Florian,
All of these patches look good to me and should get checked in.
To be clear, patch #1, #2, and #3 are ready to get checked in and
should be checked in immediately to fix CVE-2014-0475.
Allan,
Patch #1 is an alloca hardening that prevents overly long locale
names from blowing out the stack. This should IMO be considered a bug
and this patch allowed in our 2.20 freeze mode.
The rest of the patches fix the CVE, and should absolutely make it for
2.20.
Your final call on patch #1 though.
Cheers,
Carlos.
On 07/09/2014 12:14 PM, Florian Weimer wrote:
> These patches fix a security vulnerability reported privately by
> Stephane Chazelas. Pre-notification was sent out through
> linux-distros. I'd like to thank Stephane for reporting these issues,
> and Carlos for reviewing my patches.
>
> Before committing, I will add the bug number to the NEWS file and the
> ChangeLog. For obvious reasons, I did not have a bug number when I
> prepared those patches.
>
> The meat of the fixes is in the second patch. The first one is just
> the usual alloca hardening (technically not covered by the CVE
> assignment), and the third patch consists solely of documentation
> updates.
Thanks for working on this.
> --
> Florian Weimer / Red Hat Product Security
>
> 0001-setlocale-Use-the-heap-for-the-copy-of-the-locale-ar.patch
This doesn't follow any of the community guidelines for posting
patches. It should just be ChangeLog followed by patch.
See:
https://sourceware.org/glibc/wiki/Contribution%20checklist
> From c9775f8d4de7383ec9fe0121165227ff56a5def5 Mon Sep 17 00:00:00 2001
> From: Florian Weimer <fweimer@redhat.com>
> Date: Wed, 28 May 2014 14:41:52 +0200
> Subject: [PATCH 1/3] setlocale: Use the heap for the copy of the locale
> argument
>
> ---
> ChangeLog | 5 +++++
> locale/setlocale.c | 14 ++++++++++++--
> 2 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/ChangeLog b/ChangeLog
> index a1e44b1..457b821 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,8 @@
> +2014-07-02 Florian Weimer <fweimer@redhat.com>
> +
> + * locale/setlocale.c (setlocale): Use strdup for allocating
> + composite name copy.
> +
> 2014-07-01 Stefan Liebler <stli@linux.vnet.ibm.com>
>
> * sysdeps/s390/fpu/libm-test-ulps: Regenerate.
> diff --git a/locale/setlocale.c b/locale/setlocale.c
> index 9458468..6455b8b 100644
> --- a/locale/setlocale.c
> +++ b/locale/setlocale.c
> @@ -272,6 +272,8 @@ setlocale (int category, const char *locale)
> of entries of the form `CATEGORY=VALUE'. */
> const char *newnames[__LC_LAST];
> struct __locale_data *newdata[__LC_LAST];
> + /* Copy of the locale argument, for in-place splitting. */
> + char *locale_copy = NULL;
OK.
>
> /* Set all name pointers to the argument name. */
> for (category = 0; category < __LC_LAST; ++category)
> @@ -281,7 +283,13 @@ setlocale (int category, const char *locale)
> if (__glibc_unlikely (strchr (locale, ';') != NULL))
> {
> /* This is a composite name. Make a copy and split it up. */
> - char *np = strdupa (locale);
> + locale_copy = strdup (locale);
> + if (__glibc_unlikely (locale_copy == NULL))
> + {
> + __libc_rwlock_unlock (__libc_setlocale_lock);
> + return NULL;
> + }
OK.
> + char *np = locale_copy;
> char *cp;
> int cnt;
>
> @@ -299,6 +307,7 @@ setlocale (int category, const char *locale)
> {
> error_return:
> __libc_rwlock_unlock (__libc_setlocale_lock);
> + free (locale_copy);
OK.
>
> /* Bogus category name. */
> ERROR_RETURN;
> @@ -391,8 +400,9 @@ setlocale (int category, const char *locale)
> /* Critical section left. */
> __libc_rwlock_unlock (__libc_setlocale_lock);
>
> - /* Free the resources (the locale path variable). */
> + /* Free the resources. */
> free (locale_path);
> + free (locale_copy);
OK.
>
> return composite;
> }
> -- 1.9.3
The argument can be made that this is a bug since an overly long locale
name could trigger a stack overflow with the unchecked alloca. I've made
the case to Allan, but as release manager he has final say.
>
>
> 0002-_nl_find_locale-Improve-handling-of-crafted-locale-n.patch
>
>
> From 7f3e0956170a0ce04f0ff087f6333419631b0638 Mon Sep 17 00:00:00 2001
> From: Florian Weimer <fweimer@redhat.com>
> Date: Mon, 12 May 2014 15:24:12 +0200
> Subject: [PATCH 2/3] _nl_find_locale: Improve handling of crafted locale names
>
> ---
> ChangeLog | 8 ++
> NEWS | 9 ++
> locale/findlocale.c | 74 +++++++++++++---
> localedata/ChangeLog | 6 ++
> localedata/Makefile | 3 +-
> localedata/tst-setlocale3.c | 203 ++++++++++++++++++++++++++++++++++++++++++++
> 6 files changed, 289 insertions(+), 14 deletions(-)
> create mode 100644 localedata/tst-setlocale3.c
>
> diff --git a/ChangeLog b/ChangeLog
> index 457b821..207597c 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,5 +1,13 @@
> 2014-07-02 Florian Weimer <fweimer@redhat.com>
>
> + * locale/findlocale.c (name_present, valid_locale_name): New
> + functions.
> + (_nl_find_locale): Use the loc_name variable to store name
> + candidates. Call name_present and valid_locale_name to check and
> + validate locale names. Return an error if the locale is invalid.
> +
> +2014-07-02 Florian Weimer <fweimer@redhat.com>
> +
Remove the date line, two commits one after the other should have
only a space between them (again this is why we don't want diff's
of ChangeLog's posted).
> * locale/setlocale.c (setlocale): Use strdup for allocating
> composite name copy.
>
> diff --git a/NEWS b/NEWS
> index a07ea66..c61d121 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -65,6 +65,15 @@ Version 2.20
> * All supported architectures now use the main glibc sysdeps directory
> instead of some being in a separate "ports" directory (which was
> distributed separately before glibc 2.17).
> +
> +* Locale names, including those obtained from environment variables (LANG
> + and the LC_* variables), are more tightly checked for proper syntax.
> + setlocale will now fail (with EINVAL) for locale names that are overly
> + long, contain slashes without starting with a slash, or contain ".." path
> + components. (CVE-2014-0475) Previously, some valid locale names were
> + silently replaced with the "C" locale when running in AT_SECURE mode
> + (e.g., in a SUID program). This is no longer necessary because of the
> + additional checks.
OK.
>
> Version 2.19
>
> diff --git a/locale/findlocale.c b/locale/findlocale.c
> index bbaf708..22e8b53 100644
> --- a/locale/findlocale.c
> +++ b/locale/findlocale.c
> @@ -17,6 +17,7 @@
> <http://www.gnu.org/licenses/>. */
>
> #include <assert.h>
> +#include <errno.h>
> #include <locale.h>
> #include <stdlib.h>
> #include <string.h>
> @@ -57,6 +58,45 @@ struct loaded_l10nfile *_nl_locale_file_list[__LC_LAST];
>
> const char _nl_default_locale_path[] attribute_hidden = LOCALEDIR;
>
> +/* Checks if the name is actually present, that is, not NULL and not
> + empty. */
> +static inline int
> +name_present (const char *name)
> +{
> + return name != NULL && name[0] != '\0';
> +}
OK.
> +
> +/* Checks that the locale name neither extremely long, nor contains a
> + ".." path component (to prevent directory traversal). */
> +static inline int
> +valid_locale_name (const char *name)
> +{
> + /* Not set. */
> + size_t namelen = strlen (name);
> + /* Name too long. The limit is arbitrary and prevents stack overflow
> + issues later. */
> + if (__glibc_unlikely (namelen > 255))
> + return 0;
OK.
> + /* Directory traversal attempt. */
> + static const char slashdot[4] = {'/', '.', '.', '/'};
> + if (__glibc_unlikely (memmem (name, namelen,
> + slashdot, sizeof (slashdot)) != NULL))
> + return 0;
OK.
> + if (namelen == 2 && __glibc_unlikely (name[0] == '.' && name [1] == '.'))
> + return 0;
OK.
> + if (namelen >= 3
> + && __glibc_unlikely (((name[0] == '.'
> + && name[1] == '.'
> + && name[2] == '/')
> + || (name[namelen - 3] == '/'
> + && name[namelen - 2] == '.'
> + && name[namelen - 1] == '.'))))
> + return 0;
OK.
> + /* If there is a slash in the name, it must start with one. */
> + if (__glibc_unlikely (memchr (name, '/', namelen) != NULL) && name[0] != '/')
> + return 0;
OK. This is an implementation limit which you document later, which is fine.
> + return 1;
> +}
>
> struct __locale_data *
> internal_function
> @@ -65,7 +105,7 @@ _nl_find_locale (const char *locale_path, size_t locale_path_len,
> {
> int mask;
> /* Name of the locale for this category. */
> - char *loc_name;
> + char *loc_name = (char *) *name;
OK.
> const char *language;
> const char *modifier;
> const char *territory;
> @@ -73,31 +113,39 @@ _nl_find_locale (const char *locale_path, size_t locale_path_len,
> const char *normalized_codeset;
> struct loaded_l10nfile *locale_file;
>
> - if ((*name)[0] == '\0')
> + if (loc_name[0] == '\0')
> {
> /* The user decides which locale to use by setting environment
> variables. */
> - *name = getenv ("LC_ALL");
> - if (*name == NULL || (*name)[0] == '\0')
> - *name = getenv (_nl_category_names.str
> + loc_name = getenv ("LC_ALL");
> + if (!name_present (loc_name))
> + loc_name = getenv (_nl_category_names.str
> + _nl_category_name_idxs[category]);
> - if (*name == NULL || (*name)[0] == '\0')
> - *name = getenv ("LANG");
> + if (!name_present (loc_name))
> + loc_name = getenv ("LANG");
> + if (!name_present (loc_name))
> + loc_name = (char *) _nl_C_name;
> }
>
> - if (*name == NULL || (*name)[0] == '\0'
> - || (__builtin_expect (__libc_enable_secure, 0)
> - && strchr (*name, '/') != NULL))
> - *name = (char *) _nl_C_name;
> + /* We used to fall back to the C locale if the name contains a slash
> + character '/', but we now check for directory traversal in
> + valid_locale_name, so this is no longer necessary. */
OK.
>
> - if (__builtin_expect (strcmp (*name, _nl_C_name), 1) == 0
> - || __builtin_expect (strcmp (*name, _nl_POSIX_name), 1) == 0)
> + if (__builtin_expect (strcmp (loc_name, _nl_C_name), 1) == 0
> + || __builtin_expect (strcmp (loc_name, _nl_POSIX_name), 1) == 0)
> {
> /* We need not load anything. The needed data is contained in
> the library itself. */
> *name = (char *) _nl_C_name;
> return _nl_C[category];
> }
> + else if (!valid_locale_name (loc_name))
> + {
> + __set_errno (EINVAL);
> + return NULL;
> + }
> +
> + *name = loc_name;
OK.
>
> /* We really have to load some data. First we try the archive,
> but only if there was no LOCPATH environment variable specified. */
> diff --git a/localedata/ChangeLog b/localedata/ChangeLog
> index 9dd3cf2..c22d279 100644
> --- a/localedata/ChangeLog
> +++ b/localedata/ChangeLog
> @@ -1,3 +1,9 @@
> +2014-07-02 Florian Weimer <fweimer@redhat.com>
> +
> + * tst-setlocale3.c: New file.
> + * Makefile (tests): Add tst-setlocale3.
> + (tst-setlocale3-ENV): New variable.
Thanks for the test.
> +
> 2014-06-20 Stefan Liebler <stli@linux.vnet.ibm.com>
>
> * Makefile (LOCALES): Add en_GB.UTF-8.
> diff --git a/localedata/Makefile b/localedata/Makefile
> index e8fe10f..b6235f2 100644
> --- a/localedata/Makefile
> +++ b/localedata/Makefile
> @@ -74,7 +74,8 @@ locale_test_suite := tst_iswalnum tst_iswalpha tst_iswcntrl \
> tests = $(locale_test_suite) tst-digits tst-setlocale bug-iconv-trans \
> tst-leaks tst-mbswcs1 tst-mbswcs2 tst-mbswcs3 tst-mbswcs4 tst-mbswcs5 \
> tst-mbswcs6 tst-xlocale1 tst-xlocale2 bug-usesetlocale \
> - tst-strfmon1 tst-sscanf bug-setlocale1 tst-setlocale2 tst-wctype
> + tst-strfmon1 tst-sscanf bug-setlocale1 tst-setlocale2 tst-setlocale3 \
> + tst-wctype
OK.
> tests-static = bug-setlocale1-static
> tests += $(tests-static)
> ifeq (yes,$(build-shared))
> diff --git a/localedata/tst-setlocale3.c b/localedata/tst-setlocale3.c
> new file mode 100644
> index 0000000..e3b21a9
> --- /dev/null
> +++ b/localedata/tst-setlocale3.c
> @@ -0,0 +1,203 @@
> +/* Regression test for setlocale invalid environment variable handling.
OK.
> + Copyright (C) 2014 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <locale.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +/* The result of setlocale may be overwritten by subsequent calls, so
> + this wrapper makes a copy. */
> +static char *
> +setlocale_copy (int category, const char *locale)
> +{
> + const char *result = setlocale (category, locale);
> + if (result == NULL)
> + return NULL;
> + return strdup (result);
> +}
OK.
> +
> +static char *de_locale;
> +
> +static void
> +setlocale_fail (const char *envstring)
> +{
> + setenv ("LC_CTYPE", envstring, 1);
> + if (setlocale (LC_CTYPE, "") != NULL)
> + {
> + printf ("unexpected setlocale success for \"%s\" locale\n", envstring);
> + exit (1);
> + }
> + const char *newloc = setlocale (LC_CTYPE, NULL);
> + if (strcmp (newloc, de_locale) != 0)
> + {
> + printf ("failed setlocale call \"%s\" changed locale to \"%s\"\n",
> + envstring, newloc);
> + exit (1);
> + }
> +}
OK.
> +
> +static void
> +setlocale_success (const char *envstring)
> +{
> + setenv ("LC_CTYPE", envstring, 1);
> + char *newloc = setlocale_copy (LC_CTYPE, "");
> + if (newloc == NULL)
> + {
> + printf ("setlocale for \"%s\": %m\n", envstring);
> + exit (1);
> + }
> + if (strcmp (newloc, de_locale) == 0)
> + {
> + printf ("setlocale with LC_CTYPE=\"%s\" left locale at \"%s\"\n",
> + envstring, de_locale);
> + exit (1);
> + }
> + if (setlocale (LC_CTYPE, de_locale) == NULL)
> + {
> + printf ("restoring locale \"%s\" with LC_CTYPE=\"%s\": %m\n",
> + de_locale, envstring);
> + exit (1);
> + }
> + char *newloc2 = setlocale_copy (LC_CTYPE, newloc);
> + if (newloc2 == NULL)
> + {
> + printf ("restoring locale \"%s\" following \"%s\": %m\n",
> + newloc, envstring);
> + exit (1);
> + }
> + if (strcmp (newloc, newloc2) != 0)
> + {
> + printf ("representation of locale \"%s\" changed from \"%s\" to \"%s\"",
> + envstring, newloc, newloc2);
> + exit (1);
> + }
> + free (newloc);
> + free (newloc2);
> +
> + if (setlocale (LC_CTYPE, de_locale) == NULL)
> + {
> + printf ("restoring locale \"%s\" with LC_CTYPE=\"%s\": %m\n",
> + de_locale, envstring);
> + exit (1);
> + }
> +}
OK.
> +
> +/* Checks that a known-good locale still works if LC_ALL contains a
> + value which should be ignored. */
> +static void
> +setlocale_ignore (const char *to_ignore)
> +{
> + const char *fr_locale = "fr_FR.UTF-8";
> + setenv ("LC_CTYPE", fr_locale, 1);
> + char *expected_locale = setlocale_copy (LC_CTYPE, "");
> + if (expected_locale == NULL)
> + {
> + printf ("setlocale with LC_CTYPE=\"%s\" failed: %m\n", fr_locale);
> + exit (1);
> + }
> + if (setlocale (LC_CTYPE, de_locale) == NULL)
> + {
> + printf ("failed to restore locale: %m\n");
> + exit (1);
> + }
> + unsetenv ("LC_CTYPE");
> +
> + setenv ("LC_ALL", to_ignore, 1);
> + setenv ("LC_CTYPE", fr_locale, 1);
> + const char *actual_locale = setlocale (LC_CTYPE, "");
> + if (actual_locale == NULL)
> + {
> + printf ("setlocale with LC_ALL, LC_CTYPE=\"%s\" failed: %m\n",
> + fr_locale);
> + exit (1);
> + }
> + if (strcmp (actual_locale, expected_locale) != 0)
> + {
> + printf ("setlocale under LC_ALL failed: got \"%s\", expected \"%s\"\n",
> + actual_locale, expected_locale);
> + exit (1);
> + }
> + unsetenv ("LC_CTYPE");
> + setlocale_success (fr_locale);
> + unsetenv ("LC_ALL");
> + free (expected_locale);
> +}
OK.
> +
> +static int
> +do_test (void)
> +{
> + /* The glibc test harness sets this environment variable
> + uncondionally. */
> + unsetenv ("LC_ALL");
> +
> + de_locale = setlocale_copy (LC_CTYPE, "de_DE.UTF-8");
> + if (de_locale == NULL)
> + {
> + printf ("setlocale (LC_CTYPE, \"de_DE.UTF-8\"): %m\n");
> + return 1;
> + }
> + setlocale_success ("C");
> + setlocale_success ("en_US.UTF-8");
> + setlocale_success ("/en_US.UTF-8");
> + setlocale_success ("//en_US.UTF-8");
> + setlocale_ignore ("");
> +
> + setlocale_fail ("does-not-exist");
> + setlocale_fail ("/");
> + setlocale_fail ("/../localedata/en_US.UTF-8");
> + setlocale_fail ("en_US.UTF-8/");
> + setlocale_fail ("en_US.UTF-8/..");
> + setlocale_fail ("en_US.UTF-8/../en_US.UTF-8");
> + setlocale_fail ("../localedata/en_US.UTF-8");
> + {
> + size_t large_length = 1024;
> + char *large_name = malloc (large_length + 1);
> + if (large_name == NULL)
> + {
> + puts ("malloc failure");
> + return 1;
> + }
> + memset (large_name, '/', large_length);
> + const char *suffix = "en_US.UTF-8";
> + strcpy (large_name + large_length - strlen (suffix), suffix);
> + setlocale_fail (large_name);
> + free (large_name);
> + }
> + {
> + size_t huge_length = 64 * 1024 * 1024;
> + char *huge_name = malloc (huge_length + 1);
> + if (huge_name == NULL)
> + {
> + puts ("malloc failure");
> + return 1;
> + }
> + memset (huge_name, 'X', huge_length);
> + huge_name[huge_length] = '\0';
> + /* Construct a composite locale specification. */
> + const char *prefix = "LC_CTYPE=de_DE.UTF-8;LC_TIME=";
> + memcpy (huge_name, prefix, strlen (prefix));
> + setlocale_fail (huge_name);
> + free (huge_name);
> + }
> +
> + return 0;
> +}
> +
> +#define TEST_FUNCTION do_test ()
> +#include "../test-skeleton.c"
OK.
> -- 1.9.3
>
>
> 0003-manual-Update-the-locale-documentation.patch
Thanks for the manual update. This looks really good.
>
> From e175c6ba72a77cc3de4a7080f3cb9c072fc87353 Mon Sep 17 00:00:00 2001
> From: Florian Weimer <fweimer@redhat.com>
> Date: Wed, 28 May 2014 14:05:03 +0200
> Subject: [PATCH 3/3] manual: Update the locale documentation
>
> ---
> ChangeLog | 13 +++++
> manual/locale.texi | 146 +++++++++++++++++++++++++++++++++++++++++------------
> 2 files changed, 127 insertions(+), 32 deletions(-)
>
> diff --git a/ChangeLog b/ChangeLog
> index 207597c..d37328e 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,5 +1,18 @@
> 2014-07-02 Florian Weimer <fweimer@redhat.com>
>
> + * manual/locale.texi (Locale Names): New section documenting
> + locale name syntax. Adjust menu and node chaining accordingly.
> + (Choosing Locale): Reference Locale Names, Locale Categories.
> + Mention setting LC_ALL=C. Reflect that name syntax is now
> + documented.
> + (Locale Categories): New section title. Reference Locale Names.
> + LC_ALL is an environment variable, but not a category.
> + (Setting the Locale): Remove "locale -a" invocation and LOCPATH
> + description, now in Locale Name. Reference that section. Locale
> + name syntax is now documented.
OK.
> +
> +2014-07-02 Florian Weimer <fweimer@redhat.com>
> +
> * locale/findlocale.c (name_present, valid_locale_name): New
> functions.
> (_nl_find_locale): Use the loc_name variable to store name
> diff --git a/manual/locale.texi b/manual/locale.texi
> index 45f1e94..ee1c3a1 100644
> --- a/manual/locale.texi
> +++ b/manual/locale.texi
> @@ -29,6 +29,7 @@ will follow the conventions preferred by the user.
> * Setting the Locale:: How a program specifies the locale
> with library functions.
> * Standard Locales:: Locale names available on all systems.
> +* Locale Names:: Format of system-specific locale names.
OK.
> * Locale Information:: How to access the information for the locale.
> * Formatting Numbers:: A dedicated function to format numbers.
> * Yes-or-No Questions:: Check a Response against the locale.
> @@ -99,14 +100,16 @@ locale named @samp{espana-castellano} to use the standard conventions of
> most of Spain.
>
> The set of locales supported depends on the operating system you are
> -using, and so do their names. We can't make any promises about what
> -locales will exist, except for one standard locale called @samp{C} or
> -@samp{POSIX}. Later we will describe how to construct locales.
> -@comment (@pxref{Building Locale Files}).
> +using, and so do their names, except that the standard locale called
> +@samp{C} or @samp{POSIX} always exist. @xref{Locale Names}.
> +
> +In order to force the system to always use the default locale, the
> +user can set the @code{LC_ALL} environment variable to @samp{C}.
OK.
You remove a reference to {Building Locale Files} which is fine because
it doesn't exist :-)
>
> @cindex combining locales
> -A user also has the option of specifying different locales for different
> -purposes---in effect, choosing a mixture of multiple locales.
> +A user also has the option of specifying different locales for
> +different purposes---in effect, choosing a mixture of multiple
> +locales. @xref{Locale Categories}.
OK.
>
> For example, the user might specify the locale @samp{espana-castellano}
> for most purposes, but specify the locale @samp{usa-english} for
> @@ -120,7 +123,7 @@ which locales apply. However, the user can choose to use each locale
> for a particular subset of those purposes.
>
> @node Locale Categories, Setting the Locale, Choosing Locale, Locales
> -@section Categories of Activities that Locales Affect
> +@section Locale Categories
OK.
> @cindex categories for locales
> @cindex locale categories
>
> @@ -128,7 +131,11 @@ The purposes that locales serve are grouped into @dfn{categories}, so
> that a user or a program can choose the locale for each category
> independently. Here is a table of categories; each name is both an
> environment variable that a user can set, and a macro name that you can
> -use as an argument to @code{setlocale}.
> +use as the first argument to @code{setlocale}.
> +
> +The contents of the environment variable (or the string in the second
> +argument to @code{setlocale}) has to be a valid locale name.
> +@xref{Locale Names}.
OK.
>
> @vtable @code
> @comment locale.h
> @@ -172,7 +179,7 @@ for affirmative and negative responses.
> @comment locale.h
> @comment ISO
> @item LC_ALL
> -This is not an environment variable; it is only a macro that you can use
> +This is not a category; it is only a macro that you can use
OK. Good fix.
> with @code{setlocale} to set a single locale for all purposes. Setting
> this environment variable overwrites all selections by the other
> @code{LC_*} variables or @code{LANG}.
> @@ -355,13 +362,7 @@ The symbols in this section are defined in the header file @file{locale.h}.
> @c strndup @ascuheap @acsmem
> @c strcasecmp_l ok (C locale)
> The function @code{setlocale} sets the current locale for category
> -@var{category} to @var{locale}. A list of all the locales the system
> -provides can be created by running
> -
> -@pindex locale
> -@smallexample
> - locale -a
> -@end smallexample
> +@var{category} to @var{locale}.
OK.
>
> If @var{category} is @code{LC_ALL}, this specifies the locale for all
> purposes. The other possible values of @var{category} specify an
> @@ -386,10 +387,9 @@ is passed in as @var{locale} parameter.
>
> When you read the current locale for category @code{LC_ALL}, the value
> encodes the entire combination of selected locales for all categories.
> -In this case, the value is not just a single locale name. In fact, we
> -don't make any promises about what it looks like. But if you specify
> -the same ``locale name'' with @code{LC_ALL} in a subsequent call to
> -@code{setlocale}, it restores the same combination of locale selections.
> +If you specify the same ``locale name'' with @code{LC_ALL} in a
> +subsequent call to @code{setlocale}, it restores the same combination
> +of locale selections.
OK. We are documenting the format here, so that's good.
>
> To be sure you can use the returned string encoding the currently selected
> locale at a later time, you must make a copy of the string. It is not
> @@ -405,20 +405,15 @@ for @var{category}.
> If a nonempty string is given for @var{locale}, then the locale of that
> name is used if possible.
>
> +The effective locale name (either the second argument to
> +@code{setlocale}, or if the argument is an empty string, the name
> +obtained from the process environment) must be valid locale name.
> +@xref{Locale Names}.
OK.
> +
> If you specify an invalid locale name, @code{setlocale} returns a null
> pointer and leaves the current locale unchanged.
> @end deftypefun
>
> -The path used for finding locale data can be set using the
> -@code{LOCPATH} environment variable. The default path for finding
> -locale data is system specific. It is computed from the value given
> -as the prefix while configuring the C library. This value normally is
> -@file{/usr} or @file{/}. For the former the complete path is:
> -
> -@smallexample
> -/usr/lib/locale
> -@end smallexample
OK.
Thanks for removing this, it was always a configuration option and should
not be specified in the manual.
> -
> Here is an example showing how you might use @code{setlocale} to
> temporarily switch to a new locale.
>
> @@ -458,7 +453,7 @@ locale categories, and future versions of the library will do so. For
> portability, assume that any symbol beginning with @samp{LC_} might be
> defined in @file{locale.h}.
>
> -@node Standard Locales, Locale Information, Setting the Locale, Locales
> +@node Standard Locales, Locale Names, Setting the Locale, Locales
> @section Standard Locales
>
> The only locale names you can count on finding on all operating systems
> @@ -492,7 +487,94 @@ with the environment, rather than trying to specify some non-standard
> locale explicitly by name. Remember, different machines might have
> different sets of locales installed.
>
> -@node Locale Information, Formatting Numbers, Standard Locales, Locales
> +@node Locale Names, Locale Information, Standard Locales, Locales
> +@section Locale Names
> +
> +The following command prints a list of locales supported by the
> +system:
> +
> +@pindex locale
> +@smallexample
> + locale -a
> +@end smallexample
> +
> +@strong{Portability Note:} With the notable exception of the standard
> +locale names @samp{C} and @samp{POSIX}, locale names are
> +system-specific.
OK. Good note.
> +
> +Most locale names follow XPG syntax and consist of up to four parts:
> +
> +@smallexample
> +@var{language}[_@var{territory}[.@var{codeset}]][@@@var{modifier}]
> +@end smallexample
> +
> +Beside the first part, all of them are allowed to be missing. If the
> +full specified locale is not found, less specific ones are looked for.
> +The various parts will be stripped off, in the following order:
> +
> +@enumerate
> +@item
> +codeset
> +@item
> +normalized codeset
> +@item
> +territory
> +@item
> +modifier
> +@end enumerate
> +
> +For example, the locale name @samp{de_AT.iso885915@@euro} denotes a
> +German-language locale for use in Austria, using the ISO-8859-15
> +(Latin-9) character set, and with the Euro as the currency symbol.
OK. Good example using the @euro marker, since that's an interesting case
where the locale is extended to change the currency marker.
> +
> +In addition to locale names which follow XPG syntax, systems may
> +provide aliases such as @samp{german}. Both categories of names must
> +not contain the slash character @samp{/}.
OK. Perfect, you documented that '/' shall not be used.
> +
> +If the locale name starts with a slash @samp{/}, it is treated as a
> +path relative to the configured locale directories; see @code{LOCPATH}
> +below. The specified path must not contain a component @samp{..}, or
> +the name is invalid, and @code{setlocale} will fail.
> +
> +@strong{Portability Note:} POSIX suggests that if a locale name starts
> +with a slash @samp{/}, it is resolved as an absolute path. However,
> +@theglibc{} treats it as a relative path under the directories listed
> +in @code{LOCPATH} (or the default locale directory if @code{LOCPATH}
> +is unset).
> +
> +Locale names which are longer than an implementation-defined limit are
> +invalid and cause @code{setlocale} to fail.
> +
> +As a special case, locale names used with @code{LC_ALL} can combine
> +several locales, reflecting different locale settings for different
> +categories. For example, you might want to use a U.S. locale with ISO
> +A4 paper format, so you set @code{LANG} to @samp{en_US.UTF-8}, and
> +@code{LC_PAPER} to @samp{de_DE.UTF-8}. In this case, the
> +@code{LC_ALL}-style combined locale name is
> +
> +@smallexample
> +LC_CTYPE=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_PAPER=de_DE.UTF-8;@dots{}
> +@end smallexample
> +
> +followed by other category settings not shown here.
> +
> +@vindex LOCPATH
> +The path used for finding locale data can be set using the
> +@code{LOCPATH} environment variable. This variable lists the
> +directories in which to search for locale definitions, separated by a
> +colon @samp{:}.
> +
> +The default path for finding locale data is system specific. A typical
> +value for the @code{LOCPATH} default is:
> +
> +@smallexample
> +/usr/share/locale
> +@end smallexample
> +
> +The value of @code{LOCPATH} is ignored by privileged programs for
> +security reasons, and only the default directory is used.
OK.
> +
> +@node Locale Information, Formatting Numbers, Locale Names, Locales
> @section Accessing Locale Information
>
> There are several ways to access locale information. The simplest
> -- 1.9.3
Cheers,
Carlos.
From e175c6ba72a77cc3de4a7080f3cb9c072fc87353 Mon Sep 17 00:00:00 2001
From: Florian Weimer <fweimer@redhat.com>
Date: Wed, 28 May 2014 14:05:03 +0200
Subject: [PATCH 3/3] manual: Update the locale documentation
---
ChangeLog | 13 +++++
manual/locale.texi | 146 +++++++++++++++++++++++++++++++++++++++++------------
2 files changed, 127 insertions(+), 32 deletions(-)
@@ -1,5 +1,18 @@
2014-07-02 Florian Weimer <fweimer@redhat.com>
+ * manual/locale.texi (Locale Names): New section documenting
+ locale name syntax. Adjust menu and node chaining accordingly.
+ (Choosing Locale): Reference Locale Names, Locale Categories.
+ Mention setting LC_ALL=C. Reflect that name syntax is now
+ documented.
+ (Locale Categories): New section title. Reference Locale Names.
+ LC_ALL is an environment variable, but not a category.
+ (Setting the Locale): Remove "locale -a" invocation and LOCPATH
+ description, now in Locale Name. Reference that section. Locale
+ name syntax is now documented.
+
+2014-07-02 Florian Weimer <fweimer@redhat.com>
+
* locale/findlocale.c (name_present, valid_locale_name): New
functions.
(_nl_find_locale): Use the loc_name variable to store name
@@ -29,6 +29,7 @@ will follow the conventions preferred by the user.
* Setting the Locale:: How a program specifies the locale
with library functions.
* Standard Locales:: Locale names available on all systems.
+* Locale Names:: Format of system-specific locale names.
* Locale Information:: How to access the information for the locale.
* Formatting Numbers:: A dedicated function to format numbers.
* Yes-or-No Questions:: Check a Response against the locale.
@@ -99,14 +100,16 @@ locale named @samp{espana-castellano} to use the standard conventions of
most of Spain.
The set of locales supported depends on the operating system you are
-using, and so do their names. We can't make any promises about what
-locales will exist, except for one standard locale called @samp{C} or
-@samp{POSIX}. Later we will describe how to construct locales.
-@comment (@pxref{Building Locale Files}).
+using, and so do their names, except that the standard locale called
+@samp{C} or @samp{POSIX} always exist. @xref{Locale Names}.
+
+In order to force the system to always use the default locale, the
+user can set the @code{LC_ALL} environment variable to @samp{C}.
@cindex combining locales
-A user also has the option of specifying different locales for different
-purposes---in effect, choosing a mixture of multiple locales.
+A user also has the option of specifying different locales for
+different purposes---in effect, choosing a mixture of multiple
+locales. @xref{Locale Categories}.
For example, the user might specify the locale @samp{espana-castellano}
for most purposes, but specify the locale @samp{usa-english} for
@@ -120,7 +123,7 @@ which locales apply. However, the user can choose to use each locale
for a particular subset of those purposes.
@node Locale Categories, Setting the Locale, Choosing Locale, Locales
-@section Categories of Activities that Locales Affect
+@section Locale Categories
@cindex categories for locales
@cindex locale categories
@@ -128,7 +131,11 @@ The purposes that locales serve are grouped into @dfn{categories}, so
that a user or a program can choose the locale for each category
independently. Here is a table of categories; each name is both an
environment variable that a user can set, and a macro name that you can
-use as an argument to @code{setlocale}.
+use as the first argument to @code{setlocale}.
+
+The contents of the environment variable (or the string in the second
+argument to @code{setlocale}) has to be a valid locale name.
+@xref{Locale Names}.
@vtable @code
@comment locale.h
@@ -172,7 +179,7 @@ for affirmative and negative responses.
@comment locale.h
@comment ISO
@item LC_ALL
-This is not an environment variable; it is only a macro that you can use
+This is not a category; it is only a macro that you can use
with @code{setlocale} to set a single locale for all purposes. Setting
this environment variable overwrites all selections by the other
@code{LC_*} variables or @code{LANG}.
@@ -355,13 +362,7 @@ The symbols in this section are defined in the header file @file{locale.h}.
@c strndup @ascuheap @acsmem
@c strcasecmp_l ok (C locale)
The function @code{setlocale} sets the current locale for category
-@var{category} to @var{locale}. A list of all the locales the system
-provides can be created by running
-
-@pindex locale
-@smallexample
- locale -a
-@end smallexample
+@var{category} to @var{locale}.
If @var{category} is @code{LC_ALL}, this specifies the locale for all
purposes. The other possible values of @var{category} specify an
@@ -386,10 +387,9 @@ is passed in as @var{locale} parameter.
When you read the current locale for category @code{LC_ALL}, the value
encodes the entire combination of selected locales for all categories.
-In this case, the value is not just a single locale name. In fact, we
-don't make any promises about what it looks like. But if you specify
-the same ``locale name'' with @code{LC_ALL} in a subsequent call to
-@code{setlocale}, it restores the same combination of locale selections.
+If you specify the same ``locale name'' with @code{LC_ALL} in a
+subsequent call to @code{setlocale}, it restores the same combination
+of locale selections.
To be sure you can use the returned string encoding the currently selected
locale at a later time, you must make a copy of the string. It is not
@@ -405,20 +405,15 @@ for @var{category}.
If a nonempty string is given for @var{locale}, then the locale of that
name is used if possible.
+The effective locale name (either the second argument to
+@code{setlocale}, or if the argument is an empty string, the name
+obtained from the process environment) must be valid locale name.
+@xref{Locale Names}.
+
If you specify an invalid locale name, @code{setlocale} returns a null
pointer and leaves the current locale unchanged.
@end deftypefun
-The path used for finding locale data can be set using the
-@code{LOCPATH} environment variable. The default path for finding
-locale data is system specific. It is computed from the value given
-as the prefix while configuring the C library. This value normally is
-@file{/usr} or @file{/}. For the former the complete path is:
-
-@smallexample
-/usr/lib/locale
-@end smallexample
-
Here is an example showing how you might use @code{setlocale} to
temporarily switch to a new locale.
@@ -458,7 +453,7 @@ locale categories, and future versions of the library will do so. For
portability, assume that any symbol beginning with @samp{LC_} might be
defined in @file{locale.h}.
-@node Standard Locales, Locale Information, Setting the Locale, Locales
+@node Standard Locales, Locale Names, Setting the Locale, Locales
@section Standard Locales
The only locale names you can count on finding on all operating systems
@@ -492,7 +487,94 @@ with the environment, rather than trying to specify some non-standard
locale explicitly by name. Remember, different machines might have
different sets of locales installed.
-@node Locale Information, Formatting Numbers, Standard Locales, Locales
+@node Locale Names, Locale Information, Standard Locales, Locales
+@section Locale Names
+
+The following command prints a list of locales supported by the
+system:
+
+@pindex locale
+@smallexample
+ locale -a
+@end smallexample
+
+@strong{Portability Note:} With the notable exception of the standard
+locale names @samp{C} and @samp{POSIX}, locale names are
+system-specific.
+
+Most locale names follow XPG syntax and consist of up to four parts:
+
+@smallexample
+@var{language}[_@var{territory}[.@var{codeset}]][@@@var{modifier}]
+@end smallexample
+
+Beside the first part, all of them are allowed to be missing. If the
+full specified locale is not found, less specific ones are looked for.
+The various parts will be stripped off, in the following order:
+
+@enumerate
+@item
+codeset
+@item
+normalized codeset
+@item
+territory
+@item
+modifier
+@end enumerate
+
+For example, the locale name @samp{de_AT.iso885915@@euro} denotes a
+German-language locale for use in Austria, using the ISO-8859-15
+(Latin-9) character set, and with the Euro as the currency symbol.
+
+In addition to locale names which follow XPG syntax, systems may
+provide aliases such as @samp{german}. Both categories of names must
+not contain the slash character @samp{/}.
+
+If the locale name starts with a slash @samp{/}, it is treated as a
+path relative to the configured locale directories; see @code{LOCPATH}
+below. The specified path must not contain a component @samp{..}, or
+the name is invalid, and @code{setlocale} will fail.
+
+@strong{Portability Note:} POSIX suggests that if a locale name starts
+with a slash @samp{/}, it is resolved as an absolute path. However,
+@theglibc{} treats it as a relative path under the directories listed
+in @code{LOCPATH} (or the default locale directory if @code{LOCPATH}
+is unset).
+
+Locale names which are longer than an implementation-defined limit are
+invalid and cause @code{setlocale} to fail.
+
+As a special case, locale names used with @code{LC_ALL} can combine
+several locales, reflecting different locale settings for different
+categories. For example, you might want to use a U.S. locale with ISO
+A4 paper format, so you set @code{LANG} to @samp{en_US.UTF-8}, and
+@code{LC_PAPER} to @samp{de_DE.UTF-8}. In this case, the
+@code{LC_ALL}-style combined locale name is
+
+@smallexample
+LC_CTYPE=en_US.UTF-8;LC_TIME=en_US.UTF-8;LC_PAPER=de_DE.UTF-8;@dots{}
+@end smallexample
+
+followed by other category settings not shown here.
+
+@vindex LOCPATH
+The path used for finding locale data can be set using the
+@code{LOCPATH} environment variable. This variable lists the
+directories in which to search for locale definitions, separated by a
+colon @samp{:}.
+
+The default path for finding locale data is system specific. A typical
+value for the @code{LOCPATH} default is:
+
+@smallexample
+/usr/share/locale
+@end smallexample
+
+The value of @code{LOCPATH} is ignored by privileged programs for
+security reasons, and only the default directory is used.
+
+@node Locale Information, Formatting Numbers, Locale Names, Locales
@section Accessing Locale Information
There are several ways to access locale information. The simplest
--
1.9.3