intl: Treat C.UTF-8 locale like C locale (BZ# 16621)
Checks
Context |
Check |
Description |
dj/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
dj/TryBot-32bit |
success
|
Build for i686
|
Commit Message
The wiki page https://sourceware.org/glibc/wiki/Proposals/C.UTF-8
says that "Setting LC_ALL=C.UTF-8 will ignore LANGUAGE just like it
does with LC_ALL=C." This patch implements it.
* intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale
like the C locale.
---
intl/dcigettext.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
Comments
* Bruno Haible:
> The wiki page https://sourceware.org/glibc/wiki/Proposals/C.UTF-8
> says that "Setting LC_ALL=C.UTF-8 will ignore LANGUAGE just like it
> does with LC_ALL=C." This patch implements it.
>
> * intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale
> like the C locale.
> ---
> intl/dcigettext.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/intl/dcigettext.c b/intl/dcigettext.c
> index 1fc074a414..6a3c248e68 100644
> --- a/intl/dcigettext.c
> +++ b/intl/dcigettext.c
> @@ -1564,8 +1564,12 @@ guess_category_value (int category, const char *categoryname)
> 2. The precise output of some programs in the "C" locale is specified
> by POSIX and should not depend on environment variables like
> "LANGUAGE" or system-dependent information. We allow such programs
> - to use gettext(). */
> - if (strcmp (locale, "C") == 0)
> + to use gettext().
> + Ignore LANGUAGE and its system-dependent analogon also if the locale is
> + set to "C.UTF-8" or, more generally, to "C.<encoding>", because that's
> + the by-design behaviour for glibc, see
> + <https://sourceware.org/glibc/wiki/Proposals/C.UTF-8>. */
> + if (locale[0] == 'C' && (locale[1] == '\0' || locale[1] == '.'))
> return locale;
>
> /* The highest priority value is the value of the 'LANGUAGE' environment
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Fix pushed. I've posted my test case as well:
[PATCH] intl: Add test case for bug 16621
<https://inbox.sourceware.org/libc-alpha/87o7iiukpt.fsf@oldenburg3.str.redhat.com/T/#u>
Thanks,
Florian
Florian Weimer wrote:
> > * intl/dcigettext.c (guess_category_value): Treat C.<encoding> locale
> > like the C locale.
>
> Reviewed-by: Florian Weimer <fweimer@redhat.com>
>
> Fix pushed.
Thanks!
> I've posted my test case as well:
>
> [PATCH] intl: Add test case for bug 16621
> <https://inbox.sourceware.org/libc-alpha/87o7iiukpt.fsf@oldenburg3.str.redhat.com/T/#u>
Now that the main patch is in glibc, I added it also to GNU gettext, together with a unit
test. My unit test [1][2] happens to be stricter than what I had manually tested in Dec. 2022:
It adds a .mo file at <LOCALEDIR>/C/LC_MESSAGES/<domain>.mo . And the test fails. A
second patch is needed, basically the same change at a different place in dcigettext.c.
I'm posting it separately.
Bruno
[1] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/tests/intl-0;h=9977cfe2e5d645c3a20fbfe891974720aacb488d;hb=HEAD
[2] https://git.savannah.gnu.org/gitweb/?p=gettext.git;a=blob;f=gettext-tools/tests/intl-1-prg.c;h=cda076140b4d60d2a9535d4fa1d769f26c580c20;hb=HEAD
@@ -1564,8 +1564,12 @@ guess_category_value (int category, const char *categoryname)
2. The precise output of some programs in the "C" locale is specified
by POSIX and should not depend on environment variables like
"LANGUAGE" or system-dependent information. We allow such programs
- to use gettext(). */
- if (strcmp (locale, "C") == 0)
+ to use gettext().
+ Ignore LANGUAGE and its system-dependent analogon also if the locale is
+ set to "C.UTF-8" or, more generally, to "C.<encoding>", because that's
+ the by-design behaviour for glibc, see
+ <https://sourceware.org/glibc/wiki/Proposals/C.UTF-8>. */
+ if (locale[0] == 'C' && (locale[1] == '\0' || locale[1] == '.'))
return locale;
/* The highest priority value is the value of the 'LANGUAGE' environment