Remove obsolete aliases that broke 'locale -a'

Message ID	55544398.2030007@cs.ucla.edu
State	Superseded
Headers	Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org Message-ID: <55544398.2030007@cs.ucla.edu> Date: Wed, 13 May 2015 23:41:28 -0700 From: Paul Eggert <eggert@cs.ucla.edu> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: GNU C Library <libc-alpha@sourceware.org> Subject: [PATCH] Remove obsolete aliases that broke 'locale -a' Content-Type: multipart/mixed; boundary="------------080304020608020108030508"

Message ID

55544398.2030007@cs.ucla.edu

State

Superseded

Headers

Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
Sender: libc-alpha-owner@sourceware.org
Message-ID: <55544398.2030007@cs.ucla.edu>
Date: Wed, 13 May 2015 23:41:28 -0700
From: Paul Eggert <eggert@cs.ucla.edu>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: GNU C Library <libc-alpha@sourceware.org>
Subject: [PATCH] Remove obsolete aliases that broke 'locale -a'
Content-Type: multipart/mixed;
	boundary="------------080304020608020108030508"

Commit Message

Paul Eggert May 14, 2015, 6:41 a.m. UTC

  I'm attaching the proposed patch, as it contains a mixture of UTF-8 and Latin-1 
(which is part of the bug!) and there may be issues decoding it.  (Extract it as 
binary and then inspect the bytes carefully using your loupe.  :-)

Comments

Carlos O'Donell May 14, 2015, 4:17 p.m. UTC | #1

On 05/14/2015 02:41 AM, Paul Eggert wrote:
> I'm attaching the proposed patch, as it contains a mixture of UTF-8
> and Latin-1 (which is part of the bug!) and there may be issues
> decoding it. (Extract it as binary and then inspect the bytes
> carefully using your loupe. :-)

Looks good to me, modulo messed up data in ChangeLog.

> 0001-Remove-obsolete-aliases-that-broke-locale-a.patch
> 
> 
> From 30f91a08a481c1f5f263429544abfeee1285904a Mon Sep 17 00:00:00 2001
> From: Paul Eggert <eggert@cs.ucla.edu>
> Date: Wed, 13 May 2015 18:33:45 -0700
> Subject: [PATCH] Remove obsolete aliases that broke 'locale -a'
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> [BZ #18412]
> * intl/locale.alias: Remove obsolete aliases "bokmÃ¥l" and "franÃ§ais"
> which caused 'locale -a' to output Latin-1 data in UTF-8 locales,
> breaking some applications that use 'locale -a' output.
> ---
>  ChangeLog         | 7 +++++++
>  intl/locale.alias | 9 +++++++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/ChangeLog b/ChangeLog
> index 6d96ce2..5b96578 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,10 @@
> +2015-05-13  Paul Eggert  <eggert@cs.ucla.edu>
> +
> +	[BZ #18412]
> +	* intl/locale.alias: Remove obsolete aliases "bokmÃ¥l" and "franÃ§ais"

This looks like your ChangeLog entry is not in UTF-8?

The ChangeLog is in UTF-8.

> +	which caused 'locale -a' to output Latin-1 data in UTF-8 locales,
> +	breaking some applications that use 'locale -a' output.
> +
>  2015-05-13  Roland McGrath  <roland@hack.frob.com>
>  
>  	* sysdeps/nacl/fdopendir.c: New file.
> diff --git a/intl/locale.alias b/intl/locale.alias
> index ab1cb7a..4053df6 100644
> --- a/intl/locale.alias
> +++ b/intl/locale.alias
> @@ -24,8 +24,14 @@
>  # backward compatibility.  Nobody should rely on the names defined here.
>  # Locales should always be specified by their full name.
>  
> +# Note: This file used to contain the lines:
> +#	bokmål		nb_NO.ISO-8859-1
> +#	français	fr_FR.ISO-8859-1
> +# but these have been commented out since they cause 'locale -a' to output
> +# text encoded in Latin-1, which breaks applications in UTF-8 locales.  See:
> +# https://sourceware.org/bugzilla/show_bug.cgi?id=18412

Could you please convert this file to UTF-8 instead of ISO-8859?

> +
>  bokmal		nb_NO.ISO-8859-1
> -bokmål		nb_NO.ISO-8859-1
>  catalan		ca_ES.ISO-8859-1
>  croatian	hr_HR.ISO-8859-2
>  czech		cs_CZ.ISO-8859-2
> @@ -36,7 +42,6 @@ dutch		nl_NL.ISO-8859-1
>  eesti		et_EE.ISO-8859-1
>  estonian	et_EE.ISO-8859-1
>  finnish         fi_FI.ISO-8859-1
> -français	fr_FR.ISO-8859-1
>  french		fr_FR.ISO-8859-1
>  galego		gl_ES.ISO-8859-1
>  galician	gl_ES.ISO-8859-1
> -- 2.1.0

Cheers,
Carlos.

Joseph Myers May 14, 2015, 4:53 p.m. UTC | #2

On Thu, 14 May 2015, Carlos O'Donell wrote:

> Could you please convert this file to UTF-8 instead of ISO-8859?

Do we want to do this more generally for sources, except for tests of 
other character sets?  Files *.[chS] with other character sets in, apart 
from such tests, appear to include: string/strverscmp.c 
sysdeps/i386/fpu/e_atanh.S sysdeps/i386/fpu/e_atanhl.S 
sysdeps/i386/fpu/e_log10.S sysdeps/i386/fpu/e_log10f.S 
sysdeps/i386/fpu/e_log10l.S sysdeps/i386/fpu/e_log2.S 
sysdeps/i386/fpu/e_log2f.S sysdeps/i386/fpu/e_log2l.S 
sysdeps/i386/fpu/e_pow.S sysdeps/i386/fpu/e_powf.S 
sysdeps/i386/fpu/e_powl.S sysdeps/i386/fpu/s_asinh.S 
sysdeps/i386/fpu/s_asinhf.S sysdeps/i386/fpu/s_asinhl.S 
sysdeps/i386/fpu/s_log1p.S sysdeps/i386/fpu/s_log1pf.S 
sysdeps/i386/fpu/s_log1pl.S sysdeps/ia64/fpu/s_tanf.S 
sysdeps/x86_64/fpu/e_log10l.S sysdeps/x86_64/fpu/e_log2l.S 
sysdeps/x86_64/fpu/e_powl.S sysdeps/x86_64/fpu/s_log1pl.S

Mike Frysinger May 15, 2015, 3:55 a.m. UTC | #3

On 14 May 2015 16:53, Joseph Myers wrote:
> On Thu, 14 May 2015, Carlos O'Donell wrote:
> > Could you please convert this file to UTF-8 instead of ISO-8859?
> 
> Do we want to do this more generally for sources, except for tests of 
> other character sets?

yes, i think we should normalize everything to UTF-8 where appropriate
-mike

diff mbox

Patch

From 30f91a08a481c1f5f263429544abfeee1285904a Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Wed, 13 May 2015 18:33:45 -0700
Subject: [PATCH] Remove obsolete aliases that broke 'locale -a'
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

[BZ #18412]
* intl/locale.alias: Remove obsolete aliases "bokmÃƒÂ¥l" and "franÃƒÂ§ais"
which caused 'locale -a' to output Latin-1 data in UTF-8 locales,
breaking some applications that use 'locale -a' output.
---
 ChangeLog         | 7 +++++++
 intl/locale.alias | 9 +++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 6d96ce2..5b96578 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@ 
+2015-05-13  Paul Eggert  <eggert@cs.ucla.edu>
+
+	[BZ #18412]
+	* intl/locale.alias: Remove obsolete aliases "bokmÃƒÂ¥l" and "franÃƒÂ§ais"
+	which caused 'locale -a' to output Latin-1 data in UTF-8 locales,
+	breaking some applications that use 'locale -a' output.
+
 2015-05-13  Roland McGrath  <roland@hack.frob.com>
 
 	* sysdeps/nacl/fdopendir.c: New file.
diff --git a/intl/locale.alias b/intl/locale.alias
index ab1cb7a..4053df6 100644
--- a/intl/locale.alias
+++ b/intl/locale.alias
@@ -24,8 +24,14 @@ 
 # backward compatibility.  Nobody should rely on the names defined here.
 # Locales should always be specified by their full name.
 
+# Note: This file used to contain the lines:
+#	bokmÃ¥l		nb_NO.ISO-8859-1
+#	franÃ§ais	fr_FR.ISO-8859-1
+# but these have been commented out since they cause 'locale -a' to output
+# text encoded in Latin-1, which breaks applications in UTF-8 locales.  See:
+# https://sourceware.org/bugzilla/show_bug.cgi?id=18412
+
 bokmal		nb_NO.ISO-8859-1
-bokmÃ¥l		nb_NO.ISO-8859-1
 catalan		ca_ES.ISO-8859-1
 croatian	hr_HR.ISO-8859-2
 czech		cs_CZ.ISO-8859-2
@@ -36,7 +42,6 @@  dutch		nl_NL.ISO-8859-1
 eesti		et_EE.ISO-8859-1
 estonian	et_EE.ISO-8859-1
 finnish         fi_FI.ISO-8859-1
-franÃ§ais	fr_FR.ISO-8859-1
 french		fr_FR.ISO-8859-1
 galego		gl_ES.ISO-8859-1
 galician	gl_ES.ISO-8859-1
-- 
2.1.0