[3/5] localedata: CLDRv28: update LC_ADDRESS.country_name translations

Message ID 20160210201256.GA7732@vapier.lan
State Superseded
Delegated to: Mike Frysinger
Headers

Commit Message

Mike Frysinger Feb. 10, 2016, 8:12 p.m. UTC
  On 10 Feb 2016 14:21, Carlos O'Donell wrote:
> On 02/09/2016 04:48 PM, Mike Frysinger wrote:
> > the script lists the original upstream URI and will download it on the
> > fly.  since the db is 13MB compressed and 90MB uncompressed, i'm not
> > sure it's something we want to add to the repo.  even if we try to drop
> > unneeded files, we're still talking 40MB+.
> 
> How about this?
> 
> * Download.
> * Post-process.
> * Checkin (a) original source URI and (b) post-processed data we actually used.
> 
> That should limit what we checkin and allow us to more easily track bacwards
> from: glibc data -> post processed data -> URI (unprocessed data) to verify.

the script supports setting the title/source/address/revision/date fields
in the desc.  i'm on the fence whether we want to blow existing metadata
and just use the cldr.  e.g.


then it would be obvious what version of CLDR was used to update the
locale.  the downside is that the file isn't 100% sourced from CLDR,
so it seems like clobbering all the fields is wrong ?

maybe set source to a string like:
source "Mostly based on Unicode Common Locale Data Repository (CLDR)"
-mike
  

Comments

Carlos O'Donell Feb. 11, 2016, 4:27 a.m. UTC | #1
On 02/10/2016 03:12 PM, Mike Frysinger wrote:
> then it would be obvious what version of CLDR was used to update the
> locale.  the downside is that the file isn't 100% sourced from CLDR,
> so it seems like clobbering all the fields is wrong ?

Per FSF statement [1] the locale files are not copyrightable so IMO
attribution matters only so much as we care to thank the previous
authors for their work. Such previous authors already have attribution
in the Changelog, and IMO need not have any more attribution in the
source file, just like we don't use "Contributed by" anymore.

In summary I suggest:

- Remove the old data.

- Use "Based on Unicode Common Locale Data Repository (CLDR)"

When we get to 100% data coming from CLDR then we'll delete "Based on"
(which implies we have local modifications).

Thoughts?

Cheers,
Carlos.

https://sourceware.org/ml/libc-alpha/2013-02/msg00433.html
  
Florian Weimer Feb. 11, 2016, 9:24 a.m. UTC | #2
On 02/11/2016 05:27 AM, Carlos O'Donell wrote:
> On 02/10/2016 03:12 PM, Mike Frysinger wrote:
>> then it would be obvious what version of CLDR was used to update the
>> locale.  the downside is that the file isn't 100% sourced from CLDR,
>> so it seems like clobbering all the fields is wrong ?
> 
> Per FSF statement [1] the locale files are not copyrightable so IMO
> attribution matters only so much as we care to thank the previous
> authors for their work. Such previous authors already have attribution
> in the Changelog, and IMO need not have any more attribution in the
> source file, just like we don't use "Contributed by" anymore.

unicode.org claims copyright on CLDR data:

  <http://unicode.org/repos/cldr/trunk/unicode-license.txt>

The terms do not appear to be too onerous, but I would recommend to
obtain FSF (and internal) sign-off before incorporating data directly
from CLDR into glibc.

Florian
  
keld@keldix.com Feb. 11, 2016, 9:46 a.m. UTC | #3
On Wed, Feb 10, 2016 at 11:27:48PM -0500, Carlos O'Donell wrote:
> On 02/10/2016 03:12 PM, Mike Frysinger wrote:
> > then it would be obvious what version of CLDR was used to update the
> > locale.  the downside is that the file isn't 100% sourced from CLDR,
> > so it seems like clobbering all the fields is wrong ?
> 
> Per FSF statement [1] the locale files are not copyrightable so IMO
> attribution matters only so much as we care to thank the previous
> authors for their work. Such previous authors already have attribution
> in the Changelog, and IMO need not have any more attribution in the
> source file, just like we don't use "Contributed by" anymore.
> 
> In summary I suggest:
> 
> - Remove the old data.
> 
> - Use "Based on Unicode Common Locale Data Repository (CLDR)"
> 
> When we get to 100% data coming from CLDR then we'll delete "Based on"
> (which implies we have local modifications).
> 
> Thoughts?
> 
> Cheers,
> Carlos.
> 
> https://sourceware.org/ml/libc-alpha/2013-02/msg00433.html

I dispute the FSF reasoning about copyright. There is a lot of intellectiual
work in making locales.

I also doubt that FSF still thinks thta locales are  without copyrights.
That is my impression when I talk with them.

best regards
Keld
  
Mike Frysinger Feb. 11, 2016, 12:39 p.m. UTC | #4
On 11 Feb 2016 10:24, Florian Weimer wrote:
> On 02/11/2016 05:27 AM, Carlos O'Donell wrote:
> > On 02/10/2016 03:12 PM, Mike Frysinger wrote:
> >> then it would be obvious what version of CLDR was used to update the
> >> locale.  the downside is that the file isn't 100% sourced from CLDR,
> >> so it seems like clobbering all the fields is wrong ?
> > 
> > Per FSF statement [1] the locale files are not copyrightable so IMO
> > attribution matters only so much as we care to thank the previous
> > authors for their work. Such previous authors already have attribution
> > in the Changelog, and IMO need not have any more attribution in the
> > source file, just like we don't use "Contributed by" anymore.
> 
> unicode.org claims copyright on CLDR data:
> 
>   <http://unicode.org/repos/cldr/trunk/unicode-license.txt>

perhaps on all the files and their collection and additional data, but
you can't copyright facts.  i'm not committing their files.

> The terms do not appear to be too onerous, but I would recommend to
> obtain FSF (and internal) sign-off before incorporating data directly
> from CLDR into glibc.

covering our bases makes sense, but i don't see a problem here.  i'm
querying their db for an uncopyrightable fact ("what is the weekend
start day -> saturday") and then merging the answer "saturday" into
the data files on our side.

also, we're already doing this today: localedata/unicode-gen/ parses
the unicode.org databases to generate the list of valid characters.
-mike
  
Florian Weimer Feb. 11, 2016, 12:46 p.m. UTC | #5
On 02/11/2016 01:39 PM, Mike Frysinger wrote:
> On 11 Feb 2016 10:24, Florian Weimer wrote:
>> On 02/11/2016 05:27 AM, Carlos O'Donell wrote:
>>> On 02/10/2016 03:12 PM, Mike Frysinger wrote:
>>>> then it would be obvious what version of CLDR was used to
>>>> update the locale.  the downside is that the file isn't 100%
>>>> sourced from CLDR, so it seems like clobbering all the fields
>>>> is wrong ?
>>> 
>>> Per FSF statement [1] the locale files are not copyrightable so
>>> IMO attribution matters only so much as we care to thank the
>>> previous authors for their work. Such previous authors already
>>> have attribution in the Changelog, and IMO need not have any
>>> more attribution in the source file, just like we don't use
>>> "Contributed by" anymore.
>> 
>> unicode.org claims copyright on CLDR data:
>> 
>> <http://unicode.org/repos/cldr/trunk/unicode-license.txt>
> 
> perhaps on all the files and their collection and additional data,
> but you can't copyright facts.

There are some jurisdictions where the legislator intends that
creators of collections of facts have some exclusive rights regarding
the use of these collections.  Details are extremely hairy, see the EU
Database Directive.

I have no idea if any of this applies to CLDR, and I'm not qualified
to determine this.  But the statement, “you can't copyright facts” is
only true under very narrow interpretation of the terms “you”,
“can't”, “copyright”, and “facts”.

Florian
  
Mike Frysinger Feb. 11, 2016, 12:49 p.m. UTC | #6
On 11 Feb 2016 10:46, keld@keldix.com wrote:
> On Wed, Feb 10, 2016 at 11:27:48PM -0500, Carlos O'Donell wrote:
> > On 02/10/2016 03:12 PM, Mike Frysinger wrote:
> > > then it would be obvious what version of CLDR was used to update the
> > > locale.  the downside is that the file isn't 100% sourced from CLDR,
> > > so it seems like clobbering all the fields is wrong ?
> > 
> > Per FSF statement [1] the locale files are not copyrightable so IMO
> > attribution matters only so much as we care to thank the previous
> > authors for their work. Such previous authors already have attribution
> > in the Changelog, and IMO need not have any more attribution in the
> > source file, just like we don't use "Contributed by" anymore.
> > 
> > In summary I suggest:
> > 
> > - Remove the old data.
> > 
> > - Use "Based on Unicode Common Locale Data Repository (CLDR)"
> > 
> > When we get to 100% data coming from CLDR then we'll delete "Based on"
> > (which implies we have local modifications).
> > 
> > https://sourceware.org/ml/libc-alpha/2013-02/msg00433.html
> 
> I dispute the FSF reasoning about copyright. There is a lot of intellectiual
> work in making locales.

that the work is hard and takes time is kind of irrelevant.  you can't
copyright facts, nor should you be able to.  i doubt anyone thinks the
timezone database does not require "a lot of intellectiual work", yet
we agree (or at least the courts do) that it is public domain.

if i wanted to know the translations for the weekdays in language X
and looked/copied the values out of glibc's localedata, that is not
something you can copyright.  claiming copyright on "Mittwoch" or on
"水曜日" is ridiculous.
-mike
  
Carlos O'Donell Feb. 11, 2016, 3:04 p.m. UTC | #7
On 02/11/2016 04:24 AM, Florian Weimer wrote:
> On 02/11/2016 05:27 AM, Carlos O'Donell wrote:
>> On 02/10/2016 03:12 PM, Mike Frysinger wrote:
>>> then it would be obvious what version of CLDR was used to update the
>>> locale.  the downside is that the file isn't 100% sourced from CLDR,
>>> so it seems like clobbering all the fields is wrong ?
>>
>> Per FSF statement [1] the locale files are not copyrightable so IMO
>> attribution matters only so much as we care to thank the previous
>> authors for their work. Such previous authors already have attribution
>> in the Changelog, and IMO need not have any more attribution in the
>> source file, just like we don't use "Contributed by" anymore.
> 
> unicode.org claims copyright on CLDR data:
> 
>   <http://unicode.org/repos/cldr/trunk/unicode-license.txt>
> 
> The terms do not appear to be too onerous, but I would recommend to
> obtain FSF (and internal) sign-off before incorporating data directly
> from CLDR into glibc.

Does this position not make it clear?
https://sourceware.org/ml/libc-locales/2013-q1/msg00048.html

For the Unicode 8.0 update and this CLDR update we should be
stripping all conflicting copyright notices and adding:

% This file is part of the GNU C Library and contains locale data.
% The Free Software Foundation does not claim any copyright interest
% in the locale data contained in this file.  The foregoing does not
% affect the license of the GNU C Library as a whole.  It does not
% exempt you from the conditions of the license if your use would
% otherwise be governed by that license.

Per the FSF request?

I don't see that we need to keep making legal requests unless the
FSF changes their position.

Cheers,
Carlos.
  
Mike Frysinger Feb. 11, 2016, 3:07 p.m. UTC | #8
On 11 Feb 2016 10:04, Carlos O'Donell wrote:
> On 02/11/2016 04:24 AM, Florian Weimer wrote:
> > On 02/11/2016 05:27 AM, Carlos O'Donell wrote:
> >> On 02/10/2016 03:12 PM, Mike Frysinger wrote:
> >>> then it would be obvious what version of CLDR was used to update the
> >>> locale.  the downside is that the file isn't 100% sourced from CLDR,
> >>> so it seems like clobbering all the fields is wrong ?
> >>
> >> Per FSF statement [1] the locale files are not copyrightable so IMO
> >> attribution matters only so much as we care to thank the previous
> >> authors for their work. Such previous authors already have attribution
> >> in the Changelog, and IMO need not have any more attribution in the
> >> source file, just like we don't use "Contributed by" anymore.
> > 
> > unicode.org claims copyright on CLDR data:
> > 
> >   <http://unicode.org/repos/cldr/trunk/unicode-license.txt>
> > 
> > The terms do not appear to be too onerous, but I would recommend to
> > obtain FSF (and internal) sign-off before incorporating data directly
> > from CLDR into glibc.
> 
> Does this position not make it clear?
> https://sourceware.org/ml/libc-locales/2013-q1/msg00048.html
> 
> For the Unicode 8.0 update and this CLDR update we should be
> stripping all conflicting copyright notices and adding:
> 
> % This file is part of the GNU C Library and contains locale data.
> % The Free Software Foundation does not claim any copyright interest
> % in the locale data contained in this file.  The foregoing does not
> % affect the license of the GNU C Library as a whole.  It does not
> % exempt you from the conditions of the license if your use would
> % otherwise be governed by that license.
> 
> Per the FSF request?
> 
> I don't see that we need to keep making legal requests unless the
> FSF changes their position.

makes sense to me

i'm hacking on a linter of sorts for the locale data files to check for
common issues.  i can add this logic to that.
-mike
  
Florian Weimer Feb. 11, 2016, 5:42 p.m. UTC | #9
On 02/11/2016 04:04 PM, Carlos O'Donell wrote:
> On 02/11/2016 04:24 AM, Florian Weimer wrote:
>> On 02/11/2016 05:27 AM, Carlos O'Donell wrote:
>>> On 02/10/2016 03:12 PM, Mike Frysinger wrote:
>>>> then it would be obvious what version of CLDR was used to update the
>>>> locale.  the downside is that the file isn't 100% sourced from CLDR,
>>>> so it seems like clobbering all the fields is wrong ?
>>>
>>> Per FSF statement [1] the locale files are not copyrightable so IMO
>>> attribution matters only so much as we care to thank the previous
>>> authors for their work. Such previous authors already have attribution
>>> in the Changelog, and IMO need not have any more attribution in the
>>> source file, just like we don't use "Contributed by" anymore.
>>
>> unicode.org claims copyright on CLDR data:
>>
>>   <http://unicode.org/repos/cldr/trunk/unicode-license.txt>
>>
>> The terms do not appear to be too onerous, but I would recommend to
>> obtain FSF (and internal) sign-off before incorporating data directly
>> from CLDR into glibc.
> 
> Does this position not make it clear?
> https://sourceware.org/ml/libc-locales/2013-q1/msg00048.html

It deals with a different question.  The FSF apparently disclaims
copyright ownership of the glibc locale data.  It does not actually say
whether aggregated locale data can be the subject of a sui generis
database right (a question which the FSF would not have any say in
anyway), or if the FSF claims such rights on the glibc locale data
(probably not, but the message and permission notice do not say so
explicitly).

Florian
  
Carlos O'Donell Feb. 11, 2016, 7:20 p.m. UTC | #10
On 02/11/2016 07:49 AM, Mike Frysinger wrote:
> if i wanted to know the translations for the weekdays in language X
> and looked/copied the values out of glibc's localedata, that is not
> something you can copyright.  claiming copyright on "Mittwoch" or on
> "水曜日" is ridiculous.

This is not for us to decide or discuss. The FSF has taken a position
and indicated to us how our project should proceed.

I suggest we not discuss this any further, unless it's to discuss
reaching out to the FSF to clarify again the same question we had
answered in 2013.

Cheers,
Carlos.
  
Mike Frysinger Feb. 12, 2016, 9:33 p.m. UTC | #11
On 11 Feb 2016 14:20, Carlos O'Donell wrote:
> On 02/11/2016 07:49 AM, Mike Frysinger wrote:
> > if i wanted to know the translations for the weekdays in language X
> > and looked/copied the values out of glibc's localedata, that is not
> > something you can copyright.  claiming copyright on "Mittwoch" or on
> > "水曜日" is ridiculous.
> 
> This is not for us to decide or discuss. The FSF has taken a position
> and indicated to us how our project should proceed.
> 
> I suggest we not discuss this any further, unless it's to discuss
> reaching out to the FSF to clarify again the same question we had
> answered in 2013.

i'll keep hacking on my python logic as i think it'll be useful
regardless of cldr (e.g. the linting aspect).  i've noticed that
the "week" setting has already regressed in some locales as was
pointed out years ago by Petr.
	https://sourceware.org/ml/libc-locales/2009-q1/msg00011.html
and whenever someone contributes a new file, we really have no
way of doing any sort of automated validation which is bad juju.

beyond that, i'll post updates using cldr operating under the
assumption that the link you posted earlier permits this.  i
believe it does as well.

if someone thinks more clarification/consultation is needed, then
i would say "you drive that and let us know how it goes".

as for my personal opinions posted earlier, those are obviously
mine and not an official statement by the FSF or for the GLIBC
stewards (which i am not one of).  i have no problem posting my
personal statements ;).
-mike
  
Carlos O'Donell Feb. 15, 2016, 12:24 a.m. UTC | #12
On 02/11/2016 12:42 PM, Florian Weimer wrote:
> On 02/11/2016 04:04 PM, Carlos O'Donell wrote:
>> Does this position not make it clear?
>> https://sourceware.org/ml/libc-locales/2013-q1/msg00048.html
> 
> It deals with a different question.  The FSF apparently disclaims
> copyright ownership of the glibc locale data.  It does not actually say
> whether aggregated locale data can be the subject of a sui generis
> database right (a question which the FSF would not have any say in
> anyway), or if the FSF claims such rights on the glibc locale data
> (probably not, but the message and permission notice do not say so
> explicitly).

Conceptually to me it seems no different than making changes to
individual files, not recording copyright, and not asking the 
contributor to disclaim copyright to the FSF. Though I admit I
see where you are coming from and there might be a risk there.
Particularly because we have entire copies of the Unicode data
files in our source tree.

It would seem to me that we would have to include an attribution
section in the manual and list the Unicode license (since CLDR
is covered under it).

I suggest we continue the work and I will ask FSF legal to comment
on the issue of needing an attribution for the use of the Unicode
data files. I am still of the opinion that the original statement
from the FSF is enough guidance, to continue the work Mike is doing,
but it doesn't hurt to get clarification.

Cheers,
Carlos.
  
Florian Weimer Feb. 22, 2016, 11:12 a.m. UTC | #13
* Carlos O'Donell:

> I suggest we continue the work and I will ask FSF legal to comment
> on the issue of needing an attribution for the use of the Unicode
> data files. I am still of the opinion that the original statement
> from the FSF is enough guidance, to continue the work Mike is doing,
> but it doesn't hurt to get clarification.

Please also point them out that ISO currently seems to re-sell glibc
locale data under a restrictive license.  This is probably not what
the FSF wanted to enabled when it disclaimed copyright on glibc locale
data.
  
keld@keldix.com Feb. 22, 2016, 11:18 a.m. UTC | #14
On Mon, Feb 22, 2016 at 12:12:00PM +0100, Florian Weimer wrote:
> * Carlos O'Donell:
> 
> > I suggest we continue the work and I will ask FSF legal to comment
> > on the issue of needing an attribution for the use of the Unicode
> > data files. I am still of the opinion that the original statement
> > from the FSF is enough guidance, to continue the work Mike is doing,
> > but it doesn't hurt to get clarification.
> 
> Please also point them out that ISO currently seems to re-sell glibc
> locale data under a restrictive license.  This is probably not what
> the FSF wanted to enabled when it disclaimed copyright on glibc locale
> data.


How does ISO resell these data, and where?

Best regards
Keld
  
Florian Weimer Feb. 22, 2016, 11:34 a.m. UTC | #15
* Keld Simonsen:

> On Mon, Feb 22, 2016 at 12:12:00PM +0100, Florian Weimer wrote:
>> * Carlos O'Donell:
>> 
>> > I suggest we continue the work and I will ask FSF legal to comment
>> > on the issue of needing an attribution for the use of the Unicode
>> > data files. I am still of the opinion that the original statement
>> > from the FSF is enough guidance, to continue the work Mike is doing,
>> > but it doesn't hurt to get clarification.
>> 
>> Please also point them out that ISO currently seems to re-sell glibc
>> locale data under a restrictive license.  This is probably not what
>> the FSF wanted to enabled when it disclaimed copyright on glibc locale
>> data.

> How does ISO resell these data, and where?

A while back, you wrote this:

From: keld@keldix.com
Subject: Re: Should glibc provide a builtin C.UTF-8 locale?
Date: Tue, 27 Oct 2015 14:10:38 +0100 (16 weeks, 5 days, 22 hours ago)
Message-ID: <20151027131038.GB23833@www5.open-std.org>

| Yes, ISO TR 30112 i18n and glibc i18n are essentially the same, as 
| ISO 30112 builds on a bit old copy of glibc i18n locale.
| In turn the glibc i18n  locale was built on ISO TR 14652 i18n 
| locale, so this is a fruitful relation. ISO 30112 is the followup
| spec on ISO 14652, and ISO 30112 has catched up with some glibc development.

<https://sourceware.org/ml/libc-alpha/2015-10/msg00958.html>
  
keld@keldix.com Feb. 22, 2016, 1:07 p.m. UTC | #16
On Mon, Feb 22, 2016 at 12:34:30PM +0100, Florian Weimer wrote:
> * Keld Simonsen:
> 
> > On Mon, Feb 22, 2016 at 12:12:00PM +0100, Florian Weimer wrote:
> >> * Carlos O'Donell:
> >> 
> >> > I suggest we continue the work and I will ask FSF legal to comment
> >> > on the issue of needing an attribution for the use of the Unicode
> >> > data files. I am still of the opinion that the original statement
> >> > from the FSF is enough guidance, to continue the work Mike is doing,
> >> > but it doesn't hurt to get clarification.
> >> 
> >> Please also point them out that ISO currently seems to re-sell glibc
> >> locale data under a restrictive license.  This is probably not what
> >> the FSF wanted to enabled when it disclaimed copyright on glibc locale
> >> data.
> 
> > How does ISO resell these data, and where?
> 
> A while back, you wrote this:
> 
> From: keld@keldix.com
> Subject: Re: Should glibc provide a builtin C.UTF-8 locale?
> Date: Tue, 27 Oct 2015 14:10:38 +0100 (16 weeks, 5 days, 22 hours ago)
> Message-ID: <20151027131038.GB23833@www5.open-std.org>
> 
> | Yes, ISO TR 30112 i18n and glibc i18n are essentially the same, as 
> | ISO 30112 builds on a bit old copy of glibc i18n locale.
> | In turn the glibc i18n  locale was built on ISO TR 14652 i18n 
> | locale, so this is a fruitful relation. ISO 30112 is the followup
> | spec on ISO 14652, and ISO 30112 has catched up with some glibc development.
> 
> <https://sourceware.org/ml/libc-alpha/2015-10/msg00958.html>

Oh well, I did have a thought that it was one of my own texts:-)

Well, the data in both 14652 and 30112 bear a GPL license. 30112 WD10 page 8 says:
"The "i18n" FDCC-set and its parts are released under the GNU Public License, version 2, 
as it is taken from glibc sources" 

But if FSF does not put a license on the locales, as they might think this is
not appropiate, then that would not be so relevant... I would keep the copyrights
anyway in 30112, because we then can lift it out of ISO/IEC copyrights.

I note that the 14652 data predates the FSF mail, and the glibc data
was bearing a GPL license before it was incorporated into 14652. 30112 then copies 14652,
including the license.

I also note that I think the locales have a height of work in copyright sense,
because the whole scheme of the locales, and them being tailorable and character
set independent is almost a work of art:-) Some of the techniques would be patentable,
if we did not know better.

I also wrote this at that time where the FSF statement were published, but I would
state that each of the informations in the locales are just information, and you cannot
copyright information.  This also applies to Unicode data, the information in them 
are not copyrightable, but the collection is.

Best regards
Keld
  
Carlos O'Donell Feb. 22, 2016, 3:46 p.m. UTC | #17
On 02/22/2016 06:12 AM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> I suggest we continue the work and I will ask FSF legal to comment
>> on the issue of needing an attribution for the use of the Unicode
>> data files. I am still of the opinion that the original statement
>> from the FSF is enough guidance, to continue the work Mike is doing,
>> but it doesn't hurt to get clarification.
> 
> Please also point them out that ISO currently seems to re-sell glibc
> locale data under a restrictive license.  This is probably not what
> the FSF wanted to enabled when it disclaimed copyright on glibc locale
> data.

Could you expand on this please? What exactly are they doing and how
can I verify this?

Cheers,
Carlos.
  
Florian Weimer Feb. 22, 2016, 6:35 p.m. UTC | #18
* Carlos O'Donell:

> On 02/22/2016 06:12 AM, Florian Weimer wrote:
>> * Carlos O'Donell:
>> 
>>> I suggest we continue the work and I will ask FSF legal to comment
>>> on the issue of needing an attribution for the use of the Unicode
>>> data files. I am still of the opinion that the original statement
>>> from the FSF is enough guidance, to continue the work Mike is doing,
>>> but it doesn't hurt to get clarification.
>> 
>> Please also point them out that ISO currently seems to re-sell glibc
>> locale data under a restrictive license.  This is probably not what
>> the FSF wanted to enabled when it disclaimed copyright on glibc locale
>> data.
>
> Could you expand on this please? What exactly are they doing and how
> can I verify this?

See my parallel response to Keld.

It seems you can find some of these documents with a web search for
"it is taken from glibc sources" (*with* quotes).
  
Carlos O'Donell Feb. 23, 2016, 3:43 a.m. UTC | #19
On 02/22/2016 01:35 PM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> On 02/22/2016 06:12 AM, Florian Weimer wrote:
>>> * Carlos O'Donell:
>>>
>>>> I suggest we continue the work and I will ask FSF legal to comment
>>>> on the issue of needing an attribution for the use of the Unicode
>>>> data files. I am still of the opinion that the original statement
>>>> from the FSF is enough guidance, to continue the work Mike is doing,
>>>> but it doesn't hurt to get clarification.
>>>
>>> Please also point them out that ISO currently seems to re-sell glibc
>>> locale data under a restrictive license.  This is probably not what
>>> the FSF wanted to enabled when it disclaimed copyright on glibc locale
>>> data.
>>
>> Could you expand on this please? What exactly are they doing and how
>> can I verify this?
> 
> See my parallel response to Keld.
> 
> It seems you can find some of these documents with a web search for
> "it is taken from glibc sources" (*with* quotes).

Thanks. I do indeed see ISO/IEC TR 30112:2014 for sale (chf 198).

http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53232

However, I can't confirm that the working draft is identical to
the final published copy, the glibc relevant i18ln data might
have been removed before publication (doubtful).

I have not emailed fsf legal about this issue, and I don't
plan to do so because it might confuse the issue, and I myself
am not a lawyer so I can't make any judgement here.

Cheers,
Carlos.
  
keld@keldix.com Feb. 23, 2016, 9:04 a.m. UTC | #20
On Mon, Feb 22, 2016 at 10:43:29PM -0500, Carlos O'Donell wrote:
> On 02/22/2016 01:35 PM, Florian Weimer wrote:
> > * Carlos O'Donell:
> > 
> >> On 02/22/2016 06:12 AM, Florian Weimer wrote:
> >>> * Carlos O'Donell:
> >>>
> >>>> I suggest we continue the work and I will ask FSF legal to comment
> >>>> on the issue of needing an attribution for the use of the Unicode
> >>>> data files. I am still of the opinion that the original statement
> >>>> from the FSF is enough guidance, to continue the work Mike is doing,
> >>>> but it doesn't hurt to get clarification.
> >>>
> >>> Please also point them out that ISO currently seems to re-sell glibc
> >>> locale data under a restrictive license.  This is probably not what
> >>> the FSF wanted to enabled when it disclaimed copyright on glibc locale
> >>> data.
> >>
> >> Could you expand on this please? What exactly are they doing and how
> >> can I verify this?
> > 
> > See my parallel response to Keld.
> > 
> > It seems you can find some of these documents with a web search for
> > "it is taken from glibc sources" (*with* quotes).
> 
> Thanks. I do indeed see ISO/IEC TR 30112:2014 for sale (chf 198).
> 
> http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53232
> 
> However, I can't confirm that the working draft is identical to
> the final published copy, the glibc relevant i18ln data might
> have been removed before publication (doubtful).

The WD 10 of 30112 is an enhanced version of TR 30112:2014
http://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf
No data has been removed in either the published TR nor the WD.

See also my previous post on this subject yesterday.
There is a GPL licence on all the data in 30112 taken from glibc.

Best regards
Keld
  
Carlos O'Donell Feb. 23, 2016, 3:41 p.m. UTC | #21
On 02/23/2016 04:04 AM, Keld Simonsen wrote:
>> However, I can't confirm that the working draft is identical to
>> the final published copy, the glibc relevant i18ln data might
>> have been removed before publication (doubtful).
> 
> The WD 10 of 30112 is an enhanced version of TR 30112:2014
> http://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf
> No data has been removed in either the published TR nor the WD.
> 
> See also my previous post on this subject yesterday.
> There is a GPL licence on all the data in 30112 taken from glibc.

I assume you mean page 8?

"The complete "i18n" FDCC-set is defined as the sum of the "i18n"
 categories specified in the clauses below. The ”i18n”
 FDCC-set and its parts are released under the GNU Public License,
 version 2, as it is taken from glibc sources."

Does this conflict with the "Copright ISO/IEC 2014 All Rights
Reserved" on the bottom of every page?

Cheers,
Carlos.
  
keld@keldix.com Feb. 23, 2016, 6:16 p.m. UTC | #22
On Tue, Feb 23, 2016 at 10:41:25AM -0500, Carlos O'Donell wrote:
> On 02/23/2016 04:04 AM, Keld Simonsen wrote:
> >> However, I can't confirm that the working draft is identical to
> >> the final published copy, the glibc relevant i18ln data might
> >> have been removed before publication (doubtful).
> > 
> > The WD 10 of 30112 is an enhanced version of TR 30112:2014
> > http://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf
> > No data has been removed in either the published TR nor the WD.
> > 
> > See also my previous post on this subject yesterday.
> > There is a GPL licence on all the data in 30112 taken from glibc.
> 
> I assume you mean page 8?
> 
> "The complete "i18n" FDCC-set is defined as the sum of the "i18n"
>  categories specified in the clauses below. The ?i18n?
>  FDCC-set and its parts are released under the GNU Public License,
>  version 2, as it is taken from glibc sources."
> 
> Does this conflict with the "Copright ISO/IEC 2014 All Rights
> Reserved" on the bottom of every page?

The use of other licenses like open source licenses are used in
many ISO standards. I see it a a dual license, the ISO copyright
and the GPL license.

Best regards
Keld
  
Mike Frysinger Feb. 23, 2016, 6:46 p.m. UTC | #23
On 23 Feb 2016 19:16, Keld Simonsen wrote:
> On Tue, Feb 23, 2016 at 10:41:25AM -0500, Carlos O'Donell wrote:
> > On 02/23/2016 04:04 AM, Keld Simonsen wrote:
> > >> However, I can't confirm that the working draft is identical to
> > >> the final published copy, the glibc relevant i18ln data might
> > >> have been removed before publication (doubtful).
> > > 
> > > The WD 10 of 30112 is an enhanced version of TR 30112:2014
> > > http://www.open-std.org/JTC1/SC35/WG5/docs/30112d10.pdf
> > > No data has been removed in either the published TR nor the WD.
> > > 
> > > See also my previous post on this subject yesterday.
> > > There is a GPL licence on all the data in 30112 taken from glibc.
> > 
> > I assume you mean page 8?
> > 
> > "The complete "i18n" FDCC-set is defined as the sum of the "i18n"
> >  categories specified in the clauses below. The ?i18n?
> >  FDCC-set and its parts are released under the GNU Public License,
> >  version 2, as it is taken from glibc sources."
> > 
> > Does this conflict with the "Copright ISO/IEC 2014 All Rights
> > Reserved" on the bottom of every page?
> 
> The use of other licenses like open source licenses are used in
> many ISO standards. I see it a a dual license, the ISO copyright
> and the GPL license.

copyrights aren't licenses.  "all rights reserved" is meaningless
splatter that companies insist on posting everywhere still.

https://en.wikipedia.org/wiki/All_rights_reserved#Obsolescence
-mike
  

Patch

--- a/localedata/locales/en_US
+++ b/localedata/locales/en_US
@@ -5,16 +5,16 @@  comment_char %
 
 LC_IDENTIFICATION
 title      "English locale for the USA"
-source     "Free Software Foundation, Inc."
-address    "http:////www.gnu.org//software//libc//"
-contact    ""
+source     "Unicode Common Locale Data Repository (CLDR)"
+address    "http:////unicode.org//Public//cldr//28//core.zip"
+contact    "http:////cldr.unicode.org//index//process"
 email      "bug-glibc-locales@gnu.org"
 tel        ""
 fax        ""
 language   "American English"
 territory  "United States"
-revision   "1.0"
-date       "2000-06-24"
+revision   "28"
+date       "2015-09-16"
 %
 category  "en_US:2000";LC_IDENTIFICATION
 category  "en_US:2000";LC_CTYPE