[v2] localedata: Add Minguo calendar support to zh_TW

Message ID 20190301192730.10959-1-felixonmars@archlinux.org
State Superseded
Headers

Commit Message

Felix Yan March 1, 2019, 7:27 p.m. UTC
  Minguo calendar is the official calendar system, and very widely used in
Taiwan, it would be nice to have the support in glibc.

Some background information: The government website (www.gov.tw) uses it
without the prefix, popular public services like Taiwan HSR also uses
this calendar system.

Link to wikipedia: https://en.wikipedia.org/wiki/Minguo_calendar
---
 localedata/locales/zh_TW | 4 ++++
 1 file changed, 4 insertions(+)
  

Comments

Rafal Luzynski March 2, 2019, 3:49 p.m. UTC | #1
Felix,

Thank you for the update.  Please see the results of my tests.  First,
unpatched version:


$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey
19
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY
2019
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1992-01-01
92
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1992-01-01
1992
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1913-01-01
13
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1913-01-01
1913
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1912-12-31
12
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1912-12-31
1912
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1912-01-01
12
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1912-01-01
1912
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1911-12-31
11
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1911-12-31
1911
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1901-12-31
01
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1901-12-31
1901


Now the same commands with your patch applied:


$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey
108
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY
民國108年
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1992-01-01
81
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1992-01-01
民國81年
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1913-01-01
02
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1913-01-01
民國02年
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1912-12-31
01
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1912-12-31
民國元年
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1912-01-01
01
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1912-01-01
民國元年
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1911-12-31
01
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1911-12-31
民前01年
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1901-12-31
11
$ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1901-12-31
民前11年


Please confirm this is what you wanted to achieve.  Being unable to
read Chinese script I can only verify the numbers and I can only
trust you that this is correct otherwise.

Now please look inside the localedata/locales directory. [1] There
are 5 other locales from Taiwan: cmn_TW, hak_TW, lzh_TW, nan_TW,
and nan_TW@latin.  If Minguo calendar is popular in Taiwan then do
you think that those other locales need the same update as well?
Can you please update them, and maybe squash all changes into a single
patch?

Also please see below:

1.03.2019 20:27 Felix Yan <felixonmars@archlinux.org> wrote:
> 
> Minguo calendar is the official calendar system, and very widely used in
> Taiwan, it would be nice to have the support in glibc.
> 
> Some background information: The government website (www.gov.tw) uses it
> without the prefix, popular public services like Taiwan HSR also uses
> this calendar system.

I don't understand what is the prefix ("without the prefix") here.
Is this a typo or maybe just a concept I am not familiar with?

> Link to wikipedia: https://en.wikipedia.org/wiki/Minguo_calendar
> ---
>  localedata/locales/zh_TW | 4 ++++
>  1 file changed, 4 insertions(+)

ChangeLog entry is missing here.  I don't mean updating ChangeLog and
posting the diff here but a short summary of changes should be included
in the commit message, e.g.:

        [BZ #...]
        * localedata/locales/zh_TW (LC_TIME): Add era, support Minguo
        calendar.

or 

        [BZ #...]
        * localedata/locales/zh_TW (era): Add, support Minguo calendar.

I've just realized that Bugzilla entry is missing for this issue, please
file it. [2] (Select Component: localedata).  Also add the bug number
to the first line of the commit message which is also the subject line
of this email.

> diff --git a/localedata/locales/zh_TW b/localedata/locales/zh_TW
> index 92b04b083d..b869dec317 100644
> --- a/localedata/locales/zh_TW
> +++ b/localedata/locales/zh_TW
> @@ -126,6 +126,10 @@ am_pm	"<U4E0A><U5348>";"<U4E0B><U5348>"
>  % t_fmt_ampm: "%p %I<h>%M<m>%S<s>"
>  t_fmt_ampm  "%p %I<U6642>%M<U5206>%S<U79D2>"
>  week 7;19971130;1
> +
> +era "+:2:1913//01//01:+*:<U6C11><U570B>:%EC%Ey<U5E74>";/
> +    "+:1:1912//01//01:1912//12//31:<U6C11><U570B>:%EC<U5143><U5E74>";/
> +    "+:1:1911//12//31:-*:<U6C11><U524D>:%EC%Ey<U5E74>"
>  END LC_TIME
>  
>  LC_MESSAGES
> -- 
> 2.21.0

I believe this is correct, see my tests above.

Regards,

Rafal


[1]
https://sourceware.org/git/?p=glibc.git;a=tree;f=localedata/locales;hb=HEAD
[2] https://sourceware.org/bugzilla/enter_bug.cgi?product=glibc
  
Felix Yan March 2, 2019, 5:04 p.m. UTC | #2
On 2019/3/2 下午11:49, Rafal Luzynski wrote:
> $ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1913-01-01
> 02
> $ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1913-01-01
> 民國02年

There seems to be a small difference here:

$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1913-01-01
2
$ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1913-01-01
民國2年

The text is correct without the zero padding before number. Am I missing
something that caused this difference?

> Please confirm this is what you wanted to achieve.  Being unable to
> read Chinese script I can only verify the numbers and I can only
> trust you that this is correct otherwise.

Apart from the above difference, the other parts are correct. I have
double confirmed this on #l10n-tw too.

> Now please look inside the localedata/locales directory. [1] There
> are 5 other locales from Taiwan: cmn_TW, hak_TW, lzh_TW, nan_TW,
> and nan_TW@latin.  If Minguo calendar is popular in Taiwan then do
> you think that those other locales need the same update as well?
> Can you please update them, and maybe squash all changes into a single
> patch?

For nan_TW@latin maybe I need to find a Minnan user. I didn't find one
yet though. Would it be possible to leave that one out for now?

>> Some background information: The government website (www.gov.tw) uses it
>> without the prefix, popular public services like Taiwan HSR also uses
>> this calendar system.
> 
> I don't understand what is the prefix ("without the prefix") here.
> Is this a typo or maybe just a concept I am not familiar with?

I meant the short version (%Ey), sorry for the confusion.

I'll send a revised patch soon.
  
Rafal Luzynski March 2, 2019, 5:46 p.m. UTC | #3
2.03.2019 18:04 Felix Yan <felixonmars@archlinux.org> wrote:
> On 2019/3/2 下午11:49, Rafal Luzynski wrote:
> > $ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%Ey -d 1913-01-01
> > 02
> > $ LC_TIME=zh_TW.UTF-8 ./testrun.sh /usr/bin/date +%EY -d 1913-01-01
> > 民國02年
> 
> There seems to be a small difference here:
> 
> $ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%Ey -d 1913-01-01
> 2
> $ LC_TIME=zh_TW.UTF-8 /usr/bin/date +%EY -d 1913-01-01
> 民國2年
> 
> The text is correct without the zero padding before number. Am I missing
> something that caused this difference?

Yes, it's a new feature added to glibc 2.29 shortly before the release. [1]
If you want "2" without the zero padding you must use "%-Ey" or "%-EY".

> > Please confirm this is what you wanted to achieve.  Being unable to
> > read Chinese script I can only verify the numbers and I can only
> > trust you that this is correct otherwise.
> 
> Apart from the above difference, the other parts are correct. I have
> double confirmed this on #l10n-tw too.

Thank you.

> > Now please look inside the localedata/locales directory. [1] There
> > are 5 other locales from Taiwan: cmn_TW, hak_TW, lzh_TW, nan_TW,
> > and nan_TW@latin.  If Minguo calendar is popular in Taiwan then do
> > you think that those other locales need the same update as well?
> > Can you please update them, and maybe squash all changes into a single
> > patch?
> 
> For nan_TW@latin maybe I need to find a Minnan user. I didn't find one
> yet though. Would it be possible to leave that one out for now?

Sure.  You can contribute another patch referring to the same bug report
if you contribute it soon.  You will need to file a new bug report after
2.30 is released (August 2019).

> >> Some background information: The government website (www.gov.tw) uses
> >> it
> >> without the prefix, popular public services like Taiwan HSR also uses
> >> this calendar system.
> > 
> > I don't understand what is the prefix ("without the prefix") here.
> > Is this a typo or maybe just a concept I am not familiar with?
> 
> I meant the short version (%Ey), sorry for the confusion.
> 
> I'll send a revised patch soon.

Thank you.  I saw your email but I have not tested/reviewed yet.

Regards,

Rafal


[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23758
  

Patch

diff --git a/localedata/locales/zh_TW b/localedata/locales/zh_TW
index 92b04b083d..b869dec317 100644
--- a/localedata/locales/zh_TW
+++ b/localedata/locales/zh_TW
@@ -126,6 +126,10 @@  am_pm	"<U4E0A><U5348>";"<U4E0B><U5348>"
 % t_fmt_ampm: "%p %I<h>%M<m>%S<s>"
 t_fmt_ampm  "%p %I<U6642>%M<U5206>%S<U79D2>"
 week 7;19971130;1
+
+era "+:2:1913//01//01:+*:<U6C11><U570B>:%EC%Ey<U5E74>";/
+    "+:1:1912//01//01:1912//12//31:<U6C11><U570B>:%EC<U5143><U5E74>";/
+    "+:1:1911//12//31:-*:<U6C11><U524D>:%EC%Ey<U5E74>"
 END LC_TIME
 
 LC_MESSAGES