[v2] localedata: Updates for Taiwanese locales [BZ #24409]

Message ID CAP1CgNLRxjBpE+TaJkKn+P-1p5hVnwB8coLRFumTGyhEUfEftw@mail.gmail.com
State Superseded
Headers
Series [v2] localedata: Updates for Taiwanese locales [BZ #24409] |

Checks

Context Check Description
dj/TryBot-apply_patch fail Patch failed to apply to master at the time it was sent

Commit Message

Wei-Lun Chao May 10, 2021, 3:07 a.m. UTC
  From: Wei-Lun Chao <bluebat@member.fsf.org>

Rationale:
* Header cleanup: make the header of localedata files consistent
* Remove space (abmon): remove the extra prefix space in abmon 1~9
* Add (week): add the missing definition of week and first_weekday to 2
* Change (thousands_sep): change the thousands_sep from <U002C> to
<U2009> to avoid confusion of grouping 4 with usual grouping 3
* Simplify (yesexpr) and (noexpr): remove unusual full-width characters
* Add collation: add missing collation used in cmn_TW

Changelog:
* localedata/locales/cmn_TW: Header cleanup; Remove space (abmon); Add
(week); Change (thousands_sep); Simplify (yesexpr) and (noexpr).
* localedata/locales/hak_TW: Likewise and add collation.
* localedata/locales/nan_TW: Likewise and add collation.
* localedata/locales/lzh_TW: Likewise and add collation.

---
 localedata/locales/cmn_TW |  40 +++++++-------
 localedata/locales/hak_TW |  76 +++++++++++++-------------
 localedata/locales/lzh_TW | 110 ++++++++++++++++++++++++--------------
 localedata/locales/nan_TW |  90 ++++++++++++++++---------------
 4 files changed, 174 insertions(+), 142 deletions(-)
  

Comments

TAMUKI Shoichi May 11, 2021, 11:15 a.m. UTC | #1
Hello Wei-Lun-san,

From: Wei-Lun Chao via Libc-alpha <libc-alpha@sourceware.org>
Subject: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
Date: Mon, 10 May 2021 11:07:27 +0800

> Rationale:
> * Header cleanup: make the header of localedata files consistent
> * Remove space (abmon): remove the extra prefix space in abmon 1~9

Why are you trying to remove the extra prefix space in abmon 1..9?
By fixing the display width of abbreviated month names (%b) in the
same way as "Jan", "Feb", "Mar", ..., it has the effect of making the
display width constant, such as system log.
I don't think we should remove the leading space in abmon 1..9.

Regards,
TAMUKI Shoichi
  
TAMUKI Shoichi May 11, 2021, 12:41 p.m. UTC | #2
Hello Wei-Lun-san,

From: Wei-Lun Chao via Libc-alpha <libc-alpha@sourceware.org>
Subject: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
Date: Mon, 10 May 2021 11:07:27 +0800

> Changelog:
> * localedata/locales/cmn_TW: Header cleanup; Remove space (abmon); Add (week); Change (thousands_sep); Simplify (yesexpr) and (noexpr).
> * localedata/locales/hak_TW: Likewise and add collation.
> * localedata/locales/nan_TW: Likewise and add collation.
> * localedata/locales/lzh_TW: Likewise and add collation.

If elements from 0 to 59 are defined for alt_digits in lzh_TW, there
are the following problems:

$ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
<U4E94><U5341><U4E5D><U5E74>
$ LANG=lzh_TW date -d "1960-04-01 09:00:00" +"%Oy<U5E74>"
60<U5E74>

Up to 99 elements should be defined for alt_digits.

Regards,
TAMUKI Shoichi
  
Wei-Lun Chao May 13, 2021, 3:53 a.m. UTC | #3
Hello,

Without this patch, "date +%b" will output with a leading space for month 1..9.
And the day number makes the display width inconstant already.
So I think single space would be a better separator.

Regards,
Wei-Lun CHAO

TAMUKI Shoichi <tamuki@linet.gr.jp> 於 2021年5月11日 週二 下午7:17寫道:
>
> Hello Wei-Lun-san,
>
> From: Wei-Lun Chao via Libc-alpha <libc-alpha@sourceware.org>
> Subject: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
> Date: Mon, 10 May 2021 11:07:27 +0800
>
> > Rationale:
> > * Header cleanup: make the header of localedata files consistent
> > * Remove space (abmon): remove the extra prefix space in abmon 1~9
>
> Why are you trying to remove the extra prefix space in abmon 1..9?
> By fixing the display width of abbreviated month names (%b) in the
> same way as "Jan", "Feb", "Mar", ..., it has the effect of making the
> display width constant, such as system log.
> I don't think we should remove the leading space in abmon 1..9.
>
> Regards,
> TAMUKI Shoichi
  
Wei-Lun Chao May 13, 2021, 5:10 a.m. UTC | #4
Hello,

First, alt_digits works like "alt_numbers". I don't know how to just
map each digit to another character.
Before this patch, alt_digits covers 0..31, and this patch covers
0..59. I am not sure, is it worthy to extend
to 0..99, because those presentations are not suitable for year
numbers. For example:
$ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
We will expect <U4E94><U4E5D><U5E74> instead of <U4E94><U5341><U4E5D><U5E74>

Regards,
Wei-Lun CHAO

TAMUKI Shoichi <tamuki@linet.gr.jp> 於 2021年5月11日 週二 下午8:42寫道:
>
> Hello Wei-Lun-san,
>
> From: Wei-Lun Chao via Libc-alpha <libc-alpha@sourceware.org>
> Subject: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
> Date: Mon, 10 May 2021 11:07:27 +0800
>
> > Changelog:
> > * localedata/locales/cmn_TW: Header cleanup; Remove space (abmon); Add (week); Change (thousands_sep); Simplify (yesexpr) and (noexpr).
> > * localedata/locales/hak_TW: Likewise and add collation.
> > * localedata/locales/nan_TW: Likewise and add collation.
> > * localedata/locales/lzh_TW: Likewise and add collation.
>
> If elements from 0 to 59 are defined for alt_digits in lzh_TW, there
> are the following problems:
>
> $ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
> <U4E94><U5341><U4E5D><U5E74>
> $ LANG=lzh_TW date -d "1960-04-01 09:00:00" +"%Oy<U5E74>"
> 60<U5E74>
>
> Up to 99 elements should be defined for alt_digits.
>
> Regards,
> TAMUKI Shoichi
  
TAMUKI Shoichi May 16, 2021, 9:46 p.m. UTC | #5
Hello Wei-Lun-san,

From: Wei-Lun Chao <bluebat@member.fsf.org>
Subject: Re: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
Date: Thu, 13 May 2021 11:53:38 +0800

> > > Rationale:
> > > * Header cleanup: make the header of localedata files consistent
> > > * Remove space (abmon): remove the extra prefix space in abmon 1~9
> > 
> > Why are you trying to remove the extra prefix space in abmon 1..9?
> > By fixing the display width of abbreviated month names (%b) in the
> > same way as "Jan", "Feb", "Mar", ..., it has the effect of making the
> > display width constant, such as system log.
> > I don't think we should remove the leading space in abmon 1..9.
> 
> Without this patch, "date +%b" will output with a leading space for month 1..9.
> And the day number makes the display width inconstant already.
> So I think single space would be a better separator.

The elements defined in (ab)mon and (ab)day in the current *_TW locale
are summarized in the attached file mon+day.txt.  In each locale, the
display widths of mon (%B) and abday (%A) can change, while the
display widths of abmon (%b) and abday (%a) are constant.

Date and time notations such as logs and form lists may sometimes be
in RFC2822 (%a, %d %b %Y %T %z) or ctime (%a %b %d %T %Z %Y) format
with local locales.

With this patch, these display widths will change in *_TW locale,
even though the display width of each conversion specification that
makes up them is intended to be constant.  Please see the attached
file ctime.txt.

Regards,
TAMUKI Shoichi
  
TAMUKI Shoichi May 16, 2021, 9:46 p.m. UTC | #6
Hello Wei-Lun-san,

From: Wei-Lun Chao <bluebat@member.fsf.org>
Subject: Re: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
Date: Thu, 13 May 2021 13:10:12 +0800

> > > Changelog:
> > > * localedata/locales/cmn_TW: Header cleanup; Remove space (abmon); Add
> > > (week); Change (thousands_sep); Simplify (yesexpr) and (noexpr).
> > > * localedata/locales/hak_TW: Likewise and add collation.
> > > * localedata/locales/nan_TW: Likewise and add collation.
> > > * localedata/locales/lzh_TW: Likewise and add collation.
> > 
> > If elements from 0 to 59 are defined for alt_digits in lzh_TW, there
> > are the following problems:
> > 
> > $ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
> > <U4E94><U5341><U4E5D><U5E74>
> > $ LANG=lzh_TW date -d "1960-04-01 09:00:00" +"%Oy<U5E74>"
> > 60<U5E74>
> > 
> > Up to 99 elements should be defined for alt_digits.
> 
> First, alt_digits works like "alt_numbers". I don't know how to just
> map each digit to another character.
> Before this patch, alt_digits covers 0..31, and this patch covers
> 0..59. I am not sure, is it worthy to extend
> to 0..99, because those presentations are not suitable for year
> numbers. For example:
> $ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
> We will expect <U4E94><U4E5D><U5E74> instead of <U4E94><U5341><U4E5D><U5E74>

I understand what you are concerned about.

In Japan, for example, it is not sutable to use number names to
express another form of year notation using alternative numeric
symbols in years such as AD 1984 and AD 645.  It is commonly expressed
in positional notation.  I think the situation is the same in Taiwan.

On the other hand, for example, it is not suitable to use positional
notation to express another form of year notation using alternative
numeric symbols in years such as AD 78.  It is commonly expressed in
number names.  Please see the attached file alt_digits.txt.

I think that these boundaries are empirically determined by whether
the numerical value is 2 digits or less or 3 digits or more.
Therefore, it makes sense to define alt_digits up to 99.

Regards,
TAMUKI Shoichi
  
Wei-Lun Chao Nov. 19, 2021, 2:47 p.m. UTC | #7
Dear Shoichi,

Thanks for your explanation.
I'll drop this part of patch.

TAMUKI Shoichi <tamuki@linet.gr.jp> 於 2021年5月16日 週日 下午6:04寫道:
>
> Hello Wei-Lun-san,
>
> From: Wei-Lun Chao <bluebat@member.fsf.org>
> Subject: Re: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
> Date: Thu, 13 May 2021 11:53:38 +0800
>
> > > > Rationale:
> > > > * Header cleanup: make the header of localedata files consistent
> > > > * Remove space (abmon): remove the extra prefix space in abmon 1~9
> > >
> > > Why are you trying to remove the extra prefix space in abmon 1..9?
> > > By fixing the display width of abbreviated month names (%b) in the
> > > same way as "Jan", "Feb", "Mar", ..., it has the effect of making the
> > > display width constant, such as system log.
> > > I don't think we should remove the leading space in abmon 1..9.
> >
> > Without this patch, "date +%b" will output with a leading space for month 1..9.
> > And the day number makes the display width inconstant already.
> > So I think single space would be a better separator.
>
> The elements defined in (ab)mon and (ab)day in the current *_TW locale
> are summarized in the attached file mon+day.png.  In each locale, the
> display widths of mon (%B) and abday (%A) can change, while the
> display widths of abmon (%b) and abday (%a) are constant.
>
> Date and time notations such as logs and form lists may sometimes be
> in RFC2822 (%a, %d %b %Y %T %z) or ctime (%a %b %d %T %Z %Y) format
> with local locales.
>
> With this patch, these display widths will change in *_TW locale,
> even though the display width of each conversion specification that
> makes up them is intended to be constant.  Please see the attached
> file ctime.png.
>
> Regards,
> TAMUKI Shoichi
  
Wei-Lun Chao Nov. 19, 2021, 2:48 p.m. UTC | #8
Dear Shoichi,

Thanks for your explanation.
I'll add this part of patch.

TAMUKI Shoichi <tamuki@linet.gr.jp> 於 2021年5月16日 週日 下午6:04寫道:
>
> Hello Wei-Lun-san,
>
> From: Wei-Lun Chao <bluebat@member.fsf.org>
> Subject: Re: [PATCH v2] localedata: Updates for Taiwanese locales [BZ #24409]
> Date: Thu, 13 May 2021 13:10:12 +0800
>
> > > > Changelog:
> > > > * localedata/locales/cmn_TW: Header cleanup; Remove space (abmon); Add
> > > > (week); Change (thousands_sep); Simplify (yesexpr) and (noexpr).
> > > > * localedata/locales/hak_TW: Likewise and add collation.
> > > > * localedata/locales/nan_TW: Likewise and add collation.
> > > > * localedata/locales/lzh_TW: Likewise and add collation.
> > >
> > > If elements from 0 to 59 are defined for alt_digits in lzh_TW, there
> > > are the following problems:
> > >
> > > $ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
> > > <U4E94><U5341><U4E5D><U5E74>
> > > $ LANG=lzh_TW date -d "1960-04-01 09:00:00" +"%Oy<U5E74>"
> > > 60<U5E74>
> > >
> > > Up to 99 elements should be defined for alt_digits.
> >
> > First, alt_digits works like "alt_numbers". I don't know how to just
> > map each digit to another character.
> > Before this patch, alt_digits covers 0..31, and this patch covers
> > 0..59. I am not sure, is it worthy to extend
> > to 0..99, because those presentations are not suitable for year
> > numbers. For example:
> > $ LANG=lzh_TW date -d "1959-04-01 09:00:00" +"%Oy<U5E74>"
> > We will expect <U4E94><U4E5D><U5E74> instead of <U4E94><U5341><U4E5D><U5E74>
>
> I understand what you are concerned about.
>
> In Japan, for example, it is not sutable to use number names to
> express another form of year notation using alternative numeric
> symbols in years such as AD 1984 and AD 645.  It is commonly expressed
> in positional notation.  I think the situation is the same in Taiwan.
>
> On the other hand, for example, it is not suitable to use positional
> notation to express another form of year notation using alternative
> numeric symbols in years such as AD 78.  It is commonly expressed in
> number names.  Please see the attached file alt_digits.png.
>
> I think that these boundaries are empirically determined by whether
> the numerical value is 2 digits or less or 3 digits or more.
> Therefore, it makes sense to define alt_digits up to 99.
>
> Regards,
> TAMUKI Shoichi
  

Patch

diff --git a/localedata/locales/cmn_TW b/localedata/locales/cmn_TW
index 9d9aca0f9e..bc90121012 100644
--- a/localedata/locales/cmn_TW
+++ b/localedata/locales/cmn_TW
@@ -8,15 +8,9 @@  escape_char /
 % exempt you from the conditions of the license if your use would
 % otherwise be governed by that license.

-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%
 % Mandarin Chinese locale for the Republic of China
-%
 % Prepared and contributed to glibc by Wei-Lun Chao <bluebat@member.fsf.org>
-%
 % build with: localedef -f UTF-8 -i cmn_TW cmn_TW
-%
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 LC_IDENTIFICATION
 title        "Mandarin Chinese locale for the Republic of China"
@@ -28,8 +22,8 @@  tel          ""
 fax          ""
 language     "Mandarin Chinese"
 territory    "Taiwan"
-revision     "0.2"
-date         "2017-07-20"
+revision     "0.3"
+date         "2021-04-21"

 category "i18n:2012";LC_IDENTIFICATION
 category "i18n:2012";LC_CTYPE
@@ -77,16 +71,16 @@  mon           "<U4E00><U6708>";/
      "<U5341><U6708>";/
      "<U5341><U4E00><U6708>";/
      "<U5341><U4E8C><U6708>"
-%  1月,  2月,  3月,  4月,  5月,  6月,  7月,  8月,  9月, 10月, 11月, 12月
-abmon         " 1<U6708>";/
-       " 2<U6708>";/
-       " 3<U6708>";/
-       " 4<U6708>";/
-       " 5<U6708>";/
-       " 6<U6708>";/
-       " 7<U6708>";/
-       " 8<U6708>";/
-       " 9<U6708>";/
+% 1月, 2月, 3月, 4月, 5月, 6月, 7月, 8月, 9月, 10月, 11月, 12月
+abmon         "1<U6708>";/
+       "2<U6708>";/
+       "3<U6708>";/
+       "4<U6708>";/
+       "5<U6708>";/
+       "6<U6708>";/
+       "7<U6708>";/
+       "8<U6708>";/
+       "9<U6708>";/
        "10<U6708>";/
        "11<U6708>";/
        "12<U6708>"
@@ -119,6 +113,8 @@  am_pm         "<U4E0A><U5348>";/
 t_fmt_ampm    "%p %I<U9EDE>%M<U5206>%S<U79D2>"
 % %Y年 %b %-d號 %A %H:%M:%S %Z
 date_fmt      "%Y<U5E74> %b %-d<U865F> %A %H:%M:%S %Z"
+week 7;19971130;1
+first_weekday 2

 era "+:2:1913//01//01:+*:<U6C11><U570B>:%EC%Ey<U5E74>";/
     "+:1:1912//01//01:1912//12//31:<U6C11><U570B>:%EC<U5143><U5E74>";/
@@ -127,7 +123,7 @@  END LC_TIME

 LC_NUMERIC
 decimal_point "."
-thousands_sep ","
+thousands_sep "<U2009>"
 grouping      4
 END LC_NUMERIC

@@ -135,7 +131,7 @@  LC_MONETARY
 currency_symbol    "NT$"
 int_curr_symbol    "TWD "
 mon_decimal_point  "."
-mon_thousands_sep  ","
+mon_thousands_sep  "<U2009>"
 mon_grouping       4
 positive_sign      ""
 negative_sign      "-"
@@ -166,8 +162,8 @@  measurement 1
 END LC_MEASUREMENT

 LC_MESSAGES
-yesexpr "^[+1yY<UFF59><UFF39><U662F>]"
-noexpr  "^[-0nN<UFF4E><UFF2E><U4E0D><U5426>]"
+yesexpr "^[yY]"
+noexpr  "^[nN]"
 % 是
 yesstr  "<U662F>"
 % 不是
diff --git a/localedata/locales/hak_TW b/localedata/locales/hak_TW
index 73b9678ec4..9824be119e 100644
--- a/localedata/locales/hak_TW
+++ b/localedata/locales/hak_TW
@@ -8,15 +8,9 @@  escape_char /
 % exempt you from the conditions of the license if your use would
 % otherwise be governed by that license.

-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%
 % Hakka Chinese locale for the Republic of China
-%
 % Prepared and contributed to glibc by Wei-Lun Chao <bluebat@member.fsf.org>
-%
 % build with: localedef -f UTF-8 -i hak_TW hak_TW
-%
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 LC_IDENTIFICATION
 title        "Hakka Chinese locale for the Republic of China"
@@ -28,8 +22,8 @@  tel          ""
 fax          ""
 language     "Hakka Chinese"
 territory    "Taiwan"
-revision     "0.1"
-date         "2013-06-02"
+revision     "0.2"
+date         "2021-04-21"

 category "i18n:2012";LC_IDENTIFICATION
 category "i18n:2012";LC_CTYPE
@@ -47,17 +41,20 @@  END LC_IDENTIFICATION

 LC_CTYPE
 copy "i18n"
+translit_start
+include "translit_combining";""
+translit_end
+
 class    "hanzi"; /
 <U3007>;/
 <U3400>..<U4DBF>;/
 <U4E00>..<U9FA5>;/
 <UF900>..<UFA6A>;/
-<U00020000>..<U0002A6D6>;/
-<U0002F800>..<U0002FA1D>
+<U00020000>..<U0002FA1D>
 END LC_CTYPE

 LC_COLLATE
-copy "iso14651_t1"
+copy "cns11643_stroke"
 END LC_COLLATE

 LC_TIME
@@ -74,16 +71,16 @@  mon           "<U4E00><U6708>";/
      "<U5341><U6708>";/
      "<U5341><U4E00><U6708>";/
      "<U5341><U4E8C><U6708>"
-%  1月,  2月,  3月,  4月,  5月,  6月,  7月,  8月,  9月, 10月, 11月, 12月
-abmon         " 1<U6708>";/
-       " 2<U6708>";/
-       " 3<U6708>";/
-       " 4<U6708>";/
-       " 5<U6708>";/
-       " 6<U6708>";/
-       " 7<U6708>";/
-       " 8<U6708>";/
-       " 9<U6708>";/
+% 1月, 2月, 3月, 4月, 5月, 6月, 7月, 8月, 9月, 10月, 11月, 12月
+abmon         "1<U6708>";/
+       "2<U6708>";/
+       "3<U6708>";/
+       "4<U6708>";/
+       "5<U6708>";/
+       "6<U6708>";/
+       "7<U6708>";/
+       "8<U6708>";/
+       "9<U6708>";/
        "10<U6708>";/
        "11<U6708>";/
        "12<U6708>"
@@ -103,19 +100,20 @@  abday         "<U65E5>";/
        "<U56DB>";/
        "<U4E94>";/
        "<U516D>"
-% %Y年%m月%d日 (%A) %H點%M分%S秒
-d_t_fmt       "%Y<U5E74>%m<U6708>%d<U65E5> (%A) %H<U9EDE>%M<U5206>%S<U79D2>"
-% %Y年%m月%d日
-d_fmt         "%Y<U5E74>%m<U6708>%d<U65E5>"
+% %Y年%m月%d號 (%A) %H點%M分%S秒
+d_t_fmt       "%Y<U5E74>%m<U6708>%d<U865F> (%A) %H<U9EDE>%M<U5206>%S<U79D2>"
+% %Y年%m月%d號
+d_fmt         "%Y<U5E74>%m<U6708>%d<U865F>"
 % %H點%M分%S秒
 t_fmt         "%H<U9EDE>%M<U5206>%S<U79D2>"
 % 上晝, 下晝
 am_pm         "<U4E0A><U665D>";"<U4E0B><U665D>"
 % %p %I點%M分%S秒
 t_fmt_ampm    "%p %I<U9EDE>%M<U5206>%S<U79D2>"
-% %Y年 %b %e日 %A %H:%M:%S %Z
-date_fmt      "%Y<U5E74> %b %e<U65E5> %A %H:%M:%S %Z"
+% %Y年 %b %-d號 %A %H:%M:%S %Z
+date_fmt      "%Y<U5E74> %b %-d<U865F> %A %H:%M:%S %Z"
 week 7;19971130;1
+first_weekday 2

 era "+:2:1913//01//01:+*:<U6C11><U570B>:%EC%Ey<U5E74>";/
     "+:1:1912//01//01:1912//12//31:<U6C11><U570B>:%EC<U5143><U5E74>";/
@@ -124,7 +122,7 @@  END LC_TIME

 LC_NUMERIC
 decimal_point "."
-thousands_sep ","
+thousands_sep "<U2009>"
 grouping      4
 END LC_NUMERIC

@@ -132,7 +130,7 @@  LC_MONETARY
 currency_symbol    "NT$"
 int_curr_symbol    "TWD "
 mon_decimal_point  "."
-mon_thousands_sep  ","
+mon_thousands_sep  "<U2009>"
 mon_grouping       4
 positive_sign      ""
 negative_sign      "-"
@@ -153,18 +151,22 @@  int_n_sign_posn    1
 END LC_MONETARY

 LC_PAPER
-copy "i18n"
+height 297
+width  210
 END LC_PAPER

 LC_MEASUREMENT
-copy "i18n"
+% metric
+measurement 1
 END LC_MEASUREMENT

 LC_MESSAGES
-% ^[+1yYyY係]
-yesexpr "^[+1yY<UFF59><UFF39><U4FC2>]"
-% ^[-0nNnN毋]
-noexpr  "^[-0nN<UFF4E><UFF2E><U6BCB>]"
+yesexpr "^[yY]"
+noexpr  "^[nN]"
+% 係
+yesstr  "<U4FC2>"
+% 毋係
+nostr   "<U6BCB><U4FC2>"
 END LC_MESSAGES

 LC_NAME
@@ -191,8 +193,8 @@  country_ab3  "TWN"
 country_num  158
 country_car "RC"
 country_isbn 957
-% 客家語
-lang_name    "<U5BA2><U5BB6><U8A71>"
+% 漢語客家語
+lang_name    "<U6F22><U8A9E><U5BA2><U5BB6><U8A9E>"
 lang_term    "hak"
 lang_lib     "hak"
 END LC_ADDRESS
diff --git a/localedata/locales/lzh_TW b/localedata/locales/lzh_TW
index 4740418a83..ba4c495b09 100644
--- a/localedata/locales/lzh_TW
+++ b/localedata/locales/lzh_TW
@@ -8,15 +8,9 @@  escape_char /
 % exempt you from the conditions of the license if your use would
 % otherwise be governed by that license.

-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%
 % Literary Chinese locale for the Republic of China
-%
 % Prepared and contributed to glibc by Wei-Lun Chao <bluebat@member.fsf.org>
-%
 % build with: localedef -f UTF-8 -i lzh_TW lzh_TW
-%
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 LC_IDENTIFICATION
 title        "Literary Chinese locale for the Republic of China"
@@ -28,8 +22,8 @@  tel          ""
 fax          ""
 language     "Literary Chinese"
 territory    "Taiwan"
-revision     "0.1"
-date         "2013-06-02"
+revision     "0.2"
+date         "2021-04-21"

 category "i18n:2012";LC_IDENTIFICATION
 category "i18n:2012";LC_CTYPE
@@ -47,17 +41,20 @@  END LC_IDENTIFICATION

 LC_CTYPE
 copy "i18n"
+translit_start
+include "translit_combining";""
+translit_end
+
 class    "hanzi"; /
 <U3007>;/
 <U3400>..<U4DBF>;/
 <U4E00>..<U9FA5>;/
 <UF900>..<UFA6A>;/
-<U00020000>..<U0002A6D6>;/
-<U0002F800>..<U0002FA1D>
+<U00020000>..<U0002FA1D>
 END LC_CTYPE

 LC_COLLATE
-copy "iso14651_t1"
+copy "cns11643_stroke"
 END LC_COLLATE

 LC_TIME
@@ -74,17 +71,17 @@  mon           "<U4E00><U6708>";/
      "<U5341><U6708>";/
      "<U5341><U4E00><U6708>";/
      "<U5341><U4E8C><U6708>"
-%  一 ,  二 ,  三 ,  四 ,  五 ,  六 ,  七 ,  八 ,  九 ,  十 , 十一, 十二
-abmon         " <U4E00> ";/
-       " <U4E8C> ";/
-       " <U4E09> ";/
-       " <U56DB> ";/
-       " <U4E94> ";/
-       " <U516D> ";/
-       " <U4E03> ";/
-       " <U516B> ";/
-       " <U4E5D> ";/
-       " <U5341> ";/
+% 一, 二, 三, 四, 五, 六, 七, 八, 九, 十, 十一, 十二
+abmon         "<U4E00>";/
+       "<U4E8C>";/
+       "<U4E09>";/
+       "<U56DB>";/
+       "<U4E94>";/
+       "<U516D>";/
+       "<U4E03>";/
+       "<U516B>";/
+       "<U4E5D>";/
+       "<U5341>";/
        "<U5341><U4E00>";/
        "<U5341><U4E8C>"
 % 週日, 週一, 週二, 週三, 週四, 週五, 週六
@@ -103,19 +100,21 @@  abday         "<U65E5>";/
        "<U56DB>";/
        "<U4E94>";/
        "<U516D>"
-% %OC%Oy年%B%Od日 (%A) %OH時%OM分%OS秒
-d_t_fmt       "%OC%Oy<U5E74>%B%Od<U65E5> (%A) %OH<U6642>%OM<U5206>%OS<U79D2>"
-% %OC%Oy年%B%Od日
-d_fmt         "%OC%Oy<U5E74>%B%Od<U65E5>"
+% %Y年%B%Od (%A) %OH時%OM分%OS秒
+d_t_fmt       "%Y<U5E74>%B%Od (%A) %OH<U6642>%OM<U5206>%OS<U79D2>"
+% %Y年%B%Od
+d_fmt         "%Y<U5E74>%B%Od"
 % %OH時%OM分%OS秒
 t_fmt         "%OH<U6642>%OM<U5206>%OS<U79D2>"
 % 朝, 暮
-am_pm         "<U671D>";"<U66AE>"
+am_pm         "<U671D>";/
+       "<U66AE>"
 % %p %OI時%OM分%OS秒
 t_fmt_ampm    "%p %OI<U6642>%OM<U5206>%OS<U79D2>"
-% 公曆 %C%Oy年 %B %Oe日 %A %OH時%OM分%OS秒
-date_fmt      "<U516C><U66C6> %C%Oy<U5E74> %B %Oe<U65E5> %A
%OH<U6642>%OM<U5206>%OS<U79D2>"
-% 〇, 一, 二, 三, 四, 五, 六, 七, 八, 九, 十, 十一, 十二, 十三, 十四, 十五, 十六, 十七, 十八,
十九, 廿, 廿一, 廿二, 廿三, 廿四, 廿五, 廿六, 廿七, 廿八, 廿九, 卅, 卅一
+% 公曆 %Y年 %B %Oe %A %OH時%OM分%OS秒
+date_fmt      "<U516C><U66C6> %Y<U5E74> %B %Oe %A
%OH<U6642>%OM<U5206>%OS<U79D2>"
+% 〇, 一, 二, 三, 四, 五, 六, 七, 八, 九, 十, 十一, 十二, 十三, 十四, 十五, 十六, 十七, 十八,
十九, 廿, 廿一, 廿二, 廿三, 廿四, 廿五, 廿六, 廿七, 廿八, 廿九, 卅, 卅一,
+% 卅二, 卅三, 卅四, 卅五, 卅六, 卅七, 卅八, 卅九, 四十, 四十一, 四十二, 四十三, 四十四, 四十五, 四十六,
四十七, 四十八, 四十九, 五十, 五十一, 五十二, 五十三, 五十四, 五十五, 五十六, 五十七, 五十八, 五十九
 alt_digits    "<U3007>";/
             "<U4E00>";/
             "<U4E8C>";/
@@ -147,9 +146,38 @@  alt_digits    "<U3007>";/
             "<U5EFF><U516B>";/
             "<U5EFF><U4E5D>";/
             "<U5345>";/
-            "<U5345><U4E00>"
+            "<U5345><U4E00>";/
+            "<U5345><U4E8C>";/
+            "<U5345><U4E09>";/
+            "<U5345><U56DB>";/
+            "<U5345><U4E94>";/
+            "<U5345><U516D>";/
+            "<U5345><U4E03>";/
+            "<U5345><U516B>";/
+            "<U5345><U4E5D>";/
+            "<U56DB><U5341>";/
+            "<U56DB><U5341><U4E00>";/
+            "<U56DB><U5341><U4E8C>";/
+            "<U56DB><U5341><U4E09>";/
+            "<U56DB><U5341><U56DB>";/
+            "<U56DB><U5341><U4E94>";/
+            "<U56DB><U5341><U516D>";/
+            "<U56DB><U5341><U4E03>";/
+            "<U56DB><U5341><U516B>";/
+            "<U56DB><U5341><U4E5D>";/
+            "<U56DB><U5341>";/
+            "<U4E94><U5341><U4E00>";/
+            "<U4E94><U5341><U4E8C>";/
+            "<U4E94><U5341><U4E09>";/
+            "<U4E94><U5341><U56DB>";/
+            "<U4E94><U5341><U4E94>";/
+            "<U4E94><U5341><U516D>";/
+            "<U4E94><U5341><U4E03>";/
+            "<U4E94><U5341><U516B>";/
+            "<U4E94><U5341><U4E5D>"
 %
 week 7;19971130;1
+first_weekday 2

 era "+:2:1913//01//01:+*:<U6C11><U570B>:%EC%Ey<U5E74>";/
     "+:1:1912//01//01:1912//12//31:<U6C11><U570B>:%EC<U5143><U5E74>";/
@@ -158,7 +186,7 @@  END LC_TIME

 LC_NUMERIC
 decimal_point "."
-thousands_sep ","
+thousands_sep "<U2009>"
 grouping      4
 END LC_NUMERIC

@@ -166,7 +194,7 @@  LC_MONETARY
 currency_symbol    "NT$"
 int_curr_symbol    "TWD "
 mon_decimal_point  "."
-mon_thousands_sep  ","
+mon_thousands_sep  "<U2009>"
 mon_grouping       4
 positive_sign      ""
 negative_sign      "-"
@@ -187,18 +215,22 @@  int_n_sign_posn    1
 END LC_MONETARY

 LC_PAPER
-copy "i18n"
+height 297
+width  210
 END LC_PAPER

 LC_MEASUREMENT
-copy "i18n"
+% metric
+measurement 1
 END LC_MEASUREMENT

 LC_MESSAGES
-% ^[+1yYyY是]
-yesexpr "^[+1yY<UFF59><UFF39><U662F>]"
-% ^[-0nNnN非]
-noexpr  "^[-0nN<UFF4E><UFF2E><U975E>]"
+yesexpr "^[yY]"
+noexpr  "^[nN]"
+% 是
+yesstr  "<U662F>"
+% 否
+nostr   "<U5426>"
 END LC_MESSAGES

 LC_NAME
diff --git a/localedata/locales/nan_TW b/localedata/locales/nan_TW
index 0c2a56f4ca..83e47afbae 100644
--- a/localedata/locales/nan_TW
+++ b/localedata/locales/nan_TW
@@ -1,22 +1,16 @@ 
 comment_char %
 escape_char /

-% This file is part of the GNU C Library and contains locale data.
-% The Free Software Foundation does not claim any copyright interest
-% in the locale data contained in this file.  The foregoing does not
-% affect the license of the GNU C Library as a whole.  It does not
-% exempt you from the conditions of the license if your use would
-% otherwise be governed by that license.
-
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-%
+% This file is a part of GNU C Library (glibc) and contains locale data. The
+% Free Software Foundation does not claim any copyright interest in the
+% locale data contained in this file. The foregoing does not affect the
+% license of GNU C Library (glibc) as a whole. It does not exempt you from the
+% conditions of the license if your use would otherwise be governed by
+% that license.
+
 % Min Nan Chinese locale for the Republic of China
-%
 % Prepared and contributed to glibc by Wei-Lun Chao <bluebat@member.fsf.org>
-%
 % build with: localedef -f UTF-8 -i nan_TW nan_TW
-%
-%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 LC_IDENTIFICATION
 title        "Min Nan Chinese locale for the Republic of China"
@@ -28,8 +22,8 @@  tel          ""
 fax          ""
 language     "Min Nan Chinese"
 territory    "Taiwan"
-revision     "0.1"
-date         "2013-06-02"
+revision     "0.2"
+date         "2021-04-21"

 category "i18n:2012";LC_IDENTIFICATION
 category "i18n:2012";LC_CTYPE
@@ -47,17 +41,20 @@  END LC_IDENTIFICATION

 LC_CTYPE
 copy "i18n"
+translit_start
+include "translit_combining";""
+translit_end
+
 class    "hanzi"; /
 <U3007>;/
 <U3400>..<U4DBF>;/
 <U4E00>..<U9FA5>;/
 <UF900>..<UFA6A>;/
-<U00020000>..<U0002A6D6>;/
-<U0002F800>..<U0002FA1D>
+<U00020000>..<U0002FA1D>
 END LC_CTYPE

 LC_COLLATE
-copy "iso14651_t1"
+copy "cns11643_stroke"
 END LC_COLLATE

 LC_TIME
@@ -74,16 +71,16 @@  mon           "<U4E00><U6708>";/
      "<U5341><U6708>";/
      "<U5341><U4E00><U6708>";/
      "<U5341><U4E8C><U6708>"
-%  1月,  2月,  3月,  4月,  5月,  6月,  7月,  8月,  9月, 10月, 11月, 12月
-abmon         " 1<U6708>";/
-       " 2<U6708>";/
-       " 3<U6708>";/
-       " 4<U6708>";/
-       " 5<U6708>";/
-       " 6<U6708>";/
-       " 7<U6708>";/
-       " 8<U6708>";/
-       " 9<U6708>";/
+% 1月, 2月, 3月, 4月, 5月, 6月, 7月, 8月, 9月, 10月, 11月, 12月
+abmon         "1<U6708>";/
+       "2<U6708>";/
+       "3<U6708>";/
+       "4<U6708>";/
+       "5<U6708>";/
+       "6<U6708>";/
+       "7<U6708>";/
+       "8<U6708>";/
+       "9<U6708>";/
        "10<U6708>";/
        "11<U6708>";/
        "12<U6708>"
@@ -103,10 +100,10 @@  abday         "<U65E5>";/
        "<U56DB>";/
        "<U4E94>";/
        "<U516D>"
-% %Y年%m月%d日 (%A) %H點%M分%S秒
-d_t_fmt       "%Y<U5E74>%m<U6708>%d<U65E5> (%A) %H<U9EDE>%M<U5206>%S<U79D2>"
-% %Y年%m月%d日
-d_fmt         "%Y<U5E74>%m<U6708>%d<U65E5>"
+% %Y年%m月%d號 (%A) %H點%M分%S秒
+d_t_fmt       "%Y<U5E74>%m<U6708>%d<U865F> (%A) %H<U9EDE>%M<U5206>%S<U79D2>"
+% %Y年%m月%d號
+d_fmt         "%Y<U5E74>%m<U6708>%d<U865F>"
 % %H點%M分%S秒
 t_fmt         "%H<U9EDE>%M<U5206>%S<U79D2>"
 % 頂晡, 下晡
@@ -114,9 +111,10 @@  am_pm         "<U9802><U6661>";/
        "<U4E0B><U6661>"
 % %p %I點%M分%S秒
 t_fmt_ampm    "%p %I<U9EDE>%M<U5206>%S<U79D2>"
-% %Y年 %b %e日 %A %H:%M:%S %Z
-date_fmt      "%Y<U5E74> %b %e<U65E5> %A %H:%M:%S %Z"
+% %Y年 %b %-d號 %A %H:%M:%S %Z
+date_fmt      "%Y<U5E74> %b %-d<U865F> %A %H:%M:%S %Z"
 week 7;19971130;1
+first_weekday 2

 era "+:2:1913//01//01:+*:<U6C11><U570B>:%EC%Ey<U5E74>";/
     "+:1:1912//01//01:1912//12//31:<U6C11><U570B>:%EC<U5143><U5E74>";/
@@ -125,7 +123,7 @@  END LC_TIME

 LC_NUMERIC
 decimal_point "."
-thousands_sep ","
+thousands_sep "<U2009>"
 grouping      4
 END LC_NUMERIC

@@ -133,7 +131,7 @@  LC_MONETARY
 currency_symbol    "NT$"
 int_curr_symbol    "TWD "
 mon_decimal_point  "."
-mon_thousands_sep  ","
+mon_thousands_sep  "<U2009>"
 mon_grouping       4
 positive_sign      ""
 negative_sign      "-"
@@ -154,18 +152,22 @@  int_n_sign_posn    1
 END LC_MONETARY

 LC_PAPER
-copy "i18n"
+height 297
+width  210
 END LC_PAPER

 LC_MEASUREMENT
-copy "i18n"
+% metric
+measurement 1
 END LC_MEASUREMENT

 LC_MESSAGES
-% ^[+1yYyY是]
-yesexpr "^[+1yY<UFF59><UFF39><U662F>]"
-% ^[-0nNnN伓]
-noexpr  "^[-0nN<UFF4E><UFF2E><U4F13>]"
+yesexpr "^[yY]"
+noexpr  "^[nN]"
+% 是
+yesstr  "<U662F>"
+% 毋是
+nostr   "<U6BCB><U662F>"
 END LC_MESSAGES

 LC_NAME
@@ -192,8 +194,8 @@  country_ab3  "TWN"
 country_num  158
 country_car "RC"
 country_isbn 957
-% 閩南語
-lang_name    "<U95A9><U5357><U8A9E>"
+% 漢語閩南語
+lang_name    "<U6F22><U8A9E><U95A9><U5357><U8A9E>"
 lang_term    "nan"
 lang_lib     "nan"
 END LC_ADDRESS