[v6,1/2] strftime: Set the default width of "%Ey" to 2 [BZ #23758]

Message ID 201901110449.AA04174@tamuki.linet.gr.jp
State Superseded
Headers

Commit Message

TAMUKI Shoichi Jan. 11, 2019, 4:49 a.m. UTC
  The Japanese era name is scheduled to be changed on May 1, 2019.
Prior to this, change the alternative representation for year in
strftime to pad the number with zero to keep it constant width, so
that prevent the trouble we saw in the past from becoming obvious
again from the year after the era name changes onward.

Since only one Japanese era name is used by each emperor's reign as
lately, it is rare that the year ends in one digit or lasts more than
three digits.  In addition, the default width of month, day, hour,
minute, and second is 2, so adjust the default width of year the same
as them, and then the whole display balance is improved.  Therefore,
it would be reasonable to set the default width padding with zero of
"%Ey" to 2.

ChangeLog:

	[BZ #23758]
	* NEWS: Mention the change.
	* manual/time.texi (strftime): Document the desctiption for "%Ey".
	Also, fix the wording to "alternative" rather than "alternate".
	* time/strftime_l.c (__strftime_internal): Set the default width
	padding with zero of "%Ey" to 2.
---
 NEWS              |  4 ++++
 manual/time.texi  | 11 ++++++++---
 time/strftime_l.c |  2 +-
 3 files changed, 13 insertions(+), 4 deletions(-)
  

Comments

Zack Weinberg Jan. 16, 2019, 4:17 p.m. UTC | #1
This review covers only the documentation and commit message.

> The Japanese era name is scheduled to be changed on May 1, 2019.
> Prior to this, change the alternative representation for year in
> strftime to pad the number with zero to keep it constant width, so
> that prevent the trouble we saw in the past from becoming obvious
> again from the year after the era name changes onward.
>
> Since only one Japanese era name is used by each emperor's reign as
> lately, it is rare that the year ends in one digit or lasts more than
> three digits.  In addition, the default width of month, day, hour,
> minute, and second is 2, so adjust the default width of year the same
> as them, and then the whole display balance is improved.  Therefore,
> it would be reasonable to set the default width padding with zero of
> "%Ey" to 2.

This commit message will not be clear to readers who are not familiar
with Japanese era dating.  I had to look it up myself.  It also needs
to make clear that %Ey is changed for all locales, not just Japanese
locales, and this is expected to be harmless for locales that use %Ey
for something else.  (I see you mentioned this in the commit message
for the %EY change, but it belongs here.)

Based on what I learned, I suggest this instead:

| In Japanese locales, strftime's alternative year format (%Ey)
| produces the year of the current era (nengō).  A new era typically
| begins when a new emperor is enthroned.  The year of the current
| era is therefore usually a one- or two-digit number.
|
| Many programs that display Japanese era dates assume that the era
| year is two digits wide.  To improve how these programs display
| dates during the first nine years of a new era, change %Ey to pad
| one-digit numbers on the left with a zero.  This change applies to
| all locales.  It is expected to be harmless for other locales that
| use the alternative year format (e.g. lo_LA and th_TH, in which %Ey
| produces the year of the Buddhist calendar) as those calendars'
| year numbers are already more than two digits wide, and this is
| not expected to change.
|
| This change needs to be in place before 2019-05-01 CE, as a new
| era is scheduled to begin on that date.

>       [BZ #23758]
>       * NEWS: Mention the change.

Changes to NEWS are not mentioned in the ChangeLog.

>       * manual/time.texi (strftime): Document the desctiption for "%Ey".
>       Also, fix the wording to "alternative" rather than "alternate".
>       * time/strftime_l.c (__strftime_internal): Set the default width
>       padding with zero of "%Ey" to 2.

The rest of the ChangeLog entry is fine.  However ...

>       Also, fix the wording to "alternative" rather than "alternate".

... this wording change is *correct*, because "alternative" is the
term used by the POSIX specification for strftime, but it should be
committed as a separate patch.

Please remove all of the alternate -> alternative changes in this
patch series.  Please then prepare a separate patch that changes all
uses of "alternate" to "alternative" in the manual's description of
time formatting and makes no other changes.  That patch is
pre-approved: you may post it to the mailing list and immediately push
it to git, without waiting for someone to look at it.  It should be
applied before this patch series.

However, take care not to change uses of "alternate" that refer to
things other than alternative time formats.  For instance, "alternate"
is the correct word to use in the discussion of signal stacks.

(When used as modifiers, "alternate" and "alternative" mean almost
exactly the same thing.  Which one is used in any given context is a
matter of convention.  The manual tries to be consistent with the
POSIX specification, which uses one word for some things and the other
for others.  I regret the confusion this must cause.)

> +* Improve the width of alternative representation for year in
> +  strftime.  For %Ey conversion specifier, the default action is now
> +  to pad the number with zero to keep minimum 2 digits, similar to %y.

It is appropriate to repeat here that this change applies to all
locales, but is for the sake of Japanese era years, and is expected to
be harmless for all other locales.  Taking Paul's comments into
account as well, here is a suggested revision:

| * strftime's default formatting of a locale's alternative year (%Ey)
|   has been changed to zero-pad the year to a minimum of two digits,
|   like %y.  This improves the display of Japanese era years during
|   the first nine years of a new era, and is expected to be harmless
|   for all other locales (only Japanese locales regularly have
|   alternative year numbers less than 10).  Zero-padding can be
|   overridden with the '-' or '_' flags (which are GNU extensions).

> +If the @code{E} modifier is specified (@code{%Ey}), the locale's
> +alternative representation for year (the era year) is used instead.
> +The default action is to pad the number with zero to keep minimum 2
> +digits, similar to @code{%y}.

In addition to Paul's suggestion, this needs to be reworded for
clarity and to avoid using the term "the era year", which is not an
appropriate description of the alternative calendar for all locales.
Also, it should be clarified that %Ey does *not* reduce the result
modulo 100 as %y does.  I suggest

| If the @code{E} modifier is specified (@code{%Ey}), instead produces
| the year number according to a locale-specific alternative calendar.
| Unlike @code{%y}, the number is @emph{not} reduced modulo 100.
| However, by default it is zero-padded to a minimum of two digits
| (this can be overridden by an explicit field width or by the @code{_}
| and @code{-} flags).

zw
  
TAMUKI Shoichi Jan. 17, 2019, 6:27 a.m. UTC | #2
Hello Zack,

Thank you for reviewing my patch's documentation and commit messages.
Almost all the things I wanted to say were expressed and I think that
is very nice.

Also, Rafal and Paul, thank you for reviewing my patch's documentation
and commit messages.

I will soon prepare a new version set of patches with these remarks
applied.

From: Zack Weinberg <zackw@panix.com>
Subject: Re: [PATCH v6 1/2] strftime: Set the default width of "%Ey" to 2 [BZ #23758]
Date: Wed, 16 Jan 2019 11:17:09 -0500

> > The Japanese era name is scheduled to be changed on May 1, 2019.
> > Prior to this, change the alternative representation for year in
> > strftime to pad the number with zero to keep it constant width, so
> > that prevent the trouble we saw in the past from becoming obvious
> > again from the year after the era name changes onward.
> > 
> > Since only one Japanese era name is used by each emperor's reign as
> > lately, it is rare that the year ends in one digit or lasts more than
> > three digits.  In addition, the default width of month, day, hour,
> > minute, and second is 2, so adjust the default width of year the same
> > as them, and then the whole display balance is improved.  Therefore,
> > it would be reasonable to set the default width padding with zero of
> > "%Ey" to 2.
> 
> This commit message will not be clear to readers who are not familiar
> with Japanese era dating.  I had to look it up myself.  It also needs
> to make clear that %Ey is changed for all locales, not just Japanese
> locales, and this is expected to be harmless for locales that use %Ey
> for something else.  (I see you mentioned this in the commit message
> for the %EY change, but it belongs here.)
> 
> Based on what I learned, I suggest this instead:
> 
> | In Japanese locales, strftime's alternative year format (%Ey)
> | produces the year of the current era (neng?).  A new era typically
> | begins when a new emperor is enthroned.  The year of the current
> | era is therefore usually a one- or two-digit number.

I'm sorry, but I did not understand the meaning of (neng?).  I will
rewrite it to (Japanese Calendar) tentatively.

Regards,
TAMUKI Shoichi
  
Rafal Luzynski Jan. 17, 2019, 5:56 p.m. UTC | #3
17.01.2019 07:27 TAMUKI Shoichi <tamuki@linet.gr.jp> wrote:
> [...]
> I'm sorry, but I did not understand the meaning of (neng?).  I will
> rewrite it to (Japanese Calendar) tentatively.

It looks like your email client was unable to handle the letter "ō"
("o" with macron). [1] According to Wikipedia, [2] Zack probably meant
the Japanese era name.  I guess you are the right person to say whether
it should be written as "nengō" or "nengo" or "Japanese era name"
or "Japanese year name" or just drop the parentheses completely.
I think it's not about the Japanese Calendar, as you suggest.

BTW, this Wikipedia article also explained me what you mean by the
"year name".  I confirm that the term is confusing for those unfamiliar
with the Japanese calendar.

Regards,

Rafal


[1] https://www.fileformat.info/info/unicode/char/014d/index.htm
[2] https://en.wikipedia.org/wiki/Japanese_era_name
  
Zack Weinberg Jan. 18, 2019, 2:45 a.m. UTC | #4
On Thu, Jan 17, 2019 at 12:57 PM Rafal Luzynski
<digitalfreak@lingonborough.com> wrote:
>
> 17.01.2019 07:27 TAMUKI Shoichi <tamuki@linet.gr.jp> wrote:
> > [...]
> > I'm sorry, but I did not understand the meaning of (neng?).  I will
> > rewrite it to (Japanese Calendar) tentatively.
>
> It looks like your email client was unable to handle the letter "ō"
> ("o" with macron). [1] According to Wikipedia, [2] Zack probably meant
> the Japanese era name.

Yes, "nengo" with a macron over the "o" was the word I was trying to
write.  I don't speak Japanese myself, I got it from
https://en.wikipedia.org/wiki/Japanese_era_name .  I wanted to mention
the (romaji version of) the Japanese word for an era of this calendar,
since "era" has several meanings in English and not very many native
English speakers are familiar with this calendar.  (I wonder what you
will get if I copy and paste the kanji next to "nengō" on the
Wikipedia page: 年号.)

The thing I was trying to say is that "%Ey" (in the ja_JP locale)
produces the number of a year _within_ an era.  In this context, the
Japanese word I want is not the word for the _name_ of an era, but the
word for an era itself.  What is that word?

zw
  
TAMUKI Shoichi Jan. 18, 2019, 1:56 p.m. UTC | #5
Hello Rafal,

From: Rafal Luzynski <digitalfreak@lingonborough.com>
Subject: Re: [PATCH v6 1/2] strftime: Set the default width of "%Ey" to 2 [BZ #23758]
Date: Thu, 17 Jan 2019 18:56:20 +0100 (CET)

> > I'm sorry, but I did not understand the meaning of (neng?).  I will
> > rewrite it to (Japanese Calendar) tentatively.
> 
> It looks like your email client was unable to handle the letter "?"
> ("o" with macron). [1] According to Wikipedia, [2] Zack probably meant
> the Japanese era name.  I guess you are the right person to say whether
> it should be written as "neng?" or "nengo" or "Japanese era name"
> or "Japanese year name" or just drop the parentheses completely.
> I think it's not about the Japanese Calendar, as you suggest.

OK, I understand.  I think "nengo" and "gengo" are nearly synonymous,
and "gengo" seems to be used more often in Japan.  These mean "era
name" (%EC).  On the other hand, "the year of the (current) era" means
"the numeric era year" (%Ey).

> BTW, this Wikipedia article also explained me what you mean by the
> "year name".  I confirm that the term is confusing for those unfamiliar
> with the Japanese calendar.

I think "year name" and "era name" are also synonymous, and "era name"
seems to be used more often. [1]

[1] https://mainichi.jp/english/articles/20190104/p2a/00m/0na/034000c

Regards,
TAMUKI Shoichi
  
TAMUKI Shoichi Jan. 18, 2019, 1:59 p.m. UTC | #6
Hello Zack,

From: Zack Weinberg <zackw@panix.com>
Subject: Re: [PATCH v6 1/2] strftime: Set the default width of "%Ey" to 2 [BZ #23758]
Date: Thu, 17 Jan 2019 21:45:37 -0500

> The thing I was trying to say is that "%Ey" (in the ja_JP locale)
> produces the number of a year _within_ an era.  In this context, the
> Japanese word I want is not the word for the _name_ of an era, but the
> word for an era itself.  What is that word?

Hmm, I cannot think of the word for an era itself.

Regards,
TAMUKI Shoichi
  
Rafal Luzynski Jan. 18, 2019, 6:35 p.m. UTC | #7
18.01.2019 14:59 TAMUKI Shoichi <tamuki@linet.gr.jp> wrote:
> [...]
> From: Zack Weinberg <zackw@panix.com>
> Subject: Re: [PATCH v6 1/2] strftime: Set the default width of "%Ey" to 2
> [BZ #23758]
> Date: Thu, 17 Jan 2019 21:45:37 -0500
> 
> > The thing I was trying to say is that "%Ey" (in the ja_JP locale)
> > produces the number of a year _within_ an era.  In this context, the
> > Japanese word I want is not the word for the _name_ of an era, but the
> > word for an era itself.  What is that word?
> 
> Hmm, I cannot think of the word for an era itself.

I think we can skip the parentheses and their contents completely
if the English text outside is sufficient and the Japanese term is
only bringing confusion.

Regards,

Rafal
  
Rafal Luzynski Jan. 18, 2019, 6:44 p.m. UTC | #8
18.01.2019 14:56 TAMUKI Shoichi <tamuki@linet.gr.jp> wrote:
> 
> Hello Rafal,
> 
> > [...]
> > It looks like your email client was unable to handle the letter "?"
> > ("o" with macron). [1] According to Wikipedia, [2] Zack probably meant
> > the Japanese era name.  I guess you are the right person to say whether
> > it should be written as "neng?" or "nengo" or "Japanese era name"
> > or "Japanese year name" or just drop the parentheses completely.
> > I think it's not about the Japanese Calendar, as you suggest.
> 
> OK, I understand.  I think "nengo" and "gengo" are nearly synonymous,
> and "gengo" seems to be used more often in Japan.  These mean "era
> name" (%EC).  On the other hand, "the year of the (current) era" means
> "the numeric era year" (%Ey).

"The numeric era year" sounds unclear for me.  I think you mean
"the number of the year" or "the number of the year in the current era"
or anything like that.

> [...]
> I think "year name" and "era name" are also synonymous, and "era name"
> seems to be used more often. [1]
> 
> [1] https://mainichi.jp/english/articles/20190104/p2a/00m/0na/034000c

That's probably because, again, the term "year name" is confusing:
at first I thought that each year has its own name (same as it has
its own number).  Only after having read I understand that each year
has its name *and* a number, and the name of the year is also the name
of the era, shared with other years of the same era.  I hope I understand
this correctly and also I explain my confusion correctly. :-)

Regards,

Rafal
  
TAMUKI Shoichi Jan. 19, 2019, 3:51 a.m. UTC | #9
Hello Rafal and Zack,

From: Rafal Luzynski <digitalfreak@lingonborough.com>
Subject: Re: [PATCH v6 1/2] strftime: Set the default width of "%Ey" to 2 [BZ #23758]
Date: Fri, 18 Jan 2019 19:44:39 +0100 (CET)

> > OK, I understand.  I think "nengo" and "gengo" are nearly synonymous,
> > and "gengo" seems to be used more often in Japan.  These mean "era
> > name" (%EC).  On the other hand, "the year of the (current) era" means
> > "the numeric era year" (%Ey).
> 
> "The numeric era year" sounds unclear for me.  I think you mean
> "the number of the year" or "the number of the year in the current era"
> or anything like that.

I was concerned about the word "current" from a while ago.  "%Ey" does
not necessarily indicate the year of the current era.  For example:

$ LANG=ja_JP.UTF-8 date -d "2018-04-01" +"%Ey"
30
$ LANG=ja_JP.UTF-8 date -d "1955-04-01" +"%Ey"
30

The former era name is "Heisei", the latter era name is "Showa".

I think that the same thing can be said in the Christian era:

$ date -d "2018-04-01" +"%y"
18
$ date -d "1918-04-01" +"%y"
18

The former is the 21st century, the latter is the 20th century.

Therefore, I think that the word "current" should not be used here.

Based on this, can we drop the word "current" about the commit message
in the first patch?

| In Japanese locales, strftime's alternative year format (%Ey) produces
| the year of the current era.  A new era typically begins when a new
| emperor is enthroned.  The year of the current era is therefore
| usually a one- or two-digit number.

In addition, is "In Japanese locales," correct?

Perhaps, is "In Japanese locale," better?

The manual in the second patch also had the word "current".  Can we
similarly drop the word "current"?

| If the @code{E} modifier is specified (@code{%EC}), instead produces
| the name of the period for the current year (e.g. an era name) in the
| locale's alternative calendar.

Regards,
TAMUKI Shoichi
  
Zack Weinberg Jan. 19, 2019, 5:08 p.m. UTC | #10
On Fri, Jan 18, 2019 at 10:52 PM TAMUKI Shoichi <tamuki@linet.gr.jp> wrote:
> Hello Rafal and Zack,
> >
> > "The numeric era year" sounds unclear for me.  I think you mean
> > "the number of the year" or "the number of the year in the current era"
> > or anything like that.
>
> I was concerned about the word "current" from a while ago.  "%Ey" does
> not necessarily indicate the year of the current era.  For example:
>
> $ LANG=ja_JP.UTF-8 date -d "2018-04-01" +"%Ey"
> 30
> $ LANG=ja_JP.UTF-8 date -d "1955-04-01" +"%Ey"
> 30

Yes, you are right, "current" should not be used.  This occurred to me
when I was writing a later part of my suggestions but I did not
remember to go back and fix this commit message.

> In addition, is "In Japanese locales," correct?
> Perhaps, is "In Japanese locale," better?

There is only one Japanese locale right now, ja_JP, but there could be
others in the future.  There are many native speakers of Japanese in
other countries; I don't know if they would want to use this calendar,
but it's not out of the question.  So it seems more natural to me to
say "In Japanese locales."

(Also, it would have to be "In _the_ Japanese locale" if we were going
to make "locale" singular.)

So maybe something like this:

| In Japanese locales, strftime's alternative year format (%Ey) produces
| a year numbered within a time period called an _era_.  A new era typically
| begins when a new emperor is enthroned.  The result of %Ey is therefore
| usually a one-or two-digit number.

> The manual in the second patch also had the word "current".  Can we
> similarly drop the word "current"?
>
> | If the @code{E} modifier is specified (@code{%EC}), instead produces
> | the name of the period for the current year (e.g. an era name) in the
> | locale's alternative calendar.

Yes, same problem and I think here we can just say "the period for the year".

zw
  
Rafal Luzynski Jan. 19, 2019, 9:42 p.m. UTC | #11
19.01.2019 18:08 Zack Weinberg <zackw@panix.com> wrote:
> On Fri, Jan 18, 2019 at 10:52 PM TAMUKI Shoichi <tamuki@linet.gr.jp>
> wrote:
> > [...]
> > I was concerned about the word "current" from a while ago.  "%Ey" does
> > not necessarily indicate the year of the current era.  For example:
> >
> > $ LANG=ja_JP.UTF-8 date -d "2018-04-01" +"%Ey"
> > 30
> > $ LANG=ja_JP.UTF-8 date -d "1955-04-01" +"%Ey"
> > 30
> 
> Yes, you are right, "current" should not be used.  [...]

I absolutely agree.

> > In addition, is "In Japanese locales," correct?
> > Perhaps, is "In Japanese locale," better?
> 
> There is only one Japanese locale right now, ja_JP, but there could be
> others in the future.  There are many native speakers of Japanese in
> other countries; I don't know if they would want to use this calendar,
> but it's not out of the question.  So it seems more natural to me to
> say "In Japanese locales."
> [...]

I thought you meant that we already had multiple Japanese locales
which differ in the charset: ja_JP.UTF-8, ja_JP.EUC-JP, etc.

Regards,

Rafal
  

Patch

diff --git a/NEWS b/NEWS
index cc20102fda4..00fab6e8825 100644
--- a/NEWS
+++ b/NEWS
@@ -52,6 +52,10 @@  Major new features:
     - C-SKY ABIV2 soft-float little-endian
     - C-SKY ABIV2 hard-float little-endian
 
+* Improve the width of alternative representation for year in
+  strftime.  For %Ey conversion specifier, the default action is now
+  to pad the number with zero to keep minimum 2 digits, similar to %y.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * The glibc.tune tunable namespace has been renamed to glibc.cpu and the
diff --git a/manual/time.texi b/manual/time.texi
index 9e981314876..ab544e590c8 100644
--- a/manual/time.texi
+++ b/manual/time.texi
@@ -1339,7 +1339,7 @@  POSIX.2-1992 and by @w{ISO C99}, are:
 
 @table @code
 @item E
-Use the locale's alternate representation for date and time.  This
+Use the locale's alternative representation for date and time.  This
 modifier applies to the @code{%c}, @code{%C}, @code{%x}, @code{%X},
 @code{%y} and @code{%Y} format specifiers.  In a Japanese locale, for
 example, @code{%Ex} might yield a date format based on the Japanese
@@ -1347,7 +1347,7 @@  Emperors' reigns.
 
 @item O
 With all format specifiers that produce numbers: use the locale's
-alternate numeric symbols.
+alternative numeric symbols.
 
 With @code{%B}, @code{%b}, and @code{%h}: use the grammatical form for
 month names that is appropriate when the month is named by itself,
@@ -1355,7 +1355,7 @@  rather than the form that is appropriate when the month is used as
 part of a complete date.  This is a GNU extension.
 @end table
 
-If the format supports the modifier but no alternate representation
+If the format supports the modifier but no alternative representation
 is available, it is ignored.
 
 The conversion specifier ends with a format specifier taken from the
@@ -1568,6 +1568,11 @@  The preferred time of day representation for the current locale.
 The year without a century as a decimal number (range @code{00} through
 @code{99}).  This is equivalent to the year modulo 100.
 
+If the @code{E} modifier is specified (@code{%Ey}), the locale's
+alternative representation for year (the era year) is used instead.
+The default action is to pad the number with zero to keep minimum 2
+digits, similar to @code{%y}.
+
 @item %Y
 The year as a decimal number, using the Gregorian calendar.  Years
 before the year @code{1} are numbered @code{0}, @code{-1}, and so on.
diff --git a/time/strftime_l.c b/time/strftime_l.c
index 7ba4179de3e..cbe08e7afb4 100644
--- a/time/strftime_l.c
+++ b/time/strftime_l.c
@@ -1294,7 +1294,7 @@  __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
 	      if (era)
 		{
 		  int delta = tp->tm_year - era->start_date[0];
-		  DO_NUMBER (1, (era->offset
+		  DO_NUMBER (2, (era->offset
 				 + delta * era->absolute_direction));
 		}
 #else