[v6,2/2] strftime: Pass the additional flags from "%EY" to "%Ey" [BZ #23758]

Message ID 201901110452.AA04175@tamuki.linet.gr.jp
State Superseded
Headers

Commit Message

TAMUKI Shoichi Jan. 11, 2019, 4:52 a.m. UTC
  For the output string of the conversion specifier "%EY", an optional
flag is given to the conversion specifier so that it can be also used
the current non-padding format, and the padding format can be
controlled.  To achieve this, when an optional flag is given to the
conversion specifier "%EY", the "%Ey" included in the combined
conversion specifier is interpreted as if decorated with the
appropriate flag.

Currently in glibc, besides ja_JP (Japan) locale, the locales using
the conversion specifier "%Ey" are lo_LA (Laos) and th_TH (Thailand).
In these locales, they use the Buddhist era.  The Buddhist era is a
value obtained by adding 543 to the Christian era, so they are not
affected by the change of the conversion specifier "%Ey".

ChangeLog:

	[BZ #23758]
	* NEWS: Mention the change.
	* manual/time.texi (strftime): Document the desctiption for "%EC" and
	"%EY".
	* time/Makefile: Add tst-strftime2 to tests.  Also add ja_JP.UTF-8,
	lo_LA.UTF-8, and th_TH.UTF-8 to LOCALES.
	* time/strftime_l.c (__strftime_internal): Add argument yr_spec to
	override padding for "%Ey".
	If an optional flag ('_' or '-') is specified to "%EY", the "%Ey" in
	subformat is interpreted as if decorated with the appropriate flag.
	* time/tst-strftime2.c: New file.
---
 NEWS                 |   3 ++
 manual/time.texi     |  10 ++++
 time/Makefile        |   5 +-
 time/strftime_l.c    |  20 +++++---
 time/tst-strftime2.c | 132 +++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 161 insertions(+), 9 deletions(-)
 create mode 100644 time/tst-strftime2.c
  

Comments

Zack Weinberg Jan. 16, 2019, 4:17 p.m. UTC | #1
This review covers only the documentation and commit message.


> For the output string of the conversion specifier "%EY", an optional
> flag is given to the conversion specifier so that it can be also used
> the current non-padding format, and the padding format can be
> controlled.  To achieve this, when an optional flag is given to the
> conversion specifier "%EY", the "%Ey" included in the combined
> conversion specifier is interpreted as if decorated with the
> appropriate flag.

This is unclear.  Suggest instead:

| The full representation of the alternative calendar year (%EY)
| typically includes an internal use of %Ey.  As a GNU extension,
| apply any flags on %EY (e.g. '%-EY', '%_Ey') to the internal %Ey,
| allowing users of %EY to control how the year is padded.

(It seems to me that this extension ought to be generalized to all of
the "macro" formats (%c, %D, %F, %r, %R, %T, %x, %X, %Ec, %Ex, %EX),
and to all format flags and field widths, but that would be a separate
patch and not appropriate for 2.29 at this point.)

> Currently in glibc, besides ja_JP (Japan) locale, the locales using
> the conversion specifier "%Ey" are lo_LA (Laos) and th_TH (Thailand).
> In these locales, they use the Buddhist era.  The Buddhist era is a
> value obtained by adding 543 to the Christian era, so they are not
> affected by the change of the conversion specifier "%Ey".

Drop this paragraph, as it is already covered in the previous patch's
commit message.

>
> ChangeLog:
>
>       [BZ #23758]
>       * NEWS: Mention the change.

Changes to NEWS are not mentioned in the ChangeLog.

>       * manual/time.texi (strftime): Document the desctiption for "%EC" and
>       "%EY".

Typo: "desctiption" -> "description".  But it would be even better written

|    * manual/time.texi (strftime): Document "%EC" and "%EY".

>       If an optional flag ('_' or '-') is specified to "%EY", the "%Ey" in
>       subformat is interpreted as if decorated with the appropriate flag.

Write this sentence with an active main verb.  Also, "subformat" needs
a "the", and "the appropriate" would be better as "that", which makes
clear that it applies the *same* optional flag that was used on the %EY.

|       If an optional flag ('_' or '-') is specified to "%EY", interpret
|       the "%Ey" in the subformat as if decorated with that flag.

>  * Improve the width of alternative representation for year in
>    strftime.  For %Ey conversion specifier, the default action is now
>    to pad the number with zero to keep minimum 2 digits, similar to %y.
> +  Also, the optional flag (either _ or -) can be used for %EY, so that
> +  the internal %Ey is interpreted as if decorated with the appropriate
> +  flag.

Paul's suggested revision of this addition is technically incorrect;
he got confused about which way around the flag propagates.  I would
recommend using a separate bullet point for this change, and I would
also recommend not describing the behavior in terms of implementation
details:

| * As a GNU extension, the '-' and '_' flags can now be applied to '%EY'
|   to control how the year number is formatted; they have the same effect
|   that they would on %Ey.

>  The century of the year.  This is equivalent to the greatest integer not
>  greater than the year divided by 100.
>
> +If the @code{E} modifier is specified (@code{%EC}), the locale's
> +alternative representation for year (the era name) is used instead.

Recommend instead

| If the @code{E} modifier is specified (@code{%EC}), instead produces
| the name of the period for the current year (e.g.@: an era name) in
| the locale's alternative calendar.

> +If the @code{E} modifier is specified (@code{%EY}), the locale's
> +alternative representation for year (generally the combination of
> +@code{%EC} and @code{%Ey}) is used instead.  In this case, the
> +optional flag (either @code{_} or @code{-}) can be used, so that the
> +internal @code{%Ey} is interpreted as if decorated with the
> +appropriate flag.

Recommend instead

| If the @code{E} modifier is specified (@code{%EY}), instead produces
| a complete representation of the year according to the locale's
| alternative calendar.  Generally this will be some combination of
| the information produced by @code{%EC} and @code{Ey}.  As a GNU
| extension, the formatting flags @code{_} or @code{-} may be used
| with this conversion specifier; they affect how the year number is
| printed.

zw
  
Rafal Luzynski Jan. 17, 2019, 6:54 p.m. UTC | #2
16.01.2019 17:17 Zack Weinberg <zackw@panix.com> wrote:
> 
> This review covers only the documentation and commit message.

Thank you Zack and Paul for your reviews.  They were the most
missing part.  I believe we will help ourselves with the rest
of the patches.

> [...]
> (It seems to me that this extension ought to be generalized to all of
> the "macro" formats (%c, %D, %F, %r, %R, %T, %x, %X, %Ec, %Ex, %EX),
> and to all format flags and field widths, but that would be a separate
> patch and not appropriate for 2.29 at this point.)

This may be too ambiguous and therefore impossible to implement.

> [...]
> Paul's suggested revision of this addition is technically incorrect;
> he got confused about which way around the flag propagates.  I would
> recommend using a separate bullet point for this change, and I would
> also recommend not describing the behavior in terms of implementation
> details:
> 
> | * As a GNU extension, the '-' and '_' flags can now be applied to '%EY'
> |   to control how the year number is formatted; they have the same effect
> |   that they would on %Ey.

"they would" or "they would have"?

Also, shouldn't all format specifiers be consequently quoted, like "%EY"
and "%Ey"?  I don't mind single quotes, especially for the flags, I just
think that %Ey (without any quotes) may not be absolutely clear.

Regards,

Rafal
  
Zack Weinberg Jan. 18, 2019, 2:32 a.m. UTC | #3
On Thu, Jan 17, 2019 at 1:55 PM Rafal Luzynski
<digitalfreak@lingonborough.com> wrote:
> 16.01.2019 17:17 Zack Weinberg <zackw@panix.com> wrote:
> > (It seems to me that this extension ought to be generalized to all of
> > the "macro" formats (%c, %D, %F, %r, %R, %T, %x, %X, %Ec, %Ex, %EX),
> > and to all format flags and field widths, but that would be a separate
> > patch and not appropriate for 2.29 at this point.)
>
> This may be too ambiguous and therefore impossible to implement.

Yeah, you may be right there.  It was just an idea.

> > | * As a GNU extension, the '-' and '_' flags can now be applied to '%EY'
> > |   to control how the year number is formatted; they have the same effect
> > |   that they would on %Ey.
>
> "they would" or "they would have"?

My ear says an additional "have" is not necessary, but feel free to
add it if it sounds better to you that way.

> Also, shouldn't all format specifiers be consequently quoted, like "%EY"
> and "%Ey"?  I don't mind single quotes, especially for the flags, I just
> think that %Ey (without any quotes) may not be absolutely clear.

This is NEWS, right?  In the manual it should be @code{} and no quotes
for all of them, IIRC, but yes, let's add quotes around %Ey.  It
doesn't matter to me whether they are single or double quotes.

zw
  

Patch

diff --git a/NEWS b/NEWS
index 00fab6e8825..82c1cdf9b3d 100644
--- a/NEWS
+++ b/NEWS
@@ -55,6 +55,9 @@  Major new features:
 * Improve the width of alternative representation for year in
   strftime.  For %Ey conversion specifier, the default action is now
   to pad the number with zero to keep minimum 2 digits, similar to %y.
+  Also, the optional flag (either _ or -) can be used for %EY, so that
+  the internal %Ey is interpreted as if decorated with the appropriate
+  flag.
 
 Deprecated and removed features, and other changes affecting compatibility:
 
diff --git a/manual/time.texi b/manual/time.texi
index ab544e590c8..9dcb35fed14 100644
--- a/manual/time.texi
+++ b/manual/time.texi
@@ -1393,6 +1393,9 @@  The preferred calendar time representation for the current locale.
 The century of the year.  This is equivalent to the greatest integer not
 greater than the year divided by 100.
 
+If the @code{E} modifier is specified (@code{%EC}), the locale's
+alternative representation for year (the era name) is used instead.
+
 This format was first standardized by POSIX.2-1992 and by @w{ISO C99}.
 
 @item %d
@@ -1577,6 +1580,13 @@  digits, similar to @code{%y}.
 The year as a decimal number, using the Gregorian calendar.  Years
 before the year @code{1} are numbered @code{0}, @code{-1}, and so on.
 
+If the @code{E} modifier is specified (@code{%EY}), the locale's
+alternative representation for year (generally the combination of
+@code{%EC} and @code{%Ey}) is used instead.  In this case, the
+optional flag (either @code{_} or @code{-}) can be used, so that the
+internal @code{%Ey} is interpreted as if decorated with the
+appropriate flag.
+
 @item %z
 @w{RFC 822}/@w{ISO 8601:1988} style numeric time zone (e.g.,
 @code{-0600} or @code{+0100}), or nothing if no time zone is
diff --git a/time/Makefile b/time/Makefile
index d23ba2dee6e..5c6304ece1d 100644
--- a/time/Makefile
+++ b/time/Makefile
@@ -43,13 +43,14 @@  tests	:= test_time clocktest tst-posixtz tst-strptime tst_wcsftime \
 	   tst-getdate tst-mktime tst-mktime2 tst-ftime_l tst-strftime \
 	   tst-mktime3 tst-strptime2 bug-asctime bug-asctime_r bug-mktime1 \
 	   tst-strptime3 bug-getdate1 tst-strptime-whitespace tst-ftime \
-	   tst-tzname tst-y2039 bug-mktime4
+	   tst-tzname tst-y2039 bug-mktime4 tst-strftime2
 
 include ../Rules
 
 ifeq ($(run-built-tests),yes)
 LOCALES := de_DE.ISO-8859-1 en_US.ISO-8859-1 ja_JP.EUC-JP fr_FR.UTF-8 \
-	   es_ES.UTF-8 pl_PL.UTF-8 ru_RU.UTF-8
+	   es_ES.UTF-8 pl_PL.UTF-8 ru_RU.UTF-8 \
+	   ja_JP.UTF-8 lo_LA.UTF-8 th_TH.UTF-8
 include ../gen-locales.mk
 
 $(objpfx)tst-ftime_l.out: $(gen-locales)
diff --git a/time/strftime_l.c b/time/strftime_l.c
index cbe08e7afb4..12d7c0e8744 100644
--- a/time/strftime_l.c
+++ b/time/strftime_l.c
@@ -434,7 +434,7 @@  static CHAR_T const month_name[][10] =
 #endif
 
 static size_t __strftime_internal (CHAR_T *, size_t, const CHAR_T *,
-				   const struct tm *, bool *
+				   const struct tm *, int *, bool *
 				   ut_argument_spec
 				   LOCALE_PARAM) __THROW;
 
@@ -456,8 +456,9 @@  my_strftime (CHAR_T *s, size_t maxsize, const CHAR_T *format,
   tmcopy = *tp;
   tp = &tmcopy;
 #endif
+  int yr_spec = 0;		/* Override padding for "%Ey".  */
   bool tzset_called = false;
-  return __strftime_internal (s, maxsize, format, tp, &tzset_called
+  return __strftime_internal (s, maxsize, format, tp, &yr_spec, &tzset_called
 			      ut_argument LOCALE_ARG);
 }
 #ifdef _LIBC
@@ -466,7 +467,7 @@  libc_hidden_def (my_strftime)
 
 static size_t
 __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
-		     const struct tm *tp, bool *tzset_called
+		     const struct tm *tp, int *yr_spec, bool *tzset_called
 		     ut_argument_spec LOCALE_PARAM)
 {
 #if defined _LIBC && defined USE_IN_EXTENDED_LOCALE_MODEL
@@ -838,11 +839,12 @@  __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
 	  {
 	    CHAR_T *old_start = p;
 	    size_t len = __strftime_internal (NULL, (size_t) -1, subfmt,
-					      tp, tzset_called ut_argument
-					      LOCALE_ARG);
+					      tp, yr_spec, tzset_called
+					      ut_argument LOCALE_ARG);
 	    add (len, __strftime_internal (p, maxsize - i, subfmt,
-					   tp, tzset_called ut_argument
-					   LOCALE_ARG));
+					   tp, yr_spec, tzset_called
+					   ut_argument LOCALE_ARG));
+	    *yr_spec = 0;
 
 	    if (to_uppcase)
 	      while (old_start < p)
@@ -1273,6 +1275,8 @@  __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
 # else
 		  subfmt = era->era_format;
 # endif
+		  if (pad != 0)
+		    *yr_spec = pad;
 		  goto subformat;
 		}
 #else
@@ -1294,6 +1298,8 @@  __strftime_internal (CHAR_T *s, size_t maxsize, const CHAR_T *format,
 	      if (era)
 		{
 		  int delta = tp->tm_year - era->start_date[0];
+		  if (*yr_spec != 0)
+		    pad = *yr_spec;
 		  DO_NUMBER (2, (era->offset
 				 + delta * era->absolute_direction));
 		}
diff --git a/time/tst-strftime2.c b/time/tst-strftime2.c
new file mode 100644
index 00000000000..57d2144c83c
--- /dev/null
+++ b/time/tst-strftime2.c
@@ -0,0 +1,132 @@ 
+/* Verify the behavior of strftime on alternative representation for
+   year.
+
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <locale.h>
+#include <time.h>
+#include <stdio.h>
+#include <string.h>
+
+static const char *locales[] = { "ja_JP.UTF-8", "lo_LA.UTF-8", "th_TH.UTF-8" };
+
+static const char *formats[] = { "%EY", "%_EY", "%-EY" };
+
+static const struct
+{
+  const int d, m, y;
+} dates[] =
+  {
+    { 1, 3, 88 },
+    { 7, 0, 89 },
+    { 8, 0, 89 },
+    { 1, 3, 90 },
+    { 1, 3, 97 },
+    { 1, 3, 98 }
+  };
+
+static char ref[3][3][6][100];
+
+static void
+mkreftable (void)
+{
+  int i, j, k;
+  char era[10];
+  static const int yrj[] = { 63, 64, 1, 2, 9, 10 };
+  static const int yrb[] = { 2531, 2532, 2532, 2533, 2540, 2541 };
+
+  for (i = 0; i < array_length (locales); i++)
+    for (j = 0; j < array_length (formats); j++)
+      for (k = 0; k < array_length (dates); k++)
+	{
+	  if (i == 0)
+	    {
+	      sprintf (era, "%s", (k < 2) ? "\xe6\x98\xad\xe5\x92\x8c"
+					  : "\xe5\xb9\xb3\xe6\x88\x90");
+	      if (yrj[k] == 1)
+		sprintf (ref[i][j][k], "%s\xe5\x85\x83\xe5\xb9\xb4", era);
+	      else
+		{
+		  if (j == 0)
+		    sprintf (ref[i][j][k], "%s%02d\xe5\xb9\xb4", era, yrj[k]);
+		  else if (j == 1)
+		    sprintf (ref[i][j][k], "%s%2d\xe5\xb9\xb4", era, yrj[k]);
+		  else
+		    sprintf (ref[i][j][k], "%s%d\xe5\xb9\xb4", era, yrj[k]);
+		}
+	    }
+	  else if (i == 1)
+	    {
+	      sprintf (era, "\xe0\xba\x9e\x2e\xe0\xba\xaa\x2e ");
+	      sprintf (ref[i][j][k], "%s%d", era, yrb[k]);
+	    }
+	  else
+	    {
+	      sprintf (era, "\xe0\xb8\x9e\x2e\xe0\xb8\xa8\x2e ");
+	      sprintf (ref[i][j][k], "%s%d", era, yrb[k]);
+	    }
+	}
+}
+
+static int
+do_test (void)
+{
+  int i, j, k, result = 0;
+  struct tm ttm;
+  char date[11], buf[100];
+  size_t r, e;
+
+  mkreftable ();
+  for (i = 0; i < array_length (locales); i++)
+    {
+      if (setlocale (LC_ALL, locales[i]) == NULL)
+	{
+	  printf ("locale %s does not exist, skipping...\n", locales[i]);
+	  continue;
+	}
+      printf ("[%s]\n", locales[i]);
+      for (j = 0; j < array_length (formats); j++)
+	{
+	  for (k = 0; k < array_length (dates); k++)
+	    {
+	      ttm.tm_mday = dates[k].d;
+	      ttm.tm_mon  = dates[k].m;
+	      ttm.tm_year = dates[k].y;
+	      strftime (date, sizeof (date), "%F", &ttm);
+	      r = strftime (buf, sizeof (buf), formats[j], &ttm);
+	      e = strlen (ref[i][j][k]);
+	      printf ("%s\t\"%s\"\t\"%s\"", date, formats[j], buf);
+	      if (strcmp (buf, ref[i][j][k]) != 0)
+		{
+		  printf ("\tshould be \"%s\"", ref[i][j][k]);
+		  if (r != e)
+		    printf ("\tgot: %zu, expected: %zu", r, e);
+		  result = 1;
+		}
+	      else
+		printf ("\tOK");
+	      putchar ('\n');
+	    }
+	  putchar ('\n');
+	}
+    }
+  return result;
+}
+
+#include <support/test-driver.c>