[v6,04/11] stdio-common: Add __translated_number_width

Message ID a74a5f508b649766c35adf9e4c26337937508de4.1671221440.git.fweimer@redhat.com
State Committed
Commit 46378560e056300623364669de2405a7182b064f
Headers
Series vfprintf refactor |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Florian Weimer Dec. 16, 2022, 8:15 p.m. UTC
  This function will be used to compute the width of a number
after i18n digit translation.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
---
 include/printf.h                       | 13 +++++++-
 stdio-common/Makefile                  |  1 +
 stdio-common/translated_number_width.c | 42 ++++++++++++++++++++++++++
 3 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 stdio-common/translated_number_width.c
  

Comments

Dmitry V. Levin Jan. 30, 2023, 9:40 a.m. UTC | #1
Hi,

On Fri, Dec 16, 2022 at 09:15:22PM +0100, Florian Weimer via Libc-alpha wrote:
> This function will be used to compute the width of a number
> after i18n digit translation.

I haven't bisected, but I suppose it was this changeset that introduced
a regression reported by the strace test suite.

glibc-2.36$ cat <<'EOF' |gcc -xc - && ./a.out
#include <stdio.h>
int main() { printf("%03d\n", 1); return 0; }
EOF
001

The same test on master prints 0001 instead of 001.
  
Florian Weimer Jan. 30, 2023, 10:42 a.m. UTC | #2
* Dmitry V. Levin:

> Hi,
>
> On Fri, Dec 16, 2022 at 09:15:22PM +0100, Florian Weimer via Libc-alpha wrote:
>> This function will be used to compute the width of a number
>> after i18n digit translation.
>
> I haven't bisected, but I suppose it was this changeset that introduced
> a regression reported by the strace test suite.
>
> glibc-2.36$ cat <<'EOF' |gcc -xc - && ./a.out
> #include <stdio.h>
> int main() { printf("%03d\n", 1); return 0; }
> EOF
> 001
>
> The same test on master prints 0001 instead of 001.

Sorry, I can't reproduce it.  Did you build glibc yourself from upstream
sources, or did you get the build from somewhere else?

Thanks,
Florian
  
Dmitry V. Levin Jan. 30, 2023, 10:51 a.m. UTC | #3
On Mon, Jan 30, 2023 at 11:42:21AM +0100, Florian Weimer via Libc-alpha wrote:
> * Dmitry V. Levin:
> 
> > Hi,
> >
> > On Fri, Dec 16, 2022 at 09:15:22PM +0100, Florian Weimer via Libc-alpha wrote:
> >> This function will be used to compute the width of a number
> >> after i18n digit translation.
> >
> > I haven't bisected, but I suppose it was this changeset that introduced
> > a regression reported by the strace test suite.
> >
> > glibc-2.36$ cat <<'EOF' |gcc -xc - && ./a.out
> > #include <stdio.h>
> > int main() { printf("%03d\n", 1); return 0; }
> > EOF
> > 001
> >
> > The same test on master prints 0001 instead of 001.
> 
> Sorry, I can't reproduce it.  Did you build glibc yourself from upstream
> sources, or did you get the build from somewhere else?

This was initially reported by the strace test suite running on
rawhide-test.fedorainfracloud.org, and I suppose the patch submitted
by Andreas today fixes it.
  
Florian Weimer Jan. 30, 2023, 11:03 a.m. UTC | #4
* Dmitry V. Levin:

> On Mon, Jan 30, 2023 at 11:42:21AM +0100, Florian Weimer via Libc-alpha wrote:
>> * Dmitry V. Levin:
>> 
>> > Hi,
>> >
>> > On Fri, Dec 16, 2022 at 09:15:22PM +0100, Florian Weimer via Libc-alpha wrote:
>> >> This function will be used to compute the width of a number
>> >> after i18n digit translation.
>> >
>> > I haven't bisected, but I suppose it was this changeset that introduced
>> > a regression reported by the strace test suite.
>> >
>> > glibc-2.36$ cat <<'EOF' |gcc -xc - && ./a.out
>> > #include <stdio.h>
>> > int main() { printf("%03d\n", 1); return 0; }
>> > EOF
>> > 001
>> >
>> > The same test on master prints 0001 instead of 001.
>> 
>> Sorry, I can't reproduce it.  Did you build glibc yourself from upstream
>> sources, or did you get the build from somewhere else?
>
> This was initially reported by the strace test suite running on
> rawhide-test.fedorainfracloud.org, and I suppose the patch submitted
> by Andreas today fixes it.

Andreas' fix is for %#03o, though.  Perhaps the reduction above is
incorrect?

Thanks,
Florian
  
Dmitry V. Levin Jan. 30, 2023, 11:06 a.m. UTC | #5
On Mon, Jan 30, 2023 at 12:03:24PM +0100, Florian Weimer wrote:
> * Dmitry V. Levin:
> 
> > On Mon, Jan 30, 2023 at 11:42:21AM +0100, Florian Weimer via Libc-alpha wrote:
> >> * Dmitry V. Levin:
> >> 
> >> > Hi,
> >> >
> >> > On Fri, Dec 16, 2022 at 09:15:22PM +0100, Florian Weimer via Libc-alpha wrote:
> >> >> This function will be used to compute the width of a number
> >> >> after i18n digit translation.
> >> >
> >> > I haven't bisected, but I suppose it was this changeset that introduced
> >> > a regression reported by the strace test suite.
> >> >
> >> > glibc-2.36$ cat <<'EOF' |gcc -xc - && ./a.out
> >> > #include <stdio.h>
> >> > int main() { printf("%03d\n", 1); return 0; }
> >> > EOF
> >> > 001
> >> >
> >> > The same test on master prints 0001 instead of 001.
> >> 
> >> Sorry, I can't reproduce it.  Did you build glibc yourself from upstream
> >> sources, or did you get the build from somewhere else?
> >
> > This was initially reported by the strace test suite running on
> > rawhide-test.fedorainfracloud.org, and I suppose the patch submitted
> > by Andreas today fixes it.
> 
> Andreas' fix is for %#03o, though.  Perhaps the reduction above is
> incorrect?

Oops, I posted the wrong reduction, it was originally %#03o indeed.
  

Patch

diff --git a/include/printf.h b/include/printf.h
index 8f064149d3..5127a45f9b 100644
--- a/include/printf.h
+++ b/include/printf.h
@@ -53,7 +53,18 @@  int __wprintf_function_invoke (void *, printf_function callback,
 
 #include <bits/types/locale_t.h>
 
-/* Now define the internal interfaces.  */
+/* Returns the width (as for printf, in bytes) of the converted ASCII
+   number in the characters in the range [FIRST, LAST).  The range
+   must only contain ASCII digits.  The caller is responsible for
+   avoiding overflow.
+
+   This function is used during non-wide digit translation.  Wide
+   digit translate produces one wide character per ASCII digit,
+   so the width is simply LAST - FIRST.  */
+int __translated_number_width (locale_t loc,
+			       const char *first, const char *last)
+  attribute_hidden;
+
 extern int __printf_fphex (FILE *, const struct printf_info *,
 			   const void *const *) attribute_hidden;
 extern int __printf_fp (FILE *, const struct printf_info *,
diff --git a/stdio-common/Makefile b/stdio-common/Makefile
index 6e6da091b1..3e0c574ca5 100644
--- a/stdio-common/Makefile
+++ b/stdio-common/Makefile
@@ -85,6 +85,7 @@  routines := \
   tmpfile64 \
   tmpnam \
   tmpnam_r \
+  translated_number_width \
   vfprintf \
   vfprintf-internal \
   vfscanf \
diff --git a/stdio-common/translated_number_width.c b/stdio-common/translated_number_width.c
new file mode 100644
index 0000000000..f42d5968a1
--- /dev/null
+++ b/stdio-common/translated_number_width.c
@@ -0,0 +1,42 @@ 
+/* Compute the printf width of a sequence of ASCII digits.
+   Copyright (C) 2022 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <limits.h>
+#include <locale/localeinfo.h>
+#include <printf.h>
+
+int
+__translated_number_width (locale_t loc, const char *first, const char *last)
+{
+  struct lc_ctype_data *ctype = loc->__locales[LC_CTYPE]->private;
+
+  if (ctype->outdigit_bytes_all_equal > 0)
+    return (last - first) * ctype->outdigit_bytes_all_equal;
+  else
+    {
+      /* Digits have varying length, so the fast path cannot be used.  */
+      int digits = 0;
+      for (const char *p = first; p < last; ++p)
+        {
+          assert ('0' <= *p && *p <= '9');
+          digits += ctype->outdigit_bytes[*p - '0'];
+        }
+      return digits;
+    }
+}