[v5,4/4] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]

Message ID alpine.DEB.2.21.2407122138340.38148@angie.orcam.me.uk
State Committed
Commit 89cddc8a7096f3d9225868304d2bc0a1aaf07d63
Delegated to: DJ Delorie
Headers
Series stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650] |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit success Build for i686
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed

Commit Message

Maciej W. Rozycki July 12, 2024, 8:48 p.m. UTC
  From: Maciej W. Rozycki <macro@redhat.com>

Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer 
than INT_MAX (bug 27650)") and add a test case for the issue, inspired 
by the reproducer provided with the bug report.

This has been verified to succeed as from the commit referred and fail 
beforehand.

As the test requires 2GiB of data to be passed around its performance 
has been evaluated using a choice of systems and the execution time 
determined to be respectively in the range of 9s for POWER9@2.166GHz, 
24s for FU740@1.2GHz, and 40s for 74Kf@950MHz.  As this is on the verge 
of and beyond the default timeout it has been increased by the factor of 
8.  Regardless, following recent practice the test has been added to the 
standard rather than extended set.
---
Changes from v4:

- Discard return status tracking, having observed that any call to FAIL 
  will cause the test case to exit unsuccessfully owing to a call to 
  `adjust_exit_status' made by the test driver just before termination.

- Remove an extraneous `/' from LD_PRELOAD definition.

Changes from v3:

- Add tst-scanf-bz27650-ENV setting for `mtrace'.

Changes from v2:

- Rework error reporting in terms of <support/check.h>.

- Handle the error condition for `fgetc' analogously to `fscanf'.

- Improve the error message for the EOF condition for both functions.

- Call `fclose' before termination where appropriate.

Changes from v1:

- Reimplement in terms of `fopencookie', eliminating the need for a 
  subprocess and associated handling.

- Update execution times reported in the change description, slightly 
  reduced accordingly.

- Correct error handling for `fscanf'.

- Fix a typo s/MAX_INT/INT_MAX/ in comments.
---
 stdio-common/Makefile            |    5 +
 stdio-common/tst-scanf-bz27650.c |  108 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 113 insertions(+)

glibc-tst-scanf-bz27650.diff
  

Comments

DJ Delorie July 23, 2024, 1:45 a.m. UTC | #1
"Maciej W. Rozycki" <macro@orcam.me.uk> writes:
> Index: glibc/stdio-common/Makefile
> +  tst-scanf-bz27650 \

Ok.

> +  tst-scanf-bz27650.mtrace \

Ok.

> +tst-scanf-bz27650-ENV = \
> +  MALLOC_TRACE=$(objpfx)tst-scanf-bz27650.mtrace \
> +  LD_PRELOAD=$(common-objpfx)malloc/libc_malloc_debug.so

Ok.

> Index: glibc/stdio-common/tst-scanf-bz27650.c
> +/* Test for BZ #27650, formatted input matching beyond INT_MAX.
> +   Copyright (C) 2024 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <error.h>
> +#include <errno.h>
> +#include <limits.h>
> +#include <mcheck.h>
> +#include <stddef.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +
> +#include <sys/types.h>
> +
> +#include <support/check.h>
> +#include <support/test-driver.h>

Ok.

> +/* Produce a stream of more than INT_MAX characters via buffer BUF of
> +   size SIZE according to bookkeeping in COOKIE and then return EOF.  */
> +
> +static ssize_t
> +io_read (void *cookie, char *buf, size_t size)
> +{
> +  unsigned int *written = cookie;
> +  unsigned int w = *written;
> +
> +  if (w > INT_MAX)
> +    return 0;

This works because w is unsigned (0..65535) vs INT_MAX (32767).

> +  memset (buf, 'a', size);
> +  *written = w + size;
> +  return size;
> +}

Ok.

> +/* Consume a stream of more than INT_MAX characters from an artificial
> +   input stream of which none is the new line character.  The call to
> +   fscanf is supposed to complete upon the EOF condition of input,
> +   however in the presence of BZ #27650 it will terminate prematurely
> +   with characters still outstanding in input.  Diagnose the condition
> +   and return status accordingly.  */
> +
> +int
> +do_test (void)
> +{
> +  static cookie_io_functions_t io_funcs = { .read = io_read };
> +  unsigned int written = 0;
> +  FILE *in;
> +  int v;
> +
> +  mtrace ();
> +
> +  in = fopencookie (&written, "r", io_funcs);
> +  if (in == NULL)
> +    {
> +      FAIL ("fopencookie: %m");
> +      goto out;
> +    }

Ok.

> +  v = fscanf (in, "%*[^\n]");

function under test, ok.  Actual return value should be 0 or 1.

> +  if (ferror (in))
> +    {
> +      FAIL ("fscanf: input failure, at %u: %m", written);
> +      goto out_close;
> +    }

Ok.  No errors should happen.

> +  else if (v == EOF)
> +    {
> +      FAIL ("fscanf: unexpected end of file, at %u", written);
> +      goto out_close;
> +    }

Ok.  Should not EOF reading the first valid token.

> +  if (!feof (in))

Shouldn't report EOF until the next attempt to read, ok.

> +    {
> +      v = fgetc (in);
> +      if (ferror (in))
> +	FAIL ("fgetc: input failure: %m");
> +      else if (v == EOF)
> +	FAIL ("fgetc: unexpected end of file after missing end of file");
> +      else if (v == '\n')
> +	FAIL ("unexpected new line character received");
> +      else
> +	FAIL ("character received after end of file expected: \\x%02x", v);
> +    }

Ok.

> +out_close:
> +  if (fclose (in) != 0)
> +    FAIL ("fclose: %m");
> +
> +out:
> +  return EXIT_SUCCESS;
> +}
> +
> +#define TIMEOUT (DEFAULT_TIMEOUT * 8)
> +#include <support/test-driver.c>

Ok.

LGTM
Reviewed-by: DJ Delorie <dj@redhat.com>
  
Maciej W. Rozycki July 26, 2024, 12:26 p.m. UTC | #2
On Mon, 22 Jul 2024, DJ Delorie wrote:

> LGTM
> Reviewed-by: DJ Delorie <dj@redhat.com>

 I have applied this change and the previous ones in this series now.  
Thank you for your review!

  Maciej
  

Patch

Index: glibc/stdio-common/Makefile
===================================================================
--- glibc.orig/stdio-common/Makefile
+++ glibc/stdio-common/Makefile
@@ -244,6 +244,7 @@  tests := \
   tst-scanf-binary-c23 \
   tst-scanf-binary-gnu11 \
   tst-scanf-binary-gnu89 \
+  tst-scanf-bz27650 \
   tst-scanf-intn \
   tst-scanf-round \
   tst-scanf-to_inpunct \
@@ -314,6 +315,7 @@  generated += \
   tst-printf-fp-free.mtrace \
   tst-printf-fp-leak-mem.out \
   tst-printf-fp-leak.mtrace \
+  tst-scanf-bz27650.mtrace \
   tst-vfprintf-width-prec-mem.out \
   tst-vfprintf-width-prec.mtrace \
   # generated
@@ -403,6 +405,9 @@  tst-printf-fp-free-ENV = \
 tst-printf-fp-leak-ENV = \
   MALLOC_TRACE=$(objpfx)tst-printf-fp-leak.mtrace \
   LD_PRELOAD=$(common-objpfx)/malloc/libc_malloc_debug.so
+tst-scanf-bz27650-ENV = \
+  MALLOC_TRACE=$(objpfx)tst-scanf-bz27650.mtrace \
+  LD_PRELOAD=$(common-objpfx)malloc/libc_malloc_debug.so
 
 $(objpfx)tst-unbputc.out: tst-unbputc.sh $(objpfx)tst-unbputc
 	$(SHELL) $< $(common-objpfx) '$(test-program-prefix)'; \
Index: glibc/stdio-common/tst-scanf-bz27650.c
===================================================================
--- /dev/null
+++ glibc/stdio-common/tst-scanf-bz27650.c
@@ -0,0 +1,108 @@ 
+/* Test for BZ #27650, formatted input matching beyond INT_MAX.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <error.h>
+#include <errno.h>
+#include <limits.h>
+#include <mcheck.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <sys/types.h>
+
+#include <support/check.h>
+#include <support/test-driver.h>
+
+/* Produce a stream of more than INT_MAX characters via buffer BUF of
+   size SIZE according to bookkeeping in COOKIE and then return EOF.  */
+
+static ssize_t
+io_read (void *cookie, char *buf, size_t size)
+{
+  unsigned int *written = cookie;
+  unsigned int w = *written;
+
+  if (w > INT_MAX)
+    return 0;
+
+  memset (buf, 'a', size);
+  *written = w + size;
+  return size;
+}
+
+/* Consume a stream of more than INT_MAX characters from an artificial
+   input stream of which none is the new line character.  The call to
+   fscanf is supposed to complete upon the EOF condition of input,
+   however in the presence of BZ #27650 it will terminate prematurely
+   with characters still outstanding in input.  Diagnose the condition
+   and return status accordingly.  */
+
+int
+do_test (void)
+{
+  static cookie_io_functions_t io_funcs = { .read = io_read };
+  unsigned int written = 0;
+  FILE *in;
+  int v;
+
+  mtrace ();
+
+  in = fopencookie (&written, "r", io_funcs);
+  if (in == NULL)
+    {
+      FAIL ("fopencookie: %m");
+      goto out;
+    }
+
+  v = fscanf (in, "%*[^\n]");
+  if (ferror (in))
+    {
+      FAIL ("fscanf: input failure, at %u: %m", written);
+      goto out_close;
+    }
+  else if (v == EOF)
+    {
+      FAIL ("fscanf: unexpected end of file, at %u", written);
+      goto out_close;
+    }
+
+  if (!feof (in))
+    {
+      v = fgetc (in);
+      if (ferror (in))
+	FAIL ("fgetc: input failure: %m");
+      else if (v == EOF)
+	FAIL ("fgetc: unexpected end of file after missing end of file");
+      else if (v == '\n')
+	FAIL ("unexpected new line character received");
+      else
+	FAIL ("character received after end of file expected: \\x%02x", v);
+    }
+
+out_close:
+  if (fclose (in) != 0)
+    FAIL ("fclose: %m");
+
+out:
+  return EXIT_SUCCESS;
+}
+
+#define TIMEOUT (DEFAULT_TIMEOUT * 8)
+#include <support/test-driver.c>