[v2] stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650]
Checks
Context |
Check |
Description |
redhat-pt-bot/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
redhat-pt-bot/TryBot-32bit |
success
|
Build for i686
|
linaro-tcwg-bot/tcwg_glibc_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_glibc_check--master-arm |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 |
success
|
Test passed
|
Commit Message
Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer
than INT_MAX (bug 27650)") and add a test case for the issue, inspired
by the reproducer provided with the bug report.
This has been verified to succeed as from the commit referred and fail
beforehand.
As the test requires 2GiB of data to be passed around its performance
has been evaluated using a choice of systems and the execution time
determined to be respectively in the range of 9s for POWER9@2.166GHz,
24s for FU740@1.2GHz, and 40s for 74Kf@950MHz. As this is on the verge
of and beyond the default timeout it has been increased by the factor of
8. Regardless, following recent practice the test has been added to the
standard rather than extended set.
---
Hi,
This has been verified with the `powerpc64le-linux-gnu' (IBM POWER9)
native target and then the same host and the `riscv64-linux-gnu' (SiFive
FU740) and `mips-linux-gnu' (o32 ABI) (MIPS 74Kf) targets. This is so as
to assess performance requirements for the test case.
I now have a working test environment with a 21064A@266MHz Alpha machine
in my lab, so as a matter of interest as to performance I have tried this
test with that system as well. It has turned out to require ~4m30s to
complete, so setting TIMEOUTFACTOR=2 is required, not unreasonably for a
30 years old system. However the test also failed towards the end, with:
tst-scanf-bz27650: fscanf: input failure, at 2147483648: Cannot allocate memory
be it with v1 or with code updated for `fopencookie' and parts around
`fscanf' unchanged.
Having re-read the relevant parts of the ISO C and POSIX specifications I
have figured out that `fscanf' is one of the functions that returns a
result that is not indicative of an error and is allowed to set `errno'
arbitrarily even in the absence of an error both at a time:
"The value of errno may be set to nonzero by a library function call
whether or not there is an error, provided the use of errno is not
documented in the description of the function in this document."
and POSIX further clarifies for `fscanf':
"If a read error occurs, the error indicator for the stream shall be set."
Indeed returning -1 in `io_read' makes `fscanf' return 0 here and set the
error indicator. I have therefore updated error handling for `fscanf', by
explicitly checking for the error condition of the stream and if it is
clear, then ignoring `errno'. With the fix in place the `alpha-linux-gnu'
target now passes the test too.
I have re-verified that this updated test case still triggers a failure
with commit b03e4d7bd25b reverted.
Any questions, comments or concerns? Otherwise OK to apply?
Maciej
Changes from v1:
- Reimplement in terms of `fopencookie', eliminating the need for a
subprocess and associated handling.
- Update execution times reported in the change description, slightly
reduced accordingly.
- Correct error handling for `fscanf'.
- Fix a typo s/MAX_INT/INT_MAX/ in comments.
---
stdio-common/Makefile | 2
stdio-common/tst-scanf-bz27650.c | 92 +++++++++++++++++++++++++++++++++++++++
2 files changed, 94 insertions(+)
glibc-tst-scanf-bz27650.diff
Comments
* Maciej W. Rozycki:
> --- /dev/null
> +++ glibc/stdio-common/tst-scanf-bz27650.c
> @@ -0,0 +1,92 @@
> +/* Test for BZ #27650, formatted input matching beyond INT_MAX.
I like how much shorter the test is now.
> +#include <mcheck.h>
Where is <mcheck.h> used?
> + if (v == EOF || e != 0)
> + error (EXIT_FAILURE, e ? errno : 0,
> + "fscanf: input failure, at %u", written);
Tests should not write diagnostics to standard error (where they get
interspersed with make output). Please #include <support/check.h> and
use FAIL_EXIT1 instead (or TEST_VERIFY, TEST_COMPARE …).
> + if (!feof (in))
> + {
> + v = fgetc (in);
> + if (v == EOF)
> + error (EXIT_FAILURE, errno, "fgetc: input failure");
> + else if (v == '\n')
> + error (EXIT_FAILURE, 0, "unexpected new line character received");
> + else
> + error (EXIT_FAILURE, 0,
> + "character received after end of file expected: \\x%02x", v);
> + }
Missing fclose (in)?
On Tue, 2 Jul 2024, Florian Weimer wrote:
> > +#include <mcheck.h>
>
> Where is <mcheck.h> used?
There's a call to `mtrace' in `do_test' right at the beginning.
> > + if (v == EOF || e != 0)
> > + error (EXIT_FAILURE, e ? errno : 0,
> > + "fscanf: input failure, at %u", written);
>
> Tests should not write diagnostics to standard error (where they get
> interspersed with make output). Please #include <support/check.h> and
> use FAIL_EXIT1 instead (or TEST_VERIFY, TEST_COMPARE …).
There's prior usage, presumably with older test cases, but indeed I've
become aware of this peculiarity in the course of my other effort, and
it's just escaped me here. Honestly, given that there's no use for fd 2
for test cases the test driver could well just assign `stderr = stdio',
making all the available standard error reporting facilities work as
expected automagically. Maybe it's worth doing as a separate cleanup?
Anyway the macros you refer to standardise the structure of the messages
produced, so after some experimenting I chose to follow your suggestion
rather than doing `stderr = stdio'. However sadly there is currently no
macro that lets one pass a descriptive message but does not terminate the
program or return from the current function. So I chose to trivially add
one and converted the patch into a mini patch series, having concluded
that calling `support_print_failure_impl' directly from a test case felt a
bit hackish.
> > + if (!feof (in))
> > + {
> > + v = fgetc (in);
> > + if (v == EOF)
> > + error (EXIT_FAILURE, errno, "fgetc: input failure");
> > + else if (v == '\n')
> > + error (EXIT_FAILURE, 0, "unexpected new line character received");
> > + else
> > + error (EXIT_FAILURE, 0,
> > + "character received after end of file expected: \\x%02x", v);
> > + }
>
> Missing fclose (in)?
Technically not needed, but you're right it's a bit sloppy not to have
it. This however requires the test case to be reshaped a little for the
cleanup to be always called provided that the stream has been successfully
opened.
I have posted v3 now.
Maciej
On Thu, 4 Jul 2024, Maciej W. Rozycki wrote:
> > > +#include <mcheck.h>
> >
> > Where is <mcheck.h> used?
>
> There's a call to `mtrace' in `do_test' right at the beginning.
And I have spotted that for this to have any effect explicit arrangements
have to be made for the test harness, so I have now posted v4 to address
that, superseding v3. Sorry for the extra noise.
Maciej
On Thu, 4 Jul 2024, Maciej W. Rozycki wrote:
> And I have spotted that for this to have any effect explicit arrangements
> have to be made for the test harness, so I have now posted v4 to address
> that, superseding v3. Sorry for the extra noise.
For the record, I have obviously noticed the CI failures, hard to miss.
It just did not occur to me, sigh, that adding a FAIL definition to
support/check.h is actually a global change and therefore requires running
the whole test suite, unlike with earlier iterations. And it happened
that the FAIL definition clashed with a bunch of same-named local macros,
including, sadly, some recently added, which could have been chosen to use
support/check.h facilities instead. I guess it wasn't my day yesterday.
I've been working on v5 since I became aware of the problem. Current ETA
is Monday next week.
Maciej
===================================================================
@@ -244,6 +244,7 @@ tests := \
tst-scanf-binary-c23 \
tst-scanf-binary-gnu11 \
tst-scanf-binary-gnu89 \
+ tst-scanf-bz27650 \
tst-scanf-intn \
tst-scanf-round \
tst-scanf-to_inpunct \
@@ -314,6 +315,7 @@ generated += \
tst-printf-fp-free.mtrace \
tst-printf-fp-leak-mem.out \
tst-printf-fp-leak.mtrace \
+ tst-scanf-bz27650.mtrace \
tst-vfprintf-width-prec-mem.out \
tst-vfprintf-width-prec.mtrace \
# generated
===================================================================
@@ -0,0 +1,92 @@
+/* Test for BZ #27650, formatted input matching beyond INT_MAX.
+ Copyright (C) 2024 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <error.h>
+#include <errno.h>
+#include <limits.h>
+#include <mcheck.h>
+#include <stddef.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <sys/types.h>
+
+#include <support/test-driver.h>
+
+/* Produce a stream of more than INT_MAX characters via buffer BUF of
+ size SIZE according to bookkeeping in COOKIE and then return EOF. */
+
+static ssize_t
+io_read (void *cookie, char *buf, size_t size)
+{
+ unsigned int *written = cookie;
+ unsigned int w = *written;
+
+ if (w > INT_MAX)
+ return 0;
+
+ memset (buf, 'a', size);
+ *written = w + size;
+ return size;
+}
+
+/* Consume a stream of more than INT_MAX characters from an artificial
+ input stream of which none is the new line character. The call to
+ fscanf is supposed to complete upon the EOF condition of input,
+ however in the presence of BZ #27650 it will terminate prematurely
+ with characters still outstanding in input. Diagnose the condition
+ and return status accordingly. */
+
+int
+do_test (void)
+{
+ static cookie_io_functions_t io_funcs = { .read = io_read };
+ unsigned int written = 0;
+ FILE *in;
+ int v, e;
+
+ mtrace ();
+
+ in = fopencookie (&written, "r", io_funcs);
+ if (in == NULL)
+ error (EXIT_FAILURE, errno, "fopencookie");
+
+ v = fscanf (in, "%*[^\n]");
+ e = ferror (in);
+ if (v == EOF || e != 0)
+ error (EXIT_FAILURE, e ? errno : 0,
+ "fscanf: input failure, at %u", written);
+
+ if (!feof (in))
+ {
+ v = fgetc (in);
+ if (v == EOF)
+ error (EXIT_FAILURE, errno, "fgetc: input failure");
+ else if (v == '\n')
+ error (EXIT_FAILURE, 0, "unexpected new line character received");
+ else
+ error (EXIT_FAILURE, 0,
+ "character received after end of file expected: \\x%02x", v);
+ }
+
+ return EXIT_SUCCESS;
+}
+
+#define TIMEOUT (DEFAULT_TIMEOUT * 8)
+#include <support/test-driver.c>