[v2] ungetc: Guarantee single char pushback

Message ID 20241129164128.1198099-1-siddhesh@sourceware.org
State Under Review
Delegated to: Florian Weimer
Headers
Series [v2] ungetc: Guarantee single char pushback |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed
redhat-pt-bot/TryBot-32bit success Build for i686

Commit Message

Siddhesh Poyarekar Nov. 29, 2024, 4:41 p.m. UTC
  The C standard requires that ungetc guarantees at least one pushback, so
put a single byte pushback buffer in the FILE struct to enable that.

Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
---
Changes from v1:

- Drop ungetwc from scope of the patchset
- Fixed nits
- Retain old behaviour for legacy applications
- Minimize changes to fileops
- Namespace-ize free_backup_buf
- Add a test to verify that the subsequent malloc failure results in ungetc
  failure too
- Add GNU Toolchain Authors copyright notice.

 libio/bits/types/struct_FILE.h  |  4 +-
 libio/fileops.c                 |  7 ++-
 libio/genops.c                  | 26 +++++++--
 libio/libioP.h                  |  3 +
 stdio-common/Makefile           |  2 +
 stdio-common/tst-ungetc-nomem.c | 98 +++++++++++++++++++++++++++++++++
 6 files changed, 130 insertions(+), 10 deletions(-)
 create mode 100644 stdio-common/tst-ungetc-nomem.c
  

Comments

Florian Weimer Dec. 2, 2024, 9:18 p.m. UTC | #1
* Siddhesh Poyarekar:

> diff --git a/libio/genops.c b/libio/genops.c
> index d7e35e67d5..996daf0ad8 100644
> --- a/libio/genops.c
> +++ b/libio/genops.c
> @@ -1,4 +1,5 @@
>  /* Copyright (C) 1993-2024 Free Software Foundation, Inc.
> +   Copyright The GNU Toolchain Authors.
>     This file is part of the GNU C Library.
>  
>     The GNU C Library is free software; you can redistribute it and/or
> @@ -48,6 +49,13 @@ flush_cleanup (void *not_used)
>  }
>  #endif
>  
> +void
> +_IO_free_backup_buf (FILE *fp, char *ptr)
> +{
> +  if (ptr != fp->_short_backupbuf)
> +    free (ptr);
> +}

Shouldn't this check for the availability of the _short_backupbuf field,
too?

And with that in place, it might make sense to update oldfileops.c in
similar places, just in case something weird happens with the vtables
and we end up the old vtables code on new file handles.

For similar reasons, please convert the free calls for wide streams to
_IO_free_backup_buf, too.  We definitely have weird vtable interactions
for those.

> diff --git a/stdio-common/tst-ungetc-nomem.c b/stdio-common/tst-ungetc-nomem.c
> new file mode 100644
> index 0000000000..73b88e2e1c
> --- /dev/null
> +++ b/stdio-common/tst-ungetc-nomem.c

> +    FAIL_EXIT1 ("fwrite failed: 5m\n");

Typo: %m

> +  while (!feof (fp))
> +    {
> +      fail = true;
> +      TEST_COMPARE (ungetc ('y', fp), 'y');
> +      /* This will result in resizing, which should fail.  */
> +      TEST_COMPARE (ungetc ('y', fp), EOF);
> +      fail = false;
> +      TEST_COMPARE (fgetc (fp), 'y');

Hmm.  So ungetc doesn't set the error indicator?  POSIX doesn't mention
it, so it seems okay.

When I mentioned multiple ungetc calls in a row, I meant that first
force a single-byte buffer (with fail = true), and then re-enable malloc
and force switch to a larger buffer with multiple ungetc calls.

Thanks,
Florian
  
Siddhesh Poyarekar Dec. 2, 2024, 10:22 p.m. UTC | #2
On 2024-12-02 16:18, Florian Weimer wrote:
> * Siddhesh Poyarekar:
> 
>> diff --git a/libio/genops.c b/libio/genops.c
>> index d7e35e67d5..996daf0ad8 100644
>> --- a/libio/genops.c
>> +++ b/libio/genops.c
>> @@ -1,4 +1,5 @@
>>   /* Copyright (C) 1993-2024 Free Software Foundation, Inc.
>> +   Copyright The GNU Toolchain Authors.
>>      This file is part of the GNU C Library.
>>   
>>      The GNU C Library is free software; you can redistribute it and/or
>> @@ -48,6 +49,13 @@ flush_cleanup (void *not_used)
>>   }
>>   #endif
>>   
>> +void
>> +_IO_free_backup_buf (FILE *fp, char *ptr)
>> +{
>> +  if (ptr != fp->_short_backupbuf)
>> +    free (ptr);
>> +}
> 
> Shouldn't this check for the availability of the _short_backupbuf field,
> too?

Uhmm, I suppose so, so:

if (_IO_vtable_offset (fp) == 0 && ptr != fp->_short_backupbuf)
   ...

> And with that in place, it might make sense to update oldfileops.c in
> similar places, just in case something weird happens with the vtables
> and we end up the old vtables code on new file handles.

I'll take a look and clean up.

> For similar reasons, please convert the free calls for wide streams to
> _IO_free_backup_buf, too.  We definitely have weird vtable interactions
> for those.

Yes, I just wanted to separate it out from this change.  Would you 
rather prefer that I club the two changes together like I did the first 
time?

>> diff --git a/stdio-common/tst-ungetc-nomem.c b/stdio-common/tst-ungetc-nomem.c
>> new file mode 100644
>> index 0000000000..73b88e2e1c
>> --- /dev/null
>> +++ b/stdio-common/tst-ungetc-nomem.c
> 
>> +    FAIL_EXIT1 ("fwrite failed: 5m\n");
> 
> Typo: %m

Eep :/

>> +  while (!feof (fp))
>> +    {
>> +      fail = true;
>> +      TEST_COMPARE (ungetc ('y', fp), 'y');
>> +      /* This will result in resizing, which should fail.  */
>> +      TEST_COMPARE (ungetc ('y', fp), EOF);
>> +      fail = false;
>> +      TEST_COMPARE (fgetc (fp), 'y');
> 
> Hmm.  So ungetc doesn't set the error indicator?  POSIX doesn't mention
> it, so it seems okay.
> 
> When I mentioned multiple ungetc calls in a row, I meant that first
> force a single-byte buffer (with fail = true), and then re-enable malloc
> and force switch to a larger buffer with multiple ungetc calls.

Ah, so a *successful* switch after the transient failure; OK I'll add 
that and also see if there are any other combinations to test.

Thanks,
Sid
  
Florian Weimer Dec. 3, 2024, 8:38 a.m. UTC | #3
* Siddhesh Poyarekar:

> On 2024-12-02 16:18, Florian Weimer wrote:
>> * Siddhesh Poyarekar:
>> 
>>> diff --git a/libio/genops.c b/libio/genops.c
>>> index d7e35e67d5..996daf0ad8 100644
>>> --- a/libio/genops.c
>>> +++ b/libio/genops.c
>>> @@ -1,4 +1,5 @@
>>>   /* Copyright (C) 1993-2024 Free Software Foundation, Inc.
>>> +   Copyright The GNU Toolchain Authors.
>>>      This file is part of the GNU C Library.
>>>        The GNU C Library is free software; you can redistribute it
>>> and/or
>>> @@ -48,6 +49,13 @@ flush_cleanup (void *not_used)
>>>   }
>>>   #endif
>>>   +void
>>> +_IO_free_backup_buf (FILE *fp, char *ptr)
>>> +{
>>> +  if (ptr != fp->_short_backupbuf)
>>> +    free (ptr);
>>> +}
>> Shouldn't this check for the availability of the _short_backupbuf
>> field,
>> too?
>
> Uhmm, I suppose so, so:
>
> if (_IO_vtable_offset (fp) == 0 && ptr != fp->_short_backupbuf)
>   ...

Right.

>> For similar reasons, please convert the free calls for wide streams to
>> _IO_free_backup_buf, too.  We definitely have weird vtable interactions
>> for those.
>
> Yes, I just wanted to separate it out from this change.  Would you
> rather prefer that I club the two changes together like I did the
> first time?

Note that this is not about ungetwc support, it's about breaking stuff
on the wide character side due to the ungetc changes.

Thanks,
Florian
  

Patch

diff --git a/libio/bits/types/struct_FILE.h b/libio/bits/types/struct_FILE.h
index d8d26639d1..5d08509078 100644
--- a/libio/bits/types/struct_FILE.h
+++ b/libio/bits/types/struct_FILE.h
@@ -1,4 +1,5 @@ 
 /* Copyright (C) 1991-2024 Free Software Foundation, Inc.
+   Copyright The GNU Toolchain Authors.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -94,8 +95,9 @@  struct _IO_FILE_complete
   void *_freeres_buf;
   struct _IO_FILE **_prevchain;
   int _mode;
+  char _short_backupbuf[1];
   /* Make sure we don't get into trouble again.  */
-  char _unused2[15 * sizeof (int) - 5 * sizeof (void *)];
+  char _unused2[15 * sizeof (int) - 5 * sizeof (void *) - sizeof (char)];
 };
 
 /* These macros are used by bits/stdio.h and internal headers.  */
diff --git a/libio/fileops.c b/libio/fileops.c
index 759d737ec7..d49e489f55 100644
--- a/libio/fileops.c
+++ b/libio/fileops.c
@@ -1,4 +1,5 @@ 
 /* Copyright (C) 1993-2024 Free Software Foundation, Inc.
+   Copyright The GNU Toolchain Authors.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -480,7 +481,7 @@  _IO_new_file_underflow (FILE *fp)
       /* Maybe we already have a push back pointer.  */
       if (fp->_IO_save_base != NULL)
 	{
-	  free (fp->_IO_save_base);
+	  _IO_free_backup_buf (fp, fp->_IO_save_base);
 	  fp->_flags &= ~_IO_IN_BACKUP;
 	}
       _IO_doallocbuf (fp);
@@ -932,7 +933,7 @@  _IO_new_file_seekoff (FILE *fp, off64_t offset, int dir, int mode)
       /* It could be that we already have a pushback buffer.  */
       if (fp->_IO_read_base != NULL)
 	{
-	  free (fp->_IO_read_base);
+	  _IO_free_backup_buf (fp, fp->_IO_read_base);
 	  fp->_flags &= ~_IO_IN_BACKUP;
 	}
       _IO_doallocbuf (fp);
@@ -1282,7 +1283,7 @@  _IO_file_xsgetn (FILE *fp, void *data, size_t n)
       /* Maybe we already have a push back pointer.  */
       if (fp->_IO_save_base != NULL)
 	{
-	  free (fp->_IO_save_base);
+	  _IO_free_backup_buf (fp, fp->_IO_save_base);
 	  fp->_flags &= ~_IO_IN_BACKUP;
 	}
       _IO_doallocbuf (fp);
diff --git a/libio/genops.c b/libio/genops.c
index d7e35e67d5..996daf0ad8 100644
--- a/libio/genops.c
+++ b/libio/genops.c
@@ -1,4 +1,5 @@ 
 /* Copyright (C) 1993-2024 Free Software Foundation, Inc.
+   Copyright The GNU Toolchain Authors.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -48,6 +49,13 @@  flush_cleanup (void *not_used)
 }
 #endif
 
+void
+_IO_free_backup_buf (FILE *fp, char *ptr)
+{
+  if (ptr != fp->_short_backupbuf)
+    free (ptr);
+}
+
 /* Fields in struct _IO_FILE after the _lock field are internal to
    glibc and opaque to applications.  We can change them as long as
    the size of struct _IO_FILE is unchanged, which is checked as the
@@ -212,7 +220,7 @@  _IO_free_backup_area (FILE *fp)
 {
   if (_IO_in_backup (fp))
     _IO_switch_to_main_get_area (fp);  /* Just in case. */
-  free (fp->_IO_save_base);
+  _IO_free_backup_buf (fp, fp->_IO_save_base);
   fp->_IO_save_base = NULL;
   fp->_IO_save_end = NULL;
   fp->_IO_backup_base = NULL;
@@ -260,7 +268,7 @@  save_for_backup (FILE *fp, char *end_p)
 	memcpy (new_buffer + avail,
 		fp->_IO_read_base + least_mark,
 		needed_size);
-      free (fp->_IO_save_base);
+      _IO_free_backup_buf (fp, fp->_IO_save_base);
       fp->_IO_save_base = new_buffer;
       fp->_IO_save_end = new_buffer + avail + needed_size;
     }
@@ -636,7 +644,7 @@  _IO_default_finish (FILE *fp, int dummy)
 
   if (fp->_IO_save_base)
     {
-      free (fp->_IO_save_base);
+      _IO_free_backup_buf (fp, fp->_IO_save_base);
       fp->_IO_save_base = NULL;
     }
 
@@ -998,11 +1006,17 @@  _IO_default_pbackfail (FILE *fp, int c)
 	  else if (!_IO_have_backup (fp))
 	    {
 	      /* No backup buffer: allocate one. */
-	      /* Use nshort buffer, if unused? (probably not)  FIXME */
 	      int backup_size = 128;
 	      char *bbuf = (char *) malloc (backup_size);
 	      if (bbuf == NULL)
-		return EOF;
+		{
+		  /* Guarantee a 1-char pushback, except for legacy code where
+		     we don't have the extended part of FILE.  */
+		  if (__glibc_unlikely (_IO_vtable_offset (fp) != 0))
+		    return EOF;
+		  bbuf = fp->_short_backupbuf;
+		  backup_size = 1;
+		}
 	      fp->_IO_save_base = bbuf;
 	      fp->_IO_save_end = fp->_IO_save_base + backup_size;
 	      fp->_IO_backup_base = fp->_IO_save_end;
@@ -1022,7 +1036,7 @@  _IO_default_pbackfail (FILE *fp, int c)
 	    return EOF;
 	  memcpy (new_buf + (new_size - old_size), fp->_IO_read_base,
 		  old_size);
-	  free (fp->_IO_read_base);
+	  _IO_free_backup_buf (fp, fp->_IO_read_base);
 	  _IO_setg (fp, new_buf, new_buf + (new_size - old_size),
 		    new_buf + new_size);
 	  fp->_IO_backup_base = fp->_IO_read_ptr;
diff --git a/libio/libioP.h b/libio/libioP.h
index 34bf91fcd8..90ef8e90be 100644
--- a/libio/libioP.h
+++ b/libio/libioP.h
@@ -1,4 +1,5 @@ 
 /* Copyright (C) 1993-2024 Free Software Foundation, Inc.
+   Copyright The GNU Toolchain Authors.
    This file is part of the GNU C Library.
 
    The GNU C Library is free software; you can redistribute it and/or
@@ -357,6 +358,8 @@  typedef FILE *_IO_ITER;
 
 /* Generic functions */
 
+extern void _IO_free_backup_buf (FILE *, char *);
+libc_hidden_proto (_IO_free_backup_buf)
 extern void _IO_switch_to_main_get_area (FILE *) __THROW;
 extern void _IO_switch_to_backup_area (FILE *) __THROW;
 extern int _IO_switch_to_get_mode (FILE *);
diff --git a/stdio-common/Makefile b/stdio-common/Makefile
index e76e40e587..b1a04fd064 100644
--- a/stdio-common/Makefile
+++ b/stdio-common/Makefile
@@ -1,4 +1,5 @@ 
 # Copyright (C) 1991-2024 Free Software Foundation, Inc.
+# Copyright The GNU Toolchain Authors.
 # This file is part of the GNU C Library.
 
 # The GNU C Library is free software; you can redistribute it and/or
@@ -303,6 +304,7 @@  tests := \
   tst-tmpnam \
   tst-ungetc \
   tst-ungetc-leak \
+  tst-ungetc-nomem \
   tst-unlockedio \
   tst-vfprintf-mbs-prec \
   tst-vfprintf-user-type \
diff --git a/stdio-common/tst-ungetc-nomem.c b/stdio-common/tst-ungetc-nomem.c
new file mode 100644
index 0000000000..73b88e2e1c
--- /dev/null
+++ b/stdio-common/tst-ungetc-nomem.c
@@ -0,0 +1,98 @@ 
+/* Test ungetc behavior with malloc failures.
+   Copyright The GNU Toolchain Authors.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdio.h>
+#include <string.h>
+#include <support/check.h>
+#include <support/support.h>
+#include <support/temp_file.h>
+#include <support/xstdio.h>
+
+extern void *__libc_malloc (size_t)
+     __attribute__ ((malloc)) __attribute__ ((alloc_size (1)));
+
+static volatile bool fail = false;
+
+void *
+malloc (size_t sz)
+{
+  if (fail)
+    return NULL;
+
+  return __libc_malloc (sz);
+}
+
+static int
+do_test (void)
+{
+  char *filename = NULL;
+  struct stat props = {};
+  size_t bufsz = 0;
+
+  create_temp_file ("tst-ungetc-nomem.", &filename);
+  if (stat (filename, &props) != 0)
+    FAIL_EXIT1 ("Could not get file status: %m\n");
+
+  FILE *fp = fopen (filename, "w");
+
+  /* The libio buffer sizes are the same as block size.  */
+  bufsz = props.st_blksize + 2;
+
+  char *buf = xmalloc (bufsz);
+  memset (buf, 'a', bufsz);
+
+  if (fwrite (buf, sizeof (char), bufsz, fp) != bufsz)
+    FAIL_EXIT1 ("fwrite failed: 5m\n");
+  xfclose (fp);
+
+  /* Begin test.  */
+  fp = xfopen (filename, "r");
+
+
+  /* The standard requires the first ungetc to always work.  */
+  fail = true;
+  TEST_COMPARE (ungetc ('y', fp), 'y');
+
+  /* Now let the buffers get allocated to allow for subsequent tests.  */
+  fail = false;
+  TEST_COMPARE (fgetc (fp), 'y');
+  TEST_COMPARE (ungetc ('y', fp), 'y');
+  TEST_COMPARE (fgetc (fp), 'y');
+
+  while (!feof (fp))
+    {
+      fail = true;
+      TEST_COMPARE (ungetc ('y', fp), 'y');
+      /* This will result in resizing, which should fail.  */
+      TEST_COMPARE (ungetc ('y', fp), EOF);
+      fail = false;
+      TEST_COMPARE (fgetc (fp), 'y');
+      /* The second fgetc should return a char from the file or EOF.  */
+      char c = fgetc (fp);
+      if (!feof (fp))
+	TEST_COMPARE (c, 'a');
+    }
+
+  /* Final sanity check before we're done.  */
+  TEST_COMPARE (ferror (fp), 0);
+  xfclose (fp);
+
+  return 0;
+}
+
+#include <support/test-driver.c>