[PING,v2] Resolve-flockfile-funlockfile-differences

Message ID 87ilhj3bsp.fsf@oracle.com
State Under Review
Delegated to: Florian Weimer
Headers
Series [PING,v2] Resolve-flockfile-funlockfile-differences |

Checks

Context Check Description
dj/TryBot-apply_patch fail Patch failed to apply to master at the time it was sent
dj/TryBot-32bit fail Patch series failed to apply

Commit Message

Cupertino Miranda Jan. 6, 2023, 3:47 p.m. UTC
  This is a request for review of the patch sent by Patrick McGehearty based on
an inconsistency in flockfile and funlockfile definitions (macro/function) when
used in glibc code.
At the same time this is a request for comments on the topic based on some
personal concerns on the patch.

After reading the description of the problem as explained in our bug tracking I
decided to make a more targeted patch, which will also help to better explain
the problem.

Unfortunately at this stage I cannot functionally verify this patch due to how
long it was identified and fixed at Oracle.

The patch defines a local wrapper to flockfile and funlockfile such that both
the function pointer passed to libc_cleanup_region_start and the direct call
would point to the same definition.

I have some concerns about the original presented patch as, IMO, it would allow
any user level application to bypass the lock/unlock by directly writing to
the file pointer flags, changing the original definition of f[unlock|lock]file
as described in the man pages.

Should both the macro/function definitions be coherent such that they should be
used interchangeably in glibc code?

Regards,
Cupertino

--- MORE TARGETED PATCH ---


> From a0291671db0d92a8762de3a45fd322e471a19a24 Mon Sep 17 00:00:00 2001
> From: Patrick McGehearty <patrick.mcgehearty@oracle.com>
> Date: Wed, 23 Nov 2022 21:02:02 +0000
> Subject: [PATCH v2] Resolve flockfile/funlockfile differences
>
> - - - - - -
> Only difference from v1 is to correct indenting/tabs
> for the changes to match other .c files in stdio-common.
> - - - - - -
> This patch resolves differences between flockfile/funlockfile and
> LOCK_STREAM/UNLOCKSTREAM to avoid failure to unlock a stream mutex in
> a multi-threaded application context which allows entering using
> LOCK_STREAM but leaving using funlockfile or vice-versa.
> This issue was detected during stress tests of a large proprietary
> application. The cause and solution was identified by Gerd Rausch.
>
> The issue occurs because _IO_funlockfile has different definitions in
> different contexts:
>
> Comparing the inline version in libio/libio.h
> #  define _IO_flockfile(_fp) \
>   if (((_fp)->_flags & _IO_USER_LOCK) == 0) _IO_flockfile (_fp)
>
> And the non-inline version in stdio-common/flockfile.c
> __flockfile (FILE *stream)
> {
>   _IO_lock_lock (*stream->_lock);
> }
>
> Note the lack of the _IO_USER_LOCK in the __flockfile version.  This
> difference means it is possible to bypass the lock in some cases and
> not release the lock in other cases. Either way, it causes trouble.
>
> The proposed fix is to simple add the _IO_USER_LOCK guard to the
> non-line versions of flockfile and funlockfile.
>
> Modified files:
>       stdio-common/flockfile.c
>       stdio-common/funlockfile.c
> ---
>  stdio-common/flockfile.c   | 3 ++-
>  stdio-common/funlockfile.c | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/stdio-common/flockfile.c b/stdio-common/flockfile.c
> index 49f72c6..7c82847 100644
> --- a/stdio-common/flockfile.c
> +++ b/stdio-common/flockfile.c
> @@ -22,7 +22,8 @@
>  void
>  __flockfile (FILE *stream)
>  {
> -  _IO_lock_lock (*stream->_lock);
> +  if ((stream->_flags & _IO_USER_LOCK) == 0)
> +    _IO_lock_lock (*stream->_lock);
>  }
>  weak_alias (__flockfile, flockfile);
>  weak_alias (__flockfile, _IO_flockfile)
> diff --git a/stdio-common/funlockfile.c b/stdio-common/funlockfile.c
> index bf44c99..b77b1b2 100644
> --- a/stdio-common/funlockfile.c
> +++ b/stdio-common/funlockfile.c
> @@ -23,7 +23,8 @@
>  void
>  __funlockfile (FILE *stream)
>  {
> -  _IO_lock_unlock (*stream->_lock);
> +  if ((stream->_flags & _IO_USER_LOCK) == 0)
> +    _IO_lock_unlock (*stream->_lock);
>  }
>  weak_alias (__funlockfile, _IO_funlockfile)
>  weak_alias (__funlockfile, funlockfile);
> --
> 1.8.3.1
  

Comments

Cupertino Miranda Jan. 12, 2023, 3:04 p.m. UTC | #1
Hi Carlos, Florian,

Last patchwork review meeting you mentioned some improvement patches to
(I think) __libc_cleanup_region_start/end mechanism.
I tried to find the patches in the libc-alpha archives without any
success.

Only one I found was something from 2018 changing it for mach:
https://ecos.sourceware.org/ml/libc-alpha/2018-03/msg00433.html

In any case the solution would not address the problem.

Can you possibly point me to the patches you refered to?
My searching ability is limited to the archives / google. It might be
simpler for you to find it. ;-)

Regards,
Cupertino

Cupertino Miranda via Libc-alpha writes:

> This is a request for review of the patch sent by Patrick McGehearty based on
> an inconsistency in flockfile and funlockfile definitions (macro/function) when
> used in glibc code.
> At the same time this is a request for comments on the topic based on some
> personal concerns on the patch.
>
> After reading the description of the problem as explained in our bug tracking I
> decided to make a more targeted patch, which will also help to better explain
> the problem.
>
> Unfortunately at this stage I cannot functionally verify this patch due to how
> long it was identified and fixed at Oracle.
>
> The patch defines a local wrapper to flockfile and funlockfile such that both
> the function pointer passed to libc_cleanup_region_start and the direct call
> would point to the same definition.
>
> I have some concerns about the original presented patch as, IMO, it would allow
> any user level application to bypass the lock/unlock by directly writing to
> the file pointer flags, changing the original definition of f[unlock|lock]file
> as described in the man pages.
>
> Should both the macro/function definitions be coherent such that they should be
> used interchangeably in glibc code?
>
> Regards,
> Cupertino
>
> --- MORE TARGETED PATCH ---
>
> diff --git a/stdio-common/vfscanf-internal.c b/stdio-common/vfscanf-internal.c
> index 2ad34050f3..5b55db962b 100644
> --- a/stdio-common/vfscanf-internal.c
> +++ b/stdio-common/vfscanf-internal.c
> @@ -177,11 +177,15 @@
>           return EOF;                                                         \
>         }                                                                     \
>      } while (0)
> +
> +static inline void checked_IO_funlockfile(FILE *s);
> +static inline void checked_IO_flockfile(FILE *s);
> +
>  #define LOCK_STREAM(S)                                                       \
> -  __libc_cleanup_region_start (1, (void (*) (void *)) &_IO_funlockfile, (S)); \
> -  _IO_flockfile (S)
> +  __libc_cleanup_region_start (1, (void (*) (void *)) &checked_IO_funlockfile, (S)); \
> +  checked_IO_flockfile (S)
>  #define UNLOCK_STREAM(S)                                                     \
> -  _IO_funlockfile (S);                                                       \
> +  checked_IO_funlockfile (S);                                                \
>    __libc_cleanup_region_end (0)
>
>  struct ptrs_to_free
> @@ -197,6 +201,18 @@ struct char_buffer {
>    struct scratch_buffer scratch;
>  };
>
> +static inline void
> +checked_IO_funlockfile(FILE *s)
> +{
> +  _IO_funlockfile (s);
> +}
> +
> +static inline void
> +checked_IO_flockfile(FILE *s)
> +{
> +  _IO_flockfile (s);
> +}
> +
>  /* Returns a pointer to the first CHAR_T object in the buffer.  Only
>     valid if char_buffer_add (BUFFER, CH) has been called and
>     char_buffer_error (BUFFER) is false.  */
>
>> From a0291671db0d92a8762de3a45fd322e471a19a24 Mon Sep 17 00:00:00 2001
>> From: Patrick McGehearty <patrick.mcgehearty@oracle.com>
>> Date: Wed, 23 Nov 2022 21:02:02 +0000
>> Subject: [PATCH v2] Resolve flockfile/funlockfile differences
>>
>> - - - - - -
>> Only difference from v1 is to correct indenting/tabs
>> for the changes to match other .c files in stdio-common.
>> - - - - - -
>> This patch resolves differences between flockfile/funlockfile and
>> LOCK_STREAM/UNLOCKSTREAM to avoid failure to unlock a stream mutex in
>> a multi-threaded application context which allows entering using
>> LOCK_STREAM but leaving using funlockfile or vice-versa.
>> This issue was detected during stress tests of a large proprietary
>> application. The cause and solution was identified by Gerd Rausch.
>>
>> The issue occurs because _IO_funlockfile has different definitions in
>> different contexts:
>>
>> Comparing the inline version in libio/libio.h
>> #  define _IO_flockfile(_fp) \
>>   if (((_fp)->_flags & _IO_USER_LOCK) == 0) _IO_flockfile (_fp)
>>
>> And the non-inline version in stdio-common/flockfile.c
>> __flockfile (FILE *stream)
>> {
>>   _IO_lock_lock (*stream->_lock);
>> }
>>
>> Note the lack of the _IO_USER_LOCK in the __flockfile version.  This
>> difference means it is possible to bypass the lock in some cases and
>> not release the lock in other cases. Either way, it causes trouble.
>>
>> The proposed fix is to simple add the _IO_USER_LOCK guard to the
>> non-line versions of flockfile and funlockfile.
>>
>> Modified files:
>>       stdio-common/flockfile.c
>>       stdio-common/funlockfile.c
>> ---
>>  stdio-common/flockfile.c   | 3 ++-
>>  stdio-common/funlockfile.c | 3 ++-
>>  2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/stdio-common/flockfile.c b/stdio-common/flockfile.c
>> index 49f72c6..7c82847 100644
>> --- a/stdio-common/flockfile.c
>> +++ b/stdio-common/flockfile.c
>> @@ -22,7 +22,8 @@
>>  void
>>  __flockfile (FILE *stream)
>>  {
>> -  _IO_lock_lock (*stream->_lock);
>> +  if ((stream->_flags & _IO_USER_LOCK) == 0)
>> +    _IO_lock_lock (*stream->_lock);
>>  }
>>  weak_alias (__flockfile, flockfile);
>>  weak_alias (__flockfile, _IO_flockfile)
>> diff --git a/stdio-common/funlockfile.c b/stdio-common/funlockfile.c
>> index bf44c99..b77b1b2 100644
>> --- a/stdio-common/funlockfile.c
>> +++ b/stdio-common/funlockfile.c
>> @@ -23,7 +23,8 @@
>>  void
>>  __funlockfile (FILE *stream)
>>  {
>> -  _IO_lock_unlock (*stream->_lock);
>> +  if ((stream->_flags & _IO_USER_LOCK) == 0)
>> +    _IO_lock_unlock (*stream->_lock);
>>  }
>>  weak_alias (__funlockfile, _IO_funlockfile)
>>  weak_alias (__funlockfile, funlockfile);
>> --
>> 1.8.3.1
  
Florian Weimer Feb. 13, 2023, 1:30 p.m. UTC | #2
* Cupertino Miranda via Libc-alpha:

> This is a request for review of the patch sent by Patrick McGehearty
> based on an inconsistency in flockfile and funlockfile definitions
> (macro/function) when used in glibc code.  At the same time this is a
> request for comments on the topic based on some personal concerns on
> the patch.
>
> After reading the description of the problem as explained in our bug
> tracking I decided to make a more targeted patch, which will also help
> to better explain the problem.
>
> Unfortunately at this stage I cannot functionally verify this patch
> due to how long it was identified and fixed at Oracle.
>
> The patch defines a local wrapper to flockfile and funlockfile such
> that both the function pointer passed to libc_cleanup_region_start and
> the direct call would point to the same definition.

So here's the general concern I have regarding
_IO_flockfile.  In libio/libio.h, we have:

    197 extern void _IO_flockfile (FILE *) __THROW;
    198 extern void _IO_funlockfile (FILE *) __THROW;
    199 extern int _IO_ftrylockfile (FILE *) __THROW;
    200 
    201 #define _IO_peekc(_fp) _IO_peekc_unlocked (_fp)
    202 #define _IO_flockfile(_fp) /**/
    203 #define _IO_funlockfile(_fp) ((void) 0)
    204 #define _IO_ftrylockfile(_fp) /**/
    205 #ifndef _IO_cleanup_region_start
    206 #define _IO_cleanup_region_start(_fct, _fp) /**/
    207 #endif
    208 #ifndef _IO_cleanup_region_end
    209 #define _IO_cleanup_region_end(_Doit) /**/
    210 #endif

    275 #ifdef _IO_MTSAFE_IO
    276 # undef _IO_peekc
    277 # undef _IO_flockfile
    278 # undef _IO_funlockfile
    279 # undef _IO_ftrylockfile
    280 
    281 # define _IO_peekc(_fp) _IO_peekc_locked (_fp)
    282 # define _IO_flockfile(_fp) \
    283   if (((_fp)->_flags & _IO_USER_LOCK) == 0) _IO_lock_lock (*(_fp)->_lock    283 )
    284 # define _IO_funlockfile(_fp) \
    285   if (((_fp)->_flags & _IO_USER_LOCK) == 0) _IO_lock_unlock (*(_fp)->_lo    285 ck)
    286 #endif /* _IO_MTSAFE_IO */

As you can see, the exact definition depends on _IO_MTSAFE_IO.  It is
set in Makeconfig, but again conditionally:

   1397 # A sysdeps Makeconfig fragment may set libc-reentrant to yes.
   1398 ifeq (yes,$(libc-reentrant))
   1399 defines += -D_LIBC_REENTRANT
   1400 
   1401 libio-mtsafe = -D_IO_MTSAFE_IO
   1402 endif

$(libc-reentrant) should always evaluate to yes because it is set as
such in sysdeps/pthread/Makeconfig.

On the other hand, this construct requires further opt-in by including
libio-mtsafe in CFLAGS or CPPFLAGS.  There seem to be uses that are
appear to be unconvered by -D_IO_MTSAFE_IO, for example the malloc
subdirectory.  Disassembling __malloc_stats seems to confirm that.

Maybe you can confirm based on your case notes whether this issue might
becaused by parts of glibc stomping over the flags due to the
inconsistency described above.  Or is it because the application sets
__fsetlocking (FSETLOCKING_BYCALLER) concurrently?  I don't see how we
can support the latter, that really seems to be an application bug.

Thanks,
Florian
  
Florian Weimer Sept. 4, 2023, 4:40 p.m. UTC | #3
* Cupertino Miranda via Libc-alpha:

> This is a request for review of the patch sent by Patrick McGehearty
> based on an inconsistency in flockfile and funlockfile definitions
> (macro/function) when used in glibc code.

It finally clicked to for me what is going on here.  Sorry for taking so
long.  It's a basic C problem.

_IO_funlockfile is defined as a macro and as a function.  The function
definition in stdio-common/funlockfile.c unlocks the stream
unconditionally.  The macro version checks if FSETLOCKING_BYCALLER is
active and skips the unlocking in that case.  Same for _IO_flockfile.

With &_IO_funlockfile, we get the unconditional unlock, but with
_IO_flockfile (S), we get the conditionalized lock.

The current code is clearly buggy.  It should be possible to write a
test for this, using ftrylockfile.  The direction of the change seems
right, too.

> Should both the macro/function definitions be coherent such that they should be
> used interchangeably in glibc code?

The external flockfile/funlockfile needs to be unconditional because I
think it is expected that these functions can be used to implement the
locking that is required if FSETLOCKING_BYCALLER is used.

Despite the recursive internal lock, checking for FSETLOCKING_BYCALLER
is probably more than a performance optimization because the unexpected
locking might introduce deadlocks.  So skipping the locking, as you did,
seems correct.

So I think we need the different internal and external behavior.  What
is clearly not okay though is that we use the same identifier for those
two different operations.  We cannot rename the external symbol, but we
really should rename the internal identifier.

Thanks,
Florian
  
Florian Weimer Sept. 5, 2023, 8:42 a.m. UTC | #4
* Cupertino Miranda via Libc-alpha:

> diff --git a/stdio-common/vfscanf-internal.c b/stdio-common/vfscanf-internal.c
> index 2ad34050f3..5b55db962b 100644
> --- a/stdio-common/vfscanf-internal.c
> +++ b/stdio-common/vfscanf-internal.c
> @@ -177,11 +177,15 @@
>           return EOF;                                                         \
>         }                                                                     \
>      } while (0)
> +
> +static inline void checked_IO_funlockfile(FILE *s);
> +static inline void checked_IO_flockfile(FILE *s);
> +
>  #define LOCK_STREAM(S)                                                       \
> -  __libc_cleanup_region_start (1, (void (*) (void *)) &_IO_funlockfile, (S)); \
> -  _IO_flockfile (S)
> +  __libc_cleanup_region_start (1, (void (*) (void *)) &checked_IO_funlockfile, (S)); \
> +  checked_IO_flockfile (S)
>  #define UNLOCK_STREAM(S)                                                     \
> -  _IO_funlockfile (S);                                                       \
> +  checked_IO_funlockfile (S);                                                \
>    __libc_cleanup_region_end (0)
>
>  struct ptrs_to_free
> @@ -197,6 +201,18 @@ struct char_buffer {
>    struct scratch_buffer scratch;
>  };
>
> +static inline void
> +checked_IO_funlockfile(FILE *s)
> +{
> +  _IO_funlockfile (s);
> +}
> +
> +static inline void
> +checked_IO_flockfile(FILE *s)
> +{
> +  _IO_flockfile (s);
> +}

A couple of suggestions:

We only need a wrapper for funlockfile.  We should avoid the forward
declaration.  I think we can keep calling _IO_funlockfile on the
no-cancel path.  The wrapper function should have the correct type, so
that the (technically undefined) cast in __libc_cleanup_region_start is
no longer needed.

Please file a bug, with a summary like ”fscanf may incorrectly unlock
FSETLOCKING_BYCALLER stream if canceled”.

Please let me know if you can make the adjustments.  I have verified
that the change fixes my test case.

Thanks,
Florian
  

Patch

diff --git a/stdio-common/vfscanf-internal.c b/stdio-common/vfscanf-internal.c
index 2ad34050f3..5b55db962b 100644
--- a/stdio-common/vfscanf-internal.c
+++ b/stdio-common/vfscanf-internal.c
@@ -177,11 +177,15 @@ 
          return EOF;                                                         \
        }                                                                     \
     } while (0)
+
+static inline void checked_IO_funlockfile(FILE *s);
+static inline void checked_IO_flockfile(FILE *s);
+
 #define LOCK_STREAM(S)                                                       \
-  __libc_cleanup_region_start (1, (void (*) (void *)) &_IO_funlockfile, (S)); \
-  _IO_flockfile (S)
+  __libc_cleanup_region_start (1, (void (*) (void *)) &checked_IO_funlockfile, (S)); \
+  checked_IO_flockfile (S)
 #define UNLOCK_STREAM(S)                                                     \
-  _IO_funlockfile (S);                                                       \
+  checked_IO_funlockfile (S);                                                \
   __libc_cleanup_region_end (0)

 struct ptrs_to_free
@@ -197,6 +201,18 @@  struct char_buffer {
   struct scratch_buffer scratch;
 };

+static inline void
+checked_IO_funlockfile(FILE *s)
+{
+  _IO_funlockfile (s);
+}
+
+static inline void
+checked_IO_flockfile(FILE *s)
+{
+  _IO_flockfile (s);
+}
+
 /* Returns a pointer to the first CHAR_T object in the buffer.  Only
    valid if char_buffer_add (BUFFER, CH) has been called and
    char_buffer_error (BUFFER) is false.  */