abort: Only flush file-based stdio streams before termination

Message ID 20170817133507.CEA5341DB79B0@oldenburg.str.redhat.com
State Superseded
Headers

Commit Message

Florian Weimer Aug. 17, 2017, 1:35 p.m. UTC
  Historically, glibc flushes streams on abort, which is not
required by POSIX.  This can trigger additional work
(including callbacks through function pointers) in processes
which are known to be in a bad state.  After this change,
only streams which are backed by the standard descriptor-based
implementation are flushed.

2017-08-17  Florian Weimer  <fweimer@redhat.com>

	Only flush file-based stdio streams on process abort.
	* libio/genops.c (_IO_flush_all_lockp): Add restrict_vtables
	parameter.  Do not call __overflow on vtable mismatch.
	(_IO_flush_all): Adjust call to _IO_flush_all_lockp.
	(_IO_cleanup): Likewise.
	* libio/libioP.h (_IO_JUMPS_FUNC_NOVALIDATE): New macro.
	(_IO_JUMPS_FUNC): Use it.
	(_IO_flush_all_lockp): Add restrict_vtables parameter.  Make
	hidden.
	* stdlib/abort.c (fflush): Remove macro definition.
	(abort): Call _IO_flush_all_lockp directly, with vtable
	restriction.
	* stdlib/tst-abort-fflush.c: Newfile.
	* stdlib/Makefile (tests): Add it.
  

Comments

Carlos O'Donell Aug. 17, 2017, 1:45 p.m. UTC | #1
On 08/17/2017 09:35 AM, Florian Weimer wrote:
> Historically, glibc flushes streams on abort, which is not
> required by POSIX.  This can trigger additional work
> (including callbacks through function pointers) in processes
> which are known to be in a bad state.  After this change,
> only streams which are backed by the standard descriptor-based
> implementation are flushed.

Does this mean open_memstream streams to NVM won't be flushed by
default?

As a user I might be convinced that my own custom streams need to
flushed by hand in an abort handler, but I might expect open_memstream
streams to be flushed.
  
Adhemerval Zanella Aug. 17, 2017, 1:50 p.m. UTC | #2
On 17/08/2017 10:45, Carlos O'Donell wrote:
> On 08/17/2017 09:35 AM, Florian Weimer wrote:
>> Historically, glibc flushes streams on abort, which is not
>> required by POSIX.  This can trigger additional work
>> (including callbacks through function pointers) in processes
>> which are known to be in a bad state.  After this change,
>> only streams which are backed by the standard descriptor-based
>> implementation are flushed.
> 
> Does this mean open_memstream streams to NVM won't be flushed by
> default?
> 
> As a user I might be convinced that my own custom streams need to
> flushed by hand in an abort handler, but I might expect open_memstream
> streams to be flushed.
> 

For my BZ#21735 fix (libio: Fix open_memstream fflush (NULL)), you
indicate you asked for clarification regarding open_memstream [1].  
I think we should follow the same semantic for abort after Austin 
Group clarification.

[1] http://austingroupbugs.net/view.php?id=1156
  
Carlos O'Donell Aug. 17, 2017, 1:57 p.m. UTC | #3
On 08/17/2017 09:50 AM, Adhemerval Zanella wrote:
> 
> 
> On 17/08/2017 10:45, Carlos O'Donell wrote:
>> On 08/17/2017 09:35 AM, Florian Weimer wrote:
>>> Historically, glibc flushes streams on abort, which is not
>>> required by POSIX.  This can trigger additional work
>>> (including callbacks through function pointers) in processes
>>> which are known to be in a bad state.  After this change,
>>> only streams which are backed by the standard descriptor-based
>>> implementation are flushed.
>>
>> Does this mean open_memstream streams to NVM won't be flushed by
>> default?
>>
>> As a user I might be convinced that my own custom streams need to
>> flushed by hand in an abort handler, but I might expect open_memstream
>> streams to be flushed.
>>
> 
> For my BZ#21735 fix (libio: Fix open_memstream fflush (NULL)), you
> indicate you asked for clarification regarding open_memstream [1].  
> I think we should follow the same semantic for abort after Austin 
> Group clarification.
> 
> [1] http://austingroupbugs.net/view.php?id=1156
 
Agreed.

IMO until such clarification is made we should conservatively
flush open_memstream streams.
  
Florian Weimer Aug. 17, 2017, 1:59 p.m. UTC | #4
On 08/17/2017 03:45 PM, Carlos O'Donell wrote:
> On 08/17/2017 09:35 AM, Florian Weimer wrote:
>> Historically, glibc flushes streams on abort, which is not
>> required by POSIX.  This can trigger additional work
>> (including callbacks through function pointers) in processes
>> which are known to be in a bad state.  After this change,
>> only streams which are backed by the standard descriptor-based
>> implementation are flushed.
> 
> Does this mean open_memstream streams to NVM won't be flushed by
> default?

You can't put open_memstream streams on NVM because their backing
storage is tied to the malloc allocator.

> As a user I might be convinced that my own custom streams need to
> flushed by hand in an abort handler, but I might expect open_memstream
> streams to be flushed.

Flushing open_memstream streams prior to process termination is not
observable, except via an abort handler.  I don't think we need to flush
them.

Thanks,
Florian
  
Florian Weimer Aug. 17, 2017, 2:19 p.m. UTC | #5
On 08/17/2017 03:50 PM, Adhemerval Zanella wrote:
> For my BZ#21735 fix (libio: Fix open_memstream fflush (NULL)), you
> indicate you asked for clarification regarding open_memstream [1].  
> I think we should follow the same semantic for abort after Austin 
> Group clarification.

POSIX does not require flushing on abort, so this is a separate matter.

Florian
  
Andreas Schwab Aug. 17, 2017, 2:29 p.m. UTC | #6
On Aug 17 2017, fweimer@redhat.com (Florian Weimer) wrote:

> Historically, glibc flushes streams on abort, which is not
> required by POSIX.  This can trigger additional work
> (including callbacks through function pointers) in processes
> which are known to be in a bad state.  After this change,
> only streams which are backed by the standard descriptor-based
> implementation are flushed.

That still doesn't make abort thread-safe.

Andreas.
  
Florian Weimer Aug. 17, 2017, 2:35 p.m. UTC | #7
On 08/17/2017 04:29 PM, Andreas Schwab wrote:
> On Aug 17 2017, fweimer@redhat.com (Florian Weimer) wrote:
> 
>> Historically, glibc flushes streams on abort, which is not
>> required by POSIX.  This can trigger additional work
>> (including callbacks through function pointers) in processes
>> which are known to be in a bad state.  After this change,
>> only streams which are backed by the standard descriptor-based
>> implementation are flushed.
> 
> That still doesn't make abort thread-safe.

Do you mean async-signal-safe?

Sure, but I think it's an incremental improvement over what we have today.

I'm also fine with pulling all the flushing from abort.  But I don't
know if we have consensus for that.

Thanks,
Florian
  
Andreas Schwab Aug. 17, 2017, 2:50 p.m. UTC | #8
On Aug 17 2017, Florian Weimer <fweimer@redhat.com> wrote:

> On 08/17/2017 04:29 PM, Andreas Schwab wrote:
>> On Aug 17 2017, fweimer@redhat.com (Florian Weimer) wrote:
>> 
>>> Historically, glibc flushes streams on abort, which is not
>>> required by POSIX.  This can trigger additional work
>>> (including callbacks through function pointers) in processes
>>> which are known to be in a bad state.  After this change,
>>> only streams which are backed by the standard descriptor-based
>>> implementation are flushed.
>> 
>> That still doesn't make abort thread-safe.
>
> Do you mean async-signal-safe?

No, thread-safe.  Accessing _IO_list_all without locking is broken to
begin with.

Andreas.
  
Carlos O'Donell Aug. 17, 2017, 3:31 p.m. UTC | #9
On 08/17/2017 09:59 AM, Florian Weimer wrote:
> On 08/17/2017 03:45 PM, Carlos O'Donell wrote:
>> On 08/17/2017 09:35 AM, Florian Weimer wrote:
>>> Historically, glibc flushes streams on abort, which is not
>>> required by POSIX.  This can trigger additional work
>>> (including callbacks through function pointers) in processes
>>> which are known to be in a bad state.  After this change,
>>> only streams which are backed by the standard descriptor-based
>>> implementation are flushed.
>>
>> Does this mean open_memstream streams to NVM won't be flushed by
>> default?
> 
> You can't put open_memstream streams on NVM because their backing
> storage is tied to the malloc allocator.

The allocator is user interposable.

>> As a user I might be convinced that my own custom streams need to
>> flushed by hand in an abort handler, but I might expect open_memstream
>> streams to be flushed.
> 
> Flushing open_memstream streams prior to process termination is not
> observable, except via an abort handler.  I don't think we need to flush
> them.

It is observable if you back the allocations onto NVM.

It is conceivable to do so for a robust application that has to journal
state and recover after a crash.

My opinion is that we should flush memory streams, but I'm not firmly
entrenched in this opinion.

I admit the NVM example is contrived, but using NVM in a case like this
is exactly what *I* would use NVM for.

I'm OK with this change if we clearly document what we're doing in the
glibc manual, and explain the alternative solution of flushing from the
abort handler.

The downside to signal handling to flush streams is that signal actions
are not easily composable and therefore rely on the various library
authors coordinating in some kind of compositional way to handle abort
cleanup.
  
Florian Weimer Aug. 17, 2017, 3:52 p.m. UTC | #10
On 08/17/2017 05:31 PM, Carlos O'Donell wrote:

> I'm OK with this change if we clearly document what we're doing in the
> glibc manual, and explain the alternative solution of flushing from the
> abort handler.

The manual currently does not list abort as an action which flushes any
buffers:

<http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html>

I think you are making up an implementation constraint which does not
actually exist.

What I'm trying to do is to get rid of the flushing (to get a cleaner
process termination sequence) while preserving the legacy behavior that
stdout/stderr and other file buffers are flushed on termination because
that's easily user-visible.  Considering that flushing streams which are
not file-backed can allocate memory using malloc and that abort can be
called from all kinds of contexts (including malloc itself), I think
that's a reasonable precaution.

Thanks,
Florian
  
Florian Weimer Aug. 17, 2017, 3:57 p.m. UTC | #11
On 08/17/2017 04:50 PM, Andreas Schwab wrote:
> On Aug 17 2017, Florian Weimer <fweimer@redhat.com> wrote:
> 
>> On 08/17/2017 04:29 PM, Andreas Schwab wrote:
>>> On Aug 17 2017, fweimer@redhat.com (Florian Weimer) wrote:
>>>
>>>> Historically, glibc flushes streams on abort, which is not
>>>> required by POSIX.  This can trigger additional work
>>>> (including callbacks through function pointers) in processes
>>>> which are known to be in a bad state.  After this change,
>>>> only streams which are backed by the standard descriptor-based
>>>> implementation are flushed.
>>>
>>> That still doesn't make abort thread-safe.
>>
>> Do you mean async-signal-safe?
> 
> No, thread-safe.  Accessing _IO_list_all without locking is broken to
> begin with.

In your opinion, should we remove the flushing altogether?

We still have a similar problem during regular process termination, though.

Thanks,
Florian
  
Adhemerval Zanella Aug. 17, 2017, 5:29 p.m. UTC | #12
On 17/08/2017 12:52, Florian Weimer wrote:
> On 08/17/2017 05:31 PM, Carlos O'Donell wrote:
> 
>> I'm OK with this change if we clearly document what we're doing in the
>> glibc manual, and explain the alternative solution of flushing from the
>> abort handler.
> 
> The manual currently does not list abort as an action which flushes any
> buffers:
> 
> <http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html>
> 
> I think you are making up an implementation constraint which does not
> actually exist.
> 
> What I'm trying to do is to get rid of the flushing (to get a cleaner
> process termination sequence) while preserving the legacy behavior that
> stdout/stderr and other file buffers are flushed on termination because
> that's easily user-visible.  Considering that flushing streams which are
> not file-backed can allocate memory using malloc and that abort can be
> called from all kinds of contexts (including malloc itself), I think
> that's a reasonable precaution.

I will double-check, but I recall that at least for open_memstream with
my BZ#21735 flushing on about did *not* allocate memory.  And I am
aware it is a different issue, but we will add different semantics for
flushing (even though it is not POSIX).

But I agree with Carlos it should be ok with a proper documentation
about this flushing behaviour.
  
Florian Weimer Aug. 18, 2017, 12:23 p.m. UTC | #13
On 08/17/2017 05:52 PM, Florian Weimer wrote:
> On 08/17/2017 05:31 PM, Carlos O'Donell wrote:
> 
>> I'm OK with this change if we clearly document what we're doing in the
>> glibc manual, and explain the alternative solution of flushing from the
>> abort handler.
> 
> The manual currently does not list abort as an action which flushes any
> buffers:
> 
> <http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html>
> 
> I think you are making up an implementation constraint which does not
> actually exist.

Furthermore, GCC performs dead store elimination on malloc'ed memory
whichis incompatible with your made-up use case:

#include <stdlib.h>

void f (int i)
{
  int *p = malloc (sizeof (p));
  *p = i;
  abort ();
}

f:
	subq	$8, %rsp
	call	abort

malloc is not an interface useful for persistent memory.

Thanks,
Florian
  
Carlos O'Donell Aug. 18, 2017, 1:45 p.m. UTC | #14
On 08/18/2017 08:23 AM, Florian Weimer wrote:
> On 08/17/2017 05:52 PM, Florian Weimer wrote:
>> On 08/17/2017 05:31 PM, Carlos O'Donell wrote:
>>
>>> I'm OK with this change if we clearly document what we're doing in the
>>> glibc manual, and explain the alternative solution of flushing from the
>>> abort handler.
>>
>> The manual currently does not list abort as an action which flushes any
>> buffers:
>>
>> <http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html>
>>
>> I think you are making up an implementation constraint which does not
>> actually exist.
> 
> Furthermore, GCC performs dead store elimination on malloc'ed memory
> whichis incompatible with your made-up use case:
> 
> #include <stdlib.h>
> 
> void f (int i)
> {
>   int *p = malloc (sizeof (p));
>   *p = i;
>   abort ();
> }
> 
> f:
> 	subq	$8, %rsp
> 	call	abort
> 
> malloc is not an interface useful for persistent memory.

I agree that DSE is a general NVM problem that needs to be solved.

I don't agree that you can use this specific small example and problem
to refute that flushing an opem_memstream is not useful.

The open memstream is much more complicated than this, and I bet
you can't prove DSE like above to remove all of the stores to the memory
stream (not with the present compiler technology).

Do we have a technical objection to flushing the open memory streams?

I agree that we should do less work on abort(), you have convinced me
of that, but given the existing behaviour to flush streams, I want to
take make these changes conservative. Avoiding flushing user streams
is a good idea, so is removing the complex backtracing.
  
Florian Weimer Aug. 18, 2017, 2:11 p.m. UTC | #15
On 08/18/2017 03:45 PM, Carlos O'Donell wrote:

> Do we have a technical objection to flushing the open memory streams?

abort needs to be async-signal-safe and our current implementation of
open_memstream allocates on flush, for example if exactly 8192 bytes
have been written:

#include <assert.h>
#include <dlfcn.h>
#include <err.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>

static volatile int to_write;
static volatile bool pending_flush;
static volatile bool malloc_called;

void *
malloc (size_t sz)
{
  malloc_called = true;
  if (pending_flush)
    {
      pending_flush = false;
      printf ("malloc called for size %d\n", to_write);
    }
  void *(*next) (size_t) = dlsym (RTLD_NEXT, "malloc");
  return next (sz);
}

int
main (int argc, char **argv)
{
  to_write = atoi (argv[1]);
  char *data = malloc (to_write);
  assert (malloc_called);
  if (data == NULL)
    err (1, "malloc");
  memset (data, 'X', to_write);

  char *buffer = NULL;
  size_t length = 0;
  FILE *fp = open_memstream (&buffer, &length);
  if (fwrite (data, to_write, 1, fp) != 1)
    err (1, "fwrite");

  pending_flush = true;
  fflush (fp);
}

$ ./memstream-alloc 8192
malloc called for size 8192

Thanks,
Florian
  
Carlos O'Donell Aug. 18, 2017, 2:25 p.m. UTC | #16
On 08/18/2017 10:11 AM, Florian Weimer wrote:
> On 08/18/2017 03:45 PM, Carlos O'Donell wrote:
> 
>> Do we have a technical objection to flushing the open memory streams?
> 
> abort needs to be async-signal-safe and our current implementation of
> open_memstream allocates on flush, for example if exactly 8192 bytes
> have been written:

Thank you for checking this.

Adhemerval, Is there any way to remove the allocation on flush?
  
Adhemerval Zanella Aug. 18, 2017, 3:18 p.m. UTC | #17
On 18/08/2017 11:25, Carlos O'Donell wrote:
> On 08/18/2017 10:11 AM, Florian Weimer wrote:
>> On 08/18/2017 03:45 PM, Carlos O'Donell wrote:
>>
>>> Do we have a technical objection to flushing the open memory streams?
>>
>> abort needs to be async-signal-safe and our current implementation of
>> open_memstream allocates on flush, for example if exactly 8192 bytes
>> have been written:
> 
> Thank you for checking this.
> 
> Adhemerval, Is there any way to remove the allocation on flush?
> 

I think so, the open_memstream malloc on Florian tests came from
the _IO_str_overflow at:

libio/memstream.c:

107 static int
108 _IO_mem_sync (_IO_FILE *fp)
109 {
110   struct _IO_FILE_memstream *mp = (struct _IO_FILE_memstream *) fp;
111 
112   if (fp->_IO_write_ptr == fp->_IO_write_end)
113     {
114       _IO_str_overflow (fp, '\0');
115       --fp->_IO_write_ptr;
116     }
117 
118   *mp->bufloc = fp->_IO_write_base;
119   *mp->sizeloc = fp->_IO_write_ptr - fp->_IO_write_base;
120 
121   return 0;
122 }

However this '\0' overflow check is superflous, underlying FILE operation
used for data input already keep the buffer NULL-terminated (because
_IO_str_overflow which is used for buffer enlarge zeros the buffer). 
Removing the code and just setting _IO_mem_sync does not trigger any
regression and avoid the memory allocation.
  
Andreas Schwab Aug. 21, 2017, 7:23 a.m. UTC | #18
On Aug 17 2017, Florian Weimer <fweimer@redhat.com> wrote:

> In your opinion, should we remove the flushing altogether?

Yes.  abort must be thread-safe, and there is no way to do flushing
without locking.

> We still have a similar problem during regular process termination, though.

_IO_cleanup must do locking as well, of course.

Andreas.
  
Florian Weimer Aug. 21, 2017, 8:20 a.m. UTC | #19
On 08/21/2017 09:23 AM, Andreas Schwab wrote:
> On Aug 17 2017, Florian Weimer <fweimer@redhat.com> wrote:
> 
>> In your opinion, should we remove the flushing altogether?
> 
> Yes.  abort must be thread-safe, and there is no way to do flushing
> without locking.

In the current implementation, yes.

Carlos, do you still think we must continue to flush on abort?

I'm concerned that the current is state (the abort mechanism can be
abused for code execution) is desired by no one, and a squabble over the
ideal behavior of a from-scratch stdio prevents us from applying
incremental improvements to the code we have today.

>> We still have a similar problem during regular process termination, though.
> 
> _IO_cleanup must do locking as well, of course.

That means that a process cannot terminate if flockfile on a stream has
been called without a matching funlockfile.  I don't think this is
permitted by POSIX, and wouldn't be a desirable implementation, either.

In a hypothetical, from-scratch stdio implementation, it should be
possible to implement flush-once without locking, but it requires
careful ordering of buffer pointer updates (or two locks instead of one).

Thanks,
Florian
  
Andreas Schwab Aug. 21, 2017, 8:58 a.m. UTC | #20
On Aug 21 2017, Florian Weimer <fweimer@redhat.com> wrote:

>> _IO_cleanup must do locking as well, of course.
>
> That means that a process cannot terminate if flockfile on a stream has
> been called without a matching funlockfile.  I don't think this is
> permitted by POSIX, and wouldn't be a desirable implementation, either.

Is it?  That would simply be a programming error.  Since POSIX requires
locking on stdio I don't see how it can require exit to use no locking.

> In a hypothetical, from-scratch stdio implementation, it should be
> possible to implement flush-once without locking, but it requires
> careful ordering of buffer pointer updates (or two locks instead of one).

I don't think making stdio lock-free is desirable.

Andreas.
  
Florian Weimer Aug. 21, 2017, 9:11 a.m. UTC | #21
On 08/21/2017 10:58 AM, Andreas Schwab wrote:
> On Aug 21 2017, Florian Weimer <fweimer@redhat.com> wrote:
> 
>>> _IO_cleanup must do locking as well, of course.
>>
>> That means that a process cannot terminate if flockfile on a stream has
>> been called without a matching funlockfile.  I don't think this is
>> permitted by POSIX, and wouldn't be a desirable implementation, either.
> 
> Is it?  That would simply be a programming error.  Since POSIX requires
> locking on stdio I don't see how it can require exit to use no locking.

Is exit a function which references a FILE * object?  What about fflush
(NULL)?

POSIX says this:

“
All functions that reference (FILE *) objects, except those with names
ending in _unlocked, shall behave as if they use flockfile() and
funlockfile() internally to obtain ownership of these (FILE *) objects.
”

I have no idea whether this expresses an intent that only explicit FILE
* references cause locking and, potentially blocking, or if this wording
is the result of a quick specification hack to add thread safety to stdio.

>> In a hypothetical, from-scratch stdio implementation, it should be
>> possible to implement flush-once without locking, but it requires
>> careful ordering of buffer pointer updates (or two locks instead of one).
> 
> I don't think making stdio lock-free is desirable.

But neither is blocking on exit because a thread happens to have called
flockfile on a stdio stream.

Thanks,
Florian
  
Andreas Schwab Aug. 21, 2017, 9:39 a.m. UTC | #22
On Aug 21 2017, Florian Weimer <fweimer@redhat.com> wrote:

> But neither is blocking on exit because a thread happens to have called
> flockfile on a stdio stream.

_IO_cleanup must at least lock _IO_list_all.

Andreas.
  
Carlos O'Donell Aug. 21, 2017, 12:57 p.m. UTC | #23
On 08/21/2017 04:20 AM, Florian Weimer wrote:
> On 08/21/2017 09:23 AM, Andreas Schwab wrote:
>> On Aug 17 2017, Florian Weimer <fweimer@redhat.com> wrote:
>>
>>> In your opinion, should we remove the flushing altogether?
>>
>> Yes.  abort must be thread-safe, and there is no way to do flushing
>> without locking.
> 
> In the current implementation, yes.
> 
> Carlos, do you still think we must continue to flush on abort?
> 
> I'm concerned that the current is state (the abort mechanism can be
> abused for code execution) is desired by no one, and a squabble over the
> ideal behavior of a from-scratch stdio prevents us from applying
> incremental improvements to the code we have today.

Conceptually I am OK with removing *all* flushing on abort.

It is the *partial* flushing for which I have stronger opinions about
what should and should not get flushed.

For the removal of all flushing I would like to think through all of
the consequences including answering:

(a) How do authors get back the previous behaviour in the event they
    need it during a transition period?

    (a.1) Can we offer a tunable for a few releases to allow authors
          to put back the flush (ignored for secure processes)?

(b) How do we explain to application authors that we will no longer
    support any flushing during abort, and that what is left in the
    buffers is unrecoverable? Is it purely an argument about security
    and the complexity of attack vectors?

(c) Do we provide better documentation for application authors who
    are looking for something more robust against crashes? Give
    examples for how to setup a FILE* that is always as close to
    flushed as possible (setvbuf to no buffer).

I want to mitigate any potential problems this will cause our users.
  
Szabolcs Nagy Aug. 21, 2017, 1:46 p.m. UTC | #24
On 21/08/17 10:11, Florian Weimer wrote:
> On 08/21/2017 10:58 AM, Andreas Schwab wrote:
>> Is it?  That would simply be a programming error.  Since POSIX requires
>> locking on stdio I don't see how it can require exit to use no locking.
> 
> Is exit a function which references a FILE * object?  What about fflush
> (NULL)?
> 
> POSIX says this:
> 
> “
> All functions that reference (FILE *) objects, except those with names
> ending in _unlocked, shall behave as if they use flockfile() and
> funlockfile() internally to obtain ownership of these (FILE *) objects.
> ”
> 
> I have no idea whether this expresses an intent that only explicit FILE
> * references cause locking and, potentially blocking, or if this wording
> is the result of a quick specification hack to add thread safety to stdio.
> 

see interpretation response of
http://austingroupbugs.net/view.php?id=611
  
Florian Weimer Aug. 21, 2017, 1:50 p.m. UTC | #25
On 08/21/2017 03:46 PM, Szabolcs Nagy wrote:
> On 21/08/17 10:11, Florian Weimer wrote:
>> On 08/21/2017 10:58 AM, Andreas Schwab wrote:
>>> Is it?  That would simply be a programming error.  Since POSIX requires
>>> locking on stdio I don't see how it can require exit to use no locking.
>>
>> Is exit a function which references a FILE * object?  What about fflush
>> (NULL)?
>>
>> POSIX says this:
>>
>> “
>> All functions that reference (FILE *) objects, except those with names
>> ending in _unlocked, shall behave as if they use flockfile() and
>> funlockfile() internally to obtain ownership of these (FILE *) objects.
>> ”
>>
>> I have no idea whether this expresses an intent that only explicit FILE
>> * references cause locking and, potentially blocking, or if this wording
>> is the result of a quick specification hack to add thread safety to stdio.
>>
> 
> see interpretation response of
> http://austingroupbugs.net/view.php?id=611

*sigh* It's still ambiguous wheter blocking is required or not.

Thanks,
Florian
  
Carlos O'Donell Aug. 23, 2017, 4:09 a.m. UTC | #26
On 08/17/2017 11:52 AM, Florian Weimer wrote:
> On 08/17/2017 05:31 PM, Carlos O'Donell wrote:
> 
>> I'm OK with this change if we clearly document what we're doing in the
>> glibc manual, and explain the alternative solution of flushing from the
>> abort handler.
> 
> The manual currently does not list abort as an action which flushes any
> buffers:
> 
> <http://www.gnu.org/software/libc/manual/html_node/Flushing-Buffers.html>
> 
> I think you are making up an implementation constraint which does not
> actually exist.

Past behaviour is indeed an implementation constraint, if after a long
enough time, everyone expects it to be that way. Worse I would say that
by flushing buffers we have made an implementation-dependent decision
to do so.

We should document it because C11 leaves it up to the implementation to
decide on this behaviour:

C11 7.22.4.1.2
Whether open streams with unwritten buffered data are flushed, open streams
are closed, or temporary files are removed is implementation-defined.

> What I'm trying to do is to get rid of the flushing (to get a cleaner
> process termination sequence) while preserving the legacy behavior that
> stdout/stderr and other file buffers are flushed on termination because
> that's easily user-visible.  Considering that flushing streams which are
> not file-backed can allocate memory using malloc and that abort can be
> called from all kinds of contexts (including malloc itself), I think
> that's a reasonable precaution.

I think we both in general agree on the strategy we're taking, but I'm 
arguing that if we flush anything, then we need to be more conservative.

It's easier to argue we won't flush anything and then document that.
  

Patch

diff --git a/libio/genops.c b/libio/genops.c
index 6ad7346cae..d97196ebef 100644
--- a/libio/genops.c
+++ b/libio/genops.c
@@ -790,7 +790,7 @@  _IO_get_column (_IO_FILE *fp)
 
 
 int
-_IO_flush_all_lockp (int do_lock)
+_IO_flush_all_lockp (int do_lock, int restrict_vtables)
 {
   int result = 0;
   struct _IO_FILE *fp;
@@ -816,9 +816,32 @@  _IO_flush_all_lockp (int do_lock)
 	       && fp->_mode > 0 && (fp->_wide_data->_IO_write_ptr
 				    > fp->_wide_data->_IO_write_base))
 #endif
-	   )
-	  && _IO_OVERFLOW (fp, EOF) == EOF)
-	result = EOF;
+	   ))
+	{
+	  /* Only invoke __overflow if the vtable refers to an actual
+	     file.  (mmap'ed files are read-only and do not need
+	     flushing in this mode.)  */
+	  const struct _IO_jump_t *vtable = NULL;
+	  if (restrict_vtables)
+	    {
+	      vtable = _IO_JUMPS_FUNC_NOVALIDATE (fp);
+	      if (!(vtable == &_IO_file_jumps
+		    || vtable == &_IO_wfile_jumps
+#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)
+		    || vtable == &_IO_old_file_jumps
+#endif
+		    ))
+		vtable = NULL;
+	    }
+	  else
+	    vtable = _IO_JUMPS_FUNC (fp);
+
+	  if (vtable != NULL)
+	    {
+	      if (vtable->__overflow (fp, EOF))
+		result = EOF;
+	    }
+	}
 
       if (do_lock)
 	_IO_funlockfile (fp);
@@ -848,7 +871,7 @@  int
 _IO_flush_all (void)
 {
   /* We want locking.  */
-  return _IO_flush_all_lockp (1);
+  return _IO_flush_all_lockp (1, 0);
 }
 libc_hidden_def (_IO_flush_all)
 
@@ -982,7 +1005,7 @@  _IO_cleanup (void)
 {
   /* We do *not* want locking.  Some threads might use streams but
      that is their problem, we flush them underneath them.  */
-  int result = _IO_flush_all_lockp (0);
+  int result = _IO_flush_all_lockp (0, 0);
 
   /* We currently don't have a reliable mechanism for making sure that
      C++ static destructors are executed in the correct order.
diff --git a/libio/libioP.h b/libio/libioP.h
index 1832b44cc7..bfdcb8e949 100644
--- a/libio/libioP.h
+++ b/libio/libioP.h
@@ -123,16 +123,21 @@  extern "C" {
 #define _IO_CHECK_WIDE(THIS) \
   (_IO_CAST_FIELD_ACCESS ((THIS), struct _IO_FILE, _wide_data) != NULL)
 
+/* _IO_JUMPS_FUNC_NOVALIDATE (THIS) expands to a vtable pointer for THIS,
+   possibly adjusted by the vtable offset, _IO_vtable_offset (THIS).  */
 #if _IO_JUMPS_OFFSET
-# define _IO_JUMPS_FUNC(THIS) \
-  (IO_validate_vtable                                                   \
-   (*(struct _IO_jump_t **) ((void *) &_IO_JUMPS_FILE_plus (THIS)	\
-			     + (THIS)->_vtable_offset)))
+# define _IO_JUMPS_FUNC_NOVALIDATE(THIS)				\
+  (*(struct _IO_jump_t **) ((void *) &_IO_JUMPS_FILE_plus (THIS)	\
+			    + (THIS)->_vtable_offset))
 # define _IO_vtable_offset(THIS) (THIS)->_vtable_offset
 #else
-# define _IO_JUMPS_FUNC(THIS) (IO_validate_vtable (_IO_JUMPS_FILE_plus (THIS)))
+# define _IO_JUMPS_FUNC_NOVALIDATE(THIS) (_IO_JUMPS_FILE_plus (THIS))
 # define _IO_vtable_offset(THIS) 0
-#endif
+#endif /* _IO_JUMPS_OFFSET */
+
+/* Like _IO_JUMPS_FUNC_NOVALIDATE, but perform vtable validation.  */
+#define _IO_JUMPS_FUNC(THIS) \
+  (IO_validate_vtable (_IO_JUMPS_FUNC_NOVALIDATE (THIS)))
 #define _IO_WIDE_JUMPS_FUNC(THIS) _IO_WIDE_JUMPS(THIS)
 #define JUMP_FIELD(TYPE, NAME) TYPE NAME
 #define JUMP0(FUNC, THIS) (_IO_JUMPS_FUNC(THIS)->FUNC) (THIS)
@@ -507,7 +512,13 @@  extern int _IO_new_do_write (_IO_FILE *, const char *, _IO_size_t);
 extern int _IO_old_do_write (_IO_FILE *, const char *, _IO_size_t);
 extern int _IO_wdo_write (_IO_FILE *, const wchar_t *, _IO_size_t);
 libc_hidden_proto (_IO_wdo_write)
-extern int _IO_flush_all_lockp (int);
+
+/* If DO_LOCK, perform locking.  If RESTRICT_VTABLES, only flush
+   streams which refer to real files (based on their vtable); this is
+   used for restricted flushing on abort.  */
+extern int _IO_flush_all_lockp (int do_lock, int restrict_vtables)
+  attribute_hidden;
+
 extern int _IO_flush_all (void);
 libc_hidden_proto (_IO_flush_all)
 extern int _IO_cleanup (void);
diff --git a/stdlib/Makefile b/stdlib/Makefile
index e4b36ca118..e1e428cf41 100644
--- a/stdlib/Makefile
+++ b/stdlib/Makefile
@@ -80,7 +80,7 @@  tests		:= tst-strtol tst-strtod testmb testrand testsort testdiv   \
 		   tst-strtol-locale tst-strtod-nan-locale tst-strfmon_l    \
 		   tst-quick_exit tst-thread-quick_exit tst-width	    \
 		   tst-width-stdint tst-strfrom tst-strfrom-locale	    \
-		   tst-getrandom
+		   tst-getrandom tst-abort-fflush
 tests-internal	:= tst-strtod1i tst-strtod3 tst-strtod4 tst-strtod5i \
 		   tst-tls-atexit tst-tls-atexit-nodelete
 tests-static	:= tst-secure-getenv
diff --git a/stdlib/abort.c b/stdlib/abort.c
index 19882f3e3d..79ec2a9cd4 100644
--- a/stdlib/abort.c
+++ b/stdlib/abort.c
@@ -32,7 +32,6 @@ 
 #endif
 
 #include <libio/libioP.h>
-#define fflush(s) _IO_flush_all_lockp (0)
 
 /* Exported variable to locate abort message in core files etc.  */
 struct abort_msg_s *__abort_msg __attribute__ ((nocommon));
@@ -72,7 +71,8 @@  abort (void)
   if (stage == 1)
     {
       ++stage;
-      fflush (NULL);
+      /* Flush streams without locking, but only file streams.  */
+      _IO_flush_all_lockp (0, 1);
     }
 
   /* Send signal which possibly calls a user handler.  */
diff --git a/stdlib/tst-abort-fflush.c b/stdlib/tst-abort-fflush.c
new file mode 100644
index 0000000000..9d2eedce1d
--- /dev/null
+++ b/stdlib/tst-abort-fflush.c
@@ -0,0 +1,219 @@ 
+/* Check stream-flushing behavior on abort.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <support/capture_subprocess.h>
+#include <support/check.h>
+#include <support/namespace.h>
+#include <support/support.h>
+#include <support/temp_file.h>
+#include <support/xstdio.h>
+#include <support/xunistd.h>
+#include <sys/stat.h>
+#include <sys/wait.h>
+#include <wchar.h>
+
+/* Test streams and their file names.  */
+static char *name_fdopen;
+static FILE *stream_fdopen;
+static char *name_fopen;
+static FILE *stream_fopen;
+static char *name_fopen_wide;
+static FILE *stream_fopen_wide;
+static FILE *stream_fopencookie;
+
+/* Shared data with subprocess, to detect fopencookie flushing.  */
+static struct
+{
+  unsigned int write_called;
+  bool allow_close;
+} *shared;
+
+static void
+check_size (const char *name, const char *path, off64_t expected)
+{
+  struct stat64 st;
+  xstat (path, &st);
+  if (st.st_size != expected)
+    {
+      support_record_failure ();
+      printf ("error: file for %s (%s) does not have size %llu, got %llu\n",
+              name, path, (unsigned long long) expected,
+              (unsigned long long) st.st_size);
+    }
+}
+
+static void
+check_all_empty (void)
+{
+  check_size ("fdopen", name_fdopen, 0);
+  check_size ("fopen", name_fopen, 0);
+  check_size ("fopen (wide orientation)", name_fopen_wide, 0);
+  TEST_VERIFY (shared->write_called == 0);
+}
+
+static void
+check_all_written (void)
+{
+  check_size ("fdopen", name_fdopen, 1);
+  check_size ("fopen (wide orientation)", name_fopen_wide, 1);
+  TEST_VERIFY (shared->write_called == 1);
+}
+
+static ssize_t
+cookieread (void *cookie, char *buf, size_t count)
+{
+  FAIL_EXIT1 ("cookieread called");
+}
+
+static ssize_t
+cookiewrite (void *cookie, const char *buf, size_t count)
+{
+  ++shared->write_called;
+  return count;
+}
+
+static int
+cookieseek (void *cookie, off64_t *offset, int whence)
+{
+  return 0;
+}
+
+static int
+cookieclose (void *cookie)
+{
+  if (!shared->allow_close)
+    FAIL_EXIT1 ("cookieclose called");
+  return 0;
+}
+
+static void
+pretest (void)
+{
+  shared = support_shared_allocate (sizeof (*shared));
+
+  /* Create the streams.  */
+  {
+    int fd = create_temp_file ("tst-abort-fflush", &name_fdopen);
+    TEST_VERIFY_EXIT (fd >= 0);
+    stream_fdopen = fdopen (fd, "w");
+    TEST_VERIFY_EXIT (stream_fdopen != NULL);
+  }
+  {
+    int fd = create_temp_file ("tst-abort-fflush", &name_fopen);
+    TEST_VERIFY_EXIT (fd >= 0);
+    xclose (fd);
+    stream_fopen = xfopen (name_fopen, "w");
+  }
+  {
+    int fd = create_temp_file ("tst-abort-fflush", &name_fopen_wide);
+    TEST_VERIFY_EXIT (fd >= 0);
+    xclose (fd);
+    stream_fopen_wide = xfopen (name_fopen_wide, "w,ccs=utf-8");
+  }
+
+  stream_fopencookie = fopencookie (NULL, "w", (cookie_io_functions_t)
+                                    {
+                                      .read = cookieread,
+                                      .write = cookiewrite,
+                                      .seek = cookieseek,
+                                      .close = cookieclose
+                                    });
+  TEST_VERIFY_EXIT (stream_fopencookie != NULL);
+  shared->write_called = 0;
+  shared->allow_close = false;
+
+  TEST_VERIFY (fputc ('X', stream_fdopen) == 'X');
+  TEST_VERIFY (fputc ('X', stream_fopen) == 'X');
+  TEST_VERIFY (fputwc (L'X', stream_fopen_wide) == L'X');
+  TEST_VERIFY (fputc ('X', stream_fopencookie) == 'X');
+
+  /* No flushing must have happened at this point.  */
+  check_all_empty ();
+}
+
+static void
+posttest (void)
+{
+  xfclose (stream_fdopen);
+  xfclose (stream_fopen);
+  xfclose (stream_fopen_wide);
+  shared->allow_close = true;
+  xfclose (stream_fopencookie);
+  free (name_fdopen);
+  free (name_fopen);
+  free (name_fopen_wide);
+  support_shared_free (shared);
+  shared = NULL;
+}
+
+static void
+run_exit (void *unused)
+{
+  exit (0);
+}
+
+static void
+run_abort (void *unused)
+{
+  TEST_VERIFY (fputc ('X', stdout) == 'X');
+  TEST_VERIFY (fputc ('Y', stderr) == 'Y');
+  abort ();
+}
+
+static int
+do_test (void)
+{
+  /* Check that fflush flushes all streams.  */
+  pretest ();
+  fflush (NULL);
+  check_all_written ();
+  posttest ();
+
+  /* Check that exit flushes all streams.  */
+  pretest ();
+  support_isolate_in_subprocess (run_exit, NULL);
+  check_all_written ();
+  posttest ();
+
+  /* Check that abort only flushes file streams.  */
+  pretest ();
+  {
+    struct support_capture_subprocess proc
+      = support_capture_subprocess (run_abort, NULL);
+    TEST_VERIFY (proc.out.length == 1);
+    TEST_VERIFY (proc.out.buffer[0] == 'X');
+    TEST_VERIFY (proc.err.length == 1);
+    TEST_VERIFY (proc.err.buffer[0] == 'Y');
+    TEST_VERIFY (WIFSIGNALED (proc.status));
+    TEST_VERIFY (WTERMSIG (proc.status) == SIGABRT);
+    check_size ("fdopen", name_fdopen, 1);
+    check_size ("fopen", name_fopen, 1);
+    check_size ("fopen (wide)", name_fopen_wide, 1);
+    TEST_VERIFY (shared->write_called == 0);
+    support_capture_subprocess_free (&proc);
+  }
+  posttest ();
+
+  return 0;
+}
+
+#include <support/test-driver.c>