[BZ,#19402] Clear list of acquired robust mutexes in the child process after forking.

Message ID 1482401752.14990.777.camel@redhat.com
State New, archived
Headers

Commit Message

Torvald Riegel Dec. 22, 2016, 10:15 a.m. UTC
  I've asked for comments on whether mutexes acquired before fork() remain
to be acquired by just the parent process after fork():
https://sourceware.org/ml/libc-alpha/2016-12/msg00772.html

If we can agree on one of the requirements I'm proposing there, this
patch fixes the core problem of bug 19402.  (The reproducer of this bug
reveals another issue on x86, for which I'll send a patch next.)

The fix is:

Robust mutexes acquired at the time of a call to fork() do not remain
acquired by the forked child process.  We have to clear the list of
acquired robust mutexes before registering this list with the kernel;
otherwise, if some of the robust mutexes are process-shared, the parent
process can alter the child's robust mutex list, which can lead to
deadlocks or even modification of memory that may not be occupied by a
mutex anymore.

Tested on x86_64-linux with glibc's tests and the reproducer from 19402.
  

Comments

Florian Weimer Dec. 22, 2016, 12:22 p.m. UTC | #1
On 12/22/2016 11:15 AM, Torvald Riegel wrote:
> I've asked for comments on whether mutexes acquired before fork() remain
> to be acquired by just the parent process after fork():
> https://sourceware.org/ml/libc-alpha/2016-12/msg00772.html

The conceptual problem here is that we need to effectively 
process-shared mutexes differently from those which are process-private. 
  With effectively process-shared, I mean that the mutex object resides 
on a mapping which is not copied by fork, but remains shared with the 
parent.

I'm going to comment on the other thread as well.

> The fix is:
>
> Robust mutexes acquired at the time of a call to fork() do not remain
> acquired by the forked child process.  We have to clear the list of
> acquired robust mutexes before registering this list with the kernel;
> otherwise, if some of the robust mutexes are process-shared, the parent
> process can alter the child's robust mutex list, which can lead to
> deadlocks or even modification of memory that may not be occupied by a
> mutex anymore.
>
> Tested on x86_64-linux with glibc's tests and the reproducer from 19402.

Can we add a test case for this?

> +      /* Initialize the robust mutex list setting in the kernel which has
> +	 been reset during the fork.  We do not check for errors because if
> +	 it fails here, it must have failed at process startup as well and
> +	 nobody could have used robust mutexes.
> +	 Before we do that, we have to clear the list of robust mutexes
> +	 because we do not inherit ownership of mutexes from the parent.
> +	 We do not have to set self->robust_head.futex_offset since we do
> +	 inherit the correct value from the parent.  We do not need to clear
> +	 the pending operation because it must have been zero when fork was
> +	 called.  */

fork is supposed to be async-signal-safe, so I'm not sure the comment 
about the pending operation is completely correct.

(Our fork implementation is currently not async-signal-safe, though.)

> +# ifdef __PTHREAD_MUTEX_HAVE_PREV
> +      self->robust_prev = &self->robust_head;
> +# endif
> +      self->robust_head.list = &self->robust_head;
>  # ifdef SHARED
>        if (__builtin_expect (__libc_pthread_functions_init, 0))
>  	PTHFCT_CALL (ptr_set_robust, (self));

The change as such looks good to me, assuming this is the semantics we 
want.  I checked it matches the initialization code, and it comes before 
the set_robust_list call, as required.

Thanks,
Florian
  
Torvald Riegel Dec. 23, 2016, 9:31 p.m. UTC | #2
On Thu, 2016-12-22 at 13:22 +0100, Florian Weimer wrote:
> On 12/22/2016 11:15 AM, Torvald Riegel wrote:
> > I've asked for comments on whether mutexes acquired before fork() remain
> > to be acquired by just the parent process after fork():
> > https://sourceware.org/ml/libc-alpha/2016-12/msg00772.html
> 
> The conceptual problem here is that we need to effectively 
> process-shared mutexes differently from those which are process-private. 
>   With effectively process-shared, I mean that the mutex object resides 
> on a mapping which is not copied by fork, but remains shared with the 
> parent.

I don't think we need to, but that's a more general topic than this
patch in particular.

I think the use case of a process-private, robust mutex that is acquired
across fork and needs to be recovered is quite narrow.

Second, I don't see a way how we can accomodate both process-private and
process-shared robust mutexes in one list we're sharing with the kernel.
As soon as there are process-shared pages that the mutexes reside on, we
have lost because we cannot trust the contents of this memory (the
parent might do anything with it, so we can't traverse the list safely).

> > The fix is:
> >
> > Robust mutexes acquired at the time of a call to fork() do not remain
> > acquired by the forked child process.  We have to clear the list of
> > acquired robust mutexes before registering this list with the kernel;
> > otherwise, if some of the robust mutexes are process-shared, the parent
> > process can alter the child's robust mutex list, which can lead to
> > deadlocks or even modification of memory that may not be occupied by a
> > mutex anymore.
> >
> > Tested on x86_64-linux with glibc's tests and the reproducer from 19402.
> 
> Can we add a test case for this?

What do you have in mind?  Checking that the list is reset to zero after
fork?  Do we need a test for that if we have documented the need to do
that in the code?
(Trying to kill the child in such a way that corruption is visible and
can be safely detected is probably nontrivial and not worth it, I
guess.)

> > +      /* Initialize the robust mutex list setting in the kernel which has
> > +	 been reset during the fork.  We do not check for errors because if
> > +	 it fails here, it must have failed at process startup as well and
> > +	 nobody could have used robust mutexes.
> > +	 Before we do that, we have to clear the list of robust mutexes
> > +	 because we do not inherit ownership of mutexes from the parent.
> > +	 We do not have to set self->robust_head.futex_offset since we do
> > +	 inherit the correct value from the parent.  We do not need to clear
> > +	 the pending operation because it must have been zero when fork was
> > +	 called.  */
> 
> fork is supposed to be async-signal-safe, so I'm not sure the comment 
> about the pending operation is completely correct.

Any operation that sets the pending operation to nonzero is not
async-signal-safe or safe for a signal handler though.  So what could go
wrong?  (OTOH, just clearing the pending op is fine too.)
  

Patch

commit f5a07a158cfe1f5d6583c25eeb57ebbbe2fbee83
Author: Torvald Riegel <triegel@redhat.com>
Date:   Wed Dec 21 13:37:19 2016 +0100

    Clear list of acquired robust mutexes in the child process after forking.
    
    Robust mutexes acquired at the time of a call to fork() do not remain
    acquired by the forked child process.  We have to clear the list of
    acquired robust mutexes before registering this list with the kernel;
    otherwise, if some of the robust mutexes are process-shared, the parent
    process can alter the child's robust mutex list, which can lead to
    deadlocks or even modification of memory that may not be occupied by a
    mutex anymore.
    
    2016-12-22  Torvald Riegel  <triegel@redhat.com>
    
    	[BZ #19402]
    	* sysdeps/nptl/fork.c (__libc_fork): Clear list of acquired robust
    	mutexes.

diff --git a/sysdeps/nptl/fork.c b/sysdeps/nptl/fork.c
index 32cecce..fd3a82d 100644
--- a/sysdeps/nptl/fork.c
+++ b/sysdeps/nptl/fork.c
@@ -162,12 +162,20 @@  __libc_fork (void)
 #endif
 
 #ifdef __NR_set_robust_list
-      /* Initialize the robust mutex list which has been reset during
-	 the fork.  We do not check for errors since if it fails here
-	 it failed at process start as well and noone could have used
-	 robust mutexes.  We also do not have to set
-	 self->robust_head.futex_offset since we inherit the correct
-	 value from the parent.  */
+      /* Initialize the robust mutex list setting in the kernel which has
+	 been reset during the fork.  We do not check for errors because if
+	 it fails here, it must have failed at process startup as well and
+	 nobody could have used robust mutexes.
+	 Before we do that, we have to clear the list of robust mutexes
+	 because we do not inherit ownership of mutexes from the parent.
+	 We do not have to set self->robust_head.futex_offset since we do
+	 inherit the correct value from the parent.  We do not need to clear
+	 the pending operation because it must have been zero when fork was
+	 called.  */
+# ifdef __PTHREAD_MUTEX_HAVE_PREV
+      self->robust_prev = &self->robust_head;
+# endif
+      self->robust_head.list = &self->robust_head;
 # ifdef SHARED
       if (__builtin_expect (__libc_pthread_functions_init, 0))
 	PTHFCT_CALL (ptr_set_robust, (self));