Message ID | 1416014955-5408-1-git-send-email-rickyz@chromium.org |
---|---|
State | Dropped |
Headers |
Received: (qmail 21067 invoked by alias); 15 Nov 2014 01:29:22 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 21052 invoked by uid 89); 15 Nov 2014 01:29:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: mail-ob0-f202.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=dpm3r4d4NfaUcqqma8ahsXCZC8ufGbQ8YUhVyqQy5Wo=; b=cAlVDa8PgzPNZDHFmxdTrgjJr8OkrOcdVZFcauPmhKrn3rNrSvVPodCL6rDHOGozBn R3dZwXNuNoDybfj5rgPkY7fo2FYO7JIMSWOYMku7hPDak4Or4U5RiUjYI7rTX6gj1OFF dHM5nQOhruaJH7U8BGlR9iKFKg05/uVKESc+bkWO9k6n3crip2OopB/IS9xGUfLlauR0 lemGrU2iE97Uvndffh0aVMyq3m2hr7BvUvv4MwdS52WAYxBX7reYUZPBEPh5HaAMy2lB Pi/v1EwUYELAhTSFJnnWIVQLlGMfWoOmRwjfwkQYmNofoVi8wbsxijws+vMcG/045rl9 0FTQ== X-Gm-Message-State: ALoCoQkkNz5ITpMtXVTT1mWf7z3OXT7JW36g0+2GjGlTZQyvSRjf3igWnn5ufvAPw6NfzDej52WV X-Received: by 10.50.128.226 with SMTP id nr2mr9334084igb.2.1416014957138; Fri, 14 Nov 2014 17:29:17 -0800 (PST) From: Ricky Zhou <rickyz@chromium.org> To: libc-alpha@sourceware.org Cc: Ricky Zhou <rickyz@chromium.org> Subject: [PATCH] [BZ #15392] Remove fork child pid assertion Date: Fri, 14 Nov 2014 17:29:15 -0800 Message-Id: <1416014955-5408-1-git-send-email-rickyz@chromium.org> |
Commit Message
Ricky Zhou
Nov. 15, 2014, 1:29 a.m. UTC
This assertion is no longer always true, since a forked child may be in a different PID namespace than its parent, and the two namespace may have PID collisions. An example program which hits this assert is attached to bug 17596. [BZ #15392] * sysdeps/nptl/fork.c (__libc_fork): Remove assert that the parent and child pid must differ after forking. --- sysdeps/nptl/fork.c | 2 -- 1 file changed, 2 deletions(-)
Comments
On Fri, 2014-11-14 at 17:29 -0800, Ricky Zhou wrote: > This assertion is no longer always true, since a forked child may be in > a different PID namespace than its parent, and the two namespace may > have PID collisions. If that's true, process-shared synchronization facilities based on the thread ID (e.g., mutexes) might not work anymore. Do we at least need a note in the documentation that this is the case? Or do we still need to implement it? Is there any glibc-internal synchronization that this could break? I believe we deal with these issues before dropping the assert. > An example program which hits this assert is attached to bug 17596. > > [BZ #15392] > * sysdeps/nptl/fork.c (__libc_fork): Remove assert that the parent > and child pid must differ after forking. > --- > sysdeps/nptl/fork.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/sysdeps/nptl/fork.c b/sysdeps/nptl/fork.c > index a7dafa8..6adb723 100644 > --- a/sysdeps/nptl/fork.c > +++ b/sysdeps/nptl/fork.c > @@ -139,8 +139,6 @@ __libc_fork (void) > { > struct pthread *self = THREAD_SELF; > > - assert (THREAD_GETMEM (self, tid) != ppid); > - > /* See __pthread_once. */ > if (__fork_generation_pointer != NULL) > *__fork_generation_pointer += __PTHREAD_ONCE_FORK_GEN_INCR;
On Mon, Nov 17, 2014 at 2:38 AM, Torvald Riegel <triegel@redhat.com> wrote: > On Fri, 2014-11-14 at 17:29 -0800, Ricky Zhou wrote: >> This assertion is no longer always true, since a forked child may be in >> a different PID namespace than its parent, and the two namespace may >> have PID collisions. > > If that's true, process-shared synchronization facilities based on the > thread ID (e.g., mutexes) might not work anymore. > > Do we at least need a note in the documentation that this is the case? > Or do we still need to implement it? > > Is there any glibc-internal synchronization that this could break? > > I believe we deal with these issues before dropping the assert. Within a PID namespace, all thread IDs should remain unique, and a process cannot have threads in different PID namespaces (http://lxr.free-electrons.com/source/kernel/fork.c#L1217), so I'm having trouble coming up with a situation where removing this assert would allow badness to happen down the line. Ricky
On Mon, 2014-11-17 at 14:20 -0800, Ricky Zhou wrote: > On Mon, Nov 17, 2014 at 2:38 AM, Torvald Riegel <triegel@redhat.com> wrote: > > On Fri, 2014-11-14 at 17:29 -0800, Ricky Zhou wrote: > >> This assertion is no longer always true, since a forked child may be in > >> a different PID namespace than its parent, and the two namespace may > >> have PID collisions. > > > > If that's true, process-shared synchronization facilities based on the > > thread ID (e.g., mutexes) might not work anymore. > > > > Do we at least need a note in the documentation that this is the case? > > Or do we still need to implement it? > > > > Is there any glibc-internal synchronization that this could break? > > > > I believe we deal with these issues before dropping the assert. > Within a PID namespace, all thread IDs should remain unique, and a > process cannot have threads in different PID namespaces > (http://lxr.free-electrons.com/source/kernel/fork.c#L1217), so I'm > having trouble coming up with a situation where removing this assert > would allow badness to happen down the line. Okay, so private (intra-process) mutexes and such are safe. However, what about process-shared mutexes?
On 11/18/2014 09:32 AM, Torvald Riegel wrote: > On Mon, 2014-11-17 at 14:20 -0800, Ricky Zhou wrote: >> On Mon, Nov 17, 2014 at 2:38 AM, Torvald Riegel <triegel@redhat.com> wrote: >>> On Fri, 2014-11-14 at 17:29 -0800, Ricky Zhou wrote: >>>> This assertion is no longer always true, since a forked child may be in >>>> a different PID namespace than its parent, and the two namespace may >>>> have PID collisions. >>> >>> If that's true, process-shared synchronization facilities based on the >>> thread ID (e.g., mutexes) might not work anymore. >>> >>> Do we at least need a note in the documentation that this is the case? >>> Or do we still need to implement it? >>> >>> Is there any glibc-internal synchronization that this could break? >>> >>> I believe we deal with these issues before dropping the assert. >> Within a PID namespace, all thread IDs should remain unique, and a >> process cannot have threads in different PID namespaces >> (http://lxr.free-electrons.com/source/kernel/fork.c#L1217), so I'm >> having trouble coming up with a situation where removing this assert >> would allow badness to happen down the line. > > Okay, so private (intra-process) mutexes and such are safe. However, > what about process-shared mutexes? I agree with Torvald, and I have further comments. All of pthreads expects POSIX requirements, which are that the PIDs are stable for the life of the process. If setns is called, then the fork'd child violates that invariant and enters undefined behaviour. The only tenable solution I see is documenting that you must use clone to enter the new namespace. The documentation should note that fork is forbidden after setns and unshare. The patch I would like to see is: * Add minimal stub entry for `setns` and `unshare` to manual/threads.texi and mention that after calling this function that fork must not be used, and clone is the only safe function to call. If you have problems using clone, then we need to talk about that interface. The feedback from lxc has been that they would like not to have to worry about the stack requirement in clone(), but we haven't done anything to make that easier. Cheers, Carlos.
Ah, I see, I didn't know about process-shared mutexes - those definitely won't work across PID namespaces (and this should definitely be documented). Do you know of any other places in glibc where we may be assuming that PIDs are unique between processes that may be in different PID namespaces? I think it'd be useful to have a list of these. I'm a little bit confused about why we'd want to disallow fork after setns/unshare (since this is likely to be a common pattern that's already in use). I'm not super familiar with glibc internals/APIs, but are there many other instances where we rely on the uniqueness of PIDs between two different processes apart from process-shared mutexes/condvars/barriers? Would it be possible to document the specific APIs that won't work across PID namespaces instead of forbidding fork after setns/unshare with CLONE_NEWPID? By the way, I am also interested in improving or extending the clone API as well (let me know if you'd rather split this into a separate thread). Two ideas that would solve our issues with the current clone wrapper: 1) Provide an interface to reset the PID cache (this would allow us to use the syscall directly). 2) Provide an alternate fork-like version of the clone wrapper. This version would not take a child_stack, and would enforce that CLONE_VM is not set. Aside from this, it would invalidate the PID cache and perform the syscall. Thanks, Ricky On Tue, Nov 18, 2014 at 8:02 AM, Carlos O'Donell <carlos@redhat.com> wrote: > On 11/18/2014 09:32 AM, Torvald Riegel wrote: >> On Mon, 2014-11-17 at 14:20 -0800, Ricky Zhou wrote: >>> On Mon, Nov 17, 2014 at 2:38 AM, Torvald Riegel <triegel@redhat.com> wrote: >>>> On Fri, 2014-11-14 at 17:29 -0800, Ricky Zhou wrote: >>>>> This assertion is no longer always true, since a forked child may be in >>>>> a different PID namespace than its parent, and the two namespace may >>>>> have PID collisions. >>>> >>>> If that's true, process-shared synchronization facilities based on the >>>> thread ID (e.g., mutexes) might not work anymore. >>>> >>>> Do we at least need a note in the documentation that this is the case? >>>> Or do we still need to implement it? >>>> >>>> Is there any glibc-internal synchronization that this could break? >>>> >>>> I believe we deal with these issues before dropping the assert. >>> Within a PID namespace, all thread IDs should remain unique, and a >>> process cannot have threads in different PID namespaces >>> (http://lxr.free-electrons.com/source/kernel/fork.c#L1217), so I'm >>> having trouble coming up with a situation where removing this assert >>> would allow badness to happen down the line. >> >> Okay, so private (intra-process) mutexes and such are safe. However, >> what about process-shared mutexes? > > I agree with Torvald, and I have further comments. > > All of pthreads expects POSIX requirements, which are that the PIDs > are stable for the life of the process. If setns is called, then the > fork'd child violates that invariant and enters undefined behaviour. > > The only tenable solution I see is documenting that you must use clone > to enter the new namespace. The documentation should note that fork > is forbidden after setns and unshare. > > The patch I would like to see is: > > * Add minimal stub entry for `setns` and `unshare` to > manual/threads.texi and mention that after calling this > function that fork must not be used, and clone is the > only safe function to call. > > If you have problems using clone, then we need to talk about that > interface. The feedback from lxc has been that they would like not > to have to worry about the stack requirement in clone(), but we > haven't done anything to make that easier. > > Cheers, > Carlos. > >
On 11/18/2014 08:51 PM, Ricky Zhou wrote: > Ah, I see, I didn't know about process-shared mutexes - those > definitely won't work across PID namespaces (and this should > definitely be documented). Do you know of any other places in glibc > where we may be assuming that PIDs are unique between processes that > may be in different PID namespaces? I think it'd be useful to have a > list of these. PID uniqueness is a requirement for POSIX, everything can rely on it. Documenting the functions which depend on it will lock down the implementation's flexibility. The uniqueness of the PID continues to be true *before* you enter the namespace and then *after* you enter the namespace. The only problem is during the transition for the child. We should clearly define *how* you transition and limit that transition to a simple set of APIs. > I'm a little bit confused about why we'd want to disallow fork after > setns/unshare (since this is likely to be a common pattern that's > already in use). I'm not super familiar with glibc internals/APIs, but > are there many other instances where we rely on the uniqueness of PIDs > between two different processes apart from process-shared > mutexes/condvars/barriers? Would it be possible to document the > specific APIs that won't work across PID namespaces instead of > forbidding fork after setns/unshare with CLONE_NEWPID? You are asking to support a POSIX API with a set of semantics that violate the contract of that API. I would like to avoid this. Using clone is fine, since clone has none of the POSIX requirements that fork, or vfork have. I would like to see one well documented way to transition to the child safely after which you should be able to continue calling libc functions including fork and vfork. Today, the easiest solution is: "After setns/unshare you must clone followed immediately by exec" The exec starts the process over again in the new namespace and everything works fine. The problem is that this is resource intensive and not what users want. > By the way, I am also interested in improving or extending the clone > API as well (let me know if you'd rather split this into a separate > thread). Two ideas that would solve our issues with the current clone > wrapper: Then let us go in this direction, make clone better, and recommend it as the only way to create a new process after setns/unshare. > 1) Provide an interface to reset the PID cache (this would allow us to > use the syscall directly). Why do you need this if you have a version of clone that resets it for you? > 2) Provide an alternate fork-like version of the clone wrapper. This > version would not take a child_stack, and would enforce that CLONE_VM > is not set. Aside from this, it would invalidate the PID cache and > perform the syscall. I think a new fork-like clone wrapper should be acceptable, we'll have to iterate on the design a bit, but I don't object to the idea. We could document it as being the one safe way to create a new process in the new namespace. Cheers, Carlos.
On Tue, Nov 18, 2014 at 7:57 PM, Carlos O'Donell <carlos@redhat.com> wrote: > On 11/18/2014 08:51 PM, Ricky Zhou wrote: >> Ah, I see, I didn't know about process-shared mutexes - those >> definitely won't work across PID namespaces (and this should >> definitely be documented). Do you know of any other places in glibc >> where we may be assuming that PIDs are unique between processes that >> may be in different PID namespaces? I think it'd be useful to have a >> list of these. > > PID uniqueness is a requirement for POSIX, everything can rely on it. > > Documenting the functions which depend on it will lock down the > implementation's flexibility. My goal with documenting functions that depend on PID uniqueness isn't to lock down the implementation, but rather to lock down people that decide to use PID namespaces. For example, someone using PID namespaces should be aware that they cannot use process-shared mutexes across the PID namespace. > You are asking to support a POSIX API with a set of semantics that > violate the contract of that API. I would like to avoid this. Is the concrete fear that future changes to glibc internals may want to rely on the uniqueness of PIDs across fork? I'm not trying to argue one way or the other here - I just want to better understand the reasoning for disallowing fork after setns/unshare. I haven't fully read through the fork code, but it seems to reset some other glibc internal state, and someone using setns/unshare might want that behavior as well. > Using clone is fine, since clone has none of the POSIX requirements > that fork, or vfork have. Does this mean that developers should implicitly know that features like process-shared mutexes won't work across clones into a new PID namespace because clone is not a POSIX API? > Then let us go in this direction, make clone better, and recommend > it as the only way to create a new process after setns/unshare. > >> 1) Provide an interface to reset the PID cache (this would allow us to >> use the syscall directly). > > Why do you need this if you have a version of clone that resets it for you? From my perspective, this could potentially be useful in the (probably not too likely) situation where Linux adds some new method for creating a process that we want to use before glibc has a wrapper for it. I'm fine with the other approach of adding a new clone wrapper as well though. > I think a new fork-like clone wrapper should be acceptable, we'll have > to iterate on the design a bit, but I don't object to the idea. Cool, I'll send out an RFC in a bit to start talking about what this interface should look like. Thanks, Ricky
diff --git a/sysdeps/nptl/fork.c b/sysdeps/nptl/fork.c index a7dafa8..6adb723 100644 --- a/sysdeps/nptl/fork.c +++ b/sysdeps/nptl/fork.c @@ -139,8 +139,6 @@ __libc_fork (void) { struct pthread *self = THREAD_SELF; - assert (THREAD_GETMEM (self, tid) != ppid); - /* See __pthread_once. */ if (__fork_generation_pointer != NULL) *__fork_generation_pointer += __PTHREAD_ONCE_FORK_GEN_INCR;