Message ID | 20190814154400.6371-2-christian.brauner@ubuntu.com |
---|---|
State | Not applicable |
Headers |
Received: (qmail 84586 invoked by alias); 14 Aug 2019 15:46:17 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 84563 invoked by uid 89); 14 Aug 2019 15:46:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-15.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SEM_URI, SEM_URIRED autolearn=ham version=3.3.1 spammy= X-HELO: youngberry.canonical.com From: Christian Brauner <christian.brauner@ubuntu.com> To: linux-kernel@vger.kernel.org, libc-alpha@sourceware.org Cc: oleg@redhat.com, alistair23@gmail.com, ebiederm@xmission.com, arnd@arndb.de, dalias@libc.org, torvalds@linux-foundation.org, adhemerval.zanella@linaro.org, fweimer@redhat.com, palmer@sifive.com, macro@wdc.com, zongbox@gmail.com, akpm@linux-foundation.org, viro@zeniv.linux.org.uk, hpa@zytor.com, Christian Brauner <christian.brauner@ubuntu.com> Subject: [PATCH v3 1/1] waitid: Add support for waiting for the current process group Date: Wed, 14 Aug 2019 17:44:00 +0200 Message-Id: <20190814154400.6371-2-christian.brauner@ubuntu.com> In-Reply-To: <20190814154400.6371-1-christian.brauner@ubuntu.com> References: <CAKmqyKMJPQAOKn11xepzAwXOd4e9dU0Cyz=A0T-uMEgUp5yJjA@mail.gmail.com> <20190814154400.6371-1-christian.brauner@ubuntu.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit |
Commit Message
Christian Brauner
Aug. 14, 2019, 3:44 p.m. UTC
From: "Eric W. Biederman" <ebiederm@xmission.com> It was recently discovered that the linux version of waitid is not a superset of the other wait functions because it does not include support for waiting for the current process group. This has two downsides. An extra system call is needed to get the current process group, and a signal could come in between the system call that retrieved the process gorup and the call to waitid that changes the current process group. Allow userspace to avoid both of those issues by defining idtype == P_PGID and id == 0 to mean wait for the caller's process group at the time of the call. Arguments can be made for using a different choice of idtype and id for this case but the BSDs already use this P_PGID and 0 to indicate waiting for the current process's process group. So be nice to user space programmers and don't introduce an unnecessary incompatibility. Some people have noted that the posix description is that waitpid will wait for the current process group, and that in the presence of pthreads that process group can change. To get clarity on this issue I looked at XNU, FreeBSD, and Luminos. All of those flavors of unix waited for the current process group at the time of call and as written could not adapt to the process group changing after the call. At one point Linux did adapt to the current process group changing but that stopped in 161550d74c07 ("pid: sys_wait... fixes"). It has been over 11 years since Linux has that behavior, no programs that fail with the change in behavior have been reported, and I could not find any other unix that does this. So I think it is safe to clarify the definition of current process group, to current process group at the time of the wait function. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Rich Felker <dalias@libc.org> Cc: Alistair Francis <alistair23@gmail.com> Cc: Zong Li <zongbox@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Florian Weimer <fweimer@redhat.com> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: GNU C Library <libc-alpha@sourceware.org> --- v1: - Christian Brauner <christian.brauner@ubuntu.com>: - move find_get_pid() calls into the switch statements to minimize merge conflicts with P_PIDFD changes and adhere to coding style discussions we had for P_PIDFD v2: - Oleg Nesterov <oleg@redhat.com>: - take rcu_read_lock() around task_pgrp() so that we don't race with another thread changing the pgrp - Christian Brauner <christian.brauner@ubuntu.com>: - introduce find_get_pgrp() v3: - Oleg Nesterov <oleg@redhat.com>: - use get_task_pid() --- kernel/exit.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
Comments
On 08/14, Christian Brauner wrote: > > and a signal could come in between the system call that > retrieved the process gorup and the call to waitid that changes the ^^^^^ > current process group. I noticed this typo only because I spent 2 minutes or more trying to understand this sentence ;) But yes, a signal handler or another thread can change pgrp in between. Reviewed-by: Oleg Nesterov <oleg@redhat.com>
On Wed, Aug 14, 2019 at 06:09:17PM +0200, Oleg Nesterov wrote: > On 08/14, Christian Brauner wrote: > > > > and a signal could come in between the system call that > > retrieved the process gorup and the call to waitid that changes the > ^^^^^ > > current process group. > > I noticed this typo only because I spent 2 minutes or more trying to > understand this sentence ;) But yes, a signal handler or another thread I'll try to rewrite it. :) > can change pgrp in between. > > Reviewed-by: Oleg Nesterov <oleg@redhat.com> >
On Wed, Aug 14, 2019 at 06:15:17PM +0200, Christian Brauner wrote: > On Wed, Aug 14, 2019 at 06:09:17PM +0200, Oleg Nesterov wrote: > > On 08/14, Christian Brauner wrote: > > > > > > and a signal could come in between the system call that > > > retrieved the process gorup and the call to waitid that changes the > > ^^^^^ > > > current process group. > > > > I noticed this typo only because I spent 2 minutes or more trying to > > understand this sentence ;) But yes, a signal handler or another thread > > I'll try to rewrite it. :) Ok, here's what I changed it to: It was recently discovered that the linux version of waitid is not a superset of the other wait functions because it does not include support for waiting for the current process group. This has two downsides: 1. An extra system call is needed to get the current process group. 2. After the current process group is received and before it is passed to waitid a signal could arrive causing the current process group to change. Christian
On Wed, Aug 14, 2019 at 06:34:44PM +0200, Christian Brauner wrote: > On Wed, Aug 14, 2019 at 06:15:17PM +0200, Christian Brauner wrote: > > On Wed, Aug 14, 2019 at 06:09:17PM +0200, Oleg Nesterov wrote: > > > On 08/14, Christian Brauner wrote: > > > > > > > > and a signal could come in between the system call that > > > > retrieved the process gorup and the call to waitid that changes the > > > ^^^^^ > > > > current process group. > > > > > > I noticed this typo only because I spent 2 minutes or more trying to > > > understand this sentence ;) But yes, a signal handler or another thread > > > > I'll try to rewrite it. :) > > Ok, here's what I changed it to: > > It was recently discovered that the linux version of waitid is not a > superset of the other wait functions because it does not include > support for waiting for the current process group. This has two > downsides: > 1. An extra system call is needed to get the current process group. > 2. After the current process group is received and before it is passed > to waitid a signal could arrive causing the current process group to change. I don't think "downsides" sufficiently conveys that this is hard breakage of a requirement for waitpid. How about something like the following? "It was recently discovered that the linux version of waitid is not a superset of the other wait functions because it does not include support for waiting for the current process group. Userspace cannot simply emulate this functionality with an additional getpgid syscall due to inherent race conditions that violate the async-signal safety requirements for waitpid." Rich
On Wed, Aug 14, 2019 at 12:55:01PM -0400, Rich Felker wrote: > On Wed, Aug 14, 2019 at 06:34:44PM +0200, Christian Brauner wrote: > > On Wed, Aug 14, 2019 at 06:15:17PM +0200, Christian Brauner wrote: > > > On Wed, Aug 14, 2019 at 06:09:17PM +0200, Oleg Nesterov wrote: > > > > On 08/14, Christian Brauner wrote: > > > > > > > > > > and a signal could come in between the system call that > > > > > retrieved the process gorup and the call to waitid that changes the > > > > ^^^^^ > > > > > current process group. > > > > > > > > I noticed this typo only because I spent 2 minutes or more trying to > > > > understand this sentence ;) But yes, a signal handler or another thread > > > > > > I'll try to rewrite it. :) > > > > Ok, here's what I changed it to: > > > > It was recently discovered that the linux version of waitid is not a > > superset of the other wait functions because it does not include > > support for waiting for the current process group. This has two > > downsides: > > 1. An extra system call is needed to get the current process group. > > 2. After the current process group is received and before it is passed > > to waitid a signal could arrive causing the current process group to change. > > I don't think "downsides" sufficiently conveys that this is hard > breakage of a requirement for waitpid. How about something like the > following? > > "It was recently discovered that the linux version of waitid is not a > superset of the other wait functions because it does not include > support for waiting for the current process group. Userspace cannot > simply emulate this functionality with an additional getpgid syscall > due to inherent race conditions that violate the async-signal safety > requirements for waitpid." I like the rather specific example in there. How about we add that after this section like so: It was recently discovered that the linux version of waitid is not a superset of the other wait functions because it does not include support for waiting for the current process group. This has two downsides: 1. An extra system call is needed to get the current process group. 2. After the current process group is received and before it is passed to waitid a signal could arrive causing the current process group to change. Such inherent race-conditions as mentioned in 2. make it impossible for userspace to emulate this functionaly and thus violate the async-signal safety requirements for waitpid. Christian
On Wed, Aug 14, 2019 at 9:55 AM Rich Felker <dalias@libc.org> wrote: > > I don't think "downsides" sufficiently conveys that this is hard > breakage of a requirement for waitpid. Well, let's be honest here. Who has _ever_ seen a signal handler changing the current process group? In fact, the SYSV version of setpgid() takes a process ID to set it *for somebody else*, so the signal safety is not even necessarily relevant, since it might be racing with _another_ thread doing it (which even the kernel side won't fix - it's just user space doing odd things). So yes - it's technically true that it's impossible to emulate properly in user space. But I doubt it makes _any_ difference what-so-ever, and glibc might as well do something like ret = waitid(P_PGID, 0, ..); if (ret == -EINVAL) { do the emulation } which makes it work with older kernels, and has zero downside in practice. Hmm? Linus
On Wed, Aug 14, 2019 at 10:06:19AM -0700, Linus Torvalds wrote: > On Wed, Aug 14, 2019 at 9:55 AM Rich Felker <dalias@libc.org> wrote: > > > > I don't think "downsides" sufficiently conveys that this is hard > > breakage of a requirement for waitpid. > > Well, let's be honest here. Who has _ever_ seen a signal handler > changing the current process group? > > In fact, the SYSV version of setpgid() takes a process ID to set it > *for somebody else*, so the signal safety is not even necessarily > relevant, since it might be racing with _another_ thread doing it > (which even the kernel side won't fix - it's just user space doing odd > things). For that case, the operations are inherently unordered with respect to each other, and assuming the interpretation that waitpid is allowed to wait on "the pgid at the time of the call" rather than at the time of child exit/status-change -- which was discussed thoroughly in the thread leading up to this patch -- there is no conformance distinction. On the other hand, with changing your own pgid from a signal handler, there is a clear observable ordering between the events. For example, if the signal handler changes the pgid and forks a child with the new pgid, waitpid for "own pgid" can be assumed to include the new child in its wait set. I agree this is not common usage, so impact of breakage is probably low, but I'd rather not have wrong/racy hacks be something we're committed to supporting indefinitely on the userspace side. > So yes - it's technically true that it's impossible to emulate > properly in user space. > > But I doubt it makes _any_ difference what-so-ever, and glibc might as > well do something like > > ret = waitid(P_PGID, 0, ..); > if (ret == -EINVAL) { do the emulation } > > which makes it work with older kernels, and has zero downside in practice. > > Hmm? It only affects RV32 anyway; other archs all have a waitpid syscall that can be used. Since there's not yet any official libc release with RV32 support and AIUI the ABI is not considered "frozen" yet, emulation doesn't seem useful here. Whatever kernel version fixes this (or some later one, if nobody gets things together on upstreaming libc support of RV32) will just become the minimum version for RV32. Rich
diff --git a/kernel/exit.c b/kernel/exit.c index 5b4a5dcce8f8..8e4e357fddc8 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -1576,19 +1576,23 @@ static long kernel_waitid(int which, pid_t upid, struct waitid_info *infop, type = PIDTYPE_PID; if (upid <= 0) return -EINVAL; + + pid = find_get_pid(upid); break; case P_PGID: type = PIDTYPE_PGID; - if (upid <= 0) + if (upid < 0) return -EINVAL; + + if (upid) + pid = find_get_pid(upid); + else + pid = get_task_pid(current, PIDTYPE_PGID); break; default: return -EINVAL; } - if (type < PIDTYPE_MAX) - pid = find_get_pid(upid); - wo.wo_type = type; wo.wo_pid = pid; wo.wo_flags = options;