Message ID | 55F19819.3010601@gmail.com |
---|---|
State | Rejected |
Headers |
Received: (qmail 72678 invoked by alias); 10 Sep 2015 14:48:04 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 72613 invoked by uid 89); 10 Sep 2015 14:48:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-wi0-f169.google.com X-Received: by 10.194.86.161 with SMTP id q1mr72546119wjz.18.1441896477198; Thu, 10 Sep 2015 07:47:57 -0700 (PDT) To: libc-alpha@sourceware.org From: Navid Rahimi <rahimi.nv@gmail.com> Subject: [PATCH] [BZ #18433] Check file access/existence before forking. Message-ID: <55F19819.3010601@gmail.com> Date: Thu, 10 Sep 2015 19:17:53 +0430 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit |
Commit Message
Navid Rahimi
Sept. 10, 2015, 2:47 p.m. UTC
[BZ #18433] * sysdeps/posix/spawni.c (__spawni): Check file access before forking. --- ChangeLog | 5 +++++ sysdeps/posix/spawni.c | 3 +++ 2 files changed, 8 insertions(+)
Comments
On 10/09/15 15:47, Navid Rahimi wrote: > [BZ #18433] > * sysdeps/posix/spawni.c (__spawni): > Check file access before forking. > --- > +++ b/sysdeps/posix/spawni.c > @@ -90,6 +90,9 @@ __spawni (pid_t *pid, const char *file, > size_t len; > size_t pathlen; > > + if(__access (file, X_OK) != 0) > + return errno; > + this does not guarantee the success of __execve (file, argv, envp).
On September 10, 2015 10:47:53 AM EDT, Navid Rahimi <rahimi.nv@gmail.com> wrote: >[BZ #18433] >* sysdeps/posix/spawni.c (__spawni): >Check file access before forking. >--- >ChangeLog | 5 +++++ >sysdeps/posix/spawni.c | 3 +++ >2 files changed, 8 insertions(+) > >diff --git a/ChangeLog b/ChangeLog >index 5f009a8..0e0c85b 100644 >--- a/ChangeLog >+++ b/ChangeLog >@@ -1,3 +1,8 @@ >+2015-09-10 Navid Rahimi <rahimi.nv@gmail.com> >+ >+ [BZ #18433] >+ * sysdeps/posix/spawni.c (__spawni): Check file access before >forking. >+ > 2015-09-08 Joseph Myers <joseph@codesourcery.com> > > [BZ #14912] >diff --git a/sysdeps/posix/spawni.c b/sysdeps/posix/spawni.c >index eee9331..c571390 100644 >--- a/sysdeps/posix/spawni.c >+++ b/sysdeps/posix/spawni.c >@@ -90,6 +90,9 @@ __spawni (pid_t *pid, const char *file, > size_t len; > size_t pathlen; > >+ if(__access (file, X_OK) != 0) >+ return errno; >+ > /* Do this once. */ > short int flags = attrp == NULL ? 0 : attrp->__flags; Why is a TOCTOU race acceptable and/or unavoidable? Also, why is a check using the real rather than the effective credentials correct here?
On 09/10/2015 07:31 PM, Szabolcs Nagy wrote: > On 10/09/15 15:47, Navid Rahimi wrote: >> [BZ #18433] >> * sysdeps/posix/spawni.c (__spawni): >> Check file access before forking. >> --- > >> +++ b/sysdeps/posix/spawni.c >> @@ -90,6 +90,9 @@ __spawni (pid_t *pid, const char *file, >> size_t len; >> size_t pathlen; >> >> + if(__access (file, X_OK) != 0) >> + return errno; >> + > > this does not guarantee the success of __execve (file, argv, envp). > I think our main objection here is to avoid forking when there is no file.There is so many other variable for checking if execve is going to success or not.
On Thu, 2015-09-10 at 19:35 +0430, Navid Rahimi wrote: > I think our main objection here is to avoid forking when there is no > file.There is so many other variable for checking if execve is going to > success or not. On the other hand, your patch will add an extra system call and directory lookup to every successful posix_spawn() call, i.e. you are optimising the failure case at the expense of the successful case. It's not at all obvious to me that this is a sensible thing to do. Can you explain your reasoning in a bit more detail? p.
On Thu, Sep 10, 2015 at 7:32 PM, Zack Weinberg <zackw@panix.com> wrote:
> Why is a TOCTOU race acceptable and/or unavoidable? Also, why is a check using the real rather than the effective credentials correct here?
Because of nature of lock in unix (being advisory) I think there is no
way to lock file and prevent TOCTOU, even if we had lock mechanism ,
it is cumbersome (and almost impossible) to design in correctly.
Because for preventing TOCTOU we should keep lock until execve , and
after that point so many question will raise , "what will happen to
lock in execve ?" or "does child process has lock also ?"
About using real rather effective credentials , it was my mistake ,
euidaccess would be more appropriate .
On Thu, Sep 10, 2015 at 10:56 PM, Phil Blundell <pb@pbcl.net> wrote: > On Thu, 2015-09-10 at 19:35 +0430, Navid Rahimi wrote: >> I think our main objection here is to avoid forking when there is no >> file.There is so many other variable for checking if execve is going to >> success or not. > > On the other hand, your patch will add an extra system call and > directory lookup to every successful posix_spawn() call, i.e. you are > optimising the failure case at the expense of the successful case. It's > not at all obvious to me that this is a sensible thing to do. Can you > explain your reasoning in a bit more detail? > > p. > > I agree with you completely about overhead and extra syscall , but there is no other option if we want to check fork. About optimizing failure case , I would not call it optimizing , I returning correct error code and correct answer to caller at expense of one extra syscall overhead . Perfect Abstraction I think is the most priority of every project.Adding extra syscall will have overhead , but without that we are not reporting status correctly to caller.
On Thu, Sep 10, 2015 at 5:38 PM, navid Rahimi <rahimi.nv@gmail.com> wrote: > On Thu, Sep 10, 2015 at 7:32 PM, Zack Weinberg <zackw@panix.com> wrote: >> Why is a TOCTOU race acceptable and/or unavoidable? Also, why is a check using the real rather than the effective credentials correct here? > > Because of nature of lock in unix (being advisory) I think there is no > way to lock file and prevent TOCTOU, even if we had lock mechanism , > it is cumbersome (and almost impossible) to design in correctly. It just occurred to me that you could open() the file in the parent and then use fexecve() in the child. If that worked correctly it would address both my concerns and Phil's. However, fexecve is faked on top of /proc, which means there are all sorts of ways it might not work 100% reliably, and for correctness you would need an open mode that may not actually exist (O_EXEC, I guess it would be called). All in all I am inclined to say leave things as they are. Nobody uses posix_spawn if they can help it, anyway :) zw
On Fri, 2015-09-11 at 02:15 +0430, navid Rahimi wrote: > I agree with you completely about overhead and extra syscall , but > there is no other option if we want to check fork. About optimizing > failure case , I would not call it optimizing , I returning correct > error code and correct answer to caller at expense of one extra > syscall overhead . Well, right, but I still don't understand why this is desirable. A portable application can't rely on this behaviour since, as I wrote in the bugzilla ticket earlier, POSIX explicitly allows an implementation to not detect errors before fork()ing and instead to have the child do exit(127); and as far as I can tell glibc is already doing this correctly. So there doesn't seem to be any conformance problem here; at most we're talking about a QoI issue. An application which is particularly worried that it might be about to posix_spawn() a nonexistent executable can always do the access() check for itself beforehand. Alternatively it could look for the exit code 127 and then try to figure out retrospectively why the spawn() would have failed by poking around with access(). But it remains a bit unclear to me why the application would really need to know since, even if it can determine why posix_spawn() failed, it's not obvious that it can take any meaningful steps to fix the problem. Do you have a concrete use-case where having a specific failure return from posix_spawn() is important? p.
On Thu, Sep 10, 2015 at 07:26:46PM +0100, Phil Blundell wrote: > On Thu, 2015-09-10 at 19:35 +0430, Navid Rahimi wrote: > > I think our main objection here is to avoid forking when there is no > > file.There is so many other variable for checking if execve is going to > > success or not. > > On the other hand, your patch will add an extra system call and > directory lookup to every successful posix_spawn() call, i.e. you are > optimising the failure case at the expense of the successful case. It's > not at all obvious to me that this is a sensible thing to do. Can you > explain your reasoning in a bit more detail? > And do you have programs where that would matter? On quick test just calling fork around takes 240000 cycles so this would likely lost in noise.
On 10/09/15 22:45, navid Rahimi wrote: > On Thu, Sep 10, 2015 at 10:56 PM, Phil Blundell <pb@pbcl.net> wrote: >> On Thu, 2015-09-10 at 19:35 +0430, Navid Rahimi wrote: >>> I think our main objection here is to avoid forking when there is no >>> file.There is so many other variable for checking if execve is going to >>> success or not. >> >> On the other hand, your patch will add an extra system call and >> directory lookup to every successful posix_spawn() call, i.e. you are >> optimising the failure case at the expense of the successful case. It's >> not at all obvious to me that this is a sensible thing to do. Can you >> explain your reasoning in a bit more detail? >> >> p. >> >> > > I agree with you completely about overhead and extra syscall , but > there is no other option if we want to check fork. About optimizing there are other options. for example you can correctly check if execve returned and report an error then instead of doing approximations of the right check. you can report the error through a cloexec pipe to the parent. note that a correct posix_spawn implementation never uses fork for QoI reasons (that's the point of posix_spawn, so it works on nommu, large multi-thread applications etc.) and vfork has the wrong semantics for c code so there are other issues here. > failure case , I would not call it optimizing , I returning correct > error code and correct answer to caller at expense of one extra > syscall overhead . Perfect Abstraction I think is the most priority of > every project.Adding extra syscall will have overhead , but without > that we are not reporting status correctly to caller. >
On 09/10/2015 11:50 PM, Zack Weinberg wrote: > On Thu, Sep 10, 2015 at 5:38 PM, navid Rahimi <rahimi.nv@gmail.com> wrote: >> On Thu, Sep 10, 2015 at 7:32 PM, Zack Weinberg <zackw@panix.com> wrote: >>> Why is a TOCTOU race acceptable and/or unavoidable? Also, why is a check using the real rather than the effective credentials correct here? >> >> Because of nature of lock in unix (being advisory) I think there is no >> way to lock file and prevent TOCTOU, even if we had lock mechanism , >> it is cumbersome (and almost impossible) to design in correctly. > > It just occurred to me that you could open() the file in the parent > and then use fexecve() in the child. If that worked correctly it > would address both my concerns and Phil's. However, fexecve is faked > on top of /proc, which means there are all sorts of ways it might not > work 100% reliably, and for correctness you would need an open mode > that may not actually exist (O_EXEC, I guess it would be called). The correct way is to set up a pipe with O_CLOEXEC file descriptors in the parent prior to forking. The parent reads from the pipe after forking, waiting. When the execve succeeds, the read in the parent completes without returning any data. On execve failure, the child writes the errno value over the pipe, where the parent can read it and report it. The downside is that this adds additional blocking to the posix_spawn operation, which is not something all callers want (some may even prefer that the vfork happens on a new thread). So I think this needs a configuration knob. > All in all I am inclined to say leave things as they are. Nobody uses > posix_spawn if they can help it, anyway :) We need to move people off fork and clone to posix_spawn. If it is missing functionality, we should add it.
On Fri, 11 Sep 2015, Florian Weimer wrote: > The downside is that this adds additional blocking to the posix_spawn > operation, which is not something all callers want (some may even prefer > that the vfork happens on a new thread). So I think this needs a > configuration knob. Huh? By definition, the vfork parent is suspended until the vfork child either execs or terminates. In both outcomes, the pipe will be already closed when the parent is resumed, so it will not block on reading. (and if you can rely on parent-suspending semantics of vfork, you don't even need a pipe) Unless I've missed something? Alexander
On Fri, Sep 11, 2015 at 1:48 PM, Florian Weimer <fweimer@redhat.com> wrote: > The correct way is to set up a pipe with O_CLOEXEC file descriptors in > the parent prior to forking. The parent reads from the pipe after > forking, waiting. When the execve succeeds, the read in the parent > completes without returning any data. On execve failure, the child > writes the errno value over the pipe, where the parent can read it and > report it. I was reading musl source code to figure it out how they handle this situation , they use exact same method . They use pipe's to get situation of child .The main key point here is the closure of pipe in successful execve(O_CLOEXEC). Implementation of it is not hard at all. I will work on patch to send in days.
On Fri, Sep 11, 2015 at 12:33:24PM +0300, Alexander Monakov wrote: > On Fri, 11 Sep 2015, Florian Weimer wrote: > > The downside is that this adds additional blocking to the posix_spawn > > operation, which is not something all callers want (some may even prefer > > that the vfork happens on a new thread). So I think this needs a > > configuration knob. > > Huh? By definition, the vfork parent is suspended until the vfork child > either execs or terminates. In both outcomes, the pipe will be already > closed when the parent is resumed, so it will not block on reading. > > (and if you can rely on parent-suspending semantics of vfork, you don't even > need a pipe) > > Unless I've missed something? I've found CLONE_VFORK unreliable on Linux: under some conditions, when running under ptrace, I observed the parent returning before the child execs/exits, and of course horrible memory corruption resulted. Thus I strongly prefer using CLONE_VFORK merely as a hint to the scheduler (which helps prevent the extra blocking step in read) and using FD_CLOEXEC for the actual synchronization. Rich
Rich Felker <dalias@libc.org> writes: > I've found CLONE_VFORK unreliable on Linux: under some conditions, > when running under ptrace, I observed the parent returning before the > child execs/exits, and of course horrible memory corruption resulted. Did you report it? Andreas.
On Fri, Sep 11, 2015 at 06:23:07PM +0200, Andreas Schwab wrote: > Rich Felker <dalias@libc.org> writes: > > > I've found CLONE_VFORK unreliable on Linux: under some conditions, > > when running under ptrace, I observed the parent returning before the > > child execs/exits, and of course horrible memory corruption resulted. > > Did you report it? At the time I was too busy to try to track down exactly why it was happening, and it was unclear to me whether this was the kernel's fault or the tracing process's fault (strace). Rich
doing an access/stat/whatever doesn't solve the issue you described -- as others have pointed out, there's a TOCTOU race, but there's also no guarantee that doing a stat/access on mode bits means you can actually execute that file and/or that the exec will succeed. since this doesn't actually fix the problem you've described (and others have pointed out that it's not a spec violation), then this patch as-is is dead. -mike
access has the wrong semantics. You could use euidaccess. But then that will has the wrong semantics if POSIX_SPAWN_RESETIDS was used. It quickly gets quite hairy. Trying to match the semantics that execve will use after everything that posix_spawn might be doing in the child side is a losing battle.
On Sat, Sep 12, 2015 at 12:48 AM, Roland McGrath <roland@hack.frob.com> wrote: > access has the wrong semantics. You could use euidaccess. But then that > will has the wrong semantics if POSIX_SPAWN_RESETIDS was used. It quickly > gets quite hairy. Trying to match the semantics that execve will use after > everything that posix_spawn might be doing in the child side is a losing > battle. About using access/euidaccess , both are dead end. So I get over it. I thought best approach would be using pipe's but pipes are not available all the time.
On Fri, Sep 11, 2015 at 5:18 AM, Florian Weimer <fweimer@redhat.com> wrote: >> All in all I am inclined to say leave things as they are. Nobody uses >> posix_spawn if they can help it, anyway :) > > We need to move people off fork and clone to posix_spawn. If it is > missing functionality, we should add it. ... I honestly have no idea why you say that. I am under the impression that posix_spawn is a monstrosity from the days when the POSIX committee was inclined to make up interfaces out of whole cloth in order to fix things that weren't broken, and that it should under no circumstances be used in new code. zw
On 11/09/15 22:01, Zack Weinberg wrote: > On Fri, Sep 11, 2015 at 5:18 AM, Florian Weimer <fweimer@redhat.com> wrote: >>> All in all I am inclined to say leave things as they are. Nobody uses >>> posix_spawn if they can help it, anyway :) >> >> We need to move people off fork and clone to posix_spawn. If it is >> missing functionality, we should add it. > > ... I honestly have no idea why you say that. I am under the because fork has issues - spawn process from a large application (memory overcommit off) is unreliable with fork. - using it from a multi-threaded application (so from a library) is hard, because the child has to be AS-safe. - leaks sensitive information to the child (the sensitive info is often managed by a library and there is no way of safely clearing it on fork). - pthread_atfork is broken, further limiting the applicability of fork in multi-threaded code (it cannot be implemented if fork has to be AS-safe and the interface contract of the callbacks are ill-defined). - it has no simple implementation if the underlying platform has no fork syscall with the right semantics (posix on windows, nommu,...) > impression that posix_spawn is a monstrosity from the days when the > POSIX committee was inclined to make up interfaces out of whole cloth > in order to fix things that weren't broken, and that it should under > no circumstances be used in new code. posix_spawn is the only standard interface that can work (some) fork limitations around. new code should use it instead of fork (or vfork) if possible. > > zw >
On 14 Sep 2015 15:04, Szabolcs Nagy wrote: > On 11/09/15 22:01, Zack Weinberg wrote: > > On Fri, Sep 11, 2015 at 5:18 AM, Florian Weimer wrote: > >>> All in all I am inclined to say leave things as they are. Nobody uses > >>> posix_spawn if they can help it, anyway :) > >> > >> We need to move people off fork and clone to posix_spawn. If it is > >> missing functionality, we should add it. > > > > ... I honestly have no idea why you say that. I am under the > > because fork has issues > > - spawn process from a large application (memory overcommit off) > is unreliable with fork. > - using it from a multi-threaded application (so from a library) > is hard, because the child has to be AS-safe. > - leaks sensitive information to the child (the sensitive info > is often managed by a library and there is no way of safely > clearing it on fork). > - pthread_atfork is broken, further limiting the applicability > of fork in multi-threaded code (it cannot be implemented if > fork has to be AS-safe and the interface contract of the > callbacks are ill-defined). > - it has no simple implementation if the underlying platform has > no fork syscall with the right semantics (posix on windows, > nommu,...) how does posix_spawn mangically fix these things ? it still calls fork internally, and the use of the vfork flag is non-portable. -mike
On Wed, Sep 16, 2015 at 2:32 AM, Mike Frysinger <vapier@gentoo.org> wrote: > how does posix_spawn mangically fix these things ? it still calls > fork internally, and the use of the vfork flag is non-portable. > -mike I sent a v2 ( in fact v3 ) of patch which solve the bug with pipes in whole new thread in mailing list, but personally I have these question too , posix_spawn is just another layer on top of fork (or vfork) but seems people count on it as hole another approach . It is a little hard to understand for me how is it differ from fork when it is just fork with a bunch of other options. best wishes, -navid
On 15/09/15 23:02, Mike Frysinger wrote: > On 14 Sep 2015 15:04, Szabolcs Nagy wrote: >> On 11/09/15 22:01, Zack Weinberg wrote: >>> On Fri, Sep 11, 2015 at 5:18 AM, Florian Weimer wrote: >>>>> All in all I am inclined to say leave things as they are. Nobody uses >>>>> posix_spawn if they can help it, anyway :) >>>> >>>> We need to move people off fork and clone to posix_spawn. If it is >>>> missing functionality, we should add it. >>> >>> ... I honestly have no idea why you say that. I am under the >> >> because fork has issues >> >> - spawn process from a large application (memory overcommit off) >> is unreliable with fork. >> - using it from a multi-threaded application (so from a library) >> is hard, because the child has to be AS-safe. >> - leaks sensitive information to the child (the sensitive info >> is often managed by a library and there is no way of safely >> clearing it on fork). >> - pthread_atfork is broken, further limiting the applicability >> of fork in multi-threaded code (it cannot be implemented if >> fork has to be AS-safe and the interface contract of the >> callbacks are ill-defined). >> - it has no simple implementation if the underlying platform has >> no fork syscall with the right semantics (posix on windows, >> nommu,...) > > how does posix_spawn mangically fix these things ? it still calls > fork internally, and the use of the vfork flag is non-portable. > -mike the api is such that user code is not run with a copy of the address space in the child so an implementation without fork is possible (that's the point of this api). an implementation with fork is not useful as you noted. an implementation with vfork is problematic in c because the compiler is free to clobber the stack of the parent. it can be implemented with clone and CLONE_VM on linux. (see the musl posix_spawn by Rich Felker) and that solves the problems (of course the kinds of actions before exec are limited by the api so it's not always useful/simple).
On 09/11/2015 05:08 AM, Szabolcs Nagy wrote: > On 10/09/15 22:45, navid Rahimi wrote: >> On Thu, Sep 10, 2015 at 10:56 PM, Phil Blundell <pb@pbcl.net> wrote: >>> On Thu, 2015-09-10 at 19:35 +0430, Navid Rahimi wrote: >>>> I think our main objection here is to avoid forking when there is no >>>> file.There is so many other variable for checking if execve is going to >>>> success or not. >>> >>> On the other hand, your patch will add an extra system call and >>> directory lookup to every successful posix_spawn() call, i.e. you are >>> optimising the failure case at the expense of the successful case. It's >>> not at all obvious to me that this is a sensible thing to do. Can you >>> explain your reasoning in a bit more detail? >>> >>> p. >>> >>> >> >> I agree with you completely about overhead and extra syscall , but >> there is no other option if we want to check fork. About optimizing > > there are other options. > > for example you can correctly check if execve returned > and report an error then instead of doing approximations > of the right check. > > you can report the error through a cloexec pipe to the parent. > > note that a correct posix_spawn implementation never uses > fork for QoI reasons (that's the point of posix_spawn, so > it works on nommu, large multi-thread applications etc.) > and vfork has the wrong semantics for c code so there are > other issues here. Note that it is possible to use vfork in certain conditions, and we do in glibc. So one should not entirely dismiss vfork, but that's slightly off topic. c.
On 18/09/15 15:44, Carlos O'Donell wrote: > Note that it is possible to use vfork in certain conditions, > and we do in glibc. So one should not entirely dismiss vfork, > but that's slightly off topic. are you sure? i think all use of vfork is invalid c: the compiler can spill registers on the stack then in the child clobber them, then after vfork returns in the parent the clobbered registers are restored breaking the expectations of the compiler. (this can break independently of how the c code around vfork looks like). i think you have to know the compiler internals or use some language extension to make it valid.
On Fri, 18 Sep 2015, Szabolcs Nagy wrote: > On 18/09/15 15:44, Carlos O'Donell wrote: > > Note that it is possible to use vfork in certain conditions, > > and we do in glibc. So one should not entirely dismiss vfork, > > but that's slightly off topic. > > are you sure? > > i think all use of vfork is invalid c: the compiler can > spill registers on the stack then in the child clobber > them, then after vfork returns in the parent the > clobbered registers are restored breaking the expectations > of the compiler. (this can break independently of how > the c code around vfork looks like). The same argument applies to setjmp, and the solution in GCC is the same for both: GCC internally recognizes vfork, setjmp, and a few other functions as functions that "return twice". (after all, if it was broken like you describe, one would expect to witness such breakage in practice, for instance with GNU Make prior to 4.0) Alexander
On 18/09/15 16:19, Alexander Monakov wrote: > On Fri, 18 Sep 2015, Szabolcs Nagy wrote: > >> On 18/09/15 15:44, Carlos O'Donell wrote: >>> Note that it is possible to use vfork in certain conditions, >>> and we do in glibc. So one should not entirely dismiss vfork, >>> but that's slightly off topic. >> >> are you sure? >> >> i think all use of vfork is invalid c: the compiler can >> spill registers on the stack then in the child clobber >> them, then after vfork returns in the parent the >> clobbered registers are restored breaking the expectations >> of the compiler. (this can break independently of how >> the c code around vfork looks like). > > The same argument applies to setjmp, and the solution in GCC is the > same for both: GCC internally recognizes vfork, setjmp, and a few other > functions as functions that "return twice". > > (after all, if it was broken like you describe, one would expect to witness > such breakage in practice, for instance with GNU Make prior to 4.0) > that's surprising: setjmp is standard c and it is a macro so it cannot be called through a function pointer.. this is not true for vfork so it's not clear why gcc recognizes it. > Alexander >
> > The same argument applies to setjmp, and the solution in GCC is the > > same for both: GCC internally recognizes vfork, setjmp, and a few other > > functions as functions that "return twice". > > > > (after all, if it was broken like you describe, one would expect to witness > > such breakage in practice, for instance with GNU Make prior to 4.0) > > > > that's surprising: setjmp is standard c > and it is a macro so it cannot be called > through a function pointer.. this is not > true for vfork so it's not clear why gcc > recognizes it. Because a minor change to GCC (extending setjmp treatment to vfork) is overall a more acceptable compromise than demanding fixes to every vfork-in-C usage? (I can see your argument in a way that an abstract C compiler cannot be required to treat vfork specially, unlike setjmp; but fwiw, not only GCC, but Clang and cparser also appear to treat vfork similar to setjmp) Alexander
On Wed, Sep 16, 2015 at 4:35 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote: > On 15/09/15 23:02, Mike Frysinger wrote: >> On 14 Sep 2015 15:04, Szabolcs Nagy wrote: >>> On 11/09/15 22:01, Zack Weinberg wrote: >>>> On Fri, Sep 11, 2015 at 5:18 AM, Florian Weimer wrote: >>>>> We need to move people off fork and clone to posix_spawn. If it is >>>>> missing functionality, we should add it. >>>> >>>> ... I honestly have no idea why you say that. >>> >>> because fork has issues [...] >> >> how does posix_spawn magically fix these things ? it still calls >> fork internally [...] > > the api is such that user code is not run with a copy of the > address space in the child so an implementation without fork > is possible (that's the point of this api). [...] (of course the kinds > of actions before exec are limited by the api so it's not always > useful/simple). So that means migrating to posix_spawn doesn't fix anything until a better kernel API exists, so developer effort should go into _that_, and then applications should migrate directly to the new kernel API, not to posix_spawn. (I actually have a design sketched out in my head; unfortunately I have neither the time nor the kernel programming experience to get it done. The fundamental idea is a new process state, "egg," in which the process has a PID and a task_struct assigned but does not have an address space and cannot execute instructions. There are a bunch of new system calls to set the kernel state of an egg -- open file descriptors and the like -- and a modified execve() that loads a program into an egg and sets it going.) zw
On 09/18/2015 11:03 AM, Szabolcs Nagy wrote: > On 18/09/15 15:44, Carlos O'Donell wrote: >> Note that it is possible to use vfork in certain conditions, >> and we do in glibc. So one should not entirely dismiss vfork, >> but that's slightly off topic. > > are you sure? > > i think all use of vfork is invalid c: the compiler can > spill registers on the stack then in the child clobber > them, then after vfork returns in the parent the > clobbered registers are restored breaking the expectations > of the compiler. (this can break independently of how > the c code around vfork looks like). > > i think you have to know the compiler internals or use > some language extension to make it valid. The only way I know to make it safe is to create a thread to handle the vfork state changes, thus you're running on a distinct stack. See: http://red.ht/1NI34eT Cheers, Carlos.
This was meant to be a reply to list. ---------- Forwarded message ---------- From: Zack Weinberg <zackw@panix.com> Date: Fri, Sep 18, 2015 at 12:42 PM Subject: Re: [PATCH] [BZ #18433] Check file access/existence before forking. To: Szabolcs Nagy <szabolcs.nagy@arm.com> On Fri, Sep 18, 2015 at 11:32 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote: > > that's surprising: setjmp is standard c > and it is a macro so it cannot be called > through a function pointer.. this is not > true for vfork so it's not clear why gcc > recognizes it. Any function can be annotated with these semantics using __attribute__ ((returns_twice)), but glibc's headers don't do that, so GCC must have special knowledge of the name 'vfork'; and this probably _is_ a conformance violation, in that a program that doesn't include <unistd.h> (or includes it in a feature-selection mode that doesn't bring in a declaration of 'vfork') ought to be allowed to define its own 'vfork' that has no such semantics. (As far as I know, vfork is _not_ in POSIX nor XSI.) I don't see any way around it, though. Breaking programs that call vfork, expecting it to have its special behavior, would be worse. And I expect the code-generation changes are not _wrong_ in the case that vfork doesn't return twice, just less optimal. And yes, I expect Bad Things happen if someone calls vfork through a function pointer. zw
On Fri, Sep 18, 2015 at 06:19:33PM +0300, Alexander Monakov wrote: > On Fri, 18 Sep 2015, Szabolcs Nagy wrote: > > > On 18/09/15 15:44, Carlos O'Donell wrote: > > > Note that it is possible to use vfork in certain conditions, > > > and we do in glibc. So one should not entirely dismiss vfork, > > > but that's slightly off topic. > > > > are you sure? > > > > i think all use of vfork is invalid c: the compiler can > > spill registers on the stack then in the child clobber > > them, then after vfork returns in the parent the > > clobbered registers are restored breaking the expectations > > of the compiler. (this can break independently of how > > the c code around vfork looks like). > > The same argument applies to setjmp, Not quite. With setjmp, once there's any return from the function where setjmp was called (or call to a function that's known neither to return nor call longjmp or throw an exception), the compiler may rightfully assume that non-reachable data in the setjmp caller is no longer live and clobber it. What makes vfork is special is that the data must be treated as live even when the caller calls _exit. > and the solution in GCC is the > same for both: GCC internally recognizes vfork, setjmp, and a few other > functions as functions that "return twice". If gcc treats vfork even more specially than setjmp, or slightly pessimizes behavior around setjmp to force all data to remain live, then I think it's safe to use vfork as long as your compiler is gcc. Whether other compilers that are or might become interesting do the same, I don't know. It's not an assumption I like. > (after all, if it was broken like you describe, one would expect to witness > such breakage in practice, for instance with GNU Make prior to 4.0) I thought the reason GNU make switches was that they encountered subtle breakage in practice somewhere... Of course GNU make has to be compatible with basically _all_ compilers, not just gcc, so that might be the reason. Rich
On 18 Sep 2015 13:13, Zack Weinberg wrote: > From: Zack Weinberg <zackw@panix.com> > On Fri, Sep 18, 2015 at 11:32 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote: > > that's surprising: setjmp is standard c > > and it is a macro so it cannot be called > > through a function pointer.. this is not > > true for vfork so it's not clear why gcc > > recognizes it. > > Any function can be annotated with these semantics using __attribute__ > ((returns_twice)), but glibc's headers don't do that feel like posting a patch ? :) -mike
On Fri, 18 Sep 2015, Rich Felker wrote: > > > i think all use of vfork is invalid c: the compiler can > > > spill registers on the stack then in the child clobber > > > them, then after vfork returns in the parent the > > > clobbered registers are restored breaking the expectations > > > of the compiler. (this can break independently of how > > > the c code around vfork looks like). > > > > The same argument applies to setjmp, > > Not quite. With setjmp, once there's any return from the function > where setjmp was called (or call to a function that's known neither to > return nor call longjmp or throw an exception), the compiler may > rightfully assume that non-reachable data in the setjmp caller is no > longer live and clobber it. > > What makes vfork is special is that the data must be treated as live > even when the caller calls _exit. For the compiler, the relationship between vfork and _exit should be the same as between setjmp and longjmp: calling the latter causes the former to return a second time.
On September 18, 2015 4:13:02 PM EDT, Mike Frysinger <vapier@gentoo.org> wrote: >On 18 Sep 2015 13:13, Zack Weinberg wrote: >> From: Zack Weinberg <zackw@panix.com> >> On Fri, Sep 18, 2015 at 11:32 AM, Szabolcs Nagy ><szabolcs.nagy@arm.com> wrote: >> > that's surprising: setjmp is standard c >> > and it is a macro so it cannot be called >> > through a function pointer.. this is not >> > true for vfork so it's not clear why gcc >> > recognizes it. >> >> Any function can be annotated with these semantics using >__attribute__ >> ((returns_twice)), but glibc's headers don't do that > >feel like posting a patch ? :) Is that actually a useful change to make, considering GCC already knows about vfork? To be clear, I am not advocating for the use of vfork (in posix_spawn or otherwise); I am only advocating *against* bothering to make improvements to posix_spawn. zw
On Fri, Sep 18, 2015 at 11:19:56PM +0300, Alexander Monakov wrote: > On Fri, 18 Sep 2015, Rich Felker wrote: > > > > i think all use of vfork is invalid c: the compiler can > > > > spill registers on the stack then in the child clobber > > > > them, then after vfork returns in the parent the > > > > clobbered registers are restored breaking the expectations > > > > of the compiler. (this can break independently of how > > > > the c code around vfork looks like). > > > > > > The same argument applies to setjmp, > > > > Not quite. With setjmp, once there's any return from the function > > where setjmp was called (or call to a function that's known neither to > > return nor call longjmp or throw an exception), the compiler may > > rightfully assume that non-reachable data in the setjmp caller is no > > longer live and clobber it. > > > > What makes vfork is special is that the data must be treated as live > > even when the caller calls _exit. > > For the compiler, the relationship between vfork and _exit should be the same > as between setjmp and longjmp: calling the latter causes the former to return > a second time. Yes, I suppose that's a good way to model it. But the returns_twice attribute does not specify under what conditions the function returns again, so unless gcc has vfork-specific knowledge attached to the specific name, I suppose it must just assume everything stays live. Rich
Rich Felker <dalias@libc.org> writes: > Yes, I suppose that's a good way to model it. But the returns_twice > attribute does not specify under what conditions the function returns > again, so unless gcc has vfork-specific knowledge attached to the > specific name, I suppose it must just assume everything stays live. GCC doesn't treat vfork any different from setjmp, or any other function marked returns_twice, see calls.c:special_function_p. Andreas.
On Sat, Sep 19, 2015 at 2:39 AM, Andreas Schwab <schwab@linux-m68k.org> wrote: > Rich Felker <dalias@libc.org> writes: > >> Yes, I suppose that's a good way to model it. But the returns_twice >> attribute does not specify under what conditions the function returns >> again, so unless gcc has vfork-specific knowledge attached to the >> specific name, I suppose it must just assume everything stays live. > > GCC doesn't treat vfork any different from setjmp, or any other function > marked returns_twice, see calls.c:special_function_p. special_function_p just defines the set of functions assumed to have this behavior (setjmp, setjmp_syscall, sigsetjmp, savectx, qsetjmp, vfork, and getcontext); to find out what the compiler *does* with that knowledge you have to grep for ECF_RETURNS_TWICE. As best I can tell, the direct effects of a callsite being marked ECF_RETURNS_TWICE seem to be that it is not eligible for various optimizations (e.g. tail call) and, much more importantly, the control-flow graph for the containing function is annotated with an "abnormal edge" from *every other callsite which is not known to have no side effects* to that callsite. I'm not sure exactly what "abnormal edge" means to GCC, but I think this should be conservatively correct, whether or not such a function actually does return twice. zw
On 09/19/2015 08:04 AM, Zack Weinberg wrote: > On Sat, Sep 19, 2015 at 2:39 AM, Andreas Schwab <schwab@linux-m68k.org> wrote: >> Rich Felker <dalias@libc.org> writes: >> >>> Yes, I suppose that's a good way to model it. But the returns_twice >>> attribute does not specify under what conditions the function returns >>> again, so unless gcc has vfork-specific knowledge attached to the >>> specific name, I suppose it must just assume everything stays live. Indeed it does assume everything stays live. >> >> GCC doesn't treat vfork any different from setjmp, or any other function >> marked returns_twice, see calls.c:special_function_p. > > special_function_p just defines the set of functions assumed to have > this behavior (setjmp, setjmp_syscall, sigsetjmp, savectx, qsetjmp, > vfork, and getcontext); to find out what the compiler *does* with that > knowledge you have to grep for ECF_RETURNS_TWICE. > > As best I can tell, the direct effects of a callsite being marked > ECF_RETURNS_TWICE seem to be that it is not eligible for various > optimizations (e.g. tail call) and, much more importantly, the > control-flow graph for the containing function is annotated with an > "abnormal edge" from *every other callsite which is not known to have > no side effects* to that callsite. > > I'm not sure exactly what "abnormal edge" means to GCC, but I think > this should be conservatively correct, whether or not such a function > actually does return twice. Abnormal edge, i.e. abnormal flow of control, i.e. longjmp or throw-like. It's a big hammer that kills quite a lot of data flow analysis, but thankfully returns_twice functions are rare. And in the situations they are used, there usually isn't much room for improvement anyway, so why complicate the correctness issue? r~
On 09/18/2015 06:52 PM, Zack Weinberg wrote: > On Wed, Sep 16, 2015 at 4:35 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote: >> On 15/09/15 23:02, Mike Frysinger wrote: >>> On 14 Sep 2015 15:04, Szabolcs Nagy wrote: >>>> On 11/09/15 22:01, Zack Weinberg wrote: >>>>> On Fri, Sep 11, 2015 at 5:18 AM, Florian Weimer wrote: >>>>>> We need to move people off fork and clone to posix_spawn. If it is >>>>>> missing functionality, we should add it. >>>>> >>>>> ... I honestly have no idea why you say that. >>>> >>>> because fork has issues [...] >>> >>> how does posix_spawn magically fix these things ? it still calls >>> fork internally [...] >> >> the api is such that user code is not run with a copy of the >> address space in the child so an implementation without fork >> is possible (that's the point of this api). [...] (of course the kinds >> of actions before exec are limited by the api so it's not always >> useful/simple). > > So that means migrating to posix_spawn doesn't fix anything until a > better kernel API exists, so developer effort should go into _that_, > and then applications should migrate directly to the new kernel API, > not to posix_spawn. posix_spawn is not a kernel API. If applications move to that, we can add optimizations as soon as better kernel APIs become available, and applications will directly benefit. That's the nice aspect of pthread_spawn. The not-so-nice aspect from an application point of view is that the interface is so restrictive, and glibc basically has to vet and implement any child process configuration option. > (I actually have a design sketched out in my head; unfortunately I > have neither the time nor the kernel programming experience to get it > done. The fundamental idea is a new process state, "egg," in which > the process has a PID and a task_struct assigned but does not have an > address space and cannot execute instructions. There are a bunch of > new system calls to set the kernel state of an egg -- open file > descriptors and the like -- and a modified execve() that loads a > program into an egg and sets it going.) I think we actually have most bits in the kernel (what you are proposing is a lot like the restore part of the checkpoint/restore support), we just have to fit things together properly. One thing that we can't really address: Some callers want to know the child PID *before* the execve (so that pthread_spawn does not delay execution), some want to know precisely if execve succeeded. We probably need to introduce a flag to choose between the two behaviors because they are fundamentally incompatible.
On Wed, Sep 30, 2015 at 7:48 AM, Florian Weimer <fweimer@redhat.com> wrote: > On 09/18/2015 06:52 PM, Zack Weinberg wrote: >> (I actually have a design sketched out in my head; unfortunately I >> have neither the time nor the kernel programming experience to get it >> done. The fundamental idea is a new process state, "egg," in which >> the process has a PID and a task_struct assigned but does not have an >> address space and cannot execute instructions. There are a bunch of >> new system calls to set the kernel state of an egg -- open file >> descriptors and the like -- and a modified execve() that loads a >> program into an egg and sets it going.) > > I think we actually have most bits in the kernel (what you are proposing > is a lot like the restore part of the checkpoint/restore support), we > just have to fit things together properly. > > One thing that we can't really address: Some callers want to know the > child PID *before* the execve (so that pthread_spawn does not delay > execution), some want to know precisely if execve succeeded. We > probably need to introduce a flag to choose between the two behaviors > because they are fundamentally incompatible. This is a non-issue if applications code to the new kernel API rather than posix_spawn. egg_create() returns the new PID, which you always need to know, and egg_execve(pid, ...) tells you if it failed. (A critical aspect of my idea, which I guess I forgot to mention, is that all the new setting-up-the-egg syscalls report failure in the *parent*; this wipes out a whole class of problems with failures in between fork() and execve() being difficult to report accurately.) zw
diff --git a/ChangeLog b/ChangeLog index 5f009a8..0e0c85b 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,8 @@ +2015-09-10 Navid Rahimi <rahimi.nv@gmail.com> + + [BZ #18433] + * sysdeps/posix/spawni.c (__spawni): Check file access before forking. + 2015-09-08 Joseph Myers <joseph@codesourcery.com> [BZ #14912] diff --git a/sysdeps/posix/spawni.c b/sysdeps/posix/spawni.c index eee9331..c571390 100644 --- a/sysdeps/posix/spawni.c +++ b/sysdeps/posix/spawni.c @@ -90,6 +90,9 @@ __spawni (pid_t *pid, const char *file, size_t len; size_t pathlen; + if(__access (file, X_OK) != 0) + return errno; + /* Do this once. */ short int flags = attrp == NULL ? 0 : attrp->__flags;