Message ID | 1519679016-12241-2-git-send-email-adhemerval.zanella@linaro.org |
---|---|
State | Dropped |
Headers |
Received: (qmail 73689 invoked by alias); 26 Feb 2018 21:03:48 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 73127 invoked by uid 89); 26 Feb 2018 21:03:47 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mail-qt0-f180.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references; bh=7rJUZ/2PnzRy3pNlmT8RDxltSHh/W0KAjEtLfwj1ZOA=; b=X5UIt/Q4oqBF2RMniJZbdUccfcP8et/DIZt6fDjYcd+X9LAE9/RIyD/B836LsqYHG1 N37iHQr/MlPh9NlUkE+IIzAhzS+CQdgqURweB9BzMvacqVL8Z2lhQrgiIx0rIuD4jvdT PykF9JgpnxIPki8oaUfPrjhqQBrI/N1rBkr4uxy3DUkzsgxbnEgGtdejda6L45z1HW8u IF1HjzL9rru9iqI16OaZ2u7Q0aPVXVlOBmhCaxLbHA9A2VOdapaxGFvmaUDnxdqWgQm1 vnYXGXaVrnYmP36HaBVn3MIA4G6CcEJ1V+ExCzWkCGVpLTBswYqXI5O2GEhK/XwNGrsh 51xg== X-Gm-Message-State: APf1xPCP2lviNrtXTtGIY/VgLe7cR7L6RLdwnHgYoWTqTrwnR/64FTl/ huummD7RFB0gUxVa/zA6qfKYkoVUQFc= X-Google-Smtp-Source: AG47ELuLE7V2bniCXY6dpstk8msd0lPayYirdRmu0+mZEjWNqPqcGPOOypek6M3oZy3GK8GN5UpogA== X-Received: by 10.200.58.230 with SMTP id x93mr19951005qte.72.1519679024799; Mon, 26 Feb 2018 13:03:44 -0800 (PST) From: Adhemerval Zanella <adhemerval.zanella@linaro.org> To: libc-alpha@sourceware.org Subject: [PATCH v2 01/21] powerpc: Create stackframe information on syscall Date: Mon, 26 Feb 2018 18:03:16 -0300 Message-Id: <1519679016-12241-2-git-send-email-adhemerval.zanella@linaro.org> In-Reply-To: <1519679016-12241-1-git-send-email-adhemerval.zanella@linaro.org> References: <1519679016-12241-1-git-send-email-adhemerval.zanella@linaro.org> |
Commit Message
Adhemerval Zanella Netto
Feb. 26, 2018, 9:03 p.m. UTC
This patch adds a minimal stackframe creation on powerpc syscall implementation so backtrace works correctly on a signal handler. Checked on powerpc64le-linux-gnu. * sysdeps/unix/sysv/linux/powerpc/syscall.S (syscall): Create stack frame. --- ChangeLog | 3 +++ sysdeps/unix/sysv/linux/powerpc/syscall.S | 14 ++++++++++++++ 2 files changed, 17 insertions(+)
Comments
On Feb 26 2018, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote: > diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S > index 2da9172..fae0fe8 100644 > --- a/sysdeps/unix/sysv/linux/powerpc/syscall.S > +++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S > @@ -19,6 +19,14 @@ > > ENTRY (syscall) > ABORT_TRANSACTION > + /* Creates a minimum stack frame so backtrace works. */ s/Creates/Create/ Andreas.
On 26/02/2018 18:41, Andreas Schwab wrote: > On Feb 26 2018, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote: > >> diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S >> index 2da9172..fae0fe8 100644 >> --- a/sysdeps/unix/sysv/linux/powerpc/syscall.S >> +++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S >> @@ -19,6 +19,14 @@ >> >> ENTRY (syscall) >> ABORT_TRANSACTION >> + /* Creates a minimum stack frame so backtrace works. */ > > s/Creates/Create/ > > Andreas. > Ack, fixed locally.
On 26 Feb 2018, Adhemerval Zanella wrote: > This patch adds a minimal stackframe creation on powerpc syscall > implementation so backtrace works correctly on a signal handler. I don't know powerpc well enough to know if this should be necessary. I think one of the powerpc arch maintainers should comment. I do have a couple questions: > +#ifdef __powerpc64__ > + stdu r1, -FRAME_MIN_SIZE (r1) > + cfi_adjust_cfa_offset (FRAME_MIN_SIZE) > +#else > + stwu r1,-16(1) > + cfi_def_cfa_offset (16) > +#endif Why does this use cfa_adjust_cfa_offset for 64-bit but _def_ for 32-bit? > +#ifdef __powerpc64__ > + addi r1, r1, FRAME_MIN_SIZE > +#else > + addi r1,r1,16 > +#endif If FRAME_MIN_SIZE were defined for ppc32, we could reduce the ifdeffage here. > + cfi_def_cfa_offset (0) > sc > PSEUDO_RET Shouldn't the stack adjustments be undone _after_ the 'sc' instruction? Actually, is it possible that a single cfi_def_cfa_offset (0) at the beginning of the function is all that's really needed here? It seems like backtrace should be able to handle leaf frames in general... zw
On 26 Feb 2018, Adhemerval Zanella wrote: > This patches fixes some race conditions in NPTL cancellation code by > redefining how cancellable syscalls are defined and handled. Current > approach is to enable asynchronous cancellation prior to making the syscall > and restore the previous cancellation type once the syscall returns. > > As decribed in BZ#12683, this approach shows 2 important problems: > > 1. Cancellation can act after the syscall has returned from kernel, but > before userspace saves the return value. It might result in a resource > leak if the syscall allocated a resource or a side effect (partial > read/write), and there is no way to program handle it with cancellation > handlers. > > 2. If a signal is handled while the thread is blocked at a cancellable > syscall, the entire signal handler runs with asynchronous cancellation > enabled. This can lead to issues if the signal handler call functions > which are async-signal-safe but not async-cancel-safe. > > For cancellation to work correctly, there are 5 points at which the > cancellation signal could arrive: > > 1. Before the final "testcancel" and before the syscall is made. > 2. Between the "testcancel" and the syscall. > 3. While the syscall is blocked and no side effects have yet taken place. > 4. While the syscall is blocked but with some side effects already having > taken place (e.g. a partial read or write). > 5. After the syscall has returned. > > And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not in case > 4 or 5. The proposed solution follows: > > * Handling case 1 is trivial: do a conditional branch based on whether the > thread has received a cancellation request; > * Case 2 can be caught by the signal handler determining that the saved > program counter (from the ucontext_t) is in some address range beginning > just before the "testcancel" and ending with the syscall instruction. > * In this case, except for certain syscalls that ALWAYS fail with EINTR > even for non-interrupting signals, the kernel will reset the program > counter to point at the syscall instruction during signal handling, so > that the syscall is restarted when the signal handler returns. So, from > the signal handler's standpoint, this looks the same as case 2, and thus > it's taken care of. > * In this case, the kernel cannot restart the syscall; when it's > interrupted by a signal, the kernel must cause the syscall to return > with whatever partial result it obtained (e.g. partial read or write). > * In this case, the saved program counter points just after the syscall > instruction, so the signal handler won't act on cancellation. > This one is equal to 4. since the program counter is past the syscall > instruction already. To be 100% clear here, you are fixing important problem #1 by having deferred cancellation _not_ trigger in cases 4 and 5, because the signal handler observes that the PC is past the syscall instruction? > Another case that needs handling is syscalls that fail with EINTR even > when the signal handler is non-interrupting. In this case, the syscall > wrapper code can just check the cancellation flag when the errno result > is EINTR, and act on cancellation if it's set. Which system calls are affected by this rule, and is it actually correct to have cancellation trigger when they fail with EINTR? Can we be certain that the EINTR was due to the cancellation signal rather than some other signal delivered first? (At least for sigsuspend it is possible to have two signals be delivered simultaneously, if they were both blocked and pending, and both become unblocked due to the sigsuspend. The kernel pushes two signal stack frames. It would reduce the odds of this being a problem if the SIGCANCEL handler masked all other signals while running, but I'm not sure that will work with it jumping directly to __pthread_unwind under some conditions, and also I don't think it _completely_ eliminates the possibility; a second signal could be delivered right after sigreturn unmasks it, without returning to the caller of the syscall stub first.) > 1. Remove the enable_asynccancel/disable_asynccancel function usage in > syscall definition and instead make them call a common symbol that will > check if cancellation is enabled (__syscall_cancel at > nptl/libc-cancellation.c), call the arch-specific cancellable > entry-point (__syscall_cancel_arch) and cancel the thread when required. > > 2. Provide a arch-specific symbol that contains global markers. These > markers will be used in SIGCANCEL handler to check if the interruption > has been called in a valid syscall and if the syscalls has been > completed or not. > A default version is provided (sysdeps/unix/sysv/linux/syscall_cancel.c), > however the markers may not be set on correct expected places depeding > of how INTERNAL_SYSCALL_NCS is implemented by the underlying architecture. > In this case arch-specific implementation should be provided. > > 3. Rewrite SIGCANCEL asynchronous handler to check for both cancelling type > and if current IP from signal handler falls between the global markes > and act accordingly (sigcancel_handler at nptl/nptl-init.c). > > 4. Adjust nptl/pthread_cancel.c to send an signal instead of acting > directly. This avoid synchronization issues when updating the > cancellation status and also focus the logic on signal handler and > cancellation syscall code. > > 5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to > appropriated cancelable futex syscalls. > > 6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to > appropriated cancelable syscalls. > > 7. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide > cancelable futex calls (used in libpthread code). The overall design seems sound to me. > This patch adds the proposed changes to NPTL. The code leaves all the ports > broken without further patches in the list. Can we find an ordering of the patches in which the tree is never broken at an intermediate stage? Perhaps you could introduce the generic __syscall_cancel(_arch) first, but not change any of the existing cancellation logic until after all of the port-specific code was in place? Below, all the changes which I erased and didn't comment on seem OK to me. I am generally pleased to see many system call wrappers get simpler in the process of fixing this bug. > --- a/io/creat.c > +++ b/io/creat.c ... > - > -/* __open handles cancellation. */ > -LIBC_CANCEL_HANDLED (); It might be clearer to split patch 2 into two patches, one to remove tst-cancel-wrappers.sh and one to make all the other testing changes, and then shift all of the removals of LIBC_CANCEL_HANDLED into the same patch that removes tst-cancel-wrappers.sh. > --- a/nptl/descr.h > +++ b/nptl/descr.h ... > + /* Bit set if threads is canceled. */ Grammar: "Bit set if thread is canceled." (For 100% proper English, it would be "This bit is set if this thread has been cancelled", but all of the other comments are clipped as well, so the above is fine.) > --- a/nptl/libc-cancellation.c > +++ b/nptl/libc-cancellation.c ... > +#include <setjmp.h> > +#include <stdlib.h> Why are these additional #includes necessary? ... > + volatile struct pthread *pd = (volatile struct pthread *) self; Why is it necessary for this pointer to be volatile? It appears that we only ever access pd->cancelhandling; which loads need to be forced to go to memory, and should they also be atomic? > + /* If cancellation is not enabled, call the syscall directly. */ > + if (pd->cancelhandling & CANCELSTATE_BITMASK) It is unfortunate that the one-bit flag that means "cancellation is disabled" is named CANCELSTATE_BITMASK instead of, I dunno, CANCEL_DISABLED_BITMASK maybe. You didn't introduce that, but please consider a follow-up patch that renames CANCELSTATE_BIT(MASK) and also CANCELTYPE_BIT(MASK) [the latter to "CANCEL_ASYNC_BIT(MASK)" perhaps]. > + if ((result == -EINTR) > + && (pd->cancelhandling & CANCELED_BITMASK) > + && !(pd->cancelhandling & CANCELSTATE_BITMASK)) > + __syscall_do_cancel (); I thought you said this treatment was only applied to system calls that _always_ fail with EINTR, even when interrupted by SA_RESTART signals. This is doing it for system calls interrupted by non-SA_RESTART signals as well. You also change SIGCANCEL to be a SA_RESTART signal in this patch, but it still could happen that we have SIGCANCEL and some other non-SA_RESTART signal arrive simultaneously enough that they interrupt the same system call, and then it fails with EINTR and everything is confused. I think you need to check over the entire design with "what if SIGCANCEL and some other signal are delivered as nested interruptions to the same system call" in mind, in fact. The dangerous case that I see is when the _other_ signal causes the system call to be interrupted after producing some side effects. Can the SIGCANCEL handler know that this has happened or is about to happen? > --- a/nptl/nptl-init.c > +++ b/nptl/nptl-init.c > sigcancel_handler (int sig, siginfo_t *si, void *ctx) > { > + INTERNAL_SYSCALL_DECL (err); > + pid_t pid = INTERNAL_SYSCALL_CALL (getpid, err); > + > /* Safety check. It would be possible to call this function for > other signals and send a signal from another process. This is not > correct and might even be a security problem. Try to catch as > many incorrect invocations as possible. */ > if (sig != SIGCANCEL > - || si->si_pid != __getpid() > + || si->si_pid != pid > || si->si_code != SI_TKILL) Why is this change necessary? > + volatile struct pthread *pd = (volatile struct pthread *) self; Same question as above: why does this need to be volatile, does it actually do the job, should we instead be using atomics? > + /* Add SIGCANCEL on ignored sigmask to avoid the handler to be called > + again. */ > + sigset_t *set = UCONTEXT_SIGMASK (ctx); > + __sigaddset (set, SIGCANCEL); Screwing with the signal mask that will be restored by sigreturn seems unsafe. Could we instead return early if CANCELED_BITMASK and/or EXITING_BITMASK has already been set? > + /* Check if asynchronous cancellation mode is set and if interrupted ^^^ This 'and' should be 'or'. > + THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT); > + THREAD_SETMEM (self, result, PTHREAD_CANCELED); I think these ought to be done inside __do_cancel, since several other places may call __do_cancel. > + INTERNAL_SYSCALL_CALL (rt_sigprocmask, err, SIG_SETMASK, set, NULL, > + _NSIG / 8); See above in re screwing with the signal mask. > + __do_cancel (); > + return; > } > } This 'return' should be unnecessary on two counts: we're about to fall off the end of the function anyway, and __do_cancel ends by calling __pthread_unwind which does not return. --- a/nptl/pthreadP.h +++ b/nptl/pthreadP.h ... > - /* Make sure we get no more cancellations. */ > - THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT); > + /* Make sure we get no more cancellations by clearing the cancel > + state. */ > + int oldval = THREAD_GETMEM (self, cancelhandling); > + while (1) > + { > + int newval = (oldval | CANCELSTATE_BITMASK); > + newval &= ~(CANCELTYPE_BITMASK); Disabled cancellation state is specified to take precedence over asynchronous cancellation type, so it should be enough to atomically set CANCELSTATE_BIT, at which point the CAS loop would be unnecessary. > --- a/nptl/pthread_cancel.c > +++ b/nptl/pthread_cancel.c ... > + /* Avoid signaling when thread attempts cancel itself (pthread_kill > + is expensive). */ > + if (pd == THREAD_SELF && !(pd->cancelhandling & CANCELTYPE_BITMASK)) > + return 0; It is not obvious why this is correct. The answer is that pthread_cancel is not itself a cancellation point, so, when cancellation is deferred, a self-cancel is _not_ supposed to take effect immediately; the signal handler would have no effect, and the next cancellation point will check the flag anyway. I suggest instead if (pd == THREAD_SELF) { if (pd->cancelhandling & CANCELTYPE_BITMASK) __do_cancel (); /* pthread_cancel is not itself a cancellation point; the next cancellation point will check the flag anyway, so there is no need to send the cancellation signal. */ return 0; } return __pthread_kill (th, SIGCANCEL); > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c ... > /* If the parent was running cancellation handlers while creating > the thread the new thread inherited the signal mask. Reset the > cancellation signal mask. */ If the cancellation handler exited early when EXITING_BIT was set, and it didn't mess with the signal mask, we wouldn't need to do this, either. > - int oldtype = CANCEL_ASYNC (); > + int ct; > + __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct); I don't understand the logic in this part of the code at all so I am going to trust you that this is the correct change. It seems like it's using SIGCANCEL for something other than cancellation, which seems unwise and maybe should be changed in a follow-up. > --- a/nptl/pthread_join_common.c > +++ b/nptl/pthread_join_common.c ... > - int oldtype = CANCEL_ASYNC (); > + int ct; > + __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct); > > if (abstime != NULL) > result = lll_timedwait_tid (pd->tid, abstime); > else > lll_wait_tid (pd->tid); > > - CANCEL_RESET (oldtype); > + __pthread_setcanceltype (ct, NULL); Maybe we should have lll_(timed)wait_tid_cancel rather than this? > --- a/sysdeps/unix/sysdep.h > +++ b/sysdeps/unix/sysdep.h ... > +/* The loader does not need to handle thread cancellation, use direct > + syscall instead. */ A stronger assertion would be better: /* The loader should never make cancellable system calls. Remap them to direct system calls. */ Also, is this hack still necessary after I went through and made the loader explicitly use _nocancel operations? > --- a/sysdeps/unix/sysv/linux/pthread_kill.c > +++ b/sysdeps/unix/sysv/linux/pthread_kill.c ... > - /* Disallow sending the signal we use for cancellation, timers, > - for the setxid implementation. */ > - if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID) > + /* Disallow sending the signal we use for setxid implementation. */ > + if (signo == SIGSETXID) > return EINVAL; Now applications can call pthread_kill(thr, SIGCANCEL) themselves, can't they? That seems unsafe. Also this has since been changed to use __is_internal_signal, I think. We need an __internal_pthread_kill or something. > --- a/sysdeps/unix/sysv/linux/socketcall.h > +++ b/sysdeps/unix/sysv/linux/socketcall.h ... > +#define __SOCKETCALL_CANCEL1(__name, __a1) \ > + SYSCALL_CANCEL_NCS (socketcall, __name, \ > + ((long int [1]) { (long int) __a1 })) Blech, which architectures still require us to use socketcall? > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/syscall_cancel.c ... > +#define ADD_LABEL(__label) \ > + asm volatile ( \ > + ".global " __label "\t\n" \ > + ".type " __label ",\%function\t\n" \ > + __label ":\n"); This makes me extremely nervous. Specifically, I do not trust the compiler to not move these around, even though they're volatile. I would actually prefer an .S file for every architecture. Do we have a concrete reason to believe they really will always be where they're supposed to be? "It compiled fine with the compiler I have right now" is not enough, I'm afraid. > --- a/sysdeps/unix/sysv/linux/sysdep.h > +++ b/sysdeps/unix/sysv/linux/sysdep.h ... > +/* Check error from cancellable syscall and set errno accordingly. > + Linux uses a negative return value to indicate syscall errors > + and since version 2.1 the return value of a system call might be > + negative even if the call succeeded (e.g., the `lseek' system call > + might return a large offset). > + Current contract is kernel make sure the no syscall returns a value > + in -1 .. -4095 as a valid result so we can savely test with -4095. */ > +#define SYSCALL_CANCEL_RET(__ret) \ > + ({ \ > + if (__ret > -4096UL) \ > + { \ > + __set_errno (-__ret); \ > + __ret = -1; \ > + } \ > + __ret; \ > + }) Don't we already have this somewhere under another name? zw
On 06/05/2018 23:49, Zack Weinberg wrote: > On 26 Feb 2018, Adhemerval Zanella wrote: >> This patch adds a minimal stackframe creation on powerpc syscall >> implementation so backtrace works correctly on a signal handler. > > I don't know powerpc well enough to know if this should be necessary. > I think one of the powerpc arch maintainers should comment. I do have > a couple questions: > >> +#ifdef __powerpc64__ >> + stdu r1, -FRAME_MIN_SIZE (r1) >> + cfi_adjust_cfa_offset (FRAME_MIN_SIZE) >> +#else >> + stwu r1,-16(1) >> + cfi_def_cfa_offset (16) >> +#endif > > Why does this use cfa_adjust_cfa_offset for 64-bit but _def_ for 32-bit? I am not well versed in CFI definition, so I was basically followed what compiler is spilling in a usual function call back when I coded it. > >> +#ifdef __powerpc64__ >> + addi r1, r1, FRAME_MIN_SIZE >> +#else >> + addi r1,r1,16 >> +#endif > > If FRAME_MIN_SIZE were defined for ppc32, we could reduce the > ifdeffage here. > >> + cfi_def_cfa_offset (0) >> sc >> PSEUDO_RET > > Shouldn't the stack adjustments be undone _after_ the 'sc' > instruction? Actually, is it possible that a single > cfi_def_cfa_offset (0) at the beginning of the function is all that's > really needed here? It seems like backtrace should be able to handle > leaf frames in general... Indeed the single 'cfi_def_cfa_offset (0)' is suffice, I will change the patch accordingly in next iteration.
On 06/05/2018 23:49, Zack Weinberg wrote: > On 26 Feb 2018, Adhemerval Zanella wrote: >> With upcoming fix for BZ#12683, pthread cancellation does not act for: >> >> 1. If syscall is blocked but with some side effects already having >> taken place (e.g. a partial read or write). >> 2. After the syscall has returned. >> >> The main change is due the fact programs need to act in syscalls with >> side-effects (for instance, to avoid leak of allocated resources or >> handle partial read/write). >> >> This patch changes the NPTL testcase that assumes the old behavior and >> also remove the tst-cancel-wrappers.sh test (which checks for symbols >> that will not exist anymore). > > It's OK and expected that we have to adjust test cases that were > expecting the old behavior ... > >> For tst-cancel{2,3} case it remove the pipe close because it might >> cause the write syscall to return with side effects if the close is >> executed before the pthread cancel. > > ... however, this change appears to be wrong. If cancellation is > broken, these tests will now deadlock rather than failing cleanly. On current cancellation implementation the thread will finish regardless and sigcancel_handler will act whether there is side-effects or not (the pipe close). The issue is cancellation should not happen if syscall returns but some side effects already took place, in this case the pipe close. > > If I understand correctly, the problem you're trying to avoid is that > the 'tf' thread could conceivably receive the closed-pipe event before > the cancellation signal, even though the 'do_test' thread triggers the > cancellation signal first. I don't know of any way to fix this 100%, > but I think it would be good enough to use pthread_timedjoin_np to > sleep for a hundred milliseconds or so in the 'do_test' thread, and > then close the pipe if the 'tf' thread is still alive. Yes, although for this specific case I am not sure if this could happen in practice. I assume if a thread issues a 'signal' followed by a 'close', the signal target thread will receive the events in a ordered manner, i.e, the signal handler will be activated before the syscall sees any side-effects (the close). It seems to be Linux behaviour, but I am not sure if a different system might act differently. And I try to avoid the timing check, such as pthread_timedjoin_np, because they tend to quite fragile in practice for such cases (due either to system load when testing glibc, machine performance, etc.). I will remove the close removal changes in next iteration. > > (tst-cancel4.c appears to be trying to ensure that cancellations are > pending with a pthread_barrier_t, but as far as I know there's no > guarantee that if a thread does > > pthread_cancel(th); > pthread_barrier_wait(ba); > > where 'th' also waits on 'ba', the SIGCANCEL will actually be > delivered before the barrier unblocks, either. Feh.) > >> It also changes how to call the read syscall on tst-backtrace{5,6} >> to use syscall instead of read cancelable syscall to avoid need to >> handle the cancelable bridge function calls. It requires a change >> on powerpc syscall implementation to create a stackframe, since >> powerpc backtrace rely on such information. > > It doesn't look technically difficult to me to handle an additional > stack frame or two in the trace. They're always going to be there, > aren't they? In the new world order, the stack trace will always be > > 0 handle_signal > 1 <signal trampoline> > 2 __syscall_cancel_arch > 3 __syscall_cancel > 4 read > 5 fn > 6 fn > 7 fn > 8 do_test > > won't it? I think teaching the backtrace logic about this would be > better than needing to use a raw syscall() and then mess with the > PowerPC implementation of syscall(). I might feel differently about > this change if __read_nocancel were a public API, but it isn't... With your current suggestion to powerpc syscall bits, there is no need to actually change the powerpc syscall implementation besides an additional CFI mechanism. But I do not mind to change the testcase on the bz12683 fix itself, the only advantage I see is by using indirect syscall there is no need to actually change it again. > >> Checked on i686-linux-gnu, x86_64-linux-gnu, x86_64-linux-gnux32, >> aarch64-linux-gnu, arm-linux-gnueabihf, powerpc64le-linux-gnu, >> powerpc-linux-gnu, sparcv9-linux-gnu, and sparc64-linux-gnu. > > When you say checked, do you mean you actually ran the test cases, or > did you just compile them (perhaps with a cross-compiler)? It is a full make check on native machines. > >> * nptl/Makefile [$(run-built-tests) = yes] (tests-special): Remove >> tst-cancel-wrappers.sh. >> * nptl/tst-cancel-wrappers.sh: Remove file. > > This part is OK. > >> * nptl/tst-cancel4.c (tf_write): Handle cancelled syscall with >> side-effects. >> (tf_send): Likewise. > > This part is also OK. I think the test can still fail spuriously due > to races, but this one can't deadlock, at least, so we can live with it. > >> * sysdeps/unix/sysv/linux/powerpc/syscall.S (syscall): Create stack >> frame. > > This ChangeLog entry belongs to patch 1. Ack. I will remove it. > > zw >
On Mon, May 7, 2018 at 1:13 PM, Adhemerval Zanella <adhemerval.zanella@linaro.org> wrote: > On 06/05/2018 23:49, Zack Weinberg wrote: >> On 26 Feb 2018, Adhemerval Zanella wrote: >>> For tst-cancel{2,3} case it remove the pipe close because it might >>> cause the write syscall to return with side effects if the close is >>> executed before the pthread cancel. >> >> ... however, this change appears to be wrong. If cancellation is >> broken, these tests will now deadlock rather than failing cleanly. > > On current cancellation implementation the thread will finish regardless > and sigcancel_handler will act whether there is side-effects or not > (the pipe close). The issue is cancellation should not happen if syscall > returns but some side effects already took place, in this case the pipe > close. I think maybe I didn't explain clearly enough what I'm worried about here. What the test case does _when cancellation works_ is sensible. But this is a test case, it also needs to behave sensibly when cancellation _doesn't_ work. Imagine a new port where, for some reason, the cancellation mechanism is so broken that read/write aren't acting as cancellation points at all. Without the close, tst-cancel{2,3} will block forever in read/write. We have the test-driver timeout as a backstop, but we shouldn't rely on it. > Yes, although for this specific case I am not sure if this could happen > in practice. I assume if a thread issues a 'signal' followed by a 'close', > the signal target thread will receive the events in a ordered manner, i.e, > the signal handler will be activated before the syscall sees any > side-effects (the close). It seems to be Linux behaviour, but I am not > sure if a different system might act differently. I don't think POSIX makes any requirements, but yes, in practice the signal should always arrive first. > And I try to avoid the timing check, such as pthread_timedjoin_np, > because they tend to quite fragile in practice for such cases (due either > to system load when testing glibc, machine performance, etc.). This is reasonable. For the new cancellation mechanism in general, we don't have a good way of arranging for SIGCANCEL to arrive at exactly the critical points within the syscall sequence, do we? I am tempted to try to write a test case that scripts gdb and single-steps through a call to open() and fires SIGCANCEL at each instruction. >> won't it? I think teaching the backtrace logic about this would be >> better than needing to use a raw syscall() and then mess with the >> PowerPC implementation of syscall(). I might feel differently about >> this change if __read_nocancel were a public API, but it isn't... > > With your current suggestion to powerpc syscall bits, there is no need > to actually change the powerpc syscall implementation besides an > additional CFI mechanism. But I do not mind to change the testcase on > the bz12683 fix itself, the only advantage I see is by using indirect > syscall there is no need to actually change it again. I don't feel especially strongly about this now we have a way that doesn't add actual instructions to powerpc syscall(). zw
On 06/05/2018 23:49, Zack Weinberg wrote: Hi Zack, First, thank you for the throughfully review on this patch. > On 26 Feb 2018, Adhemerval Zanella wrote: >> This patches fixes some race conditions in NPTL cancellation code by >> redefining how cancellable syscalls are defined and handled. Current >> approach is to enable asynchronous cancellation prior to making the syscall >> and restore the previous cancellation type once the syscall returns. >> >> As decribed in BZ#12683, this approach shows 2 important problems: >> >> 1. Cancellation can act after the syscall has returned from kernel, but >> before userspace saves the return value. It might result in a resource >> leak if the syscall allocated a resource or a side effect (partial >> read/write), and there is no way to program handle it with cancellation >> handlers. >> >> 2. If a signal is handled while the thread is blocked at a cancellable >> syscall, the entire signal handler runs with asynchronous cancellation >> enabled. This can lead to issues if the signal handler call functions >> which are async-signal-safe but not async-cancel-safe. >> >> For cancellation to work correctly, there are 5 points at which the >> cancellation signal could arrive: >> >> 1. Before the final "testcancel" and before the syscall is made. >> 2. Between the "testcancel" and the syscall. >> 3. While the syscall is blocked and no side effects have yet taken place. >> 4. While the syscall is blocked but with some side effects already having >> taken place (e.g. a partial read or write). >> 5. After the syscall has returned. >> >> And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not in case >> 4 or 5. The proposed solution follows: >> >> * Handling case 1 is trivial: do a conditional branch based on whether the >> thread has received a cancellation request; >> * Case 2 can be caught by the signal handler determining that the saved >> program counter (from the ucontext_t) is in some address range beginning >> just before the "testcancel" and ending with the syscall instruction. >> * In this case, except for certain syscalls that ALWAYS fail with EINTR >> even for non-interrupting signals, the kernel will reset the program >> counter to point at the syscall instruction during signal handling, so >> that the syscall is restarted when the signal handler returns. So, from >> the signal handler's standpoint, this looks the same as case 2, and thus >> it's taken care of. >> * In this case, the kernel cannot restart the syscall; when it's >> interrupted by a signal, the kernel must cause the syscall to return >> with whatever partial result it obtained (e.g. partial read or write). >> * In this case, the saved program counter points just after the syscall >> instruction, so the signal handler won't act on cancellation. >> This one is equal to 4. since the program counter is past the syscall >> instruction already. > > To be 100% clear here, you are fixing important problem #1 by having > deferred cancellation _not_ trigger in cases 4 and 5, because the > signal handler observes that the PC is past the syscall instruction? Yes. > >> Another case that needs handling is syscalls that fail with EINTR even >> when the signal handler is non-interrupting. In this case, the syscall >> wrapper code can just check the cancellation flag when the errno result >> is EINTR, and act on cancellation if it's set. > > Which system calls are affected by this rule, and is it actually > correct to have cancellation trigger when they fail with EINTR? The best description I have is from man pages related to signal [1] in the 'Interruption of system calls and library functions by signal handlers' part. [1] http://man7.org/linux/man-pages/man7/signal.7.html And I think cancellation should happen int his case because it falls on the semantic that another thread explicit called pthread_cancel, it is the underlying system behaviour that is not allowing libc to explicit cancel the thread. > Can we be certain that the EINTR was due to the cancellation signal rather > than some other signal delivered first? (At least for sigsuspend it > is possible to have two signals be delivered simultaneously, if they > were both blocked and pending, and both become unblocked due to the > sigsuspend. The kernel pushes two signal stack frames. It would > reduce the odds of this being a problem if the SIGCANCEL handler > masked all other signals while running, but I'm not sure that will > work with it jumping directly to __pthread_unwind under some > conditions, and also I don't think it _completely_ eliminates the > possibility; a second signal could be delivered right after sigreturn > unmasks it, without returning to the caller of the syscall stub first.) By inspecting the CANCELED_BIT in thread cancelhandling mask: nptl/libc-cancellation.c: 50 if ((result == -EINTR) 51 && (pd->cancelhandling & CANCELED_BITMASK) 52 && !(pd->cancelhandling & CANCELSTATE_BITMASK)) 53 __syscall_do_cancel (); Which is atomic set on pthread_cancel: nptl/pthread_cancel.c: 40 41 THREAD_ATOMIC_BIT_SET (pd, cancelhandling, CANCELED_BIT); 42 So for the case where the thread is indeed cancelled by pthread_cancel, cancelhandling should have CANCELED bit set and if kernel deliver other signal first and cancellation signal handler it not called, the code on libc-cancellation should handle it. > >> 1. Remove the enable_asynccancel/disable_asynccancel function usage in >> syscall definition and instead make them call a common symbol that will >> check if cancellation is enabled (__syscall_cancel at >> nptl/libc-cancellation.c), call the arch-specific cancellable >> entry-point (__syscall_cancel_arch) and cancel the thread when required. >> >> 2. Provide a arch-specific symbol that contains global markers. These >> markers will be used in SIGCANCEL handler to check if the interruption >> has been called in a valid syscall and if the syscalls has been >> completed or not. >> A default version is provided (sysdeps/unix/sysv/linux/syscall_cancel.c), >> however the markers may not be set on correct expected places depeding >> of how INTERNAL_SYSCALL_NCS is implemented by the underlying architecture. >> In this case arch-specific implementation should be provided. >> >> 3. Rewrite SIGCANCEL asynchronous handler to check for both cancelling type >> and if current IP from signal handler falls between the global markes >> and act accordingly (sigcancel_handler at nptl/nptl-init.c). >> >> 4. Adjust nptl/pthread_cancel.c to send an signal instead of acting >> directly. This avoid synchronization issues when updating the >> cancellation status and also focus the logic on signal handler and >> cancellation syscall code. >> >> 5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to >> appropriated cancelable futex syscalls. >> >> 6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to >> appropriated cancelable syscalls. >> >> 7. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide >> cancelable futex calls (used in libpthread code). > > The overall design seems sound to me. > >> This patch adds the proposed changes to NPTL. The code leaves all the ports >> broken without further patches in the list. > > Can we find an ordering of the patches in which the tree is never > broken at an intermediate stage? Perhaps you could introduce the generic > __syscall_cancel(_arch) first, but not change any of the existing > cancellation logic until after all of the port-specific code was in > place? I think I can try to split this one two or more patches and make the actual fix tied with the first architecture changed (x86_64 currently). The x86_64 its the only architectures that provides a *cancellation* re-implementation in assembly (originally as optimization (?)), and I think we split it on a preliminary clean-up patch. But I am not sure I can make the tree stable if we just apply some architectures fixes, instead of all of them. > > Below, all the changes which I erased and didn't comment on seem OK to > me. I am generally pleased to see many system call wrappers get > simpler in the process of fixing this bug. > >> --- a/io/creat.c >> +++ b/io/creat.c > ... >> - >> -/* __open handles cancellation. */ >> -LIBC_CANCEL_HANDLED (); > > It might be clearer to split patch 2 into two patches, one to remove > tst-cancel-wrappers.sh and one to make all the other testing changes, > and then shift all of the removals of LIBC_CANCEL_HANDLED into the > same patch that removes tst-cancel-wrappers.sh. This sounds reasonable, I will change it. > >> --- a/nptl/descr.h >> +++ b/nptl/descr.h > ... >> + /* Bit set if threads is canceled. */ > > Grammar: "Bit set if thread is canceled." > > (For 100% proper English, it would be "This bit is set if this thread > has been cancelled", but all of the other comments are clipped as > well, so the above is fine.) Ack. > >> --- a/nptl/libc-cancellation.c >> +++ b/nptl/libc-cancellation.c > ... >> +#include <setjmp.h> >> +#include <stdlib.h> > > Why are these additional #includes necessary? They are not, I will remove them. > > ... >> + volatile struct pthread *pd = (volatile struct pthread *) self; > > Why is it necessary for this pointer to be volatile? It appears that > we only ever access pd->cancelhandling; which loads need to be > forced to go to memory, and should they also be atomic? If I recall correctly this came from a previous review, however I can't recall why exactly it was required. Thinking twice I see no reason in fact to use volatile here. I will remove it. > >> + /* If cancellation is not enabled, call the syscall directly. */ >> + if (pd->cancelhandling & CANCELSTATE_BITMASK) > > It is unfortunate that the one-bit flag that means "cancellation is > disabled" is named CANCELSTATE_BITMASK instead of, I dunno, > CANCEL_DISABLED_BITMASK maybe. You didn't introduce that, but please > consider a follow-up patch that renames CANCELSTATE_BIT(MASK) and also > CANCELTYPE_BIT(MASK) [the latter to "CANCEL_ASYNC_BIT(MASK)" perhaps]. In fact in my bz12683 repo I have 4 more patches that refactor the cancellation mask altogether [1]. [1] https://sourceware.org/git/?p=glibc.git;a=shortlog;h=refs/heads/azanella/bz12683 > >> + if ((result == -EINTR) >> + && (pd->cancelhandling & CANCELED_BITMASK) >> + && !(pd->cancelhandling & CANCELSTATE_BITMASK)) >> + __syscall_do_cancel (); > > I thought you said this treatment was only applied to system calls > that _always_ fail with EINTR, even when interrupted by SA_RESTART > signals. This is doing it for system calls interrupted by > non-SA_RESTART signals as well. You also change SIGCANCEL to be a > SA_RESTART signal in this patch, but it still could happen that we > have SIGCANCEL and some other non-SA_RESTART signal arrive > simultaneously enough that they interrupt the same system call, and > then it fails with EINTR and everything is confused. > > I think you need to check over the entire design with "what if > SIGCANCEL and some other signal are delivered as nested interruptions > to the same system call" in mind, in fact. The dangerous case that I > see is when the _other_ signal causes the system call to be > interrupted after producing some side effects. Can the SIGCANCEL > handler know that this has happened or is about to happen? It is still handling only the cancellation case, since it checks the cancelhandling bit that is set only if a thread calls pthread_cancel. My understanding is nested interruptions should not be a problem: either cancel sighandler is activated first which cancels the thread if syscall does not have a side-effect or the syscall returns with EINTR and with a cancellation pending (cancelhandling & CANCELED_BITMASK). However I think we can improve the cancel handling (and potentially simplifying the synchronization code on cancelhandling mask) by masking all signals while handling SIGCANCEL. > >> --- a/nptl/nptl-init.c >> +++ b/nptl/nptl-init.c >> sigcancel_handler (int sig, siginfo_t *si, void *ctx) >> { >> + INTERNAL_SYSCALL_DECL (err); >> + pid_t pid = INTERNAL_SYSCALL_CALL (getpid, err); >> + >> /* Safety check. It would be possible to call this function for >> other signals and send a signal from another process. This is not >> correct and might even be a security problem. Try to catch as >> many incorrect invocations as possible. */ >> if (sig != SIGCANCEL >> - || si->si_pid != __getpid() >> + || si->si_pid != pid >> || si->si_code != SI_TKILL) > > Why is this change necessary? This code was before getpid cache removal (c579f48edba8) and I did not catch it on my rebases. I will remove it. > >> + volatile struct pthread *pd = (volatile struct pthread *) self; > > Same question as above: why does this need to be volatile, does it > actually do the job, should we instead be using atomics? Same as before, I will remove it. > >> + /* Add SIGCANCEL on ignored sigmask to avoid the handler to be called >> + again. */ >> + sigset_t *set = UCONTEXT_SIGMASK (ctx); >> + __sigaddset (set, SIGCANCEL); > > Screwing with the signal mask that will be restored by sigreturn seems > unsafe. Could we instead return early if CANCELED_BITMASK and/or > EXITING_BITMASK has already been set? > >> + /* Check if asynchronous cancellation mode is set and if interrupted > ^^^ > This 'and' should be 'or'. Ack. > >> + THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT); >> + THREAD_SETMEM (self, result, PTHREAD_CANCELED); > > I think these ought to be done inside __do_cancel, since several other > places may call __do_cancel. Indeed and the 'THREAD_SETMEM (self, result, PTHREAD_CANCELED)' here is superflous (since __do_cancel already does it). > >> + INTERNAL_SYSCALL_CALL (rt_sigprocmask, err, SIG_SETMASK, set, NULL, >> + _NSIG / 8); > > See above in re screwing with the signal mask. > >> + __do_cancel (); >> + return; >> } >> } > > This 'return' should be unnecessary on two counts: we're about to fall > off the end of the function anyway, and __do_cancel ends by calling > __pthread_unwind which does not return. Indeed and __do_cancel is already marked as noreturn. I will remove it. > > --- a/nptl/pthreadP.h > +++ b/nptl/pthreadP.h > ... >> - /* Make sure we get no more cancellations. */ >> - THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT); >> + /* Make sure we get no more cancellations by clearing the cancel >> + state. */ >> + int oldval = THREAD_GETMEM (self, cancelhandling); >> + while (1) >> + { >> + int newval = (oldval | CANCELSTATE_BITMASK); >> + newval &= ~(CANCELTYPE_BITMASK); > > Disabled cancellation state is specified to take precedence over > asynchronous cancellation type, so it should be enough to atomically > set CANCELSTATE_BIT, at which point the CAS loop would be unnecessary. Right, this seems unnecessary. > >> --- a/nptl/pthread_cancel.c >> +++ b/nptl/pthread_cancel.c > ... >> + /* Avoid signaling when thread attempts cancel itself (pthread_kill >> + is expensive). */ >> + if (pd == THREAD_SELF && !(pd->cancelhandling & CANCELTYPE_BITMASK)) >> + return 0; > > It is not obvious why this is correct. The answer is that > pthread_cancel is not itself a cancellation point, so, when > cancellation is deferred, a self-cancel is _not_ supposed to take > effect immediately; the signal handler would have no effect, and the > next cancellation point will check the flag anyway. I suggest instead > > if (pd == THREAD_SELF) > { > if (pd->cancelhandling & CANCELTYPE_BITMASK) > __do_cancel (); > /* pthread_cancel is not itself a cancellation point; > the next cancellation point will check the flag anyway, > so there is no need to send the cancellation signal. */ > return 0; > } > return __pthread_kill (th, SIGCANCEL); I tried you suggestion, but it seems to trigger a regression which I think it is unrelated to the patch: >>> info threads Id Target Id Frame * 1 LWP 28018 "ld-linux-x86-64" __GI___syscall_cancel_arch (ch=ch@entry=0x7ffff7ff2a48, nr=nr@entry=202, a1=a1@entry=140737345771984, a2=a2@entry=0, a3=28023, a4=a4@entry=0, a5=0, a6=0) at ../sysdeps/unix/sysv/linux/syscall_cancel.c:63 3 LWP 28023 "ld-linux-x86-64" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 5 LWP 28025 "ld-linux-x86-64" __lll_unlock_wake () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371 7 LWP 28027 "ld-linux-x86-64" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 9 LWP 28029 "ld-linux-x86-64" __lll_unlock_wake () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371 11 LWP 28031 "ld-linux-x86-64" __lll_unlock_wake () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371 13 LWP 28033 "ld-linux-x86-64" __lll_unlock_wake () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371 15 LWP 28035 "ld-linux-x86-64" __lll_unlock_wake () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:371 17 LWP 28037 "ld-linux-x86-64" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 19 LWP 28039 "ld-linux-x86-64" 0x00007ffff7bc3d95 in __GI___pthread_mutex_lock (mutex=0x7ffff7ffd990 <_rtld_local+2352>) at ../nptl/pthread_mutex_lock.c:113 21 LWP 28041 "ld-linux-x86-64" __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 The process seems to stuck on unwind operation: #4 0x00007ffff71f9a13 in uw_frame_state_for (context=context@entry=0x7ffff71e9c20, fs=fs@entry=0x7ffff71e9a70) at /home/azanella/Projects/glibc/ports/src/gcc/libgcc/unwind-dw2.c:1249 #5 0x00007ffff71fac80 in uw_init_context_1 (context=context@entry=0x7ffff71e9c20, outer_cfa=outer_cfa@entry=0x7ffff71e9e50, outer_ra=0x7ffff7bca390 <__GI___pthread_unwind+64>) at /home/azanella/Projects/glibc/ports/src/gcc/libgcc/unwind-dw2.c:1578 #6 0x00007ffff71fb426 in _Unwind_ForcedUnwind (exc=0x7ffff71ead70, stop=stop@entry=0x7ffff7bca220 <unwind_stop>, stop_argument=0x7ffff71e9e80) at /home/azanella/Projects/glibc/ports/src/gcc/libgcc/unwind.inc:201 #7 0x00007ffff7bca390 in __GI___pthread_unwind (buf=<optimized out>) at unwind.c:121 #8 0x00007ffff7bc8b41 in __do_cancel () at pthreadP.h:302 #9 __pthread_cancel (th=140737339369216) at pthread_cancel.c:60 #10 0x000000000040174d in ?? () #11 0x0000000000000000 in ?? () It does not happen on powerpc or sparc as far I can check. I did not dig into, but I think it would be safer to use current mechanism to pthread_cancel always by signaling the thread with pthread_kill and we can revise this later why is failing with your suggestion on x86. > >> --- a/nptl/pthread_create.c >> +++ b/nptl/pthread_create.c > ... >> /* If the parent was running cancellation handlers while creating >> the thread the new thread inherited the signal mask. Reset the >> cancellation signal mask. */ > > If the cancellation handler exited early when EXITING_BIT was set, and > it didn't mess with the signal mask, we wouldn't need to do this, either. > >> - int oldtype = CANCEL_ASYNC (); >> + int ct; >> + __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct); > > I don't understand the logic in this part of the code at all so I am > going to trust you that this is the correct change. It seems like > it's using SIGCANCEL for something other than cancellation, which > seems unwise and maybe should be changed in a follow-up. I think we can follow up this on subsequent patches, since the change is just to adjust CANCEL_ASYNC macro removal. > >> --- a/nptl/pthread_join_common.c >> +++ b/nptl/pthread_join_common.c > ... >> - int oldtype = CANCEL_ASYNC (); >> + int ct; >> + __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct); >> >> if (abstime != NULL) >> result = lll_timedwait_tid (pd->tid, abstime); >> else >> lll_wait_tid (pd->tid); >> >> - CANCEL_RESET (oldtype); >> + __pthread_setcanceltype (ct, NULL); > > Maybe we should have lll_(timed)wait_tid_cancel rather than this? Both lll_timedwait_tid and lll_wait_tid are already a cancellation point, the idea here is to just to enable asynchronous cancellation since pthread_join is a cancellation entrypoint (and nptl/libc-cancellation.c call the syscall directly if cancellation is disabled). > >> --- a/sysdeps/unix/sysdep.h >> +++ b/sysdeps/unix/sysdep.h > ... >> +/* The loader does not need to handle thread cancellation, use direct >> + syscall instead. */ > > A stronger assertion would be better: > > /* The loader should never make cancellable system calls. Remap them > to direct system calls. */ Ack, I changed to your suggestion. > > Also, is this hack still necessary after I went through and made the > loader explicitly use _nocancel operations? I need to actually check, but I presume that once the patch go in, loader build would not require to pull objects that might pull the libc-cancellation. > >> --- a/sysdeps/unix/sysv/linux/pthread_kill.c >> +++ b/sysdeps/unix/sysv/linux/pthread_kill.c > ... >> - /* Disallow sending the signal we use for cancellation, timers, >> - for the setxid implementation. */ >> - if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID) >> + /* Disallow sending the signal we use for setxid implementation. */ >> + if (signo == SIGSETXID) >> return EINVAL; > > Now applications can call pthread_kill(thr, SIGCANCEL) themselves, > can't they? That seems unsafe. Also this has since been changed to > use __is_internal_signal, I think. We need an __internal_pthread_kill > or something. Yes, I will change to call an internal symbol. > >> --- a/sysdeps/unix/sysv/linux/socketcall.h >> +++ b/sysdeps/unix/sysv/linux/socketcall.h > ... >> +#define __SOCKETCALL_CANCEL1(__name, __a1) \ >> + SYSCALL_CANCEL_NCS (socketcall, __name, \ >> + ((long int [1]) { (long int) __a1 })) > > Blech, which architectures still require us to use socketcall? The one I recall, i386, m68k, s390, and sparc (some socket operations are wire-up in recent kernels for some architectures). > >> --- /dev/null >> +++ b/sysdeps/unix/sysv/linux/syscall_cancel.c > ... >> +#define ADD_LABEL(__label) \ >> + asm volatile ( \ >> + ".global " __label "\t\n" \ >> + ".type " __label ",\%function\t\n" \ >> + __label ":\n"); > > This makes me extremely nervous. Specifically, I do not trust the > compiler to not move these around, even though they're volatile. I > would actually prefer an .S file for every architecture. > > Do we have a concrete reason to believe they really will always be > where they're supposed to be? "It compiled fine with the compiler I > have right now" is not enough, I'm afraid. For the architectures that I am using the C implementation (riscv, mips64, nios2, m68k, alpha, ia64, arm, and aarch64) two different gcc version generates the expected code (gcc6 and gcc7). My understanding is the volatile asm should not move the ADD_LABEL macro other C declaration, but I give you that depending on how the architecture defines INTERNAL_SYSCALL_NCS_CALL, the labels might no in right position. I created it to easier the port adjustments (since for some architectures I don't have much experience with ABI), but I give you I don't have a strong preference here. The main advantage I see is it simplifies a lot the required boilerplate on some ABIs for PIC/non-PIC code. > >> --- a/sysdeps/unix/sysv/linux/sysdep.h >> +++ b/sysdeps/unix/sysv/linux/sysdep.h > ... >> +/* Check error from cancellable syscall and set errno accordingly. >> + Linux uses a negative return value to indicate syscall errors >> + and since version 2.1 the return value of a system call might be >> + negative even if the call succeeded (e.g., the `lseek' system call >> + might return a large offset). >> + Current contract is kernel make sure the no syscall returns a value >> + in -1 .. -4095 as a valid result so we can savely test with -4095. */ >> +#define SYSCALL_CANCEL_RET(__ret) \ >> + ({ \ >> + if (__ret > -4096UL) \ >> + { \ >> + __set_errno (-__ret); \ >> + __ret = -1; \ >> + } \ >> + __ret; \ >> + }) > > Don't we already have this somewhere under another name? Not really, we have the INTERNAL_SYSCALL_ERROR_P macro which check the error from the INTERNAL_SYSCALL_DECL (err). The issues is some some kernel abis (alpha and sparc for instance) either it does check the return code (alpha) or it expects the expecial kernel abi register (sparc). I added this one for consistency.
On 08/05/2018 10:35, Zack Weinberg wrote: > On Mon, May 7, 2018 at 1:13 PM, Adhemerval Zanella > <adhemerval.zanella@linaro.org> wrote: >> On 06/05/2018 23:49, Zack Weinberg wrote: >>> On 26 Feb 2018, Adhemerval Zanella wrote: >>>> For tst-cancel{2,3} case it remove the pipe close because it might >>>> cause the write syscall to return with side effects if the close is >>>> executed before the pthread cancel. >>> >>> ... however, this change appears to be wrong. If cancellation is >>> broken, these tests will now deadlock rather than failing cleanly. >> >> On current cancellation implementation the thread will finish regardless >> and sigcancel_handler will act whether there is side-effects or not >> (the pipe close). The issue is cancellation should not happen if syscall >> returns but some side effects already took place, in this case the pipe >> close. > > I think maybe I didn't explain clearly enough what I'm worried about > here. What the test case does _when cancellation works_ is sensible. > But this is a test case, it also needs to behave sensibly when > cancellation _doesn't_ work. Imagine a new port where, for some > reason, the cancellation mechanism is so broken that read/write aren't > acting as cancellation points at all. Without the close, > tst-cancel{2,3} will block forever in read/write. We have the > test-driver timeout as a backstop, but we shouldn't rely on it. Well, adding a timing mechanism (either by alarm, pthread_timedjoin, etc) is basically what the test-driver backstop is actually doing, with the advantage it is another process no susceptible to possible memory corruptions or other execution failures. I don't see much gain in add more hardening in test itself. > >> Yes, although for this specific case I am not sure if this could happen >> in practice. I assume if a thread issues a 'signal' followed by a 'close', >> the signal target thread will receive the events in a ordered manner, i.e, >> the signal handler will be activated before the syscall sees any >> side-effects (the close). It seems to be Linux behaviour, but I am not >> sure if a different system might act differently. > > I don't think POSIX makes any requirements, but yes, in practice the > signal should always arrive first. > >> And I try to avoid the timing check, such as pthread_timedjoin_np, >> because they tend to quite fragile in practice for such cases (due either >> to system load when testing glibc, machine performance, etc.). > > This is reasonable. > > For the new cancellation mechanism in general, we don't have a good > way of arranging for SIGCANCEL to arrive at exactly the critical > points within the syscall sequence, do we? I am tempted to try to > write a test case that scripts gdb and single-steps through a call to > open() and fires SIGCANCEL at each instruction. I can't really think any explicit way to actually check it, a gdb/ptrace is the close think I can think off as well. > >>> won't it? I think teaching the backtrace logic about this would be >>> better than needing to use a raw syscall() and then mess with the >>> PowerPC implementation of syscall(). I might feel differently about >>> this change if __read_nocancel were a public API, but it isn't... >> >> With your current suggestion to powerpc syscall bits, there is no need >> to actually change the powerpc syscall implementation besides an >> additional CFI mechanism. But I do not mind to change the testcase on >> the bz12683 fix itself, the only advantage I see is by using indirect >> syscall there is no need to actually change it again. > > I don't feel especially strongly about this now we have a way that > doesn't add actual instructions to powerpc syscall(). > > zw > Right, I will my current version on next version.
On 06/05/2018 23:49, Zack Weinberg wrote: > On 26 Feb 2018, Adhemerval Zanella wrote: >> This patches adds the x86_64 modification required for the BZ#12683. >> It basically provide the required ucontext_get_pc symbol and remove >> the arch-specific libc-cancellation implementations. > > These changes all look fine to me. > >> lock;orl $4, %fs:776 >> >> Where with patch changes it now compiles to: >> >> mov %fs:16,%rax >> lock;orl $4, 776(%rax) > > That might actually be faster on some microarchitectures due to not > having multiple unusual prefixes on one instruction. I wouldn't worry > about it. > >> In fact all x86_64 THREAD_ATOMIC_* macros >> do not respect the input descr and possible will fail when used with >> a 'descr' difference than THREAD_SELF. > > Maybe we should correct or remove all of them, then? They sound like > they're bugs waiting to happen. > > zw > Right, I will fix in on an extra patch then.
On 06/05/2018 23:49, Zack Weinberg wrote: > On 26 Feb 2018, Adhemerval Zanella wrote: >> This patch adds the i386 modifications required for the BZ#12683. >> It basically provides the required ucontext_get_pc symbol, add the >> cancelable syscall wrapper and fix a thread atomic update macro. > > This also seems fine. > >> On i386 an arch-specific cancellation implementation is required >> because depending of the glibc configuration and underlying kernel >> the syscall may be done using a vDSO symbol (__kernel_vsyscall). > ... >> Also, since glibc supports i486, the old 'int 0x80' should be used >> in the syscall wrapper. One option could make minimum default chip >> to pentium II (which implements sysenter) or add a runtime check >> on syscall_cancel.S to use 'int 0x80' or sysenter. > > If I remember correctly, there can be only one 'sysenter' instruction > in the entire user address space, due to awkward limitations of the > interface it presents to the kernel. That was why __kernel_vsyscall > was added in the first place. > > We can probably live with using int 0x80 for these syscalls that may > well be blocking anyway. > >> Similar to x86_64, it also remove bogus arch-specific >> THREAD_ATOMIC_BIT_SET where it always reference to current thread >> instead of the one referenced by input 'descr' argument. > > Same comment as for x86_64 -- shouldn't we get rid of or repair _all_ of > the THREAD_ATOMIC_ macros that don't honor their arguments? > > zw > Alright, I will remove then in an extra patch as well.
diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S index 2da9172..fae0fe8 100644 --- a/sysdeps/unix/sysv/linux/powerpc/syscall.S +++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S @@ -19,6 +19,14 @@ ENTRY (syscall) ABORT_TRANSACTION + /* Creates a minimum stack frame so backtrace works. */ +#ifdef __powerpc64__ + stdu r1, -FRAME_MIN_SIZE (r1) + cfi_adjust_cfa_offset (FRAME_MIN_SIZE) +#else + stwu r1,-16(1) + cfi_def_cfa_offset (16) +#endif mr r0,r3 mr r3,r4 mr r4,r5 @@ -26,6 +34,12 @@ ENTRY (syscall) mr r6,r7 mr r7,r8 mr r8,r9 +#ifdef __powerpc64__ + addi r1, r1, FRAME_MIN_SIZE +#else + addi r1,r1,16 +#endif + cfi_def_cfa_offset (0) sc PSEUDO_RET PSEUDO_END (syscall)