[PATCH/v3] x86: Use pad in pthread_unwind_buf to preserve shadow stack register

Message ID 20180421183714.GA16402@gmail.com
State Committed
Commit d6cc1829aa31b6fb060f24dffd28aa6705cdd33a
Headers

Commit Message

H.J. Lu April 21, 2018, 6:37 p.m. UTC
  On Fri, Apr 20, 2018 at 10:28:24PM -0500, Carlos O'Donell wrote:
> Lets call this v2. Subject adjusted. Please keep incrementing the version
> number on your patches to make the review easier by myself and others.
> 

Done.

> > +
> > +  /* No previous handlers.  NB: This must be done after setjmp since
> > +     the private space in the unwind jump buffer may overlap space
> > +     used by setjmp to store extra architecture-specific information
> > +     which is never be used by the cancellation-specific
> 
> s/be//g
 
 Done

> > +/* Number of bits per long.  */
> > +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
> > +/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64.
> > +   Define it to 96 to leave some rooms for future use.  */
> 
> Why 96?
> 
> Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits,
> so you have 128 signals worth of space and then the shadow stack pointer.
> 
> Does this work on x32?
> 
> On x32 you have only 4 void*'s in the private pad.
> 
> Your presently structured sigset_t looks like this:
> 
> typedef union
>   {
>     __sigset_t __saved_mask_compat;
>     struct
>       {
>         __jmp_buf_sigset_t __saved_mask;
>         /* Used for shadow stack pointer.  */
>         unsigned long int __shadow_stack_pointer;
>       } __saved;
>   } __jmpbuf_arch_t;
> 
> Which means you have a sigset_t *before* the __shadow_stack_pointer.
> 
> On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit
> words left? That's only space for 64 signals?
> 
> Are you counting on one additional 32-bit word of padding between the
> __mask_was_saved and the rest of the long arguments?
> 
> If so, then this still needs spelling out in an a comment why 96 signals
> works on both i686, x32, and x86_64.
> 
> Also it should be explained if 96 is the *maximum* we can do with the current
> layout, or if more is possible. In which case picking 96 is *not* arbitrary.

   The pad array in struct pthread_unwind_buf is used by setjmp to save
   shadow stack register.  The usable space in __saved_mask for sigset
   and shadow stack pointer:
   1. i386: The 4x4 byte pad array which can be used for 4 byte shadow
   stack pointer and maximum 12 byte sigset.
   2. x32: 4 byte padding + the 4x4 byte pad array which can be used
   for 8 byte shadow stack pointer and maximum 12 byte sigset.
   3. x86-64: The 4x8 byte pad array which can be used for 8 byte
   shadow stack pointer and maximum 24 byte sigset.

Please see comments in sysdeps/unix/sysv/linux/x86/setjmpP.h for
details.

> Suggest:
> ~~~
> The pad array in struct pthread_unwind_buf is used by setjmp to save
> the shadow stack register.  Assert that the size of struct pthread_unwind_buf
> is no less than the offset of the shadow stack pointer plus the shadow stack
> pointer size.
> 
> NB: We use setjmp in thread cancellation and this saves the shadow stack
> register, but __libc_unwind_longjmp doesn't restore the shadow stack register
> since cancellation never returns after longjmp.
> ~~~
> 

Done.

> 
> This assertion is too loose.
> 
> The assertion you need is that the shadow stack pointer itself falls within
> the range of the private padding. This would have caught the previous bug
> with the rounded up size.
> 
> Please adjust the assertion to be as tight as possible or add new assertions.
> 
> Completely untested, but just to show what I'm thinking:
> 
> _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved)
> 		 + sizeof (int) < SHADOW_STACK_POINTER_OFFSET
> 		 && (offsetof (struct pthread_unwind_buf, priv.pad[4])
> 		    > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE),
> 		"Shadow stack pointer is not within private storage of pthread_unwind_buf.");
> 

I used a slighly different assert.

> 
> Looking forward to a v3.
> 

Here it is.

H.J.
---
The pad array in struct pthread_unwind_buf is used by setjmp to save
shadow stack register.  We assert that size of struct pthread_unwind_buf
is no less than offset of shadow stack pointer + shadow stack pointer
size.

Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
these with thread cancellation, call setjmp, but never return after
__libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
__libc_longjmp on x86, doesn't need to restore shadow stack register.
__libc_longjmp, which is a private interface for thread cancellation
implementation in libpthread, is changed to call __longjmp_cancel,
instead of __longjmp.  __longjmp_cancel is a new internal function
in libc, which is similar to __longjmp, but doesn't restore shadow
stack register.

The compatibility longjmp and siglongjmp in libpthread.so are changed
to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
restore shadow stack register.

Tested with build-many-glibcs.py.

	* nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
	handlers after setjmp.
	* setjmp/longjmp.c (__libc_longjmp): Don't define alias if
	defined.
	* sysdeps/unix/sysv/linux/x86/setjmpP.h: Include
	<libc-pointer-arith.h>.
	(_JUMP_BUF_SIGSET_BITS_PER_WORD): New.
	(_JUMP_BUF_SIGSET_NSIG): Changed to 96.
	(_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and
	_JUMP_BUF_SIGSET_BITS_PER_WORD.
	* sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
	* sysdeps/x86/__longjmp_cancel.S: New file.
	* sysdeps/x86/longjmp.c: Likewise.
	* sysdeps/x86/nptl/pt-longjmp.c: Likewise.
---
 nptl/pthread_create.c                 | 17 +++++++--
 setjmp/longjmp.c                      |  2 +
 sysdeps/unix/sysv/linux/x86/setjmpP.h | 71 ++++++++++++++++++++++++++++++++---
 sysdeps/x86/Makefile                  |  4 ++
 sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++++
 sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++++++++
 sysdeps/x86/nptl/pt-longjmp.c         | 71 +++++++++++++++++++++++++++++++++++
 7 files changed, 222 insertions(+), 8 deletions(-)
 create mode 100644 sysdeps/x86/__longjmp_cancel.S
 create mode 100644 sysdeps/x86/longjmp.c
 create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
  

Comments

Carlos O'Donell May 2, 2018, 4:43 a.m. UTC | #1
On 04/21/2018 02:37 PM, H.J. Lu wrote:
> On Fri, Apr 20, 2018 at 10:28:24PM -0500, Carlos O'Donell wrote:
>> Lets call this v2. Subject adjusted. Please keep incrementing the version
>> number on your patches to make the review easier by myself and others.
>>
> 
> Done.
> 
>>> +
>>> +  /* No previous handlers.  NB: This must be done after setjmp since
>>> +     the private space in the unwind jump buffer may overlap space
>>> +     used by setjmp to store extra architecture-specific information
>>> +     which is never be used by the cancellation-specific
>>
>> s/be//g
>  
>  Done
> 
>>> +/* Number of bits per long.  */
>>> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
>>> +/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64.
>>> +   Define it to 96 to leave some rooms for future use.  */
>>
>> Why 96?
>>
>> Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits,
>> so you have 128 signals worth of space and then the shadow stack pointer.
>>
>> Does this work on x32?
>>
>> On x32 you have only 4 void*'s in the private pad.
>>
>> Your presently structured sigset_t looks like this:
>>
>> typedef union
>>   {
>>     __sigset_t __saved_mask_compat;
>>     struct
>>       {
>>         __jmp_buf_sigset_t __saved_mask;
>>         /* Used for shadow stack pointer.  */
>>         unsigned long int __shadow_stack_pointer;
>>       } __saved;
>>   } __jmpbuf_arch_t;
>>
>> Which means you have a sigset_t *before* the __shadow_stack_pointer.
>>
>> On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit
>> words left? That's only space for 64 signals?
>>
>> Are you counting on one additional 32-bit word of padding between the
>> __mask_was_saved and the rest of the long arguments?
>>
>> If so, then this still needs spelling out in an a comment why 96 signals
>> works on both i686, x32, and x86_64.
>>
>> Also it should be explained if 96 is the *maximum* we can do with the current
>> layout, or if more is possible. In which case picking 96 is *not* arbitrary.
> 
>    The pad array in struct pthread_unwind_buf is used by setjmp to save
>    shadow stack register.  The usable space in __saved_mask for sigset
>    and shadow stack pointer:
>    1. i386: The 4x4 byte pad array which can be used for 4 byte shadow
>    stack pointer and maximum 12 byte sigset.
>    2. x32: 4 byte padding + the 4x4 byte pad array which can be used
>    for 8 byte shadow stack pointer and maximum 12 byte sigset.
>    3. x86-64: The 4x8 byte pad array which can be used for 8 byte
>    shadow stack pointer and maximum 24 byte sigset.
> 
> Please see comments in sysdeps/unix/sysv/linux/x86/setjmpP.h for
> details.
> 
>> Suggest:
>> ~~~
>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>> the shadow stack register.  Assert that the size of struct pthread_unwind_buf
>> is no less than the offset of the shadow stack pointer plus the shadow stack
>> pointer size.
>>
>> NB: We use setjmp in thread cancellation and this saves the shadow stack
>> register, but __libc_unwind_longjmp doesn't restore the shadow stack register
>> since cancellation never returns after longjmp.
>> ~~~
>>
> 
> Done.
> 
>>
>> This assertion is too loose.
>>
>> The assertion you need is that the shadow stack pointer itself falls within
>> the range of the private padding. This would have caught the previous bug
>> with the rounded up size.
>>
>> Please adjust the assertion to be as tight as possible or add new assertions.
>>
>> Completely untested, but just to show what I'm thinking:
>>
>> _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved)
>> 		 + sizeof (int) < SHADOW_STACK_POINTER_OFFSET
>> 		 && (offsetof (struct pthread_unwind_buf, priv.pad[4])
>> 		    > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE),
>> 		"Shadow stack pointer is not within private storage of pthread_unwind_buf.");
>>
> 
> I used a slighly different assert.
> 
>>
>> Looking forward to a v3.
>>
> 
> Here it is.
> 
> H.J.
> ---
> The pad array in struct pthread_unwind_buf is used by setjmp to save
> shadow stack register.  We assert that size of struct pthread_unwind_buf
> is no less than offset of shadow stack pointer + shadow stack pointer
> size.
> 
> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
> these with thread cancellation, call setjmp, but never return after
> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
> __libc_longjmp on x86, doesn't need to restore shadow stack register.
> __libc_longjmp, which is a private interface for thread cancellation
> implementation in libpthread, is changed to call __longjmp_cancel,
> instead of __longjmp.  __longjmp_cancel is a new internal function
> in libc, which is similar to __longjmp, but doesn't restore shadow
> stack register.
> 
> The compatibility longjmp and siglongjmp in libpthread.so are changed
> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
> restore shadow stack register.
> 
> Tested with build-many-glibcs.py.
> 
> 	* nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
> 	handlers after setjmp.
> 	* setjmp/longjmp.c (__libc_longjmp): Don't define alias if
> 	defined.
> 	* sysdeps/unix/sysv/linux/x86/setjmpP.h: Include
> 	<libc-pointer-arith.h>.
> 	(_JUMP_BUF_SIGSET_BITS_PER_WORD): New.
> 	(_JUMP_BUF_SIGSET_NSIG): Changed to 96.
> 	(_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and
> 	_JUMP_BUF_SIGSET_BITS_PER_WORD.
> 	* sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
> 	* sysdeps/x86/__longjmp_cancel.S: New file.
> 	* sysdeps/x86/longjmp.c: Likewise.
> 	* sysdeps/x86/nptl/pt-longjmp.c: Likewise.
> ---
>  nptl/pthread_create.c                 | 17 +++++++--
>  setjmp/longjmp.c                      |  2 +
>  sysdeps/unix/sysv/linux/x86/setjmpP.h | 71 ++++++++++++++++++++++++++++++++---
>  sysdeps/x86/Makefile                  |  4 ++
>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++++
>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++++++++
>  sysdeps/x86/nptl/pt-longjmp.c         | 71 +++++++++++++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 8 deletions(-)
>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>  create mode 100644 sysdeps/x86/longjmp.c
>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
> 
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index caaf07c134..92c945b12b 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -427,12 +427,23 @@ START_THREAD_DEFN
>       compilers without that support we do use setjmp.  */
>    struct pthread_unwind_buf unwind_buf;
>  
> -  /* No previous handlers.  */
> +  int not_first_call;
> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
> +
> +  /* No previous handlers.  NB: This must be done after setjmp since the
> +     private space in the unwind jump buffer may overlap space used by
> +     setjmp to store extra architecture-specific information which is
> +     never used by the cancellation-specific __libc_unwind_longjmp.
> +
> +     The private space is allowed to overlap because the unwinder never
> +     has to return through any of the jumped-to call frames, and thus
> +     only a minimum amount of saved data need be stored, and for example,
> +     need not include the process signal mask information. This is all
> +     an optimization to reduce stack usage when pushing cancellation
> +     handlers.  */

OK.

>    unwind_buf.priv.data.prev = NULL;
>    unwind_buf.priv.data.cleanup = NULL;
>  
> -  int not_first_call;
> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>    if (__glibc_likely (! not_first_call))
>      {
>        /* Store the new cleanup handler info.  */
> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
> index a2a7065a85..453889e103 100644
> --- a/setjmp/longjmp.c
> +++ b/setjmp/longjmp.c
> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>  }
>  
>  #ifndef __libc_siglongjmp
> +# ifndef __libc_longjmp
>  /* __libc_longjmp is a private interface for cancellation implementation
>     in libpthread.  */
>  strong_alias (__libc_siglongjmp, __libc_longjmp)
> +# endif

OK.

>  weak_alias (__libc_siglongjmp, _longjmp)
>  weak_alias (__libc_siglongjmp, longjmp)
>  weak_alias (__libc_siglongjmp, siglongjmp)
> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> index c0ed767a0d..6b2608453d 100644
> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
> @@ -20,13 +20,72 @@
>  #define	_SETJMPP_H	1
>  
>  #include <bits/types/__sigset_t.h>
> +#include <libc-pointer-arith.h>
>  
> -/* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
> -   Define it to 513 to leave some rooms for future use.  */
> -#define _JUMP_BUF_SIGSET_NSIG	513
> +/* <setjmp/setjmp.h> has
> +
> +struct __jmp_buf_tag
> +  {
> +    __jmp_buf __jmpbuf;
> +    int __mask_was_saved;
> +    __sigset_t __saved_mask;
> +  };
> +
> +   struct __jmp_buf_tag is 32 bits aligned on i386 and is 64 bits
> +   aligned on x32 and x86-64.  __saved_mask is aligned to 32 bits
> +   on i386/x32 without padding and is aligned to 64 bits on x86-64
> +   with 32 bit padding.
> +
> +   and <nptl/descr.h> has
> +
> +struct pthread_unwind_buf
> +{
> +  struct
> +  {
> +    __jmp_buf jmp_buf;
> +    int mask_was_saved;
> +  } cancel_jmp_buf[1];
> +
> +  union
> +  {
> +    void *pad[4];
> +    struct
> +    {
> +      struct pthread_unwind_buf *prev;
> +      struct _pthread_cleanup_buffer *cleanup;
> +      int canceltype;
> +    } data;
> +  } priv;
> +};
> +
> +   struct pthread_unwind_buf is 32 bits aligned on i386 and 64 bits
> +   aligned on x32/x86-64.  cancel_jmp_buf is aligned to 32 bits on
> +   i386 and is aligned to 64 bits on x32/x86-64.

OK.

> +
> +   The pad array in struct pthread_unwind_buf is used by setjmp to save
> +   shadow stack register.  The usable space in __saved_mask for sigset
> +   and shadow stack pointer:
> +   1. i386: The 4x4 byte pad array which can be used for 4 byte shadow
> +   stack pointer and maximum 12 byte sigset.
> +   2. x32: 4 byte padding + the 4x4 byte pad array which can be used
> +   for 8 byte shadow stack pointer and maximum 12 byte sigset.
> +   3. x86-64: The 4x8 byte pad array which can be used for 8 byte
> +   shadow stack pointer and maximum 24 byte sigset.

OK.

> +
> +   NB: We use setjmp in thread cancellation and this saves the shadow
> +   stack register, but __libc_unwind_longjmp doesn't restore the shadow
> +   stack register since cancellation never returns after longjmp.  */
> +
> +/* Number of bits per long.  */
> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
> +/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64. The
> +   common maximum sigset for i386, x32 and x86-64 is 12 bytes (96 bits).
> +   Define it to 96 to leave some rooms for future use.  */
> +#define _JUMP_BUF_SIGSET_NSIG	96
>  /* Number of longs to hold all signals.  */
>  #define _JUMP_BUF_SIGSET_NWORDS \
> -  ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
> +  (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \
> +   / _JUMP_BUF_SIGSET_BITS_PER_WORD)

OK.

>  
>  typedef struct
>    {
> @@ -39,7 +98,9 @@ typedef union
>      struct
>        {
>  	__jmp_buf_sigset_t __saved_mask;
> -	/* Used for shadow stack pointer.  */
> +	/* Used for shadow stack pointer.  NB: Shadow stack pointer
> +	   must have the same alignment as __saved_mask.  Otherwise
> +	   offset of __saved_mask will be changed.  */

s/offset of __saved_mask/offset of __shadow_stack_pointer/g?

OK with that correction.

It is the __shadow_stack_pointer offset which will change if it doesn't
have the same alignment as the __saved_mask. The compiler will insert
padding between the fields.

Perhaps assert there is no padding?

>  	unsigned long int __shadow_stack_pointer;
>        } __saved;
>    } __jmpbuf_arch_t;
> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
> index 0d0326c21a..d25d6f0ae4 100644
> --- a/sysdeps/x86/Makefile
> +++ b/sysdeps/x86/Makefile
> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>  tests += tst-get-cpu-features
>  tests-static += tst-get-cpu-features-static
>  endif
> +
> +ifeq ($(subdir),setjmp)
> +sysdep_routines += __longjmp_cancel
> +endif

OK.

> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
> new file mode 100644
> index 0000000000..b57dbfa376
> --- /dev/null
> +++ b/sysdeps/x86/__longjmp_cancel.S
> @@ -0,0 +1,20 @@
> +/* __longjmp_cancel for x86.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#define __longjmp __longjmp_cancel
> +#include <__longjmp.S>

OK.

> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
> new file mode 100644
> index 0000000000..a53f31e1dd
> --- /dev/null
> +++ b/sysdeps/x86/longjmp.c
> @@ -0,0 +1,45 @@
> +/* __libc_siglongjmp for x86.
> +   Copyright (C) 2018 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#define __libc_longjmp __redirect___libc_longjmp
> +#include <setjmp/longjmp.c>
> +#undef __libc_longjmp
> +
> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
> +     __attribute__ ((__noreturn__)) attribute_hidden;
> +
> +/* Since __libc_longjmp is a private interface for cancellation
> +   implementation in libpthread, there is no need to restore shadow
> +   stack register.  */
> +
> +void
> +__libc_longjmp (sigjmp_buf env, int val)
> +{
> +  /* Perform any cleanups needed by the frames being unwound.  */
> +  _longjmp_unwind (env, val);
> +
> +  if (env[0].__mask_was_saved)
> +    /* Restore the saved signal mask.  */
> +    (void) __sigprocmask (SIG_SETMASK,
> +			  (sigset_t *) &env[0].__saved_mask,
> +			  (sigset_t *) NULL);
> +
> +  /* Call the machine-dependent function to restore machine state
> +     without shadow stack.  */
> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
> +}

OK.

> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
> new file mode 100644
> index 0000000000..6165c7d4a7
> --- /dev/null
> +++ b/sysdeps/x86/nptl/pt-longjmp.c
> @@ -0,0 +1,71 @@
> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
> +   X86 version.
> +   Copyright (C) 18 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <pthreadP.h>
> +#include <jmp_buf-ssp.h>
> +
> +#ifdef __x86_64__
> +# define SHADOW_STACK_POINTER_SIZE 8
> +#else
> +# define SHADOW_STACK_POINTER_SIZE 4
> +#endif
> +
> +/* Assert that the priv field in struct pthread_unwind_buf has space
> +   to store shadow stack pointer.  */
> +_Static_assert ((offsetof (struct pthread_unwind_buf, priv)
> +		 <= SHADOW_STACK_POINTER_OFFSET)
> +		&& ((offsetof (struct pthread_unwind_buf, priv)
> +		     + sizeof (((struct pthread_unwind_buf *) 0)->priv))
> +		    >= (SHADOW_STACK_POINTER_OFFSET
> +			+ SHADOW_STACK_POINTER_SIZE)),
> +		"Shadow stack pointer is not within private storage "
> +		"of pthread_unwind_buf.");

OK. Thanks for testing this out.

> +
> +#include <shlib-compat.h>
> +
> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
> +   was no apparent reason for it.  There is no use in having a separate
> +   symbol in libpthread, but the historical ABI requires it.  For static
> +   linking, there is no need to provide anything here--the libc version
> +   will be linked in.  For shared library ABI compatibility, there must be
> +   longjmp and siglongjmp symbols in libpthread.so.
> +
> +   With an IFUNC resolver, it would be possible to avoid the indirection,
> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
> +   been relocated, in which case the IFUNC resolver would not be able to
> +   provide the correct address.  */
> +
> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
> +
> +static void __attribute__ ((noreturn, used))
> +longjmp_compat (jmp_buf env, int val)
> +{
> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
> +     __libc_longjmp is a private interface for cancellation which
> +     doesn't restore shadow stack register.  */
> +  __libc_siglongjmp (env, val);
> +}
> +
> +strong_alias (longjmp_compat, longjmp_alias)
> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
> +
> +strong_alias (longjmp_alias, siglongjmp_alias)
> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
> +
> +#endif
> 

Version 3 looks good to me with the comment correction I mentioned.

Thank you for working with me to improve the quality of the code
and comments, and for balancing the needs of the implementation
against the distribution needs e.g. no added symbol for core CET
enablement.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>
  
H.J. Lu May 2, 2018, 12:45 p.m. UTC | #2
On Tue, May 1, 2018 at 9:43 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 04/21/2018 02:37 PM, H.J. Lu wrote:
>> On Fri, Apr 20, 2018 at 10:28:24PM -0500, Carlos O'Donell wrote:
>>> Lets call this v2. Subject adjusted. Please keep incrementing the version
>>> number on your patches to make the review easier by myself and others.
>>>
>>
>> Done.
>>
>>>> +
>>>> +  /* No previous handlers.  NB: This must be done after setjmp since
>>>> +     the private space in the unwind jump buffer may overlap space
>>>> +     used by setjmp to store extra architecture-specific information
>>>> +     which is never be used by the cancellation-specific
>>>
>>> s/be//g
>>
>>  Done
>>
>>>> +/* Number of bits per long.  */
>>>> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
>>>> +/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64.
>>>> +   Define it to 96 to leave some rooms for future use.  */
>>>
>>> Why 96?
>>>
>>> Sure on x86_64 the cancel jump buf has 4 x void*'s and each is 64-bits,
>>> so you have 128 signals worth of space and then the shadow stack pointer.
>>>
>>> Does this work on x32?
>>>
>>> On x32 you have only 4 void*'s in the private pad.
>>>
>>> Your presently structured sigset_t looks like this:
>>>
>>> typedef union
>>>   {
>>>     __sigset_t __saved_mask_compat;
>>>     struct
>>>       {
>>>         __jmp_buf_sigset_t __saved_mask;
>>>         /* Used for shadow stack pointer.  */
>>>         unsigned long int __shadow_stack_pointer;
>>>       } __saved;
>>>   } __jmpbuf_arch_t;
>>>
>>> Which means you have a sigset_t *before* the __shadow_stack_pointer.
>>>
>>> On x32 to save a 64-bit shadow stack pointer, you'll only have 2 x 32-bit
>>> words left? That's only space for 64 signals?
>>>
>>> Are you counting on one additional 32-bit word of padding between the
>>> __mask_was_saved and the rest of the long arguments?
>>>
>>> If so, then this still needs spelling out in an a comment why 96 signals
>>> works on both i686, x32, and x86_64.
>>>
>>> Also it should be explained if 96 is the *maximum* we can do with the current
>>> layout, or if more is possible. In which case picking 96 is *not* arbitrary.
>>
>>    The pad array in struct pthread_unwind_buf is used by setjmp to save
>>    shadow stack register.  The usable space in __saved_mask for sigset
>>    and shadow stack pointer:
>>    1. i386: The 4x4 byte pad array which can be used for 4 byte shadow
>>    stack pointer and maximum 12 byte sigset.
>>    2. x32: 4 byte padding + the 4x4 byte pad array which can be used
>>    for 8 byte shadow stack pointer and maximum 12 byte sigset.
>>    3. x86-64: The 4x8 byte pad array which can be used for 8 byte
>>    shadow stack pointer and maximum 24 byte sigset.
>>
>> Please see comments in sysdeps/unix/sysv/linux/x86/setjmpP.h for
>> details.
>>
>>> Suggest:
>>> ~~~
>>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>>> the shadow stack register.  Assert that the size of struct pthread_unwind_buf
>>> is no less than the offset of the shadow stack pointer plus the shadow stack
>>> pointer size.
>>>
>>> NB: We use setjmp in thread cancellation and this saves the shadow stack
>>> register, but __libc_unwind_longjmp doesn't restore the shadow stack register
>>> since cancellation never returns after longjmp.
>>> ~~~
>>>
>>
>> Done.
>>
>>>
>>> This assertion is too loose.
>>>
>>> The assertion you need is that the shadow stack pointer itself falls within
>>> the range of the private padding. This would have caught the previous bug
>>> with the rounded up size.
>>>
>>> Please adjust the assertion to be as tight as possible or add new assertions.
>>>
>>> Completely untested, but just to show what I'm thinking:
>>>
>>> _Static_assert ((offsetof (struct pthread_unwind_buf, cancel_jump_buf[0].mask_was_saved)
>>>               + sizeof (int) < SHADOW_STACK_POINTER_OFFSET
>>>               && (offsetof (struct pthread_unwind_buf, priv.pad[4])
>>>                  > (SHADOW_STACK_POINTER_OFFSET + SHADOW_STACK_POINTER_SIZE),
>>>              "Shadow stack pointer is not within private storage of pthread_unwind_buf.");
>>>
>>
>> I used a slighly different assert.
>>
>>>
>>> Looking forward to a v3.
>>>
>>
>> Here it is.
>>
>> H.J.
>> ---
>> The pad array in struct pthread_unwind_buf is used by setjmp to save
>> shadow stack register.  We assert that size of struct pthread_unwind_buf
>> is no less than offset of shadow stack pointer + shadow stack pointer
>> size.
>>
>> Since functions, like LIBC_START_MAIN, START_THREAD_DEFN as well as
>> these with thread cancellation, call setjmp, but never return after
>> __libc_unwind_longjmp, __libc_unwind_longjmp, which is defined as
>> __libc_longjmp on x86, doesn't need to restore shadow stack register.
>> __libc_longjmp, which is a private interface for thread cancellation
>> implementation in libpthread, is changed to call __longjmp_cancel,
>> instead of __longjmp.  __longjmp_cancel is a new internal function
>> in libc, which is similar to __longjmp, but doesn't restore shadow
>> stack register.
>>
>> The compatibility longjmp and siglongjmp in libpthread.so are changed
>> to call __libc_siglongjmp, instead of __libc_longjmp, so that they will
>> restore shadow stack register.
>>
>> Tested with build-many-glibcs.py.
>>
>>       * nptl/pthread_create.c (START_THREAD_DEFN): Clear previous
>>       handlers after setjmp.
>>       * setjmp/longjmp.c (__libc_longjmp): Don't define alias if
>>       defined.
>>       * sysdeps/unix/sysv/linux/x86/setjmpP.h: Include
>>       <libc-pointer-arith.h>.
>>       (_JUMP_BUF_SIGSET_BITS_PER_WORD): New.
>>       (_JUMP_BUF_SIGSET_NSIG): Changed to 96.
>>       (_JUMP_BUF_SIGSET_NWORDS): Changed to use ALIGN_UP and
>>       _JUMP_BUF_SIGSET_BITS_PER_WORD.
>>       * sysdeps/x86/Makefile (sysdep_routines): Add __longjmp_cancel.
>>       * sysdeps/x86/__longjmp_cancel.S: New file.
>>       * sysdeps/x86/longjmp.c: Likewise.
>>       * sysdeps/x86/nptl/pt-longjmp.c: Likewise.
>> ---
>>  nptl/pthread_create.c                 | 17 +++++++--
>>  setjmp/longjmp.c                      |  2 +
>>  sysdeps/unix/sysv/linux/x86/setjmpP.h | 71 ++++++++++++++++++++++++++++++++---
>>  sysdeps/x86/Makefile                  |  4 ++
>>  sysdeps/x86/__longjmp_cancel.S        | 20 ++++++++++
>>  sysdeps/x86/longjmp.c                 | 45 ++++++++++++++++++++++
>>  sysdeps/x86/nptl/pt-longjmp.c         | 71 +++++++++++++++++++++++++++++++++++
>>  7 files changed, 222 insertions(+), 8 deletions(-)
>>  create mode 100644 sysdeps/x86/__longjmp_cancel.S
>>  create mode 100644 sysdeps/x86/longjmp.c
>>  create mode 100644 sysdeps/x86/nptl/pt-longjmp.c
>>
>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>> index caaf07c134..92c945b12b 100644
>> --- a/nptl/pthread_create.c
>> +++ b/nptl/pthread_create.c
>> @@ -427,12 +427,23 @@ START_THREAD_DEFN
>>       compilers without that support we do use setjmp.  */
>>    struct pthread_unwind_buf unwind_buf;
>>
>> -  /* No previous handlers.  */
>> +  int not_first_call;
>> +  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>> +
>> +  /* No previous handlers.  NB: This must be done after setjmp since the
>> +     private space in the unwind jump buffer may overlap space used by
>> +     setjmp to store extra architecture-specific information which is
>> +     never used by the cancellation-specific __libc_unwind_longjmp.
>> +
>> +     The private space is allowed to overlap because the unwinder never
>> +     has to return through any of the jumped-to call frames, and thus
>> +     only a minimum amount of saved data need be stored, and for example,
>> +     need not include the process signal mask information. This is all
>> +     an optimization to reduce stack usage when pushing cancellation
>> +     handlers.  */
>
> OK.
>
>>    unwind_buf.priv.data.prev = NULL;
>>    unwind_buf.priv.data.cleanup = NULL;
>>
>> -  int not_first_call;
>> -  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
>>    if (__glibc_likely (! not_first_call))
>>      {
>>        /* Store the new cleanup handler info.  */
>> diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
>> index a2a7065a85..453889e103 100644
>> --- a/setjmp/longjmp.c
>> +++ b/setjmp/longjmp.c
>> @@ -40,9 +40,11 @@ __libc_siglongjmp (sigjmp_buf env, int val)
>>  }
>>
>>  #ifndef __libc_siglongjmp
>> +# ifndef __libc_longjmp
>>  /* __libc_longjmp is a private interface for cancellation implementation
>>     in libpthread.  */
>>  strong_alias (__libc_siglongjmp, __libc_longjmp)
>> +# endif
>
> OK.
>
>>  weak_alias (__libc_siglongjmp, _longjmp)
>>  weak_alias (__libc_siglongjmp, longjmp)
>>  weak_alias (__libc_siglongjmp, siglongjmp)
>> diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>> index c0ed767a0d..6b2608453d 100644
>> --- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
>> +++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
>> @@ -20,13 +20,72 @@
>>  #define      _SETJMPP_H      1
>>
>>  #include <bits/types/__sigset_t.h>
>> +#include <libc-pointer-arith.h>
>>
>> -/* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
>> -   Define it to 513 to leave some rooms for future use.  */
>> -#define _JUMP_BUF_SIGSET_NSIG        513
>> +/* <setjmp/setjmp.h> has
>> +
>> +struct __jmp_buf_tag
>> +  {
>> +    __jmp_buf __jmpbuf;
>> +    int __mask_was_saved;
>> +    __sigset_t __saved_mask;
>> +  };
>> +
>> +   struct __jmp_buf_tag is 32 bits aligned on i386 and is 64 bits
>> +   aligned on x32 and x86-64.  __saved_mask is aligned to 32 bits
>> +   on i386/x32 without padding and is aligned to 64 bits on x86-64
>> +   with 32 bit padding.
>> +
>> +   and <nptl/descr.h> has
>> +
>> +struct pthread_unwind_buf
>> +{
>> +  struct
>> +  {
>> +    __jmp_buf jmp_buf;
>> +    int mask_was_saved;
>> +  } cancel_jmp_buf[1];
>> +
>> +  union
>> +  {
>> +    void *pad[4];
>> +    struct
>> +    {
>> +      struct pthread_unwind_buf *prev;
>> +      struct _pthread_cleanup_buffer *cleanup;
>> +      int canceltype;
>> +    } data;
>> +  } priv;
>> +};
>> +
>> +   struct pthread_unwind_buf is 32 bits aligned on i386 and 64 bits
>> +   aligned on x32/x86-64.  cancel_jmp_buf is aligned to 32 bits on
>> +   i386 and is aligned to 64 bits on x32/x86-64.
>
> OK.
>
>> +
>> +   The pad array in struct pthread_unwind_buf is used by setjmp to save
>> +   shadow stack register.  The usable space in __saved_mask for sigset
>> +   and shadow stack pointer:
>> +   1. i386: The 4x4 byte pad array which can be used for 4 byte shadow
>> +   stack pointer and maximum 12 byte sigset.
>> +   2. x32: 4 byte padding + the 4x4 byte pad array which can be used
>> +   for 8 byte shadow stack pointer and maximum 12 byte sigset.
>> +   3. x86-64: The 4x8 byte pad array which can be used for 8 byte
>> +   shadow stack pointer and maximum 24 byte sigset.
>
> OK.
>
>> +
>> +   NB: We use setjmp in thread cancellation and this saves the shadow
>> +   stack register, but __libc_unwind_longjmp doesn't restore the shadow
>> +   stack register since cancellation never returns after longjmp.  */
>> +
>> +/* Number of bits per long.  */
>> +#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
>> +/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64. The
>> +   common maximum sigset for i386, x32 and x86-64 is 12 bytes (96 bits).
>> +   Define it to 96 to leave some rooms for future use.  */
>> +#define _JUMP_BUF_SIGSET_NSIG        96
>>  /* Number of longs to hold all signals.  */
>>  #define _JUMP_BUF_SIGSET_NWORDS \
>> -  ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
>> +  (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \
>> +   / _JUMP_BUF_SIGSET_BITS_PER_WORD)
>
> OK.
>
>>
>>  typedef struct
>>    {
>> @@ -39,7 +98,9 @@ typedef union
>>      struct
>>        {
>>       __jmp_buf_sigset_t __saved_mask;
>> -     /* Used for shadow stack pointer.  */
>> +     /* Used for shadow stack pointer.  NB: Shadow stack pointer
>> +        must have the same alignment as __saved_mask.  Otherwise
>> +        offset of __saved_mask will be changed.  */
>
> s/offset of __saved_mask/offset of __shadow_stack_pointer/g?

__saved_mask here is in the public header:

/* Calling environment, plus possibly a saved signal mask.  */
struct __jmp_buf_tag
  {
    /* NOTE: The machine-dependent definitions of `__sigsetjmp'
       assume that a `jmp_buf' begins with a `__jmp_buf' and that
       `__mask_was_saved' follows it.  Do not move these members
       or add others before it.  */
    __jmp_buf __jmpbuf; /* Calling environment.  */
    int __mask_was_saved; /* Saved the signal mask?  */
    __sigset_t __saved_mask; /* Saved signal mask.  */
  };

and sysdeps/unix/sysv/linux/x86/setjmpP.h has

#undef __sigset_t
#define __sigset_t __jmpbuf_arch_t
#include <setjmp.h>
#undef __saved_mask
#define __saved_mask __saved_mask.__saved.__saved_mask

The offset of  __saved_mask must be unchanged, which is checked by
assert in include/setjmp.h:

/* Check if internal fields in jmp_buf have the expected offsets.  */
TEST_OFFSET (struct __jmp_buf_tag, __mask_was_saved,
     MASK_WAS_SAVED_OFFSET);
TEST_OFFSET (struct __jmp_buf_tag, __saved_mask,
     SAVED_MASK_OFFSET);


> OK with that correction.
>
> It is the __shadow_stack_pointer offset which will change if it doesn't
> have the same alignment as the __saved_mask. The compiler will insert
> padding between the fields.
>
> Perhaps assert there is no padding?

I don't think we need it since there is

/* Assert that the priv field in struct pthread_unwind_buf has space
   to store shadow stack pointer.  */
_Static_assert ((offsetof (struct pthread_unwind_buf, priv)
<= SHADOW_STACK_POINTER_OFFSET)
&& ((offsetof (struct pthread_unwind_buf, priv)
     + sizeof (((struct pthread_unwind_buf *) 0)->priv))
    >= (SHADOW_STACK_POINTER_OFFSET
+ SHADOW_STACK_POINTER_SIZE)),
"Shadow stack pointer is not within private storage "
"of pthread_unwind_buf.");

to ensure that offset and size of the priv field are sufficient.

>>       unsigned long int __shadow_stack_pointer;
>>        } __saved;
>>    } __jmpbuf_arch_t;
>> diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
>> index 0d0326c21a..d25d6f0ae4 100644
>> --- a/sysdeps/x86/Makefile
>> +++ b/sysdeps/x86/Makefile
>> @@ -8,3 +8,7 @@ sysdep-dl-routines += dl-get-cpu-features
>>  tests += tst-get-cpu-features
>>  tests-static += tst-get-cpu-features-static
>>  endif
>> +
>> +ifeq ($(subdir),setjmp)
>> +sysdep_routines += __longjmp_cancel
>> +endif
>
> OK.
>
>> diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
>> new file mode 100644
>> index 0000000000..b57dbfa376
>> --- /dev/null
>> +++ b/sysdeps/x86/__longjmp_cancel.S
>> @@ -0,0 +1,20 @@
>> +/* __longjmp_cancel for x86.
>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#define __longjmp __longjmp_cancel
>> +#include <__longjmp.S>
>
> OK.
>
>> diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
>> new file mode 100644
>> index 0000000000..a53f31e1dd
>> --- /dev/null
>> +++ b/sysdeps/x86/longjmp.c
>> @@ -0,0 +1,45 @@
>> +/* __libc_siglongjmp for x86.
>> +   Copyright (C) 2018 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#define __libc_longjmp __redirect___libc_longjmp
>> +#include <setjmp/longjmp.c>
>> +#undef __libc_longjmp
>> +
>> +extern void __longjmp_cancel (__jmp_buf __env, int __val)
>> +     __attribute__ ((__noreturn__)) attribute_hidden;
>> +
>> +/* Since __libc_longjmp is a private interface for cancellation
>> +   implementation in libpthread, there is no need to restore shadow
>> +   stack register.  */
>> +
>> +void
>> +__libc_longjmp (sigjmp_buf env, int val)
>> +{
>> +  /* Perform any cleanups needed by the frames being unwound.  */
>> +  _longjmp_unwind (env, val);
>> +
>> +  if (env[0].__mask_was_saved)
>> +    /* Restore the saved signal mask.  */
>> +    (void) __sigprocmask (SIG_SETMASK,
>> +                       (sigset_t *) &env[0].__saved_mask,
>> +                       (sigset_t *) NULL);
>> +
>> +  /* Call the machine-dependent function to restore machine state
>> +     without shadow stack.  */
>> +  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
>> +}
>
> OK.
>
>> diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
>> new file mode 100644
>> index 0000000000..6165c7d4a7
>> --- /dev/null
>> +++ b/sysdeps/x86/nptl/pt-longjmp.c
>> @@ -0,0 +1,71 @@
>> +/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
>> +   X86 version.
>> +   Copyright (C) 18 Free Software Foundation, Inc.
>> +   This file is part of the GNU C Library.
>> +
>> +   The GNU C Library is free software; you can redistribute it and/or
>> +   modify it under the terms of the GNU Lesser General Public
>> +   License as published by the Free Software Foundation; either
>> +   version 2.1 of the License, or (at your option) any later version.
>> +
>> +   The GNU C Library is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> +   Lesser General Public License for more details.
>> +
>> +   You should have received a copy of the GNU Lesser General Public
>> +   License along with the GNU C Library; if not, see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#include <pthreadP.h>
>> +#include <jmp_buf-ssp.h>
>> +
>> +#ifdef __x86_64__
>> +# define SHADOW_STACK_POINTER_SIZE 8
>> +#else
>> +# define SHADOW_STACK_POINTER_SIZE 4
>> +#endif
>> +
>> +/* Assert that the priv field in struct pthread_unwind_buf has space
>> +   to store shadow stack pointer.  */
>> +_Static_assert ((offsetof (struct pthread_unwind_buf, priv)
>> +              <= SHADOW_STACK_POINTER_OFFSET)
>> +             && ((offsetof (struct pthread_unwind_buf, priv)
>> +                  + sizeof (((struct pthread_unwind_buf *) 0)->priv))
>> +                 >= (SHADOW_STACK_POINTER_OFFSET
>> +                     + SHADOW_STACK_POINTER_SIZE)),
>> +             "Shadow stack pointer is not within private storage "
>> +             "of pthread_unwind_buf.");
>
> OK. Thanks for testing this out.
>
>> +
>> +#include <shlib-compat.h>
>> +
>> +/* libpthread once had its own longjmp (and siglongjmp alias), though there
>> +   was no apparent reason for it.  There is no use in having a separate
>> +   symbol in libpthread, but the historical ABI requires it.  For static
>> +   linking, there is no need to provide anything here--the libc version
>> +   will be linked in.  For shared library ABI compatibility, there must be
>> +   longjmp and siglongjmp symbols in libpthread.so.
>> +
>> +   With an IFUNC resolver, it would be possible to avoid the indirection,
>> +   but the IFUNC resolver might run before the __libc_longjmp symbol has
>> +   been relocated, in which case the IFUNC resolver would not be able to
>> +   provide the correct address.  */
>> +
>> +#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
>> +
>> +static void __attribute__ ((noreturn, used))
>> +longjmp_compat (jmp_buf env, int val)
>> +{
>> +  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
>> +     __libc_longjmp is a private interface for cancellation which
>> +     doesn't restore shadow stack register.  */
>> +  __libc_siglongjmp (env, val);
>> +}
>> +
>> +strong_alias (longjmp_compat, longjmp_alias)
>> +compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
>> +
>> +strong_alias (longjmp_alias, siglongjmp_alias)
>> +compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
>> +
>> +#endif
>>
>
> Version 3 looks good to me with the comment correction I mentioned.
>
> Thank you for working with me to improve the quality of the code
> and comments, and for balancing the needs of the implementation
> against the distribution needs e.g. no added symbol for core CET
> enablement.

Thank you for your valuable feedbacks.  I am checking in my patch now.

> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
>
> --
> Cheers,
> Carlos.
  

Patch

diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index caaf07c134..92c945b12b 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -427,12 +427,23 @@  START_THREAD_DEFN
      compilers without that support we do use setjmp.  */
   struct pthread_unwind_buf unwind_buf;
 
-  /* No previous handlers.  */
+  int not_first_call;
+  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
+
+  /* No previous handlers.  NB: This must be done after setjmp since the
+     private space in the unwind jump buffer may overlap space used by
+     setjmp to store extra architecture-specific information which is
+     never used by the cancellation-specific __libc_unwind_longjmp.
+
+     The private space is allowed to overlap because the unwinder never
+     has to return through any of the jumped-to call frames, and thus
+     only a minimum amount of saved data need be stored, and for example,
+     need not include the process signal mask information. This is all
+     an optimization to reduce stack usage when pushing cancellation
+     handlers.  */
   unwind_buf.priv.data.prev = NULL;
   unwind_buf.priv.data.cleanup = NULL;
 
-  int not_first_call;
-  not_first_call = setjmp ((struct __jmp_buf_tag *) unwind_buf.cancel_jmp_buf);
   if (__glibc_likely (! not_first_call))
     {
       /* Store the new cleanup handler info.  */
diff --git a/setjmp/longjmp.c b/setjmp/longjmp.c
index a2a7065a85..453889e103 100644
--- a/setjmp/longjmp.c
+++ b/setjmp/longjmp.c
@@ -40,9 +40,11 @@  __libc_siglongjmp (sigjmp_buf env, int val)
 }
 
 #ifndef __libc_siglongjmp
+# ifndef __libc_longjmp
 /* __libc_longjmp is a private interface for cancellation implementation
    in libpthread.  */
 strong_alias (__libc_siglongjmp, __libc_longjmp)
+# endif
 weak_alias (__libc_siglongjmp, _longjmp)
 weak_alias (__libc_siglongjmp, longjmp)
 weak_alias (__libc_siglongjmp, siglongjmp)
diff --git a/sysdeps/unix/sysv/linux/x86/setjmpP.h b/sysdeps/unix/sysv/linux/x86/setjmpP.h
index c0ed767a0d..6b2608453d 100644
--- a/sysdeps/unix/sysv/linux/x86/setjmpP.h
+++ b/sysdeps/unix/sysv/linux/x86/setjmpP.h
@@ -20,13 +20,72 @@ 
 #define	_SETJMPP_H	1
 
 #include <bits/types/__sigset_t.h>
+#include <libc-pointer-arith.h>
 
-/* The biggest signal number + 1.  As of kernel 4.14, x86 _NSIG is 64.
-   Define it to 513 to leave some rooms for future use.  */
-#define _JUMP_BUF_SIGSET_NSIG	513
+/* <setjmp/setjmp.h> has
+
+struct __jmp_buf_tag
+  {
+    __jmp_buf __jmpbuf;
+    int __mask_was_saved;
+    __sigset_t __saved_mask;
+  };
+
+   struct __jmp_buf_tag is 32 bits aligned on i386 and is 64 bits
+   aligned on x32 and x86-64.  __saved_mask is aligned to 32 bits
+   on i386/x32 without padding and is aligned to 64 bits on x86-64
+   with 32 bit padding.
+
+   and <nptl/descr.h> has
+
+struct pthread_unwind_buf
+{
+  struct
+  {
+    __jmp_buf jmp_buf;
+    int mask_was_saved;
+  } cancel_jmp_buf[1];
+
+  union
+  {
+    void *pad[4];
+    struct
+    {
+      struct pthread_unwind_buf *prev;
+      struct _pthread_cleanup_buffer *cleanup;
+      int canceltype;
+    } data;
+  } priv;
+};
+
+   struct pthread_unwind_buf is 32 bits aligned on i386 and 64 bits
+   aligned on x32/x86-64.  cancel_jmp_buf is aligned to 32 bits on
+   i386 and is aligned to 64 bits on x32/x86-64.
+
+   The pad array in struct pthread_unwind_buf is used by setjmp to save
+   shadow stack register.  The usable space in __saved_mask for sigset
+   and shadow stack pointer:
+   1. i386: The 4x4 byte pad array which can be used for 4 byte shadow
+   stack pointer and maximum 12 byte sigset.
+   2. x32: 4 byte padding + the 4x4 byte pad array which can be used
+   for 8 byte shadow stack pointer and maximum 12 byte sigset.
+   3. x86-64: The 4x8 byte pad array which can be used for 8 byte
+   shadow stack pointer and maximum 24 byte sigset.
+
+   NB: We use setjmp in thread cancellation and this saves the shadow
+   stack register, but __libc_unwind_longjmp doesn't restore the shadow
+   stack register since cancellation never returns after longjmp.  */
+
+/* Number of bits per long.  */
+#define _JUMP_BUF_SIGSET_BITS_PER_WORD (8 * sizeof (unsigned long int))
+/* The biggest signal number.  As of kernel 4.14, x86 _NSIG is 64. The
+   common maximum sigset for i386, x32 and x86-64 is 12 bytes (96 bits).
+   Define it to 96 to leave some rooms for future use.  */
+#define _JUMP_BUF_SIGSET_NSIG	96
 /* Number of longs to hold all signals.  */
 #define _JUMP_BUF_SIGSET_NWORDS \
-  ((_JUMP_BUF_SIGSET_NSIG - 1 + 7) / (8 * sizeof (unsigned long int)))
+  (ALIGN_UP (_JUMP_BUF_SIGSET_NSIG, _JUMP_BUF_SIGSET_BITS_PER_WORD) \
+   / _JUMP_BUF_SIGSET_BITS_PER_WORD)
 
 typedef struct
   {
@@ -39,7 +98,9 @@  typedef union
     struct
       {
 	__jmp_buf_sigset_t __saved_mask;
-	/* Used for shadow stack pointer.  */
+	/* Used for shadow stack pointer.  NB: Shadow stack pointer
+	   must have the same alignment as __saved_mask.  Otherwise
+	   offset of __saved_mask will be changed.  */
 	unsigned long int __shadow_stack_pointer;
       } __saved;
   } __jmpbuf_arch_t;
diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile
index 0d0326c21a..d25d6f0ae4 100644
--- a/sysdeps/x86/Makefile
+++ b/sysdeps/x86/Makefile
@@ -8,3 +8,7 @@  sysdep-dl-routines += dl-get-cpu-features
 tests += tst-get-cpu-features
 tests-static += tst-get-cpu-features-static
 endif
+
+ifeq ($(subdir),setjmp)
+sysdep_routines += __longjmp_cancel
+endif
diff --git a/sysdeps/x86/__longjmp_cancel.S b/sysdeps/x86/__longjmp_cancel.S
new file mode 100644
index 0000000000..b57dbfa376
--- /dev/null
+++ b/sysdeps/x86/__longjmp_cancel.S
@@ -0,0 +1,20 @@ 
+/* __longjmp_cancel for x86.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define __longjmp __longjmp_cancel
+#include <__longjmp.S>
diff --git a/sysdeps/x86/longjmp.c b/sysdeps/x86/longjmp.c
new file mode 100644
index 0000000000..a53f31e1dd
--- /dev/null
+++ b/sysdeps/x86/longjmp.c
@@ -0,0 +1,45 @@ 
+/* __libc_siglongjmp for x86.
+   Copyright (C) 2018 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define __libc_longjmp __redirect___libc_longjmp
+#include <setjmp/longjmp.c>
+#undef __libc_longjmp
+
+extern void __longjmp_cancel (__jmp_buf __env, int __val)
+     __attribute__ ((__noreturn__)) attribute_hidden;
+
+/* Since __libc_longjmp is a private interface for cancellation
+   implementation in libpthread, there is no need to restore shadow
+   stack register.  */
+
+void
+__libc_longjmp (sigjmp_buf env, int val)
+{
+  /* Perform any cleanups needed by the frames being unwound.  */
+  _longjmp_unwind (env, val);
+
+  if (env[0].__mask_was_saved)
+    /* Restore the saved signal mask.  */
+    (void) __sigprocmask (SIG_SETMASK,
+			  (sigset_t *) &env[0].__saved_mask,
+			  (sigset_t *) NULL);
+
+  /* Call the machine-dependent function to restore machine state
+     without shadow stack.  */
+  __longjmp_cancel (env[0].__jmpbuf, val ?: 1);
+}
diff --git a/sysdeps/x86/nptl/pt-longjmp.c b/sysdeps/x86/nptl/pt-longjmp.c
new file mode 100644
index 0000000000..6165c7d4a7
--- /dev/null
+++ b/sysdeps/x86/nptl/pt-longjmp.c
@@ -0,0 +1,71 @@ 
+/* ABI compatibility for 'longjmp' and 'siglongjmp' symbols in libpthread ABI.
+   X86 version.
+   Copyright (C) 18 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <pthreadP.h>
+#include <jmp_buf-ssp.h>
+
+#ifdef __x86_64__
+# define SHADOW_STACK_POINTER_SIZE 8
+#else
+# define SHADOW_STACK_POINTER_SIZE 4
+#endif
+
+/* Assert that the priv field in struct pthread_unwind_buf has space
+   to store shadow stack pointer.  */
+_Static_assert ((offsetof (struct pthread_unwind_buf, priv)
+		 <= SHADOW_STACK_POINTER_OFFSET)
+		&& ((offsetof (struct pthread_unwind_buf, priv)
+		     + sizeof (((struct pthread_unwind_buf *) 0)->priv))
+		    >= (SHADOW_STACK_POINTER_OFFSET
+			+ SHADOW_STACK_POINTER_SIZE)),
+		"Shadow stack pointer is not within private storage "
+		"of pthread_unwind_buf.");
+
+#include <shlib-compat.h>
+
+/* libpthread once had its own longjmp (and siglongjmp alias), though there
+   was no apparent reason for it.  There is no use in having a separate
+   symbol in libpthread, but the historical ABI requires it.  For static
+   linking, there is no need to provide anything here--the libc version
+   will be linked in.  For shared library ABI compatibility, there must be
+   longjmp and siglongjmp symbols in libpthread.so.
+
+   With an IFUNC resolver, it would be possible to avoid the indirection,
+   but the IFUNC resolver might run before the __libc_longjmp symbol has
+   been relocated, in which case the IFUNC resolver would not be able to
+   provide the correct address.  */
+
+#if SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_22)
+
+static void __attribute__ ((noreturn, used))
+longjmp_compat (jmp_buf env, int val)
+{
+  /* NB: We call __libc_siglongjmp,  instead of __libc_longjmp, since
+     __libc_longjmp is a private interface for cancellation which
+     doesn't restore shadow stack register.  */
+  __libc_siglongjmp (env, val);
+}
+
+strong_alias (longjmp_compat, longjmp_alias)
+compat_symbol (libpthread, longjmp_alias, longjmp, GLIBC_2_0);
+
+strong_alias (longjmp_alias, siglongjmp_alias)
+compat_symbol (libpthread, siglongjmp_alias, siglongjmp, GLIBC_2_0);
+
+#endif