V7 [PATCH] sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]

Message ID CAMe9rOoG8E4cgt_HOw-EEAOyQ5cGGKRebmA2YsY=zYefq7cc1A@mail.gmail.com
State Superseded
Headers
Series V7 [PATCH] sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305] |

Commit Message

H.J. Lu Oct. 20, 2020, 6:19 p.m. UTC
  On Tue, Oct 20, 2020 at 7:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Oct 20, 2020 at 2:19 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Mon, Oct 19, 2020 at 02:32:35PM -0700, H.J. Lu via Libc-alpha wrote:
> > > On Mon, Oct 19, 2020 at 8:08 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Thu, Oct 15, 2020 at 04:57:28AM -0700, H.J. Lu via Libc-alpha wrote:
> >
> > [...]
> >
> > > > > > > diff --git a/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
> >
> > [...]
> >
> > > > > > > +/* Emulate AT_MINSIGSTKSZ with XSAVE. */
> > > > > > > +
> > > > > > > +static inline void
> > > > > > > +dl_check_minsigstacksize (void)
> > > > > > > +{
> > > > > > > +  /* NB: Default to a constant MINSIGSTKSZ.  */
> > > > > > > +  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
> > > > > > > +           "MINSIGSTKSZ is constant");
> > > > > > > +  /* Return if AT_MINSIGSTKSZ is provide by kernel.  */
> > > > > > > +  if (GLRO(dl_minsigstacksize) != MINSIGSTKSZ)
> > > > > > > +    return;
> > > > > >
> > > > > > Couldn't the kernel actually yield MINSIGSTKSZ or a smaller value, say,
> > > > > > if running on hardware that doesn't have AVX-512?
> > > > > >
> > > > > It is OK for MINSIGSTKSZ > AT_MINSIGSTKSZ.  For _SC_SIGSTKSZ_SOURCE,
> > > > > dynamic MINSIGSTKSZ is defined as sysconf (_SC_SIGSTKSZ) which is
> > > > >
> > > > > MAX (SIGSTKSZ, MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4)
> > > > >
> > > > > and dynamic MINSIGSTKSZ is always > MINSIGSTKSZ.
> > > > >
> > > > > > We might want a separate flag to indicate whether we obtained a value
> > > > > > from the auxv, rather relying on MINSIGSTKSZ having this magic meaning.
> > > > >
> > > > > AT_MINSIGSTKSZ is the only way for GLRO(dl_minsigstacksize) != MINSIGSTKSZ.
> > > >
> > > > Yes, but reading AT_MINSIGSTKSZ doesn't guarantee that
> > > > GLRO(dl_minsigstkszsize) != MINSIGSTKSZ, no?
> > > >
> > > > What if the value reported for AT_MINSIGSTKSZ is actually the same as
> > > > MINSIGSTKSZ?  This could be the case on some arches in future even if
> > > > it's never true today.  But the code here assumes that AT_MINSIGSTKSZ
> > > > wasn't available in this case, and reverts to a fallback guess.
> > >
> > > Since the fallback tracks what the kernel does, if AT_MINSIGSTKSZ
> > > is 2KB, the fallback will be 2KB or slightly larger.
> >
> > Well, I guess that should be safe.  It still feels a bit like it works
> > by accident, but I may be being too paranoid.
>
> Let me work on that.
>

Here is the updated patch to initialize GLRO(dl_minsigstacksize)
to 0 on x86.
  

Comments

H.J. Lu Nov. 3, 2020, 3:06 a.m. UTC | #1
On Tue, Oct 20, 2020 at 11:19 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Tue, Oct 20, 2020 at 7:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Tue, Oct 20, 2020 at 2:19 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > >
> > > On Mon, Oct 19, 2020 at 02:32:35PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > On Mon, Oct 19, 2020 at 8:08 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > >
> > > > > On Thu, Oct 15, 2020 at 04:57:28AM -0700, H.J. Lu via Libc-alpha wrote:
> > >
> > > [...]
> > >
> > > > > > > > diff --git a/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
> > >
> > > [...]
> > >
> > > > > > > > +/* Emulate AT_MINSIGSTKSZ with XSAVE. */
> > > > > > > > +
> > > > > > > > +static inline void
> > > > > > > > +dl_check_minsigstacksize (void)
> > > > > > > > +{
> > > > > > > > +  /* NB: Default to a constant MINSIGSTKSZ.  */
> > > > > > > > +  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
> > > > > > > > +           "MINSIGSTKSZ is constant");
> > > > > > > > +  /* Return if AT_MINSIGSTKSZ is provide by kernel.  */
> > > > > > > > +  if (GLRO(dl_minsigstacksize) != MINSIGSTKSZ)
> > > > > > > > +    return;
> > > > > > >
> > > > > > > Couldn't the kernel actually yield MINSIGSTKSZ or a smaller value, say,
> > > > > > > if running on hardware that doesn't have AVX-512?
> > > > > > >
> > > > > > It is OK for MINSIGSTKSZ > AT_MINSIGSTKSZ.  For _SC_SIGSTKSZ_SOURCE,
> > > > > > dynamic MINSIGSTKSZ is defined as sysconf (_SC_SIGSTKSZ) which is
> > > > > >
> > > > > > MAX (SIGSTKSZ, MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4)
> > > > > >
> > > > > > and dynamic MINSIGSTKSZ is always > MINSIGSTKSZ.
> > > > > >
> > > > > > > We might want a separate flag to indicate whether we obtained a value
> > > > > > > from the auxv, rather relying on MINSIGSTKSZ having this magic meaning.
> > > > > >
> > > > > > AT_MINSIGSTKSZ is the only way for GLRO(dl_minsigstacksize) != MINSIGSTKSZ.
> > > > >
> > > > > Yes, but reading AT_MINSIGSTKSZ doesn't guarantee that
> > > > > GLRO(dl_minsigstkszsize) != MINSIGSTKSZ, no?
> > > > >
> > > > > What if the value reported for AT_MINSIGSTKSZ is actually the same as
> > > > > MINSIGSTKSZ?  This could be the case on some arches in future even if
> > > > > it's never true today.  But the code here assumes that AT_MINSIGSTKSZ
> > > > > wasn't available in this case, and reverts to a fallback guess.
> > > >
> > > > Since the fallback tracks what the kernel does, if AT_MINSIGSTKSZ
> > > > is 2KB, the fallback will be 2KB or slightly larger.
> > >
> > > Well, I guess that should be safe.  It still feels a bit like it works
> > > by accident, but I may be being too paranoid.
> >
> > Let me work on that.
> >
>
> Here is the updated patch to initialize GLRO(dl_minsigstacksize)
> to 0 on x86.
>

PING:

https://sourceware.org/pipermail/libc-alpha/2020-October/118843.html
  
Dave Martin Nov. 4, 2020, 4:50 p.m. UTC | #2
On Mon, Nov 02, 2020 at 07:06:13PM -0800, H.J. Lu via Libc-alpha wrote:
> On Tue, Oct 20, 2020 at 11:19 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >
> > On Tue, Oct 20, 2020 at 7:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Tue, Oct 20, 2020 at 2:19 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > >
> > > > On Mon, Oct 19, 2020 at 02:32:35PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > On Mon, Oct 19, 2020 at 8:08 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > >
> > > > > > On Thu, Oct 15, 2020 at 04:57:28AM -0700, H.J. Lu via Libc-alpha wrote:
> > > >
> > > > [...]
> > > >
> > > > > > > > > diff --git a/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
> > > >
> > > > [...]
> > > >
> > > > > > > > > +/* Emulate AT_MINSIGSTKSZ with XSAVE. */
> > > > > > > > > +
> > > > > > > > > +static inline void
> > > > > > > > > +dl_check_minsigstacksize (void)
> > > > > > > > > +{
> > > > > > > > > +  /* NB: Default to a constant MINSIGSTKSZ.  */
> > > > > > > > > +  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
> > > > > > > > > +           "MINSIGSTKSZ is constant");
> > > > > > > > > +  /* Return if AT_MINSIGSTKSZ is provide by kernel.  */
> > > > > > > > > +  if (GLRO(dl_minsigstacksize) != MINSIGSTKSZ)
> > > > > > > > > +    return;
> > > > > > > >
> > > > > > > > Couldn't the kernel actually yield MINSIGSTKSZ or a smaller value, say,
> > > > > > > > if running on hardware that doesn't have AVX-512?
> > > > > > > >
> > > > > > > It is OK for MINSIGSTKSZ > AT_MINSIGSTKSZ.  For _SC_SIGSTKSZ_SOURCE,
> > > > > > > dynamic MINSIGSTKSZ is defined as sysconf (_SC_SIGSTKSZ) which is
> > > > > > >
> > > > > > > MAX (SIGSTKSZ, MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4)
> > > > > > >
> > > > > > > and dynamic MINSIGSTKSZ is always > MINSIGSTKSZ.
> > > > > > >
> > > > > > > > We might want a separate flag to indicate whether we obtained a value
> > > > > > > > from the auxv, rather relying on MINSIGSTKSZ having this magic meaning.
> > > > > > >
> > > > > > > AT_MINSIGSTKSZ is the only way for GLRO(dl_minsigstacksize) != MINSIGSTKSZ.
> > > > > >
> > > > > > Yes, but reading AT_MINSIGSTKSZ doesn't guarantee that
> > > > > > GLRO(dl_minsigstkszsize) != MINSIGSTKSZ, no?
> > > > > >
> > > > > > What if the value reported for AT_MINSIGSTKSZ is actually the same as
> > > > > > MINSIGSTKSZ?  This could be the case on some arches in future even if
> > > > > > it's never true today.  But the code here assumes that AT_MINSIGSTKSZ
> > > > > > wasn't available in this case, and reverts to a fallback guess.
> > > > >
> > > > > Since the fallback tracks what the kernel does, if AT_MINSIGSTKSZ
> > > > > is 2KB, the fallback will be 2KB or slightly larger.
> > > >
> > > > Well, I guess that should be safe.  It still feels a bit like it works
> > > > by accident, but I may be being too paranoid.
> > >
> > > Let me work on that.
> > >
> >
> > Here is the updated patch to initialize GLRO(dl_minsigstacksize)
> > to 0 on x86.
> >
> 
> PING:
> 
> https://sourceware.org/pipermail/libc-alpha/2020-October/118843.html

Because the signal context doesn't actually fit in mcontext_t any more
(or, recursively, in ucontext_t), it would make sense to amend some of
the ucontext-related functions at the same time, and force people
towards this updated interface when building with _SC_SIGSTKSZ_SOURCE.

I'm aiming to get a draft proposal onto the list next week, if people
can wait for that before deciding whether the two sets of changes should
be coupled...

Cheers
---Dave
  
H.J. Lu Nov. 4, 2020, 5:48 p.m. UTC | #3
On Wed, Nov 4, 2020 at 8:50 AM Dave Martin <Dave.Martin@arm.com> wrote:
>
> On Mon, Nov 02, 2020 at 07:06:13PM -0800, H.J. Lu via Libc-alpha wrote:
> > On Tue, Oct 20, 2020 at 11:19 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > >
> > > On Tue, Oct 20, 2020 at 7:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > On Tue, Oct 20, 2020 at 2:19 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > >
> > > > > On Mon, Oct 19, 2020 at 02:32:35PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > > On Mon, Oct 19, 2020 at 8:08 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > > >
> > > > > > > On Thu, Oct 15, 2020 at 04:57:28AM -0700, H.J. Lu via Libc-alpha wrote:
> > > > >
> > > > > [...]
> > > > >
> > > > > > > > > > diff --git a/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
> > > > >
> > > > > [...]
> > > > >
> > > > > > > > > > +/* Emulate AT_MINSIGSTKSZ with XSAVE. */
> > > > > > > > > > +
> > > > > > > > > > +static inline void
> > > > > > > > > > +dl_check_minsigstacksize (void)
> > > > > > > > > > +{
> > > > > > > > > > +  /* NB: Default to a constant MINSIGSTKSZ.  */
> > > > > > > > > > +  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
> > > > > > > > > > +           "MINSIGSTKSZ is constant");
> > > > > > > > > > +  /* Return if AT_MINSIGSTKSZ is provide by kernel.  */
> > > > > > > > > > +  if (GLRO(dl_minsigstacksize) != MINSIGSTKSZ)
> > > > > > > > > > +    return;
> > > > > > > > >
> > > > > > > > > Couldn't the kernel actually yield MINSIGSTKSZ or a smaller value, say,
> > > > > > > > > if running on hardware that doesn't have AVX-512?
> > > > > > > > >
> > > > > > > > It is OK for MINSIGSTKSZ > AT_MINSIGSTKSZ.  For _SC_SIGSTKSZ_SOURCE,
> > > > > > > > dynamic MINSIGSTKSZ is defined as sysconf (_SC_SIGSTKSZ) which is
> > > > > > > >
> > > > > > > > MAX (SIGSTKSZ, MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4)
> > > > > > > >
> > > > > > > > and dynamic MINSIGSTKSZ is always > MINSIGSTKSZ.
> > > > > > > >
> > > > > > > > > We might want a separate flag to indicate whether we obtained a value
> > > > > > > > > from the auxv, rather relying on MINSIGSTKSZ having this magic meaning.
> > > > > > > >
> > > > > > > > AT_MINSIGSTKSZ is the only way for GLRO(dl_minsigstacksize) != MINSIGSTKSZ.
> > > > > > >
> > > > > > > Yes, but reading AT_MINSIGSTKSZ doesn't guarantee that
> > > > > > > GLRO(dl_minsigstkszsize) != MINSIGSTKSZ, no?
> > > > > > >
> > > > > > > What if the value reported for AT_MINSIGSTKSZ is actually the same as
> > > > > > > MINSIGSTKSZ?  This could be the case on some arches in future even if
> > > > > > > it's never true today.  But the code here assumes that AT_MINSIGSTKSZ
> > > > > > > wasn't available in this case, and reverts to a fallback guess.
> > > > > >
> > > > > > Since the fallback tracks what the kernel does, if AT_MINSIGSTKSZ
> > > > > > is 2KB, the fallback will be 2KB or slightly larger.
> > > > >
> > > > > Well, I guess that should be safe.  It still feels a bit like it works
> > > > > by accident, but I may be being too paranoid.
> > > >
> > > > Let me work on that.
> > > >
> > >
> > > Here is the updated patch to initialize GLRO(dl_minsigstacksize)
> > > to 0 on x86.
> > >
> >
> > PING:
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-October/118843.html
>
> Because the signal context doesn't actually fit in mcontext_t any more
> (or, recursively, in ucontext_t), it would make sense to amend some of
> the ucontext-related functions at the same time, and force people
> towards this updated interface when building with _SC_SIGSTKSZ_SOURCE.

That may be useful.

> I'm aiming to get a draft proposal onto the list next week, if people
> can wait for that before deciding whether the two sets of changes should
> be coupled...
>

Thanks.
  
H.J. Lu Nov. 18, 2020, 2:13 p.m. UTC | #4
On Wed, Nov 4, 2020 at 9:48 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> On Wed, Nov 4, 2020 at 8:50 AM Dave Martin <Dave.Martin@arm.com> wrote:
> >
> > On Mon, Nov 02, 2020 at 07:06:13PM -0800, H.J. Lu via Libc-alpha wrote:
> > > On Tue, Oct 20, 2020 at 11:19 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > >
> > > > On Tue, Oct 20, 2020 at 7:59 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > > > >
> > > > > On Tue, Oct 20, 2020 at 2:19 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > >
> > > > > > On Mon, Oct 19, 2020 at 02:32:35PM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > > > On Mon, Oct 19, 2020 at 8:08 AM Dave Martin <Dave.Martin@arm.com> wrote:
> > > > > > > >
> > > > > > > > On Thu, Oct 15, 2020 at 04:57:28AM -0700, H.J. Lu via Libc-alpha wrote:
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > > > > > diff --git a/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
> > > > > >
> > > > > > [...]
> > > > > >
> > > > > > > > > > > +/* Emulate AT_MINSIGSTKSZ with XSAVE. */
> > > > > > > > > > > +
> > > > > > > > > > > +static inline void
> > > > > > > > > > > +dl_check_minsigstacksize (void)
> > > > > > > > > > > +{
> > > > > > > > > > > +  /* NB: Default to a constant MINSIGSTKSZ.  */
> > > > > > > > > > > +  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
> > > > > > > > > > > +           "MINSIGSTKSZ is constant");
> > > > > > > > > > > +  /* Return if AT_MINSIGSTKSZ is provide by kernel.  */
> > > > > > > > > > > +  if (GLRO(dl_minsigstacksize) != MINSIGSTKSZ)
> > > > > > > > > > > +    return;
> > > > > > > > > >
> > > > > > > > > > Couldn't the kernel actually yield MINSIGSTKSZ or a smaller value, say,
> > > > > > > > > > if running on hardware that doesn't have AVX-512?
> > > > > > > > > >
> > > > > > > > > It is OK for MINSIGSTKSZ > AT_MINSIGSTKSZ.  For _SC_SIGSTKSZ_SOURCE,
> > > > > > > > > dynamic MINSIGSTKSZ is defined as sysconf (_SC_SIGSTKSZ) which is
> > > > > > > > >
> > > > > > > > > MAX (SIGSTKSZ, MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4)
> > > > > > > > >
> > > > > > > > > and dynamic MINSIGSTKSZ is always > MINSIGSTKSZ.
> > > > > > > > >
> > > > > > > > > > We might want a separate flag to indicate whether we obtained a value
> > > > > > > > > > from the auxv, rather relying on MINSIGSTKSZ having this magic meaning.
> > > > > > > > >
> > > > > > > > > AT_MINSIGSTKSZ is the only way for GLRO(dl_minsigstacksize) != MINSIGSTKSZ.
> > > > > > > >
> > > > > > > > Yes, but reading AT_MINSIGSTKSZ doesn't guarantee that
> > > > > > > > GLRO(dl_minsigstkszsize) != MINSIGSTKSZ, no?
> > > > > > > >
> > > > > > > > What if the value reported for AT_MINSIGSTKSZ is actually the same as
> > > > > > > > MINSIGSTKSZ?  This could be the case on some arches in future even if
> > > > > > > > it's never true today.  But the code here assumes that AT_MINSIGSTKSZ
> > > > > > > > wasn't available in this case, and reverts to a fallback guess.
> > > > > > >
> > > > > > > Since the fallback tracks what the kernel does, if AT_MINSIGSTKSZ
> > > > > > > is 2KB, the fallback will be 2KB or slightly larger.
> > > > > >
> > > > > > Well, I guess that should be safe.  It still feels a bit like it works
> > > > > > by accident, but I may be being too paranoid.
> > > > >
> > > > > Let me work on that.
> > > > >
> > > >
> > > > Here is the updated patch to initialize GLRO(dl_minsigstacksize)
> > > > to 0 on x86.
> > > >
> > >
> > > PING:
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-October/118843.html
> >
> > Because the signal context doesn't actually fit in mcontext_t any more
> > (or, recursively, in ucontext_t), it would make sense to amend some of
> > the ucontext-related functions at the same time, and force people
> > towards this updated interface when building with _SC_SIGSTKSZ_SOURCE.
>
> That may be useful.
>
> > I'm aiming to get a draft proposal onto the list next week, if people
> > can wait for that before deciding whether the two sets of changes should
> > be coupled...
> >
>
> Thanks.

If there are no more comments, I will check in it together with

https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html

on Friday, Nov 20.

Thanks.
  
Zack Weinberg Nov. 18, 2020, 2:25 p.m. UTC | #5
On Wed, Nov 18, 2020 at 9:14 AM H.J. Lu via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> If there are no more comments, I will check in it together with
>
> https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html
>
> on Friday, Nov 20.

I don't think you have consensus for this ABI-breaking change, and I
*certainly* don't think you should check it in without positive
agreement from at least one other person.

zw
  
H.J. Lu Nov. 18, 2020, 2:40 p.m. UTC | #6
On Wed, Nov 18, 2020 at 6:26 AM Zack Weinberg <zackw@panix.com> wrote:
>
> On Wed, Nov 18, 2020 at 9:14 AM H.J. Lu via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> >
> > If there are no more comments, I will check in it together with
> >
> > https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html
> >
> > on Friday, Nov 20.
>
> I don't think you have consensus for this ABI-breaking change, and I
> *certainly* don't think you should check it in without positive
> agreement from at least one other person.
>

My patch adds _SC_SIGSTKSZ_SOURCE.   There are no changes
unless _SC_SIGSTKSZ_SOURCE is defined.

Dave, what do you think?
  
Zack Weinberg Nov. 18, 2020, 3:12 p.m. UTC | #7
On Wed, Nov 18, 2020 at 9:40 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> On Wed, Nov 18, 2020 at 6:26 AM Zack Weinberg <zackw@panix.com> wrote:
> >
> > On Wed, Nov 18, 2020 at 9:14 AM H.J. Lu via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > >
> > > If there are no more comments, I will check in it together with
> > >
> > > https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html
> > >
> > > on Friday, Nov 20.
> >
> > I don't think you have consensus for this ABI-breaking change, and I
> > *certainly* don't think you should check it in without positive
> > agreement from at least one other person.
> >
>
> My patch adds _SC_SIGSTKSZ_SOURCE.   There are no changes
> unless _SC_SIGSTKSZ_SOURCE is defined.

OK, but do we have consensus on adding a new _*_SOURCE macro and a new
sysconf parameter ahead of their adoption by POSIX?  This is not a
rhetorical question, I have not been following that part of the
discussion closely enough to tell.

zw
  
H.J. Lu Nov. 18, 2020, 3:17 p.m. UTC | #8
On Wed, Nov 18, 2020 at 7:13 AM Zack Weinberg <zackw@panix.com> wrote:
>
> On Wed, Nov 18, 2020 at 9:40 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Wed, Nov 18, 2020 at 6:26 AM Zack Weinberg <zackw@panix.com> wrote:
> > >
> > > On Wed, Nov 18, 2020 at 9:14 AM H.J. Lu via Libc-alpha
> > > <libc-alpha@sourceware.org> wrote:
> > > >
> > > > If there are no more comments, I will check in it together with
> > > >
> > > > https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html
> > > >
> > > > on Friday, Nov 20.
> > >
> > > I don't think you have consensus for this ABI-breaking change, and I
> > > *certainly* don't think you should check it in without positive
> > > agreement from at least one other person.
> > >
> >
> > My patch adds _SC_SIGSTKSZ_SOURCE.   There are no changes
> > unless _SC_SIGSTKSZ_SOURCE is defined.
>
> OK, but do we have consensus on adding a new _*_SOURCE macro and a new
> sysconf parameter ahead of their adoption by POSIX?  This is not a

This is what Dave and I came up with.  This approach doesn't change ABI
unless it is requested by developers.

> rhetorical question, I have not been following that part of the
> discussion closely enough to tell.
>
> zw
  
Florian Weimer Nov. 18, 2020, 3:20 p.m. UTC | #9
* H. J. Lu via Libc-alpha:

> On Wed, Nov 18, 2020 at 7:13 AM Zack Weinberg <zackw@panix.com> wrote:
>>
>> On Wed, Nov 18, 2020 at 9:40 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>> > On Wed, Nov 18, 2020 at 6:26 AM Zack Weinberg <zackw@panix.com> wrote:
>> > >
>> > > On Wed, Nov 18, 2020 at 9:14 AM H.J. Lu via Libc-alpha
>> > > <libc-alpha@sourceware.org> wrote:
>> > > >
>> > > > If there are no more comments, I will check in it together with
>> > > >
>> > > > https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html
>> > > >
>> > > > on Friday, Nov 20.
>> > >
>> > > I don't think you have consensus for this ABI-breaking change, and I
>> > > *certainly* don't think you should check it in without positive
>> > > agreement from at least one other person.
>> > >
>> >
>> > My patch adds _SC_SIGSTKSZ_SOURCE.   There are no changes
>> > unless _SC_SIGSTKSZ_SOURCE is defined.
>>
>> OK, but do we have consensus on adding a new _*_SOURCE macro and a new
>> sysconf parameter ahead of their adoption by POSIX?  This is not a
>
> This is what Dave and I came up with.  This approach doesn't change ABI
> unless it is requested by developers.

Did you mean ABI?

I think we should use _GNU_SOURCE. We can add another feature macro if
it's adopted by POSIX.
  
Dave Martin Nov. 18, 2020, 5:04 p.m. UTC | #10
On Wed, Nov 18, 2020 at 04:20:44PM +0100, Florian Weimer wrote:
> * H. J. Lu via Libc-alpha:
> 
> > On Wed, Nov 18, 2020 at 7:13 AM Zack Weinberg <zackw@panix.com> wrote:
> >>
> >> On Wed, Nov 18, 2020 at 9:40 AM H.J. Lu <hjl.tools@gmail.com> wrote:
> >> > On Wed, Nov 18, 2020 at 6:26 AM Zack Weinberg <zackw@panix.com> wrote:
> >> > >
> >> > > On Wed, Nov 18, 2020 at 9:14 AM H.J. Lu via Libc-alpha
> >> > > <libc-alpha@sourceware.org> wrote:
> >> > > >
> >> > > > If there are no more comments, I will check in it together with
> >> > > >
> >> > > > https://sourceware.org/pipermail/libc-alpha/2020-October/118707.html
> >> > > >
> >> > > > on Friday, Nov 20.
> >> > >
> >> > > I don't think you have consensus for this ABI-breaking change, and I
> >> > > *certainly* don't think you should check it in without positive
> >> > > agreement from at least one other person.
> >> > >
> >> >
> >> > My patch adds _SC_SIGSTKSZ_SOURCE.   There are no changes
> >> > unless _SC_SIGSTKSZ_SOURCE is defined.
> >>
> >> OK, but do we have consensus on adding a new _*_SOURCE macro and a new
> >> sysconf parameter ahead of their adoption by POSIX?  This is not a
> >
> > This is what Dave and I came up with.  This approach doesn't change ABI
> > unless it is requested by developers.
> 
> Did you mean ABI?
> 
> I think we should use _GNU_SOURCE. We can add another feature macro if
> it's adopted by POSIX.

Having _GNU_SOURCE enable this feature will break existing source code.

It probably should go under _GNU_SOURCE in the long term, but is there
usually a transitional period when the change is opt-in via some feature
macro?

Otherwise, it feels like we would have a chicken-and-egg problem, with
nobody wanting to be the first adopter.


And also, what, if anything, should be done about the ucontext API?

The signal frame on a couple of arches now overruns the original
mcontext_t definition.

Ucontext seems to be in quite a broken state and is hard to fix because
of the lack of extensibility in the design.  Signals have also diverged
from setcontext() etc. -- some arches don't even encode the same data in
there, or put things in the same places between the two APIs.

I have some thoughts on what a better interface might look like --
basically separating the signal ucontext_t type from the setcontext()/
getcontext() etc. type, and providing accessors for the architectural
register state rather than just having a fixed struct definition for
mcontext_t.

But, there also may not be a lot of appetite for such a change, and
I can't see how it could be backwards compatible.

I can elaborate if people think it's worth discussing.

Cheers
---Dave
  
Florian Weimer Nov. 18, 2020, 5:35 p.m. UTC | #11
* Dave Martin:

> Having _GNU_SOURCE enable this feature will break existing source code.

Does this matter?

The code is already broken on a quickly increasing number of machines,
so it needs fixing anyway.  A compile-time error is probably
preferable to an obscure run-time failure.

> It probably should go under _GNU_SOURCE in the long term, but is there
> usually a transitional period when the change is opt-in via some feature
> macro?

Not always.  See the iszero and other macros in <math.h>, that also
broke existing sources (largely C++, so we worked around it by using
C++ features instead of macros).  There are also many older examples.

> I have some thoughts on what a better interface might look like --
> basically separating the signal ucontext_t type from the setcontext()/
> getcontext() etc. type, and providing accessors for the architectural
> register state rather than just having a fixed struct definition for
> mcontext_t.
>
> But, there also may not be a lot of appetite for such a change, and
> I can't see how it could be backwards compatible.
>
> I can elaborate if people think it's worth discussing.

I think Rich Felker wants to copy signal contexts around to implement
critical sections that can't be interrupted by a signal handler, I
think that would need this fully fixed.

But this is somewhat separate from fixing stack sizes to accomodate
kernel and hardware needs.

By the way, something that I tried to raise in the past, but wasn't
good at it: In the future, we may need a mechanism to reduce the
kernel stack size usage for legacy binaries, perhaps using
virtualization.  The last time I looked at this, the signal context
did not actually reduce when AVX-512 support was masked in a guest, I
think.  (Okay, this is largely for H.J.'s benefit.)  This also applies
to ld.so and its context save operation.  Fortunately, not many people
have run into this compatibility issue so far.  The case I remember
was an application that broke because it assumed zeros on the stack,
and the XSAVE trampoline wrote some non-zero bits there because it
reached much deeper into the stack.  But the user was happy when we
found the root cause and was able to add the missing initializer to
their code, so all was good.
  
H.J. Lu Nov. 18, 2020, 5:48 p.m. UTC | #12
On Wed, Nov 18, 2020 at 9:35 AM Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Dave Martin:
>
> > Having _GNU_SOURCE enable this feature will break existing source code.
>
> Does this matter?
>
> The code is already broken on a quickly increasing number of machines,
> so it needs fixing anyway.  A compile-time error is probably
> preferable to an obscure run-time failure.

I can see points on both sides.  Either approach is better than the status quo.
Can we make a decision for glibc 2.33?

> > It probably should go under _GNU_SOURCE in the long term, but is there
> > usually a transitional period when the change is opt-in via some feature
> > macro?
>
> Not always.  See the iszero and other macros in <math.h>, that also
> broke existing sources (largely C++, so we worked around it by using
> C++ features instead of macros).  There are also many older examples.
>
> > I have some thoughts on what a better interface might look like --
> > basically separating the signal ucontext_t type from the setcontext()/
> > getcontext() etc. type, and providing accessors for the architectural
> > register state rather than just having a fixed struct definition for
> > mcontext_t.
> >
> > But, there also may not be a lot of appetite for such a change, and
> > I can't see how it could be backwards compatible.
> >
> > I can elaborate if people think it's worth discussing.
>
> I think Rich Felker wants to copy signal contexts around to implement
> critical sections that can't be interrupted by a signal handler, I
> think that would need this fully fixed.
>
> But this is somewhat separate from fixing stack sizes to accomodate
> kernel and hardware needs.
>
> By the way, something that I tried to raise in the past, but wasn't
> good at it: In the future, we may need a mechanism to reduce the
> kernel stack size usage for legacy binaries, perhaps using
> virtualization.  The last time I looked at this, the signal context
> did not actually reduce when AVX-512 support was masked in a guest, I
> think.  (Okay, this is largely for H.J.'s benefit.)  This also applies
> to ld.so and its context save operation.  Fortunately, not many people
> have run into this compatibility issue so far.  The case I remember
> was an application that broke because it assumed zeros on the stack,
> and the XSAVE trampoline wrote some non-zero bits there because it
> reached much deeper into the stack.  But the user was happy when we
> found the root cause and was able to add the missing initializer to
> their code, so all was good.

We, Intel, are investigating different approaches to improve signal handling
on machines with large XSAVE states.  One possibility is fast signal handler
where the programmer informs the kernel that only to save and restore a subset
of XSAVE states in the signal handler and update the compiler to enforce that.
  
Dave Martin Nov. 18, 2020, 6:09 p.m. UTC | #13
On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> * Dave Martin:
> 
> > Having _GNU_SOURCE enable this feature will break existing source code.
> 
> Does this matter?
> 
> The code is already broken on a quickly increasing number of machines,
> so it needs fixing anyway.  A compile-time error is probably
> preferable to an obscure run-time failure.

Fair point, but I didn't want to be the one making it :)

If people are happy to go this way, I don't see a problem with it.
Distro maintainers may shout a bit while they deal with the fallout of
that glibc upgrade though.


> > It probably should go under _GNU_SOURCE in the long term, but is there
> > usually a transitional period when the change is opt-in via some feature
> > macro?
> 
> Not always.  See the iszero and other macros in <math.h>, that also
> broke existing sources (largely C++, so we worked around it by using
> C++ features instead of macros).  There are also many older examples.

I see.  Well, I guess that's not my call.


> > I have some thoughts on what a better interface might look like --
> > basically separating the signal ucontext_t type from the setcontext()/
> > getcontext() etc. type, and providing accessors for the architectural
> > register state rather than just having a fixed struct definition for
> > mcontext_t.
> >
> > But, there also may not be a lot of appetite for such a change, and
> > I can't see how it could be backwards compatible.
> >
> > I can elaborate if people think it's worth discussing.
> 
> I think Rich Felker wants to copy signal contexts around to implement
> critical sections that can't be interrupted by a signal handler, I
> think that would need this fully fixed.

Is rseq a more suitable way to do that sort of thing on new-ish Linux?
I guess a fallback may be needed for older / other kernels though.


I'm happy to follow up on here with my ucontext proposal, but it's quite
a big change.  I wasn't sure people would really be up for it...


> But this is somewhat separate from fixing stack sizes to accomodate
> kernel and hardware needs.

I was wondering whether we could feed in ucontext changes on the back of
the MINSIGSTKSZ stuff, so that if people want one they must adopt both.

Probably best to deal with it independently though -- as you say, the
ucontext breakage already exists.


> By the way, something that I tried to raise in the past, but wasn't
> good at it: In the future, we may need a mechanism to reduce the
> kernel stack size usage for legacy binaries, perhaps using
> virtualization.  The last time I looked at this, the signal context
> did not actually reduce when AVX-512 support was masked in a guest, I
> think.  (Okay, this is largely for H.J.'s benefit.)  This also applies
> to ld.so and its context save operation.  Fortunately, not many people
> have run into this compatibility issue so far.  The case I remember
> was an application that broke because it assumed zeros on the stack,
> and the XSAVE trampoline wrote some non-zero bits there because it
> reached much deeper into the stack.  But the user was happy when we
> found the root cause and was able to add the missing initializer to
> their code, so all was good.

For aarch64 an explicit prctl()/sysctl opt-in is needed to enable jumbo
vector registers before you see oversized signal frames, though I don't
think there is a similar control for AVX-512.

Even on aarch64, this interface is not very friendly though.  It might
be better to have some ELF attribute that ld.so or the libc startup can
arbitrate on and twiddle the appropriate switches.

Cheers
---Dave
  
Szabolcs Nagy Nov. 19, 2020, 2:59 p.m. UTC | #14
The 11/18/2020 18:09, Dave Martin via Libc-alpha wrote:
> On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> > * Dave Martin:
> > > I have some thoughts on what a better interface might look like --
> > > basically separating the signal ucontext_t type from the setcontext()/
> > > getcontext() etc. type, and providing accessors for the architectural
> > > register state rather than just having a fixed struct definition for
> > > mcontext_t.
> > >
> > > But, there also may not be a lot of appetite for such a change, and
> > > I can't see how it could be backwards compatible.
> > >
> > > I can elaborate if people think it's worth discussing.
> > 
> > I think Rich Felker wants to copy signal contexts around to implement
> > critical sections that can't be interrupted by a signal handler, I
> > think that would need this fully fixed.
> 
> Is rseq a more suitable way to do that sort of thing on new-ish Linux?
> I guess a fallback may be needed for older / other kernels though.

rseq does not help with libc critical sections:

the point is not to restart the critical section
(which would require no side effect or mechanisms
to roll side effects back and that the section is
entirely written in asm between begin/end labels
so the kernel knows when the section is left),

but to let the critical section with all its side
effects complete and delay the signal handler
until then. (the slow and easy way to do this is
masking signals using syscalls around critical
sections, a fast solution needs signal wrapping
and saving the sigcontext.)

for example the entire malloc call can be a critical
section and an incomming signal delayed until malloc
completes. such solution allows hiding all libc
internal inconsistent state from user code so async
signal handlers can call all libc apis.

> For aarch64 an explicit prctl()/sysctl opt-in is needed to enable jumbo
> vector registers before you see oversized signal frames, though I don't
> think there is a similar control for AVX-512.
> 
> Even on aarch64, this interface is not very friendly though.  It might
> be better to have some ELF attribute that ld.so or the libc startup can
> arbitrate on and twiddle the appropriate switches.

we can probably mark binaries if they are large
vector length compatible. (but the incompatible
binaries have to be hunted down manually and all
we can do with them at dlopen time is to give a
nice dlerror in case the vlengh got increased)
  
H.J. Lu Nov. 19, 2020, 3:10 p.m. UTC | #15
On Thu, Nov 19, 2020 at 7:00 AM Szabolcs Nagy via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> The 11/18/2020 18:09, Dave Martin via Libc-alpha wrote:
> > On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> > > * Dave Martin:
> > > > I have some thoughts on what a better interface might look like --
> > > > basically separating the signal ucontext_t type from the setcontext()/
> > > > getcontext() etc. type, and providing accessors for the architectural
> > > > register state rather than just having a fixed struct definition for
> > > > mcontext_t.
> > > >
> > > > But, there also may not be a lot of appetite for such a change, and
> > > > I can't see how it could be backwards compatible.
> > > >
> > > > I can elaborate if people think it's worth discussing.
> > >
> > > I think Rich Felker wants to copy signal contexts around to implement
> > > critical sections that can't be interrupted by a signal handler, I
> > > think that would need this fully fixed.
> >
> > Is rseq a more suitable way to do that sort of thing on new-ish Linux?
> > I guess a fallback may be needed for older / other kernels though.
>
> rseq does not help with libc critical sections:
>
> the point is not to restart the critical section
> (which would require no side effect or mechanisms
> to roll side effects back and that the section is
> entirely written in asm between begin/end labels
> so the kernel knows when the section is left),
>
> but to let the critical section with all its side
> effects complete and delay the signal handler
> until then. (the slow and easy way to do this is
> masking signals using syscalls around critical
> sections, a fast solution needs signal wrapping
> and saving the sigcontext.)
>
> for example the entire malloc call can be a critical
> section and an incomming signal delayed until malloc
> completes. such solution allows hiding all libc
> internal inconsistent state from user code so async
> signal handlers can call all libc apis.
>
> > For aarch64 an explicit prctl()/sysctl opt-in is needed to enable jumbo
> > vector registers before you see oversized signal frames, though I don't
> > think there is a similar control for AVX-512.
> >
> > Even on aarch64, this interface is not very friendly though.  It might
> > be better to have some ELF attribute that ld.so or the libc startup can
> > arbitrate on and twiddle the appropriate switches.
>
> we can probably mark binaries if they are large
> vector length compatible. (but the incompatible
> binaries have to be hunted down manually and all
> we can do with them at dlopen time is to give a
> nice dlerror in case the vlengh got increased)

We can mark such binaries with GNU property.
  
Zack Weinberg Nov. 19, 2020, 3:39 p.m. UTC | #16
On Thu, Nov 19, 2020 at 10:00 AM Szabolcs Nagy via Libc-alpha
<libc-alpha@sourceware.org> wrote:
...
> but to let the critical section with all its side
> effects complete and delay the signal handler
> until then. (the slow and easy way to do this is
> masking signals using syscalls around critical
> sections, a fast solution needs signal wrapping
> and saving the sigcontext.)
>
> for example the entire malloc call can be a critical
> section and an incomming signal delayed until malloc
> completes.

I read this and I'm wondering how impractical it would be to invent a
way to perform pthread_sigmask() operations without a system call.
Abstractly, the signal mask for each thread _could_ be placed in user
space at an ABI-specified location where the kernel can find it
(bottom of the thread stack, maybe, or an offset from the TLS base);
this doesn't make anything new possible (the kernel will just continue
to ignore the bits for SIGKILL, SIGSEGV, etc.) so it wouldn't be a
privilege violation.

Seems like that would be much easier than copying sigcontexts around,
and less likely to break code that thinks it knows stuff about signal
frames.

zw
  
Florian Weimer Nov. 19, 2020, 3:51 p.m. UTC | #17
* Zack Weinberg:

> I read this and I'm wondering how impractical it would be to invent a
> way to perform pthread_sigmask() operations without a system call.
> Abstractly, the signal mask for each thread _could_ be placed in user
> space at an ABI-specified location where the kernel can find it
> (bottom of the thread stack, maybe, or an offset from the TLS base);
> this doesn't make anything new possible (the kernel will just continue
> to ignore the bits for SIGKILL, SIGSEGV, etc.) so it wouldn't be a
> privilege violation.

Yes, this came up in the KTLS context.  The kernel also needs to set a
flag that indicates that unblocking requires a system call (so that a
pending signal can be delivered).  But it looks quite feasible to
implement.
  
Rich Felker Nov. 19, 2020, 4:16 p.m. UTC | #18
On Thu, Nov 19, 2020 at 10:39:47AM -0500, Zack Weinberg wrote:
> On Thu, Nov 19, 2020 at 10:00 AM Szabolcs Nagy via Libc-alpha
> <libc-alpha@sourceware.org> wrote:
> ....
> > but to let the critical section with all its side
> > effects complete and delay the signal handler
> > until then. (the slow and easy way to do this is
> > masking signals using syscalls around critical
> > sections, a fast solution needs signal wrapping
> > and saving the sigcontext.)
> >
> > for example the entire malloc call can be a critical
> > section and an incomming signal delayed until malloc
> > completes.
> 
> I read this and I'm wondering how impractical it would be to invent a
> way to perform pthread_sigmask() operations without a system call.
> 
> Abstractly, the signal mask for each thread _could_ be placed in user
> space at an ABI-specified location where the kernel can find it

I've worked this out before, and it's mostly simple, but might have
some corner cases. Archs without an atomic (in single threaded sense)
64-bit (128-bit for MIPS) store operation can't update the mask
without tearing (a signal arriving between the partial writes could
see an inconsistent signal mask), so they couldn't use a naive form.
Having two slots you can alternate between (update the inactive one
then write the selector to switch which one is active) should work
though. There are probably lots of other minor issues like this but I
think they're all solveable.

> (bottom of the thread stack, maybe, or an offset from the TLS base);
> this doesn't make anything new possible (the kernel will just continue
> to ignore the bits for SIGKILL, SIGSEGV, etc.) so it wouldn't be a
> privilege violation.

It could either be managed by the vdso, or at an address registered by
a syscall (similar to robust_list) rather than some hard-coded
contract.

But in any case, even if this were added, it wouldn't meet the needs
for use case I proposed in musl. It doesn't solve the execve race with
abort (and can't, since it's just equivalent to optimizing signal
blocking we can already do, not introducing any new capability) and it
would not be available on past kernels, only future ones.

> Seems like that would be much easier than copying sigcontexts around,
> and less likely to break code that thinks it knows stuff about signal
> frames.

You don't need copy sigcontexts around. That's a misunderstanding of
how this works. It's just the siginfo that needs to be saved until the
deferred handler runs. The sigcontext passed to the handler is then
the synchronous context at the time it runs (which becomes the logical
time of signal delivery), not the earlier async context, and can
thereby be pretty much entirely uninitialized except for return
address and stack pointer. Everything else is a representation of
internal state from a libc function and can just be all zeros or
whatever.

Rich
  
Dave Martin Nov. 19, 2020, 4:37 p.m. UTC | #19
On Thu, Nov 19, 2020 at 02:59:34PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> The 11/18/2020 18:09, Dave Martin via Libc-alpha wrote:
> > On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> > > * Dave Martin:
> > > > I have some thoughts on what a better interface might look like --
> > > > basically separating the signal ucontext_t type from the setcontext()/
> > > > getcontext() etc. type, and providing accessors for the architectural
> > > > register state rather than just having a fixed struct definition for
> > > > mcontext_t.
> > > >
> > > > But, there also may not be a lot of appetite for such a change, and
> > > > I can't see how it could be backwards compatible.
> > > >
> > > > I can elaborate if people think it's worth discussing.
> > > 
> > > I think Rich Felker wants to copy signal contexts around to implement
> > > critical sections that can't be interrupted by a signal handler, I
> > > think that would need this fully fixed.
> > 
> > Is rseq a more suitable way to do that sort of thing on new-ish Linux?
> > I guess a fallback may be needed for older / other kernels though.
> 
> rseq does not help with libc critical sections:
> 
> the point is not to restart the critical section
> (which would require no side effect or mechanisms
> to roll side effects back and that the section is
> entirely written in asm between begin/end labels
> so the kernel knows when the section is left),
> 
> but to let the critical section with all its side
> effects complete and delay the signal handler
> until then. (the slow and easy way to do this is
> masking signals using syscalls around critical
> sections, a fast solution needs signal wrapping
> and saving the sigcontext.)
> 
> for example the entire malloc call can be a critical
> section and an incomming signal delayed until malloc
> completes. such solution allows hiding all libc
> internal inconsistent state from user code so async
> signal handlers can call all libc apis.

Isn't this a bit backwards.  This "makes" trivial signal handlers easier
to write, but this is a bit of a Trojan horse: precisely because signal
handlers can interrupt things, subtleties abound.  So, while there are
plenty of naive signal handlers out there, there are far fewer that are
genuinely trivial -- i.e., free from subtleties.

In any case, the problem of async signal safety remains: even if libc
uses internal locks to hide it, library functions in general may not.

A better approach would be to have function attributes that identify
code that may run in signal context and async-signal-safe functions, so
that the compiler can actually enforce that only reentrant functions are
called from signal context.

Finally, if a fault signal is delivered while blocked or ignored it
kills the process.  So handlers for fault signals raised by the kernel
still wouldn't be able to call random libc functions: to prevent sudden
death while in the middle of malloc etc., libc must not mask these
signals, and wouldn't be safely reentrant while handling then -- thus we
still have the problem we intended to solve.

This is ironic, since these signals are the _only_ signals that must be
handled using signal handlers.  Other signals can all be accepted by
other means, such as signalfd.

The other reason to use signal handlers is to minimise response latency
for asynchronous signals.  Masking signals in order to bulletproof
code that the signal handler probably isn't going to use anyway would
interfere with this goal.

> > For aarch64 an explicit prctl()/sysctl opt-in is needed to enable jumbo
> > vector registers before you see oversized signal frames, though I don't
> > think there is a similar control for AVX-512.
> > 
> > Even on aarch64, this interface is not very friendly though.  It might
> > be better to have some ELF attribute that ld.so or the libc startup can
> > arbitrate on and twiddle the appropriate switches.
> 
> we can probably mark binaries if they are large
> vector length compatible. (but the incompatible
> binaries have to be hunted down manually and all
> we can do with them at dlopen time is to give a
> nice dlerror in case the vlengh got increased)

Unless libc obeys the flag and actively prevents the setting of a larger
vector length.  It could do that, though I don't have a strong view on
whether it should...

dlopening a library with a smaller maximum vector length than the
established one would still be a problem though: we might have to fail
that.  But only in a process where some library uses SVE at all.  I
think there are ELF attributes for that too, no?

I think the incompatible binaries don't necessarily have to be hunted
down:  if a program was built with _GNU_SOURCE and a new enough C
library, it can't use the compile-time constant MINSIGSTKSZ etc., so we
can take the opportunity to mark the binary.  Other binaries can't be
assumed to be compatible with large vectors.

Cheers
---Dave
  
Dave Martin Nov. 19, 2020, 4:52 p.m. UTC | #20
On Thu, Nov 19, 2020 at 11:16:57AM -0500, Rich Felker wrote:
> On Thu, Nov 19, 2020 at 10:39:47AM -0500, Zack Weinberg wrote:
> > On Thu, Nov 19, 2020 at 10:00 AM Szabolcs Nagy via Libc-alpha
> > <libc-alpha@sourceware.org> wrote:
> > ....
> > > but to let the critical section with all its side
> > > effects complete and delay the signal handler
> > > until then. (the slow and easy way to do this is
> > > masking signals using syscalls around critical
> > > sections, a fast solution needs signal wrapping
> > > and saving the sigcontext.)
> > >
> > > for example the entire malloc call can be a critical
> > > section and an incomming signal delayed until malloc
> > > completes.
> > 
> > I read this and I'm wondering how impractical it would be to invent a
> > way to perform pthread_sigmask() operations without a system call.
> > 
> > Abstractly, the signal mask for each thread _could_ be placed in user
> > space at an ABI-specified location where the kernel can find it
> 
> I've worked this out before, and it's mostly simple, but might have
> some corner cases. Archs without an atomic (in single threaded sense)
> 64-bit (128-bit for MIPS) store operation can't update the mask
> without tearing (a signal arriving between the partial writes could
> see an inconsistent signal mask), so they couldn't use a naive form.
> Having two slots you can alternate between (update the inactive one
> then write the selector to switch which one is active) should work
> though. There are probably lots of other minor issues like this but I
> think they're all solveable.

I'd also thought about this, and I feel it does mostly work on systems
with atomic read-modify-writes of some sort, but there are a lot of
tricky details.

We might be able to move a lot of the "legacy" signal delivery logic to
userspace, eliminating all kernel entries except for asynchronous signal
delivery that comes out of the blue.

My instinct says that it would be better to have a more stripped down
kernel signal delivery mechanism that is actually designed as a
foundation for this.  Then all the "legacy" signal delivery logic could
be implemented in userspace.

POSIX signals are trying to meet a bunch of mutually defeating goals,
some of them historic and quite misguided -- so they're never going to
be all that satisfactory as a base to build on.

> 
> > (bottom of the thread stack, maybe, or an offset from the TLS base);
> > this doesn't make anything new possible (the kernel will just continue
> > to ignore the bits for SIGKILL, SIGSEGV, etc.) so it wouldn't be a
> > privilege violation.
> 
> It could either be managed by the vdso, or at an address registered by
> a syscall (similar to robust_list) rather than some hard-coded
> contract.
> 
> But in any case, even if this were added, it wouldn't meet the needs
> for use case I proposed in musl. It doesn't solve the execve race with
> abort (and can't, since it's just equivalent to optimizing signal
> blocking we can already do, not introducing any new capability) and it
> would not be available on past kernels, only future ones.
> 
> > Seems like that would be much easier than copying sigcontexts around,
> > and less likely to break code that thinks it knows stuff about signal
> > frames.
> 
> You don't need copy sigcontexts around. That's a misunderstanding of
> how this works. It's just the siginfo that needs to be saved until the
> deferred handler runs. The sigcontext passed to the handler is then
> the synchronous context at the time it runs (which becomes the logical
> time of signal delivery), not the earlier async context, and can
> thereby be pretty much entirely uninitialized except for return
> address and stack pointer. Everything else is a representation of
> internal state from a libc function and can just be all zeros or
> whatever.

I see.  I might still try to hack up a better ucontext interface if
(heh) I find myself with time on my hands, but I won't cite your needs
as a use case!

Cheers
  
Rich Felker Nov. 19, 2020, 5:29 p.m. UTC | #21
On Thu, Nov 19, 2020 at 04:37:45PM +0000, Dave Martin wrote:
> On Thu, Nov 19, 2020 at 02:59:34PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> > The 11/18/2020 18:09, Dave Martin via Libc-alpha wrote:
> > > On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> > > > * Dave Martin:
> > > > > I have some thoughts on what a better interface might look like --
> > > > > basically separating the signal ucontext_t type from the setcontext()/
> > > > > getcontext() etc. type, and providing accessors for the architectural
> > > > > register state rather than just having a fixed struct definition for
> > > > > mcontext_t.
> > > > >
> > > > > But, there also may not be a lot of appetite for such a change, and
> > > > > I can't see how it could be backwards compatible.
> > > > >
> > > > > I can elaborate if people think it's worth discussing.
> > > > 
> > > > I think Rich Felker wants to copy signal contexts around to implement
> > > > critical sections that can't be interrupted by a signal handler, I
> > > > think that would need this fully fixed.
> > > 
> > > Is rseq a more suitable way to do that sort of thing on new-ish Linux?
> > > I guess a fallback may be needed for older / other kernels though.
> > 
> > rseq does not help with libc critical sections:
> > 
> > the point is not to restart the critical section
> > (which would require no side effect or mechanisms
> > to roll side effects back and that the section is
> > entirely written in asm between begin/end labels
> > so the kernel knows when the section is left),
> > 
> > but to let the critical section with all its side
> > effects complete and delay the signal handler
> > until then. (the slow and easy way to do this is
> > masking signals using syscalls around critical
> > sections, a fast solution needs signal wrapping
> > and saving the sigcontext.)
> > 
> > for example the entire malloc call can be a critical
> > section and an incomming signal delayed until malloc
> > completes. such solution allows hiding all libc
> > internal inconsistent state from user code so async
> > signal handlers can call all libc apis.
> 
> Isn't this a bit backwards.  This "makes" trivial signal handlers easier
> to write, but this is a bit of a Trojan horse: precisely because signal
> handlers can interrupt things, subtleties abound.  So, while there are
> plenty of naive signal handlers out there, there are far fewer that are
> genuinely trivial -- i.e., free from subtleties.
> 
> In any case, the problem of async signal safety remains: even if libc
> uses internal locks to hide it, library functions in general may not.
> 
> A better approach..

You're missing a lot of the context of this to understand the
motivations. While the malloc example is interesting and making it
AS-safe as an extension is perhaps nice, the real motivation is places
where we're implementing an interface that is already required or at
least expected to be AS-safe. If such a function needs to access a
shared resource under lock, then presently that lock can never be
taken except with all application signals blocked. Otherwise, it's
possible that the AS-safe function needing the lock runs from a signal
handler that interrupted code that already held the lock. (Note:
recursive locks can't solve this, because a more complex version of
the same deadlock can be recreated with a pair of threads each
interrupted by signal handlers each needing the lock the other holds
in the interrupted code.)

One particularly nasty example is munmap (and mmap or mremap with
MAP_FIXED) which needs to synchronize with removal of robust mutexes
from the pending slot of the unlocking thread's robust_list. (Without
doing this, there are fundamental race conditions whereby async
termination of a process with robust mutexes can corrupt memory-mapped
files.) Presently, this makes it so our munmap, etc. are not AS-safe.
Note that they're not required to be AS-safe by POSIX, but most
programmers assume they are. In order to make them AS-safe, every
pthread_mutex_unlock on a robust mutex would need to mask and unmask
signals, so that the lock that inhibits changes to mappings couldn't
be held arbitrarily long and in deadlocking ways when interrupted by a
signal. But this masking/unmasking would introduce 2 syscalls to a
very hot path that's not supposed to have any.

However the original motivation for signal wrapping (which is what
enables deferral) is fixing an otherwise unfixable race between execve
and abort. In order to exit via SIGABRT when the signal starts out
with SIG_IGN disposition, abort has to change the disposition, and use
locks to prevent it from being changed back and prevent the changed
disposition from being observable. Morally, execve should take this
lock to prevent passing the changed disposition to a new process image
if it races with abort. However, the lock has to be an AS-safe one
(since abort has to be AS-safe), and execve can't mask signals or the
new mask would be passed to the new process image.

With the proposed wrapping/deferral, we can safely take the abort lock
without masking any signals. There's some additional machinery needed
here in the signal wrapper to make everything come out right, which
I'm glossing over because it's not the point here, but it all works
out and gives you an execve that can't leak internal state of abort.

> would be to have function attributes that identify
> code that may run in signal context and async-signal-safe functions, so
> that the compiler can actually enforce that only reentrant functions are
> called from signal context.
> 
> Finally, if a fault signal is delivered while blocked or ignored it
> kills the process.  So handlers for fault signals raised by the kernel
> still wouldn't be able to call random libc functions: to prevent sudden
> death while in the middle of malloc etc., libc must not mask these
> signals, and wouldn't be safely reentrant while handling then -- thus we
> still have the problem we intended to solve.

This has been raised plenty of times already. The desired outcome is
that the process terminate. This is a feature not a bug. If you really
don't want that, you *could* process synchronous signals (SIGSEGV
raised by invalid access) synchronously rather than deferring them,
possibly after releasing the lock for the critical section, but all
this does is de-harden the code. The *only* way a fault can happen in
this code is for the internal state to be corrupted in a manner in
which no further execution in the process context is safe.

> This is ironic, since these signals are the _only_ signals that must be
> handled using signal handlers.  Other signals can all be accepted by
> other means, such as signalfd.

I think you mean the only signals for which installing a handler is
meaningless because they're only delivered when the program already
has undefined behavior. (Technically you could use them in ways that
can have meaningfully defined model *from your own code* or even from
library code like memcpy or something if you've created mappings that
are intended to fault, but there is no defined or even reasonably
definable condition under which that access would end up taking place
in one of the critical sections under consideration here.)

> The other reason to use signal handlers is to minimise response latency
> for asynchronous signals.  Masking signals in order to bulletproof
> code that the signal handler probably isn't going to use anyway would
> interfere with this goal.

Yes, deferring signals is a matter of trading very small bounded
amounts of latency to get rid of unbounded latency and possible
deadlock from code that holds a lock being interrupted by a signal
handler. Note that the equivalent trade (with much much larger
latency) already happens anywhere you have to mask signals temporarily
via a sigprocmask.

Rich
  
Szabolcs Nagy Nov. 19, 2020, 5:33 p.m. UTC | #22
The 11/19/2020 16:37, Dave Martin wrote:
> On Thu, Nov 19, 2020 at 02:59:34PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> > the point is not to restart the critical section
> > (which would require no side effect or mechanisms
> > to roll side effects back and that the section is
> > entirely written in asm between begin/end labels
> > so the kernel knows when the section is left),
> > 
> > but to let the critical section with all its side
> > effects complete and delay the signal handler
> > until then. (the slow and easy way to do this is
> > masking signals using syscalls around critical
> > sections, a fast solution needs signal wrapping
> > and saving the sigcontext.)
> > 
> > for example the entire malloc call can be a critical
> > section and an incomming signal delayed until malloc
> > completes. such solution allows hiding all libc
> > internal inconsistent state from user code so async
> > signal handlers can call all libc apis.
> 
> Isn't this a bit backwards.  This "makes" trivial signal handlers easier
> to write, but this is a bit of a Trojan horse: precisely because signal
> handlers can interrupt things, subtleties abound.  So, while there are
> plenty of naive signal handlers out there, there are far fewer that are
> genuinely trivial -- i.e., free from subtleties.
> 
> In any case, the problem of async signal safety remains: even if libc
> uses internal locks to hide it, library functions in general may not.
> 
> A better approach would be to have function attributes that identify
> code that may run in signal context and async-signal-safe functions, so
> that the compiler can actually enforce that only reentrant functions are
> called from signal context.

this does not solve any of the real problems with
exposed libc internal state to signal handlers:

- linux has serious bugs that libc has to work around
(e.g. missing abort syscall: abort and execve must be
as-safe but abort must change SIGABRT handling that
must not be inherited by execve and execve cannot
block signals because that must not be inherited either.)

- signal handler can arbitrarily delay another thread
(because it may happen when libc internal lock is
held and user code in signal handlers may not return
immediately so a handler in one thread can break
real-time guarantees of another purely because of
libc internal details that must not be observable).

- there can be libc operations that must be as-safe
but internally needs to do non-as-safe operation
(e.g. tls access must be as-safe but in glibc it
may allocate or take internal locks.)

- compiler can generate non-as-safe libc calls into
signal handler code (e.g. memcpy is not required to
be as-safe but compiler can generate it, but a more
realistic example is the various sanitizer
instrumentations that break as-safety because print
diagnostic messages on failures).

- there are existing slow signal mask operations
around critical sections in libc which can be
improved by the proposed design.

> Finally, if a fault signal is delivered while blocked or ignored it
> kills the process.  So handlers for fault signals raised by the kernel
> still wouldn't be able to call random libc functions: to prevent sudden
> death while in the middle of malloc etc., libc must not mask these
> signals, and wouldn't be safely reentrant while handling then -- thus we
> still have the problem we intended to solve.

synchronous faults cannot happen in libc unless
the caller invoked ub so anything goes.

(i.e. not delaying those signals is fine)
  
Dave Martin Nov. 19, 2020, 7:39 p.m. UTC | #23
On Thu, Nov 19, 2020 at 05:33:57PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> The 11/19/2020 16:37, Dave Martin wrote:
> > On Thu, Nov 19, 2020 at 02:59:34PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> > > the point is not to restart the critical section
> > > (which would require no side effect or mechanisms
> > > to roll side effects back and that the section is
> > > entirely written in asm between begin/end labels
> > > so the kernel knows when the section is left),
> > > 
> > > but to let the critical section with all its side
> > > effects complete and delay the signal handler
> > > until then. (the slow and easy way to do this is
> > > masking signals using syscalls around critical
> > > sections, a fast solution needs signal wrapping
> > > and saving the sigcontext.)
> > > 
> > > for example the entire malloc call can be a critical
> > > section and an incomming signal delayed until malloc
> > > completes. such solution allows hiding all libc
> > > internal inconsistent state from user code so async
> > > signal handlers can call all libc apis.
> > 
> > Isn't this a bit backwards.  This "makes" trivial signal handlers easier
> > to write, but this is a bit of a Trojan horse: precisely because signal
> > handlers can interrupt things, subtleties abound.  So, while there are
> > plenty of naive signal handlers out there, there are far fewer that are
> > genuinely trivial -- i.e., free from subtleties.
> > 
> > In any case, the problem of async signal safety remains: even if libc
> > uses internal locks to hide it, library functions in general may not.
> > 
> > A better approach would be to have function attributes that identify
> > code that may run in signal context and async-signal-safe functions, so
> > that the compiler can actually enforce that only reentrant functions are
> > called from signal context.
> 
> this does not solve any of the real problems with
> exposed libc internal state to signal handlers:
> 
> - linux has serious bugs that libc has to work around
> (e.g. missing abort syscall: abort and execve must be
> as-safe but abort must change SIGABRT handling that
> must not be inherited by execve and execve cannot
> block signals because that must not be inherited either.)

I can see the argument here, but this feels more like an implementation
detail of the execve() wrapper and libc's abort() implementation (?)

I guess this is analogous to spin_lock_irq() in the kernel: you take the
lock, but an irq that may be handled in the same (hardware) thread must
not attempt to take the same lock, since the spinlock code is not itself
reentrant for a given lock.  So, in the main thread you must also mask
IRQs before taking the lock.

> - signal handler can arbitrarily delay another thread
> (because it may happen when libc internal lock is
> held and user code in signal handlers may not return
> immediately so a handler in one thread can break
> real-time guarantees of another purely because of
> libc internal details that must not be observable).
> 
> - there can be libc operations that must be as-safe
> but internally needs to do non-as-safe operation
> (e.g. tls access must be as-safe but in glibc it
> may allocate or take internal locks.)

Seems reasonable.  I was concerned that these critical sections might be
long, potentially sleeping sequences of code.  It sounds like that's
definitley not the intention, so I guess it's workable.

The need to call sigprocmask() does suck here, and a way to get that
effect purely within userspace would be preferable.

> - compiler can generate non-as-safe libc calls into
> signal handler code (e.g. memcpy is not required to
> be as-safe but compiler can generate it, but a more

But presumably there is an understanding between the compiler and libcs
that it targets that the function called for out-of-line memcpy()s must
be as-safe.  By definition compiler output is not portable -- it assumes
a particular runtime environment.  So the lack of guranteed as-safety
for memcpy() in the standards is not necessarily an issue here.

(I wonder how may memcpy() implementations are really non-as-safe
though.  I guess that could happen if some kind of accelerator were used
for large copies.)

> realistic example is the various sanitizer
> instrumentations that break as-safety because print
> diagnostic messages on failures).

Well, you can do this in an as-safe way.  But it is unfortunate not to
be able to use the usual C functions to do the printing.

I generally assume that the sprintf() family of formats are at least
safe if you're not using locales or custom format specifiers.  I
probably shouldn't though, strictly speaking.


I suspect that unsafely printing diagnostics from handlers for fatal
signals is rather common in the wild, even if the standards say you
mustn't do it.

> - there are existing slow signal mask operations
> around critical sections in libc which can be
> improved by the proposed design.
> 
> > Finally, if a fault signal is delivered while blocked or ignored it
> > kills the process.  So handlers for fault signals raised by the kernel
> > still wouldn't be able to call random libc functions: to prevent sudden
> > death while in the middle of malloc etc., libc must not mask these
> > signals, and wouldn't be safely reentrant while handling then -- thus we
> > still have the problem we intended to solve.
> 
> synchronous faults cannot happen in libc unless
> the caller invoked ub so anything goes.
> 
> (i.e. not delaying those signals is fine)

Debatable.  The whole point of handling a fatal signal is to clean up
mess and/or capture diagnostic data.  By definition the process may be
in a partially unknown state but this point, but best efforts should
still be made to handle the signal -- otherwise handling the signal
doesn't make sense at all.  It doesn't seem right for this to fail
unpredictably depending on where the signal landed.

Cheers
---Dave
  
H.J. Lu Nov. 20, 2020, 2:08 p.m. UTC | #24
On Wed, Nov 18, 2020 at 10:09 AM Dave Martin via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> On Wed, Nov 18, 2020 at 06:35:00PM +0100, Florian Weimer wrote:
> > * Dave Martin:
> >
> > > Having _GNU_SOURCE enable this feature will break existing source code.
> >
> > Does this matter?
> >
> > The code is already broken on a quickly increasing number of machines,
> > so it needs fixing anyway.  A compile-time error is probably
> > preferable to an obscure run-time failure.
>
> Fair point, but I didn't want to be the one making it :)
>
> If people are happy to go this way, I don't see a problem with it.
> Distro maintainers may shout a bit while they deal with the fallout of
> that glibc upgrade though.
>
>
> > > It probably should go under _GNU_SOURCE in the long term, but is there
> > > usually a transitional period when the change is opt-in via some feature
> > > macro?
> >
> > Not always.  See the iszero and other macros in <math.h>, that also
> > broke existing sources (largely C++, so we worked around it by using
> > C++ features instead of macros).  There are also many older examples.
>
> I see.  Well, I guess that's not my call.
>

Can we reach a consensus for _GNU_SOURCE vs _SC_SIGSTKSZ_SOURCE?

How about we make _GNU_SOURCE to enable _SC_SIGSTKSZ_SOURCE and
check _SC_SIGSTKSZ_SOURCE != 0?  If _GNU_SOURCE triggers compilation
error and source codes can't be changed, we can add -D_SC_SIGSTKSZ_SOURCE=0
to disable _SC_SIGSTKSZ_SOURCE.


H.J.
  
Florian Weimer Nov. 20, 2020, 2:11 p.m. UTC | #25
* H. J. Lu:

> Can we reach a consensus for _GNU_SOURCE vs _SC_SIGSTKSZ_SOURCE?
>
> How about we make _GNU_SOURCE to enable _SC_SIGSTKSZ_SOURCE and
> check _SC_SIGSTKSZ_SOURCE != 0?  If _GNU_SOURCE triggers compilation
> error and source codes can't be changed, we can add
> -D_SC_SIGSTKSZ_SOURCE=0 to disable _SC_SIGSTKSZ_SOURCE.

I think the source code compatibility is much more obscure than the
other things we broke quite recently, even without deprecation.  Why
do we add the complexity in this case?  I feel it is disproportional.
  

Patch

From 38eb69fdc50cc9b44259f549f4ea023f61ed843f Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Fri, 25 Sep 2020 13:10:08 -0700
Subject: [PATCH] sysconf: Add _SC_MINSIGSTKSZ/_SC_SIGSTKSZ [BZ #20305]

Add _SC_MINSIGSTKSZ for the minimum signal stack size derived from
AT_MINSIGSTKSZ, which is the minimum number of bytes of free stack
space required in order to gurantee successful, non-nested handling
of a single signal whose handler is an empty function, and _SC_SIGSTKSZ
which is the suggested minimum number of bytes of stack space required
for a signal stack.

If AT_MINSIGSTKSZ isn't available, sysconf (_SC_MINSIGSTKSZ) returns
MINSIGSTKSZ.  On Linux/x86 with XSAVE, the signal frame used by kernel
is composed of the following areas and laid out as:

 ------------------------------
 | alignment padding          |
 ------------------------------
 | xsave buffer               |
 ------------------------------
 | fsave header (32-bit only) |
 ------------------------------
 | siginfo + ucontext         |
 ------------------------------

Compute AT_MINSIGSTKSZ value as size of xsave buffer + size of fsave
header (32-bit only) + size of siginfo and ucontext + alignment padding.

If _SC_SIGSTKSZ_SOURCE is defined, MINSIGSTKSZ and SIGSTKSZ are redefined
as

/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ).  */
 # undef SIGSTKSZ
 # define SIGSTKSZ sysconf (_SC_SIGSTKSZ)

/* Minimum stack size for a signal handler: SIGSTKSZ.  */
 # undef MINSIGSTKSZ
 # define MINSIGSTKSZ SIGSTKSZ

Compilation will fail if the source assumes constant MINSIGSTKSZ or
SIGSTKSZ.

The reason for not simply increasing the kernel's MINSIGSTKSZ #define
(apart from the fact that it is rarely used, due to glibc's shadowing
definitions) was that userspace binaries will have baked in the old
value of the constant and may be making assumptions about it.

For example, the type (char [MINSIGSTKSZ]) changes if this #define
changes.  This could be a problem if an newly built library tries to
memcpy() or dump such an object defined by and old binary.
Bounds-checking and the stack sizes passed to things like sigaltstack()
and makecontext() could similarly go wrong.
---
 NEWS                                          |  5 ++
 bits/confname.h                               |  8 +-
 bits/sigstksz.h                               | 21 +++++
 elf/dl-support.c                              |  5 ++
 elf/dl-sysdep.c                               |  9 ++
 include/bits/sigstack.h                       |  5 ++
 include/features.h                            | 12 +++
 manual/conf.texi                              | 21 +++++
 manual/creature.texi                          |  6 ++
 posix/sysconf.c                               |  3 +
 signal/Makefile                               |  5 +-
 signal/signal.h                               |  1 +
 signal/tst-minsigstksz-5.c                    | 84 +++++++++++++++++++
 sysdeps/generic/ldsodefs.h                    |  3 +
 sysdeps/unix/sysv/linux/Makefile              |  8 ++
 sysdeps/unix/sysv/linux/bits/sigstksz.h       | 33 ++++++++
 .../unix/sysv/linux/ia64/sysconf-sigstksz.h   | 27 ++++++
 sysdeps/unix/sysv/linux/sysconf-sigstksz.h    | 38 +++++++++
 sysdeps/unix/sysv/linux/sysconf.c             |  9 ++
 .../unix/sysv/linux/x86/dl-minsigstacksize.h  | 83 ++++++++++++++++++
 .../sysv/linux/x86/include/bits/sigstack.h    |  5 ++
 sysdeps/x86/cpu-features.c                    |  3 +
 sysdeps/x86/dl-minsigstacksize.h              | 27 ++++++
 23 files changed, 418 insertions(+), 3 deletions(-)
 create mode 100644 bits/sigstksz.h
 create mode 100644 include/bits/sigstack.h
 create mode 100644 signal/tst-minsigstksz-5.c
 create mode 100644 sysdeps/unix/sysv/linux/bits/sigstksz.h
 create mode 100644 sysdeps/unix/sysv/linux/ia64/sysconf-sigstksz.h
 create mode 100644 sysdeps/unix/sysv/linux/sysconf-sigstksz.h
 create mode 100644 sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
 create mode 100644 sysdeps/unix/sysv/linux/x86/include/bits/sigstack.h
 create mode 100644 sysdeps/x86/dl-minsigstacksize.h

diff --git a/NEWS b/NEWS
index 3c1e509744..0fc57afc3f 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,11 @@  Version 2.33
 
 Major new features:
 
+* Add _SC_MINSIGSTKSZ and _SC_SIGSTKSZ.  When _SC_SIGSTKSZ_SOURCE is
+  defined, MINSIGSTKSZ and SIGSTKSZ are no longer constant on Linux.
+  MINSIGSTKSZ is redefined to sysconf(_SC_MINSIGSTKSZ) and SIGSTKSZ
+  is redefined to sysconf (_SC_SIGSTKSZ).
+
 * The dynamic linker accepts the --argv0 argument and provides opportunity
   to change argv[0] string.
 
diff --git a/bits/confname.h b/bits/confname.h
index 5dc8215093..451d3eb636 100644
--- a/bits/confname.h
+++ b/bits/confname.h
@@ -525,8 +525,14 @@  enum
 
     _SC_THREAD_ROBUST_PRIO_INHERIT,
 #define _SC_THREAD_ROBUST_PRIO_INHERIT	_SC_THREAD_ROBUST_PRIO_INHERIT
-    _SC_THREAD_ROBUST_PRIO_PROTECT
+    _SC_THREAD_ROBUST_PRIO_PROTECT,
 #define _SC_THREAD_ROBUST_PRIO_PROTECT	_SC_THREAD_ROBUST_PRIO_PROTECT
+
+    _SC_MINSIGSTKSZ,
+#define	_SC_MINSIGSTKSZ			_SC_MINSIGSTKSZ
+
+    _SC_SIGSTKSZ
+#define	_SC_SIGSTKSZ			_SC_SIGSTKSZ
   };
 
 /* Values for the NAME argument to `confstr'.  */
diff --git a/bits/sigstksz.h b/bits/sigstksz.h
new file mode 100644
index 0000000000..5535d873b5
--- /dev/null
+++ b/bits/sigstksz.h
@@ -0,0 +1,21 @@ 
+/* Definition of MINSIGSTKSZ.  Generic version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SIGNAL_H
+# error "Never include <bits/sigstksz.h> directly; use <signal.h> instead."
+#endif
diff --git a/elf/dl-support.c b/elf/dl-support.c
index afbc94df54..1bbce03f9f 100644
--- a/elf/dl-support.c
+++ b/elf/dl-support.c
@@ -136,6 +136,8 @@  void (*_dl_init_static_tls) (struct link_map *) = &_dl_nothread_init_static_tls;
 
 size_t _dl_pagesize = EXEC_PAGESIZE;
 
+size_t _dl_minsigstacksize = CONSTANT_MINSIGSTKSZ;
+
 int _dl_inhibit_cache;
 
 unsigned int _dl_osversion;
@@ -294,6 +296,9 @@  _dl_aux_init (ElfW(auxv_t) *av)
       case AT_RANDOM:
 	_dl_random = (void *) av->a_un.a_val;
 	break;
+      case AT_MINSIGSTKSZ:
+	_dl_minsigstacksize = av->a_un.a_val;
+	break;
       DL_PLATFORM_AUXV
       }
   if (seen == 0xf)
diff --git a/elf/dl-sysdep.c b/elf/dl-sysdep.c
index 854570821c..47400f2937 100644
--- a/elf/dl-sysdep.c
+++ b/elf/dl-sysdep.c
@@ -117,6 +117,11 @@  _dl_sysdep_start (void **start_argptr,
   user_entry = (ElfW(Addr)) ENTRY_POINT;
   GLRO(dl_platform) = NULL; /* Default to nothing known about the platform.  */
 
+  /* NB: Default to a constant CONSTANT_MINSIGSTKSZ.  */
+  _Static_assert (__builtin_constant_p (CONSTANT_MINSIGSTKSZ),
+		  "CONSTANT_MINSIGSTKSZ is constant");
+  GLRO(dl_minsigstacksize) = CONSTANT_MINSIGSTKSZ;
+
   for (av = GLRO(dl_auxv); av->a_type != AT_NULL; set_seen (av++))
     switch (av->a_type)
       {
@@ -181,6 +186,9 @@  _dl_sysdep_start (void **start_argptr,
       case AT_RANDOM:
 	_dl_random = (void *) av->a_un.a_val;
 	break;
+      case AT_MINSIGSTKSZ:
+	GLRO(dl_minsigstacksize) = av->a_un.a_val;
+	break;
       DL_PLATFORM_AUXV
       }
 
@@ -308,6 +316,7 @@  _dl_show_auxv (void)
 	  [AT_SYSINFO_EHDR - 2] =	{ "SYSINFO_EHDR:      0x", hex },
 	  [AT_RANDOM - 2] =		{ "RANDOM:            0x", hex },
 	  [AT_HWCAP2 - 2] =		{ "HWCAP2:            0x", hex },
+	  [AT_MINSIGSTKSZ - 2] =	{ "MINSIGSTKSZ        ", dec },
 	  [AT_L1I_CACHESIZE - 2] =	{ "L1I_CACHESIZE:     ", dec },
 	  [AT_L1I_CACHEGEOMETRY - 2] =	{ "L1I_CACHEGEOMETRY: 0x", hex },
 	  [AT_L1D_CACHESIZE - 2] =	{ "L1D_CACHESIZE:     ", dec },
diff --git a/include/bits/sigstack.h b/include/bits/sigstack.h
new file mode 100644
index 0000000000..eea939f59d
--- /dev/null
+++ b/include/bits/sigstack.h
@@ -0,0 +1,5 @@ 
+#include_next <bits/sigstack.h>
+
+#if !defined _ISOMAC && !defined CONSTANT_MINSIGSTKSZ
+# define CONSTANT_MINSIGSTKSZ MINSIGSTKSZ
+#endif
diff --git a/include/features.h b/include/features.h
index f3e62d3362..b202301ad8 100644
--- a/include/features.h
+++ b/include/features.h
@@ -55,6 +55,9 @@ 
    _FORTIFY_SOURCE	Add security hardening to many library functions.
 			Set to 1 or 2; 2 performs stricter checks than 1.
 
+   _SC_SIGSTKSZ_SOURCE	Select correct (but non compile-time constant)
+			MINSIGSTKSZ and SIGSTKSZ.
+
    _REENTRANT, _THREAD_SAFE
 			Obsolete; equivalent to _POSIX_C_SOURCE=199506L.
 
@@ -96,6 +99,8 @@ 
    __USE_ATFILE		Define *at interfaces and AT_* constants for them.
    __USE_GNU		Define GNU extensions.
    __USE_FORTIFY_LEVEL	Additional security measures used, according to level.
+   __USE_SC_SIGSTKSZ	Define correct (but non compile-time constant)
+			MINSIGSTKSZ and SIGSTKSZ.
 
    The macros `__GNU_LIBRARY__', `__GLIBC__', and `__GLIBC_MINOR__' are
    defined by this file unconditionally.  `__GNU_LIBRARY__' is provided
@@ -139,6 +144,7 @@ 
 #undef	__USE_ATFILE
 #undef	__USE_GNU
 #undef	__USE_FORTIFY_LEVEL
+#undef	__USE_SC_SIGSTKSZ
 #undef	__KERNEL_STRICT_NAMES
 #undef	__GLIBC_USE_ISOC2X
 #undef	__GLIBC_USE_DEPRECATED_GETS
@@ -407,6 +413,12 @@ 
 # define __USE_FORTIFY_LEVEL 0
 #endif
 
+#ifdef	_SC_SIGSTKSZ_SOURCE
+# define __USE_SC_SIGSTKSZ	1
+#else
+# define __USE_SC_SIGSTKSZ	0
+#endif
+
 /* The function 'gets' existed in C89, but is impossible to use
    safely.  It has been removed from ISO C11 and ISO C++14.  Note: for
    compatibility with various implementations of <cstdio>, this test
diff --git a/manual/conf.texi b/manual/conf.texi
index f959b00bb6..ba9847aaa4 100644
--- a/manual/conf.texi
+++ b/manual/conf.texi
@@ -913,6 +913,27 @@  Inquire about the parameter corresponding to @code{NL_SETMAX}.
 @item _SC_NL_TEXTMAX
 @standards{X/Open, unistd.h}
 Inquire about the parameter corresponding to @code{NL_TEXTMAX}.
+
+@item _SC_MINSIGSTKSZ
+@standards{GNU, unistd.h}
+Inquire about the minimum number of bytes of free stack space required
+in order to guarantee successful, non-nested handling of a single signal
+whose handler is an empty function.
+
+@item _SC_SIGSTKSZ
+@standards{GNU, unistd.h}
+Inquire about the suggested minimum number of bytes of stack space
+required for a signal stack.
+
+This is not guaranteed to be enough for any specific purpose other than
+the invocation of a single, non-nested, empty handler, but nonetheless
+should be enough for basic scenarios involving simple signal handlers
+and very low levels of signal nesting (say, 2 or 3 levels at the very
+most).
+
+This value is provided for developer convenience and to ease migration
+from the legacy @code{SIGSTKSZ} constant.  Programs requiring stronger
+guarantees should avoid using it if at all possible.
 @end vtable
 
 @node Examples of Sysconf
diff --git a/manual/creature.texi b/manual/creature.texi
index be5050468b..8fef4150fa 100644
--- a/manual/creature.texi
+++ b/manual/creature.texi
@@ -257,6 +257,12 @@  various library functions.  If defined to @math{2}, even stricter
 checks are applied.
 @end defvr
 
+@defvr Macro _SC_SIGSTKSZ_SOURCE
+@standards{GNU, (none)}
+If this macro is defined, correct (but non compile-time constant)
+MINSIGSTKSZ and SIGSTKSZ are defined.
+@end defvr
+
 @defvr Macro _REENTRANT
 @defvrx Macro _THREAD_SAFE
 @standards{Obsolete, (none)}
diff --git a/posix/sysconf.c b/posix/sysconf.c
index ba303384c1..ca7833e6c4 100644
--- a/posix/sysconf.c
+++ b/posix/sysconf.c
@@ -266,6 +266,9 @@  __sysconf (int name)
     case _SC_XOPEN_REALTIME:
     case _SC_XOPEN_REALTIME_THREADS:
 
+    case _SC_MINSIGSTKSZ:
+    case _SC_SIGSTKSZ:
+
       break;
     }
 
diff --git a/signal/Makefile b/signal/Makefile
index 2ec3ddd74f..641d30582d 100644
--- a/signal/Makefile
+++ b/signal/Makefile
@@ -31,7 +31,8 @@  headers := signal.h sys/signal.h \
 	   bits/types/sigevent_t.h bits/types/siginfo_t.h \
 	   bits/types/sigset_t.h bits/types/sigval_t.h \
 	   bits/types/stack_t.h bits/types/struct_sigstack.h \
-	   bits/types/__sigval_t.h bits/signal_ext.h
+	   bits/types/__sigval_t.h bits/signal_ext.h \
+	   bits/sigstksz.h
 
 routines	:= signal raise killpg \
 		   sigaction sigprocmask kill \
@@ -48,7 +49,7 @@  routines	:= signal raise killpg \
 tests		:= tst-signal tst-sigset tst-sigsimple tst-raise tst-sigset2 \
 		   tst-sigwait-eintr tst-sigaction \
 		   tst-minsigstksz-1 tst-minsigstksz-2 tst-minsigstksz-3 \
-		   tst-minsigstksz-3a tst-minsigstksz-4 \
+		   tst-minsigstksz-3a tst-minsigstksz-4 tst-minsigstksz-5 \
 		   tst-sigisemptyset
 
 include ../Rules
diff --git a/signal/signal.h b/signal/signal.h
index effe3d698f..0311eb2a66 100644
--- a/signal/signal.h
+++ b/signal/signal.h
@@ -312,6 +312,7 @@  extern int siginterrupt (int __sig, int __interrupt) __THROW
   __attribute_deprecated_msg__ ("Use sigaction with SA_RESTART instead");
 
 # include <bits/sigstack.h>
+# include <bits/sigstksz.h>
 # include <bits/ss_flags.h>
 
 /* Alternate signal handler stack interface.
diff --git a/signal/tst-minsigstksz-5.c b/signal/tst-minsigstksz-5.c
new file mode 100644
index 0000000000..d657d2f4e6
--- /dev/null
+++ b/signal/tst-minsigstksz-5.c
@@ -0,0 +1,84 @@ 
+/* Test of signal delivery on an alternate stack with MINSIGSTKSZ size.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <signal.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <support/check.h>
+#include <support/support.h>
+
+static volatile sig_atomic_t handler_run;
+
+static void
+handler (int signo)
+{
+  /* Clear a bit of on-stack memory.  */
+  volatile char buffer[256];
+  for (size_t i = 0; i < sizeof (buffer); ++i)
+    buffer[i] = 0;
+  handler_run = 1;
+}
+
+int
+do_test (void)
+{
+  size_t stack_buffer_size = 64 * 1024 * 1024;
+  void *stack_buffer = xmalloc (stack_buffer_size);
+  void *stack_end = stack_buffer + stack_buffer_size;
+  memset (stack_buffer, 0xCC, stack_buffer_size);
+
+  void *stack_bottom = stack_buffer + (stack_buffer_size + MINSIGSTKSZ) / 2;
+  void *stack_top = stack_bottom + MINSIGSTKSZ;
+  stack_t stack =
+    {
+      .ss_sp = stack_bottom,
+      .ss_size = MINSIGSTKSZ,
+    };
+  if (sigaltstack (&stack, NULL) < 0)
+    FAIL_RET ("sigaltstack: %m\n");
+
+  struct sigaction act =
+    {
+      .sa_handler = handler,
+      .sa_flags = SA_ONSTACK,
+    };
+  if (sigaction (SIGUSR1, &act, NULL) < 0)
+    FAIL_RET ("sigaction: %m\n");
+
+  if (kill (getpid (), SIGUSR1) < 0)
+    FAIL_RET ("kill: %m\n");
+
+  if (handler_run != 1)
+    FAIL_RET ("handler did not run\n");
+
+  for (void *p = stack_buffer; p < stack_bottom; ++p)
+    if (*(unsigned char *) p != 0xCC)
+      FAIL_RET ("changed byte %ld bytes below configured stack\n",
+		(long) (stack_bottom - p));
+  for (void *p = stack_top; p < stack_end; ++p)
+    if (*(unsigned char *) p != 0xCC)
+      FAIL_RET ("changed byte %ld bytes above configured stack\n",
+		(long) (p - stack_top));
+
+  free (stack_buffer);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/generic/ldsodefs.h b/sysdeps/generic/ldsodefs.h
index 382eeb9be0..5ce847aeac 100644
--- a/sysdeps/generic/ldsodefs.h
+++ b/sysdeps/generic/ldsodefs.h
@@ -528,6 +528,9 @@  struct rtld_global_ro
   /* Cached value of `getpagesize ()'.  */
   EXTERN size_t _dl_pagesize;
 
+  /* Cached value of `sysconf (_SC_MINSIGSTKSZ)'.  */
+  EXTERN size_t _dl_minsigstacksize;
+
   /* Do we read from ld.so.cache?  */
   EXTERN int _dl_inhibit_cache;
 
diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile
index 09604e128b..b51a02a6e6 100644
--- a/sysdeps/unix/sysv/linux/Makefile
+++ b/sysdeps/unix/sysv/linux/Makefile
@@ -190,6 +190,9 @@  sysdep_routines += ntp_gettime ntp_gettimex
 endif
 
 ifeq ($(subdir),signal)
+# Compile tst-minsigstksz-5.c with _SC_SIGSTKSZ_SOURCE.
+CFLAGS-tst-minsigstksz-5.c += -D_SC_SIGSTKSZ_SOURCE
+
 tests-special += $(objpfx)tst-signal-numbers.out
 # Depending on signal.o* is a hack.  What we actually want is a dependency
 # on signal.h and everything it includes.  That's impractical to write
@@ -228,6 +231,11 @@  ifeq ($(subdir),sunrpc)
 sysdep_headers += nfs/nfs.h
 endif
 
+ifeq ($(subdir),support)
+# Compile xsigstack.c with _SC_SIGSTKSZ_SOURCE.
+CFLAGS-xsigstack.c += -D_SC_SIGSTKSZ_SOURCE
+endif
+
 ifeq ($(subdir),termios)
 sysdep_headers += termio.h
 endif
diff --git a/sysdeps/unix/sysv/linux/bits/sigstksz.h b/sysdeps/unix/sysv/linux/bits/sigstksz.h
new file mode 100644
index 0000000000..cd5b3cc895
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/bits/sigstksz.h
@@ -0,0 +1,33 @@ 
+/* Definition of MINSIGSTKSZ and SIGSTKSZ.  Linux version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SIGNAL_H
+# error "Never include <bits/sigstksz.h> directly; use <signal.h> instead."
+#endif
+
+#if __USE_SC_SIGSTKSZ
+# include <unistd.h>
+
+/* Default stack size for a signal handler: sysconf (SC_SIGSTKSZ).  */
+# undef SIGSTKSZ
+# define SIGSTKSZ sysconf (_SC_SIGSTKSZ)
+
+/* Minimum stack size for a signal handler: SIGSTKSZ.  */
+# undef MINSIGSTKSZ
+# define MINSIGSTKSZ SIGSTKSZ
+#endif
diff --git a/sysdeps/unix/sysv/linux/ia64/sysconf-sigstksz.h b/sysdeps/unix/sysv/linux/ia64/sysconf-sigstksz.h
new file mode 100644
index 0000000000..7e5ceba151
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/ia64/sysconf-sigstksz.h
@@ -0,0 +1,27 @@ 
+/* sysconf_sigstksz ().  Linux/ia64 version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Return sysconf (_SC_SIGSTKSZ).  */
+
+static long int
+sysconf_sigstksz (void)
+{
+  _Static_assert (__builtin_constant_p (SIGSTKSZ),
+		  "SIGSTKSZ is constant");
+  return SIGSTKSZ;
+}
diff --git a/sysdeps/unix/sysv/linux/sysconf-sigstksz.h b/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
new file mode 100644
index 0000000000..64d450b22c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/sysconf-sigstksz.h
@@ -0,0 +1,38 @@ 
+/* sysconf_sigstksz ().  Linux version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Return sysconf (_SC_SIGSTKSZ).  */
+
+static long int
+sysconf_sigstksz (void)
+{
+  long int minsigstacksize = GLRO(dl_minsigstacksize);
+  assert (minsigstacksize != 0);
+  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
+		  "MINSIGSTKSZ is constant");
+  if (minsigstacksize < MINSIGSTKSZ)
+    minsigstacksize = MINSIGSTKSZ;
+  /* MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4.  */
+  long int sigstacksize = minsigstacksize * 4;
+  /* Return MAX (SIGSTKSZ, sigstacksize).  */
+  _Static_assert (__builtin_constant_p (SIGSTKSZ),
+		  "SIGSTKSZ is constant");
+  if (sigstacksize < SIGSTKSZ)
+    sigstacksize = SIGSTKSZ;
+  return sigstacksize;
+}
diff --git a/sysdeps/unix/sysv/linux/sysconf.c b/sysdeps/unix/sysv/linux/sysconf.c
index 7958a74164..d1d86df3fb 100644
--- a/sysdeps/unix/sysv/linux/sysconf.c
+++ b/sysdeps/unix/sysv/linux/sysconf.c
@@ -16,6 +16,7 @@ 
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
+#include <assert.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <stdlib.h>
@@ -26,6 +27,7 @@ 
 #include <sys/param.h>
 #include <not-cancel.h>
 #include <ldsodefs.h>
+#include <sysconf-sigstksz.h>
 
 /* Legacy value of ARG_MAX.  The macro is now not defined since the
    actual value varies based on the stack size.  */
@@ -75,6 +77,13 @@  __sysconf (int name)
       }
       break;
 
+    case _SC_MINSIGSTKSZ:
+      assert (GLRO(dl_minsigstacksize) != 0);
+      return GLRO(dl_minsigstacksize);
+
+    case _SC_SIGSTKSZ:
+      return sysconf_sigstksz ();
+
     default:
       break;
     }
diff --git a/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
new file mode 100644
index 0000000000..6088bbc99e
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86/dl-minsigstacksize.h
@@ -0,0 +1,83 @@ 
+/* Emulate AT_MINSIGSTKSZ.  Linux/x86 version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Emulate AT_MINSIGSTKSZ with XSAVE. */
+
+static inline void
+dl_check_minsigstacksize (const struct cpu_features *cpu_features)
+{
+  /* Return if AT_MINSIGSTKSZ is provide by kernel.  */
+  if (GLRO(dl_minsigstacksize) != 0)
+    return;
+
+  if (cpu_features->basic.max_cpuid >= 0xd
+      && CPU_FEATURES_CPU_P (cpu_features, OSXSAVE))
+    {
+      /* Emulate AT_MINSIGSTKSZ.  In Linux kernel, the signal frame data
+	 with XSAVE is composed of the following areas and laid out as:
+	 ------------------------------
+	 | alignment padding          |
+	 ------------------------------
+	 | xsave buffer               |
+	 ------------------------------
+	 | fsave header (32-bit only) |
+	 ------------------------------
+	 | siginfo + ucontext         |
+	 ------------------------------
+	 */
+
+      unsigned int sigframe_size;
+
+#ifdef __x86_64__
+      /* NB: sizeof(struct rt_sigframe) + 8-byte return address in Linux
+	 kernel.  */
+      sigframe_size = 440 + 8;
+#else
+      /* NB: sizeof(struct sigframe_ia32) + sizeof(struct fregs_state)) +
+	 4-byte return address + 3 * 4-byte arguments in Linux kernel.  */
+      sigframe_size = 736 + 112 + 4 + 3 * 4;
+#endif
+
+      /* Add 15 bytes to align the stack to 16 bytes.  */
+      sigframe_size += 15;
+
+      /* Make the space before xsave buffer multiple of 16 bytes.  */
+      sigframe_size = ALIGN_UP (sigframe_size, 16);
+
+      /* Add (64 - 16)-byte padding to align xsave buffer at 64 bytes.  */
+      sigframe_size += 64 - 16;
+
+      unsigned int eax, ebx, ecx, edx;
+      __cpuid_count (0xd, 0, eax, ebx, ecx, edx);
+
+      /* Add the size of xsave buffer.  */
+      sigframe_size += ebx;
+
+      /* Add the size of FP_XSTATE_MAGIC2.  */
+#define FP_XSTATE_MAGIC2 0x46505845U
+      sigframe_size += sizeof (FP_XSTATE_MAGIC2);
+
+      GLRO(dl_minsigstacksize) = sigframe_size;
+    }
+  else
+    {
+      /* NB: Default to a constant MINSIGSTKSZ.  */
+      _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
+		      "MINSIGSTKSZ is constant");
+      GLRO(dl_minsigstacksize) = MINSIGSTKSZ;
+    }
+}
diff --git a/sysdeps/unix/sysv/linux/x86/include/bits/sigstack.h b/sysdeps/unix/sysv/linux/x86/include/bits/sigstack.h
new file mode 100644
index 0000000000..208754c497
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/x86/include/bits/sigstack.h
@@ -0,0 +1,5 @@ 
+#include_next <bits/sigstack.h>
+
+#ifndef _ISOMAC
+# define CONSTANT_MINSIGSTKSZ 0
+#endif
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index f26deba38d..bef6084f5e 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -19,6 +19,7 @@ 
 #include <cpuid.h>
 #include <dl-hwcap.h>
 #include <libc-pointer-arith.h>
+#include <dl-minsigstacksize.h>
 #if IS_IN (libc) && !defined SHARED
 # include <assert.h>
 # include <unistd.h>
@@ -367,6 +368,8 @@  get_common_indices (struct cpu_features *cpu_features,
 		   cpu_features->features[COMMON_CPUID_INDEX_19].cpuid.ebx,
 		   cpu_features->features[COMMON_CPUID_INDEX_19].cpuid.ecx,
 		   cpu_features->features[COMMON_CPUID_INDEX_19].cpuid.edx);
+
+  dl_check_minsigstacksize (cpu_features);
 }
 
 _Static_assert (((index_arch_Fast_Unaligned_Load
diff --git a/sysdeps/x86/dl-minsigstacksize.h b/sysdeps/x86/dl-minsigstacksize.h
new file mode 100644
index 0000000000..959871c970
--- /dev/null
+++ b/sysdeps/x86/dl-minsigstacksize.h
@@ -0,0 +1,27 @@ 
+/* Emulate AT_MINSIGSTKSZ.  Generic x86 version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* Emulate AT_MINSIGSTKSZ with XSAVE. */
+
+static inline void
+dl_check_minsigstacksize (const struct cpu_features *cpu_features)
+{
+  /* NB: Default to a constant MINSIGSTKSZ.  */
+  _Static_assert (__builtin_constant_p (MINSIGSTKSZ),
+		  "MINSIGSTKSZ is constant");
+  GLRO(dl_minsigstacksize) = MINSIGSTKSZ;
+}
-- 
2.26.2