Message ID | 20200929205746.6763-1-chang.seok.bae@intel.com |
---|---|
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A4A303987831; Tue, 29 Sep 2020 21:01:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A4A303987831 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1601413289; bh=0rfjm2zHeBe6gi3MJBU/ClvPmZjAXxMpjKrPS9r/1t8=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=CePBj4Qm+Y1osyrn/SolF3U9IpjEYrWK96ZLX/tT8b9/M8EtqviDIBQxmR+5sLEyA mGlslVYpVQmuB921oWO0IMAUaFXPFDHBmW5QTyTYLKpVFH7PB4BavbWFlaimDJZUUg 7f679kTcnuv+GhapdEyekG83iW+Fc21+dWDtzELs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by sourceware.org (Postfix) with ESMTPS id E62DD39730B5 for <libc-alpha@sourceware.org>; Tue, 29 Sep 2020 21:01:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E62DD39730B5 IronPort-SDR: Woqg1kL1HbZd5UZ/tMh5984GKAOgvGxcViFGc1eAhY1MKjk4camN/ZRx2wMSWy8F9othQL/c+N Ex0EkC4EHuEA== X-IronPort-AV: E=McAfee;i="6000,8403,9759"; a="223888246" X-IronPort-AV: E=Sophos;i="5.77,319,1596524400"; d="scan'208";a="223888246" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Sep 2020 14:01:22 -0700 IronPort-SDR: bFn8JtJ42VossFrlSUkzKDjyua51j/jF2FeiiAoBoXvDILU7PlBemmWR34TyuRuKfxE1NraZ6/ TvBROEA/0eLg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.77,319,1596524400"; d="scan'208";a="514024807" Received: from chang-linux-3.sc.intel.com ([172.25.66.175]) by fmsmga006.fm.intel.com with ESMTP; 29 Sep 2020 14:01:22 -0700 To: tglx@linutronix.de, mingo@kernel.org, bp@suse.de, luto@kernel.org, x86@kernel.org Subject: [RFC PATCH 0/4] x86: Improve Minimum Alternate Stack Size Date: Tue, 29 Sep 2020 13:57:42 -0700 Message-Id: <20200929205746.6763-1-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.17.1 X-Spam-Status: No, score=-2.9 required=5.0 tests=AC_FROM_MANY_DOTS, BAYES_00, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: "Chang S. Bae via Libc-alpha" <libc-alpha@sourceware.org> Reply-To: "Chang S. Bae" <chang.seok.bae@intel.com> Cc: linux-arch@vger.kernel.org, len.brown@intel.com, tony.luck@intel.com, libc-alpha@sourceware.org, ravi.v.shankar@intel.com, chang.seok.bae@intel.com, linux-kernel@vger.kernel.org, dave.hansen@intel.com, linux-api@vger.kernel.org, Dave.Martin@arm.com Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series |
x86: Improve Minimum Alternate Stack Size
|
|
Message
Chang S. Bae
Sept. 29, 2020, 8:57 p.m. UTC
During signal entry, the kernel pushes data onto the normal userspace stack. On x86, the data pushed onto the user stack includes XSAVE state, which has grown over time as new features and larger registers have been added to the architecture. MINSIGSTKSZ is a constant provided in the kernel signal.h headers and typically distributed in lib-dev(el) packages, e.g. [1]. Its value is compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ constant indicates to userspace how much data the kernel expects to push on the user stack, [2][3]. However, this constant is much too small and does not reflect recent additions to the architecture. For instance, when AVX-512 states are in use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can cause user stack overflow when delivering a signal. In this series, we suggest a couple of things: 1. Provide a variable minimum stack size to userspace, as a similar approach to [5] 2. Avoid using a too-small alternate stack [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD [2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html [3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html [4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531 [5]: https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf Chang S. Bae (4): x86/signal: Introduce helpers to get the maximum signal frame size x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ x86/signal: Prevent an alternate stack overflow before a signal delivery selftest/x86/signal: Include test cases for validating sigaltstack arch/x86/ia32/ia32_signal.c | 11 +- arch/x86/include/asm/elf.h | 4 + arch/x86/include/asm/fpu/signal.h | 2 + arch/x86/include/asm/sigframe.h | 25 +++++ arch/x86/include/uapi/asm/auxvec.h | 6 +- arch/x86/kernel/cpu/common.c | 3 + arch/x86/kernel/fpu/signal.c | 20 ++++ arch/x86/kernel/signal.c | 66 +++++++++++- tools/testing/selftests/x86/Makefile | 2 +- tools/testing/selftests/x86/sigaltstack.c | 126 ++++++++++++++++++++++ 10 files changed, 258 insertions(+), 7 deletions(-) create mode 100644 tools/testing/selftests/x86/sigaltstack.c -- 2.17.1
Comments
On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > During signal entry, the kernel pushes data onto the normal userspace > stack. On x86, the data pushed onto the user stack includes XSAVE state, > which has grown over time as new features and larger registers have been > added to the architecture. > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > constant indicates to userspace how much data the kernel expects to push on > the user stack, [2][3]. > > However, this constant is much too small and does not reflect recent > additions to the architecture. For instance, when AVX-512 states are in > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > cause user stack overflow when delivering a signal. > > In this series, we suggest a couple of things: > 1. Provide a variable minimum stack size to userspace, as a similar > approach to [5] > 2. Avoid using a too-small alternate stack I can't comment on the x86 specifics, but the approach followed in this series does seem consistent with the way arm64 populates AT_MINSIGSTKSZ. I need to dig up my glibc hacks for providing a sysconf interface to this... Cheers ---Dave > > [1]: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/bits/sigstack.h;h=b9dca794da093dc4d41d39db9851d444e1b54d9b;hb=HEAD > [2]: https://www.gnu.org/software/libc/manual/html_node/Signal-Stack.html > [3]: https://man7.org/linux/man-pages/man2/sigaltstack.2.html > [4]: https://bugzilla.kernel.org/show_bug.cgi?id=153531 > [5]: https://blog.linuxplumbersconf.org/2017/ocw/system/presentations/4671/original/plumbers-dm-2017.pdf > > Chang S. Bae (4): > x86/signal: Introduce helpers to get the maximum signal frame size > x86/elf: Support a new ELF aux vector AT_MINSIGSTKSZ > x86/signal: Prevent an alternate stack overflow before a signal > delivery > selftest/x86/signal: Include test cases for validating sigaltstack > > arch/x86/ia32/ia32_signal.c | 11 +- > arch/x86/include/asm/elf.h | 4 + > arch/x86/include/asm/fpu/signal.h | 2 + > arch/x86/include/asm/sigframe.h | 25 +++++ > arch/x86/include/uapi/asm/auxvec.h | 6 +- > arch/x86/kernel/cpu/common.c | 3 + > arch/x86/kernel/fpu/signal.c | 20 ++++ > arch/x86/kernel/signal.c | 66 +++++++++++- > tools/testing/selftests/x86/Makefile | 2 +- > tools/testing/selftests/x86/sigaltstack.c | 126 ++++++++++++++++++++++ > 10 files changed, 258 insertions(+), 7 deletions(-) > create mode 100644 tools/testing/selftests/x86/sigaltstack.c > > -- > 2.17.1 >
On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > During signal entry, the kernel pushes data onto the normal userspace > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > which has grown over time as new features and larger registers have been > > added to the architecture. > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > constant indicates to userspace how much data the kernel expects to push on > > the user stack, [2][3]. > > > > However, this constant is much too small and does not reflect recent > > additions to the architecture. For instance, when AVX-512 states are in > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > cause user stack overflow when delivering a signal. > > > > In this series, we suggest a couple of things: > > 1. Provide a variable minimum stack size to userspace, as a similar > > approach to [5] > > 2. Avoid using a too-small alternate stack > > I can't comment on the x86 specifics, but the approach followed in this > series does seem consistent with the way arm64 populates > AT_MINSIGSTKSZ. > > I need to dig up my glibc hacks for providing a sysconf interface to > this... Here is my proposal for glibc: https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE is in use.
On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > During signal entry, the kernel pushes data onto the normal userspace > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > which has grown over time as new features and larger registers have been > > > added to the architecture. > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > constant indicates to userspace how much data the kernel expects to push on > > > the user stack, [2][3]. > > > > > > However, this constant is much too small and does not reflect recent > > > additions to the architecture. For instance, when AVX-512 states are in > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > cause user stack overflow when delivering a signal. > > > > > > In this series, we suggest a couple of things: > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > approach to [5] > > > 2. Avoid using a too-small alternate stack > > > > I can't comment on the x86 specifics, but the approach followed in this > > series does seem consistent with the way arm64 populates > > AT_MINSIGSTKSZ. > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > this... > > Here is my proposal for glibc: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html Thanks for the link. Are there patches yet? I already had some hacks in the works, but I can drop them if there's something already out there. > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. Can we do this? IIUC, this is an ABI break and carries the risk of buffer overruns. The reason for not simply increasing the kernel's MINSIGSTKSZ #define (apart from the fact that it is rarely used, due to glibc's shadowing definitions) was that userspace binaries will have baked in the old value of the constant and may be making assumptions about it. For example, the type (char [MINSIGSTKSZ]) changes if this #define changes. This could be a problem if an newly built library tries to memcpy() or dump such an object defined by and old binary. Bounds-checking and the stack sizes passed to things like sigaltstack() and makecontext() could similarly go wrong. > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the discovery method is changing. The meaning of the value is exactly the same as before. If we are going to rename it though, it could make sense to go for something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". The trouble with including "STKSZ" is that is sounds like a recommendation for your stack size. While the signal frame size is relevant to picking a stack size, it's not the only thing to consider. Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept of a "recommended stack size" be abandoned? glibc can at least make a slightly more informed guess about suitable stack sizes than the kernel (and glibc already has to guess anyway, in order to determine the default thread stack size). > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > is in use. Great if we can do it. I was concerned that this might be controversial. Would this just be a recommendation, or can we enforce it somehow? Cheers ---Dave
On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > which has grown over time as new features and larger registers have been > > > > added to the architecture. > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > constant indicates to userspace how much data the kernel expects to push on > > > > the user stack, [2][3]. > > > > > > > > However, this constant is much too small and does not reflect recent > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > cause user stack overflow when delivering a signal. > > > > > > > > In this series, we suggest a couple of things: > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > approach to [5] > > > > 2. Avoid using a too-small alternate stack > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > series does seem consistent with the way arm64 populates > > > AT_MINSIGSTKSZ. > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > this... > > > > Here is my proposal for glibc: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > Thanks for the link. > > Are there patches yet? I already had some hacks in the works, but I can > drop them if there's something already out there. I am working on it. > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > Can we do this? IIUC, this is an ABI break and carries the risk of > buffer overruns. > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > (apart from the fact that it is rarely used, due to glibc's shadowing > definitions) was that userspace binaries will have baked in the old > value of the constant and may be making assumptions about it. > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > changes. This could be a problem if an newly built library tries to > memcpy() or dump such an object defined by and old binary. > Bounds-checking and the stack sizes passed to things like sigaltstack() > and makecontext() could similarly go wrong. With my original proposal: https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html char [MINSIGSTKSZ] won't compile. The feedback is to increase the constants: https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > discovery method is changing. The meaning of the value is exactly the > same as before. > > If we are going to rename it though, it could make sense to go for > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > The trouble with including "STKSZ" is that is sounds like a > recommendation for your stack size. While the signal frame size is > relevant to picking a stack size, it's not the only thing to > consider. The problem is that AT_MINSIGSTKSZ is the signal frame size used by kernel. The minimum stack size for a signal handler is more likely AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal frame size used by kernel + 6KB for user application. > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > of a "recommended stack size" be abandoned? glibc can at least make a > slightly more informed guess about suitable stack sizes than the kernel > (and glibc already has to guess anyway, in order to determine the > default thread stack size). Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't available. > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > is in use. > > Great if we can do it. I was concerned that this might be > controversial. > > Would this just be a recommendation, or can we enforce it somehow? It is just an idea. We need to move away from constant SIGSTKSZ and MINSIGSTKSZ.
On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > which has grown over time as new features and larger registers have been > > > > > added to the architecture. > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > the user stack, [2][3]. > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > In this series, we suggest a couple of things: > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > approach to [5] > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > series does seem consistent with the way arm64 populates > > > > AT_MINSIGSTKSZ. > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > this... > > > > > > Here is my proposal for glibc: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > Thanks for the link. > > > > Are there patches yet? I already had some hacks in the works, but I can > > drop them if there's something already out there. > > I am working on it. > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > buffer overruns. > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > (apart from the fact that it is rarely used, due to glibc's shadowing > > definitions) was that userspace binaries will have baked in the old > > value of the constant and may be making assumptions about it. > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > changes. This could be a problem if an newly built library tries to > > memcpy() or dump such an object defined by and old binary. > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > and makecontext() could similarly go wrong. > > With my original proposal: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > constants: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > discovery method is changing. The meaning of the value is exactly the > > same as before. > > > > If we are going to rename it though, it could make sense to go for > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > The trouble with including "STKSZ" is that is sounds like a > > recommendation for your stack size. While the signal frame size is > > relevant to picking a stack size, it's not the only thing to > > consider. > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > kernel. The minimum stack size for a signal handler is more likely > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > frame size used by kernel + 6KB for user application. > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > of a "recommended stack size" be abandoned? glibc can at least make a > > slightly more informed guess about suitable stack sizes than the kernel > > (and glibc already has to guess anyway, in order to determine the > > default thread stack size). > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > available. > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > is in use. > > > > Great if we can do it. I was concerned that this might be > > controversial. > > > > Would this just be a recommendation, or can we enforce it somehow? > > It is just an idea. We need to move away from constant SIGSTKSZ and > MINSIGSTKSZ. > Here is the glibc patch: https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ AT_MINSIGSTKSZ should return the signal frame size used by kernel + 6KB for user application.
On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > which has grown over time as new features and larger registers have been > > > > > added to the architecture. > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > the user stack, [2][3]. > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > In this series, we suggest a couple of things: > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > approach to [5] > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > series does seem consistent with the way arm64 populates > > > > AT_MINSIGSTKSZ. > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > this... > > > > > > Here is my proposal for glibc: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > Thanks for the link. > > > > Are there patches yet? I already had some hacks in the works, but I can > > drop them if there's something already out there. > > I am working on it. OK. I may post something for discussion, but I'm happy for it to be superseded by someone (i.e., other than me) who actually knows what they're doing... > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > buffer overruns. > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > (apart from the fact that it is rarely used, due to glibc's shadowing > > definitions) was that userspace binaries will have baked in the old > > value of the constant and may be making assumptions about it. > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > changes. This could be a problem if an newly built library tries to > > memcpy() or dump such an object defined by and old binary. > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > and makecontext() could similarly go wrong. > > With my original proposal: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > constants: > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html Ah, I see. But both still API and ABI breaks; moreover, declaraing an array with size based on (MIN)SIGSTKSZ is not just reasonable, but the obvious thing to do with this constant in many simple cases. Such usage is widespread, see: * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1 Your two approaches seem to trade off two different sources of buffer overruns: undersized stacks versus ABI breaks across library boundaries. Since undersized stack is by far the more familiar problem and we at least have guard regions to help detect overruns, I'd vote to keep MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. Or are people reporting real stack overruns on x86 today? For arm64, we made large vectors on SVE opt-in, so that oversized signal frames are not seen by default. Would somethine similar be feasible on x86? > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > discovery method is changing. The meaning of the value is exactly the > > same as before. > > > > If we are going to rename it though, it could make sense to go for > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > The trouble with including "STKSZ" is that is sounds like a > > recommendation for your stack size. While the signal frame size is > > relevant to picking a stack size, it's not the only thing to > > consider. > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > kernel. The minimum stack size for a signal handler is more likely > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > frame size used by kernel + 6KB for user application. Ack; to be correct, you also need to take into account which signals may be unmasked while running on this stack, and the stack requirements of all their handlers. Unfortunately, that's hard :( What's your view on my naming suggesions? > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > of a "recommended stack size" be abandoned? glibc can at least make a > > slightly more informed guess about suitable stack sizes than the kernel > > (and glibc already has to guess anyway, in order to determine the > > default thread stack size). > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > available. In my code, I generate _SC_SIGSTKSZ as the equivalent of max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ) which is >= the legacy value, and broadly reperesentative of the relationship between MINSIGSTKSZ and SIGSTKSZ on most arches. What do you think? > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > is in use. > > > > Great if we can do it. I was concerned that this might be > > controversial. > > > > Would this just be a recommendation, or can we enforce it somehow? > > It is just an idea. We need to move away from constant SIGSTKSZ and > MINSIGSTKSZ. Totally agree with that. Cheers ---Dave
On 10/6/20 8:25 AM, Dave Martin wrote:
> Or are people reporting real stack overruns on x86 today?
We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state
mostly from AVX-512, and a 2048 byte MINSIGSTKSZ.
On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > which has grown over time as new features and larger registers have been > > > > > > added to the architecture. > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > the user stack, [2][3]. > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > approach to [5] > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > series does seem consistent with the way arm64 populates > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > this... > > > > > > > > Here is my proposal for glibc: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > Thanks for the link. > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > drop them if there's something already out there. > > > > I am working on it. > > OK. I may post something for discussion, but I'm happy for it to be > superseded by someone (i.e., other than me) who actually knows what > they're doing... Please see my previous email for my glibc patch: https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > buffer overruns. > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > definitions) was that userspace binaries will have baked in the old > > > value of the constant and may be making assumptions about it. > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > changes. This could be a problem if an newly built library tries to > > > memcpy() or dump such an object defined by and old binary. > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > and makecontext() could similarly go wrong. > > > > With my original proposal: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > constants: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > obvious thing to do with this constant in many simple cases. Such usage > is widespread, see: > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1 > > > Your two approaches seem to trade off two different sources of buffer > overruns: undersized stacks versus ABI breaks across library boundaries. We can't get everything we want. > Since undersized stack is by far the more familiar problem and we at > least have guard regions to help detect overruns, I'd vote to keep > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. Agree. > Or are people reporting real stack overruns on x86 today? I hope so. > > For arm64, we made large vectors on SVE opt-in, so that oversized signal > frames are not seen by default. Would somethine similar be feasible on > x86? > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > discovery method is changing. The meaning of the value is exactly the > > > same as before. > > > > > > If we are going to rename it though, it could make sense to go for > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > recommendation for your stack size. While the signal frame size is > > > relevant to picking a stack size, it's not the only thing to > > > consider. > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > kernel. The minimum stack size for a signal handler is more likely > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > frame size used by kernel + 6KB for user application. > > Ack; to be correct, you also need to take into account which signals may > be unmasked while running on this stack, and the stack requirements of > all their handlers. Unfortunately, that's hard :( > > What's your view on my naming suggesions? I used _SC_MINSIGSTKSZ: https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9 > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > slightly more informed guess about suitable stack sizes than the kernel > > > (and glibc already has to guess anyway, in order to determine the > > > default thread stack size). > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > available. > > In my code, I generate _SC_SIGSTKSZ as the equivalent of > > max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ) > > which is >= the legacy value, and broadly reperesentative of the > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches. > > > What do you think? sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases. > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > is in use. > > > > > > Great if we can do it. I was concerned that this might be > > > controversial. > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > MINSIGSTKSZ. > > Totally agree with that. > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile if the source assumes constant SIGSTKSZ or MINSIGSTKSZ.
On Tue, Oct 06, 2020 at 08:18:03AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > which has grown over time as new features and larger registers have been > > > > > > added to the architecture. > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > the user stack, [2][3]. > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > approach to [5] > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > series does seem consistent with the way arm64 populates > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > this... > > > > > > > > Here is my proposal for glibc: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > Thanks for the link. > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > drop them if there's something already out there. > > > > I am working on it. > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > buffer overruns. > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > definitions) was that userspace binaries will have baked in the old > > > value of the constant and may be making assumptions about it. > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > changes. This could be a problem if an newly built library tries to > > > memcpy() or dump such an object defined by and old binary. > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > and makecontext() could similarly go wrong. > > > > With my original proposal: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > constants: > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > discovery method is changing. The meaning of the value is exactly the > > > same as before. > > > > > > If we are going to rename it though, it could make sense to go for > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > recommendation for your stack size. While the signal frame size is > > > relevant to picking a stack size, it's not the only thing to > > > consider. > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > kernel. The minimum stack size for a signal handler is more likely > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > frame size used by kernel + 6KB for user application. > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > slightly more informed guess about suitable stack sizes than the kernel > > > (and glibc already has to guess anyway, in order to determine the > > > default thread stack size). > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > available. > > > > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > is in use. > > > > > > Great if we can do it. I was concerned that this might be > > > controversial. > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > MINSIGSTKSZ. > > > > Here is the glibc patch: > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > AT_MINSIGSTKSZ should return the signal frame size used by kernel + 6KB > for user application. I'm not sure about the 6K here. We a few fundamental parameters: * the actual maximum size of the kernel-allocated signal frame (which we'll report via AT_MINSIGSTKSZ); * the size of additional userspace stack frame required to execute the minimal (i.e., empty) signal handler. (On AArch64, this is 0. In environments where the C lirbrary calls signal handlers through some sort of wrapper, this would need to include the wrapper's stack needs also); * additional userspace stack needs for the actual signal handler code. This is completely unknown. _SC_MINSIGSTKSZ (however named) should certainly include the first two, but I'm not sure about the third. It will at least be architecture- dependent. This is one reason why I still favor having more than one constant here: the fundamental system properties should be discoverable for software that knows how to calculate its own stack needs accurately. Since calculating stack needs is hard and most software doesn't bother to do it, we could also give a "recommended" stack size which incorporates a guess of typical handler stack needs (similarly to the legacy SIGSTKSZ constant), but I think that should be a separate parameter. Cheers ---Dave
On Tue, Oct 6, 2020 at 8:43 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Tue, Oct 06, 2020 at 08:18:03AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 5:12 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > > which has grown over time as new features and larger registers have been > > > > > > > added to the architecture. > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > > approach to [5] > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > > series does seem consistent with the way arm64 populates > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > this... > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > Thanks for the link. > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > drop them if there's something already out there. > > > > > > I am working on it. > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > buffer overruns. > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > definitions) was that userspace binaries will have baked in the old > > > > value of the constant and may be making assumptions about it. > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > changes. This could be a problem if an newly built library tries to > > > > memcpy() or dump such an object defined by and old binary. > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > and makecontext() could similarly go wrong. > > > > > > With my original proposal: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > constants: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > > discovery method is changing. The meaning of the value is exactly the > > > > same as before. > > > > > > > > If we are going to rename it though, it could make sense to go for > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > > recommendation for your stack size. While the signal frame size is > > > > relevant to picking a stack size, it's not the only thing to > > > > consider. > > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > > kernel. The minimum stack size for a signal handler is more likely > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > > frame size used by kernel + 6KB for user application. > > > > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > > slightly more informed guess about suitable stack sizes than the kernel > > > > (and glibc already has to guess anyway, in order to determine the > > > > default thread stack size). > > > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > > available. > > > > > > > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > > is in use. > > > > > > > > Great if we can do it. I was concerned that this might be > > > > controversial. > > > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > > MINSIGSTKSZ. > > > > > > > Here is the glibc patch: > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > AT_MINSIGSTKSZ should return the signal frame size used by kernel + 6KB > > for user application. > > I'm not sure about the 6K here. 6KB is something I made up. > We a few fundamental parameters: > > * the actual maximum size of the kernel-allocated signal frame (which > we'll report via AT_MINSIGSTKSZ); Agree. > * the size of additional userspace stack frame required to execute the > minimal (i.e., empty) signal handler. (On AArch64, this is 0. In It is also 0 for x86. > environments where the C lirbrary calls signal handlers through some > sort of wrapper, this would need to include the wrapper's stack > needs also); > * additional userspace stack needs for the actual signal handler code. > This is completely unknown. That is 6KB I made up. > > _SC_MINSIGSTKSZ (however named) should certainly include the first two, > but I'm not sure about the third. It will at least be architecture- > dependent. > > > This is one reason why I still favor having more than one constant here: > the fundamental system properties should be discoverable for software > that knows how to calculate its own stack needs accurately. > > Since calculating stack needs is hard and most software doesn't bother > to do it, we could also give a "recommended" stack size which > incorporates a guess of typical handler stack needs (similarly to the > legacy SIGSTKSZ constant), but I think that should be a separate > parameter. Sounds reasonable. We can have _SC_MINSIGSTKSZ and _SC_SIGSTKSZ which is _SC_MINSIGSTKSZ + 6KB (or some other value).
On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > > which has grown over time as new features and larger registers have been > > > > > > > added to the architecture. > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > > approach to [5] > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > > series does seem consistent with the way arm64 populates > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > this... > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > Thanks for the link. > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > drop them if there's something already out there. > > > > > > I am working on it. > > > > OK. I may post something for discussion, but I'm happy for it to be > > superseded by someone (i.e., other than me) who actually knows what > > they're doing... > > Please see my previous email for my glibc patch: > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > buffer overruns. > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > definitions) was that userspace binaries will have baked in the old > > > > value of the constant and may be making assumptions about it. > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > changes. This could be a problem if an newly built library tries to > > > > memcpy() or dump such an object defined by and old binary. > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > and makecontext() could similarly go wrong. > > > > > > With my original proposal: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > constants: > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > obvious thing to do with this constant in many simple cases. Such usage > > is widespread, see: > > > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1 > > > > > > Your two approaches seem to trade off two different sources of buffer > > overruns: undersized stacks versus ABI breaks across library boundaries. > > We can't get everything we want. > > > Since undersized stack is by far the more familiar problem and we at > > least have guard regions to help detect overruns, I'd vote to keep > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. > > Agree. > > > Or are people reporting real stack overruns on x86 today? > > I hope so. > > > > > For arm64, we made large vectors on SVE opt-in, so that oversized signal > > frames are not seen by default. Would somethine similar be feasible on > > x86? > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > > discovery method is changing. The meaning of the value is exactly the > > > > same as before. > > > > > > > > If we are going to rename it though, it could make sense to go for > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > > recommendation for your stack size. While the signal frame size is > > > > relevant to picking a stack size, it's not the only thing to > > > > consider. > > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > > kernel. The minimum stack size for a signal handler is more likely > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > > frame size used by kernel + 6KB for user application. > > > > Ack; to be correct, you also need to take into account which signals may > > be unmasked while running on this stack, and the stack requirements of > > all their handlers. Unfortunately, that's hard :( > > > > What's your view on my naming suggesions? > > I used _SC_MINSIGSTKSZ: > > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9 Apologies, I missed that. Otherwise, the changes look much as I would expect, except for the "6K for user program" thing. This is strictly not included in the legacy MINSIGSTKSZ. > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > > slightly more informed guess about suitable stack sizes than the kernel > > > > (and glibc already has to guess anyway, in order to determine the > > > > default thread stack size). > > > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > > available. > > > > In my code, I generate _SC_SIGSTKSZ as the equivalent of > > > > max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ) > > > > which is >= the legacy value, and broadly reperesentative of the > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches. > > > > > > What do you think? > > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases. Why, though? MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever. Software that calculates its own needs to know the actual system values, not estimates based on guesses about how much stack a typical program might need if it were recompiled for x86. This doesn't mean we can't have a generic suggested value that's suitable for common scenarios (like SIGSTKSZ), but if we do then I think it should be a separate constant. > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > > is in use. > > > > > > > > Great if we can do it. I was concerned that this might be > > > > controversial. > > > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > > MINSIGSTKSZ. > > > > Totally agree with that. > > > > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ. Ah yes, I see. That's a sensible precaution. Is it accepted in general that defining different feature test macros can lead to ABI incompatibilities? I have thought that building a shared library with _GNU_SOURCE (say) doesn't mean that a program that loads that library must also be built with _GNU_SOURCE. For one thing, that's hard to police. However, there are already combinations that could break, e.g., mixing -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if this define changes off_t. So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an acceptable compromise. Interfaces that depend on the value of MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice -- I don't know of a specific example. Cheers ---Dave
On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: > On 10/6/20 8:25 AM, Dave Martin wrote: > > Or are people reporting real stack overruns on x86 today? > > We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state > mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. Right. Out of interest, do you believe that's a direct consequence of the larger kernel-generated signal frame, or does the expansion of userspace stack frames play a role too? In practice software just assumes SIGSTKSZ and then ignores the problem until / unless an actual stack overflow is seen. There's probably a lot of software out there whose stack is theoretically too small even without AVX-512 etc. in the mix, especially when considering the possibility of nested signals... Cheers ---Dave
On Tue, Oct 6, 2020 at 9:55 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > > > which has grown over time as new features and larger registers have been > > > > > > > > added to the architecture. > > > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > > > approach to [5] > > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > > > series does seem consistent with the way arm64 populates > > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > > this... > > > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > > > Thanks for the link. > > > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > > drop them if there's something already out there. > > > > > > > > I am working on it. > > > > > > OK. I may post something for discussion, but I'm happy for it to be > > > superseded by someone (i.e., other than me) who actually knows what > > > they're doing... > > > > Please see my previous email for my glibc patch: > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > > buffer overruns. > > > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > > definitions) was that userspace binaries will have baked in the old > > > > > value of the constant and may be making assumptions about it. > > > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > > changes. This could be a problem if an newly built library tries to > > > > > memcpy() or dump such an object defined by and old binary. > > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > > and makecontext() could similarly go wrong. > > > > > > > > With my original proposal: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > > constants: > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > > obvious thing to do with this constant in many simple cases. Such usage > > > is widespread, see: > > > > > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1 > > > > > > > > > Your two approaches seem to trade off two different sources of buffer > > > overruns: undersized stacks versus ABI breaks across library boundaries. > > > > We can't get everything we want. > > > > > Since undersized stack is by far the more familiar problem and we at > > > least have guard regions to help detect overruns, I'd vote to keep > > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. > > > > Agree. > > > > > Or are people reporting real stack overruns on x86 today? > > > > I hope so. > > > > > > > > For arm64, we made large vectors on SVE opt-in, so that oversized signal > > > frames are not seen by default. Would somethine similar be feasible on > > > x86? > > > > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > > > discovery method is changing. The meaning of the value is exactly the > > > > > same as before. > > > > > > > > > > If we are going to rename it though, it could make sense to go for > > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > > > recommendation for your stack size. While the signal frame size is > > > > > relevant to picking a stack size, it's not the only thing to > > > > > consider. > > > > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > > > kernel. The minimum stack size for a signal handler is more likely > > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > > > frame size used by kernel + 6KB for user application. > > > > > > Ack; to be correct, you also need to take into account which signals may > > > be unmasked while running on this stack, and the stack requirements of > > > all their handlers. Unfortunately, that's hard :( > > > > > > What's your view on my naming suggesions? > > > > I used _SC_MINSIGSTKSZ: > > > > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9 > > Apologies, I missed that. > > Otherwise, the changes look much as I would expect, except for the > "6K for user program" thing. This is strictly not included in the > legacy MINSIGSTKSZ. > > > > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > > > slightly more informed guess about suitable stack sizes than the kernel > > > > > (and glibc already has to guess anyway, in order to determine the > > > > > default thread stack size). > > > > > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > > > available. > > > > > > In my code, I generate _SC_SIGSTKSZ as the equivalent of > > > > > > max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ) > > > > > > which is >= the legacy value, and broadly reperesentative of the > > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches. > > > > > > > > > What do you think? > > > > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases. > > Why, though? > > MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever. > > > Software that calculates its own needs to know the actual system values, > not estimates based on guesses about how much stack a typical program > might need if it were recompiled for x86. > > This doesn't mean we can't have a generic suggested value that's suitable > for common scenarios (like SIGSTKSZ), but if we do then I think it > should be a separate constant. I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. > > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > > > is in use. > > > > > > > > > > Great if we can do it. I was concerned that this might be > > > > > controversial. > > > > > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > > > MINSIGSTKSZ. > > > > > > Totally agree with that. > > > > > > > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile > > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ. > > Ah yes, I see. That's a sensible precaution. > > Is it accepted in general that defining different feature test macros > can lead to ABI incompatibilities? > > I have thought that building a shared library with _GNU_SOURCE (say) > doesn't mean that a program that loads that library must also be built > with _GNU_SOURCE. For one thing, that's hard to police. > > > However, there are already combinations that could break, e.g., mixing > -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if > this define changes off_t. > > > So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an > acceptable compromise. Interfaces that depend on the value of > MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice -- > I don't know of a specific example. > I changed it to _SC_SIGSTKSZ_SOURCE: https://gitlab.com/x86-glibc/glibc/-/commit/41d5e6b31025721590f12d5aa543eb0bc53ce263 #ifdef __USE_SC_SIGSTKSZ # include <unistd.h> /* Minimum stack size for a signal handler: sysconf (SC_SIGSTKSZ). */ # undef MINSIGSTKSZ # define MINSIGSTKSZ sysconf (_SC_SIGSTKSZ) /* System default stack size for a signal handler: MINSIGSTKSZ. */ # undef SIGSTKSZ # define SIGSTKSZ MINSIGSTKSZ #endif Compilation will fail if the source assumes constant MINSIGSTKSZ or SIGSTKSZ.
* Dave Martin via Libc-alpha: > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: >> On 10/6/20 8:25 AM, Dave Martin wrote: >> > Or are people reporting real stack overruns on x86 today? >> >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > > Right. Out of interest, do you believe that's a direct consequence of > the larger kernel-generated signal frame, or does the expansion of > userspace stack frames play a role too? I must say that I do not quite understand this question. 32 64-*byte* registers simply need 2048 bytes of storage space worst case, there is really no way around that. > In practice software just assumes SIGSTKSZ and then ignores the problem > until / unless an actual stack overflow is seen. > > There's probably a lot of software out there whose stack is > theoretically too small even without AVX-512 etc. in the mix, especially > when considering the possibility of nested signals... That is certainly true. We have seen problems with ntpd, which requested a 16 KiB stack, at a time when there were various deductions from the stack size, and since the glibc dynamic loader also uses XSAVE, ntpd exceeded the remaining stack space. But in this case, we just fudged the stack size computation in pthread_create and made it less likely that the dynamic loader was activated, which largely worked around this particular problem. For MINSIGSTKSZ, we just don't have this option because it's simply too small in the first place. I don't immediately recall a bug due to SIGSTKSZ being too small. The test cases I wrote for this were all artificial, to raise awareness of this issue (applications treating these as recommended values, rather than minimum value to avoid immediately sigaltstack/phtread_create failures, same issue with PTHREAD_STACK_MIN). Thanks, Florian
On 10/6/20 10:00 AM, Dave Martin wrote: > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: >> On 10/6/20 8:25 AM, Dave Martin wrote: >>> Or are people reporting real stack overruns on x86 today? >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > Right. Out of interest, do you believe that's a direct consequence of > the larger kernel-generated signal frame, or does the expansion of > userspace stack frames play a role too? The kernel-generated signal frame is entirely responsible for the ~2800 bytes that I'm talking about. I'm sure there are some systems where userspace plays a role, but those are much less of a worry at the moment, since the kernel-induced overflows mean an instant crash that userspace has no recourse for.
On Tue, Oct 06, 2020 at 08:21:00PM +0200, Florian Weimer wrote: > * Dave Martin via Libc-alpha: > > > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: > >> On 10/6/20 8:25 AM, Dave Martin wrote: > >> > Or are people reporting real stack overruns on x86 today? > >> > >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state > >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > > > > Right. Out of interest, do you believe that's a direct consequence of > > the larger kernel-generated signal frame, or does the expansion of > > userspace stack frames play a role too? > > I must say that I do not quite understand this question. > > 32 64-*byte* registers simply need 2048 bytes of storage space worst > case, there is really no way around that. If the architecture grows more or bigger registers, and if those registers are used in general-purpose code, then all stack frames will tend to grow, not just the signal frame. So a stack overflow might be caused by the larger signal frame by itself; or it might be caused by the growth of the stack of 20 function frames created by someone's signal handler. In the latter case, this is just a "normal" stack overflow, and nothing really to do with signals or SIGSTKSZ. Rebuilding with different compiler flags could also grow the stack usage and cause just the same problem. I also strongly suspect that people often don't think about signal nesting when allocating signal stacks. So, there might be a pre- existing potential overflow that just becomes more likely when the signal frame grows. That's not really SIGSTKSZ's fault. Of course, AVX-512 might never be used in general-purpose code. On AArch64, SVE can be used in general-purpose code, but it's too early to say what its prevalence will be in signal handlers. Probably low. > > In practice software just assumes SIGSTKSZ and then ignores the problem > > until / unless an actual stack overflow is seen. > > > > There's probably a lot of software out there whose stack is > > theoretically too small even without AVX-512 etc. in the mix, especially > > when considering the possibility of nested signals... > > That is certainly true. We have seen problems with ntpd, which > requested a 16 KiB stack, at a time when there were various deductions > from the stack size, and since the glibc dynamic loader also uses XSAVE, > ntpd exceeded the remaining stack space. But in this case, we just > fudged the stack size computation in pthread_create and made it less > likely that the dynamic loader was activated, which largely worked > around this particular problem. For MINSIGSTKSZ, we just don't have > this option because it's simply too small in the first place. > > I don't immediately recall a bug due to SIGSTKSZ being too small. The > test cases I wrote for this were all artificial, to raise awareness of > this issue (applications treating these as recommended values, rather > than minimum value to avoid immediately sigaltstack/phtread_create > failures, same issue with PTHREAD_STACK_MIN). Ack, I think if SIGSTKSZ was too small significantly often, there would be more awareness of the issue. Cheers ---Dave
On Tue, Oct 06, 2020 at 11:30:42AM -0700, Dave Hansen wrote: > On 10/6/20 10:00 AM, Dave Martin wrote: > > On Tue, Oct 06, 2020 at 08:33:47AM -0700, Dave Hansen wrote: > >> On 10/6/20 8:25 AM, Dave Martin wrote: > >>> Or are people reporting real stack overruns on x86 today? > >> We have real overruns. We have ~2800 bytes of XSAVE (regisiter) state > >> mostly from AVX-512, and a 2048 byte MINSIGSTKSZ. > > Right. Out of interest, do you believe that's a direct consequence of > > the larger kernel-generated signal frame, or does the expansion of > > userspace stack frames play a role too? > > The kernel-generated signal frame is entirely responsible for the ~2800 > bytes that I'm talking about. > > I'm sure there are some systems where userspace plays a role, but those > are much less of a worry at the moment, since the kernel-induced > overflows mean an instant crash that userspace has no recourse for. Ack, that sounds pretty convincing. Cheers ---Dave
On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: > On Tue, Oct 6, 2020 at 9:55 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > > > > which has grown over time as new features and larger registers have been > > > > > > > > > added to the architecture. > > > > > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > > > > approach to [5] > > > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > > > > series does seem consistent with the way arm64 populates > > > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > > > this... > > > > > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > > > > > Thanks for the link. > > > > > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > > > drop them if there's something already out there. > > > > > > > > > > I am working on it. > > > > > > > > OK. I may post something for discussion, but I'm happy for it to be > > > > superseded by someone (i.e., other than me) who actually knows what > > > > they're doing... > > > > > > Please see my previous email for my glibc patch: > > > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > > > buffer overruns. > > > > > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > > > definitions) was that userspace binaries will have baked in the old > > > > > > value of the constant and may be making assumptions about it. > > > > > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > > > changes. This could be a problem if an newly built library tries to > > > > > > memcpy() or dump such an object defined by and old binary. > > > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > > > and makecontext() could similarly go wrong. > > > > > > > > > > With my original proposal: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > > > constants: > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > > > obvious thing to do with this constant in many simple cases. Such usage > > > > is widespread, see: > > > > > > > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1 > > > > > > > > > > > > Your two approaches seem to trade off two different sources of buffer > > > > overruns: undersized stacks versus ABI breaks across library boundaries. > > > > > > We can't get everything we want. > > > > > > > Since undersized stack is by far the more familiar problem and we at > > > > least have guard regions to help detect overruns, I'd vote to keep > > > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. > > > > > > Agree. > > > > > > > Or are people reporting real stack overruns on x86 today? > > > > > > I hope so. > > > > > > > > > > > For arm64, we made large vectors on SVE opt-in, so that oversized signal > > > > frames are not seen by default. Would somethine similar be feasible on > > > > x86? > > > > > > > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > > > > discovery method is changing. The meaning of the value is exactly the > > > > > > same as before. > > > > > > > > > > > > If we are going to rename it though, it could make sense to go for > > > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > > > > recommendation for your stack size. While the signal frame size is > > > > > > relevant to picking a stack size, it's not the only thing to > > > > > > consider. > > > > > > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > > > > kernel. The minimum stack size for a signal handler is more likely > > > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > > > > frame size used by kernel + 6KB for user application. > > > > > > > > Ack; to be correct, you also need to take into account which signals may > > > > be unmasked while running on this stack, and the stack requirements of > > > > all their handlers. Unfortunately, that's hard :( > > > > > > > > What's your view on my naming suggesions? > > > > > > I used _SC_MINSIGSTKSZ: > > > > > > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9 > > > > Apologies, I missed that. > > > > Otherwise, the changes look much as I would expect, except for the > > "6K for user program" thing. This is strictly not included in the > > legacy MINSIGSTKSZ. > > > > > > > > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > > > > slightly more informed guess about suitable stack sizes than the kernel > > > > > > (and glibc already has to guess anyway, in order to determine the > > > > > > default thread stack size). > > > > > > > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > > > > available. > > > > > > > > In my code, I generate _SC_SIGSTKSZ as the equivalent of > > > > > > > > max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ) > > > > > > > > which is >= the legacy value, and broadly reperesentative of the > > > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches. > > > > > > > > > > > > What do you think? > > > > > > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases. > > > > Why, though? > > > > MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever. > > > > > > Software that calculates its own needs to know the actual system values, > > not estimates based on guesses about how much stack a typical program > > might need if it were recompiled for x86. > > > > This doesn't mean we can't have a generic suggested value that's suitable > > for common scenarios (like SIGSTKSZ), but if we do then I think it > > should be a separate constant. > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. > _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead? If the arch has more or bigger registers to save in the signal frame, the chances are that they're going to get saved in some userspace stack frames too. So I suspect that the user signal handler stack usage may scale up to some extent rather than being a constant. To help people migrate without unpleasant surprises, I also figured it would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >= legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ. This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a drop-in replacement for MINSIGSTKSZ, etc. (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64. My idea was that sysconf () should hide this surprise, but people who really want to know the kernel value would tolerate some nonportability and read AT_MINSIGSTKSZ directly.) So then: kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ); minsigstksz = LEGACY_MINSIGSTKSZ; if (kernel_minsigstksz > minsigstksz) minsistksz = kernel_minsigstksz; sigstksz = LEGACY_SIGSTKSZ; if (minsigstksz * 4 > sigstksz) sigstksz = minsigstksz * 4; (Or something like that, unless the architecture provides its own definitions. ia64's MINSIGSTKSZ is enormous, so it probably doesn't want this.) Also: should all these values be rounded up to a multiple of the architecture's preferred stack alignment? Should the preferred stack alignment also be exposed through sysconf? Portable code otherwise has no way to know this, though if the preferred alignment is <= the minimum malloc()/alloca() alignment then this is generally not an issue.) > > > > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > > > > is in use. > > > > > > > > > > > > Great if we can do it. I was concerned that this might be > > > > > > controversial. > > > > > > > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > > > > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > > > > MINSIGSTKSZ. > > > > > > > > Totally agree with that. > > > > > > > > > > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile > > > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ. > > > > Ah yes, I see. That's a sensible precaution. > > > > Is it accepted in general that defining different feature test macros > > can lead to ABI incompatibilities? > > > > I have thought that building a shared library with _GNU_SOURCE (say) > > doesn't mean that a program that loads that library must also be built > > with _GNU_SOURCE. For one thing, that's hard to police. > > > > > > However, there are already combinations that could break, e.g., mixing > > -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if > > this define changes off_t. > > > > > > So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an > > acceptable compromise. Interfaces that depend on the value of > > MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice -- > > I don't know of a specific example. > > > > I changed it to _SC_SIGSTKSZ_SOURCE: > > https://gitlab.com/x86-glibc/glibc/-/commit/41d5e6b31025721590f12d5aa543eb0bc53ce263 > > #ifdef __USE_SC_SIGSTKSZ > # include <unistd.h> > /* Minimum stack size for a signal handler: sysconf (SC_SIGSTKSZ). */ > # undef MINSIGSTKSZ > # define MINSIGSTKSZ sysconf (_SC_SIGSTKSZ) > /* System default stack size for a signal handler: MINSIGSTKSZ. */ > # undef SIGSTKSZ > # define SIGSTKSZ MINSIGSTKSZ > #endif > > Compilation will fail if the source assumes constant MINSIGSTKSZ or > SIGSTKSZ. I don't understand all the glibc-fu, bit it looks reasonable overall (notwithstanding my comments above). Cheers ---Dave
On Wed, Oct 7, 2020 at 3:47 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: > > On Tue, Oct 6, 2020 at 9:55 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Tue, Oct 06, 2020 at 08:34:06AM -0700, H.J. Lu wrote: > > > > On Tue, Oct 6, 2020 at 8:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > On Tue, Oct 06, 2020 at 05:12:29AM -0700, H.J. Lu wrote: > > > > > > On Tue, Oct 6, 2020 at 2:25 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > > > On Mon, Oct 05, 2020 at 10:17:06PM +0100, H.J. Lu wrote: > > > > > > > > On Mon, Oct 5, 2020 at 6:45 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > > > > > > > > > > > > > On Tue, Sep 29, 2020 at 01:57:42PM -0700, Chang S. Bae wrote: > > > > > > > > > > During signal entry, the kernel pushes data onto the normal userspace > > > > > > > > > > stack. On x86, the data pushed onto the user stack includes XSAVE state, > > > > > > > > > > which has grown over time as new features and larger registers have been > > > > > > > > > > added to the architecture. > > > > > > > > > > > > > > > > > > > > MINSIGSTKSZ is a constant provided in the kernel signal.h headers and > > > > > > > > > > typically distributed in lib-dev(el) packages, e.g. [1]. Its value is > > > > > > > > > > compiled into programs and is part of the user/kernel ABI. The MINSIGSTKSZ > > > > > > > > > > constant indicates to userspace how much data the kernel expects to push on > > > > > > > > > > the user stack, [2][3]. > > > > > > > > > > > > > > > > > > > > However, this constant is much too small and does not reflect recent > > > > > > > > > > additions to the architecture. For instance, when AVX-512 states are in > > > > > > > > > > use, the signal frame size can be 3.5KB while MINSIGSTKSZ remains 2KB. > > > > > > > > > > > > > > > > > > > > The bug report [4] explains this as an ABI issue. The small MINSIGSTKSZ can > > > > > > > > > > cause user stack overflow when delivering a signal. > > > > > > > > > > > > > > > > > > > > In this series, we suggest a couple of things: > > > > > > > > > > 1. Provide a variable minimum stack size to userspace, as a similar > > > > > > > > > > approach to [5] > > > > > > > > > > 2. Avoid using a too-small alternate stack > > > > > > > > > > > > > > > > > > I can't comment on the x86 specifics, but the approach followed in this > > > > > > > > > series does seem consistent with the way arm64 populates > > > > > > > > > AT_MINSIGSTKSZ. > > > > > > > > > > > > > > > > > > I need to dig up my glibc hacks for providing a sysconf interface to > > > > > > > > > this... > > > > > > > > > > > > > > > > Here is my proposal for glibc: > > > > > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118098.html > > > > > > > > > > > > > > Thanks for the link. > > > > > > > > > > > > > > Are there patches yet? I already had some hacks in the works, but I can > > > > > > > drop them if there's something already out there. > > > > > > > > > > > > I am working on it. > > > > > > > > > > OK. I may post something for discussion, but I'm happy for it to be > > > > > superseded by someone (i.e., other than me) who actually knows what > > > > > they're doing... > > > > > > > > Please see my previous email for my glibc patch: > > > > > > > > https://gitlab.com/x86-glibc/glibc/-/commits/users/hjl/AT_MINSIGSTKSZ > > > > > > > > > > > > > > > > > > > 1. Define SIGSTKSZ and MINSIGSTKSZ to 64KB. > > > > > > > > > > > > > > Can we do this? IIUC, this is an ABI break and carries the risk of > > > > > > > buffer overruns. > > > > > > > > > > > > > > The reason for not simply increasing the kernel's MINSIGSTKSZ #define > > > > > > > (apart from the fact that it is rarely used, due to glibc's shadowing > > > > > > > definitions) was that userspace binaries will have baked in the old > > > > > > > value of the constant and may be making assumptions about it. > > > > > > > > > > > > > > For example, the type (char [MINSIGSTKSZ]) changes if this #define > > > > > > > changes. This could be a problem if an newly built library tries to > > > > > > > memcpy() or dump such an object defined by and old binary. > > > > > > > Bounds-checking and the stack sizes passed to things like sigaltstack() > > > > > > > and makecontext() could similarly go wrong. > > > > > > > > > > > > With my original proposal: > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118028.html > > > > > > > > > > > > char [MINSIGSTKSZ] won't compile. The feedback is to increase the > > > > > > constants: > > > > > > > > > > > > https://sourceware.org/pipermail/libc-alpha/2020-September/118092.html > > > > > > > > > > Ah, I see. But both still API and ABI breaks; moreover, declaraing an > > > > > array with size based on (MIN)SIGSTKSZ is not just reasonable, but the > > > > > obvious thing to do with this constant in many simple cases. Such usage > > > > > is widespread, see: > > > > > > > > > > * https://codesearch.debian.net/search?q=%5BSIGSTKSZ%5D&literal=1 > > > > > > > > > > > > > > > Your two approaches seem to trade off two different sources of buffer > > > > > overruns: undersized stacks versus ABI breaks across library boundaries. > > > > > > > > We can't get everything we want. > > > > > > > > > Since undersized stack is by far the more familiar problem and we at > > > > > least have guard regions to help detect overruns, I'd vote to keep > > > > > MINSIGSTKSZ and SIGSTKSZ as-is, at least for now. > > > > > > > > Agree. > > > > > > > > > Or are people reporting real stack overruns on x86 today? > > > > > > > > I hope so. > > > > > > > > > > > > > > For arm64, we made large vectors on SVE opt-in, so that oversized signal > > > > > frames are not seen by default. Would somethine similar be feasible on > > > > > x86? > > > > > > > > > > > > > > > > > > 2. Add _SC_RSVD_SIG_STACK_SIZE for signal stack size reserved by the kernel. > > > > > > > > > > > > > > How about "_SC_MINSIGSTKSZ"? This was my initial choice since only the > > > > > > > discovery method is changing. The meaning of the value is exactly the > > > > > > > same as before. > > > > > > > > > > > > > > If we are going to rename it though, it could make sense to go for > > > > > > > something more directly descriptive, say, "_SC_SIGNAL_FRAME_SIZE". > > > > > > > > > > > > > > The trouble with including "STKSZ" is that is sounds like a > > > > > > > recommendation for your stack size. While the signal frame size is > > > > > > > relevant to picking a stack size, it's not the only thing to > > > > > > > consider. > > > > > > > > > > > > The problem is that AT_MINSIGSTKSZ is the signal frame size used by > > > > > > kernel. The minimum stack size for a signal handler is more likely > > > > > > AT_MINSIGSTKSZ + 1.5KB unless AT_MINSIGSTKSZ returns the signal > > > > > > frame size used by kernel + 6KB for user application. > > > > > > > > > > Ack; to be correct, you also need to take into account which signals may > > > > > be unmasked while running on this stack, and the stack requirements of > > > > > all their handlers. Unfortunately, that's hard :( > > > > > > > > > > What's your view on my naming suggesions? > > > > > > > > I used _SC_MINSIGSTKSZ: > > > > > > > > https://gitlab.com/x86-glibc/glibc/-/commit/73ca53bfbc1c105bc579f55f15af011a07fcded9 > > > > > > Apologies, I missed that. > > > > > > Otherwise, the changes look much as I would expect, except for the > > > "6K for user program" thing. This is strictly not included in the > > > legacy MINSIGSTKSZ. > > > > > > > > > > > > > > > > > > > Also, do we need a _SC_SIGSTKSZ constant, or should the entire concept > > > > > > > of a "recommended stack size" be abandoned? glibc can at least make a > > > > > > > slightly more informed guess about suitable stack sizes than the kernel > > > > > > > (and glibc already has to guess anyway, in order to determine the > > > > > > > default thread stack size). > > > > > > > > > > > > Glibc should try to deduct signal frame size if AT_MINSIGSTKSZ isn't > > > > > > available. > > > > > > > > > > In my code, I generate _SC_SIGSTKSZ as the equivalent of > > > > > > > > > > max(sysconf(_SC_MINSIGSTKSZ) * 4, SIGSTKSZ) > > > > > > > > > > which is >= the legacy value, and broadly reperesentative of the > > > > > relationship between MINSIGSTKSZ and SIGSTKSZ on most arches. > > > > > > > > > > > > > > > What do you think? > > > > > > > > sysconf(_SC_MINSIGSTKSZ) should be usable ASIS for most cases. > > > > > > Why, though? > > > > > > MINSIGSTKSZ is not specified to be usable as-is for any case whatsoever. > > > > > > > > > Software that calculates its own needs to know the actual system values, > > > not estimates based on guesses about how much stack a typical program > > > might need if it were recompiled for x86. > > > > > > This doesn't mean we can't have a generic suggested value that's suitable > > > for common scenarios (like SIGSTKSZ), but if we do then I think it > > > should be a separate constant. > > > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. > > _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. > > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead? Done. > If the arch has more or bigger registers to save in the signal frame, > the chances are that they're going to get saved in some userspace stack > frames too. So I suspect that the user signal handler stack usage may > scale up to some extent rather than being a constant. > > > To help people migrate without unpleasant surprises, I also figured it > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >= > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ. > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a > drop-in replacement for MINSIGSTKSZ, etc. > > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64. > My idea was that sysconf () should hide this surprise, but people who > really want to know the kernel value would tolerate some > nonportability and read AT_MINSIGSTKSZ directly.) > > > So then: > > kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ); > minsigstksz = LEGACY_MINSIGSTKSZ; > if (kernel_minsigstksz > minsigstksz) > minsistksz = kernel_minsigstksz; > > sigstksz = LEGACY_SIGSTKSZ; > if (minsigstksz * 4 > sigstksz) > sigstksz = minsigstksz * 4; I updated users/hjl/AT_MINSIGSTKSZ branch with +@item _SC_MINSIGSTKSZ +@standards{GNU, unistd.h} +Inquire about the signal stack size used by the kernel. + +@item _SC_SIGSTKSZ +@standards{GNU, unistd.h} +Inquire about the default signal stack size for a signal handler. case _SC_MINSIGSTKSZ: assert (GLRO(dl_minsigstacksize) != 0); return GLRO(dl_minsigstacksize); case _SC_SIGSTKSZ: { /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4. */ long int minsigstacksize = GLRO(dl_minsigstacksize); _Static_assert (__builtin_constant_p (MINSIGSTKSZ), "MINSIGSTKSZ is constant"); if (minsigstacksize < MINSIGSTKSZ) minsigstacksize = MINSIGSTKSZ; return minsigstacksize * 4; } > > (Or something like that, unless the architecture provides its own > definitions. ia64's MINSIGSTKSZ is enormous, so it probably doesn't > want this.) > > > Also: should all these values be rounded up to a multiple of the > architecture's preferred stack alignment? Kernel should provide a properly aligned AT_MINSIGSTKSZ. > Should the preferred stack alignment also be exposed through sysconf? > Portable code otherwise has no way to know this, though if the > preferred alignment is <= the minimum malloc()/alloca() alignment then > this is generally not an issue.) Good question. But it is orthogonal to the signal stack size issue. > > > > > > > > > > > > 3. Deprecate SIGSTKSZ and MINSIGSTKSZ if _SC_RSVD_SIG_STACK_SIZE > > > > > > > > is in use. > > > > > > > > > > > > > > Great if we can do it. I was concerned that this might be > > > > > > > controversial. > > > > > > > > > > > > > > Would this just be a recommendation, or can we enforce it somehow? > > > > > > > > > > > > It is just an idea. We need to move away from constant SIGSTKSZ and > > > > > > MINSIGSTKSZ. > > > > > > > > > > Totally agree with that. > > > > > > > > > > > > > With my glibc patch, -D_SC_MINSIGSTKSZ_SOURCE will fail to compile > > > > if the source assumes constant SIGSTKSZ or MINSIGSTKSZ. > > > > > > Ah yes, I see. That's a sensible precaution. > > > > > > Is it accepted in general that defining different feature test macros > > > can lead to ABI incompatibilities? > > > > > > I have thought that building a shared library with _GNU_SOURCE (say) > > > doesn't mean that a program that loads that library must also be built > > > with _GNU_SOURCE. For one thing, that's hard to police. > > > > > > > > > However, there are already combinations that could break, e.g., mixing > > > -D_FILE_OFFSET_BITS=64 with -D_FILE_OFFSET_BITS=32 would be broken if > > > this define changes off_t. > > > > > > > > > So, maybe having _SC_MINSIGSTKSZ_SOURCE break things in this way is an > > > acceptable compromise. Interfaces that depend on the value of > > > MINSIGSTKSZ or SIGSTKSZ are possible, but probably rare in practice -- > > > I don't know of a specific example. > > > > > > > I changed it to _SC_SIGSTKSZ_SOURCE: > > > > https://gitlab.com/x86-glibc/glibc/-/commit/41d5e6b31025721590f12d5aa543eb0bc53ce263 > > > > #ifdef __USE_SC_SIGSTKSZ > > # include <unistd.h> > > /* Minimum stack size for a signal handler: sysconf (SC_SIGSTKSZ). */ > > # undef MINSIGSTKSZ > > # define MINSIGSTKSZ sysconf (_SC_SIGSTKSZ) > > /* System default stack size for a signal handler: MINSIGSTKSZ. */ > > # undef SIGSTKSZ > > # define SIGSTKSZ MINSIGSTKSZ > > #endif > > > > Compilation will fail if the source assumes constant MINSIGSTKSZ or > > SIGSTKSZ. > > I don't understand all the glibc-fu, bit it looks reasonable overall > (notwithstanding my comments above). > > Cheers > ---Dave
On Wed, Oct 07, 2020 at 06:30:03AM -0700, H.J. Lu wrote: > On Wed, Oct 7, 2020 at 3:47 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: [...] > > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. > > > _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, > > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value > > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. > > > > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead? > > Done. OK. I was prepared to have to argue my case a bit more, but if you think this is OK then I will stop arguing ;) > > If the arch has more or bigger registers to save in the signal frame, > > the chances are that they're going to get saved in some userspace stack > > frames too. So I suspect that the user signal handler stack usage may > > scale up to some extent rather than being a constant. > > > > > > To help people migrate without unpleasant surprises, I also figured it > > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >= > > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ. > > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a > > drop-in replacement for MINSIGSTKSZ, etc. > > > > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64. > > My idea was that sysconf () should hide this surprise, but people who > > really want to know the kernel value would tolerate some > > nonportability and read AT_MINSIGSTKSZ directly.) > > > > > > So then: > > > > kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ); > > minsigstksz = LEGACY_MINSIGSTKSZ; > > if (kernel_minsigstksz > minsigstksz) > > minsistksz = kernel_minsigstksz; > > > > sigstksz = LEGACY_SIGSTKSZ; > > if (minsigstksz * 4 > sigstksz) > > sigstksz = minsigstksz * 4; > > I updated users/hjl/AT_MINSIGSTKSZ branch with > > +@item _SC_MINSIGSTKSZ > +@standards{GNU, unistd.h} Can we specify these more agressively now? > +Inquire about the signal stack size used by the kernel. I think we've already concluded that this should included all mandatory overheads, including those imposed by the compiler and glibc? e.g.: --8<-- The returned value is the minimum number of bytes of free stack space required in order to gurantee successful, non-nested handling of a single signal whose handler is an empty function. -->8-- > + > +@item _SC_SIGSTKSZ > +@standards{GNU, unistd.h} > +Inquire about the default signal stack size for a signal handler. Similarly: --8<-- The returned value is the suggested minimum number of bytes of stack space required for a signal stack. This is not guaranteed to be enough for any specific purpose other than the invocation of a single, non-nested, empty handler, but nonetheless should be enough for basic scenarios involving simple signal handlers and very low levels of signal nesting (say, 2 or 3 levels at the very most). This value is provided for developer convenience and to ease migration from the legacy SIGSTKSZ constant. Programs requiring stronger guarantees should avoid using it if at all possible. -->8-- If these descriptions are too wordy, we might want to move some of it out to signal.texi, though. > > case _SC_MINSIGSTKSZ: > assert (GLRO(dl_minsigstacksize) != 0); > return GLRO(dl_minsigstacksize); > > case _SC_SIGSTKSZ: > { > /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4. */ > long int minsigstacksize = GLRO(dl_minsigstacksize); > _Static_assert (__builtin_constant_p (MINSIGSTKSZ), > "MINSIGSTKSZ is constant"); > if (minsigstacksize < MINSIGSTKSZ) > minsigstacksize = MINSIGSTKSZ; > return minsigstacksize * 4; > } > > > > > (Or something like that, unless the architecture provides its own > > definitions. ia64's MINSIGSTKSZ is enormous, so it probably doesn't > > want this.) > > > > > > Also: should all these values be rounded up to a multiple of the > > architecture's preferred stack alignment? > > Kernel should provide a properly aligned AT_MINSIGSTKSZ. OK. Can you comment on Chang S. Bae's series? I wasn't sure that the proposal produces an aligned value for AT_MINSIGSTKSZ on x86, but maybe I just worked it out wrong. > > Should the preferred stack alignment also be exposed through sysconf? > > Portable code otherwise has no way to know this, though if the > > preferred alignment is <= the minimum malloc()/alloca() alignment then > > this is generally not an issue.) > > Good question. But it is orthogonal to the signal stack size issue. Ack. There are various other brokennesses around this area, such as the fact that the register data may now run off the end of the mcontext_t object that is supposed to contain it. Ideally we should fork ucontext_t or mcontext_t into two types, one for the ucontext API and one for the signal API (which are anyway highly incompatible with each other). If this stuff is addressed, I guess we could bundle it under a more general feature test macro. But it's probably best to nail down this series in something like its current form first. I'll follow up on libc-alpha with a wishlist, but that will be a separate conversation... [...] Cheers ---Dave
On Wed, Oct 7, 2020 at 8:46 AM Dave Martin <Dave.Martin@arm.com> wrote: > > On Wed, Oct 07, 2020 at 06:30:03AM -0700, H.J. Lu wrote: > > On Wed, Oct 7, 2020 at 3:47 AM Dave Martin <Dave.Martin@arm.com> wrote: > > > > > > On Tue, Oct 06, 2020 at 10:44:14AM -0700, H.J. Lu wrote: > > [...] > > > > > I updated my glibc patch to add both _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. > > > > _SC_MINSIGSTKSZ is the minimum signal stack size from AT_MINSIGSTKSZ, > > > > which is the signal frame size used by kernel, and _SC_SIGSTKSZ is the value > > > > of sysconf (_SC_MINSIGSTKSZ) + 6KB for user application. > > > > > > Can I suggest sysconf (_SC_MINSIGSTKSZ) * 4 instead? > > > > Done. > > OK. I was prepared to have to argue my case a bit more, but if you > think this is OK then I will stop arguing ;) > > > > > If the arch has more or bigger registers to save in the signal frame, > > > the chances are that they're going to get saved in some userspace stack > > > frames too. So I suspect that the user signal handler stack usage may > > > scale up to some extent rather than being a constant. > > > > > > > > > To help people migrate without unpleasant surprises, I also figured it > > > would be a good idea to make sure that sysconf (_SC_MINSIGSTKSZ) >= > > > legacy MINSIGSTKSZ, and sysconf (_SC_SIGSTKSZ) >= legacy SIGSTKSZ. > > > This should makes it safer to use sysconf (_SC_MINSIGSTKSZ) as a > > > drop-in replacement for MINSIGSTKSZ, etc. > > > > > > (To explain: AT_MINSIGSTKSZ may actually be < MINSIGSTKSZ on AArch64. > > > My idea was that sysconf () should hide this surprise, but people who > > > really want to know the kernel value would tolerate some > > > nonportability and read AT_MINSIGSTKSZ directly.) > > > > > > > > > So then: > > > > > > kernel_minsigstksz = getauxval(AT_MINSIGSTKSZ); > > > minsigstksz = LEGACY_MINSIGSTKSZ; > > > if (kernel_minsigstksz > minsigstksz) > > > minsistksz = kernel_minsigstksz; > > > > > > sigstksz = LEGACY_SIGSTKSZ; > > > if (minsigstksz * 4 > sigstksz) > > > sigstksz = minsigstksz * 4; > > > > I updated users/hjl/AT_MINSIGSTKSZ branch with > > > > +@item _SC_MINSIGSTKSZ > > +@standards{GNU, unistd.h} > > Can we specify these more agressively now? > > > +Inquire about the signal stack size used by the kernel. > > I think we've already concluded that this should included all mandatory > overheads, including those imposed by the compiler and glibc? > > e.g.: > > --8<-- > > The returned value is the minimum number of bytes of free stack space > required in order to gurantee successful, non-nested handling of a > single signal whose handler is an empty function. > > -->8-- > > > + > > +@item _SC_SIGSTKSZ > > +@standards{GNU, unistd.h} > > +Inquire about the default signal stack size for a signal handler. > > Similarly: > > --8<-- > > The returned value is the suggested minimum number of bytes of stack > space required for a signal stack. > > This is not guaranteed to be enough for any specific purpose other than > the invocation of a single, non-nested, empty handler, but nonetheless > should be enough for basic scenarios involving simple signal handlers > and very low levels of signal nesting (say, 2 or 3 levels at the very > most). > > This value is provided for developer convenience and to ease migration > from the legacy SIGSTKSZ constant. Programs requiring stronger > guarantees should avoid using it if at all possible. > > -->8-- Done. > > If these descriptions are too wordy, we might want to move some of it > out to signal.texi, though. > > > > > case _SC_MINSIGSTKSZ: > > assert (GLRO(dl_minsigstacksize) != 0); > > return GLRO(dl_minsigstacksize); > > > > case _SC_SIGSTKSZ: > > { > > /* Return MAX (MINSIGSTKSZ, sysconf (_SC_MINSIGSTKSZ)) * 4. */ > > long int minsigstacksize = GLRO(dl_minsigstacksize); > > _Static_assert (__builtin_constant_p (MINSIGSTKSZ), > > "MINSIGSTKSZ is constant"); > > if (minsigstacksize < MINSIGSTKSZ) > > minsigstacksize = MINSIGSTKSZ; > > return minsigstacksize * 4; > > } > > > > > > > > (Or something like that, unless the architecture provides its own > > > definitions. ia64's MINSIGSTKSZ is enormous, so it probably doesn't > > > want this.) > > > > > > > > > Also: should all these values be rounded up to a multiple of the > > > architecture's preferred stack alignment? > > > > Kernel should provide a properly aligned AT_MINSIGSTKSZ. > > OK. Can you comment on Chang S. Bae's series? I wasn't sure that the > proposal produces an aligned value for AT_MINSIGSTKSZ on x86, but maybe > I just worked it out wrong. In Linux kernel, the signal frame data is composed of the following areas and laid out as: ------------------------------ | alignment padding | ------------------------------ | xsave buffer | <<<<<<< This must be 64 byte aligned ------------------------------ | fsave header (32-bit only) | ------------------------------ | siginfo + ucontext | ------------------------------ In order to get the proper alignment for xsave buffer, the signal stake size should be rounded up to 64 byte aligned. In glibc, I have /* Size of xsave buffer. */ unsigned int sigframe_size = ebx; if (64 bit) /* NB: sizeof(struct rt_sigframe) in Linux kernel. */ sigframe_size += 440; else /* NB: sizeof(struct sigframe_ia32) + sizeof(struct fregs_state)) in Linux kernel. */ sigframe_size += 736 + 112; endif /* Round up to 64-byte aligned. */ sigframe_size = ALIGN_UP (sigframe_size, 64); GLRO(dl_minsigstacksize) = sigframe_size; Kernel should do something similar. > > > > Should the preferred stack alignment also be exposed through sysconf? > > > Portable code otherwise has no way to know this, though if the > > > preferred alignment is <= the minimum malloc()/alloca() alignment then > > > this is generally not an issue.) > > > > Good question. But it is orthogonal to the signal stack size issue. > > Ack. > > There are various other brokennesses around this area, such as the fact > that the register data may now run off the end of the mcontext_t object > that is supposed to contain it. Ideally we should fork ucontext_t or > mcontext_t into two types, one for the ucontext API and one for the > signal API (which are anyway highly incompatible with each other). > > If this stuff is addressed, I guess we could bundle it under a more > general feature test macro. But it's probably best to nail down this > series in something like its current form first. Agreed. > > I'll follow up on libc-alpha with a wishlist, but that will be a > separate conversation... > > [...] > > Cheers > ---Dave