Message ID | CAMe9rOqCO0A8cCzqi2DjZqq+1HM9wHsx0t97WGq2N3vuGp=Hmg@mail.gmail.com |
---|---|
State | Committed |
Headers |
Received: (qmail 93000 invoked by alias); 18 Jan 2019 14:54:11 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 89156 invoked by uid 89); 18 Jan 2019 14:54:04 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=sk:fasynch, fexceptions X-HELO: mail-oi1-f193.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dwKaYUKaoLaw9EUFtTjgqMixxpuLz/sdHIEJ/yvi5yk=; b=PvzXvXcDkTmBQ94R6hwRgQz9qTzBA7h4GaB9AMMT43eqBJdsaCUvWOpQ7hwwT4zybD QPWakEzlua48V4BEpVvQ0ZzXkQ4ubdT3byh/Qt7IR+5kSwblr+brqhLiEHv4sXeo8kES i62Fxlf9XS7YRSl02k4czVJP7amMUs5AOD59yTYR4GxtsSku0tURSHJiUYb8StR6EjU6 8GOMD2b3oUVDIGZCaQ53k/GHJr6tXZPqNkKVbBlMg88t9yYKeW5pUbtv0aT3sHF/dUPT mMOfd94uhpro5YJOx7bGq3S5EtVFk2UFWayjCatupbrYX+1czponnjFAzr5YlOLVows3 SqLg== MIME-Version: 1.0 References: <20190115200526.4677-1-zackw@panix.com> <1d45c6cb-192c-5ade-513e-a40c65d9fb7e@redhat.com> <CAKCAbMhSxEznwJDjSscq_whWk8j8TV409HAgGJ4AGOrHhV=8bw@mail.gmail.com> <dfefe1c3-7952-9878-757d-48a15c880332@redhat.com> <CAKCAbMiaUYOLkRD36MEVJRuzbAyjC7DEYNz68ZuJdWDrq=YPRQ@mail.gmail.com> <35b94b4d-adaa-7a83-a357-d82d6b4b2249@redhat.com> <CAMe9rOqbGSMHQbn5Wp+-Ttm4wKzL03FTJ6WA+bwjVQvL3gWh6g@mail.gmail.com> <CAKCAbMj3MQxOHswUJAssjAuZdG_3ZOAXhGpjWLBs=Aw0wC5Tbw@mail.gmail.com> In-Reply-To: <CAKCAbMj3MQxOHswUJAssjAuZdG_3ZOAXhGpjWLBs=Aw0wC5Tbw@mail.gmail.com> From: "H.J. Lu" <hjl.tools@gmail.com> Date: Fri, 18 Jan 2019 06:53:24 -0800 Message-ID: <CAMe9rOqCO0A8cCzqi2DjZqq+1HM9wHsx0t97WGq2N3vuGp=Hmg@mail.gmail.com> Subject: Re: [PATCH] Tests for minimal signal handler functionality in MINSIGSTKSZ space. To: Zack Weinberg <zackw@panix.com> Cc: "Carlos O'Donell" <carlos@redhat.com>, GNU C Library <libc-alpha@sourceware.org>, Siddhesh Poyarekar <siddhesh@gotplt.org> Content-Type: text/plain; charset="UTF-8" |
Commit Message
H.J. Lu
Jan. 18, 2019, 2:53 p.m. UTC
On Thu, Jan 17, 2019 at 7:12 PM Zack Weinberg <zackw@panix.com> wrote: > > On Wed, Jan 16, 2019 at 10:31 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > The new tests failed on AVX512 machines: > > > > Program received signal SIGUSR1, User defined signal 1. > > __GI_raise (sig=sig@entry=10) at ../sysdeps/unix/sysv/linux/raise.c:50 > > 50 return ret; > > (gdb) c > > Continuing. > > > > Program received signal SIGSEGV, Segmentation fault. > > _dl_runtime_resolve_xsavec () at ../sysdeps/x86_64/dl-trampoline.h:93 > > 93 movq %rax, REGISTER_SAVE_RAX(%rsp) > > (gdb) bt > > #0 _dl_runtime_resolve_xsavec () at ../sysdeps/x86_64/dl-trampoline.h:93 > > #1 0x0040248d in handler (unused=<optimized out>) at tst-minsigstksz-4.c:44 > > #2 <signal handler called> > > #3 __GI_raise (sig=sig@entry=10) at ../sysdeps/unix/sysv/linux/raise.c:50 > > #4 0x004024da in do_test () at tst-minsigstksz-4.c:59 > > #5 0x00402cd6 in support_test_main (argc=1, argv=0xffffcef8, > > config=config@entry=0xffffcdf0) at support_test_main.c:350 > > #6 0x00402348 in main (argc=<optimized out>, argv=<optimized out>) > > at ../support/test-driver.c:168 > > (gdb) > > > > AVX512 needs 2560 bytes to save processor state. > > Well, this is the problem that we knew existed and can't fix quickly. > If I'm reading http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html > correctly, we can't introduce new sysconf constants unilaterally > (there is no license to define system-specific _SC_* symbols). > > I wonder if this test passes if you run it with LD_BIND_NOW=t in the > environment. Forcing -z now for these tests might be the best we can > do for 2.29. > This works:
Comments
On 1/18/19 9:53 AM, H.J. Lu wrote: > On Thu, Jan 17, 2019 at 7:12 PM Zack Weinberg <zackw@panix.com> wrote: >> >> On Wed, Jan 16, 2019 at 10:31 PM H.J. Lu <hjl.tools@gmail.com> wrote: >>> >>> The new tests failed on AVX512 machines: >>> >>> Program received signal SIGUSR1, User defined signal 1. >>> __GI_raise (sig=sig@entry=10) at ../sysdeps/unix/sysv/linux/raise.c:50 >>> 50 return ret; >>> (gdb) c >>> Continuing. >>> >>> Program received signal SIGSEGV, Segmentation fault. >>> _dl_runtime_resolve_xsavec () at ../sysdeps/x86_64/dl-trampoline.h:93 >>> 93 movq %rax, REGISTER_SAVE_RAX(%rsp) >>> (gdb) bt >>> #0 _dl_runtime_resolve_xsavec () at ../sysdeps/x86_64/dl-trampoline.h:93 >>> #1 0x0040248d in handler (unused=<optimized out>) at tst-minsigstksz-4.c:44 >>> #2 <signal handler called> >>> #3 __GI_raise (sig=sig@entry=10) at ../sysdeps/unix/sysv/linux/raise.c:50 >>> #4 0x004024da in do_test () at tst-minsigstksz-4.c:59 >>> #5 0x00402cd6 in support_test_main (argc=1, argv=0xffffcef8, >>> config=config@entry=0xffffcdf0) at support_test_main.c:350 >>> #6 0x00402348 in main (argc=<optimized out>, argv=<optimized out>) >>> at ../support/test-driver.c:168 >>> (gdb) >>> >>> AVX512 needs 2560 bytes to save processor state. >> >> Well, this is the problem that we knew existed and can't fix quickly. >> If I'm reading http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html >> correctly, we can't introduce new sysconf constants unilaterally >> (there is no license to define system-specific _SC_* symbols). >> >> I wonder if this test passes if you run it with LD_BIND_NOW=t in the >> environment. Forcing -z now for these tests might be the best we can >> do for 2.29. >> > > This works: > > diff --git a/signal/Makefile b/signal/Makefile > index 9597287bca..9c65c26887 100644 > --- a/signal/Makefile > +++ b/signal/Makefile > @@ -59,3 +59,9 @@ CFLAGS-sigwait.c += -fexceptions -fasynchronous-unwind-tables > CFLAGS-sigwaitinfo.c += -fexceptions -fasynchronous-unwind-tables > > CFLAGS-sigreturn.c += $(no-stack-protector) > + OK with comment: # We don't want to test the lazy resolution stack usage # just the execution of the handler and the functions > +LDFLAGS-tst-minsigstksz-1 = -Wl,-z,now > +LDFLAGS-tst-minsigstksz-2 = -Wl,-z,now > +LDFLAGS-tst-minsigstksz-3 = -Wl,-z,now > +LDFLAGS-tst-minsigstksz-3a = -Wl,-z,now > +LDFLAGS-tst-minsigstksz-4 = -Wl,-z,now > I'd say this is OK for master. Reviewed-by: Carlos O'Donell <carlos@redhat.com> All of this raises an interesting point. Should MINSIGSTKSZ have included enough space for the lazy resolution to happen? I think we have to, because you're never going to have already called abort, quick_exit, or _exit, so they will all go through lazy binding resolution if you're not BIND_NOW. Which means we need an average estimate from all arches about the lazy binding stack usage.
On Fri, Jan 18, 2019 at 11:40 AM Carlos O'Donell <carlos@redhat.com> wrote: > > All of this raises an interesting point. Should MINSIGSTKSZ > have included enough space for the lazy resolution to happen? > I think we have to, because you're never going to have already > called abort, quick_exit, or _exit, so they will all go through > lazy binding resolution if you're not BIND_NOW. Which means we > need an average estimate from all arches about the lazy binding > stack usage. in the long term, this strikes me as another reason we should be thinking about making eager symbol resolution the default ... (and then we could start thinking about moving the dynamic loader out of process ...) in the medium term, though, I completely agree in the 2.29 term, though, perhaps the best we can do is some documentation, I'll have a go at writing that in the next couple days zw
On Fri, Jan 18, 2019 at 10:08 AM Zack Weinberg <zackw@panix.com> wrote: > > On Fri, Jan 18, 2019 at 11:40 AM Carlos O'Donell <carlos@redhat.com> wrote: > > > > All of this raises an interesting point. Should MINSIGSTKSZ > > have included enough space for the lazy resolution to happen? > > I think we have to, because you're never going to have already > > called abort, quick_exit, or _exit, so they will all go through > > lazy binding resolution if you're not BIND_NOW. Which means we > > need an average estimate from all arches about the lazy binding > > stack usage. > > in the long term, this strikes me as another reason we should be > thinking about making eager symbol resolution the default ... (and > then we could start thinking about moving the dynamic loader out of > process ...) > > in the medium term, though, I completely agree > > in the 2.29 term, though, perhaps the best we can do is some > documentation, I'll have a go at writing that in the next couple days > The 32-bit signal/tst-minsigstksz-1 failed on AVX512 machine: Program received signal SIGSEGV, Segmentation fault. 0xf7e2e366 in __GI___libc_sigaction (sig=10, act=0xf7cf70a8, oact=0xf7cf7134) at ../sysdeps/unix/sysv/linux/sigaction.c:48 48 if (act) (gdb) disass Dump of assembler code for function __GI___libc_sigaction: 0xf7e2e350 <+0>: sub $0x14c,%esp 0xf7e2e356 <+6>: xor %ecx,%ecx 0xf7e2e358 <+8>: mov %esi,0x140(%esp) 0xf7e2e35f <+15>: mov 0x154(%esp),%edx => 0xf7e2e366 <+22>: call 0xf7f44c2f <__x86.get_pc_thunk.si> For 64-bit: Breakpoint 3, __GI___libc_sigaction (sig=sig@entry=10, act=act@entry=0x7fffffffdb40, oact=oact@entry=0x0) at ../sysdeps/unix/sysv/linux/sigaction.c:48 48 if (act) (gdb) disass Dump of assembler code for function __GI___libc_sigaction: => 0x00007ffff7e4dc50 <+0>: sub $0xd0,%rsp 0x00007ffff7e4dc57 <+7>: mov %rdx,%r8 0x00007ffff7e4dc5a <+10>: test %rsi,%rsi 0x00007ffff7e4dc5d <+13>: je 0x7ffff7e4ddb0 <__GI___libc_sigaction+352> 0x00007ffff7e4dc63 <+19>: mov (%rsi),%rax 64-bit allocates smaller stack and it doesn't need to call __x86.get_pc_thunk.si. On AVX512, kernel needs larger stack space to save signal context.
On 1/18/19 2:49 PM, H.J. Lu wrote: > On Fri, Jan 18, 2019 at 10:08 AM Zack Weinberg <zackw@panix.com> wrote: >> >> On Fri, Jan 18, 2019 at 11:40 AM Carlos O'Donell <carlos@redhat.com> wrote: >>> >>> All of this raises an interesting point. Should MINSIGSTKSZ >>> have included enough space for the lazy resolution to happen? >>> I think we have to, because you're never going to have already >>> called abort, quick_exit, or _exit, so they will all go through >>> lazy binding resolution if you're not BIND_NOW. Which means we >>> need an average estimate from all arches about the lazy binding >>> stack usage. >> >> in the long term, this strikes me as another reason we should be >> thinking about making eager symbol resolution the default ... (and >> then we could start thinking about moving the dynamic loader out of >> process ...) >> >> in the medium term, though, I completely agree >> >> in the 2.29 term, though, perhaps the best we can do is some >> documentation, I'll have a go at writing that in the next couple days >> > > The 32-bit signal/tst-minsigstksz-1 failed on AVX512 machine: > > Program received signal SIGSEGV, Segmentation fault. > 0xf7e2e366 in __GI___libc_sigaction (sig=10, act=0xf7cf70a8, oact=0xf7cf7134) > at ../sysdeps/unix/sysv/linux/sigaction.c:48 > 48 if (act) > (gdb) disass > Dump of assembler code for function __GI___libc_sigaction: > 0xf7e2e350 <+0>: sub $0x14c,%esp > 0xf7e2e356 <+6>: xor %ecx,%ecx > 0xf7e2e358 <+8>: mov %esi,0x140(%esp) > 0xf7e2e35f <+15>: mov 0x154(%esp),%edx > => 0xf7e2e366 <+22>: call 0xf7f44c2f <__x86.get_pc_thunk.si> > > For 64-bit: > > Breakpoint 3, __GI___libc_sigaction (sig=sig@entry=10, > act=act@entry=0x7fffffffdb40, oact=oact@entry=0x0) > at ../sysdeps/unix/sysv/linux/sigaction.c:48 > 48 if (act) > (gdb) disass > Dump of assembler code for function __GI___libc_sigaction: > => 0x00007ffff7e4dc50 <+0>: sub $0xd0,%rsp > 0x00007ffff7e4dc57 <+7>: mov %rdx,%r8 > 0x00007ffff7e4dc5a <+10>: test %rsi,%rsi > 0x00007ffff7e4dc5d <+13>: je 0x7ffff7e4ddb0 <__GI___libc_sigaction+352> > 0x00007ffff7e4dc63 <+19>: mov (%rsi),%rax > > 64-bit allocates smaller stack and it doesn't need to call > __x86.get_pc_thunk.si. On AVX512, kernel needs larger > stack space to save signal context. I think these should be marked as XFAIL for x86, if the tests can't pass with BIND_NOW then there is a real problem here that needs fixing in the next release cycle.
diff --git a/signal/Makefile b/signal/Makefile index 9597287bca..9c65c26887 100644 --- a/signal/Makefile +++ b/signal/Makefile @@ -59,3 +59,9 @@ CFLAGS-sigwait.c += -fexceptions -fasynchronous-unwind-tables CFLAGS-sigwaitinfo.c += -fexceptions -fasynchronous-unwind-tables CFLAGS-sigreturn.c += $(no-stack-protector) + +LDFLAGS-tst-minsigstksz-1 = -Wl,-z,now +LDFLAGS-tst-minsigstksz-2 = -Wl,-z,now +LDFLAGS-tst-minsigstksz-3 = -Wl,-z,now +LDFLAGS-tst-minsigstksz-3a = -Wl,-z,now +LDFLAGS-tst-minsigstksz-4 = -Wl,-z,now