powerpc64: Workaround sigtramp vdso return call.

Message ID 20210126130515.jivsqeoyj5nu5znk@work-tp
State Superseded
Headers
Series powerpc64: Workaround sigtramp vdso return call. |

Commit Message

Raoni Fassina Firmino Jan. 26, 2021, 1:05 p.m. UTC
  Hi all,

There was some initial discussions on the mailing list about the
failing misc/tst-sigcontext-get_pc[1], and I made some suggestions of
possible solutions.  As the window for the release is closing I want
to sent the more simple one of them ASAP for consideration (others
would not make it in time)

o/
Raoni

[1] https://sourceware.org/pipermail/libc-alpha/2021-January/121933.html

---- 8< ----

A not so recent kernel change[1] changed how the trampoline
`__kernel_sigtramp_rt64` is used to call signal handlers.

This was exposed on the test misc/tst-sigcontext-get_pc

Before kernel 5.9, the kernel set LR to the trampoline address and
jumped directly to the signal handler, and at the end the signal
handler, as any other function, would `blr` to the address set.  In
other words, the trampoline was executed just at the end of the signal
handler and the only thing it did was call sigreturn.  But since
kernel 5.9 the kernel set CTRL to the signal handler and calls to the
trampoline code, the trampoline then `bctrl` to the address in CTRL,
setting the LR to the next instruction in the middle of the
trampoline, when the signal handler returns, the rest of the
trampoline code executes the same code as before.

Here is the full trampoline code as of kernel 5.11.0-rc5 for
reference:

    V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
    .Lsigrt_start:
            bctrl▸  /* call the handler */
            addi▸   r1, r1, __SIGNAL_FRAMESIZE
            li▸     r0,__NR_rt_sigreturn
            sc
    .Lsigrt_end:
    V_FUNCTION_END(__kernel_sigtramp_rt64)

This new behavior breaks how `backtrace()` uses to detect the
trampoline frame to correctly reconstruct the stack frame when it is
called from inside a signal handling.

This workaround rely on the fact that the trampoline code is at very
least two (maybe 3?) instructions in size (as it is in the 32 bits
version, only on `li` and `sc`), so it is safe to check the return
address be in the range __kernel_sigtramp_rt64 .. + 4.

[1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline
    commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
    url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9
---
 sysdeps/powerpc/powerpc64/backtrace.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)
  

Comments

Raoni Fassina Firmino Jan. 26, 2021, 1:08 p.m. UTC | #1
Forgot to add:

Tested on top of master (04c6a8073d1c9d73c4a88b536aeb803b12fbffdc)
on the following platforms with no regression:
   - powerpc64le-linux-gnu (Power 9) kernel 5.4.0-59
   - powerpc64le-linux-gnu (Power 9) kernel 5.9.14-1
   - powerpc64-linux-gnu (Power 9) kernel 5.10.0-1
  
Florian Weimer Jan. 26, 2021, 2:12 p.m. UTC | #2
* Raoni Fassina Firmino:

> A not so recent kernel change[1] changed how the trampoline
> `__kernel_sigtramp_rt64` is used to call signal handlers.
>
> This was exposed on the test misc/tst-sigcontext-get_pc
>
> Before kernel 5.9, the kernel set LR to the trampoline address and
> jumped directly to the signal handler, and at the end the signal
> handler, as any other function, would `blr` to the address set.  In
> other words, the trampoline was executed just at the end of the signal
> handler and the only thing it did was call sigreturn.  But since
> kernel 5.9 the kernel set CTRL to the signal handler and calls to the
> trampoline code, the trampoline then `bctrl` to the address in CTRL,
> setting the LR to the next instruction in the middle of the
> trampoline, when the signal handler returns, the rest of the
> trampoline code executes the same code as before.

Thanks for the patch, byt:

No one has explained so far why the original blr instruction couldn't be
augmented with the appropriate branch predictor hint.  The 2.07 ISA
manual suggests that it's possible, but maybe I'm reading it wrong.

Thanks,
Florian
  
Paul E Murphy Jan. 26, 2021, 2:45 p.m. UTC | #3
On 1/26/21 8:12 AM, Florian Weimer via Libc-alpha wrote:
> * Raoni Fassina Firmino:
> 
>> A not so recent kernel change[1] changed how the trampoline
>> `__kernel_sigtramp_rt64` is used to call signal handlers.
>>
>> This was exposed on the test misc/tst-sigcontext-get_pc
>>
>> Before kernel 5.9, the kernel set LR to the trampoline address and
>> jumped directly to the signal handler, and at the end the signal
>> handler, as any other function, would `blr` to the address set.  In
>> other words, the trampoline was executed just at the end of the signal
>> handler and the only thing it did was call sigreturn.  But since
>> kernel 5.9 the kernel set CTRL to the signal handler and calls to the
>> trampoline code, the trampoline then `bctrl` to the address in CTRL,
>> setting the LR to the next instruction in the middle of the
>> trampoline, when the signal handler returns, the rest of the
>> trampoline code executes the same code as before.
> 
> Thanks for the patch, byt:
> 
> No one has explained so far why the original blr instruction couldn't be
> augmented with the appropriate branch predictor hint.  The 2.07 ISA
> manual suggests that it's possible, but maybe I'm reading it wrong.

bctrl is the preferred form of making indirect calls.  You can add hint 
0b01 to bclr (blr) to get similar behavior on power8/9, but as noted in 
the ISA, it is optional.
  
Adhemerval Zanella Jan. 26, 2021, 4:36 p.m. UTC | #4
On 26/01/2021 10:05, Raoni Fassina Firmino via Libc-alpha wrote:
> Hi all,
> 
> There was some initial discussions on the mailing list about the
> failing misc/tst-sigcontext-get_pc[1], and I made some suggestions of
> possible solutions.  As the window for the release is closing I want
> to sent the more simple one of them ASAP for consideration (others
> would not make it in time)
> 
> o/
> Raoni
> 
> [1] https://sourceware.org/pipermail/libc-alpha/2021-January/121933.html

As Rich has noted, this should not interfere with BZ#12683 fix (which
I do hope we get upstream in next release): the resulting Program Counter
from the signal handle context is set to next instruct pass the syscall:

(gdb) n
39        uintptr_t pc = sigcontext_get_pc (ctx);
$1 = 140737351668608
(gdb) p /x pc
$2 = 0x7ffff7da5380
(gdb) x/20i 0x7ffff7da5380
   0x7ffff7da5380 <__GI_raise+496>:     mfcr    r0
   0x7ffff7da5384 <__GI_raise+500>:     addi    r1,r1,400
   0x7ffff7da5388 <__GI_raise+504>:     mr      r3,r31
   0x7ffff7da538c <__GI_raise+508>:     ld      r0,16(r1)

So this fix is indeed a backtrace() one, which raise another question:
do we still need to keep this powerpc optimization?  It seems that only
powerpc and microblaze does use an optimized version and I think by
using the generic libgcc based one might allow some consolidation
and code simplification (specially for powerpc64 case, where it would
allow to remove some internal vDSO symbol setup).

> 
> ---- 8< ----
> 
> A not so recent kernel change[1] changed how the trampoline
> `__kernel_sigtramp_rt64` is used to call signal handlers.
> 
> This was exposed on the test misc/tst-sigcontext-get_pc
> 
> Before kernel 5.9, the kernel set LR to the trampoline address and
> jumped directly to the signal handler, and at the end the signal
> handler, as any other function, would `blr` to the address set.  In
> other words, the trampoline was executed just at the end of the signal
> handler and the only thing it did was call sigreturn.  But since
> kernel 5.9 the kernel set CTRL to the signal handler and calls to the
> trampoline code, the trampoline then `bctrl` to the address in CTRL,
> setting the LR to the next instruction in the middle of the
> trampoline, when the signal handler returns, the rest of the
> trampoline code executes the same code as before.
> 
> Here is the full trampoline code as of kernel 5.11.0-rc5 for
> reference:
> 
>     V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
>     .Lsigrt_start:
>             bctrl▸  /* call the handler */
>             addi▸   r1, r1, __SIGNAL_FRAMESIZE
>             li▸     r0,__NR_rt_sigreturn
>             sc
>     .Lsigrt_end:
>     V_FUNCTION_END(__kernel_sigtramp_rt64)
> 
> This new behavior breaks how `backtrace()` uses to detect the
> trampoline frame to correctly reconstruct the stack frame when it is
> called from inside a signal handling.
> 
> This workaround rely on the fact that the trampoline code is at very
> least two (maybe 3?) instructions in size (as it is in the 32 bits
> version, only on `li` and `sc`), so it is safe to check the return
> address be in the range __kernel_sigtramp_rt64 .. + 4.
> 
> [1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline
>     commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
>     url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9
> ---
>  sysdeps/powerpc/powerpc64/backtrace.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/sysdeps/powerpc/powerpc64/backtrace.c b/sysdeps/powerpc/powerpc64/backtrace.c
> index ae64c5d7a5..5540085436 100644
> --- a/sysdeps/powerpc/powerpc64/backtrace.c
> +++ b/sysdeps/powerpc/powerpc64/backtrace.c
> @@ -54,11 +54,20 @@ struct signal_frame_64 {
>    /* We don't care about the rest, since the IP value is at 'uc' field.  */
>  };
>  
> +/* Test if the address match to the inside the trampoline code.  We cannot
> + * count on the return address be the beginning of sigtramp, the kernel may jump

Line too long.

> + * to the trampoline and the trampoline jump to the signal handler and in this
> + * case the address will be somewhere in the middle of the trampoline.  This is
> + * a workaround while we cannot know the sigtramp size.  It will be at least two

Ditto.

> + * instructions long since it needs at least to load the syscall number for
> + * sigreturn and call it.
> + */

Maybe describe the kernel change itself, such as:

/* Up to kernel 5.9, returning from an interrupt or syscall to a signal 
   handler starts execution directly at the handler's entry point, with
   LR set to address of the sigreturn trampoline (the vDSO symbol).
   Newer kernels will branch to signal handler from the trampoline
   instead, so checking the stacktrace against the vDSO entrypoint does
   not work in such case.
   The vDSO branches with a 'bctrl' instruction, so checking either
   the vDSO address itself and the next instruction should cover all
   kernel versions.  */

>  static inline bool
>  is_sigtramp_address (void *nip)
>  {
>  #ifdef HAVE_SIGTRAMP_RT64
> -  if (nip == GLRO (dl_vdso_sigtramp_rt64))
> +  void *sigtramp_addr = GLRO (dl_vdso_sigtramp_rt64);
> +  if (nip >= sigtramp_addr && nip <= sigtramp_addr + 4)

Maybe it would be simpler to just:

 if (nip == GLRO (dl_vdso_sigtramp_rt64) ||
     nip == GLRO (dl_vdso_sigtramp_rt64) + 4)

(dl_vdso_sigtramp_rt64 is a void* and gcc allows void pointer arithmetic 
as an extension).

>      return true;
>  #endif
>    return false;
>
  
Raoni Fassina Firmino Jan. 27, 2021, 4:21 p.m. UTC | #5
On Tue, Jan 26, 2021 at 08:45:00AM -0600, AL glibc-alpha wrote:
> 
> 
> On 1/26/21 8:12 AM, Florian Weimer via Libc-alpha wrote:
> > * Raoni Fassina Firmino:
> > 
> > > A not so recent kernel change[1] changed how the trampoline
> > > `__kernel_sigtramp_rt64` is used to call signal handlers.
> > > 
> > > This was exposed on the test misc/tst-sigcontext-get_pc
> > > 
> > > Before kernel 5.9, the kernel set LR to the trampoline address and
> > > jumped directly to the signal handler, and at the end the signal
> > > handler, as any other function, would `blr` to the address set.  In
> > > other words, the trampoline was executed just at the end of the signal
> > > handler and the only thing it did was call sigreturn.  But since
> > > kernel 5.9 the kernel set CTRL to the signal handler and calls to the
> > > trampoline code, the trampoline then `bctrl` to the address in CTRL,
> > > setting the LR to the next instruction in the middle of the
> > > trampoline, when the signal handler returns, the rest of the
> > > trampoline code executes the same code as before.
> > 
> > Thanks for the patch, byt:
> > 
> > No one has explained so far why the original blr instruction couldn't be
> > augmented with the appropriate branch predictor hint.  The 2.07 ISA
> > manual suggests that it's possible, but maybe I'm reading it wrong.
> 
> bctrl is the preferred form of making indirect calls.  You can add hint 0b01
> to bclr (blr) to get similar behavior on power8/9, but as noted in the ISA,
> it is optional.

What branch prediction we are talking about? I think there is only one
blr that is relevant, the one returning from the signal handler to the
trampoline. In this case it if it is a simple blr is already hinted
correctly with 0b00 (I think it is the default BH for blr), that it is a
subroutine return. We don't have control over the blr from the signal
handler to change the hint to 0b01 anyway. So IIUC, the return address
predictor failed before because the signal handler don't go back from
the same place (+4) it was called, and it changes with the added bctrl.

I am CC'ing Nicholas and maybe he has more insight.

(I know that now this discussion is split in two places, the original
thread Florian started and this on for the patch. Not sure where best to
continue this)
  
Raoni Fassina Firmino Jan. 27, 2021, 7:22 p.m. UTC | #6
Thank you for the review :-)


On Tue, Jan 26, 2021 at 01:36:12PM -0300, AL glibc-alpha wrote:
> 
> 
> On 26/01/2021 10:05, Raoni Fassina Firmino via Libc-alpha wrote:
> > Hi all,
> > 
> > There was some initial discussions on the mailing list about the
> > failing misc/tst-sigcontext-get_pc[1], and I made some suggestions of
> > possible solutions.  As the window for the release is closing I want
> > to sent the more simple one of them ASAP for consideration (others
> > would not make it in time)
> > 
> > o/
> > Raoni
> > 
> > [1] https://sourceware.org/pipermail/libc-alpha/2021-January/121933.html
> 
> As Rich has noted, this should not interfere with BZ#12683 fix (which
> I do hope we get upstream in next release): the resulting Program Counter
> from the signal handle context is set to next instruct pass the syscall:
> 
> (gdb) n
> 39        uintptr_t pc = sigcontext_get_pc (ctx);
> $1 = 140737351668608
> (gdb) p /x pc
> $2 = 0x7ffff7da5380
> (gdb) x/20i 0x7ffff7da5380
>    0x7ffff7da5380 <__GI_raise+496>:     mfcr    r0
>    0x7ffff7da5384 <__GI_raise+500>:     addi    r1,r1,400
>    0x7ffff7da5388 <__GI_raise+504>:     mr      r3,r31
>    0x7ffff7da538c <__GI_raise+508>:     ld      r0,16(r1)
> 

Glad it didn't messed up anything unexpected and thanks for the test :-)


> So this fix is indeed a backtrace() one, which raise another question:
> do we still need to keep this powerpc optimization?  It seems that only
> powerpc and microblaze does use an optimized version and I think by
> using the generic libgcc based one might allow some consolidation
> and code simplification (specially for powerpc64 case, where it would
> allow to remove some internal vDSO symbol setup).

I would not know about it, I would have to look into it. In any case
this would be a change only for the next release or beyond, right?

CC'ing Tulio, maybe he or someone else has some insight into it.


> 
> > 
> > ---- 8< ----
> > 
> > A not so recent kernel change[1] changed how the trampoline
> > `__kernel_sigtramp_rt64` is used to call signal handlers.
> > 
> > This was exposed on the test misc/tst-sigcontext-get_pc
> > 
> > Before kernel 5.9, the kernel set LR to the trampoline address and
> > jumped directly to the signal handler, and at the end the signal
> > handler, as any other function, would `blr` to the address set.  In
> > other words, the trampoline was executed just at the end of the signal
> > handler and the only thing it did was call sigreturn.  But since
> > kernel 5.9 the kernel set CTRL to the signal handler and calls to the
> > trampoline code, the trampoline then `bctrl` to the address in CTRL,
> > setting the LR to the next instruction in the middle of the
> > trampoline, when the signal handler returns, the rest of the
> > trampoline code executes the same code as before.
> > 
> > Here is the full trampoline code as of kernel 5.11.0-rc5 for
> > reference:
> > 
> >     V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
> >     .Lsigrt_start:
> >             bctrl▸  /* call the handler */
> >             addi▸   r1, r1, __SIGNAL_FRAMESIZE
> >             li▸     r0,__NR_rt_sigreturn
> >             sc
> >     .Lsigrt_end:
> >     V_FUNCTION_END(__kernel_sigtramp_rt64)
> > 
> > This new behavior breaks how `backtrace()` uses to detect the
> > trampoline frame to correctly reconstruct the stack frame when it is
> > called from inside a signal handling.
> > 
> > This workaround rely on the fact that the trampoline code is at very
> > least two (maybe 3?) instructions in size (as it is in the 32 bits
> > version, only on `li` and `sc`), so it is safe to check the return
> > address be in the range __kernel_sigtramp_rt64 .. + 4.
> > 
> > [1] subject: powerpc/64/signal: Balance return predictor stack in signal trampoline
> >     commit: 0138ba5783ae0dcc799ad401a1e8ac8333790df9
> >     url: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0138ba5783ae0dcc799ad401a1e8ac8333790df9
> > ---
> >  sysdeps/powerpc/powerpc64/backtrace.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/sysdeps/powerpc/powerpc64/backtrace.c b/sysdeps/powerpc/powerpc64/backtrace.c
> > index ae64c5d7a5..5540085436 100644
> > --- a/sysdeps/powerpc/powerpc64/backtrace.c
> > +++ b/sysdeps/powerpc/powerpc64/backtrace.c
> > @@ -54,11 +54,20 @@ struct signal_frame_64 {
> >    /* We don't care about the rest, since the IP value is at 'uc' field.  */
> >  };
> >  
> > +/* Test if the address match to the inside the trampoline code.  We cannot
> > + * count on the return address be the beginning of sigtramp, the kernel may jump
> 
> Line too long.

Oh yes, Is it 79, my mistake. Fixed.


> 
> > + * to the trampoline and the trampoline jump to the signal handler and in this
> > + * case the address will be somewhere in the middle of the trampoline.  This is
> > + * a workaround while we cannot know the sigtramp size.  It will be at least two
> 
> Ditto.

Fixed.


> 
> > + * instructions long since it needs at least to load the syscall number for
> > + * sigreturn and call it.
> > + */
> 
> Maybe describe the kernel change itself, such as:
> 
> /* Up to kernel 5.9, returning from an interrupt or syscall to a signal 
>    handler starts execution directly at the handler's entry point, with
>    LR set to address of the sigreturn trampoline (the vDSO symbol).
>    Newer kernels will branch to signal handler from the trampoline
>    instead, so checking the stacktrace against the vDSO entrypoint does
>    not work in such case.
>    The vDSO branches with a 'bctrl' instruction, so checking either
>    the vDSO address itself and the next instruction should cover all
>    kernel versions.  */


Thank you, I was not happy with my explanation, yours is better.

I also notice that I did the comment block wrong (Bad vim!), fixed that
too.


> 
> >  static inline bool
> >  is_sigtramp_address (void *nip)
> >  {
> >  #ifdef HAVE_SIGTRAMP_RT64
> > -  if (nip == GLRO (dl_vdso_sigtramp_rt64))
> > +  void *sigtramp_addr = GLRO (dl_vdso_sigtramp_rt64);
> > +  if (nip >= sigtramp_addr && nip <= sigtramp_addr + 4)
> 
> Maybe it would be simpler to just:
> 
>  if (nip == GLRO (dl_vdso_sigtramp_rt64) ||
>      nip == GLRO (dl_vdso_sigtramp_rt64) + 4)
> 
> (dl_vdso_sigtramp_rt64 is a void* and gcc allows void pointer arithmetic 
> as an extension).

I have no strong opinion one way or the other.  In the end
"sigtramp_addr" is almost as long as "GLRO (dl_vdso_sigtramp_rt64)"
anyway and the line split I think is very and more readable.

Thanks, changed to this version.
  
Adhemerval Zanella Jan. 27, 2021, 7:31 p.m. UTC | #7
On 27/01/2021 16:22, Raoni Fassina Firmino wrote:
>> So this fix is indeed a backtrace() one, which raise another question:
>> do we still need to keep this powerpc optimization?  It seems that only
>> powerpc and microblaze does use an optimized version and I think by
>> using the generic libgcc based one might allow some consolidation
>> and code simplification (specially for powerpc64 case, where it would
>> allow to remove some internal vDSO symbol setup).
> 
> I would not know about it, I would have to look into it. In any case
> this would be a change only for the next release or beyond, right?

Yes, I will probably propose it once 2.34 opens.
  
Nicholas Piggin Jan. 28, 2021, 5:38 a.m. UTC | #8
+linuxppc-dev

Excerpts from Raoni Fassina Firmino's message of January 28, 2021 2:21 am:
> On Tue, Jan 26, 2021 at 08:45:00AM -0600, AL glibc-alpha wrote:
>> 
>> 
>> On 1/26/21 8:12 AM, Florian Weimer via Libc-alpha wrote:
>> > * Raoni Fassina Firmino:
>> > 
>> > > A not so recent kernel change[1] changed how the trampoline
>> > > `__kernel_sigtramp_rt64` is used to call signal handlers.
>> > > 
>> > > This was exposed on the test misc/tst-sigcontext-get_pc
>> > > 
>> > > Before kernel 5.9, the kernel set LR to the trampoline address and
>> > > jumped directly to the signal handler, and at the end the signal
>> > > handler, as any other function, would `blr` to the address set.  In
>> > > other words, the trampoline was executed just at the end of the signal
>> > > handler and the only thing it did was call sigreturn.  But since
>> > > kernel 5.9 the kernel set CTRL to the signal handler and calls to the
>> > > trampoline code, the trampoline then `bctrl` to the address in CTRL,
>> > > setting the LR to the next instruction in the middle of the
>> > > trampoline, when the signal handler returns, the rest of the
>> > > trampoline code executes the same code as before.
>> > 
>> > Thanks for the patch, byt:
>> > 
>> > No one has explained so far why the original blr instruction couldn't be
>> > augmented with the appropriate branch predictor hint.  The 2.07 ISA
>> > manual suggests that it's possible, but maybe I'm reading it wrong.
>> 
>> bctrl is the preferred form of making indirect calls.  You can add hint 0b01
>> to bclr (blr) to get similar behavior on power8/9, but as noted in the ISA,
>> it is optional.
> 
> What branch prediction we are talking about? I think there is only one
> blr that is relevant, the one returning from the signal handler to the
> trampoline. In this case it if it is a simple blr is already hinted
> correctly with 0b00 (I think it is the default BH for blr), that it is a
> subroutine return. We don't have control over the blr from the signal
> handler to change the hint to 0b01 anyway. So IIUC, the return address
> predictor failed before because the signal handler don't go back from
> the same place (+4) it was called, and it changes with the added bctrl.
> 
> I am CC'ing Nicholas and maybe he has more insight.

Prior to the kernel patch, if the signal handler code used blr BH=0b01
for returns that would indeed prevent the unbalance on processors which
implement it.

But you are right, as explained in Linux commit 0138ba5783ae, the blr is 
in the signal handler function so we can't change that.

> (I know that now this discussion is split in two places, the original
> thread Florian started and this on for the patch. Not sure where best to
> continue this)

linuxppc-dev doesn't mind responsible cross posts to other lists,
hopefully libc-alpha is too.

Thanks,
Nick
  
Andreas Schwab Jan. 28, 2021, 8:54 a.m. UTC | #9
Please explain why this cannot be fixed in the kernel, given the strict
no-regression policy.

Andreas.
  
Raoni Fassina Firmino Jan. 28, 2021, 1:03 p.m. UTC | #10
On Thu, Jan 28, 2021 at 09:54:19AM +0100, Andreas Schwab wrote:
> Please explain why this cannot be fixed in the kernel, given the strict
> no-regression policy.

Yes, I think it can and I am already looking into it. But I also think
it can be improved on the glic side, like I hinted in [1].


[1] https://sourceware.org/pipermail/libc-alpha/2021-January/121951.html
  

Patch

diff --git a/sysdeps/powerpc/powerpc64/backtrace.c b/sysdeps/powerpc/powerpc64/backtrace.c
index ae64c5d7a5..5540085436 100644
--- a/sysdeps/powerpc/powerpc64/backtrace.c
+++ b/sysdeps/powerpc/powerpc64/backtrace.c
@@ -54,11 +54,20 @@  struct signal_frame_64 {
   /* We don't care about the rest, since the IP value is at 'uc' field.  */
 };
 
+/* Test if the address match to the inside the trampoline code.  We cannot
+ * count on the return address be the beginning of sigtramp, the kernel may jump
+ * to the trampoline and the trampoline jump to the signal handler and in this
+ * case the address will be somewhere in the middle of the trampoline.  This is
+ * a workaround while we cannot know the sigtramp size.  It will be at least two
+ * instructions long since it needs at least to load the syscall number for
+ * sigreturn and call it.
+ */
 static inline bool
 is_sigtramp_address (void *nip)
 {
 #ifdef HAVE_SIGTRAMP_RT64
-  if (nip == GLRO (dl_vdso_sigtramp_rt64))
+  void *sigtramp_addr = GLRO (dl_vdso_sigtramp_rt64);
+  if (nip >= sigtramp_addr && nip <= sigtramp_addr + 4)
     return true;
 #endif
   return false;