[2/4] aarch64: Re-implement setcontext without sigreturn syscall

Message ID 1394707543-9690-2-git-send-email-will.newton@linaro.org
State Committed
Headers

Commit Message

Will Newton March 13, 2014, 10:45 a.m. UTC
  The current implementation of setcontext uses sigreturn to restore
the contents of registers. This contrasts with the way most other
architectures implement setcontext:

  powerpc, mips, tile:

  Call sigreturn if context was created by a call to a signal handler,
  otherwise restore in user code.

  x86_64, sparc, hppa, sh, ia64, m68k, s390, arm:

  Only support restoring "synchronous" contexts, that is contexts
  created by getcontext, and restoring in user code.

  alpha, aarch64:

  Call sigreturn in all cases to do the restore.

The text of the setcontext manpage suggests that the requirement to be
able to restore a signal handler created context has been dropped from
SUSv2:

  If  the context was obtained by a call to a signal handler, then old
  standard text says that "program execution continues with the program
  instruction following the instruction interrupted by the signal".
  However, this sentence was removed in SUSv2, and the present verdict
  is "the result is unspecified".

Implementing setcontext by calling sigreturn unconditionally causes
problems when used with sigaltstack as in BZ #16629. On this basis it
seems that aarch64 and likely alpha are broken and that new ports should
only support restoring contexts created with getcontext and do not need
to call sigreturn at all.

This patch re-implements the aarch64 setcontext function to restore
the context in user code in a similar manner to x86_64 and other ports.

ChangeLog:

2014-03-13  Will Newton  <will.newton@linaro.org>

	[BZ #16629]
	* sysdeps/unix/sysv/linux/aarch64/setcontext.S (__setcontext):
	Re-implement to restore registers in user code and avoid
	sigreturn system call.
---
 sysdeps/unix/sysv/linux/aarch64/setcontext.S | 139 +++++++++++++++++----------
 1 file changed, 88 insertions(+), 51 deletions(-)
  

Comments

Marcus Shawcroft March 17, 2014, 5:48 p.m. UTC | #1
On 13 March 2014 10:45, Will Newton <will.newton@linaro.org> wrote:


> +       /* Restore the general purpose registers.  */
> +       mov     x0, x9

Looks like this code is treating x9 as callee saved over the kernel
call. While this is true with the current implementation of the
kernel, the glibc port for AArch64 currently treats only the argument
registers as preserved.  This is a hang over from the early days of
the AArch64 port when the kernel guys wanted the glibc port to be
conservative in this respect in order that they have the opportunity
to be selective in what was restored on exit from the kernel.

Catalin, Can you comment on the above paragraph?  Is it still
important that glibc / user space be conservative in its assumptions
about which registers are preserved over system call or can we now
relax this position and exploit the fact that the kernel preserves
x9-x30 over kernel calls?


> +       cfi_def_cfa(x0, 0)

Space before (....

Cheers
/Marcus
  
Will Newton March 18, 2014, 3:35 p.m. UTC | #2
On 17 March 2014 17:48, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote:
> On 13 March 2014 10:45, Will Newton <will.newton@linaro.org> wrote:
>
>
>> +       /* Restore the general purpose registers.  */
>> +       mov     x0, x9
>
> Looks like this code is treating x9 as callee saved over the kernel
> call. While this is true with the current implementation of the
> kernel, the glibc port for AArch64 currently treats only the argument
> registers as preserved.  This is a hang over from the early days of
> the AArch64 port when the kernel guys wanted the glibc port to be
> conservative in this respect in order that they have the opportunity
> to be selective in what was restored on exit from the kernel.

Really, is that true? It seems to me you would have to save/restore in
userland anyway so I am not sure what would actually be saved by
making syscalls behave so differently from function calls.

> Catalin, Can you comment on the above paragraph?  Is it still
> important that glibc / user space be conservative in its assumptions
> about which registers are preserved over system call or can we now
> relax this position and exploit the fact that the kernel preserves
> x9-x30 over kernel calls?
>
>
>> +       cfi_def_cfa(x0, 0)
>
> Space before (....

Thanks, I fixed that.
  
Catalin Marinas March 18, 2014, 5:20 p.m. UTC | #3
On 18 Mar 2014, at 15:35, Will Newton <will.newton@linaro.org> wrote:
> On 17 March 2014 17:48, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote:
> > On 13 March 2014 10:45, Will Newton <will.newton@linaro.org> wrote:
> > 
> > +       /* Restore the general purpose registers.  */
> > +       mov     x0, x9
> > 
> > Looks like this code is treating x9 as callee saved over the kernel
> > call. While this is true with the current implementation of the
> > kernel, the glibc port for AArch64 currently treats only the argument
> > registers as preserved.  This is a hang over from the early days of
> > the AArch64 port when the kernel guys wanted the glibc port to be
> > conservative in this respect in order that they have the opportunity
> > to be selective in what was restored on exit from the kernel.
> 
> Really, is that true? It seems to me you would have to save/restore in
> userland anyway so I am not sure what would actually be saved by
> making syscalls behave so differently from function calls.

I think Marcus and I had several debates some time ago and I really
don’t remember what we decided is the most optimal way. Currently
Linux saves all the registers on syscall entry and restores all apart
from the return value.

> > Catalin, Can you comment on the above paragraph?  Is it still
> > important that glibc / user space be conservative in its assumptions
> > about which registers are preserved over system call or can we now
> > relax this position and exploit the fact that the kernel preserves
> > x9-x30 over kernel calls?

The kernel tradition is that once stuff got into mainline it is
considered user/kernel ABI (usually even if it is buggy) and cannot be
changed. So you can relax the use code and assume that everything is
preserved by the kernel (apart from x0 which is the return value).

Catalin
  
Marcus Shawcroft March 19, 2014, 4:39 p.m. UTC | #4
Hi Will,

On 13 March 2014 10:45, Will Newton <will.newton@linaro.org> wrote:

The use of x9 is clearly a none issue now.  Which also means that the
syscall code in sysdeps.h can be relaxed somewhat...  That aside I
have another comment on the parsing of extension blocks in the signal
context:

> +       cfi_offset( d8, oV0 + 8 * SZVREG)
> +       cfi_offset( d9, oV0 + 9 * SZVREG)
> +       cfi_offset(d10, oV0 + 10 * SZVREG)
> +       cfi_offset(d11, oV0 + 11 * SZVREG)
> +       cfi_offset(d12, oV0 + 12 * SZVREG)
> +       cfi_offset(d13, oV0 + 13 * SZVREG)
> +       cfi_offset(d14, oV0 + 14 * SZVREG)
> +       cfi_offset(d15, oV0 + 15 * SZVREG)


> +       ldp     x18, x19, [x0, oX0 + 18 * SZREG]
> +       ldp     x20, x21, [x0, oX0 + 20 * SZREG]
> +       ldp     x22, x23, [x0, oX0 + 22 * SZREG]
> +       ldp     x24, x25, [x0, oX0 + 24 * SZREG]
> +       ldp     x26, x27, [x0, oX0 + 26 * SZREG]
> +       ldp     x28, x29, [x0, oX0 + 28 * SZREG]
> +       ldr     x30,      [x0, oX0 + 30 * SZREG]
> +       ldr     x2, [x0, oSP]
> +       mov     sp, x2
> +
> +       /* Check for FP SIMD context.  */
> +       add     x2, x0, #oEXTENSION
> +
> +       mov     w3, #(FPSIMD_MAGIC & 0xffff)
> +       movk    w3, #(FPSIMD_MAGIC >> 16), lsl #16
> +       ldr     w1, [x2, #oHEAD + oMAGIC]
> +       cmp     w1, w3
> +       b.ne    2f

The code should not assume the next block will be the fp/simd block.
The code should iterate over all of the remaining blocks using the
size field looking for the magic marker of the fp/simd block or the
null marker.

This also implies the cfi_offset code above using oVo is incorrect.
Note the existing use of oVo is used to construct context, it is valid
for us to choose such a layout where the fpsimd block follows
immediately after the initial context, but we should not assume the
kernel will use such a layout.

Cheers
/Marcus
  
Rich Felker March 19, 2014, 5:12 p.m. UTC | #5
On Thu, Mar 13, 2014 at 10:45:41AM +0000, Will Newton wrote:
> The current implementation of setcontext uses sigreturn to restore
> the contents of registers. This contrasts with the way most other
> architectures implement setcontext:
> 
>   powerpc, mips, tile:
> 
>   Call sigreturn if context was created by a call to a signal handler,
>   otherwise restore in user code.
> 
>   x86_64, sparc, hppa, sh, ia64, m68k, s390, arm:
> 
>   Only support restoring "synchronous" contexts, that is contexts
>   created by getcontext, and restoring in user code.
> 
>   alpha, aarch64:
> 
>   Call sigreturn in all cases to do the restore.
> 
> The text of the setcontext manpage suggests that the requirement to be
> able to restore a signal handler created context has been dropped from
> SUSv2:
> 
>   If  the context was obtained by a call to a signal handler, then old
>   standard text says that "program execution continues with the program
>   instruction following the instruction interrupted by the signal".
>   However, this sentence was removed in SUSv2, and the present verdict
>   is "the result is unspecified".
> 
> Implementing setcontext by calling sigreturn unconditionally causes
> problems when used with sigaltstack as in BZ #16629. On this basis it
> seems that aarch64 and likely alpha are broken and that new ports should
> only support restoring contexts created with getcontext and do not need
> to call sigreturn at all.
> 
> This patch re-implements the aarch64 setcontext function to restore
> the context in user code in a similar manner to x86_64 and other ports.

I question the whole concept of this patch. On aarch64, the kernel
stores cpu-model-specific register state that libc won't necessarily
know how to restore as part of the context. Of course this isn't
needed for synchronous context switches, and it seems you're ok with
dropping support for asynchronous ones (from signal handlers), but my
impression was that using these interfaces from signal handlers was
one of the only reasons they're interesting. Otherwise they're just a
glorified setjmp/longjmp, and there's no reason to save/restore ANY
registers in the mcontext_t except the call-saved ones (same as
setjmp/longjmp do).

BTW sigreturn is also nice from the standpoint that restoring the
signal mask is atomic with respect to switching stacks. Otherwise, if
you restore the signal mask before switching, you risk a
newly-unmasked signal handler running on the old stack, and if you
restore the signal mask after switching, you risk a
previously-unmasked signal that should be masked running on the new
stack. The only way to solve this problem without sigreturn is to
block ALL signals that should be blocked in either the old or the new
mask before switching stacks.

As for this last issue, I suspect glibc is buggy on other archs with
regard to this property...

Rich
  
Will Newton March 19, 2014, 5:13 p.m. UTC | #6
On 19 March 2014 16:39, Marcus Shawcroft <marcus.shawcroft@gmail.com> wrote:
> Hi Will,
>
> On 13 March 2014 10:45, Will Newton <will.newton@linaro.org> wrote:
>
> The use of x9 is clearly a none issue now.  Which also means that the
> syscall code in sysdeps.h can be relaxed somewhat...  That aside I
> have another comment on the parsing of extension blocks in the signal
> context:
>
>> +       cfi_offset( d8, oV0 + 8 * SZVREG)
>> +       cfi_offset( d9, oV0 + 9 * SZVREG)
>> +       cfi_offset(d10, oV0 + 10 * SZVREG)
>> +       cfi_offset(d11, oV0 + 11 * SZVREG)
>> +       cfi_offset(d12, oV0 + 12 * SZVREG)
>> +       cfi_offset(d13, oV0 + 13 * SZVREG)
>> +       cfi_offset(d14, oV0 + 14 * SZVREG)
>> +       cfi_offset(d15, oV0 + 15 * SZVREG)
>
>
>> +       ldp     x18, x19, [x0, oX0 + 18 * SZREG]
>> +       ldp     x20, x21, [x0, oX0 + 20 * SZREG]
>> +       ldp     x22, x23, [x0, oX0 + 22 * SZREG]
>> +       ldp     x24, x25, [x0, oX0 + 24 * SZREG]
>> +       ldp     x26, x27, [x0, oX0 + 26 * SZREG]
>> +       ldp     x28, x29, [x0, oX0 + 28 * SZREG]
>> +       ldr     x30,      [x0, oX0 + 30 * SZREG]
>> +       ldr     x2, [x0, oSP]
>> +       mov     sp, x2
>> +
>> +       /* Check for FP SIMD context.  */
>> +       add     x2, x0, #oEXTENSION
>> +
>> +       mov     w3, #(FPSIMD_MAGIC & 0xffff)
>> +       movk    w3, #(FPSIMD_MAGIC >> 16), lsl #16
>> +       ldr     w1, [x2, #oHEAD + oMAGIC]
>> +       cmp     w1, w3
>> +       b.ne    2f
>
> The code should not assume the next block will be the fp/simd block.
> The code should iterate over all of the remaining blocks using the
> size field looking for the magic marker of the fp/simd block or the
> null marker.
>
> This also implies the cfi_offset code above using oVo is incorrect.
> Note the existing use of oVo is used to construct context, it is valid
> for us to choose such a layout where the fpsimd block follows
> immediately after the initial context, but we should not assume the
> kernel will use such a layout.

As per the commit message this code will only be run on contexts
created by getcontext/makecontext and not on kernel created contexts.
I can add support for handling arbitrarily shaped contexts, but it may
cause some unnecessary complexity in the code.
  
Will Newton March 19, 2014, 5:19 p.m. UTC | #7
On 19 March 2014 17:12, Rich Felker <dalias@aerifal.cx> wrote:
> On Thu, Mar 13, 2014 at 10:45:41AM +0000, Will Newton wrote:
>> The current implementation of setcontext uses sigreturn to restore
>> the contents of registers. This contrasts with the way most other
>> architectures implement setcontext:
>>
>>   powerpc, mips, tile:
>>
>>   Call sigreturn if context was created by a call to a signal handler,
>>   otherwise restore in user code.
>>
>>   x86_64, sparc, hppa, sh, ia64, m68k, s390, arm:
>>
>>   Only support restoring "synchronous" contexts, that is contexts
>>   created by getcontext, and restoring in user code.
>>
>>   alpha, aarch64:
>>
>>   Call sigreturn in all cases to do the restore.
>>
>> The text of the setcontext manpage suggests that the requirement to be
>> able to restore a signal handler created context has been dropped from
>> SUSv2:
>>
>>   If  the context was obtained by a call to a signal handler, then old
>>   standard text says that "program execution continues with the program
>>   instruction following the instruction interrupted by the signal".
>>   However, this sentence was removed in SUSv2, and the present verdict
>>   is "the result is unspecified".
>>
>> Implementing setcontext by calling sigreturn unconditionally causes
>> problems when used with sigaltstack as in BZ #16629. On this basis it
>> seems that aarch64 and likely alpha are broken and that new ports should
>> only support restoring contexts created with getcontext and do not need
>> to call sigreturn at all.
>>
>> This patch re-implements the aarch64 setcontext function to restore
>> the context in user code in a similar manner to x86_64 and other ports.
>
> I question the whole concept of this patch. On aarch64, the kernel
> stores cpu-model-specific register state that libc won't necessarily
> know how to restore as part of the context. Of course this isn't
> needed for synchronous context switches, and it seems you're ok with
> dropping support for asynchronous ones (from signal handlers), but my
> impression was that using these interfaces from signal handlers was
> one of the only reasons they're interesting. Otherwise they're just a
> glorified setjmp/longjmp, and there's no reason to save/restore ANY
> registers in the mcontext_t except the call-saved ones (same as
> setjmp/longjmp do).

Looking at the list of architectures that do not support asynchronous
contexts suggests that basically nobody is using that on Linux. I
would be interested if anyone was aware of software using that
feature, either on Linux on powerpc/mips/tile, BSD or Solaris.

> BTW sigreturn is also nice from the standpoint that restoring the
> signal mask is atomic with respect to switching stacks. Otherwise, if
> you restore the signal mask before switching, you risk a
> newly-unmasked signal handler running on the old stack, and if you
> restore the signal mask after switching, you risk a
> previously-unmasked signal that should be masked running on the new
> stack. The only way to solve this problem without sigreturn is to
> block ALL signals that should be blocked in either the old or the new
> mask before switching stacks.
>
> As for this last issue, I suspect glibc is buggy on other archs with
> regard to this property...

But on the other hand sigreturn interacts badly with sigaltstack which
actually causes problems with real running code (in this case the Go
language).
  
Rich Felker March 19, 2014, 5:37 p.m. UTC | #8
On Wed, Mar 19, 2014 at 05:19:23PM +0000, Will Newton wrote:
> >> This patch re-implements the aarch64 setcontext function to restore
> >> the context in user code in a similar manner to x86_64 and other ports.
> >
> > I question the whole concept of this patch. On aarch64, the kernel
> > stores cpu-model-specific register state that libc won't necessarily
> > know how to restore as part of the context. Of course this isn't
> > needed for synchronous context switches, and it seems you're ok with
> > dropping support for asynchronous ones (from signal handlers), but my
> > impression was that using these interfaces from signal handlers was
> > one of the only reasons they're interesting. Otherwise they're just a
> > glorified setjmp/longjmp, and there's no reason to save/restore ANY
> > registers in the mcontext_t except the call-saved ones (same as
> > setjmp/longjmp do).
> 
> Looking at the list of architectures that do not support asynchronous
> contexts suggests that basically nobody is using that on Linux. I
> would be interested if anyone was aware of software using that
> feature, either on Linux on powerpc/mips/tile, BSD or Solaris.

Well I think "basically nobody" is using any archs but x86[_64] and
arm, at least for some sense of "basically nobody"... ;-) But you may
very well be right. I just sort of question what's the point in even
offering these interfaces on new archs, though, if they don't do
anything significant that setjmp/longjmp can't do.

> > BTW sigreturn is also nice from the standpoint that restoring the
> > signal mask is atomic with respect to switching stacks. Otherwise, if
> > you restore the signal mask before switching, you risk a
> > newly-unmasked signal handler running on the old stack, and if you
> > restore the signal mask after switching, you risk a
> > previously-unmasked signal that should be masked running on the new
> > stack. The only way to solve this problem without sigreturn is to
> > block ALL signals that should be blocked in either the old or the new
> > mask before switching stacks.
> >
> > As for this last issue, I suspect glibc is buggy on other archs with
> > regard to this property...
> 
> But on the other hand sigreturn interacts badly with sigaltstack which
> actually causes problems with real running code (in this case the Go
> language).

Could you explain the bad interaction? I'd be interested in knowing
what it is (since I was planning on using sigreturn to implement the
ucontext functions in musl) and if I knew, I might even have a
workaround.

Rich
  
Richard Henderson March 19, 2014, 5:43 p.m. UTC | #9
On 03/19/2014 10:37 AM, Rich Felker wrote:
> On Wed, Mar 19, 2014 at 05:19:23PM +0000, Will Newton wrote:
>> But on the other hand sigreturn interacts badly with sigaltstack which
>> actually causes problems with real running code (in this case the Go
>> language).
> 
> Could you explain the bad interaction? I'd be interested in knowing
> what it is (since I was planning on using sigreturn to implement the
> ucontext functions in musl) and if I knew, I might even have a
> workaround.

Yes, please explain.  I've used sigreturn on alpha since day 1 and
haven't noticed a problem.


r~
  
Andreas Schwab March 19, 2014, 5:45 p.m. UTC | #10
Rich Felker <dalias@aerifal.cx> writes:

> Well I think "basically nobody" is using any archs but x86[_64] and
> arm, at least for some sense of "basically nobody"... ;-) But you may
> very well be right. I just sort of question what's the point in even
> offering these interfaces on new archs, though, if they don't do
> anything significant that setjmp/longjmp can't do.

There's a reason POSIX has removed them...

Andreas.
  
Will Newton March 20, 2014, 8:23 a.m. UTC | #11
On 19 March 2014 17:37, Rich Felker <dalias@aerifal.cx> wrote:
> On Wed, Mar 19, 2014 at 05:19:23PM +0000, Will Newton wrote:
>> >> This patch re-implements the aarch64 setcontext function to restore
>> >> the context in user code in a similar manner to x86_64 and other ports.
>> >
>> > I question the whole concept of this patch. On aarch64, the kernel
>> > stores cpu-model-specific register state that libc won't necessarily
>> > know how to restore as part of the context. Of course this isn't
>> > needed for synchronous context switches, and it seems you're ok with
>> > dropping support for asynchronous ones (from signal handlers), but my
>> > impression was that using these interfaces from signal handlers was
>> > one of the only reasons they're interesting. Otherwise they're just a
>> > glorified setjmp/longjmp, and there's no reason to save/restore ANY
>> > registers in the mcontext_t except the call-saved ones (same as
>> > setjmp/longjmp do).
>>
>> Looking at the list of architectures that do not support asynchronous
>> contexts suggests that basically nobody is using that on Linux. I
>> would be interested if anyone was aware of software using that
>> feature, either on Linux on powerpc/mips/tile, BSD or Solaris.
>
> Well I think "basically nobody" is using any archs but x86[_64] and
> arm, at least for some sense of "basically nobody"... ;-) But you may
> very well be right. I just sort of question what's the point in even
> offering these interfaces on new archs, though, if they don't do
> anything significant that setjmp/longjmp can't do.

Because existing software uses them, for better or worse.

>> > BTW sigreturn is also nice from the standpoint that restoring the
>> > signal mask is atomic with respect to switching stacks. Otherwise, if
>> > you restore the signal mask before switching, you risk a
>> > newly-unmasked signal handler running on the old stack, and if you
>> > restore the signal mask after switching, you risk a
>> > previously-unmasked signal that should be masked running on the new
>> > stack. The only way to solve this problem without sigreturn is to
>> > block ALL signals that should be blocked in either the old or the new
>> > mask before switching stacks.
>> >
>> > As for this last issue, I suspect glibc is buggy on other archs with
>> > regard to this property...
>>
>> But on the other hand sigreturn interacts badly with sigaltstack which
>> actually causes problems with real running code (in this case the Go
>> language).
>
> Could you explain the bad interaction? I'd be interested in knowing
> what it is (since I was planning on using sigreturn to implement the
> ucontext functions in musl) and if I knew, I might even have a
> workaround.

https://sourceware.org/bugzilla/show_bug.cgi?id=16629
  
Richard Henderson March 20, 2014, 3:24 p.m. UTC | #12
On 03/20/2014 01:23 AM, Will Newton wrote:
>> Could you explain the bad interaction? I'd be interested in knowing
>> what it is (since I was planning on using sigreturn to implement the
>> ucontext functions in musl) and if I knew, I might even have a
>> workaround.
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=16629

Ah, the rt_sigreturn.  Alpha is using the "old" one for setcontext.


r~
  

Patch

diff --git a/sysdeps/unix/sysv/linux/aarch64/setcontext.S b/sysdeps/unix/sysv/linux/aarch64/setcontext.S
index d220c41..2a70cc2 100644
--- a/sysdeps/unix/sysv/linux/aarch64/setcontext.S
+++ b/sysdeps/unix/sysv/linux/aarch64/setcontext.S
@@ -22,63 +22,100 @@ 
 #include "ucontext_i.h"
 #include "ucontext-internal.h"
 
-/* int setcontext (const ucontext_t *ucp) */
+/*  int __setcontext (const ucontext_t *ucp)
 
-	.text
-
-ENTRY(__setcontext)
-
-	/* Create a signal frame on the stack:
-
-		fp
-		lr
-		...
-	   sp-> rt_sigframe
-	 */
-
-	stp     x29, x30, [sp, -16]!
-	cfi_adjust_cfa_offset (16)
-	cfi_rel_offset (x29, 0)
-	cfi_rel_offset (x30, 8)
-
-        mov     x29, sp
-	cfi_def_cfa_register (x29)
-
-	/* Allocate space for the sigcontext.  */
-	mov	w3, #((RT_SIGFRAME_SIZE + SP_ALIGN_SIZE) & SP_ALIGN_MASK)
-	sub	sp, sp,	x3
+  Restores the machine context in UCP and thereby resumes execution
+  in that context.
 
-	/* Compute the base address of the ucontext structure.  */
-	add	x1, sp, #RT_SIGFRAME_UCONTEXT
+  This implementation is intended to be used for *synchronous* context
+  switches only.  Therefore, it does not have to restore anything
+  other than the PRESERVED state.  */
 
-	/* Only ucontext is required in the frame, *copy* it in.  */
-
-#if UCONTEXT_SIZE % 16
-#error The implementation of setcontext.S assumes sizeof(ucontext_t) % 16 == 0
-#endif
-
-	mov	x2, #UCONTEXT_SIZE / 16
-0:
-	ldp	x3, x4, [x0], #16
-	stp	x3, x4, [x1], #16
-	sub	x2, x2, 1
-	cbnz	x2, 0b
+	.text
 
-	/* rt_sigreturn () -- no arguments, sp points to struct rt_sigframe.  */
-	mov	x8, SYS_ify (rt_sigreturn)
+ENTRY(__setcontext)
+	/* Save a copy of UCP.  */
+	mov	x9, x0
+
+	/* Set the signal mask with
+	   rt_sigprocmask (SIG_SETMASK, mask, NULL, _NSIG/8).  */
+	mov	x0, #SIG_SETMASK
+	ldr	x1, [x9, UCONTEXT_SIGMASK]
+	mov	x2, #0
+	mov	x3, #_NSIG8
+	mov	x8, SYS_ify (rt_sigprocmask)
 	svc	0
-
-	/* Ooops we failed.  Recover the stack */
-
-	mov	sp, x29
-	cfi_def_cfa_register (sp)
-
-        ldp     x29, x30, [sp], 16
-	cfi_adjust_cfa_offset (16)
-	cfi_restore (x29)
-	cfi_restore (x30)
+	cbz	x0, 1f
 	b	C_SYMBOL_NAME(__syscall_error)
-
+1:
+	/* Restore the general purpose registers.  */
+	mov	x0, x9
+	cfi_def_cfa(x0, 0)
+	cfi_offset(x18, oX0 + 18 * SZREG)
+	cfi_offset(x19, oX0 + 19 * SZREG)
+	cfi_offset(x20, oX0 + 20 * SZREG)
+	cfi_offset(x21, oX0 + 21 * SZREG)
+	cfi_offset(x22, oX0 + 22 * SZREG)
+	cfi_offset(x23, oX0 + 23 * SZREG)
+	cfi_offset(x24, oX0 + 24 * SZREG)
+	cfi_offset(x25, oX0 + 25 * SZREG)
+	cfi_offset(x26, oX0 + 26 * SZREG)
+	cfi_offset(x27, oX0 + 27 * SZREG)
+	cfi_offset(x28, oX0 + 28 * SZREG)
+	cfi_offset(x29, oX0 + 29 * SZREG)
+	cfi_offset(x30, oX0 + 30 * SZREG)
+
+	cfi_offset( d8, oV0 + 8 * SZVREG)
+	cfi_offset( d9, oV0 + 9 * SZVREG)
+	cfi_offset(d10, oV0 + 10 * SZVREG)
+	cfi_offset(d11, oV0 + 11 * SZVREG)
+	cfi_offset(d12, oV0 + 12 * SZVREG)
+	cfi_offset(d13, oV0 + 13 * SZVREG)
+	cfi_offset(d14, oV0 + 14 * SZVREG)
+	cfi_offset(d15, oV0 + 15 * SZVREG)
+	ldp	x18, x19, [x0, oX0 + 18 * SZREG]
+	ldp	x20, x21, [x0, oX0 + 20 * SZREG]
+	ldp	x22, x23, [x0, oX0 + 22 * SZREG]
+	ldp	x24, x25, [x0, oX0 + 24 * SZREG]
+	ldp	x26, x27, [x0, oX0 + 26 * SZREG]
+	ldp	x28, x29, [x0, oX0 + 28 * SZREG]
+	ldr     x30,      [x0, oX0 + 30 * SZREG]
+	ldr     x2, [x0, oSP]
+	mov	sp, x2
+
+	/* Check for FP SIMD context.  */
+	add     x2, x0, #oEXTENSION
+
+	mov	w3, #(FPSIMD_MAGIC & 0xffff)
+	movk	w3, #(FPSIMD_MAGIC >> 16), lsl #16
+	ldr	w1, [x2, #oHEAD + oMAGIC]
+	cmp	w1, w3
+	b.ne	2f
+
+	/* Restore the FP SIMD context.  */
+	add	x3, x2, #oV0 + 8 * SZVREG
+	ldp	 d8,  d9, [x3], #2 * SZVREG
+	ldp	d10, d11, [x3], #2 * SZVREG
+	ldp	d12, d13, [x3], #2 * SZVREG
+	ldp	d14, d15, [x3], #2 * SZVREG
+
+	add	x3, x2, oFPSR
+
+	ldr	w4, [x3]
+	msr	fpsr, x4
+
+	ldr	w4, [x3, oFPCR - oFPSR]
+	msr	fpcr, x4
+
+2:
+	ldr     x16, [x0, oPC]
+	/* Restore arg registers.  */
+	ldp	x2, x3, [x0, oX0 + 2 * SZREG]
+	ldp	x4, x5, [x0, oX0 + 4 * SZREG]
+	ldp	x6, x7, [x0, oX0 + 6 * SZREG]
+	ldp	x0, x1, [x0, oX0 + 0 * SZREG]
+	/* Jump to the new pc value.  */
+	br	x16
 PSEUDO_END (__setcontext)
 weak_alias (__setcontext, setcontext)