[v4,04/22] aarch64: Define jmp_buf offset for GCS

Message ID 20241129163721.2385847-5-yury.khrustalev@arm.com
State Superseded
Delegated to: Carlos O'Donell
Headers
Series aarch64: Add support for Guarded Control Stack extension |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed

Commit Message

Yury Khrustalev Nov. 29, 2024, 4:37 p.m. UTC
  From: Szabolcs Nagy <szabolcs.nagy@arm.com>

The target specific internal __longjmp is called with a __jmp_buf
argument which has its size exposed in the ABI. On aarch64 this has
no space left, so GCSPR cannot be restored in longjmp in the usual
way, which is needed for the Guarded Control Stack (GCS) extension.

setjmp is implemented via __sigsetjmp which has a jmp_buf argument
however it is also called with __pthread_unwind_buf_t argument cast
to jmp_buf (in cancellation cleanup code built with -fno-exception).
The two types, jmp_buf and __pthread_unwind_buf_t, have common bits
beyond the __jmp_buf field and there is unused space there which we
can use for saving GCSPR.

For this to work some bits of those two generic types have to be
reserved for target specific use and the generic code in glibc has
to ensure that __longjmp is always called with a __jmp_buf that is
embedded into one of those two types. Morally __longjmp should be
changed to take jmp_buf as argument, but that is an intrusive change
across targets.

Note: longjmp is never called with __pthread_unwind_buf_t from user
code, only the internal __libc_longjmp is called with that type and
thus the two types could have separate longjmp implementations on a
target. We don't rely on this now (but migh in the future given that
cancellation unwind does not need to restore GCSPR).

Given the above this patch finds an unused slot for GCSPR. This
placement is not exposed in the ABI so it may change in the future.
This is also very target ABI specific so the generic types cannot
be easily changed to clearly mark the reserved fields.
---
 sysdeps/aarch64/jmpbuf-offsets.h | 63 ++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)
  

Comments

Carlos O'Donell Dec. 2, 2024, 9:36 p.m. UTC | #1
On 11/29/24 11:37 AM, Yury Khrustalev wrote:
> From: Szabolcs Nagy <szabolcs.nagy@arm.com>
> 
> The target specific internal __longjmp is called with a __jmp_buf
> argument which has its size exposed in the ABI. On aarch64 this has
> no space left, so GCSPR cannot be restored in longjmp in the usual
> way, which is needed for the Guarded Control Stack (GCS) extension.
> 
> setjmp is implemented via __sigsetjmp which has a jmp_buf argument
> however it is also called with __pthread_unwind_buf_t argument cast
> to jmp_buf (in cancellation cleanup code built with -fno-exception).
> The two types, jmp_buf and __pthread_unwind_buf_t, have common bits
> beyond the __jmp_buf field and there is unused space there which we
> can use for saving GCSPR.
> 
> For this to work some bits of those two generic types have to be
> reserved for target specific use and the generic code in glibc has
> to ensure that __longjmp is always called with a __jmp_buf that is
> embedded into one of those two types. Morally __longjmp should be
> changed to take jmp_buf as argument, but that is an intrusive change
> across targets.
> 
> Note: longjmp is never called with __pthread_unwind_buf_t from user
> code, only the internal __libc_longjmp is called with that type and
> thus the two types could have separate longjmp implementations on a
> target. We don't rely on this now (but migh in the future given that

s/migh/might/g

> cancellation unwind does not need to restore GCSPR).
> 
> Given the above this patch finds an unused slot for GCSPR. This
> placement is not exposed in the ABI so it may change in the future.
> This is also very target ABI specific so the generic types cannot
> be easily changed to clearly mark the reserved fields.

Private header. No exported ABI.

OK to commit with the noted formatting fixes corrected.
You can keep my RB if you make only the formatting changes.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> ---
>  sysdeps/aarch64/jmpbuf-offsets.h | 63 ++++++++++++++++++++++++++++++++
>  1 file changed, 63 insertions(+)
> 
> diff --git a/sysdeps/aarch64/jmpbuf-offsets.h b/sysdeps/aarch64/jmpbuf-offsets.h
> index 632328c7e2..ec047cf6b1 100644
> --- a/sysdeps/aarch64/jmpbuf-offsets.h
> +++ b/sysdeps/aarch64/jmpbuf-offsets.h
> @@ -39,6 +39,69 @@
>  #define JB_D14		 20
>  #define JB_D15		 21
>  
> +/* The target specific part of jmp_buf has no space for expansion but
> +   the public jmp_buf ABI type has. Unfortunately there is another type

Two spaces after period i.e. "s/. /.  /g"

> +   that is used with setjmp APIs and exposed by thread cancellation (in
> +   binaries built with -fno-exceptions) which complicates the situation.
> +
> +  // Internal layout of the public jmp_buf type on AArch64.
> +  // This is passed to setjmp, longjmp, sigsetjmp, siglongjmp.
> +  struct
> +  {
> +    uint64_t jmpbuf[22];     // Target specific part.
> +    uint32_t mask_was_saved; // savemask bool used by sigsetjmp/siglongjmp.
> +    uint32_t pad;
> +    uint64_t saved_mask;     // sigset_t bits used on linux.
> +    uint64_t unused[15];     // sigset_t bits not used on linux.
> +  };
> +
> +  // Internal layout of the public __pthread_unwind_buf_t type.
> +  // This is passed to sigsetjmp with !savemask and to the internal
> +  // __libc_longjmp (currently alias of longjmp on AArch64).
> +  struct
> +  {
> +    uint64_t jmpbuf[22];     // Must match jmp_buf.
> +    uint32_t mask_was_saved; // Must match jmp_buf, always 0.
> +    uint32_t pad;
> +    void *prev;              // List for unwinding.
> +    void *cleanup;           // Cleanup handlers.
> +    uint32_t canceltype;     // 1 bit cancellation type.
> +    uint32_t pad2;
> +    void *pad3;
> +  };
> +
> +  Ideally only the target specific part of jmp_buf (A) is accessed by
> +  __setjmp and __longjmp.  But that is always embedded into one of the
> +  two types above so the bits that are unused in those types (B) may be
> +  reused for target specific purposes.  Setjmp can't distinguish between
> +  jmp_buf and __pthread_unwind_buf_t, but longjmp can: only an internal
> +  longjmp call uses the latter, so state that is not needed for cancel
> +  cleanups can go to fields (C).  If generic code is refactored then the
> +  usage of additional fields can be optimized (D). And some fields are

s/. /.  /g

> +  only accessible in the savedmask case (E). Reusability of jmp_buf

s/. /.  /g

> +  fields on AArch64 for target purposes:
> +
> +  struct
> +  {
> +    uint64_t A[22];  //   0 .. 176
> +    uint32_t D;      // 176 .. 180
> +    uint32_t B;      // 180 .. 184
> +    uint64_t D;      // 184 .. 192
> +    uint64_t C;      // 192 .. 200
> +    uint32_t C;      // 200 .. 204
> +    uint32_t B;      // 204 .. 208
> +    uint64_t B;      // 208 .. 216
> +    uint64_t E[12];  // 216 .. 312
> +  }
> +
> +  The B fields can be used with minimal glibc code changes. We need a

s/. /.  /g

> +  64 bit field for the Guarded Control Stack pointer (GCSPR_EL0) which
> +  can use a C field too as cancellation cleanup does not execute RET
> +  for a previous BL of the cancelled thread, but that would require a
> +  custom __libc_longjmp. This layout can change in the future.
> +*/

Two spaces followed by comment terminator. e.g. "in the future.  */"

> +#define JB_GCSPR 208
> +
>  #ifndef  __ASSEMBLER__
>  #include <setjmp.h>
>  #include <stdint.h>
  

Patch

diff --git a/sysdeps/aarch64/jmpbuf-offsets.h b/sysdeps/aarch64/jmpbuf-offsets.h
index 632328c7e2..ec047cf6b1 100644
--- a/sysdeps/aarch64/jmpbuf-offsets.h
+++ b/sysdeps/aarch64/jmpbuf-offsets.h
@@ -39,6 +39,69 @@ 
 #define JB_D14		 20
 #define JB_D15		 21
 
+/* The target specific part of jmp_buf has no space for expansion but
+   the public jmp_buf ABI type has. Unfortunately there is another type
+   that is used with setjmp APIs and exposed by thread cancellation (in
+   binaries built with -fno-exceptions) which complicates the situation.
+
+  // Internal layout of the public jmp_buf type on AArch64.
+  // This is passed to setjmp, longjmp, sigsetjmp, siglongjmp.
+  struct
+  {
+    uint64_t jmpbuf[22];     // Target specific part.
+    uint32_t mask_was_saved; // savemask bool used by sigsetjmp/siglongjmp.
+    uint32_t pad;
+    uint64_t saved_mask;     // sigset_t bits used on linux.
+    uint64_t unused[15];     // sigset_t bits not used on linux.
+  };
+
+  // Internal layout of the public __pthread_unwind_buf_t type.
+  // This is passed to sigsetjmp with !savemask and to the internal
+  // __libc_longjmp (currently alias of longjmp on AArch64).
+  struct
+  {
+    uint64_t jmpbuf[22];     // Must match jmp_buf.
+    uint32_t mask_was_saved; // Must match jmp_buf, always 0.
+    uint32_t pad;
+    void *prev;              // List for unwinding.
+    void *cleanup;           // Cleanup handlers.
+    uint32_t canceltype;     // 1 bit cancellation type.
+    uint32_t pad2;
+    void *pad3;
+  };
+
+  Ideally only the target specific part of jmp_buf (A) is accessed by
+  __setjmp and __longjmp.  But that is always embedded into one of the
+  two types above so the bits that are unused in those types (B) may be
+  reused for target specific purposes.  Setjmp can't distinguish between
+  jmp_buf and __pthread_unwind_buf_t, but longjmp can: only an internal
+  longjmp call uses the latter, so state that is not needed for cancel
+  cleanups can go to fields (C).  If generic code is refactored then the
+  usage of additional fields can be optimized (D). And some fields are
+  only accessible in the savedmask case (E). Reusability of jmp_buf
+  fields on AArch64 for target purposes:
+
+  struct
+  {
+    uint64_t A[22];  //   0 .. 176
+    uint32_t D;      // 176 .. 180
+    uint32_t B;      // 180 .. 184
+    uint64_t D;      // 184 .. 192
+    uint64_t C;      // 192 .. 200
+    uint32_t C;      // 200 .. 204
+    uint32_t B;      // 204 .. 208
+    uint64_t B;      // 208 .. 216
+    uint64_t E[12];  // 216 .. 312
+  }
+
+  The B fields can be used with minimal glibc code changes. We need a
+  64 bit field for the Guarded Control Stack pointer (GCSPR_EL0) which
+  can use a C field too as cancellation cleanup does not execute RET
+  for a previous BL of the cancelled thread, but that would require a
+  custom __libc_longjmp. This layout can change in the future.
+*/
+#define JB_GCSPR 208
+
 #ifndef  __ASSEMBLER__
 #include <setjmp.h>
 #include <stdint.h>