[1/2] arm: Implement cortex-M return signing address codegen

Message ID gkrlf23f0hp.fsf@arm.com
State Superseded
Headers
Series [1/2] arm: Implement cortex-M return signing address codegen |

Commit Message

Andrea Corallo Nov. 5, 2021, 8:52 a.m. UTC
  Hi all,

this patch enables address return signature and verification based on
Armv8.1-M Pointer Authentication [1].

To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry.  This is signing LR using SP and storing the
result in R12.  R12 will be pushed into the stack.

During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.

Here an example of PAC instrumented function prologue and epilogue:

        pac     r12, lr, sp
        push    {r3, r7, lr}
        push    {r12}
        sub     sp, sp, #4
        [...] function body
        add     sp, sp, #4
        pop     {r12}
        pop     {r3, r7, lr}
        aut     r12, lr, sp
        bx      lr

The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.

These two patches apply on top of Tejas series posted here [2].

Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.

Best Regards

  Andrea

[1] <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
[2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>
From 605970bdef506d749bbe9650ee469f41b1d7377f Mon Sep 17 00:00:00 2001
From: Andrea Corallo <andrea.corallo@arm.com>
Date: Fri, 24 Sep 2021 14:50:29 +0200
Subject: [PATCH 1/2] [PATCH] [1/2] arm: Implement cortex-M return signing
 address codegen

gcc/Changelog

2021-11-03  Andrea Corallo  <andrea.corallo@arm.com>

	* config/arm/arm.c: (arm_compute_frame_layout)
	(arm_expand_prologue, thumb2_expand_return, arm_expand_epilogue)
	(arm_conditional_register_usage): Update for pac codegen.
	(arm_pac_enabled_for_curr_function_p): New function.
	* config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
	Add new patterns.
	* config/arm/unspecs.md (UNSPEC_PAC_IP_LR_SP)
	(UNSPEC_PACBTI_IP_LR_SP, UNSPEC_AUT_IP_LR_SP): Add unspecs.

gcc/testsuite/Changelog

2021-11-03  Andrea Corallo  <andrea.corallo@arm.com>

	* gcc.target/arm/pac-1.c : New test case.
	* gcc.target/arm/pac-2.c : Likewise.
	* gcc.target/arm/pac-3.c : Likewise.
	* gcc.target/arm/pac-4.c : Likewise.
	* gcc.target/arm/pac-5.c : Likewise.
---
 gcc/config/arm/arm.c                 | 85 ++++++++++++++++++++++++----
 gcc/config/arm/arm.md                | 20 +++++++
 gcc/config/arm/unspecs.md            |  3 +
 gcc/testsuite/gcc.target/arm/pac-1.c | 25 ++++++++
 gcc/testsuite/gcc.target/arm/pac-2.c | 25 ++++++++
 gcc/testsuite/gcc.target/arm/pac-3.c | 25 ++++++++
 gcc/testsuite/gcc.target/arm/pac-4.c | 25 ++++++++
 gcc/testsuite/gcc.target/arm/pac-5.c | 26 +++++++++
 8 files changed, 224 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-5.c
  

Comments

Andrea Corallo Nov. 24, 2021, 10:08 a.m. UTC | #1
Andrea Corallo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> Hi all,
>
> this patch enables address return signature and verification based on
> Armv8.1-M Pointer Authentication [1].
>
> To sign the return address, we use the PAC R12, LR, SP instruction
> upon function entry.  This is signing LR using SP and storing the
> result in R12.  R12 will be pushed into the stack.
>
> During function epilogue R12 will be popped and AUT R12, LR, SP will
> be used to verify that the content of LR is still valid before return.
>
> Here an example of PAC instrumented function prologue and epilogue:
>
>         pac     r12, lr, sp
>         push    {r3, r7, lr}
>         push    {r12}
>         sub     sp, sp, #4
>         [...] function body
>         add     sp, sp, #4
>         pop     {r12}
>         pop     {r3, r7, lr}
>         aut     r12, lr, sp
>         bx      lr
>
> The patch also takes care of generating a PACBTI instruction in place
> of the sequence BTI+PAC when Branch Target Identification is enabled
> contextually.
>
> These two patches apply on top of Tejas series posted here [2].
>
> Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.
>
> Best Regards
>
>   Andrea
>
> [1] <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
> [2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>

Ping

Best Regards

  Andrea
  
Andrea Corallo Dec. 8, 2021, 10:33 a.m. UTC | #2
Andrea Corallo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:

> Andrea Corallo via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>
>> Hi all,
>>
>> this patch enables address return signature and verification based on
>> Armv8.1-M Pointer Authentication [1].
>>
>> To sign the return address, we use the PAC R12, LR, SP instruction
>> upon function entry.  This is signing LR using SP and storing the
>> result in R12.  R12 will be pushed into the stack.
>>
>> During function epilogue R12 will be popped and AUT R12, LR, SP will
>> be used to verify that the content of LR is still valid before return.
>>
>> Here an example of PAC instrumented function prologue and epilogue:
>>
>>         pac     r12, lr, sp
>>         push    {r3, r7, lr}
>>         push    {r12}
>>         sub     sp, sp, #4
>>         [...] function body
>>         add     sp, sp, #4
>>         pop     {r12}
>>         pop     {r3, r7, lr}
>>         aut     r12, lr, sp
>>         bx      lr
>>
>> The patch also takes care of generating a PACBTI instruction in place
>> of the sequence BTI+PAC when Branch Target Identification is enabled
>> contextually.
>>
>> These two patches apply on top of Tejas series posted here [2].
>>
>> Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.
>>
>> Best Regards
>>
>>   Andrea
>>
>> [1] <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
>> [2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>
>
> Ping
>
> Best Regards
>
>   Andrea

Hi all,

pinging this and 2/2.

Thanks

  Andrea
  
Richard Earnshaw Dec. 8, 2021, 11:31 a.m. UTC | #3
On 05/11/2021 08:52, Andrea Corallo via Gcc-patches wrote:
> Hi all,
> 
> this patch enables address return signature and verification based on
> Armv8.1-M Pointer Authentication [1].
> 
> To sign the return address, we use the PAC R12, LR, SP instruction
> upon function entry.  This is signing LR using SP and storing the
> result in R12.  R12 will be pushed into the stack.
> 
> During function epilogue R12 will be popped and AUT R12, LR, SP will
> be used to verify that the content of LR is still valid before return.
> 
> Here an example of PAC instrumented function prologue and epilogue:
> 
>          pac     r12, lr, sp
>          push    {r3, r7, lr}
>          push    {r12}
>          sub     sp, sp, #4

Which, as shown here, generates a stack which does not preserve 8-byte 
alignment.

Also, what's wrong with

	pac	r12, lr, sp
	push	{r3, r7, ip, lr}
?

Which saves 2 bytes in the prologue and ...

>          [...] function body
>          add     sp, sp, #4
>          pop     {r12}
>          pop     {r3, r7, lr}
>          aut     r12, lr, sp
>          bx      lr

	pop	{r3, r7, ip, lr}
	aut	r12, lr, sp
	bx	lr

which saves 4 bytes in the epilogue (repeated for each instance of the 
epilogue).

> 
> The patch also takes care of generating a PACBTI instruction in place
> of the sequence BTI+PAC when Branch Target Identification is enabled
> contextually.
> 

What about variadic functions?

What about functions where lr is live on entry (where it's used for 
passing the closure in nested functions)?


> These two patches apply on top of Tejas series posted here [2].
> 
> Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.
> 
> Best Regards
> 
>    Andrea
> 
> [1] <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
> [2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>
> 


+static bool arm_pac_enabled_for_curr_function_p (void);

I really don't like that name.  There are a lot of functions with 
variations of 'current function' in the name already and this creates 
yet another variant.  Something like 
arm_current_function_pac_enabled_p() would be preferable; or, if that 
really is too long, use 'current_func' which already has usage within 
the compiler.

+(define_insn "pac_ip_lr_sp"
+  [(set (reg:DI IP_REGNUM)
+	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+                   UNSPEC_PAC_IP_LR_SP))]
+  ""
+  "pac\tr12, lr, sp")
+
+(define_insn "pacbti_ip_lr_sp"
+  [(set (reg:DI IP_REGNUM)
+	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+                   UNSPEC_PACBTI_IP_LR_SP))]
+  ""
+  "pacbti\tr12, lr, sp")
+
+(define_insn "aut_ip_lr_sp"
+  [(unspec:DI [(reg:DI IP_REGNUM) (reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+              UNSPEC_AUT_IP_LR_SP)]
+  ""
+  "aut\tr12, lr, sp")
+

I think all these need a length attribute.  Also, they should only be 
enabled for thumb2 (certainly not in Arm state).
And when using explicit register names in an asm, prefix each name with 
'%|', just in case the assembler dialect has a register name prefix.

The names are somewhat unweildy, can't we use something more usefully 
descriptive, like 'pac_nop', 'pacbti_nop' and 'aut_nop', since all these 
instructions are using the architectural NOP space.

Finally, I think we need some more tests that cover the various 
frame-pointer flavours when used in combination with this feature and 
for various corners of the PCS.

R.
  
Andrea Corallo Dec. 17, 2021, 3:52 p.m. UTC | #4
Hi Richard,

thanks for reviewing!  Some comments inline.

Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
> On 05/11/2021 08:52, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> this patch enables address return signature and verification based
>> on
>> Armv8.1-M Pointer Authentication [1].
>> To sign the return address, we use the PAC R12, LR, SP instruction
>> upon function entry.  This is signing LR using SP and storing the
>> result in R12.  R12 will be pushed into the stack.
>> During function epilogue R12 will be popped and AUT R12, LR, SP will
>> be used to verify that the content of LR is still valid before return.
>> Here an example of PAC instrumented function prologue and epilogue:
>>          pac     r12, lr, sp
>>          push    {r3, r7, lr}
>>          push    {r12}
>>          sub     sp, sp, #4
>
> Which, as shown here, generates a stack which does not preserve 8-byte
> alignment.

I'm probably catastrofically wrong but shouldn't the stack be "all times
be aligned to a word boundary" [1]?

> Also, what's wrong with
>
> 	pac	r12, lr, sp
> 	push	{r3, r7, ip, lr}
> ?

AFAIK the AAPCS32 defines the Frame Record to be 2 consecutive 32-bit
values of LR and FP on the stack so that's the reason.

> Which saves 2 bytes in the prologue and ...
>
>>          [...] function body
>>          add     sp, sp, #4
>>          pop     {r12}
>>          pop     {r3, r7, lr}
>>          aut     r12, lr, sp
>>          bx      lr
>
> 	pop	{r3, r7, ip, lr}
> 	aut	r12, lr, sp
> 	bx	lr
>
> which saves 4 bytes in the epilogue (repeated for each instance of the
> epilogue).
>
>> The patch also takes care of generating a PACBTI instruction in
>> place
>> of the sequence BTI+PAC when Branch Target Identification is enabled
>> contextually.
>> 
>
> What about variadic functions?
>
> What about functions where lr is live on entry (where it's used for
> passing the closure in nested functions)?
>
>> These two patches apply on top of Tejas series posted here [2].
>> Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.
>> Best Regards
>>    Andrea
>> [1]
>> <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
>> [2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>
>> 
>
>
> +static bool arm_pac_enabled_for_curr_function_p (void);
>
> I really don't like that name.  There are a lot of functions with
> variations of 'current function' in the name already and this creates
> yet another variant.  Something like
> arm_current_function_pac_enabled_p() would be preferable; or, if that
> really is too long, use 'current_func' which already has usage within
> the compiler.

Ack

> +(define_insn "pac_ip_lr_sp"
> +  [(set (reg:DI IP_REGNUM)
> +	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
> +                   UNSPEC_PAC_IP_LR_SP))]
> +  ""
> +  "pac\tr12, lr, sp")
> +
> +(define_insn "pacbti_ip_lr_sp"
> +  [(set (reg:DI IP_REGNUM)
> +	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
> +                   UNSPEC_PACBTI_IP_LR_SP))]
> +  ""
> +  "pacbti\tr12, lr, sp")
> +
> +(define_insn "aut_ip_lr_sp"
> +  [(unspec:DI [(reg:DI IP_REGNUM) (reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
> +              UNSPEC_AUT_IP_LR_SP)]
> +  ""
> +  "aut\tr12, lr, sp")
> +
>
> I think all these need a length attribute.  Also, they should only be
> enabled for thumb2 (certainly not in Arm state).
> And when using explicit register names in an asm, prefix each name
> with '%|', just in case the assembler dialect has a register name
> prefix.

Ack

> The names are somewhat unweildy, can't we use something more usefully
> descriptive, like 'pac_nop', 'pacbti_nop' and 'aut_nop', since all
> these instructions are using the architectural NOP space.
>
> Finally, I think we need some more tests that cover the various
> frame-pointer flavours when used in combination with this feature and
> for various corners of the PCS.

Could you give some more indications of the falvors we what to have tests
for?

Thanks

  Andrea

[1] <https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#id46>
  
Richard Earnshaw Dec. 17, 2021, 4:55 p.m. UTC | #5
On 17/12/2021 15:52, Andrea Corallo wrote:
> Hi Richard,
> 
> thanks for reviewing!  Some comments inline.
> 
> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>> On 05/11/2021 08:52, Andrea Corallo via Gcc-patches wrote:
>>> Hi all,
>>> this patch enables address return signature and verification based
>>> on
>>> Armv8.1-M Pointer Authentication [1].
>>> To sign the return address, we use the PAC R12, LR, SP instruction
>>> upon function entry.  This is signing LR using SP and storing the
>>> result in R12.  R12 will be pushed into the stack.
>>> During function epilogue R12 will be popped and AUT R12, LR, SP will
>>> be used to verify that the content of LR is still valid before return.
>>> Here an example of PAC instrumented function prologue and epilogue:
>>>           pac     r12, lr, sp
>>>           push    {r3, r7, lr}
>>>           push    {r12}
>>>           sub     sp, sp, #4
>>
>> Which, as shown here, generates a stack which does not preserve 8-byte
>> alignment.
> 
> I'm probably catastrofically wrong but shouldn't the stack be "all times
> be aligned to a word boundary" [1]?

At a function boundary it must be 8-byte aligned (same reference).  I 
don't think GCC really optimizes leaf functions to permit sub 8-byte 
alignment, but since you omitted the body of your function I might be 
wrong in this case.

> 
>> Also, what's wrong with
>>
>> 	pac	r12, lr, sp
>> 	push	{r3, r7, ip, lr}
>> ?
> 
> AFAIK the AAPCS32 defines the Frame Record to be 2 consecutive 32-bit
> values of LR and FP on the stack so that's the reason.

GCC does not currently support AAPCS frame chains as that's a relatively 
new feature in the AAPCS; so it's not something you need to be concerned 
about right now.  The AAPCS frame chain uses R11 as the frame-chain 
register anyway.  However, you are right that this does affect 
-mtpcs-frame and will become relevant when we do add support for AAPCS 
frame chains.

> 
>> Which saves 2 bytes in the prologue and ...
>>
>>>           [...] function body
>>>           add     sp, sp, #4
>>>           pop     {r12}
>>>           pop     {r3, r7, lr}
>>>           aut     r12, lr, sp
>>>           bx      lr
>>
>> 	pop	{r3, r7, ip, lr}
>> 	aut	r12, lr, sp
>> 	bx	lr
>>
>> which saves 4 bytes in the epilogue (repeated for each instance of the
>> epilogue).
>>
>>> The patch also takes care of generating a PACBTI instruction in
>>> place
>>> of the sequence BTI+PAC when Branch Target Identification is enabled
>>> contextually.
>>>
>>
>> What about variadic functions?
>>
>> What about functions where lr is live on entry (where it's used for
>> passing the closure in nested functions)?
>>
>>> These two patches apply on top of Tejas series posted here [2].
>>> Regressioned and arm-linux-gnu aarch64-linux-gnu bootstraped.
>>> Best Regards
>>>     Andrea
>>> [1]
>>> <https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>
>>> [2] <https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581176.html>
>>>
>>
>>
>> +static bool arm_pac_enabled_for_curr_function_p (void);
>>
>> I really don't like that name.  There are a lot of functions with
>> variations of 'current function' in the name already and this creates
>> yet another variant.  Something like
>> arm_current_function_pac_enabled_p() would be preferable; or, if that
>> really is too long, use 'current_func' which already has usage within
>> the compiler.
> 
> Ack
> 
>> +(define_insn "pac_ip_lr_sp"
>> +  [(set (reg:DI IP_REGNUM)
>> +	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
>> +                   UNSPEC_PAC_IP_LR_SP))]
>> +  ""
>> +  "pac\tr12, lr, sp")
>> +
>> +(define_insn "pacbti_ip_lr_sp"
>> +  [(set (reg:DI IP_REGNUM)
>> +	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
>> +                   UNSPEC_PACBTI_IP_LR_SP))]
>> +  ""
>> +  "pacbti\tr12, lr, sp")
>> +
>> +(define_insn "aut_ip_lr_sp"
>> +  [(unspec:DI [(reg:DI IP_REGNUM) (reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
>> +              UNSPEC_AUT_IP_LR_SP)]
>> +  ""
>> +  "aut\tr12, lr, sp")
>> +
>>
>> I think all these need a length attribute.  Also, they should only be
>> enabled for thumb2 (certainly not in Arm state).
>> And when using explicit register names in an asm, prefix each name
>> with '%|', just in case the assembler dialect has a register name
>> prefix.
> 
> Ack
> 
>> The names are somewhat unweildy, can't we use something more usefully
>> descriptive, like 'pac_nop', 'pacbti_nop' and 'aut_nop', since all
>> these instructions are using the architectural NOP space.
>>
>> Finally, I think we need some more tests that cover the various
>> frame-pointer flavours when used in combination with this feature and
>> for various corners of the PCS.
> 
> Could you give some more indications of the falvors we what to have tests
> for?

I'm thinking mostly about test cases for the additional situations I've 
described above.  But there's also testing that the code does the right 
thing with -mtpcs-frame.

It might be that we want to declare -mtpcs-frame and branch protection 
incompatible, which would save a lot of complicated validation.  That's 
probably acceptable because -mtpcs-frame is essentially deprecated 
anyway (and will hopefully be superseded with AAPCS frame-chain support 
before long).  However, if we do that, then we at least need some option 
compatibility diagnostic to reject such a combination and the 
incompatibility will need documention.

R.

> 
> Thanks
> 
>    Andrea
> 
> [1] <https://github.com/ARM-software/abi-aa/blob/main/aapcs32/aapcs32.rst#id46>
>
  
Andrea Corallo Dec. 17, 2021, 5:22 p.m. UTC | #6
Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:

> On 17/12/2021 15:52, Andrea Corallo wrote:
>> Hi Richard,
>> thanks for reviewing!  Some comments inline.
>> Richard Earnshaw <Richard.Earnshaw@foss.arm.com> writes:
>>> On 05/11/2021 08:52, Andrea Corallo via Gcc-patches wrote:
>>>> Hi all,
>>>> this patch enables address return signature and verification based
>>>> on
>>>> Armv8.1-M Pointer Authentication [1].
>>>> To sign the return address, we use the PAC R12, LR, SP instruction
>>>> upon function entry.  This is signing LR using SP and storing the
>>>> result in R12.  R12 will be pushed into the stack.
>>>> During function epilogue R12 will be popped and AUT R12, LR, SP will
>>>> be used to verify that the content of LR is still valid before return.
>>>> Here an example of PAC instrumented function prologue and epilogue:
>>>>           pac     r12, lr, sp
>>>>           push    {r3, r7, lr}
>>>>           push    {r12}
>>>>           sub     sp, sp, #4
>>>
>>> Which, as shown here, generates a stack which does not preserve 8-byte
>>> alignment.
>> I'm probably catastrofically wrong but shouldn't the stack be "all
>> times
>> be aligned to a word boundary" [1]?
>
> At a function boundary it must be 8-byte aligned (same reference).  I
> don't think GCC really optimizes leaf functions to permit sub 8-byte
> alignment, but since you omitted the body of your function I might be
> wrong in this case.

I see thanks.

>> 
>>> Also, what's wrong with
>>>
>>> 	pac	r12, lr, sp
>>> 	push	{r3, r7, ip, lr}
>>> ?
>> AFAIK the AAPCS32 defines the Frame Record to be 2 consecutive
>> 32-bit
>> values of LR and FP on the stack so that's the reason.
>
> GCC does not currently support AAPCS frame chains as that's a
> relatively new feature in the AAPCS; so it's not something you need to
> be concerned about right now.  The AAPCS frame chain uses R11 as the
> frame-chain register anyway.  However, you are right that this does
> affect -mtpcs-frame and will become relevant when we do add support
> for AAPCS frame chains.

Do you think would be better to go for the "push {r3, r7, ip, lr}"
solution even if we decide -mtpcs-frame is not compatible with pac
instrumentation or we want to stay with the proposed approach for future
compatibility?

Thanks

  Andrea
  

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index a87bcb298f9..2889a471fa5 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -302,6 +302,7 @@  static bool arm_vectorize_vec_perm_const (machine_mode, rtx, rtx, rtx,
 					  const vec_perm_indices &);
 
 static bool aarch_macro_fusion_pair_p (rtx_insn*, rtx_insn*);
+static bool arm_pac_enabled_for_curr_function_p (void);
 
 static int arm_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost,
 					   tree vectype,
@@ -22696,6 +22697,10 @@  arm_compute_frame_layout (void)
 	 nonecure entry functions with VSTR/VLDR.  */
       if (TARGET_HAVE_FPCXT_CMSE && IS_CMSE_ENTRY (func_type))
 	saved += 4;
+
+      /* Allocate space for saving R12 */
+      if (arm_pac_enabled_for_curr_function_p ())
+	saved += 4;
     }
   else /* TARGET_THUMB1 */
     {
@@ -23288,11 +23293,12 @@  arm_expand_prologue (void)
   /* The static chain register is the same as the IP register.  If it is
      clobbered when creating the frame, we need to save and restore it.  */
   clobber_ip = IS_NESTED (func_type)
-	       && ((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
-		   || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
-			|| flag_stack_clash_protection)
-		       && !df_regs_ever_live_p (LR_REGNUM)
-		       && arm_r3_live_at_start_p ()));
+    && (((TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
+	 || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
+	      || flag_stack_clash_protection)
+	     && !df_regs_ever_live_p (LR_REGNUM)
+	     && arm_r3_live_at_start_p ()))
+	|| (arm_pac_enabled_for_curr_function_p ()));
 
   /* Find somewhere to store IP whilst the frame is being created.
      We try the following places in order:
@@ -23368,6 +23374,14 @@  arm_expand_prologue (void)
 	}
     }
 
+  if (arm_pac_enabled_for_curr_function_p ())
+    {
+      if (aarch_bti_enabled ())
+	emit_insn (gen_pacbti_ip_lr_sp ());
+      else
+	emit_insn (gen_pac_ip_lr_sp ());
+    }
+
   if (TARGET_APCS_FRAME && frame_pointer_needed && TARGET_ARM)
     {
       if (IS_INTERRUPT (func_type))
@@ -23490,6 +23504,9 @@  arm_expand_prologue (void)
   if (! IS_VOLATILE (func_type))
     saved_regs += arm_save_coproc_regs ();
 
+  if (arm_pac_enabled_for_curr_function_p ())
+      emit_multi_reg_push (1 << IP_REGNUM, 1 << IP_REGNUM);
+
   if (frame_pointer_needed && TARGET_ARM)
     {
       /* Create the new frame pointer.  */
@@ -27150,7 +27167,8 @@  thumb2_expand_return (bool simple_return)
 	 to assert it for now to ensure that future code changes do not silently
 	 change this behavior.  */
       gcc_assert (!IS_CMSE_ENTRY (arm_current_func_type ()));
-      if (num_regs == 1)
+      if (num_regs == 1
+	  && !(arm_pac_enabled_for_curr_function_p ()))
         {
           rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2));
           rtx reg = gen_rtx_REG (SImode, PC_REGNUM);
@@ -27165,13 +27183,34 @@  thumb2_expand_return (bool simple_return)
         }
       else
         {
-          saved_regs_mask &= ~ (1 << LR_REGNUM);
-          saved_regs_mask |=   (1 << PC_REGNUM);
-          arm_emit_multi_reg_pop (saved_regs_mask);
+	  if (arm_pac_enabled_for_curr_function_p ())
+	    {
+	      emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
+				     GEN_INT (4)));
+	      arm_emit_multi_reg_pop (1 << IP_REGNUM);
+	      saved_regs_mask &= ~ (1 << PC_REGNUM);
+	      arm_emit_multi_reg_pop (saved_regs_mask);
+	      emit_insn (gen_aut_ip_lr_sp ());
+	      emit_jump_insn (simple_return_rtx);
+	    }
+	  else
+	    {
+	      saved_regs_mask &= ~ (1 << LR_REGNUM);
+	      saved_regs_mask |=   (1 << PC_REGNUM);
+	      arm_emit_multi_reg_pop (saved_regs_mask);
+	    }
         }
     }
   else
     {
+      if (arm_pac_enabled_for_curr_function_p ())
+	{
+	  emit_insn (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
+				 GEN_INT (4)));
+	  arm_emit_multi_reg_pop (1 << IP_REGNUM);
+	  emit_insn (gen_aut_ip_lr_sp ());
+	}
+
       if (IS_CMSE_ENTRY (arm_current_func_type ()))
 	cmse_nonsecure_entry_clear_before_return ();
       emit_jump_insn (simple_return_rtx);
@@ -27469,6 +27508,9 @@  arm_expand_epilogue (bool really_return)
           /* In Thumb-2 mode, the frame pointer points to the last saved
              register.  */
 	  amount = offsets->locals_base - offsets->saved_regs;
+	  if (arm_pac_enabled_for_curr_function_p ())
+	    amount += 4;
+
 	  if (amount)
 	    {
 	      insn = emit_insn (gen_addsi3 (hard_frame_pointer_rtx,
@@ -27497,6 +27539,10 @@  arm_expand_epilogue (bool really_return)
       /* Pop off outgoing args and local frame to adjust stack pointer to
          last saved register.  */
       amount = offsets->outgoing_args - offsets->saved_regs;
+
+      if (arm_pac_enabled_for_curr_function_p ())
+	amount += 4;
+
       if (amount)
         {
 	  rtx_insn *tmp;
@@ -27562,6 +27608,9 @@  arm_expand_epilogue (bool really_return)
 				       stack_pointer_rtx, stack_pointer_rtx);
         }
 
+  if (arm_pac_enabled_for_curr_function_p ())
+    arm_emit_multi_reg_pop (1 << IP_REGNUM);
+
   if (saved_regs_mask)
     {
       rtx insn;
@@ -27574,7 +27623,8 @@  arm_expand_epilogue (bool really_return)
           && really_return
           && crtl->args.pretend_args_size == 0
           && saved_regs_mask & (1 << LR_REGNUM)
-          && !crtl->calls_eh_return)
+          && !crtl->calls_eh_return
+	  && !arm_pac_enabled_for_curr_function_p ())
         {
           saved_regs_mask &= ~(1 << LR_REGNUM);
           saved_regs_mask |= (1 << PC_REGNUM);
@@ -27688,6 +27738,9 @@  arm_expand_epilogue (bool really_return)
 	}
     }
 
+  if (arm_pac_enabled_for_curr_function_p ())
+    emit_insn (gen_aut_ip_lr_sp ());
+
   if (!really_return)
     return;
 
@@ -30393,6 +30446,9 @@  arm_conditional_register_usage (void)
 	global_regs[ARM_HARD_FRAME_POINTER_REGNUM] = 1;
     }
 
+  if (TARGET_HAVE_PACBTI)
+    call_used_regs[IP_REGNUM] = 1;
+
   /* The Q and GE bits are only accessed via special ACLE patterns.  */
   CLEAR_HARD_REG_BIT (operand_reg_set, APSRQ_REGNUM);
   CLEAR_HARD_REG_BIT (operand_reg_set, APSRGE_REGNUM);
@@ -32822,6 +32878,15 @@  arm_fusion_enabled_p (tune_params::fuse_ops op)
   return current_tune->fusible_ops & op;
 }
 
+/* Return TRUE if return address signing mechanism is enabled.  */
+static bool
+arm_pac_enabled_for_curr_function_p (void)
+{
+  return aarch_ra_sign_scope == AARCH_FUNCTION_ALL
+    || (aarch_ra_sign_scope == AARCH_FUNCTION_NON_LEAF
+	&& !crtl->is_leaf);
+}
+
 /* Implement TARGET_SCHED_CAN_SPECULATE_INSN.  Return true if INSN can be
    scheduled for speculative execution.  Reject the long-running division
    and square-root instructions.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 4adc976b8b6..132135d244d 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -12823,6 +12823,26 @@ 
    (set_attr "length" "8")]
 )
 
+(define_insn "pac_ip_lr_sp"
+  [(set (reg:DI IP_REGNUM)
+	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+                   UNSPEC_PAC_IP_LR_SP))]
+  ""
+  "pac\tr12, lr, sp")
+
+(define_insn "pacbti_ip_lr_sp"
+  [(set (reg:DI IP_REGNUM)
+	(unspec:DI [(reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+                   UNSPEC_PACBTI_IP_LR_SP))]
+  ""
+  "pacbti\tr12, lr, sp")
+
+(define_insn "aut_ip_lr_sp"
+  [(unspec:DI [(reg:DI IP_REGNUM) (reg:DI SP_REGNUM) (reg:DI LR_REGNUM)]
+              UNSPEC_AUT_IP_LR_SP)]
+  ""
+  "aut\tr12, lr, sp")
+
 ;; Vector bits common to IWMMXT, Neon and MVE
 (include "vec-common.md")
 ;; Load the Intel Wireless Multimedia Extension patterns
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index ad1c6edd005..d60d0ceb87c 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -159,6 +159,9 @@ 
   UNSPEC_VCDE		; Custom Datapath Extension instruction.
   UNSPEC_VCDEA		; Custom Datapath Extension instruction.
   UNSPEC_DLS		; Used for DLS (Do Loop Start), Armv8.1-M Mainline instruction
+  UNSPEC_PAC_IP_LR_SP   ; Represents PAC signing LR
+  UNSPEC_PACBTI_IP_LR_SP ; Represents PAC signing LR + valid landing pad
+  UNSPEC_AUT_IP_LR_SP   ; Represents PAC verifying LR
 ])
 
 
diff --git a/gcc/testsuite/gcc.target/arm/pac-1.c b/gcc/testsuite/gcc.target/arm/pac-1.c
new file mode 100644
index 00000000000..8979a554e63
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pac-1.c
@@ -0,0 +1,25 @@ 
+/* Testing return address signing.  */
+/* { dg-do run } */
+/* { dg-options "-march=armv8.1-m.main -mbranch-protection=pac-ret+leaf -mthumb --save-temps -O0" } */
+
+#include <stdlib.h>
+
+int
+__attribute__((noinline))
+foo1 (int a, int b)
+{
+  return a + b;
+}
+
+int
+main (void)
+{
+  if (foo1 (1, 2) != 3)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "pac\tr12, lr, sp" 2 } } */
+/* { dg-final { scan-assembler-times "aut\tr12, lr, sp" 2 } } */
+/* { dg-final { scan-assembler-not "bti" } } */
diff --git a/gcc/testsuite/gcc.target/arm/pac-2.c b/gcc/testsuite/gcc.target/arm/pac-2.c
new file mode 100644
index 00000000000..678294af67d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pac-2.c
@@ -0,0 +1,25 @@ 
+/* Testing return address signing.  */
+/* { dg-do run } */
+/* { dg-options "-march=armv8.1-m.main -mbranch-protection=pac-ret -mthumb --save-temps -O0" } */
+
+#include <stdlib.h>
+
+int
+__attribute__((noinline))
+foo1 (int a, int b)
+{
+  return a + b;
+}
+
+int
+main (void)
+{
+  if (foo1 (1, 2) != 3)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler "pac\tr12, lr, sp" } } */
+/* { dg-final { scan-assembler "aut\tr12, lr, sp" } } */
+/* { dg-final { scan-assembler-not "bti" } } */
diff --git a/gcc/testsuite/gcc.target/arm/pac-3.c b/gcc/testsuite/gcc.target/arm/pac-3.c
new file mode 100644
index 00000000000..e67ee910683
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pac-3.c
@@ -0,0 +1,25 @@ 
+/* Testing return address signing.  */
+/* { dg-do run } */
+/* { dg-options "-march=armv8.1-m.main -mbranch-protection=bti+pac-ret+leaf -mthumb --save-temps -O2" } */
+
+#include <stdlib.h>
+
+int
+__attribute__((noinline))
+foo1 (int a, int b)
+{
+  return a + b;
+}
+
+int
+main (void)
+{
+  if (foo1 (1, 2) != 3)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "pacbti\tr12, lr, sp" 2 } } */
+/* { dg-final { scan-assembler-times "aut\tr12, lr, sp" 2 } } */
+/* { dg-final { scan-assembler-not "\tbti\t" } } */
diff --git a/gcc/testsuite/gcc.target/arm/pac-4.c b/gcc/testsuite/gcc.target/arm/pac-4.c
new file mode 100644
index 00000000000..404457313a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pac-4.c
@@ -0,0 +1,25 @@ 
+/* Testing return address signing.  */
+/* { dg-do run } */
+/* { dg-options "-march=armv8.1-m.main+pacbti -mthumb --save-temps -O2" } */
+
+#include <stdlib.h>
+
+int
+__attribute__((noinline))
+foo1 (int a, int b)
+{
+  return a + b;
+}
+
+int
+main (void)
+{
+  if (foo1 (1, 2) != 3)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-not "\tbti\t" } } */
+/* { dg-final { scan-assembler-not "\tpac\t" } } */
+/* { dg-final { scan-assembler-not "\tpacbti\t" } } */
diff --git a/gcc/testsuite/gcc.target/arm/pac-5.c b/gcc/testsuite/gcc.target/arm/pac-5.c
new file mode 100644
index 00000000000..d2d996b921a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pac-5.c
@@ -0,0 +1,26 @@ 
+/* Testing return address signing.  */
+/* { dg-do run } */
+/* { dg-options "-march=armv8.1-m.main -mbranch-protection=pac-ret+leaf -mthumb --save-temps -O0" } */
+
+#include <stdlib.h>
+
+int
+__attribute__((noinline))
+foo1 (int a, int b)
+{
+  int square (int z) { return z * z; }
+  return square (a) + square (b);
+}
+
+int
+main (void)
+{
+  if (foo1 (1, 2) != 5)
+    abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-assembler-times "pac\tr12, lr, sp" 3 } } */
+/* { dg-final { scan-assembler-times "aut\tr12, lr, sp" 3 } } */
+/* { dg-final { scan-assembler-not "bti" } } */