[2/5] xtensa: Add support for sibling call optimization
Commit Message
This patch introduces support for sibling call optimization, when call0
ABI is in effect.
gcc/ChangeLog:
* config/xtensa/xtensa-protos.h (xtensa_prepare_expand_call,
xtensa_emit_sibcall): New prototypes.
(xtensa_expand_epilogue): Add new argument that specifies whether
or not sibling call.
* config/xtensa/xtensa.cc (TARGET_FUNCTION_OK_FOR_SIBCALL):
New macro definition.
(xtensa_prepare_expand_call): New function in order to share
the common code.
(xtensa_emit_sibcall, xtensa_function_ok_for_sibcall):
New functions.
(xtensa_expand_epilogue): Add new argument sibcall_p and use it
for sibling call handling.
* config/xtensa/xtensa.md (call, call_value):
Use xtensa_prepare_expand_call.
(call_internal, call_value_internal):
Add the condition in order to be disabled if sibling call.
(sibcall, sibcall_value, sibcall_epilogue): New expansions.
(sibcall_internal, sibcall_value_internal): New insn patterns.
gcc/testsuite/ChangeLog:
* gcc.target/xtensa/sibcalls.c: New.
---
gcc/config/xtensa/xtensa-protos.h | 4 +-
gcc/config/xtensa/xtensa.cc | 63 ++++++++++++++++++--
gcc/config/xtensa/xtensa.md | 68 +++++++++++++++++-----
gcc/testsuite/gcc.target/xtensa/sibcalls.c | 15 +++++
4 files changed, 130 insertions(+), 20 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/xtensa/sibcalls.c
Comments
Hi Suwa-san,
On Mon, Jun 13, 2022 at 8:54 PM Takayuki 'January June' Suwa
<jjsuwa_sys3175@yahoo.co.jp> wrote:
>
> This patch introduces support for sibling call optimization, when call0
> ABI is in effect.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa-protos.h (xtensa_prepare_expand_call,
> xtensa_emit_sibcall): New prototypes.
> (xtensa_expand_epilogue): Add new argument that specifies whether
> or not sibling call.
> * config/xtensa/xtensa.cc (TARGET_FUNCTION_OK_FOR_SIBCALL):
> New macro definition.
> (xtensa_prepare_expand_call): New function in order to share
> the common code.
> (xtensa_emit_sibcall, xtensa_function_ok_for_sibcall):
> New functions.
> (xtensa_expand_epilogue): Add new argument sibcall_p and use it
> for sibling call handling.
> * config/xtensa/xtensa.md (call, call_value):
> Use xtensa_prepare_expand_call.
> (call_internal, call_value_internal):
> Add the condition in order to be disabled if sibling call.
> (sibcall, sibcall_value, sibcall_epilogue): New expansions.
> (sibcall_internal, sibcall_value_internal): New insn patterns.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/xtensa/sibcalls.c: New.
> ---
> gcc/config/xtensa/xtensa-protos.h | 4 +-
> gcc/config/xtensa/xtensa.cc | 63 ++++++++++++++++++--
> gcc/config/xtensa/xtensa.md | 68 +++++++++++++++++-----
> gcc/testsuite/gcc.target/xtensa/sibcalls.c | 15 +++++
> 4 files changed, 130 insertions(+), 20 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/xtensa/sibcalls.c
This change results in a bunch of new regression test failures:
FAIL: gcc.c-torture/execute/builtins/fprintf.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/fprintf.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/fprintf.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/fprintf.c execution, -O2 -flto
-fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/memset.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/memset.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/memset.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/memset.c execution, -O2 -flto
-fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/printf.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/printf.c execution, -O3 -g
FAIL: gcc.c-torture/execute/builtins/printf.c execution, -Os
FAIL: gcc.c-torture/execute/builtins/printf.c execution, -O2 -flto
-fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/strcpy-2.c execution, -O2
FAIL: gcc.c-torture/execute/builtins/strcpy-2.c execution, -O2 -flto
-fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/strcpy-2.c execution, -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects
FAIL: gcc.c-torture/execute/20000121-1.c -O2 execution test
FAIL: gcc.c-torture/execute/20000121-1.c -O3 -g execution test
FAIL: gcc.c-torture/execute/20000121-1.c -Os execution test
FAIL: gcc.c-torture/execute/20000121-1.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/921208-2.c -O2 execution test
FAIL: gcc.c-torture/execute/921208-2.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions execution
test
FAIL: gcc.c-torture/execute/921208-2.c -O3 -g execution test
FAIL: gcc.c-torture/execute/921208-2.c -Os execution test
FAIL: gcc.c-torture/execute/921208-2.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/pr33992.c -O2 execution test
FAIL: gcc.c-torture/execute/pr33992.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions execution
test
FAIL: gcc.c-torture/execute/pr33992.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr33992.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/pr49161.c -O2 execution test
FAIL: gcc.c-torture/execute/pr49161.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr49161.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/pr88714.c -O2 execution test
FAIL: gcc.c-torture/execute/pr88714.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr88714.c -Os execution test
FAIL: gcc.c-torture/execute/pr88714.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/pr88714.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects execution test
FAIL: gcc.c-torture/execute/pr88739.c -O2 execution test
FAIL: gcc.c-torture/execute/pr88739.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr88739.c -Os execution test
FAIL: gcc.c-torture/execute/pr88739.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/pr90949.c -O2 execution test
FAIL: gcc.c-torture/execute/pr90949.c -O3 -g execution test
FAIL: gcc.c-torture/execute/pr90949.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/pr90949.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects execution test
FAIL: gcc.c-torture/execute/printf-2.c -O2 execution test
FAIL: gcc.c-torture/execute/printf-2.c -O3 -g execution test
FAIL: gcc.c-torture/execute/printf-2.c -Os execution test
FAIL: gcc.c-torture/execute/printf-2.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.c-torture/execute/printf-2.c -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects execution test
FAIL: gcc.dg/packed-array.c execution test
FAIL: gcc.dg/pr20115.c execution test
FAIL: gcc.dg/pr44404.c execution test
FAIL: gcc.dg/pr81292-2.c execution test
FAIL: gcc.dg/strlenopt-31.c execution test
FAIL: gcc.dg/strlenopt-81.c execution test
FAIL: gcc.dg/torture/builtin-complex-1.c -O2 execution test
FAIL: gcc.dg/torture/builtin-complex-1.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions execution
test
FAIL: gcc.dg/torture/builtin-complex-1.c -O3 -g execution test
FAIL: gcc.dg/torture/builtin-complex-1.c -Os execution test
FAIL: gcc.dg/torture/builtin-complex-1.c -O2 -flto
-fno-use-linker-plugin -flto-partition=none execution test
FAIL: gcc.dg/torture/pr56661.c -Os execution test
FAIL: gcc.dg/torture/pr65077.c -O2 execution test
FAIL: gcc.dg/torture/pr65077.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions execution
test
FAIL: gcc.dg/torture/pr65077.c -O3 -g execution test
FAIL: gcc.dg/torture/pr65077.c -Os execution test
FAIL: gcc.dg/torture/pr65077.c -O2 -flto -fno-use-linker-plugin
-flto-partition=none execution test
FAIL: gcc.dg/torture/pr67916.c -O2 execution test
FAIL: gcc.dg/torture/pr67916.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions execution
test
FAIL: gcc.dg/torture/pr67916.c -O3 -g execution test
FAIL: gcc.dg/torture/pr67916.c -Os execution test
FAIL: gcc.dg/torture/pr67916.c -O2 -flto -fno-use-linker-plugin
-flto-partition=none execution test
FAIL: gcc.dg/tree-ssa/cswtch-3.c execution test
FAIL: gcc.dg/tree-ssa/predcom-dse-5.c execution test
FAIL: gcc.dg/tree-ssa/predcom-dse-6.c execution test
FAIL: gcc.dg/tree-ssa/predcom-dse-7.c execution test
The code generated for e.g. gcc.c-torture/execute/921208-2.c looks like this:
.file "921208-2.c"
.text
.literal_position
.align 4
.global g
.type g, @function
g:
ret.n
.size g, .-g
.literal_position
.literal .LC1, g@PLT
.literal .LC3, 1072693248
.literal .LC4, 1073741824
.align 4
.global f
.type f, @function
f:
addi sp, sp, -16
s32i.n a13, sp, 4
l32r a13, .LC3
s32i.n a12, sp, 8
s32i.n a14, sp, 0
movi.n a12, 0
l32r a14, .LC1
s32i.n a0, sp, 12
mov.n a3, a13
mov.n a4, a12
mov.n a5, a13
mov.n a2, a12
callx0 a14
l32i.n a0, sp, 12
l32i.n a14, sp, 0
mov.n a4, a12
mov.n a5, a13
l32i.n a12, sp, 8
l32i.n a13, sp, 4
l32r a3, .LC4
movi.n a2, 0
addi sp, sp, 16
jx a14
.size f, .-f
.section .text.startup,"ax",@progbits
.literal_position
.literal .LC5, f@PLT
.literal .LC6, exit@PLT
.align 4
.global main
.type main, @function
main:
addi sp, sp, -16
l32r a2, .LC5
s32i.n a0, sp, 12
callx0 a2
l32r a3, .LC6
movi.n a2, 0
callx0 a3
.size main, .-main
.ident "GCC: (GNU) 13.0.0 20220614 (experimental)"
@@ -53,7 +53,9 @@ extern void xtensa_expand_atomic (enum rtx_code, rtx, rtx, rtx, bool);
extern void xtensa_emit_loop_end (rtx_insn *, rtx *);
extern char *xtensa_emit_branch (bool, rtx *);
extern char *xtensa_emit_movcc (bool, bool, bool, rtx *);
+extern void xtensa_prepare_expand_call (int, rtx *);
extern char *xtensa_emit_call (int, rtx *);
+extern char *xtensa_emit_sibcall (int, rtx *);
extern bool xtensa_tls_referenced_p (rtx);
extern enum rtx_code xtensa_shlrd_which_direction (rtx, rtx);
@@ -73,7 +75,7 @@ extern int xtensa_dbx_register_number (int);
extern long compute_frame_size (poly_int64);
extern bool xtensa_use_return_instruction_p (void);
extern void xtensa_expand_prologue (void);
-extern void xtensa_expand_epilogue (void);
+extern void xtensa_expand_epilogue (bool);
extern void order_regs_for_local_alloc (void);
extern enum reg_class xtensa_regno_to_class (int regno);
extern HOST_WIDE_INT xtensa_initial_elimination_offset (int from, int to);
@@ -189,7 +189,7 @@ static bool xtensa_can_eliminate (const int from ATTRIBUTE_UNUSED,
const int to);
static HOST_WIDE_INT xtensa_starting_frame_offset (void);
static unsigned HOST_WIDE_INT xtensa_asan_shadow_offset (void);
-
+static bool xtensa_function_ok_for_sibcall (tree, tree);
static rtx xtensa_delegitimize_address (rtx);
@@ -347,6 +347,9 @@ static rtx xtensa_delegitimize_address (rtx);
#undef TARGET_DELEGITIMIZE_ADDRESS
#define TARGET_DELEGITIMIZE_ADDRESS xtensa_delegitimize_address
+#undef TARGET_FUNCTION_OK_FOR_SIBCALL
+#define TARGET_FUNCTION_OK_FOR_SIBCALL xtensa_function_ok_for_sibcall
+
struct gcc_target targetm = TARGET_INITIALIZER;
@@ -2127,6 +2130,20 @@ xtensa_emit_movcc (bool inverted, bool isfp, bool isbool, rtx *operands)
}
+void
+xtensa_prepare_expand_call (int callop, rtx *operands)
+{
+ rtx addr = XEXP (operands[callop], 0);
+
+ if (flag_pic && SYMBOL_REF_P (addr)
+ && (!SYMBOL_REF_LOCAL_P (addr) || SYMBOL_REF_EXTERNAL_P (addr)))
+ addr = gen_sym_PLT (addr);
+
+ if (!call_insn_operand (addr, VOIDmode))
+ XEXP (operands[callop], 0) = copy_to_mode_reg (Pmode, addr);
+}
+
+
char *
xtensa_emit_call (int callop, rtx *operands)
{
@@ -2145,6 +2162,24 @@ xtensa_emit_call (int callop, rtx *operands)
}
+char *
+xtensa_emit_sibcall (int callop, rtx *operands)
+{
+ static char result[64];
+ rtx tgt = operands[callop];
+
+ if (GET_CODE (tgt) == CONST_INT)
+ sprintf (result, "j.l\t" HOST_WIDE_INT_PRINT_HEX ", a9",
+ INTVAL (tgt));
+ else if (register_operand (tgt, VOIDmode))
+ sprintf (result, "jx\t%%%d", callop);
+ else
+ sprintf (result, "j.l\t%%%d, a9", callop);
+
+ return result;
+}
+
+
bool
xtensa_legitimate_address_p (machine_mode mode, rtx addr, bool strict)
{
@@ -3270,7 +3305,7 @@ xtensa_expand_prologue (void)
}
void
-xtensa_expand_epilogue (void)
+xtensa_expand_epilogue (bool sibcall_p)
{
if (!TARGET_WINDOWED_ABI)
{
@@ -3304,10 +3339,13 @@ xtensa_expand_epilogue (void)
if (xtensa_call_save_reg(regno))
{
rtx x = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset));
+ rtx reg;
offset -= UNITS_PER_WORD;
- emit_move_insn (gen_rtx_REG (SImode, regno),
+ emit_move_insn (reg = gen_rtx_REG (SImode, regno),
gen_frame_mem (SImode, x));
+ if (regno == A0_REG && sibcall_p)
+ emit_use (reg);
}
}
@@ -3342,7 +3380,8 @@ xtensa_expand_epilogue (void)
EH_RETURN_STACKADJ_RTX));
}
cfun->machine->epilogue_done = true;
- emit_jump_insn (gen_return ());
+ if (!sibcall_p)
+ emit_jump_insn (gen_return ());
}
bool
@@ -4869,6 +4908,22 @@ xtensa_asan_shadow_offset (void)
return HOST_WIDE_INT_UC (0x10000000);
}
+/* Implement TARGET_FUNCTION_OK_FOR_SIBCALL. */
+static bool
+xtensa_function_ok_for_sibcall (tree decl, tree exp ATTRIBUTE_UNUSED)
+{
+ /* Do not allow tailcalls if the Windowed Register Option is
+ configured. */
+ if (TARGET_WINDOWED_ABI)
+ return false;
+
+ /* Do not allow indirect tailcalls. */
+ if (decl == NULL)
+ return false;
+
+ return true;
+}
+
static rtx
xtensa_delegitimize_address (rtx op)
{
@@ -2148,18 +2148,13 @@
(match_operand 1 "" ""))]
""
{
- rtx addr = XEXP (operands[0], 0);
- if (flag_pic && GET_CODE (addr) == SYMBOL_REF
- && (!SYMBOL_REF_LOCAL_P (addr) || SYMBOL_REF_EXTERNAL_P (addr)))
- addr = gen_sym_PLT (addr);
- if (!call_insn_operand (addr, VOIDmode))
- XEXP (operands[0], 0) = copy_to_mode_reg (Pmode, addr);
+ xtensa_prepare_expand_call (0, operands);
})
(define_insn "call_internal"
[(call (mem (match_operand:SI 0 "call_insn_operand" "nir"))
(match_operand 1 "" "i"))]
- ""
+ "!SIBLING_CALL_P (insn)"
{
return xtensa_emit_call (0, operands);
}
@@ -2173,19 +2168,14 @@
(match_operand 2 "" "")))]
""
{
- rtx addr = XEXP (operands[1], 0);
- if (flag_pic && GET_CODE (addr) == SYMBOL_REF
- && (!SYMBOL_REF_LOCAL_P (addr) || SYMBOL_REF_EXTERNAL_P (addr)))
- addr = gen_sym_PLT (addr);
- if (!call_insn_operand (addr, VOIDmode))
- XEXP (operands[1], 0) = copy_to_mode_reg (Pmode, addr);
+ xtensa_prepare_expand_call (1, operands);
})
(define_insn "call_value_internal"
[(set (match_operand 0 "register_operand" "=a")
(call (mem (match_operand:SI 1 "call_insn_operand" "nir"))
(match_operand 2 "" "i")))]
- ""
+ "!SIBLING_CALL_P (insn)"
{
return xtensa_emit_call (1, operands);
}
@@ -2193,6 +2183,46 @@
(set_attr "mode" "none")
(set_attr "length" "3")])
+(define_expand "sibcall"
+ [(call (match_operand 0 "memory_operand" "")
+ (match_operand 1 "" ""))]
+ "!TARGET_WINDOWED_ABI"
+{
+ xtensa_prepare_expand_call (0, operands);
+})
+
+(define_insn "sibcall_internal"
+ [(call (mem:SI (match_operand:SI 0 "call_insn_operand" "nir"))
+ (match_operand 1 "" "i"))]
+ "!TARGET_WINDOWED_ABI && SIBLING_CALL_P (insn)"
+{
+ return xtensa_emit_sibcall (0, operands);
+}
+ [(set_attr "type" "call")
+ (set_attr "mode" "none")
+ (set_attr "length" "3")])
+
+(define_expand "sibcall_value"
+ [(set (match_operand 0 "register_operand" "")
+ (call (match_operand 1 "memory_operand" "")
+ (match_operand 2 "" "")))]
+ "!TARGET_WINDOWED_ABI"
+{
+ xtensa_prepare_expand_call (1, operands);
+})
+
+(define_insn "sibcall_value_internal"
+ [(set (match_operand 0 "register_operand" "=a")
+ (call (mem:SI (match_operand:SI 1 "call_insn_operand" "nir"))
+ (match_operand 2 "" "i")))]
+ "!TARGET_WINDOWED_ABI && SIBLING_CALL_P (insn)"
+{
+ return xtensa_emit_sibcall (1, operands);
+}
+ [(set_attr "type" "call")
+ (set_attr "mode" "none")
+ (set_attr "length" "3")])
+
(define_insn "entry"
[(set (reg:SI A1_REG)
(unspec_volatile:SI [(match_operand:SI 0 "const_int_operand" "i")]
@@ -2260,7 +2290,15 @@
[(return)]
""
{
- xtensa_expand_epilogue ();
+ xtensa_expand_epilogue (false);
+ DONE;
+})
+
+(define_expand "sibcall_epilogue"
+ [(return)]
+ "!TARGET_WINDOWED_ABI"
+{
+ xtensa_expand_epilogue (true);
DONE;
})
new file mode 100644
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mabi=call0 -foptimize-sibling-calls" } */
+
+extern int foo(int);
+extern void bar(int);
+
+int test_0(int a) {
+ return foo(a);
+}
+
+void test_1(int a) {
+ bar(a);
+}
+
+/* { dg-final { scan-assembler-not "ret" } } */