From patchwork Tue May 31 08:50:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 54555 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC258395A061 for ; Tue, 31 May 2022 08:59:36 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg152.qq.com (smtpbg152.qq.com [13.245.186.79]) by sourceware.org (Postfix) with ESMTPS id DE9A13834E42 for ; Tue, 31 May 2022 08:51:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DE9A13834E42 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp84t1653987055tr11u3ir Received: from server1.localdomain ( [42.247.22.65]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 31 May 2022 16:50:54 +0800 (CST) X-QQ-SSF: 01400000002000B0F000000A0000000 X-QQ-FEAT: 2j+C2ndjE44LGVTMMecgNqpoq1HvOsxbiuDfh2eHQVNIjoUHD33yLfwR+UcGE SNYAXW3+XOZSaiDhV7lo6o2A5HQM2Tpn9QPR4mEUHhCsGnGwwPy/Ta2REnRyJ6V3QGgLPUN ZjhMqqo3sSKQeCU0GST8ORdBqXbl1Ra1B0ru01C4Yn6Ajc41fxKeh8W06l1RdJf1KOh+/Ck LkC1gJkWSkKL2ku0N2NQ+tQexnL0XNujPDPfFhfBDhiYwjVkKqFdE0q7DX+Td/F/ajJS1qO oqFtdgCRjNpHYKE/DVjIYxBDVY7QlU7dMegYezO7XkeeI0Zin62hWcLB8MI7z30lNR0qy9+ a1j497cUGZ2kIk/WsoOWQfXFbt/NI4VJDl8RkpL X-QQ-GoodBg: 2 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Subject: [PATCH 13/21] Adjust scalable frame and full testcases Date: Tue, 31 May 2022 16:50:04 +0800 Message-Id: <20220531085012.269719-14-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220531085012.269719-1-juzhe.zhong@rivai.ai> References: <20220531085012.269719-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybgforeign:qybgforeign10 X-QQ-Bgrelay: 1 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: zhongjuzhe gcc/ChangeLog: * config/riscv/riscv-vector.cc (rvv_adjust_frame): Adjust frame manipulation for RVV scalable vector. * config/riscv/riscv-vector.h (rvv_adjust_frame): Adjust frame manipulation for RVV scalable vector. * config/riscv/riscv.cc (riscv_compute_frame_info): Adjust frame manipulation for RVV scalable vector. (riscv_first_stack_step): Adjust frame manipulation for RVV scalable vector. (riscv_expand_prologue): Adjust frame manipulation for RVV scalable vector. (riscv_expand_epilogue): Adjust frame manipulation for RVV scalable vector. (riscv_dwarf_poly_indeterminate_value): New function. (riscv_estimated_poly_value): New function. (TARGET_DWARF_POLY_INDETERMINATE_VALUE): New targethook. (TARGET_ESTIMATED_POLY_VALUE): New targethook. * config/riscv/riscv.h (RISCV_PROLOGUE_TEMP2_REGNUM): New macro define. (RISCV_PROLOGUE_TEMP2): New macro define. (RISCV_DWARF_VLENB): New macro define. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/stack/rvv-stack.exp: New test. * gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c: New test. * gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c: New test. * gcc.target/riscv/rvv/stack/stack-check-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c: New test. * gcc.target/riscv/rvv/stack/stack-check-vector_1.c: New test. * gcc.target/riscv/rvv/stack/stack-check-vector_2.c: New test. --- gcc/config/riscv/riscv-vector.cc | 33 +++ gcc/config/riscv/riscv-vector.h | 1 + gcc/config/riscv/riscv.cc | 275 ++++++++++++----- gcc/config/riscv/riscv.h | 4 + .../gcc.target/riscv/rvv/stack/rvv-stack.exp | 47 +++ .../rvv/stack/stack-check-alloca-scalar.c | 53 ++++ .../rvv/stack/stack-check-alloca-vector.c | 45 +++ .../stack/stack-check-save-restore-scalar.c | 48 +++ .../stack/stack-check-save-restore-vector.c | 62 ++++ .../riscv/rvv/stack/stack-check-scalar.c | 205 +++++++++++++ .../rvv/stack/stack-check-vararg-scalar.c | 33 +++ .../riscv/rvv/stack/stack-check-vector_1.c | 277 ++++++++++++++++++ .../riscv/rvv/stack/stack-check-vector_2.c | 141 +++++++++ 13 files changed, 1143 insertions(+), 81 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c diff --git a/gcc/config/riscv/riscv-vector.cc b/gcc/config/riscv/riscv-vector.cc index d09fc1b8e49..4cb5e79421d 100644 --- a/gcc/config/riscv/riscv-vector.cc +++ b/gcc/config/riscv/riscv-vector.cc @@ -846,6 +846,39 @@ rvv_expand_poly_move (machine_mode mode, rtx dest, rtx clobber, rtx src) emit_insn (gen_rtx_SET (dest, riscv_add_offset (clobber, dest, constant))); } +/* Adjust frame of vector for prologue && epilogue. */ +void +rvv_adjust_frame (rtx target, poly_int64 offset, bool epilogue) +{ + rtx clobber = RISCV_PROLOGUE_TEMP (Pmode); + rtx space = RISCV_PROLOGUE_TEMP2 (Pmode); + rtx insn, dwarf, adjust_frame_rtx; + + rvv_expand_poly_move (Pmode, space, clobber, gen_int_mode (offset, Pmode)); + + if (epilogue) + { + insn = gen_add3_insn (target, target, space); + } + else + { + insn = gen_sub3_insn (target, target, space); + } + + insn = emit_insn (insn); + + RTX_FRAME_RELATED_P (insn) = 1; + + adjust_frame_rtx = + gen_rtx_SET (target, + plus_constant (Pmode, target, epilogue ? offset : -offset)); + + dwarf = alloc_reg_note (REG_FRAME_RELATED_EXPR, + copy_rtx (adjust_frame_rtx), NULL_RTX); + + REG_NOTES (insn) = dwarf; +} + /* Helper functions for handling sew=64 on RV32 system. */ bool imm32_p (rtx a) diff --git a/gcc/config/riscv/riscv-vector.h b/gcc/config/riscv/riscv-vector.h index b70cf676e26..98f47ea0ec1 100644 --- a/gcc/config/riscv/riscv-vector.h +++ b/gcc/config/riscv/riscv-vector.h @@ -22,4 +22,5 @@ #define GCC_RISCV_VECTOR_H void rvv_report_required (void); void rvv_expand_poly_move (machine_mode, rtx, rtx, rtx); +void rvv_adjust_frame (rtx, poly_int64, bool); #endif // GCC_RISCV_VECTOR_H \ No newline at end of file diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 832c1754002..29106bbf6fe 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4357,7 +4357,11 @@ riscv_compute_frame_info (void) padding. */ frame->arg_pointer_offset = offset - crtl->args.pretend_args_size; frame->total_size = offset; - + + /* Calculate the constant offset of a scalable frame. We Handle the constant + and scalable part of frame seperatly. */ + frame->constant_offset = riscv_stack_align (frame->total_size.coeffs[0]) - + riscv_stack_align (frame->total_size.coeffs[1]); /* Next points the incoming stack pointer and any incoming arguments. */ /* Only use save/restore routines when the GPRs are atop the frame. */ @@ -4538,21 +4542,27 @@ riscv_restore_reg (rtx reg, rtx mem) static HOST_WIDE_INT riscv_first_stack_step (struct riscv_frame_info *frame) { - if (SMALL_OPERAND (frame->total_size.to_constant())) - return frame->total_size.to_constant (); + HOST_WIDE_INT frame_total_size; + if (!frame->total_size.is_constant()) + frame_total_size = frame->constant_offset; + else + frame_total_size = frame->total_size.to_constant(); + + if (SMALL_OPERAND (frame_total_size)) + return frame_total_size; HOST_WIDE_INT min_first_step = RISCV_STACK_ALIGN ((frame->total_size - frame->fp_sp_offset).to_constant()); HOST_WIDE_INT max_first_step = IMM_REACH / 2 - PREFERRED_STACK_BOUNDARY / 8; - HOST_WIDE_INT min_second_step = frame->total_size.to_constant() - max_first_step; + HOST_WIDE_INT min_second_step = frame_total_size - max_first_step; gcc_assert (min_first_step <= max_first_step); /* As an optimization, use the least-significant bits of the total frame size, so that the second adjustment step is just LUI + ADD. */ if (!SMALL_OPERAND (min_second_step) - && frame->total_size.to_constant() % IMM_REACH < IMM_REACH / 2 - && frame->total_size.to_constant() % IMM_REACH >= min_first_step) - return frame->total_size.to_constant() % IMM_REACH; + && frame_total_size % IMM_REACH < IMM_REACH / 2 + && frame_total_size % IMM_REACH >= min_first_step) + return frame_total_size % IMM_REACH; if (TARGET_RVC) { @@ -4625,12 +4635,12 @@ void riscv_expand_prologue (void) { struct riscv_frame_info *frame = &cfun->machine->frame; - HOST_WIDE_INT size = frame->total_size.to_constant (); + poly_int64 size = frame->total_size; unsigned mask = frame->mask; rtx insn; if (flag_stack_usage_info) - current_function_static_stack_size = size; + current_function_static_stack_size = constant_lower_bound (size); if (cfun->machine->naked_p) return; @@ -4640,7 +4650,6 @@ riscv_expand_prologue (void) { rtx dwarf = NULL_RTX; dwarf = riscv_adjust_libcall_cfi_prologue (); - size -= frame->save_libcall_adjustment; insn = emit_insn (riscv_gen_gpr_save_insn (frame)); frame->mask = 0; /* Temporarily fib that we need not save GPRs. */ @@ -4652,11 +4661,14 @@ riscv_expand_prologue (void) /* Save the registers. */ if ((frame->mask | frame->fmask) != 0) { - HOST_WIDE_INT step1 = MIN (size, riscv_first_stack_step (frame)); + HOST_WIDE_INT step1 = riscv_first_stack_step (frame); + if (size.is_constant ()) + step1 = MIN (size.to_constant(), step1); + gcc_assert (SMALL_OPERAND (-step1)); insn = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, - GEN_INT (-step1)); + stack_pointer_rtx, + GEN_INT (-step1)); RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; size -= step1; riscv_for_each_saved_reg (size, riscv_save_reg, false, false); @@ -4667,34 +4679,56 @@ riscv_expand_prologue (void) /* Set up the frame pointer, if we're using one. */ if (frame_pointer_needed) { + poly_int64 offset = frame->hard_frame_pointer_offset - size; insn = gen_add3_insn (hard_frame_pointer_rtx, stack_pointer_rtx, - GEN_INT ((frame->hard_frame_pointer_offset - size).to_constant ())); + GEN_INT (offset.to_constant ())); RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; riscv_emit_stack_tie (); } /* Allocate the rest of the frame. */ - if (size > 0) + if (known_gt (size, 0)) { - if (SMALL_OPERAND (-size)) - { - insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, - GEN_INT (-size)); - RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; - } - else - { - riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (-size)); - emit_insn (gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, - RISCV_PROLOGUE_TEMP (Pmode))); - - /* Describe the effect of the previous instructions. */ - insn = plus_constant (Pmode, stack_pointer_rtx, -size); - insn = gen_rtx_SET (stack_pointer_rtx, insn); - riscv_set_frame_expr (insn); - } + /* Two step adjustment, first for scalar frame, second for vector frame. */ + poly_int64 poly_offset (0, 0); + if (!size.is_constant ()) + { + HOST_WIDE_INT factor = size.coeffs[1]; + poly_offset.coeffs[0] = factor; + poly_offset.coeffs[1] = factor; + size -= poly_offset; + } + + /* First step for scalar frame. */ + HOST_WIDE_INT size_value = size.to_constant (); + if (size_value > 0) + { + if (SMALL_OPERAND (-size_value)) + { + insn = gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, + GEN_INT (-size_value)); + RTX_FRAME_RELATED_P (emit_insn (insn)) = 1; + } + else + { + riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), GEN_INT (size_value)); + emit_insn (gen_sub3_insn (stack_pointer_rtx, + stack_pointer_rtx, + RISCV_PROLOGUE_TEMP (Pmode))); + + /* Describe the effect of the previous instructions. */ + insn = plus_constant (Pmode, stack_pointer_rtx, -size_value); + insn = gen_rtx_SET (stack_pointer_rtx, insn); + riscv_set_frame_expr (insn); + } + } + + /* Second step for vector frame */ + if (known_gt (poly_offset, 0)) + { + rvv_adjust_frame (stack_pointer_rtx, poly_offset, false); + } } } @@ -4734,7 +4768,8 @@ riscv_expand_epilogue (int style) Start off by assuming that no registers need to be restored. */ struct riscv_frame_info *frame = &cfun->machine->frame; unsigned mask = frame->mask; - HOST_WIDE_INT step1 = frame->total_size.to_constant (); + poly_int64 step1 = frame->total_size; + poly_int64 restore_offset; /* For restore register */ HOST_WIDE_INT step2 = 0; bool use_restore_libcall = ((style == NORMAL_RETURN) && riscv_use_save_libcall (frame)); @@ -4742,8 +4777,8 @@ riscv_expand_epilogue (int style) rtx insn; /* We need to add memory barrier to prevent read from deallocated stack. */ - bool need_barrier_p = known_ne (get_frame_size (), - cfun->machine->frame.arg_pointer_offset); + bool need_barrier_p = known_ne (get_frame_size () + + cfun->machine->frame.arg_pointer_offset, 0); if (cfun->machine->naked_p) { @@ -4763,6 +4798,21 @@ riscv_expand_epilogue (int style) /* Reset the epilogue cfa info before starting to emit the epilogue. */ epilogue_cfa_sp_offset = 0; + if (use_restore_libcall) + { + step1 -= frame->save_libcall_adjustment; + frame->mask = 0; /* Temporarily fib that we need not save GPRs. */ + } + + /* If we need to restore registers, deallocate as much stack as + possible in the second step without going out of range. */ + if ((frame->mask | frame->fmask) != 0) + { + step2 = riscv_first_stack_step (frame); + step1 -= step2; + restore_offset = step1; + } + /* Move past any dynamic stack allocations. */ if (cfun->calls_alloca) { @@ -4770,21 +4820,18 @@ riscv_expand_epilogue (int style) riscv_emit_stack_tie (); need_barrier_p = false; - rtx adjust = GEN_INT (-frame->hard_frame_pointer_offset.to_constant ()); - if (!SMALL_OPERAND (INTVAL (adjust))) - { - riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); - adjust = RISCV_PROLOGUE_TEMP (Pmode); - } - - insn = emit_insn ( - gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx, - adjust)); + gcc_assert (frame_pointer_needed); + poly_int64 offset = frame->hard_frame_pointer_offset - step1; + insn = emit_insn (gen_add3_insn (stack_pointer_rtx, hard_frame_pointer_rtx, + GEN_INT (-offset.to_constant ()))); + /* By using hard_frame_pointer_rtx, it can skip the adjust of step1 + and go directly to the position of step2 */ + step1 = 0; rtx dwarf = NULL_RTX; rtx cfa_adjust_value = gen_rtx_PLUS ( - Pmode, hard_frame_pointer_rtx, - GEN_INT (-frame->hard_frame_pointer_offset.to_constant ())); + Pmode, hard_frame_pointer_rtx, + GEN_INT (-offset.to_constant ())); rtx cfa_adjust_rtx = gen_rtx_SET (stack_pointer_rtx, cfa_adjust_value); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, cfa_adjust_rtx, dwarf); RTX_FRAME_RELATED_P (insn) = 1; @@ -4792,14 +4839,6 @@ riscv_expand_epilogue (int style) REG_NOTES (insn) = dwarf; } - /* If we need to restore registers, deallocate as much stack as - possible in the second step without going out of range. */ - if ((frame->mask | frame->fmask) != 0) - { - step2 = riscv_first_stack_step (frame); - step1 -= step2; - } - /* Set TARGET to BASE + STEP1. */ if (known_gt (step1, 0)) { @@ -4807,25 +4846,38 @@ riscv_expand_epilogue (int style) riscv_emit_stack_tie (); need_barrier_p = false; - /* Get an rtx for STEP1 that we can add to BASE. */ - rtx adjust = GEN_INT (step1); - if (!SMALL_OPERAND (step1)) - { - riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); - adjust = RISCV_PROLOGUE_TEMP (Pmode); - } + /* First step for vector frame */ + if (!step1.is_constant ()) + { + HOST_WIDE_INT factor = step1.coeffs[1]; + poly_int64 poly_offset (factor, factor); + rvv_adjust_frame (stack_pointer_rtx, poly_offset, true); + step1 -= poly_offset; + } - insn = emit_insn ( - gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, adjust)); + /* Second step for scalar frame. */ + HOST_WIDE_INT scalar_step1 = step1.to_constant (); + if (scalar_step1 > 0) + { + rtx adjust = GEN_INT (scalar_step1); + if (!SMALL_OPERAND (scalar_step1)) + { + riscv_emit_move (RISCV_PROLOGUE_TEMP (Pmode), adjust); + adjust = RISCV_PROLOGUE_TEMP (Pmode); + } - rtx dwarf = NULL_RTX; - rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx, - GEN_INT (step2)); + insn = emit_insn ( + gen_add3_insn (stack_pointer_rtx, stack_pointer_rtx, adjust)); - dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf); - RTX_FRAME_RELATED_P (insn) = 1; + rtx dwarf = NULL_RTX; + rtx cfa_adjust_rtx = gen_rtx_PLUS (Pmode, stack_pointer_rtx, + GEN_INT (step2)); - REG_NOTES (insn) = dwarf; + dwarf = alloc_reg_note (REG_CFA_DEF_CFA, cfa_adjust_rtx, dwarf); + RTX_FRAME_RELATED_P (insn) = 1; + + REG_NOTES (insn) = dwarf; + } } else if (frame_pointer_needed) { @@ -4834,18 +4886,11 @@ riscv_expand_epilogue (int style) epilogue_cfa_sp_offset = step2; } - if (use_restore_libcall) - frame->mask = 0; /* Temporarily fib that we need not save GPRs. */ - /* Restore the registers. */ - riscv_for_each_saved_reg (frame->total_size - step2, riscv_restore_reg, - true, style == EXCEPTION_RETURN); - - if (use_restore_libcall) + if ((frame->mask | frame->fmask) != 0) { - frame->mask = mask; /* Undo the above fib. */ - gcc_assert (step2 >= frame->save_libcall_adjustment); - step2 -= frame->save_libcall_adjustment; + riscv_for_each_saved_reg (restore_offset, riscv_restore_reg, + true, style == EXCEPTION_RETURN); } if (need_barrier_p) @@ -4868,6 +4913,7 @@ riscv_expand_epilogue (int style) if (use_restore_libcall) { + frame->mask = mask; /* Undo the above fib. */ rtx dwarf = riscv_adjust_libcall_cfi_epilogue (); insn = emit_insn (gen_gpr_restore (GEN_INT (riscv_save_libcall_count (mask)))); RTX_FRAME_RELATED_P (insn) = 1; @@ -6118,6 +6164,67 @@ riscv_regmode_natural_size (machine_mode mode) return UNITS_PER_WORD; } +/* Implement the TARGET_DWARF_POLY_INDETERMINATE_VALUE hook. */ + +static unsigned int +riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor, + int *offset) +{ + /* Polynomial invariant 1 == (VLENB / 8) - 1. */ + gcc_assert (i == 1); + *factor = 8; + *offset = 1; + return RISCV_DWARF_VLENB; +} + +/* Implement TARGET_ESTIMATED_POLY_VALUE. + Look into the tuning structure for an estimate. + KIND specifies the type of requested estimate: min, max or likely. + For cores with a known RVV width all three estimates are the same. + For generic RVV tuning we want to distinguish the maximum estimate from + the minimum and likely ones. + The likely estimate is the same as the minimum in that case to give a + conservative behavior of auto-vectorizing with RVV when it is a win + even for 128-bit RVV. + When RVV width information is available VAL.coeffs[1] is multiplied by + the number of VQ chunks over the initial Advanced SIMD 128 bits. */ + +static HOST_WIDE_INT +riscv_estimated_poly_value (poly_int64 val, + poly_value_estimate_kind kind = POLY_VALUE_LIKELY) +{ + unsigned int width_source = + BITS_PER_RISCV_VECTOR.is_constant () + ? (unsigned int)BITS_PER_RISCV_VECTOR.to_constant () + : (unsigned int)RVV_SCALABLE; + + /* If there is no core-specific information then the minimum and likely + values are based on 128-bit vectors and the maximum is based on + the architectural maximum of 2048 bits. */ + if (width_source == RVV_SCALABLE) + switch (kind) + { + case POLY_VALUE_MIN: + case POLY_VALUE_LIKELY: + return val.coeffs[0]; + + case POLY_VALUE_MAX: + return val.coeffs[0] + val.coeffs[1] * 15; + } + + /* Allow BITS_PER_RISCV_VECTOR to be a bitmask of different VL, treating the + lowest as likely. This could be made more general if future -mtune + options need it to be. */ + if (kind == POLY_VALUE_MAX) + width_source = 1 << floor_log2 (width_source); + else + width_source = least_bit_hwi (width_source); + + /* If the core provides width information, use that. */ + HOST_WIDE_INT over_128 = width_source - 128; + return val.coeffs[0] + val.coeffs[1] * over_128 / 128; +} + /* Initialize the GCC target structure. */ #undef TARGET_ASM_ALIGNED_HI_OP #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t" @@ -6336,6 +6443,12 @@ riscv_regmode_natural_size (machine_mode mode) #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE riscv_mangle_type +#undef TARGET_DWARF_POLY_INDETERMINATE_VALUE +#define TARGET_DWARF_POLY_INDETERMINATE_VALUE riscv_dwarf_poly_indeterminate_value + +#undef TARGET_ESTIMATED_POLY_VALUE +#define TARGET_ESTIMATED_POLY_VALUE riscv_estimated_poly_value + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-riscv.h" diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h index 5de745bc824..03eb92900be 100644 --- a/gcc/config/riscv/riscv.h +++ b/gcc/config/riscv/riscv.h @@ -396,6 +396,8 @@ ASM_MISA_SPEC #define RISCV_PROLOGUE_TEMP_REGNUM (GP_TEMP_FIRST) #define RISCV_PROLOGUE_TEMP(MODE) gen_rtx_REG (MODE, RISCV_PROLOGUE_TEMP_REGNUM) +#define RISCV_PROLOGUE_TEMP2_REGNUM (GP_TEMP_FIRST + 1) +#define RISCV_PROLOGUE_TEMP2(MODE) gen_rtx_REG (MODE, RISCV_PROLOGUE_TEMP2_REGNUM) #define RISCV_CALL_ADDRESS_TEMP_REGNUM (GP_TEMP_FIRST + 1) #define RISCV_CALL_ADDRESS_TEMP(MODE) \ @@ -1085,4 +1087,6 @@ extern void riscv_remove_unneeded_save_restore_calls (void); #define REGMODE_NATURAL_SIZE(MODE) riscv_regmode_natural_size (MODE) +#define RISCV_DWARF_VLENB (4096 + 0xc22) + #endif /* ! GCC_RISCV_H */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp b/gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp new file mode 100644 index 00000000000..9a558f32ed0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/rvv-stack.exp @@ -0,0 +1,47 @@ +# Copyright (C) 2022-2022 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `dg.exp' driver. + +# Exit immediately if this isn't a RISC-V target. +if ![istarget riscv*-*-*] then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +set gcc_march_list [list "-march=rv64gcv" "-march=rv64gv"] +if [istarget riscv32-*-*] then { + set gcc_march_list [list "-march=rv32gcv" "-march=rv32gv"] +} + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +# Initialize `dg'. +dg-init + +# Main loop. +foreach march $gcc_march_list { + gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] \ + $march $DEFAULT_CFLAGS +} +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c new file mode 100644 index 00000000000..58c8e6de603 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-scalar.c @@ -0,0 +1,53 @@ + +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +void f2 (char*); +void f3 (char*, ...); + +/* +** stach_check_alloca_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-48 +** sw ra,12\(sp\) +** sw s0,8\(sp\) +** addi s0,sp,16 +** ... +** addi a0,a0,23 +** andi a0,a0,-16 +** sub sp,sp,a0 +** ... +** addi sp,s0,-16 +** lw ra,12\(sp\) +** lw s0,8\(sp\) +** addi sp,sp,48 +** jr ra +*/ +void stach_check_alloca_1 (int y, ...) +{ + char* pStr = (char*)__builtin_alloca(y); + f2(pStr); +} + +/* +** stach_check_alloca_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-48 +** sw ra,44\(sp\) +** sw s0,40\(sp\) +** addi s0,sp,48 +** addi a0,a0,23 +** andi a0,a0,-16 +** sub sp,sp,a0 +** ... +** addi sp,s0,-48 +** lw ra,44\(sp\) +** lw s0,40\(sp\) +** addi sp,sp,48 +** jr ra +*/ +void stach_check_alloca_2 (int y) +{ + char* pStr = (char*)__builtin_alloca(y); + f3(pStr, pStr, pStr, pStr, pStr, pStr, pStr, pStr, 2, pStr, pStr, pStr, 1); +} + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c new file mode 100644 index 00000000000..12c05b337cb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-alloca-vector.c @@ -0,0 +1,45 @@ + +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +void f (char*); + +/* +** stach_check_alloca_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-48 +** sw ra,12\(sp\) +** sw s0,8\(sp\) +** addi s0,sp,16 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** addi a1,a1,23 +** andi a1,a1,-16 +** sub sp,sp,a1 +** ... +** addi sp,s0,-16 +** lw ra,12\(sp\) +** lw s0,8\(sp\) +** addi sp,sp,48 +** jr ra +*/ +void stach_check_alloca_1 (vuint8m1_t data, uint8_t *base, int y, ...) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m1_t *)base = data; + char* pStr = (char*)__builtin_alloca(y); + f(pStr); +} + + diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c new file mode 100644 index 00000000000..72179791677 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-scalar.c @@ -0,0 +1,48 @@ + +/* { dg-do compile } */ +/* { dg-options "-msave-restore -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + + +void fn2 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8); +void fn3 (char*); + + +/* +** stack_save_restore_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** call t0,__riscv_save_0 +** addi sp,sp,-32 +** fs(w|d) fs0,24\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** ... +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** fl(w|d) fs0,24\(sp\) +** addi sp,sp,32 +** tail __riscv_restore_0 +*/ +int stack_save_restore_1 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8) +{ + char d[8000]; + float f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13; + asm volatile ("nop" + : "=f" (f1), "=f" (f2), "=f" (f3), "=f" (f4), "=f" (f5), "=f" (f6), + "=f" (f7), "=f" (f8), "=f" (f9), "=f" (f10), "=f" (f11), + "=f" (f12), "=f" (f13) + : + :); + asm volatile ("nop" + : + : "f" (f1), "f" (f2), "f" (f3), "f" (f4), "f" (f5), "f" (f6), + "f" (f7), "f" (f8), "f" (f9), "f" (f10), "f" (f11), + "f" (f12), "f" (f13) + :); + fn2 (a1, a2, a3, a4, a5, a6, a7, a8); + fn3(d); + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c new file mode 100644 index 00000000000..694ce4669c0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-save-restore-vector.c @@ -0,0 +1,62 @@ + +/* { dg-do compile } */ +/* { dg-options "-msave-restore -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ +#include + +void fn2 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8); +void fn3 (char*); + +/* +** stack_save_restore_2: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** call t0,__riscv_save_0 +** addi sp,sp,-32 +** fs(w|d) fs0,24\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** fl(w|d) fs0,24\(sp\) +** addi sp,sp,32 +** tail __riscv_restore_0 +*/ +int stack_save_restore_2 (float a1, float a2, float a3, float a4, + float a5, float a6, float a7, float a8, + vuint8m1_t data, uint8_t *base) +{ + char d[8000]; + float f1, f2, f3, f4, f5, f6, f7, f8, f9, f10, f11, f12, f13; + asm volatile ("nop" + : "=f" (f1), "=f" (f2), "=f" (f3), "=f" (f4), "=f" (f5), "=f" (f6), + "=f" (f7), "=f" (f8), "=f" (f9), "=f" (f10), "=f" (f11), + "=f" (f12), "=f" (f13) + : + :); + asm volatile ("nop" + : + : "f" (f1), "f" (f2), "f" (f3), "f" (f4), "f" (f5), "f" (f6), + "f" (f7), "f" (f8), "f" (f9), "f" (f10), "f" (f11), + "f" (f12), "f" (f13) + :); + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m1_t *)base = data; + fn2 (a1, a2, a3, a4, a5, a6, a7, a8); + fn3(d); + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c new file mode 100644 index 00000000000..4400470b650 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-scalar.c @@ -0,0 +1,205 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +void f (uint8_t *); + +/* GPR: 16, local: 16, total: 32 +** stack_offset1_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** addi a0,sp,4 +** call f +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +void stack_offset1_1 () +{ + uint8_t local[10]; + f(local); +} + +/* GPR: 16, local: 2016, total: 2032 +** stack_offset1_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** mv a0,sp +** call f +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset1_2 () +{ + uint8_t local[2016]; + f(local); +} + +/* GPR: 16, local: 6000, total: 6016 +** stack_offset2_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-1920 +** sw ra,1916\(sp\) +** li t0,4096 +** sub sp,sp,t0 +** li a0,-4096 +** addi a0,a0,-1904 +** li a5,4096 +** addi a5,a5,1904 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** li t0,4096 +** add sp,sp,t0 +** lw ra,1916\(sp\) +** addi sp,sp,1920 +** jr ra +*/ +void stack_offset2_1 () +{ + uint8_t local[6000]; + f(local); +} + +/* GPR: 16, local: 2032, total: 2048 +** stack_offset3_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** addi sp,sp,-2016 +** addi a0,sp,12 +** call f +** addi sp,sp,2016 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset3_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-16 +** addi a0,sp,12 +** call f +** addi sp,sp,16 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_1 () +{ + uint8_t local[2017]; + f(local); +} + +/* GPR: 16, local: 2112, total: 2128 +** stack_offset3_2: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-96 +** sw ra,92\(sp\) +** addi sp,sp,-2032 +** li a0,-4096 +** addi a0,a0,1996 +** li a5,4096 +** addi a5,a5,-1984 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** addi sp,sp,2032 +** lw ra,92\(sp\) +** addi sp,sp,96 +** jr ra +*/ +/* +** stack_offset3_2: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-96 +** li a0,-4096 +** addi a0,a0,1996 +** li a5,4096 +** addi a5,a5,-1984 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** addi sp,sp,96 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_2 () +{ + uint8_t local[2100]; + f(local); +} + +/* GPR: 16, local: 8000, total: 8016 +** stack_offset4_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** li a0,-8192 +** addi a0,a0,192 +** li a5,8192 +** addi a5,a5,-192 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset4_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** li t0,4096 +** addi t0,t0,1888 +** sub sp,sp,t0 +** li a0,-8192 +** addi a0,a0,192 +** li a5,8192 +** addi a5,a5,-192 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** li t0,4096 +** addi t0,t0,1888 +** add sp,sp,t0 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset4_1 () +{ + uint8_t local[8000]; + f(local); +} + +/* GPR: 16, local: 3056, total: 3072 +** stack_offset5_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-1040 +** li a0,-4096 +** addi a0,a0,1048 +** li a5,4096 +** addi a5,a5,-1040 +** add a5,a5,a0 +** add a0,a5,sp +** call f +** addi sp,sp,1040 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset5_1 () +{ + uint8_t local[3048]; + f(local); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c new file mode 100644 index 00000000000..ffc90a02f65 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vararg-scalar.c @@ -0,0 +1,33 @@ +/* { dg-do run } */ +/* { dg-options "-std=c11" } */ + +#include +#include + +int va_sum(int a, int count, ...) +{ + va_list ap; + va_start(ap, count); + for (int i = count; i > 0; i--) + { + int arg = va_arg(ap, int); + a = a + arg; + } + va_end(ap); + return a; +} + +int main() +{ + int sum = 0; + int a = 1; + int b = 2; + int c = 3; + int d = 4; + sum = va_sum(sum, 4, a, b, c, d); + + if (sum != 10) + { + abort(); + } +} \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c new file mode 100644 index 00000000000..afca87532ae --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_1.c @@ -0,0 +1,277 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +void f (uint8_t *); +void f2 (vuint8m1_t); + +/* GPR: 16, local: 16+vlenb, total: 32+vlenb +** stack_offset1_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +void stack_offset1_1 (vuint8m1_t data) +{ + uint8_t local[10]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 2016+vlenb, total: 2032+vlenb +** stack_offset1_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset1_2 (vuint8m1_t data) +{ + uint8_t local[2016]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 6000+vlenb, total: 6016+vlenb +** stack_offset2_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-1920 +** sw ra,1916\(sp\) +** li t0,4096 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,4096 +** add sp,sp,t0 +** lw ra,1916\(sp\) +** addi sp,sp,1920 +** jr ra +*/ +void stack_offset2_1 (vuint8m1_t data) +{ + uint8_t local[6000]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 2032+vlenb, total: 2048+vlenb +** stack_offset3_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** addi sp,sp,-2016 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,2016 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset3_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-16 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,16 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_1 (vuint8m1_t data) +{ + uint8_t local[2017]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 2112+vlenb, total: 2128+vlenb +** stack_offset3_2: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-96 +** sw ra,92\(sp\) +** addi sp,sp,-2032 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,2032 +** lw ra,92\(sp\) +** addi sp,sp,96 +** jr ra +*/ +/* +** stack_offset3_2: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-96 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,96 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset3_2 (vuint8m1_t data) +{ + uint8_t local[2100]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 8000+vlenb, total: 8016+vlenb +** stack_offset4_1: { target { { any-opts "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-32 +** sw ra,28\(sp\) +** li t0,8192 +** addi t0,t0,-208 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,8192 +** addi t0,t0,-208 +** add sp,sp,t0 +** lw ra,28\(sp\) +** addi sp,sp,32 +** jr ra +*/ +/* +** stack_offset4_1: { target { { any-opts "-march=rv32gv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** li t0,4096 +** addi t0,t0,1888 +** sub sp,sp,t0 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** li t0,4096 +** addi t0,t0,1888 +** add sp,sp,t0 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset4_1 (vuint8m1_t data) +{ + uint8_t local[8000]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} + +/* GPR: 16, local: 3056+vlenb, total: 3072+vlenb +** stack_offset5_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** addi sp,sp,-2032 +** sw ra,2028\(sp\) +** addi sp,sp,-1040 +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** addi sp,sp,1040 +** lw ra,2028\(sp\) +** addi sp,sp,2032 +** jr ra +*/ +void stack_offset5_1 (vuint8m1_t data) +{ + uint8_t local[3048]; + f(local); + vint16m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + f2(data); +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c new file mode 100644 index 00000000000..7938fa6261c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/stack/stack-check-vector_2.c @@ -0,0 +1,141 @@ +/* { dg-do compile } */ +/* { dg-options "-fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include +#include + +void f (uint8_t *); +void f2 (vuint8m1_t); + +/* 1*vlenb +** stack_offset1_1: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** sub sp,sp,t0 +** ... +** csrr t0,vlenb +** add sp,sp,t0 +** jr ra +*/ +void stack_offset1_1 (vuint8m1_t data, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m1_t *)base = data; +} + +/* 8*vlenb +** stack_offset1_2: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** slli t1,t0,3 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** slli t1,t0,3 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_2 (vuint8m8_t data, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m8_t *)base = data; +} + +/* 3*vlenb +** stack_offset1_3: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** slli t1,t0,2 +** sub t1,t1,t0 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** slli t1,t0,2 +** sub t1,t1,t0 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_3 (vuint8m1_t data, vuint8m2_t data2, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m2_t *)base = data2; + *(vuint8m1_t *)base = data; +} + +/* 9*vlenb +** stack_offset1_4: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** slli t1,t0,3 +** add t1,t1,t0 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** slli t1,t0,3 +** add t1,t1,t0 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_4 (vuint8m1_t data, vuint8m8_t data2, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m8_t *)base = data2; + *(vuint8m1_t *)base = data; +} + +/* 10*vlenb +** stack_offset1_5: { target { { any-opts "-march=rv32gv" "-march=rv32gcv" } && { { any-opts "-O1" "-O2" "-O3" "-Os" } && { no-opts "-flto" } } } } +** csrr t0,vlenb +** li t1,10 +** mul t1,t1,t0 +** sub sp,sp,t1 +** ... +** csrr t0,vlenb +** li t1,10 +** mul t1,t1,t0 +** add sp,sp,t1 +** jr ra +*/ +void stack_offset1_5 (vuint8m2_t data, vuint8m8_t data2, uint8_t *base) +{ + vuint8m8_t v0, v8, v16, v24; + asm volatile ("nop" + : "=vr" (v0), "=vr" (v8), "=vr" (v16), "=vr" (v24) + : + :); + asm volatile ("nop" + : + : "vr" (v0), "vr" (v8), "vr" (v16), "vr" (v24) + :); + *(vuint8m8_t *)base = data2; + *(vuint8m2_t *)base = data; +}