From patchwork Wed Nov 9 23:07:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philipp Tomsich X-Patchwork-Id: 60312 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3C75A3885C1F for ; Wed, 9 Nov 2022 23:07:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by sourceware.org (Postfix) with ESMTPS id 284183858D33 for ; Wed, 9 Nov 2022 23:07:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 284183858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lj1-x22c.google.com with SMTP id d20so27131692ljc.12 for ; Wed, 09 Nov 2022 15:07:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ukgxc5tpuROS6TJMt0uphxhrkdto9vh/gKzFqBsE8Cs=; b=hoFHfDJe2yWOZAebdVTJImlZSZXQBzv75vIXCBP4abc1ys51VRYeA4DwpQmXlLCus2 jh7jYzSjU0TPx++ZAZG3UR1OkIlEZD+YZgczgXv2WoC0Q63+jG3IsdV6rkfSomfgCbLT J2cZt1PnvTzqC1Q8cidf0Ml+K2eTl31Z/c1bXqjkYd0E1VX/vlk3VlzqDKRNkkq81T1S /L3yhzm51hs7WNWOuwE63Sy9q0kB5mUrybEG0P/StfY8karsLqUUeIFypUL6Z0VojjA/ Lut/8MDOLjCIYvl16Gs9mN7zhI9fymb471EMF8XF3sSPFENgZGOJj1h6R/ngJchkmBXj Rxyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ukgxc5tpuROS6TJMt0uphxhrkdto9vh/gKzFqBsE8Cs=; b=603KYSBKtAvUkfG4pPDm4Hdbu52v1z7lvfTPn+Ms31Bk0jr54UTflv/tyBkn0S8A5L +eHMahy032CP5cmGRmiiH0td5oFNCcbd9UaeHFPXsg8LOeJ+GPuaEM9T6ONjcZWTdlIF Q9E2libG0owsuGtaGC7l2PNtKhDaMzkl+8jvrKrlvT9vVlhejWSfPBLWXQe6rZZsx/cA W1Gz1MswoDoEDrx0EIUe6mL+PMx1T94dfpQgn/EmELRMqiZoKmQOqioNXYTTE6SwnIWM 0K8Xx9YyyJAdM62w3bdCyLS/es/slCk1WM+lZQ7cyLE2cGrC7JSVU0AIeDeDHwmxxIFQ Ch6w== X-Gm-Message-State: ACrzQf06FuFx88YOw1i2v+jIj8DphKmB+85yXSzmoLRvUIw0Lj0dHYmT Vil/a7z49uzBrfpr5bfYtWSStctWByCrJczr X-Google-Smtp-Source: AMsMyM6zR+z/OIa/Dtdsk2SmrV04Ds1wgQGA5ibsKR0i08WQDEfKcOpUwEVfvpP36SELWTLaptfixw== X-Received: by 2002:a2e:a164:0:b0:267:43f:ce46 with SMTP id u4-20020a2ea164000000b00267043fce46mr8251504ljl.530.1668035241526; Wed, 09 Nov 2022 15:07:21 -0800 (PST) Received: from ubuntu-focal.. ([2a01:4f9:3a:1e26::2]) by smtp.gmail.com with ESMTPSA id k4-20020a05651239c400b004a6f66eed7fsm2423088lfu.165.2022.11.09.15.07.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Nov 2022 15:07:20 -0800 (PST) From: Philipp Tomsich To: gcc-patches@gcc.gnu.org Cc: Christoph Muellner , Palmer Dabbelt , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich Subject: [PATCH] RISC-V: Optimise adding a (larger than simm12) constant Date: Thu, 10 Nov 2022 00:07:18 +0100 Message-Id: <20221109230718.3240479-1-philipp.tomsich@vrull.eu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Handling the register-const_int addition has very quickly escalated to creating a full sign-extended 32bit constant and performing a register-register for RISC-V in GCC so far, resulting in sequences like (for the case of "a + 2048"): li a5,4096 addi a5,a5,-2048 add a0,a0,a5 By adding an expansion for add3, we can emit optimised RTL that matches the capabilities of RISC-V better by adding support for the following, previously unoptimised cases: - addi + addi addi a0,a0,2047 addi a0,a0,1 - li + sh[123]add (if Zba is enabled) li a5,960 sh3add a0,a5,a0 With this commit, we also fix up riscv_adjust_libcall_cfi_prologue() and riscv_adjust_libcall_cfi_epilogue() to not use gen_add3_insn, as the expander will otherwise wrap the resulting set-expression in an insn (causing an ICE at dwarf2-time) when invoked with -msave-restore. This closes the gap to LLVM, which has already been emitting these optimised sequences. Note that this benefits is perlbench (in SPEC CPU 2017), which needs to add the constant 3840. gcc/ChangeLog: * config/riscv/bitmanip.md (*shNadd): Rename. (riscv_shNadd): Expose as gen_riscv_shNadd{di/si}. * config/riscv/predicates.md (const_arith_shifted123_operand): New predicate (for constants that are a simm12, shifted by 1, 2 or 3). (const_arith_2simm12_operand): New predicate (that can be expressed by adding 2 simm12 together). (addi_operand): New predicate (an immedaite operand suitable for the new add3 expansion). * config/riscv/riscv.cc (riscv_adjust_libcall_cfi_prologue): Don't use gen_add3_insn, where a RTX instead of an INSN is required (otherwise this will break as soon as we have a define_expand for add3). (riscv_adjust_libcall_cfi_epilogue): Same. * config/riscv/riscv.md (addsi3): Rename. (riscv_addsi3): New name for addsi3. (adddi3): Rename. (riscv_adddi3): New name for adddi3. (add3): New expander that handles the basic and fancy (such as li+sh[123]add, addi+addi, ...) cases for adding register-register and register-const_int. gcc/testsuite/ChangeLog: * gcc.target/riscv/addi.c: New test. * gcc.target/riscv/zba-shNadd-06.c: New test. Signed-off-by: Philipp Tomsich --- gcc/config/riscv/bitmanip.md | 2 +- gcc/config/riscv/predicates.md | 28 +++++++++ gcc/config/riscv/riscv.cc | 10 ++-- gcc/config/riscv/riscv.md | 58 ++++++++++++++++++- gcc/testsuite/gcc.target/riscv/addi.c | 39 +++++++++++++ .../gcc.target/riscv/zba-shNadd-06.c | 11 ++++ 6 files changed, 141 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/addi.c create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index bb23ceb86d9..78fdf02c2ec 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -29,7 +29,7 @@ [(set_attr "type" "bitmanip,load") (set_attr "mode" "DI")]) -(define_insn "*shNadd" +(define_insn "riscv_shNadd" [(set (match_operand:X 0 "register_operand" "=r") (plus:X (ashift:X (match_operand:X 1 "register_operand" "r") (match_operand:QI 2 "imm123_operand" "Ds3")) diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 6772228e5b6..c56bfa99339 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -308,3 +308,31 @@ (match_test "INTVAL (op) > 0"))) (ior (match_test "SMALL_OPERAND (UINTVAL (op) & ~(HOST_WIDE_INT_1U << floor_log2 (UINTVAL (op))))") (match_test "popcount_hwi (UINTVAL (op)) == 2")))) + +;; A CONST_INT that can be shifted down by 1, 2 or 3 bits (i.e., has +;; these bits clear) and will then form a SMALL_OPERAND. +(define_predicate "const_arith_shifted123_operand" + (and (match_code "const_int") + (not (match_test "SMALL_OPERAND (INTVAL (op))"))) +{ + HOST_WIDE_INT val = INTVAL (op); + int trailing = ctz_hwi (val); + + /* Clamp to 3, as we have sh[123]add instructions only. */ + if (trailing > 3) + trailing = 3; + + return trailing > 0 && SMALL_OPERAND (val >> trailing); +}) + +;; A CONST_INT that can formed by adding two SMALL_OPERANDs together +(define_predicate "const_arith_2simm12_operand" + (and (match_code "const_int") + (ior (match_test "SMALL_OPERAND(INTVAL (op) - ~(HOST_WIDE_INT_M1U << (IMM_BITS - 1)))") + (match_test "SMALL_OPERAND(INTVAL (op) - (HOST_WIDE_INT_M1U << (IMM_BITS - 1)))")))) + +(define_predicate "addi_operand" + (ior (match_operand 0 "arith_operand") + (match_operand 0 "const_arith_2simm12_operand") + (and (match_operand 0 "const_arith_shifted123_operand") + (match_test "TARGET_ZBA")))) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5a632058003..8efe90cc0fa 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -4910,8 +4910,9 @@ riscv_adjust_libcall_cfi_prologue () } /* Debug info for adjust sp. */ - adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, GEN_INT (-saved_size)); + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx, + gen_rtx_PLUS (Pmode, stack_pointer_rtx, + GEN_INT (-saved_size))); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, dwarf); return dwarf; @@ -5013,8 +5014,9 @@ riscv_adjust_libcall_cfi_epilogue () int saved_size = cfun->machine->frame.save_libcall_adjustment; /* Debug info for adjust sp. */ - adjust_sp_rtx = gen_add3_insn (stack_pointer_rtx, - stack_pointer_rtx, GEN_INT (saved_size)); + adjust_sp_rtx = gen_rtx_SET (stack_pointer_rtx, + gen_rtx_PLUS (Pmode, stack_pointer_rtx, + GEN_INT (saved_size))); dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx, dwarf); diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 171a0cdced6..289ff7470c6 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -446,7 +446,7 @@ [(set_attr "type" "fadd") (set_attr "mode" "")]) -(define_insn "addsi3" +(define_insn "riscv_addsi3" [(set (match_operand:SI 0 "register_operand" "=r,r") (plus:SI (match_operand:SI 1 "register_operand" " r,r") (match_operand:SI 2 "arith_operand" " r,I")))] @@ -455,7 +455,7 @@ [(set_attr "type" "arith") (set_attr "mode" "SI")]) -(define_insn "adddi3" +(define_insn "riscv_adddi3" [(set (match_operand:DI 0 "register_operand" "=r,r") (plus:DI (match_operand:DI 1 "register_operand" " r,r") (match_operand:DI 2 "arith_operand" " r,I")))] @@ -464,6 +464,60 @@ [(set_attr "type" "arith") (set_attr "mode" "DI")]) +(define_expand "add3" + [(set (match_operand:GPR 0 "register_operand" "=r,r") + (plus:GPR (match_operand:GPR 1 "register_operand" " r,r") + (match_operand:GPR 2 "addi_operand" " r,I")))] + "" +{ + if (arith_operand (operands[2], mode)) + emit_insn (gen_riscv_add3 (operands[0], operands[1], operands[2])); + else if (const_arith_2simm12_operand (operands[2], mode)) + { + /* Split into two immediates that add up to the desired value: + * e.g., break up "a + 2445" into: + * addi a0,a0,2047 + * addi a0,a0,398 + */ + + HOST_WIDE_INT val = INTVAL (operands[2]); + HOST_WIDE_INT saturated = HOST_WIDE_INT_M1U << (IMM_BITS - 1); + + if (val >= 0) + saturated = ~saturated; + + val -= saturated; + + rtx tmp = gen_reg_rtx (mode); + emit_insn (gen_riscv_add3 (tmp, operands[1], GEN_INT (saturated))); + emit_insn (gen_riscv_add3 (operands[0], tmp, GEN_INT (val))); + } + else if (mode == word_mode + && const_arith_shifted123_operand (operands[2], mode)) + { + /* Use a sh[123]add and an immediate shifted down by 1, 2, or 3. */ + + HOST_WIDE_INT val = INTVAL (operands[2]); + int shamt = ctz_hwi (val); + + if (shamt > 3) + shamt = 3; + + rtx tmp = gen_reg_rtx (mode); + emit_insn (gen_rtx_SET (tmp, GEN_INT (val >> shamt))); + + /* We don't use gen_riscv_shNadd here, as it will only exist for + . Instead we build up its canonical form directly. */ + rtx shifted_imm = gen_rtx_ASHIFT (mode, tmp, GEN_INT (shamt)); + rtx shNadd = gen_rtx_PLUS (mode, shifted_imm, operands[1]); + emit_insn (gen_rtx_SET (operands[0], shNadd)); + } + else + FAIL; + + DONE; +}) + (define_expand "addv4" [(set (match_operand:GPR 0 "register_operand" "=r,r") (plus:GPR (match_operand:GPR 1 "register_operand" " r,r") diff --git a/gcc/testsuite/gcc.target/riscv/addi.c b/gcc/testsuite/gcc.target/riscv/addi.c new file mode 100644 index 00000000000..01339e44697 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/addi.c @@ -0,0 +1,39 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ + +long long f (long long a) +{ + // addi a0,a0,2047 + // addi a0,a0,1 + + return a + 2048; +} + +long long f2 (long long a) +{ + // addi a0,a0,2047 + // addi a0,a0,398 + + return a + 2445; +} + +long long f3 (long long a) +{ + // addi a0,a0,-2048 + // addi a0,a0,-397 + + return a - 2445; +} + +long long f6 (long long a) +{ + // li a5,1179648 + // add a0,a0,a5 + + return a + (0x12 << 16); +} + +/* { dg-final { scan-assembler-times "addi\t" 6 } } */ +/* { dg-final { scan-assembler-times "li\t" 1 } } */ +/* { dg-final { scan-assembler-times "add\t" 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c b/gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c new file mode 100644 index 00000000000..c55f05ed1d2 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zba-shNadd-06.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zba -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */ + +long long f (long long a) +{ + return a + 7680; +} + +/* { dg-final { scan-assembler-times "li\t" 1 } } */ +/* { dg-final { scan-assembler-times "sh3add\t" 1 } } */