From patchwork Fri Nov 17 08:33:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lulu Cheng X-Patchwork-Id: 80103 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 11A0B3857831 for ; Fri, 17 Nov 2023 08:35:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 836B13858404 for ; Fri, 17 Nov 2023 08:34:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 836B13858404 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 836B13858404 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700210061; cv=none; b=bSrHJUjgToyqzlUYhhNyNb9S6M/Ui3MPsjuHkStfn2Hii9HmWD0IkafTXvY58YoTDEHJx5RSCEPSj+guxSeiQNwWp8jlus5VgG3j3ZBi/d2WsuLkyJch/5J6RBYR86K2aKq+PA7JdKU4wwEJFGRyIu7mLdA1e23/XpurUmSHPXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700210061; c=relaxed/simple; bh=AGNhBnRF7aTaMvrkhFEM9xPluDfpXYJ3AMqqFhn36kI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Gm3YSHTVKIZm1Q58Ttl4kxQpMa23m3n/cJzK6C7aVgGS0Yl1GA5aOpJJ7kF2NTfMl/zmlv6SZkguvp6nJzN/VYO0VdWgUI0T6sMfIVrSDgkdaUsf0q7tMUPgke1JfFjW+iyOJkQ8zLkDcB31jorFVVpDggr8/hOxhgODeugffng= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.20.4.107]) by gateway (Coremail) with SMTP id _____8AxZ+iHJVdl5MI6AA--.14753S3; Fri, 17 Nov 2023 16:34:15 +0800 (CST) Received: from loongson-pc.loongson.cn (unknown [10.20.4.107]) by localhost.localdomain (Coremail) with SMTP id AQAAf8CxbS9rJVdlSSVFAA--.20530S4; Fri, 17 Nov 2023 16:34:14 +0800 (CST) From: Lulu Cheng To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, xuchenghua@loongson.cn, Lulu Cheng Subject: [PATCH v1 2/3] LoongArch: Implement atomic operations using LoongArch1.1 instructions. Date: Fri, 17 Nov 2023 16:33:43 +0800 Message-Id: <20231117083344.29037-3-chenglulu@loongson.cn> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231117083344.29037-1-chenglulu@loongson.cn> References: <20231117083344.29037-1-chenglulu@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8CxbS9rJVdlSSVFAA--.20530S4 X-CM-SenderInfo: xfkh0wpoxo3qxorr0wxvrqhubq/ X-Coremail-Antispam: 1Uk129KBj9fXoW3Kw1ktF43Gry8KF4xZw4UJrc_yoW8Jw4kJo WrtFWDXw18JryY9wsxKr1rAr1vqF4UAF4xAa4avw1YyaykZrZ8JryDGa15Za43JF4DW348 AryIkas8JF97Jw4rl-sFpf9Il3svdjkaLaAFLSUrUUUUUb8apTn2vfkv8UJUUUU8wcxFpf 9Il3svdxBIdaVrn0xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3 UjIYCTnIWjp_UUUY87kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI 8IcIk0rVWrJVCq3wAFIxvE14AKwVWUAVWUZwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26ryj6F1UM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26r4j6F4UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI 0UMc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280 aVAFwI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28Icx kI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2Iq xVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42 IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY 6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aV CY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU8SksDUUUUU== X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org 1. short and char type calls for atomic_add_fetch and __atomic_fetch_add are implemented using amadd{_db}.{b/h}. 2. Use amcas{_db}.{b/h/w/d} to implement __atomic_compare_exchange_n and __atomic_compare_exchange. 3. The short and char types of the functions __atomic_exchange and __atomic_exchange_n are implemented using amswap{_db}.{b/h}. gcc/ChangeLog: * config/loongarch/loongarch-def.h: Add comments. * config/loongarch/loongarch-opts.h (ISA_BASE_IS_LA64V110): Define macro. * config/loongarch/loongarch.cc (loongarch_memmodel_needs_rel_acq_fence): Remove redundant code implementations. * config/loongarch/sync.md (d): Added QI, HI support. (atomic_add): New template. (atomic_exchange_short): Likewise. (atomic_cas_value_strong_amcas): Likewise.. (atomic_fetch_add_short): Likewise. --- gcc/config/loongarch/loongarch-def.h | 2 + gcc/config/loongarch/loongarch-opts.h | 2 +- gcc/config/loongarch/loongarch.cc | 6 +- gcc/config/loongarch/sync.md | 186 ++++++++++++++++++++------ 4 files changed, 147 insertions(+), 49 deletions(-) diff --git a/gcc/config/loongarch/loongarch-def.h b/gcc/config/loongarch/loongarch-def.h index db497f3ffe2..b319cded456 100644 --- a/gcc/config/loongarch/loongarch-def.h +++ b/gcc/config/loongarch/loongarch-def.h @@ -54,7 +54,9 @@ extern "C" { /* enum isa_base */ extern const char* loongarch_isa_base_strings[]; +/* LoongArch V1.00. */ #define ISA_BASE_LA64V100 0 +/* LoongArch V1.10. */ #define ISA_BASE_LA64V110 1 #define N_ISA_BASE_TYPES 2 diff --git a/gcc/config/loongarch/loongarch-opts.h b/gcc/config/loongarch/loongarch-opts.h index bd2e86a5aa7..e8645d9ad5c 100644 --- a/gcc/config/loongarch/loongarch-opts.h +++ b/gcc/config/loongarch/loongarch-opts.h @@ -86,10 +86,10 @@ loongarch_update_gcc_opt_status (struct loongarch_target *target, || la_target.isa.simd == ISA_EXT_SIMD_LASX) #define ISA_HAS_LASX (la_target.isa.simd == ISA_EXT_SIMD_LASX) - /* TARGET_ macros for use in *.md template conditionals */ #define TARGET_uARCH_LA464 (la_target.cpu_tune == CPU_LA464) #define TARGET_uARCH_LA664 (la_target.cpu_tune == CPU_LA664) +#define ISA_BASE_IS_LA64V110 (la_target.isa.base == ISA_BASE_LA64V110) /* Note: optimize_size may vary across functions, while -m[no]-memcpy imposes a global constraint. */ diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 464f6c4dd63..5bec10d7418 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -5813,16 +5813,12 @@ loongarch_print_operand_punct_valid_p (unsigned char code) static bool loongarch_memmodel_needs_rel_acq_fence (enum memmodel model) { - switch (model) + switch (memmodel_base (model)) { case MEMMODEL_ACQ_REL: case MEMMODEL_SEQ_CST: - case MEMMODEL_SYNC_SEQ_CST: case MEMMODEL_RELEASE: - case MEMMODEL_SYNC_RELEASE: case MEMMODEL_ACQUIRE: - case MEMMODEL_CONSUME: - case MEMMODEL_SYNC_ACQUIRE: return true; case MEMMODEL_RELAXED: diff --git a/gcc/config/loongarch/sync.md b/gcc/config/loongarch/sync.md index c112091a60f..66e316d80f5 100644 --- a/gcc/config/loongarch/sync.md +++ b/gcc/config/loongarch/sync.md @@ -38,7 +38,7 @@ (define_code_attr atomic_optab [(plus "add") (ior "or") (xor "xor") (and "and")]) ;; This attribute gives the format suffix for atomic memory operations. -(define_mode_attr amo [(SI "w") (DI "d")]) +(define_mode_attr amo [(QI "b") (HI "h") (SI "w") (DI "d")]) ;; expands to the name of the atomic operand that implements a ;; particular code. @@ -123,7 +123,18 @@ (define_insn "atomic_" UNSPEC_SYNC_OLD_OP))] "" "am%A2.\t$zero,%z1,%0" - [(set (attr "length") (const_int 8))]) + [(set (attr "length") (const_int 4))]) + +(define_insn "atomic_add" + [(set (match_operand:SHORT 0 "memory_operand" "+ZB") + (unspec_volatile:SHORT + [(plus:SHORT (match_dup 0) + (match_operand:SHORT 1 "reg_or_0_operand" "rJ")) + (match_operand:SI 2 "const_int_operand")] ;; model + UNSPEC_SYNC_OLD_OP))] + "ISA_BASE_IS_LA64V110" + "amadd%A2.\t$zero,%z1,%0" + [(set (attr "length") (const_int 4))]) (define_insn "atomic_fetch_" [(set (match_operand:GPR 0 "register_operand" "=&r") @@ -131,12 +142,12 @@ (define_insn "atomic_fetch_" (set (match_dup 1) (unspec_volatile:GPR [(any_atomic:GPR (match_dup 1) - (match_operand:GPR 2 "reg_or_0_operand" "rJ")) + (match_operand:GPR 2 "reg_or_0_operand" "rJ")) (match_operand:SI 3 "const_int_operand")] ;; model UNSPEC_SYNC_OLD_OP))] "" "am%A3.\t%0,%z2,%1" - [(set (attr "length") (const_int 8))]) + [(set (attr "length") (const_int 4))]) (define_insn "atomic_exchange" [(set (match_operand:GPR 0 "register_operand" "=&r") @@ -148,7 +159,19 @@ (define_insn "atomic_exchange" (match_operand:GPR 2 "register_operand" "r"))] "" "amswap%A3.\t%0,%z2,%1" - [(set (attr "length") (const_int 8))]) + [(set (attr "length") (const_int 4))]) + +(define_insn "atomic_exchange_short" + [(set (match_operand:SHORT 0 "register_operand" "=&r") + (unspec_volatile:SHORT + [(match_operand:SHORT 1 "memory_operand" "+ZB") + (match_operand:SI 3 "const_int_operand")] ;; model + UNSPEC_SYNC_EXCHANGE)) + (set (match_dup 1) + (match_operand:SHORT 2 "register_operand" "r"))] + "ISA_BASE_IS_LA64V110" + "amswap%A3.\t%0,%z2,%1" + [(set (attr "length") (const_int 4))]) (define_insn "atomic_cas_value_strong" [(set (match_operand:GPR 0 "register_operand" "=&r") @@ -156,25 +179,36 @@ (define_insn "atomic_cas_value_strong" (set (match_dup 1) (unspec_volatile:GPR [(match_operand:GPR 2 "reg_or_0_operand" "rJ") (match_operand:GPR 3 "reg_or_0_operand" "rJ") - (match_operand:SI 4 "const_int_operand") ;; mod_s - (match_operand:SI 5 "const_int_operand")] ;; mod_f + (match_operand:SI 4 "const_int_operand")] ;; mod_s UNSPEC_COMPARE_AND_SWAP)) - (clobber (match_scratch:GPR 6 "=&r"))] + (clobber (match_scratch:GPR 5 "=&r"))] "" { return "1:\\n\\t" "ll.\\t%0,%1\\n\\t" "bne\\t%0,%z2,2f\\n\\t" - "or%i3\\t%6,$zero,%3\\n\\t" - "sc.\\t%6,%1\\n\\t" - "beqz\\t%6,1b\\n\\t" + "or%i3\\t%5,$zero,%3\\n\\t" + "sc.\\t%5,%1\\n\\t" + "beqz\\t%5,1b\\n\\t" "b\\t3f\\n\\t" "2:\\n\\t" - "%G5\\n\\t" + "%G4\\n\\t" "3:\\n\\t"; } [(set (attr "length") (const_int 28))]) +(define_insn "atomic_cas_value_strong_amcas" + [(set (match_operand:QHWD 0 "register_operand" "=&r") + (match_operand:QHWD 1 "memory_operand" "+ZB")) + (set (match_dup 1) + (unspec_volatile:QHWD [(match_operand:QHWD 2 "reg_or_0_operand" "rJ") + (match_operand:QHWD 3 "reg_or_0_operand" "rJ") + (match_operand:SI 4 "const_int_operand")] ;; mod_s + UNSPEC_COMPARE_AND_SWAP))] + "ISA_BASE_IS_LA64V110" + "ori\t%0,%z2,0\n\tamcas%A4.\t%0,%z3,%1" + [(set (attr "length") (const_int 8))]) + (define_expand "atomic_compare_and_swap" [(match_operand:SI 0 "register_operand" "") ;; bool output (match_operand:GPR 1 "register_operand" "") ;; val output @@ -186,9 +220,29 @@ (define_expand "atomic_compare_and_swap" (match_operand:SI 7 "const_int_operand" "")] ;; mod_f "" { - emit_insn (gen_atomic_cas_value_strong (operands[1], operands[2], - operands[3], operands[4], - operands[6], operands[7])); + rtx mod_s, mod_f; + + mod_s = operands[6]; + mod_f = operands[7]; + + /* Normally the succ memory model must be stronger than fail, but in the + unlikely event of fail being ACQUIRE and succ being RELEASE we need to + promote succ to ACQ_REL so that we don't lose the acquire semantics. */ + + if (is_mm_acquire (memmodel_base (INTVAL (mod_f))) + && is_mm_release (memmodel_base (INTVAL (mod_s)))) + mod_s = GEN_INT (MEMMODEL_ACQ_REL); + + operands[6] = mod_s; + + if (ISA_BASE_IS_LA64V110) + emit_insn (gen_atomic_cas_value_strong_amcas (operands[1], operands[2], + operands[3], operands[4], + operands[6])); + else + emit_insn (gen_atomic_cas_value_strong (operands[1], operands[2], + operands[3], operands[4], + operands[6])); rtx compare = operands[1]; if (operands[3] != const0_rtx) @@ -292,31 +346,53 @@ (define_expand "atomic_compare_and_swap" (match_operand:SI 7 "const_int_operand" "")] ;; mod_f "" { - union loongarch_gen_fn_ptrs generator; - generator.fn_7 = gen_atomic_cas_value_cmp_and_7_si; - loongarch_expand_atomic_qihi (generator, operands[1], operands[2], - operands[3], operands[4], operands[7]); + rtx mod_s, mod_f; - rtx compare = operands[1]; - if (operands[3] != const0_rtx) - { - machine_mode mode = GET_MODE (operands[3]); - rtx op1 = convert_modes (SImode, mode, operands[1], true); - rtx op3 = convert_modes (SImode, mode, operands[3], true); - rtx difference = gen_rtx_MINUS (SImode, op1, op3); - compare = gen_reg_rtx (SImode); - emit_insn (gen_rtx_SET (compare, difference)); - } + mod_s = operands[6]; + mod_f = operands[7]; - if (word_mode != mode) + /* Normally the succ memory model must be stronger than fail, but in the + unlikely event of fail being ACQUIRE and succ being RELEASE we need to + promote succ to ACQ_REL so that we don't lose the acquire semantics. */ + + if (is_mm_acquire (memmodel_base (INTVAL (mod_f))) + && is_mm_release (memmodel_base (INTVAL (mod_s)))) + mod_s = GEN_INT (MEMMODEL_ACQ_REL); + + operands[6] = mod_s; + + if (ISA_BASE_IS_LA64V110) + emit_insn (gen_atomic_cas_value_strong_amcas (operands[1], operands[2], + operands[3], operands[4], + operands[6])); + else { - rtx reg = gen_reg_rtx (word_mode); - emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare))); - compare = reg; + union loongarch_gen_fn_ptrs generator; + generator.fn_7 = gen_atomic_cas_value_cmp_and_7_si; + loongarch_expand_atomic_qihi (generator, operands[1], operands[2], + operands[3], operands[4], operands[6]); } - emit_insn (gen_rtx_SET (operands[0], - gen_rtx_EQ (SImode, compare, const0_rtx))); + rtx compare = operands[1]; + if (operands[3] != const0_rtx) + { + machine_mode mode = GET_MODE (operands[3]); + rtx op1 = convert_modes (SImode, mode, operands[1], true); + rtx op3 = convert_modes (SImode, mode, operands[3], true); + rtx difference = gen_rtx_MINUS (SImode, op1, op3); + compare = gen_reg_rtx (SImode); + emit_insn (gen_rtx_SET (compare, difference)); + } + + if (word_mode != mode) + { + rtx reg = gen_reg_rtx (word_mode); + emit_insn (gen_rtx_SET (reg, gen_rtx_SIGN_EXTEND (word_mode, compare))); + compare = reg; + } + + emit_insn (gen_rtx_SET (operands[0], + gen_rtx_EQ (SImode, compare, const0_rtx))); DONE; }) @@ -505,13 +581,31 @@ (define_expand "atomic_exchange" (match_operand:SHORT 2 "register_operand"))] "" { - union loongarch_gen_fn_ptrs generator; - generator.fn_7 = gen_atomic_cas_value_exchange_7_si; - loongarch_expand_atomic_qihi (generator, operands[0], operands[1], - const0_rtx, operands[2], operands[3]); + if (ISA_BASE_IS_LA64V110) + emit_insn (gen_atomic_exchange_short (operands[0], operands[1], operands[2], operands[3])); + else + { + union loongarch_gen_fn_ptrs generator; + generator.fn_7 = gen_atomic_cas_value_exchange_7_si; + loongarch_expand_atomic_qihi (generator, operands[0], operands[1], + const0_rtx, operands[2], operands[3]); + } DONE; }) +(define_insn "atomic_fetch_add_short" + [(set (match_operand:SHORT 0 "register_operand" "=&r") + (match_operand:SHORT 1 "memory_operand" "+ZB")) + (set (match_dup 1) + (unspec_volatile:SHORT + [(plus:SHORT (match_dup 1) + (match_operand:SHORT 2 "reg_or_0_operand" "rJ")) + (match_operand:SI 3 "const_int_operand")] ;; model + UNSPEC_SYNC_OLD_OP))] + "ISA_BASE_IS_LA64V110" + "amadd%A3.\t%0,%z2,%1" + [(set (attr "length") (const_int 4))]) + (define_expand "atomic_fetch_add" [(set (match_operand:SHORT 0 "register_operand" "=&r") (match_operand:SHORT 1 "memory_operand" "+ZB")) @@ -523,10 +617,16 @@ (define_expand "atomic_fetch_add" UNSPEC_SYNC_OLD_OP))] "" { - union loongarch_gen_fn_ptrs generator; - generator.fn_7 = gen_atomic_cas_value_add_7_si; - loongarch_expand_atomic_qihi (generator, operands[0], operands[1], - operands[1], operands[2], operands[3]); + if (ISA_BASE_IS_LA64V110) + emit_insn (gen_atomic_fetch_add_short (operands[0], operands[1], + operands[2], operands[3])); + else + { + union loongarch_gen_fn_ptrs generator; + generator.fn_7 = gen_atomic_cas_value_add_7_si; + loongarch_expand_atomic_qihi (generator, operands[0], operands[1], + operands[1], operands[2], operands[3]); + } DONE; })