From patchwork Fri Nov 17 08:51:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 80323 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D9D613882074 for ; Mon, 20 Nov 2023 02:59:14 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 9E57B386483A for ; Mon, 20 Nov 2023 02:58:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9E57B386483A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9E57B386483A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700449138; cv=none; b=MHXaQX/ZRJKnzf+Bm7BHiGsSzO7JAh5gYo3z3wLyvsWjJvVwbOBOecKAJyOu2JzjbbuhQjMy41yTsi1Lzh/1n40lRceEtiCcjGkwThIf1qvx2wjU/DnokihcqTXaZHzQ7GnjZ8m1+hcyG45O37MKlfE9HVNTh1zd62D2oNr4qOM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700449138; c=relaxed/simple; bh=JKGRFGcC6vKKvcCBMVzqLWsvcsc+m3YaI7v9umjgYgs=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=woPs5FmnBharp1Ikd6jrqylAt0AYjuqy+1iqatNd2tZA+GyqT2xTYK64sMLrTayppcTm1F5Z7nQqKDF2I9TrMfbX2Qf8ZKNXV21N9GQnagFELrvq9tTJqnRKnZwZio8bNaZM0LWSN2mbXb6SttZH1BhpMECiaoxz0vceuwx90v8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r4uUv-0001nJ-PR for gcc-patches@gcc.gnu.org; Sun, 19 Nov 2023 21:58:56 -0500 Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8Dxl+iGKVdlycM6AA--.14453S3; Fri, 17 Nov 2023 16:51:18 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Bx3y+DKVdl9ShFAA--.21179S4; Fri, 17 Nov 2023 16:51:15 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH] LoongArch: Add support for xorsign. Date: Fri, 17 Nov 2023 16:51:13 +0800 Message-Id: <20231117085113.7180-1-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8Bx3y+DKVdl9ShFAA--.21179S4 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj9fXoW3CFyDur15ZFyfJr4DJFW7ZFc_yoW8Jr45Zo WktF4DC3WrGr1SkwsrKanxXryvvw4rAan7ZasIv3W5Ja1UA3yYy3srGwn8Z343Jrn8WryU Zas3WayDX3yxAw4kl-sFpf9Il3svdjkaLaAFLSUrUUUU1b8apTn2vfkv8UJUUUU8wcxFpf 9Il3svdxBIdaVrn0xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3 UjIYCTnIWjp_UUUYg7kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI 8IcIk0rVWrJVCq3wAFIxvE14AKwVWUGVWUXwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26r1I6r4UM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26r1j6r4UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAF wI0_Gr1j6F4UJwAaw2AFwI0_Jrv_JF1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqjxCEc2 xF0cIa020Ex4CE44I27wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2Ix0cI8IcVAFwI0_ Jr0_Jr4lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x 0EwIxGrwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwCFI7km07C267AK xVWUXVWUAwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMI IF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2 KfnxnUUI43ZEXa7IU8qiiDUUUUU== X-Gw-Check: c0a08213eae25e9b Received-SPF: pass client-ip=114.242.206.163; envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org This patch adds support for xorsign pattern to scalar fp and vector. With the new expands, uniformly using vector bitwise logical operations to handle xorsign. On LoongArch64, floating-point registers and vector registers share the same register, so this patch also allows conversion between LSX vector mode and scalar fp mode to avoid unnecessary instruction generation. gcc/ChangeLog: * config/loongarch/lasx.md (xorsign3): New expander. * config/loongarch/loongarch.cc (loongarch_can_change_mode_class): Allow conversion between LSX vector mode and scalar fp mode. * config/loongarch/loongarch.md (@xorsign3): New expander. * config/loongarch/lsx.md (@xorsign3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c: New test. * gcc.target/loongarch/vector/lasx/lasx-xorsign.c: New test. * gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c: New test. * gcc.target/loongarch/vector/lsx/lsx-xorsign.c: New test. * gcc.target/loongarch/xorsign-run.c: New test. * gcc.target/loongarch/xorsign.c: New test. diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md index f0f2dd08dd8..5a4be588fb4 100644 --- a/gcc/config/loongarch/lasx.md +++ b/gcc/config/loongarch/lasx.md @@ -1120,10 +1120,10 @@ (define_insn "umod3" (set_attr "mode" "")]) (define_insn "xor3" - [(set (match_operand:ILASX 0 "register_operand" "=f,f,f") - (xor:ILASX - (match_operand:ILASX 1 "register_operand" "f,f,f") - (match_operand:ILASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] + [(set (match_operand:LASX 0 "register_operand" "=f,f,f") + (xor:LASX + (match_operand:LASX 1 "register_operand" "f,f,f") + (match_operand:LASX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] "ISA_HAS_LASX" "@ xvxor.v\t%u0,%u1,%u2 @@ -3147,6 +3147,20 @@ (define_expand "copysign3" operands[5] = gen_reg_rtx (mode); }) +(define_expand "xorsign3" + [(set (match_dup 4) + (and:FLASX (match_dup 3) + (match_operand:FLASX 2 "register_operand"))) + (set (match_operand:FLASX 0 "register_operand") + (xor:FLASX (match_dup 4) + (match_operand:FLASX 1 "register_operand")))] + "ISA_HAS_LASX" +{ + operands[3] = loongarch_build_signbit_mask (mode, 1, 0); + + operands[4] = gen_reg_rtx (mode); +}) + (define_insn "absv4df2" [(set (match_operand:V4DF 0 "register_operand" "=f") diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index d05743bec87..e4cdbcf0f2d 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -6687,6 +6687,11 @@ loongarch_can_change_mode_class (machine_mode from, machine_mode to, if (LSX_SUPPORTED_MODE_P (from) && LSX_SUPPORTED_MODE_P (to)) return true; + /* Allow conversion between LSX vector mode and scalar fp mode. */ + if ((LSX_SUPPORTED_MODE_P (from) && SCALAR_FLOAT_MODE_P (to)) + || ((SCALAR_FLOAT_MODE_P (from) && LSX_SUPPORTED_MODE_P (to)))) + return true; + return !reg_classes_intersect_p (FP_REGS, rclass); } diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 22814a3679c..117c0924a85 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -1146,6 +1146,23 @@ (define_insn "copysign3" "fcopysign.\t%0,%1,%2" [(set_attr "type" "fcopysign") (set_attr "mode" "")]) + +(define_expand "@xorsign3" + [(match_operand:ANYF 0 "register_operand") + (match_operand:ANYF 1 "register_operand") + (match_operand:ANYF 2 "register_operand")] + "ISA_HAS_LSX" +{ + machine_mode lsx_mode + = mode == SFmode ? V4SFmode : V2DFmode; + rtx tmp = gen_reg_rtx (lsx_mode); + rtx op1 = lowpart_subreg (lsx_mode, operands[1], mode); + rtx op2 = lowpart_subreg (lsx_mode, operands[2], mode); + emit_insn (gen_xorsign3 (lsx_mode, tmp, op1, op2)); + emit_move_insn (operands[0], + lowpart_subreg (mode, tmp, lsx_mode)); + DONE; +}) ;; ;; .................... diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md index 55c7d79a030..40500363dc0 100644 --- a/gcc/config/loongarch/lsx.md +++ b/gcc/config/loongarch/lsx.md @@ -1027,10 +1027,10 @@ (define_insn "umod3" (set_attr "mode" "")]) (define_insn "xor3" - [(set (match_operand:ILSX 0 "register_operand" "=f,f,f") - (xor:ILSX - (match_operand:ILSX 1 "register_operand" "f,f,f") - (match_operand:ILSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] + [(set (match_operand:LSX 0 "register_operand" "=f,f,f") + (xor:LSX + (match_operand:LSX 1 "register_operand" "f,f,f") + (match_operand:LSX 2 "reg_or_vector_same_val_operand" "f,YC,Urv8")))] "ISA_HAS_LSX" "@ vxor.v\t%w0,%w1,%w2 @@ -2884,6 +2884,21 @@ (define_expand "copysign3" operands[5] = gen_reg_rtx (mode); }) +(define_expand "@xorsign3" + [(set (match_dup 4) + (and:FLSX (match_dup 3) + (match_operand:FLSX 2 "register_operand"))) + (set (match_operand:FLSX 0 "register_operand") + (xor:FLSX (match_dup 4) + (match_operand:FLSX 1 "register_operand")))] + "ISA_HAS_LSX" +{ + operands[3] = loongarch_build_signbit_mask (mode, 1, 0); + + operands[4] = gen_reg_rtx (mode); +}) + + (define_insn "absv2df2" [(set (match_operand:V2DF 0 "register_operand" "=f") (abs:V2DF (match_operand:V2DF 1 "register_operand" "f")))] diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c new file mode 100644 index 00000000000..2295503d4a1 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign-run.c @@ -0,0 +1,60 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -mlasx" } */ +/* { dg-require-effective-target loongarch_asx_hw } */ + +#include "lasx-xorsign.c" + +extern void abort (); + +#define N 16 +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f, + -12.5f, -15.6f, -18.7f, -21.8f, + 24.9f, 27.1f, 30.2f, 33.3f, + 36.4f, 39.5f, 42.6f, 45.7f}; +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f, + -9.0f, 1.0f, -2.0f, 3.0f, + -4.0f, -5.0f, 6.0f, 7.0f, + -8.0f, -9.0f, 10.0f, 11.0f}; +float r[N]; + +double ad[N] = {-0.1d, -3.2d, -6.3d, -9.4d, + -12.5d, -15.6d, -18.7d, -21.8d, + 24.9d, 27.1d, 30.2d, 33.3d, + 36.4d, 39.5d, 42.6d, 45.7d}; +double bd[N] = {-1.2d, 3.4d, -5.6d, 7.8d, + -9.0d, 1.0d, -2.0d, 3.0d, + -4.0d, -5.0d, 6.0d, 7.0d, + -8.0d, -9.0d, 10.0d, 11.0d}; +double rd[N]; + +void +__attribute__ ((optimize ("-O0"))) +check_xorsignf (void) +{ + for (int i = 0; i < N; i++) + if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i])) + abort (); +} + +void +__attribute__ ((optimize ("-O0"))) +check_xorsign (void) +{ + for (int i = 0; i < N; i++) + if (rd[i] != ad[i] * __builtin_copysign (1.0d, bd[i])) + abort (); +} + +int +main (void) +{ + my_xorsignf (r, a, b, N); + /* check results: */ + check_xorsignf (); + + my_xorsign (rd, ad, bd, N); + /* check results: */ + check_xorsign (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign.c new file mode 100644 index 00000000000..190a9239b31 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-xorsign.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -mlasx" } */ +/* { dg-final { scan-assembler "xvand\\.v" } } */ +/* { dg-final { scan-assembler "xvxor\\.v" } } */ +/* { dg-final { scan-assembler-not "xvfmul" } } */ + +double +my_xorsign (double *restrict a, double *restrict b, double *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysign (1.0d, c[i]); +} + +float +my_xorsignf (float *restrict a, float *restrict b, float *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysignf (1.0f, c[i]); +} diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c new file mode 100644 index 00000000000..927d9b06d87 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign-run.c @@ -0,0 +1,60 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -mlsx" } */ +/* { dg-require-effective-target loongarch_asx_hw } */ + +#include "lsx-xorsign.c" + +extern void abort (); + +#define N 16 +float a[N] = {-0.1f, -3.2f, -6.3f, -9.4f, + -12.5f, -15.6f, -18.7f, -21.8f, + 24.9f, 27.1f, 30.2f, 33.3f, + 36.4f, 39.5f, 42.6f, 45.7f}; +float b[N] = {-1.2f, 3.4f, -5.6f, 7.8f, + -9.0f, 1.0f, -2.0f, 3.0f, + -4.0f, -5.0f, 6.0f, 7.0f, + -8.0f, -9.0f, 10.0f, 11.0f}; +float r[N]; + +double ad[N] = {-0.1d, -3.2d, -6.3d, -9.4d, + -12.5d, -15.6d, -18.7d, -21.8d, + 24.9d, 27.1d, 30.2d, 33.3d, + 36.4d, 39.5d, 42.6d, 45.7d}; +double bd[N] = {-1.2d, 3.4d, -5.6d, 7.8d, + -9.0d, 1.0d, -2.0d, 3.0d, + -4.0d, -5.0d, 6.0d, 7.0d, + -8.0d, -9.0d, 10.0d, 11.0d}; +double rd[N]; + +void +__attribute__ ((optimize ("-O0"))) +check_xorsignf (void) +{ + for (int i = 0; i < N; i++) + if (r[i] != a[i] * __builtin_copysignf (1.0f, b[i])) + abort (); +} + +void +__attribute__ ((optimize ("-O0"))) +check_xorsign (void) +{ + for (int i = 0; i < N; i++) + if (rd[i] != ad[i] * __builtin_copysign (1.0d, bd[i])) + abort (); +} + +int +main (void) +{ + my_xorsignf (r, a, b, N); + /* check results: */ + check_xorsignf (); + + my_xorsign (rd, ad, bd, N); + /* check results: */ + check_xorsign (); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign.c new file mode 100644 index 00000000000..c2694c11e79 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-xorsign.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -ftree-vectorize -mlsx" } */ +/* { dg-final { scan-assembler "vand\\.v" } } */ +/* { dg-final { scan-assembler "vxor\\.v" } } */ +/* { dg-final { scan-assembler-not "vfmul" } } */ + +double +my_xorsign (double *restrict a, double *restrict b, double *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysign (1.0d, c[i]); +} + +float +my_xorsignf (float *restrict a, float *restrict b, float *restrict c, int n) +{ + for (int i = 0; i < n; i++) + a[i] = b[i] * __builtin_copysignf (1.0f, c[i]); +} diff --git a/gcc/testsuite/gcc.target/loongarch/xorsign-run.c b/gcc/testsuite/gcc.target/loongarch/xorsign-run.c new file mode 100644 index 00000000000..5dd04cabe62 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/xorsign-run.c @@ -0,0 +1,25 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -mlsx" } */ +/* { dg-require-effective-target loongarch_asx_hw } */ + +extern void abort(void); + +static double x = 2.0; +static float y = 2.0; + +int main() +{ + if ((2.5 * __builtin_copysign(1.0d, x)) != 2.5) + abort(); + + if ((2.5 * __builtin_copysign(1.0f, y)) != 2.5) + abort(); + + if ((2.5 * __builtin_copysignf(1.0d, -x)) != -2.5) + abort(); + + if ((2.5 * __builtin_copysignf(1.0f, -y)) != -2.5) + abort(); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/loongarch/xorsign.c b/gcc/testsuite/gcc.target/loongarch/xorsign.c new file mode 100644 index 00000000000..ca80603d48b --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/xorsign.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mlsx" } */ +/* { dg-final { scan-assembler "vand\\.v" } } */ +/* { dg-final { scan-assembler "vxor\\.v" } } */ +/* { dg-final { scan-assembler-not "fcopysign" } } */ +/* { dg-final { scan-assembler-not "fmul" } } */ + +double +my_xorsign (double a, double b) +{ + return a * __builtin_copysign (1.0d, b); +} + +float +my_xorsignf (float a, float b) +{ + return a * __builtin_copysignf (1.0f, b); +}