From patchwork Thu Jan 13 14:56:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 49975 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C962D39518B5 for ; Thu, 13 Jan 2022 14:58:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C962D39518B5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642085931; bh=7KvUEsbgBcEFjbnVTh8M6xFMqbzFtYLVShYHDs7H2PU=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=YUbOjo3GcYCalV+EUuL6kkK2YRr0JYbDVDDbhA5cgKSk76sbEFjx3DSMiYWHSH2cQ iY7Mox3G6Yo9ORRq4vwXk8PPk1FMIVDKoXJ3XeiQpzN47Q/FbC1wbGPRy3jYAtla/B TPBF+Wc5j4ECEKHV8EQG19LYi7XJe0eyZU0K0zrg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx07-00178001.pphosted.com (mx07-00178001.pphosted.com [185.132.182.106]) by sourceware.org (Postfix) with ESMTPS id C4A5A3951C2E for ; Thu, 13 Jan 2022 14:57:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C4A5A3951C2E Received: from pps.filterd (m0241204.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 20D9pOmT031954 for ; Thu, 13 Jan 2022 15:57:14 +0100 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com (PPS) with ESMTPS id 3dj25b5a81-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 13 Jan 2022 15:57:14 +0100 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 7DC8710002A for ; Thu, 13 Jan 2022 15:57:13 +0100 (CET) Received: from Webmail-eu.st.com (sfhdag2node2.st.com [10.75.127.5]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 5DFB2214237 for ; Thu, 13 Jan 2022 15:57:13 +0100 (CET) Received: from gnx2104.gnb.st.com (10.75.127.47) by SFHDAG2NODE2.st.com (10.75.127.5) with Microsoft SMTP Server (TLS) id 15.0.1497.26; Thu, 13 Jan 2022 15:57:12 +0100 To: Subject: [PATCH v3 00/15] ARM/MVE use vectors of boolean for predicates Date: Thu, 13 Jan 2022 15:56:10 +0100 Message-ID: <20220113145645.4077141-1-christophe.lyon@foss.st.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.75.127.47] X-ClientProxiedBy: SFHDAG2NODE2.st.com (10.75.127.5) To SFHDAG2NODE2.st.com (10.75.127.5) X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-01-13_07,2022-01-13_01,2021-12-02_01 X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This is v3 of this patch series, fixing issues I discovered before committing v2 (which had been approved). Thanks a lot to Richard Sandiford for his help. The changes v2 -> v3 are: Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support VPR. Patch 7: Changes to the underlying representation of vectors of booleans to account for the different expectations between AArch64/SVE and Arm/MVE. Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of duplicating them in mve_mov. This requires the introduction of a new constraint to match a constant vector of booleans. Add a new RTL test. Patch 9: Introduce check_effective_target_arm_mve and skip gcc.dg/signbit-2.c, because with MVE there is no fallback architecture unlike SVE or AVX512. Patch 12: Update less load/store MVE builtins (mve_vldrdq_gather_base_z_v2di, mve_vldrdq_gather_offset_z_v2di, mve_vldrdq_gather_shifted_offset_z_v2di, mve_vstrdq_scatter_base_p_v2di, mve_vstrdq_scatter_offset_p_v2di, mve_vstrdq_scatter_offset_p_v2di_insn, mve_vstrdq_scatter_shifted_offset_p_v2di, mve_vstrdq_scatter_shifted_offset_p_v2di_insn, mve_vstrdq_scatter_base_wb_p_v2di, mve_vldrdq_gather_base_wb_z_v2di, mve_vldrdq_gather_base_nowb_z_v2di, mve_vldrdq_gather_base_wb_z_v2di_insn) for which we keep HI mode for vpr_register_operand. Patch 13: No need to update gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use the mov pattern that emits '@ movhi' in the assembly. Patch 15: This is a new patch to fix a problem I noticed during this v2->v3 update. I'll squash patch 2 with patch 9 and patch 3 with patch 8. Original text: This patch series addresses PR 100757 and 101325 by representing vectors of predicates (MVE VPR.P0 register) as vectors of booleans rather than using HImode. As this implies a lot of mostly mechanical changes, I have tried to split the patches in a way that should help reviewers, but the split is a bit artificial. Patches 1-3 add new tests. Patches 4-6 are small independent improvements. Patch 7 implements the predicate qualifier, but does not change any builtin yet. Patch 8 is the first of the two main patches, and uses the new qualifier to describe the vcmp and vpsel builtins that are useful for auto-vectorization of comparisons. Patch 9 is the second main patch, which fixes the vcond_mask expander. Patches 10-13 convert almost all the remaining builtins with HI operands to use the predicate qualifier. After these, there are still a few builtins with HI operands left, about which I am not sure: vctp, vpnot, load-gather and store-scatter with v2di operands. In fact, patches 11/12 update some STR/LDR qualifiers in a way that breaks these v2di builtins although existing tests still pass. Christophe Lyon (15): arm: Add new tests for comparison vectorization with Neon and MVE arm: Add tests for PR target/100757 arm: Add tests for PR target/101325 arm: Add GENERAL_AND_VPR_REGS regclass arm: Add support for VPR_REG in arm_class_likely_spilled_p arm: Fix mve_vmvnq_n_ argument mode arm: Implement MVE predicates as vectors of booleans arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates arm: Fix vcond_mask expander for MVE (PR target/100757) arm: Convert remaining MVE vcmp builtins to predicate qualifiers arm: Convert more MVE builtins to predicate qualifiers arm: Convert more load/store MVE builtins to predicate qualifiers arm: Convert more MVE/CDE builtins to predicate qualifiers arm: Add VPR_REG to ALL_REGS arm: Fix constraint check for V8HI in mve_vector_mem_operand gcc/config/aarch64/aarch64-modes.def | 8 +- gcc/config/arm/arm-builtins.c | 224 +++-- gcc/config/arm/arm-builtins.h | 4 +- gcc/config/arm/arm-modes.def | 8 + gcc/config/arm/arm-protos.h | 4 +- gcc/config/arm/arm-simd-builtin-types.def | 4 + gcc/config/arm/arm.c | 169 ++-- gcc/config/arm/arm.h | 9 +- gcc/config/arm/arm_mve_builtins.def | 746 ++++++++-------- gcc/config/arm/constraints.md | 6 + gcc/config/arm/iterators.md | 6 + gcc/config/arm/mve.md | 795 ++++++++++-------- gcc/config/arm/neon.md | 39 + gcc/config/arm/vec-common.md | 52 -- gcc/config/arm/vfp.md | 34 +- gcc/doc/sourcebuild.texi | 4 + gcc/emit-rtl.c | 20 +- gcc/genmodes.c | 81 +- gcc/machmode.def | 2 +- gcc/rtx-vector-builder.c | 4 +- gcc/simplify-rtx.c | 34 +- gcc/testsuite/gcc.dg/signbit-2.c | 1 + .../gcc.target/arm/simd/mve-vcmp-f32-2.c | 32 + .../gcc.target/arm/simd/neon-compare-1.c | 78 ++ .../gcc.target/arm/simd/neon-compare-2.c | 13 + .../gcc.target/arm/simd/neon-compare-3.c | 14 + .../arm/simd/neon-compare-scalar-1.c | 57 ++ .../gcc.target/arm/simd/neon-vcmp-f16.c | 12 + .../gcc.target/arm/simd/neon-vcmp-f32-2.c | 15 + .../gcc.target/arm/simd/neon-vcmp-f32-3.c | 12 + .../gcc.target/arm/simd/neon-vcmp-f32.c | 12 + gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c | 22 + .../gcc.target/arm/simd/pr100757-2.c | 20 + .../gcc.target/arm/simd/pr100757-3.c | 20 + .../gcc.target/arm/simd/pr100757-4.c | 19 + gcc/testsuite/gcc.target/arm/simd/pr100757.c | 19 + .../gcc.target/arm/simd/pr101325-2.c | 19 + gcc/testsuite/gcc.target/arm/simd/pr101325.c | 14 + gcc/testsuite/lib/target-supports.exp | 15 +- gcc/varasm.c | 7 +- 40 files changed, 1635 insertions(+), 1019 deletions(-) create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325-2.c create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c