From patchwork Fri Jun 10 13:37:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 55011 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6F4A438293C6 for ; Fri, 10 Jun 2022 13:40:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6F4A438293C6 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654868414; bh=MCn+cu5DG7d3QLs7hWkJQ0sOIPGSlxs0fL7t1IUQwj0=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=C0cZrur6qJpsiW70ZWggcSEhGI4W0gR4LQXL07I+tQgDyUbhUQXSgEi+IPBzVVUyN yq3FZz6KYCgTuvVyKkdM6kfUbhhj22W5QSJ6FqH29ZYbnVXZon2aj+bUMId3PAdqLK 4PaCIIaVmJVjbS2plh6UNbzTt9r8/d38r95djNgA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id ECD8A3828903 for ; Fri, 10 Jun 2022 13:37:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ECD8A3828903 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 92C6B1FB; Fri, 10 Jun 2022 06:37:21 -0700 (PDT) Received: from [10.57.10.220] (unknown [10.57.10.220]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C1B6A3F73B; Fri, 10 Jun 2022 06:37:20 -0700 (PDT) Message-ID: Date: Fri, 10 Jun 2022 14:37:14 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Subject: [PATCH][AArch64] Implement ACLE Data Intrinsics X-Spam-Status: No, score=-27.5 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch adds support for the ACLE Data Intrinsics to the AArch64 port. Bootstrapped and regression tested on aarch64-none-linux. OK for trunk? gcc/ChangeLog: 2022-06-10  Andre Vieira          * config/aarch64/aarch64.md (rbit2): Rename this ...         (@aarch64_rbit): ... this and change it in...         (ffs2,ctz2): ... here.         (@aarch64_rev16): New.         * config/aarch64/aarch64-builtins.cc: (aarch64_builtins):         Define the following enum AARCH64_REV16, AARCH64_REV16L, AARCH64_REV16LL,         AARCH64_RBIT, AARCH64_RBITL, AARCH64_RBITLL.         (aarch64_init_data_intrinsics): New.         (handle_arm_acle_h): Add call to aarch64_init_data_intrinsics.         (aarch64_expand_builtin_data_intrinsic): New.         (aarch64_general_expand_builtin): Add call to aarch64_expand_builtin_data_intrinsic.         * config/aarch64/arm_acle.h (__clz, __clzl, __clzll, __cls, __clsl, __clsll, __rbit,         __rbitl, __rbitll, __rev, __revl, __revll, __rev16, __rev16l, __rev16ll, __ror, __rorl,         __rorll, __revsh): New. gcc/testsuite/ChangeLog: 2022-06-10  Andre Vieira      * gcc.target/aarch64/acle/data-intrinsics.c: New test. diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index e0a741ac663188713e21f457affa57217d074783..91a687dee13a27c21f0c50de9ba777aa900d6096 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -613,6 +613,12 @@ enum aarch64_builtins AARCH64_LS64_BUILTIN_ST64B, AARCH64_LS64_BUILTIN_ST64BV, AARCH64_LS64_BUILTIN_ST64BV0, + AARCH64_REV16, + AARCH64_REV16L, + AARCH64_REV16LL, + AARCH64_RBIT, + AARCH64_RBITL, + AARCH64_RBITLL, AARCH64_BUILTIN_MAX }; @@ -1664,10 +1670,41 @@ aarch64_init_ls64_builtins (void) = aarch64_general_add_builtin (data[i].name, data[i].type, data[i].code); } +static void +aarch64_init_data_intrinsics (void) +{ + tree uint32_fntype = build_function_type_list (uint32_type_node, + uint32_type_node, NULL_TREE); + tree long_fntype = build_function_type_list (long_unsigned_type_node, + long_unsigned_type_node, + NULL_TREE); + tree uint64_fntype = build_function_type_list (uint64_type_node, + uint64_type_node, NULL_TREE); + aarch64_builtin_decls[AARCH64_REV16] + = aarch64_general_add_builtin ("__builtin_aarch64_rev16", uint32_fntype, + AARCH64_REV16); + aarch64_builtin_decls[AARCH64_REV16L] + = aarch64_general_add_builtin ("__builtin_aarch64_rev16l", long_fntype, + AARCH64_REV16L); + aarch64_builtin_decls[AARCH64_REV16LL] + = aarch64_general_add_builtin ("__builtin_aarch64_rev16ll", uint64_fntype, + AARCH64_REV16LL); + aarch64_builtin_decls[AARCH64_RBIT] + = aarch64_general_add_builtin ("__builtin_aarch64_rbit", uint32_fntype, + AARCH64_RBIT); + aarch64_builtin_decls[AARCH64_RBITL] + = aarch64_general_add_builtin ("__builtin_aarch64_rbitl", long_fntype, + AARCH64_RBITL); + aarch64_builtin_decls[AARCH64_RBITLL] + = aarch64_general_add_builtin ("__builtin_aarch64_rbitll", uint64_fntype, + AARCH64_RBITLL); +} + /* Implement #pragma GCC aarch64 "arm_acle.h". */ void handle_arm_acle_h (void) { + aarch64_init_data_intrinsics (); if (TARGET_LS64) aarch64_init_ls64_builtins (); } @@ -2393,6 +2430,32 @@ aarch64_expand_builtin_memtag (int fcode, tree exp, rtx target) emit_insn (pat); return target; } +/* Function to expand an expression EXP which calls one of the ACLE Data + Intrinsic builtins FCODE with the result going to TARGET. */ +static rtx +aarch64_expand_builtin_data_intrinsic (unsigned int fcode, tree exp, rtx target) +{ + rtx op0 = expand_normal (CALL_EXPR_ARG (exp, 0)); + machine_mode mode = GET_MODE (op0); + rtx pat; + switch (fcode) + { + case AARCH64_REV16: + case AARCH64_REV16L: + case AARCH64_REV16LL: + pat = gen_aarch64_rev16 (mode, target, op0); + break; + case AARCH64_RBIT: + case AARCH64_RBITL: + case AARCH64_RBITLL: + pat = gen_aarch64_rbit (mode, target, op0); + break; + default: + gcc_unreachable (); + } + emit_insn (pat); + return target; +} /* Expand an expression EXP as fpsr or fpcr setter (depending on UNSPEC) using MODE. */ @@ -2551,6 +2614,9 @@ aarch64_general_expand_builtin (unsigned int fcode, tree exp, rtx target, if (fcode >= AARCH64_MEMTAG_BUILTIN_START && fcode <= AARCH64_MEMTAG_BUILTIN_END) return aarch64_expand_builtin_memtag (fcode, exp, target); + if (fcode >= AARCH64_REV16 + && fcode <= AARCH64_RBITLL) + return aarch64_expand_builtin_data_intrinsic (fcode, exp, target); gcc_unreachable (); } diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index acec8c1146765c0fac73c15351853324b8f03209..ef0aed25c6b26eff61f9f6030dc5921a534e3d19 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4950,7 +4950,7 @@ (define_expand "ffs2" rtx ccreg = aarch64_gen_compare_reg (EQ, operands[1], const0_rtx); rtx x = gen_rtx_NE (VOIDmode, ccreg, const0_rtx); - emit_insn (gen_rbit2 (operands[0], operands[1])); + emit_insn (gen_aarch64_rbit (mode, operands[0], operands[1])); emit_insn (gen_clz2 (operands[0], operands[0])); emit_insn (gen_csinc3_insn (operands[0], x, operands[0], const0_rtx)); DONE; @@ -4996,7 +4996,7 @@ (define_insn "clrsb2" [(set_attr "type" "clz")] ) -(define_insn "rbit2" +(define_insn "@aarch64_rbit" [(set (match_operand:GPI 0 "register_operand" "=r") (unspec:GPI [(match_operand:GPI 1 "register_operand" "r")] UNSPEC_RBIT))] "" @@ -5017,7 +5017,7 @@ (define_insn_and_split "ctz2" "reload_completed" [(const_int 0)] " - emit_insn (gen_rbit2 (operands[0], operands[1])); + emit_insn (gen_aarch64_rbit (mode, operands[0], operands[1])); emit_insn (gen_clz2 (operands[0], operands[0])); DONE; ") @@ -6022,6 +6022,13 @@ (define_insn "bswaphi2" [(set_attr "type" "rev")] ) +(define_insn "@aarch64_rev16" + [(set (match_operand:GPI 0 "register_operand" "=r") + (unspec:GPI [(match_operand:GPI 1 "register_operand" "r")] UNSPEC_REV))] + "" + "rev16\\t%0, %1" + [(set_attr "type" "rev")]) + (define_insn "*aarch64_bfxil" [(set (match_operand:GPI 0 "register_operand" "=r,r") (ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "r,0") diff --git a/gcc/config/aarch64/arm_acle.h b/gcc/config/aarch64/arm_acle.h index 9775a48c65825b424d3eb442384f5ab87b734fd7..faddd5d0a780c5d65ba430bd3174c701e848c794 100644 --- a/gcc/config/aarch64/arm_acle.h +++ b/gcc/config/aarch64/arm_acle.h @@ -28,6 +28,7 @@ #define _GCC_ARM_ACLE_H #include +#include #pragma GCC aarch64 "arm_acle.h" @@ -35,6 +36,54 @@ extern "C" { #endif +#define _GCC_ARM_ACLE_ROR_FN(NAME, TYPE) \ +__extension__ extern __inline TYPE \ +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) \ +NAME (TYPE value, uint32_t rotate) \ +{ \ + size_t size = sizeof (TYPE) * __CHAR_BIT__; \ + rotate = rotate % size; \ + return value >> rotate | value << (size - rotate); \ +} + +_GCC_ARM_ACLE_ROR_FN (__ror, uint32_t) +_GCC_ARM_ACLE_ROR_FN (__rorl, unsigned long) +_GCC_ARM_ACLE_ROR_FN (__rorll, uint64_t) + +#define _GCC_ARM_ACLE_DATA_FN(NAME, BUILTIN, TYPE) \ +__extension__ extern __inline TYPE \ +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) \ +__##NAME (TYPE value) \ +{ \ + return __builtin_##BUILTIN (value); \ +} + +_GCC_ARM_ACLE_DATA_FN (clz, clz, uint32_t) +_GCC_ARM_ACLE_DATA_FN (clzl, clzl, unsigned long) +_GCC_ARM_ACLE_DATA_FN (clzll, clzll, uint64_t) +_GCC_ARM_ACLE_DATA_FN (cls, clrsb, uint32_t) +_GCC_ARM_ACLE_DATA_FN (clsl, clrsbl, unsigned long) +_GCC_ARM_ACLE_DATA_FN (clsll, clrsbll, uint64_t) +_GCC_ARM_ACLE_DATA_FN (rev16, aarch64_rev16, uint32_t) +_GCC_ARM_ACLE_DATA_FN (rev16l, aarch64_rev16l, unsigned long) +_GCC_ARM_ACLE_DATA_FN (rev16ll, aarch64_rev16ll, uint64_t) +_GCC_ARM_ACLE_DATA_FN (rbit, aarch64_rbit, uint32_t) +_GCC_ARM_ACLE_DATA_FN (rbitl, aarch64_rbitl, unsigned long) +_GCC_ARM_ACLE_DATA_FN (rbitll, aarch64_rbitll, uint64_t) +_GCC_ARM_ACLE_DATA_FN (revsh, bswap16, int16_t) +_GCC_ARM_ACLE_DATA_FN (rev, bswap32, uint32_t) +_GCC_ARM_ACLE_DATA_FN (revll, bswap64, uint64_t) + +__extension__ extern __inline unsigned long +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__revl (unsigned long __value) +{ + if (sizeof (unsigned long) == 8) + return __revll (__value); + else + return __rev (__value); +} + #pragma GCC push_options #pragma GCC target ("arch=armv8.3-a") __extension__ extern __inline int32_t diff --git a/gcc/testsuite/gcc.target/aarch64/acle/data-intrinsics.c b/gcc/testsuite/gcc.target/aarch64/acle/data-intrinsics.c new file mode 100644 index 0000000000000000000000000000000000000000..90813184704dfcdaf2d24d523ff744aa6cbedf1a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/acle/data-intrinsics.c @@ -0,0 +1,215 @@ +/* Test the ACLE data intrinsics. */ +/* { dg-do assemble } */ +/* { dg-additional-options "--save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include "arm_acle.h" + +/* +** test_clz: +** clz w0, w0 +** ret +*/ + +uint32_t test_clz (uint32_t a) +{ + return __clz (a); +} + +/* +** test_clzl: +** clz [wx]0, [wx]0 +** ret +*/ + +unsigned long test_clzl (unsigned long a) +{ + return __clzl (a); +} + +/* +** test_clzll: +** clz x0, x0 +** ret +*/ + +uint64_t test_clzll (uint64_t a) +{ + return __clzll (a); +} + +/* +** test_cls: +** cls w0, w0 +** ret +*/ + +uint32_t test_cls (uint32_t a) +{ + return __cls (a); +} + +/* +** test_clsl: +** cls [wx]0, [wx]0 +** ret +*/ + +unsigned long test_clsl (unsigned long a) +{ + return __clsl (a); +} + +/* +** test_clsll: +** cls x0, x0 +** ret +*/ + +uint64_t test_clsll (uint64_t a) +{ + return __clsll (a); +} + +/* +** test_rbit: +** rbit w0, w0 +** ret +*/ + +uint32_t test_rbit (uint32_t a) +{ + return __rbit (a); +} + +/* +** test_rbitl: +** rbit [wx]0, [wx]0 +** ret +*/ + +unsigned long test_rbitl (unsigned long a) +{ + return __rbitl (a); +} + +/* +** test_rbitll: +** rbit x0, x0 +** ret +*/ + +uint64_t test_rbitll (uint64_t a) +{ + return __rbitll (a); +} + +/* +** test_rev: +** rev w0, w0 +** ret +*/ + +uint32_t test_rev (uint32_t a) +{ + return __builtin_bswap32 (a); +} + +/* +** test_revl: +** rev [wx]0, [wx]0 +** ret +*/ + +unsigned long test_revl (unsigned long a) +{ + return __revl (a); +} + +/* +** test_revll: +** rev x0, x0 +** ret +*/ + +uint64_t test_revll (uint64_t a) +{ + return __revll (a); +} + +/* +** test_rev16: +** rev16 w0, w0 +** ret +*/ + +uint32_t test_rev16 (uint32_t a) +{ + return __rev16 (a); +} + +/* +** test_rev16l: +** rev16 [wx]0, [wx]0 +** ret +*/ + +unsigned long test_rev16l (unsigned long a) +{ + return __rev16l (a); +} + +/* +** test_rev16ll: +** rev16 x0, x0 +** ret +*/ + +uint64_t test_rev16ll (uint64_t a) +{ + return __rev16ll (a); +} + +/* +** test_ror: +** ror w0, w0, w1 +** ret +*/ + +uint32_t test_ror (uint32_t a, uint32_t r) +{ + return __ror (a, r); +} + +/* +** test_rorl: +** ror [wx]0, [wx]0, [wx]1 +** ret +*/ + +unsigned long test_rorl (unsigned long a, uint32_t r) +{ + return __rorl (a, r); +} + +/* +** test_rorll: +** ror x0, x0, x1 +** ret +*/ + +uint64_t test_rorll (uint64_t a, uint32_t r) +{ + return __rorll (a, r); +} + +/* +** test_revsh: +** rev16 w0, w0 +** ret +*/ + +int16_t test_revsh (int16_t a) +{ + return __revsh (a); +}