From patchwork Tue Nov 8 14:35:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kwok Cheung Yeung X-Patchwork-Id: 60203 X-Patchwork-Delegate: ams@gcc.gnu.org Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5DBA7385841F for ; Tue, 8 Nov 2022 14:36:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 083EF3858415 for ; Tue, 8 Nov 2022 14:35:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 083EF3858415 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.96,148,1665475200"; d="scan'208,223";a="86239763" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa3.mentor.iphmx.com with ESMTP; 08 Nov 2022 06:35:46 -0800 IronPort-SDR: Hna1vcyWGNWiw8XwOd4GWMpcbPdTNlhiSJdNs1GpIDHeKPPkc0keu+agy1IODT31oznu/5G5cE bjb9EUJBudFHQA3/TStC4+q9JlA686KfqupdzEaTfqqII4Ven779ndnGO2/O3X5k3e1K76j4R5 yC7D2srr/c0bkCNCjoZyZqIg1f5PiCZZM9Cn6au1ASNzffO8YZT4OQAhVMSuzreieVz++nmxfj PXZremNLWFzqwYrPBrO4WL6IKjWE9UtwHFf2oQ8spbY+aekY0jqc65a4RM9uzq1YHnO+vFaZqW UJE= Message-ID: <952c73e5-ba66-0a5a-e33e-1feb6396743e@codesourcery.com> Date: Tue, 8 Nov 2022 14:35:28 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 From: Kwok Cheung Yeung Subject: [PATCH] amdgcn: Add builtins for vectorized native versions of abs, floorf and floor To: gcc-patches , Andrew Stubbs X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) To svr-ies-mbx-12.mgc.mentorg.com (139.181.222.12) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hello This patch adds three extra builtins for the vectorized forms of the abs, floorf and floor math functions, which are implemented by native GCN instructions. I have also added a test to check that they generate the expected assembler instructions. Okay for trunk? Thanks Kwok From 37f49b204d501327d0867b3e8a3f01b9445fb9bd Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Tue, 8 Nov 2022 11:59:58 +0000 Subject: [PATCH] amdgcn: Add builtins for vectorized native versions of abs, floorf and floor 2022-11-08 Kwok Cheung Yeung gcc/ * config/gcn/gcn-builtins.def (FABSV, FLOORVF, FLOORV): New builtins. * config/gcn/gcn.cc (gcn_expand_builtin_1): Expand GCN_BUILTIN_FABSV, GCN_BUILTIN_FLOORVF and GCN_BUILTIN_FLOORV. gcc/testsuite/ * gcc.target/gcn/math-builtins-1.c: New test. --- gcc/config/gcn/gcn-builtins.def | 15 +++++++++ gcc/config/gcn/gcn.cc | 33 +++++++++++++++++++ .../gcc.target/gcn/math-builtins-1.c | 33 +++++++++++++++++++ 3 files changed, 81 insertions(+) create mode 100644 gcc/testsuite/gcc.target/gcn/math-builtins-1.c -- 2.25.1 diff --git a/gcc/config/gcn/gcn-builtins.def b/gcc/config/gcn/gcn-builtins.def index 27691909925..c50777bd3b0 100644 --- a/gcc/config/gcn/gcn-builtins.def +++ b/gcc/config/gcn/gcn-builtins.def @@ -64,6 +64,21 @@ DEF_BUILTIN (FABSVF, 3 /*CODE_FOR_fabsvf */, _A2 (GCN_BTI_V64SF, GCN_BTI_V64SF), gcn_expand_builtin_1) +DEF_BUILTIN (FABSV, 3 /*CODE_FOR_fabsv */, + "fabsv", B_INSN, + _A2 (GCN_BTI_V64DF, GCN_BTI_V64DF), + gcn_expand_builtin_1) + +DEF_BUILTIN (FLOORVF, 3 /*CODE_FOR_floorvf */, + "floorvf", B_INSN, + _A2 (GCN_BTI_V64SF, GCN_BTI_V64SF), + gcn_expand_builtin_1) + +DEF_BUILTIN (FLOORV, 3 /*CODE_FOR_floorv */, + "floorv", B_INSN, + _A2 (GCN_BTI_V64DF, GCN_BTI_V64DF), + gcn_expand_builtin_1) + DEF_BUILTIN (LDEXPVF, 3 /*CODE_FOR_ldexpvf */, "ldexpvf", B_INSN, _A3 (GCN_BTI_V64SF, GCN_BTI_V64SF, GCN_BTI_V64SI), diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index 1996115a686..9c5e3419748 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -4329,6 +4329,39 @@ gcn_expand_builtin_1 (tree exp, rtx target, rtx /*subtarget */ , emit_insn (gen_absv64sf2 (target, arg)); return target; } + case GCN_BUILTIN_FABSV: + { + if (ignore) + return target; + rtx arg = force_reg (V64DFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64DFmode, + EXPAND_NORMAL)); + emit_insn (gen_absv64df2 (target, arg)); + return target; + } + case GCN_BUILTIN_FLOORVF: + { + if (ignore) + return target; + rtx arg = force_reg (V64SFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64SFmode, + EXPAND_NORMAL)); + emit_insn (gen_floorv64sf2 (target, arg)); + return target; + } + case GCN_BUILTIN_FLOORV: + { + if (ignore) + return target; + rtx arg = force_reg (V64DFmode, + expand_expr (CALL_EXPR_ARG (exp, 0), NULL_RTX, + V64DFmode, + EXPAND_NORMAL)); + emit_insn (gen_floorv64df2 (target, arg)); + return target; + } case GCN_BUILTIN_LDEXPVF: { if (ignore) diff --git a/gcc/testsuite/gcc.target/gcn/math-builtins-1.c b/gcc/testsuite/gcc.target/gcn/math-builtins-1.c new file mode 100644 index 00000000000..e1aadfb40d9 --- /dev/null +++ b/gcc/testsuite/gcc.target/gcn/math-builtins-1.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O1" } */ + +typedef float v64sf __attribute__ ((vector_size (256))); +typedef double v64df __attribute__ ((vector_size (512))); +typedef int v64si __attribute__ ((vector_size (256))); +typedef long v64di __attribute__ ((vector_size (512))); + +v64sf f (v64sf _x, v64si _y) +{ + v64sf x = _x; + v64si y = _y; + x = __builtin_gcn_fabsvf (x); /* { dg-final { scan-assembler "v_add_f32\\s+v\[0-9\]+, 0, |v\[0-9\]+|" } } */ + x = __builtin_gcn_floorvf (x); /* { dg-final { scan-assembler "v_floor_f32\\s+v\[0-9\]+, v\[0-9\]+" } }*/ + x = __builtin_gcn_frexpvf_mant (x); /* { dg-final { scan-assembler "v_frexp_mant_f32\\s+v\[0-9\]+, v\[0-9\]+" } }*/ + y = __builtin_gcn_frexpvf_exp (x); /* { dg-final { scan-assembler "v_frexp_exp_i32_f32\\s+v\[0-9\]+, v\[0-9\]+" } }*/ + x = __builtin_gcn_ldexpvf (x, y); /* { dg-final { scan-assembler "v_ldexp_f32\\s+v\[0-9\]+, v\[0-9\]+, v\[0-9\]+" } }*/ + + return x; +} + +v64df g (v64df _x, v64si _y) +{ + v64df x = _x; + v64si y = _y; + x = __builtin_gcn_fabsv (x); /* { dg-final { scan-assembler "v_add_f64\\s+v\\\[\[0-9\]+:\[0-9]+\\\], 0, |v\\\[\[0-9\]+:\[0-9\]+\\\]|" } } */ + x = __builtin_gcn_floorv (x); /* { dg-final { scan-assembler "v_floor_f64\\s+v\\\[\[0-9\]+:\[0-9]+\\\], v\\\[\[0-9\]+:\[0-9]+\\\]" } }*/ + x = __builtin_gcn_frexpv_mant (x); /* { dg-final { scan-assembler "v_frexp_mant_f64\\s+v\\\[\[0-9\]+:\[0-9]+\\\], v\\\[\[0-9\]+:\[0-9]+\\\]" } }*/ + y = __builtin_gcn_frexpv_exp (x); /* { dg-final { scan-assembler "v_frexp_exp_i32_f64\\s+v\[0-9\]+, v\\\[\[0-9\]+:\[0-9]+\\\]" } }*/ + x = __builtin_gcn_ldexpv (x, y); /* { dg-final { scan-assembler "v_ldexp_f64\\s+v\\\[\[0-9\]+:\[0-9]+\\\], v\\\[\[0-9\]+:\[0-9]+\\\], v\[0-9\]+" } }*/ + + return x; +}