From patchwork Thu Jun 8 10:31:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oluwatamilore Adebayo X-Patchwork-Id: 70780 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A3B1F385700C for ; Thu, 8 Jun 2023 10:32:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A3B1F385700C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686220328; bh=D9/N13lM9VQylpcqtVA78h6ZhbR7QKyCYuNHn1yznR4=; h=To:CC:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=LnrnD8j9xYh6iUZkOGTcNfwiCswxVRELksJQ8r2eyLxGQGUbsGGCbIQ+K31Nn25QR m7mj013D9IMfhaId6AU9GpFrXgZa6MXLewnotH50hiQXLH7CUZQ92sguFptedXIhYq hwunslGLQwweyyV5WGLbeB0Z2mqOi4cBTk/bVUcI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2076.outbound.protection.outlook.com [40.107.21.76]) by sourceware.org (Postfix) with ESMTPS id 1F2CA3858C62 for ; Thu, 8 Jun 2023 10:31:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1F2CA3858C62 Received: from AS9PR06CA0708.eurprd06.prod.outlook.com (2603:10a6:20b:49f::35) by PAXPR08MB7621.eurprd08.prod.outlook.com (2603:10a6:102:23f::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.32; Thu, 8 Jun 2023 10:31:30 +0000 Received: from AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:49f:cafe::cc) by AS9PR06CA0708.outlook.office365.com (2603:10a6:20b:49f::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.33 via Frontend Transport; Thu, 8 Jun 2023 10:31:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT020.mail.protection.outlook.com (100.127.140.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.25 via Frontend Transport; Thu, 8 Jun 2023 10:31:30 +0000 Received: ("Tessian outbound e13c2446394c:v136"); Thu, 08 Jun 2023 10:31:30 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 516908f10e140f3a X-CR-MTA-TID: 64aa7808 Received: from bffeae94a351.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D8CB8DDA-B3EE-4E68-BE52-C27B8A989FAB.1; Thu, 08 Jun 2023 10:31:18 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id bffeae94a351.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 08 Jun 2023 10:31:18 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CcHv9zZ7RWrwlWU50wRRaRmEtfyWyeXmC3TuJNBFVcTUwkrOuIC5Zj6UZx1SiEDJevtlDuVsNqYN4AZUEhqC5uIi2YWeMLhpuXidu3P/Vgo9jMJWdTpjCeSvcQxlBfUa5JnZMEC+BZS52WSleCiuGDyMwNYqmkaw9acDc9WxQAYxnILP2ZJR0UEpoV6n1PHH2aHpTUZUAcfEhVB3gSmgIsz5/mtuRv7BJzaSXTKxORDrLzGl7Q3hnCQEnWKqcBON0E4htir+32rVFjTXavmy7HOujt9mdn3zjpNr2GhD7XQCiID4N8BHJrqiDmqkkgw3JZn6ESYd0jOvbWIBE/E3IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=D9/N13lM9VQylpcqtVA78h6ZhbR7QKyCYuNHn1yznR4=; b=nJRn5WZd8qBkf6KQFmd67glIPcr/9wlCbjcHSxUrIRjgIJ+nBFFb9d0WMTvd6IUL+JL8LRFer1zuOO9Z+4yt+2KbZgw7avEy8/9p/XwsnDaa48BPhBKmluqQrGbowV0x+BGyweEzHprhQMOZgbcktnZ+beR/pHbBndq8bot9Kn7dgdrcMbVSSJvXHj9GBQRpXETiACa5/9arlP1meVv5Fd7rVDBXokF+jTkSJybn7JlYf0++Ll7k+bEYcJV9aBIzLpFlPYJ36qw/ezxbr8KmpG34WwEiaA6cUmRm8J5M6xM4S6OJk/jYRaUkWiOQPTlhccIiU7nrQbErssYFD4/73A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=armh.onmicrosoft.com smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none Received: from AS9PR06CA0559.eurprd06.prod.outlook.com (2603:10a6:20b:485::23) by DBAPR08MB5575.eurprd08.prod.outlook.com (2603:10a6:10:1a6::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.33; Thu, 8 Jun 2023 10:31:15 +0000 Received: from AM7EUR03FT037.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:485:cafe::64) by AS9PR06CA0559.outlook.office365.com (2603:10a6:20b:485::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.33 via Frontend Transport; Thu, 8 Jun 2023 10:31:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM7EUR03FT037.mail.protection.outlook.com (100.127.140.225) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6477.25 via Frontend Transport; Thu, 8 Jun 2023 10:31:15 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 8 Jun 2023 10:31:14 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 8 Jun 2023 10:31:14 +0000 Received: from e119885.cambridge.arm.com (10.2.78.52) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23 via Frontend Transport; Thu, 8 Jun 2023 10:31:14 +0000 To: CC: , , Subject: [PATCH 1/2] Missed opportunity to use [SU]ABD Date: Thu, 8 Jun 2023 11:31:03 +0100 Message-ID: <20230608103103.23794-1-oluwatamilore.adebayo@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230608102830.23565-1-oluwatamilore.adebayo@arm.com> References: <20230608102830.23565-1-oluwatamilore.adebayo@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM7EUR03FT037:EE_|DBAPR08MB5575:EE_|AM7EUR03FT020:EE_|PAXPR08MB7621:EE_ X-MS-Office365-Filtering-Correlation-Id: b886cfe4-0ddb-4c61-3f1e-08db680b8620 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: e2SiDxu7lOXN278GNlN7JIaDu4wRVF0+UQCZ7IWziZmod/+yUqZKPkW0jRlQlJZIXMwmQ2Y+1x/t7LCuF0yEGxI4JTzxdCqee3MwoDmonQX4VCKY3XNl2ND+hYjBnSrGLh1rz3U90DdLxXkprUwJhKSAiKyaqk9YD/odQbBwKcPqkF3wBv1sJKDlmWLSi3DWCv0V6YPDuPJ2EKK6mSoLH4VBVM+jR78UZhZHWjkTBdZgx9XSY+LtHbqxwCCHDMOj6l7UjUHp7vwE9nmPua1XvUs4X7N7J0RYdvfPNewsTFQy85N/bQxiflGiCxVLukRF4GLjGZi1nG4+Ly5HyQiE6XmUlHcZV8IYQ85jDJQBSLUIcRpWHAOB5kwUmHoj9Iw8OO2TFCmvY/h/1B4TTQTKybJq5yv5eTPZeuLujd8z4xQGATZTd6e2NB9ofqSetHOPO9kqrOQ7esFdbWIww2CxqJeKEvqgx1cv5JwadK5m/A2NSvPmMT717qaBYrYK5o+0uveragx33Hvt7D5Y9rmvpwjGqM9BGMJoMYflZnK+5cyat8dU0wyrEUCpbeF1sY+yBruXuexwUmjZ83rBgK0Ygc4w6r6LFwZNxGKYKgwhzbjmUnTsO0rW5rYt95qAq1If0W84XuAStSIdyZTy9WQurYso9y2lyZLWB/j4MR6dobSRsqr9BEFeJK2hwXxXltqsF/uR66Mk+HM45tImtrv9bLMtCq8+awBqBjg29sHa8dbAxBuRlfj2Yuh9nLBG0aXj X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230028)(4636009)(376002)(346002)(39860400002)(136003)(396003)(451199021)(46966006)(36840700001)(83380400001)(44832011)(5660300002)(7049001)(81166007)(6200100001)(186003)(82310400005)(426003)(356005)(26005)(1076003)(336012)(2616005)(47076005)(36860700001)(82740400003)(54906003)(40480700001)(37006003)(2906002)(30864003)(6666004)(478600001)(86362001)(7696005)(36756003)(70586007)(70206006)(4326008)(8676002)(8936002)(6862004)(45080400002)(41300700001)(316002)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBAPR08MB5575 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 2236b119-23bb-400c-6f41-08db680b7cfa X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: tcID2wJe17j1cIJxbqo1+MCRfaORUsFgNvlyxGnVXi2xcqM+8mq8N0JRMmxyB1vqBThs6AClXUdSIDdD7lG1DG3KJCtt4w6s/qmuuwqONqVofLXwXnSlhO0dZIus6R+VUmhsHos7VBKJ4mOs6EppgfOPEVk7xfIjWiVfLteKn7aKtd/81e/GHnvDvjcHzbRnmEmfWbjPQxFLvU5psFqe/o9Q0CF3X4/4VnukIb/B67Gv92+dQgsWAzPGGaD6+G6O3an8br9xy/2NAkajaewgZLn6u/9IR6fEKQQDE+MmCmrgR/WUmKrR6GwDXnQlF34Wfd9EQsl7+SavLJHTwWhkmUOqfiM/gr/4E4Vh1j1D0HMpHED0qtwZQ13kWoCH6fXOB1Aau2hyD3xn3FrEqM7tvB4/TWcCup6aSi0rD10BNtT+rozhlUpuPrnqoHIN/jJA+n9bDm/g9wqHOqfzf4/xiJbK1A6rYdYoaYaoM+/pJM4JGgqHNpY1539ThkB8xIDWPtwK8xOpq/qgDMJIO63wrdKCCIdFBjfvN8F3JQIc6Uqtj4JqXH2KB2Bfb+xd0q/LM2f1750yW8Qi+YWDjadLt79hh+5Boh2p/h11LugWgAnVW3jkLi1okqVHasCcMBdyiZEsYNksBnlevuK17k99BlNVSR8I5YgbYbJh6IS+QQ9QyztXLSow4wkypvJJOwOS7zFN7fVdEN4mUmG+OLXX/UUVU7mRyEtmSk66G4LiZuHOjFnHEKOYpJjHfzenpOTW X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230028)(4636009)(136003)(346002)(396003)(376002)(39860400002)(451199021)(36840700001)(40470700004)(46966006)(37006003)(54906003)(40460700003)(45080400002)(478600001)(40480700001)(5660300002)(8936002)(8676002)(44832011)(4326008)(2906002)(6862004)(36756003)(6200100001)(86362001)(30864003)(81166007)(70206006)(82740400003)(316002)(82310400005)(70586007)(41300700001)(36860700001)(2616005)(1076003)(26005)(186003)(47076005)(83380400001)(6666004)(426003)(7049001)(7696005)(336012); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jun 2023 10:31:30.6621 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b886cfe4-0ddb-4c61-3f1e-08db680b8620 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB7621 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Oluwatamilore Adebayo via Gcc-patches From: Oluwatamilore Adebayo Reply-To: Oluwatamilore Adebayo Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: oluade01 This adds a recognition pattern for the non-widening absolute difference (ABD). gcc/ChangeLog: * doc/md.texi (sabd, uabd): Document them. * internal-fn.def (ABD): Use new optab. * optabs.def (sabd_optab, uabd_optab): New optabs, * tree-vect-patterns.cc (vect_recog_absolute_difference): Recognize the following idiom abs (a - b). (vect_recog_sad_pattern): Refactor to use vect_recog_absolute_difference. (vect_recog_abd_pattern): Use patterns found by vect_recog_absolute_difference to build a new ABD internal call. --- gcc/doc/md.texi | 10 ++ gcc/internal-fn.def | 3 + gcc/optabs.def | 2 + gcc/tree-vect-patterns.cc | 259 +++++++++++++++++++++++++++++++++----- 4 files changed, 244 insertions(+), 30 deletions(-) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 6a435eb44610960513e9739ac9ac1e8a27182c10..e11b10d2fca11016232921bc85e47975f700e6c6 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5787,6 +5787,16 @@ Other shift and rotate instructions, analogous to the Vector shift and rotate instructions that take vectors as operand 2 instead of a scalar type. +@cindex @code{uabd@var{m}} instruction pattern +@cindex @code{sabd@var{m}} instruction pattern +@item @samp{uabd@var{m}}, @samp{sabd@var{m}} +Signed and unsigned absolute difference instructions. These +instructions find the difference between operands 1 and 2 +then return the absolute value. A C code equivalent would be: +@smallexample +op0 = op1 > op2 ? op1 - op2 : op2 - op1; +@end smallexample + @cindex @code{avg@var{m}3_floor} instruction pattern @cindex @code{uavg@var{m}3_floor} instruction pattern @item @samp{avg@var{m}3_floor} diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 3ac9d82aace322bd8ef108596e5583daa18c76e3..116965f4830cec8f60642ff011a86b6562e2c509 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -191,6 +191,9 @@ DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary) DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary) DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary) +DEF_INTERNAL_SIGNED_OPTAB_FN (ABD, ECF_CONST | ECF_NOTHROW, first, + sabd, uabd, binary) + DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_FLOOR, ECF_CONST | ECF_NOTHROW, first, savg_floor, uavg_floor, binary) DEF_INTERNAL_SIGNED_OPTAB_FN (AVG_CEIL, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/optabs.def b/gcc/optabs.def index 6c064ff4993620067d38742a0bfe0a3efb511069..35b835a6ac56d72417dac8ddfd77a8a7e2475e65 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -359,6 +359,8 @@ OPTAB_D (mask_fold_left_plus_optab, "mask_fold_left_plus_$a") OPTAB_D (extract_last_optab, "extract_last_$a") OPTAB_D (fold_extract_last_optab, "fold_extract_last_$a") +OPTAB_D (uabd_optab, "uabd$a3") +OPTAB_D (sabd_optab, "sabd$a3") OPTAB_D (savg_floor_optab, "avg$a3_floor") OPTAB_D (uavg_floor_optab, "uavg$a3_floor") OPTAB_D (savg_ceil_optab, "avg$a3_ceil") diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index dc102c919352a0328cf86eabceb3a38c41a7e4fd..7296892aaa07da59b8122d29a22a2f583e8ff5aa 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -782,6 +782,100 @@ vect_split_statement (vec_info *vinfo, stmt_vec_info stmt2_info, tree new_rhs, } } +/* Look for the following pattern + X = x[i] + Y = y[i] + DIFF = X - Y + DAD = ABS_EXPR + + ABS_STMT should point to a statement of code ABS_EXPR or ABSU_EXPR. + HALF_TYPE and UNPROM will be set should the statement be found to + be a widened operation. + DIFF_STMT will be set to the MINUS_EXPR + statement that precedes the ABS_STMT unless vect_widened_op_tree + succeeds. + */ +static bool +vect_recog_absolute_difference (vec_info *vinfo, gassign *abs_stmt, + tree *half_type, + vect_unpromoted_value unprom[2], + gassign **diff_stmt) +{ + if (!abs_stmt) + return false; + + /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi + inside the loop (in case we are analyzing an outer-loop). */ + enum tree_code code = gimple_assign_rhs_code (abs_stmt); + if (code != ABS_EXPR && code != ABSU_EXPR) + return false; + + tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); + tree abs_type = TREE_TYPE (abs_oprnd); + if (!abs_oprnd) + return false; + if (!ANY_INTEGRAL_TYPE_P (abs_type) + || TYPE_OVERFLOW_WRAPS (abs_type) + || TYPE_UNSIGNED (abs_type)) + return false; + + /* Peel off conversions from the ABS input. This can involve sign + changes (e.g. from an unsigned subtraction to a signed ABS input) + or signed promotion, but it can't include unsigned promotion. + (Note that ABS of an unsigned promotion should have been folded + away before now anyway.) */ + vect_unpromoted_value unprom_diff; + abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, + &unprom_diff); + if (!abs_oprnd) + return false; + if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) + && TYPE_UNSIGNED (unprom_diff.type)) + return false; + + /* We then detect if the operand of abs_expr is defined by a minus_expr. */ + stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); + if (!diff_stmt_vinfo) + return false; + + /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi + inside the loop (in case we are analyzing an outer-loop). */ + if (vect_widened_op_tree (vinfo, diff_stmt_vinfo, + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, 2, unprom, half_type)) + return true; + + /* Failed to find a widen operation so we check for a regular MINUS_EXPR. */ + gassign *diff = dyn_cast (STMT_VINFO_STMT (diff_stmt_vinfo)); + if (!diff || gimple_assign_rhs_code (diff) != MINUS_EXPR) + return false; + + if (diff_stmt) + *diff_stmt = diff; + + if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (abs_oprnd))) + { + *half_type = NULL_TREE; + return true; + } + + vect_unpromoted_value diff0, diff1; + if (vect_look_through_possible_promotion (vinfo, gimple_assign_rhs1 (diff), + &diff0) + && vect_look_through_possible_promotion (vinfo, gimple_assign_rhs2 (diff), + &diff1)) + { + if (TYPE_OVERFLOW_UNDEFINED (diff0.type) + && TYPE_OVERFLOW_UNDEFINED (diff1.type)) + { + *half_type = NULL_TREE; + return true; + } + } + + return false; +} + /* Convert UNPROM to TYPE and return the result, adding new statements to STMT_INFO's pattern definition statements if no better way is available. VECTYPE is the vector form of TYPE. @@ -1320,41 +1414,31 @@ vect_recog_sad_pattern (vec_info *vinfo, /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ gassign *abs_stmt = dyn_cast (abs_stmt_vinfo->stmt); - if (!abs_stmt - || (gimple_assign_rhs_code (abs_stmt) != ABS_EXPR - && gimple_assign_rhs_code (abs_stmt) != ABSU_EXPR)) - return NULL; + vect_unpromoted_value unprom[2]; - tree abs_oprnd = gimple_assign_rhs1 (abs_stmt); - tree abs_type = TREE_TYPE (abs_oprnd); - if (TYPE_UNSIGNED (abs_type)) - return NULL; + if (!abs_stmt) + { + gcall *abd_stmt = dyn_cast (abs_stmt_vinfo->stmt); + if (!abd_stmt + || !gimple_call_internal_p (abd_stmt) + || gimple_call_internal_fn (abd_stmt) != IFN_ABD) + return NULL; - /* Peel off conversions from the ABS input. This can involve sign - changes (e.g. from an unsigned subtraction to a signed ABS input) - or signed promotion, but it can't include unsigned promotion. - (Note that ABS of an unsigned promotion should have been folded - away before now anyway.) */ - vect_unpromoted_value unprom_diff; - abs_oprnd = vect_look_through_possible_promotion (vinfo, abs_oprnd, - &unprom_diff); - if (!abs_oprnd) - return NULL; - if (TYPE_PRECISION (unprom_diff.type) != TYPE_PRECISION (abs_type) - && TYPE_UNSIGNED (unprom_diff.type)) - return NULL; + tree abd_oprnd0 = gimple_call_arg (abd_stmt, 0); + tree abd_oprnd1 = gimple_call_arg (abd_stmt, 1); - /* We then detect if the operand of abs_expr is defined by a minus_expr. */ - stmt_vec_info diff_stmt_vinfo = vect_get_internal_def (vinfo, abs_oprnd); - if (!diff_stmt_vinfo) + if (!vect_look_through_possible_promotion (vinfo, abd_oprnd0, &unprom[0]) + || !vect_look_through_possible_promotion (vinfo, abd_oprnd1, + &unprom[1])) + return NULL; + + half_type = unprom[0].type; + } + else if (!vect_recog_absolute_difference (vinfo, abs_stmt, &half_type, + unprom, NULL)) return NULL; - /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi - inside the loop (in case we are analyzing an outer-loop). */ - vect_unpromoted_value unprom[2]; - if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, - IFN_VEC_WIDEN_MINUS, - false, 2, unprom, &half_type)) + if (!half_type) return NULL; vect_pattern_detected ("vect_recog_sad_pattern", last_stmt); @@ -1376,6 +1460,120 @@ vect_recog_sad_pattern (vec_info *vinfo, return pattern_stmt; } +/* Function vect_recog_abd_pattern + + Try to find the following ABsolute Difference (ABD) pattern: + + VTYPE x, y, out; + type diff; + loop i in range: + S1 diff = x[i] - y[i] + S2 out[i] = ABS_EXPR ; + + where 'type' is a integer and 'VTYPE' is a vector of integers + the same size as 'type' + + Input: + + * STMT_VINFO: The stmt from which the pattern search begins + + Output: + + * TYPE_out: The type of the output of this pattern + + * Return value: A new stmt that will be used to replace the sequence of + stmts that constitute the pattern; either SABD or UABD: + SABD_EXPR + UABD_EXPR + */ + +static gimple * +vect_recog_abd_pattern (vec_info *vinfo, + stmt_vec_info stmt_vinfo, tree *type_out) +{ + /* Look for the following patterns + X = x[i] + Y = y[i] + DIFF = X - Y + DAD = ABS_EXPR + out[i] = DAD + + In which + - X, Y, DIFF, DAD all have the same type + - x, y, out are all vectors of the same type + */ + + gassign *last_stmt = dyn_cast (STMT_VINFO_STMT (stmt_vinfo)); + if (!last_stmt) + return NULL; + + tree out_type = TREE_TYPE (gimple_assign_lhs (last_stmt)); + + vect_unpromoted_value unprom[2]; + gassign *diff_stmt; + tree half_type; + if (!vect_recog_absolute_difference (vinfo, last_stmt, &half_type, + unprom, &diff_stmt)) + return NULL; + + tree abd_type = out_type, vectype; + tree abd_oprnds[2]; + bool extend = false; + if (half_type) + { + vectype = get_vectype_for_scalar_type (vinfo, half_type); + if (!vectype) + return NULL; + + abd_type = half_type; + vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds, + half_type, unprom, vectype); + + extend = TYPE_PRECISION (abd_type) < TYPE_PRECISION (out_type); + } + else + { + unprom[0].op = gimple_assign_rhs1 (diff_stmt); + unprom[1].op = gimple_assign_rhs2 (diff_stmt); + tree signed_out = signed_type_for (out_type); + vectype = get_vectype_for_scalar_type (vinfo, signed_out); + if (!vectype) + return NULL; + + vect_convert_inputs (vinfo, stmt_vinfo, 2, abd_oprnds, + signed_out, unprom, vectype); + } + + vect_pattern_detected ("vect_recog_abd_pattern", last_stmt); + + if (!vectype + || !direct_internal_fn_supported_p (IFN_ABD, vectype, + OPTIMIZE_FOR_SPEED)) + return NULL; + + *type_out = get_vectype_for_scalar_type (vinfo, out_type); + + tree abd_result = vect_recog_temp_ssa_var (abd_type, NULL); + gcall *abd_stmt = gimple_build_call_internal (IFN_ABD, 2, + abd_oprnds[0], abd_oprnds[1]); + gimple_call_set_lhs (abd_stmt, abd_result); + gimple_set_location (abd_stmt, gimple_location (last_stmt)); + + if (!extend) + return abd_stmt; + + gimple *stmt = abd_stmt; + if (!TYPE_UNSIGNED (abd_type)) + { + tree unsign = unsigned_type_for (abd_type); + tree unsign_vectype = get_vectype_for_scalar_type (vinfo, unsign); + stmt = vect_convert_output (vinfo, stmt_vinfo, unsign, stmt, + unsign_vectype); + } + + return vect_convert_output (vinfo, stmt_vinfo, out_type, stmt, vectype); +} + /* Recognize an operation that performs ORIG_CODE on widened inputs, so that it can be treated as though it had the form: @@ -6471,6 +6669,7 @@ struct vect_recog_func static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_bitfield_ref_pattern, "bitfield_ref" }, { vect_recog_bit_insert_pattern, "bit_insert" }, + { vect_recog_abd_pattern, "abd" }, { vect_recog_over_widening_pattern, "over_widening" }, /* Must come after over_widening, which narrows the shift as much as possible beforehand. */ From patchwork Thu Jun 8 10:38:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oluwatamilore Adebayo X-Patchwork-Id: 70781 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0AB4F3858C2F for ; Thu, 8 Jun 2023 10:39:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0AB4F3858C2F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686220773; bh=yEbBIyhSoiQfdA3J8rVzFx3RrK386/pDDvZSghYy1DA=; h=To:CC:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=cjTbX+b8bKYQis4pWYUmLudroG44tWKzmtGWUwKOXWuTt+JMZdqZzrMkqfb6Ye/IQ 2Tpw1llPvnBr0lnrTP+Rpg+8f72z6Zhlw+tllvtS6jqEoWWYCevP5zP1MC58Fx5nD4 RUzcYU21sQVLEnnTElu1ykLebewwthYSo9FSQjr0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2082.outbound.protection.outlook.com [40.107.22.82]) by sourceware.org (Postfix) with ESMTPS id D60363858C2F for ; Thu, 8 Jun 2023 10:39:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D60363858C2F Received: from AM6P191CA0085.EURP191.PROD.OUTLOOK.COM (2603:10a6:209:8a::26) by VE1PR08MB5645.eurprd08.prod.outlook.com (2603:10a6:800:1a6::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.32; Thu, 8 Jun 2023 10:38:52 +0000 Received: from AM7EUR03FT050.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:8a:cafe::5c) by AM6P191CA0085.outlook.office365.com (2603:10a6:209:8a::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6433.29 via Frontend Transport; Thu, 8 Jun 2023 10:38:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT050.mail.protection.outlook.com (100.127.141.27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.25 via Frontend Transport; Thu, 8 Jun 2023 10:38:52 +0000 Received: ("Tessian outbound 5bb4c51d5a1f:v136"); Thu, 08 Jun 2023 10:38:52 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 307fc1027b650f85 X-CR-MTA-TID: 64aa7808 Received: from c7e17c465629.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9E675E90-6F50-48FF-B9F1-FB452371225E.1; Thu, 08 Jun 2023 10:38:45 +0000 Received: from EUR02-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c7e17c465629.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 08 Jun 2023 10:38:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=m1EiEjWs0IqushhVleOEE0LtEBhQZwukclwj5qHu3B5AQtg8qWQ/0bAsG26IhhMXyIaMPheFaH0RqVuolxAJnTx48DnqRT3Um9QEk9Gzv3/9G7d5X1KeeohG7ylVOY+2EZVQNV6hyFEqw+pk4YLPv977bGkW3VVr8QISkc44X1CU4P2ZrbZOI/+D8WcZDvtJr5gVMuUoLkw0QMQnzHhPaWxwE9p1Uo1Xw3K/qHQcH/1d68aiOgR+dDI+He9Q1RVRt58zdmBwF/2HqSPmyvg3Q4CWqMR4rnRY4zNH7lSrQVk/RBj1EjtkQVKkV2FMZ40E43gPUon76u9DGBIqyyL2vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=yEbBIyhSoiQfdA3J8rVzFx3RrK386/pDDvZSghYy1DA=; b=bMdhYvMCBoEjxyA0FTEPaquI3naqk7aU1A+r3buZoeiQ1CeJDs4IfwY29V8ZHrc7k6MCfROtt8bACpuyWgveklUHuA3gG06j+E7QkZ8O721QFX1EV41+jSCQIttod69FCRp2y7BDn8HgNYrUqpsAfof7C86y8Ubg9CWJ3q42Q0I4T4UCe9iVhGJORnIopzuYF+V6zkx4CAaI++RUU1nOUG/Io9+JsaanzoxlUtv5t1wg962oOrKef7FIwQFkdBtaoF4eghryfZk76LqvsXVIrd5SBzFivYEwPVeZz1S9q8Jyfzo4KrtplaY7ydaKXBNMSDa1crAsvlI/o+tgNJtVuw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=armh.onmicrosoft.com smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none Received: from AM0PR01CA0093.eurprd01.prod.exchangelabs.com (2603:10a6:208:10e::34) by AS8PR08MB6469.eurprd08.prod.outlook.com (2603:10a6:20b:33c::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.19; Thu, 8 Jun 2023 10:38:45 +0000 Received: from AM7EUR03FT030.eop-EUR03.prod.protection.outlook.com (2603:10a6:208:10e:cafe::25) by AM0PR01CA0093.outlook.office365.com (2603:10a6:208:10e::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.33 via Frontend Transport; Thu, 8 Jun 2023 10:38:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM7EUR03FT030.mail.protection.outlook.com (100.127.140.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6477.25 via Frontend Transport; Thu, 8 Jun 2023 10:38:41 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Thu, 8 Jun 2023 10:38:40 +0000 Received: from e119885.cambridge.arm.com (10.2.78.52) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23 via Frontend Transport; Thu, 8 Jun 2023 10:38:40 +0000 To: CC: , , Subject: [PATCH 2/2] AArch64: New RTL for ABD Date: Thu, 8 Jun 2023 11:38:33 +0100 Message-ID: <20230608103833.25420-1-oluwatamilore.adebayo@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230608103713.25297-1-oluwatamilore.adebayo@arm.com> References: <20230608103713.25297-1-oluwatamilore.adebayo@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM7EUR03FT030:EE_|AS8PR08MB6469:EE_|AM7EUR03FT050:EE_|VE1PR08MB5645:EE_ X-MS-Office365-Filtering-Correlation-Id: 0e74c036-b053-43a0-225a-08db680c8d9e x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: azWodjD/p/XP2XHae87gLm9VYrkjbw/fE4vovbFYI29KhGD/R+E3xhdaY4OBULA29S2pKWcAJ/eL9aTl9Ymx+6LoXBs8gDFhS9bqe3G+huiPmwo6RRNVE8FjP3e95sgPY8B4/hHLyfkwKlmJB1eQSVQEzSAj2xrZVxEH6FVVzwV5QGMGrQKRNUcBJsjZ4VJtoDKhiJMtMGWuGkpbH7GW8EZmfWbavHhjfjMeKdauT9pyCQKsjTWkZIdVEi6vJzdyg+S3GwaJgTbmoSbWhOgtj54nAdIroIxCGYS7FFSZzJuN8kBjrQ7R5lPvC3kljClciQXp9i+IWepRwnPf8BhbVNH4i5eda6j5aSlXrs72kV7cOiVCOC9bt70bUGQH47FPqy86WSAV3Zl/PAR6M/M0soBoAZbuegWNu9j8ta60VhnIq06CbZlf79FDk3xl+Qkrt/JaN0PIomCZp6yPZMGn+j/FfUPbvvdhVQMRq8wzOxzjtoKpukF2kcp6OV6lfi+DWLkTUttPqYbpzhoc431Ku4DGwXZR1u64hoVJAHdwBIfKM1my2+HWycO7XWU/r1cPmgPrK5gb9C50RezztDoJoVFOCRt+QXu6tfs/ovgojkBUQ9CJSseHUKlxOBdCsy5QRfClya1Z2QvxmN1zkSOdrQjelPymgqtLqw/s1/rRbnqXSdEAHDa3Ifl5TI++04fZ9d6HAKWMbedrGbzwHCHYphVCNr0/JVUd/9PFE96J6PxKE4ZJrcPi/BTAjs/fW+q7uDTD9LwoH3zlKGoixau5CbGceKO5DJPyv7DY6jTAZTY= X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(136003)(396003)(346002)(376002)(451199021)(46966006)(40470700004)(36840700001)(7696005)(6666004)(7049001)(356005)(81166007)(82740400003)(30864003)(70586007)(70206006)(2906002)(37006003)(36756003)(54906003)(5660300002)(8936002)(8676002)(44832011)(86362001)(41300700001)(4326008)(316002)(40480700001)(478600001)(6862004)(40460700003)(426003)(336012)(2616005)(84970400001)(26005)(1076003)(186003)(82310400005)(47076005)(36860700001)(83380400001)(6200100001)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6469 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT050.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 481e0430-b939-4c23-de9d-08db680c8705 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 0DlGW37117vM1gdrv7Yh72huDiFn8/ofoxyr95e5pxkxdvPfYuIfyGExkBwzOmZF/6TbkwoviafALFfDaKymmBf3KUMG5iylKe1Ca1ZbOktJ6LznpxTS72p63DiHTtvtd2FYJc0L3bzOUyZjvk55jhB3Yv+Yrq0VRX3q/EjHkC9YJ5aTU1jH7PmWCDwkVvA2oK/wjcjmjCoT2WIj03hLZX+aD3YQje2y5CcApM+RFoifL9/h/h9yy3I+FNKXPWh/0w4ZQzZZnBIYXP59gT3QaFgOADNjq7s6WqUGedfLJ2a5uRK6Wd011K8xm7kFYcudr5t5kiGIbk0eCeJs0tz75t2qCS9YeM8P57Lx7IyigyqaI0AHt+b4CfYz+8+2KoPLN0BKlaIktohsVEoc/HPIV9YyPF14UbYl+qRu3bUn1ZsgxKVLzLUuf4mOAQs7RQGC8LifbykSBBYpmrMSlMS8U7oPgFSfSNp4nLEMMX2OEeblSWopMlrRbVyBhUryF3P83hH6SzZnW/PwLJHXYdx+Wv3rMS+ZQ6q9bsQnW9VRJt022ZE1pXo9OzDpWaKvFSgk5h7AXBKCQwj5iA/dZw3RdNVMIgUj2DqVfkfwIVOb1iRqnLgiTEhN+b8hGIe2wNAcXcz0cfYcJWBAd1pGXu+HepK61teNUI0Q7IIoliPItagi112BSI/tXJEmgBrac52cXSPWch5wRYqF7QzexGpvqW20jRkCzZ8zE/r+C7Nv1SYdZTeLC/O1kExOK39/sHjRZiO3vZwozowvDbUhdC0K5w== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(376002)(396003)(346002)(136003)(451199021)(40470700004)(46966006)(36840700001)(1076003)(41300700001)(7696005)(2616005)(186003)(336012)(6666004)(7049001)(426003)(47076005)(84970400001)(36860700001)(83380400001)(26005)(40460700003)(40480700001)(478600001)(37006003)(54906003)(70206006)(82310400005)(4326008)(82740400003)(70586007)(316002)(81166007)(5660300002)(8936002)(6862004)(8676002)(2906002)(44832011)(86362001)(30864003)(36756003)(6200100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Jun 2023 10:38:52.6966 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0e74c036-b053-43a0-225a-08db680c8d9e X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT050.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5645 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Oluwatamilore Adebayo via Gcc-patches From: Oluwatamilore Adebayo Reply-To: Oluwatamilore Adebayo Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: oluade01 This patch adds new RTL and tests for sabd and uabd PR tree-optimization/109156 gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (sabd, uabd): Change the mode to 3. * config/aarch64/aarch64-simd.md (aarch64_abd): Rename to abd3. * config/aarch64/aarch64-sve.md (abd_3): Rename to abd3. gcc/testsuite/ChangeLog: * gcc.target/aarch64/abd.h: New file. * gcc.target/aarch64/abd_2.c: New test. * gcc.target/aarch64/abd_3.c: New test. * gcc.target/aarch64/abd_4.c: New test. * gcc.target/aarch64/abd_none_2.c: New test. * gcc.target/aarch64/abd_none_3.c: New test. * gcc.target/aarch64/abd_none_4.c: New test. * gcc.target/aarch64/abd_run_1.c: New test. * gcc.target/aarch64/sve/abd_1.c: New test. * gcc.target/aarch64/sve/abd_none_1.c: New test. * gcc.target/aarch64/sve/abd_2.c: New test. * gcc.target/aarch64/sve/abd_none_2.c: New test. --- gcc/config/aarch64/aarch64-simd-builtins.def | 6 +- gcc/config/aarch64/aarch64-simd.md | 4 +- gcc/config/aarch64/aarch64-sve.md | 4 +- gcc/testsuite/gcc.target/aarch64/abd.h | 68 ++++++++++++++ gcc/testsuite/gcc.target/aarch64/abd_2.c | 35 +++++++ gcc/testsuite/gcc.target/aarch64/abd_3.c | 36 +++++++ gcc/testsuite/gcc.target/aarch64/abd_4.c | 30 ++++++ gcc/testsuite/gcc.target/aarch64/abd_none_2.c | 14 +++ gcc/testsuite/gcc.target/aarch64/abd_none_3.c | 14 +++ gcc/testsuite/gcc.target/aarch64/abd_none_4.c | 19 ++++ gcc/testsuite/gcc.target/aarch64/abd_run_1.c | 93 +++++++++++++++++++ gcc/testsuite/gcc.target/aarch64/sve/abd_1.c | 35 +++++++ gcc/testsuite/gcc.target/aarch64/sve/abd_2.c | 32 +++++++ .../gcc.target/aarch64/sve/abd_none_1.c | 13 +++ .../gcc.target/aarch64/sve/abd_none_2.c | 18 ++++ 15 files changed, 414 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/abd.h create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_none_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_none_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_none_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/abd_run_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_none_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/abd_none_2.c diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 1beaa08c1e7c94bc13a64865ddb677345534699c..3efbf0a1874f6242e69665b8316d9a7d62a9c8cf 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -194,9 +194,9 @@ BUILTIN_VDQV_L (UNOP, saddlv, 0, NONE) BUILTIN_VDQV_L (UNOPU, uaddlv, 0, NONE) - /* Implemented by aarch64_abd. */ - BUILTIN_VDQ_BHSI (BINOP, sabd, 0, NONE) - BUILTIN_VDQ_BHSI (BINOPU, uabd, 0, NONE) + /* Implemented by abd3. */ + BUILTIN_VDQ_BHSI (BINOP, sabd, 3, NONE) + BUILTIN_VDQ_BHSI (BINOPU, uabd, 3, NONE) /* Implemented by aarch64_aba. */ BUILTIN_VDQ_BHSI (TERNOP, saba, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index a567f016b354c0f0542e58e7b51c0be739882d65..da35a928bac91db61f4e9884d9c8b162c3a3c937 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -896,7 +896,7 @@ (define_insn "aarch64_abs" ;; So (ABS:QI (minus:QI 64 -128)) == (ABS:QI (192 or -64 signed)) == 64. ;; Whereas SABD would return 192 (-64 signed) on the above example. ;; Use MINUS ([us]max (op1, op2), [us]min (op1, op2)) instead. -(define_insn "aarch64_abd" +(define_insn "abd3" [(set (match_operand:VDQ_BHSI 0 "register_operand" "=w") (minus:VDQ_BHSI (USMAX:VDQ_BHSI @@ -1087,7 +1087,7 @@ (define_expand "sadv16qi" { rtx ones = force_reg (V16QImode, CONST1_RTX (V16QImode)); rtx abd = gen_reg_rtx (V16QImode); - emit_insn (gen_aarch64_abdv16qi (abd, operands[1], operands[2])); + emit_insn (gen_abdv16qi3 (abd, operands[1], operands[2])); emit_insn (gen_udot_prodv16qi (operands[0], abd, ones, operands[3])); DONE; } diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 2898b85376b831c2728b806e0f2079086345f1fe..2de651a1989c6b36272dd78a8744c700ebc75c1a 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -4001,7 +4001,7 @@ (define_insn_and_rewrite "*aarch64_adr_shift_uxtw" ;; ------------------------------------------------------------------------- ;; Unpredicated integer absolute difference. -(define_expand "abd_3" +(define_expand "abd3" [(use (match_operand:SVE_I 0 "register_operand")) (USMAX:SVE_I (match_operand:SVE_I 1 "register_operand") @@ -6973,7 +6973,7 @@ (define_expand "sad" { rtx ones = force_reg (mode, CONST1_RTX (mode)); rtx diff = gen_reg_rtx (mode); - emit_insn (gen_abd_3 (diff, operands[1], operands[2])); + emit_insn (gen_abd3 (diff, operands[1], operands[2])); emit_insn (gen_udot_prod (operands[0], diff, ones, operands[3])); DONE; } diff --git a/gcc/testsuite/gcc.target/aarch64/abd.h b/gcc/testsuite/gcc.target/aarch64/abd.h new file mode 100644 index 0000000000000000000000000000000000000000..b95fd908d91d9e576e4d76638844e22deb50a006 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd.h @@ -0,0 +1,68 @@ +#ifdef ABD_IDIOM + +#define TEST1(S, TYPE) \ +__attribute__((noipa)) \ +void fn_##S##_##TYPE (S TYPE * restrict a, \ + S TYPE * restrict b, \ + S TYPE * restrict out) { \ + for (int i = 0; i < N; i++) { \ + signed TYPE diff = b[i] - a[i]; \ + out[i] = diff > 0 ? diff : -diff; \ +} } + +#define TEST2(S, TYPE1, TYPE2) \ +__attribute__((noipa)) \ +void fn_##S##_##TYPE1##_##TYPE1##_##TYPE2 \ + (S TYPE1 * restrict a, \ + S TYPE1 * restrict b, \ + S TYPE2 * restrict out) { \ + for (int i = 0; i < N; i++) { \ + signed TYPE2 diff = b[i] - a[i]; \ + out[i] = diff > 0 ? diff : -diff; \ +} } + +#define TEST3(S, TYPE1, TYPE2, TYPE3) \ +__attribute__((noipa)) \ +void fn_##S##_##TYPE1##_##TYPE2##_##TYPE3 \ + (S TYPE1 * restrict a, \ + S TYPE2 * restrict b, \ + S TYPE3 * restrict out) { \ + for (int i = 0; i < N; i++) { \ + signed TYPE3 diff = b[i] - a[i]; \ + out[i] = diff > 0 ? diff : -diff; \ +} } + +#endif + +#ifdef ABD_ABS + +#define TEST1(S, TYPE) \ +__attribute__((noipa)) \ +void fn_##S##_##TYPE (S TYPE * restrict a, \ + S TYPE * restrict b, \ + S TYPE * restrict out) { \ + for (int i = 0; i < N; i++) \ + out[i] = __builtin_abs(a[i] - b[i]); \ +} + +#define TEST2(S, TYPE1, TYPE2) \ +__attribute__((noipa)) \ +void fn_##S##_##TYPE1##_##TYPE1##_##TYPE2 \ + (S TYPE1 * restrict a, \ + S TYPE1 * restrict b, \ + S TYPE2 * restrict out) { \ + for (int i = 0; i < N; i++) \ + out[i] = __builtin_abs(a[i] - b[i]); \ +} + +#define TEST3(S, TYPE1, TYPE2, TYPE3) \ +__attribute__((noipa)) \ +void fn_##S##_##TYPE1##_##TYPE2##_##TYPE3 \ + (S TYPE1 * restrict a, \ + S TYPE2 * restrict b, \ + S TYPE3 * restrict out) { \ + for (int i = 0; i < N; i++) \ + out[i] = __builtin_abs(a[i] - b[i]); \ +} + +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/abd_2.c b/gcc/testsuite/gcc.target/aarch64/abd_2.c new file mode 100644 index 0000000000000000000000000000000000000000..c0d41fb7ef99baf95b0f6a2e68a88f6748482af3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_2.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#pragma GCC target "+nosve" +#define N 1024 + +#define ABD_ABS +#include "abd.h" + +TEST1(signed, int) +TEST1(signed, short) +TEST1(signed, char) + +TEST2(signed, char, short) +TEST2(signed, char, int) +TEST2(signed, short, int) + +TEST3(signed, char, int, short) +TEST3(signed, char, short, int) + +TEST1(unsigned, short) +TEST1(unsigned, char) + +TEST2(unsigned, char, short) +TEST2(unsigned, char, int) + +TEST3(unsigned, char, short, int) + +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 5 } } */ +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, v\[0-9\]+\.8h" 4 } } */ +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, v\[0-9\]+\.16b" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, v\[0-9\]+\.8h" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, v\[0-9\]+\.16b" 3 } } */ + +/* { dg-final { scan-assembler-not {\tabs\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/abd_3.c b/gcc/testsuite/gcc.target/aarch64/abd_3.c new file mode 100644 index 0000000000000000000000000000000000000000..4873c64f34885b3993010beafa01087569336dec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_3.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +#pragma GCC target "arch=armv8-a" +#define N 1024 + +#define ABD_ABS +#include "abd.h" + +TEST1(signed, int) +TEST1(signed, short) +TEST1(signed, char) + +TEST2(signed, char, short) +TEST2(signed, char, int) +TEST2(signed, short, int) + +TEST3(signed, char, int, short) +TEST3(signed, char, short, int) + +TEST1(unsigned, short) +TEST1(unsigned, char) + +TEST2(unsigned, char, short) +TEST2(unsigned, char, int) +TEST2(unsigned, short, int) + +TEST3(unsigned, char, short, int) + +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 5 } } */ +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, v\[0-9\]+\.8h" 4 } } */ +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, v\[0-9\]+\.16b" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, v\[0-9\]+\.8h" 4 } } */ +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, v\[0-9\]+\.16b" 3 } } */ + +/* { dg-final { scan-assembler-not {\tabs\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/abd_4.c b/gcc/testsuite/gcc.target/aarch64/abd_4.c new file mode 100644 index 0000000000000000000000000000000000000000..4670ec8fe92f3cd4b1bb0a15125fc66231b2be0a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_4.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#pragma GCC target "+nosve" +#define N 1024 + +#define ABD_IDIOM +#include "abd.h" + +TEST1(signed, int) +TEST1(signed, short) +TEST1(signed, char) + +TEST2(signed, char, short) +TEST2(signed, char, int) +TEST2(signed, short, int) + +TEST3(signed, char, short, int) + +TEST2(unsigned, char, short) +TEST2(unsigned, char, int) +TEST2(unsigned, short, int) + +TEST3(unsigned, char, short, int) + +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */ +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, v\[0-9\]+\.8h" 4 } } */ +/* { dg-final { scan-assembler-times "sabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, v\[0-9\]+\.16b" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h, v\[0-9\]+\.8h" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b, v\[0-9\]+\.16b" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/abd_none_2.c b/gcc/testsuite/gcc.target/aarch64/abd_none_2.c new file mode 100644 index 0000000000000000000000000000000000000000..658e7426965ead945d0cad68ef721f176fb41665 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_none_2.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#pragma GCC target "+nosve" +#define N 1024 + +#define ABD_ABS +#include "abd.h" + +TEST1(unsigned, int) +TEST3(unsigned, char, int, short) + +/* { dg-final { scan-assembler-not {\tsabd\t} } } */ +/* { dg-final { scan-assembler-not {\tuabd\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/abd_none_3.c b/gcc/testsuite/gcc.target/aarch64/abd_none_3.c new file mode 100644 index 0000000000000000000000000000000000000000..14cfdcbde6998b527989326ff8848d071a4774e8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_none_3.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast" } */ + +#pragma GCC target "arch=armv8-a" +#define N 1024 + +#define ABD_ABS +#include "abd.h" + +TEST1(unsigned, int) +TEST3(unsigned, char, int, short) + +/* { dg-final { scan-assembler-not {\tsabd\t} } } */ +/* { dg-final { scan-assembler-not {\tuabd\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/abd_none_4.c b/gcc/testsuite/gcc.target/aarch64/abd_none_4.c new file mode 100644 index 0000000000000000000000000000000000000000..667acf2c3c8156fcf5ff9a56b40fdf2cf2aa5502 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_none_4.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#pragma GCC target "+nosve" +#define N 1024 + +#define ABD_IDIOM +#include "abd.h" + +TEST3(signed, char, int, short) + +TEST1(unsigned, int) +TEST1(unsigned, short) +TEST1(unsigned, char) + +TEST3(unsigned, char, int, short) + +/* { dg-final { scan-assembler-not {\tsabd\t} } } */ +/* { dg-final { scan-assembler-not {\tuabd\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/abd_run_1.c b/gcc/testsuite/gcc.target/aarch64/abd_run_1.c new file mode 100644 index 0000000000000000000000000000000000000000..7bb0a801415ffeab235bd636032112228255e836 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/abd_run_1.c @@ -0,0 +1,93 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +#pragma GCC target "+nosve" +#define N 16 + +#define ABD_ABS +#include "abd.h" + +TEST1(signed, int) +TEST1(signed, short) +TEST1(signed, char) + +TEST1(unsigned, int) +TEST1(unsigned, short) +TEST1(unsigned, char) + +#define EMPTY { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } +#define sA { -50, -50, -50, -50, -50, -50, -50, -50, -50, -50, -50, -50, -50, -50, -50, -50 } +#define uA { 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100 } +#define B { 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25 } +#define GOLD { 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75 } + +typedef signed char s8; +typedef unsigned char u8; +typedef signed short s16; +typedef unsigned short u16; +typedef signed int s32; +typedef unsigned int u32; + +s8 sc_out[] = EMPTY; +u8 uc_out[] = EMPTY; +s16 ss_out[] = EMPTY; +u16 us_out[] = EMPTY; +s32 si_out[] = EMPTY; +u32 ui_out[] = EMPTY; + +s8 sc_A[] = sA; +s8 sc_B[] = B; +u8 uc_A[] = uA; +u8 uc_B[] = B; + +s16 ss_A[] = sA; +s16 ss_B[] = B; +u16 us_A[] = uA; +u16 us_B[] = B; + +s32 si_A[] = sA; +s32 si_B[] = B; +u32 ui_A[] = uA; +u32 ui_B[] = B; + +s8 sc_gold[] = GOLD; +u8 uc_gold[] = GOLD; +s16 ss_gold[] = GOLD; +u16 us_gold[] = GOLD; +s32 si_gold[] = GOLD; +u32 ui_gold[] = GOLD; + +extern void abort (void); + +#define CLEAR(arr) \ +for (int i = 0; i < N; i++) \ + arr[i] = 0; + +#define COMPARE(A, B) \ +for (int i = 0; i < N; i++) \ + if (A[i] != B[i]) \ + abort(); + +int main () +{ + fn_signed_char (sc_A, sc_B, sc_out); + COMPARE (sc_out, sc_gold); + + fn_unsigned_char (uc_A, uc_B, uc_out); + COMPARE (uc_out, uc_gold); + + fn_signed_short (ss_A, ss_B, ss_out); + COMPARE (ss_out, ss_gold) + + fn_unsigned_short (us_A, us_B, us_out); + COMPARE (us_out, us_gold) + + fn_signed_int (si_A, si_B, si_out); + COMPARE (si_out, si_gold); + + fn_unsigned_int (ui_A, ui_B, ui_out); + COMPARE (ui_out, ui_gold); + + return 0; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/sve/abd_1.c b/gcc/testsuite/gcc.target/aarch64/sve/abd_1.c new file mode 100644 index 0000000000000000000000000000000000000000..e49006f90b22040f890c279ec19c490a655abd63 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/abd_1.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#define N 1024 + +#define ABD_ABS +#include "../abd.h" + +TEST1(signed, int) +TEST1(signed, short) +TEST1(signed, char) + +TEST2(signed, char, int) +TEST2(signed, char, short) +TEST2(signed, short, int) + +TEST3(signed, char, int, short) +TEST3(signed, char, short, int) + +TEST1(unsigned, short) +TEST1(unsigned, char) + +TEST2(unsigned, char, short) +TEST2(unsigned, char, int) +TEST2(unsigned, short, int) + +TEST3(unsigned, char, short, int) + +/* { dg-final { scan-assembler-times "sabd\\tz\[0-9\]+\.s, p\[0-9\]/m, z\[0-9\]+\.s, z\[0-9\]+\.s" 2 } } */ +/* { dg-final { scan-assembler-times "sabd\\tz\[0-9\]+\.h, p\[0-9\]/m, z\[0-9\]+\.h, z\[0-9\]+\.h" 3 } } */ +/* { dg-final { scan-assembler-times "sabd\\tz\[0-9\]+\.b, p\[0-9\]/m, z\[0-9\]+\.b, z\[0-9\]+\.b" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tz\[0-9\]+\.h, p\[0-9\]/m, z\[0-9\]+\.h, z\[0-9\]+\.h" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tz\[0-9\]+\.b, p\[0-9\]/m, z\[0-9\]+\.b, z\[0-9\]+\.b" 3 } } */ + +/* { dg-final { scan-assembler-not {\tabs\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/abd_2.c b/gcc/testsuite/gcc.target/aarch64/sve/abd_2.c new file mode 100644 index 0000000000000000000000000000000000000000..bb17c8d6855eeee0a109b991c6610cfe7af12fd9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/abd_2.c @@ -0,0 +1,32 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#define N 1024 + +#define ABD_IDIOM +#include "../abd.h" + +TEST1(signed, int) +TEST1(signed, short) +TEST1(signed, char) + +TEST2(signed, char, int) +TEST2(signed, char, short) +TEST2(signed, short, int) + +TEST3(signed, char, short, int) + +TEST2(unsigned, char, int) +TEST2(unsigned, char, short) +TEST2(unsigned, short, int) + +TEST3(unsigned, char, int, short) +TEST3(unsigned, char, short, int) + +/* { dg-final { scan-assembler-times "sabd\\tz\[0-9\]+\.s, p\[0-9\]/m, z\[0-9\]+\.s, z\[0-9\]+\.s" 1 } } */ +/* { dg-final { scan-assembler-times "sabd\\tz\[0-9\]+\.h, p\[0-9\]/m, z\[0-9\]+\.h, z\[0-9\]+\.h" 3 } } */ +/* { dg-final { scan-assembler-times "sabd\\tz\[0-9\]+\.b, p\[0-9\]/m, z\[0-9\]+\.b, z\[0-9\]+\.b" 3 } } */ +/* { dg-final { scan-assembler-times "uabd\\tz\[0-9\]+\.h, p\[0-9\]/m, z\[0-9\]+\.h, z\[0-9\]+\.h" 2 } } */ +/* { dg-final { scan-assembler-times "uabd\\tz\[0-9\]+\.b, p\[0-9\]/m, z\[0-9\]+\.b, z\[0-9\]+\.b" 2 } } */ + +/* { dg-final { scan-assembler-not {\tabs\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/abd_none_1.c b/gcc/testsuite/gcc.target/aarch64/sve/abd_none_1.c new file mode 100644 index 0000000000000000000000000000000000000000..a4c2053c50e235e6ea6ad8bfb124889556be1657 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/abd_none_1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#define N 1024 + +#define ABD_ABS +#include "../abd.h" + +TEST1(unsigned, int) +TEST3(unsigned, char, int, short) + +/* { dg-final { scan-assembler-not {\tsabd\t} } } */ +/* { dg-final { scan-assembler-not {\tuabd\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/abd_none_2.c b/gcc/testsuite/gcc.target/aarch64/sve/abd_none_2.c new file mode 100644 index 0000000000000000000000000000000000000000..b540c38844936a17dc658acd909f3416f8887071 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/abd_none_2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#define N 1024 + +#define ABD_IDIOM +#include "../abd.h" + +TEST3(signed, char, int, short) + +TEST1(unsigned, int) +TEST1(unsigned, short) +TEST1(unsigned, char) + +TEST3(unsigned, char, int, short) + +/* { dg-final { scan-assembler-not {\tsabd\t} } } */ +/* { dg-final { scan-assembler-not {\tuabd\t} } } */