From patchwork Mon Oct 31 11:56:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59648 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 68F4B385140C for ; Mon, 31 Oct 2022 11:59:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 68F4B385140C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217554; bh=rmbC28QJ7mY2s6QJJ6tPuQFAKcyl2yCevmb8ZKp2Xxg=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=BYe6/aEg0Hibu8Ard7VJazXjkvAsgsqdMrGBS/LIlQJI8smbGDfxEKNF0+D4HR5x4 AqpwT1eDoKYpaWDRFyv71AJR0B8ZPDx1GXNev8GY1pFYmFFFW5VI5K5822bBsyhyT5 MvUYXYrt1dD6EzC+P6kH2pBKiXY4Bv3M5nZYM6QY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140059.outbound.protection.outlook.com [40.107.14.59]) by sourceware.org (Postfix) with ESMTPS id 7A8AF385482E for ; Mon, 31 Oct 2022 11:57:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7A8AF385482E ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=JPGRJlWlcnK4LgJpFspP7CQNiFSMpi/WndndCYshLKalFc8rzxhqQgsAtfy7kLt+tSXVYHRV6yS/0O0CfneH8vpvB7bK3QtZkU8dnjWPEUWQAkVT0vfIQVVUTPQIk9h/Yft5NclI3F2Cpkvq1S/QxF8gUA92leEHbPVDy2W2ZTcbW7iZ9qEIyS2dVFIDf5b8qINRatM8TGyeeLlmjJ+PG1+TbWndWD7VNYe1vKZ9iceu2HF8ZxpyFDxzrBFz22tUG0MijjEzZ2lC6uIxqLOHJ2103UFadgn2Ci4GMAn7ul2G22ToV4FEIDAs9d2eKmLjtVw95J0Y1GO2fg18sZAGZA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rmbC28QJ7mY2s6QJJ6tPuQFAKcyl2yCevmb8ZKp2Xxg=; b=imNgik6QKEyzmJ9iS76kL4gzKEJDzLY7bZ/9Lqf1kgpKxAd25rmHiXzGDUALqWpV8BvJlOAz1G9xNy7/z1HmrplSk1Av22nOPZYFxRkMTNMdENmvsfzc6Z1qxQzBIBwlYmzFQmIuCWyZ8GHzCV9WfqXWOKB6rqs+w627zzQRfg0xQPwaEtpyNaF9SWaCSH9XAmd/hXJnwBUVqjiLMiiZoPgtP6QODNho/lOxUBLgu1tFEAG1fvN8N50PWAoFVxuqQcDplmlunOJ6mwlnpoXuw52PpZr2y9i0zgSbjqkbD1cTj5vuHjvZgJf8R9Wj8nm7S+T+jt5cEGrBFfG0/ZjpoQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB6PR0301CA0080.eurprd03.prod.outlook.com (2603:10a6:6:30::27) by PAWPR08MB10257.eurprd08.prod.outlook.com (2603:10a6:102:367::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15; Mon, 31 Oct 2022 11:57:32 +0000 Received: from DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com (2603:10a6:6:30:cafe::7f) by DB6PR0301CA0080.outlook.office365.com (2603:10a6:6:30::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:57:32 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT035.mail.protection.outlook.com (100.127.142.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:57:31 +0000 Received: ("Tessian outbound 0800d254cb3b:v130"); Mon, 31 Oct 2022 11:57:31 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 98af2278ba981402 X-CR-MTA-TID: 64aa7808 Received: from 98a33b3d9617.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 7C6FF9E9-61D8-411B-BB9B-DDCDE57357E6.1; Mon, 31 Oct 2022 11:56:46 +0000 Received: from EUR02-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 98a33b3d9617.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:56:46 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LOIfRIyycB3o6t+fXwxJSElJDAUjVtT6N4+6IIu/+suBj/18WXinzm5CGQg89trS8xMICDlr+w5rtYB8X5g2wKYKgNnfztGWp+M5G5xcOMXiUGVmjOv8Ck5dONS4Oyy39XXbxKclnqpmqztFaRRnyp0bWn+yGWV3oW/raQ1pNXfMc59y8ZvA0v0XxjykuBQaBf+FJr11o3rh8Itga2CIS0HlR7QdWMrJwrK/pRkZ1N0MfUli0kZ8DUjYoINzY7wtKDbPvZ/y/4wSLdsCV1+bD7tAv9ES+Tt94i77VpqwY38ElFfDUUfnmSKnBoMxWjOmnIuPTIR/igRY3+QOSyrD7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rmbC28QJ7mY2s6QJJ6tPuQFAKcyl2yCevmb8ZKp2Xxg=; b=RDyZFvNg5+P9N2HGvfqRyaF/G3YG/E8elVrr3o1v9aE3jE9euadlPZeh2H7+pojk39qCG5LKPQujsWVNG8YsHMgvhsY+E1fqDzxYUq3x8w+Po7F5qllzJS2/KlJ8IZK33vxfI06dDEc5s39CUUwmvh1mTK/YZx7z0Em4spK8XhB+YT51Is9a1uoz7PbLMFBsVFxSDiWaDs2+sIiIPDFdVVnVrOVmohUsRK38qkMfnQrUy0LXsY27ainLRFJnJk3JFzkYS/1Mv0hHzrzk5CHo0CnKGqa9HpH5lgB4YzgEehzSoVH3knq5fTJ09+k5UWXZWDfba1TIYJKI65Fv4JSQSg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB6730.eurprd08.prod.outlook.com (2603:10a6:10:2a2::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:56:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:56:44 +0000 Date: Mon, 31 Oct 2022 11:56:42 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/8]middle-end: Recognize scalar reductions from bitfields and array_refs Message-ID: Content-Disposition: inline X-ClientProxiedBy: LO4P123CA0430.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18b::21) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB6730:EE_|DBAEUR03FT035:EE_|PAWPR08MB10257:EE_ X-MS-Office365-Filtering-Correlation-Id: b6db9dd2-e002-4627-4d18-08dabb371713 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: QJLINOLDq3A0q/8kix+kxsPS1/LtmBppnuDsk0jlEgKnmqG40KnNkfFf8crS+amknDV+vwvq4EyL07qi7SM/y62skhuBeibKpHcr9MHfHRXtYFiVPFu8cALsajnEjGyZxZ16wtB4ebbaUs+5ZKGqBdh2NQpJcfZHyjjLAa6V0iPGVDRz4bRZ8smSsxhkM6yQshYcqL0+ddRPDn7l/0a8xf9dl+v24hrUKQpdQgU5xu0SZOJINEsTWMEFI6YJEEDNCZJLE/xo8Aicj+/2qKgjh4L7inSAvVTDnx2ZbxqmtCnVeC7btrXCWZnMIjesI+MYkhNDX7StAYnXIyG8d4DYKLqIjqTm8Ya6LfSmYphTsFbeERN0lXZX+DmusabckEeA50tLrj3ASw1mX4x/NNQWtz8fjQ+JtabJ2TOAHB89s9lP8BTQm6ZND2QWO8enE1KS2SJk/yxZqPgQbg39U42XwZPmqSIUimag8OOF0+UoY8w309rPLWwpP2dZjjqYYOteFKSPjpTiL2XnTS/Aq8aWhems7bzlFyow85IbPKMF5VxLfkIhB9UtENCmk+/WQXbsW1NKbSarGAKIKc4d450ujZwNWS0MND+ELgCzH4MkBcre53AdQ7VW+LsmNNSUCJEm3aSoA8++SNtMQmzHvDJ0I/+IiPu7HvpRLLuufSydLUrIXMJ+74eyOE42VGylRWDMK8+m0WbnSjqPdVbTiggXDsruTcQjEGEj18PB+uIJIj33a8D85RVYC/MyidJn63I/IpXIGMkAvFzxErhOKv3ABTj1PHvznjj+JpGANvfVY/w= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(396003)(136003)(376002)(39860400002)(451199015)(83380400001)(66899015)(36756003)(6486002)(2906002)(44832011)(86362001)(38100700002)(6512007)(33964004)(44144004)(26005)(4743002)(2616005)(186003)(316002)(66946007)(66556008)(8676002)(478600001)(4326008)(41300700001)(66476007)(235185007)(5660300002)(6916009)(8936002)(6506007)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6730 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: dd27aa71-f569-4ba5-0c1d-08dabb36fb50 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: W9FoGqDbDmFLiBBYOZakT4aqYzWq+1zRRMvCPiqwyd48TtxYUDVh7f6j7PILjFDoptHJa2o3FA2yrjWMZQ7/fD+vvGlJdefLRSphwCJzJEo6U6Y5EMVNpRNA0IP/RfnDom4YDfdmxlkvcYMnnPlK81h+cBA2SA9palTOlM4SZziu4lXjMANN2AkGkg+0qn9iVv/jXJ9jfURJ3AzOPYsSVZhiOcTxrUuNRrdmjYy8PN4udLPMd3uahN8LNKQFrEBQjFDnOszUrZPD76nsJ4M7gwRg8xQLbpRoBoNJtk5wwyfBVgdb8O0lvDf9cXUP4p9SPAUAgoBKRgQSdCdL8C8VQ9bUX/ZB8fFOMXi2ACWoPusZy9S7ExfJzNZ0FxuG4SPoHVyl2tZmUsqsHOAx+gydg/IFJAuug2W4DmFzjpah1o3giu6nv134L5lloDir3xUGiIQgRm4t2Q4/oNW4t7vsZctqNdY5nN+C469JJRx2VYRIBLeLExaocR/jyb8GiXSSFKNCe71BkSdIjjALp6add77mVtEJMMBZRbTrAD5/KiKJLbsV6aoRBo/qqgSvB0T9AoKKKnmCc+BLXYRD5Lls+iiIUzRVXmeXI3cs9HXA5ViXOhdyuy5TCDfP3g2SgEsWkcgnT7kw1lmfJAOZM1Rt/YsVs6Qq8L4ZsawfCBuwpAG5pfK3kSEQivq8Ktg1Ex9uUgQJhsAOjLyNgZ4Gk8aZMUPS8JE25H457woMXFfGFB3gNqSThRJl4DLNg/FbbmjML+9lwuigF0IjpXHK9HWCg/3TNYPJAyO8gJK47hxCMnRFZs1m6tpicPWGQ56DleP0 X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(136003)(346002)(396003)(39860400002)(376002)(451199015)(36840700001)(46966006)(40470700004)(66899015)(82740400003)(36756003)(86362001)(356005)(81166007)(478600001)(4743002)(2906002)(336012)(83380400001)(40480700001)(44832011)(186003)(44144004)(107886003)(33964004)(2616005)(26005)(6506007)(40460700003)(36860700001)(6512007)(47076005)(4326008)(6916009)(316002)(235185007)(6486002)(82310400005)(70206006)(70586007)(8676002)(41300700001)(8936002)(5660300002)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:57:31.0926 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b6db9dd2-e002-4627-4d18-08dabb371713 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB10257 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This patch series is to add recognition of pairwise operations (reductions) in match.pd such that we can benefit from them even at -O1 when the vectorizer isn't enabled. Ths use of these allow for a lot simpler codegen in AArch64 and allows us to avoid quite a lot of codegen warts. As an example a simple: typedef float v4sf __attribute__((vector_size (16))); float foo3 (v4sf x) { return x[1] + x[2]; } currently generates: foo3: dup s1, v0.s[1] dup s0, v0.s[2] fadd s0, s1, s0 ret while with this patch series now generates: foo3: ext v0.16b, v0.16b, v0.16b, #4 faddp s0, v0.2s ret This patch will not perform the operation if the source is not a gimple register and leaves memory sources to the vectorizer as it's able to deal correctly with clobbers. The use of these instruction makes a significant difference in codegen quality for AArch64 and Arm. NOTE: The last entry in the series contains tests for all of the previous patches as it's a bit of an all or nothing thing. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * match.pd (adjacent_data_access_p): Import. Add new pattern for bitwise plus, min, max, fmax, fmin. * tree-cfg.cc (verify_gimple_call): Allow function arguments in IFNs. * tree.cc (adjacent_data_access_p): New. * tree.h (adjacent_data_access_p): New. --- inline copy of patch -- diff --git a/gcc/match.pd b/gcc/match.pd index 2617d56091dfbd41ae49f980ee0af3757f5ec1cf..aecaa3520b36e770d11ea9a10eb18db23c0cd9f7 100644 --- diff --git a/gcc/match.pd b/gcc/match.pd index 2617d56091dfbd41ae49f980ee0af3757f5ec1cf..aecaa3520b36e770d11ea9a10eb18db23c0cd9f7 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -39,7 +39,8 @@ along with GCC; see the file COPYING3. If not see HONOR_NANS uniform_vector_p expand_vec_cmp_expr_p - bitmask_inv_cst_vector_p) + bitmask_inv_cst_vector_p + adjacent_data_access_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -7195,6 +7196,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Canonicalizations of BIT_FIELD_REFs. */ +/* Canonicalize BIT_FIELD_REFS to pairwise operations. */ +(for op (plus min max FMIN_ALL FMAX_ALL) + ifn (IFN_REDUC_PLUS IFN_REDUC_MIN IFN_REDUC_MAX + IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (op @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, true); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (ifn (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))) + +(for op (lt gt) + ifni (IFN_REDUC_MIN IFN_REDUC_MAX) + ifnf (IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (cond (op @0 @1) @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, false); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (if (SCALAR_FLOAT_MODE_P (TYPE_MODE (type))) + (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) + (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 91ec33c80a41e1e0cc6224e137dd42144724a168..b19710392940cf469de52d006603ae1e3deb6b76 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3492,6 +3492,7 @@ verify_gimple_call (gcall *stmt) { tree arg = gimple_call_arg (stmt, i); if ((is_gimple_reg_type (TREE_TYPE (arg)) + && !is_gimple_variable (arg) && !is_gimple_val (arg)) || (!is_gimple_reg_type (TREE_TYPE (arg)) && !is_gimple_lvalue (arg))) diff --git a/gcc/tree.h b/gcc/tree.h index e6564aaccb7b69cd938ff60b6121aec41b7e8a59..8f8a9660c9e0605eb516de194640b8c1b531b798 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5006,6 +5006,11 @@ extern bool integer_pow2p (const_tree); extern tree bitmask_inv_cst_vector_p (tree); +/* TRUE if the two operands represent adjacent access of data such that a + pairwise operation can be used. */ + +extern tree adjacent_data_access_p (tree, tree, poly_uint64*, bool); + /* integer_nonzerop (tree x) is nonzero if X is an integer constant with a nonzero value. */ diff --git a/gcc/tree.cc b/gcc/tree.cc index 007c9325b17076f474e6681c49966c59cf6b91c7..5315af38a1ead89ca5f75dc4b19de9841e29d311 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10457,6 +10457,90 @@ bitmask_inv_cst_vector_p (tree t) return builder.build (); } +/* Returns base address if the two operands represent adjacent access of data + such that a pairwise operation can be used. OP1 must be a lower subpart + than OP2. If POS is not NULL then on return if a value is returned POS + will indicate the position of the lower address. If COMMUTATIVE_P then + the operation is also tried by flipping op1 and op2. */ + +tree adjacent_data_access_p (tree op1, tree op2, poly_uint64 *pos, + bool commutative_p) +{ + gcc_assert (op1); + gcc_assert (op2); + if (TREE_CODE (op1) != TREE_CODE (op2) + || TREE_TYPE (op1) != TREE_TYPE (op2)) + return NULL; + + tree type = TREE_TYPE (op1); + gimple *stmt1 = NULL, *stmt2 = NULL; + unsigned int bits = GET_MODE_BITSIZE (GET_MODE_INNER (TYPE_MODE (type))); + + if (TREE_CODE (op1) == BIT_FIELD_REF + && operand_equal_p (TREE_OPERAND (op1, 0), TREE_OPERAND (op2, 0), 0) + && operand_equal_p (TREE_OPERAND (op1, 1), TREE_OPERAND (op2, 1), 0) + && known_eq (bit_field_size (op1), bits)) + { + poly_uint64 offset1 = bit_field_offset (op1); + poly_uint64 offset2 = bit_field_offset (op2); + if (known_eq (offset2 - offset1, bits)) + { + if (pos) + *pos = offset1; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, bits)) + { + if (pos) + *pos = offset2; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == ARRAY_REF + && operand_equal_p (get_base_address (op1), get_base_address (op2))) + { + wide_int size1 = wi::to_wide (array_ref_element_size (op1)); + wide_int size2 = wi::to_wide (array_ref_element_size (op2)); + if (wi::ne_p (size1, size2) || wi::ne_p (size1, bits / 8) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op1, 1)) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op2, 1))) + return NULL; + + poly_uint64 offset1 = tree_to_poly_uint64 (TREE_OPERAND (op1, 1)); + poly_uint64 offset2 = tree_to_poly_uint64 (TREE_OPERAND (op2, 1)); + if (known_eq (offset2 - offset1, 1UL)) + { + if (pos) + *pos = offset1 * bits; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, 1UL)) + { + if (pos) + *pos = offset2 * bits; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == SSA_NAME + && (stmt1 = SSA_NAME_DEF_STMT (op1)) != NULL + && (stmt2 = SSA_NAME_DEF_STMT (op2)) != NULL + && is_gimple_assign (stmt1) + && is_gimple_assign (stmt2)) + { + if (gimple_assign_rhs_code (stmt1) != ARRAY_REF + && gimple_assign_rhs_code (stmt1) != BIT_FIELD_REF + && gimple_assign_rhs_code (stmt2) != ARRAY_REF + && gimple_assign_rhs_code (stmt2) != BIT_FIELD_REF) + return NULL; + + return adjacent_data_access_p (gimple_assign_rhs1 (stmt1), + gimple_assign_rhs1 (stmt2), pos, + commutative_p); + } + + return NULL; +} + /* If VECTOR_CST T has a single nonzero element, return the index of that element, otherwise return -1. */ --- a/gcc/match.pd +++ b/gcc/match.pd @@ -39,7 +39,8 @@ along with GCC; see the file COPYING3. If not see HONOR_NANS uniform_vector_p expand_vec_cmp_expr_p - bitmask_inv_cst_vector_p) + bitmask_inv_cst_vector_p + adjacent_data_access_p) /* Operator lists. */ (define_operator_list tcc_comparison @@ -7195,6 +7196,47 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Canonicalizations of BIT_FIELD_REFs. */ +/* Canonicalize BIT_FIELD_REFS to pairwise operations. */ +(for op (plus min max FMIN_ALL FMAX_ALL) + ifn (IFN_REDUC_PLUS IFN_REDUC_MIN IFN_REDUC_MAX + IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (op @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, true); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (ifn (BIT_FIELD_REF:ntype { src; } { size; } { pos; }))))))) + +(for op (lt gt) + ifni (IFN_REDUC_MIN IFN_REDUC_MAX) + ifnf (IFN_REDUC_FMIN IFN_REDUC_FMAX) + (simplify + (cond (op @0 @1) @0 @1) + (if (INTEGRAL_TYPE_P (type) || SCALAR_FLOAT_TYPE_P (type)) + (with { poly_uint64 nloc = 0; + tree src = adjacent_data_access_p (@0, @1, &nloc, false); + tree ntype = build_vector_type (type, 2); + tree size = TYPE_SIZE (ntype); + tree pos = build_int_cst (TREE_TYPE (size), nloc); + poly_uint64 _sz; + poly_uint64 _total; } + (if (src && is_gimple_reg (src) && ntype + && poly_int_tree_p (size, &_sz) + && poly_int_tree_p (TYPE_SIZE (TREE_TYPE (src)), &_total) + && known_ge (_total, _sz + nloc)) + (if (SCALAR_FLOAT_MODE_P (TYPE_MODE (type))) + (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) + (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc index 91ec33c80a41e1e0cc6224e137dd42144724a168..b19710392940cf469de52d006603ae1e3deb6b76 100644 --- a/gcc/tree-cfg.cc +++ b/gcc/tree-cfg.cc @@ -3492,6 +3492,7 @@ verify_gimple_call (gcall *stmt) { tree arg = gimple_call_arg (stmt, i); if ((is_gimple_reg_type (TREE_TYPE (arg)) + && !is_gimple_variable (arg) && !is_gimple_val (arg)) || (!is_gimple_reg_type (TREE_TYPE (arg)) && !is_gimple_lvalue (arg))) diff --git a/gcc/tree.h b/gcc/tree.h index e6564aaccb7b69cd938ff60b6121aec41b7e8a59..8f8a9660c9e0605eb516de194640b8c1b531b798 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5006,6 +5006,11 @@ extern bool integer_pow2p (const_tree); extern tree bitmask_inv_cst_vector_p (tree); +/* TRUE if the two operands represent adjacent access of data such that a + pairwise operation can be used. */ + +extern tree adjacent_data_access_p (tree, tree, poly_uint64*, bool); + /* integer_nonzerop (tree x) is nonzero if X is an integer constant with a nonzero value. */ diff --git a/gcc/tree.cc b/gcc/tree.cc index 007c9325b17076f474e6681c49966c59cf6b91c7..5315af38a1ead89ca5f75dc4b19de9841e29d311 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10457,6 +10457,90 @@ bitmask_inv_cst_vector_p (tree t) return builder.build (); } +/* Returns base address if the two operands represent adjacent access of data + such that a pairwise operation can be used. OP1 must be a lower subpart + than OP2. If POS is not NULL then on return if a value is returned POS + will indicate the position of the lower address. If COMMUTATIVE_P then + the operation is also tried by flipping op1 and op2. */ + +tree adjacent_data_access_p (tree op1, tree op2, poly_uint64 *pos, + bool commutative_p) +{ + gcc_assert (op1); + gcc_assert (op2); + if (TREE_CODE (op1) != TREE_CODE (op2) + || TREE_TYPE (op1) != TREE_TYPE (op2)) + return NULL; + + tree type = TREE_TYPE (op1); + gimple *stmt1 = NULL, *stmt2 = NULL; + unsigned int bits = GET_MODE_BITSIZE (GET_MODE_INNER (TYPE_MODE (type))); + + if (TREE_CODE (op1) == BIT_FIELD_REF + && operand_equal_p (TREE_OPERAND (op1, 0), TREE_OPERAND (op2, 0), 0) + && operand_equal_p (TREE_OPERAND (op1, 1), TREE_OPERAND (op2, 1), 0) + && known_eq (bit_field_size (op1), bits)) + { + poly_uint64 offset1 = bit_field_offset (op1); + poly_uint64 offset2 = bit_field_offset (op2); + if (known_eq (offset2 - offset1, bits)) + { + if (pos) + *pos = offset1; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, bits)) + { + if (pos) + *pos = offset2; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == ARRAY_REF + && operand_equal_p (get_base_address (op1), get_base_address (op2))) + { + wide_int size1 = wi::to_wide (array_ref_element_size (op1)); + wide_int size2 = wi::to_wide (array_ref_element_size (op2)); + if (wi::ne_p (size1, size2) || wi::ne_p (size1, bits / 8) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op1, 1)) + || !tree_fits_poly_uint64_p (TREE_OPERAND (op2, 1))) + return NULL; + + poly_uint64 offset1 = tree_to_poly_uint64 (TREE_OPERAND (op1, 1)); + poly_uint64 offset2 = tree_to_poly_uint64 (TREE_OPERAND (op2, 1)); + if (known_eq (offset2 - offset1, 1UL)) + { + if (pos) + *pos = offset1 * bits; + return TREE_OPERAND (op1, 0); + } + else if (commutative_p && known_eq (offset1 - offset2, 1UL)) + { + if (pos) + *pos = offset2 * bits; + return TREE_OPERAND (op1, 0); + } + } + else if (TREE_CODE (op1) == SSA_NAME + && (stmt1 = SSA_NAME_DEF_STMT (op1)) != NULL + && (stmt2 = SSA_NAME_DEF_STMT (op2)) != NULL + && is_gimple_assign (stmt1) + && is_gimple_assign (stmt2)) + { + if (gimple_assign_rhs_code (stmt1) != ARRAY_REF + && gimple_assign_rhs_code (stmt1) != BIT_FIELD_REF + && gimple_assign_rhs_code (stmt2) != ARRAY_REF + && gimple_assign_rhs_code (stmt2) != BIT_FIELD_REF) + return NULL; + + return adjacent_data_access_p (gimple_assign_rhs1 (stmt1), + gimple_assign_rhs1 (stmt2), pos, + commutative_p); + } + + return NULL; +} + /* If VECTOR_CST T has a single nonzero element, return the index of that element, otherwise return -1. */ From patchwork Mon Oct 31 11:57:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59646 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4E4DD385DC00 for ; Mon, 31 Oct 2022 11:58:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4E4DD385DC00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217490; bh=zCFtO2cA+eRp9a8JuRzQLFoJcbjwpEihJmCNQxZU1IU=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=lijaPM63vqSj5qu7C/41jAQvGWU7QCfku4RPvECb0GiaHQCS+6u8LxJHD0ntvnhdk gmeT5obQjN3MFusucNs/xTnOicoIHb+zGaNYLIGT4WShhMYZQC8Tz7f0UUiOzRhqsv bg9WWFkGxDcuuCl3UbsUg7smL1KMBEG8zRyrcfTM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2065.outbound.protection.outlook.com [40.107.20.65]) by sourceware.org (Postfix) with ESMTPS id BACA33854838 for ; Mon, 31 Oct 2022 11:57:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BACA33854838 ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=EpgbRQc8Xt1M6Rp1bq5O62vlp0ELoMkVb6hjYsxrzEUqv6+uT+BtSh1Fxp1XS3EVl0iGOT1cumI26igwyvgb12PEg/PaIkD8JlgsHOi2zs11mgqLahDQJN3qkWWImZzo9ByzDBlsZRHrz+HZuUft7DjYi5VKGquP+3sV4QvIF+nybV9aA2r3m5VU2pgr8OuhqEEdaI4wpuMJ0OD5o1BhtDKh9RKEVtLb/OQBQi6b/978+eNHgi9rbkAhTjS92a81GZA5MEA9x9tYOlICLCj79JHIBp0MR49q8F1sW6k94CkPtcQo/AH+Uh3tK2SyP7fDJryZgPu4n6G4p1zm1xVYDQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zCFtO2cA+eRp9a8JuRzQLFoJcbjwpEihJmCNQxZU1IU=; b=YDgHs/H/RCyagBFX7+SboyzFu+PN3iCRq0Q3gkGgsx7+cA2vtf959hu3lbwpRUc8+esFkNctMUXPsrK7Fa3VaE/6k0SCo1eT2jD7Qb8x85oROaf+tp98dPkvs7vJ1g/BACLUonqZn2KrpO1XMGXrA++CpFU4MDlKZ3vXXWgYqXPupmL+zQWL2E3geaoEajNwZ4AEheFJi3YLrL0dPugYtyqBmXBuNv58TaFXDrBYTpQLIZyEFiUuRNTNkP2LHPdE46l861t5sDdQmxGKzkj/onHYUqGIUxWsNjc/xdAof+XrKJViP7DE0i0JeezrVJqhGs/oepAnwjEFG6HPc5TNYg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB6PR0202CA0011.eurprd02.prod.outlook.com (2603:10a6:4:29::21) by PAXPR08MB6655.eurprd08.prod.outlook.com (2603:10a6:102:15d::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:57:30 +0000 Received: from DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:29:cafe::dd) by DB6PR0202CA0011.outlook.office365.com (2603:10a6:4:29::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16 via Frontend Transport; Mon, 31 Oct 2022 11:57:30 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT063.mail.protection.outlook.com (100.127.142.255) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:57:30 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Mon, 31 Oct 2022 11:57:30 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 968f3eed71563821 X-CR-MTA-TID: 64aa7808 Received: from b6e485f884c1.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BABE0E2A-D6AC-4E16-AF1C-84AEA68790A2.1; Mon, 31 Oct 2022 11:57:22 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id b6e485f884c1.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:57:22 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WktOn7a1MWvxHYivIuMX1XSyP9eVr6phW52k+b7cBZHYL7bcgcgqj+0Nfgzn7AkNHm9dtGWavnSiWU6o+2a33wv1DRzdPkNtGy+zbluIy6Zban0YhD0YfoZkfrmWEzOGdm4U/izBAP45DyFVZut6cq+8YUAPIrN3GaRpeocjgx94ZmRg3KJCo/J0whcpmBG72NXhwEXRFWki31IqYN/+eHmXC5F4Gfbt4EPAZ2Le9wsJyrAEtI8uAiDbImWBWKSf4tyC3hanlOaNNPcbBn1SaWYrYz3pvjtWuWp+cNF3z9aJm3/8H5/Uq3zimnYroZ0Ngg03R1wuJpHs3zRmbrJcow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=zCFtO2cA+eRp9a8JuRzQLFoJcbjwpEihJmCNQxZU1IU=; b=DHgZzoU3Xxk1Axn9vUCRas0d7vaqG1LX/U1zZJ3gULvCP9rUFL7gosNIADCFOakMadPj8aWAtNhivAxfDSGd84NK8hb11MzFmrB0gNzwjxZdK0zhlfjscRFvmgLgMp2M5CtlHlxsD++G2OkSGbwSqpNP/thPb8rbA1YwoR8vpaMMeaanMhcI9mZeWV/WHZxgT0z2tSo2y6Jh+slRxJ+AThRbt2kNAfTm7Wm3Q40tg20ZhKCUa1asB2hiB020dlasmyozKLl2MPepvdoBQS8RfouK4535EsLbRQtV71gUcd/gy6aX3CpwPZ7E4gE5HjN4IAMPzejEofjSxTeSiwyoZA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS2PR08MB8717.eurprd08.prod.outlook.com (2603:10a6:20b:55d::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14; Mon, 31 Oct 2022 11:57:20 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:57:19 +0000 Date: Mon, 31 Oct 2022 11:57:12 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/8]middle-end: Recognize scalar widening reductions Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA1P222CA0074.NAMP222.PROD.OUTLOOK.COM (2603:10b6:806:2c1::20) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS2PR08MB8717:EE_|DBAEUR03FT063:EE_|PAXPR08MB6655:EE_ X-MS-Office365-Filtering-Correlation-Id: 189bdc31-6b7c-4796-b8ef-08dabb3716aa x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 7Gq7iRX5VFNjQeqE6OCoxD1xbxRtXtwmk57Q5gqJJcjNrklN0CNKwUlblYqmr5gwDqTopDgrFe/Dd4JmrQ3VzESHWI6dQw98cegHktFcR8IvE3JyAE2RC+3vlOxgeJT/4BDrna4Vj0IhGbTeuoqYE9qlq/VFELW0N4WF7FeN8dZP9GerYguPByYylNDh36sQcrAZhZknwZxG0bruuw7LBWtHLDvDKVravzvIQou07o3aALI1vNGmKa/kN/8MFZz/POiLyggM94qge1ZycuJOMehTkF3MbkymTV2quEJorJw8pB5qqo4RBqLIOrOleNKBn/0P/GAOa0zfZq+Edo4BIYx87F96HZvcwecSkDnZOfd0b/K8RIXVZWGnoD0lC/OP00vuSrTchxdc/QxSkjINw7+VEkCKqmWQmCkDYp3Fy2Rwl9wesU/oMy8hm6Bs6d40CLLFQh0c1VBHk4fnsGvjteMtjcqg8fyDeoR5EHhBMHp2YVGJW6HhzTAFrbskuPjgTn1mk5lKhA1mI4em8ZSsKqefFL4X7XvwWV8ErYYHs6f3eQabKTCoQvv3dMdRDKr/lJWoJej8shRm2bqktfzmV04Lqo8+8AlMudd9hIYGDlUvhhTY+2FjrhvVomXyIFmJousGxIQBffBlzSYqDkeqHj7Vvv3shcYG0CGSOjweowmiv3KP27YyOx1RrUMoDUArlJz8JhJlvOBT0O0CDL2C3StOsqFzmeZEBH/NhcD8GX9IqkPvRO+2qG1YKbz1eHvjb2JwfuCH5S/XhkEjbrtTyL/nq4vD3AxdrbysUuXXWUk= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(376002)(39860400002)(136003)(396003)(366004)(451199015)(478600001)(8676002)(38100700002)(6486002)(316002)(6666004)(86362001)(6512007)(26005)(4743002)(66476007)(4326008)(41300700001)(186003)(235185007)(2906002)(33964004)(66946007)(2616005)(8936002)(6916009)(66556008)(36756003)(5660300002)(44832011)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8717 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 87578d1b-655c-4f7d-f979-08dabb371055 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 1RDKzcoUK+iNXv9aejZ0Wl/fgOQ6HKqumG+iTrrebVX2duuxqPyMw2j6Qqy9OYWEJ9uo7bYxVyZVwqJl1wkM/KC60A/2V8Ev9dn9OTxXhQzKnQN2R+tXNxYyGGE/mM65rC+hlQ0n9wDGL73KUonJuNlANYPBHf+bukZ/tuiPe0/4NFsAhvznatDWtd12eabE3SQb2c90Vd2aMLBQv9zYjBrAOlkBOvDZ7kmowqHhE6uwvWUD0DPKWCWR2Lz/ZVie1sbHh+57J+TAKrebd97jMn0gF9SBLkAnjFwv2B5in/lQCFKt/pj06htLPu0XEjJKEM52Fw1PePAOf+e0otJy3RDQeX2GogHtgH99C8kuwantr7BawqwjtzivxNS/Vu1VJTjeEElD5/10EEtMdIn4x5PNxm2rRG9uZiYCdUhMiR5iNrMOG8L1bFsJaueisJxOarsfx0shdUDpYs50R2DgsofmqS3Lgpb1FXOdZReRG2Rt9WwGf4bhWxqOXJGXlTQQ8AuQ5EbB75xo/EoUDc8gDPuWFY3+4RXkmuborqailB/1j6SObwuckXwUEw9KOTTsY8XVDT8WIRtAX4x+JoDnVoTN2TPF9QDGudbF2ggprMwFWsw1He+tAX0ELW//lda10Obm8x2dlXzlntNv73ZB7eNS8fZewI1H3f0P3KShPUNBRzhVvbbrd+NW7RliQfkXxoHxLhj9kTM4WgadCXTp66XmXS/q3FQcOxlEK5Vz+P7Fc7vNDrK+FVocK+23VTcj4EzQ059pLMlC8o05wtWWIi1Iz4ytJU8sI3N7fXXYm/Ky7I4Hsr5SieNE5No4NOnkFUiAUaxsKOYLo5cK2LTxGA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(346002)(376002)(39860400002)(136003)(396003)(451199015)(46966006)(40470700004)(36840700001)(36860700001)(47076005)(86362001)(82310400005)(356005)(81166007)(82740400003)(41300700001)(2906002)(6506007)(235185007)(44832011)(8676002)(40480700001)(4326008)(70586007)(70206006)(5660300002)(8936002)(316002)(6916009)(4743002)(6512007)(40460700003)(2616005)(26005)(336012)(186003)(107886003)(6486002)(478600001)(33964004)(44144004)(6666004)(36756003)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:57:30.4229 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 189bdc31-6b7c-4796-b8ef-08dabb3716aa X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT063.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB6655 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This adds a new optab and IFNs for REDUC_PLUS_WIDEN where the resulting scalar reduction has twice the precision of the input elements. At some point in a later patch I will also teach the vectorizer to recognize this builtin once I figure out how the various bits of reductions work. For now it's generated only by the match.pd pattern. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * internal-fn.def (REDUC_PLUS_WIDEN): New. * doc/md.texi: Document it. * match.pd: Recognize widening plus. * optabs.def (reduc_splus_widen_scal_optab, reduc_uplus_widen_scal_optab): New. --- inline copy of patch -- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644 --- diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 34825549ed4e315b07d36dc3d63bae0cc0a3932d..c08691ab4c9a4bfe55ae81e5e228a414d6242d78 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector. +@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_uplus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and zero-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + +@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_splus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and sign-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + @cindex @code{reduc_and_scal_@var{m}} instruction pattern @item @samp{reduc_and_scal_@var{m}} @cindex @code{reduc_ior_scal_@var{m}} instruction pattern diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary) DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW, reduc_plus_scal, unary) +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW, + first, reduc_splus_widen_scal, + reduc_uplus_widen_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first, reduc_smax_scal, reduc_umax_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/match.pd b/gcc/match.pd index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) +/* Widening reduction conversions. */ +(simplify + (convert (IFN_REDUC_PLUS @0)) + (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type) + && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0)) + && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0))) + (IFN_REDUC_PLUS_WIDEN @0))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/optabs.def b/gcc/optabs.def index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a") OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a") OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a") OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a") +OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a") +OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a") OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a") OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a") OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a") --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5284,6 +5284,20 @@ Compute the sum of the elements of a vector. The vector is operand 1, and operand 0 is the scalar result, with mode equal to the mode of the elements of the input vector. +@cindex @code{reduc_uplus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_uplus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and zero-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + +@cindex @code{reduc_splus_widen_scal_@var{m}} instruction pattern +@item @samp{reduc_splus_widen_scal_@var{m}} +Compute the sum of the elements of a vector and sign-extend @var{m} to a mode +that has twice the precision of @var{m}.. The vector is operand 1, and +operand 0 is the scalar result, with mode equal to twice the precision of the +mode of the elements of the input vector. + @cindex @code{reduc_and_scal_@var{m}} instruction pattern @item @samp{reduc_and_scal_@var{m}} @cindex @code{reduc_ior_scal_@var{m}} instruction pattern diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 5e672183f4def9d0cdc29cf12fe17e8cff928f9f..f64a8421b1087b6c0f3602dc556876b0fd15c7ad 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -215,6 +215,9 @@ DEF_INTERNAL_OPTAB_FN (RSQRT, ECF_CONST, rsqrt, unary) DEF_INTERNAL_OPTAB_FN (REDUC_PLUS, ECF_CONST | ECF_NOTHROW, reduc_plus_scal, unary) +DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_PLUS_WIDEN, ECF_CONST | ECF_NOTHROW, + first, reduc_splus_widen_scal, + reduc_uplus_widen_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MAX, ECF_CONST | ECF_NOTHROW, first, reduc_smax_scal, reduc_umax_scal, unary) DEF_INTERNAL_SIGNED_OPTAB_FN (REDUC_MIN, ECF_CONST | ECF_NOTHROW, first, diff --git a/gcc/match.pd b/gcc/match.pd index aecaa3520b36e770d11ea9a10eb18db23c0cd9f7..1d407414bee278c64c00d425d9f025c1c58d853d 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7237,6 +7237,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (ifnf (BIT_FIELD_REF:ntype { src; } { size; } { pos; })) (ifni (BIT_FIELD_REF:ntype { src; } { size; } { pos; })))))))) +/* Widening reduction conversions. */ +(simplify + (convert (IFN_REDUC_PLUS @0)) + (if (element_precision (TREE_TYPE (@0)) * 2 == element_precision (type) + && TYPE_UNSIGNED (type) == TYPE_UNSIGNED (TREE_TYPE (@0)) + && ANY_INTEGRAL_TYPE_P (type) && ANY_INTEGRAL_TYPE_P (TREE_TYPE(@0))) + (IFN_REDUC_PLUS_WIDEN @0))) + (simplify (BIT_FIELD_REF (BIT_FIELD_REF @0 @1 @2) @3 @4) (BIT_FIELD_REF @0 @3 { const_binop (PLUS_EXPR, bitsizetype, @2, @4); })) diff --git a/gcc/optabs.def b/gcc/optabs.def index a6db2342bed6baf13ecbd84112c8432c6972e6fe..9947aed67fb8a3b675cb0aab9aeb059f89644106 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -346,6 +346,8 @@ OPTAB_D (reduc_fmin_scal_optab, "reduc_fmin_scal_$a") OPTAB_D (reduc_smax_scal_optab, "reduc_smax_scal_$a") OPTAB_D (reduc_smin_scal_optab, "reduc_smin_scal_$a") OPTAB_D (reduc_plus_scal_optab, "reduc_plus_scal_$a") +OPTAB_D (reduc_splus_widen_scal_optab, "reduc_splus_widen_scal_$a") +OPTAB_D (reduc_uplus_widen_scal_optab, "reduc_uplus_widen_scal_$a") OPTAB_D (reduc_umax_scal_optab, "reduc_umax_scal_$a") OPTAB_D (reduc_umin_scal_optab, "reduc_umin_scal_$a") OPTAB_D (reduc_and_scal_optab, "reduc_and_scal_$a") From patchwork Mon Oct 31 11:57:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59647 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7D91D385415F for ; Mon, 31 Oct 2022 11:58:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7D91D385415F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217511; bh=xWzAevfcmxmIehbQ96I1Iiq22u8sQ8kFIWZiKXqtP3A=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Mh9qeOsK/KWTuGNs4b1o7MdxIyRiagLfg3NMNRSpqJ+spkmB/QoRVVuZ6o6UPH/k8 p+lN5q9mYQAT1kb2nFfKegGECo18IPjp+7dSKKuzvuS0ubsTms1tRW6fkGMR3fZgp3 QJaAhwlWSmO6nP1V6MMUIOB5eBip+NtC9d36qKJ8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2089.outbound.protection.outlook.com [40.107.105.89]) by sourceware.org (Postfix) with ESMTPS id 5CBEF3854811 for ; Mon, 31 Oct 2022 11:57:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5CBEF3854811 ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=BBTv1a3v5Cj02IEqRATC1iDfyREAHp9E5qRbtM0lkbYcOD2drpvz+UH1RTJ7IcWQhxVHEcK9KOQrDgAKw6bfI61y1r47g6WcGAd+kcvlWFXBrCpNb7xEh4TBGysqssj+hvK5E+hyoIibcoQvp0gvlj3YtnMSIWbbMIexUeiQjxaLowIvi5ZpMJRxZ1dySVtuaDO5fFiuOtruCZeMKZypEADH8HenKU+iAuJt/jFfpQFS8D10VOq3YTzS8AGjUfZ8un4HmAazQkbOldsBDxFcDUc/iDsrEfC3mRFCYg+aecrWRzI9Hfn9O2QPmO4Ax8UW8VdbMGpsemtsPFuffSERig== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xWzAevfcmxmIehbQ96I1Iiq22u8sQ8kFIWZiKXqtP3A=; b=KUo/EQeXu8Tdt1GsuxngCK5Y+T+g5L1cXtT6k/p/RvlXY+f+NQ/IZs2FGGv3Vcagjx1/dh2Nz5a6qJQPKjJ2LbnxvA0WF/+jYehSMfa6Jda324AJC9pnroW/e7VoQ7MSpEnc04RF4Ogo6n2RXLEo0TH4ZbJbaHz4TUMw2pVhOWZ8KQxL6C4WeXEi6oMfPAvZmZG+4rygt4xybdL9VUkqlTbomNpcv84P0Mh2jrUFlw9twi+NmPIyz+nNYBe1JstA64tpSpo0ufoVs9vEGAUuvdK2suCd8+zQx0bBAA6AoqZUneeKr7/e6jOG6nQ0/ThxdnrGnO5XuJXks7nWddkbqQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB6PR0802CA0037.eurprd08.prod.outlook.com (2603:10a6:4:a3::23) by GVXPR08MB7703.eurprd08.prod.outlook.com (2603:10a6:150:6b::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:57:56 +0000 Received: from DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:a3:cafe::d5) by DB6PR0802CA0037.outlook.office365.com (2603:10a6:4:a3::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19 via Frontend Transport; Mon, 31 Oct 2022 11:57:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT021.mail.protection.outlook.com (100.127.142.184) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:57:55 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Mon, 31 Oct 2022 11:57:55 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6dcb4d70b20071fd X-CR-MTA-TID: 64aa7808 Received: from 3f5959128ad8.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3895C469-ADEB-42BB-A8D0-B57348B8FEDD.1; Mon, 31 Oct 2022 11:57:48 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 3f5959128ad8.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:57:48 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XjwVwANzkYrX++qJaUSKGhKhcPzoOl6kIVq+NGOBF7+w+K3FcI8rp/oW3NrnPD/tUrSUa2iAauzndC/tkx3Zei3tH6BH/EF62IpyaH+6jENPlkEDUAqphD3yqUlnRycKFLlOXIxO+WjatzzLjar3X0papyUny0pJVj58eAPt45BnWAu0sk+Lb1j5OyxwBhgEOpSeOlcpwLJrElqPGXioo6N1X7C8JeSFBoRFBx4Qi8guvqQdc1eoyQiHGmxi8Bggor9+jQVBJ91vy+9XWtnpMwwzz+x0t1K2QE4rH0vx08gDmzvn/VMkFX67NI0JAHB4BjZv5y5TS3IVYUTAstTnOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xWzAevfcmxmIehbQ96I1Iiq22u8sQ8kFIWZiKXqtP3A=; b=RbGsS4UQZD4jyFrqVzQMUY4UU15vLH0oEfhpXyp7NjnWkwmVasHa7pevf+bCnefb9Uc4VfCs5Qve7/mfOAGexkLJrzJI23fEYbRIaEfxPzS20Td4LnAsZCykNSwLpOjxkbsYPw8BXXwH1Y3/lygxUYyv7GD4uDgKL6v0bkyoXm6v+xU2UH4beAxDMhtZ6RgIzSN8o7fr58XemPUIjBErECznetTncJ4qmP6yuQ+WfJXlIwEpOCgOisxlACnxBs3vBPVWLxp/8nThfxEipmJhpf7+7pqIMB1eDnsS8ASnRrNDvq5RzhyWd7Qbr/l25lRGAZbPFXnClD2GJzULhpgetQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB8385.eurprd08.prod.outlook.com (2603:10a6:10:3da::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16; Mon, 31 Oct 2022 11:57:47 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:57:47 +0000 Date: Mon, 31 Oct 2022 11:57:42 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 3/8]middle-end: Support extractions of subvectors from arbitrary element position inside a vector Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO2P265CA0516.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:13b::23) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB8385:EE_|DBAEUR03FT021:EE_|GVXPR08MB7703:EE_ X-MS-Office365-Filtering-Correlation-Id: 94b9cbf5-2d93-4b28-4014-08dabb3725ce x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: UDhc20mtfcYfmTiTnUYBx5uH+d3T5eV14AwjAKzMv7zhZNccdUy7YX05PMh7CO+zpO3yv5xrngwGmmXi9xh75LJ5scWQRrOgSw8DwXr4Gl/iQhb8OJNeTh7EqpIr8eHWERwp9FXXr15uimni2374sK5Z8ic3E9bWiVPHl8voRjE4DA15NpJpY9w/ID5TfziR7veGhH7DCdes5JPdj0PXAtcqpe9AjhuLGP3LCew9VqVcoCGK4ebsdquKJUTwrgw1H5H7Pc+NiNEQW7dg3x06S0JHfVHXoC0afzHTWI4NYdDyo+Jd869d1nomT/bM59gfvO4NnsFin5OCBZfcDeY2KB/BZ8dDn7KDbhd5hIJxqjiwhyPlkiuszCYe0Pol4lx/MFUdNKQWcqttti162aIo3Y4UdMJXz2cBwbwMh6OKryHgdbaCCDH2bDCqrPOoM3R7UH2wtRdfLgwHi2zGQn15WpCjpNQmR3HDnPXspLQQYN9hV3Z+rU1vs/9rw7HsWXxyM99mjiSjptDbFD795OGnHNtlnMz2DCB5vixDzWT9kJ+6im+hEGTi3PN+oF7gHdK11NQB0BGXv4GWvNE4dEZgr6e0AtuHfCybFWxgzqNIo2fM0wjnX26DW0nwryDSyY+yY+ub/rrMaJ2BeWTAEKdMidOXRDLfJNRgA+afntZkP5bghyevYBFcomIVHEmsY08J3flZtfZNEQ9vQk077pQCFQbgifwwdDeRICO97GIeRUJMUFuOPBUinPTQVFklAe4I/8IYxEnEFY763VPa3sUAdJ90AVDP192ewh7DmXiEwdmytxoOgwux4Hh2SOZ9ECf9 X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(136003)(396003)(39860400002)(376002)(451199015)(66899015)(36756003)(84970400001)(38100700002)(5660300002)(44832011)(2906002)(235185007)(186003)(86362001)(4326008)(6916009)(4743002)(2616005)(478600001)(6666004)(8676002)(316002)(6486002)(66476007)(66556008)(66946007)(41300700001)(8936002)(26005)(33964004)(6512007)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8385 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: ca3693d2-cdd0-446c-b5a3-08dabb37207b X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: aQ+SqH0fOzIP+0WM6SWa09zcWnQGBgKxZl2F7AC+9jcD7yjEtUwOybMdEscrLOrjYRDqwOPxkcFiba/AteCOVnzTNnU0IialYbubpYhwX3IbjQR/UzdF7UR4uhKE/KTh2DEXOPTeOMNbeEIIUYimuv2RQqQ68at7J8I1alIb14XlvFVDgYAfO1ODiqHkrYQseX9aXGj8jyQqWCX3u/McofFST4HmWxGCqMcKNnFouZ378MUtTWhUxe7D+Dn4+2LRKEKF/sHTsP/HUBBOxm7XQWhDTTXFZ+aCGW04ydbm3W7jZoq/ahro7i7qDmHQFAY1y8NN1dJjGo45EPCcxQBxXLhBxW44JBINJK90Vu0MdC+bYRRqaP5vIjUzfK5XxmoGlM2rynlXZ349gU0BlrG06avhunTTOk1sjCN8mvfSJHTZ1Daa02xSCi0tS4lzb93VbesNZeKPzNnNrSUr3xfsWblH7yhOhVsn4kWWMNlg70/HhUQeZ51rByMxqMMt+pp69c16oXXQpJh5jdgMZOGj/gOAJwdJ0S4UbJqWainVBF5++99/5mOe5hxDzrAZRq7QxU2KjzEe29EJ1HjrL5rhaK5ivwIm+DUUbkJH53Kpq2x7OKZDUTLbQUjK4FDawXkl/z20HuhFJbxyREw2I0S49UyCGcw+K+D0qp3mIDA9mTcPnl/dA30FyXCzfvF0N0TQOXb17j1OUzHAzXwS25xreNWG3XocBj9FUHhIN8c3wo81hkkswfsKe93CE03LujCIIEz2SaQdKIu2+agob8kTMZQCasts7fqxpcei3QEDsARsSwagCBk2HaeF/SUiYO8zMcH5Y1CYaMVbcr8yid8BQ464BT/zKtQhbkZkXjGgKGg= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(136003)(39850400004)(396003)(376002)(346002)(451199015)(36840700001)(40470700004)(46966006)(8936002)(84970400001)(44832011)(235185007)(41300700001)(478600001)(82740400003)(40460700003)(5660300002)(316002)(2906002)(66899015)(70206006)(70586007)(4326008)(8676002)(6916009)(47076005)(81166007)(4743002)(6512007)(26005)(6486002)(36860700001)(6506007)(82310400005)(33964004)(44144004)(186003)(40480700001)(356005)(2616005)(336012)(107886003)(86362001)(36756003)(6666004)(67856001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:57:55.8059 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 94b9cbf5-2d93-4b28-4014-08dabb3725ce X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT021.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB7703 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: nd@arm.com, rguenther@suse.de Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, The current vector extract pattern can only extract from a vector when the position to extract is a multiple of the vector bitsize as a whole. That means extract something like a V2SI from a V4SI vector from position 32 isn't possible as 32 is not a multiple of 64. Ideally this optab should have worked on multiple of the element size, but too many targets rely on this semantic now. So instead add a new case which allows any extraction as long as the bit pos is a multiple of the element size. We use a VEC_PERM to shuffle the elements into the bottom parts of the vector and then use a subreg to extract the values out. This now allows various vector operations that before were being decomposed into very inefficient scalar operations. NOTE: I added 3 testcases, I only fixed the 3rd one. The 1st one missed because we don't optimize VEC_PERM expressions into bitfields. The 2nd one is missed because extract_bit_field only works on vector modes. In this case the intermediate extract is DImode. On targets where the scalar mode is tiable to vector modes the extract should work fine. However I ran out of time to fix the first two and so will do so in GCC 14. For now this catches the case that my pattern now introduces more easily. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * expmed.cc (extract_bit_field_1): Add support for vector element extracts. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ext_1.c: New. --- inline copy of patch -- diff --git a/gcc/expmed.cc b/gcc/expmed.cc index bab020c07222afa38305ef8d7333f271b1965b78..ffdf65210d17580a216477cfe4ac1598941ac9e4 100644 --- diff --git a/gcc/expmed.cc b/gcc/expmed.cc index bab020c07222afa38305ef8d7333f271b1965b78..ffdf65210d17580a216477cfe4ac1598941ac9e4 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -1718,6 +1718,45 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, return target; } } + else if (!known_eq (bitnum, 0U) + && multiple_p (GET_MODE_UNIT_BITSIZE (tmode), bitnum, &pos)) + { + /* The encoding has a single stepped pattern. */ + poly_uint64 nunits = GET_MODE_NUNITS (new_mode); + int nelts = nunits.to_constant (); + vec_perm_builder sel (nunits, nelts, 1); + int delta = -pos.to_constant (); + for (int i = 0; i < nelts; ++i) + sel.quick_push ((i - delta) % nelts); + vec_perm_indices indices (sel, 1, nunits); + + if (can_vec_perm_const_p (new_mode, new_mode, indices, false)) + { + class expand_operand ops[4]; + machine_mode outermode = new_mode; + machine_mode innermode = tmode; + enum insn_code icode + = direct_optab_handler (vec_perm_optab, outermode); + target = gen_reg_rtx (outermode); + if (icode != CODE_FOR_nothing) + { + rtx sel = vec_perm_indices_to_rtx (outermode, indices); + create_output_operand (&ops[0], target, outermode); + ops[0].target = 1; + create_input_operand (&ops[1], op0, outermode); + create_input_operand (&ops[2], op0, outermode); + create_input_operand (&ops[3], sel, outermode); + if (maybe_expand_insn (icode, 4, ops)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + else if (targetm.vectorize.vec_perm_const != NULL) + { + if (targetm.vectorize.vec_perm_const (outermode, outermode, + target, op0, op0, indices)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + } + } } /* See if we can get a better vector mode before extracting. */ diff --git a/gcc/testsuite/gcc.target/aarch64/ext_1.c b/gcc/testsuite/gcc.target/aarch64/ext_1.c new file mode 100644 index 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e571b3bc2ddf887a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ext_1.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +typedef unsigned int v4si __attribute__((vector_size (16))); +typedef unsigned int v2si __attribute__((vector_size (8))); + +/* +** extract: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract (v4si x) +{ + v2si res = {x[1], x[2]}; + return res; +} + +/* +** extract1: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract1 (v4si x) +{ + v2si res; + memcpy (&res, ((int*)&x)+1, sizeof(res)); + return res; +} + +typedef struct cast { + int a; + v2si b __attribute__((packed)); +} cast_t; + +typedef union Data { + v4si x; + cast_t y; +} data; + +/* +** extract2: +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract2 (v4si x) +{ + data d; + d.x = x; + return d.y.b; +} + --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -1718,6 +1718,45 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, poly_uint64 bitnum, return target; } } + else if (!known_eq (bitnum, 0U) + && multiple_p (GET_MODE_UNIT_BITSIZE (tmode), bitnum, &pos)) + { + /* The encoding has a single stepped pattern. */ + poly_uint64 nunits = GET_MODE_NUNITS (new_mode); + int nelts = nunits.to_constant (); + vec_perm_builder sel (nunits, nelts, 1); + int delta = -pos.to_constant (); + for (int i = 0; i < nelts; ++i) + sel.quick_push ((i - delta) % nelts); + vec_perm_indices indices (sel, 1, nunits); + + if (can_vec_perm_const_p (new_mode, new_mode, indices, false)) + { + class expand_operand ops[4]; + machine_mode outermode = new_mode; + machine_mode innermode = tmode; + enum insn_code icode + = direct_optab_handler (vec_perm_optab, outermode); + target = gen_reg_rtx (outermode); + if (icode != CODE_FOR_nothing) + { + rtx sel = vec_perm_indices_to_rtx (outermode, indices); + create_output_operand (&ops[0], target, outermode); + ops[0].target = 1; + create_input_operand (&ops[1], op0, outermode); + create_input_operand (&ops[2], op0, outermode); + create_input_operand (&ops[3], sel, outermode); + if (maybe_expand_insn (icode, 4, ops)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + else if (targetm.vectorize.vec_perm_const != NULL) + { + if (targetm.vectorize.vec_perm_const (outermode, outermode, + target, op0, op0, indices)) + return simplify_gen_subreg (innermode, target, outermode, 0); + } + } + } } /* See if we can get a better vector mode before extracting. */ diff --git a/gcc/testsuite/gcc.target/aarch64/ext_1.c b/gcc/testsuite/gcc.target/aarch64/ext_1.c new file mode 100644 index 0000000000000000000000000000000000000000..18a10a14f1161584267a8472e571b3bc2ddf887a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ext_1.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +typedef unsigned int v4si __attribute__((vector_size (16))); +typedef unsigned int v2si __attribute__((vector_size (8))); + +/* +** extract: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract (v4si x) +{ + v2si res = {x[1], x[2]}; + return res; +} + +/* +** extract1: { xfail *-*-* } +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract1 (v4si x) +{ + v2si res; + memcpy (&res, ((int*)&x)+1, sizeof(res)); + return res; +} + +typedef struct cast { + int a; + v2si b __attribute__((packed)); +} cast_t; + +typedef union Data { + v4si x; + cast_t y; +} data; + +/* +** extract2: +** ext v0.16b, v0.16b, v0.16b, #4 +** ret +*/ +v2si extract2 (v4si x) +{ + data d; + d.x = x; + return d.y.b; +} + From patchwork Mon Oct 31 11:58:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59649 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8B4983851C24 for ; Mon, 31 Oct 2022 11:59:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8B4983851C24 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217575; bh=2cWYoTPz9RrOqsM99wuLiq1THCAJ1UmpU2Eavf5Xll8=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Z85KCE/9yfUXI6yNWCH/gWU3kiLYD7acxGJbIf+NB+2IYZ2+fCzCjib2TrgeUWzAu /mrMMHLquQfReAGzvu4oA427/2IXLvx9Ouc26HlULpoVqidaJiNn2TJfjOGL49wwSk BkvAqqIGZrAJsYDj4iNmRInrP5lYvRm/Ipx/001o= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on2084.outbound.protection.outlook.com [40.107.247.84]) by sourceware.org (Postfix) with ESMTPS id AF1F7385483B for ; Mon, 31 Oct 2022 11:58:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AF1F7385483B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=EVVR3bm+uum24EmGqIE21KOttfbql1gWS2G6j1eDTaBhSOOhtHpefb+ERiez9nQA20VO1zdBt05PqIlAC/IyRT9sJHlK4qQMQ+/N2L8cXhXSgMa0fqqXrE9E0+HQj8KR8SQdnqzjxo3JYgYG8quEncu10yxDI0+AhqDqworNIpsE8NKeerzafjj5yf07GJgFyfDNKhsnf4blty4DXVN/39zUDhyF1eqeIQWCPJVLU2wHf6KopIH4axsMy4L8WYE7BpyZO1Jf01i/WOWOAx62M1SphqmdMOsHeocBxILMWu96qWIYD7gdxpNqZgVexmOrL1OmKMY/rSy+3E6BLJfNKw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2cWYoTPz9RrOqsM99wuLiq1THCAJ1UmpU2Eavf5Xll8=; b=RnF1MecwAvvTkGhm2uqEtFjREMgk9zkMh5D6WwN46oVPfEyy12RKUAQajBoAncVuCeVi7V/CCUQfd27W2A7Ib6M8KWg48DhPUlkTACZDMK5sjSKRbd8cWHc6jj/T4ym5vj1XP/qKprCrF8ioQsirfiPsZHta9gif0tfcPUOYJaB/HdWfEZWpdxJ/KGbqECENfLTvwvcwAKzeLSQa8FBJ9VeiFi/b4CZ2dRIFQRckhfQiGY39E/rz8xxadmeEJ+9qvfHO2pIK3A8uLy0oaxeQA0v3BZunayGOq5ZUcan3xbt8Yjo9lwkN4/RA9i/U03kcB4pcviB4v72P/VSXRIPSgg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB9PR02CA0019.eurprd02.prod.outlook.com (2603:10a6:10:1d9::24) by GVXPR08MB7728.eurprd08.prod.outlook.com (2603:10a6:150:6a::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:58:20 +0000 Received: from DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:1d9:cafe::42) by DB9PR02CA0019.outlook.office365.com (2603:10a6:10:1d9::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:58:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT023.mail.protection.outlook.com (100.127.142.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:58:20 +0000 Received: ("Tessian outbound 6c699027a257:v130"); Mon, 31 Oct 2022 11:58:20 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 70de9d9dc337fd6f X-CR-MTA-TID: 64aa7808 Received: from bf87c50e828e.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 415339F3-1947-48EE-9AF7-87E54FEBE504.1; Mon, 31 Oct 2022 11:58:13 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id bf87c50e828e.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:58:13 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PZeFbwDhSEqJllBizXPtq9W55V35QSG7gzEqBsX6ZfPF1c3Lo35T3eTNukylftmRqxMpfdtBoxb9Z14QTmS4A5edCPWVzkbfE0m3FE1UYFV0rAJOE0zNgjAkpJLuGGIWPe3LpdX7J9ClMa1XU4pTWfANcdr8Sj2h07JJSSMQq+cBhe42bTQWgvxaC3JpSNuMlTlBtCqEbQ812dPcgQhShF1/cU7Dymo7w14M40QDnk5fnCMOheMAAjZxlUc/T/jwYMoqoO0PwU81eIHmPw+5BQ9aM9yGIcnksVmO4m0PgPF7LPuo9kUNIlQraIk4VTEAWf+RBu17FPMhX3pLYqMS3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2cWYoTPz9RrOqsM99wuLiq1THCAJ1UmpU2Eavf5Xll8=; b=UctEiDRG3XI+HOTVaXxwwKbHobLbTQAreLhFfSgZk5+mxD1Y4SGmTEhzLhvlpbwkiawDMpw0K6t9vMmJZre1on7Yy21NEDyjA5Np6itmm1EmZieboEfToEAIS6wQTwv0wVCXuFM2xI5nCE93sdr3q/tibW8S3xd6WJ+l19soQxtTfdndem1SS915xASM8Ahjg/M6QM6jCgJVVcYWyRvQABdt1S5mqpbQ035Fhxrh50jBBYTywJhhel+JJTPE0/92dWymzZtH9qXDSb8sAIdzNaNxWx338KOn6RFDpbfAygLCqoO5kTrpwxZO8gItlu+otuhw5q4aCMlrdk8y5Tqpmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DB9PR08MB8385.eurprd08.prod.outlook.com (2603:10a6:10:3da::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.16; Mon, 31 Oct 2022 11:58:11 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:58:11 +0000 Date: Mon, 31 Oct 2022 11:58:09 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 4/8]AArch64 aarch64: Implement widening reduction patterns Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0323.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:197::22) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DB9PR08MB8385:EE_|DBAEUR03FT023:EE_|GVXPR08MB7728:EE_ X-MS-Office365-Filtering-Correlation-Id: 18863909-ebb8-4201-fea8-08dabb37346f x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: pVAp4GsMDHJ8GhMvyMQfIgULuH6vwOLucx7yK0Ae7yRjHY2TWe6ywp16+HGgoFciLoE02QTzuT6NPFE0QQfL/mMWEyIegeJxWf9ATYc61yfNYTww8AH5N+AAPV0KxufcfQV6VY28xSP0Uan1Dg+L5Xl37rHvZ3eW5evLyIZA0POEax57RrFkgOOg+ojCyjesvxEDiX/dZlGawS6f3wFjnxfdPIfqPAjGb823LCI0skITlB+tH9jHagg61zpL0heLL3ENYuDIPDFzeB+U+Hfe9oxm/CW9RpSTitzFlSMBbBGJ3c+q5Rpitd6RKXqyqx568NloqjAgQnK/N8idXgqBLm5XKJdx3u+PdoZ1jsPSKCslZTiDSJsmX/aYKTMnjI+66RfjXBwDow1qAC+C7eDvkPLA8pkvFh1YkMCiYzUkHLxAgSgJfGPaoVGObcaoQwamTsekVlBfWb7WYl5qOXcYxUKHxtJ5ZkSHRSynXhQp1lyOGGgjLRzNO7/grkCZXQcj6WDipd0cdAvxbdqxv0dMAfyL1qms8Nmn/hAIcgJgjn1jmrNJC1hyWXVs8lwKNzwFDPhPwL1Q3c/RqlXW7jiSEGvniRvPOFSzrCBjteVIs1wIMHLKti8fH7g21a5PXNiX0I6zTxc6zx0TC0xnsVr3Zv4Q/rXEtTHbmQ4qQi/MIUxJtQQ2383gphT9x1hhmAfkyj82pUuuddA8+iiF8jBhnTO28pA0JosCbgxcnee2k01klsvURocF3TBqhhln9CvNFO8ji7bYA1GquI8maWO4DyS1p+tOyAEwFcopb20K0h4= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(136003)(396003)(39860400002)(376002)(451199015)(36756003)(38100700002)(5660300002)(44832011)(2906002)(235185007)(186003)(86362001)(4326008)(6916009)(4743002)(2616005)(478600001)(8676002)(316002)(6486002)(66476007)(66556008)(66946007)(41300700001)(8936002)(26005)(33964004)(6512007)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB8385 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 4081c3d2-425b-4665-e72a-08dabb372f2e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: pCFP8OSkA/4gsn/cSjKQAAZHpcm7bj5t3AmLBBVD1zr3ovOUQ2nwzHEXPGfyTJpeuSiVtyr+xADUbjiYj0BTKLRUE7U1Ur3+OApjZXziOveocS6Ye0fvcyRHLxpK4zbrbwpAoIDw0FQr9VcDpzRsgjt8YA94qIuHEOc+PgqzbQUbl89UXxJ/x73KiuGffnwccadaORVpTfsSb2+ko3WgmaC+gSwQUJeTH3Qe6siN0VvyssquhieLvb9VjElLLfT3Qgp2b0DdFkHLhIEzE8zsvujTw+Br9TslzCp6rLlfm0uVHGQuGOBOpIi+7EfIPr9x4LYM9YSBX5RUTna6HK593sNu5fpRpOnqZ60MR5pFob+IlOd+RVY8G8gV27JPgvK+9ZVZz90YRG5+r28STMwhC/D09vCayga5pdqC5XA7HlWx2Z/dVEMbxFdcmFz5yxWm/Hkv7OHjFD41lYoqsFKx9VT2eUH5yjxn7UIBJypGHyIOLMRU1lQ6U2MCPf5OCeVJDTOH1EFE2LGfcudWblgKQzISkA3XmgX+dHuTq8CVtkzSYDw9SWWojpThSnGP26R270Twn8xJl4Xr2tvlVdb+ofOGZARks71dhnLdvQKFQQ+UYLAG7ylJ5h4y4wNkd7qcpynNRmQ7R3YgVNpzWOw4Ls6Enxpfx2oZaUlvyMBwxb4j+maV3pRKOK43Lqa9owK1jdJkM04gJ/267i+5Fz8/RKecJcHRPcL/Rg6SVv6h1PH/uyyn+4JQyfj8ssWdU/8eRhwJ5q15nckbyamnK8XIs5sszrHHZyq2GBNs/qoFqADywQPGJmqF/GoDB7QfbQSbFxrdps4Tkg3v2kkfGQT0Sw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(376002)(346002)(39860400002)(396003)(136003)(451199015)(46966006)(40470700004)(36840700001)(336012)(47076005)(40460700003)(6486002)(2906002)(44832011)(86362001)(81166007)(356005)(36756003)(40480700001)(82740400003)(82310400005)(186003)(4743002)(26005)(6512007)(44144004)(2616005)(33964004)(36860700001)(70586007)(8936002)(316002)(70206006)(478600001)(8676002)(4326008)(6506007)(41300700001)(6916009)(5660300002)(235185007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:58:20.3540 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 18863909-ebb8-4201-fea8-08dabb37346f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB7728 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, This implements the new widening reduction optab in the backend. Instead of introducing a duplicate definition for the same thing I have renamed the intrinsics defintions to use the same optab. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd-builtins.def (saddlv, uaddlv): Rename to reduc_splus_widen_scal_ and reduc_uplus_widen_scal_ respectively. * config/aarch64/aarch64-simd.md (aarch64_addlv): Renamed to ... (reduc_plus_widen_scal_): ... This. * config/aarch64/arm_neon.h (vaddlv_s8, vaddlv_s16, vaddlv_u8, vaddlv_u16, vaddlvq_s8, vaddlvq_s16, vaddlvq_s32, vaddlvq_u8, vaddlvq_u16, vaddlvq_u32, vaddlv_s32, vaddlv_u32): Use it. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index cf46b31627b84476a25762ffc708fd84a4086e43..a4b21e1495c5699d8557a4bcb9e73ef98ae60b35 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index cf46b31627b84476a25762ffc708fd84a4086e43..a4b21e1495c5699d8557a4bcb9e73ef98ae60b35 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -190,9 +190,9 @@ BUILTIN_VDQV_L (UNOP, saddlp, 0, NONE) BUILTIN_VDQV_L (UNOPU, uaddlp, 0, NONE) - /* Implemented by aarch64_addlv. */ - BUILTIN_VDQV_L (UNOP, saddlv, 0, NONE) - BUILTIN_VDQV_L (UNOPU, uaddlv, 0, NONE) + /* Implemented by reduc_plus_widen_scal_. */ + BUILTIN_VDQV_L (UNOP, reduc_splus_widen_scal_, 10, NONE) + BUILTIN_VDQV_L (UNOPU, reduc_uplus_widen_scal_, 10, NONE) /* Implemented by aarch64_abd. */ BUILTIN_VDQ_BHSI (BINOP, sabd, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index cf8c094bd4b76981cef2dd5dd7b8e6be0d56101f..25aed74f8cf939562ed65a578fe32ca76605b58a 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3455,7 +3455,7 @@ (define_expand "reduc_plus_scal_v4sf" DONE; }) -(define_insn "aarch64_addlv" +(define_insn "reduc_plus_widen_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV_L 1 "register_operand" "w")] USADDLV))] diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index cf6af728ca99dae1cb6ab647466cfec32f7e913e..7b2c4c016191bcd6c3e075d27810faedb23854b7 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -3664,70 +3664,70 @@ __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s8 (int8x8_t __a) { - return __builtin_aarch64_saddlvv8qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s16 (int16x4_t __a) { - return __builtin_aarch64_saddlvv4hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4hi (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u8 (uint8x8_t __a) { - return __builtin_aarch64_uaddlvv8qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u16 (uint16x4_t __a) { - return __builtin_aarch64_uaddlvv4hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4hi_uu (__a); } __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s8 (int8x16_t __a) { - return __builtin_aarch64_saddlvv16qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v16qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s16 (int16x8_t __a) { - return __builtin_aarch64_saddlvv8hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8hi (__a); } __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s32 (int32x4_t __a) { - return __builtin_aarch64_saddlvv4si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4si (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u8 (uint8x16_t __a) { - return __builtin_aarch64_uaddlvv16qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v16qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u16 (uint16x8_t __a) { - return __builtin_aarch64_uaddlvv8hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8hi_uu (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u32 (uint32x4_t __a) { - return __builtin_aarch64_uaddlvv4si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4si_uu (__a); } __extension__ extern __inline float32x2_t @@ -6461,14 +6461,14 @@ __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s32 (int32x2_t __a) { - return __builtin_aarch64_saddlvv2si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v2si (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u32 (uint32x2_t __a) { - return __builtin_aarch64_uaddlvv2si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v2si_uu (__a); } __extension__ extern __inline int16x4_t --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -190,9 +190,9 @@ BUILTIN_VDQV_L (UNOP, saddlp, 0, NONE) BUILTIN_VDQV_L (UNOPU, uaddlp, 0, NONE) - /* Implemented by aarch64_addlv. */ - BUILTIN_VDQV_L (UNOP, saddlv, 0, NONE) - BUILTIN_VDQV_L (UNOPU, uaddlv, 0, NONE) + /* Implemented by reduc_plus_widen_scal_. */ + BUILTIN_VDQV_L (UNOP, reduc_splus_widen_scal_, 10, NONE) + BUILTIN_VDQV_L (UNOPU, reduc_uplus_widen_scal_, 10, NONE) /* Implemented by aarch64_abd. */ BUILTIN_VDQ_BHSI (BINOP, sabd, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index cf8c094bd4b76981cef2dd5dd7b8e6be0d56101f..25aed74f8cf939562ed65a578fe32ca76605b58a 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3455,7 +3455,7 @@ (define_expand "reduc_plus_scal_v4sf" DONE; }) -(define_insn "aarch64_addlv" +(define_insn "reduc_plus_widen_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV_L 1 "register_operand" "w")] USADDLV))] diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index cf6af728ca99dae1cb6ab647466cfec32f7e913e..7b2c4c016191bcd6c3e075d27810faedb23854b7 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -3664,70 +3664,70 @@ __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s8 (int8x8_t __a) { - return __builtin_aarch64_saddlvv8qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s16 (int16x4_t __a) { - return __builtin_aarch64_saddlvv4hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4hi (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u8 (uint8x8_t __a) { - return __builtin_aarch64_uaddlvv8qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u16 (uint16x4_t __a) { - return __builtin_aarch64_uaddlvv4hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4hi_uu (__a); } __extension__ extern __inline int16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s8 (int8x16_t __a) { - return __builtin_aarch64_saddlvv16qi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v16qi (__a); } __extension__ extern __inline int32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s16 (int16x8_t __a) { - return __builtin_aarch64_saddlvv8hi (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v8hi (__a); } __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_s32 (int32x4_t __a) { - return __builtin_aarch64_saddlvv4si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v4si (__a); } __extension__ extern __inline uint16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u8 (uint8x16_t __a) { - return __builtin_aarch64_uaddlvv16qi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v16qi_uu (__a); } __extension__ extern __inline uint32_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u16 (uint16x8_t __a) { - return __builtin_aarch64_uaddlvv8hi_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v8hi_uu (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlvq_u32 (uint32x4_t __a) { - return __builtin_aarch64_uaddlvv4si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v4si_uu (__a); } __extension__ extern __inline float32x2_t @@ -6461,14 +6461,14 @@ __extension__ extern __inline int64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_s32 (int32x2_t __a) { - return __builtin_aarch64_saddlvv2si (__a); + return __builtin_aarch64_reduc_splus_widen_scal_v2si (__a); } __extension__ extern __inline uint64_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vaddlv_u32 (uint32x2_t __a) { - return __builtin_aarch64_uaddlvv2si_uu (__a); + return __builtin_aarch64_reduc_uplus_widen_scal_v2si_uu (__a); } __extension__ extern __inline int16x4_t From patchwork Mon Oct 31 11:58:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59651 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AE6B73853806 for ; Mon, 31 Oct 2022 12:00:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AE6B73853806 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217646; bh=GbuvD6we8C52P7Xvv2yIJLB8MjygodYPPTGJolOa/Hw=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=KR/Q7X8vQYykYZup/DOWq8J4LyzUWu8VhVuFliSJEKTrllOoF+Fm7HDlDyng1NeOV PJXRIJdac0ODOsO+cTZeO1l5JJz+pgLOtUqsxbPhDVNn7GtoSak/r3kkkIrJVKzroq KeeUBUh2qRa8MNE+avFmsMDF7YQAG8yDeBy9/7tM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-eopbgr150088.outbound.protection.outlook.com [40.107.15.88]) by sourceware.org (Postfix) with ESMTPS id 6E96B3851C1E for ; Mon, 31 Oct 2022 11:59:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6E96B3851C1E ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Uxiw146OAHf0ZFp0k4bMVgfidX4qPAXlwaj4+ZWCYNzOJgBgIoqTE0kud5LdgALkB4a/PuBv490pOSrgb7fKeuV1e/4qpXlXCwTKkrdJQ7zDDGbXJT+W+wB8fpcLtvi88dWihS9ehrxEVNK5r9SAfd+c0hr1vaduEJ33bkisT9QduT4nvV2+jA40S6GaFw/QeI1qxQco/DuYsXQvkslk+7Eu9PYN/iCsmGKXgZpmB++1W4tq1eSBOqXPavePvC6QI7pTvIyYKOqgLZI3+gxQvW3R56NjUkJRG+pEsTejsk7W3UHJrIGPA2f3qj+F4qbS3bkJxME8z8VfL0SP+zXfbA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GbuvD6we8C52P7Xvv2yIJLB8MjygodYPPTGJolOa/Hw=; b=RufwG+pkB6E8Yd3o5ohsHRm9c777NemS68pyFe19LHh4WUwVFPu6frUZ7lT8vGsJPMh5t5ZHQW/W1T+GhlGLBIKEyZIw45MKGRm1IjNDAHWMFOYhPHJrwC2aezp38Nn3Fome0+zi2YM6EF5RS+5YztAzgIbbWJkkPvbfTO3cNZNV18IT9ZWMQSrIp0JzrZIsvgnmjib5Vg8PaYA9UG7Lp8H92VghAdcYtknhPOh3LDxU53njmc4Qw5pQ8csMk3eZYgJd2aVsh4LhJ++kGvap8F9sL5z9lNAIyvKxytNXFWusUt96j70ZO4HBDoyfmyVAOptLMYRzop3m0DWulaQ44w== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB9PR02CA0028.eurprd02.prod.outlook.com (2603:10a6:10:1d9::33) by PA4PR08MB7385.eurprd08.prod.outlook.com (2603:10a6:102:2a0::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15; Mon, 31 Oct 2022 11:59:02 +0000 Received: from DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:1d9:cafe::a) by DB9PR02CA0028.outlook.office365.com (2603:10a6:10:1d9::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:59:02 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT023.mail.protection.outlook.com (100.127.142.253) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:02 +0000 Received: ("Tessian outbound 58faf9791229:v130"); Mon, 31 Oct 2022 11:59:02 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: b0559d7bca7c3507 X-CR-MTA-TID: 64aa7808 Received: from c1694a330e7c.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C4D91AC3-947B-427E-B065-375B4160E6A9.1; Mon, 31 Oct 2022 11:58:55 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c1694a330e7c.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:58:55 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ahZbY2ODi7NNQEyld05kSM8tyei0ijttGQvF31BNCLgNBb3dnqhrZ3RuId5k7AeRjG0W1WvwZVFsVQA17CqDbrBqBeChWfKIznyHwEKN5yQnSaB858aG7guE0moKTso7YRj2d/nOEuK1LhE/60zXju3cWut5OqolSVpCpsrEZCfM93lPpnSgUmY8Sv0yOCdcHWsr3RZ30QcEBRUKYrp0ufW7+whKFBimBPRWbhi9PotHiXGgWmlm++lpo/PRTP89OqJxcGBiMxAfHhlsjSe9M+/yrF7VFkkZ4nLP6OsFpn6c6q+UG6SrH1AEEDaio43g3B7lLwA7McSeBOFCBHl4Bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GbuvD6we8C52P7Xvv2yIJLB8MjygodYPPTGJolOa/Hw=; b=eR4xOrcBbgfLF32W+I69LEiOli4c5nF7kqR1Jbw7jTq2cJ9s36LlLLDTw1AnN/CW3oOKvabYW9cPeoNYPVzLm6t6Z9i5J25f6Fd6M1RI4LHJbqIcXYBamx0wkm9OWGT5vJz843Oj7ejZ5o7mecUrYZBTQ+2pppIuSDv4LWumAXSeNh9kDNJkZXGv1rYarAJF85hKrN89NB9aPO3WujsTPcfMwf0pjFWBpnpPrsmxgZdp4vm9BPqdw1qmjIFvot+VU24K5laoSN9dkmCvg6aKhgP3r3Vnh95x6t2Qt+ptfVSJ4707uZqRH7K8013t40oSjcO5HEwdB4MdveKDxydgRw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by AS2PR08MB8717.eurprd08.prod.outlook.com (2603:10a6:20b:55d::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14; Mon, 31 Oct 2022 11:58:53 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:58:53 +0000 Date: Mon, 31 Oct 2022 11:58:50 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 5/8]AArch64 aarch64: Make existing V2HF be usable. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO4P123CA0489.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1ab::8) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|AS2PR08MB8717:EE_|DBAEUR03FT023:EE_|PA4PR08MB7385:EE_ X-MS-Office365-Filtering-Correlation-Id: 73d59646-5faa-4bc8-2867-08dabb374d90 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: gPgn4EpSlKRuG+Em7xxvjqpDQ7gDGplOxqy3tGFed9MQSOFL9aZkbwWz4d+9j+EdWhhUJ3BZm8puHbIUORi6VYheHCBkB0acGQdWhznJCeX/ltthor2Jl3IA6dn3Gd3K/OpJkBDp8VYFFtmnTizQU+2Wi5RDolPZnxp8JeBXG/k/RD2AL6r/rm4BE+3Z8IiSjj2SOXJpXRcYQPtcPD390Ksru42ZrYXCvJivpj1VdmGSCARvkGCmuJfyUHc1qzEpt3APuMLYTbdeB0RomZoiVASUr2A/a32dMpu/Rl21WiNSCBW9h+JfCDRMnc4tRoYiNEdhyfejIsjAOPmzDDJZpsP90RcX5IPyH0aDP9V8BszW0nFzwzzjaTVyjDz/BAgcGw4y0oQHaleYiFr0zZw7UPLt+9arrtuNb1lZEy2/5t77YNjq9eGyC4TzpMiMjJWdErvBs5VfZo9RJQUprh4a74R/Iqq6DHqYT/BuN0WcMhw8EEcUGauXcf7g0YWJLwa1xwhVOH/IR24d6LjDfv0KMEkeLq/PapwKON4E/U/g1ZRoz4/Aj0QsouYtIWST31frtxAOROyyP2VV2qktj0eV8M3aQcNnWJI26jNKaIv5GKPgAlhnvtNz9R8ckB7ZfEV9DGbDXL703ZLP3VJp4NGRMZgcJCXB41hxGUkQ7MoV5MyiypxY0f+jMeD00cnFOan+it/SGoOJdRbyL4pQUIi9svPo9IbUdGVPFWixXI1LigNaqeEr4pTdljbTRxl74BmbcUD5TKEAttHqykVxnTVcOPFG8tUFrWL0SDOGb6CajyRrt9IcG39k8bhzeoAu98j1aO3af75kg+WMflqzYVk59zlNCoMeOjKWOoj5sB5Bm+4= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(376002)(39860400002)(136003)(396003)(366004)(451199015)(478600001)(8676002)(38100700002)(84970400001)(6486002)(316002)(86362001)(6512007)(26005)(4743002)(66476007)(4326008)(41300700001)(186003)(83380400001)(235185007)(2906002)(33964004)(66946007)(2616005)(8936002)(30864003)(6916009)(66556008)(36756003)(5660300002)(44832011)(44144004)(6506007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB8717 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: e9b592e1-fc7a-4b6a-1cca-08dabb3747bc X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: yS9pZjM7O72lD66W8R85RvOUmHNAkCYY2ta8L6rOdeR5zdUoLJ5WHfO8ltCudKrpEd31QjuC+RmTnzx2J5YerhAZ6lcOwVx3CHlC01MusWbZhc9skB0dgp9PDlaNxu2OpXTBc1l0PvkN5VhtgRPUQlpHwKe/CYP8p6oreS4Eqj0j9CnHB8aauiCJM33wM5eh9/3HAEwbo3qGNIAQdoAfjt7s6RU/aahDYMF03jCBx1JXjCYcr/hAUq5BnLbnHcEGzmX9blPKjavIp871gIpjTB5QyxHYfouhloeUgACdtYpeVVr7WxErJKsmubuEbPXG93/refFewi6kq+t7WPH3EMQZka1jN7CT2j0zqdAq48IXECDWPOvJrzcdmZSivRiZgzfv/0i4wCCXXc6p45PQSupG1oCTRA94Era17bYdKI3KoogfEchAlfWyQrYOcdOggWm3KEHIhxRsYHjoGVr1tEIIzfzDMzfZVPt3+Iev292BEI2UsIpmxxQ+lShzQIyUDOEVTtuqXV3i/gM72qjzbZxUSjV9F+/hYZFOXR4gvnydX9WXcrJ7u2EuUoilxouvidZcKbBP4R9DG//nNVtacveuKnYGKTqzc78wo0AEe6V37VVRGN1/U5B0FHaggSG8i6iQik6UWdddQbPvWIZ41Lv3ziL/QuNW/eRbWzoIOkBVg+wkEnppN18yW72/sfiwkzbsCgJEsRa9GKJPO4vQ3Zm8PAL67scFf/7DlNS/Su6Gynx4ZIpX0Vy9X+nLhM8gXLIZF6YPbqoyCnaXoviU9+nVpvO52rbKssv12PpqAw7xlO69RYeFdU0CY/e7ByX1yPBiqdvqKlvAAPSExR9JyqqmTa/sMb10/AWUep8mveWdKLTgKzqsp27TXdtUDniMgTthZ5K9CDU7nr6d2q/q6A== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(396003)(39860400002)(376002)(136003)(346002)(451199015)(46966006)(40470700004)(36840700001)(47076005)(81166007)(36860700001)(40460700003)(356005)(82740400003)(86362001)(44832011)(2906002)(30864003)(5660300002)(235185007)(8676002)(6512007)(41300700001)(70206006)(2616005)(4326008)(82310400005)(6506007)(70586007)(33964004)(6486002)(186003)(336012)(478600001)(4743002)(8936002)(316002)(6916009)(44144004)(26005)(40480700001)(84970400001)(83380400001)(36756003)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:59:02.5078 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 73d59646-5faa-4bc8-2867-08dabb374d90 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB7385 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SCC_10_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, The backend has an existing V2HFmode that is used by pairwise operations. This mode was however never made fully functional. Amongst other things it was never declared as a vector type which made it unusable from the mid-end. It's also lacking an implementation for load/stores so reload ICEs if this mode is every used. This finishes the implementation by providing the above. Note that I have created a new iterator VHSDF_P instead of extending VHSDF because the previous iterator is used in far more things than just load/stores. It's also used for instance in intrinsics and extending this would force me to provide support for mangling the type while we never expose it through intrinsics. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_simd_movv2hf): New. (mov, movmisalign, aarch64_dup_lane, aarch64_store_lane0, aarch64_simd_vec_set, @aarch64_simd_vec_copy_lane, vec_set, reduc__scal_, reduc__scal_, aarch64_reduc__internal, aarch64_get_lane, vec_init, vec_extract): Support V2HF. * config/aarch64/aarch64.cc (aarch64_classify_vector_mode): Add E_V2HFmode. * config/aarch64/iterators.md (VHSDF_P): New. (V2F, VALL_F16_FULL, nunits, Vtype, Vmtype, Vetype, stype, VEL, Vel, q, vp): Add V2HF. * config/arm/types.md (neon_fp_reduc_add_h): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/slp_1.c: Update testcase. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 25aed74f8cf939562ed65a578fe32ca76605b58a..93a2888f567460ad10ec050ea7d4f701df4729d1 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 25aed74f8cf939562ed65a578fe32ca76605b58a..93a2888f567460ad10ec050ea7d4f701df4729d1 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -19,10 +19,10 @@ ;; . (define_expand "mov" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD" - " +{ /* Force the operand into a register if it is not an immediate whose use can be replaced with xzr. If the mode is 16 bytes wide, then we will be doing @@ -46,12 +46,11 @@ (define_expand "mov" aarch64_expand_vector_init (operands[0], operands[1]); DONE; } - " -) +}) (define_expand "movmisalign" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD && !STRICT_ALIGNMENT" { /* This pattern is not permitted to fail during expansion: if both arguments @@ -85,10 +84,10 @@ (define_insn "aarch64_simd_dup" ) (define_insn "aarch64_dup_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]) )))] "TARGET_SIMD" @@ -142,6 +141,29 @@ (define_insn "*aarch64_simd_mov" mov_reg, neon_move")] ) +(define_insn "*aarch64_simd_movv2hf" + [(set (match_operand:V2HF 0 "nonimmediate_operand" + "=w, m, m, w, ?r, ?w, ?r, w, w") + (match_operand:V2HF 1 "general_operand" + "m, Dz, w, w, w, r, r, Dz, Dn"))] + "TARGET_SIMD_F16INST + && (register_operand (operands[0], V2HFmode) + || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" + "@ + ldr\\t%s0, %1 + str\\twzr, %0 + str\\t%s1, %0 + mov\\t%0.2s[0], %1.2s[0] + umov\\t%w0, %1.s[0] + fmov\\t%s0, %1 + mov\\t%0, %1 + movi\\t%d0, 0 + * return aarch64_output_simd_mov_immediate (operands[1], 32);" + [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\ + neon_logic, neon_to_gp, f_mcr,\ + mov_reg, neon_move, neon_move")] +) + (define_insn "*aarch64_simd_mov" [(set (match_operand:VQMOV 0 "nonimmediate_operand" "=w, Umn, m, w, ?r, ?w, ?r, w") @@ -182,7 +204,7 @@ (define_insn "*aarch64_simd_mov" (define_insn "aarch64_store_lane0" [(set (match_operand: 0 "memory_operand" "=m") - (vec_select: (match_operand:VALL_F16 1 "register_operand" "w") + (vec_select: (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand 2 "const_int_operand" "n")])))] "TARGET_SIMD && ENDIAN_LANE_N (, INTVAL (operands[2])) == 0" @@ -1035,11 +1057,11 @@ (define_insn "one_cmpl2" ) (define_insn "aarch64_simd_vec_set" - [(set (match_operand:VALL_F16 0 "register_operand" "=w,w,w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w,w,w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (match_operand: 1 "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) - (match_operand:VALL_F16 3 "register_operand" "0,0,0") + (match_operand:VALL_F16_FULL 3 "register_operand" "0,0,0") (match_operand:SI 2 "immediate_operand" "i,i,i")))] "TARGET_SIMD" { @@ -1061,14 +1083,14 @@ (define_insn "aarch64_simd_vec_set" ) (define_insn "@aarch64_simd_vec_copy_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 3 "register_operand" "w") + (match_operand:VALL_F16_FULL 3 "register_operand" "w") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) - (match_operand:VALL_F16 1 "register_operand" "0") + (match_operand:VALL_F16_FULL 1 "register_operand" "0") (match_operand:SI 2 "immediate_operand" "i")))] "TARGET_SIMD" { @@ -1376,7 +1398,7 @@ (define_insn "vec_shr_" ) (define_expand "vec_set" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand: 1 "aarch64_simd_nonimmediate_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" @@ -3503,7 +3525,7 @@ (define_insn "popcount2" ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP smax/smin). (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINV)] "TARGET_SIMD" { @@ -3518,7 +3540,7 @@ (define_expand "reduc__scal_" (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINNMV)] "TARGET_SIMD" { @@ -3562,8 +3584,8 @@ (define_insn "aarch64_reduc__internalv2si" ) (define_insn "aarch64_reduc__internal" - [(set (match_operand:VHSDF 0 "register_operand" "=w") - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] + [(set (match_operand:VHSDF_P 0 "register_operand" "=w") + (unspec:VHSDF_P [(match_operand:VHSDF_P 1 "register_operand" "w")] FMAXMINV))] "TARGET_SIMD" "\\t%0, %1." @@ -4208,7 +4230,7 @@ (define_insn "*aarch64_get_lane_zero_extend" (define_insn_and_split "aarch64_get_lane" [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv") (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w, w, w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w, w, w") (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] "TARGET_SIMD" { @@ -7989,7 +8011,7 @@ (define_expand "aarch64_st1" ;; Standard pattern name vec_init. (define_expand "vec_init" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand 1 "" "")] "TARGET_SIMD" { @@ -8068,7 +8090,7 @@ (define_insn "aarch64_urecpe" (define_expand "vec_extract" [(match_operand: 0 "aarch64_simd_nonimmediate_operand") - (match_operand:VALL_F16 1 "register_operand") + (match_operand:VALL_F16_FULL 1 "register_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" { diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f05bac713e88ea8c7feaa2367d55bd523ca66f57..1e08f8453688210afe1566092b19b59c9bdd0c97 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3566,6 +3566,7 @@ aarch64_classify_vector_mode (machine_mode mode) case E_V8BFmode: case E_V4SFmode: case E_V2DFmode: + case E_V2HFmode: return TARGET_SIMD ? VEC_ADVSIMD : 0; default: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 37d8161a33b1c399d80be82afa67613a087389d4..1df09f7fe2eb35aed96113476541e0faa5393551 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -160,6 +160,10 @@ (define_mode_iterator VDQF [V2SF V4SF V2DF]) (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST") (V8HF "TARGET_SIMD_F16INST") V2SF V4SF V2DF]) +;; Advanced SIMD Float modes suitable for pairwise operations. +(define_mode_iterator VHSDF_P [(V4HF "TARGET_SIMD_F16INST") + (V8HF "TARGET_SIMD_F16INST") + V2SF V4SF V2DF (V2HF "TARGET_SIMD_F16INST")]) ;; Advanced SIMD Float modes, and DF. (define_mode_iterator VDQF_DF [V2SF V4SF V2DF DF]) @@ -188,15 +192,23 @@ (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF V2DI]) (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF]) ;; Advanced SIMD Float modes with 2 elements. -(define_mode_iterator V2F [V2SF V2DF]) +(define_mode_iterator V2F [V2SF V2DF V2HF]) ;; All Advanced SIMD modes on which we support any arithmetic operations. (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF]) -;; All Advanced SIMD modes suitable for moving, loading, and storing. +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; except V2HF. (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; including V2HF +(define_mode_iterator VALL_F16_FULL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI + V4HF V8HF V4BF V8BF V2SF V4SF V2DF + (V2HF "TARGET_SIMD_F16INST")]) + + ;; The VALL_F16 modes except the 128-bit 2-element ones. (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF]) @@ -1076,7 +1088,7 @@ (define_mode_attr nunits [(V8QI "8") (V16QI "16") (V2SF "2") (V4SF "4") (V1DF "1") (V2DF "2") (DI "1") (DF "1") - (V8DI "8")]) + (V8DI "8") (V2HF "2")]) ;; Map a mode to the number of bits in it, if the size of the mode ;; is constant. @@ -1090,6 +1102,7 @@ (define_mode_attr s [(HF "h") (SF "s") (DF "d") (SI "s") (DI "d")]) ;; Give the length suffix letter for a sign- or zero-extension. (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) +(define_mode_attr sizel [(QI "b") (HI "h") (SI "")]) ;; Give the number of bits in the mode (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) @@ -1134,8 +1147,9 @@ (define_mode_attr Vtype [(V8QI "8b") (V16QI "16b") (V2SI "2s") (V4SI "4s") (DI "1d") (DF "1d") (V2DI "2d") (V2SF "2s") - (V4SF "4s") (V2DF "2d") - (V4HF "4h") (V8HF "8h") + (V2HF "2h") (V4SF "4s") + (V2DF "2d") (V4HF "4h") + (V8HF "8h") (V2x8QI "8b") (V2x4HI "4h") (V2x2SI "2s") (V2x1DI "1d") (V2x4HF "4h") (V2x2SF "2s") @@ -1175,9 +1189,10 @@ (define_mode_attr Vmtype [(V8QI ".8b") (V16QI ".16b") (V4HI ".4h") (V8HI ".8h") (V2SI ".2s") (V4SI ".4s") (V2DI ".2d") (V4HF ".4h") - (V8HF ".8h") (V4BF ".4h") - (V8BF ".8h") (V2SF ".2s") - (V4SF ".4s") (V2DF ".2d") + (V8HF ".8h") (V2HF ".2h") + (V4BF ".4h") (V8BF ".8h") + (V2SF ".2s") (V4SF ".4s") + (V2DF ".2d") (DI "") (SI "") (HI "") (QI "") (TI "") (HF "") @@ -1193,7 +1208,7 @@ (define_mode_attr Vmntype [(V8HI ".8b") (V4SI ".4h") (define_mode_attr Vetype [(V8QI "b") (V16QI "b") (V4HI "h") (V8HI "h") (V2SI "s") (V4SI "s") - (V2DI "d") + (V2DI "d") (V2HF "h") (V4HF "h") (V8HF "h") (V2SF "s") (V4SF "s") (V2DF "d") @@ -1285,7 +1300,7 @@ (define_mode_attr Vcwtype [(VNx16QI "b") (VNx8QI "h") (VNx4QI "w") (VNx2QI "d") ;; more accurately. (define_mode_attr stype [(V8QI "b") (V16QI "b") (V4HI "s") (V8HI "s") (V2SI "s") (V4SI "s") (V2DI "d") (V4HF "s") - (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") + (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") (V2HF "s") (HF "s") (SF "s") (DF "d") (QI "b") (HI "s") (SI "s") (DI "d")]) @@ -1360,8 +1375,8 @@ (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") (V4HF "HF") (V8HF "HF") (V2SF "SF") (V4SF "SF") (DF "DF") (V2DF "DF") - (SI "SI") (HI "HI") - (QI "QI") + (SI "SI") (V2HF "HF") + (QI "QI") (HI "HI") (V4BF "BF") (V8BF "BF") (VNx16QI "QI") (VNx8QI "QI") (VNx4QI "QI") (VNx2QI "QI") (VNx8HI "HI") (VNx4HI "HI") (VNx2HI "HI") @@ -1381,7 +1396,7 @@ (define_mode_attr Vel [(V8QI "qi") (V16QI "qi") (V2SF "sf") (V4SF "sf") (V2DF "df") (DF "df") (SI "si") (HI "hi") - (QI "qi") + (QI "qi") (V2HF "hf") (V4BF "bf") (V8BF "bf") (VNx16QI "qi") (VNx8QI "qi") (VNx4QI "qi") (VNx2QI "qi") (VNx8HI "hi") (VNx4HI "hi") (VNx2HI "hi") @@ -1866,7 +1881,7 @@ (define_mode_attr q [(V8QI "") (V16QI "_q") (V4HF "") (V8HF "_q") (V4BF "") (V8BF "_q") (V2SF "") (V4SF "_q") - (V2DF "_q") + (V2HF "") (V2DF "_q") (QI "") (HI "") (SI "") (DI "") (HF "") (SF "") (DF "") (V2x8QI "") (V2x16QI "_q") (V2x4HI "") (V2x8HI "_q") @@ -1905,6 +1920,7 @@ (define_mode_attr vp [(V8QI "v") (V16QI "v") (V2SI "p") (V4SI "v") (V2DI "p") (V2DF "p") (V2SF "p") (V4SF "v") + (V2HF "p") (V4HF "v") (V8HF "v")]) (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi") diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 7d0504bdd944e9c0d1b545b0b66a9a1adc808714..3cfbc7a93cca1bea4925853e51d0a147c5722247 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -483,6 +483,7 @@ (define_attr "autodetect_type" ; neon_fp_minmax_s_q ; neon_fp_minmax_d ; neon_fp_minmax_d_q +; neon_fp_reduc_add_h ; neon_fp_reduc_add_s ; neon_fp_reduc_add_s_q ; neon_fp_reduc_add_d @@ -1033,6 +1034,7 @@ (define_attr "type" neon_fp_minmax_d,\ neon_fp_minmax_d_q,\ \ + neon_fp_reduc_add_h,\ neon_fp_reduc_add_s,\ neon_fp_reduc_add_s_q,\ neon_fp_reduc_add_d,\ @@ -1257,8 +1259,8 @@ (define_attr "is_neon_type" "yes,no" neon_fp_compare_d, neon_fp_compare_d_q, neon_fp_minmax_s,\ neon_fp_minmax_s_q, neon_fp_minmax_d, neon_fp_minmax_d_q,\ neon_fp_neg_s, neon_fp_neg_s_q, neon_fp_neg_d, neon_fp_neg_d_q,\ - neon_fp_reduc_add_s, neon_fp_reduc_add_s_q, neon_fp_reduc_add_d,\ - neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, + neon_fp_reduc_add_h, neon_fp_reduc_add_s, neon_fp_reduc_add_s_q,\ + neon_fp_reduc_add_d, neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s,\ neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ neon_fp_reduc_minmax_d_q,\ neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 07d71a63414b1066ea431e287286ad048515711a..8e35e0b574d49913b43c7d8d4f4ba75f127f42e9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -30,11 +30,9 @@ vec_slp_##TYPE (TYPE *restrict a, TYPE b, TYPE c, int n) \ TEST_ALL (VEC_PERM) /* We should use one DUP for each of the 8-, 16- and 32-bit types, - although we currently use LD1RW for _Float16. We should use two - DUPs for each of the three 64-bit types. */ + We should use two DUPs for each of the three 64-bit types. */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, [hw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 1 } } */ +/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, [dx]} 9 } } */ /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ /* { dg-final { scan-assembler-not {\tzip2\t} } } */ --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -19,10 +19,10 @@ ;; . (define_expand "mov" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD" - " +{ /* Force the operand into a register if it is not an immediate whose use can be replaced with xzr. If the mode is 16 bytes wide, then we will be doing @@ -46,12 +46,11 @@ (define_expand "mov" aarch64_expand_vector_init (operands[0], operands[1]); DONE; } - " -) +}) (define_expand "movmisalign" - [(set (match_operand:VALL_F16 0 "nonimmediate_operand") - (match_operand:VALL_F16 1 "general_operand"))] + [(set (match_operand:VALL_F16_FULL 0 "nonimmediate_operand") + (match_operand:VALL_F16_FULL 1 "general_operand"))] "TARGET_SIMD && !STRICT_ALIGNMENT" { /* This pattern is not permitted to fail during expansion: if both arguments @@ -85,10 +84,10 @@ (define_insn "aarch64_simd_dup" ) (define_insn "aarch64_dup_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]) )))] "TARGET_SIMD" @@ -142,6 +141,29 @@ (define_insn "*aarch64_simd_mov" mov_reg, neon_move")] ) +(define_insn "*aarch64_simd_movv2hf" + [(set (match_operand:V2HF 0 "nonimmediate_operand" + "=w, m, m, w, ?r, ?w, ?r, w, w") + (match_operand:V2HF 1 "general_operand" + "m, Dz, w, w, w, r, r, Dz, Dn"))] + "TARGET_SIMD_F16INST + && (register_operand (operands[0], V2HFmode) + || aarch64_simd_reg_or_zero (operands[1], V2HFmode))" + "@ + ldr\\t%s0, %1 + str\\twzr, %0 + str\\t%s1, %0 + mov\\t%0.2s[0], %1.2s[0] + umov\\t%w0, %1.s[0] + fmov\\t%s0, %1 + mov\\t%0, %1 + movi\\t%d0, 0 + * return aarch64_output_simd_mov_immediate (operands[1], 32);" + [(set_attr "type" "neon_load1_1reg, store_8, neon_store1_1reg,\ + neon_logic, neon_to_gp, f_mcr,\ + mov_reg, neon_move, neon_move")] +) + (define_insn "*aarch64_simd_mov" [(set (match_operand:VQMOV 0 "nonimmediate_operand" "=w, Umn, m, w, ?r, ?w, ?r, w") @@ -182,7 +204,7 @@ (define_insn "*aarch64_simd_mov" (define_insn "aarch64_store_lane0" [(set (match_operand: 0 "memory_operand" "=m") - (vec_select: (match_operand:VALL_F16 1 "register_operand" "w") + (vec_select: (match_operand:VALL_F16_FULL 1 "register_operand" "w") (parallel [(match_operand 2 "const_int_operand" "n")])))] "TARGET_SIMD && ENDIAN_LANE_N (, INTVAL (operands[2])) == 0" @@ -1035,11 +1057,11 @@ (define_insn "one_cmpl2" ) (define_insn "aarch64_simd_vec_set" - [(set (match_operand:VALL_F16 0 "register_operand" "=w,w,w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w,w,w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (match_operand: 1 "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) - (match_operand:VALL_F16 3 "register_operand" "0,0,0") + (match_operand:VALL_F16_FULL 3 "register_operand" "0,0,0") (match_operand:SI 2 "immediate_operand" "i,i,i")))] "TARGET_SIMD" { @@ -1061,14 +1083,14 @@ (define_insn "aarch64_simd_vec_set" ) (define_insn "@aarch64_simd_vec_copy_lane" - [(set (match_operand:VALL_F16 0 "register_operand" "=w") - (vec_merge:VALL_F16 - (vec_duplicate:VALL_F16 + [(set (match_operand:VALL_F16_FULL 0 "register_operand" "=w") + (vec_merge:VALL_F16_FULL + (vec_duplicate:VALL_F16_FULL (vec_select: - (match_operand:VALL_F16 3 "register_operand" "w") + (match_operand:VALL_F16_FULL 3 "register_operand" "w") (parallel [(match_operand:SI 4 "immediate_operand" "i")]))) - (match_operand:VALL_F16 1 "register_operand" "0") + (match_operand:VALL_F16_FULL 1 "register_operand" "0") (match_operand:SI 2 "immediate_operand" "i")))] "TARGET_SIMD" { @@ -1376,7 +1398,7 @@ (define_insn "vec_shr_" ) (define_expand "vec_set" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand: 1 "aarch64_simd_nonimmediate_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" @@ -3503,7 +3525,7 @@ (define_insn "popcount2" ;; gimple_fold'd to the IFN_REDUC_(MAX|MIN) function. (This is FP smax/smin). (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINV)] "TARGET_SIMD" { @@ -3518,7 +3540,7 @@ (define_expand "reduc__scal_" (define_expand "reduc__scal_" [(match_operand: 0 "register_operand") - (unspec: [(match_operand:VHSDF 1 "register_operand")] + (unspec: [(match_operand:VHSDF_P 1 "register_operand")] FMAXMINNMV)] "TARGET_SIMD" { @@ -3562,8 +3584,8 @@ (define_insn "aarch64_reduc__internalv2si" ) (define_insn "aarch64_reduc__internal" - [(set (match_operand:VHSDF 0 "register_operand" "=w") - (unspec:VHSDF [(match_operand:VHSDF 1 "register_operand" "w")] + [(set (match_operand:VHSDF_P 0 "register_operand" "=w") + (unspec:VHSDF_P [(match_operand:VHSDF_P 1 "register_operand" "w")] FMAXMINV))] "TARGET_SIMD" "\\t%0, %1." @@ -4208,7 +4230,7 @@ (define_insn "*aarch64_get_lane_zero_extend" (define_insn_and_split "aarch64_get_lane" [(set (match_operand: 0 "aarch64_simd_nonimmediate_operand" "=?r, w, Utv") (vec_select: - (match_operand:VALL_F16 1 "register_operand" "w, w, w") + (match_operand:VALL_F16_FULL 1 "register_operand" "w, w, w") (parallel [(match_operand:SI 2 "immediate_operand" "i, i, i")])))] "TARGET_SIMD" { @@ -7989,7 +8011,7 @@ (define_expand "aarch64_st1" ;; Standard pattern name vec_init. (define_expand "vec_init" - [(match_operand:VALL_F16 0 "register_operand") + [(match_operand:VALL_F16_FULL 0 "register_operand") (match_operand 1 "" "")] "TARGET_SIMD" { @@ -8068,7 +8090,7 @@ (define_insn "aarch64_urecpe" (define_expand "vec_extract" [(match_operand: 0 "aarch64_simd_nonimmediate_operand") - (match_operand:VALL_F16 1 "register_operand") + (match_operand:VALL_F16_FULL 1 "register_operand") (match_operand:SI 2 "immediate_operand")] "TARGET_SIMD" { diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f05bac713e88ea8c7feaa2367d55bd523ca66f57..1e08f8453688210afe1566092b19b59c9bdd0c97 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -3566,6 +3566,7 @@ aarch64_classify_vector_mode (machine_mode mode) case E_V8BFmode: case E_V4SFmode: case E_V2DFmode: + case E_V2HFmode: return TARGET_SIMD ? VEC_ADVSIMD : 0; default: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 37d8161a33b1c399d80be82afa67613a087389d4..1df09f7fe2eb35aed96113476541e0faa5393551 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -160,6 +160,10 @@ (define_mode_iterator VDQF [V2SF V4SF V2DF]) (define_mode_iterator VHSDF [(V4HF "TARGET_SIMD_F16INST") (V8HF "TARGET_SIMD_F16INST") V2SF V4SF V2DF]) +;; Advanced SIMD Float modes suitable for pairwise operations. +(define_mode_iterator VHSDF_P [(V4HF "TARGET_SIMD_F16INST") + (V8HF "TARGET_SIMD_F16INST") + V2SF V4SF V2DF (V2HF "TARGET_SIMD_F16INST")]) ;; Advanced SIMD Float modes, and DF. (define_mode_iterator VDQF_DF [V2SF V4SF V2DF DF]) @@ -188,15 +192,23 @@ (define_mode_iterator VDQF_COND [V2SF V2SI V4SF V4SI V2DF V2DI]) (define_mode_iterator VALLF [V2SF V4SF V2DF SF DF]) ;; Advanced SIMD Float modes with 2 elements. -(define_mode_iterator V2F [V2SF V2DF]) +(define_mode_iterator V2F [V2SF V2DF V2HF]) ;; All Advanced SIMD modes on which we support any arithmetic operations. (define_mode_iterator VALL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V2SF V4SF V2DF]) -;; All Advanced SIMD modes suitable for moving, loading, and storing. +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; except V2HF. (define_mode_iterator VALL_F16 [V8QI V16QI V4HI V8HI V2SI V4SI V2DI V4HF V8HF V4BF V8BF V2SF V4SF V2DF]) +;; All Advanced SIMD modes suitable for moving, loading, and storing +;; including V2HF +(define_mode_iterator VALL_F16_FULL [V8QI V16QI V4HI V8HI V2SI V4SI V2DI + V4HF V8HF V4BF V8BF V2SF V4SF V2DF + (V2HF "TARGET_SIMD_F16INST")]) + + ;; The VALL_F16 modes except the 128-bit 2-element ones. (define_mode_iterator VALL_F16_NO_V2Q [V8QI V16QI V4HI V8HI V2SI V4SI V4HF V8HF V2SF V4SF]) @@ -1076,7 +1088,7 @@ (define_mode_attr nunits [(V8QI "8") (V16QI "16") (V2SF "2") (V4SF "4") (V1DF "1") (V2DF "2") (DI "1") (DF "1") - (V8DI "8")]) + (V8DI "8") (V2HF "2")]) ;; Map a mode to the number of bits in it, if the size of the mode ;; is constant. @@ -1090,6 +1102,7 @@ (define_mode_attr s [(HF "h") (SF "s") (DF "d") (SI "s") (DI "d")]) ;; Give the length suffix letter for a sign- or zero-extension. (define_mode_attr size [(QI "b") (HI "h") (SI "w")]) +(define_mode_attr sizel [(QI "b") (HI "h") (SI "")]) ;; Give the number of bits in the mode (define_mode_attr sizen [(QI "8") (HI "16") (SI "32") (DI "64")]) @@ -1134,8 +1147,9 @@ (define_mode_attr Vtype [(V8QI "8b") (V16QI "16b") (V2SI "2s") (V4SI "4s") (DI "1d") (DF "1d") (V2DI "2d") (V2SF "2s") - (V4SF "4s") (V2DF "2d") - (V4HF "4h") (V8HF "8h") + (V2HF "2h") (V4SF "4s") + (V2DF "2d") (V4HF "4h") + (V8HF "8h") (V2x8QI "8b") (V2x4HI "4h") (V2x2SI "2s") (V2x1DI "1d") (V2x4HF "4h") (V2x2SF "2s") @@ -1175,9 +1189,10 @@ (define_mode_attr Vmtype [(V8QI ".8b") (V16QI ".16b") (V4HI ".4h") (V8HI ".8h") (V2SI ".2s") (V4SI ".4s") (V2DI ".2d") (V4HF ".4h") - (V8HF ".8h") (V4BF ".4h") - (V8BF ".8h") (V2SF ".2s") - (V4SF ".4s") (V2DF ".2d") + (V8HF ".8h") (V2HF ".2h") + (V4BF ".4h") (V8BF ".8h") + (V2SF ".2s") (V4SF ".4s") + (V2DF ".2d") (DI "") (SI "") (HI "") (QI "") (TI "") (HF "") @@ -1193,7 +1208,7 @@ (define_mode_attr Vmntype [(V8HI ".8b") (V4SI ".4h") (define_mode_attr Vetype [(V8QI "b") (V16QI "b") (V4HI "h") (V8HI "h") (V2SI "s") (V4SI "s") - (V2DI "d") + (V2DI "d") (V2HF "h") (V4HF "h") (V8HF "h") (V2SF "s") (V4SF "s") (V2DF "d") @@ -1285,7 +1300,7 @@ (define_mode_attr Vcwtype [(VNx16QI "b") (VNx8QI "h") (VNx4QI "w") (VNx2QI "d") ;; more accurately. (define_mode_attr stype [(V8QI "b") (V16QI "b") (V4HI "s") (V8HI "s") (V2SI "s") (V4SI "s") (V2DI "d") (V4HF "s") - (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") + (V8HF "s") (V2SF "s") (V4SF "s") (V2DF "d") (V2HF "s") (HF "s") (SF "s") (DF "d") (QI "b") (HI "s") (SI "s") (DI "d")]) @@ -1360,8 +1375,8 @@ (define_mode_attr VEL [(V8QI "QI") (V16QI "QI") (V4HF "HF") (V8HF "HF") (V2SF "SF") (V4SF "SF") (DF "DF") (V2DF "DF") - (SI "SI") (HI "HI") - (QI "QI") + (SI "SI") (V2HF "HF") + (QI "QI") (HI "HI") (V4BF "BF") (V8BF "BF") (VNx16QI "QI") (VNx8QI "QI") (VNx4QI "QI") (VNx2QI "QI") (VNx8HI "HI") (VNx4HI "HI") (VNx2HI "HI") @@ -1381,7 +1396,7 @@ (define_mode_attr Vel [(V8QI "qi") (V16QI "qi") (V2SF "sf") (V4SF "sf") (V2DF "df") (DF "df") (SI "si") (HI "hi") - (QI "qi") + (QI "qi") (V2HF "hf") (V4BF "bf") (V8BF "bf") (VNx16QI "qi") (VNx8QI "qi") (VNx4QI "qi") (VNx2QI "qi") (VNx8HI "hi") (VNx4HI "hi") (VNx2HI "hi") @@ -1866,7 +1881,7 @@ (define_mode_attr q [(V8QI "") (V16QI "_q") (V4HF "") (V8HF "_q") (V4BF "") (V8BF "_q") (V2SF "") (V4SF "_q") - (V2DF "_q") + (V2HF "") (V2DF "_q") (QI "") (HI "") (SI "") (DI "") (HF "") (SF "") (DF "") (V2x8QI "") (V2x16QI "_q") (V2x4HI "") (V2x8HI "_q") @@ -1905,6 +1920,7 @@ (define_mode_attr vp [(V8QI "v") (V16QI "v") (V2SI "p") (V4SI "v") (V2DI "p") (V2DF "p") (V2SF "p") (V4SF "v") + (V2HF "p") (V4HF "v") (V8HF "v")]) (define_mode_attr vsi2qi [(V2SI "v8qi") (V4SI "v16qi") diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 7d0504bdd944e9c0d1b545b0b66a9a1adc808714..3cfbc7a93cca1bea4925853e51d0a147c5722247 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -483,6 +483,7 @@ (define_attr "autodetect_type" ; neon_fp_minmax_s_q ; neon_fp_minmax_d ; neon_fp_minmax_d_q +; neon_fp_reduc_add_h ; neon_fp_reduc_add_s ; neon_fp_reduc_add_s_q ; neon_fp_reduc_add_d @@ -1033,6 +1034,7 @@ (define_attr "type" neon_fp_minmax_d,\ neon_fp_minmax_d_q,\ \ + neon_fp_reduc_add_h,\ neon_fp_reduc_add_s,\ neon_fp_reduc_add_s_q,\ neon_fp_reduc_add_d,\ @@ -1257,8 +1259,8 @@ (define_attr "is_neon_type" "yes,no" neon_fp_compare_d, neon_fp_compare_d_q, neon_fp_minmax_s,\ neon_fp_minmax_s_q, neon_fp_minmax_d, neon_fp_minmax_d_q,\ neon_fp_neg_s, neon_fp_neg_s_q, neon_fp_neg_d, neon_fp_neg_d_q,\ - neon_fp_reduc_add_s, neon_fp_reduc_add_s_q, neon_fp_reduc_add_d,\ - neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s, + neon_fp_reduc_add_h, neon_fp_reduc_add_s, neon_fp_reduc_add_s_q,\ + neon_fp_reduc_add_d, neon_fp_reduc_add_d_q, neon_fp_reduc_minmax_s,\ neon_fp_reduc_minmax_s_q, neon_fp_reduc_minmax_d,\ neon_fp_reduc_minmax_d_q,\ neon_fp_cvt_narrow_s_q, neon_fp_cvt_narrow_d_q,\ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 07d71a63414b1066ea431e287286ad048515711a..8e35e0b574d49913b43c7d8d4f4ba75f127f42e9 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -30,11 +30,9 @@ vec_slp_##TYPE (TYPE *restrict a, TYPE b, TYPE c, int n) \ TEST_ALL (VEC_PERM) /* We should use one DUP for each of the 8-, 16- and 32-bit types, - although we currently use LD1RW for _Float16. We should use two - DUPs for each of the three 64-bit types. */ + We should use two DUPs for each of the three 64-bit types. */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.h, [hw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 2 } } */ -/* { dg-final { scan-assembler-times {\tld1rw\tz[0-9]+\.s, } 1 } } */ +/* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.s, [sw]} 3 } } */ /* { dg-final { scan-assembler-times {\tmov\tz[0-9]+\.d, [dx]} 9 } } */ /* { dg-final { scan-assembler-times {\tzip1\tz[0-9]+\.d, z[0-9]+\.d, z[0-9]+\.d\n} 3 } } */ /* { dg-final { scan-assembler-not {\tzip2\t} } } */ From patchwork Mon Oct 31 11:59:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59650 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DD8AB3887F48 for ; Mon, 31 Oct 2022 12:00:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DD8AB3887F48 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217629; bh=NMByqjfRKqOASUbUkrPjwQO+J7oSNrjaV3ZkTi1BOQ0=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=IfweRBL6qIYYM/LaUriO5i68DSbKSMOKT18ia4nV/xzspE84g9lQkTX8MSs2/j0Og bW4KmdKANZPwoAQD1AqN+biB36zAZo+gL4yOp3/BJw9xoov1J2zof33f5j6cU3ZEhJ KHyn0Bffht59lQPzVYSkNuHaX6pX81J40J+B7iqA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70083.outbound.protection.outlook.com [40.107.7.83]) by sourceware.org (Postfix) with ESMTPS id BECA2385381D for ; Mon, 31 Oct 2022 11:59:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BECA2385381D ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=I2AXpv8BbiVZN7q4pjgu/eJv3uSfp7h5IF/D3JWMu7PFbMUz1H0C9ek6bfhftey0ecub4LIhyjjjFvSpCiG+peVbNGr7lPv8gK8dS6ZkL9ZxS+nxsotAo4CBufShdWRm/tgJ+9gFznTKSoLvtIgLBwwfk+V+0CsojqhmfJJwsiz47W6qVf3lsiOt3bP/YIjGkrDALkYfKAnBz7wy0OQLwuS1XbFim5bb/DmHxXGs/ZrJm3GFY7pTPND0OCKdtINp2vWhyYI2xuSiAHcl2nzbTKWeyCVyZVQXN2V04N/XxL9OsU66BM+OesTQWKC3h12q/nTjc8LJUhx+pihyXUAmqQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NMByqjfRKqOASUbUkrPjwQO+J7oSNrjaV3ZkTi1BOQ0=; b=QB20LsCkLsZJLASnhSG1jo5IbdAadS7Gvw8TBajInC2/NGlDal5lNfMKUZiAYnPguauQCtz1X9K75tCLHmELYJ+Ljl+b7Y64edzrvI6OxUTSn3oOzQHwcDMB8hyCtWZX+WYA+2yZdVxB0XbxA0gmrQxydrs7O82WWJqpVc0EU52H0/ULBh4vWzlaCwHQujUmDbe3nw2m5P4N1XLJYCLm6wANuydsf/OcE1GghaHChSffmvStuPwiDe39MV7qbbbC0VhMt/jzs8rxWmOEWjTRja3trBm0uNgGy3MYmFVyc9snPsA51SoDUTMUF9+WiEsN2QCdZYlxdHlKGAw2WUNCUQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AS9PR04CA0128.eurprd04.prod.outlook.com (2603:10a6:20b:531::10) by PA4PR08MB6254.eurprd08.prod.outlook.com (2603:10a6:102:f3::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.18; Mon, 31 Oct 2022 11:59:31 +0000 Received: from AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:531:cafe::89) by AS9PR04CA0128.outlook.office365.com (2603:10a6:20b:531::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT056.mail.protection.outlook.com (100.127.140.107) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:31 +0000 Received: ("Tessian outbound 6c699027a257:v130"); Mon, 31 Oct 2022 11:59:31 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 86b378ac9063a837 X-CR-MTA-TID: 64aa7808 Received: from 23c15775faa1.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 79764953-D9D6-4575-B055-B007F5552F0E.1; Mon, 31 Oct 2022 11:59:20 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 23c15775faa1.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:59:20 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LosX++JY6fBhj/i+7wIFWFpY6/WozxsjPUzcjDVnqdMxU6G0I0O9IZqkVpmBjW7kqGYZYQibMed4F2JMPyav43CJoJNhQl6dAxUdsfqAFVOyuA4GFjzN4fWNRgt2ZdYhkpLyY1pli0Yz8gLKzRQEJ6zzUzptS8kaOkSQUsLeA2AmtVwTQKyf9GuY3mormnHB1tY1fwKRSQGP3xTzfirheRXC9Y1xnhEpyGMBsfHrDjlQwaW7/lVNESsE4Yw3sJra8tKHT9twgxGWit6EF4FBCjjHYqpoOTmmVN5hOBcbQt8/MYehEK9gmpR3qRbX97aQeoIImc2vu68TFk31CW6KBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NMByqjfRKqOASUbUkrPjwQO+J7oSNrjaV3ZkTi1BOQ0=; b=oC49Ko2EHnE8dt8EFhfUTz797X3yjjXkt/JCQGn0tjWlBj3jdup8+RE4KhqnZBUbmNuzSVz3KaSAeWS0V10O+al9xxNUPR0vA2ps7IG9CRNItGIB1XJqF5wOJLNWiQ3MtiUMJuLB3aGQqkmsLmDmlMwSpnrKPVd1tle+hAEH2LdiGoQVqwsngXTcqdS4lv3vbHnMVT98vhZLu3Iiv3mVx7YKDzvCYplyiljOIFfNq6QGbe0E/gdG+kXu+RlKzdgNKlgHTPVbCHarhd73FOdq8Wm9roq9jjV47sjpykaOpgfcrwmnLc9UwePDOYhZ4EuC2L3nIooPvMzU6bEEWWHIqg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:59:17 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:59:17 +0000 Date: Mon, 31 Oct 2022 11:59:14 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 6/8]AArch64: Add peephole and scheduling logic for pairwise operations that appear late in RTL. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO2P265CA0031.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:61::19) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|AM7EUR03FT056:EE_|PA4PR08MB6254:EE_ X-MS-Office365-Filtering-Correlation-Id: fd0488e7-f991-4d0f-23ac-08dabb375ef5 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: OBcggxxwVXZdJlgF1bA6d3Yx10fbv2/xCk3y3roGqQL0UiBu9RlsiFUK3kz1ka6TvJsFHlmxwBETV4hIWQSVN2S/j7Kf24OzA2IoVmI/jiY75DSa2R8ssX+jg4dWm6K6pIqQ6ODdcRz5wnp0caZNLBFvOBcSd+7oZly4ZUZ4L7Y0t17xFzVk644fvqK7E5oPg02HhBeo6n0dpH9W9SabvTPgzaYRRsSZZ+J+GnzdfYo/T5XkqNjJa0Fu3veDvWnWb7L4QDBaF30ZGvK2ennOOQaDFP2bGRVz+vIRmqz3lc9eUblkXyj49gqMxD2t6zHzv2x6B45W9/2bPw0tnv68oA5NJP5hhwmH0RsqE5VRbh1gjKJafB/LiCFlmb4dWZ1Wijafo+vWN5QuUKsLqd++6KqyBaS+5k+h26ek5jO5JV6CuO8MqDUL0LaRLDt6mXGu31tnmXe7pNIHfmq6PZ6Ex7m/OaJDBwKa5O7xgle1G32G0YwEYRhlp6wLhKwEXniTssc7HaORxsCLK/tqzUNW7jffhKRBux4Cy2smkTDdXHzwW4DOpruMgojpcSBfyLQW6mWkdxsTqWsVwb0VgtHkruotXV6gL1JNdoSct7reE9VCLNXubxOw9ptEn2xjH4bO6oBXWym6DQKCSGGCYDCif0SYuFX/oIWTdMmspeB1ZeadthzFrUel8+f2PPVhBQ9XjPmgcUaEh+/nq6VmO5qDPBUusdBPhZ0B0T2N2yzn+iWLfvWB2LNEhClEgYKs4qjy//hulhpbWXeK8TTKK5uHVQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(2906002)(38100700002)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(6666004)(8936002)(26005)(4743002)(6512007)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a1852387-5bfd-404a-6efd-08dabb375622 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 2hfWXmx9zh1XRzJjDfa2n2Rn34fUWM4gsae/5Nnb6He/omTEobAw+H0SUxU+ogw4wkK+EsQ43ShHzzjYtLxcMnpWit0IZBJWkjH3llb1nj8xCLfbbRawk5odqnFaAek6LjpDy38ypFbXKJe1emJMDyLeadbRfDUMo60XPiDvyr5pICZRAs2XJhfjnHs9Be1EYKKnPsEIG62+jhBelCl3H8JzKWd8OCtZSYoyl9Iviw5yGhAQor6ZpATRnSEFROxZFxJqLXzX5AJtRnRYLvsP2IAPYumuPcnkSCp5/+2huEZJWq1OV8+xjSYS3TS3+pe8rLId7yULiFnj1Ba1pv5rdpu5p3IgCAMD1zgCBYq4bZdmqSXjPywdycyieku0881os4Zksd7JI9rXF0FKZ1gYw2Uo2CTZK0SzcLKJFf32TTKTmYwn8xNDeR//NZ3E3rMdYIcCoNsFMWe4yyoiEGv3hZcS4fVUOI+yI5ugsSxxaFuEcavEAlN0Upby9plQMzipWtXlbwdAyqTeKJBL1IsUoczVsYTY3rYo1og/HUI9aR3hbtC/MFOdDO7HbNMEvyWzzJVMz/G5fcfSAfsc/6WgoTj4+Z9RfaVnZv2XzESpH7i2rSE1gs8xizklVlktfrQPvLw+2UQV7EV1uxG9zOMCUtzRrH/Cr3LKwvQiPqsNcQgo5MXc/P7Ai23zOgI0KuCVrEloKTXonB3hddVU3cXkyXblH6oCDPrx0SVBL19+R2QG8axGeRQWWHqoHe4spJFZE81lOSt0HjG006niOBOsHn3nFiUQ3p/d2zHVZzxb+tA= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(346002)(136003)(376002)(396003)(39860400002)(451199015)(40470700004)(46966006)(36840700001)(47076005)(336012)(40480700001)(40460700003)(6666004)(6486002)(2906002)(44832011)(44144004)(356005)(81166007)(36756003)(86362001)(82740400003)(82310400005)(4743002)(6512007)(33964004)(26005)(36860700001)(2616005)(186003)(316002)(70206006)(8676002)(478600001)(4326008)(70586007)(41300700001)(6506007)(235185007)(8936002)(5660300002)(6916009)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:59:31.6284 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fd0488e7-f991-4d0f-23ac-08dabb375ef5 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT056.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6254 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, Says what it does on the tin. In case some operations form in RTL due to a split, combine or any RTL pass then still try to recognize them. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md: Add new peepholes. * config/aarch64/aarch64.cc (aarch_macro_fusion_pair_p): Schedule sequential PLUS operations next to each other to increase the chance of forming pairwise operations. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 93a2888f567460ad10ec050ea7d4f701df4729d1..20e9adbf7b9b484f9a19f0c62770930dc3941eb2 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 93a2888f567460ad10ec050ea7d4f701df4729d1..20e9adbf7b9b484f9a19f0c62770930dc3941eb2 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3425,6 +3425,22 @@ (define_insn "aarch64_faddp" [(set_attr "type" "neon_fp_reduc_add_")] ) +(define_peephole2 + [(set (match_operand: 0 "register_operand") + (vec_select: + (match_operand:VHSDF 1 "register_operand") + (parallel [(match_operand 2 "const_int_operand")]))) + (set (match_operand: 3 "register_operand") + (plus: + (match_dup 0) + (match_operand: 5 "register_operand")))] + "TARGET_SIMD + && ENDIAN_LANE_N (, INTVAL (operands[2])) == 1 + && REGNO (operands[5]) == REGNO (operands[1]) + && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 3) (unspec: [(match_dup 1)] UNSPEC_FADDV))] +) + (define_insn "reduc_plus_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV 1 "register_operand" "w")] diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f3bd71c9f10868f9e6ab50d8e36ed3ee3d48ac22..4023b1729d92bf37f5a2fc8fc8cd3a5194532079 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -25372,6 +25372,29 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr) } } + /* Try to schedule vec_select and add together so the peephole works. */ + if (simple_sets_p && REG_P (SET_DEST (prev_set)) && REG_P (SET_DEST (curr_set)) + && GET_CODE (SET_SRC (prev_set)) == VEC_SELECT && GET_CODE (SET_SRC (curr_set)) == PLUS) + { + /* We're trying to match: + prev (vec_select) == (set (reg r0) + (vec_select (reg r1) n) + curr (plus) == (set (reg r2) + (plus (reg r0) (reg r1))) */ + rtx prev_src = SET_SRC (prev_set); + rtx curr_src = SET_SRC (curr_set); + rtx parallel = XEXP (prev_src, 1); + auto idx + = ENDIAN_LANE_N (GET_MODE_NUNITS (GET_MODE (XEXP (prev_src, 0))), 1); + if (GET_CODE (parallel) == PARALLEL + && XVECLEN (parallel, 0) == 1 + && known_eq (INTVAL (XVECEXP (parallel, 0, 0)), idx) + && GET_MODE (SET_DEST (prev_set)) == GET_MODE (curr_src) + && GET_MODE_INNER (GET_MODE (XEXP (prev_src, 0))) + == GET_MODE (XEXP (curr_src, 1))) + return true; + } + /* Fuse compare (CMP/CMN/TST/BICS) and conditional branch. */ if (aarch64_fusion_enabled_p (AARCH64_FUSE_CMP_BRANCH) && prev_set && curr_set && any_condjump_p (curr) --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3425,6 +3425,22 @@ (define_insn "aarch64_faddp" [(set_attr "type" "neon_fp_reduc_add_")] ) +(define_peephole2 + [(set (match_operand: 0 "register_operand") + (vec_select: + (match_operand:VHSDF 1 "register_operand") + (parallel [(match_operand 2 "const_int_operand")]))) + (set (match_operand: 3 "register_operand") + (plus: + (match_dup 0) + (match_operand: 5 "register_operand")))] + "TARGET_SIMD + && ENDIAN_LANE_N (, INTVAL (operands[2])) == 1 + && REGNO (operands[5]) == REGNO (operands[1]) + && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 3) (unspec: [(match_dup 1)] UNSPEC_FADDV))] +) + (define_insn "reduc_plus_scal_" [(set (match_operand: 0 "register_operand" "=w") (unspec: [(match_operand:VDQV 1 "register_operand" "w")] diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index f3bd71c9f10868f9e6ab50d8e36ed3ee3d48ac22..4023b1729d92bf37f5a2fc8fc8cd3a5194532079 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -25372,6 +25372,29 @@ aarch_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr) } } + /* Try to schedule vec_select and add together so the peephole works. */ + if (simple_sets_p && REG_P (SET_DEST (prev_set)) && REG_P (SET_DEST (curr_set)) + && GET_CODE (SET_SRC (prev_set)) == VEC_SELECT && GET_CODE (SET_SRC (curr_set)) == PLUS) + { + /* We're trying to match: + prev (vec_select) == (set (reg r0) + (vec_select (reg r1) n) + curr (plus) == (set (reg r2) + (plus (reg r0) (reg r1))) */ + rtx prev_src = SET_SRC (prev_set); + rtx curr_src = SET_SRC (curr_set); + rtx parallel = XEXP (prev_src, 1); + auto idx + = ENDIAN_LANE_N (GET_MODE_NUNITS (GET_MODE (XEXP (prev_src, 0))), 1); + if (GET_CODE (parallel) == PARALLEL + && XVECLEN (parallel, 0) == 1 + && known_eq (INTVAL (XVECEXP (parallel, 0, 0)), idx) + && GET_MODE (SET_DEST (prev_set)) == GET_MODE (curr_src) + && GET_MODE_INNER (GET_MODE (XEXP (prev_src, 0))) + == GET_MODE (XEXP (curr_src, 1))) + return true; + } + /* Fuse compare (CMP/CMN/TST/BICS) and conditional branch. */ if (aarch64_fusion_enabled_p (AARCH64_FUSE_CMP_BRANCH) && prev_set && curr_set && any_condjump_p (curr) From patchwork Mon Oct 31 11:59:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59652 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 55B2A3898399 for ; Mon, 31 Oct 2022 12:01:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 55B2A3898399 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217676; bh=vy3Sf7RtVsXKLawl93k7zVldvfjdsytRrcOmJ+HvfhQ=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=ZESblwnX1LLz3Iw9u38TJVmoUPKTjaDe7ch3krAXdWuWncfLzVeZkjni8mlXuBAjR eXT5M8E1KAnfci3uBHFpOs2K7OOwBZF6geI1IalvFLPoG4TcZzf7vtOOOmsszR1HaS EC8DOeDFfpp30DP2vg641V7r8OaqwA54tlAN3yDI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70087.outbound.protection.outlook.com [40.107.7.87]) by sourceware.org (Postfix) with ESMTPS id 50A13382DE3B for ; Mon, 31 Oct 2022 11:59:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 50A13382DE3B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Hdfl/CiFPzsP06L1QRay/OvabpZSJCCQsBplN5Avm6btnZ9g1ESc+zfCPvRcEW/XvIDTLSBhdcC9ijXHF4zvS16CJtSVOSMD1DwHmPpxuccseJqk7OI6dKn1g8Q8y6DugLd3yhoL6YRbqjrBGR2ZfUcl2ktOgK5xhx0YJUjNn2VD7H9ESCS6sAvJBW9XFE4hitN6nVovAGMGGnLb3Wj1Jjat3DmTS0JoUyDn6/5sMNBfub10x15/Iim5RxV8TwQ3qFqo/WIv92BD0hYqgDCQSKoDhE8o6dMIskssFC+wrYHPztq9yvLXWdB74O2hapENMlw6lGCVBIdXCKNmh5/rSw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vy3Sf7RtVsXKLawl93k7zVldvfjdsytRrcOmJ+HvfhQ=; b=jDBwEA2H/oiE/ase4+ecTt+sjg6E+SCQEeYMya1vCwTLPpYI1G16mJr7O4fmf3mgJ75vkfyjdFi2GdL2sWWS4h0t4Dvzb+OXcyl/bEoXvMZ0sLAFDAkP22K1XwV3eFnSTTUYYXoufJF67gMFOxsZij0peocaJXEJfqSI9xTYhlIDyRG+uj0ycXdVEIcQP9iopxxeNBm7M5pgUrh57E3rArbGBG8sEUBNnjpUkvlkXPNF7niCrCK1lcV06YeIoxsPiJaCvMlIGhkghVfD9nHWYEosoxLLD4dv1u1mde4iVdbKCvxYPsgVvzoJkmpOiSD800i09ccESxgaiewN4Mqt8Q== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from DB8PR03CA0025.eurprd03.prod.outlook.com (2603:10a6:10:be::38) by AS8PR08MB6408.eurprd08.prod.outlook.com (2603:10a6:20b:33a::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:59:52 +0000 Received: from DBAEUR03FT059.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:be:cafe::df) by DB8PR03CA0025.outlook.office365.com (2603:10a6:10:be::38) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.15 via Frontend Transport; Mon, 31 Oct 2022 11:59:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT059.mail.protection.outlook.com (100.127.142.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 11:59:52 +0000 Received: ("Tessian outbound 2ff13c8f2c05:v130"); Mon, 31 Oct 2022 11:59:52 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3f99d9352cb862cb X-CR-MTA-TID: 64aa7808 Received: from ddc7967abf4e.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 29CAEA04-A315-48CB-8FDE-B197E82F023B.1; Mon, 31 Oct 2022 11:59:46 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id ddc7967abf4e.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 11:59:46 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xgrcpk1bqo9qAzJLKf9MyX+C6z9K7TuAupgVpig45MQobLVLzxmWsmUcfCf2ihOb8hLn/xUnFMHDkLIY9Ily4t59cCxvspVGdlTybONWwSY80ImmyPRSRCYrwWGwSc6qERthvM8dbwvJdoT10QV7TsAP00jIldy/ILEzKXJKR6zctqqknHSOJG6X3q0uJdgSnTXwNFgpSVixYx+WKJPsAxoo2ev1Htu//SOOIjHy6dsqGvVcMxu/kofwA1F0HXIN6s8NvCPP/CYKKPacJyC4FPq8ASJAgQifsfAUoWfIuKwdUxkXB6PaytNkr0Rody/JoLfLFK1kRXFW2/wgUgModA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vy3Sf7RtVsXKLawl93k7zVldvfjdsytRrcOmJ+HvfhQ=; b=gLZh+FheczprzOxeW65uF6F7Q9f6C8AmXbmDGkbHPMzPPwd26q/ra0BSWp7kOfnKTXtahMeK5MW2Jafb9jVlywm4EzXPJuIr+pvrjWBHfDswhevOdDOpb/DJ0ppQpI4Uas5G+GG+lIieSpZemeAO3rLf89ogrz5V5FPKGMiZemR9OD8vFjnxkeQ2srNAN/2MR1+kOHEqqNW73ma4ZbiIws8d5RzTi00yEvSAtt/4zwY8l0bBJYsbLsD+EmV5iqB+8HBHBzbb5YlJSkYuHpSddp3Mh19bp/TbhUi8c96H+JzTwOzVy7etDudmMRpLWfiq5y6G4CHaYiE3h6pTQMiaGg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 11:59:44 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 11:59:44 +0000 Date: Mon, 31 Oct 2022 11:59:36 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO3P123CA0006.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:ba::11) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|DBAEUR03FT059:EE_|AS8PR08MB6408:EE_ X-MS-Office365-Filtering-Correlation-Id: b4f64721-9547-4f9d-f816-08dabb376b64 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Bl8lSbwJnI2ZY8wLlgsqk7fMN3OeT6/rutuvhiNs8oVt+YXFnrEnPlV7G4GSWBux6FE3SJcP6mZDwMZPF+pX2yZLeNV6gLiBi+q8yU6KTlQ9R2U/0jxP9Zf3WUEpyO5ZPIF0hb08JiSg1qEVFFek5u3xUpccHU7qFhYmoLITnkp8Ky/3CNLJyLOLnvlUpICqV3wJ5KQggB97c9mMXV0QTr+UWJiOvvnhB7Llg5akCMyWbKMtQro/kcX2ufcC6vYYtEZh0XfqLSZBWlT+HGuAgTM+eZYwP3yIALbxeSTuFFfMqwZhH3rrmhRkqjIClA+pqt0iqeDbyqp7Ykzg9y6Yv4qb7Ov8jF4r+sCWPFs1OTvs5TXnnGqc3GAj3d01j02yhfu2nE8p+T9pQ95O3SiBNIXFgWXwjvaMY2on6qhD3JPrQYfI2IiFU6GtzmZOg0f9nMauouyYxRuokjqGS2AYEBIApCMfxPLpBBzQAL0ACO9jUfmcZrdtxZNWyRSC2+zPAlNmp31Bez6cp/L/4E0KzSlWVBYWARB0FJZniibbP7YzSVSyGHRXRtLXsnIXJY/oZqDBig3XypXXvpMiO6G26m9q3CwfKmkgN0BVrgcZxB9y0UIpK+Db78EpEZugZljghARxDM4x92eBoL8Jm6UCW/F6qMKZubg+tgcs0usEZzpH/j+j6WQ0JoNS6Rg7ktjsrtcUL0eHskCemzIXMXzwpbnSNPowvpNOq3PySW3FJNZmDZyg5KBedppctxs9S8nmdTvn8L2OcdgyxTyCvNMAR2NZB3I0Vu+v5VVe4FVaN7GtIGi2a9d/80QzfUfGLXnKancMwku36CVYRaJa7LEVebXYHtkS/LefXkNuFWyMgWc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(84970400001)(2906002)(38100700002)(83380400001)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(30864003)(6666004)(8936002)(26005)(4743002)(6512007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT059.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: dd95116c-94f3-48f4-8f2f-08dabb376691 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: co1f2U42wxc070x4BNB9CpEDKLL2lg4CcxdaaIIOMtwsHDya5AcYnQX/4xywIjDsxAzVTY8j5nIoQ0FhcUNBTd1gEp+kF5g7K2o1LSml8fyp3KjJQeoFCDyR4BxFmir22ZQKF6JmGj1AHlywB75VJZq2RIOoETrcV4WG4aU3Z36lT4ATWGSBybZF6UT9nrD7GiPvW2jx2eacmp2MDLJ3LeFuZXHJl0HwboPd4SrpGemNpP8izVbrGeKEmgEbEL5vLvsfCjBNkVoWX0TvMNtL6/NPxPqv8GQmx3NMC/A/LwO7JBaGwT3Km5ejwjYjRxKDUEI3Mb3BaSy/XAD1VWyqDloFZMFP8CV6Itm2K4eM8thqSZHOrNCClI50l+Kl26DN7DwgvtGpaQQHeGkkjZq0qVON/DQetTv7wCqjf3HxNvr2m38Bl3iIhP2cmzf/OB9Recsmr15ya+w23QPX9Yr4qYRVIqS6KroNiAOGGlOUNA/RrRoyTczFBlZb8pmdV19I0rM3zi+uPOYe/BWWLtE7u/tIhUv9Fbq4gXOr59/DLIyfp/WzP/q6YHwJfFFfeUHZ1b5vWOdblMsTALD8xOq91Pj6VUduYDJMAhYYvuN0+uvgtC7gVatleaw94H/4LGxVT+04xWrzBjFVoLPijzqESpjo8OOnkwkC8vE0IgDgDO2vwB02UnMFtIpfLaOhl5F+vXCW2rRdygqUylq9VYIaHnutRWviqhoVYZPgckEcXUmYVRktHAmt5IY7F1fyBhLWlu8hCU1a3E3Aum369d/Dni2rXT92Rlyxv+imoINcE5+VBihdlnVqvmY434QyVyP12LnUfVGBmFTcVd6zQ7j7Kj8q2quwfNySy3YQ1YwiF6S+oSOwNHLl4j4TV0Dn2PPTUS3Pw75o0b4WgDXImiy8WA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(136003)(39860400002)(376002)(346002)(396003)(451199015)(40470700004)(36840700001)(46966006)(6916009)(4326008)(41300700001)(8676002)(70206006)(70586007)(8936002)(235185007)(84970400001)(30864003)(44832011)(5660300002)(6666004)(478600001)(2906002)(356005)(81166007)(6486002)(336012)(6512007)(44144004)(2616005)(26005)(36756003)(4743002)(6506007)(186003)(33964004)(47076005)(83380400001)(316002)(82310400005)(36860700001)(86362001)(40480700001)(82740400003)(40460700003)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 11:59:52.5721 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b4f64721-9547-4f9d-f816-08dabb376b64 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT059.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6408 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, The target has various zero and sign extension patterns. These however live in various locations around the MD file and almost all of them are split differently. Due to the various patterns we also ended up missing valid extensions. For instance smov is almost never generated. This change tries to make this more manageable by consolidating the patterns as much as possible and in doing so fix the missing alternatives. There were also some duplicate patterns. Note that the zero_extend<*_ONLY:mode>2 patterns are nearly identical however QImode lacks an alternative that the others don't have, so I have left them as 3 different patterns next to each other. In a lot of cases the wrong iterator was used leaving out cases that should exist. I've also changed the masks used for zero extensions to hex instead of decimal as it's more clear what they do that way, and aligns better with output of other compilers. This leave the bulk of the extensions in just 3 patterns. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_get_lane_zero_extend): Changed to ... (*aarch64_get_lane_zero_extend): ... This. (*aarch64_get_lane_extenddi): New. * config/aarch64/aarch64.md (sidi2, *extendsidi2_aarch64, qihi2, *extendqihi2_aarch64, *zero_extendsidi2_aarch64): Remove duplicate patterns. (2, *extend2_aarch64): Remove, consolidate into ... (extend2): ... This. (*zero_extendqihi2_aarch64, *zero_extend2_aarch64): Remove, consolidate into ... (zero_extend2, zero_extend2, zero_extend2): (*ands_compare0): Renamed to ... (*ands_compare0): ... This. * config/aarch64/iterators.md (HI_ONLY, QI_ONLY): New. (short_mask): Use hex rather than dec and add SI. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ands_3.c: Update codegen. * gcc.target/aarch64/sve/slp_1.c: Likewise. * gcc.target/aarch64/tst_5.c: Likewise. * gcc.target/aarch64/tst_6.c: Likewise. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc3270e71af5a72b9 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc3270e71af5a72b9 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4237,19 +4237,34 @@ (define_insn "*aarch64_get_lane_extend" [(set_attr "type" "neon_to_gp")] ) -(define_insn "*aarch64_get_lane_zero_extend" +(define_insn "*aarch64_get_lane_extenddi" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI + (vec_select: + (match_operand:VS 1 "register_operand" "w") + (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] + "TARGET_SIMD" + { + operands[2] = aarch64_endian_lane_rtx (mode, + INTVAL (operands[2])); + return "smov\\t%x0, %1.[%2]"; + } + [(set_attr "type" "neon_to_gp")] +) + +(define_insn "*aarch64_get_lane_zero_extend" [(set (match_operand:GPI 0 "register_operand" "=r") (zero_extend:GPI - (vec_select: - (match_operand:VDQQH 1 "register_operand" "w") + (vec_select: + (match_operand:VDQV_L 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (mode, + operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); - return "umov\\t%w0, %1.[%2]"; + return "umov\\t%w0, %1.[%2]"; } - [(set_attr "type" "neon_to_gp")] + [(set_attr "type" "neon_to_gp")] ) ;; Lane extraction of a value, neither sign nor zero extension diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3ea16dbc2557c6a4f37104d44a49f77f768eb53d..09ae1118371f82ca63146fceb953eb9e820d05a4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1911,22 +1911,6 @@ (define_insn "storewb_pair_" ;; Sign/Zero extension ;; ------------------------------------------------------------------- -(define_expand "sidi2" - [(set (match_operand:DI 0 "register_operand") - (ANY_EXTEND:DI (match_operand:SI 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r") - (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))] - "" - "@ - sxtw\t%0, %w1 - ldrsw\t%0, %1" - [(set_attr "type" "extend,load_4")] -) - (define_insn "*load_pair_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump"))) @@ -1940,21 +1924,6 @@ (define_insn "*load_pair_extendsidi2_aarch64" [(set_attr "type" "load_8")] ) -(define_insn "*zero_extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r,w,w,r,w") - (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m,r,m,w,w")))] - "" - "@ - uxtw\t%0, %w1 - ldr\t%w0, %1 - fmov\t%s0, %w1 - ldr\t%s0, %1 - fmov\t%w0, %s1 - fmov\t%s0, %s1" - [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") - (set_attr "arch" "*,*,fp,fp,fp,fp")] -) - (define_insn "*load_pair_zero_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r,w") (zero_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump,Ump"))) @@ -1971,61 +1940,64 @@ (define_insn "*load_pair_zero_extendsidi2_aarch64" (set_attr "arch" "*,fp")] ) -(define_expand "2" - [(set (match_operand:GPI 0 "register_operand") - (ANY_EXTEND:GPI (match_operand:SHORT 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,r") - (sign_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,w")))] +(define_insn "extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,r") + (sign_extend:SD_HSDI + (match_operand:ALLX 1 "nonimmediate_operand" "r,m,w")))] "" "@ - sxt\t%0, %w1 - ldrs\t%0, %1 - smov\t%0, %1.[0]" + sxt\t%0, %w1 + ldrs\t%0, %1 + smov\t%0, %1.[0]" [(set_attr "type" "extend,load_4,neon_to_gp") (set_attr "arch" "*,*,fp")] ) -(define_insn "*zero_extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r") - (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,m,w")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:SI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - and\t%0, %1, - ldr\t%w0, %1 - ldr\t%0, %1 - umov\t%w0, %1.[0]" - [(set_attr "type" "logic_imm,load_4,f_loads,neon_to_gp") - (set_attr "arch" "*,*,fp,fp")] -) - -(define_expand "qihi2" - [(set (match_operand:HI 0 "register_operand") - (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand")))] - "" + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + fmov\t%w0, %1 + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp,fp")] ) -(define_insn "*extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:HI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - sxtb\t%w0, %w1 - ldrsb\t%w0, %1" - [(set_attr "type" "extend,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp16,fp,fp,fp16")] ) -(define_insn "*zero_extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,r,w") + (zero_extend:SD_HSDI + (match_operand:QI_ONLY 1 "nonimmediate_operand" "r,m,m,w,w")))] "" "@ - and\t%w0, %w1, 255 - ldrb\t%w0, %1" - [(set_attr "type" "logic_imm,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + dup\t%0, %1.[0]" + [(set_attr "type" "mov_reg,load_4,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp")] ) ;; ------------------------------------------------------------------- @@ -5029,15 +5001,15 @@ (define_insn "*and_compare0" [(set_attr "type" "alus_imm")] ) -(define_insn "*ands_compare0" +(define_insn "*ands_compare0" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ - (zero_extend:GPI (match_operand:SHORT 1 "register_operand" "r")) + (zero_extend:SD_HSDI (match_operand:ALLX 1 "register_operand" "r")) (const_int 0))) - (set (match_operand:GPI 0 "register_operand" "=r") - (zero_extend:GPI (match_dup 1)))] + (set (match_operand:SD_HSDI 0 "register_operand" "=r") + (zero_extend:SD_HSDI (match_dup 1)))] "" - "ands\\t%0, %1, " + "ands\\t%0, %1, " [(set_attr "type" "alus_imm")] ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 1df09f7fe2eb35aed96113476541e0faa5393551..e904407b2169e589b7007ff966b2d9347a6d0fd2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -41,6 +41,8 @@ (define_mode_iterator SHORT [QI HI]) ;; Iterators for single modes, for "@" patterns. (define_mode_iterator SI_ONLY [SI]) (define_mode_iterator DI_ONLY [DI]) +(define_mode_iterator HI_ONLY [HI]) +(define_mode_iterator QI_ONLY [QI]) ;; Iterator for all integer modes (up to 64-bit) (define_mode_iterator ALLI [QI HI SI DI]) @@ -1033,7 +1035,7 @@ (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")]) ;; For width of fp registers in fcvt instruction (define_mode_attr fpw [(DI "s") (SI "d")]) -(define_mode_attr short_mask [(HI "65535") (QI "255")]) +(define_mode_attr short_mask [(SI "0xffffffff") (HI "0xffff") (QI "0xff")]) ;; For constraints used in scalar immediate vector moves (define_mode_attr hq [(HI "h") (QI "q")]) diff --git a/gcc/testsuite/gcc.target/aarch64/ands_3.c b/gcc/testsuite/gcc.target/aarch64/ands_3.c index 42cb7f0f0bc86a4aceb09851c31eb2e888d93403..421aa5cea7a51ad810cc9c5653a149cb21bb871c 100644 --- a/gcc/testsuite/gcc.target/aarch64/ands_3.c +++ b/gcc/testsuite/gcc.target/aarch64/ands_3.c @@ -9,4 +9,4 @@ f9 (unsigned char x, int y) return x; } -/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*255" } } */ +/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 8e35e0b574d49913b43c7d8d4f4ba75f127f42e9..03288976b3397cdbe0e822f94f2a6448d9fa9a52 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -51,7 +51,6 @@ TEST_ALL (VEC_PERM) /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s} 6 } } */ /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d} 6 } } */ /* { dg-final { scan-assembler-not {\tldr} } } */ -/* { dg-final { scan-assembler-times {\tstr} 2 } } */ -/* { dg-final { scan-assembler-times {\tstr\th[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {\tins\tv[0-9]+\.h\[1\], v[0-9]+\.h\[0\]} 1 } } */ /* { dg-final { scan-assembler-not {\tuqdec} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_5.c b/gcc/testsuite/gcc.target/aarch64/tst_5.c index 0de40a6c47a7d63c1b7a81aeba438a096c0041b8..19034cd74ed07ea4d670c25d9ab3d1cff805a483 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_5.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_5.c @@ -4,7 +4,7 @@ int f255 (int x) { - if (x & 255) + if (x & 0xff) return 1; return x; } @@ -12,10 +12,10 @@ f255 (int x) int f65535 (int x) { - if (x & 65535) + if (x & 0xffff) return 1; return x; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*255" } } */ -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_6.c b/gcc/testsuite/gcc.target/aarch64/tst_6.c index f15ec114c391fed79cc43b7740fde83fb3d4ea53..1c047cfae214b60e5bf003e6781a277202fcc588 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_6.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_6.c @@ -7,4 +7,4 @@ foo (long x) return ((short) x != 0) ? x : 1; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4237,19 +4237,34 @@ (define_insn "*aarch64_get_lane_extend" [(set_attr "type" "neon_to_gp")] ) -(define_insn "*aarch64_get_lane_zero_extend" +(define_insn "*aarch64_get_lane_extenddi" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI + (vec_select: + (match_operand:VS 1 "register_operand" "w") + (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] + "TARGET_SIMD" + { + operands[2] = aarch64_endian_lane_rtx (mode, + INTVAL (operands[2])); + return "smov\\t%x0, %1.[%2]"; + } + [(set_attr "type" "neon_to_gp")] +) + +(define_insn "*aarch64_get_lane_zero_extend" [(set (match_operand:GPI 0 "register_operand" "=r") (zero_extend:GPI - (vec_select: - (match_operand:VDQQH 1 "register_operand" "w") + (vec_select: + (match_operand:VDQV_L 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))] "TARGET_SIMD" { - operands[2] = aarch64_endian_lane_rtx (mode, + operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); - return "umov\\t%w0, %1.[%2]"; + return "umov\\t%w0, %1.[%2]"; } - [(set_attr "type" "neon_to_gp")] + [(set_attr "type" "neon_to_gp")] ) ;; Lane extraction of a value, neither sign nor zero extension diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3ea16dbc2557c6a4f37104d44a49f77f768eb53d..09ae1118371f82ca63146fceb953eb9e820d05a4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1911,22 +1911,6 @@ (define_insn "storewb_pair_" ;; Sign/Zero extension ;; ------------------------------------------------------------------- -(define_expand "sidi2" - [(set (match_operand:DI 0 "register_operand") - (ANY_EXTEND:DI (match_operand:SI 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r") - (sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))] - "" - "@ - sxtw\t%0, %w1 - ldrsw\t%0, %1" - [(set_attr "type" "extend,load_4")] -) - (define_insn "*load_pair_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump"))) @@ -1940,21 +1924,6 @@ (define_insn "*load_pair_extendsidi2_aarch64" [(set_attr "type" "load_8")] ) -(define_insn "*zero_extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r,w,w,r,w") - (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m,r,m,w,w")))] - "" - "@ - uxtw\t%0, %w1 - ldr\t%w0, %1 - fmov\t%s0, %w1 - ldr\t%s0, %1 - fmov\t%w0, %s1 - fmov\t%s0, %s1" - [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") - (set_attr "arch" "*,*,fp,fp,fp,fp")] -) - (define_insn "*load_pair_zero_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r,w") (zero_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump,Ump"))) @@ -1971,61 +1940,64 @@ (define_insn "*load_pair_zero_extendsidi2_aarch64" (set_attr "arch" "*,fp")] ) -(define_expand "2" - [(set (match_operand:GPI 0 "register_operand") - (ANY_EXTEND:GPI (match_operand:SHORT 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,r") - (sign_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,w")))] +(define_insn "extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,r") + (sign_extend:SD_HSDI + (match_operand:ALLX 1 "nonimmediate_operand" "r,m,w")))] "" "@ - sxt\t%0, %w1 - ldrs\t%0, %1 - smov\t%0, %1.[0]" + sxt\t%0, %w1 + ldrs\t%0, %1 + smov\t%0, %1.[0]" [(set_attr "type" "extend,load_4,neon_to_gp") (set_attr "arch" "*,*,fp")] ) -(define_insn "*zero_extend2_aarch64" - [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r") - (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,m,w")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:SI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - and\t%0, %1, - ldr\t%w0, %1 - ldr\t%0, %1 - umov\t%w0, %1.[0]" - [(set_attr "type" "logic_imm,load_4,f_loads,neon_to_gp") - (set_attr "arch" "*,*,fp,fp")] -) - -(define_expand "qihi2" - [(set (match_operand:HI 0 "register_operand") - (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand")))] - "" + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + fmov\t%w0, %1 + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp,fp")] ) -(define_insn "*extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") + (zero_extend:SD_HSDI + (match_operand:HI_ONLY 1 "nonimmediate_operand" "r,m,r,m,w,w")))] "" "@ - sxtb\t%w0, %w1 - ldrsb\t%w0, %1" - [(set_attr "type" "extend,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + fmov\t%0, %w1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + fmov\t%0, %1" + [(set_attr "type" "mov_reg,load_4,f_mcr,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp16,fp,fp,fp16")] ) -(define_insn "*zero_extendqihi2_aarch64" - [(set (match_operand:HI 0 "register_operand" "=r,r") - (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] +(define_insn "zero_extend2" + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,r,w") + (zero_extend:SD_HSDI + (match_operand:QI_ONLY 1 "nonimmediate_operand" "r,m,m,w,w")))] "" "@ - and\t%w0, %w1, 255 - ldrb\t%w0, %1" - [(set_attr "type" "logic_imm,load_4")] + uxt\t%0, %w1 + ldr\t%w0, %1 + ldr\t%0, %1 + umov\t%w0, %1.[0] + dup\t%0, %1.[0]" + [(set_attr "type" "mov_reg,load_4,f_loads,f_mrc,fmov") + (set_attr "arch" "*,*,fp,fp,fp")] ) ;; ------------------------------------------------------------------- @@ -5029,15 +5001,15 @@ (define_insn "*and_compare0" [(set_attr "type" "alus_imm")] ) -(define_insn "*ands_compare0" +(define_insn "*ands_compare0" [(set (reg:CC_NZ CC_REGNUM) (compare:CC_NZ - (zero_extend:GPI (match_operand:SHORT 1 "register_operand" "r")) + (zero_extend:SD_HSDI (match_operand:ALLX 1 "register_operand" "r")) (const_int 0))) - (set (match_operand:GPI 0 "register_operand" "=r") - (zero_extend:GPI (match_dup 1)))] + (set (match_operand:SD_HSDI 0 "register_operand" "=r") + (zero_extend:SD_HSDI (match_dup 1)))] "" - "ands\\t%0, %1, " + "ands\\t%0, %1, " [(set_attr "type" "alus_imm")] ) diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 1df09f7fe2eb35aed96113476541e0faa5393551..e904407b2169e589b7007ff966b2d9347a6d0fd2 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -41,6 +41,8 @@ (define_mode_iterator SHORT [QI HI]) ;; Iterators for single modes, for "@" patterns. (define_mode_iterator SI_ONLY [SI]) (define_mode_iterator DI_ONLY [DI]) +(define_mode_iterator HI_ONLY [HI]) +(define_mode_iterator QI_ONLY [QI]) ;; Iterator for all integer modes (up to 64-bit) (define_mode_iterator ALLI [QI HI SI DI]) @@ -1033,7 +1035,7 @@ (define_mode_attr w2 [(HF "x") (SF "x") (DF "w")]) ;; For width of fp registers in fcvt instruction (define_mode_attr fpw [(DI "s") (SI "d")]) -(define_mode_attr short_mask [(HI "65535") (QI "255")]) +(define_mode_attr short_mask [(SI "0xffffffff") (HI "0xffff") (QI "0xff")]) ;; For constraints used in scalar immediate vector moves (define_mode_attr hq [(HI "h") (QI "q")]) diff --git a/gcc/testsuite/gcc.target/aarch64/ands_3.c b/gcc/testsuite/gcc.target/aarch64/ands_3.c index 42cb7f0f0bc86a4aceb09851c31eb2e888d93403..421aa5cea7a51ad810cc9c5653a149cb21bb871c 100644 --- a/gcc/testsuite/gcc.target/aarch64/ands_3.c +++ b/gcc/testsuite/gcc.target/aarch64/ands_3.c @@ -9,4 +9,4 @@ f9 (unsigned char x, int y) return x; } -/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*255" } } */ +/* { dg-final { scan-assembler "ands\t(x|w)\[0-9\]+,\[ \t\]*(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c index 8e35e0b574d49913b43c7d8d4f4ba75f127f42e9..03288976b3397cdbe0e822f94f2a6448d9fa9a52 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/slp_1.c @@ -51,7 +51,6 @@ TEST_ALL (VEC_PERM) /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.s} 6 } } */ /* { dg-final { scan-assembler-times {\twhilelo\tp[0-7]\.d} 6 } } */ /* { dg-final { scan-assembler-not {\tldr} } } */ -/* { dg-final { scan-assembler-times {\tstr} 2 } } */ -/* { dg-final { scan-assembler-times {\tstr\th[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {\tins\tv[0-9]+\.h\[1\], v[0-9]+\.h\[0\]} 1 } } */ /* { dg-final { scan-assembler-not {\tuqdec} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_5.c b/gcc/testsuite/gcc.target/aarch64/tst_5.c index 0de40a6c47a7d63c1b7a81aeba438a096c0041b8..19034cd74ed07ea4d670c25d9ab3d1cff805a483 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_5.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_5.c @@ -4,7 +4,7 @@ int f255 (int x) { - if (x & 255) + if (x & 0xff) return 1; return x; } @@ -12,10 +12,10 @@ f255 (int x) int f65535 (int x) { - if (x & 65535) + if (x & 0xffff) return 1; return x; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*255" } } */ -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xff" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/tst_6.c b/gcc/testsuite/gcc.target/aarch64/tst_6.c index f15ec114c391fed79cc43b7740fde83fb3d4ea53..1c047cfae214b60e5bf003e6781a277202fcc588 100644 --- a/gcc/testsuite/gcc.target/aarch64/tst_6.c +++ b/gcc/testsuite/gcc.target/aarch64/tst_6.c @@ -7,4 +7,4 @@ foo (long x) return ((short) x != 0) ? x : 1; } -/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*65535" } } */ +/* { dg-final { scan-assembler "tst\t(x|w)\[0-9\]+,\[ \t\]*0xffff" } } */ From patchwork Mon Oct 31 12:00:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59653 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 697CF38187CB for ; Mon, 31 Oct 2022 12:01:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 697CF38187CB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217682; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=eScIwUbFGqehOvuykI7pP6emRAak3pLjIPm6/MKCrY6fQ7gpyuCQxCxCKSXz5pxeb HTejf/RGbZC71VAtENP40aXvckthcGVaKWJnLqVMAfvJnVonI8/8+I3whmSXpWZNxx HmCDGTTRTrDC1JD5beWFHsnB5jzwZjyUY/7Ja3sE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2063.outbound.protection.outlook.com [40.107.105.63]) by sourceware.org (Postfix) with ESMTPS id 185473865C1B for ; Mon, 31 Oct 2022 12:00:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 185473865C1B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=nmv00S0GzoSVy8mgKj6frbrTYTcETmNrEa/jP7DK/sQ2ZmqAZPm4WEEkK4+LRLdZfmg5B2KRBHZL9nDnx05PpNJkR7dbO8PdSln0n+NWALDYfbPSxBdurq3VC0jrG6q9V0dl3KlyH2jNn1j+JEiqxG/vC259iMCbQrCla6CdLlMiJ60zLHsgD6hHZuQ9YAZlI2oW/j7FXp/fYu+5D7QxrmnYkgWOF5++dROrwFxYIFXGarGLWJ1THLeJ7druPAEaVKVqQFTosMiTzohBCPGTjvyNXIAhNUzLUxSU11fCWIcdQOWLCzbQylEYr8i1U9AkXACIzGr2REQW2PVQfqw4rQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; b=PYAfBdNZzmfpKe+dXRq0kycDbXkgVOTfA0ML5nhYUPh49lfbuoQQ2m3N2i7mSV8584XbncmQT5P9gODvfaHKKuoiPnmrl+AHXBk1PycUUj+uZqvaVMq0WQtSWAwQy9paCEpye3n5lWoooA92hvcyx7XkFeCqbNhNnruub5KQgAg/lt4uExjJVZfisLL5jIAOpDfp4lFGnnI/EV5kFxCdBUQm03tHY/E3ZxMNTSLhA0tbQmYa+pWlM93+Psel5iWZUA+JV/qI1i4L3ID4qePliy+zCtaSxJhtybZW5I05ykSTebxq39Ds7fqqecBh4/S5jGLI10k9QNJHLAk8VKlvWA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AS9PR06CA0117.eurprd06.prod.outlook.com (2603:10a6:20b:465::32) by AS8PR08MB6614.eurprd08.prod.outlook.com (2603:10a6:20b:338::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.18; Mon, 31 Oct 2022 12:00:43 +0000 Received: from AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:465:cafe::4e) by AS9PR06CA0117.outlook.office365.com (2603:10a6:20b:465::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19 via Frontend Transport; Mon, 31 Oct 2022 12:00:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT018.mail.protection.outlook.com (100.127.140.97) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 12:00:43 +0000 Received: ("Tessian outbound aeae1c7b66fd:v130"); Mon, 31 Oct 2022 12:00:43 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 021da67ec2d85891 X-CR-MTA-TID: 64aa7808 Received: from 321a9a2ad8b3.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 0BDE44F5-19B4-41BF-A5E2-9E71891F53C6.1; Mon, 31 Oct 2022 12:00:32 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 321a9a2ad8b3.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 12:00:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZZ9KW7ipb367bcMvy87VfkmKqIocu99xxePw+6GmCxrUJuMp1VHN/PCz0Z8gSEyVi5C73LIGOvUiOmcQ/UH/hfgtEuKHcUXH5XPaBB4g+XIp38AxY3UsbsM/krndt6g8tap36HWpMP7JasdtIsm8LImT/5clN4rViEUQm6vtb4A+jUT46v/YX1GVEheCb2wGr32NCPCulaRGXNv1UjzoEVvR403oYXUWIdPh3H3dL3S5xgeNUtByq3Lxhhq2ts1ZldIJYDRYaB+0DNUHeo5kHXXtIt+gDvb+tI/q56nPV8LlFcxn0aTrhyi5Mlx5HaMpMsHQCoUxtFV6HXhptVa3Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; b=SfeiN4KolgPb83SwEFmtEKu2GkfTB97EEWDy+ccA8CqTtFeM46CEYyvsok1h5fkbHAGetAy4zFIDF0VgftA/U4jTso1UdV+V3pQ97vTBy66kf5+29kFSfyIk7Q6GLNTBQaEIiDcX70LpHKhHV6xDY+Sepi2fJ4UZjEGii/L2BCfAKfXJ4Le+PFPt1RtU42SQuvg7Y7zBkuk0+iUyhdYYUpgyq8RR1jQZFtbiUCdyhiAwemYYU8tvDAXvMwMnzF7STi9KNQqsBnqQb82kVw8cuthArw7p8UpNqzWOrzaaYQDZee0WELrU7dEnbvb8tLMf2IL95t5CwdvQiiF0+nbufg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 12:00:30 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 12:00:30 +0000 Date: Mon, 31 Oct 2022 12:00:22 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA9PR13CA0177.namprd13.prod.outlook.com (2603:10b6:806:28::32) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|AM7EUR03FT018:EE_|AS8PR08MB6614:EE_ X-MS-Office365-Filtering-Correlation-Id: 7aead600-ac0b-457c-8e84-08dabb3789e2 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: zS8tPJ3I4HPOER8Bd8ZlgIiuWcqHxpW5U1Iy5ILFhs1M/AQKC/X/2TblJAAVV8dB9kjzLDN+rToZjyh/+8iskwoPuTSFI6W7mMInS7CUCcqNuXOa3Y35K7kqjqAvLTRVkg4mTJgcAB+QljLEne2CY/QZw6hNofyYbp9OM0Z776MBTtMD95hjft2DPvibEIOSBZ0BJhmh0Xs/kwpXADKwK9NTjMgoK0ZTXSAAyMnp4gvi0S9pFYseQUf2FfKIG76zmRkN8ZOqWxtFVqrL7DQtFtSx4eiBrfwRoErTx2gYoRyiCjay8xA8UG7WOJMGO2ywtgCbL5qhp10POeLAvbxChJUFw1se6NQINOw1lNIQqmLKHL2V7juk5v/GCH/Fp4ti2RhEYaxR3SvAadenpnJK6L+MGA+8P46ib+Ofx54BQj82Zhu9+bN1YmSUgn0qSwS6xre2RsocPJIALz6Ixwz7OLBFc4J6dJAkJ//9wvA0QOSqf7wBp5TEQKXhAqAcT15/LE8Ubnx9QZDAKXTX4ngILP1J1Z/qghDEgJZLDbpoSgG8c8H+o0Ac7ouDAmNFBq/b2W4jHZwxKbDyeGy+sqJoPHrYofIdJ3oZelH4VGB4atnQ54ZkcAXXgorSzjNGGeZSMPN+i12obsdQDh/Xd4v5TqmBDyFl7VR0wGSjlBowXou31knP5JWFiy8b78MI2aBdoIZ5xt7K0LU/GBe2fDfMkLxAGhYSNN2rXgFOJ5psSwrXQyXM8FvJEJYz80USBrFJejC0CsjYKR5zer6ULlBRAwVTtdjiYhgTlV/FZps0MUXnTvRrstWzdLMOZMpYvA4M5M7Jif20Xm0B2lNV721kkrn/wf/JnpXgF0Yyb9/VUQc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(84970400001)(2906002)(38100700002)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(30864003)(6666004)(8936002)(26005)(4743002)(6512007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: b343adcb-faee-42be-f7cc-08dabb3781cf X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cBTYLxg+PDAL1i1AOT7wUqVf1uvn4Z5XgtA1Uhxcgu89nM5r6X9kC+UH42PG9MwiCYhG+NhWjTbkqkIC4PUGJjC9FcN8mSyiHWDfqdJfxcMNNoZSV51WLxq8lcm00nSWqpjemTsHQjcOl3VBrkDe48cyct1eFeUmusLQj8Go6NF0lKnS/hpsIwaIKKFG1snJwey4j5e0mjcXpEIRPukJ3B8OojNI584wdEKammUNur6IiLrHfSG7qmFwa8c5bMKDmZ/23NX3VwhxNWG23eviASyQnYC1gmT7K6aUVLDnexPezV2ujb5xkb9+KdPnSjc8NmGCoH1XW7HlAZy7eBOQKyzn4WZILL9M3UOxssgnOG/PQSIIfrqhvhRnI6E2Saj5BwpnsS9jTsI3V4O7dwrySgm2Od/DsQGGpiUMjwbcixs/Wx5AnnGa95o1Becx2wSzUQusMF04AjqwCRzSxo+uIl5J4BX5mP4IqhAiNehcGYQSQ+eICddbwRKIsR3hveKRlnzsFudzatHn3smDmCjK7IRa7rYb1zZv7rejW8b3NFzaxAob5+hzDxzDjY7mvVJcr/qWrFDw+khNtwu8g01rXwRmq/0ZAkJOBy02xbSYupJC8JK8sGSiSLMTYfujBA9BzebsE3LBnHiFmVDoWu3nV9heu5CZ9XDqOuefX84qz3uawOQ2ftQSPTDEWtTbW4uMF3vCYqRPTS9JL9urgPffjW5rYYYcBCFfAMWhTDQlDhLj+lOFfDZWEcC+Bn21mSBMJcs3fxAvFEp4HjUlV6AStNhHz0XuPTlLy4hfkzZgiR9bORCO1TqMWqgD/RW/OqTANWPxdSg2DMmGRUCvMk6p23q2t73OjDOdUrlMX06ZAk9an3wR31YvG8pKYFjEmTGzR5SHQaAAkuvHVVpIMKNvDw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(136003)(396003)(346002)(376002)(451199015)(36840700001)(46966006)(40470700004)(4743002)(41300700001)(6512007)(81166007)(356005)(2906002)(336012)(82310400005)(36860700001)(186003)(26005)(36756003)(47076005)(8936002)(44832011)(2616005)(235185007)(82740400003)(5660300002)(30864003)(6486002)(40480700001)(478600001)(6666004)(86362001)(40460700003)(33964004)(6506007)(44144004)(84970400001)(70586007)(70206006)(8676002)(4326008)(316002)(6916009)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 12:00:43.6486 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7aead600-ac0b-457c-8e84-08dabb3789e2 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6614 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, Currently we often times generate an r -> r add even if it means we need two reloads to perform it, i.e. in the case that the values are on the SIMD side. The pairwise operations expose these more now and so we get suboptimal codegen. Normally I would have liked to use ^ or $ here, but while this works for the simple examples, reload inexplicably falls apart on examples that should have been trivial. It forces a move to r -> w to use the w ADD, which is counter to what ^ and $ should do. However ! seems to fix all the regression and still maintains the good codegen. I have tried looking into whether it's our costings that are off, but I can't seem anything logical here. So I'd like to push this change instead along with test that augment the other testcases that guard the r -> r variants. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64.md (*add3_aarch64): Add ! to the r -> r alternative. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/scalar_addp.c: New test. * gcc.target/aarch64/simd/scalar_faddp.c: New test. * gcc.target/aarch64/simd/scalar_faddp2.c: New test. * gcc.target/aarch64/simd/scalar_fmaxp.c: New test. * gcc.target/aarch64/simd/scalar_fminp.c: New test. * gcc.target/aarch64/simd/scalar_maxp.c: New test. * gcc.target/aarch64/simd/scalar_minp.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f1245a242d5192d 100644 --- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f1245a242d5192d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2043,7 +2043,7 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,!rk,w,rk,r,r,rk") (plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c new file mode 100644 index 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80758bc36fa76d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** addp d0, v0.2d +** fmov x0, d0 +** ret +*/ +long long +foo (v2di x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** saddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +long long +foo1 (v2si x) +{ + return x[1] + x[0]; +} + +/* +** foo2: +** uaddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[1] + x[0]; +} + +/* +** foo3: +** uaddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[1] + x[0]) + y[0]; +} + +/* +** foo4: +** saddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[1] + x[0]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c new file mode 100644 index 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467b91a76fbe31aff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c @@ -0,0 +1,66 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** faddp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** faddp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] + x[1]; +} + +/* +** foo2: +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] + x[1]; +} + +/* +** foo3: +** ext v0.16b, v0.16b, v0.16b, #4 +** faddp s0, v0.2s +** ret +*/ +float +foo3 (v4sf x) +{ + return x[1] + x[2]; +} + +/* +** foo4: +** dup s0, v0.s\[3\] +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo4 (v8hf x) +{ + return x[6] + x[7]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c new file mode 100644 index 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f730b1213e940391a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -w" } */ + +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); +double a[]; +*b; +fn1() { + __m128i c; + *(__m128i *)a = c; + *b = a[0] + a[1]; +} + +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c new file mode 100644 index 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c574506610ab4fd7299 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c @@ -0,0 +1,56 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fmaxnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fmaxnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fmaxnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fmaxnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c new file mode 100644 index 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff25f1494e839f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fminnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fminnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fminnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fminnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c new file mode 100644 index 0000000000000000000000000000000000000000..e219a13abc745b83dca58633fd2d812e276d6b2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, ge +** ret +*/ +long long +foo (v2di x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** smaxp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** smaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c new file mode 100644 index 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c3ddca6b477430e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, le +** ret +*/ +long long +foo (v2di x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** sminp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** sminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2043,7 +2043,7 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,!rk,w,rk,r,r,rk") (plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c new file mode 100644 index 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80758bc36fa76d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** addp d0, v0.2d +** fmov x0, d0 +** ret +*/ +long long +foo (v2di x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** saddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +long long +foo1 (v2si x) +{ + return x[1] + x[0]; +} + +/* +** foo2: +** uaddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[1] + x[0]; +} + +/* +** foo3: +** uaddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[1] + x[0]) + y[0]; +} + +/* +** foo4: +** saddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[1] + x[0]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c new file mode 100644 index 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467b91a76fbe31aff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c @@ -0,0 +1,66 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** faddp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** faddp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] + x[1]; +} + +/* +** foo2: +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] + x[1]; +} + +/* +** foo3: +** ext v0.16b, v0.16b, v0.16b, #4 +** faddp s0, v0.2s +** ret +*/ +float +foo3 (v4sf x) +{ + return x[1] + x[2]; +} + +/* +** foo4: +** dup s0, v0.s\[3\] +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo4 (v8hf x) +{ + return x[6] + x[7]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c new file mode 100644 index 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f730b1213e940391a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -w" } */ + +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); +double a[]; +*b; +fn1() { + __m128i c; + *(__m128i *)a = c; + *b = a[0] + a[1]; +} + +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c new file mode 100644 index 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c574506610ab4fd7299 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c @@ -0,0 +1,56 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fmaxnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fmaxnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fmaxnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fmaxnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c new file mode 100644 index 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff25f1494e839f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fminnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fminnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fminnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fminnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c new file mode 100644 index 0000000000000000000000000000000000000000..e219a13abc745b83dca58633fd2d812e276d6b2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, ge +** ret +*/ +long long +foo (v2di x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** smaxp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** smaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c new file mode 100644 index 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c3ddca6b477430e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, le +** ret +*/ +long long +foo (v2di x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** sminp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** sminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +}