From patchwork Mon Oct 31 12:00:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 59653 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 697CF38187CB for ; Mon, 31 Oct 2022 12:01:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 697CF38187CB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667217682; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; h=Date:To:Subject:In-Reply-To:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:Cc:From; b=eScIwUbFGqehOvuykI7pP6emRAak3pLjIPm6/MKCrY6fQ7gpyuCQxCxCKSXz5pxeb HTejf/RGbZC71VAtENP40aXvckthcGVaKWJnLqVMAfvJnVonI8/8+I3whmSXpWZNxx HmCDGTTRTrDC1JD5beWFHsnB5jzwZjyUY/7Ja3sE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-AM7-obe.outbound.protection.outlook.com (mail-am7eur03on2063.outbound.protection.outlook.com [40.107.105.63]) by sourceware.org (Postfix) with ESMTPS id 185473865C1B for ; Mon, 31 Oct 2022 12:00:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 185473865C1B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=nmv00S0GzoSVy8mgKj6frbrTYTcETmNrEa/jP7DK/sQ2ZmqAZPm4WEEkK4+LRLdZfmg5B2KRBHZL9nDnx05PpNJkR7dbO8PdSln0n+NWALDYfbPSxBdurq3VC0jrG6q9V0dl3KlyH2jNn1j+JEiqxG/vC259iMCbQrCla6CdLlMiJ60zLHsgD6hHZuQ9YAZlI2oW/j7FXp/fYu+5D7QxrmnYkgWOF5++dROrwFxYIFXGarGLWJ1THLeJ7druPAEaVKVqQFTosMiTzohBCPGTjvyNXIAhNUzLUxSU11fCWIcdQOWLCzbQylEYr8i1U9AkXACIzGr2REQW2PVQfqw4rQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; b=PYAfBdNZzmfpKe+dXRq0kycDbXkgVOTfA0ML5nhYUPh49lfbuoQQ2m3N2i7mSV8584XbncmQT5P9gODvfaHKKuoiPnmrl+AHXBk1PycUUj+uZqvaVMq0WQtSWAwQy9paCEpye3n5lWoooA92hvcyx7XkFeCqbNhNnruub5KQgAg/lt4uExjJVZfisLL5jIAOpDfp4lFGnnI/EV5kFxCdBUQm03tHY/E3ZxMNTSLhA0tbQmYa+pWlM93+Psel5iWZUA+JV/qI1i4L3ID4qePliy+zCtaSxJhtybZW5I05ykSTebxq39Ds7fqqecBh4/S5jGLI10k9QNJHLAk8VKlvWA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AS9PR06CA0117.eurprd06.prod.outlook.com (2603:10a6:20b:465::32) by AS8PR08MB6614.eurprd08.prod.outlook.com (2603:10a6:20b:338::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.18; Mon, 31 Oct 2022 12:00:43 +0000 Received: from AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:465:cafe::4e) by AS9PR06CA0117.outlook.office365.com (2603:10a6:20b:465::32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19 via Frontend Transport; Mon, 31 Oct 2022 12:00:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT018.mail.protection.outlook.com (100.127.140.97) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.14 via Frontend Transport; Mon, 31 Oct 2022 12:00:43 +0000 Received: ("Tessian outbound aeae1c7b66fd:v130"); Mon, 31 Oct 2022 12:00:43 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 021da67ec2d85891 X-CR-MTA-TID: 64aa7808 Received: from 321a9a2ad8b3.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 0BDE44F5-19B4-41BF-A5E2-9E71891F53C6.1; Mon, 31 Oct 2022 12:00:32 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 321a9a2ad8b3.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 31 Oct 2022 12:00:32 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZZ9KW7ipb367bcMvy87VfkmKqIocu99xxePw+6GmCxrUJuMp1VHN/PCz0Z8gSEyVi5C73LIGOvUiOmcQ/UH/hfgtEuKHcUXH5XPaBB4g+XIp38AxY3UsbsM/krndt6g8tap36HWpMP7JasdtIsm8LImT/5clN4rViEUQm6vtb4A+jUT46v/YX1GVEheCb2wGr32NCPCulaRGXNv1UjzoEVvR403oYXUWIdPh3H3dL3S5xgeNUtByq3Lxhhq2ts1ZldIJYDRYaB+0DNUHeo5kHXXtIt+gDvb+tI/q56nPV8LlFcxn0aTrhyi5Mlx5HaMpMsHQCoUxtFV6HXhptVa3Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mojxDEAB+ExO7kL51Mb6c/xWmlMv52XmIPhgGglTp8c=; b=SfeiN4KolgPb83SwEFmtEKu2GkfTB97EEWDy+ccA8CqTtFeM46CEYyvsok1h5fkbHAGetAy4zFIDF0VgftA/U4jTso1UdV+V3pQ97vTBy66kf5+29kFSfyIk7Q6GLNTBQaEIiDcX70LpHKhHV6xDY+Sepi2fJ4UZjEGii/L2BCfAKfXJ4Le+PFPt1RtU42SQuvg7Y7zBkuk0+iUyhdYYUpgyq8RR1jQZFtbiUCdyhiAwemYYU8tvDAXvMwMnzF7STi9KNQqsBnqQb82kVw8cuthArw7p8UpNqzWOrzaaYQDZee0WELrU7dEnbvb8tLMf2IL95t5CwdvQiiF0+nbufg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DBBPR08MB6268.eurprd08.prod.outlook.com (2603:10a6:10:202::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5769.19; Mon, 31 Oct 2022 12:00:30 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::c57d:50c2:3502:a52%4]) with mapi id 15.20.5769.019; Mon, 31 Oct 2022 12:00:30 +0000 Date: Mon, 31 Oct 2022 12:00:22 +0000 To: gcc-patches@gcc.gnu.org Subject: [PATCH 8/8]AArch64: Have reload not choose to do add on the scalar side if both values exist on the SIMD side. Message-ID: Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: SA9PR13CA0177.namprd13.prod.outlook.com (2603:10b6:806:28::32) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DBBPR08MB6268:EE_|AM7EUR03FT018:EE_|AS8PR08MB6614:EE_ X-MS-Office365-Filtering-Correlation-Id: 7aead600-ac0b-457c-8e84-08dabb3789e2 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: zS8tPJ3I4HPOER8Bd8ZlgIiuWcqHxpW5U1Iy5ILFhs1M/AQKC/X/2TblJAAVV8dB9kjzLDN+rToZjyh/+8iskwoPuTSFI6W7mMInS7CUCcqNuXOa3Y35K7kqjqAvLTRVkg4mTJgcAB+QljLEne2CY/QZw6hNofyYbp9OM0Z776MBTtMD95hjft2DPvibEIOSBZ0BJhmh0Xs/kwpXADKwK9NTjMgoK0ZTXSAAyMnp4gvi0S9pFYseQUf2FfKIG76zmRkN8ZOqWxtFVqrL7DQtFtSx4eiBrfwRoErTx2gYoRyiCjay8xA8UG7WOJMGO2ywtgCbL5qhp10POeLAvbxChJUFw1se6NQINOw1lNIQqmLKHL2V7juk5v/GCH/Fp4ti2RhEYaxR3SvAadenpnJK6L+MGA+8P46ib+Ofx54BQj82Zhu9+bN1YmSUgn0qSwS6xre2RsocPJIALz6Ixwz7OLBFc4J6dJAkJ//9wvA0QOSqf7wBp5TEQKXhAqAcT15/LE8Ubnx9QZDAKXTX4ngILP1J1Z/qghDEgJZLDbpoSgG8c8H+o0Ac7ouDAmNFBq/b2W4jHZwxKbDyeGy+sqJoPHrYofIdJ3oZelH4VGB4atnQ54ZkcAXXgorSzjNGGeZSMPN+i12obsdQDh/Xd4v5TqmBDyFl7VR0wGSjlBowXou31knP5JWFiy8b78MI2aBdoIZ5xt7K0LU/GBe2fDfMkLxAGhYSNN2rXgFOJ5psSwrXQyXM8FvJEJYz80USBrFJejC0CsjYKR5zer6ULlBRAwVTtdjiYhgTlV/FZps0MUXnTvRrstWzdLMOZMpYvA4M5M7Jif20Xm0B2lNV721kkrn/wf/JnpXgF0Yyb9/VUQc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(346002)(366004)(39860400002)(136003)(376002)(396003)(451199015)(478600001)(84970400001)(2906002)(38100700002)(66476007)(66556008)(66946007)(4326008)(8676002)(33964004)(44832011)(2616005)(186003)(41300700001)(86362001)(235185007)(5660300002)(6506007)(6486002)(44144004)(316002)(36756003)(6916009)(30864003)(6666004)(8936002)(26005)(4743002)(6512007)(67856001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6268 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: b343adcb-faee-42be-f7cc-08dabb3781cf X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cBTYLxg+PDAL1i1AOT7wUqVf1uvn4Z5XgtA1Uhxcgu89nM5r6X9kC+UH42PG9MwiCYhG+NhWjTbkqkIC4PUGJjC9FcN8mSyiHWDfqdJfxcMNNoZSV51WLxq8lcm00nSWqpjemTsHQjcOl3VBrkDe48cyct1eFeUmusLQj8Go6NF0lKnS/hpsIwaIKKFG1snJwey4j5e0mjcXpEIRPukJ3B8OojNI584wdEKammUNur6IiLrHfSG7qmFwa8c5bMKDmZ/23NX3VwhxNWG23eviASyQnYC1gmT7K6aUVLDnexPezV2ujb5xkb9+KdPnSjc8NmGCoH1XW7HlAZy7eBOQKyzn4WZILL9M3UOxssgnOG/PQSIIfrqhvhRnI6E2Saj5BwpnsS9jTsI3V4O7dwrySgm2Od/DsQGGpiUMjwbcixs/Wx5AnnGa95o1Becx2wSzUQusMF04AjqwCRzSxo+uIl5J4BX5mP4IqhAiNehcGYQSQ+eICddbwRKIsR3hveKRlnzsFudzatHn3smDmCjK7IRa7rYb1zZv7rejW8b3NFzaxAob5+hzDxzDjY7mvVJcr/qWrFDw+khNtwu8g01rXwRmq/0ZAkJOBy02xbSYupJC8JK8sGSiSLMTYfujBA9BzebsE3LBnHiFmVDoWu3nV9heu5CZ9XDqOuefX84qz3uawOQ2ftQSPTDEWtTbW4uMF3vCYqRPTS9JL9urgPffjW5rYYYcBCFfAMWhTDQlDhLj+lOFfDZWEcC+Bn21mSBMJcs3fxAvFEp4HjUlV6AStNhHz0XuPTlLy4hfkzZgiR9bORCO1TqMWqgD/RW/OqTANWPxdSg2DMmGRUCvMk6p23q2t73OjDOdUrlMX06ZAk9an3wR31YvG8pKYFjEmTGzR5SHQaAAkuvHVVpIMKNvDw== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(136003)(396003)(346002)(376002)(451199015)(36840700001)(46966006)(40470700004)(4743002)(41300700001)(6512007)(81166007)(356005)(2906002)(336012)(82310400005)(36860700001)(186003)(26005)(36756003)(47076005)(8936002)(44832011)(2616005)(235185007)(82740400003)(5660300002)(30864003)(6486002)(40480700001)(478600001)(6666004)(86362001)(40460700003)(33964004)(6506007)(44144004)(84970400001)(70586007)(70206006)(8676002)(4326008)(316002)(6916009)(2700100001)(67856001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Oct 2022 12:00:43.6486 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7aead600-ac0b-457c-8e84-08dabb3789e2 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT018.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB6614 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tamar Christina via Gcc-patches From: Tamar Christina Reply-To: Tamar Christina Cc: Richard.Earnshaw@arm.com, nd@arm.com, richard.sandiford@arm.com, Marcus.Shawcroft@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi All, Currently we often times generate an r -> r add even if it means we need two reloads to perform it, i.e. in the case that the values are on the SIMD side. The pairwise operations expose these more now and so we get suboptimal codegen. Normally I would have liked to use ^ or $ here, but while this works for the simple examples, reload inexplicably falls apart on examples that should have been trivial. It forces a move to r -> w to use the w ADD, which is counter to what ^ and $ should do. However ! seems to fix all the regression and still maintains the good codegen. I have tried looking into whether it's our costings that are off, but I can't seem anything logical here. So I'd like to push this change instead along with test that augment the other testcases that guard the r -> r variants. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64.md (*add3_aarch64): Add ! to the r -> r alternative. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/scalar_addp.c: New test. * gcc.target/aarch64/simd/scalar_faddp.c: New test. * gcc.target/aarch64/simd/scalar_faddp2.c: New test. * gcc.target/aarch64/simd/scalar_fmaxp.c: New test. * gcc.target/aarch64/simd/scalar_fminp.c: New test. * gcc.target/aarch64/simd/scalar_maxp.c: New test. * gcc.target/aarch64/simd/scalar_minp.c: New test. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f1245a242d5192d 100644 --- diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 09ae1118371f82ca63146fceb953eb9e820d05a4..c333fb1f72725992bb304c560f1245a242d5192d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2043,7 +2043,7 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,!rk,w,rk,r,r,rk") (plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c new file mode 100644 index 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80758bc36fa76d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** addp d0, v0.2d +** fmov x0, d0 +** ret +*/ +long long +foo (v2di x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** saddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +long long +foo1 (v2si x) +{ + return x[1] + x[0]; +} + +/* +** foo2: +** uaddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[1] + x[0]; +} + +/* +** foo3: +** uaddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[1] + x[0]) + y[0]; +} + +/* +** foo4: +** saddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[1] + x[0]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c new file mode 100644 index 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467b91a76fbe31aff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c @@ -0,0 +1,66 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** faddp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** faddp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] + x[1]; +} + +/* +** foo2: +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] + x[1]; +} + +/* +** foo3: +** ext v0.16b, v0.16b, v0.16b, #4 +** faddp s0, v0.2s +** ret +*/ +float +foo3 (v4sf x) +{ + return x[1] + x[2]; +} + +/* +** foo4: +** dup s0, v0.s\[3\] +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo4 (v8hf x) +{ + return x[6] + x[7]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c new file mode 100644 index 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f730b1213e940391a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -w" } */ + +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); +double a[]; +*b; +fn1() { + __m128i c; + *(__m128i *)a = c; + *b = a[0] + a[1]; +} + +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c new file mode 100644 index 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c574506610ab4fd7299 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c @@ -0,0 +1,56 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fmaxnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fmaxnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fmaxnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fmaxnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c new file mode 100644 index 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff25f1494e839f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fminnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fminnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fminnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fminnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c new file mode 100644 index 0000000000000000000000000000000000000000..e219a13abc745b83dca58633fd2d812e276d6b2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, ge +** ret +*/ +long long +foo (v2di x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** smaxp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** smaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c new file mode 100644 index 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c3ddca6b477430e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, le +** ret +*/ +long long +foo (v2di x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** sminp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** sminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -2043,7 +2043,7 @@ (define_expand "add3" (define_insn "*add3_aarch64" [(set - (match_operand:GPI 0 "register_operand" "=rk,rk,w,rk,r,r,rk") + (match_operand:GPI 0 "register_operand" "=rk,!rk,w,rk,r,r,rk") (plus:GPI (match_operand:GPI 1 "register_operand" "%rk,rk,w,rk,rk,0,rk") (match_operand:GPI 2 "aarch64_pluslong_operand" "I,r,w,J,Uaa,Uai,Uav")))] diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c new file mode 100644 index 0000000000000000000000000000000000000000..5b8d40f19884fc7b4e7decd80758bc36fa76d058 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_addp.c @@ -0,0 +1,70 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** addp d0, v0.2d +** fmov x0, d0 +** ret +*/ +long long +foo (v2di x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** saddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +long long +foo1 (v2si x) +{ + return x[1] + x[0]; +} + +/* +** foo2: +** uaddlp v0.1d, v0.2s +** fmov x0, d0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[1] + x[0]; +} + +/* +** foo3: +** uaddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[1] + x[0]) + y[0]; +} + +/* +** foo4: +** saddlp v0.1d, v0.2s +** add d0, d0, d1 +** fmov x0, d0 +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[1] + x[0]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c new file mode 100644 index 0000000000000000000000000000000000000000..ff455e060fc833b2f63e89c467b91a76fbe31aff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp.c @@ -0,0 +1,66 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** faddp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[1] + x[0]; +} + +/* +** foo1: +** faddp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] + x[1]; +} + +/* +** foo2: +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] + x[1]; +} + +/* +** foo3: +** ext v0.16b, v0.16b, v0.16b, #4 +** faddp s0, v0.2s +** ret +*/ +float +foo3 (v4sf x) +{ + return x[1] + x[2]; +} + +/* +** foo4: +** dup s0, v0.s\[3\] +** faddp h0, v0.2h +** ret +*/ +__fp16 +foo4 (v8hf x) +{ + return x[6] + x[7]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c new file mode 100644 index 0000000000000000000000000000000000000000..04412c3b45c51648e46ff20f730b1213e940391a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_faddp2.c @@ -0,0 +1,14 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -w" } */ + +typedef __m128i __attribute__((__vector_size__(2 * sizeof(long)))); +double a[]; +*b; +fn1() { + __m128i c; + *(__m128i *)a = c; + *b = a[0] + a[1]; +} + +/* { dg-final { scan-assembler-times {faddp\td0, v0\.2d} 1 } } */ + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c new file mode 100644 index 0000000000000000000000000000000000000000..aa1d2bf17cd707b74d8f7c574506610ab4fd7299 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fmaxp.c @@ -0,0 +1,56 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fmaxnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fmaxnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fmaxnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fmaxnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c new file mode 100644 index 0000000000000000000000000000000000000000..6136c5272069c4d86f09951cdff25f1494e839f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_fminp.c @@ -0,0 +1,55 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target arm_v8_2a_fp16_scalar_ok } */ +/* { dg-add-options arm_v8_2a_fp16_scalar } */ +/* { dg-additional-options "-save-temps -O1" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef double v2df __attribute__((vector_size (16))); +typedef float v4sf __attribute__((vector_size (16))); +typedef __fp16 v8hf __attribute__((vector_size (16))); + +/* +** foo: +** fminnmp d0, v0.2d +** ret +*/ +double +foo (v2df x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** fminnmp s0, v0.2s +** ret +*/ +float +foo1 (v4sf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** fminnmp h0, v0.2h +** ret +*/ +__fp16 +foo2 (v8hf x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** fminnmp s0, v0.2s +** fcvt d0, s0 +** fadd d0, d0, d1 +** ret +*/ +double +foo3 (v4sf x, v2df y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c new file mode 100644 index 0000000000000000000000000000000000000000..e219a13abc745b83dca58633fd2d812e276d6b2d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_maxp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, ge +** ret +*/ +long long +foo (v2di x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** smaxp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] > x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** umaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** smaxp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] > x[1] ? x[0] : x[1]) + y[0]; +} diff --git a/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c new file mode 100644 index 0000000000000000000000000000000000000000..2a32fb4ea3edaa4c547a7a481c3ddca6b477430e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/scalar_minp.c @@ -0,0 +1,74 @@ +/* { dg-do assemble } */ +/* { dg-additional-options "-save-temps -O1 -std=c99" } */ +/* { dg-final { check-function-bodies "**" "" "" { target { le } } } } */ + +typedef long long v2di __attribute__((vector_size (16))); +typedef unsigned long long v2udi __attribute__((vector_size (16))); +typedef int v2si __attribute__((vector_size (16))); +typedef unsigned int v2usi __attribute__((vector_size (16))); + +/* +** foo: +** umov x0, v0.d\[1\] +** fmov x1, d0 +** cmp x0, x1 +** csel x0, x0, x1, le +** ret +*/ +long long +foo (v2di x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo1: +** sminp v0.2s, v0.2s, v0.2s +** smov x0, v0.s\[0\] +** ret +*/ +long long +foo1 (v2si x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo2: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** ret +*/ +unsigned long long +foo2 (v2usi x) +{ + return x[0] < x[1] ? x[0] : x[1]; +} + +/* +** foo3: +** uminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, uxtw +** ret +*/ +unsigned long long +foo3 (v2usi x, v2udi y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +} + +/* +** foo4: +** sminp v0.2s, v0.2s, v0.2s +** fmov w0, s0 +** fmov x1, d1 +** add x0, x1, w0, sxtw +** ret +*/ +long long +foo4 (v2si x, v2di y) +{ + return (x[0] < x[1] ? x[0] : x[1]) + y[0]; +}