From patchwork Thu Sep 16 13:39:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jirui Wu X-Patchwork-Id: 45085 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 31EF83857422 for ; Thu, 16 Sep 2021 13:40:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 31EF83857422 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1631799644; bh=Lr3V2gOQtlfU9Oopij6e2lmV2agETULS04SNCmzSNCw=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=gjwGmrn/a3m19qBpN0CVe0ULzRrLv98GHEVNmqhHhIclQvTNgl1Jo5CVqZYukcQHj lo45RuYM4ywIPPZCLAo5ow26XdgXAAc5rtykL+wMgdYhrtv13Tq3GP9ouI3LNa9u8L 7exeaL+tJj4TwzZFl4ux3e9jE/aumgCkuNS3BCgc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2059.outbound.protection.outlook.com [40.107.21.59]) by sourceware.org (Postfix) with ESMTPS id BF3123858C39 for ; Thu, 16 Sep 2021 13:40:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BF3123858C39 Received: from AM6PR0502CA0040.eurprd05.prod.outlook.com (2603:10a6:20b:56::17) by VE1PR08MB5709.eurprd08.prod.outlook.com (2603:10a6:800:1a7::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.14; Thu, 16 Sep 2021 13:40:05 +0000 Received: from AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:56:cafe::23) by AM6PR0502CA0040.outlook.office365.com (2603:10a6:20b:56::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.14 via Frontend Transport; Thu, 16 Sep 2021 13:40:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT041.mail.protection.outlook.com (10.152.17.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.14 via Frontend Transport; Thu, 16 Sep 2021 13:40:04 +0000 Received: ("Tessian outbound 8e26f7114b75:v103"); Thu, 16 Sep 2021 13:40:04 +0000 X-CR-MTA-TID: 64aa7808 Received: from 42254e3510b3.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id BF84CDE3-309F-46EE-A854-A0977C067550.1; Thu, 16 Sep 2021 13:39:54 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 42254e3510b3.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 16 Sep 2021 13:39:54 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=oVRDTxCYEn7k6qWXVDss0h0t2Njn2mBi28o8F6lcyDrQDq+q/HPElC5d/mohjthwt9DoqJ9X8GUdplUkRPFDbmHlSyaMNaEZJkgMB6wuV0cskVjLMID2SGGSpXf8UapS0tB3jSsJtynDK79fM5wD3xUWA81oQw8EJuU/uA5EnBaZmw/MbK+GncUIni67l8akK/PU/smN8t7AgyTEcyI5CIo/dRdb3Lqdix11QqCscw2xlcOmbrzrat8Nsu594ENGTS3HqctzYgKnDw688Y2ASYRo8OHJhlip2/EH2VkjqGT7gvSmYLQtkiR+LqWeg1h2gVMYghVFtDqVPLbpYUfSJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=6ON/264f7sj1TNiWO7UpxzTunVxTUPieFW2tskE7dxY=; b=E80AJorqXK455dd/Q/YeB4PZg+TpG0ErIquV6vm3sgsT3Al4zqRmGvFOelpqFrP4WRSZwj5up1HIniZjBOrTQORBtJqwc9EigqmZ5po9wsXTKxsTxyS8KDswvVajFnILcLMMRPUSSagmdKhPNtVhiEPOUHMGjJLUE0k5xI25j/gpFK0LDoWcGYt3lVpKUmfYo3EOGpjAM0FormUIO6TVFdMIPEg3kMUMRGAfRLn2psAbTVH4J+h2SbFpQ4OUHdpd1Lme5cRWwCfhSz8LiRbo6xg3yLC9s/Oe8uspp+9ohN4CDoHcDAOxGcdFuBGNvHuYwo783Y+AJ64LfMMRkhx0jQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB4896.eurprd08.prod.outlook.com (2603:10a6:802:b1::18) by VI1PR08MB5536.eurprd08.prod.outlook.com (2603:10a6:803:13b::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4523.14; Thu, 16 Sep 2021 13:39:43 +0000 Received: from VE1PR08MB4896.eurprd08.prod.outlook.com ([fe80::940f:a21:5d48:99ce]) by VE1PR08MB4896.eurprd08.prod.outlook.com ([fe80::940f:a21:5d48:99ce%7]) with mapi id 15.20.4523.016; Thu, 16 Sep 2021 13:39:43 +0000 To: "gcc-patches@gcc.gnu.org" Subject: [Patch][GCC][middle-end] - Lower store and load neon builtins to gimple Thread-Topic: [Patch][GCC][middle-end] - Lower store and load neon builtins to gimple Thread-Index: AderAByS/x9/leW3Rvq/YnZOYNfotg== Date: Thu, 16 Sep 2021 13:39:43 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-ts-tracking-id: 79C3918613089945B046B98EA1B80281.0 x-checkrecipientchecked: true Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: f62686af-9ff5-4a88-78b1-08d979177db9 x-ms-traffictypediagnostic: VI1PR08MB5536:|VE1PR08MB5709: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:96;OLM:96; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: ZabsR+bJbf7Qv4qHGpt87oeL1pvDbRb7DO2EoJgCITacyQB9IB6Vu7tiNCjCcwyBLct35vsZO++LyQo06AaEK3zNrfPco/DHlHzlZbiGXBqE6DXvySSE7/wJyc21TRMmwMGgceAry23o75lIZ70ArGsF4+tR/3mqfqmK+6B+WbDImVHfwBu8Opfhnhhr6aEsHrBi29JxzwyOrnrm5Uvjtp89oJn4KOjCK/0tFnEOE3dqhYJ4sfuRY+SZ5eFhFlJfURaqw3Xr8uQq/JC7w2NpgDwxDqL6f7HsZtCPUfRiRLPg/wxiU61eF0LndlF+KoC4GQccKUk/9EcVGzoJl3diT67Y24TltV5bGSMJjswkhJakQvZlTSh0EQb/uBYTVncxTd8UnBNA+u1YEeZhmlKTjKZYMsEqF0n7yWpf3mjw0V8RxlQuYoFzpqs+faYWsQUCvDyHn6ClfoalLW43+1Yz2C1D8K7oZ4+NvDMkcDoZmgAYqb2bT0nRIC7ZsCDUpbnia96n+khCtvS1YawGzkz8UOvl2mYLvcpw+kTeHyiLW6Ak9ej3RcDO4jLrf79yp/2eHtmFIfPgBuNAs5qmcGTrj1/0Ge1f3PsiIyuNLIwMO5fsRcgGPyRAL5HCy5tQn1UqdYSde2ZzdxSnIvDPX6YtD5T3/tBnerWIN8Crc2nVTBBRE3FzrKzNa1R+WjlIb/z/od9r0xrQhFkjebIeO9t9E7QOeREGalKDK3P7gcQcH18VdxVeLgV49RES6HXvvNJhETJczRfFgL47PLoQMenDUsSWXiadahZ+rOf2ju/FKPAXrexsBMgtroNqJndVO57+3bMNW/fXzjls5YOELsuC2w== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB4896.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(83380400001)(8676002)(186003)(33656002)(38070700005)(6506007)(86362001)(2906002)(8936002)(52536014)(99936003)(508600001)(71200400001)(66446008)(6916009)(122000001)(9686003)(7696005)(64756008)(4326008)(76116006)(55016002)(54906003)(26005)(38100700002)(66556008)(66946007)(66616009)(316002)(55236004)(66476007)(5660300002); DIR:OUT; SFP:1101; MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB5536 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 9566a134-d6d5-46b5-892f-08d9791770ff X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cLDJJheZKq8shprMTFGWcxP0POiIK0mfz0pFzNxIr0tvUTs+01D8PCSyIFg2W1Xd47g9bB6+g6mpSQclUsIVvxAHaQOgV5iS26kJl/v5VmeThYTcJpDI3HmuyoXdUQ/c2FmfMdU2f1CJB2A1vJ31b1uypLdolkSzN3vmj5GG6BzOuzmh7U9egc/mpw207k/2OoOLE6BkQAvwyYKawzjzNiResZ1XhiUf5c10ILJhR05jD5qOl/bMLhmQ4ExPRxp9KWrAO9wMmreZwZ8QXeNc00IYe0fmVP/x6+VJOKfBsvnEnuKfRgm47JiDP0QEeJmE3Yg8q2WAt4AmwwjrT6uF/bVWCbT4PyCVd28hp2NawB+nTt1EGxZXSaGGby8QpHxLzGsaJ50x3XvlSKunHeeL+SdVuxXY0NHeIaDva8sGhlXyr2+fxk+DsC+2ZgnZqj08ZWkk4tGJIR/tAzua/+JLs/2r/lv289WJPWMKOT9Z5pVlHClEpXr63JOXwMt0Lpzxq+CGa/fVBZrTzlt9G4i0qevB2T99e1nRHimJ7P1eMBts3Txi+DWZA54pxBS1Vrchg3GNbFqsNmnDZLqm5OPs6ULXejVH9Qx+Nl+YpzOF4Op+H5UdnzA5de8Z5YAIeX204JEp1C97NfDZnnDnk2jXitMRtJlestxneDXs0sQgku4Qsl8W9BgAQX1ZJ4kso48wK9rwOw5WmTj+CnsDSXYxrvtmo4p3P4H+tqHQL5H9fQ6lOx3yi0f5wfWE78z/FNvldOfs/DXjYZM67P5rTrZx9w3pefZnedccIzTx7rSEUdJK+ryHytEfTXMqbSG113n+ X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(136003)(346002)(396003)(376002)(39860400002)(46966006)(36840700001)(2906002)(70586007)(82310400003)(55016002)(316002)(26005)(66616009)(4326008)(82740400003)(356005)(8676002)(6506007)(81166007)(36860700001)(99936003)(54906003)(186003)(86362001)(7696005)(336012)(5660300002)(9686003)(52536014)(83380400001)(47076005)(33656002)(21480400003)(6916009)(70206006)(235185007)(478600001)(8936002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Sep 2021 13:40:04.9361 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f62686af-9ff5-4a88-78b1-08d979177db9 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5709 X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, HTML_MESSAGE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-Content-Filtered-By: Mailman/MimeDel 2.1.29 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jirui Wu via Gcc-patches From: Jirui Wu Reply-To: Jirui Wu Cc: Richard Sandiford , "ian@airs.com" , Richard Biener Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi all, This patch lowers the vld1 and vst1 variants of the store and load neon builtins functions to gimple. The changes in this patch covers: * Replaces calls to the vld1 and vst1 variants of the builtins * Uses MEM_REF gimple assignments to generate better code * Updates test cases to prevent over optimization Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? If OK can it be committed for me, I have no commit rights. Thanks, Jirui gcc/ChangeLog: * config/aarch64/aarch64-builtins.c (aarch64_general_gimple_fold_builtin): lower vld1 and vst1 variants of the neon builtins gcc/testsuite/ChangeLog: * gcc.target/aarch64/fmla_intrinsic_1.c: prevent over optimization * gcc.target/aarch64/fmls_intrinsic_1.c: prevent over optimization * gcc.target/aarch64/fmul_intrinsic_1.c: prevent over optimization * gcc.target/aarch64/mla_intrinsic_1.c: prevent over optimization * gcc.target/aarch64/mls_intrinsic_1.c: prevent over optimization * gcc.target/aarch64/mul_intrinsic_1.c: prevent over optimization * gcc.target/aarch64/simd/vmul_elem_1.c: prevent over optimization * gcc.target/aarch64/vclz.c: replace macro with function to prevent over optimization * gcc.target/aarch64/vneg_s.c: replace macro with function to prevent over optimization diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index eef9fc0f4440d7db359e53a7b4e21e48cf2a65f4..027491414da16b66a7fe922a1b979d97f553b724 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -2382,6 +2382,31 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt) 1, args[0]); gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); break; + /*Lower store and load neon builtins to gimple. */ + BUILTIN_VALL_F16 (LOAD1, ld1, 0, LOAD) + if (!BYTES_BIG_ENDIAN) + { + new_stmt = gimple_build_assign (gimple_call_lhs (stmt), + fold_build2 (MEM_REF, + TREE_TYPE + (gimple_call_lhs (stmt)), + args[0], build_int_cst + (TREE_TYPE (args[0]), 0))); + } + break; + BUILTIN_VALL_F16 (STORE1, st1, 0, STORE) + if (!BYTES_BIG_ENDIAN) + { + new_stmt = gimple_build_assign (fold_build2 (MEM_REF, + TREE_TYPE (gimple_call_arg + (stmt, 1)), + gimple_call_arg (stmt, 0), + build_int_cst + (TREE_TYPE (gimple_call_arg + (stmt, 0)), 0)), + gimple_call_arg (stmt, 1)); + } + break; BUILTIN_VDQIF (UNOP, reduc_smax_scal_, 10, ALL) BUILTIN_VDQ_BHSI (UNOPU, reduc_umax_scal_, 10, ALL) new_stmt = gimple_build_call_internal (IFN_REDUC_MAX, diff --git a/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c b/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c index 59ad41ed0471b17418c395f31fbe666b60ec3623..bef31c45650dcd088b38a755083e6bd9fe530c52 100644 --- a/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c +++ b/gcc/testsuite/gcc.target/aarch64/fmla_intrinsic_1.c @@ -11,6 +11,7 @@ extern void abort (void); #define TEST_VMLA(q1, q2, size, in1_lanes, in2_lanes) \ static void \ +__attribute__((noipa,noinline)) \ test_vfma##q1##_lane##q2##_f##size (float##size##_t * res, \ const float##size##_t *in1, \ const float##size##_t *in2) \ @@ -104,12 +105,12 @@ main (int argc, char **argv) vfmaq_laneq_f32. */ /* { dg-final { scan-assembler-times "fmla\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.s\\\[\[0-9\]+\\\]" 2 } } */ -/* vfma_lane_f64. */ -/* { dg-final { scan-assembler-times "fmadd\\td\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+" 1 } } */ +/* vfma_lane_f64. + vfma_laneq_f64. */ +/* { dg-final { scan-assembler-times "fmadd\\td\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+" 2 } } */ /* vfmaq_lane_f64. - vfma_laneq_f64. vfmaq_laneq_f64. */ -/* { dg-final { scan-assembler-times "fmla\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 3 } } */ +/* { dg-final { scan-assembler-times "fmla\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c b/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c index 2d5a3d305360a08a9663cfd497cb1a5374b4b327..865def28c3f4d04042ab495d232bb865cabb2b50 100644 --- a/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c +++ b/gcc/testsuite/gcc.target/aarch64/fmls_intrinsic_1.c @@ -11,6 +11,7 @@ extern void abort (void); #define TEST_VMLS(q1, q2, size, in1_lanes, in2_lanes) \ static void \ +__attribute__((noipa,noinline)) \ test_vfms##q1##_lane##q2##_f##size (float##size##_t * res, \ const float##size##_t *in1, \ const float##size##_t *in2) \ @@ -105,12 +106,12 @@ main (int argc, char **argv) vfmsq_laneq_f32. */ /* { dg-final { scan-assembler-times "fmls\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s, v\[0-9\]+\.s\\\[\[0-9\]+\\\]" 2 } } */ -/* vfms_lane_f64. */ -/* { dg-final { scan-assembler-times "fmsub\\td\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+" 1 } } */ +/* vfms_lane_f64. + vfms_laneq_f64. */ +/* { dg-final { scan-assembler-times "fmsub\\td\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+\, d\[0-9\]+" 2 } } */ /* vfmsq_lane_f64. - vfms_laneq_f64. vfmsq_laneq_f64. */ -/* { dg-final { scan-assembler-times "fmls\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 3 } } */ +/* { dg-final { scan-assembler-times "fmls\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d, v\[0-9\]+\.d\\\[\[0-9\]+\\\]" 2 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c b/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c index 8b0880d89b13596dea7db79c14cb7d124cf7079c..63dc56c70a2572c6a8789c5a75713a7952ab9746 100644 --- a/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c +++ b/gcc/testsuite/gcc.target/aarch64/fmul_intrinsic_1.c @@ -9,6 +9,7 @@ extern double fabs (double); #define TEST_VMUL(q1, q2, size, in1_lanes, in2_lanes) \ static void \ +__attribute__((noipa,noinline)) \ test_vmul##q1##_lane##q2##_f##size (float##size##_t * res, \ const float##size##_t *in1, \ const float##size##_t *in2) \ diff --git a/gcc/testsuite/gcc.target/aarch64/mla_intrinsic_1.c b/gcc/testsuite/gcc.target/aarch64/mla_intrinsic_1.c index 46b3c78c131ea92eae208d399ef25c71cd8446f7..885bfb39b797e6d095aaecafa0271094c34fbea5 100644 --- a/gcc/testsuite/gcc.target/aarch64/mla_intrinsic_1.c +++ b/gcc/testsuite/gcc.target/aarch64/mla_intrinsic_1.c @@ -11,6 +11,7 @@ extern void abort (void); #define TEST_VMLA(q, su, size, in1_lanes, in2_lanes) \ static void \ +__attribute__((noipa,noinline)) \ test_vmlaq_lane##q##_##su##size (MAP##su (size, ) * res, \ const MAP##su(size, ) *in1, \ const MAP##su(size, ) *in2) \ diff --git a/gcc/testsuite/gcc.target/aarch64/mls_intrinsic_1.c b/gcc/testsuite/gcc.target/aarch64/mls_intrinsic_1.c index e01a4f6d0e1e83cac042a2cad4f02664b87e8c05..df046ce32c032bce70559a842d52001264ecbcbc 100644 --- a/gcc/testsuite/gcc.target/aarch64/mls_intrinsic_1.c +++ b/gcc/testsuite/gcc.target/aarch64/mls_intrinsic_1.c @@ -11,6 +11,7 @@ extern void abort (void); #define TEST_VMLS(q, su, size, in1_lanes, in2_lanes) \ static void \ +__attribute__((noipa,noinline)) \ test_vmlsq_lane##q##_##su##size (MAP##su (size, ) * res, \ const MAP##su(size, ) *in1, \ const MAP##su(size, ) *in2) \ diff --git a/gcc/testsuite/gcc.target/aarch64/mul_intrinsic_1.c b/gcc/testsuite/gcc.target/aarch64/mul_intrinsic_1.c index 00ef4f2de6c5510638b7e31990c0754f60d3e4d0..517b937f3e1b612d5a9c3c2f68a529a631d848e0 100644 --- a/gcc/testsuite/gcc.target/aarch64/mul_intrinsic_1.c +++ b/gcc/testsuite/gcc.target/aarch64/mul_intrinsic_1.c @@ -11,6 +11,7 @@ extern void abort (void); #define TEST_VMUL(q, su, size, in1_lanes, in2_lanes) \ static void \ +__attribute__((noipa,noinline)) \ test_vmulq_lane##q##_##su##size (MAP##su (size, ) * res, \ const MAP##su(size, ) *in1, \ const MAP##su(size, ) *in2) \ diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c b/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c index a1faefd88bacabadf45bf5a22ca5481db13c41cb..ffa391aeae1fa0b52ef4ad7ae040a8bc40e160d2 100644 --- a/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c +++ b/gcc/testsuite/gcc.target/aarch64/simd/vmul_elem_1.c @@ -146,12 +146,14 @@ check_v2sf (float32_t elemA, float32_t elemB) vst1_f32 (vec32x2_res, vmul_n_f32 (vec32x2_src, elemA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (* (uint32_t *) &vec32x2_res[indx] != * (uint32_t *) &expected2_1[indx]) abort (); vst1_f32 (vec32x2_res, vmul_n_f32 (vec32x2_src, elemB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (* (uint32_t *) &vec32x2_res[indx] != * (uint32_t *) &expected2_2[indx]) abort (); @@ -169,24 +171,28 @@ check_v4sf (float32_t elemA, float32_t elemB, float32_t elemC, float32_t elemD) vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (* (uint32_t *) &vec32x4_res[indx] != * (uint32_t *) &expected4_1[indx]) abort (); vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (* (uint32_t *) &vec32x4_res[indx] != * (uint32_t *) &expected4_2[indx]) abort (); vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (* (uint32_t *) &vec32x4_res[indx] != * (uint32_t *) &expected4_3[indx]) abort (); vst1q_f32 (vec32x4_res, vmulq_n_f32 (vec32x4_src, elemD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (* (uint32_t *) &vec32x4_res[indx] != * (uint32_t *) &expected4_4[indx]) abort (); @@ -204,12 +210,14 @@ check_v2df (float64_t elemdC, float64_t elemdD) vst1q_f64 (vec64x2_res, vmulq_n_f64 (vec64x2_src, elemdC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (* (uint64_t *) &vec64x2_res[indx] != * (uint64_t *) &expectedd2_1[indx]) abort (); vst1q_f64 (vec64x2_res, vmulq_n_f64 (vec64x2_src, elemdD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (* (uint64_t *) &vec64x2_res[indx] != * (uint64_t *) &expectedd2_2[indx]) abort (); @@ -227,12 +235,14 @@ check_v2si (int32_t elemsA, int32_t elemsB) vst1_s32 (vecs32x2_res, vmul_n_s32 (vecs32x2_src, elemsA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (vecs32x2_res[indx] != expecteds2_1[indx]) abort (); vst1_s32 (vecs32x2_res, vmul_n_s32 (vecs32x2_src, elemsB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (vecs32x2_res[indx] != expecteds2_2[indx]) abort (); @@ -248,12 +258,14 @@ check_v2si_unsigned (uint32_t elemusA, uint32_t elemusB) vst1_u32 (vecus32x2_res, vmul_n_u32 (vecus32x2_src, elemusA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (vecus32x2_res[indx] != expectedus2_1[indx]) abort (); vst1_u32 (vecus32x2_res, vmul_n_u32 (vecus32x2_src, elemusB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 2; indx++) if (vecus32x2_res[indx] != expectedus2_2[indx]) abort (); @@ -271,24 +283,28 @@ check_v4si (int32_t elemsA, int32_t elemsB, int32_t elemsC, int32_t elemsD) vst1q_s32 (vecs32x4_res, vmulq_n_s32 (vecs32x4_src, elemsA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecs32x4_res[indx] != expecteds4_1[indx]) abort (); vst1q_s32 (vecs32x4_res, vmulq_n_s32 (vecs32x4_src, elemsB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecs32x4_res[indx] != expecteds4_2[indx]) abort (); vst1q_s32 (vecs32x4_res, vmulq_n_s32 (vecs32x4_src, elemsC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecs32x4_res[indx] != expecteds4_3[indx]) abort (); vst1q_s32 (vecs32x4_res, vmulq_n_s32 (vecs32x4_src, elemsD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecs32x4_res[indx] != expecteds4_4[indx]) abort (); @@ -305,24 +321,28 @@ check_v4si_unsigned (uint32_t elemusA, uint32_t elemusB, uint32_t elemusC, vst1q_u32 (vecus32x4_res, vmulq_n_u32 (vecus32x4_src, elemusA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecus32x4_res[indx] != expectedus4_1[indx]) abort (); vst1q_u32 (vecus32x4_res, vmulq_n_u32 (vecus32x4_src, elemusB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecus32x4_res[indx] != expectedus4_2[indx]) abort (); vst1q_u32 (vecus32x4_res, vmulq_n_u32 (vecus32x4_src, elemusC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecus32x4_res[indx] != expectedus4_3[indx]) abort (); vst1q_u32 (vecus32x4_res, vmulq_n_u32 (vecus32x4_src, elemusD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecus32x4_res[indx] != expectedus4_4[indx]) abort (); @@ -341,24 +361,28 @@ check_v4hi (int16_t elemhA, int16_t elemhB, int16_t elemhC, int16_t elemhD) vst1_s16 (vech16x4_res, vmul_n_s16 (vech16x4_src, elemhA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vech16x4_res[indx] != expectedh4_1[indx]) abort (); vst1_s16 (vech16x4_res, vmul_n_s16 (vech16x4_src, elemhB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vech16x4_res[indx] != expectedh4_2[indx]) abort (); vst1_s16 (vech16x4_res, vmul_n_s16 (vech16x4_src, elemhC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vech16x4_res[indx] != expectedh4_3[indx]) abort (); vst1_s16 (vech16x4_res, vmul_n_s16 (vech16x4_src, elemhD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vech16x4_res[indx] != expectedh4_4[indx]) abort (); @@ -375,24 +399,28 @@ check_v4hi_unsigned (uint16_t elemuhA, uint16_t elemuhB, uint16_t elemuhC, vst1_u16 (vecuh16x4_res, vmul_n_u16 (vecuh16x4_src, elemuhA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecuh16x4_res[indx] != expecteduh4_1[indx]) abort (); vst1_u16 (vecuh16x4_res, vmul_n_u16 (vecuh16x4_src, elemuhB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecuh16x4_res[indx] != expecteduh4_2[indx]) abort (); vst1_u16 (vecuh16x4_res, vmul_n_u16 (vecuh16x4_src, elemuhC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecuh16x4_res[indx] != expecteduh4_3[indx]) abort (); vst1_u16 (vecuh16x4_res, vmul_n_u16 (vecuh16x4_src, elemuhD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 4; indx++) if (vecuh16x4_res[indx] != expecteduh4_4[indx]) abort (); @@ -411,48 +439,56 @@ check_v8hi (int16_t elemhA, int16_t elemhB, int16_t elemhC, int16_t elemhD, vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_1[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_2[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_3[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_4[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhE)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_5[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhF)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_6[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhG)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_7[indx]) abort (); vst1q_s16 (vech16x8_res, vmulq_n_s16 (vech16x8_src, elemhH)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vech16x8_res[indx] != expectedh8_8[indx]) abort (); @@ -470,48 +506,56 @@ check_v8hi_unsigned (uint16_t elemuhA, uint16_t elemuhB, uint16_t elemuhC, vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhA)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_1[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhB)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_2[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhC)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_3[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhD)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_4[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhE)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_5[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhF)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_6[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhG)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_7[indx]) abort (); vst1q_u16 (vecuh16x8_res, vmulq_n_u16 (vecuh16x8_src, elemuhH)); + asm volatile ("" : : : "memory"); for (indx = 0; indx < 8; indx++) if (vecuh16x8_res[indx] != expecteduh8_8[indx]) abort (); diff --git a/gcc/testsuite/gcc.target/aarch64/vclz.c b/gcc/testsuite/gcc.target/aarch64/vclz.c index a36ee44fc1658886f04dff19b946b933f9668008..ca4d17426e645c0f8bbe3a4cdd962848b4e1cbed 100644 --- a/gcc/testsuite/gcc.target/aarch64/vclz.c +++ b/gcc/testsuite/gcc.target/aarch64/vclz.c @@ -66,22 +66,62 @@ extern void abort (void); #define CLZ_INST(reg_len, data_len, is_signed) \ CONCAT1 (vclz, POSTFIX (reg_len, data_len, is_signed)) -#define RUN_TEST(test_set, answ_set, reg_len, data_len, is_signed, n) \ - INHIB_OPTIMIZATION; \ - a = LOAD_INST (reg_len, data_len, is_signed) (test_set); \ - b = LOAD_INST (reg_len, data_len, is_signed) (answ_set); \ - a = CLZ_INST (reg_len, data_len, is_signed) (a); \ - for (i = 0; i < n; i++) \ - if (a [i] != b [i]) \ - return 1; +#define BUILD_TEST(type, size, lanes) \ +int __attribute__((noipa,noinline)) \ +run_test##type##size##x##lanes (int##size##_t* test_set, \ + int##size##_t* answ_set, \ + int reg_len, int data_len, \ + int n) \ +{ \ + int i; \ + INHIB_OPTIMIZATION; \ + int##size##x##lanes##_t a = vld1##type##size (test_set); \ + int##size##x##lanes##_t b = vld1##type##size (answ_set); \ + a = vclz##type##size (a); \ + for (i = 0; i < n; i++){ \ + if (a [i] != b [i]) \ + return 1; \ + } \ + return 0; \ +} + +/* unsigned inputs */ +#define U_BUILD_TEST(type, size, lanes) \ +int __attribute__((noipa,noinline)) \ +run_test##type##size##x##lanes (uint##size##_t* test_set, \ + uint##size##_t* answ_set, \ + int reg_len, int data_len, \ + int n) \ +{ \ + int i; \ + INHIB_OPTIMIZATION; \ + uint##size##x##lanes##_t a = vld1##type##size (test_set); \ + uint##size##x##lanes##_t b = vld1##type##size (answ_set); \ + a = vclz##type##size (a); \ + for (i = 0; i < n; i++){ \ + if (a [i] != b [i]) \ + return 1; \ + } \ + return 0; \ +} + +BUILD_TEST (_s, 8, 8) +BUILD_TEST (_s, 16, 4) +BUILD_TEST (_s, 32, 2) +BUILD_TEST (q_s, 8, 16) +BUILD_TEST (q_s, 16, 8) +BUILD_TEST (q_s, 32, 4) + +U_BUILD_TEST (_u, 8, 8) +U_BUILD_TEST (_u, 16, 4) +U_BUILD_TEST (_u, 32, 2) +U_BUILD_TEST (q_u, 8, 16) +U_BUILD_TEST (q_u, 16, 8) +U_BUILD_TEST (q_u, 32, 4) int __attribute__ ((noinline)) test_vclz_s8 () { - int i; - int8x8_t a; - int8x8_t b; - int8_t test_set0[8] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, TEST6, TEST7 @@ -98,22 +138,18 @@ test_vclz_s8 () 0, 0, 0, 0, 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 64, 8, 1, 8); - RUN_TEST (test_set1, answ_set1, 64, 8, 1, 1); + int o1 = run_test_s8x8 (test_set0, answ_set0, 64, 8, 8); + int o2 = run_test_s8x8 (test_set1, answ_set1, 64, 8, 1); - return 0; + return o1||o2; } /* Double scan-assembler-times to take account of unsigned functions. */ -/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.8b, v\[0-9\]+\.8b" 4 } } */ +/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.8b, v\[0-9\]+\.8b" 2 } } */ int __attribute__ ((noinline)) test_vclz_s16 () { - int i; - int16x4_t a; - int16x4_t b; - int16_t test_set0[4] = { TEST0, TEST1, TEST2, TEST3 }; int16_t test_set1[4] = { TEST4, TEST5, TEST6, TEST7 }; int16_t test_set2[4] = { TEST8, TEST9, TEST10, TEST11 }; @@ -126,25 +162,21 @@ test_vclz_s16 () int16_t answ_set3[4] = { 4, 3, 2, 1 }; int16_t answ_set4[4] = { 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 64, 16, 1, 4); - RUN_TEST (test_set1, answ_set1, 64, 16, 1, 4); - RUN_TEST (test_set2, answ_set2, 64, 16, 1, 4); - RUN_TEST (test_set3, answ_set3, 64, 16, 1, 4); - RUN_TEST (test_set4, answ_set4, 64, 16, 1, 1); + int o1 = run_test_s16x4 (test_set0, answ_set0, 64, 16, 4); + int o2 = run_test_s16x4 (test_set1, answ_set1, 64, 16, 4); + int o3 = run_test_s16x4 (test_set2, answ_set2, 64, 16, 4); + int o4 = run_test_s16x4 (test_set3, answ_set3, 64, 16, 4); + int o5 = run_test_s16x4 (test_set4, answ_set4, 64, 16, 1); - return 0; + return o1||o2||o3||o4||o5; } /* Double scan-assembler-times to take account of unsigned functions. */ -/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.4h, v\[0-9\]+\.4h" 10} } */ +/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.4h, v\[0-9\]+\.4h" 2} } */ int __attribute__ ((noinline)) test_vclz_s32 () { - int i; - int32x2_t a; - int32x2_t b; - int32_t test_set0[2] = { TEST0, TEST1 }; int32_t test_set1[2] = { TEST2, TEST3 }; int32_t test_set2[2] = { TEST4, TEST5 }; @@ -181,37 +213,34 @@ test_vclz_s32 () int32_t answ_set15[2] = { 2, 1 }; int32_t answ_set16[2] = { 0, 0 }; - RUN_TEST (test_set0, answ_set0, 64, 32, 1, 2); - RUN_TEST (test_set1, answ_set1, 64, 32, 1, 2); - RUN_TEST (test_set2, answ_set2, 64, 32, 1, 2); - RUN_TEST (test_set3, answ_set3, 64, 32, 1, 2); - RUN_TEST (test_set4, answ_set4, 64, 32, 1, 2); - RUN_TEST (test_set5, answ_set5, 64, 32, 1, 2); - RUN_TEST (test_set6, answ_set6, 64, 32, 1, 2); - RUN_TEST (test_set7, answ_set7, 64, 32, 1, 2); - RUN_TEST (test_set8, answ_set8, 64, 32, 1, 2); - RUN_TEST (test_set9, answ_set9, 64, 32, 1, 2); - RUN_TEST (test_set10, answ_set10, 64, 32, 1, 2); - RUN_TEST (test_set11, answ_set11, 64, 32, 1, 2); - RUN_TEST (test_set12, answ_set12, 64, 32, 1, 2); - RUN_TEST (test_set13, answ_set13, 64, 32, 1, 2); - RUN_TEST (test_set14, answ_set14, 64, 32, 1, 2); - RUN_TEST (test_set15, answ_set15, 64, 32, 1, 2); - RUN_TEST (test_set16, answ_set16, 64, 32, 1, 1); - - return 0; + int o1 = run_test_s32x2 (test_set0, answ_set0, 64, 32, 2); + int o2 = run_test_s32x2 (test_set1, answ_set1, 64, 32, 2); + int o3 = run_test_s32x2 (test_set2, answ_set2, 64, 32, 2); + int o4 = run_test_s32x2 (test_set3, answ_set3, 64, 32, 2); + int o5 = run_test_s32x2 (test_set4, answ_set4, 64, 32, 2); + int o6 = run_test_s32x2 (test_set5, answ_set5, 64, 32, 2); + int o7 = run_test_s32x2 (test_set6, answ_set6, 64, 32, 2); + int o8 = run_test_s32x2 (test_set7, answ_set7, 64, 32, 2); + int o9 = run_test_s32x2 (test_set8, answ_set8, 64, 32, 2); + int o10 = run_test_s32x2 (test_set9, answ_set9, 64, 32, 2); + int o11 = run_test_s32x2 (test_set10, answ_set10, 64, 32, 2); + int o12 = run_test_s32x2 (test_set11, answ_set11, 64, 32, 2); + int o13 = run_test_s32x2 (test_set12, answ_set12, 64, 32, 2); + int o14 = run_test_s32x2 (test_set13, answ_set13, 64, 32, 2); + int o15 = run_test_s32x2 (test_set14, answ_set14, 64, 32, 2); + int o16 = run_test_s32x2 (test_set15, answ_set15, 64, 32, 2); + int o17 = run_test_s32x2 (test_set16, answ_set16, 64, 32, 1); + + return o1||o2||o3||o4||o5||o6||o7||o8||o9||o10||o11||o12||o13||o14 + ||o15||o16||o17; } /* Double scan-assembler-times to take account of unsigned functions. */ -/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.2s, v\[0-9\]+\.2s" 34 } } */ +/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.2s, v\[0-9\]+\.2s" 2 } } */ int __attribute__ ((noinline)) test_vclzq_s8 () { - int i; - int8x16_t a; - int8x16_t b; - int8_t test_set0[16] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, TEST6, TEST7, TEST8, TEST8, TEST8, TEST8, TEST8, TEST8, TEST8, TEST8 @@ -219,8 +248,8 @@ test_vclzq_s8 () int8_t answ_set0[16] = { 8, 7, 6, 5, 4, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 128, 8, 1, 9); - return 0; + int o1 = run_testq_s8x16 (test_set0, answ_set0, 128, 8, 9); + return o1; } /* Double scan-assembler-times to take account of unsigned functions. */ @@ -229,10 +258,6 @@ test_vclzq_s8 () int __attribute__ ((noinline)) test_vclzq_s16 () { - int i; - int16x8_t a; - int16x8_t b; - int16_t test_set0[8] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, TEST6, TEST7 }; @@ -252,23 +277,19 @@ test_vclzq_s16 () int16_t answ_set2[8] = { 0, 0, 0, 0, 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 128, 16, 1, 8); - RUN_TEST (test_set1, answ_set1, 128, 16, 1, 8); - RUN_TEST (test_set2, answ_set2, 128, 16, 1, 1); + int o1 = run_testq_s16x8 (test_set0, answ_set0, 128, 16, 8); + int o2 = run_testq_s16x8 (test_set1, answ_set1, 128, 16, 8); + int o3 = run_testq_s16x8 (test_set2, answ_set2, 128, 16, 1); - return 0; + return o1||o2||o3; } /* Double scan-assembler-times to take account of unsigned functions. */ -/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h" 6 } } */ +/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h" 2 } } */ int __attribute__ ((noinline)) test_vclzq_s32 () { - int i; - int32x4_t a; - int32x4_t b; - int32_t test_set0[4] = { TEST0, TEST1, TEST2, TEST3 }; int32_t test_set1[4] = { TEST4, TEST5, TEST6, TEST7 }; int32_t test_set2[4] = { TEST8, TEST9, TEST10, TEST11 }; @@ -289,27 +310,23 @@ test_vclzq_s32 () int32_t answ_set7[4] = { 4, 3, 2, 1 }; int32_t answ_set8[4] = { 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 128, 32, 1, 4); - RUN_TEST (test_set1, answ_set1, 128, 32, 1, 4); - RUN_TEST (test_set2, answ_set2, 128, 32, 1, 4); - RUN_TEST (test_set3, answ_set3, 128, 32, 1, 4); - RUN_TEST (test_set4, answ_set4, 128, 32, 1, 1); + int o1 = run_testq_s32x4 (test_set0, answ_set0, 128, 32, 4); + int o2 = run_testq_s32x4 (test_set1, answ_set1, 128, 32, 4); + int o3 = run_testq_s32x4 (test_set2, answ_set2, 128, 32, 4); + int o4 = run_testq_s32x4 (test_set3, answ_set3, 128, 32, 4); + int o5 = run_testq_s32x4 (test_set4, answ_set4, 128, 32, 1); - return 0; + return o1||o2||o3||o4||o5; } /* Double scan-assembler-times to take account of unsigned functions. */ -/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s" 10 } } */ +/* { dg-final { scan-assembler-times "clz\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s" 2 } } */ /* Unsigned versions. */ int __attribute__ ((noinline)) test_vclz_u8 () { - int i; - uint8x8_t a; - uint8x8_t b; - uint8_t test_set0[8] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, TEST6, TEST7 }; @@ -323,10 +340,10 @@ test_vclz_u8 () 0, 0, 0, 0, 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 64, 8, 0, 8); - RUN_TEST (test_set1, answ_set1, 64, 8, 0, 1); + int o1 = run_test_u8x8 (test_set0, answ_set0, 64, 8, 8); + int o2 = run_test_u8x8 (test_set1, answ_set1, 64, 8, 1); - return 0; + return o1||o2; } /* ASM scan near test for signed version. */ @@ -334,10 +351,6 @@ test_vclz_u8 () int __attribute__ ((noinline)) test_vclz_u16 () { - int i; - uint16x4_t a; - uint16x4_t b; - uint16_t test_set0[4] = { TEST0, TEST1, TEST2, TEST3 }; uint16_t test_set1[4] = { TEST4, TEST5, TEST6, TEST7 }; uint16_t test_set2[4] = { TEST8, TEST9, TEST10, TEST11 }; @@ -350,13 +363,13 @@ test_vclz_u16 () uint16_t answ_set3[4] = { 4, 3, 2, 1 }; uint16_t answ_set4[4] = { 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 64, 16, 0, 4); - RUN_TEST (test_set1, answ_set1, 64, 16, 0, 4); - RUN_TEST (test_set2, answ_set2, 64, 16, 0, 4); - RUN_TEST (test_set3, answ_set3, 64, 16, 0, 4); - RUN_TEST (test_set4, answ_set4, 64, 16, 0, 1); + int o1 = run_test_u16x4 (test_set0, answ_set0, 64, 16, 4); + int o2 = run_test_u16x4 (test_set1, answ_set1, 64, 16, 4); + int o3 = run_test_u16x4 (test_set2, answ_set2, 64, 16, 4); + int o4 = run_test_u16x4 (test_set3, answ_set3, 64, 16, 4); + int o5 = run_test_u16x4 (test_set4, answ_set4, 64, 16, 1); - return 0; + return o1||o2||o3||o4||o5; } /* ASM scan near test for signed version. */ @@ -364,10 +377,6 @@ test_vclz_u16 () int __attribute__ ((noinline)) test_vclz_u32 () { - int i; - uint32x2_t a; - uint32x2_t b; - uint32_t test_set0[2] = { TEST0, TEST1 }; uint32_t test_set1[2] = { TEST2, TEST3 }; uint32_t test_set2[2] = { TEST4, TEST5 }; @@ -404,25 +413,26 @@ test_vclz_u32 () uint32_t answ_set15[2] = { 2, 1 }; uint32_t answ_set16[2] = { 0, 0 }; - RUN_TEST (test_set0, answ_set0, 64, 32, 0, 2); - RUN_TEST (test_set1, answ_set1, 64, 32, 0, 2); - RUN_TEST (test_set2, answ_set2, 64, 32, 0, 2); - RUN_TEST (test_set3, answ_set3, 64, 32, 0, 2); - RUN_TEST (test_set4, answ_set4, 64, 32, 0, 2); - RUN_TEST (test_set5, answ_set5, 64, 32, 0, 2); - RUN_TEST (test_set6, answ_set6, 64, 32, 0, 2); - RUN_TEST (test_set7, answ_set7, 64, 32, 0, 2); - RUN_TEST (test_set8, answ_set8, 64, 32, 0, 2); - RUN_TEST (test_set9, answ_set9, 64, 32, 0, 2); - RUN_TEST (test_set10, answ_set10, 64, 32, 0, 2); - RUN_TEST (test_set11, answ_set11, 64, 32, 0, 2); - RUN_TEST (test_set12, answ_set12, 64, 32, 0, 2); - RUN_TEST (test_set13, answ_set13, 64, 32, 0, 2); - RUN_TEST (test_set14, answ_set14, 64, 32, 0, 2); - RUN_TEST (test_set15, answ_set15, 64, 32, 0, 2); - RUN_TEST (test_set16, answ_set16, 64, 32, 0, 1); - - return 0; + int o1 = run_test_u32x2 (test_set0, answ_set0, 64, 32, 2); + int o2 = run_test_u32x2 (test_set1, answ_set1, 64, 32, 2); + int o3 = run_test_u32x2 (test_set2, answ_set2, 64, 32, 2); + int o4 = run_test_u32x2 (test_set3, answ_set3, 64, 32, 2); + int o5 = run_test_u32x2 (test_set4, answ_set4, 64, 32, 2); + int o6 = run_test_u32x2 (test_set5, answ_set5, 64, 32, 2); + int o7 = run_test_u32x2 (test_set6, answ_set6, 64, 32, 2); + int o8 = run_test_u32x2 (test_set7, answ_set7, 64, 32, 2); + int o9 = run_test_u32x2 (test_set8, answ_set8, 64, 32, 2); + int o10 = run_test_u32x2 (test_set9, answ_set9, 64, 32, 2); + int o11 = run_test_u32x2 (test_set10, answ_set10, 64, 32, 2); + int o12 = run_test_u32x2 (test_set11, answ_set11, 64, 32, 2); + int o13 = run_test_u32x2 (test_set12, answ_set12, 64, 32, 2); + int o14 = run_test_u32x2 (test_set13, answ_set13, 64, 32, 2); + int o15 = run_test_u32x2 (test_set14, answ_set14, 64, 32, 2); + int o16 = run_test_u32x2 (test_set15, answ_set15, 64, 32, 2); + int o17 = run_test_u32x2 (test_set16, answ_set16, 64, 32, 1); + + return o1||o2||o3||o4||o5||o6||o7||o8||o9||o10||o11||o12||o13||o14 + ||o15||o16||o17; } /* ASM scan near test for signed version. */ @@ -441,9 +451,9 @@ test_vclzq_u8 () uint8_t answ_set0[16] = { 8, 7, 6, 5, 4, 3, 2, 1, 0, 0, 0, 0, 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 128, 8, 0, 9); + int o1 = run_testq_u8x16 (test_set0, answ_set0, 128, 8, 9); - return 0; + return o1; } /* ASM scan near test for signed version. */ @@ -476,11 +486,11 @@ test_vclzq_u16 () 0, 0, 0, 0, 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 128, 16, 0, 8); - RUN_TEST (test_set1, answ_set1, 128, 16, 0, 8); - RUN_TEST (test_set2, answ_set2, 128, 16, 0, 1); + int o1 = run_testq_u16x8 (test_set0, answ_set0, 128, 16, 8); + int o2 = run_testq_u16x8 (test_set1, answ_set1, 128, 16, 8); + int o3 = run_testq_u16x8 (test_set2, answ_set2, 128, 16, 1); - return 0; + return o1||o2||o3; } /* ASM scan near test for signed version. */ @@ -488,10 +498,6 @@ test_vclzq_u16 () int __attribute__ ((noinline)) test_vclzq_u32 () { - int i; - uint32x4_t a; - uint32x4_t b; - uint32_t test_set0[4] = { TEST0, TEST1, TEST2, TEST3 }; uint32_t test_set1[4] = { TEST4, TEST5, TEST6, TEST7 }; uint32_t test_set2[4] = { TEST8, TEST9, TEST10, TEST11 }; @@ -512,13 +518,13 @@ test_vclzq_u32 () uint32_t answ_set7[4] = { 4, 3, 2, 1 }; uint32_t answ_set8[4] = { 0, 0, 0, 0 }; - RUN_TEST (test_set0, answ_set0, 128, 32, 0, 4); - RUN_TEST (test_set1, answ_set1, 128, 32, 0, 4); - RUN_TEST (test_set2, answ_set2, 128, 32, 0, 4); - RUN_TEST (test_set3, answ_set3, 128, 32, 0, 4); - RUN_TEST (test_set4, answ_set4, 128, 32, 0, 1); + int o1 = run_testq_u32x4 (test_set0, answ_set0, 128, 32, 4); + int o2 = run_testq_u32x4 (test_set1, answ_set1, 128, 32, 4); + int o3 = run_testq_u32x4 (test_set2, answ_set2, 128, 32, 4); + int o4 = run_testq_u32x4 (test_set3, answ_set3, 128, 32, 4); + int o5 = run_testq_u32x4 (test_set4, answ_set4, 128, 32, 1); - return 0; + return o1||o2||o3||o4||o5; } /* ASM scan near test for signed version. */ diff --git a/gcc/testsuite/gcc.target/aarch64/vneg_s.c b/gcc/testsuite/gcc.target/aarch64/vneg_s.c index 6947526abdd4f49cf560661531e96feb9b934eb5..8ddc4d21c1f89d6c66624a33ee0386cb3a28c512 100644 --- a/gcc/testsuite/gcc.target/aarch64/vneg_s.c +++ b/gcc/testsuite/gcc.target/aarch64/vneg_s.c @@ -31,49 +31,24 @@ extern void abort (void); -#define CONCAT(a, b) a##b -#define CONCAT1(a, b) CONCAT (a, b) -#define REG_INFEX64 _ -#define REG_INFEX128 q_ -#define REG_INFEX(reg_len) REG_INFEX##reg_len -#define POSTFIX(reg_len, data_len) \ - CONCAT1 (REG_INFEX (reg_len), s##data_len) -#define DATA_TYPE_32 float -#define DATA_TYPE_64 double -#define DATA_TYPE(data_len) DATA_TYPE_##data_len - -#define FORCE_SIMD_INST64_8(data) -#define FORCE_SIMD_INST64_16(data) -#define FORCE_SIMD_INST64_32(data) -#define FORCE_SIMD_INST64_64(data) force_simd (data) -#define FORCE_SIMD_INST128_8(data) -#define FORCE_SIMD_INST128_16(data) -#define FORCE_SIMD_INST128_32(data) -#define FORCE_SIMD_INST128_64(data) - -#define FORCE_SIMD_INST(reg_len, data_len, data) \ - CONCAT1 (FORCE_SIMD_INST, reg_len##_##data_len) (data) -#define LOAD_INST(reg_len, data_len) \ - CONCAT1 (vld1, POSTFIX (reg_len, data_len)) -#define NEG_INST(reg_len, data_len) \ - CONCAT1 (vneg, POSTFIX (reg_len, data_len)) - -#define RUN_TEST(test_set, answ_set, reg_len, data_len, n, a, b) \ - { \ - int i; \ - INHIB_OPTIMIZATION; \ - (a) = LOAD_INST (reg_len, data_len) (test_set); \ - (b) = LOAD_INST (reg_len, data_len) (answ_set); \ - FORCE_SIMD_INST (reg_len, data_len, a) \ - a = NEG_INST (reg_len, data_len) (a); \ - FORCE_SIMD_INST (reg_len, data_len, a) \ - for (i = 0; i < n; i++) \ - { \ - INHIB_OPTIMIZATION; \ - if (a[i] != b[i]) \ - return 1; \ - } \ - } +#define BUILD_TEST(type, size, lanes) \ +int __attribute__((noipa,noinline)) \ +run_test##type##size##x##lanes (int##size##_t* test_set, \ + int##size##_t* answ_set, \ + int reg_len, int data_len, int n) \ +{ \ + int i; \ + int##size##x##lanes##_t a = vld1##type##size (test_set); \ + int##size##x##lanes##_t b = vld1##type##size (answ_set); \ + a = vneg##type##size (a); \ + for (i = 0; i < n; i++) \ + { \ + INHIB_OPTIMIZATION; \ + if (a[i] != b[i]) \ + return 1; \ + } \ + return 0; \ +} \ #define RUN_TEST_SCALAR(test_val, answ_val, a, b) \ { \ @@ -87,12 +62,19 @@ extern void abort (void); force_simd (res); \ } +BUILD_TEST (_s, 8, 8) +BUILD_TEST (_s, 16, 4) +BUILD_TEST (_s, 32, 2) +BUILD_TEST (_s, 64, 1) + +BUILD_TEST (q_s, 8, 16) +BUILD_TEST (q_s, 16, 8) +BUILD_TEST (q_s, 32, 4) +BUILD_TEST (q_s, 64, 2) + int __attribute__ ((noinline)) test_vneg_s8 () { - int8x8_t a; - int8x8_t b; - int8_t test_set0[8] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, SCHAR_MAX, SCHAR_MIN }; @@ -100,9 +82,9 @@ test_vneg_s8 () ANSW0, ANSW1, ANSW2, ANSW3, ANSW4, ANSW5, SCHAR_MIN + 1, SCHAR_MIN }; - RUN_TEST (test_set0, answ_set0, 64, 8, 8, a, b); + int o1 = run_test_s8x8 (test_set0, answ_set0, 64, 8, 8); - return 0; + return o1; } /* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.8b, v\[0-9\]+\.8b" 1 } } */ @@ -110,29 +92,23 @@ test_vneg_s8 () int __attribute__ ((noinline)) test_vneg_s16 () { - int16x4_t a; - int16x4_t b; - int16_t test_set0[4] = { TEST0, TEST1, TEST2, TEST3 }; int16_t test_set1[4] = { TEST4, TEST5, SHRT_MAX, SHRT_MIN }; int16_t answ_set0[4] = { ANSW0, ANSW1, ANSW2, ANSW3 }; int16_t answ_set1[4] = { ANSW4, ANSW5, SHRT_MIN + 1, SHRT_MIN }; - RUN_TEST (test_set0, answ_set0, 64, 16, 4, a, b); - RUN_TEST (test_set1, answ_set1, 64, 16, 4, a, b); + int o1 = run_test_s16x4 (test_set0, answ_set0, 64, 16, 4); + int o2 = run_test_s16x4 (test_set1, answ_set1, 64, 16, 4); - return 0; + return o1||o2; } -/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.4h, v\[0-9\]+\.4h" 2 } } */ +/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.4h, v\[0-9\]+\.4h" 1 } } */ int __attribute__ ((noinline)) test_vneg_s32 () { - int32x2_t a; - int32x2_t b; - int32_t test_set0[2] = { TEST0, TEST1 }; int32_t test_set1[2] = { TEST2, TEST3 }; int32_t test_set2[2] = { TEST4, TEST5 }; @@ -143,22 +119,19 @@ test_vneg_s32 () int32_t answ_set2[2] = { ANSW4, ANSW5 }; int32_t answ_set3[2] = { INT_MIN + 1, INT_MIN }; - RUN_TEST (test_set0, answ_set0, 64, 32, 2, a, b); - RUN_TEST (test_set1, answ_set1, 64, 32, 2, a, b); - RUN_TEST (test_set2, answ_set2, 64, 32, 2, a, b); - RUN_TEST (test_set3, answ_set3, 64, 32, 2, a, b); + int o1 = run_test_s32x2 (test_set0, answ_set0, 64, 32, 2); + int o2 = run_test_s32x2 (test_set1, answ_set1, 64, 32, 2); + int o3 = run_test_s32x2 (test_set2, answ_set2, 64, 32, 2); + int o4 = run_test_s32x2 (test_set3, answ_set3, 64, 32, 2); - return 0; + return o1||o2||o3||o4; } -/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.2s, v\[0-9\]+\.2s" 4 } } */ +/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.2s, v\[0-9\]+\.2s" 1 } } */ int __attribute__ ((noinline)) test_vneg_s64 () { - int64x1_t a; - int64x1_t b; - int64_t test_set0[1] = { TEST0 }; int64_t test_set1[1] = { TEST1 }; int64_t test_set2[1] = { TEST2 }; @@ -177,16 +150,16 @@ test_vneg_s64 () int64_t answ_set6[1] = { LLONG_MIN + 1 }; int64_t answ_set7[1] = { LLONG_MIN }; - RUN_TEST (test_set0, answ_set0, 64, 64, 1, a, b); - RUN_TEST (test_set1, answ_set1, 64, 64, 1, a, b); - RUN_TEST (test_set2, answ_set2, 64, 64, 1, a, b); - RUN_TEST (test_set3, answ_set3, 64, 64, 1, a, b); - RUN_TEST (test_set4, answ_set4, 64, 64, 1, a, b); - RUN_TEST (test_set5, answ_set5, 64, 64, 1, a, b); - RUN_TEST (test_set6, answ_set6, 64, 64, 1, a, b); - RUN_TEST (test_set7, answ_set7, 64, 64, 1, a, b); + int o1 = run_test_s64x1 (test_set0, answ_set0, 64, 64, 1); + int o2 = run_test_s64x1 (test_set1, answ_set1, 64, 64, 1); + int o3 = run_test_s64x1 (test_set2, answ_set2, 64, 64, 1); + int o4 = run_test_s64x1 (test_set3, answ_set3, 64, 64, 1); + int o5 = run_test_s64x1 (test_set4, answ_set4, 64, 64, 1); + int o6 = run_test_s64x1 (test_set5, answ_set5, 64, 64, 1); + int o7 = run_test_s64x1 (test_set6, answ_set6, 64, 64, 1); + int o8 = run_test_s64x1 (test_set7, answ_set7, 64, 64, 1); - return 0; + return o1||o2||o3||o4||o5||o6||o7||o8; } int __attribute__ ((noinline)) @@ -206,14 +179,11 @@ test_vnegd_s64 () return 0; } -/* { dg-final { scan-assembler-times "neg\\td\[0-9\]+, d\[0-9\]+" 16 } } */ +/* { dg-final { scan-assembler-times "neg\\td\[0-9\]+, d\[0-9\]+" 8 } } */ int __attribute__ ((noinline)) test_vnegq_s8 () { - int8x16_t a; - int8x16_t b; - int8_t test_set0[16] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, SCHAR_MAX, SCHAR_MIN, 4, 8, 15, 16, 23, 42, -1, -2 @@ -224,9 +194,9 @@ test_vnegq_s8 () -4, -8, -15, -16, -23, -42, 1, 2 }; - RUN_TEST (test_set0, answ_set0, 128, 8, 8, a, b); + int o1 = run_testq_s8x16 (test_set0, answ_set0, 128, 8, 8); - return 0; + return o1; } /* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.16b, v\[0-9\]+\.16b" 1 } } */ @@ -234,9 +204,6 @@ test_vnegq_s8 () int __attribute__ ((noinline)) test_vnegq_s16 () { - int16x8_t a; - int16x8_t b; - int16_t test_set0[8] = { TEST0, TEST1, TEST2, TEST3, TEST4, TEST5, SHRT_MAX, SHRT_MIN }; @@ -244,9 +211,9 @@ test_vnegq_s16 () ANSW0, ANSW1, ANSW2, ANSW3, ANSW4, ANSW5, SHRT_MIN + 1, SHRT_MIN }; - RUN_TEST (test_set0, answ_set0, 128, 16, 8, a, b); + int o1 = run_testq_s16x8 (test_set0, answ_set0, 128, 16, 8); - return 0; + return o1; } /* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.8h, v\[0-9\]+\.8h" 1 } } */ @@ -254,29 +221,23 @@ test_vnegq_s16 () int __attribute__ ((noinline)) test_vnegq_s32 () { - int32x4_t a; - int32x4_t b; - int32_t test_set0[4] = { TEST0, TEST1, TEST2, TEST3 }; int32_t test_set1[4] = { TEST4, TEST5, INT_MAX, INT_MIN }; int32_t answ_set0[4] = { ANSW0, ANSW1, ANSW2, ANSW3 }; int32_t answ_set1[4] = { ANSW4, ANSW5, INT_MIN + 1, INT_MIN }; - RUN_TEST (test_set0, answ_set0, 128, 32, 4, a, b); - RUN_TEST (test_set1, answ_set1, 128, 32, 4, a, b); + int o1 = run_testq_s32x4 (test_set0, answ_set0, 128, 32, 4); + int o2 = run_testq_s32x4 (test_set1, answ_set1, 128, 32, 4); - return 0; + return o1||o2; } -/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s" 2 } } */ +/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.4s, v\[0-9\]+\.4s" 1 } } */ int __attribute__ ((noinline)) test_vnegq_s64 () { - int64x2_t a; - int64x2_t b; - int64_t test_set0[2] = { TEST0, TEST1 }; int64_t test_set1[2] = { TEST2, TEST3 }; int64_t test_set2[2] = { TEST4, TEST5 }; @@ -287,15 +248,15 @@ test_vnegq_s64 () int64_t answ_set2[2] = { ANSW4, ANSW5 }; int64_t answ_set3[2] = { LLONG_MIN + 1, LLONG_MIN }; - RUN_TEST (test_set0, answ_set0, 128, 64, 2, a, b); - RUN_TEST (test_set1, answ_set1, 128, 64, 2, a, b); - RUN_TEST (test_set2, answ_set2, 128, 64, 2, a, b); - RUN_TEST (test_set3, answ_set3, 128, 64, 2, a, b); + int o1 = run_testq_s64x2 (test_set0, answ_set0, 128, 64, 2); + int o2 = run_testq_s64x2 (test_set1, answ_set1, 128, 64, 2); + int o3 = run_testq_s64x2 (test_set2, answ_set2, 128, 64, 2); + int o4 = run_testq_s64x2 (test_set3, answ_set3, 128, 64, 2); - return 0; + return o1||o2||o2||o4; } -/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d" 4 } } */ +/* { dg-final { scan-assembler-times "neg\\tv\[0-9\]+\.2d, v\[0-9\]+\.2d" 1 } } */ int main (int argc, char **argv)