From patchwork Fri Dec 9 13:32:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Stamatis Markianos-Wright X-Patchwork-Id: 61728 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 18C9038500BC for ; Fri, 9 Dec 2022 13:35:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 18C9038500BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1670592937; bh=9U9QLN90LV7kuvNVzLdaHb36JNKPCSPK+/3fvuSx6F0=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=QtYarYKbZYXA4RN6pnS3hzMm+IaueNbqrtcho8KdXsoDRFlxQHfG8Lbp5jCCozKPK l3dolFHb1UTy4BCiyo09BgRgxpEBbSRGjcrcHVnwl02Y+T3eQX96Zvt8A9JLxqZrT1 iBEPt9Z3+uM2Wk5RgKIoe2I6Z1bL/N89yc/bwArI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on2066.outbound.protection.outlook.com [40.107.241.66]) by sourceware.org (Postfix) with ESMTPS id 897BF385137E for ; Fri, 9 Dec 2022 13:35:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 897BF385137E Received: from AM6P193CA0139.EURP193.PROD.OUTLOOK.COM (2603:10a6:209:85::44) by DU0PR08MB7883.eurprd08.prod.outlook.com (2603:10a6:10:3b1::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5880.11; Fri, 9 Dec 2022 13:34:59 +0000 Received: from VI1EUR03FT035.eop-EUR03.prod.protection.outlook.com (2603:10a6:209:85:cafe::6a) by AM6P193CA0139.outlook.office365.com (2603:10a6:209:85::44) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5880.18 via Frontend Transport; Fri, 9 Dec 2022 13:34:59 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VI1EUR03FT035.mail.protection.outlook.com (100.127.145.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5901.17 via Frontend Transport; Fri, 9 Dec 2022 13:34:58 +0000 Received: ("Tessian outbound 0800d254cb3b:v130"); Fri, 09 Dec 2022 13:34:58 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: efbf940945bb5b8d X-CR-MTA-TID: 64aa7808 Received: from 7d5b876c0399.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 6DE8E5A5-3EBC-4961-8A1A-EB6EBC59EBA1.1; Fri, 09 Dec 2022 13:32:43 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 7d5b876c0399.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 09 Dec 2022 13:32:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=i21oGWm587WHGx9DMH8Rd6OVK/wglVJojt7iHl9XJuJgHoLlBCmQuYJVgyEG3kV5sqUL7RkPfbxoda3jl1YSxAYjh0QZV1K01X1UBUof1vXdwqg+R7SAtOBdxBbtDXGVs7xQ2Tpnm7/U3c/xDOntXkEgNuxriulK7N8rKR3znvlpzDs+3D8mWOUm//lIGKNrCUYvWBat8ypPdzeBHgR8JQXRnxN/tKP/AvU/diD6rs+vt3eAXRvry7vn/knWV932D7mYONL8DGbOAU1WPUBm/XeU3Kw95T2+nYw/NLSnbJ59B5lEkskrapvL2yD4i6wZiUoSX6d+iRIEV090gNmJSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9U9QLN90LV7kuvNVzLdaHb36JNKPCSPK+/3fvuSx6F0=; b=FEud/9c0yN493UIzIE4pOH8hTFDbQexZW6wDZ+Zg8TNFuQcMRSDkYdxmGuPLDXSnmT9ZSziIZO6p5eCELW9m9o3XTGb5jTASFHMAvCAAPCYtaDPohWjpzIUKGqNlp5YasGXGekZlj9KKaNfAQyAsp16KWm5wnjlKN/DhjA5fBjzZQB/8fuWgNDLLiwlzSUEIDOl4hoS7y0PVBsEl8fbS1wDjvrj/QmSi8kKslog/TwjnnJjeU77qMKwnqSc/VbLlL8uCOk3F6MMjzq2EXgVDANnheHxt0cL6Ck5rUIrZx4jy1w8DIrct7pF7FHm2XOi2hlO+Xzb5/PAMWohz482fmg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from DB9PR08MB6507.eurprd08.prod.outlook.com (2603:10a6:10:25a::6) by DU2PR08MB10303.eurprd08.prod.outlook.com (2603:10a6:10:491::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5880.11; Fri, 9 Dec 2022 13:32:39 +0000 Received: from DB9PR08MB6507.eurprd08.prod.outlook.com ([fe80::77ac:7231:e695:8ff3]) by DB9PR08MB6507.eurprd08.prod.outlook.com ([fe80::77ac:7231:e695:8ff3%8]) with mapi id 15.20.5880.014; Fri, 9 Dec 2022 13:32:39 +0000 Message-ID: <08fcbcb4-c1c5-2e9f-1efd-e1d08fb7a3f6@arm.com> Date: Fri, 9 Dec 2022 13:32:29 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.5.1 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Cc: Kyrylo Tkachov , richard Earnshaw , Ramana Radhakrishnan , nickc@redhat.com Subject: [PATCH] Fix memory constraint on MVE v[ld/st][2/4] instructions [PR107714] X-ClientProxiedBy: SN7PR04CA0048.namprd04.prod.outlook.com (2603:10b6:806:120::23) To DB9PR08MB6507.eurprd08.prod.outlook.com (2603:10a6:10:25a::6) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: DB9PR08MB6507:EE_|DU2PR08MB10303:EE_|VI1EUR03FT035:EE_|DU0PR08MB7883:EE_ X-MS-Office365-Filtering-Correlation-Id: ebcc747e-f824-437d-9990-08dad9ea2ac7 X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 6rcBerrRnyIjOc7clg1cR4OltZzYSFCEluVSDmwkRDxVVGQY60o85aj6oNtZ6Mg4Fbc2okhlePdcxtLB3L6HOIn0eW3qt/hBiOQbJjDW87JBpya4a95GPuXhCo8EQtPQNZ8KKg3VKjC37s1VuNjDbAZFSinp35vfuNvj++H1HZbXPfRxkxD8TGEXruOe7i6ZdwIJe+v7Sg19wbKyzeaXUF2HpFY+ks/1+mG1ne0/RtTJSi3imxPmFszrnsMt9B5xGYaFn3B3Uhab9GBtfzhpEzSAlSUQVCIGadkhZnaYbK6rhWSmg48cp6fUUw8SbfTAFYGsTb7W+1Y8yCJuDCaRqNvjJZWW+pBpBdU8nj82WdlZMXGTE/Lssyxzgm3HYsbKQ01mDoPJ2M+VgAFM95KbcU8th6rNvVEeCIV+jZfUfbi4U8ysYrJ/uEmgKEbnRHvcRe8+0QAuzo749NahUCqng1wGYKzwSQ6Y99fu2yDzyPPTx8zVWHaZkrPcNgyyhAtF8PQ7b5QXiiC4YM6pFXYq5ebs5AeOF+iJmoVTEibUY5r8pA9goadT11miBu9b/ha+gJOei+kGrKmBA6sRgnOXRNuox+Ma3I5G59ir1vuPGzauuECgmWoN3WkOmUfXt3retSgdbVBW5Uu56M5mMlTL7WBw7thEmHhWcSLQf5TWCRjBNHdhrql73yeg4lUrDXUTCrJlsMK2PtBf8B1BwS3CR/7oUX0IE8iS/V/cqZR3H9Dx5Myed/0A8Y998fbJvPCaThWYmt1tCoXJoa59F0b5y/rhvmyEbKkzyRYqmMV6t04= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DB9PR08MB6507.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(4636009)(39860400002)(346002)(366004)(136003)(396003)(376002)(451199015)(8936002)(235185007)(36756003)(478600001)(31696002)(2616005)(6512007)(38100700002)(86362001)(186003)(6486002)(54906003)(6916009)(41300700001)(26005)(4326008)(66556008)(5660300002)(33964004)(66946007)(8676002)(966005)(6506007)(6666004)(316002)(66476007)(2906002)(31686004)(84970400001)(45980500001)(43740500002); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU2PR08MB10303 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VI1EUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 058a364a-26b9-4f19-eb14-08dad9e9d729 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Row2mainFkJk1DBTrZGlMj13diReja6I70Og8oT0PoUgJRn95EmWbjt6kr1h3hZqAxiBuPRMYf1Lv+avpP0QjE/0MPNilmr2zFZkOihCdqIu8VU23MEosJBouL1MlukQsRZ4OCdLIZNVCkJV4dUw+SrZMvP6VuFwyy1GPY+gKNw/TdvDVdTr8LTIBR29i2aMtqrzFcuvmuz8kwGrMzyBDEVX/EY0+BMiZEprwUavKvaXmy+S/U7qRbTPKPJKEsm0XtQ0Xr7Uy//6Imduvkx6fZl78Vt4yYwRgEvMF79Oi3OxfwlUA6TKT4H5THxbugYKWwJ70yjZr8N4yKAgrOjrigFI379eNPYj8DKUje169a+prk+3vMeOQ1OZUIB3VJ9Sdef+m3JrS5CG+eZDR6x1qqXz7vdd6AzPI6YJx5h8aNShrR8M/2q2ir4JR4AwmvuVyVeH9STesbNVauDbQPNOVSxmXpPvzwv/oqHrScUh5BB8/tzbqN2gpS3QBiZQV8/TXkkBLA3uv415AuvLY0JvEe3Le1Nl4filavzqcK05fXVgM/7Tj4fsmpl0xqW5ma+0jB3KkmlNVdDDkoJpVRN7AJrg7S0Fi5GyphJ05RUK/PI5dtu2R0E6PZmg7/fsD45EQUG1tCxaVtyXv/rpwgFHN5ncvZ0CFN8rdhXSxt8weIseDLh8Mc8wsroIHqBpAPNBliQmP/nWpdon/0j+A+OPmgxS6MqbJEv2NXiqN7oSE1DHAk3TmiedC8bxig/TVqgLGh/J1DLxy2Z5ycKEFsUsyOG6uvDe0QtNz4Tqgpdl1WN+UKjuOB9BC9RUdvVwqm96 X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230022)(4636009)(346002)(39860400002)(396003)(136003)(376002)(451199015)(40470700004)(46966006)(36840700001)(84970400001)(2906002)(86362001)(31696002)(54906003)(6506007)(6486002)(33964004)(316002)(6512007)(6666004)(186003)(47076005)(26005)(41300700001)(6916009)(5660300002)(40460700003)(2616005)(8676002)(82740400003)(356005)(478600001)(70586007)(36860700001)(31686004)(81166007)(4326008)(70206006)(8936002)(235185007)(107886003)(966005)(336012)(36756003)(82310400005)(40480700001)(43740500002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Dec 2022 13:34:58.8082 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ebcc747e-f824-437d-9990-08dad9ea2ac7 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VI1EUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB7883 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Stam Markianos-Wright via Gcc-patches From: Stamatis Markianos-Wright Reply-To: Stam Markianos-Wright Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi all, In the M-Class Arm-ARM: https://developer.arm.com/documentation/ddi0553/bu/?lang=en these MVE instructions only have '!' writeback variant and at: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 we found that the Um constraint would also allow through a register offset writeback, resulting in an assembler error. Here I have added a new constraint and predicate for these instructions, which (uniquely, AFAICT), only support a `!` writeback increment by the data size (inside the compiler this is a POST_INC). No regressions in arm-none-eabi with MVE and MVE.FP. Ok for trunk, and backport to GCC11 and GCC12 (testing pending)? Thanks, Stam gcc/ChangeLog:         PR target/107714         * config/arm/arm-protos.h (mve_struct_mem_operand): New protoype.         * config/arm/arm.cc (mve_struct_mem_operand): New function.         * config/arm/constraints.md (Ug): New constraint.         * config/arm/mve.md (mve_vst4q): Change constraint.         (mve_vst2q): Likewise.         (mve_vld4q): Likewise.         (mve_vld2q): Likewise.         * config/arm/predicates.md (mve_struct_operand): New predicate. gcc/testsuite/ChangeLog:         PR target/107714         * gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c: New test. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 550272facd12e60a49bf8a3b20f811cc13765b3a..8ea38118b05769bd6fcb1d22d902a50979cfd953 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -122,6 +122,7 @@ extern int arm_coproc_mem_operand_wb (rtx, int); extern int neon_vector_mem_operand (rtx, int, bool); extern int mve_vector_mem_operand (machine_mode, rtx, bool); extern int neon_struct_mem_operand (rtx); +extern int mve_struct_mem_operand (rtx); extern rtx *neon_vcmla_lane_prepare_operands (rtx *); diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index b587561eebea921bdc68016922d37948e2870ce2..31f2a7b9d4688dde69d1435e24cf885e8544be71 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -13737,6 +13737,24 @@ neon_vector_mem_operand (rtx op, int type, bool strict) return FALSE; } +/* Return TRUE if OP is a mem suitable for loading/storing an MVE struct + type. */ +int +mve_struct_mem_operand (rtx op) +{ + rtx ind = XEXP (op, 0); + + /* Match: (mem (reg)). */ + if (REG_P (ind)) + return arm_address_register_rtx_p (ind, 0); + + /* Allow only post-increment by the mode size. */ + if (GET_CODE (ind) == POST_INC) + return arm_address_register_rtx_p (XEXP (ind, 0), 0); + + return FALSE; +} + /* Return TRUE if OP is a mem suitable for loading/storing a Neon struct type. */ int diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index e5a36d29c7135943b9bb5ea396f70e2e4beb1e4a..8908b7f5b15ce150685868e78e75280bf32053f1 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -474,6 +474,12 @@ (and (match_code "mem") (match_test "TARGET_32BIT && arm_coproc_mem_operand (op, FALSE)"))) +(define_memory_constraint "Ug" + "@internal + In Thumb-2 state a valid MVE struct load/store address." + (and (match_code "mem") + (match_test "TARGET_HAVE_MVE && mve_struct_mem_operand (op)"))) + (define_memory_constraint "Uj" "@internal In ARM/Thumb-2 state a VFP load/store address that supports writeback diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index b5e6da4b1335818a3e8815de59850e845a2d0400..847bc032afa2c3977c05725562a14940beb282d4 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -99,7 +99,7 @@ ;; [vst4q]) ;; (define_insn "mve_vst4q" - [(set (match_operand:XI 0 "neon_struct_operand" "=Um") + [(set (match_operand:XI 0 "mve_struct_operand" "=Ug") (unspec:XI [(match_operand:XI 1 "s_register_operand" "w") (unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] VST4Q)) @@ -9959,7 +9959,7 @@ ;; [vst2q]) ;; (define_insn "mve_vst2q" - [(set (match_operand:OI 0 "neon_struct_operand" "=Um") + [(set (match_operand:OI 0 "mve_struct_operand" "=Ug") (unspec:OI [(match_operand:OI 1 "s_register_operand" "w") (unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] VST2Q)) @@ -9988,7 +9988,7 @@ ;; (define_insn "mve_vld2q" [(set (match_operand:OI 0 "s_register_operand" "=w") - (unspec:OI [(match_operand:OI 1 "neon_struct_operand" "Um") + (unspec:OI [(match_operand:OI 1 "mve_struct_operand" "Ug") (unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] VLD2Q)) ] @@ -10016,7 +10016,7 @@ ;; (define_insn "mve_vld4q" [(set (match_operand:XI 0 "s_register_operand" "=w") - (unspec:XI [(match_operand:XI 1 "neon_struct_operand" "Um") + (unspec:XI [(match_operand:XI 1 "mve_struct_operand" "Ug") (unspec:MVE_VLD_ST [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] VLD4Q)) ] diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index aab5a91ad4ddc6a7a02611d05442d6de63841a7c..67f2fdb4f8f607ceb50871e1bc17dbdb9b987c2c 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -876,6 +876,10 @@ (and (match_code "mem") (match_test "TARGET_32BIT && neon_vector_mem_operand (op, 2, true)"))) +(define_predicate "mve_struct_operand" + (and (match_code "mem") + (match_test "TARGET_HAVE_MVE && mve_struct_mem_operand (op)"))) + (define_predicate "neon_permissive_struct_operand" (and (match_code "mem") (match_test "TARGET_32BIT && neon_vector_mem_operand (op, 2, false)"))) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c new file mode 100644 index 0000000000000000000000000000000000000000..d028b91e81aed97e4b30978b6d130a6f97f1cbc3 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c @@ -0,0 +1,300 @@ +/* { dg-require-effective-target arm_v8_1m_mve_ok } */ +/* { dg-add-options arm_v8_1m_mve } */ +/* { dg-additional-options "-O1" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "arm_mve.h" + +#ifdef __cplusplus +extern "C" { +#endif + +/* +**test: +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test(const uint8_t * in, uint8_t * out, int width) +{ + uint8x16x2_t rg = vld2q(in); + uint8x16x2_t gb = vld2q(in + width); + vst2q (out, rg); + vst2q (out + width, gb); +} + +/* +**test2: +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\]! +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\]! +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test2(const uint8_t * in, uint8_t * out) +{ + uint8x16x2_t rg = vld2q(in); + uint8x16x2_t gb = vld2q(in + 32); + vst2q (out, rg); + vst2q (out + 32, gb); +} + +/* +**test3: +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test3(const uint8_t * in, uint8_t * out) +{ + uint8x16x2_t rg = vld2q(in); + uint8x16x2_t gb = vld2q(in - 32); + vst2q (out, rg); + vst2q (out - 32, gb); +} + +/* +**test4: +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test4(const uint8_t * in, uint8_t * out) +{ + uint8x16x2_t rg = vld2q(in); + uint8x16x2_t gb = vld2q(in + 64); + vst2q (out, rg); + vst2q (out + 64, gb); +} + +/* +**test5: +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst20.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst21.8 {q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test5(const uint8_t * in, uint8_t * out) +{ + uint8x16x2_t rg = vld2q(in); + uint8x16x2_t gb = vld2q(in + 42); + vst2q (out, rg); + vst2q (out + 42, gb); +} + +/* +**test6: +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test6(const uint8_t * in, uint8_t * out, int width) +{ + uint8x16x4_t rg = vld4q(in); + uint8x16x4_t gb = vld4q(in + width); + vst4q (out, rg); + vst4q (out + width, gb); +} + +/* +**test7: +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test7(const uint8_t * in, uint8_t * out) +{ + uint8x16x4_t rg = vld4q(in); + uint8x16x4_t gb = vld4q(in + 32); + vst4q (out, rg); + vst4q (out + 32, gb); +} + +/* +**test8: +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\]! +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\]! +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test8(const uint8_t * in, uint8_t * out) +{ + uint8x16x4_t rg = vld4q(in); + uint8x16x4_t gb = vld4q(in + 64); + vst4q (out, rg); + vst4q (out + 64, gb); +} + +/* +**test9: +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test9(const uint8_t * in, uint8_t * out) +{ + uint8x16x4_t rg = vld4q(in); + uint8x16x4_t gb = vld4q(in - 64); + vst4q (out, rg); + vst4q (out - 64, gb); +} + +/* +**test10: +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vld40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vld43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +** vst40.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst41.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst42.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** vst43.8 {q[0-9]+, q[0-9]+, q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\] +** ... +*/ +void +test10(const uint8_t * in, uint8_t * out) +{ + uint8x16x4_t rg = vld4q(in); + uint8x16x4_t gb = vld4q(in + 42); + vst4q (out, rg); + vst4q (out + 42, gb); +} + +#ifdef __cplusplus +} +#endif + +/* { dg-final { scan-assembler-not "__ARM_undef" } } */ \ No newline at end of file