From patchwork Thu Nov 28 21:12:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Bantaloukas X-Patchwork-Id: 102052 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4BD673858C39 for ; Thu, 28 Nov 2024 21:23:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4BD673858C39 Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=FkcQFv/T; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=FkcQFv/T X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on20625.outbound.protection.outlook.com [IPv6:2a01:111:f403:260d::625]) by sourceware.org (Postfix) with ESMTPS id 77FC63858D20 for ; Thu, 28 Nov 2024 21:13:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 77FC63858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 77FC63858D20 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:260d::625 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828383; cv=pass; b=aSIaXWJ058qil/f7IXG1gLihrTR7RwIQ1amci8OxpCENlXTuwgyKyVnovZ0w1eWi/s3d4fT73bKMRYU+ykqAn8ZuBgytygdC60mhq+VSFh5EM+czweQrE/uZzuZKMT94GwjAx327kqiEEm8ADGIxNywP8nqj+ErC/M7jRuKy8rc= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828383; c=relaxed/simple; bh=Bjsucl7Qs37ytQ0zfg1P9/vTS+AeaeezvaSMlFmtMpE=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=hlwGz5hKeul+bHXeppcZr96Nsf73zCvz4ahGl1mALa2DL2mT9GOCECBjW8gpOlC4lGCe7rrToBR12GY3AbQvafEWsDoRXVa/d5akBY5SeGRgszpNPao37J6nA81HjOjsvo1sAnIM3ryH5O9dsSzv7Au6N/SJUdN2idq5X+DN9Ec= ARC-Authentication-Results: i=3; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 77FC63858D20 ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=YwOn31wQP6tVNk/L4N7JMWil/aFBjCb4Fqoq1cLoFUvPFSdFQKZwdIeE2hhwcNMMqeFVeK5+abdH8Jrtf/s+rQludGMm+py/I68laSbIUVFGkKfVNUyE6fvjfYtwxbyWv+I5T5Q13SxKbSt1GoFZz54iY6l7OO8UxDeuw+M4KATwOSFi217KBhAc6Ht0JbXpKiPFXphX3i7MhkWXaWaeUefaUSJav6KMYn2udvqyoobWcNIm5USNz2LOrthD35/eXwYskMAZ3duAQhBi44Novy4ShCvMpHxd3GoPDNQPaogCOhbQE7S/YW8HHHd6cNShJf1Pr2xj6RyBuTit6T2oLg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oZgfljua8l5WpvyIKIstSClGpNPxlVrEw28QryuDebY=; b=Ci6UMCO4skAV1EM0lIuPjjF8gE6X6ks/SFKVqLLvlnftgWJsJvCMjwL/OWrMftsJlgjrCRUxoC6qXTSejKofzUZPwQ5Ieg7m3YJBZ0AIkZziVtlXDJs/HNsA3GPTlFlBVLqR6x/J+JevU902tTU9gQaUfAZC6i4sfbH/xUVWMMRIsLlmUeSHvvhHNn86F9ruh1lR/pYiFu5cuPefC7FL2CcHxlhkBODadwbUEyZxhv8pwdTSAZ5nQD/3zw0CMjrdZpLxA6WrkJ68BRAzhyqiiViZYvi1b0E+fAJGDve8gxEvohwI3dWYiLaFIypP+eCrA5GPVPKolEm9B/WJ3Tfc5A== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oZgfljua8l5WpvyIKIstSClGpNPxlVrEw28QryuDebY=; b=FkcQFv/TqNuQzmc2syKeqFwelweWakMzwb5Wi0LzzTYymggc0NERTsbKwNafrYshHxSVgWynH7p8TjJ77U8dwRbLJgDaKkTqpqHMSNxXFv0VSTwq8nFml8Jkyvx/JTTPgnw/O31etrmokCAl8sY5NMSTavUzQjCt+MEIi0eXIW0= Received: from DUZPR01CA0096.eurprd01.prod.exchangelabs.com (2603:10a6:10:4bb::19) by PA4PR08MB6269.eurprd08.prod.outlook.com (2603:10a6:102:ed::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.13; Thu, 28 Nov 2024 21:12:56 +0000 Received: from DB5PEPF00014B9A.eurprd02.prod.outlook.com (2603:10a6:10:4bb:cafe::88) by DUZPR01CA0096.outlook.office365.com (2603:10a6:10:4bb::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.13 via Frontend Transport; Thu, 28 Nov 2024 21:12:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5PEPF00014B9A.mail.protection.outlook.com (10.167.8.167) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:54 +0000 Received: ("Tessian outbound 4eb3d11e9250:v514"); Thu, 28 Nov 2024 21:12:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3306864dd174bc4a X-TessianGatewayMetadata: i5yRpKq2r3WFrYXV6T+zBczuOLNgh1kjxscXJRTmAFegJJTxG/2lJg9s0U+rZnoo5glyS01VUdYGX8oEXqZASWfntu90gTM7wHmFAJyJAqRbOZwEKTxV3zyx2AguNr5MTRemJPrv02T4tL2YAGwcs2Bu1inUOD4hcb6245y8u1M0/sumcDmtvK52f0UbBBFn X-CR-MTA-TID: 64aa7808 Received: from L395c38144ee9.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9A836DA9-1725-402A-B3FD-CF98401FE1E5.1; Thu, 28 Nov 2024 21:12:47 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L395c38144ee9.2 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384); Thu, 28 Nov 2024 21:12:47 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=d1MFce58yazE2fqGVv2gTw5M1bYwesRswimjOOPstb6NnasJV6LxEoUNkAh7vhSZkD3qoO4KSKO1TWFszsjwpVc+WRNhJEq/MAqMviFTYQ8NXPZo1p033qCv87WH7RnJnJEo/jnDeeEgLdj4WkPevd7qgOunp90ruiKLhjDfISyRaPJ/X/K39Sk5yTmKLW4zxL79CK8xcyKFQ1iyvO3GwLhSRVELGC2yy5SJL0KGhA2ny00dQzOgCbV6BJpgOoGV3Cs7WcQTWLhuXzYeIcOqC3SSaT4I4a7sGzaoqsiRUUTqFK9jiZY9Qde+70WNDVkOQcWj+aDKVbbXb9fSojbfsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=oZgfljua8l5WpvyIKIstSClGpNPxlVrEw28QryuDebY=; b=jkck5i2iubB1H9DoITqWm7721TjXjNSg059MQx24ijFz/k7UBuRQnTCQpaYarxuYEPGu4oPY89NeOe/KgW7ZvIdYXNNecyrWu4uTxCzc4lUnc881yV4Z55ejO6qK7wVHf/FTd8Ama2hNmebS+yLbPsiSdFd8gXwQwtPA2oPTFcpWLj69H+uIVAX7ogh93dFPlvel1bbA1R345S/twwgOfsCi3WubzonOFZHORdMJ9gCTmwPXeDVIkrGBTUepiyKSEu69L8dQBPsRfyGdlhZ+6L3kjdfD1b6o0AZISyzyin5izmmRZvsq43ECIBp6cqUNBs9uAIpD51krK//2ZdfY4g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=oZgfljua8l5WpvyIKIstSClGpNPxlVrEw28QryuDebY=; b=FkcQFv/TqNuQzmc2syKeqFwelweWakMzwb5Wi0LzzTYymggc0NERTsbKwNafrYshHxSVgWynH7p8TjJ77U8dwRbLJgDaKkTqpqHMSNxXFv0VSTwq8nFml8Jkyvx/JTTPgnw/O31etrmokCAl8sY5NMSTavUzQjCt+MEIi0eXIW0= Received: from DB8PR06CA0060.eurprd06.prod.outlook.com (2603:10a6:10:120::34) by AS2PR08MB9644.eurprd08.prod.outlook.com (2603:10a6:20b:607::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.13; Thu, 28 Nov 2024 21:12:37 +0000 Received: from DU6PEPF0000A7E4.eurprd02.prod.outlook.com (2603:10a6:10:120:cafe::af) by DB8PR06CA0060.outlook.office365.com (2603:10a6:10:120::34) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.14 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DU6PEPF0000A7E4.mail.protection.outlook.com (10.167.8.43) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:36 +0000 Received: from 5fe87ac27518.euhpc2.arm.com (10.58.86.32) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 28 Nov 2024 21:12:36 +0000 From: Claudio Bantaloukas To: CC: Claudio Bantaloukas Subject: [PATCH v5 1/5] aarch64: Add basic svmfloat8_t support to arm_sve.h Date: Thu, 28 Nov 2024 21:12:30 +0000 Message-ID: <20241128211234.1714776-2-claudio.bantaloukas@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> References: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DU6PEPF0000A7E4:EE_|AS2PR08MB9644:EE_|DB5PEPF00014B9A:EE_|PA4PR08MB6269:EE_ X-MS-Office365-Filtering-Correlation-Id: 45906cb5-75ed-4038-d95b-08dd0ff16cd6 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|36860700013|376014|82310400026|1800799024; X-Microsoft-Antispam-Message-Info-Original: s2YektIehwgmtQNdeoz/6e5Cv2oyVr++8jgSH4Vlh+Xu0mNUygpvoPvEmR6cyOdaN6GfGP4qk8FIso+y1X4Z90vOCPEEz092i6z2uZ1vYC+BthHQdifhD80ZfRGZeYGy+xfjH4PawIGlvBw/OkkEWOAxwSdFjJ5kMizA3u3KtgTK+8Dl7hdOjCJtvrw94E0pHhlkvdn9c8o9BqN9+XnXPdyrol5LqBBbvjMK1bDkFl6h22l9Hi9LpVY4faA80ogrWYO4uKDiiHcHU1xYFyP5Ok5rWZjnJ+149aK0PGtJb88ppoKtWHfZPhRgn8Bb6OTfrmZSsDV9l4kvW7tVDpRhNeQtj8KdUcqDryxupg3MKRWdJHPPonxbWNxw54q429TNtMWchht0tP+p8NWe85PGv8pdiHbqedolBvVvKrd3YJULRK59XI4Fun6mXZ50aX/M9R/8ykbZG8gwKRgt9NRIxKv2BgLYu+aAw2h7GEcTcK4sqDRm2JE5wpTPSO6e9Tc3gbJcO6F4VpfcUMcg1/sU4VNdHpuZAcnKHpOkZ6SKhSgU1HBFqBWHrvlYT++AWYwTFIUg0GpfpD/Lu7EvktNzOQSmGiRTwI4UtCAKa23Ya7+QTVEht9RJyfSsL1D63udtJaH6Wj7/W4eto6gDpgwOn4vcvOOkVx3rBRjl77t/0lubOpsTqX3bhgM2rzwVdGvIgXNlBOyGoapwR+YE4riGkmshXRBvOzFNqCudxH8x5yku3PEIxuVmp5FWw44tPcziqYhEOoHj9hwFjsEJm93x3YR7tbGxSQYrOqw56xgAX/GDd/u/qZih9B6IUq5HZ7dNRCh4bb41d6oh6RvZYt+eBJpSiMHkHEojD5ZKzdByl/PTxv1hEwRnMPm95+cxIqq9rqAuCLVUlwO2ayQ55ICLM2VEeEniiXBkRL2rog6O6XTtouHSeK78J6BM39gxlgmTt7eD8ww1hXc4kkA0XiUnMW/xEvW19X6Oj0narUXazgMYqVsCmZPsexP/2WG1G1dX3mF+UzpSE0AO3UucJPQbxzjPM6czs1CZXjAnixORtgCtRDl8rpZzyQiUNGUNCc6w9kcuezMxG3GLn2WVnberEO08/pjMYm3IWU5hwoCTnCYjGJxyCw3fJjmurbTTqh13cBYaLqtRrZBzUNCSo+yfMM9F6guBZNpZIpeYMm4bEgMo2K2+5wLCBO+FM+aHN8NIYJJo+S3P9kxUvaHeb/RjtWo722pkR1bmMpLucg3OJRzX1Gx+jnXSbHQjLV5m9P19MBJI+6Gc5hPh3Z+YkhI3/scYDn6i7AiwD5x84JEUHzUUPL1DOQD/bmg6m7yXYcTe2XwFGF4h0sMy4OFNR9FC07fjS10BOWTjJNG5/+pBY0FwQdda0U8QOL3ncpMDLR/KoCjf80zwVHQk3NE+IpwDCg== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(376014)(82310400026)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS2PR08MB9644 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:10:120::34]; domain=DB8PR06CA0060.eurprd06.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5PEPF00014B9A.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 6d2d8170-0e3a-45ac-4d97-08dd0ff1627f X-Microsoft-Antispam: BCL:0; ARA:13230040|14060799003|1800799024|36860700013|376014|35042699022|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?f1zBgeCZ4i12Vt+0Je5wi+lZzWw5Zxj?= =?utf-8?q?lCY2+9EMSQNQGrmfMm3HjE0nI7XJwa7D61phPCXfhJJ8/xegQDwQPTE53KSovF8kq?= =?utf-8?q?YPfukIZ2sgmYiPUGodQtFrhW4+2CmsS6ry1Dc7tNswsRqnbJZsT8BDbTFbfyrNLO0?= =?utf-8?q?4wPKQVHN898bZ7rJUOINupavqHPbiPClAEt5Gm3nc4LFhhmOypI6ySw/BD7HmkxAv?= =?utf-8?q?SEfBATb9QIgr9mznkbN+MjS6J6doa1F6hHK/653G4NBFIwUnSDutdQcgd8S7S9B8m?= =?utf-8?q?fXpW92iqtQcRhaoawfvXTLRxCjNWDjnE1/0XWO3zmv+Y6vu7hK5kMn4sC4lbIxsM1?= =?utf-8?q?Rjs/z0Wn+wEA3azbQdzjJSC9FcH0/VUtMl4OKU6tQTNdAiUVq4lTd+AssZqdrYfLV?= =?utf-8?q?LAMJVnwkQwjGRmW29tib5NVov30msYmawg6bKv7dqAnpmTdSu6GP6R568hfKZCrlV?= =?utf-8?q?1vjhzbuymx7HDIC4SSaqRK2A63lxXk6tg8AC0wVrsTXi2LbXqpyceWTisDrugs+pH?= =?utf-8?q?KH8iB/f8p6wx+qWksv7KkPg8l1lhaEbDXtHB1jiKc2XtT+IwkpMhH+Smwrek7gj98?= =?utf-8?q?tyxukemrqF9tFF5jvCWdekSsxqertMH6N11AOCsFZaqeGcZrBaQr1aeif04ND6tYI?= =?utf-8?q?siwNXxADaptKGeT66o/GHIQNHk4L6aedlsmObk3bQld1Q5+eDAEInOwcGww60kW47?= =?utf-8?q?meXUXgN6apgGPBOLsXBbd20d3A2KD71pOdDDHMLChE9uFAwOMcoo8ZVJxFpxxZ4QA?= =?utf-8?q?SVAwrhg+QFJcoThodNJrlauMFOXwQnTyd3ijPgCzyh+NUvqHZ4rmefZaMmYAeNdOz?= =?utf-8?q?ji0QHMceDkcCaFGyrybRR3/Cm++U5naYHljKNUTnsnGpDK6nAy4gYjygzgEDyQ/z3?= =?utf-8?q?HN/jiFEN1/hfsmzShd1e/OIVJnNLrZGfH8GkeS5AYlNZqpaICvRYxofQF2xhBgYQu?= =?utf-8?q?5PE/0YU2bi2UebwAo5YUGjJI9f/9WXWz6pRxcCGQGFY2stEaurMbeB53qHBapeg3H?= =?utf-8?q?Qh71SGlXCfC9D7NlDpYWf6nuo0/XDBus5yDxSrJ5ntW2XGNOU0G2NKF4oP3YrX7dj?= =?utf-8?q?Fo9EuFIdcXJ2TLRKlwdhox3XMa0akIcc6NcwwVY0/aKev3QLM2MgOSXoZf6T9/y/6?= =?utf-8?q?yg5zwGhPDPeerEb7gvFR48MzTRi05a0HFQy0tTWiy0daLOtBLJQaeis27//1apjr8?= =?utf-8?q?esDh8aQPbHnwa9KInzuDEP1S84/5NXVkkdVrYMqmN4KT7iAU/b/KfDF9xbNn5IKP1?= =?utf-8?q?FjHG8JIzyRW4/sgfS34Imo32yoQSiU7avp0nhRMZmwNqpICG6CpFGIY7eFCZ7B7Bx?= =?utf-8?q?wPkOMnpPVFU8sYyHnuB1a6gFX9E2DxBXoA=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:64aa7808-outbound-1.mta.getcheckrecipient.com; CAT:NONE; SFS:(13230040)(14060799003)(1800799024)(36860700013)(376014)(35042699022)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2024 21:12:54.4033 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 45906cb5-75ed-4038-d95b-08dd0ff16cd6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5PEPF00014B9A.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR08MB6269 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org This patch adds support for the fp8 related vectors to arm_sve.h. It also adds support for functions that just treat mfloat8_t as a bag of 8 bits (reinterpret casts, vector manipulation, loads, stores, element selection, vector tuples, table lookups, sve<->simd bridge); these functions are available for fp8 whenever they're available for other 8-bit types. Arithmetic operations, bit manipulation, conversions are notably absent. The generated asm is mostly consistent with the _u8 equivalents and this can be used to validate tests, except where immediates are used. These cannot be expressed for mf8 and thus we resort to the use of function arguments found in registers w(0-9). gcc/ * config/aarch64/aarch64-sve-builtins.cc (TYPES_b_data): Add mf8. (TYPES_reinterpret1, TYPES_reinterpret): Likewise. * config/aarch64/aarch64-sve-builtins.def (svmfloat8_t): New type. (mf8): New type suffix. * config/aarch64/aarch64-sve-builtins.h (TYPE_mfloat): New type_class_index. gcc/testsuite/ * g++.target/aarch64/sve/acle/general-c++/mangle_1.C: Test mangling of svmfloat8_t. * g++.target/aarch64/sve/acle/general-c++/mangle_2.C: Likewise for __SVMfloat8_t. * gcc.target/aarch64/sve/acle/asm/clasta_mf8.c: New test. * gcc.target/aarch64/sve/acle/asm/clastb_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/create2_1.c (create2_mf8): Likewise. * gcc.target/aarch64/sve/acle/asm/create3_1.c (create_mf8): Likewise. * gcc.target/aarch64/sve/acle/asm/create4_1.c (create_mf8): Likewise. * gcc.target/aarch64/sve/acle/asm/dup_lane_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dup_neonq_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/dupq_lane_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ext_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/get_neonq_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/get2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/get3_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/get4_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/insr_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lasta_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/lastb_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1ro_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld1rq_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld3_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ld4_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldff1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnf1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/ldnt1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/len_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c (reinterpret_bf16_mf8_tied1, reinterpret_bf16_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c (reinterpret_f16_mf8_tied1, reinterpret_f16_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c (reinterpret_f32_mf8_tied1, reinterpret_f32_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c (reinterpret_f64_mf8_tied1, reinterpret_f64_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c (reinterpret_s16_mf8_tied1, reinterpret_s16_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c (reinterpret_s32_mf8_tied1, reinterpret_s32_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c (reinterpret_s64_mf8_tied1, reinterpret_s64_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c (reinterpret_s8_mf8_tied1, reinterpret_s8_mf8_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c (reinterpret_u16_mf8_tied1, reinterpret_u16_mf8_untied) (reinterpret_u16_mf8_x3_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c (reinterpret_u32_mf8_tied1, reinterpret_u32_mf8_untied) (reinterpret_u32_mf8_x3_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c (reinterpret_u64_mf8_tied1, reinterpret_u64_mf8_untied) (reinterpret_u64_mf8_x3_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c (reinterpret_u8_mf8_tied1, reinterpret_u8_mf8_untied) (reinterpret_u8_mf8_x3_untied): Likewise. * gcc.target/aarch64/sve/acle/asm/rev_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/sel_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/set_neonq_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/set2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/set3_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/set4_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/splice_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st3_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/st4_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/stnt1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/tbl_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/trn1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/trn1q_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/trn2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/trn2q_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/undef2_1.c (mfloat8_t): Likewise. * gcc.target/aarch64/sve/acle/asm/undef3_1.c (mfloat8_t): Likewise. * gcc.target/aarch64/sve/acle/asm/undef4_1.c (mfloat8_t): Likewise. * gcc.target/aarch64/sve/acle/asm/undef_1.c (mfloat8_t): Likewise. * gcc.target/aarch64/sve/acle/asm/uzp1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/uzp1q_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/uzp2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/uzp2q_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/zip1_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/zip1q_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/zip2_mf8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/zip2q_mf8.c: Likewise. * gcc.target/aarch64/sve/pcs/annotate_1.c (ret_mf8, ret_mf8x2) (ret_mf8x3, ret_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/annotate_2.c (fn_mf8, fn_mf8x2) (fn_mf8x3, fn_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/annotate_3.c (fn_mf8, fn_mf8x2) (fn_mf8x3, fn_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/annotate_4.c (fn_mf8, fn_mf8x2) (fn_mf8x3, fn_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/annotate_5.c (fn_mf8, fn_mf8x2) (fn_mf8x3, fn_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/annotate_6.c (fn_mf8, fn_mf8x2) (fn_mf8x3, fn_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/annotate_7.c (fn_mf8, fn_mf8x2) (fn_mf8x3, fn_mf8x4): Likewise. * gcc.target/aarch64/sve/pcs/args_5_be_mf8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_5_le_mf8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_be_mf8.c: Likewise. * gcc.target/aarch64/sve/pcs/args_6_le_mf8.c: Likewise. * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c (mfloat8x32_t): New typedef. (mfloat8_callee, mfloat8_caller): New tests. * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c (mfloat8x32_t): New typedef. (mfloat8_callee, mfloat8_caller): New tests. * gcc.target/aarch64/sve/pcs/return_4_128.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_4_256.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_4_512.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_4_1024.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_4_2048.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_4.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_5_128.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_5_256.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_5_512.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_5_1024.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_5_2048.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_5.c (CALLER_NON_NUMERIC): Renamed CALLER_BF16 macro. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_6.c (mfloat8_t): New typedef. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_6_128.c (mfloat8_t): New typedef. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_6_256.c (mfloat8_t): New typedef. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_6_512.c (mfloat8_t): New typedef. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_6_1024.c (mfloat8_t): New typedef. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_6_2048.c (mfloat8_t): New typedef. (callee_mf8, caller_mf8): New tests. * gcc.target/aarch64/sve/pcs/return_7.c (callee_mf8): New tests. (caller_mf8): Likewise. * gcc.target/aarch64/sve/pcs/return_8.c (callee_mf8): Likewise (caller_mf8): Likewise. * gcc.target/aarch64/sve/pcs/return_9.c (callee_mf8): Likewise (caller_mf8): Likewise. * gcc.target/aarch64/sve/pcs/varargs_2_mf8.c: New tests * gcc.target/aarch64/sve2/acle/asm/tbl2_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/tbx_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilerw_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/whilewr_mf8.c: Likewise. --- gcc/config/aarch64/aarch64-sve-builtins.cc | 9 +- gcc/config/aarch64/aarch64-sve-builtins.def | 3 + gcc/config/aarch64/aarch64-sve-builtins.h | 1 + .../aarch64/sve/acle/general-c++/mangle_1.C | 2 + .../aarch64/sve/acle/general-c++/mangle_2.C | 2 + .../aarch64/sve/acle/asm/clasta_mf8.c | 52 +++ .../aarch64/sve/acle/asm/clastb_mf8.c | 52 +++ .../aarch64/sve/acle/asm/create2_1.c | 15 + .../aarch64/sve/acle/asm/create3_1.c | 11 + .../aarch64/sve/acle/asm/create4_1.c | 12 + .../aarch64/sve/acle/asm/dup_lane_mf8.c | 124 ++++++++ .../gcc.target/aarch64/sve/acle/asm/dup_mf8.c | 31 ++ .../aarch64/sve/acle/asm/dup_neonq_mf8.c | 30 ++ .../aarch64/sve/acle/asm/dupq_lane_mf8.c | 48 +++ .../gcc.target/aarch64/sve/acle/asm/ext_mf8.c | 73 +++++ .../aarch64/sve/acle/asm/get2_mf8.c | 55 ++++ .../aarch64/sve/acle/asm/get3_mf8.c | 108 +++++++ .../aarch64/sve/acle/asm/get4_mf8.c | 179 +++++++++++ .../aarch64/sve/acle/asm/get_neonq_mf8.c | 33 ++ .../aarch64/sve/acle/asm/insr_mf8.c | 22 ++ .../aarch64/sve/acle/asm/lasta_mf8.c | 12 + .../aarch64/sve/acle/asm/lastb_mf8.c | 12 + .../gcc.target/aarch64/sve/acle/asm/ld1_mf8.c | 162 ++++++++++ .../aarch64/sve/acle/asm/ld1ro_mf8.c | 121 +++++++ .../aarch64/sve/acle/asm/ld1rq_mf8.c | 137 ++++++++ .../gcc.target/aarch64/sve/acle/asm/ld2_mf8.c | 204 ++++++++++++ .../gcc.target/aarch64/sve/acle/asm/ld3_mf8.c | 246 +++++++++++++++ .../gcc.target/aarch64/sve/acle/asm/ld4_mf8.c | 290 +++++++++++++++++ .../aarch64/sve/acle/asm/ldff1_mf8.c | 91 ++++++ .../aarch64/sve/acle/asm/ldnf1_mf8.c | 155 +++++++++ .../aarch64/sve/acle/asm/ldnt1_mf8.c | 162 ++++++++++ .../gcc.target/aarch64/sve/acle/asm/len_mf8.c | 12 + .../aarch64/sve/acle/asm/reinterpret_bf16.c | 17 + .../aarch64/sve/acle/asm/reinterpret_f16.c | 17 + .../aarch64/sve/acle/asm/reinterpret_f32.c | 17 + .../aarch64/sve/acle/asm/reinterpret_f64.c | 17 + .../aarch64/sve/acle/asm/reinterpret_mf8.c | 297 ++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s16.c | 17 + .../aarch64/sve/acle/asm/reinterpret_s32.c | 17 + .../aarch64/sve/acle/asm/reinterpret_s64.c | 17 + .../aarch64/sve/acle/asm/reinterpret_s8.c | 17 + .../aarch64/sve/acle/asm/reinterpret_u16.c | 28 ++ .../aarch64/sve/acle/asm/reinterpret_u32.c | 28 ++ .../aarch64/sve/acle/asm/reinterpret_u64.c | 28 ++ .../aarch64/sve/acle/asm/reinterpret_u8.c | 28 ++ .../gcc.target/aarch64/sve/acle/asm/rev_mf8.c | 21 ++ .../gcc.target/aarch64/sve/acle/asm/sel_mf8.c | 30 ++ .../aarch64/sve/acle/asm/set2_mf8.c | 41 +++ .../aarch64/sve/acle/asm/set3_mf8.c | 63 ++++ .../aarch64/sve/acle/asm/set4_mf8.c | 87 +++++ .../aarch64/sve/acle/asm/set_neonq_mf8.c | 23 ++ .../aarch64/sve/acle/asm/splice_mf8.c | 33 ++ .../gcc.target/aarch64/sve/acle/asm/st1_mf8.c | 162 ++++++++++ .../gcc.target/aarch64/sve/acle/asm/st2_mf8.c | 204 ++++++++++++ .../gcc.target/aarch64/sve/acle/asm/st3_mf8.c | 246 +++++++++++++++ .../gcc.target/aarch64/sve/acle/asm/st4_mf8.c | 290 +++++++++++++++++ .../aarch64/sve/acle/asm/stnt1_mf8.c | 162 ++++++++++ .../gcc.target/aarch64/sve/acle/asm/tbl_mf8.c | 30 ++ .../aarch64/sve/acle/asm/trn1_mf8.c | 30 ++ .../aarch64/sve/acle/asm/trn1q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/trn2_mf8.c | 30 ++ .../aarch64/sve/acle/asm/trn2q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/undef2_1.c | 7 + .../aarch64/sve/acle/asm/undef3_1.c | 7 + .../aarch64/sve/acle/asm/undef4_1.c | 7 + .../gcc.target/aarch64/sve/acle/asm/undef_1.c | 7 + .../aarch64/sve/acle/asm/uzp1_mf8.c | 30 ++ .../aarch64/sve/acle/asm/uzp1q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/uzp2_mf8.c | 30 ++ .../aarch64/sve/acle/asm/uzp2q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/zip1_mf8.c | 30 ++ .../aarch64/sve/acle/asm/zip1q_mf8.c | 33 ++ .../aarch64/sve/acle/asm/zip2_mf8.c | 30 ++ .../aarch64/sve/acle/asm/zip2q_mf8.c | 33 ++ .../gcc.target/aarch64/sve/pcs/annotate_1.c | 8 + .../gcc.target/aarch64/sve/pcs/annotate_2.c | 8 + .../gcc.target/aarch64/sve/pcs/annotate_3.c | 8 + .../gcc.target/aarch64/sve/pcs/annotate_4.c | 12 + .../gcc.target/aarch64/sve/pcs/annotate_5.c | 12 + .../gcc.target/aarch64/sve/pcs/annotate_6.c | 12 + .../gcc.target/aarch64/sve/pcs/annotate_7.c | 8 + .../aarch64/sve/pcs/args_5_be_mf8.c | 63 ++++ .../aarch64/sve/pcs/args_5_le_mf8.c | 58 ++++ .../aarch64/sve/pcs/args_6_be_mf8.c | 71 +++++ .../aarch64/sve/pcs/args_6_le_mf8.c | 70 +++++ .../aarch64/sve/pcs/gnu_vectors_1.c | 12 +- .../aarch64/sve/pcs/gnu_vectors_2.c | 10 +- .../gcc.target/aarch64/sve/pcs/return_4.c | 21 +- .../aarch64/sve/pcs/return_4_1024.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_4_128.c | 21 +- .../aarch64/sve/pcs/return_4_2048.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_4_256.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_4_512.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5.c | 21 +- .../aarch64/sve/pcs/return_5_1024.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5_128.c | 21 +- .../aarch64/sve/pcs/return_5_2048.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5_256.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_5_512.c | 21 +- .../gcc.target/aarch64/sve/pcs/return_6.c | 24 ++ .../aarch64/sve/pcs/return_6_1024.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_6_128.c | 19 ++ .../aarch64/sve/pcs/return_6_2048.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_6_256.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_6_512.c | 22 ++ .../gcc.target/aarch64/sve/pcs/return_7.c | 28 ++ .../gcc.target/aarch64/sve/pcs/return_8.c | 29 ++ .../gcc.target/aarch64/sve/pcs/return_9.c | 33 ++ .../aarch64/sve/pcs/varargs_2_mf8.c | 182 +++++++++++ .../aarch64/sve2/acle/asm/tbl2_mf8.c | 31 ++ .../aarch64/sve2/acle/asm/tbx_mf8.c | 37 +++ .../aarch64/sve2/acle/asm/whilerw_mf8.c | 50 +++ .../aarch64/sve2/acle/asm/whilewr_mf8.c | 50 +++ 113 files changed, 5954 insertions(+), 30 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clasta_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clastb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_neonq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dupq_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ext_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get_neonq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/insr_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lasta_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lastb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1rq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnt1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/len_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rev_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sel_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set_neonq_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/splice_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st3_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/stnt1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2q_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbl2_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbx_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilerw_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilewr_mf8.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 0fec1cd439e..4596404f8a0 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -253,10 +253,11 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { #define TYPES_b_integer(S, D) \ S (s8), TYPES_b_unsigned (S, D) -/* _s8 +/* _mf8 + _s8 _u8. */ #define TYPES_b_data(S, D) \ - TYPES_b_integer (S, D) + S (mf8), TYPES_b_integer (S, D) /* _s8 _s16 _u8 _u16. */ @@ -539,16 +540,18 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { D (u8, s32), \ D (u16, s64) -/* { _bf16 } { _bf16 } +/* { _mf8 _bf16 } { _mf8 _bf16 } { _f16 _f32 _f64 } { _f16 _f32 _f64 } { _s8 _s16 _s32 _s64 } x { _s8 _s16 _s32 _s64 } { _u8 _u16 _u32 _u64 } { _u8 _u16 _u32 _u64 }. */ #define TYPES_reinterpret1(D, A) \ + D (A, mf8), \ D (A, bf16), \ D (A, f16), D (A, f32), D (A, f64), \ D (A, s8), D (A, s16), D (A, s32), D (A, s64), \ D (A, u8), D (A, u16), D (A, u32), D (A, u64) #define TYPES_reinterpret(S, D) \ + TYPES_reinterpret1 (D, mf8), \ TYPES_reinterpret1 (D, bf16), \ TYPES_reinterpret1 (D, f16), \ TYPES_reinterpret1 (D, f32), \ diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index a9243c40a97..47c396b866d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -81,6 +81,7 @@ DEF_SVE_MODE (vnum, none, none, vectors) DEF_SVE_TYPE (svbool_t, 10, __SVBool_t, boolean_type_node) DEF_SVE_TYPE (svcount_t, 11, __SVCount_t, boolean_type_node) +DEF_SVE_TYPE (svmfloat8_t, 13, __SVMfloat8_t, aarch64_mfp8_type_node) DEF_SVE_TYPE (svbfloat16_t, 14, __SVBfloat16_t, bfloat16_type_node) DEF_SVE_TYPE (svfloat16_t, 13, __SVFloat16_t, aarch64_fp16_type_node) DEF_SVE_TYPE (svfloat32_t, 13, __SVFloat32_t, float_type_node) @@ -107,6 +108,8 @@ DEF_SVE_TYPE_SUFFIX (c8, svcount_t, count, 8, VNx16BImode) DEF_SVE_TYPE_SUFFIX (c16, svcount_t, count, 16, VNx16BImode) DEF_SVE_TYPE_SUFFIX (c32, svcount_t, count, 32, VNx16BImode) DEF_SVE_TYPE_SUFFIX (c64, svcount_t, count, 64, VNx16BImode) +DEF_SVE_NEON_TYPE_SUFFIX (mf8, svmfloat8_t, mfloat, 8, VNx16QImode, + Mfloat8x8_t, Mfloat8x16_t) DEF_SVE_NEON_TYPE_SUFFIX (bf16, svbfloat16_t, bfloat, 16, VNx8BFmode, Bfloat16x4_t, Bfloat16x8_t) DEF_SVE_NEON_TYPE_SUFFIX (f16, svfloat16_t, float, 16, VNx8HFmode, diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 4094f8207f9..d209aebe96e 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -173,6 +173,7 @@ enum type_class_index TYPE_bfloat, TYPE_count, TYPE_float, + TYPE_mfloat, TYPE_signed, TYPE_unsigned, NUM_TYPE_CLASSES diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C index 2ad0c7f9838..c4984065416 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_1.C @@ -16,6 +16,7 @@ void f11(svfloat32_t) {} void f12(svfloat64_t) {} void f13(svbfloat16_t) {} void f14(svcount_t) {} +void f15(svmfloat8_t) {} /* { dg-final { scan-assembler "_Z2f1u10__SVBool_t:" } } */ /* { dg-final { scan-assembler "_Z2f2u10__SVInt8_t:" } } */ @@ -31,3 +32,4 @@ void f14(svcount_t) {} /* { dg-final { scan-assembler "_Z3f12u13__SVFloat64_t:" } } */ /* { dg-final { scan-assembler "_Z3f13u14__SVBfloat16_t:" } } */ /* { dg-final { scan-assembler "_Z3f14u11__SVCount_t:" } } */ +/* { dg-final { scan-assembler "_Z3f15u13__SVMfloat8_t:" } } */ diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C index c8bfcc5a9c2..3d83ddb7ab4 100644 --- a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C +++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/mangle_2.C @@ -14,6 +14,7 @@ void f11(__SVFloat32_t) {} void f12(__SVFloat64_t) {} void f13(__SVBfloat16_t) {} void f14(__SVCount_t) {} +void f15(__SVMfloat8_t) {} /* { dg-final { scan-assembler "_Z2f1u10__SVBool_t:" } } */ /* { dg-final { scan-assembler "_Z2f2u10__SVInt8_t:" } } */ @@ -29,3 +30,4 @@ void f14(__SVCount_t) {} /* { dg-final { scan-assembler "_Z3f12u13__SVFloat64_t:" } } */ /* { dg-final { scan-assembler "_Z3f13u14__SVBfloat16_t:" } } */ /* { dg-final { scan-assembler "_Z3f14u11__SVCount_t:" } } */ +/* { dg-final { scan-assembler "_Z3f15u13__SVMfloat8_t:" } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clasta_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clasta_mf8.c new file mode 100644 index 00000000000..708ecb1ff39 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clasta_mf8.c @@ -0,0 +1,52 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** clasta_mf8_tied1: +** clasta z0\.b, p0, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (clasta_mf8_tied1, svmfloat8_t, + z0 = svclasta_mf8 (p0, z0, z1), + z0 = svclasta (p0, z0, z1)) + +/* +** clasta_mf8_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** clasta z0\.b, p0, z0\.b, \1\.b +** ret +*/ +TEST_UNIFORM_Z (clasta_mf8_tied2, svmfloat8_t, + z0 = svclasta_mf8 (p0, z1, z0), + z0 = svclasta (p0, z1, z0)) + +/* +** clasta_mf8_untied: +** movprfx z0, z1 +** clasta z0\.b, p0, z0\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (clasta_mf8_untied, svmfloat8_t, + z0 = svclasta_mf8 (p0, z1, z2), + z0 = svclasta (p0, z1, z2)) + +/* +** clasta_x0_mf8: +** clasta b0, p0, b0, z2\.b +** ret +*/ +TEST_FOLD_LEFT_X (clasta_x0_mf8, mfloat8_t, svmfloat8_t, + x0 = svclasta_n_mf8 (p0, x0, z0), + x0 = svclasta (p0, x0, z0)) + +/* +** clasta_x1_mf8: +** clasta b1, p0, b1, z2\.b +** dup b0, v1.b\[0\] +** ret +*/ +TEST_FOLD_LEFT_X (clasta_x1_mf8, mfloat8_t, svmfloat8_t, + x0 = svclasta_n_mf8 (p0, x1, z0), + x0 = svclasta (p0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clastb_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clastb_mf8.c new file mode 100644 index 00000000000..179c102caef --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/clastb_mf8.c @@ -0,0 +1,52 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** clastb_mf8_tied1: +** clastb z0\.b, p0, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (clastb_mf8_tied1, svmfloat8_t, + z0 = svclastb_mf8 (p0, z0, z1), + z0 = svclastb (p0, z0, z1)) + +/* +** clastb_mf8_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** clastb z0\.b, p0, z0\.b, \1\.b +** ret +*/ +TEST_UNIFORM_Z (clastb_mf8_tied2, svmfloat8_t, + z0 = svclastb_mf8 (p0, z1, z0), + z0 = svclastb (p0, z1, z0)) + +/* +** clastb_mf8_untied: +** movprfx z0, z1 +** clastb z0\.b, p0, z0\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (clastb_mf8_untied, svmfloat8_t, + z0 = svclastb_mf8 (p0, z1, z2), + z0 = svclastb (p0, z1, z2)) + +/* +** clastb_x0_mf8: +** clastb b0, p0, b0, z2\.b +** ret +*/ +TEST_FOLD_LEFT_X (clastb_x0_mf8, mfloat8_t, svmfloat8_t, + x0 = svclastb_n_mf8 (p0, x0, z0), + x0 = svclastb (p0, x0, z0)) + +/* +** clastb_x1_mf8: +** clastb b1, p0, b1, z2\.b +** dup b0, v1.b\[0\] +** ret +*/ +TEST_FOLD_LEFT_X (clastb_x1_mf8, mfloat8_t, svmfloat8_t, + x0 = svclastb_n_mf8 (p0, x1, z0), + x0 = svclastb (p0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create2_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create2_1.c index 7e7d8901d21..a9369c8bdf8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create2_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create2_1.c @@ -62,6 +62,21 @@ TEST_CREATE (create2_u16, svuint16x2_t, svuint16_t, z0 = svcreate2_u16 (z6, z5), z0 = svcreate2 (z6, z5)) +/* +** create2_mf8: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z1\.d, z5\.d +** mov z0\.d, z4\.d +** ) +** ret +*/ +TEST_CREATE (create2_mf8, svmfloat8x2_t, svmfloat8_t, + z0 = svcreate2_mf8 (z4, z5), + z0 = svcreate2 (z4, z5)) + /* ** create2_bf16: ** ( diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create3_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create3_1.c index 0bea95195b8..da787cb1b3b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create3_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create3_1.c @@ -46,6 +46,17 @@ TEST_CREATE (create3_u16, svuint16x3_t, svuint16_t, z0 = svcreate3_u16 (z6, z5, z4), z0 = svcreate3 (z6, z5, z4)) +/* +** create3_mf8: +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** ret +*/ +TEST_CREATE (create3_mf8, svmfloat8x3_t, svmfloat8_t, + z0 = svcreate3_mf8 (z4, z5, z6), + z0 = svcreate3 (z4, z5, z6)) + /* ** create3_bf16: ** mov [^\n]+ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c index 1d2ff4e871d..a9eaa4e3335 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/create4_1.c @@ -50,6 +50,18 @@ TEST_CREATE (create4_u16, svuint16x4_t, svuint16_t, z0 = svcreate4_u16 (z6, z5, z4, z7), z0 = svcreate4 (z6, z5, z4, z7)) +/* +** create4_mf8: +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** mov [^\n]+ +** ret +*/ +TEST_CREATE (create4_mf8, svmfloat8x4_t, svmfloat8_t, + z0 = svcreate4_mf8 (z4, z5, z6, z7), + z0 = svcreate4 (z4, z5, z6, z7)) + /* ** create4_bf16: ** mov [^\n]+ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_mf8.c new file mode 100644 index 00000000000..2036963d567 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_lane_mf8.c @@ -0,0 +1,124 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** dup_lane_w0_mf8_tied1: +** mov (z[0-9]+\.b), w0 +** tbl z0\.b, {z0\.b}, \1 +** ret +*/ +TEST_UNIFORM_ZX (dup_lane_w0_mf8_tied1, svmfloat8_t, uint8_t, + z0 = svdup_lane_mf8 (z0, x0), + z0 = svdup_lane (z0, x0)) + +/* +** dup_lane_w0_mf8_untied: +** mov (z[0-9]+\.b), w0 +** tbl z0\.b, {z1\.b}, \1 +** ret +*/ +TEST_UNIFORM_ZX (dup_lane_w0_mf8_untied, svmfloat8_t, uint8_t, + z0 = svdup_lane_mf8 (z1, x0), + z0 = svdup_lane (z1, x0)) + +/* +** dup_lane_0_mf8_tied1: +** dup z0\.b, z0\.b\[0\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_0_mf8_tied1, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 0), + z0 = svdup_lane (z0, 0)) + +/* +** dup_lane_0_mf8_untied: +** dup z0\.b, z1\.b\[0\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_0_mf8_untied, svmfloat8_t, + z0 = svdup_lane_mf8 (z1, 0), + z0 = svdup_lane (z1, 0)) + +/* +** dup_lane_7_mf8: +** dup z0\.b, z0\.b\[7\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_7_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 7), + z0 = svdup_lane (z0, 7)) + +/* +** dup_lane_8_mf8: +** dup z0\.b, z0\.b\[8\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_8_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 8), + z0 = svdup_lane (z0, 8)) + +/* +** dup_lane_15_mf8: +** dup z0\.b, z0\.b\[15\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_15_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 15), + z0 = svdup_lane (z0, 15)) + +/* +** dup_lane_16_mf8: +** dup z0\.b, z0\.b\[16\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_16_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 16), + z0 = svdup_lane (z0, 16)) + +/* +** dup_lane_31_mf8: +** dup z0\.b, z0\.b\[31\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_31_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 31), + z0 = svdup_lane (z0, 31)) + +/* +** dup_lane_32_mf8: +** dup z0\.b, z0\.b\[32\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_32_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 32), + z0 = svdup_lane (z0, 32)) + +/* +** dup_lane_63_mf8: +** dup z0\.b, z0\.b\[63\] +** ret +*/ +TEST_UNIFORM_Z (dup_lane_63_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 63), + z0 = svdup_lane (z0, 63)) + +/* +** dup_lane_64_mf8: +** mov (z[0-9]+\.b), #64 +** tbl z0\.b, {z0\.b}, \1 +** ret +*/ +TEST_UNIFORM_Z (dup_lane_64_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 64), + z0 = svdup_lane (z0, 64)) + +/* +** dup_lane_255_mf8: +** mov (z[0-9]+\.b), #-1 +** tbl z0\.b, {z0\.b}, \1 +** ret +*/ +TEST_UNIFORM_Z (dup_lane_255_mf8, svmfloat8_t, + z0 = svdup_lane_mf8 (z0, 255), + z0 = svdup_lane (z0, 255)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_mf8.c new file mode 100644 index 00000000000..fbeac4b46ae --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_mf8.c @@ -0,0 +1,31 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** dup_w0_mf8: +** mov z0\.b, b4 +** ret +*/ +TEST_UNIFORM_ZX (dup_w0_mf8, svmfloat8_t, mfloat8_t, + z0 = svdup_n_mf8 (x0), + z0 = svdup_mf8 (x0)) + +/* +** dup_w0_mf8_m: +** movprfx z0, z1 +** mov z0\.b, p0/m, b4 +** ret +*/ +TEST_UNIFORM_ZX (dup_w0_mf8_m, svmfloat8_t, mfloat8_t, + z0 = svdup_n_mf8_m (z1, p0, x0), + z0 = svdup_mf8_m (z1, p0, x0)) + +/* +** dup_w0_mf8_x: +** mov z0\.b, b4 +** ret +*/ +TEST_UNIFORM_ZX (dup_w0_mf8_x, svmfloat8_t, mfloat8_t, + z0 = svdup_n_mf8_x (p0, x0), + z0 = svdup_mf8_x (p0, x0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_neonq_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_neonq_mf8.c new file mode 100644 index 00000000000..55cf2742bac --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dup_neonq_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** dup_neonq_mf8_z0: +** dup z0.q, z4.q\[0\] +** ret +*/ +TEST_DUP_NEONQ (dup_neonq_mf8_z0, mfloat8x16_t, svmfloat8_t, + z0 = svdup_neonq_mf8 (z4), + z0 = svdup_neonq (z4)) + +/* +** dup_neonq_mf8_z4: +** dup z4.q, z4.q\[0\] +** ret +*/ +TEST_DUP_NEONQ (dup_neonq_mf8_z4, mfloat8x16_t, svmfloat8_t, + z4_res = svdup_neonq_mf8 (z4), + z4_res = svdup_neonq (z4)) + +/* +** dup_neonq_mf8_z5: +** dup z5.q, z4.q\[0\] +** ret +*/ +TEST_DUP_NEONQ (dup_neonq_mf8_z5, mfloat8x16_t, svmfloat8_t, + z5_res = svdup_neonq_mf8 (z4), + z5_res = svdup_neonq (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dupq_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dupq_lane_mf8.c new file mode 100644 index 00000000000..02e95f73c19 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/dupq_lane_mf8.c @@ -0,0 +1,48 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** dupq_lane_0_mf8_tied: +** dup z0\.q, z0\.q\[0\] +** ret +*/ +TEST_UNIFORM_Z (dupq_lane_0_mf8_tied, svmfloat8_t, + z0 = svdupq_lane_mf8 (z0, 0), + z0 = svdupq_lane (z0, 0)) + +/* +** dupq_lane_0_mf8_untied: +** dup z0\.q, z1\.q\[0\] +** ret +*/ +TEST_UNIFORM_Z (dupq_lane_0_mf8_untied, svmfloat8_t, + z0 = svdupq_lane_mf8 (z1, 0), + z0 = svdupq_lane (z1, 0)) + +/* +** dupq_lane_1_mf8: +** dup z0\.q, z0\.q\[1\] +** ret +*/ +TEST_UNIFORM_Z (dupq_lane_1_mf8, svmfloat8_t, + z0 = svdupq_lane_mf8 (z0, 1), + z0 = svdupq_lane (z0, 1)) + +/* +** dupq_lane_2_mf8: +** dup z0\.q, z0\.q\[2\] +** ret +*/ +TEST_UNIFORM_Z (dupq_lane_2_mf8, svmfloat8_t, + z0 = svdupq_lane_mf8 (z0, 2), + z0 = svdupq_lane (z0, 2)) + +/* +** dupq_lane_3_mf8: +** dup z0\.q, z0\.q\[3\] +** ret +*/ +TEST_UNIFORM_Z (dupq_lane_3_mf8, svmfloat8_t, + z0 = svdupq_lane_mf8 (z0, 3), + z0 = svdupq_lane (z0, 3)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ext_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ext_mf8.c new file mode 100644 index 00000000000..ceeca3dd367 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ext_mf8.c @@ -0,0 +1,73 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** ext_0_mf8_tied1: +** ext z0\.b, z0\.b, z1\.b, #0 +** ret +*/ +TEST_UNIFORM_Z (ext_0_mf8_tied1, svmfloat8_t, + z0 = svext_mf8 (z0, z1, 0), + z0 = svext (z0, z1, 0)) + +/* +** ext_0_mf8_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** ext z0\.b, z0\.b, \1\.b, #0 +** ret +*/ +TEST_UNIFORM_Z (ext_0_mf8_tied2, svmfloat8_t, + z0 = svext_mf8 (z1, z0, 0), + z0 = svext (z1, z0, 0)) + +/* +** ext_0_mf8_untied: +** movprfx z0, z1 +** ext z0\.b, z0\.b, z2\.b, #0 +** ret +*/ +TEST_UNIFORM_Z (ext_0_mf8_untied, svmfloat8_t, + z0 = svext_mf8 (z1, z2, 0), + z0 = svext (z1, z2, 0)) + +/* +** ext_1_mf8: +** movprfx z0, z1 +** ext z0\.b, z0\.b, z2\.b, #1 +** ret +*/ +TEST_UNIFORM_Z (ext_1_mf8, svmfloat8_t, + z0 = svext_mf8 (z1, z2, 1), + z0 = svext (z1, z2, 1)) + +/* +** ext_2_mf8: +** movprfx z0, z1 +** ext z0\.b, z0\.b, z2\.b, #2 +** ret +*/ +TEST_UNIFORM_Z (ext_2_mf8, svmfloat8_t, + z0 = svext_mf8 (z1, z2, 2), + z0 = svext (z1, z2, 2)) + +/* +** ext_3_mf8: +** movprfx z0, z1 +** ext z0\.b, z0\.b, z2\.b, #3 +** ret +*/ +TEST_UNIFORM_Z (ext_3_mf8, svmfloat8_t, + z0 = svext_mf8 (z1, z2, 3), + z0 = svext (z1, z2, 3)) + +/* +** ext_255_mf8: +** movprfx z0, z1 +** ext z0\.b, z0\.b, z2\.b, #255 +** ret +*/ +TEST_UNIFORM_Z (ext_255_mf8, svmfloat8_t, + z0 = svext_mf8 (z1, z2, 255), + z0 = svext (z1, z2, 255)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get2_mf8.c new file mode 100644 index 00000000000..e365f09d17e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get2_mf8.c @@ -0,0 +1,55 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** get2_mf8_z0_0: +** mov z0\.d, z4\.d +** ret +*/ +TEST_GET (get2_mf8_z0_0, svmfloat8x2_t, svmfloat8_t, + z0 = svget2_mf8 (z4, 0), + z0 = svget2 (z4, 0)) + +/* +** get2_mf8_z0_1: +** mov z0\.d, z5\.d +** ret +*/ +TEST_GET (get2_mf8_z0_1, svmfloat8x2_t, svmfloat8_t, + z0 = svget2_mf8 (z4, 1), + z0 = svget2 (z4, 1)) + +/* +** get2_mf8_z4_0: +** ret +*/ +TEST_GET (get2_mf8_z4_0, svmfloat8x2_t, svmfloat8_t, + z4_res = svget2_mf8 (z4, 0), + z4_res = svget2 (z4, 0)) + +/* +** get2_mf8_z4_1: +** mov z4\.d, z5\.d +** ret +*/ +TEST_GET (get2_mf8_z4_1, svmfloat8x2_t, svmfloat8_t, + z4_res = svget2_mf8 (z4, 1), + z4_res = svget2 (z4, 1)) + +/* +** get2_mf8_z5_0: +** mov z5\.d, z4\.d +** ret +*/ +TEST_GET (get2_mf8_z5_0, svmfloat8x2_t, svmfloat8_t, + z5_res = svget2_mf8 (z4, 0), + z5_res = svget2 (z4, 0)) + +/* +** get2_mf8_z5_1: +** ret +*/ +TEST_GET (get2_mf8_z5_1, svmfloat8x2_t, svmfloat8_t, + z5_res = svget2_mf8 (z4, 1), + z5_res = svget2 (z4, 1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get3_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get3_mf8.c new file mode 100644 index 00000000000..6acab814c59 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get3_mf8.c @@ -0,0 +1,108 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** get3_mf8_z0_0: +** mov z0\.d, z4\.d +** ret +*/ +TEST_GET (get3_mf8_z0_0, svmfloat8x3_t, svmfloat8_t, + z0 = svget3_mf8 (z4, 0), + z0 = svget3 (z4, 0)) + +/* +** get3_mf8_z0_1: +** mov z0\.d, z5\.d +** ret +*/ +TEST_GET (get3_mf8_z0_1, svmfloat8x3_t, svmfloat8_t, + z0 = svget3_mf8 (z4, 1), + z0 = svget3 (z4, 1)) + +/* +** get3_mf8_z0_2: +** mov z0\.d, z6\.d +** ret +*/ +TEST_GET (get3_mf8_z0_2, svmfloat8x3_t, svmfloat8_t, + z0 = svget3_mf8 (z4, 2), + z0 = svget3 (z4, 2)) + +/* +** get3_mf8_z4_0: +** ret +*/ +TEST_GET (get3_mf8_z4_0, svmfloat8x3_t, svmfloat8_t, + z4_res = svget3_mf8 (z4, 0), + z4_res = svget3 (z4, 0)) + +/* +** get3_mf8_z4_1: +** mov z4\.d, z5\.d +** ret +*/ +TEST_GET (get3_mf8_z4_1, svmfloat8x3_t, svmfloat8_t, + z4_res = svget3_mf8 (z4, 1), + z4_res = svget3 (z4, 1)) + +/* +** get3_mf8_z4_2: +** mov z4\.d, z6\.d +** ret +*/ +TEST_GET (get3_mf8_z4_2, svmfloat8x3_t, svmfloat8_t, + z4_res = svget3_mf8 (z4, 2), + z4_res = svget3 (z4, 2)) + +/* +** get3_mf8_z5_0: +** mov z5\.d, z4\.d +** ret +*/ +TEST_GET (get3_mf8_z5_0, svmfloat8x3_t, svmfloat8_t, + z5_res = svget3_mf8 (z4, 0), + z5_res = svget3 (z4, 0)) + +/* +** get3_mf8_z5_1: +** ret +*/ +TEST_GET (get3_mf8_z5_1, svmfloat8x3_t, svmfloat8_t, + z5_res = svget3_mf8 (z4, 1), + z5_res = svget3 (z4, 1)) + +/* +** get3_mf8_z5_2: +** mov z5\.d, z6\.d +** ret +*/ +TEST_GET (get3_mf8_z5_2, svmfloat8x3_t, svmfloat8_t, + z5_res = svget3_mf8 (z4, 2), + z5_res = svget3 (z4, 2)) + +/* +** get3_mf8_z6_0: +** mov z6\.d, z4\.d +** ret +*/ +TEST_GET (get3_mf8_z6_0, svmfloat8x3_t, svmfloat8_t, + z6_res = svget3_mf8 (z4, 0), + z6_res = svget3 (z4, 0)) + +/* +** get3_mf8_z6_1: +** mov z6\.d, z5\.d +** ret +*/ +TEST_GET (get3_mf8_z6_1, svmfloat8x3_t, svmfloat8_t, + z6_res = svget3_mf8 (z4, 1), + z6_res = svget3 (z4, 1)) + +/* +** get3_mf8_z6_2: +** ret +*/ +TEST_GET (get3_mf8_z6_2, svmfloat8x3_t, svmfloat8_t, + z6_res = svget3_mf8 (z4, 2), + z6_res = svget3 (z4, 2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_mf8.c new file mode 100644 index 00000000000..cdee90adb4a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get4_mf8.c @@ -0,0 +1,179 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** get4_mf8_z0_0: +** mov z0\.d, z4\.d +** ret +*/ +TEST_GET (get4_mf8_z0_0, svmfloat8x4_t, svmfloat8_t, + z0 = svget4_mf8 (z4, 0), + z0 = svget4 (z4, 0)) + +/* +** get4_mf8_z0_1: +** mov z0\.d, z5\.d +** ret +*/ +TEST_GET (get4_mf8_z0_1, svmfloat8x4_t, svmfloat8_t, + z0 = svget4_mf8 (z4, 1), + z0 = svget4 (z4, 1)) + +/* +** get4_mf8_z0_2: +** mov z0\.d, z6\.d +** ret +*/ +TEST_GET (get4_mf8_z0_2, svmfloat8x4_t, svmfloat8_t, + z0 = svget4_mf8 (z4, 2), + z0 = svget4 (z4, 2)) + +/* +** get4_mf8_z0_3: +** mov z0\.d, z7\.d +** ret +*/ +TEST_GET (get4_mf8_z0_3, svmfloat8x4_t, svmfloat8_t, + z0 = svget4_mf8 (z4, 3), + z0 = svget4 (z4, 3)) + +/* +** get4_mf8_z4_0: +** ret +*/ +TEST_GET (get4_mf8_z4_0, svmfloat8x4_t, svmfloat8_t, + z4_res = svget4_mf8 (z4, 0), + z4_res = svget4 (z4, 0)) + +/* +** get4_mf8_z4_1: +** mov z4\.d, z5\.d +** ret +*/ +TEST_GET (get4_mf8_z4_1, svmfloat8x4_t, svmfloat8_t, + z4_res = svget4_mf8 (z4, 1), + z4_res = svget4 (z4, 1)) + +/* +** get4_mf8_z4_2: +** mov z4\.d, z6\.d +** ret +*/ +TEST_GET (get4_mf8_z4_2, svmfloat8x4_t, svmfloat8_t, + z4_res = svget4_mf8 (z4, 2), + z4_res = svget4 (z4, 2)) + +/* +** get4_mf8_z4_3: +** mov z4\.d, z7\.d +** ret +*/ +TEST_GET (get4_mf8_z4_3, svmfloat8x4_t, svmfloat8_t, + z4_res = svget4_mf8 (z4, 3), + z4_res = svget4 (z4, 3)) + +/* +** get4_mf8_z5_0: +** mov z5\.d, z4\.d +** ret +*/ +TEST_GET (get4_mf8_z5_0, svmfloat8x4_t, svmfloat8_t, + z5_res = svget4_mf8 (z4, 0), + z5_res = svget4 (z4, 0)) + +/* +** get4_mf8_z5_1: +** ret +*/ +TEST_GET (get4_mf8_z5_1, svmfloat8x4_t, svmfloat8_t, + z5_res = svget4_mf8 (z4, 1), + z5_res = svget4 (z4, 1)) + +/* +** get4_mf8_z5_2: +** mov z5\.d, z6\.d +** ret +*/ +TEST_GET (get4_mf8_z5_2, svmfloat8x4_t, svmfloat8_t, + z5_res = svget4_mf8 (z4, 2), + z5_res = svget4 (z4, 2)) + +/* +** get4_mf8_z5_3: +** mov z5\.d, z7\.d +** ret +*/ +TEST_GET (get4_mf8_z5_3, svmfloat8x4_t, svmfloat8_t, + z5_res = svget4_mf8 (z4, 3), + z5_res = svget4 (z4, 3)) + +/* +** get4_mf8_z6_0: +** mov z6\.d, z4\.d +** ret +*/ +TEST_GET (get4_mf8_z6_0, svmfloat8x4_t, svmfloat8_t, + z6_res = svget4_mf8 (z4, 0), + z6_res = svget4 (z4, 0)) + +/* +** get4_mf8_z6_1: +** mov z6\.d, z5\.d +** ret +*/ +TEST_GET (get4_mf8_z6_1, svmfloat8x4_t, svmfloat8_t, + z6_res = svget4_mf8 (z4, 1), + z6_res = svget4 (z4, 1)) + +/* +** get4_mf8_z6_2: +** ret +*/ +TEST_GET (get4_mf8_z6_2, svmfloat8x4_t, svmfloat8_t, + z6_res = svget4_mf8 (z4, 2), + z6_res = svget4 (z4, 2)) + +/* +** get4_mf8_z6_3: +** mov z6\.d, z7\.d +** ret +*/ +TEST_GET (get4_mf8_z6_3, svmfloat8x4_t, svmfloat8_t, + z6_res = svget4_mf8 (z4, 3), + z6_res = svget4 (z4, 3)) + +/* +** get4_mf8_z7_0: +** mov z7\.d, z4\.d +** ret +*/ +TEST_GET (get4_mf8_z7_0, svmfloat8x4_t, svmfloat8_t, + z7_res = svget4_mf8 (z4, 0), + z7_res = svget4 (z4, 0)) + +/* +** get4_mf8_z7_1: +** mov z7\.d, z5\.d +** ret +*/ +TEST_GET (get4_mf8_z7_1, svmfloat8x4_t, svmfloat8_t, + z7_res = svget4_mf8 (z4, 1), + z7_res = svget4 (z4, 1)) + +/* +** get4_mf8_z7_2: +** mov z7\.d, z6\.d +** ret +*/ +TEST_GET (get4_mf8_z7_2, svmfloat8x4_t, svmfloat8_t, + z7_res = svget4_mf8 (z4, 2), + z7_res = svget4 (z4, 2)) + +/* +** get4_mf8_z7_3: +** ret +*/ +TEST_GET (get4_mf8_z7_3, svmfloat8x4_t, svmfloat8_t, + z7_res = svget4_mf8 (z4, 3), + z7_res = svget4 (z4, 3)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get_neonq_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get_neonq_mf8.c new file mode 100644 index 00000000000..d659a821fa1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/get_neonq_mf8.c @@ -0,0 +1,33 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** get_neonq_mf8_z0: +** mov v0.16b, v4.16b +** ret +*/ +TEST_GET (get_neonq_mf8_z0, svmfloat8_t, mfloat8x16_t, + z0 = svget_neonq_mf8 (z4), + z0 = svget_neonq (z4)) + +/* +** get_neonq_mf8_z4: +** ret +*/ +TEST_GET (get_neonq_mf8_z4, svmfloat8_t, mfloat8x16_t, + z4_res = svget_neonq_mf8 (z4), + z4_res = svget_neonq (z4)) + +/* +** get_neonq_mf8_z5: +** ( +** mov z5.d, z4.d +** | +** mov v5.16b, v4.16b +** ) +** ret +*/ +TEST_GET (get_neonq_mf8_z5, svmfloat8_t, mfloat8x16_t, + z5_res = svget_neonq_mf8 (z4), + z5_res = svget_neonq (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/insr_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/insr_mf8.c new file mode 100644 index 00000000000..69dfc0a97c8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/insr_mf8.c @@ -0,0 +1,22 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** insr_w0_mf8_tied1: +** insr z0\.b, b4 +** ret +*/ +TEST_UNIFORM_ZX (insr_w0_mf8_tied1, svmfloat8_t, mfloat8_t, + z0 = svinsr_n_mf8 (z0, x0), + z0 = svinsr (z0, x0)) + +/* +** insr_w0_mf8_untied: +** movprfx z0, z1 +** insr z0\.b, b4 +** ret +*/ +TEST_UNIFORM_ZX (insr_w0_mf8_untied, svmfloat8_t, mfloat8_t, + z0 = svinsr_n_mf8 (z1, x0), + z0 = svinsr (z1, x0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lasta_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lasta_mf8.c new file mode 100644 index 00000000000..8e8cb51e65e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lasta_mf8.c @@ -0,0 +1,12 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** lasta_x0_mf8: +** lasta b0, p0, z0\.b +** ret +*/ +TEST_REDUCTION_X (lasta_x0_mf8, mfloat8_t, svmfloat8_t, + x0 = svlasta_mf8 (p0, z0), + x0 = svlasta (p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lastb_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lastb_mf8.c new file mode 100644 index 00000000000..a0d96f83f55 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/lastb_mf8.c @@ -0,0 +1,12 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** lastb_x0_mf8: +** lastb b0, p0, z0\.b +** ret +*/ +TEST_REDUCTION_X (lastb_x0_mf8, mfloat8_t, svmfloat8_t, + x0 = svlastb_mf8 (p0, z0), + x0 = svlastb (p0, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_mf8.c new file mode 100644 index 00000000000..7fdc60faeec --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1_mf8.c @@ -0,0 +1,162 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ld1_mf8_base: +** ld1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1_mf8_base, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0), + z0 = svld1 (p0, x0)) + +/* +** ld1_mf8_index: +** ld1b z0\.b, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ld1_mf8_index, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 + x1), + z0 = svld1 (p0, x0 + x1)) + +/* +** ld1_mf8_1: +** ld1b z0\.b, p0/z, \[x0, #1, mul vl\] +** ret +*/ +TEST_LOAD (ld1_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 + svcntb ()), + z0 = svld1 (p0, x0 + svcntb ())) + +/* +** ld1_mf8_7: +** ld1b z0\.b, p0/z, \[x0, #7, mul vl\] +** ret +*/ +TEST_LOAD (ld1_mf8_7, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 + svcntb () * 7), + z0 = svld1 (p0, x0 + svcntb () * 7)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld1_mf8_8: +** incb x0, all, mul #8 +** ld1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 + svcntb () * 8), + z0 = svld1 (p0, x0 + svcntb () * 8)) + +/* +** ld1_mf8_m1: +** ld1b z0\.b, p0/z, \[x0, #-1, mul vl\] +** ret +*/ +TEST_LOAD (ld1_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 - svcntb ()), + z0 = svld1 (p0, x0 - svcntb ())) + +/* +** ld1_mf8_m8: +** ld1b z0\.b, p0/z, \[x0, #-8, mul vl\] +** ret +*/ +TEST_LOAD (ld1_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 - svcntb () * 8), + z0 = svld1 (p0, x0 - svcntb () * 8)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld1_mf8_m9: +** decb x0, all, mul #9 +** ld1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1_mf8_m9, svmfloat8_t, mfloat8_t, + z0 = svld1_mf8 (p0, x0 - svcntb () * 9), + z0 = svld1 (p0, x0 - svcntb () * 9)) + +/* +** ld1_vnum_mf8_0: +** ld1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_0, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, 0), + z0 = svld1_vnum (p0, x0, 0)) + +/* +** ld1_vnum_mf8_1: +** ld1b z0\.b, p0/z, \[x0, #1, mul vl\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, 1), + z0 = svld1_vnum (p0, x0, 1)) + +/* +** ld1_vnum_mf8_7: +** ld1b z0\.b, p0/z, \[x0, #7, mul vl\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_7, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, 7), + z0 = svld1_vnum (p0, x0, 7)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld1_vnum_mf8_8: +** incb x0, all, mul #8 +** ld1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, 8), + z0 = svld1_vnum (p0, x0, 8)) + +/* +** ld1_vnum_mf8_m1: +** ld1b z0\.b, p0/z, \[x0, #-1, mul vl\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, -1), + z0 = svld1_vnum (p0, x0, -1)) + +/* +** ld1_vnum_mf8_m8: +** ld1b z0\.b, p0/z, \[x0, #-8, mul vl\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, -8), + z0 = svld1_vnum (p0, x0, -8)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld1_vnum_mf8_m9: +** decb x0, all, mul #9 +** ld1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_m9, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, -9), + z0 = svld1_vnum (p0, x0, -9)) + +/* +** ld1_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ld1b z0\.b, p0/z, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** ld1b z0\.b, p0/z, \[x0, \3\] +** ) +** ret +*/ +TEST_LOAD (ld1_vnum_mf8_x1, svmfloat8_t, mfloat8_t, + z0 = svld1_vnum_mf8 (p0, x0, x1), + z0 = svld1_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_mf8.c new file mode 100644 index 00000000000..08b7f68f7d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1ro_mf8.c @@ -0,0 +1,121 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ +/* { dg-additional-options "-march=armv8.6-a+f64mm" } */ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ + +#include "test_sve_acle.h" + +/* +** ld1ro_mf8_base: +** ld1rob z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_base, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0), + z0 = svld1ro (p0, x0)) + +/* +** ld1ro_mf8_index: +** ld1rob z0\.b, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_index, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 + x1), + z0 = svld1ro (p0, x0 + x1)) + +/* +** ld1ro_mf8_1: +** add (x[0-9]+), x0, #?1 +** ld1rob z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 + 1), + z0 = svld1ro (p0, x0 + 1)) + +/* +** ld1ro_mf8_16: +** add (x[0-9]+), x0, #?16 +** ld1rob z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_16, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 + 16), + z0 = svld1ro (p0, x0 + 16)) + +/* +** ld1ro_mf8_256: +** add (x[0-9]+), x0, #?256 +** ld1rob z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_256, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 + 256), + z0 = svld1ro (p0, x0 + 256)) + +/* +** ld1ro_mf8_m1: +** sub (x[0-9]+), x0, #?1 +** ld1rob z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 - 1), + z0 = svld1ro (p0, x0 - 1)) + +/* +** ld1ro_mf8_m16: +** sub (x[0-9]+), x0, #?16 +** ld1rob z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_m16, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 - 16), + z0 = svld1ro (p0, x0 - 16)) + +/* +** ld1ro_mf8_m288: +** sub (x[0-9]+), x0, #?288 +** ld1rob z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_m288, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 - 288), + z0 = svld1ro (p0, x0 - 288)) + +/* +** ld1ro_mf8_32: +** ld1rob z0\.b, p0/z, \[x0, #?32\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_32, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 + 32), + z0 = svld1ro (p0, x0 + 32)) + +/* +** ld1ro_mf8_224: +** ld1rob z0\.b, p0/z, \[x0, #?224\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_224, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 + 224), + z0 = svld1ro (p0, x0 + 224)) + +/* +** ld1ro_mf8_m32: +** ld1rob z0\.b, p0/z, \[x0, #?-32\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_m32, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 - 32), + z0 = svld1ro (p0, x0 - 32)) + +/* +** ld1ro_mf8_m256: +** ld1rob z0\.b, p0/z, \[x0, #?-256\] +** ret +*/ +TEST_LOAD (ld1ro_mf8_m256, svmfloat8_t, mfloat8_t, + z0 = svld1ro_mf8 (p0, x0 - 256), + z0 = svld1ro (p0, x0 - 256)) + diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1rq_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1rq_mf8.c new file mode 100644 index 00000000000..b3a3f4f09bc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld1rq_mf8.c @@ -0,0 +1,137 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ld1rq_mf8_base: +** ld1rqb z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_base, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0), + z0 = svld1rq (p0, x0)) + +/* +** ld1rq_mf8_index: +** ld1rqb z0\.b, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_index, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + x1), + z0 = svld1rq (p0, x0 + x1)) + +/* +** ld1rq_mf8_1: +** add (x[0-9]+), x0, #?1 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + 1), + z0 = svld1rq (p0, x0 + 1)) + +/* +** ld1rq_mf8_8: +** add (x[0-9]+), x0, #?8 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + 8), + z0 = svld1rq (p0, x0 + 8)) + +/* +** ld1rq_mf8_15: +** add (x[0-9]+), x0, #?15 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_15, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + 15), + z0 = svld1rq (p0, x0 + 15)) + +/* +** ld1rq_mf8_16: +** ld1rqb z0\.b, p0/z, \[x0, #?16\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_16, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + 16), + z0 = svld1rq (p0, x0 + 16)) + +/* +** ld1rq_mf8_112: +** ld1rqb z0\.b, p0/z, \[x0, #?112\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_112, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + 112), + z0 = svld1rq (p0, x0 + 112)) + +/* +** ld1rq_mf8_128: +** add (x[0-9]+), x0, #?128 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_128, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 + 128), + z0 = svld1rq (p0, x0 + 128)) + +/* +** ld1rq_mf8_m1: +** sub (x[0-9]+), x0, #?1 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 - 1), + z0 = svld1rq (p0, x0 - 1)) + +/* +** ld1rq_mf8_m8: +** sub (x[0-9]+), x0, #?8 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 - 8), + z0 = svld1rq (p0, x0 - 8)) + +/* +** ld1rq_mf8_m15: +** sub (x[0-9]+), x0, #?15 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_m15, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 - 15), + z0 = svld1rq (p0, x0 - 15)) + +/* +** ld1rq_mf8_m16: +** ld1rqb z0\.b, p0/z, \[x0, #?-16\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_m16, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 - 16), + z0 = svld1rq (p0, x0 - 16)) + +/* +** ld1rq_mf8_m128: +** ld1rqb z0\.b, p0/z, \[x0, #?-128\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_m128, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 - 128), + z0 = svld1rq (p0, x0 - 128)) + +/* +** ld1rq_mf8_m144: +** sub (x[0-9]+), x0, #?144 +** ld1rqb z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld1rq_mf8_m144, svmfloat8_t, mfloat8_t, + z0 = svld1rq_mf8 (p0, x0 - 144), + z0 = svld1rq (p0, x0 - 144)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld2_mf8.c new file mode 100644 index 00000000000..b533bf8169b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld2_mf8.c @@ -0,0 +1,204 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ld2_mf8_base: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_mf8_base, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0), + z0 = svld2 (p0, x0)) + +/* +** ld2_mf8_index: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ld2_mf8_index, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 + x1), + z0 = svld2 (p0, x0 + x1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld2_mf8_1: +** incb x0 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_mf8_1, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 + svcntb ()), + z0 = svld2 (p0, x0 + svcntb ())) + +/* +** ld2_mf8_2: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #2, mul vl\] +** ret +*/ +TEST_LOAD (ld2_mf8_2, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 + svcntb () * 2), + z0 = svld2 (p0, x0 + svcntb () * 2)) + +/* +** ld2_mf8_14: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #14, mul vl\] +** ret +*/ +TEST_LOAD (ld2_mf8_14, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 + svcntb () * 14), + z0 = svld2 (p0, x0 + svcntb () * 14)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld2_mf8_16: +** incb x0, all, mul #16 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_mf8_16, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 + svcntb () * 16), + z0 = svld2 (p0, x0 + svcntb () * 16)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld2_mf8_m1: +** decb x0 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_mf8_m1, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 - svcntb ()), + z0 = svld2 (p0, x0 - svcntb ())) + +/* +** ld2_mf8_m2: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #-2, mul vl\] +** ret +*/ +TEST_LOAD (ld2_mf8_m2, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 - svcntb () * 2), + z0 = svld2 (p0, x0 - svcntb () * 2)) + +/* +** ld2_mf8_m16: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #-16, mul vl\] +** ret +*/ +TEST_LOAD (ld2_mf8_m16, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 - svcntb () * 16), + z0 = svld2 (p0, x0 - svcntb () * 16)) + +/* +** ld2_mf8_m18: +** addvl (x[0-9]+), x0, #-18 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld2_mf8_m18, svmfloat8x2_t, mfloat8_t, + z0 = svld2_mf8 (p0, x0 - svcntb () * 18), + z0 = svld2 (p0, x0 - svcntb () * 18)) + +/* +** ld2_vnum_mf8_0: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_0, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, 0), + z0 = svld2_vnum (p0, x0, 0)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld2_vnum_mf8_1: +** incb x0 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_1, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, 1), + z0 = svld2_vnum (p0, x0, 1)) + +/* +** ld2_vnum_mf8_2: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #2, mul vl\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_2, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, 2), + z0 = svld2_vnum (p0, x0, 2)) + +/* +** ld2_vnum_mf8_14: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #14, mul vl\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_14, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, 14), + z0 = svld2_vnum (p0, x0, 14)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld2_vnum_mf8_16: +** incb x0, all, mul #16 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_16, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, 16), + z0 = svld2_vnum (p0, x0, 16)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld2_vnum_mf8_m1: +** decb x0 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_m1, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, -1), + z0 = svld2_vnum (p0, x0, -1)) + +/* +** ld2_vnum_mf8_m2: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #-2, mul vl\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_m2, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, -2), + z0 = svld2_vnum (p0, x0, -2)) + +/* +** ld2_vnum_mf8_m16: +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, #-16, mul vl\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_m16, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, -16), + z0 = svld2_vnum (p0, x0, -16)) + +/* +** ld2_vnum_mf8_m18: +** addvl (x[0-9]+), x0, #-18 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_m18, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, -18), + z0 = svld2_vnum (p0, x0, -18)) + +/* +** ld2_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** ld2b {z0\.b(?: - |, )z1\.b}, p0/z, \[x0, \3\] +** ) +** ret +*/ +TEST_LOAD (ld2_vnum_mf8_x1, svmfloat8x2_t, mfloat8_t, + z0 = svld2_vnum_mf8 (p0, x0, x1), + z0 = svld2_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld3_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld3_mf8.c new file mode 100644 index 00000000000..f43d8050e7d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld3_mf8.c @@ -0,0 +1,246 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ld3_mf8_base: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_mf8_base, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0), + z0 = svld3 (p0, x0)) + +/* +** ld3_mf8_index: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ld3_mf8_index, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 + x1), + z0 = svld3 (p0, x0 + x1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_mf8_1: +** incb x0 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_mf8_1, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 + svcntb ()), + z0 = svld3 (p0, x0 + svcntb ())) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_mf8_2: +** incb x0, all, mul #2 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_mf8_2, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 + svcntb () * 2), + z0 = svld3 (p0, x0 + svcntb () * 2)) + +/* +** ld3_mf8_3: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #3, mul vl\] +** ret +*/ +TEST_LOAD (ld3_mf8_3, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 + svcntb () * 3), + z0 = svld3 (p0, x0 + svcntb () * 3)) + +/* +** ld3_mf8_21: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #21, mul vl\] +** ret +*/ +TEST_LOAD (ld3_mf8_21, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 + svcntb () * 21), + z0 = svld3 (p0, x0 + svcntb () * 21)) + +/* +** ld3_mf8_24: +** addvl (x[0-9]+), x0, #24 +** ld3b {z0\.b - z2\.b}, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld3_mf8_24, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 + svcntb () * 24), + z0 = svld3 (p0, x0 + svcntb () * 24)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_mf8_m1: +** decb x0 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_mf8_m1, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 - svcntb ()), + z0 = svld3 (p0, x0 - svcntb ())) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_mf8_m2: +** decb x0, all, mul #2 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_mf8_m2, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 - svcntb () * 2), + z0 = svld3 (p0, x0 - svcntb () * 2)) + +/* +** ld3_mf8_m3: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #-3, mul vl\] +** ret +*/ +TEST_LOAD (ld3_mf8_m3, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 - svcntb () * 3), + z0 = svld3 (p0, x0 - svcntb () * 3)) + +/* +** ld3_mf8_m24: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #-24, mul vl\] +** ret +*/ +TEST_LOAD (ld3_mf8_m24, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 - svcntb () * 24), + z0 = svld3 (p0, x0 - svcntb () * 24)) + +/* +** ld3_mf8_m27: +** addvl (x[0-9]+), x0, #-27 +** ld3b {z0\.b - z2\.b}, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld3_mf8_m27, svmfloat8x3_t, mfloat8_t, + z0 = svld3_mf8 (p0, x0 - svcntb () * 27), + z0 = svld3 (p0, x0 - svcntb () * 27)) + +/* +** ld3_vnum_mf8_0: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_0, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, 0), + z0 = svld3_vnum (p0, x0, 0)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_vnum_mf8_1: +** incb x0 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_1, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, 1), + z0 = svld3_vnum (p0, x0, 1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_vnum_mf8_2: +** incb x0, all, mul #2 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_2, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, 2), + z0 = svld3_vnum (p0, x0, 2)) + +/* +** ld3_vnum_mf8_3: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #3, mul vl\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_3, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, 3), + z0 = svld3_vnum (p0, x0, 3)) + +/* +** ld3_vnum_mf8_21: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #21, mul vl\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_21, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, 21), + z0 = svld3_vnum (p0, x0, 21)) + +/* +** ld3_vnum_mf8_24: +** addvl (x[0-9]+), x0, #24 +** ld3b {z0\.b - z2\.b}, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_24, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, 24), + z0 = svld3_vnum (p0, x0, 24)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_vnum_mf8_m1: +** decb x0 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_m1, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, -1), + z0 = svld3_vnum (p0, x0, -1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld3_vnum_mf8_m2: +** decb x0, all, mul #2 +** ld3b {z0\.b - z2\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_m2, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, -2), + z0 = svld3_vnum (p0, x0, -2)) + +/* +** ld3_vnum_mf8_m3: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #-3, mul vl\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_m3, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, -3), + z0 = svld3_vnum (p0, x0, -3)) + +/* +** ld3_vnum_mf8_m24: +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, #-24, mul vl\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_m24, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, -24), + z0 = svld3_vnum (p0, x0, -24)) + +/* +** ld3_vnum_mf8_m27: +** addvl (x[0-9]+), x0, #-27 +** ld3b {z0\.b - z2\.b}, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_m27, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, -27), + z0 = svld3_vnum (p0, x0, -27)) + +/* +** ld3_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ld3b {z0\.b - z2\.b}, p0/z, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** ld3b {z0\.b - z2\.b}, p0/z, \[x0, \3\] +** ) +** ret +*/ +TEST_LOAD (ld3_vnum_mf8_x1, svmfloat8x3_t, mfloat8_t, + z0 = svld3_vnum_mf8 (p0, x0, x1), + z0 = svld3_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld4_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld4_mf8.c new file mode 100644 index 00000000000..e4e9dc016c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ld4_mf8.c @@ -0,0 +1,290 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ld4_mf8_base: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_base, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0), + z0 = svld4 (p0, x0)) + +/* +** ld4_mf8_index: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ld4_mf8_index, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + x1), + z0 = svld4 (p0, x0 + x1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_mf8_1: +** incb x0 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_1, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + svcntb ()), + z0 = svld4 (p0, x0 + svcntb ())) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_mf8_2: +** incb x0, all, mul #2 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_2, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + svcntb () * 2), + z0 = svld4 (p0, x0 + svcntb () * 2)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_mf8_3: +** incb x0, all, mul #3 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_3, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + svcntb () * 3), + z0 = svld4 (p0, x0 + svcntb () * 3)) + +/* +** ld4_mf8_4: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #4, mul vl\] +** ret +*/ +TEST_LOAD (ld4_mf8_4, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + svcntb () * 4), + z0 = svld4 (p0, x0 + svcntb () * 4)) + +/* +** ld4_mf8_28: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #28, mul vl\] +** ret +*/ +TEST_LOAD (ld4_mf8_28, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + svcntb () * 28), + z0 = svld4 (p0, x0 + svcntb () * 28)) + +/* +** ld4_mf8_32: +** [^{]* +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, x[0-9]+\] +** ret +*/ +TEST_LOAD (ld4_mf8_32, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 + svcntb () * 32), + z0 = svld4 (p0, x0 + svcntb () * 32)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_mf8_m1: +** decb x0 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_m1, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 - svcntb ()), + z0 = svld4 (p0, x0 - svcntb ())) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_mf8_m2: +** decb x0, all, mul #2 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_m2, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 - svcntb () * 2), + z0 = svld4 (p0, x0 - svcntb () * 2)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_mf8_m3: +** decb x0, all, mul #3 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_mf8_m3, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 - svcntb () * 3), + z0 = svld4 (p0, x0 - svcntb () * 3)) + +/* +** ld4_mf8_m4: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #-4, mul vl\] +** ret +*/ +TEST_LOAD (ld4_mf8_m4, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 - svcntb () * 4), + z0 = svld4 (p0, x0 - svcntb () * 4)) + +/* +** ld4_mf8_m32: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #-32, mul vl\] +** ret +*/ +TEST_LOAD (ld4_mf8_m32, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 - svcntb () * 32), + z0 = svld4 (p0, x0 - svcntb () * 32)) + +/* +** ld4_mf8_m36: +** [^{]* +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, x[0-9]+\] +** ret +*/ +TEST_LOAD (ld4_mf8_m36, svmfloat8x4_t, mfloat8_t, + z0 = svld4_mf8 (p0, x0 - svcntb () * 36), + z0 = svld4 (p0, x0 - svcntb () * 36)) + +/* +** ld4_vnum_mf8_0: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_0, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 0), + z0 = svld4_vnum (p0, x0, 0)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_vnum_mf8_1: +** incb x0 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_1, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 1), + z0 = svld4_vnum (p0, x0, 1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_vnum_mf8_2: +** incb x0, all, mul #2 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_2, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 2), + z0 = svld4_vnum (p0, x0, 2)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_vnum_mf8_3: +** incb x0, all, mul #3 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_3, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 3), + z0 = svld4_vnum (p0, x0, 3)) + +/* +** ld4_vnum_mf8_4: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #4, mul vl\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_4, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 4), + z0 = svld4_vnum (p0, x0, 4)) + +/* +** ld4_vnum_mf8_28: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #28, mul vl\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_28, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 28), + z0 = svld4_vnum (p0, x0, 28)) + +/* +** ld4_vnum_mf8_32: +** [^{]* +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, x[0-9]+\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_32, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, 32), + z0 = svld4_vnum (p0, x0, 32)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_vnum_mf8_m1: +** decb x0 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_m1, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, -1), + z0 = svld4_vnum (p0, x0, -1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_vnum_mf8_m2: +** decb x0, all, mul #2 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_m2, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, -2), + z0 = svld4_vnum (p0, x0, -2)) + +/* Moving the constant into a register would also be OK. */ +/* +** ld4_vnum_mf8_m3: +** decb x0, all, mul #3 +** ld4b {z0\.b - z3\.b}, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_m3, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, -3), + z0 = svld4_vnum (p0, x0, -3)) + +/* +** ld4_vnum_mf8_m4: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #-4, mul vl\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_m4, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, -4), + z0 = svld4_vnum (p0, x0, -4)) + +/* +** ld4_vnum_mf8_m32: +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, #-32, mul vl\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_m32, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, -32), + z0 = svld4_vnum (p0, x0, -32)) + +/* +** ld4_vnum_mf8_m36: +** [^{]* +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, x[0-9]+\] +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_m36, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, -36), + z0 = svld4_vnum (p0, x0, -36)) + +/* +** ld4_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ld4b {z0\.b - z3\.b}, p0/z, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** ld4b {z0\.b - z3\.b}, p0/z, \[x0, \3\] +** ) +** ret +*/ +TEST_LOAD (ld4_vnum_mf8_x1, svmfloat8x4_t, mfloat8_t, + z0 = svld4_vnum_mf8 (p0, x0, x1), + z0 = svld4_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_mf8.c new file mode 100644 index 00000000000..45cdf4168fb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldff1_mf8.c @@ -0,0 +1,91 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ldff1_mf8_base: +** ldff1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldff1_mf8_base, svmfloat8_t, mfloat8_t, + z0 = svldff1_mf8 (p0, x0), + z0 = svldff1 (p0, x0)) + +/* +** ldff1_mf8_index: +** ldff1b z0\.b, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ldff1_mf8_index, svmfloat8_t, mfloat8_t, + z0 = svldff1_mf8 (p0, x0 + x1), + z0 = svldff1 (p0, x0 + x1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldff1_mf8_1: +** incb x0 +** ldff1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldff1_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svldff1_mf8 (p0, x0 + svcntb ()), + z0 = svldff1 (p0, x0 + svcntb ())) + +/* Moving the constant into a register would also be OK. */ +/* +** ldff1_mf8_m1: +** decb x0 +** ldff1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldff1_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svldff1_mf8 (p0, x0 - svcntb ()), + z0 = svldff1 (p0, x0 - svcntb ())) + +/* +** ldff1_vnum_mf8_0: +** ldff1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldff1_vnum_mf8_0, svmfloat8_t, mfloat8_t, + z0 = svldff1_vnum_mf8 (p0, x0, 0), + z0 = svldff1_vnum (p0, x0, 0)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldff1_vnum_mf8_1: +** incb x0 +** ldff1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldff1_vnum_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svldff1_vnum_mf8 (p0, x0, 1), + z0 = svldff1_vnum (p0, x0, 1)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldff1_vnum_mf8_m1: +** decb x0 +** ldff1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldff1_vnum_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svldff1_vnum_mf8 (p0, x0, -1), + z0 = svldff1_vnum (p0, x0, -1)) + +/* +** ldff1_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ldff1b z0\.b, p0/z, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** ldff1b z0\.b, p0/z, \[x0, \3\] +** ) +** ret +*/ +TEST_LOAD (ldff1_vnum_mf8_x1, svmfloat8_t, mfloat8_t, + z0 = svldff1_vnum_mf8 (p0, x0, x1), + z0 = svldff1_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_mf8.c new file mode 100644 index 00000000000..a5054e9047e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnf1_mf8.c @@ -0,0 +1,155 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ldnf1_mf8_base: +** ldnf1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_base, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0), + z0 = svldnf1 (p0, x0)) + +/* +** ldnf1_mf8_index: +** add (x[0-9]+), x0, x1 +** ldnf1b z0\.b, p0/z, \[\1\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_index, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 + x1), + z0 = svldnf1 (p0, x0 + x1)) + +/* +** ldnf1_mf8_1: +** ldnf1b z0\.b, p0/z, \[x0, #1, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 + svcntb ()), + z0 = svldnf1 (p0, x0 + svcntb ())) + +/* +** ldnf1_mf8_7: +** ldnf1b z0\.b, p0/z, \[x0, #7, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_7, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 + svcntb () * 7), + z0 = svldnf1 (p0, x0 + svcntb () * 7)) + +/* +** ldnf1_mf8_8: +** incb x0, all, mul #8 +** ldnf1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 + svcntb () * 8), + z0 = svldnf1 (p0, x0 + svcntb () * 8)) + +/* +** ldnf1_mf8_m1: +** ldnf1b z0\.b, p0/z, \[x0, #-1, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 - svcntb ()), + z0 = svldnf1 (p0, x0 - svcntb ())) + +/* +** ldnf1_mf8_m8: +** ldnf1b z0\.b, p0/z, \[x0, #-8, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 - svcntb () * 8), + z0 = svldnf1 (p0, x0 - svcntb () * 8)) + +/* +** ldnf1_mf8_m9: +** decb x0, all, mul #9 +** ldnf1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnf1_mf8_m9, svmfloat8_t, mfloat8_t, + z0 = svldnf1_mf8 (p0, x0 - svcntb () * 9), + z0 = svldnf1 (p0, x0 - svcntb () * 9)) + +/* +** ldnf1_vnum_mf8_0: +** ldnf1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_0, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, 0), + z0 = svldnf1_vnum (p0, x0, 0)) + +/* +** ldnf1_vnum_mf8_1: +** ldnf1b z0\.b, p0/z, \[x0, #1, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, 1), + z0 = svldnf1_vnum (p0, x0, 1)) + +/* +** ldnf1_vnum_mf8_7: +** ldnf1b z0\.b, p0/z, \[x0, #7, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_7, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, 7), + z0 = svldnf1_vnum (p0, x0, 7)) + +/* +** ldnf1_vnum_mf8_8: +** incb x0, all, mul #8 +** ldnf1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, 8), + z0 = svldnf1_vnum (p0, x0, 8)) + +/* +** ldnf1_vnum_mf8_m1: +** ldnf1b z0\.b, p0/z, \[x0, #-1, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, -1), + z0 = svldnf1_vnum (p0, x0, -1)) + +/* +** ldnf1_vnum_mf8_m8: +** ldnf1b z0\.b, p0/z, \[x0, #-8, mul vl\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, -8), + z0 = svldnf1_vnum (p0, x0, -8)) + +/* +** ldnf1_vnum_mf8_m9: +** decb x0, all, mul #9 +** ldnf1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_m9, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, -9), + z0 = svldnf1_vnum (p0, x0, -9)) + +/* +** ldnf1_vnum_mf8_x1: +** cntb (x[0-9]+) +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ldnf1b z0\.b, p0/z, \[\2\] +** ret +*/ +TEST_LOAD (ldnf1_vnum_mf8_x1, svmfloat8_t, mfloat8_t, + z0 = svldnf1_vnum_mf8 (p0, x0, x1), + z0 = svldnf1_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnt1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnt1_mf8.c new file mode 100644 index 00000000000..dbfd9ae83d4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/ldnt1_mf8.c @@ -0,0 +1,162 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** ldnt1_mf8_base: +** ldnt1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_base, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0), + z0 = svldnt1 (p0, x0)) + +/* +** ldnt1_mf8_index: +** ldnt1b z0\.b, p0/z, \[x0, x1\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_index, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 + x1), + z0 = svldnt1 (p0, x0 + x1)) + +/* +** ldnt1_mf8_1: +** ldnt1b z0\.b, p0/z, \[x0, #1, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 + svcntb ()), + z0 = svldnt1 (p0, x0 + svcntb ())) + +/* +** ldnt1_mf8_7: +** ldnt1b z0\.b, p0/z, \[x0, #7, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_7, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 + svcntb () * 7), + z0 = svldnt1 (p0, x0 + svcntb () * 7)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldnt1_mf8_8: +** incb x0, all, mul #8 +** ldnt1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 + svcntb () * 8), + z0 = svldnt1 (p0, x0 + svcntb () * 8)) + +/* +** ldnt1_mf8_m1: +** ldnt1b z0\.b, p0/z, \[x0, #-1, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 - svcntb ()), + z0 = svldnt1 (p0, x0 - svcntb ())) + +/* +** ldnt1_mf8_m8: +** ldnt1b z0\.b, p0/z, \[x0, #-8, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 - svcntb () * 8), + z0 = svldnt1 (p0, x0 - svcntb () * 8)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldnt1_mf8_m9: +** decb x0, all, mul #9 +** ldnt1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnt1_mf8_m9, svmfloat8_t, mfloat8_t, + z0 = svldnt1_mf8 (p0, x0 - svcntb () * 9), + z0 = svldnt1 (p0, x0 - svcntb () * 9)) + +/* +** ldnt1_vnum_mf8_0: +** ldnt1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_0, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, 0), + z0 = svldnt1_vnum (p0, x0, 0)) + +/* +** ldnt1_vnum_mf8_1: +** ldnt1b z0\.b, p0/z, \[x0, #1, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_1, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, 1), + z0 = svldnt1_vnum (p0, x0, 1)) + +/* +** ldnt1_vnum_mf8_7: +** ldnt1b z0\.b, p0/z, \[x0, #7, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_7, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, 7), + z0 = svldnt1_vnum (p0, x0, 7)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldnt1_vnum_mf8_8: +** incb x0, all, mul #8 +** ldnt1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_8, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, 8), + z0 = svldnt1_vnum (p0, x0, 8)) + +/* +** ldnt1_vnum_mf8_m1: +** ldnt1b z0\.b, p0/z, \[x0, #-1, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_m1, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, -1), + z0 = svldnt1_vnum (p0, x0, -1)) + +/* +** ldnt1_vnum_mf8_m8: +** ldnt1b z0\.b, p0/z, \[x0, #-8, mul vl\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_m8, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, -8), + z0 = svldnt1_vnum (p0, x0, -8)) + +/* Moving the constant into a register would also be OK. */ +/* +** ldnt1_vnum_mf8_m9: +** decb x0, all, mul #9 +** ldnt1b z0\.b, p0/z, \[x0\] +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_m9, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, -9), + z0 = svldnt1_vnum (p0, x0, -9)) + +/* +** ldnt1_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** ldnt1b z0\.b, p0/z, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** ldnt1b z0\.b, p0/z, \[x0, \3\] +** ) +** ret +*/ +TEST_LOAD (ldnt1_vnum_mf8_x1, svmfloat8_t, mfloat8_t, + z0 = svldnt1_vnum_mf8 (p0, x0, x1), + z0 = svldnt1_vnum (p0, x0, x1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/len_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/len_mf8.c new file mode 100644 index 00000000000..c8730c6886a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/len_mf8.c @@ -0,0 +1,12 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** len_x0_mf8: +** cntb x0 +** ret +*/ +TEST_REDUCTION_X (len_x0_mf8, uint64_t, svmfloat8_t, + x0 = svlen_mf8 (z0), + x0 = svlen (z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c index dd0daf2eff0..e745b2500dc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_bf16_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_mf8_tied1, svbfloat16_t, svmfloat8_t, + z0_res = svreinterpret_bf16_mf8 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_bf16_mf8_untied, svbfloat16_t, svmfloat8_t, + z0 = svreinterpret_bf16_mf8 (z4), + z0 = svreinterpret_bf16 (z4)) + /* ** reinterpret_bf16_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c index 9b6f8227d2a..75dcf300b75 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_f16_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_mf8_tied1, svfloat16_t, svmfloat8_t, + z0_res = svreinterpret_f16_mf8 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_f16_mf8_untied, svfloat16_t, svmfloat8_t, + z0 = svreinterpret_f16_mf8 (z4), + z0 = svreinterpret_f16 (z4)) + /* ** reinterpret_f16_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c index ce981fce9d8..4bf7860b9dc 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_f32_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_mf8_tied1, svfloat32_t, svmfloat8_t, + z0_res = svreinterpret_f32_mf8 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_f32_mf8_untied, svfloat32_t, svmfloat8_t, + z0 = svreinterpret_f32_mf8 (z4), + z0 = svreinterpret_f32 (z4)) + /* ** reinterpret_f32_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c index 4f51824ab7e..f4012fa54aa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_f64_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_mf8_tied1, svfloat64_t, svmfloat8_t, + z0_res = svreinterpret_f64_mf8 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_f64_mf8_untied, svfloat64_t, svmfloat8_t, + z0 = svreinterpret_f64_mf8 (z4), + z0 = svreinterpret_f64 (z4)) + /* ** reinterpret_f64_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_mf8.c new file mode 100644 index 00000000000..dcd4978dc92 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_mf8.c @@ -0,0 +1,297 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** reinterpret_mf8_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_mf8_tied1, svmfloat8_t, svmfloat8_t, + z0_res = svreinterpret_mf8_mf8 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_mf8_untied, svmfloat8_t, svmfloat8_t, + z0 = svreinterpret_mf8_mf8 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_bf16_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_bf16_tied1, svmfloat8_t, svbfloat16_t, + z0_res = svreinterpret_mf8_bf16 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_bf16_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_bf16_untied, svmfloat8_t, svbfloat16_t, + z0 = svreinterpret_mf8_bf16 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_f16_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_f16_tied1, svmfloat8_t, svfloat16_t, + z0_res = svreinterpret_mf8_f16 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_f16_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_f16_untied, svmfloat8_t, svfloat16_t, + z0 = svreinterpret_mf8_f16 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_f32_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_f32_tied1, svmfloat8_t, svfloat32_t, + z0_res = svreinterpret_mf8_f32 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_f32_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_f32_untied, svmfloat8_t, svfloat32_t, + z0 = svreinterpret_mf8_f32 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_f64_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_f64_tied1, svmfloat8_t, svfloat64_t, + z0_res = svreinterpret_mf8_f64 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_f64_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_f64_untied, svmfloat8_t, svfloat64_t, + z0 = svreinterpret_mf8_f64 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_s8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_s8_tied1, svmfloat8_t, svint8_t, + z0_res = svreinterpret_mf8_s8 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_s8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_s8_untied, svmfloat8_t, svint8_t, + z0 = svreinterpret_mf8_s8 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_s16_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_s16_tied1, svmfloat8_t, svint16_t, + z0_res = svreinterpret_mf8_s16 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_s16_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_s16_untied, svmfloat8_t, svint16_t, + z0 = svreinterpret_mf8_s16 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_s32_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_s32_tied1, svmfloat8_t, svint32_t, + z0_res = svreinterpret_mf8_s32 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_s32_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_s32_untied, svmfloat8_t, svint32_t, + z0 = svreinterpret_mf8_s32 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_s64_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_s64_tied1, svmfloat8_t, svint64_t, + z0_res = svreinterpret_mf8_s64 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_s64_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_s64_untied, svmfloat8_t, svint64_t, + z0 = svreinterpret_mf8_s64 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_u8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_u8_tied1, svmfloat8_t, svuint8_t, + z0_res = svreinterpret_mf8_u8 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_u8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_u8_untied, svmfloat8_t, svuint8_t, + z0 = svreinterpret_mf8_u8 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_u16_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_u16_tied1, svmfloat8_t, svuint16_t, + z0_res = svreinterpret_mf8_u16 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_u16_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_u16_untied, svmfloat8_t, svuint16_t, + z0 = svreinterpret_mf8_u16 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_u32_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_u32_tied1, svmfloat8_t, svuint32_t, + z0_res = svreinterpret_mf8_u32 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_u32_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_u32_untied, svmfloat8_t, svuint32_t, + z0 = svreinterpret_mf8_u32 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_u64_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_u64_tied1, svmfloat8_t, svuint64_t, + z0_res = svreinterpret_mf8_u64 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_u64_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_mf8_u64_untied, svmfloat8_t, svuint64_t, + z0 = svreinterpret_mf8_u64 (z4), + z0 = svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_bf16_x2_tied1, svmfloat8x2_t, svbfloat16x2_t, + z0_res = svreinterpret_mf8_bf16_x2 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_mf8_f32_x2_untied, svmfloat8x2_t, svfloat32x2_t, z0, + svreinterpret_mf8_f32_x2 (z4), + svreinterpret_mf8 (z4)) + +/* +** reinterpret_mf8_mf8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_mf8_mf8_x3_untied, svmfloat8x3_t, svmfloat8x3_t, z18, + svreinterpret_mf8_mf8_x3 (z23), + svreinterpret_mf8 (z23)) + +/* +** reinterpret_mf8_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_s64_x3_tied1, svmfloat8x3_t, svint64x3_t, + z0_res = svreinterpret_mf8_s64_x3 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_mf8_u8_x3_untied, svmfloat8x3_t, svuint8x3_t, z18, + svreinterpret_mf8_u8_x3 (z23), + svreinterpret_mf8 (z23)) + +/* +** reinterpret_mf8_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_mf8_u32_x4_tied1, svmfloat8x4_t, svuint32x4_t, + z0_res = svreinterpret_mf8_u32_x4 (z0), + z0_res = svreinterpret_mf8 (z0)) + +/* +** reinterpret_mf8_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_mf8_f64_x4_untied, svmfloat8x4_t, svfloat64x4_t, z28, + svreinterpret_mf8_f64_x4 (z4), + svreinterpret_mf8 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c index 7e15f3e9bd3..17558aa718b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_s16_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_mf8_tied1, svint16_t, svmfloat8_t, + z0_res = svreinterpret_s16_mf8 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_s16_mf8_untied, svint16_t, svmfloat8_t, + z0 = svreinterpret_s16_mf8 (z4), + z0 = svreinterpret_s16 (z4)) + /* ** reinterpret_s16_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c index 60da8aef333..c78e90e2776 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_s32_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_mf8_tied1, svint32_t, svmfloat8_t, + z0_res = svreinterpret_s32_mf8 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_s32_mf8_untied, svint32_t, svmfloat8_t, + z0 = svreinterpret_s32_mf8 (z4), + z0 = svreinterpret_s32 (z4)) + /* ** reinterpret_s32_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c index d705c60dfd7..9370c4b5789 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_s64_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_mf8_tied1, svint64_t, svmfloat8_t, + z0_res = svreinterpret_s64_mf8 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_s64_mf8_untied, svint64_t, svmfloat8_t, + z0 = svreinterpret_s64_mf8 (z4), + z0 = svreinterpret_s64 (z4)) + /* ** reinterpret_s64_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c index ab90a54d746..46a5cd17f80 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_s8_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_mf8_tied1, svint8_t, svmfloat8_t, + z0_res = svreinterpret_s8_mf8 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_s8_mf8_untied, svint8_t, svmfloat8_t, + z0 = svreinterpret_s8_mf8 (z4), + z0 = svreinterpret_s8 (z4)) + /* ** reinterpret_s8_bf16_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c index fcfc0eb9da5..d91b305fab7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_u16_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_mf8_tied1, svuint16_t, svmfloat8_t, + z0_res = svreinterpret_u16_mf8 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_u16_mf8_untied, svuint16_t, svmfloat8_t, + z0 = svreinterpret_u16_mf8 (z4), + z0 = svreinterpret_u16 (z4)) + /* ** reinterpret_u16_bf16_tied1: ** ret @@ -229,6 +246,17 @@ TEST_DUAL_XN (reinterpret_u16_f32_x2_untied, svuint16x2_t, svfloat32x2_t, z0, svreinterpret_u16_f32_x2 (z4), svreinterpret_u16 (z4)) +/* +** reinterpret_u16_mf8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_mf8_x3_untied, svuint16x3_t, svmfloat8x3_t, z18, + svreinterpret_u16_mf8_x3 (z23), + svreinterpret_u16 (z23)) + /* ** reinterpret_u16_s64_x3_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c index 6d7e05857fe..77f5abc3465 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_u32_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_mf8_tied1, svuint32_t, svmfloat8_t, + z0_res = svreinterpret_u32_mf8 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_u32_mf8_untied, svuint32_t, svmfloat8_t, + z0 = svreinterpret_u32_mf8 (z4), + z0 = svreinterpret_u32 (z4)) + /* ** reinterpret_u32_bf16_tied1: ** ret @@ -229,6 +246,17 @@ TEST_DUAL_XN (reinterpret_u32_f32_x2_untied, svuint32x2_t, svfloat32x2_t, z0, svreinterpret_u32_f32_x2 (z4), svreinterpret_u32 (z4)) +/* +** reinterpret_u32_mf8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_mf8_x3_untied, svuint32x3_t, svmfloat8x3_t, z18, + svreinterpret_u32_mf8_x3 (z23), + svreinterpret_u32 (z23)) + /* ** reinterpret_u32_s64_x3_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c index 55c0baefb6f..90fb1ff8478 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_u64_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_mf8_tied1, svuint64_t, svmfloat8_t, + z0_res = svreinterpret_u64_mf8 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_u64_mf8_untied, svuint64_t, svmfloat8_t, + z0 = svreinterpret_u64_mf8 (z4), + z0 = svreinterpret_u64 (z4)) + /* ** reinterpret_u64_bf16_tied1: ** ret @@ -229,6 +246,17 @@ TEST_DUAL_XN (reinterpret_u64_f32_x2_untied, svuint64x2_t, svfloat32x2_t, z0, svreinterpret_u64_f32_x2 (z4), svreinterpret_u64 (z4)) +/* +** reinterpret_u64_mf8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_mf8_x3_untied, svuint64x3_t, svmfloat8x3_t, z18, + svreinterpret_u64_mf8_x3 (z23), + svreinterpret_u64 (z23)) + /* ** reinterpret_u64_s64_x3_tied1: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c index f7302196162..87500e334a1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c @@ -2,6 +2,23 @@ #include "test_sve_acle.h" +/* +** reinterpret_u8_mf8_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_mf8_tied1, svuint8_t, svmfloat8_t, + z0_res = svreinterpret_u8_mf8 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_mf8_untied: +** mov z0\.d, z4\.d +** ret +*/ +TEST_DUAL_Z (reinterpret_u8_mf8_untied, svuint8_t, svmfloat8_t, + z0 = svreinterpret_u8_mf8 (z4), + z0 = svreinterpret_u8 (z4)) + /* ** reinterpret_u8_bf16_tied1: ** ret @@ -214,6 +231,17 @@ TEST_DUAL_Z_REV (reinterpret_u8_bf16_x2_tied1, svuint8x2_t, svbfloat16x2_t, z0_res = svreinterpret_u8_bf16_x2 (z0), z0_res = svreinterpret_u8 (z0)) +/* +** reinterpret_u8_mf8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_mf8_x3_untied, svuint8x3_t, svmfloat8x3_t, z18, + svreinterpret_u8_mf8_x3 (z23), + svreinterpret_u8 (z23)) + /* ** reinterpret_u8_f32_x2_untied: ** ( diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rev_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rev_mf8.c new file mode 100644 index 00000000000..f0c6532f153 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/rev_mf8.c @@ -0,0 +1,21 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** rev_mf8_tied1: +** rev z0\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (rev_mf8_tied1, svmfloat8_t, + z0 = svrev_mf8 (z0), + z0 = svrev (z0)) + +/* +** rev_mf8_untied: +** rev z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (rev_mf8_untied, svmfloat8_t, + z0 = svrev_mf8 (z1), + z0 = svrev (z1)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sel_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sel_mf8.c new file mode 100644 index 00000000000..8a76ce864b7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/sel_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** sel_mf8_tied1: +** sel z0\.b, p0, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (sel_mf8_tied1, svmfloat8_t, + z0 = svsel_mf8 (p0, z0, z1), + z0 = svsel (p0, z0, z1)) + +/* +** sel_mf8_tied2: +** sel z0\.b, p0, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (sel_mf8_tied2, svmfloat8_t, + z0 = svsel_mf8 (p0, z1, z0), + z0 = svsel (p0, z1, z0)) + +/* +** sel_mf8_untied: +** sel z0\.b, p0, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (sel_mf8_untied, svmfloat8_t, + z0 = svsel_mf8 (p0, z1, z2), + z0 = svsel (p0, z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set2_mf8.c new file mode 100644 index 00000000000..7f190d0b4cd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set2_mf8.c @@ -0,0 +1,41 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** set2_mf8_z24_0: +** mov z25\.d, z5\.d +** mov z24\.d, z0\.d +** ret +*/ +TEST_SET (set2_mf8_z24_0, svmfloat8x2_t, svmfloat8_t, + z24 = svset2_mf8 (z4, 0, z0), + z24 = svset2 (z4, 0, z0)) + +/* +** set2_mf8_z24_1: +** mov z24\.d, z4\.d +** mov z25\.d, z0\.d +** ret +*/ +TEST_SET (set2_mf8_z24_1, svmfloat8x2_t, svmfloat8_t, + z24 = svset2_mf8 (z4, 1, z0), + z24 = svset2 (z4, 1, z0)) + +/* +** set2_mf8_z4_0: +** mov z4\.d, z0\.d +** ret +*/ +TEST_SET (set2_mf8_z4_0, svmfloat8x2_t, svmfloat8_t, + z4 = svset2_mf8 (z4, 0, z0), + z4 = svset2 (z4, 0, z0)) + +/* +** set2_mf8_z4_1: +** mov z5\.d, z0\.d +** ret +*/ +TEST_SET (set2_mf8_z4_1, svmfloat8x2_t, svmfloat8_t, + z4 = svset2_mf8 (z4, 1, z0), + z4 = svset2 (z4, 1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set3_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set3_mf8.c new file mode 100644 index 00000000000..19247616349 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set3_mf8.c @@ -0,0 +1,63 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** set3_mf8_z24_0: +** mov z25\.d, z5\.d +** mov z26\.d, z6\.d +** mov z24\.d, z0\.d +** ret +*/ +TEST_SET (set3_mf8_z24_0, svmfloat8x3_t, svmfloat8_t, + z24 = svset3_mf8 (z4, 0, z0), + z24 = svset3 (z4, 0, z0)) + +/* +** set3_mf8_z24_1: +** mov z24\.d, z4\.d +** mov z26\.d, z6\.d +** mov z25\.d, z0\.d +** ret +*/ +TEST_SET (set3_mf8_z24_1, svmfloat8x3_t, svmfloat8_t, + z24 = svset3_mf8 (z4, 1, z0), + z24 = svset3 (z4, 1, z0)) + +/* +** set3_mf8_z24_2: +** mov z24\.d, z4\.d +** mov z25\.d, z5\.d +** mov z26\.d, z0\.d +** ret +*/ +TEST_SET (set3_mf8_z24_2, svmfloat8x3_t, svmfloat8_t, + z24 = svset3_mf8 (z4, 2, z0), + z24 = svset3 (z4, 2, z0)) + +/* +** set3_mf8_z4_0: +** mov z4\.d, z0\.d +** ret +*/ +TEST_SET (set3_mf8_z4_0, svmfloat8x3_t, svmfloat8_t, + z4 = svset3_mf8 (z4, 0, z0), + z4 = svset3 (z4, 0, z0)) + +/* +** set3_mf8_z4_1: +** mov z5\.d, z0\.d +** ret +*/ +TEST_SET (set3_mf8_z4_1, svmfloat8x3_t, svmfloat8_t, + z4 = svset3_mf8 (z4, 1, z0), + z4 = svset3 (z4, 1, z0)) + +/* +** set3_mf8_z4_2: +** mov z6\.d, z0\.d +** ret +*/ +TEST_SET (set3_mf8_z4_2, svmfloat8x3_t, svmfloat8_t, + z4 = svset3_mf8 (z4, 2, z0), + z4 = svset3 (z4, 2, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_mf8.c new file mode 100644 index 00000000000..faf0ceb3dd7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set4_mf8.c @@ -0,0 +1,87 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** set4_mf8_z24_0: +** mov z25\.d, z5\.d +** mov z26\.d, z6\.d +** mov z27\.d, z7\.d +** mov z24\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z24_0, svmfloat8x4_t, svmfloat8_t, + z24 = svset4_mf8 (z4, 0, z0), + z24 = svset4 (z4, 0, z0)) + +/* +** set4_mf8_z24_1: +** mov z24\.d, z4\.d +** mov z26\.d, z6\.d +** mov z27\.d, z7\.d +** mov z25\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z24_1, svmfloat8x4_t, svmfloat8_t, + z24 = svset4_mf8 (z4, 1, z0), + z24 = svset4 (z4, 1, z0)) + +/* +** set4_mf8_z24_2: +** mov z24\.d, z4\.d +** mov z25\.d, z5\.d +** mov z27\.d, z7\.d +** mov z26\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z24_2, svmfloat8x4_t, svmfloat8_t, + z24 = svset4_mf8 (z4, 2, z0), + z24 = svset4 (z4, 2, z0)) + +/* +** set4_mf8_z24_3: +** mov z24\.d, z4\.d +** mov z25\.d, z5\.d +** mov z26\.d, z6\.d +** mov z27\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z24_3, svmfloat8x4_t, svmfloat8_t, + z24 = svset4_mf8 (z4, 3, z0), + z24 = svset4 (z4, 3, z0)) + +/* +** set4_mf8_z4_0: +** mov z4\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z4_0, svmfloat8x4_t, svmfloat8_t, + z4 = svset4_mf8 (z4, 0, z0), + z4 = svset4 (z4, 0, z0)) + +/* +** set4_mf8_z4_1: +** mov z5\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z4_1, svmfloat8x4_t, svmfloat8_t, + z4 = svset4_mf8 (z4, 1, z0), + z4 = svset4 (z4, 1, z0)) + +/* +** set4_mf8_z4_2: +** mov z6\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z4_2, svmfloat8x4_t, svmfloat8_t, + z4 = svset4_mf8 (z4, 2, z0), + z4 = svset4 (z4, 2, z0)) + +/* +** set4_mf8_z4_3: +** mov z7\.d, z0\.d +** ret +*/ +TEST_SET (set4_mf8_z4_3, svmfloat8x4_t, svmfloat8_t, + z4 = svset4_mf8 (z4, 3, z0), + z4 = svset4 (z4, 3, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set_neonq_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set_neonq_mf8.c new file mode 100644 index 00000000000..1de9ac240cc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/set_neonq_mf8.c @@ -0,0 +1,23 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** set_neonq_mf8_z24: +** ptrue (p[0-9]+).b, vl16 +** sel z24.b, \1, z0.b, z4.b +** ret +*/ +TEST_SET_NEONQ (set_neonq_mf8_z24, svmfloat8_t, mfloat8x16_t, + z24 = svset_neonq_mf8 (z4, z0), + z24 = svset_neonq (z4, z0)) + +/* +** set_neonq_mf8_z4: +** ptrue (p[0-9]+).b, vl16 +** sel z4.b, \1, z0.b, z4.b +** ret +*/ +TEST_SET_NEONQ (set_neonq_mf8_z4, svmfloat8_t, mfloat8x16_t, + z4_res = svset_neonq_mf8 (z4, z0), + z4_res = svset_neonq (z4, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/splice_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/splice_mf8.c new file mode 100644 index 00000000000..5ddeff1d3dd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/splice_mf8.c @@ -0,0 +1,33 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** splice_mf8_tied1: +** splice z0\.b, p0, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (splice_mf8_tied1, svmfloat8_t, + z0 = svsplice_mf8 (p0, z0, z1), + z0 = svsplice (p0, z0, z1)) + +/* +** splice_mf8_tied2: +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z1 +** splice z0\.b, p0, z0\.b, \1\.b +** ret +*/ +TEST_UNIFORM_Z (splice_mf8_tied2, svmfloat8_t, + z0 = svsplice_mf8 (p0, z1, z0), + z0 = svsplice (p0, z1, z0)) + +/* +** splice_mf8_untied: +** movprfx z0, z1 +** splice z0\.b, p0, z0\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (splice_mf8_untied, svmfloat8_t, + z0 = svsplice_mf8 (p0, z1, z2), + z0 = svsplice (p0, z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_mf8.c new file mode 100644 index 00000000000..d4ca82bd08c --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st1_mf8.c @@ -0,0 +1,162 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** st1_mf8_base: +** st1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (st1_mf8_base, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0, z0), + svst1 (p0, x0, z0)) + +/* +** st1_mf8_index: +** st1b z0\.b, p0, \[x0, x1\] +** ret +*/ +TEST_STORE (st1_mf8_index, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 + x1, z0), + svst1 (p0, x0 + x1, z0)) + +/* +** st1_mf8_1: +** st1b z0\.b, p0, \[x0, #1, mul vl\] +** ret +*/ +TEST_STORE (st1_mf8_1, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 + svcntb (), z0), + svst1 (p0, x0 + svcntb (), z0)) + +/* +** st1_mf8_7: +** st1b z0\.b, p0, \[x0, #7, mul vl\] +** ret +*/ +TEST_STORE (st1_mf8_7, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 + svcntb () * 7, z0), + svst1 (p0, x0 + svcntb () * 7, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st1_mf8_8: +** incb x0, all, mul #8 +** st1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (st1_mf8_8, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 + svcntb () * 8, z0), + svst1 (p0, x0 + svcntb () * 8, z0)) + +/* +** st1_mf8_m1: +** st1b z0\.b, p0, \[x0, #-1, mul vl\] +** ret +*/ +TEST_STORE (st1_mf8_m1, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 - svcntb (), z0), + svst1 (p0, x0 - svcntb (), z0)) + +/* +** st1_mf8_m8: +** st1b z0\.b, p0, \[x0, #-8, mul vl\] +** ret +*/ +TEST_STORE (st1_mf8_m8, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 - svcntb () * 8, z0), + svst1 (p0, x0 - svcntb () * 8, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st1_mf8_m9: +** decb x0, all, mul #9 +** st1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (st1_mf8_m9, svmfloat8_t, mfloat8_t, + svst1_mf8 (p0, x0 - svcntb () * 9, z0), + svst1 (p0, x0 - svcntb () * 9, z0)) + +/* +** st1_vnum_mf8_0: +** st1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_0, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, 0, z0), + svst1_vnum (p0, x0, 0, z0)) + +/* +** st1_vnum_mf8_1: +** st1b z0\.b, p0, \[x0, #1, mul vl\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_1, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, 1, z0), + svst1_vnum (p0, x0, 1, z0)) + +/* +** st1_vnum_mf8_7: +** st1b z0\.b, p0, \[x0, #7, mul vl\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_7, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, 7, z0), + svst1_vnum (p0, x0, 7, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st1_vnum_mf8_8: +** incb x0, all, mul #8 +** st1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_8, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, 8, z0), + svst1_vnum (p0, x0, 8, z0)) + +/* +** st1_vnum_mf8_m1: +** st1b z0\.b, p0, \[x0, #-1, mul vl\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_m1, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, -1, z0), + svst1_vnum (p0, x0, -1, z0)) + +/* +** st1_vnum_mf8_m8: +** st1b z0\.b, p0, \[x0, #-8, mul vl\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_m8, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, -8, z0), + svst1_vnum (p0, x0, -8, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st1_vnum_mf8_m9: +** decb x0, all, mul #9 +** st1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (st1_vnum_mf8_m9, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, -9, z0), + svst1_vnum (p0, x0, -9, z0)) + +/* +** st1_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** st1b z0\.b, p0, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** st1b z0\.b, p0, \[x0, \3\] +** ) +** ret +*/ +TEST_STORE (st1_vnum_mf8_x1, svmfloat8_t, mfloat8_t, + svst1_vnum_mf8 (p0, x0, x1, z0), + svst1_vnum (p0, x0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st2_mf8.c new file mode 100644 index 00000000000..7473f110172 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st2_mf8.c @@ -0,0 +1,204 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** st2_mf8_base: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_mf8_base, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0, z0), + svst2 (p0, x0, z0)) + +/* +** st2_mf8_index: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, x1\] +** ret +*/ +TEST_STORE (st2_mf8_index, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 + x1, z0), + svst2 (p0, x0 + x1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st2_mf8_1: +** incb x0 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_mf8_1, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 + svcntb (), z0), + svst2 (p0, x0 + svcntb (), z0)) + +/* +** st2_mf8_2: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #2, mul vl\] +** ret +*/ +TEST_STORE (st2_mf8_2, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 + svcntb () * 2, z0), + svst2 (p0, x0 + svcntb () * 2, z0)) + +/* +** st2_mf8_14: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #14, mul vl\] +** ret +*/ +TEST_STORE (st2_mf8_14, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 + svcntb () * 14, z0), + svst2 (p0, x0 + svcntb () * 14, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st2_mf8_16: +** incb x0, all, mul #16 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_mf8_16, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 + svcntb () * 16, z0), + svst2 (p0, x0 + svcntb () * 16, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st2_mf8_m1: +** decb x0 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_mf8_m1, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 - svcntb (), z0), + svst2 (p0, x0 - svcntb (), z0)) + +/* +** st2_mf8_m2: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #-2, mul vl\] +** ret +*/ +TEST_STORE (st2_mf8_m2, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 - svcntb () * 2, z0), + svst2 (p0, x0 - svcntb () * 2, z0)) + +/* +** st2_mf8_m16: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #-16, mul vl\] +** ret +*/ +TEST_STORE (st2_mf8_m16, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 - svcntb () * 16, z0), + svst2 (p0, x0 - svcntb () * 16, z0)) + +/* +** st2_mf8_m18: +** addvl (x[0-9]+), x0, #-18 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[\1\] +** ret +*/ +TEST_STORE (st2_mf8_m18, svmfloat8x2_t, mfloat8_t, + svst2_mf8 (p0, x0 - svcntb () * 18, z0), + svst2 (p0, x0 - svcntb () * 18, z0)) + +/* +** st2_vnum_mf8_0: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_0, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, 0, z0), + svst2_vnum (p0, x0, 0, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st2_vnum_mf8_1: +** incb x0 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_1, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, 1, z0), + svst2_vnum (p0, x0, 1, z0)) + +/* +** st2_vnum_mf8_2: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #2, mul vl\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_2, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, 2, z0), + svst2_vnum (p0, x0, 2, z0)) + +/* +** st2_vnum_mf8_14: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #14, mul vl\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_14, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, 14, z0), + svst2_vnum (p0, x0, 14, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st2_vnum_mf8_16: +** incb x0, all, mul #16 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_16, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, 16, z0), + svst2_vnum (p0, x0, 16, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st2_vnum_mf8_m1: +** decb x0 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_m1, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, -1, z0), + svst2_vnum (p0, x0, -1, z0)) + +/* +** st2_vnum_mf8_m2: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #-2, mul vl\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_m2, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, -2, z0), + svst2_vnum (p0, x0, -2, z0)) + +/* +** st2_vnum_mf8_m16: +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, #-16, mul vl\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_m16, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, -16, z0), + svst2_vnum (p0, x0, -16, z0)) + +/* +** st2_vnum_mf8_m18: +** addvl (x[0-9]+), x0, #-18 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[\1\] +** ret +*/ +TEST_STORE (st2_vnum_mf8_m18, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, -18, z0), + svst2_vnum (p0, x0, -18, z0)) + +/* +** st2_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** st2b {z0\.b(?: - |, )z1\.b}, p0, \[x0, \3\] +** ) +** ret +*/ +TEST_STORE (st2_vnum_mf8_x1, svmfloat8x2_t, mfloat8_t, + svst2_vnum_mf8 (p0, x0, x1, z0), + svst2_vnum (p0, x0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st3_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st3_mf8.c new file mode 100644 index 00000000000..e23e7e979de --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st3_mf8.c @@ -0,0 +1,246 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** st3_mf8_base: +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_mf8_base, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0, z0), + svst3 (p0, x0, z0)) + +/* +** st3_mf8_index: +** st3b {z0\.b - z2\.b}, p0, \[x0, x1\] +** ret +*/ +TEST_STORE (st3_mf8_index, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 + x1, z0), + svst3 (p0, x0 + x1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_mf8_1: +** incb x0 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_mf8_1, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 + svcntb (), z0), + svst3 (p0, x0 + svcntb (), z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_mf8_2: +** incb x0, all, mul #2 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_mf8_2, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 + svcntb () * 2, z0), + svst3 (p0, x0 + svcntb () * 2, z0)) + +/* +** st3_mf8_3: +** st3b {z0\.b - z2\.b}, p0, \[x0, #3, mul vl\] +** ret +*/ +TEST_STORE (st3_mf8_3, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 + svcntb () * 3, z0), + svst3 (p0, x0 + svcntb () * 3, z0)) + +/* +** st3_mf8_21: +** st3b {z0\.b - z2\.b}, p0, \[x0, #21, mul vl\] +** ret +*/ +TEST_STORE (st3_mf8_21, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 + svcntb () * 21, z0), + svst3 (p0, x0 + svcntb () * 21, z0)) + +/* +** st3_mf8_24: +** addvl (x[0-9]+), x0, #24 +** st3b {z0\.b - z2\.b}, p0, \[\1\] +** ret +*/ +TEST_STORE (st3_mf8_24, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 + svcntb () * 24, z0), + svst3 (p0, x0 + svcntb () * 24, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_mf8_m1: +** decb x0 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_mf8_m1, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 - svcntb (), z0), + svst3 (p0, x0 - svcntb (), z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_mf8_m2: +** decb x0, all, mul #2 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_mf8_m2, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 - svcntb () * 2, z0), + svst3 (p0, x0 - svcntb () * 2, z0)) + +/* +** st3_mf8_m3: +** st3b {z0\.b - z2\.b}, p0, \[x0, #-3, mul vl\] +** ret +*/ +TEST_STORE (st3_mf8_m3, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 - svcntb () * 3, z0), + svst3 (p0, x0 - svcntb () * 3, z0)) + +/* +** st3_mf8_m24: +** st3b {z0\.b - z2\.b}, p0, \[x0, #-24, mul vl\] +** ret +*/ +TEST_STORE (st3_mf8_m24, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 - svcntb () * 24, z0), + svst3 (p0, x0 - svcntb () * 24, z0)) + +/* +** st3_mf8_m27: +** addvl (x[0-9]+), x0, #-27 +** st3b {z0\.b - z2\.b}, p0, \[\1\] +** ret +*/ +TEST_STORE (st3_mf8_m27, svmfloat8x3_t, mfloat8_t, + svst3_mf8 (p0, x0 - svcntb () * 27, z0), + svst3 (p0, x0 - svcntb () * 27, z0)) + +/* +** st3_vnum_mf8_0: +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_0, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, 0, z0), + svst3_vnum (p0, x0, 0, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_vnum_mf8_1: +** incb x0 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_1, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, 1, z0), + svst3_vnum (p0, x0, 1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_vnum_mf8_2: +** incb x0, all, mul #2 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_2, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, 2, z0), + svst3_vnum (p0, x0, 2, z0)) + +/* +** st3_vnum_mf8_3: +** st3b {z0\.b - z2\.b}, p0, \[x0, #3, mul vl\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_3, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, 3, z0), + svst3_vnum (p0, x0, 3, z0)) + +/* +** st3_vnum_mf8_21: +** st3b {z0\.b - z2\.b}, p0, \[x0, #21, mul vl\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_21, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, 21, z0), + svst3_vnum (p0, x0, 21, z0)) + +/* +** st3_vnum_mf8_24: +** addvl (x[0-9]+), x0, #24 +** st3b {z0\.b - z2\.b}, p0, \[\1\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_24, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, 24, z0), + svst3_vnum (p0, x0, 24, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_vnum_mf8_m1: +** decb x0 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_m1, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, -1, z0), + svst3_vnum (p0, x0, -1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st3_vnum_mf8_m2: +** decb x0, all, mul #2 +** st3b {z0\.b - z2\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_m2, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, -2, z0), + svst3_vnum (p0, x0, -2, z0)) + +/* +** st3_vnum_mf8_m3: +** st3b {z0\.b - z2\.b}, p0, \[x0, #-3, mul vl\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_m3, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, -3, z0), + svst3_vnum (p0, x0, -3, z0)) + +/* +** st3_vnum_mf8_m24: +** st3b {z0\.b - z2\.b}, p0, \[x0, #-24, mul vl\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_m24, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, -24, z0), + svst3_vnum (p0, x0, -24, z0)) + +/* +** st3_vnum_mf8_m27: +** addvl (x[0-9]+), x0, #-27 +** st3b {z0\.b - z2\.b}, p0, \[\1\] +** ret +*/ +TEST_STORE (st3_vnum_mf8_m27, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, -27, z0), + svst3_vnum (p0, x0, -27, z0)) + +/* +** st3_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** st3b {z0\.b - z2\.b}, p0, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** st3b {z0\.b - z2\.b}, p0, \[x0, \3\] +** ) +** ret +*/ +TEST_STORE (st3_vnum_mf8_x1, svmfloat8x3_t, mfloat8_t, + svst3_vnum_mf8 (p0, x0, x1, z0), + svst3_vnum (p0, x0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_mf8.c new file mode 100644 index 00000000000..5c9526a816d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/st4_mf8.c @@ -0,0 +1,290 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** st4_mf8_base: +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_base, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0, z0), + svst4 (p0, x0, z0)) + +/* +** st4_mf8_index: +** st4b {z0\.b - z3\.b}, p0, \[x0, x1\] +** ret +*/ +TEST_STORE (st4_mf8_index, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + x1, z0), + svst4 (p0, x0 + x1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_mf8_1: +** incb x0 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_1, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + svcntb (), z0), + svst4 (p0, x0 + svcntb (), z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_mf8_2: +** incb x0, all, mul #2 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_2, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + svcntb () * 2, z0), + svst4 (p0, x0 + svcntb () * 2, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_mf8_3: +** incb x0, all, mul #3 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_3, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + svcntb () * 3, z0), + svst4 (p0, x0 + svcntb () * 3, z0)) + +/* +** st4_mf8_4: +** st4b {z0\.b - z3\.b}, p0, \[x0, #4, mul vl\] +** ret +*/ +TEST_STORE (st4_mf8_4, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + svcntb () * 4, z0), + svst4 (p0, x0 + svcntb () * 4, z0)) + +/* +** st4_mf8_28: +** st4b {z0\.b - z3\.b}, p0, \[x0, #28, mul vl\] +** ret +*/ +TEST_STORE (st4_mf8_28, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + svcntb () * 28, z0), + svst4 (p0, x0 + svcntb () * 28, z0)) + +/* +** st4_mf8_32: +** [^{]* +** st4b {z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\] +** ret +*/ +TEST_STORE (st4_mf8_32, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 + svcntb () * 32, z0), + svst4 (p0, x0 + svcntb () * 32, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_mf8_m1: +** decb x0 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_m1, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 - svcntb (), z0), + svst4 (p0, x0 - svcntb (), z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_mf8_m2: +** decb x0, all, mul #2 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_m2, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 - svcntb () * 2, z0), + svst4 (p0, x0 - svcntb () * 2, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_mf8_m3: +** decb x0, all, mul #3 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_mf8_m3, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 - svcntb () * 3, z0), + svst4 (p0, x0 - svcntb () * 3, z0)) + +/* +** st4_mf8_m4: +** st4b {z0\.b - z3\.b}, p0, \[x0, #-4, mul vl\] +** ret +*/ +TEST_STORE (st4_mf8_m4, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 - svcntb () * 4, z0), + svst4 (p0, x0 - svcntb () * 4, z0)) + +/* +** st4_mf8_m32: +** st4b {z0\.b - z3\.b}, p0, \[x0, #-32, mul vl\] +** ret +*/ +TEST_STORE (st4_mf8_m32, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 - svcntb () * 32, z0), + svst4 (p0, x0 - svcntb () * 32, z0)) + +/* +** st4_mf8_m36: +** [^{]* +** st4b {z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\] +** ret +*/ +TEST_STORE (st4_mf8_m36, svmfloat8x4_t, mfloat8_t, + svst4_mf8 (p0, x0 - svcntb () * 36, z0), + svst4 (p0, x0 - svcntb () * 36, z0)) + +/* +** st4_vnum_mf8_0: +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_0, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 0, z0), + svst4_vnum (p0, x0, 0, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_vnum_mf8_1: +** incb x0 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_1, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 1, z0), + svst4_vnum (p0, x0, 1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_vnum_mf8_2: +** incb x0, all, mul #2 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_2, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 2, z0), + svst4_vnum (p0, x0, 2, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_vnum_mf8_3: +** incb x0, all, mul #3 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_3, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 3, z0), + svst4_vnum (p0, x0, 3, z0)) + +/* +** st4_vnum_mf8_4: +** st4b {z0\.b - z3\.b}, p0, \[x0, #4, mul vl\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_4, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 4, z0), + svst4_vnum (p0, x0, 4, z0)) + +/* +** st4_vnum_mf8_28: +** st4b {z0\.b - z3\.b}, p0, \[x0, #28, mul vl\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_28, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 28, z0), + svst4_vnum (p0, x0, 28, z0)) + +/* +** st4_vnum_mf8_32: +** [^{]* +** st4b {z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_32, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, 32, z0), + svst4_vnum (p0, x0, 32, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_vnum_mf8_m1: +** decb x0 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_m1, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, -1, z0), + svst4_vnum (p0, x0, -1, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_vnum_mf8_m2: +** decb x0, all, mul #2 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_m2, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, -2, z0), + svst4_vnum (p0, x0, -2, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** st4_vnum_mf8_m3: +** decb x0, all, mul #3 +** st4b {z0\.b - z3\.b}, p0, \[x0\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_m3, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, -3, z0), + svst4_vnum (p0, x0, -3, z0)) + +/* +** st4_vnum_mf8_m4: +** st4b {z0\.b - z3\.b}, p0, \[x0, #-4, mul vl\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_m4, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, -4, z0), + svst4_vnum (p0, x0, -4, z0)) + +/* +** st4_vnum_mf8_m32: +** st4b {z0\.b - z3\.b}, p0, \[x0, #-32, mul vl\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_m32, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, -32, z0), + svst4_vnum (p0, x0, -32, z0)) + +/* +** st4_vnum_mf8_m36: +** [^{]* +** st4b {z0\.b - z3\.b}, p0, \[x[0-9]+, x[0-9]+\] +** ret +*/ +TEST_STORE (st4_vnum_mf8_m36, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, -36, z0), + svst4_vnum (p0, x0, -36, z0)) + +/* +** st4_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** st4b {z0\.b - z3\.b}, p0, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** st4b {z0\.b - z3\.b}, p0, \[x0, \3\] +** ) +** ret +*/ +TEST_STORE (st4_vnum_mf8_x1, svmfloat8x4_t, mfloat8_t, + svst4_vnum_mf8 (p0, x0, x1, z0), + svst4_vnum (p0, x0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/stnt1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/stnt1_mf8.c new file mode 100644 index 00000000000..5bcb8dfa4fa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/stnt1_mf8.c @@ -0,0 +1,162 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** stnt1_mf8_base: +** stnt1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (stnt1_mf8_base, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0, z0), + svstnt1 (p0, x0, z0)) + +/* +** stnt1_mf8_index: +** stnt1b z0\.b, p0, \[x0, x1\] +** ret +*/ +TEST_STORE (stnt1_mf8_index, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 + x1, z0), + svstnt1 (p0, x0 + x1, z0)) + +/* +** stnt1_mf8_1: +** stnt1b z0\.b, p0, \[x0, #1, mul vl\] +** ret +*/ +TEST_STORE (stnt1_mf8_1, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 + svcntb (), z0), + svstnt1 (p0, x0 + svcntb (), z0)) + +/* +** stnt1_mf8_7: +** stnt1b z0\.b, p0, \[x0, #7, mul vl\] +** ret +*/ +TEST_STORE (stnt1_mf8_7, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 + svcntb () * 7, z0), + svstnt1 (p0, x0 + svcntb () * 7, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** stnt1_mf8_8: +** incb x0, all, mul #8 +** stnt1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (stnt1_mf8_8, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 + svcntb () * 8, z0), + svstnt1 (p0, x0 + svcntb () * 8, z0)) + +/* +** stnt1_mf8_m1: +** stnt1b z0\.b, p0, \[x0, #-1, mul vl\] +** ret +*/ +TEST_STORE (stnt1_mf8_m1, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 - svcntb (), z0), + svstnt1 (p0, x0 - svcntb (), z0)) + +/* +** stnt1_mf8_m8: +** stnt1b z0\.b, p0, \[x0, #-8, mul vl\] +** ret +*/ +TEST_STORE (stnt1_mf8_m8, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 - svcntb () * 8, z0), + svstnt1 (p0, x0 - svcntb () * 8, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** stnt1_mf8_m9: +** decb x0, all, mul #9 +** stnt1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (stnt1_mf8_m9, svmfloat8_t, mfloat8_t, + svstnt1_mf8 (p0, x0 - svcntb () * 9, z0), + svstnt1 (p0, x0 - svcntb () * 9, z0)) + +/* +** stnt1_vnum_mf8_0: +** stnt1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_0, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, 0, z0), + svstnt1_vnum (p0, x0, 0, z0)) + +/* +** stnt1_vnum_mf8_1: +** stnt1b z0\.b, p0, \[x0, #1, mul vl\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_1, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, 1, z0), + svstnt1_vnum (p0, x0, 1, z0)) + +/* +** stnt1_vnum_mf8_7: +** stnt1b z0\.b, p0, \[x0, #7, mul vl\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_7, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, 7, z0), + svstnt1_vnum (p0, x0, 7, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** stnt1_vnum_mf8_8: +** incb x0, all, mul #8 +** stnt1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_8, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, 8, z0), + svstnt1_vnum (p0, x0, 8, z0)) + +/* +** stnt1_vnum_mf8_m1: +** stnt1b z0\.b, p0, \[x0, #-1, mul vl\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_m1, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, -1, z0), + svstnt1_vnum (p0, x0, -1, z0)) + +/* +** stnt1_vnum_mf8_m8: +** stnt1b z0\.b, p0, \[x0, #-8, mul vl\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_m8, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, -8, z0), + svstnt1_vnum (p0, x0, -8, z0)) + +/* Moving the constant into a register would also be OK. */ +/* +** stnt1_vnum_mf8_m9: +** decb x0, all, mul #9 +** stnt1b z0\.b, p0, \[x0\] +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_m9, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, -9, z0), + svstnt1_vnum (p0, x0, -9, z0)) + +/* +** stnt1_vnum_mf8_x1: +** cntb (x[0-9]+) +** ( +** madd (x[0-9]+), (?:x1, \1|\1, x1), x0 +** stnt1b z0\.b, p0, \[\2\] +** | +** mul (x[0-9]+), (?:x1, \1|\1, x1) +** stnt1b z0\.b, p0, \[x0, \3\] +** ) +** ret +*/ +TEST_STORE (stnt1_vnum_mf8_x1, svmfloat8_t, mfloat8_t, + svstnt1_vnum_mf8 (p0, x0, x1, z0), + svstnt1_vnum (p0, x0, x1, z0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_mf8.c new file mode 100644 index 00000000000..ec77002ceeb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/tbl_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** tbl_mf8_tied1: +** tbl z0\.b, {z0\.b}, z4\.b +** ret +*/ +TEST_DUAL_Z (tbl_mf8_tied1, svmfloat8_t, svuint8_t, + z0 = svtbl_mf8 (z0, z4), + z0 = svtbl (z0, z4)) + +/* +** tbl_mf8_tied2: +** tbl z0\.b, {z4\.b}, z0\.b +** ret +*/ +TEST_DUAL_Z_REV (tbl_mf8_tied2, svmfloat8_t, svuint8_t, + z0_res = svtbl_mf8 (z4, z0), + z0_res = svtbl (z4, z0)) + +/* +** tbl_mf8_untied: +** tbl z0\.b, {z1\.b}, z4\.b +** ret +*/ +TEST_DUAL_Z (tbl_mf8_untied, svmfloat8_t, svuint8_t, + z0 = svtbl_mf8 (z1, z4), + z0 = svtbl (z1, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1_mf8.c new file mode 100644 index 00000000000..2676c79bb74 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** trn1_mf8_tied1: +** trn1 z0\.b, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (trn1_mf8_tied1, svmfloat8_t, + z0 = svtrn1_mf8 (z0, z1), + z0 = svtrn1 (z0, z1)) + +/* +** trn1_mf8_tied2: +** trn1 z0\.b, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (trn1_mf8_tied2, svmfloat8_t, + z0 = svtrn1_mf8 (z1, z0), + z0 = svtrn1 (z1, z0)) + +/* +** trn1_mf8_untied: +** trn1 z0\.b, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (trn1_mf8_untied, svmfloat8_t, + z0 = svtrn1_mf8 (z1, z2), + z0 = svtrn1 (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1q_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1q_mf8.c new file mode 100644 index 00000000000..3f5cf5a7e54 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn1q_mf8.c @@ -0,0 +1,33 @@ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ +/* { dg-additional-options "-march=armv8.2-a+f64mm" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ + +#include "test_sve_acle.h" + +/* +** trn1q_mf8_tied1: +** trn1 z0\.q, z0\.q, z1\.q +** ret +*/ +TEST_UNIFORM_Z (trn1q_mf8_tied1, svmfloat8_t, + z0 = svtrn1q_mf8 (z0, z1), + z0 = svtrn1q (z0, z1)) + +/* +** trn1q_mf8_tied2: +** trn1 z0\.q, z1\.q, z0\.q +** ret +*/ +TEST_UNIFORM_Z (trn1q_mf8_tied2, svmfloat8_t, + z0 = svtrn1q_mf8 (z1, z0), + z0 = svtrn1q (z1, z0)) + +/* +** trn1q_mf8_untied: +** trn1 z0\.q, z1\.q, z2\.q +** ret +*/ +TEST_UNIFORM_Z (trn1q_mf8_untied, svmfloat8_t, + z0 = svtrn1q_mf8 (z1, z2), + z0 = svtrn1q (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2_mf8.c new file mode 100644 index 00000000000..8bdb8eda63e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** trn2_mf8_tied1: +** trn2 z0\.b, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (trn2_mf8_tied1, svmfloat8_t, + z0 = svtrn2_mf8 (z0, z1), + z0 = svtrn2 (z0, z1)) + +/* +** trn2_mf8_tied2: +** trn2 z0\.b, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (trn2_mf8_tied2, svmfloat8_t, + z0 = svtrn2_mf8 (z1, z0), + z0 = svtrn2 (z1, z0)) + +/* +** trn2_mf8_untied: +** trn2 z0\.b, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (trn2_mf8_untied, svmfloat8_t, + z0 = svtrn2_mf8 (z1, z2), + z0 = svtrn2 (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2q_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2q_mf8.c new file mode 100644 index 00000000000..a1fff6fd8d0 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/trn2q_mf8.c @@ -0,0 +1,33 @@ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ +/* { dg-additional-options "-march=armv8.2-a+f64mm" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ + +#include "test_sve_acle.h" + +/* +** trn2q_mf8_tied1: +** trn2 z0\.q, z0\.q, z1\.q +** ret +*/ +TEST_UNIFORM_Z (trn2q_mf8_tied1, svmfloat8_t, + z0 = svtrn2q_mf8 (z0, z1), + z0 = svtrn2q (z0, z1)) + +/* +** trn2q_mf8_tied2: +** trn2 z0\.q, z1\.q, z0\.q +** ret +*/ +TEST_UNIFORM_Z (trn2q_mf8_tied2, svmfloat8_t, + z0 = svtrn2q_mf8 (z1, z0), + z0 = svtrn2q (z1, z0)) + +/* +** trn2q_mf8_untied: +** trn2 z0\.q, z1\.q, z2\.q +** ret +*/ +TEST_UNIFORM_Z (trn2q_mf8_untied, svmfloat8_t, + z0 = svtrn2q_mf8 (z1, z2), + z0 = svtrn2q (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c index 2c520df99a3..176b121eece 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef2_1.c @@ -37,6 +37,13 @@ TEST_UNDEF (uint16, svuint16x2_t, TEST_UNDEF (float16, svfloat16x2_t, z0 = svundef2_f16 ()) +/* +** mfloat8: +** ret +*/ +TEST_UNDEF (mfloat8, svmfloat8x2_t, + z0 = svundef2_mf8 ()) + /* ** bfloat16: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef3_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef3_1.c index 5c18c6317d1..f9a40895fb1 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef3_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef3_1.c @@ -37,6 +37,13 @@ TEST_UNDEF (uint16, svuint16x3_t, TEST_UNDEF (float16, svfloat16x3_t, z0 = svundef3_f16 ()) +/* +** mfloat8: +** ret +*/ +TEST_UNDEF (mfloat8, svmfloat8x3_t, + z0 = svundef3_mf8 ()) + /* ** bfloat16: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c index 9bda4d66e89..4e36301d28a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef4_1.c @@ -37,6 +37,13 @@ TEST_UNDEF (uint16, svuint16x4_t, TEST_UNDEF (float16, svfloat16x4_t, z0 = svundef4_f16 ()) +/* +** mfloat8: +** ret +*/ +TEST_UNDEF (mfloat8, svmfloat8x4_t, + z0 = svundef4_mf8 ()) + /* ** bfloat16: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef_1.c index 62873b6e1b3..bfa51b82f14 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/undef_1.c @@ -37,6 +37,13 @@ TEST_UNDEF (uint16, svuint16_t, TEST_UNDEF (float16, svfloat16_t, z0 = svundef_f16 ()) +/* +** mfloat8: +** ret +*/ +TEST_UNDEF (mfloat8, svmfloat8_t, + z0 = svundef_mf8 ()) + /* ** bfloat16: ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1_mf8.c new file mode 100644 index 00000000000..c0bab36fff3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** uzp1_mf8_tied1: +** uzp1 z0\.b, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (uzp1_mf8_tied1, svmfloat8_t, + z0 = svuzp1_mf8 (z0, z1), + z0 = svuzp1 (z0, z1)) + +/* +** uzp1_mf8_tied2: +** uzp1 z0\.b, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (uzp1_mf8_tied2, svmfloat8_t, + z0 = svuzp1_mf8 (z1, z0), + z0 = svuzp1 (z1, z0)) + +/* +** uzp1_mf8_untied: +** uzp1 z0\.b, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (uzp1_mf8_untied, svmfloat8_t, + z0 = svuzp1_mf8 (z1, z2), + z0 = svuzp1 (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1q_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1q_mf8.c new file mode 100644 index 00000000000..5a779dbf9c1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp1q_mf8.c @@ -0,0 +1,33 @@ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ +/* { dg-additional-options "-march=armv8.2-a+f64mm" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** uzp1q_mf8_tied1: +** uzp1 z0\.q, z0\.q, z1\.q +** ret +*/ +TEST_UNIFORM_Z (uzp1q_mf8_tied1, svmfloat8_t, + z0 = svuzp1q_mf8 (z0, z1), + z0 = svuzp1q (z0, z1)) + +/* +** uzp1q_mf8_tied2: +** uzp1 z0\.q, z1\.q, z0\.q +** ret +*/ +TEST_UNIFORM_Z (uzp1q_mf8_tied2, svmfloat8_t, + z0 = svuzp1q_mf8 (z1, z0), + z0 = svuzp1q (z1, z0)) + +/* +** uzp1q_mf8_untied: +** uzp1 z0\.q, z1\.q, z2\.q +** ret +*/ +TEST_UNIFORM_Z (uzp1q_mf8_untied, svmfloat8_t, + z0 = svuzp1q_mf8 (z1, z2), + z0 = svuzp1q (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2_mf8.c new file mode 100644 index 00000000000..91518b09e47 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** uzp2_mf8_tied1: +** uzp2 z0\.b, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (uzp2_mf8_tied1, svmfloat8_t, + z0 = svuzp2_mf8 (z0, z1), + z0 = svuzp2 (z0, z1)) + +/* +** uzp2_mf8_tied2: +** uzp2 z0\.b, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (uzp2_mf8_tied2, svmfloat8_t, + z0 = svuzp2_mf8 (z1, z0), + z0 = svuzp2 (z1, z0)) + +/* +** uzp2_mf8_untied: +** uzp2 z0\.b, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (uzp2_mf8_untied, svmfloat8_t, + z0 = svuzp2_mf8 (z1, z2), + z0 = svuzp2 (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2q_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2q_mf8.c new file mode 100644 index 00000000000..02a95fcd565 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/uzp2q_mf8.c @@ -0,0 +1,33 @@ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ +/* { dg-additional-options "-march=armv8.2-a+f64mm" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ + +#include "test_sve_acle.h" + +/* +** uzp2q_mf8_tied1: +** uzp2 z0\.q, z0\.q, z1\.q +** ret +*/ +TEST_UNIFORM_Z (uzp2q_mf8_tied1, svmfloat8_t, + z0 = svuzp2q_mf8 (z0, z1), + z0 = svuzp2q (z0, z1)) + +/* +** uzp2q_mf8_tied2: +** uzp2 z0\.q, z1\.q, z0\.q +** ret +*/ +TEST_UNIFORM_Z (uzp2q_mf8_tied2, svmfloat8_t, + z0 = svuzp2q_mf8 (z1, z0), + z0 = svuzp2q (z1, z0)) + +/* +** uzp2q_mf8_untied: +** uzp2 z0\.q, z1\.q, z2\.q +** ret +*/ +TEST_UNIFORM_Z (uzp2q_mf8_untied, svmfloat8_t, + z0 = svuzp2q_mf8 (z1, z2), + z0 = svuzp2q (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1_mf8.c new file mode 100644 index 00000000000..97b5e0f3cf3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** zip1_mf8_tied1: +** zip1 z0\.b, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (zip1_mf8_tied1, svmfloat8_t, + z0 = svzip1_mf8 (z0, z1), + z0 = svzip1 (z0, z1)) + +/* +** zip1_mf8_tied2: +** zip1 z0\.b, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (zip1_mf8_tied2, svmfloat8_t, + z0 = svzip1_mf8 (z1, z0), + z0 = svzip1 (z1, z0)) + +/* +** zip1_mf8_untied: +** zip1 z0\.b, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (zip1_mf8_untied, svmfloat8_t, + z0 = svzip1_mf8 (z1, z2), + z0 = svzip1 (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1q_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1q_mf8.c new file mode 100644 index 00000000000..b0b438eebde --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip1q_mf8.c @@ -0,0 +1,33 @@ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ +/* { dg-additional-options "-march=armv8.2-a+f64mm" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ + +#include "test_sve_acle.h" + +/* +** zip1q_mf8_tied1: +** zip1 z0\.q, z0\.q, z1\.q +** ret +*/ +TEST_UNIFORM_Z (zip1q_mf8_tied1, svmfloat8_t, + z0 = svzip1q_mf8 (z0, z1), + z0 = svzip1q (z0, z1)) + +/* +** zip1q_mf8_tied2: +** zip1 z0\.q, z1\.q, z0\.q +** ret +*/ +TEST_UNIFORM_Z (zip1q_mf8_tied2, svmfloat8_t, + z0 = svzip1q_mf8 (z1, z0), + z0 = svzip1q (z1, z0)) + +/* +** zip1q_mf8_untied: +** zip1 z0\.q, z1\.q, z2\.q +** ret +*/ +TEST_UNIFORM_Z (zip1q_mf8_untied, svmfloat8_t, + z0 = svzip1q_mf8 (z1, z2), + z0 = svzip1q (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2_mf8.c new file mode 100644 index 00000000000..96244fb3627 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2_mf8.c @@ -0,0 +1,30 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** zip2_mf8_tied1: +** zip2 z0\.b, z0\.b, z1\.b +** ret +*/ +TEST_UNIFORM_Z (zip2_mf8_tied1, svmfloat8_t, + z0 = svzip2_mf8 (z0, z1), + z0 = svzip2 (z0, z1)) + +/* +** zip2_mf8_tied2: +** zip2 z0\.b, z1\.b, z0\.b +** ret +*/ +TEST_UNIFORM_Z (zip2_mf8_tied2, svmfloat8_t, + z0 = svzip2_mf8 (z1, z0), + z0 = svzip2 (z1, z0)) + +/* +** zip2_mf8_untied: +** zip2 z0\.b, z1\.b, z2\.b +** ret +*/ +TEST_UNIFORM_Z (zip2_mf8_untied, svmfloat8_t, + z0 = svzip2_mf8 (z1, z2), + z0 = svzip2 (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2q_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2q_mf8.c new file mode 100644 index 00000000000..05bf1de8488 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/zip2q_mf8.c @@ -0,0 +1,33 @@ +/* { dg-require-effective-target aarch64_asm_f64mm_ok } */ +/* { dg-additional-options "-march=armv8.2-a+f64mm" } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ +/* { dg-skip-if "" { *-*-* } { "-DSTREAMING_COMPATIBLE" } { "" } } */ + +#include "test_sve_acle.h" + +/* +** zip2q_mf8_tied1: +** zip2 z0\.q, z0\.q, z1\.q +** ret +*/ +TEST_UNIFORM_Z (zip2q_mf8_tied1, svmfloat8_t, + z0 = svzip2q_mf8 (z0, z1), + z0 = svzip2q (z0, z1)) + +/* +** zip2q_mf8_tied2: +** zip2 z0\.q, z1\.q, z0\.q +** ret +*/ +TEST_UNIFORM_Z (zip2q_mf8_tied2, svmfloat8_t, + z0 = svzip2q_mf8 (z1, z0), + z0 = svzip2q (z1, z0)) + +/* +** zip2q_mf8_untied: +** zip2 z0\.q, z1\.q, z2\.q +** ret +*/ +TEST_UNIFORM_Z (zip2q_mf8_untied, svmfloat8_t, + z0 = svzip2q_mf8 (z1, z2), + z0 = svzip2q (z1, z2)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c index c3ac692d7ff..a85d068607a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_1.c @@ -14,6 +14,7 @@ svuint8_t ret_u8 (void) { return svdup_u8 (0); } svuint16_t ret_u16 (void) { return svdup_u16 (0); } svuint32_t ret_u32 (void) { return svdup_u32 (0); } svuint64_t ret_u64 (void) { return svdup_u64 (0); } +svmfloat8_t ret_mf8 (void) { return svundef_mf8 (); } svbfloat16_t ret_bf16 (void) { return svundef_bf16 (); } svfloat16_t ret_f16 (void) { return svdup_f16 (0); } svfloat32_t ret_f32 (void) { return svdup_f32 (0); } @@ -27,6 +28,7 @@ svuint8x2_t ret_u8x2 (void) { return svundef2_u8 (); } svuint16x2_t ret_u16x2 (void) { return svundef2_u16 (); } svuint32x2_t ret_u32x2 (void) { return svundef2_u32 (); } svuint64x2_t ret_u64x2 (void) { return svundef2_u64 (); } +svmfloat8x2_t ret_mf8x2 (void) { return svundef2_mf8 (); } svbfloat16x2_t ret_bf16x2 (void) { return svundef2_bf16 (); } svfloat16x2_t ret_f16x2 (void) { return svundef2_f16 (); } svfloat32x2_t ret_f32x2 (void) { return svundef2_f32 (); } @@ -40,6 +42,7 @@ svuint8x3_t ret_u8x3 (void) { return svundef3_u8 (); } svuint16x3_t ret_u16x3 (void) { return svundef3_u16 (); } svuint32x3_t ret_u32x3 (void) { return svundef3_u32 (); } svuint64x3_t ret_u64x3 (void) { return svundef3_u64 (); } +svmfloat8x3_t ret_mf8x3 (void) { return svundef3_mf8 (); } svbfloat16x3_t ret_bf16x3 (void) { return svundef3_bf16 (); } svfloat16x3_t ret_f16x3 (void) { return svundef3_f16 (); } svfloat32x3_t ret_f32x3 (void) { return svundef3_f32 (); } @@ -53,6 +56,7 @@ svuint8x4_t ret_u8x4 (void) { return svundef4_u8 (); } svuint16x4_t ret_u16x4 (void) { return svundef4_u16 (); } svuint32x4_t ret_u32x4 (void) { return svundef4_u32 (); } svuint64x4_t ret_u64x4 (void) { return svundef4_u64 (); } +svmfloat8x4_t ret_mf8x4 (void) { return svundef4_mf8 (); } svbfloat16x4_t ret_bf16x4 (void) { return svundef4_bf16 (); } svfloat16x4_t ret_f16x4 (void) { return svundef4_f16 (); } svfloat32x4_t ret_f32x4 (void) { return svundef4_f32 (); } @@ -70,6 +74,7 @@ svfloat64x4_t ret_f64x4 (void) { return svundef4_f64 (); } /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_mf8\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_bf16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32\n} } } */ @@ -83,6 +88,7 @@ svfloat64x4_t ret_f64x4 (void) { return svundef4_f64 (); } /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_mf8x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_bf16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32x2\n} } } */ @@ -97,6 +103,7 @@ svfloat64x4_t ret_f64x4 (void) { return svundef4_f64 (); } /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_mf8x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_bf16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32x3\n} } } */ @@ -110,6 +117,7 @@ svfloat64x4_t ret_f64x4 (void) { return svundef4_f64 (); } /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u32x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_u64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tret_mf8x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_bf16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tret_f32x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c index c3508735fc4..eb9e28044da 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_2.c @@ -14,6 +14,7 @@ void fn_u8 (svuint8_t x) {} void fn_u16 (svuint16_t x) {} void fn_u32 (svuint32_t x) {} void fn_u64 (svuint64_t x) {} +void fn_mf8 (svmfloat8_t x) {} void fn_bf16 (svbfloat16_t x) {} void fn_f16 (svfloat16_t x) {} void fn_f32 (svfloat32_t x) {} @@ -27,6 +28,7 @@ void fn_u8x2 (svuint8x2_t x) {} void fn_u16x2 (svuint16x2_t x) {} void fn_u32x2 (svuint32x2_t x) {} void fn_u64x2 (svuint64x2_t x) {} +void fn_mf8x2 (svmfloat8x2_t x) {} void fn_bf16x2 (svbfloat16x2_t x) {} void fn_f16x2 (svfloat16x2_t x) {} void fn_f32x2 (svfloat32x2_t x) {} @@ -40,6 +42,7 @@ void fn_u8x3 (svuint8x3_t x) {} void fn_u16x3 (svuint16x3_t x) {} void fn_u32x3 (svuint32x3_t x) {} void fn_u64x3 (svuint64x3_t x) {} +void fn_mf8x3 (svmfloat8x3_t x) {} void fn_bf16x3 (svbfloat16x3_t x) {} void fn_f16x3 (svfloat16x3_t x) {} void fn_f32x3 (svfloat32x3_t x) {} @@ -53,6 +56,7 @@ void fn_u8x4 (svuint8x4_t x) {} void fn_u16x4 (svuint16x4_t x) {} void fn_u32x4 (svuint32x4_t x) {} void fn_u64x4 (svuint64x4_t x) {} +void fn_mf8x4 (svmfloat8x4_t x) {} void fn_bf16x4 (svbfloat16x4_t x) {} void fn_f16x4 (svfloat16x4_t x) {} void fn_f32x4 (svfloat32x4_t x) {} @@ -70,6 +74,7 @@ void fn_f64x4 (svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ @@ -83,6 +88,7 @@ void fn_f64x4 (svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ @@ -96,6 +102,7 @@ void fn_f64x4 (svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x3\n} } } */ @@ -109,6 +116,7 @@ void fn_f64x4 (svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c index 42e7860ff7e..a6e9fc64960 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_3.c @@ -10,6 +10,7 @@ void fn_u8 (float d0, float d1, float d2, float d3, svuint8_t x) {} void fn_u16 (float d0, float d1, float d2, float d3, svuint16_t x) {} void fn_u32 (float d0, float d1, float d2, float d3, svuint32_t x) {} void fn_u64 (float d0, float d1, float d2, float d3, svuint64_t x) {} +void fn_mf8 (float d0, float d1, float d2, float d3, svmfloat8_t x) {} void fn_bf16 (float d0, float d1, float d2, float d3, svbfloat16_t x) {} void fn_f16 (float d0, float d1, float d2, float d3, svfloat16_t x) {} void fn_f32 (float d0, float d1, float d2, float d3, svfloat32_t x) {} @@ -23,6 +24,7 @@ void fn_u8x2 (float d0, float d1, float d2, float d3, svuint8x2_t x) {} void fn_u16x2 (float d0, float d1, float d2, float d3, svuint16x2_t x) {} void fn_u32x2 (float d0, float d1, float d2, float d3, svuint32x2_t x) {} void fn_u64x2 (float d0, float d1, float d2, float d3, svuint64x2_t x) {} +void fn_mf8x2 (float d0, float d1, float d2, float d3, svmfloat8x2_t x) {} void fn_bf16x2 (float d0, float d1, float d2, float d3, svbfloat16x2_t x) {} void fn_f16x2 (float d0, float d1, float d2, float d3, svfloat16x2_t x) {} void fn_f32x2 (float d0, float d1, float d2, float d3, svfloat32x2_t x) {} @@ -36,6 +38,7 @@ void fn_u8x3 (float d0, float d1, float d2, float d3, svuint8x3_t x) {} void fn_u16x3 (float d0, float d1, float d2, float d3, svuint16x3_t x) {} void fn_u32x3 (float d0, float d1, float d2, float d3, svuint32x3_t x) {} void fn_u64x3 (float d0, float d1, float d2, float d3, svuint64x3_t x) {} +void fn_mf8x3 (float d0, float d1, float d2, float d3, svmfloat8x3_t x) {} void fn_bf16x3 (float d0, float d1, float d2, float d3, svbfloat16x3_t x) {} void fn_f16x3 (float d0, float d1, float d2, float d3, svfloat16x3_t x) {} void fn_f32x3 (float d0, float d1, float d2, float d3, svfloat32x3_t x) {} @@ -49,6 +52,7 @@ void fn_u8x4 (float d0, float d1, float d2, float d3, svuint8x4_t x) {} void fn_u16x4 (float d0, float d1, float d2, float d3, svuint16x4_t x) {} void fn_u32x4 (float d0, float d1, float d2, float d3, svuint32x4_t x) {} void fn_u64x4 (float d0, float d1, float d2, float d3, svuint64x4_t x) {} +void fn_mf8x4 (float d0, float d1, float d2, float d3, svmfloat8x4_t x) {} void fn_bf16x4 (float d0, float d1, float d2, float d3, svbfloat16x4_t x) {} void fn_f16x4 (float d0, float d1, float d2, float d3, svfloat16x4_t x) {} void fn_f32x4 (float d0, float d1, float d2, float d3, svfloat32x4_t x) {} @@ -62,6 +66,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ @@ -75,6 +80,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ @@ -88,6 +94,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x3\n} } } */ @@ -101,6 +108,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, svfloat64x4_t x) {} /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x4\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c index 7e4438ed49a..240c1c9faae 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_4.c @@ -18,6 +18,8 @@ void fn_u32 (float d0, float d1, float d2, float d3, float d4, svuint32_t x) {} void fn_u64 (float d0, float d1, float d2, float d3, float d4, svuint64_t x) {} +void fn_mf8 (float d0, float d1, float d2, float d3, + float d4, svmfloat8_t x) {} void fn_bf16 (float d0, float d1, float d2, float d3, float d4, svbfloat16_t x) {} void fn_f16 (float d0, float d1, float d2, float d3, @@ -43,6 +45,8 @@ void fn_u32x2 (float d0, float d1, float d2, float d3, float d4, svuint32x2_t x) {} void fn_u64x2 (float d0, float d1, float d2, float d3, float d4, svuint64x2_t x) {} +void fn_mf8x2 (float d0, float d1, float d2, float d3, + float d4, svmfloat8x2_t x) {} void fn_bf16x2 (float d0, float d1, float d2, float d3, float d4, svbfloat16x2_t x) {} void fn_f16x2 (float d0, float d1, float d2, float d3, @@ -68,6 +72,8 @@ void fn_u32x3 (float d0, float d1, float d2, float d3, float d4, svuint32x3_t x) {} void fn_u64x3 (float d0, float d1, float d2, float d3, float d4, svuint64x3_t x) {} +void fn_mf8x3 (float d0, float d1, float d2, float d3, + float d4, svmfloat8x3_t x) {} void fn_bf16x3 (float d0, float d1, float d2, float d3, float d4, svbfloat16x3_t x) {} void fn_f16x3 (float d0, float d1, float d2, float d3, @@ -93,6 +99,8 @@ void fn_u32x4 (float d0, float d1, float d2, float d3, float d4, svuint32x4_t x) {} void fn_u64x4 (float d0, float d1, float d2, float d3, float d4, svuint64x4_t x) {} +void fn_mf8x4 (float d0, float d1, float d2, float d3, + float d4, svmfloat8x4_t x) {} void fn_bf16x4 (float d0, float d1, float d2, float d3, float d4, svbfloat16x4_t x) {} void fn_f16x4 (float d0, float d1, float d2, float d3, @@ -110,6 +118,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ @@ -123,6 +132,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ @@ -136,6 +146,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x3\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x3\n} } } */ @@ -149,6 +160,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_mf8x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_bf16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c index 6dadc0492cd..0ff1df3b31d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_5.c @@ -18,6 +18,8 @@ void fn_u32 (float d0, float d1, float d2, float d3, float d4, float d5, svuint32_t x) {} void fn_u64 (float d0, float d1, float d2, float d3, float d4, float d5, svuint64_t x) {} +void fn_mf8 (float d0, float d1, float d2, float d3, + float d4, float d5, svmfloat8_t x) {} void fn_bf16 (float d0, float d1, float d2, float d3, float d4, float d5, svbfloat16_t x) {} void fn_f16 (float d0, float d1, float d2, float d3, @@ -43,6 +45,8 @@ void fn_u32x2 (float d0, float d1, float d2, float d3, float d4, float d5, svuint32x2_t x) {} void fn_u64x2 (float d0, float d1, float d2, float d3, float d4, float d5, svuint64x2_t x) {} +void fn_mf8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, svmfloat8x2_t x) {} void fn_bf16x2 (float d0, float d1, float d2, float d3, float d4, float d5, svbfloat16x2_t x) {} void fn_f16x2 (float d0, float d1, float d2, float d3, @@ -68,6 +72,8 @@ void fn_u32x3 (float d0, float d1, float d2, float d3, float d4, float d5, svuint32x3_t x) {} void fn_u64x3 (float d0, float d1, float d2, float d3, float d4, float d5, svuint64x3_t x) {} +void fn_mf8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, svmfloat8x3_t x) {} void fn_bf16x3 (float d0, float d1, float d2, float d3, float d4, float d5, svbfloat16x3_t x) {} void fn_f16x3 (float d0, float d1, float d2, float d3, @@ -93,6 +99,8 @@ void fn_u32x4 (float d0, float d1, float d2, float d3, float d4, float d5, svuint32x4_t x) {} void fn_u64x4 (float d0, float d1, float d2, float d3, float d4, float d5, svuint64x4_t x) {} +void fn_mf8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, svmfloat8x4_t x) {} void fn_bf16x4 (float d0, float d1, float d2, float d3, float d4, float d5, svbfloat16x4_t x) {} void fn_f16x4 (float d0, float d1, float d2, float d3, @@ -110,6 +118,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ @@ -123,6 +132,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16x2\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32x2\n} } } */ @@ -136,6 +146,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_mf8x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_bf16x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x3\n} } } */ @@ -149,6 +160,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_mf8x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_bf16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c index 0ff73e2598e..019e76476c4 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_6.c @@ -18,6 +18,8 @@ void fn_u32 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint32_t x) {} void fn_u64 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint64_t x) {} +void fn_mf8 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svmfloat8_t x) {} void fn_bf16 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svbfloat16_t x) {} void fn_f16 (float d0, float d1, float d2, float d3, @@ -43,6 +45,8 @@ void fn_u32x2 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint32x2_t x) {} void fn_u64x2 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint64x2_t x) {} +void fn_mf8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svmfloat8x2_t x) {} void fn_bf16x2 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svbfloat16x2_t x) {} void fn_f16x2 (float d0, float d1, float d2, float d3, @@ -68,6 +72,8 @@ void fn_u32x3 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint32x3_t x) {} void fn_u64x3 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint64x3_t x) {} +void fn_mf8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svmfloat8x3_t x) {} void fn_bf16x3 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svbfloat16x3_t x) {} void fn_f16x3 (float d0, float d1, float d2, float d3, @@ -93,6 +99,8 @@ void fn_u32x4 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint32x4_t x) {} void fn_u64x4 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svuint64x4_t x) {} +void fn_mf8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, svmfloat8x4_t x) {} void fn_bf16x4 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, svbfloat16x4_t x) {} void fn_f16x4 (float d0, float d1, float d2, float d3, @@ -110,6 +118,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u32\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_u64\n} } } */ +/* { dg-final { scan-assembler {\t\.variant_pcs\tfn_mf8\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_bf16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f16\n} } } */ /* { dg-final { scan-assembler {\t\.variant_pcs\tfn_f32\n} } } */ @@ -123,6 +132,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x2\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x2\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x2\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_mf8x2\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_bf16x2\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x2\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x2\n} } } */ @@ -136,6 +146,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x3\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_mf8x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_bf16x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x3\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x3\n} } } */ @@ -149,6 +160,7 @@ void fn_f64x4 (float d0, float d1, float d2, float d3, /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u32x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_u64x4\n} } } */ +/* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_mf8x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_bf16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f16x4\n} } } */ /* { dg-final { scan-assembler-not {\t\.variant_pcs\tfn_f32x4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c index 4f3ff810778..bb1cee0db56 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/annotate_7.c @@ -18,6 +18,8 @@ void fn_u32 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint32_t x) {} void fn_u64 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint64_t x) {} +void fn_mf8 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svmfloat8_t x) {} void fn_bf16 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svbfloat16_t x) {} void fn_f16 (float d0, float d1, float d2, float d3, @@ -43,6 +45,8 @@ void fn_u32x2 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint32x2_t x) {} void fn_u64x2 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint64x2_t x) {} +void fn_mf8x2 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svmfloat8x2_t x) {} void fn_bf16x2 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svbfloat16x2_t x) {} void fn_f16x2 (float d0, float d1, float d2, float d3, @@ -68,6 +72,8 @@ void fn_u32x3 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint32x3_t x) {} void fn_u64x3 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint64x3_t x) {} +void fn_mf8x3 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svmfloat8x3_t x) {} void fn_bf16x3 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svbfloat16x3_t x) {} void fn_f16x3 (float d0, float d1, float d2, float d3, @@ -93,6 +99,8 @@ void fn_u32x4 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint32x4_t x) {} void fn_u64x4 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svuint64x4_t x) {} +void fn_mf8x4 (float d0, float d1, float d2, float d3, + float d4, float d5, float d6, float d7, svmfloat8x4_t x) {} void fn_bf16x4 (float d0, float d1, float d2, float d3, float d4, float d5, float d6, float d7, svbfloat16x4_t x) {} void fn_f16x4 (float d0, float d1, float d2, float d3, diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_mf8.c new file mode 100644 index 00000000000..3bd18a92393 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_be_mf8.c @@ -0,0 +1,63 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** addvl sp, sp, #-1 +** str (p[4-7]), \[sp\] +** ptrue \1\.b, all +** ( +** ld1b (z[0-9]+\.b), \1/z, \[x1, #1, mul vl\] +** ld1b (z[0-9]+\.b), \1/z, \[x1\] +** st2b {\3 - \2}, p0, \[x0\] +** | +** ld1b (z[0-9]+\.b), \1/z, \[x1\] +** ld1b (z[0-9]+\.b), \1/z, \[x1, #1, mul vl\] +** st2b {\4 - \5}, p0, \[x0\] +** ) +** st4b {z0\.b - z3\.b}, p1, \[x0\] +** st3b {z4\.b - z6\.b}, p2, \[x0\] +** st1b z7\.b, p3, \[x0\] +** ldr \1, \[sp\] +** addvl sp, sp, #1 +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svmfloat8x4_t z0, svmfloat8x3_t z4, svmfloat8x2_t stack, + svmfloat8_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_mf8 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_mf8 (pg, x0, -8), + svld3_vnum_mf8 (pg, x0, -3), + svld2_vnum_mf8 (pg, x0, 0), + svld1_vnum_mf8 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4b\t{z0\.b - z3\.b}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z4\.b - z6\.b}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\tz7\.b, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{(z[0-9]+\.b) - z[0-9]+\.b}.*\tst1b\t\1, p[0-7], \[(?:x1|sp)\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z[0-9]+\.b - (z[0-9]+\.b)}.*\tst1b\t\1, p[0-7], \[(?:x1|sp), #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_mf8.c new file mode 100644 index 00000000000..6106240e648 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_5_le_mf8.c @@ -0,0 +1,58 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee: +** ( +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** ldr (z[0-9]+), \[x1\] +** st2b {\2\.b - \1\.b}, p0, \[x0\] +** | +** ldr (z[0-9]+), \[x1\] +** ldr (z[0-9]+), \[x1, #1, mul vl\] +** st2b {\3\.b - \4\.b}, p0, \[x0\] +** ) +** st4b {z0\.b - z3\.b}, p1, \[x0\] +** st3b {z4\.b - z6\.b}, p2, \[x0\] +** st1b z7\.b, p3, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee (void *x0, svmfloat8x4_t z0, svmfloat8x3_t z4, svmfloat8x2_t stack, + svmfloat8_t z7, svbool_t p0, svbool_t p1, svbool_t p2, svbool_t p3) +{ + svst2 (p0, x0, stack); + svst4 (p1, x0, z0); + svst3 (p2, x0, z4); + svst1_mf8 (p3, x0, z7); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee (x0, + svld4_vnum_mf8 (pg, x0, -8), + svld3_vnum_mf8 (pg, x0, -3), + svld2_vnum_mf8 (pg, x0, 0), + svld1_vnum_mf8 (pg, x0, 2), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3), + svptrue_pat_b64 (SV_VL4)); +} + +/* { dg-final { scan-assembler {\tld4b\t{z0\.b - z3\.b}, p[0-7]/z, \[x0, #-8, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z4\.b - z6\.b}, p[0-7]/z, \[x0, #-3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\tz7\.b, p[0-7]/z, \[x0, #2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tmov\tx1, sp\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{(z[0-9]+)\.b - z[0-9]+\.b}.*\tstr\t\1, \[(?:x1|sp)\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z[0-9]+\.b - (z[0-9]+)\.b}.*\tstr\t\1, \[(?:x1|sp), #1, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp3\.d, vl4\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_mf8.c new file mode 100644 index 00000000000..80af390b944 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_be_mf8.c @@ -0,0 +1,71 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O -mbig-endian -fno-stack-clash-protection -fno-cprop-registers -fdisable-rtl-combine -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ptrue p3\.b, all +** ... +** ld1b (z[0-9]+\.b), p3/z, \[x1, #3, mul vl\] +** ... +** st4b {z[0-9]+\.b - \1}, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z5\.b - z7\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svmfloat8x3_t z0, svmfloat8x2_t z3, svmfloat8x3_t z5, + svmfloat8x4_t stack1, svmfloat8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_mf8 (p0, x0, stack1); + svst2_mf8 (p1, x0, z3); + svst3_mf8 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1b (z[0-9]+\.b), p3/z, \[x2\] +** st1b \1, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z0\.b - z2\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svmfloat8x3_t z0, svmfloat8x2_t z3, svmfloat8x3_t z5, + svmfloat8x4_t stack1, svmfloat8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_mf8 (p0, x0, stack2); + svst2_mf8 (p1, x0, z3); + svst3_mf8 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_mf8 (pg, x0, -9), + svld2_vnum_mf8 (pg, x0, -2), + svld3_vnum_mf8 (pg, x0, 0), + svld4_vnum_mf8 (pg, x0, 8), + svld1_vnum_mf8 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3b\t{z0\.b - z2\.b}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z3\.b - z4\.b}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z5\.b - z7\.b}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{(z[0-9]+\.b) - z[0-9]+\.b}.*\tst1b\t\1, p[0-7], \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{z[0-9]+\.b - (z[0-9]+\.b)}.*\tst1b\t\1, p[0-7], \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\t(z[0-9]+\.b), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1b\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_mf8.c new file mode 100644 index 00000000000..e32089947c7 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/args_6_le_mf8.c @@ -0,0 +1,70 @@ +/* { dg-do compile { target lp64 } } */ +/* { dg-options "-O -mlittle-endian -fno-stack-clash-protection -fno-cprop-registers -fdisable-rtl-combine -g" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#pragma GCC aarch64 "arm_sve.h" + +/* +** callee1: +** ... +** ldr (z[0-9]+), \[x1, #3, mul vl\] +** ... +** st4b {z[0-9]+\.b - \1\.b}, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z5\.b - z7\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee1 (void *x0, svmfloat8x3_t z0, svmfloat8x2_t z3, svmfloat8x3_t z5, + svmfloat8x4_t stack1, svmfloat8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst4_mf8 (p0, x0, stack1); + svst2_mf8 (p1, x0, z3); + svst3_mf8 (p2, x0, z5); +} + +/* +** callee2: +** ptrue p3\.b, all +** ld1b (z[0-9]+\.b), p3/z, \[x2\] +** st1b \1, p0, \[x0\] +** st2b {z3\.b - z4\.b}, p1, \[x0\] +** st3b {z0\.b - z2\.b}, p2, \[x0\] +** ret +*/ +void __attribute__((noipa)) +callee2 (void *x0, svmfloat8x3_t z0, svmfloat8x2_t z3, svmfloat8x3_t z5, + svmfloat8x4_t stack1, svmfloat8_t stack2, svbool_t p0, + svbool_t p1, svbool_t p2) +{ + svst1_mf8 (p0, x0, stack2); + svst2_mf8 (p1, x0, z3); + svst3_mf8 (p2, x0, z0); +} + +void __attribute__((noipa)) +caller (void *x0) +{ + svbool_t pg; + pg = svptrue_b8 (); + callee1 (x0, + svld3_vnum_mf8 (pg, x0, -9), + svld2_vnum_mf8 (pg, x0, -2), + svld3_vnum_mf8 (pg, x0, 0), + svld4_vnum_mf8 (pg, x0, 8), + svld1_vnum_mf8 (pg, x0, 5), + svptrue_pat_b8 (SV_VL1), + svptrue_pat_b16 (SV_VL2), + svptrue_pat_b32 (SV_VL3)); +} + +/* { dg-final { scan-assembler {\tld3b\t{z0\.b - z2\.b}, p[0-7]/z, \[x0, #-9, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld2b\t{z3\.b - z4\.b}, p[0-7]/z, \[x0, #-2, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld3b\t{z5\.b - z7\.b}, p[0-7]/z, \[x0\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{(z[0-9]+)\.b - z[0-9]+\.b}.*\tstr\t\1, \[x1\]\n} } } */ +/* { dg-final { scan-assembler {\tld4b\t{z[0-9]+\.b - (z[0-9]+)\.b}.*\tstr\t\1, \[x1, #3, mul vl\]\n} } } */ +/* { dg-final { scan-assembler {\tld1b\t(z[0-9]+\.b), p[0-7]/z, \[x0, #5, mul vl\]\n.*\tst1b\t\1, p[0-7], \[x2\]\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp0\.b, vl1\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp1\.h, vl2\n} } } */ +/* { dg-final { scan-assembler {\tptrue\tp2\.s, vl3\n} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c index e5fceb14bbe..009a987fc31 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c @@ -2,6 +2,7 @@ #include +typedef mfloat8_t mfloat8x32_t __attribute__((vector_size (32))); typedef bfloat16_t bfloat16x16_t __attribute__((vector_size (32))); typedef float16_t float16x16_t __attribute__((vector_size (32))); typedef float32_t float32x8_t __attribute__((vector_size (32))); @@ -15,6 +16,7 @@ typedef uint16_t uint16x16_t __attribute__((vector_size (32))); typedef uint32_t uint32x8_t __attribute__((vector_size (32))); typedef uint64_t uint64x4_t __attribute__((vector_size (32))); +void mfloat8_callee (mfloat8x32_t); void bfloat16_callee (bfloat16x16_t); void float16_callee (float16x16_t); void float32_callee (float32x8_t); @@ -28,6 +30,12 @@ void uint16_callee (uint16x16_t); void uint32_callee (uint32x8_t); void uint64_callee (uint64x4_t); +void +mfloat8_caller (mfloat8_t val) +{ + mfloat8_callee (svdup_mf8 (val)); +} + void bfloat16_caller (bfloat16_t val) { @@ -100,8 +108,8 @@ uint64_caller (void) uint64_callee (svindex_u64 (1, 4)); } -/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 } } */ +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 3 } } */ /* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 4 } } */ /* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 } } */ /* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 } } */ -/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 12 } } */ +/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 13 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c index 875567f0197..375ac16495a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c @@ -2,6 +2,7 @@ #include +typedef mfloat8_t mfloat8x32_t __attribute__((vector_size (32))); typedef bfloat16_t bfloat16x16_t __attribute__((vector_size (32))); typedef float16_t float16x16_t __attribute__((vector_size (32))); typedef float32_t float32x8_t __attribute__((vector_size (32))); @@ -15,6 +16,7 @@ typedef uint16_t uint16x16_t __attribute__((vector_size (32))); typedef uint32_t uint32x8_t __attribute__((vector_size (32))); typedef uint64_t uint64x4_t __attribute__((vector_size (32))); +void mfloat8_callee (svmfloat8_t); void bfloat16_callee (svbfloat16_t); void float16_callee (svfloat16_t); void float32_callee (svfloat32_t); @@ -28,6 +30,12 @@ void uint16_callee (svuint16_t); void uint32_callee (svuint32_t); void uint64_callee (svuint64_t); +void +mfloat8_caller (mfloat8x32_t arg) +{ + mfloat8_callee (arg); +} + void bfloat16_caller (bfloat16x16_t arg) { @@ -100,7 +108,7 @@ uint64_caller (uint64x4_t arg) uint64_callee (arg); } -/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } */ +/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 3 } } */ /* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 4 } } */ /* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } */ /* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c index 91fdd3c202e..7fb4dcc3dad 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4.c @@ -25,6 +25,14 @@ CALLEE (s8, __SVInt8_t) */ CALLEE (u8, __SVUint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, all +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, __SVMfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, all @@ -115,7 +123,7 @@ CALLEE (f64, __SVFloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, __SVInt8_t) */ CALLER (u8, __SVUint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, __SVMfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, __SVFloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, __SVBfloat16_t) +CALLER_NON_NUMERIC (bf16, __SVBfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c index 7d824caae1b..f3372eae7ed 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_1024.c @@ -25,6 +25,14 @@ CALLEE (s8, __SVInt8_t) */ CALLEE (u8, __SVUint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, __SVMfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl128 @@ -115,7 +123,7 @@ CALLEE (f64, __SVFloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, __SVInt8_t) */ CALLER (u8, __SVUint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, __SVMfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, __SVFloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, __SVBfloat16_t) +CALLER_NON_NUMERIC (bf16, __SVBfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_128.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_128.c index e0aa3a5fa68..87d528c84cd 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_128.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_128.c @@ -25,6 +25,14 @@ CALLEE (s8, __SVInt8_t) */ CALLEE (u8, __SVUint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl16 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, __SVMfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl16 @@ -115,7 +123,7 @@ CALLEE (f64, __SVFloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, __SVInt8_t) */ CALLER (u8, __SVUint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, __SVMfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, __SVFloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, __SVBfloat16_t) +CALLER_NON_NUMERIC (bf16, __SVBfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c index 3238015d9eb..4b429293dca 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_2048.c @@ -25,6 +25,14 @@ CALLEE (s8, __SVInt8_t) */ CALLEE (u8, __SVUint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, __SVMfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl256 @@ -115,7 +123,7 @@ CALLEE (f64, __SVFloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, __SVInt8_t) */ CALLER (u8, __SVUint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, __SVMfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, __SVFloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, __SVBfloat16_t) +CALLER_NON_NUMERIC (bf16, __SVBfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c index 50861098934..f90181a9829 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_256.c @@ -25,6 +25,14 @@ CALLEE (s8, __SVInt8_t) */ CALLEE (u8, __SVUint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, __SVMfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl32 @@ -115,7 +123,7 @@ CALLEE (f64, __SVFloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, __SVInt8_t) */ CALLER (u8, __SVUint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, __SVMfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, __SVFloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, __SVBfloat16_t) +CALLER_NON_NUMERIC (bf16, __SVBfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c index 300dacce955..c3ae7acd5a5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_4_512.c @@ -25,6 +25,14 @@ CALLEE (s8, __SVInt8_t) */ CALLEE (u8, __SVUint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, __SVMfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl64 @@ -115,7 +123,7 @@ CALLEE (f64, __SVFloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, __SVInt8_t) */ CALLER (u8, __SVUint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, __SVMfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, __SVFloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, __SVBfloat16_t) +CALLER_NON_NUMERIC (bf16, __SVBfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c index 0a840a38384..e1b941e8f5e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5.c @@ -27,6 +27,14 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, all +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, all @@ -115,7 +123,7 @@ CALLEE (f64, svfloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, svmfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, svfloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, svbfloat16_t) +CALLER_NON_NUMERIC (bf16, svbfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c index 18cefbff1e6..d621b0c40c8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_1024.c @@ -27,6 +27,14 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl128 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl128 @@ -115,7 +123,7 @@ CALLEE (f64, svfloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, svmfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, svfloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, svbfloat16_t) +CALLER_NON_NUMERIC (bf16, svbfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_128.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_128.c index c622ed55674..347a16c1367 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_128.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_128.c @@ -27,6 +27,14 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl16 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl16 @@ -115,7 +123,7 @@ CALLEE (f64, svfloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, svmfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, svfloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, svbfloat16_t) +CALLER_NON_NUMERIC (bf16, svbfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c index 3286280687d..cb369842ff0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_2048.c @@ -27,6 +27,14 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl256 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl256 @@ -115,7 +123,7 @@ CALLEE (f64, svfloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, svmfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, svfloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, svbfloat16_t) +CALLER_NON_NUMERIC (bf16, svbfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c index 3c6afa2fdf1..959a698f970 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_256.c @@ -27,6 +27,14 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl32 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl32 @@ -115,7 +123,7 @@ CALLEE (f64, svfloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, svmfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, svfloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, svbfloat16_t) +CALLER_NON_NUMERIC (bf16, svbfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c index bb7d3ebf9d4..9e40821de3c 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_5_512.c @@ -27,6 +27,14 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl64 +** ld1b z0\.b, \1/z, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl64 @@ -115,7 +123,7 @@ CALLEE (f64, svfloat64_t) return svaddv (svptrue_b8 (), callee_##SUFFIX (ptr1)); \ } -#define CALLER_BF16(SUFFIX, TYPE) \ +#define CALLER_NON_NUMERIC(SUFFIX, TYPE) \ typeof (svlasta (svptrue_b8 (), *(TYPE *) 0)) \ __attribute__((noipa)) \ caller_##SUFFIX (TYPE *ptr1) \ @@ -147,6 +155,15 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldp x29, x30, \[sp\], 16 +** ret +*/ +CALLER_NON_NUMERIC (mf8, svmfloat8_t) + /* ** caller_s16: ** ... @@ -189,7 +206,7 @@ CALLER (f16, svfloat16_t) ** ldp x29, x30, \[sp\], 16 ** ret */ -CALLER_BF16 (bf16, svbfloat16_t) +CALLER_NON_NUMERIC (bf16, svbfloat16_t) /* ** caller_s32: diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c index 1bc2f43bcf9..81c0a4163fa 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6.c @@ -6,6 +6,7 @@ typedef int8_t svint8_t __attribute__ ((vector_size (32))); typedef uint8_t svuint8_t __attribute__ ((vector_size (32))); +typedef __mfp8 svmfloat8_t __attribute__ ((vector_size (32))); typedef int16_t svint16_t __attribute__ ((vector_size (32))); typedef uint16_t svuint16_t __attribute__ ((vector_size (32))); @@ -53,6 +54,19 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_u8: +** ( +** ld1 ({v.*}), \[x0\] +** st1 \1, \[x8\] +** | +** ldp (q[0-9]+, q[0-9]+), \[x0\] +** stp \2, \[x8\] +** ) +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ( @@ -171,6 +185,16 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ldr b0, \[sp, 16\] +** ldp x29, x30, \[sp\], 48 +** ret +*/ +CALLER (mf8, svmfloat8_t) + /* ** caller_s16: ** ... diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c index 6c716ef7c34..6b58dd48eab 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_1024.c @@ -6,6 +6,7 @@ typedef int8_t svint8_t __attribute__ ((vector_size (128))); typedef uint8_t svuint8_t __attribute__ ((vector_size (128))); +typedef __mfp8 svmfloat8_t __attribute__ ((vector_size (128))); typedef int16_t svint16_t __attribute__ ((vector_size (128))); typedef uint16_t svuint16_t __attribute__ ((vector_size (128))); @@ -45,6 +46,15 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl128 +** ld1b (z[0-9]+)\.b, \1/z, \[x0\] +** st1b \2\.b, \1, \[x8\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl128 @@ -166,6 +176,18 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (mf8, svmfloat8_t) + /* ** caller_s16: ** ... diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_128.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_128.c index 4f190fd1444..18ace1985fe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_128.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_128.c @@ -6,6 +6,7 @@ typedef int8_t svint8_t __attribute__ ((vector_size (16))); typedef uint8_t svuint8_t __attribute__ ((vector_size (16))); +typedef __mfp8 svmfloat8_t __attribute__ ((vector_size (16))); typedef int16_t svint16_t __attribute__ ((vector_size (16))); typedef uint16_t svuint16_t __attribute__ ((vector_size (16))); @@ -41,6 +42,13 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ldr q0, \[x0\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ldr q0, \[x0\] @@ -140,6 +148,17 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ... +** str q0, \[[^]]*\] +** ... +** ret +*/ +CALLER (mf8, svmfloat8_t) + /* ** caller_s16: ** ... diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c index 0eb9607d9db..0def3b5f2da 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_2048.c @@ -6,6 +6,7 @@ typedef int8_t svint8_t __attribute__ ((vector_size (256))); typedef uint8_t svuint8_t __attribute__ ((vector_size (256))); +typedef __mfp8 svmfloat8_t __attribute__ ((vector_size (256))); typedef int16_t svint16_t __attribute__ ((vector_size (256))); typedef uint16_t svuint16_t __attribute__ ((vector_size (256))); @@ -45,6 +46,15 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl256 +** ld1b (z[0-9]+)\.b, \1/z, \[x0\] +** st1b \2\.b, \1, \[x8\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl256 @@ -166,6 +176,18 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (mf8, svmfloat8_t) + /* ** caller_s16: ** ... diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c index 749eb332599..17055521f7d 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_256.c @@ -6,6 +6,7 @@ typedef int8_t svint8_t __attribute__ ((vector_size (32))); typedef uint8_t svuint8_t __attribute__ ((vector_size (32))); +typedef __mfp8 svmfloat8_t __attribute__ ((vector_size (32))); typedef int16_t svint16_t __attribute__ ((vector_size (32))); typedef uint16_t svuint16_t __attribute__ ((vector_size (32))); @@ -45,6 +46,15 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl32 +** ld1b (z[0-9]+)\.b, \1/z, \[x0\] +** st1b \2\.b, \1, \[x8\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl32 @@ -166,6 +176,18 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (mf8, svmfloat8_t) + /* ** caller_s16: ** ... diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c index f6a64cc4944..324d0973ece 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_6_512.c @@ -6,6 +6,7 @@ typedef int8_t svint8_t __attribute__ ((vector_size (64))); typedef uint8_t svuint8_t __attribute__ ((vector_size (64))); +typedef __mfp8 svmfloat8_t __attribute__ ((vector_size (64))); typedef int16_t svint16_t __attribute__ ((vector_size (64))); typedef uint16_t svuint16_t __attribute__ ((vector_size (64))); @@ -45,6 +46,15 @@ CALLEE (s8, svint8_t) */ CALLEE (u8, svuint8_t) +/* +** callee_mf8: +** ptrue (p[0-7])\.b, vl64 +** ld1b (z[0-9]+)\.b, \1/z, \[x0\] +** st1b \2\.b, \1, \[x8\] +** ret +*/ +CALLEE (mf8, svmfloat8_t) + /* ** callee_s16: ** ptrue (p[0-7])\.b, vl64 @@ -166,6 +176,18 @@ CALLER (s8, svint8_t) */ CALLER (u8, svuint8_t) +/* +** caller_mf8: +** ... +** bl callee_mf8 +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[[^]]*\] +** st1b \1, \2, \[[^]]*\] +** ... +** ret +*/ +CALLER (mf8, svmfloat8_t) + /* ** caller_s16: ** ... diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c index 55456a3b4cb..5d1d4595259 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_7.c @@ -60,6 +60,34 @@ caller_u8 (void) return svtrn2 (svget2 (res, 1), svget2 (res, 0)); } +/* +** callee_mf8: +** mov z0\.b, b2 +** mov z1\.b, b3 +** ret +*/ +svmfloat8x2_t __attribute__((noipa)) +callee_mf8 (mfloat8_t h0, mfloat8_t h1, mfloat8_t h2, mfloat8_t h3) +{ + return svcreate2 (svdup_mf8 (h2), svdup_mf8 (h3)); +} + +/* +** caller_mf8: +** ... +** bl callee_mf8 +** trn2 z0\.b, z1\.b, z0\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svmfloat8_t __attribute__((noipa)) +caller_mf8 (mfloat8_t h0, mfloat8_t h1, mfloat8_t h2, mfloat8_t h3) +{ + svmfloat8x2_t res; + res = callee_mf8 (h0, h1, h2, h3); + return svtrn2 (svget2 (res, 1), svget2 (res, 0)); +} + /* ** callee_s16: ** mov z0\.h, #1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c index 9581811e7f3..05373029fe5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_8.c @@ -66,6 +66,35 @@ caller_u8 (void) svget3 (res, 0), svget3 (res, 1), svget3 (res, 2)); } +/* +** callee_mf8: +** mov z0\.b, b0 +** mov z1\.b, b1 +** mov z2\.b, b2 +** ret +*/ +svmfloat8x3_t __attribute__((noipa)) +callee_mf8 (mfloat8_t h0, mfloat8_t h1, mfloat8_t h2) +{ + return svcreate3 (svdup_mf8 (h0), svdup_mf8 (h1), svdup_mf8 (h2)); +} + +/* +** caller_mf8: +** ... +** bl callee_mf8 +** trn2 z0\.b, z0\.b, z2\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svmfloat8_t __attribute__((noipa)) +caller_mf8 (mfloat8_t h0, mfloat8_t h1, mfloat8_t h2) +{ + svmfloat8x3_t res; + res = callee_mf8 (h0, h1, h2); + return svtrn2 (svget3 (res, 0), svget3 (res, 2)); +} + /* ** callee_s16: ** mov z0\.h, #1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c index 3b2604e6068..4133709dd2b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/return_9.c @@ -74,6 +74,39 @@ caller_u8 (void) svget4 (res, 3))); } +/* +** callee_mf8: +** mov z0\.b, b4 +** mov z1\.b, b5 +** mov z2\.b, b6 +** mov z3\.b, b7 +** ret +*/ +svmfloat8x4_t __attribute__((noipa)) +callee_mf8 (mfloat8_t h0, mfloat8_t h1, mfloat8_t h2, mfloat8_t h3, + mfloat8_t h4, mfloat8_t h5, mfloat8_t h6, mfloat8_t h7) +{ + return svcreate4 (svdup_mf8 (h4), svdup_mf8 (h5), + svdup_mf8 (h6), svdup_mf8 (h7)); +} + +/* +** caller_mf8: +** ... +** bl callee_mf8 +** trn2 z0\.b, z0\.b, z3\.b +** ldp x29, x30, \[sp\], 16 +** ret +*/ +svmfloat8_t __attribute__((noipa)) +caller_mf8 (mfloat8_t h0, mfloat8_t h1, mfloat8_t h2, mfloat8_t h3, + mfloat8_t h4, mfloat8_t h5, mfloat8_t h6, mfloat8_t h7) +{ + svmfloat8x4_t res; + res = callee_mf8 (h0, h1, h2, h3, h4, h5, h6, h7); + return svtrn2 (svget4 (res, 0), svget4 (res, 3)); +} + /* ** callee_s16: ** mov z0\.h, #1 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_mf8.c new file mode 100644 index 00000000000..28777878d56 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pcs/varargs_2_mf8.c @@ -0,0 +1,182 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-stack-clash-protection -fno-cprop-registers -fdisable-rtl-combine -g" } */ +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ + +#include +#include + +/* +** callee_0: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x1\] +** ... +** st1b \1, \2, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_0 (mfloat8_t *ptr, ...) +{ + va_list va; + svmfloat8_t vec; + + va_start (va, ptr); + vec = va_arg (va, svmfloat8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* FIXME: optimize the umov and mov pair. */ +/* +** caller_0: +** ... +** umov (w[0-9]+), v0.b\[0\] +** ... +** mov (z[0-9]+\.b), \1 +** ... +** st1b \2, p[0-7], \[x1\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_0 (mfloat8_t *ptr, mfloat8_t in) +{ + callee_0 (ptr, svdup_mf8 (in)); +} + +/* +** callee_1: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x2\] +** ... +** st1b \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_1 (mfloat8_t *ptr, ...) +{ + va_list va; + svmfloat8_t vec; + + va_start (va, ptr); + va_arg (va, int); + vec = va_arg (va, svmfloat8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* FIXME: optimize the umov and mov pair. */ +/* +** caller_1: +** ... +** umov (w[0-9]+), v0.b\[0\] +** ... +** mov (z[0-9]+\.b), \1 +** ... +** st1b \2, p[0-7], \[x2\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_1 (mfloat8_t *ptr, mfloat8_t in) +{ + callee_1 (ptr, 1, svdup_mf8 (in)); +} + +/* +** callee_7: +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[x7\] +** ... +** st1b \1, p[0-7], \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_7 (mfloat8_t *ptr, ...) +{ + va_list va; + svmfloat8_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svmfloat8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* FIXME: optimize the umov and mov pair. */ +/* +** caller_7: +** ... +** umov (w[0-9]+), v0.b\[0\] +** ... +** mov (z[0-9]+\.b), \1 +** ... +** st1b \2, p[0-7], \[x7\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_7 (mfloat8_t *ptr, mfloat8_t in) +{ + callee_7 (ptr, 1, 2, 3, 4, 5, 6, svdup_mf8 (in)); +} + +/* FIXME: We should be able to get rid of the va_list object. */ +/* +** callee_8: +** sub sp, sp, #([0-9]+) +** ... +** ldr (x[0-9]+), \[sp, \1\] +** ... +** ld1b (z[0-9]+\.b), (p[0-7])/z, \[\2\] +** ... +** st1b \3, \4, \[x0\] +** ... +** ret +*/ +void __attribute__((noipa)) +callee_8 (mfloat8_t *ptr, ...) +{ + va_list va; + svmfloat8_t vec; + + va_start (va, ptr); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + va_arg (va, int); + vec = va_arg (va, svmfloat8_t); + va_end (va); + svst1 (svptrue_b8 (), ptr, vec); +} + +/* FIXME: optimize the umov and mov pair. */ +/* +** caller_8: +** ... +** umov (w[0-9]+), v0.b\[0\] +** ... +** mov (z[0-9]+\.b), \1 +** ... +** st1b \2, p[0-7], \[(x[0-9]+)\] +** ... +** str \3, \[sp\] +** ... +** ret +*/ +void __attribute__((noipa)) +caller_8 (mfloat8_t *ptr, mfloat8_t in) +{ + callee_8 (ptr, 1, 2, 3, 4, 5, 6, 7, svdup_mf8 (in)); +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbl2_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbl2_mf8.c new file mode 100644 index 00000000000..19cc739e7ab --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbl2_mf8.c @@ -0,0 +1,31 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** tbl2_mf8_tied1: +** tbl z0\.b, {z0\.b(?:, | - )z1\.b}, z4\.b +** ret +*/ +TEST_TBL2 (tbl2_mf8_tied1, svmfloat8x2_t, svmfloat8_t, svuint8_t, + z0_res = svtbl2_mf8 (z0, z4), + z0_res = svtbl2 (z0, z4)) + +/* +** tbl2_mf8_tied2: +** tbl z0\.b, {z1\.b(?:, | - )z2\.b}, z0\.b +** ret +*/ +TEST_TBL2_REV (tbl2_mf8_tied2, svmfloat8x2_t, svmfloat8_t, svuint8_t, + z0_res = svtbl2_mf8 (z1, z0), + z0_res = svtbl2 (z1, z0)) + +/* +** tbl2_mf8_untied: +** tbl z0\.b, {z2\.b(?:, | - )z3\.b}, z4\.b +** ret +*/ +TEST_TBL2 (tbl2_mf8_untied, svmfloat8x2_t, svmfloat8_t, svuint8_t, + z0_res = svtbl2_mf8 (z2, z4), + z0_res = svtbl2 (z2, z4)) + diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbx_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbx_mf8.c new file mode 100644 index 00000000000..ba0fef3934b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/tbx_mf8.c @@ -0,0 +1,37 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +/* +** tbx_mf8_tied1: +** tbx z0\.b, z1\.b, z4\.b +** ret +*/ +TEST_DUAL_Z (tbx_mf8_tied1, svmfloat8_t, svuint8_t, + z0 = svtbx_mf8 (z0, z1, z4), + z0 = svtbx (z0, z1, z4)) + +/* Bad RA choice: no preferred output sequence. */ +TEST_DUAL_Z (tbx_mf8_tied2, svmfloat8_t, svuint8_t, + z0 = svtbx_mf8 (z1, z0, z4), + z0 = svtbx (z1, z0, z4)) + +/* Bad RA choice: no preferred output sequence. */ +TEST_DUAL_Z_REV (tbx_mf8_tied3, svmfloat8_t, svuint8_t, + z0_res = svtbx_mf8 (z4, z5, z0), + z0_res = svtbx (z4, z5, z0)) + +/* +** tbx_mf8_untied: +** ( +** mov z0\.d, z1\.d +** tbx z0\.b, z2\.b, z4\.b +** | +** tbx z1\.b, z2\.b, z4\.b +** mov z0\.d, z1\.d +** ) +** ret +*/ +TEST_DUAL_Z (tbx_mf8_untied, svmfloat8_t, svuint8_t, + z0 = svtbx_mf8 (z1, z2, z4), + z0 = svtbx (z1, z2, z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilerw_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilerw_mf8.c new file mode 100644 index 00000000000..12cf0d2c365 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilerw_mf8.c @@ -0,0 +1,50 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** whilerw_rr_mf8: +** whilerw p0\.b, x0, x1 +** ret +*/ +TEST_COMPARE_S (whilerw_rr_mf8, const mfloat8_t *, + p0 = svwhilerw_mf8 (x0, x1), + p0 = svwhilerw (x0, x1)) + +/* +** whilerw_0r_mf8: +** whilerw p0\.b, xzr, x1 +** ret +*/ +TEST_COMPARE_S (whilerw_0r_mf8, const mfloat8_t *, + p0 = svwhilerw_mf8 ((const mfloat8_t *) 0, x1), + p0 = svwhilerw ((const mfloat8_t *) 0, x1)) + +/* +** whilerw_cr_mf8: +** mov (x[0-9]+), #?1073741824 +** whilerw p0\.b, \1, x1 +** ret +*/ +TEST_COMPARE_S (whilerw_cr_mf8, const mfloat8_t *, + p0 = svwhilerw_mf8 ((const mfloat8_t *) 1073741824, x1), + p0 = svwhilerw ((const mfloat8_t *) 1073741824, x1)) + +/* +** whilerw_r0_mf8: +** whilerw p0\.b, x0, xzr +** ret +*/ +TEST_COMPARE_S (whilerw_r0_mf8, const mfloat8_t *, + p0 = svwhilerw_mf8 (x0, (const mfloat8_t *) 0), + p0 = svwhilerw (x0, (const mfloat8_t *) 0)) + +/* +** whilerw_rc_mf8: +** mov (x[0-9]+), #?1073741824 +** whilerw p0\.b, x0, \1 +** ret +*/ +TEST_COMPARE_S (whilerw_rc_mf8, const mfloat8_t *, + p0 = svwhilerw_mf8 (x0, (const mfloat8_t *) 1073741824), + p0 = svwhilerw (x0, (const mfloat8_t *) 1073741824)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilewr_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilewr_mf8.c new file mode 100644 index 00000000000..c4023a2fbff --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/whilewr_mf8.c @@ -0,0 +1,50 @@ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" { target { ! ilp32 } } } } */ + +#include "test_sve_acle.h" + +/* +** whilewr_rr_mf8: +** whilewr p0\.b, x0, x1 +** ret +*/ +TEST_COMPARE_S (whilewr_rr_mf8, const mfloat8_t *, + p0 = svwhilewr_mf8 (x0, x1), + p0 = svwhilewr (x0, x1)) + +/* +** whilewr_0r_mf8: +** whilewr p0\.b, xzr, x1 +** ret +*/ +TEST_COMPARE_S (whilewr_0r_mf8, const mfloat8_t *, + p0 = svwhilewr_mf8 ((const mfloat8_t *) 0, x1), + p0 = svwhilewr ((const mfloat8_t *) 0, x1)) + +/* +** whilewr_cr_mf8: +** mov (x[0-9]+), #?1073741824 +** whilewr p0\.b, \1, x1 +** ret +*/ +TEST_COMPARE_S (whilewr_cr_mf8, const mfloat8_t *, + p0 = svwhilewr_mf8 ((const mfloat8_t *) 1073741824, x1), + p0 = svwhilewr ((const mfloat8_t *) 1073741824, x1)) + +/* +** whilewr_r0_mf8: +** whilewr p0\.b, x0, xzr +** ret +*/ +TEST_COMPARE_S (whilewr_r0_mf8, const mfloat8_t *, + p0 = svwhilewr_mf8 (x0, (const mfloat8_t *) 0), + p0 = svwhilewr (x0, (const mfloat8_t *) 0)) + +/* +** whilewr_rc_mf8: +** mov (x[0-9]+), #?1073741824 +** whilewr p0\.b, x0, \1 +** ret +*/ +TEST_COMPARE_S (whilewr_rc_mf8, const mfloat8_t *, + p0 = svwhilewr_mf8 (x0, (const mfloat8_t *) 1073741824), + p0 = svwhilewr (x0, (const mfloat8_t *) 1073741824)) From patchwork Thu Nov 28 21:12:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Bantaloukas X-Patchwork-Id: 102050 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5751B3858D39 for ; Thu, 28 Nov 2024 21:19:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5751B3858D39 Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=g1ZHpN1Y; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=g1ZHpN1Y X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2061f.outbound.protection.outlook.com [IPv6:2a01:111:f403:2612::61f]) by sourceware.org (Postfix) with ESMTPS id 339043858D38 for ; Thu, 28 Nov 2024 21:13:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 339043858D38 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 339043858D38 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2612::61f ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828380; cv=pass; b=g0bWL2fEOVSyLOGy29WMST9a1N+K8vqJiINuyVw4OYhNwd3tNvB+O84ldaoOBo287yRTOp9i+yuRz69W5uxDBHXJQYrmqRvHDIYsFo2e+xMDhK8dJW6+lTsCV5jO8DDLoHx2H6BD9H3FyIoDKaZMB22hrEV2kjL2c2+uy0VpHAM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828380; c=relaxed/simple; bh=DeJ9D1a9aWrNLyLf+o44OpK/bxv8sOGx+oYcK8vknZE=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=MerbYsWVFQXdQ0OvNcXF5qzxhpOAaigufsxnhTE4fEaJSXQwd1QJY9Tq1g2F5UgSeW6uSJvq399oXA2IXkbsv1XK1oS/AzpclI8xnDhFee/9QG2LH9cx5Y9yQhWU0ldNb3Wzug5ZW5ZN25zZwMHg4CxMt2XqVEqqv3SC7aPJpmQ= ARC-Authentication-Results: i=3; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 339043858D38 ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=DnEfso1Ih3184/HIkl7tJ9SdZKb97ug8ZpgtChnCllbeIhtT3VSNnqLf/D0nbKxlaitV5LU0q7EMw1pTgYPX5uR2T+Z92Ok7eUYqhT/uBmwlnRI2/lMq/K8oYXEFNvAEOXkZHKngt0NEL24/CLgBXIun6YinMc+rdfFUU7k0x6HIMtCf0J1U1Pcp29xSsolAN8yFAOzw1uaqd+XOYbgFz1kvQdwWXhjDMMAvMy53tpor4bLeiQvfmX4DjqhnGmT2xDj1orhDtK1/W6WdT3ZpVdh1OcNKzwIrfMEqBXSrMlWJaDXdHQylr4Wzc15f0RfYHYLaKYp8L/Qz5Yw4GtWGsg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/8IO7FPcC4J8wjYmKVVdps8lFke9EIxQaGZOjtV67Eo=; b=qNbdQ0VbeRNfrqJgW9VF39QfLR/+PnqQTFGv8CtimJFPvTc6c4PbtWMv/xlvMGtH9Ip5DLaH+N6Ph8DCncVWX/UakOmhm1cX+1HPyA5pZc+2PeAWfh//awn9L2i+w+9YUfL4HIXVO4ngv0zxpIJ/OCnG6yiZrmiQ93yFIgR/FPemKM9zMHq6khIZAoiFnFJOJmaLBqBFt9a1inlJfxgxgqm5ASjMIjU7p6lNbwB7pb73xmzPLBfAUoo4v/NDyMuY6poSjcnIqILZULpSnYy0BDs/ivCzl7S2gqLXP/VcCMGd/ISgif4tlCSF8HJDpw3dLMxCzDxkpi7sc9KZwaSa6w== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/8IO7FPcC4J8wjYmKVVdps8lFke9EIxQaGZOjtV67Eo=; b=g1ZHpN1Y89+20YPPvmest86LxUskvyW3PqVdsJbubd25vs1ef8Vw6HjIUnjmOncaFBSYBUONki8CqNWKsv6PTQuzMaUXm5UaUsRdCETN48eslkQdlsvRGhGSEuPJomVjo4HV+/oYRMO8m0DN6NyDxaydKcOMJeuHg9Dme62PwT0= Received: from DB9PR06CA0023.eurprd06.prod.outlook.com (2603:10a6:10:1db::28) by VI1PR08MB10200.eurprd08.prod.outlook.com (2603:10a6:800:1bd::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8182.20; Thu, 28 Nov 2024 21:12:54 +0000 Received: from DU2PEPF00028D0A.eurprd03.prod.outlook.com (2603:10a6:10:1db:cafe::db) by DB9PR06CA0023.outlook.office365.com (2603:10a6:10:1db::28) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.13 via Frontend Transport; Thu, 28 Nov 2024 21:12:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028D0A.mail.protection.outlook.com (10.167.242.170) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:54 +0000 Received: ("Tessian outbound 3b1f0cd68b0e:v514"); Thu, 28 Nov 2024 21:12:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 926f3ad39ebf1f99 X-TessianGatewayMetadata: miGVm/TA+Re9Ziv91QyKDM4tbe3VU4FoD0ZnmeSRn7NeFxr4ZGiUHbvEaXID4g7PAqO/YPDxNhFAZKaBgYsfxA6Szum8l5OifRtNqahqxVO0NnIaxHScyVjf08njAsbDUBURTWNdL3o5ngVqZDs+A2/RszQQfyuNgcXvAE3gk/z+aB/DOUMrORwgnS+nawTd X-CR-MTA-TID: 64aa7808 Received: from Le7b3df7afc85.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D29500E7-5EBA-417F-A32D-9B3814919151.1; Thu, 28 Nov 2024 21:12:42 +0000 Received: from EUR02-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id Le7b3df7afc85.2 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384); Thu, 28 Nov 2024 21:12:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SxZ9BMgSq2B1oeybKPKddDuaZ099G4L0luzmslW3vD3KB/fLHcmoc/9UwUdQKo9ibyJhW1hRhTiMv1W34pVyHe7d4P0e/aMHfmUZZ4LsRNQf3rIo8XxMdQLPAQ0Ov0j25xJLBBh7QkYMxk8noR9QUZ7+ADzHsWghS6RMLj1PRFVWyIaZ5LvizE0nLVL4L/e736z7dFarvAN+CTmth2+59pO5SJeglnvB7ANnm1aSU4VwjS8U8aJIAA4t17s/Mklo2raE9BLstizIHAyrX4jOKGKEfGd9o+VFj/QCGzKhRNBlINPQpkSAMxhyk7TUH7ZDkJ04hS333PdgXapATIEKyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/8IO7FPcC4J8wjYmKVVdps8lFke9EIxQaGZOjtV67Eo=; b=NEF9e80C2nGEHFYTZw1jcpdyJAN1S15dvJN3B1cv/zrhRp11vY53/O3fFnqTVC1p8qhrk3A5sXC2V7g28g5HJJYm8CLsAVnFU/rCdYaqSuEvyt7B1vrCvthMaAVWuedoNIZnsx/XalZOL+3hS/yOkxrIBPe+n3RirKWr/843zf32V71RQQkc6Xj1vITb5FZIJgqNgHAduDoMetu6/eiwTm+zNHSFaq/2LAGy71nTuvm+s6r6dTtoIfSbYhKh7kSB32iaayHm8pG33wW+fkbo4tEV+aC5fg2tWBDS8pxEPunSSreN6AXoos+3oI2SXWNR1gf30LYYYi4TlrwiCuTK4Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/8IO7FPcC4J8wjYmKVVdps8lFke9EIxQaGZOjtV67Eo=; b=g1ZHpN1Y89+20YPPvmest86LxUskvyW3PqVdsJbubd25vs1ef8Vw6HjIUnjmOncaFBSYBUONki8CqNWKsv6PTQuzMaUXm5UaUsRdCETN48eslkQdlsvRGhGSEuPJomVjo4HV+/oYRMO8m0DN6NyDxaydKcOMJeuHg9Dme62PwT0= Received: from DB8PR06CA0056.eurprd06.prod.outlook.com (2603:10a6:10:120::30) by DB3PR08MB8843.eurprd08.prod.outlook.com (2603:10a6:10:438::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.14; Thu, 28 Nov 2024 21:12:37 +0000 Received: from DU6PEPF0000A7E4.eurprd02.prod.outlook.com (2603:10a6:10:120:cafe::5e) by DB8PR06CA0056.outlook.office365.com (2603:10a6:10:120::30) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.13 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DU6PEPF0000A7E4.mail.protection.outlook.com (10.167.8.43) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:36 +0000 Received: from 5fe87ac27518.euhpc2.arm.com (10.58.86.32) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 28 Nov 2024 21:12:36 +0000 From: Claudio Bantaloukas To: CC: Claudio Bantaloukas Subject: [PATCH v5 2/5] aarch64: specify fpm mode in function instances and groups Date: Thu, 28 Nov 2024 21:12:31 +0000 Message-ID: <20241128211234.1714776-3-claudio.bantaloukas@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> References: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DU6PEPF0000A7E4:EE_|DB3PR08MB8843:EE_|DU2PEPF00028D0A:EE_|VI1PR08MB10200:EE_ X-MS-Office365-Filtering-Correlation-Id: 5ebd2e90-af49-4dd9-a9bd-08dd0ff16ce0 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|376014|82310400026|36860700013|1800799024; X-Microsoft-Antispam-Message-Info-Original: EAZ5z2fTJFKKqbRSUpsNbHJWRjR/9ZwNmXg1q0bxwV7tFkqfopsW0un09dmXbCC4JHdVekcBcqvUYQDAIee57/QXh9Gxdq8TSRmcYQ6ueaXBxCYoHXqdDw0Vet2/6rimPIY3D55CXPI1wTzk19J7mlOgG7hzpw2rnQdUaQ1u1/l7ogNh8/b2tHtxsOG7fr+rRtDbe9ys+MX3hgz++YAjx9+iiOSXDD6db732/x6fGJx7nsLUr7yNdV/itObaprxxdOEXvom5hhusL+a0YeCsLEXgA32UWyVHbBdQlUswfvWZYLM9xRP8kPVy9OlT+huSt1gsF7PCzQMzYmPEzHfu8msetPxzkRkfDDNudwuLIUH9Y3AjyoPZl31ofG8cA5qxPSjErmByFPVwFPBssraWKh8Sxi4gbjLPFA1m9rT3sdPn4MjY0aRRk6EFlIr8JHqYvNvetGgDAM8D7dHWHpODEEZ3U60mhPMJEOeA6mg7IXf8C2FPD+nHs9+qXjvoIaZ8SMHv+KyJQ//G1h0PntWS3cQ6gV3RlLrQRXNhKGGYi3Higl6cYlpMuPSWT99h+QLmFY0eT9EiEK3UUySfdDwi2EdJKOibNdq45iJpREW2srtrgPJw1IOJ0NPaW1lkgOkrvFc9Qwq3d94aBsOTSM5VK8eNm37nwvSQMcHF6FB3hcetdZo/H8CZ6sGYlqpc6vMGYqOLVWuwMQmNBjuz1VQMQ6cai08hvNpwxWXyviyEeOANE8tSh/CGJr/ET8C9cacMidNSdipXnWHsFGOCbQc3KSgeo7ecqZDTDLyXPlcURRxdQWclZlkaZgGqRkQGrtRIDE5CVafJ+EzPxTFiOlDqlqGtfAwV8gmVznyBUj/TV9zS5iARVCXXPseAX+TKFumZRv+B+fkQJ39cjihOKaBFNFQ5X2tg5Y275Nv1RusmXHy2pY+e400XKlvhf4yuIYTPAN4TA8xUz17PaPvOEsjUHzh3ebJOv9qbexc77bcm1mIuJDRDsI3vsKL3PKnOqj/Wn/Zn5b8Q37YhhdoiFaFUFLIWiSfLRTtiScQmGyDL2KC4zoNTbAlfqpVgwINMIaIgrCSHRTml4lJR/r9LTRMq/xtmxzhOLjHfyUupst3KLqps3umtISH0EJK6hwZqqX8+bDMxL8yHBj3fz/23ZNfN8v2S+ryyBHhZvoTX3oyKcufNkmO22GH9cPOxOkPDXzjLmCOLaE6zMh7LzVEbBT9Sf8djF4Si/B4KC4IWNPOx32S5d7r+yq073vRDP5JVconDwJDvaJMgP6ccc/NrXfRO+c54iRZTR/eXx312fpct3EPYPF3L5gFBOVE6X+q/QxTT1BR8N5/QsGvPLSKz7DcRQ4xqHunTwAmd8BlXhYfdDLs0caUwDtfOqP+qrEBxLVEQYM/2BCWuokO81dFeXSsqea5S8d+q5TPu2lqFsiw8HPOfZ986VjS0x3ORyqGfPfkY X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(376014)(82310400026)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR08MB8843 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:10:120::30]; domain=DB8PR06CA0056.eurprd06.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028D0A.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 1eb15e34-bc7d-40e7-707b-08dd0ff162e3 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|82310400026|35042699022|14060799003|36860700013|1800799024; X-Microsoft-Antispam-Message-Info: =?utf-8?q?UjW+FNHnTjlq6GTcFiIQdfz4hL1aj7J?= =?utf-8?q?5C+dckdB4ia9Vk9x9ZGqZ1zl5jHAHGgmPth7XCGEWc/iaTZBPPf1B6E4828gYzJrv?= =?utf-8?q?OkksPmiYkbvap0q2cA44qs9K2/5PbA2z4nAH6IhnjIXn9R2P8Oi2YrjbSPx8rmHBE?= =?utf-8?q?BnDz9RTrKAj4SWXAci3CK5TDih7wx1FD91QhrqilGlFEH0JEWBvfSC9Em6xUda6wh?= =?utf-8?q?JPlb7rLwqdqzS440CjUnlggHlaOKJ3rS/OUSAmNvTI3mI2CL9PfgYLoRQt3gydQpi?= =?utf-8?q?umWkZHtC+dp+KPnyqXXI9XcfQ1DoxVyp7MJyNPZftjYCV9+Ykt5yARIs5EMCNcipb?= =?utf-8?q?RitwshujCCJftHQzSJcAAxvrGbijUkVbUuSXpZE84Bv10L+ucRlpyYY1vun93qI+1?= =?utf-8?q?RGP/1qbTINvye8IHiytqFm9ZfdASmVnW/omWynxEt2wyIys/5MoGDnLq/7810ShMI?= =?utf-8?q?A1R73JPv8VDMOmXxPOtbvrZyTBPrrCKJMUDIx8KJH659zgGe4XER87lqQe/b9QC7l?= =?utf-8?q?mjyvDhLxxflA+L6AFUoHKhrClyNxtLaJaBGJUBASkj7GwJbo/ZqeZZ7zDuqIpFuSS?= =?utf-8?q?SlYaBV+MAJMTt27F69AT22CX8/qeSrlZubwsDVj+Oo4kUxCbjI3N70zIeTK/2S4Ho?= =?utf-8?q?xk+UnWUfAVDq6KoUFB9YZCN1V/0otbyCMXLVhFPFoRPHzJxBOa71hnEAR4uKs9kgN?= =?utf-8?q?C7xEOKZ8K9LTjpPSM1vftYYWJ/N16lm/Dbl9oUgXMduQCrQzkq2uByRjITSQdLrJi?= =?utf-8?q?HuwDlIfqjzyBzPd8nW1cbTcM/xjng7vhy5RMaHfJfLnC32Plk/UcDAyVscnxqUD52?= =?utf-8?q?/rjyk0MvZ9y8uYazgD8jx6/e5HSNc6OQmKAM4YMPYkvHfVNb4qkQZ12kFR6yyqlRM?= =?utf-8?q?sKxohmQlOBvx1g0QqlJvkY9uA1WWfWCs2Juu4NkyR4m/gmPDPPYxfqMflV7Mu0cfY?= =?utf-8?q?2Wh7pscBaV5leg0V9PAF9nP7ICy+8yN4pDAu+Zy944dWugsEdGjdFh8yteizU/o6I?= =?utf-8?q?j2zPc2N+rEK+KIhkg8XPaUrbt9W2mg+S53ulix+3fUsFrQ2ILDih/jc5kYJy965B5?= =?utf-8?q?jatoaXxelqOYZLVT77+kfCT8FaO9WEwK+8rK6Usj+kbHX803E6OBuMfhPvt0O/RKw?= =?utf-8?q?Y/ZtDs63Ooj6+/90N7QDdhvpROxvoZugVUQZorKCWh6xEC8NuCTj/SVHkWuAAKUe6?= =?utf-8?q?A8jgcYrhFbN82Q2R0c3YXLy/I4fYLXQ4YeqF5ShMzAP683WG4foKHyZjjx+PW5xBz?= =?utf-8?q?b8AKrfJd1CP7/fwhCmSvpsYtFP20sdi6kXZwpeOWaZGNpsLU9N+qjB7lA6Hn+8072?= =?utf-8?q?x2Vb24oHaVajDSigiVN7VFeRUvVd3pe/bxUJg3jAuPtz1BgwXHNtR5beGdyj0CfEC?= =?utf-8?q?KD/5JG0Rg0K?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:64aa7808-outbound-1.mta.getcheckrecipient.com; CAT:NONE; SFS:(13230040)(376014)(82310400026)(35042699022)(14060799003)(36860700013)(1800799024); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2024 21:12:54.4676 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5ebd2e90-af49-4dd9-a9bd-08dd0ff16ce0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028D0A.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB10200 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equality and hash operations gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl): Specify FPM_unused when folding. (svmul_impl): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.cc (build_one): Use the group fpm_mode when creating function instances. * config/aarch64/aarch64-sve-builtins-sve2.cc (svaba_impl, svqrshl_impl, svqshl_impl,svrshl_impl, svsra_impl): Specify FPM_unused when folding. * config/aarch64/aarch64-sve-builtins.cc (function_groups): Set fpm_mode on all elements. (neon_sve_function_groups, sme_function_groups): Likewise. (function_instance::hash): Include fpm_mode in hash. (function_builder::add_overloaded_functions): Use the group fpm mode. (function_resolver::lookup_form): Use the function instance fpm_mode when looking up a function. * config/aarch64/aarch64-sve-builtins.def (DEF_SVE_FUNCTION_GS_FPM): add define. (DEF_SVE_FUNCTION_GS): redefine against DEF_SVE_FUNCTION_GS_FPM. * config/aarch64/aarch64-sve-builtins.h (fpm_mode_index): New. (function_group_info): Add fpm_mode. (function_instance): Likewise. (function_instance::operator==): Handle fpm_mode. --- .../aarch64/aarch64-sve-builtins-base.cc | 21 ++++++++------- .../aarch64/aarch64-sve-builtins-shapes.cc | 4 +-- .../aarch64/aarch64-sve-builtins-sve2.cc | 27 +++++++++++-------- gcc/config/aarch64/aarch64-sve-builtins.cc | 21 +++++++++------ gcc/config/aarch64/aarch64-sve-builtins.def | 8 +++++- gcc/config/aarch64/aarch64-sve-builtins.h | 27 ++++++++++++++----- 6 files changed, 71 insertions(+), 37 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 87e9909b55a..95e66dc2adf 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -775,9 +775,9 @@ public: tree pg = gimple_call_arg (f.call, 0); if (!f.type_suffix (0).unsigned_p && integer_minus_onep (op2)) { - function_instance instance ("svneg", functions::svneg, - shapes::unary, MODE_none, - f.type_suffix_ids, GROUP_none, f.pred); + function_instance instance ("svneg", functions::svneg, shapes::unary, + MODE_none, f.type_suffix_ids, GROUP_none, + f.pred, FPM_unused); gcall *call = f.redirect_call (instance); unsigned offset_index = 0; if (f.pred == PRED_m) @@ -805,7 +805,8 @@ public: { function_instance instance ("svlsr", functions::svlsr, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); call = f.redirect_call (instance); tree d = INTEGRAL_TYPE_P (TREE_TYPE (op2)) ? op2 : op2_cst; new_divisor = wide_int_to_tree (TREE_TYPE (d), tree_log2 (d)); @@ -818,7 +819,8 @@ public: function_instance instance ("svasrd", functions::svasrd, shapes::shift_right_imm, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); call = f.redirect_call (instance); new_divisor = wide_int_to_tree (scalar_types[VECTOR_TYPE_svuint64_t], tree_log2 (op2_cst)); @@ -2100,9 +2102,9 @@ public: negated_op = op2; if (!f.type_suffix (0).unsigned_p && negated_op) { - function_instance instance ("svneg", functions::svneg, - shapes::unary, MODE_none, - f.type_suffix_ids, GROUP_none, f.pred); + function_instance instance ("svneg", functions::svneg, shapes::unary, + MODE_none, f.type_suffix_ids, GROUP_none, + f.pred, FPM_unused); gcall *call = f.redirect_call (instance); unsigned offset_index = 0; if (f.pred == PRED_m) @@ -2143,7 +2145,8 @@ public: tree_log2 (shift_op2)); function_instance instance ("svlsl", functions::svlsl, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); gcall *call = f.redirect_call (instance); gimple_call_set_arg (call, 1, shift_op1); gimple_call_set_arg (call, 2, shift_op2); diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 371507513c3..ebe2e581728 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -349,8 +349,8 @@ build_one (function_builder &b, const char *signature, /* Byte forms of svdupq take 16 arguments. */ auto_vec argument_types; function_instance instance (group.base_name, *group.base, *group.shape, - mode_suffix_id, group.types[ti], - group.groups[gi], group.preds[pi]); + mode_suffix_id, group.types[ti], group.groups[gi], + group.preds[pi], group.fpm_mode); tree return_type = parse_signature (instance, signature, argument_types); apply_predication (instance, return_type, argument_types); b.add_unique_function (instance, return_type, argument_types, diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index b17b78dadd5..6bfc62bdce6 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -126,9 +126,9 @@ public: tree op1 = gimple_call_arg (f.call, 0); if (!integer_zerop (op1)) return NULL; - function_instance instance ("svabd", functions::svabd, - shapes::binary_opt_n, f.mode_suffix_id, - f.type_suffix_ids, GROUP_none, PRED_x); + function_instance instance ("svabd", functions::svabd, shapes::binary_opt_n, + f.mode_suffix_id, f.type_suffix_ids, GROUP_none, + PRED_x, FPM_unused); gcall *call = f.redirect_call (instance); /* Add a ptrue as predicate, because unlike svaba, svabd is predicated. */ @@ -512,7 +512,8 @@ public: that we can use for sensible shift amounts. */ function_instance instance ("svqshl", functions::svqshl, shapes::binary_int_opt_n, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); return f.redirect_call (instance); } else @@ -520,9 +521,9 @@ public: /* The saturation has no effect, and [SU]RSHL has immediate forms that we can use for sensible shift amounts. */ function_instance instance ("svrshl", functions::svrshl, - shapes::binary_int_opt_single_n, - MODE_n, f.type_suffix_ids, GROUP_none, - f.pred); + shapes::binary_int_opt_single_n, MODE_n, + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); return f.redirect_call (instance); } } @@ -551,7 +552,8 @@ public: -wi::to_wide (amount)); function_instance instance ("svasr", functions::svasr, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); if (f.type_suffix (0).unsigned_p) { instance.base_name = "svlsr"; @@ -586,7 +588,8 @@ public: that we can use for sensible shift amounts. */ function_instance instance ("svlsl", functions::svlsl, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); gcall *call = f.redirect_call (instance); gimple_call_set_arg (call, 2, amount); return call; @@ -599,7 +602,8 @@ public: -wi::to_wide (amount)); function_instance instance ("svrshr", functions::svrshr, shapes::shift_right_imm, MODE_n, - f.type_suffix_ids, GROUP_none, f.pred); + f.type_suffix_ids, GROUP_none, f.pred, + FPM_unused); gcall *call = f.redirect_call (instance); gimple_call_set_arg (call, 2, amount); return call; @@ -635,7 +639,8 @@ public: return NULL; function_instance instance ("svlsr", functions::svlsr, shapes::binary_uint_opt_n, MODE_n, - f.type_suffix_ids, GROUP_none, PRED_x); + f.type_suffix_ids, GROUP_none, PRED_x, + FPM_unused); if (!f.type_suffix (0).unsigned_p) { instance.base_name = "svasr"; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 4596404f8a0..66320be9adc 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -933,9 +933,10 @@ static const predication_index preds_za_m[] = { PRED_za_m, NUM_PREDS }; /* A list of all arm_sve.h functions. */ static CONSTEXPR const function_group_info function_groups[] = { -#define DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ +#define DEF_SVE_FUNCTION_GS_FPM(NAME, SHAPE, TYPES, GROUPS, PREDS, FPM_MODE) \ { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_##GROUPS, \ - preds_##PREDS, aarch64_required_extensions::REQUIRED_EXTENSIONS }, + preds_##PREDS, aarch64_required_extensions::REQUIRED_EXTENSIONS, \ + FPM_##FPM_MODE }, #include "aarch64-sve-builtins.def" }; @@ -943,7 +944,8 @@ static CONSTEXPR const function_group_info function_groups[] = { static CONSTEXPR const function_group_info neon_sve_function_groups[] = { #define DEF_NEON_SVE_FUNCTION(NAME, SHAPE, TYPES, GROUPS, PREDS) \ { #NAME, &neon_sve_bridge_functions::NAME, &shapes::SHAPE, types_##TYPES, \ - groups_##GROUPS, preds_##PREDS, aarch64_required_extensions::ssve (0) }, + groups_##GROUPS, preds_##PREDS, aarch64_required_extensions::ssve (0), \ + FPM_unused }, #include "aarch64-neon-sve-bridge-builtins.def" }; @@ -951,12 +953,13 @@ static CONSTEXPR const function_group_info neon_sve_function_groups[] = { static CONSTEXPR const function_group_info sme_function_groups[] = { #define DEF_SME_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_##GROUPS, \ - preds_##PREDS, aarch64_required_extensions::REQUIRED_EXTENSIONS }, + preds_##PREDS, aarch64_required_extensions::REQUIRED_EXTENSIONS, \ + FPM_unused }, #define DEF_SME_ZA_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ { #NAME, &functions::NAME##_za, &shapes::SHAPE, types_##TYPES, \ groups_##GROUPS, preds_##PREDS, \ aarch64_required_extensions::REQUIRED_EXTENSIONS \ - .and_also (AARCH64_FL_ZA_ON) }, + .and_also (AARCH64_FL_ZA_ON), FPM_unused }, #include "aarch64-sve-builtins-sme.def" }; @@ -1238,6 +1241,7 @@ function_instance::hash () const h.add_int (type_suffix_ids[1]); h.add_int (group_suffix_id); h.add_int (pred); + h.add_int (fpm_mode); return h.end (); } @@ -1668,7 +1672,8 @@ function_builder::add_overloaded_functions (const function_group_info &group, { function_instance instance (group.base_name, *group.base, *group.shape, mode, types, - group_suffix_id, group.preds[pi]); + group_suffix_id, group.preds[pi], + group.fpm_mode); add_overloaded_function (instance, group.required_extensions); }; @@ -1845,8 +1850,8 @@ function_resolver::lookup_form (mode_suffix_index mode, group_suffix_index group) { type_suffix_pair types = { type0, type1 }; - function_instance instance (base_name, base, shape, mode, types, - group, pred); + function_instance instance (base_name, base, shape, mode, types, group, pred, + fpm_mode); registered_function *rfn = function_table->find_with_hash (instance, instance.hash ()); return rfn ? rfn->decl : NULL_TREE; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index 47c396b866d..252c126dd39 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -37,8 +37,13 @@ #define DEF_SVE_GROUP_SUFFIX(A, B, C) #endif +#ifndef DEF_SVE_FUNCTION_GS_FPM +#define DEF_SVE_FUNCTION_GS_FPM(A, B, C, D, E, F) +#endif + #ifndef DEF_SVE_FUNCTION_GS -#define DEF_SVE_FUNCTION_GS(A, B, C, D, E) +#define DEF_SVE_FUNCTION_GS(A, B, C, D, E) \ + DEF_SVE_FUNCTION_GS_FPM(A, B, C, D, E, unused) #endif #ifndef DEF_SVE_NEON_TYPE_SUFFIX @@ -164,6 +169,7 @@ DEF_SVE_GROUP_SUFFIX (vg4x4, 4, 4) #undef DEF_SVE_FUNCTION #undef DEF_SVE_FUNCTION_GS +#undef DEF_SVE_FUNCTION_GS_FPM #undef DEF_SVE_GROUP_SUFFIX #undef DEF_SME_ZA_SUFFIX #undef DEF_SVE_NEON_TYPE_SUFFIX diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index d209aebe96e..417960cafe9 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -28,6 +28,7 @@ - the "mode" suffix ("_n", "_index", etc.) - the type suffixes ("_s32", "_b8", etc.) - the predication suffix ("_x", "_z", etc.) + - the "_fpm" suffix when the floating point mode register is set Each piece of information is individually useful, so we retain this classification throughout: @@ -42,6 +43,8 @@ - prediction_index extends the predication suffix with an additional alternative: PRED_implicit for implicitly-predicated operations + - fpm_mode represents whether the fpm register is set or not + In addition to its unique full name, a function may have a shorter overloaded alias. This alias removes pieces of the suffixes that can be inferred from the arguments, such as by shortening the mode @@ -164,6 +167,14 @@ enum predication_index NUM_PREDS }; +/* Classifies intrinsics on whether they set the FPM register */ +enum fpm_mode_index +{ + FPM_unused, + FPM_set, + NUM_FPM_MODES +}; + /* Classifies element types, based on type suffixes with the bit count removed. "count" isn't really an element type, but we pretend it is for consistency. */ @@ -366,6 +377,9 @@ struct function_group_info /* The architecture extensions that the functions require. */ aarch64_required_extensions required_extensions; + + /* Whether the floating point register is set */ + fpm_mode_index fpm_mode; }; /* Describes a single fully-resolved function (i.e. one that has a @@ -376,7 +390,7 @@ public: function_instance (const char *, const function_base *, const function_shape *, mode_suffix_index, const type_suffix_pair &, group_suffix_index, - predication_index); + predication_index, fpm_mode_index); bool operator== (const function_instance &) const; bool operator!= (const function_instance &) const; @@ -420,6 +434,7 @@ public: type_suffix_pair type_suffix_ids; group_suffix_index group_suffix_id; predication_index pred; + fpm_mode_index fpm_mode; }; class registered_function; @@ -876,16 +891,15 @@ tuple_type_field (tree type) } inline function_instance:: -function_instance (const char *base_name_in, - const function_base *base_in, +function_instance (const char *base_name_in, const function_base *base_in, const function_shape *shape_in, mode_suffix_index mode_suffix_id_in, const type_suffix_pair &type_suffix_ids_in, group_suffix_index group_suffix_id_in, - predication_index pred_in) + predication_index pred_in, fpm_mode_index fpm_mode_in) : base_name (base_name_in), base (base_in), shape (shape_in), mode_suffix_id (mode_suffix_id_in), group_suffix_id (group_suffix_id_in), - pred (pred_in) + pred (pred_in), fpm_mode (fpm_mode_in) { memcpy (type_suffix_ids, type_suffix_ids_in, sizeof (type_suffix_ids)); } @@ -899,7 +913,8 @@ function_instance::operator== (const function_instance &other) const && type_suffix_ids[0] == other.type_suffix_ids[0] && type_suffix_ids[1] == other.type_suffix_ids[1] && group_suffix_id == other.group_suffix_id - && pred == other.pred); + && pred == other.pred + && fpm_mode == other.fpm_mode); } inline bool From patchwork Thu Nov 28 21:12:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Bantaloukas X-Patchwork-Id: 102048 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C5D663858C32 for ; Thu, 28 Nov 2024 21:15:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C5D663858C32 Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Bcetztte; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Bcetztte X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from AS8PR04CU009.outbound.protection.outlook.com (mail-westeuropeazlp170110003.outbound.protection.outlook.com [IPv6:2a01:111:f403:c201::3]) by sourceware.org (Postfix) with ESMTPS id 116013858D37 for ; Thu, 28 Nov 2024 21:13:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 116013858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 116013858D37 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:c201::3 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828380; cv=pass; b=cdwmIGHOWD9/rJQ+9NjNoXA7a38Q3tAxiutIl3dEkHOhgg+StnPMuHBcQkR7IcQTiduYMqDtJ76sV7s/IurO4ZVKZA86KrRzwvDSxRksV3YDFWLBO2lcfBltDNXF/cvCxAeWzRDdqwqsHhlLzfmc+2eJ1LyICzBR58RxexIB4xM= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828380; c=relaxed/simple; bh=siW7XPmYlj7BkX3fiW8C4vhBQ5+Enx3wQ+eFPdGPSWY=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=Rat0ef5XEc/S+9S1fnvJ4p6PoLOiHQW8F5QsZqn4ITkKA2g5LjrnlYZvmk2wytQS6ywOJJ9UuusrMGSoNuCdQqo/hO3UzGpMF3ewLw3EbgYvvpUhcVMJ83nnGtHqgo+k1JMsy24hi60LU+bHk6pFH19ho4uAtT1LNa70F1qL6/M= ARC-Authentication-Results: i=3; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 116013858D37 ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=Ip6qCCNOiY/MVxdECF60bj4VU6j1s2niCRVTlF3mSGbLcvQG70xkYg+mTJVLFMM0CKP70MmdZV45kNyCbRb6GItLQH8K8UOeOBlBe66tjrOMrXquJWgU69uCo0VCCIhDKCZ5WNwwsFErYJYG8RecvsoHvMiBzijuUEOTee2yj1Cn5WFsdQl0fG587hFVCsbq0UrUa1hx+T8tRj5/cEtYAUVS31FM9W441hEj5DedZDCgNNrXsOJklcfjNy9CkKLdIlqZjKbR5zgFbOIB1kcbr1mypKDqA4oVrvhk4lwm6RuaDin6gfOS+DILeKyrB1Z+vhma1IZRLScKR5prowP8jg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=APyVBBCiRygs/bHX2GVXroFSK/5UyTRHWOjzgK0Fd+g=; b=dt674DeO2edwgVVFa3tvsi8a/lrgbD7J8pvUGwYWg2oIuX47ks3J/jnduFYLKcr7PCCQcveotF13PsNx5DwFA1JVaA5Uzm0mCOwGF5m2SAhqdk2RNwICZEqIFDkj9kjfhoN23VhlRFp+Iu3lUqJv4jWA3q0NZjfqRwkNKxqvNb6U+tJt7PpeKZgGxRUd26onAaBkFm729dsSqiCRYwKlM/S7bhCOm5jjjpBUVYXZ86vmgamolFQBCKDivfYr16ZFec+yx2GG3i1DB6tAkProVP+rv3W1IyGvmnjJ2lyNnYD8sOOHZ6ZUAr2qqv8jl2hXVFi8MAkS21Njl98xCG7bQg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=APyVBBCiRygs/bHX2GVXroFSK/5UyTRHWOjzgK0Fd+g=; b=Bcetzttex4RxI+MhIjT5N0PS5QK9TPNuwDmD5K2MbGnibSKDN1B4y5GQLN/uglH17zzJchzic2tnbv5WbaBjxq244f/O5yBmfthRhvBlwW5xqocNkEvYoP5S9wvxEciLRxxw9HriZGCpjvY6S+PBJ/nYMY4qsGnIBjLfiUOXZzQ= Received: from DBBPR09CA0010.eurprd09.prod.outlook.com (2603:10a6:10:c0::22) by GV1PR08MB11090.eurprd08.prod.outlook.com (2603:10a6:150:1ed::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.13; Thu, 28 Nov 2024 21:12:54 +0000 Received: from DU2PEPF00028D05.eurprd03.prod.outlook.com (2603:10a6:10:c0:cafe::57) by DBBPR09CA0010.outlook.office365.com (2603:10a6:10:c0::22) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.14 via Frontend Transport; Thu, 28 Nov 2024 21:12:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028D05.mail.protection.outlook.com (10.167.242.165) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:52 +0000 Received: ("Tessian outbound 75f654e2c9bc:v514"); Thu, 28 Nov 2024 21:12:52 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 6ecda68a0abd03e8 X-TessianGatewayMetadata: HT6xEptNKYy1lgszpQ4pXVlG8rcDnSXxpLm4c3cTlPGa01PEWZLDrtw8BcTyAvkOY6cmYO9k+D1KOPUzSF7EXQx/IOiQUJYdQ+ttb6Fi8FUfwMPB4Qkl+wmjE1rPrGlMTZilUYaw+9W0OxRz+KASpjcRpPdRW2sDaN/OK4oebvEUnlfHAKYHvyO8xsgHESTD X-CR-MTA-TID: 64aa7808 Received: from L79efd3b4ff7b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 24741A32-AC36-47CC-9E70-2D3A41FA11F6.1; Thu, 28 Nov 2024 21:12:40 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L79efd3b4ff7b.1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384); Thu, 28 Nov 2024 21:12:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=h7Q//ErGZtT895Ts1xV/CNeqyJWWIqjsE+SKVqNJ923VyJ6wf5F8DQqVzPz956qCnGWBJE3fZUWq2MAFcb3ddqS3IEq4BTqZg5oUWc2nyk6HHhP2IM3+vXT+YIMq3icLcR6VyIEwkCxI+7/OEoTtwvEoIquhm7jFixQMVHamv+2Kii2C/74gBfQ5knzZ07ZubySU0pCWIIV95lmRfhJWHYzxLe9xcCigAiWko64dXXUdaNnizk5vdMBvvwrQHKryiymZPMUFW+uaTppuAUpLc/hFmCGLDOgiw0YEd889dWBOUMYSz2UOm37krihqNz5EVn9MOkTidmXY/kYeZ+YZNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=APyVBBCiRygs/bHX2GVXroFSK/5UyTRHWOjzgK0Fd+g=; b=UFSDQLMMybEtNh1f3qi6S9zGXqSkme5g84xVSr74QxKhP66Xtf0Bbezh+56pweSV41VEtZfj2nibIaw1QXdyeJJ4qNSaBVWqFbFLSAzJJ8akNyIoZfadLpkYVKmmZ71DDx64nt2uPdZZEvsiKtR9uVWKdqm0aWnh0uo1YpwQpD7Q6QQ4MA7I+NDibQi0wTlTKcHaoQ+3SaDChABwuX7bLJLeZ+JdJGDvAEM1e7V6Opu2g90SeuVbN45EfKd/LjFudrAswGPi5HU1p8uWN3OreH6Thisp+YtK3vFyLEi42YsFcFmYjVIDxRFz5Hn+t5pNhN/RgTEC091eDg0laOXETg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=APyVBBCiRygs/bHX2GVXroFSK/5UyTRHWOjzgK0Fd+g=; b=Bcetzttex4RxI+MhIjT5N0PS5QK9TPNuwDmD5K2MbGnibSKDN1B4y5GQLN/uglH17zzJchzic2tnbv5WbaBjxq244f/O5yBmfthRhvBlwW5xqocNkEvYoP5S9wvxEciLRxxw9HriZGCpjvY6S+PBJ/nYMY4qsGnIBjLfiUOXZzQ= Received: from DB8PR06CA0056.eurprd06.prod.outlook.com (2603:10a6:10:120::30) by PA6PR08MB10417.eurprd08.prod.outlook.com (2603:10a6:102:3cd::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.14; Thu, 28 Nov 2024 21:12:37 +0000 Received: from DU6PEPF0000A7E4.eurprd02.prod.outlook.com (2603:10a6:10:120:cafe::de) by DB8PR06CA0056.outlook.office365.com (2603:10a6:10:120::30) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.13 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DU6PEPF0000A7E4.mail.protection.outlook.com (10.167.8.43) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:37 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:37 +0000 Received: from 5fe87ac27518.euhpc2.arm.com (10.58.86.32) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 28 Nov 2024 21:12:36 +0000 From: Claudio Bantaloukas To: CC: Claudio Bantaloukas Subject: [PATCH v5 3/5] aarch64: add svcvt* FP8 intrinsics Date: Thu, 28 Nov 2024 21:12:32 +0000 Message-ID: <20241128211234.1714776-4-claudio.bantaloukas@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> References: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DU6PEPF0000A7E4:EE_|PA6PR08MB10417:EE_|DU2PEPF00028D05:EE_|GV1PR08MB11090:EE_ X-MS-Office365-Filtering-Correlation-Id: fb1155bb-595f-4235-107f-08dd0ff16bec x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|36860700013|1800799024|82310400026|376014; X-Microsoft-Antispam-Message-Info-Original: TIEl+K1IRLIDYcu7jllvMISheiBEmtfTHtR/AbONpOwxBWrF3eG7NZmgpQAf/HT+jBJgRVyEeg4RfrF/CGtRUoHEJHIgLr5QAAHhMcMaDYaez5rVwB2aIJwiGAxfxp9RTGxmauLfcuG2HCeUCmuSShQaZF2SzueXBT5Yx7WTxDqLFQdjFPiGtdAg9JG8A2UwfQJjBr+fmhhlvPXpqfeHxvcOH5/EJ5DT/KqljTLboZKlicA8n6wI7Y/gQxCIUN4o2yFQzsUvDgSuPxyWZ4DBJEt1NQET8qCR8uxVGUap8HNDvz8vgQdDotkYHZ7wOuXByDnS4ORFfKLDJ1Bfkl/yrEAdi6SK5LRHi2PdAOUB6ItA0Q4FY9RpenaUMkqWAvt4sMRFhNJEpRyIURvWlJuPFuRqhK4tcED9pN1RwedG+uAgzqr+SsIJE07fE15tUXDYg4sNbhjjmKt/iq6NnfmdDg4Vch3Ng3Pncn2ReJ5HN6yMDxxzasINlX/ocP42pUHvv29NKQQrEnX9Lxr3yWkbQvjTdqYzO4bfy0h3PnSsRYpY3qxQl5MJKKlbKrntKFxY2Xma5lVCsK1RVUBQTefIQkGbC+riMrkQ6DAg6svvv81joDlYApLHOO3KWwnhhH3mMfvqzJTZFK8FP4trHgp1ejI45/GZ7Nk5t+YA/7ljZyh7xzGF2dZXCdl8p2UTRs3KwLEG7gCUsGN3nNyBvf507Dmo6A1/rbDdQQU6Ay9Mn9VGKxeMqRLGEbPcm4nlrj9dWmMHh6UxBXWBzmC4EOx+/rP/4elb8RGw16qQpsfmdfQRTyh6UbH6P3Bhr2CYwG4wbFVwVmILXr1VDNe0yabRPb31p8vc+rXatZUsgnEA3qiYDM+PE69hTEu4BOaP8FfKTBh9GekUQ1Vs+UHwkIp7r8HhH2zs44Iq3sZJ9oiXSTrlJt5g3Mr3nUBKN4GU+J7EbgmyTuqE4t6iePoxUsr4QjdrkuMVSJsvE2FXKnAS5W0B8WoyJlTsBqwcXb8cjGygm3BBTtAPDtmo689Y5u3UddF7R/Sb0+alUKDrinCeeT50lh25DKjBS2XnDRosdxgyqAfHBNoI36WaxHmfZ3crl7YPcruUyVPSJjT15W+Y3YutFZO7DTb0VS9Cxheb9c2mq3dkTHDvWkFw+uJALKssSrAWnrATNLjeGSMpuIHAcIY6+bmYww5+aCS9IAM3wz3sIOF829CICq0MBVHytxmRleNcc8xeGaB5zGYN7p9i6BMVT6RDU9FoWYZxVMoo49A1cElQabxVH6xAn+pXIYSpviJKXIFboM0Y2NKdGZ+6FsSUWfCTQRYtSUbZN4NK/oM+H4k+qyjyaSO/P2p+JK3mYYKgjnTNoMZzlQLKw6v53iXSyRX0AoNf2IwHDr33bSKJG9+9U+zrLs/jS627KJWf4g== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(1800799024)(82310400026)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA6PR08MB10417 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:10:120::30]; domain=DB8PR06CA0056.eurprd06.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028D05.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 97cec652-43d5-455d-aa37-08dd0ff162fd X-Microsoft-Antispam: BCL:0; ARA:13230040|14060799003|376014|1800799024|35042699022|82310400026|36860700013; X-Microsoft-Antispam-Message-Info: =?utf-8?q?6EQBxMlHYCrlcUAWC4Fe3hQi+whzmEK?= =?utf-8?q?kdhRiYjG/AxPMXKRoyRm4RCdBn97Ka1S7MWTOQiF1HiSc2ABwTefHcHLnCLxpvaWc?= =?utf-8?q?l85yijUROCHTNSz8/1QRiGvoaRo3TTiMu8JkOzLlnEQYDRUyo3PRtlqkbHhj8CiYU?= =?utf-8?q?yqm0aeYrV9M0FcNBVNfLRyMaoQZgFHBxzsCSSsrBCYF2Qo+MIwF5g6RZ3BCEJ94UN?= =?utf-8?q?Pjf4+CdZgVlsYLjAe3bfboh56fSQCf81heJjzel+VPF/7K+by/OoINvqh8naazwv5?= =?utf-8?q?/xV2bDURY3IWMs4zts6CfBR+65+mmQMna3dHNd4kxti39cIqQjPTmIV9DDqA3SPnm?= =?utf-8?q?/496Md6DlpICczsoPTNweCGDvGdP3Q6+QBkMHJ1i8GOVvExJVq19okd6lXjrv5PHs?= =?utf-8?q?6FG06M+SQ/oaO9JrAMFA1b4Cj1MUa1EsCTJJlnpTH7mtiTW+NWRzteXFZEyTOTGmo?= =?utf-8?q?cKdmEtoannjEUcg+QuTlZ82yR2q6jxqhi/3MQY0oNhctQ3Gn7nw3MeLV1JCa/htHy?= =?utf-8?q?f6copXRAyJYcU3BeX6ah9NFDflpvldk1aZC1Yejfcavb9CA/44rKzaICVouPUE7//?= =?utf-8?q?mdP4GGXwoEcHwHy4ZlrTB11Idjpfkhx6xYHapweYi4Kj2bsOscM8sft2gmMtc9fmU?= =?utf-8?q?2u/qCmp7kpbX5jGF3htJqJcRk5pOqFngKTwZ6fJrJtJ6AvVdFLL6jxwOcG+jM9Zo9?= =?utf-8?q?X02pyvT/TiDlM6drHDlVSaGbi61JKcZDXyhjf8nVcnAHl62GveBs9RY/X5ropZMgM?= =?utf-8?q?1DPqNnti5hcdLlbPBlPmKuxeNJVcU9uGotSKzWJeMCRgRp6F5h631JdeX1eqnY+No?= =?utf-8?q?piwoPZaVfKypEmsZhNasOpw5SzZEu9/EysgP3zEJTFhNsdWNzw5K+DkswvzVgZcsp?= =?utf-8?q?F600WBpeVX6asLYkmlqJXcOHxI2PaLCnsi61hN69HssIMXwLlflFwGhQJGBuQ0wbV?= =?utf-8?q?NJqFaa93EqCN5ylqgET74nDZg6GRkg2uUsgp0ijJnFfr/1KJiBa+yKoisXxhbFdSk?= =?utf-8?q?MWkX1buWQ8nHtdv1It2gJZcS2n33KRtlV8Ku1NBbRez8MFiE0t0ox2HzYy7dm5N+E?= =?utf-8?q?yVXnLAC33veMBCfw6Gn2LHbIRamqsYBSCrSqMMuI9JbRrq5yh3rOEyTfUIuJ4z+ms?= =?utf-8?q?XiSTE0SystMhXESlEvw9cM1joCSKlDUYTccsTOnWFD9o3L5FJC1BgfkFsodggZOgv?= =?utf-8?q?TXntUKXxgERC0k34GsjA9TlChGxoHDO3OrOmrzOiK8BvXwsYjgsQbwYyRYQwkCVPH?= =?utf-8?q?jZJuhiJhJEg9BHoGHNg3KSlCiW05hsJ8gQYp/XjAu0mXR5YysZqQZtBe0bRCcmsmY?= =?utf-8?q?lys4AJ9fpVlR7nPCXmnxQ1ozyrqJcsOt1w=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:64aa7808-outbound-1.mta.getcheckrecipient.com; CAT:NONE; SFS:(13230040)(14060799003)(376014)(1800799024)(35042699022)(82310400026)(36860700013); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2024 21:12:52.8646 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: fb1155bb-595f-4235-107f-08dd0ff16bec X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028D05.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB11090 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_x2]_fpm - svcvtnt_mf8[_f32_x2]_fpm The underlying instructions are only available when SVE2 is enabled and the PE is not in streaming SVE mode. They are also available when SME2 is enabled and the PE is in streaming SVE mode. gcc/ * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_signature): Add an fpm_t (uint64_t) argument to functions that set the fpm register. (unary_convertxn_narrowt_def): New class. (unary_convertxn_narrowt): New shape. (unary_convertxn_narrow_def): New class. (unary_convertxn_narrow): New shape. * config/aarch64/aarch64-sve-builtins-shapes.h (unary_convertxn_narrowt): Declare. (unary_convertxn_narrow): Likewise. * config/aarch64/aarch64-sve-builtins-sve2.cc (svcvt_fp8_impl): New class. (svcvtn_impl): Handle fp8 cases. (svcvt1, svcvt2, svcvtlt1, svcvtlt2): Add new FUNCTION. (svcvtnb): Likewise. * config/aarch64/aarch64-sve-builtins-sve2.def (svcvt1, svcvt2, svcvtlt1, svcvtlt2): Add new DEF_SVE_FUNCTION_GS_FPM. (svcvtn): Likewise. (svcvtnb, svcvtnt): Likewise. * config/aarch64/aarch64-sve-builtins-sve2.h (svcvt1, svcvt2, svcvtlt1, svcvtlt2, svcvtnb, svcvtnt): Declare. * config/aarch64/aarch64-sve-builtins.cc (TYPES_cvt_mf8, TYPES_cvtn_mf8, TYPES_cvtnx_mf8): Add new types arrays. (function_builder::get_name): Append _fpm to functions that set fpmr. (function_resolver::check_gp_argument): Deal with the fpm_t argument. (function_expander::expand): Set the fpm register before calling the insn if the function warrants it. * config/aarch64/aarch64-sve2.md (@aarch64_sve2_fp8_cvt): Add new. (@aarch64_sve2_fp8_cvtn): Likewise. (@aarch64_sve2_fp8_cvtnb): Likewise. (@aarch64_sve_cvtnt): Likewise. * config/aarch64/aarch64.h (TARGET_SSVE_FP8): Add new. * config/aarch64/iterators.md (VNx8SF_ONLY, SVE_FULL_HFx2): New mode iterators. (UNSPEC_F1CVT, UNSPEC_F1CVTLT, UNSPEC_F2CVT, UNSPEC_F2CVTLT): Add new. (UNSPEC_FCVTNB, UNSPEC_FCVTNT): Likewise. (UNSPEC_FP8FCVTN): Likewise. (FP8CVT_UNS, fp8_cvt_uns_op): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_DUAL_Z): Add fpm0 argument * gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c: Add new tests. * gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrowt_1.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cvtlt_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cvtn_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cvtnb_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/cvtnt_mf8.c: Likewise. * lib/target-supports.exp: Add aarch64_asm_fp8_ok check. --- .../aarch64/aarch64-sve-builtins-shapes.cc | 78 +++++++++++++++++++ .../aarch64/aarch64-sve-builtins-shapes.h | 2 + .../aarch64/aarch64-sve-builtins-sve2.cc | 28 ++++++- .../aarch64/aarch64-sve-builtins-sve2.def | 12 +++ .../aarch64/aarch64-sve-builtins-sve2.h | 6 ++ gcc/config/aarch64/aarch64-sve-builtins.cc | 31 +++++++- gcc/config/aarch64/aarch64-sve2.md | 51 ++++++++++++ gcc/config/aarch64/aarch64.h | 5 ++ gcc/config/aarch64/iterators.md | 24 ++++++ .../aarch64/sve/acle/asm/test_sve_acle.h | 2 +- .../acle/general-c/unary_convertxn_narrow_1.c | 60 ++++++++++++++ .../general-c/unary_convertxn_narrowt_1.c | 38 +++++++++ .../aarch64/sve2/acle/asm/cvt_mf8.c | 48 ++++++++++++ .../aarch64/sve2/acle/asm/cvtlt_mf8.c | 50 ++++++++++++ .../aarch64/sve2/acle/asm/cvtn_mf8.c | 30 +++++++ .../aarch64/sve2/acle/asm/cvtnb_mf8.c | 20 +++++ .../aarch64/sve2/acle/asm/cvtnt_mf8.c | 31 ++++++++ gcc/testsuite/lib/target-supports.exp | 2 +- 18 files changed, 513 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrowt_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtn_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_mf8.c diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index ebe2e581728..62831b3c1e2 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -325,6 +325,8 @@ parse_signature (const function_instance &instance, const char *format, argument_types.quick_push (argument_type); } gcc_assert (format[0] == 0); + if (instance.fpm_mode == FPM_set) + argument_types.quick_push (get_typenode_from_name (UINT64_TYPE)); return return_type; } @@ -4596,6 +4598,46 @@ struct unary_convert_narrowt_def : public overloaded_base<1> }; SHAPE (unary_convert_narrowt) +/* sv_t svfoo_t0[_t1_g](sv_t, svx +{ + bool + explicit_group_suffix_p () const override + { + return false; + } + + bool + has_merge_argument_p (const function_instance &, unsigned int) const override + { + return true; + } + + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,v0,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + gcc_assert(r.fpm_mode == FPM_set); + sve_type type; + if (!r.check_num_arguments (3) + || !(type = r.infer_sve_type (1)) + || !r.require_scalar_type (2, "uint64_t")) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } +}; +SHAPE (unary_convertxn_narrowt) + /* svx_t svfoo_t0[_t1_g](svx_t) where the target type must be specified explicitly but the @@ -4628,6 +4670,42 @@ struct unary_convertxn_def : public unary_convert_def }; SHAPE (unary_convertxn) +/* sv_t svfoo_t0[_t1_g](svx_t) + + where the target type must be specified explicitly but the + source type can be inferred. + + Functions with a group suffix are unpredicated. */ +struct unary_convertxn_narrow_def : public unary_convert_def +{ + bool + explicit_group_suffix_p () const override + { + return false; + } + + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + gcc_assert(r.fpm_mode == FPM_set); + sve_type type; + if (!r.check_num_arguments (2) + || !(type = r.infer_sve_type (0)) + || !r.require_scalar_type (1, "uint64_t")) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } +}; +SHAPE (unary_convertxn_narrow) + /* sv_t svfoo_(sv_t, uint64_t) where the final argument is an integer constant expression in the diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index e1d661c5a46..dc3d4557288 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -229,7 +229,9 @@ namespace aarch64_sve extern const function_shape *const unary; extern const function_shape *const unary_convert; extern const function_shape *const unary_convert_narrowt; + extern const function_shape *const unary_convertxn_narrowt; extern const function_shape *const unary_convertxn; + extern const function_shape *const unary_convertxn_narrow; extern const function_shape *const unary_lane; extern const function_shape *const unary_long; extern const function_shape *const unary_n; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index 6bfc62bdce6..1a1d2c4c6ec 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -221,13 +221,34 @@ public: } }; +class svcvt_fp8_impl : public function_base +{ +public: + CONSTEXPR + svcvt_fp8_impl (int unspec) : m_unspec (unspec) {} + + rtx + expand (function_expander &e) const override + { + auto icode = code_for_aarch64_sve2_fp8_cvt (m_unspec, e.result_mode ()); + return e.use_exact_insn (icode); + } + + int m_unspec; +}; + class svcvtn_impl : public function_base { public: rtx expand (function_expander &e) const override { - return e.use_exact_insn (code_for_aarch64_sve_cvtn (e.result_mode ())); + insn_code icode; + if (e.fpm_mode == FPM_set) + icode = code_for_aarch64_sve2_fp8_cvtn (GET_MODE (e.args[0])); + else + icode = code_for_aarch64_sve_cvtn (e.result_mode ()); + return e.use_exact_insn (icode); } }; @@ -922,9 +943,14 @@ FUNCTION (svbsl2n, CODE_FOR_MODE0 (aarch64_sve2_bsl2n),) FUNCTION (svcdot, svcdot_impl,) FUNCTION (svcdot_lane, svcdot_lane_impl,) FUNCTION (svclamp, svclamp_impl,) +FUNCTION (svcvt1, svcvt_fp8_impl, (UNSPEC_F1CVT)) +FUNCTION (svcvt2, svcvt_fp8_impl, (UNSPEC_F2CVT)) +FUNCTION (svcvtlt1, svcvt_fp8_impl, (UNSPEC_F1CVTLT)) +FUNCTION (svcvtlt2, svcvt_fp8_impl, (UNSPEC_F2CVTLT)) FUNCTION (svcvtlt, unspec_based_function, (-1, -1, UNSPEC_COND_FCVTLT)) FUNCTION (svcvtl, svcvtl_impl,) FUNCTION (svcvtn, svcvtn_impl,) +FUNCTION (svcvtnb, fixed_insn_function, (CODE_FOR_aarch64_sve2_fp8_cvtnbvnx16qi)) FUNCTION (svcvtx, unspec_based_function, (-1, -1, UNSPEC_COND_FCVTX)) FUNCTION (svcvtxnt, CODE_FOR_MODE1 (aarch64_sve2_cvtxnt),) FUNCTION (svdup_laneq, svdup_laneq_impl,) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 2189855d705..8a63998fcc6 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -367,3 +367,15 @@ DEF_SVE_FUNCTION_GS (svmaxnm, binary_opt_single_n, h_bfloat, x24, none) DEF_SVE_FUNCTION_GS (svmin, binary_opt_single_n, h_bfloat, x24, none) DEF_SVE_FUNCTION_GS (svminnm, binary_opt_single_n, h_bfloat, x24, none) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS \ + sve_and_sme (AARCH64_FL_SVE2 | AARCH64_FL_FP8, \ + AARCH64_FL_SME2 | AARCH64_FL_FP8) +DEF_SVE_FUNCTION_GS_FPM (svcvt1, unary_convert, cvt_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svcvt2, unary_convert, cvt_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svcvtlt1, unary_convert, cvt_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svcvtlt2, unary_convert, cvt_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svcvtn, unary_convertxn_narrow, cvtn_mf8, x2, none, set) +DEF_SVE_FUNCTION_GS_FPM (svcvtnb, unary_convertxn_narrow, cvtnx_mf8, x2, none, set) +DEF_SVE_FUNCTION_GS_FPM (svcvtnt, unary_convertxn_narrowt, cvtnx_mf8, x2, none, set) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h index bfe3d170e70..d26751e8042 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h @@ -62,8 +62,14 @@ namespace aarch64_sve extern const function_base *const svclamp; extern const function_base *const svcntp; extern const function_base *const svcvtl; + extern const function_base *const svcvt1; + extern const function_base *const svcvt2; + extern const function_base *const svcvtlt1; + extern const function_base *const svcvtlt2; extern const function_base *const svcvtlt; extern const function_base *const svcvtn; + extern const function_base *const svcvtnb; + extern const function_base *const svcvtnt; extern const function_base *const svcvtx; extern const function_base *const svcvtxnt; extern const function_base *const svdup_laneq; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 66320be9adc..4201ece9d59 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -481,6 +481,20 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { D (f32, s32), \ D (f32, u32) +/* _f16_mf8 + _bf16_mf8. */ +#define TYPES_cvt_mf8(S, D) \ + D (f16, mf8), D (bf16, mf8) + +/* _mf8_f16 + _mf8_bf16. */ +#define TYPES_cvtn_mf8(S, D) \ + D (mf8, f16), D (mf8, bf16) + +/* _mf8_f32. */ +#define TYPES_cvtnx_mf8(S, D) \ + D (mf8, f32) + /* { _s32 _s64 } x { _b8 _b16 _b32 _b64 } { _u32 _u64 }. */ #define TYPES_inc_dec_n1(D, A) \ @@ -793,9 +807,12 @@ DEF_SVE_TYPES_ARRAY (cvt_bfloat); DEF_SVE_TYPES_ARRAY (cvt_h_s_float); DEF_SVE_TYPES_ARRAY (cvt_f32_f16); DEF_SVE_TYPES_ARRAY (cvt_long); +DEF_SVE_TYPES_ARRAY (cvt_mf8); DEF_SVE_TYPES_ARRAY (cvt_narrow_s); DEF_SVE_TYPES_ARRAY (cvt_narrow); DEF_SVE_TYPES_ARRAY (cvt_s_s); +DEF_SVE_TYPES_ARRAY (cvtn_mf8); +DEF_SVE_TYPES_ARRAY (cvtnx_mf8); DEF_SVE_TYPES_ARRAY (inc_dec_n); DEF_SVE_TYPES_ARRAY (qcvt_x2); DEF_SVE_TYPES_ARRAY (qcvt_x4); @@ -1428,6 +1445,8 @@ function_builder::get_name (const function_instance &instance, if (!overloaded_p || instance.shape->explicit_group_suffix_p ()) append_name (instance.group_suffix ().string); append_name (pred_suffixes[instance.pred]); + if (instance.fpm_mode == FPM_set) + append_name ("_fpm"); return finish_name (); } @@ -3063,11 +3082,12 @@ function_resolver::check_gp_argument (unsigned int nops, { gcc_assert (pred != PRED_za_m); i = 0; + unsigned int nfpm_args = (fpm_mode == FPM_set)? 1:0; if (pred != PRED_none) { /* Unary merge operations should use resolve_unary instead. */ gcc_assert (!shape->has_merge_argument_p (*this, nops)); - nargs = nops + 1; + nargs = nops + nfpm_args + 1; if (!check_num_arguments (nargs) || !require_vector_type (i, gp_type_index ())) return false; @@ -3075,7 +3095,7 @@ function_resolver::check_gp_argument (unsigned int nops, } else { - nargs = nops; + nargs = nops + nfpm_args; if (!check_num_arguments (nargs)) return false; } @@ -4512,6 +4532,13 @@ function_expander::expand () for (unsigned int i = 0; i < nargs; ++i) args.quick_push (expand_normal (CALL_EXPR_ARG (call_expr, i))); + if (fpm_mode == FPM_set) + { + /* The last element of these functions is always an fpm_t that must be + written to FPMR before the call to the instruction itself. */ + gcc_assert (args.last ()->mode == DImode); + emit_move_insn (gen_rtx_REG (DImode, FPM_REGNUM), args.last ()); + } return base->expand (*this); } diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 66affa85d36..e5bd2861b48 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -2936,6 +2936,14 @@ (define_insn "@aarch64__lane_" ;; ---- [FP<-FP] Widening conversions ;; ------------------------------------------------------------------------- ;; Includes: +;; - BF1CVT +;; - BF1CVTLT +;; - BF2CVT +;; - BF2CVTLT +;; - F1CVT +;; - F1CVTLT +;; - F2CVT +;; - F2CVTLT ;; - FCVTLT ;; ------------------------------------------------------------------------- @@ -3001,6 +3009,16 @@ (define_insn "*cond__strict" "\t%0., %1/m, %2." ) +(define_insn "@aarch64_sve2_fp8_cvt_" + [(set (match_operand:SVE_FULL_HF 0 "register_operand" "=w") + (unspec:SVE_FULL_HF + [(match_operand:VNx16QI 1 "register_operand" "w") + (reg:DI FPM_REGNUM)] + FP8CVT_UNS))] + "TARGET_SSVE_FP8" + "\t%0.h, %1.b" +) + ;; ------------------------------------------------------------------------- ;; ---- [FP<-FP] Narrowing conversions ;; ------------------------------------------------------------------------- @@ -3150,6 +3168,8 @@ (define_insn "@aarch64_sve_cvtl" ;; - BFCVTN ;; - FCVT ;; - FCVTN +;; - FCVTNB +;; - FCVTNT ;; ------------------------------------------------------------------------- (define_insn "truncvnx8sf2" @@ -3169,6 +3189,37 @@ (define_insn "@aarch64_sve_cvtn" "fcvtn\t%0.h, %1" ) +(define_insn "@aarch64_sve2_fp8_cvtn" + [(set (match_operand:VNx16QI 0 "register_operand" "=w") + (unspec:VNx16QI + [(match_operand:SVE_FULL_HFx2 1 "aligned_register_operand" "Uw2") + (reg:DI FPM_REGNUM)] + UNSPEC_FP8FCVTN))] + "TARGET_SSVE_FP8" + "fcvtn\t%0.b, %1" +) + +(define_insn "@aarch64_sve2_fp8_cvtnb" + [(set (match_operand:VNx16QI_ONLY 0 "register_operand" "=w") + (unspec:VNx16QI_ONLY + [(match_operand:VNx8SF 1 "aligned_register_operand" "Uw2") + (reg:DI FPM_REGNUM)] + UNSPEC_FCVTNB))] + "TARGET_SSVE_FP8" + "fcvtnb\t%0.b, %1" +) + +(define_insn "@aarch64_sve_cvtnt" + [(set (match_operand:VNx16QI_ONLY 0 "register_operand" "=w") + (unspec:VNx16QI_ONLY + [(match_operand:VNx16QI_ONLY 1 "register_operand" "0") + (match_operand:VNx8SF 2 "aligned_register_operand" "Uw2") + (reg:DI FPM_REGNUM)] + UNSPEC_FCVTNT))] + "TARGET_SSVE_FP8" + "fcvtnt\t%0.b, %2" +) + ;; ------------------------------------------------------------------------- ;; ---- [FP<-INT] Multi-vector conversions ;; ------------------------------------------------------------------------- diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index b063c315fba..f43b1659db6 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -513,6 +513,11 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED #define TARGET_SSVE_B16B16 \ (AARCH64_HAVE_ISA (SVE_B16B16) && TARGET_SVE2_OR_SME2) +/* Some fp8 instructions require +fp8 and one of +sve2 or +sme2. */ +#define TARGET_SSVE_FP8 (TARGET_FP8 \ + && (TARGET_SVE2 || TARGET_STREAMING) \ + && (TARGET_SME2 || TARGET_NON_STREAMING)) + /* Standard register usage. */ /* 31 64-bit general purpose registers R0-R30: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 023893d35f3..26716d593de 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -477,6 +477,9 @@ (define_mode_iterator SVE_FULL_BHSIx2 [VNx32QI VNx16HI VNx8SI]) ;; Fully-packed SVE vector modes that have 16-bit float elements. (define_mode_iterator SVE_FULL_HF [VNx8BF VNx8HF]) +;; Pairs of the above. +(define_mode_iterator SVE_FULL_HFx2 [VNx16BF VNx16HF]) + ;; Fully-packed SVE vector modes that have 16-bit, 32-bit or 64-bit elements. (define_mode_iterator SVE_FULL_HSD [VNx8HI VNx4SI VNx2DI VNx8BF VNx8HF VNx4SF VNx2DF]) @@ -960,7 +963,13 @@ (define_c_enum "unspec" UNSPEC_COND_FLOGB ; Used in aarch64-sve2.md. UNSPEC_EORBT ; Used in aarch64-sve2.md. UNSPEC_EORTB ; Used in aarch64-sve2.md. + UNSPEC_F1CVT ; Used in aarch64-sve2.md. + UNSPEC_F1CVTLT ; Used in aarch64-sve2.md. + UNSPEC_F2CVT ; Used in aarch64-sve2.md. + UNSPEC_F2CVTLT ; Used in aarch64-sve2.md. UNSPEC_FADDP ; Used in aarch64-sve2.md. + UNSPEC_FCVTNB ; Used in aarch64-sve2.md. + UNSPEC_FCVTNT ; Used in aarch64-sve2.md. UNSPEC_FMAXNMP ; Used in aarch64-sve2.md. UNSPEC_FMAXP ; Used in aarch64-sve2.md. UNSPEC_FMINNMP ; Used in aarch64-sve2.md. @@ -969,6 +978,7 @@ (define_c_enum "unspec" UNSPEC_FMLALT ; Used in aarch64-sve2.md. UNSPEC_FMLSLB ; Used in aarch64-sve2.md. UNSPEC_FMLSLT ; Used in aarch64-sve2.md. + UNSPEC_FP8FCVTN ; Used in aarch64-sve2.md. UNSPEC_HISTCNT ; Used in aarch64-sve2.md. UNSPEC_HISTSEG ; Used in aarch64-sve2.md. UNSPEC_LD1_COUNT ; Used in aarch64-sve2.md. @@ -4731,3 +4741,17 @@ (define_int_attr faminmax_uns_op (define_code_attr faminmax_op [(smax "famax") (smin "famin")]) + +;; Iterators and attributes for fp8 sve/sme conversions + +(define_int_iterator FP8CVT_UNS + [UNSPEC_F1CVT + UNSPEC_F2CVT + UNSPEC_F1CVTLT + UNSPEC_F2CVTLT]) + +(define_int_attr fp8_cvt_uns_op + [(UNSPEC_F1CVT "f1cvt") + (UNSPEC_F2CVT "f2cvt") + (UNSPEC_F1CVTLT "f1cvtlt") + (UNSPEC_F2CVTLT "f2cvtlt")]) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index e9112c02b3e..4a146c3e157 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -75,7 +75,7 @@ #define TEST_DUAL_Z(NAME, TYPE1, TYPE2, CODE1, CODE2) \ PROTO (NAME, TYPE1, (TYPE1 z0, TYPE1 z1, TYPE1 z2, TYPE1 z3, \ TYPE2 z4, TYPE2 z5, TYPE2 z6, TYPE2 z7, \ - svbool_t p0, svbool_t p1)) \ + svbool_t p0, svbool_t p1, fpm_t fpm0)) \ { \ INVOKE (CODE1, CODE2); \ return z0; \ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c new file mode 100644 index 00000000000..d312e857d81 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrow_1.c @@ -0,0 +1,60 @@ +#include + +#pragma GCC target "+sme2+fp8" + +void +test (svfloat16x2_t f16x2, svbfloat16x2_t bf16x2, svfloat32x2_t f32x2, + svfloat16x3_t f16x3, svfloat16x4_t f16x4, + svfloat32x3_t f32x3, svfloat32x4_t f32x4, + fpm_t fpm0, + svbool_t pg, float f, svint8_t s8, svint32x2_t s32x2) + __arm_streaming +{ + svcvtn_mf8_fpm (f16x2, fpm0); + svcvtn_mf8_fpm (bf16x2, fpm0); + + svcvtn_mf8_fpm (); /* { dg-error {too few arguments to function 'svcvtn_mf8_fpm'} } */ + + svcvtn_mf8_fpm (f16x2); /* { dg-error {too few arguments to function 'svcvtn_mf8_fpm'} } */ + svcvtn_mf8_fpm (fpm0); /* { dg-error {too few arguments to function 'svcvtn_mf8_fpm'} } */ + + svcvtn_mf8_fpm (f); /* { dg-error {too few arguments to function 'svcvtn_mf8_fpm'} } */ + svcvtn_mf8_fpm (pg); /* { dg-error {too few arguments to function 'svcvtn_mf8_fpm'} } */ + svcvtn_mf8_fpm (s8); /* { dg-error {too few arguments to function 'svcvtn_mf8_fpm'} } */ + + svcvtn_mf8_fpm (f16x2, f16x2, fpm0); /* { dg-error {too many arguments to function 'svcvtn_mf8_fpm'} } */ + + svcvtn_mf8_fpm (f16x3, fpm0); /* { dg-error {'svcvtn_mf8_fpm' has no form that takes 'svfloat16x3_t' arguments} } */ + svcvtn_mf8_fpm (f16x4, fpm0); /* { dg-error {'svcvtn_mf8_fpm' has no form that takes 'svfloat16x4_t' arguments} } */ + svcvtn_mf8_fpm (0, fpm0); /* { dg-error {passing 'int' to argument 1 of 'svcvtn_mf8_fpm', which expects an SVE type rather than a scalar type} } */ + svcvtn_mf8_fpm (f, fpm0); /* { dg-error {passing 'float' to argument 1 of 'svcvtn_mf8_fpm', which expects an SVE type rather than a scalar type} } */ + svcvtn_mf8_fpm (pg, fpm0); /* { dg-error {'svcvtn_mf8_fpm' has no form that takes 'svbool_t' arguments} } */ + svcvtn_mf8_fpm (s8, fpm0); /* { dg-error {'svcvtn_mf8_fpm' has no form that takes 'svint8_t' arguments} } */ + svcvtn_mf8_fpm (s32x2, fpm0); /* { dg-error {'svcvtn_mf8_fpm' has no form that takes 'svint32x2_t' arguments} } */ + + svcvtn_mf8_fpm (f16x2, f16x2); /* { dg-error {passing 'svfloat16x2_t' to argument 2 of 'svcvtn_mf8_fpm', which expects 'uint64_t'} } */ + + + svcvtnb_mf8_fpm (f32x2, fpm0); + + svcvtnb_mf8_fpm (); /* { dg-error {too few arguments to function 'svcvtnb_mf8_fpm'} } */ + + svcvtnb_mf8_fpm (f32x2); /* { dg-error {too few arguments to function 'svcvtnb_mf8_fpm'} } */ + svcvtnb_mf8_fpm (fpm0); /* { dg-error {too few arguments to function 'svcvtnb_mf8_fpm'} } */ + + svcvtnb_mf8_fpm (f); /* { dg-error {too few arguments to function 'svcvtnb_mf8_fpm'} } */ + svcvtnb_mf8_fpm (pg); /* { dg-error {too few arguments to function 'svcvtnb_mf8_fpm'} } */ + svcvtnb_mf8_fpm (s8); /* { dg-error {too few arguments to function 'svcvtnb_mf8_fpm'} } */ + + svcvtnb_mf8_fpm (f32x2, f32x2, fpm0); /* { dg-error {too many arguments to function 'svcvtnb_mf8_fpm'} } */ + + svcvtnb_mf8_fpm (f32x3, fpm0); /* { dg-error {'svcvtnb_mf8_fpm' has no form that takes 'svfloat32x3_t' arguments} } */ + svcvtnb_mf8_fpm (f32x4, fpm0); /* { dg-error {'svcvtnb_mf8_fpm' has no form that takes 'svfloat32x4_t' arguments} } */ + svcvtnb_mf8_fpm (0, fpm0); /* { dg-error {passing 'int' to argument 1 of 'svcvtnb_mf8_fpm', which expects an SVE type rather than a scalar type} } */ + svcvtnb_mf8_fpm (f, fpm0); /* { dg-error {passing 'float' to argument 1 of 'svcvtnb_mf8_fpm', which expects an SVE type rather than a scalar type} } */ + svcvtnb_mf8_fpm (pg, fpm0); /* { dg-error {'svcvtnb_mf8_fpm' has no form that takes 'svbool_t' arguments} } */ + svcvtnb_mf8_fpm (s8, fpm0); /* { dg-error {'svcvtnb_mf8_fpm' has no form that takes 'svint8_t' arguments} } */ + svcvtnb_mf8_fpm (s32x2, fpm0); /* { dg-error {'svcvtnb_mf8_fpm' has no form that takes 'svint32x2_t' arguments} } */ + + svcvtnb_mf8_fpm (f32x2, f32x2); /* { dg-error {passing 'svfloat32x2_t' to argument 2 of 'svcvtnb_mf8_fpm', which expects 'uint64_t'} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrowt_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrowt_1.c new file mode 100644 index 00000000000..ab97eef3472 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/unary_convertxn_narrowt_1.c @@ -0,0 +1,38 @@ +#include + +#pragma GCC target "+sme2+fp8" + +void +test (svmfloat8_t f8, svfloat32x2_t f32x2, fpm_t fpm0, + svfloat16x2_t f16x2, svfloat16x4_t f16x4, + svfloat32x3_t f32x3, svfloat32x4_t f32x4, + svbool_t pg, float f, svint8_t s8, svint32x2_t s32x2) + __arm_streaming +{ + svcvtnt_mf8_fpm (f8, f32x2, fpm0); + + svcvtnt_mf8_fpm (); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + + svcvtnt_mf8_fpm (f8); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (f32x2); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (fpm0); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (f); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (f8, f32x2); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (f32x2, fpm0); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (f8, fpm0); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (pg); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + svcvtnt_mf8_fpm (s8); /* { dg-error {too few arguments to function 'svcvtnt_mf8_fpm'} } */ + + svcvtnt_mf8_fpm (f8, f16x2, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svfloat16x2_t' arguments} } */ + svcvtnt_mf8_fpm (f8, f16x4, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svfloat16x4_t' arguments} } */ + svcvtnt_mf8_fpm (f8, f32x3, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svfloat32x3_t' arguments} } */ + svcvtnt_mf8_fpm (f8, f32x4, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svfloat32x4_t' arguments} } */ + + svcvtnt_mf8_fpm (f8, 0, fpm0); /* { dg-error {passing 'int' to argument 2 of 'svcvtnt_mf8_fpm', which expects an SVE type rather than a scalar type} } */ + svcvtnt_mf8_fpm (f8, f, fpm0); /* { dg-error {passing 'float' to argument 2 of 'svcvtnt_mf8_fpm', which expects an SVE type rather than a scalar type} } */ + svcvtnt_mf8_fpm (f8, pg, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svbool_t' arguments} } */ + svcvtnt_mf8_fpm (f8, s8, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svint8_t' arguments} } */ + svcvtnt_mf8_fpm (f8, s32x2, fpm0); /* { dg-error {'svcvtnt_mf8_fpm' has no form that takes 'svint32x2_t' arguments} } */ + + svcvtnt_mf8_fpm (f8, f32x2, f32x2); /* { dg-error {passing 'svfloat32x2_t' to argument 3 of 'svcvtnt_mf8_fpm', which expects 'uint64_t'} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c new file mode 100644 index 00000000000..4fd915ee73a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvt_mf8.c @@ -0,0 +1,48 @@ +/* { dg-do assemble { target aarch64_asm_fp8_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+bf16+fp8" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+sme2" +#endif + +/* +** cvt1_f16_mf8_fpm: +** msr fpmr, x0 +** f1cvt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvt1_f16_mf8_fpm, svfloat16_t, svmfloat8_t, + z0 = svcvt1_f16_mf8_fpm (z4, fpm0), z0 = svcvt1_f16_fpm (z4, fpm0)) + +/* +** cvt1_bf16_mf8_fpm: +** msr fpmr, x0 +** bf1cvt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvt1_bf16_mf8_fpm, svbfloat16_t, svmfloat8_t, + z0 = svcvt1_bf16_mf8_fpm (z4, fpm0), + z0 = svcvt1_bf16_fpm (z4, fpm0)) + +/* +** cvt2_f16_mf8_fpm: +** msr fpmr, x0 +** f2cvt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvt2_f16_mf8_fpm, svfloat16_t, svmfloat8_t, + z0 = svcvt2_f16_mf8_fpm (z4, fpm0), z0 = svcvt2_f16_fpm (z4, fpm0)) + +/* +** cvt2_bf16_mf8_fpm: +** msr fpmr, x0 +** bf2cvt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvt2_bf16_mf8_fpm, svbfloat16_t, svmfloat8_t, + z0 = svcvt2_bf16_mf8_fpm (z4, fpm0), + z0 = svcvt2_bf16_fpm (z4, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_mf8.c new file mode 100644 index 00000000000..fb645eed630 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtlt_mf8.c @@ -0,0 +1,50 @@ +/* { dg-do assemble { target aarch64_asm_fp8_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+bf16+fp8" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+sme2" +#endif + +/* +** cvtlt1_f16_mf8_fpm: +** msr fpmr, x0 +** f1cvtlt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvtlt1_f16_mf8_fpm, svfloat16_t, svmfloat8_t, + z0 = svcvtlt1_f16_mf8_fpm (z4, fpm0), + z0 = svcvtlt1_f16_fpm (z4, fpm0)) + +/* +** cvtlt1_bf16_mf8_fpm: +** msr fpmr, x0 +** bf1cvtlt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvtlt1_bf16_mf8_fpm, svbfloat16_t, svmfloat8_t, + z0 = svcvtlt1_bf16_mf8_fpm (z4, fpm0), + z0 = svcvtlt1_bf16_fpm (z4, fpm0)) + +/* +** cvtlt2_f16_mf8_fpm: +** msr fpmr, x0 +** f2cvtlt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvtlt2_f16_mf8_fpm, svfloat16_t, svmfloat8_t, + z0 = svcvtlt2_f16_mf8_fpm (z4, fpm0), + z0 = svcvtlt2_f16_fpm (z4, fpm0)) + +/* +** cvtlt2_bf16_mf8_fpm: +** msr fpmr, x0 +** bf2cvtlt z0\.h, z4\.b +** ret +*/ +TEST_DUAL_Z (cvtlt2_bf16_mf8_fpm, svbfloat16_t, svmfloat8_t, + z0 = svcvtlt2_bf16_mf8_fpm (z4, fpm0), + z0 = svcvtlt2_bf16_fpm (z4, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtn_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtn_mf8.c new file mode 100644 index 00000000000..b0bff2ffb0f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtn_mf8.c @@ -0,0 +1,30 @@ +/* { dg-do assemble { target aarch64_asm_fp8_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+bf16+fp8" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+sme2" +#endif + +/* +** cvtn_mf8_f16_x2_fpm: +** msr fpmr, x2 +** fcvtn z0\.b, {z4\.h(?:, | - )z5\.h} +** ret +*/ +TEST_DUAL_Z (cvtn_mf8_f16_x2_fpm, svmfloat8_t, svfloat16x2_t, + z0 = svcvtn_mf8_f16_x2_fpm (z4, fpm0), + z0 = svcvtn_mf8_fpm (z4, fpm0)) + +/* +** cvtn_mf8_bf16_x2_fpm: +** msr fpmr, x2 +** bfcvtn z0\.b, {z4\.h(?:, | - )z5\.h} +** ret +*/ +TEST_DUAL_Z (cvtn_mf8_bf16_x2_fpm, svmfloat8_t, svbfloat16x2_t, + z0 = svcvtn_mf8_bf16_x2_fpm (z4, fpm0), + z0 = svcvtn_mf8_fpm (z4, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnb_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnb_mf8.c new file mode 100644 index 00000000000..c7c58ebff53 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnb_mf8.c @@ -0,0 +1,20 @@ +/* { dg-do assemble { target aarch64_asm_fp8_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+bf16+fp8" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+sme2" +#endif + +/* +** cvtnb_mf8_f32_x2_fpm: +** msr fpmr, x2 +** fcvtnb z0\.b, {z4\.s(?:, | - )z5\.s} +** ret +*/ +TEST_DUAL_Z (cvtnb_mf8_f32_x2_fpm, svmfloat8_t, svfloat32x2_t, + z0 = svcvtnb_mf8_f32_x2_fpm (z4, fpm0), + z0 = svcvtnb_mf8_fpm (z4, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_mf8.c new file mode 100644 index 00000000000..46b42c4318d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/cvtnt_mf8.c @@ -0,0 +1,31 @@ +/* { dg-do assemble { target aarch64_asm_fp8_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+bf16+fp8" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+sme2" +#endif + +/* +** cvtnt_mf8_f32_x2_fpm_untied: +** msr fpmr, x2 +** fcvtnt z1\.b, {z4\.s(?:, | - )z5\.s} +** mov z0.d, z1.d +** ret +*/ +TEST_DUAL_Z (cvtnt_mf8_f32_x2_fpm_untied, svmfloat8_t, svfloat32x2_t, + z0 = svcvtnt_mf8_f32_x2_fpm (z1, z4, fpm0), + z0 = svcvtnt_mf8_fpm (z1, z4, fpm0)) + +/* +** cvtnt_mf8_f32_x2_fpm_tied: +** msr fpmr, x2 +** fcvtnt z0\.b, {z4\.s(?:, | - )z5\.s} +** ret +*/ +TEST_DUAL_Z (cvtnt_mf8_f32_x2_fpm_tied, svmfloat8_t, svfloat32x2_t, + z0 = svcvtnt_mf8_f32_x2_fpm (z0, z4, fpm0), + z0 = svcvtnt_mf8_fpm (z0, z4, fpm0)) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 4d3e3ac04d4..a3edccf1fda 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -12140,7 +12140,7 @@ proc check_effective_target_aarch64_tiny { } { foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve" "i8mm" "f32mm" "f64mm" "bf16" "sb" "sve2" "ls64" "sme" "sme-i16i64" "sme2" "sve-b16b16" - "sme-b16b16" "sme-f16f16" "sme2p1" } { + "sme-b16b16" "sme-f16f16" "sme2p1" "fp8" } { eval [string map [list FUNC $aarch64_ext] { proc check_effective_target_aarch64_asm_FUNC_ok { } { if { [istarget aarch64*-*-*] } { From patchwork Thu Nov 28 21:12:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Bantaloukas X-Patchwork-Id: 102047 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 609E5385840D for ; Thu, 28 Nov 2024 21:15:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 609E5385840D Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Qf5J3Mde; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=Qf5J3Mde X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR02-DB5-obe.outbound.protection.outlook.com (mail-db5eur02on2061a.outbound.protection.outlook.com [IPv6:2a01:111:f403:2608::61a]) by sourceware.org (Postfix) with ESMTPS id 0D0D73858D26 for ; Thu, 28 Nov 2024 21:12:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D0D73858D26 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0D0D73858D26 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:2608::61a ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828379; cv=pass; b=e1IiQuPYrG4sretsU0keCA+pFH/o8wEhBcZ3AjDRpaK9MLokvUcd/X/E/LjqEfjSv255ktxWymNipVshf7SyNOVEpUA9ZApzLjcAK4yr8P2RXpT2uh2m8fzw0hURREyvmBdmXjc2mKgNVM6wEJleQNhZDJc+fGRYuNwrYhq1Sec= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828379; c=relaxed/simple; bh=4uyJTU16pN5aDqy25YFfjlpx6oGhmdHMHQwBtLw+kTM=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=TNmRZPykZCiMRC8rEnYfYouh1mf1IbiCzUlIGjr0hk9TlQCd8Q7waFpNAqERE7+hM1h44F6kbZc0axm/vKyvES8drUu85Ls1f5ypecbFZFIDRXXLRqj4PO38PEgj2mwn4J5My5gzLOREoOKYlvIpqrAj4G6cGA6+yHVyBMF1XiU= ARC-Authentication-Results: i=3; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0D0D73858D26 ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=CJGHYYzMvHNunkC6XA+dnkMn0WvAo3ZGlLF2Iduj5UNO10ennDB5DefoUfd4aRDsABipMDxgKkeNdhZo5Xj2/CUdvlPblCyEqLmds/bA3rANlJMyjOBtdQdZeuI0edk9x/jamhdW3zPb6PJFu6xDgkY1UjZ3XcBVkNNzwDlXO3VQ56RYBTp/KSW7T8HF2O0RiBX/3NckcwPuHG19tgNnqYxQRY40ipHcBi0vCLZDtbrliU1waNIe73VsoodjzoGD/AfWDHOUXa8alR9U+zihXfv3X/z0XJ7wjRcbvAady410KfW7TK3BV2uTkBtqFrFkblxOfYv6ujcFKzWwOfBxBw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dJxf70uOw3WMSVyDu4wOYkEgYIFFV4WDz7AWPxINcm0=; b=kPdmO9HgKimo2tPdhO8R+RHu5r2Z0JoTetOvLMqwqNQU4qfL/y+w34JNFaxTZCJ6sAV3TwwD5m+u4FIgPCCP+K6qDv9lyD9DQe8Fyldxv08uHkq6QRXnT7JKYvBEuCmxr14jjcrBKMl45mwLkK6A8fllSnPAjaRIFF0QhFRZnTXhlgP7VtT6b6bB7u53/7iTxl5HTziUVYU0CyogkPSiQv41kHV/1LjkH6cGX6zwLM+6TKS25gRN7bIbR5dXXpB1YBKizWlnmOqQKO2Fcl0uANOUzA39oorPh0hfYd4BzdyZjYZ1g7Nbjimg4jJXhkj5DCJztxbBoP9/KsAROO1ZAA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dJxf70uOw3WMSVyDu4wOYkEgYIFFV4WDz7AWPxINcm0=; b=Qf5J3MdeW5TEwwQ16LQ9j3NvVPAWkNDlQgXa8EUzTW2Zo2Lh8C4suLpkQRSj1TcuPr6A8tQ4ZB8KRfW/AoXGkr3LaDCSyyePfQ/aTh+5U2Gv3lO8zOn0SUlap/6Dtvb5mZnT9IlkS/lCQb6YK0XPN8pXuOxsbhmWpGK0fiSAVSE= Received: from AM6P195CA0106.EURP195.PROD.OUTLOOK.COM (2603:10a6:209:86::47) by AS8PR08MB9479.eurprd08.prod.outlook.com (2603:10a6:20b:5ec::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.13; Thu, 28 Nov 2024 21:12:52 +0000 Received: from AMS0EPF000001AC.eurprd05.prod.outlook.com (2603:10a6:209:86:cafe::75) by AM6P195CA0106.outlook.office365.com (2603:10a6:209:86::47) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.14 via Frontend Transport; Thu, 28 Nov 2024 21:12:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001AC.mail.protection.outlook.com (10.167.16.152) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:49 +0000 Received: ("Tessian outbound 206fab1c37e0:v514"); Thu, 28 Nov 2024 21:12:49 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 46b74debc2ce3dec X-TessianGatewayMetadata: 29ZgbufdnfYzdaYTisz0NwGtvFLToPFAYiVuGagKMURrvUPpS+T5yFIw+XPqt8vN2vlyxtviX6U0kbyNZNuVa+I9juhO75m1NH1MBDLDdCDx1hR24RUxOq3/LyDM2mxvT2e8iRCiszU8TZYuypZr7tc5RmP7YA4Lf8iFIcRNv6L7zLJpi87QadNmCgzLD7si X-CR-MTA-TID: 64aa7808 Received: from La4c7da6da869.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 62B2C413-3919-480D-BB9E-044B6E3EB1CB.1; Thu, 28 Nov 2024 21:12:43 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id La4c7da6da869.2 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384); Thu, 28 Nov 2024 21:12:43 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=pctdELkVED7Ov7h1mE90kBNhr75PiR4oFnsnsiNREDBfHbT2FS5WSSXcSazXLdq9bCOgdUpYP4yO+sYtVuuNpd2ptJtnR7iYmxiIcDmag2MuORWd7DtDHpal4tcWPBg7A7/rw06OvFzMTfgbJ5bvCQBFdUXq0IDc93LI2efmy+pnpRMYoIsKdqxZeLKBmaw0TDYfakskMLJYgh1GKGeagD4Bm5Km4lbxyMaKFjcMANqyNWdK58fhbrYXjO/LKh5UJlZfaYVT1cJ2d1bGC1QdJMT38BknX5QoAMLcwA0Sy6bv0HIrs66B3kWygOWPnfVcbn+VzhbaCBvsHNYwK5Qipg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dJxf70uOw3WMSVyDu4wOYkEgYIFFV4WDz7AWPxINcm0=; b=gQIzIpsRbui0BpQFm/ZEzFrcHA15mYanD4W7mZzoHwvkYoIXS+nkQzsyBOTgZsJdGbAK/E6fPfTP5/ADxu35KYJUvP/He6Ytzx30Nt170gTUYuBWDLuo+8CYaRj/EVlJPv+yKMI61vJWpKkbdXLlPptRb6IzqcQgBysBk2xHglzxlg4ipMh3dot306PJ3KpWAz6EHbzsJyf8AmYYsHI64L+C/zcwME5pR4YMM4qrMALLW6xeNDrw5XHklZ5BLecUPSF9ch11FhdZj7nMQnti5DtOS5zRDY2MK0n0cvX/4XXswAHHMBXha9P9zvfjrNpYS4/pOLecdhqpx4WN6vP/Iw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dJxf70uOw3WMSVyDu4wOYkEgYIFFV4WDz7AWPxINcm0=; b=Qf5J3MdeW5TEwwQ16LQ9j3NvVPAWkNDlQgXa8EUzTW2Zo2Lh8C4suLpkQRSj1TcuPr6A8tQ4ZB8KRfW/AoXGkr3LaDCSyyePfQ/aTh+5U2Gv3lO8zOn0SUlap/6Dtvb5mZnT9IlkS/lCQb6YK0XPN8pXuOxsbhmWpGK0fiSAVSE= Received: from DU7P190CA0021.EURP190.PROD.OUTLOOK.COM (2603:10a6:10:550::17) by PAWPR08MB11201.eurprd08.prod.outlook.com (2603:10a6:102:46c::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.13; Thu, 28 Nov 2024 21:12:38 +0000 Received: from DU6PEPF0000B61D.eurprd02.prod.outlook.com (2603:10a6:10:550:cafe::fb) by DU7P190CA0021.outlook.office365.com (2603:10a6:10:550::17) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.13 via Frontend Transport; Thu, 28 Nov 2024 21:12:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DU6PEPF0000B61D.mail.protection.outlook.com (10.167.8.137) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:38 +0000 Received: from AZ-NEU-EXJ01.Arm.com (10.240.25.132) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:38 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EXJ01.Arm.com (10.240.25.132) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:37 +0000 Received: from 5fe87ac27518.euhpc2.arm.com (10.58.86.32) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 From: Claudio Bantaloukas To: CC: Claudio Bantaloukas Subject: [PATCH v5 4/5] aarch64: add SVE2 FP8 multiply accumulate intrinsics Date: Thu, 28 Nov 2024 21:12:33 +0000 Message-ID: <20241128211234.1714776-5-claudio.bantaloukas@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> References: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DU6PEPF0000B61D:EE_|PAWPR08MB11201:EE_|AMS0EPF000001AC:EE_|AS8PR08MB9479:EE_ X-MS-Office365-Filtering-Correlation-Id: 2d15dbce-76aa-4db7-bdd4-08dd0ff16a3b x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|36860700013|82310400026|1800799024|376014; X-Microsoft-Antispam-Message-Info-Original: IW+5NwGC2n9iXl3udoCQ54gUY7wgIFJY0nWngKylcqBxw0h0+/hMagJhszALlEr4HHIl9UwaHo/TmyYoHOxKKQRLSf0JuRh/ty9BYnBkyvcYyLrGqHWIHWms1pNlPJfGAvW+WLURfcWpyjKGToGjkdO8S+XTH+Cu2suR7Og6uJmcFpCqm871zLdxrspCDPuR7+4CfjXFWZ94kmC4Ft9P3GvBlLjbcXIzFGi9W6ja29weoBTaL/XraNp+9w2NWhQHTgIViz/1IG1Y4ze7jfgCbjVuI7UlUwWKAvvlW0aDts/kJwsq1ux6xVCGnx77xeuLThXDKZULezYpZD1c1vHaLQiGXHCPAA/BPZg99Wb2y86qBKDsJ3fs3R0UvXUXXZFS+zwcm5zAg9XOsXOImvwb27xIgfFwG/JDiLL9XvOGiYlpLKySCKhugPlW/Hadqwf6bIFN+3non5mbvixaDQ0hpzCYQnAUc5hlGMtlIrHmOiC+jWsHyjx64mw2b2Etqy/VZ9d6t30GPSHCmvSVdogiqNggPkDS+I2dzB+2j6e9fEDVNW5Ks9an4e7Mc6T7lggHXgcZeXRx52RGGjg33VIIqHntjauEfG5o06NjVuJxD9EWIV935xEzZwbTsW4s9Tb69hqeaD46CattsORmJczLp1+3p3ORnDjcvq235G9MnFdes1Eh0noGNecduNh+ebSW3RdAQp+LH5eRTWeJJgxh4aGLTGD6NyVaAEKh2e/3PP11REatIEs3YFb2c0tGvgIYChCkzKS16F5ULnhUx2uECCvIFLi4ItTe+wH4PpwKP1/Gf9AkAA2e+KDB9eC4UvxldoQU8FX9GB/3DiMhz8Hd8vlGjwNEkr+/CnheoGn9fmYbseKXXQ0HN0xGA8smZs/ii4wS42lgoVXnaiEZF+boIi2ItglX2JtmCdboj98fyXkSij2o/4vFHAgSogfcVSQTsZIJqzQ3dXSKc43hVVnf4JcZGO+7WuEebTt+tIB/BZIlevuPO9EJyyDISA4UzWPPYSdxL025qutcPRW1HQDOY5YrSQ7L5/hQ0ngniGbWBeVGvZCe2tiFhIHJUNDG3riREXFf8kPuuJ8hxs5xbtO6nMFdKZv6iaifPREV/kT3JIWrs97TQu1aXZjFKWQdAs12b3j1o1kL+Sgzy42NgDN/MvVDWqNXu+znpTgygqpaMiwPZNKWCGzDq/aChXQhFvJDZTUDQsLNkd1GVhO8NrDLSHBzVZFkx7vFeUcnAwSF0HCF2R9CLrSj35jPRcRIGB5YLn4A3qNMJINctRx3jGd5l3PuMLcbCirkS2AvdopiUDPuKmuTNUr396Ehheu18+QWp6h6+qSSj3QQ6pcXHy8VyQHeHQzOvdMcNr6K5uUoPPSVhZ0XXbibxmNJZvH0wDaeN+4RbMvck2cATvD6HYRxqw== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(82310400026)(1800799024)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB11201 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:10:550::17]; domain=DU7P190CA0021.EURP190.PROD.OUTLOOK.COM X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001AC.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: f7b45f13-544c-4864-6f28-08dd0ff16352 X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|36860700013|376014|1800799024|82310400026|14060799003; X-Microsoft-Antispam-Message-Info: =?utf-8?q?YJ+GFQGCa+3w+XnqdpZzZ7xfLQfjM82?= =?utf-8?q?jBCN7dUF4VVxRiO+6Ak7awAHknV/8eZmbXHQCPGSudeFGK6zyA/6EnF+OzCtKhibj?= =?utf-8?q?cO1DGFqXHtvjLLtO641j1H65dJ1856RMY1Jmd+SMinlY07JxSDHumvoQQcX6DFHuE?= =?utf-8?q?c49kGkbEEGB5MgQIhEBrACf5eRQGAenIJG9JmhZnid2C57XRtKlNc/2pwVN0DyvZf?= =?utf-8?q?Kwa5takYuk75mbN+S/Dul+6ojAzKpSDuHhbE+eFSZA7mSEStn9DSLDvo3qLmL8+33?= =?utf-8?q?yAJ+l6IeeOjvfO4jS59goKnHaGwSK8IsJNL1hxhCMufrCbjyb8A3VWRFXYn+0n3o7?= =?utf-8?q?m4pFTI43/ijj16ZrNSPr8uz8COJWj9jgFd+llUFC3hZmjbX2T/3YAHJjq59h1nIbM?= =?utf-8?q?p1n9G1r9d0Ab2lSmmcmE2n9DE7mR3oVVxCQWHx4Mcc3zeNyjOByk9+4uwmeczvNcZ?= =?utf-8?q?4BVyylWaUjFGDbVYyrJLrO5QrVG7C1Bqe7r93fVsAro8qpRWk3jTUAN9fTy6NO/BM?= =?utf-8?q?fT/XXJCzhAIDDgHc754SbHL7lga0mBe935ZD1HRRmvOZxCC5kx4i9TVUvXF5/x1kG?= =?utf-8?q?yMxvuotnUTN41aNZbUl6Sjm6MAj7ZZMqB8xrmhkcGEBMwdqZxaY2pKCkx/xz1/9EN?= =?utf-8?q?3mu+0HdMb8cWe+ZThSo+xSaZkCT06lM6JRPRh9HOSkjFpCkKS3zU+Y2/gfTGlVTzR?= =?utf-8?q?EYEBcw3tA+PgBPTjrn+yDJXw1BiwBPXWHXkxvRsd+be4NQTF1wAt/qQHf0mVIvy8B?= =?utf-8?q?xaaSGSOe+pCWgXwBmYYqP6kb6QYah5ig6UMTf1kJVh2nmvG9yh3SKtDOgACPp2wZP?= =?utf-8?q?XtqPRHJDQxiTiT3hvyeQ6ZAkKM+LIfm2LmKD6EYiJUhal1HFragaJNX46udQG1ayn?= =?utf-8?q?ja0G1p1VwROUl6+4PAmOGiOpNLD3li5vyhnjneHfctTJ3XrTmI4dPuIWeb8dYe9mP?= =?utf-8?q?piSn6kVWkWO7b50g2S2PLxLYgNmIZn3kuknl0IEhEcMVGmRWFt8orI1+WpFTAS7bP?= =?utf-8?q?+JjA9WnocUbEXXaRsULv/XFO/viBYIjhi3r3TLHpH+111HrUd5m4KfODR3ARsxg+J?= =?utf-8?q?3iMvp7O4Yd4lEAJ9Y57/4iWJRpJdKOkubRbrM9GM/MWBK+pK0kZXXkEoNDaRo9d0B?= =?utf-8?q?UUMinATEggl7JsmX7TLHm4cIgVPybNQZh5Bz8nJCZ48KHspnh8GbrZPBnUIWa3akD?= =?utf-8?q?LBQe/LaulFajyrG5jr6PNHOUcEV4ftNvsdhUfeCgShtLY+kgCyEo9FmX1n7fbaxr/?= =?utf-8?q?ifXWB7MCNyS1RtVIO50u92LyQGmxUurDzQOAtP6sFOgLN3xRkbsLQ9C59DLN3hzHx?= =?utf-8?q?dk9m6ddlmHiZJ5etYMu+SarGtq5mHisZHw=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:64aa7808-outbound-1.mta.getcheckrecipient.com; CAT:NONE; SFS:(13230040)(35042699022)(36860700013)(376014)(1800799024)(82310400026)(14060799003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2024 21:12:49.9688 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 2d15dbce-76aa-4db7-bdd4-08dd0ff16a3b X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001AC.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9479 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org This patch adds support for the following intrinsics: - svmlalb[_f16_mf8]_fpm - svmlalb[_n_f16_mf8]_fpm - svmlalt[_f16_mf8]_fpm - svmlalt[_n_f16_mf8]_fpm - svmlalb_lane[_f16_mf8]_fpm - svmlalt_lane[_f16_mf8]_fpm - svmlallbb[_f32_mf8]_fpm - svmlallbb[_n_f32_mf8]_fpm - svmlallbt[_f32_mf8]_fpm - svmlallbt[_n_f32_mf8]_fpm - svmlalltb[_f32_mf8]_fpm - svmlalltb[_n_f32_mf8]_fpm - svmlalltt[_f32_mf8]_fpm - svmlalltt[_n_f32_mf8]_fpm - svmlallbb_lane[_f32_mf8]_fpm - svmlallbt_lane[_f32_mf8]_fpm - svmlalltb_lane[_f32_mf8]_fpm - svmlalltt_lane[_f32_mf8]_fpm These are available under a combination of the FP8FMA and SVE2 features. Alternatively under the SSVE_FP8FMA feature under streaming mode. gcc/ * config/aarch64/aarch64-option-extensions.def (fp8fma, ssve-fp8fma): Add new options. * config/aarch64/aarch64-sve-builtins-functions.h (unspec_based_function_base): Add unspec_for_mfp8. (unspec_for): Return unspec_for_mfp8 on fpm-using cases. (sme_1mode_function): Fix call to parent ctor. (sme_2mode_function_t): Likewise. (unspec_based_mla_function, unspec_based_mla_lane_function): Handle fpm-using cases. * config/aarch64/aarch64-sve-builtins-shapes.cc (parse_element_type): Treat M as TYPE_SUFFIX_mf8 (ternary_mfloat8_lane_def): Add new class. (ternary_mfloat8_opt_n_def): Likewise. (ternary_mfloat8_lane): Add new shape. (ternary_mfloat8_opt_n): Likewise. * config/aarch64/aarch64-sve-builtins-shapes.h (ternary_mfloat8_lane, ternary_mfloat8_opt_n): Declare. * config/aarch64/aarch64-sve-builtins-sve2.cc (svmlalb_lane, svmlalb, svmlalt_lane, svmlalt): Update definitions with mfloat8_t unspec in ctor. (svmlallbb_lane, svmlallbb, svmlallbt_lane, svmlallbt, svmlalltb_lane, svmlalltb, svmlalltt_lane, svmlalltt, svmlal_impl): Add new FUNCTIONs. (svqrshr, svqrshrn, svqrshru, svqrshrun): Update definitions with nop mfloat8 unspec in ctor. * config/aarch64/aarch64-sve-builtins-sve2.def (svmlalb, svmlalt, svmlalb_lane, svmlalt_lane, svmlallbb, svmlallbt, svmlalltb, svmlalltt, svmlalltt_lane, svmlallbb_lane, svmlallbt_lane, svmlalltb_lane): Add new DEF_SVE_FUNCTION_GS_FPMs. * config/aarch64/aarch64-sve-builtins-sve2.h (svmlallbb_lane, svmlallbb, svmlallbt_lane, svmlallbt, svmlalltb_lane, svmlalltb, svmlalltt_lane, svmlalltt): Declare. * config/aarch64/aarch64-sve-builtins.cc (TYPES_h_float_mf8, TYPES_s_float_mf8): Add new types. (h_float_mf8, s_float_mf8): Add new SVE_TYPES_ARRAY. * config/aarch64/aarch64-sve2.md (@aarch64_sve_add_): Add new. (@aarch64_sve_add_): Add new. (@aarch64_sve_add_lane_): Likewise. (@aarch64_sve_add_lane_): Likewise. * config/aarch64/aarch64.h (TARGET_FP8FMA, TARGET_SSVE_FP8FMA): Likewise. * config/aarch64/iterators.md (VNx8HF_ONLY): Add new. (UNSPEC_FMLALB_FP8, UNSPEC_FMLALLBB_FP8, UNSPEC_FMLALLBT_FP8, UNSPEC_FMLALLTB_FP8, UNSPEC_FMLALLTT_FP8, UNSPEC_FMLALT_FP8): Likewise. (SVE2_FP8_TERNARY_VNX8HF, SVE2_FP8_TERNARY_VNX4SF): Likewise. (SVE2_FP8_TERNARY_LANE_VNX8HF, SVE2_FP8_TERNARY_LANE_VNX4SF): Likewise. (sve2_fp8_fma_op_vnx8hf, sve2_fp8_fma_op_vnx4sf): Likewise. * doc/invoke.texi: Document fp8fma and sve-fp8fma extensions. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_DUAL_Z_REV, TEST_DUAL_LANE_REG, TEST_DUAL_ZD) Add fpm0 argument. * gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_opt_n_1.c: Add new shape test. * gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c: Add new test. * gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c: Likewise. * lib/target-supports.exp: Add check_effective_target for fp8fma and ssve-fp8fma --- .../aarch64/aarch64-option-extensions.def | 4 + .../aarch64/aarch64-sve-builtins-functions.h | 16 +++- .../aarch64/aarch64-sve-builtins-shapes.cc | 78 ++++++++++++++++ .../aarch64/aarch64-sve-builtins-shapes.h | 2 + .../aarch64/aarch64-sve-builtins-sve2.cc | 46 +++++++--- .../aarch64/aarch64-sve-builtins-sve2.def | 17 ++++ .../aarch64/aarch64-sve-builtins-sve2.h | 8 ++ gcc/config/aarch64/aarch64-sve-builtins.cc | 10 ++ gcc/config/aarch64/aarch64-sve2.md | 81 +++++++++++++++++ gcc/config/aarch64/aarch64.h | 9 ++ gcc/config/aarch64/iterators.md | 37 ++++++++ gcc/doc/invoke.texi | 5 + .../aarch64/sve/acle/asm/test_sve_acle.h | 6 +- .../acle/general-c/ternary_mfloat8_lane_1.c | 84 +++++++++++++++++ .../acle/general-c/ternary_mfloat8_opt_n_1.c | 60 ++++++++++++ .../aarch64/sve2/acle/asm/mlalb_lane_mf8.c | 91 +++++++++++++++++++ .../aarch64/sve2/acle/asm/mlalb_mf8.c | 78 ++++++++++++++++ .../aarch64/sve2/acle/asm/mlallbb_lane_mf8.c | 91 +++++++++++++++++++ .../aarch64/sve2/acle/asm/mlallbb_mf8.c | 78 ++++++++++++++++ .../aarch64/sve2/acle/asm/mlallbt_lane_mf8.c | 91 +++++++++++++++++++ .../aarch64/sve2/acle/asm/mlallbt_mf8.c | 78 ++++++++++++++++ .../aarch64/sve2/acle/asm/mlalltb_lane_mf8.c | 91 +++++++++++++++++++ .../aarch64/sve2/acle/asm/mlalltb_mf8.c | 78 ++++++++++++++++ .../aarch64/sve2/acle/asm/mlalltt_lane_mf8.c | 91 +++++++++++++++++++ .../aarch64/sve2/acle/asm/mlalltt_mf8.c | 78 ++++++++++++++++ .../aarch64/sve2/acle/asm/mlalt_lane_mf8.c | 91 +++++++++++++++++++ .../aarch64/sve2/acle/asm/mlalt_mf8.c | 78 ++++++++++++++++ gcc/testsuite/lib/target-supports.exp | 3 +- 28 files changed, 1458 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_opt_n_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index f4cf6618238..f39c9e6f897 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -245,6 +245,10 @@ AARCH64_OPT_EXTENSION("gcs", GCS, (), (), (), "gcs") AARCH64_OPT_EXTENSION("fp8", FP8, (SIMD), (), (), "fp8") +AARCH64_OPT_EXTENSION("fp8fma", FP8FMA, (FP8), (), (), "fp8fma") + +AARCH64_OPT_EXTENSION("ssve-fp8fma", SSVE_FP8FMA, (SME2,FP8), (), (), "ssve-fp8fma") + AARCH64_OPT_EXTENSION("faminmax", FAMINMAX, (SIMD), (), (), "faminmax") #undef AARCH64_OPT_FMV_EXTENSION diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 409062ca3dd..3dad0c02972 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -270,10 +270,12 @@ public: CONSTEXPR unspec_based_function_base (int unspec_for_sint, int unspec_for_uint, int unspec_for_fp, + int unspec_for_mfp8 = -1, unsigned int suffix_index = 0) : m_unspec_for_sint (unspec_for_sint), m_unspec_for_uint (unspec_for_uint), m_unspec_for_fp (unspec_for_fp), + m_unspec_for_mfp8 (unspec_for_mfp8), m_suffix_index (suffix_index) {} @@ -281,6 +283,9 @@ public: int unspec_for (const function_instance &instance) const { + if (instance.fpm_mode == FPM_set) + return m_unspec_for_mfp8; + auto &suffix = instance.type_suffix (m_suffix_index); return (!suffix.integer_p ? m_unspec_for_fp : suffix.unsigned_p ? m_unspec_for_uint @@ -292,6 +297,7 @@ public: int m_unspec_for_sint; int m_unspec_for_uint; int m_unspec_for_fp; + int m_unspec_for_mfp8; /* Which type suffix is used to choose between the unspecs. */ unsigned int m_suffix_index; @@ -427,7 +433,7 @@ public: CONSTEXPR sme_1mode_function (int unspec_for_sint, int unspec_for_uint, int unspec_for_fp) - : parent (unspec_for_sint, unspec_for_uint, unspec_for_fp, 1) + : parent (unspec_for_sint, unspec_for_uint, unspec_for_fp, -1, 1) {} rtx @@ -457,7 +463,7 @@ public: CONSTEXPR sme_2mode_function_t (int unspec_for_sint, int unspec_for_uint, int unspec_for_fp) - : parent (unspec_for_sint, unspec_for_uint, unspec_for_fp, 1) + : parent (unspec_for_sint, unspec_for_uint, unspec_for_fp, -1, 1) {} rtx @@ -496,7 +502,8 @@ public: { int unspec = unspec_for (e); insn_code icode; - if (e.type_suffix (m_suffix_index).float_p) + if (e.type_suffix (m_suffix_index).float_p + && e.fpm_mode != FPM_set) { /* Put the operands in the normal (fma ...) order, with the accumulator last. This fits naturally since that's also the unprinted operand @@ -526,7 +533,8 @@ public: { int unspec = unspec_for (e); insn_code icode; - if (e.type_suffix (m_suffix_index).float_p) + if (e.type_suffix (m_suffix_index).float_p + && e.fpm_mode != FPM_set) { /* Put the operands in the normal (fma ...) order, with the accumulator last. This fits naturally since that's also the unprinted operand diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 62831b3c1e2..94f4da8ce31 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -96,6 +96,7 @@ apply_predication (const function_instance &instance, tree return_type, B - bfloat16_t c - a predicate-as-counter h - a half-sized version of + M - mfloat8_t p - a predicate (represented as TYPE_SUFFIX_b) q - a quarter-sized version of s - a signed type with the given number of bits @@ -140,6 +141,9 @@ parse_element_type (const function_instance &instance, const char *&format) if (ch == 'B') return TYPE_SUFFIX_bf16; + if (ch == 'M') + return TYPE_SUFFIX_mf8; + if (ch == 'q') { type_suffix_index suffix = parse_element_type (instance, format); @@ -4015,6 +4019,44 @@ SHAPE (ternary_bfloat_lane) typedef ternary_bfloat_lane_base<2> ternary_bfloat_lanex2_def; SHAPE (ternary_bfloat_lanex2) +/* sv_t svfoo[_t0](sv_t, svmfloat8_t, svmfloat8_t, uint64_t) + + where the final argument is an integer constant expression in the range + [0, 15]. */ +struct ternary_mfloat8_lane_def + : public ternary_resize2_lane_base<8, TYPE_mfloat, TYPE_mfloat> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + gcc_assert (group.fpm_mode == FPM_set); + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,v0,vM,vM,su64", group, MODE_none); + } + + bool + check (function_checker &c) const override + { + return c.require_immediate_lane_index (3, 2, 1); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (5) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES + || !r.require_vector_type (1, VECTOR_TYPE_svmfloat8_t) + || !r.require_vector_type (2, VECTOR_TYPE_svmfloat8_t) + || !r.require_integer_immediate (3) + || !r.require_scalar_type (4, "uint64_t")) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type, TYPE_SUFFIX_mf8, GROUP_none); + } +}; +SHAPE (ternary_mfloat8_lane) + /* sv_t svfoo[_t0](sv_t, svbfloatt16_t, svbfloat16_t) sv_t svfoo[_n_t0](sv_t, svbfloat16_t, bfloat16_t). */ struct ternary_bfloat_opt_n_def @@ -4030,6 +4072,42 @@ struct ternary_bfloat_opt_n_def }; SHAPE (ternary_bfloat_opt_n) +/* sv_t svfoo[_t0](sv_t, svmfloatt8_t, svmfloat8_t) + sv_t svfoo[_n_t0](sv_t, svmfloat8_t, bfloat8_t). */ +struct ternary_mfloat8_opt_n_def + : public ternary_resize2_opt_n_base<8, TYPE_mfloat, TYPE_mfloat> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + gcc_assert (group.fpm_mode == FPM_set); + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,v0,vM,vM", group, MODE_none); + build_all (b, "v0,v0,vM,sM", group, MODE_n); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (4) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES + || !r.require_vector_type (1, VECTOR_TYPE_svmfloat8_t) + || !r.require_vector_or_scalar_type (2) + || !r.require_scalar_type (3, "uint64_t")) + return error_mark_node; + + auto mode = r.mode_suffix_id; + if (r.scalar_argument_p (2)) + mode = MODE_n; + else if (!r.require_vector_type (2, VECTOR_TYPE_svmfloat8_t)) + return error_mark_node; + + return r.resolve_to (mode, type, TYPE_SUFFIX_mf8, GROUP_none); + } +}; +SHAPE (ternary_mfloat8_opt_n) + /* sv_t svfoo[_t0](sv_t, sv_t, sv_t, uint64_t) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index dc3d4557288..1c8937ae027 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -213,6 +213,8 @@ namespace aarch64_sve extern const function_shape *const ternary_lane_rotate; extern const function_shape *const ternary_long_lane; extern const function_shape *const ternary_long_opt_n; + extern const function_shape *const ternary_mfloat8_lane; + extern const function_shape *const ternary_mfloat8_opt_n; extern const function_shape *const ternary_opt_n; extern const function_shape *const ternary_qq_or_011_lane; extern const function_shape *const ternary_qq_lane_rotate; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index 1a1d2c4c6ec..ad52030f226 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -990,16 +990,34 @@ FUNCTION (svminnmqv, reduction, (-1, -1, UNSPEC_FMINNMQV)) FUNCTION (svminp, unspec_based_pred_function, (UNSPEC_SMINP, UNSPEC_UMINP, UNSPEC_FMINP)) FUNCTION (svminqv, reduction, (UNSPEC_SMINQV, UNSPEC_UMINQV, UNSPEC_FMINQV)) -FUNCTION (svmlalb, unspec_based_mla_function, (UNSPEC_SMULLB, - UNSPEC_UMULLB, UNSPEC_FMLALB)) -FUNCTION (svmlalb_lane, unspec_based_mla_lane_function, (UNSPEC_SMULLB, - UNSPEC_UMULLB, - UNSPEC_FMLALB)) -FUNCTION (svmlalt, unspec_based_mla_function, (UNSPEC_SMULLT, - UNSPEC_UMULLT, UNSPEC_FMLALT)) -FUNCTION (svmlalt_lane, unspec_based_mla_lane_function, (UNSPEC_SMULLT, - UNSPEC_UMULLT, - UNSPEC_FMLALT)) +FUNCTION (svmlalb_lane, unspec_based_mla_lane_function, + (UNSPEC_SMULLB, UNSPEC_UMULLB, UNSPEC_FMLALB, + UNSPEC_FMLALB_FP8)) +FUNCTION (svmlalb, unspec_based_mla_function, + (UNSPEC_SMULLB, UNSPEC_UMULLB, UNSPEC_FMLALB, + UNSPEC_FMLALB_FP8)) +FUNCTION (svmlallbb_lane, unspec_based_mla_lane_function, + (-1, -1, -1, UNSPEC_FMLALLBB_FP8)) +FUNCTION (svmlallbb, unspec_based_mla_function, + (-1, -1, -1, UNSPEC_FMLALLBB_FP8)) +FUNCTION (svmlallbt_lane, unspec_based_mla_lane_function, + (-1, -1, -1, UNSPEC_FMLALLBT_FP8)) +FUNCTION (svmlallbt, unspec_based_mla_function, + (-1, -1, -1, UNSPEC_FMLALLBT_FP8)) +FUNCTION (svmlalltb_lane, unspec_based_mla_lane_function, + (-1, -1, -1, UNSPEC_FMLALLTB_FP8)) +FUNCTION (svmlalltb, unspec_based_mla_function, + (-1, -1, -1, UNSPEC_FMLALLTB_FP8)) +FUNCTION (svmlalltt_lane, unspec_based_mla_lane_function, + (-1, -1, -1, UNSPEC_FMLALLTT_FP8)) +FUNCTION (svmlalltt, unspec_based_mla_function, + (-1, -1, -1, UNSPEC_FMLALLTT_FP8)) +FUNCTION (svmlalt_lane, unspec_based_mla_lane_function, + (UNSPEC_SMULLT, UNSPEC_UMULLT, UNSPEC_FMLALT, + UNSPEC_FMLALT_FP8)) +FUNCTION (svmlalt, unspec_based_mla_function, + (UNSPEC_SMULLT, UNSPEC_UMULLT, UNSPEC_FMLALT, + UNSPEC_FMLALT_FP8)) FUNCTION (svmlslb, unspec_based_mls_function, (UNSPEC_SMULLB, UNSPEC_UMULLB, UNSPEC_FMLSLB)) FUNCTION (svmlslb_lane, unspec_based_mls_lane_function, (UNSPEC_SMULLB, @@ -1072,15 +1090,15 @@ FUNCTION (svqrdmulh_lane, unspec_based_lane_function, (UNSPEC_SQRDMULH, -1, -1)) FUNCTION (svqrshl, svqrshl_impl,) FUNCTION (svqrshr, unspec_based_uncond_function, (UNSPEC_SQRSHR, - UNSPEC_UQRSHR, -1, 1)) + UNSPEC_UQRSHR, -1, -1, 1)) FUNCTION (svqrshrn, unspec_based_uncond_function, (UNSPEC_SQRSHRN, - UNSPEC_UQRSHRN, -1, 1)) + UNSPEC_UQRSHRN, -1, -1, 1)) FUNCTION (svqrshrnb, unspec_based_function, (UNSPEC_SQRSHRNB, UNSPEC_UQRSHRNB, -1)) FUNCTION (svqrshrnt, unspec_based_function, (UNSPEC_SQRSHRNT, UNSPEC_UQRSHRNT, -1)) -FUNCTION (svqrshru, unspec_based_uncond_function, (UNSPEC_SQRSHRU, -1, -1, 1)) -FUNCTION (svqrshrun, unspec_based_uncond_function, (UNSPEC_SQRSHRUN, -1, -1, 1)) +FUNCTION (svqrshru, unspec_based_uncond_function, (UNSPEC_SQRSHRU, -1, -1, -1, 1)) +FUNCTION (svqrshrun, unspec_based_uncond_function, (UNSPEC_SQRSHRUN, -1, -1, -1, 1)) FUNCTION (svqrshrunb, unspec_based_function, (UNSPEC_SQRSHRUNB, -1, -1)) FUNCTION (svqrshrunt, unspec_based_function, (UNSPEC_SQRSHRUNT, -1, -1)) FUNCTION (svqshl, svqshl_impl,) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index 8a63998fcc6..b489e8fad2f 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -379,3 +379,20 @@ DEF_SVE_FUNCTION_GS_FPM (svcvtn, unary_convertxn_narrow, cvtn_mf8, x2, none, set DEF_SVE_FUNCTION_GS_FPM (svcvtnb, unary_convertxn_narrow, cvtnx_mf8, x2, none, set) DEF_SVE_FUNCTION_GS_FPM (svcvtnt, unary_convertxn_narrowt, cvtnx_mf8, x2, none, set) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS \ + streaming_compatible (AARCH64_FL_SVE2 | AARCH64_FL_FP8FMA, \ + AARCH64_FL_SSVE_FP8FMA) +DEF_SVE_FUNCTION_GS_FPM (svmlalb, ternary_mfloat8_opt_n, h_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalt, ternary_mfloat8_opt_n, h_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalb_lane, ternary_mfloat8_lane, h_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalt_lane, ternary_mfloat8_lane, h_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlallbb, ternary_mfloat8_opt_n, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlallbt, ternary_mfloat8_opt_n, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalltb, ternary_mfloat8_opt_n, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalltt, ternary_mfloat8_opt_n, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalltt_lane, ternary_mfloat8_lane, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlallbb_lane, ternary_mfloat8_lane, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlallbt_lane, ternary_mfloat8_lane, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svmlalltb_lane, ternary_mfloat8_lane, s_float_mf8, none, none, set) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h index d26751e8042..ff3e0cc0e9f 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.h @@ -108,6 +108,14 @@ namespace aarch64_sve extern const function_base *const svminqv; extern const function_base *const svmlalb; extern const function_base *const svmlalb_lane; + extern const function_base *const svmlallbb_lane; + extern const function_base *const svmlallbb; + extern const function_base *const svmlallbt_lane; + extern const function_base *const svmlallbt; + extern const function_base *const svmlalltb_lane; + extern const function_base *const svmlalltb; + extern const function_base *const svmlalltt_lane; + extern const function_base *const svmlalltt; extern const function_base *const svmlalt; extern const function_base *const svmlalt_lane; extern const function_base *const svmlslb; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 4201ece9d59..00284162cc0 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -347,10 +347,18 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { TYPES_s_data (S, D), \ TYPES_d_data (S, D) +/* _f16_mf8. */ +#define TYPES_h_float_mf8(S, D) \ + D (f16, mf8) + /* _f32. */ #define TYPES_s_float(S, D) \ S (f32) +/* _f32_mf8. */ +#define TYPES_s_float_mf8(S, D) \ + D (f32, mf8) + /* _f32 _s16 _s32 _s64 _u16 _u32 _u64. */ @@ -777,6 +785,7 @@ DEF_SVE_TYPES_ARRAY (bhs_widen); DEF_SVE_TYPES_ARRAY (c); DEF_SVE_TYPES_ARRAY (h_bfloat); DEF_SVE_TYPES_ARRAY (h_float); +DEF_SVE_TYPES_ARRAY (h_float_mf8); DEF_SVE_TYPES_ARRAY (h_integer); DEF_SVE_TYPES_ARRAY (hs_signed); DEF_SVE_TYPES_ARRAY (hs_integer); @@ -788,6 +797,7 @@ DEF_SVE_TYPES_ARRAY (hsd_integer); DEF_SVE_TYPES_ARRAY (hsd_data); DEF_SVE_TYPES_ARRAY (s_float); DEF_SVE_TYPES_ARRAY (s_float_hsd_integer); +DEF_SVE_TYPES_ARRAY (s_float_mf8); DEF_SVE_TYPES_ARRAY (s_float_sd_integer); DEF_SVE_TYPES_ARRAY (s_signed); DEF_SVE_TYPES_ARRAY (s_unsigned); diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index e5bd2861b48..5498eac0b03 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -67,6 +67,7 @@ ;; ---- [INT] Shift-and-accumulate operations ;; ---- [INT] Shift-and-insert operations ;; ---- [INT] Sum of absolute differences +;; ---- [FP] Mfloat8 Multiply-and-accumulate operations ;; ;; == Extending arithmetic ;; ---- [INT] Multi-register widening conversions @@ -1993,6 +1994,86 @@ (define_insn "*aarch64_sve2_aba" } ) +;; ------------------------------------------------------------------------- +;; ---- [FP] Mfloat8 Multiply-and-accumulate operations +;; ------------------------------------------------------------------------- +;; Includes: +;; - FMLALB (vectors, FP8 to FP16) +;; - FMLALT (vectors, FP8 to FP16) +;; - FMLALB (indexed, FP8 to FP16) +;; - FMLALT (indexed, FP8 to FP16) +;; - FMLALLBB (vectors) +;; - FMLALLBB (indexed) +;; - FMLALLBT (vectors) +;; - FMLALLBT (indexed) +;; - FMLALLTB (vectors) +;; - FMLALLTB (indexed) +;; - FMLALLTT (vectors) +;; - FMLALLTT (indexed) +;; ------------------------------------------------------------------------- + +(define_insn "@aarch64_sve_add_" + [(set (match_operand:VNx8HF_ONLY 0 "register_operand") + (unspec:VNx8HF_ONLY + [(match_operand:VNx8HF 1 "register_operand") + (match_operand:VNx16QI 2 "register_operand") + (match_operand:VNx16QI 3 "register_operand") + (reg:DI FPM_REGNUM)] + SVE2_FP8_TERNARY_VNX8HF))] + "TARGET_SSVE_FP8FMA" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , 0 , w , w ; * ] \t%0.h, %2.b, %3.b + [ ?&w , w , w , w ; yes ] movprfx\t%0, %1\;\t%0.h, %2.b, %3.b + } +) + +(define_insn "@aarch64_sve_add_" + [(set (match_operand:VNx4SF_ONLY 0 "register_operand") + (unspec:VNx4SF_ONLY + [(match_operand:VNx4SF 1 "register_operand") + (match_operand:VNx16QI 2 "register_operand") + (match_operand:VNx16QI 3 "register_operand") + (reg:DI FPM_REGNUM)] + SVE2_FP8_TERNARY_VNX4SF))] + "TARGET_SSVE_FP8FMA" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , 0 , w , w ; * ] \t%0.s, %2.b, %3.b + [ ?&w , w , w , w ; yes ] movprfx\t%0, %1\;\t%0.s, %2.b, %3.b + } +) + +(define_insn "@aarch64_sve_add_lane_" + [(set (match_operand:VNx8HF_ONLY 0 "register_operand") + (unspec:VNx8HF_ONLY + [(match_operand:VNx8HF 1 "register_operand") + (match_operand:VNx16QI 2 "register_operand") + (match_operand:VNx16QI 3 "register_operand") + (match_operand:SI 4 "const_int_operand") + (reg:DI FPM_REGNUM)] + SVE2_FP8_TERNARY_LANE_VNX8HF))] + "TARGET_SSVE_FP8FMA" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , 0 , w , y ; * ] \t%0.h, %2.b, %3.b[%4] + [ ?&w , w , w , y ; yes ] movprfx\t%0, %1\;\t%0.h, %2.b, %3.b[%4] + } +) + +(define_insn "@aarch64_sve_add_lane_" + [(set (match_operand:VNx4SF_ONLY 0 "register_operand") + (unspec:VNx4SF_ONLY + [(match_operand:VNx4SF 1 "register_operand") + (match_operand:VNx16QI 2 "register_operand") + (match_operand:VNx16QI 3 "register_operand") + (match_operand:SI 4 "const_int_operand") + (reg:DI FPM_REGNUM)] + SVE2_FP8_TERNARY_LANE_VNX4SF))] + "TARGET_SSVE_FP8FMA" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , 0 , w , y ; * ] \t%0.s, %2.b, %3.b[%4] + [ ?&w , w , w , y ; yes ] movprfx\t%0, %1\;\t%0.s, %2.b, %3.b[%4] + } +) + ;; ========================================================================= ;; == Extending arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index f43b1659db6..80a1fa40709 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -518,6 +518,15 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED && (TARGET_SVE2 || TARGET_STREAMING) \ && (TARGET_SME2 || TARGET_NON_STREAMING)) +/* fp8 multiply-accumulate instructions are enabled through +fp8fma. */ +#define TARGET_FP8FMA AARCH64_HAVE_ISA (FP8FMA) + +/* SVE2 versions of fp8 multiply-accumulate instructions are enabled for + non-streaming mode by +fp8fma and for streaming mode by +ssve-fp8fma. */ +#define TARGET_SSVE_FP8FMA \ + (((TARGET_SVE2 && TARGET_FP8FMA) || TARGET_STREAMING) \ + && (AARCH64_HAVE_ISA (SSVE_FP8FMA) || TARGET_NON_STREAMING)) + /* Standard register usage. */ /* 31 64-bit general purpose registers R0-R30: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 26716d593de..4b265a73d9a 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -430,6 +430,7 @@ (define_mode_iterator VMULD [V4HI V8HI V2SI V4SI (define_mode_iterator VNx16QI_ONLY [VNx16QI]) (define_mode_iterator VNx16SI_ONLY [VNx16SI]) (define_mode_iterator VNx8HI_ONLY [VNx8HI]) +(define_mode_iterator VNx8HF_ONLY [VNx8HF]) (define_mode_iterator VNx8BF_ONLY [VNx8BF]) (define_mode_iterator VNx8SI_ONLY [VNx8SI]) (define_mode_iterator VNx8SF_ONLY [VNx8SF]) @@ -975,7 +976,13 @@ (define_c_enum "unspec" UNSPEC_FMINNMP ; Used in aarch64-sve2.md. UNSPEC_FMINP ; Used in aarch64-sve2.md. UNSPEC_FMLALB ; Used in aarch64-sve2.md. + UNSPEC_FMLALB_FP8 ; Used in aarch64-sve2.md. + UNSPEC_FMLALLBB_FP8 ; Used in aarch64-sve2.md. + UNSPEC_FMLALLBT_FP8 ; Used in aarch64-sve2.md. + UNSPEC_FMLALLTB_FP8 ; Used in aarch64-sve2.md. + UNSPEC_FMLALLTT_FP8 ; Used in aarch64-sve2.md. UNSPEC_FMLALT ; Used in aarch64-sve2.md. + UNSPEC_FMLALT_FP8 ; Used in aarch64-sve2.md. UNSPEC_FMLSLB ; Used in aarch64-sve2.md. UNSPEC_FMLSLT ; Used in aarch64-sve2.md. UNSPEC_FP8FCVTN ; Used in aarch64-sve2.md. @@ -4755,3 +4762,33 @@ (define_int_attr fp8_cvt_uns_op (UNSPEC_F2CVT "f2cvt") (UNSPEC_F1CVTLT "f1cvtlt") (UNSPEC_F2CVTLT "f2cvtlt")]) + +(define_int_iterator SVE2_FP8_TERNARY_VNX8HF + [UNSPEC_FMLALB_FP8 + UNSPEC_FMLALT_FP8]) + +(define_int_iterator SVE2_FP8_TERNARY_VNX4SF + [UNSPEC_FMLALLBB_FP8 + UNSPEC_FMLALLBT_FP8 + UNSPEC_FMLALLTB_FP8 + UNSPEC_FMLALLTT_FP8]) + +(define_int_iterator SVE2_FP8_TERNARY_LANE_VNX8HF + [UNSPEC_FMLALB_FP8 + UNSPEC_FMLALT_FP8]) + +(define_int_iterator SVE2_FP8_TERNARY_LANE_VNX4SF + [UNSPEC_FMLALLBB_FP8 + UNSPEC_FMLALLBT_FP8 + UNSPEC_FMLALLTB_FP8 + UNSPEC_FMLALLTT_FP8]) + +(define_int_attr sve2_fp8_fma_op_vnx8hf + [(UNSPEC_FMLALB_FP8 "fmlalb") + (UNSPEC_FMLALT_FP8 "fmlalt")]) + +(define_int_attr sve2_fp8_fma_op_vnx4sf + [(UNSPEC_FMLALLBB_FP8 "fmlallbb") + (UNSPEC_FMLALLBT_FP8 "fmlallbt") + (UNSPEC_FMLALLTB_FP8 "fmlalltb") + (UNSPEC_FMLALLTT_FP8 "fmlalltt")]) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 1b7b712085f..2a4f016e2df 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21952,6 +21952,11 @@ Enable support for Armv8.9-a/9.4-a translation hardening extension. Enable the RCpc3 (Release Consistency) extension. @item fp8 Enable the fp8 (8-bit floating point) extension. +@item fp8fma +Enable the fp8 (8-bit floating point) multiply accumulate extension. +@item ssve-fp8fma +Enable the fp8 (8-bit floating point) multiply accumulate extension in streaming +mode. @item faminmax Enable the Floating Point Absolute Maximum/Minimum extension. @item sve-b16b16 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index 4a146c3e157..d3ae707ac49 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -84,7 +84,7 @@ #define TEST_DUAL_Z_REV(NAME, TYPE1, TYPE2, CODE1, CODE2) \ PROTO (NAME, TYPE1, (TYPE2 z0, TYPE2 z1, TYPE2 z2, TYPE2 z3, \ TYPE1 z4, TYPE1 z5, TYPE1 z6, TYPE1 z7, \ - svbool_t p0, svbool_t p1)) \ + svbool_t p0, svbool_t p1, fpm_t fpm0)) \ { \ TYPE1 z0_res; \ INVOKE (CODE1, CODE2); \ @@ -136,7 +136,7 @@ } #define TEST_DUAL_LANE_REG(NAME, ZTYPE1, ZTYPE2, REG, CODE1, CODE2) \ - PROTO (NAME, void, (void)) \ + PROTO (NAME, void, (fpm_t fpm0)) \ { \ register ZTYPE1 z0 __asm ("z0"); \ register ZTYPE2 z1 __asm ("z1"); \ @@ -194,7 +194,7 @@ PROTO (NAME, ZTYPE1, (ZTYPE1 z0, ZTYPE1 z1, ZTYPE1 z2, \ ZTYPE1 z3, ZTYPE2 z4, ZTYPE2 z5, \ ZTYPE2 z6, STYPE d7, svbool_t p0, \ - svbool_t p1)) \ + svbool_t p1, fpm_t fpm0)) \ { \ INVOKE (CODE1, CODE2); \ return z0; \ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c new file mode 100644 index 00000000000..6bdd3c06dc2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_1.c @@ -0,0 +1,84 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+ssve-fp8fma") + +void +f1 (svfloat16_t f16, svmfloat8_t f8, fpm_t fpm, + svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, + svbfloat16_t bf16, svfloat32_t f32, svfloat64_t f64, mfloat8_t f, int i) + __arm_streaming +{ + svmlalb_lane_fpm (f16, f8, f8, 0, fpm); + svmlalb_lane_fpm (f16, f8, f8, 7, fpm); + svmlalb_lane_fpm (f16, f8, f8, 8, fpm); + svmlalb_lane_fpm (f16, f8, f8, 15, fpm); + + svmlalb_lane_fpm (f16); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8, f8); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8, f8, 0); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8, f8, fpm); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8, 15, fpm); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f8, f8, 15, fpm); /* { dg-error {too few arguments to function 'svmlalb_lane_fpm'} } */ + + svmlalb_lane_fpm (f16, f8, f8, 15, 0, fpm); /* { dg-error {too many arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8, f8, 15, fpm, fpm); /* { dg-error {too many arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f8, f8, f8, 15, fpm); /* { dg-error {too many arguments to function 'svmlalb_lane_fpm'} } */ + svmlalb_lane_fpm (f16, f16, f8, f8, 15, fpm); /* { dg-error {too many arguments to function 'svmlalb_lane_fpm'} } */ + + svmlalb_lane_fpm (f32, bf16, bf16, 0, fpm); /* { dg-error {passing 'svbfloat16_t' to argument 2 of 'svmlalb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_lane_fpm (0, f8, f8, 0, fpm); /* { dg-error {passing 'int' to argument 1 of 'svmlalb_lane_fpm', which expects an SVE type rather than a scalar} } */ + svmlalb_lane_fpm (pg, f8, f8, 0, fpm); /* { dg-error {'svmlalb_lane_fpm' has no form that takes 'svbool_t' and 'svmfloat8_t' arguments} } */ + svmlalb_lane_fpm (u8, f8, f8, 0, fpm); /* { dg-error {'svmlalb_lane_fpm' has no form that takes 'svuint8_t' and 'svmfloat8_t' arguments} } */ + svmlalb_lane_fpm (u16, f8, f8, 0, fpm); /* { dg-error {'svmlalb_lane_fpm' has no form that takes 'svuint16_t' and 'svmfloat8_t' arguments} } */ + svmlalb_lane_fpm (f32, f8, f8, 0, fpm); /* { dg-error {'svmlalb_lane_fpm' has no form that takes 'svfloat32_t' and 'svmfloat8_t' arguments} } */ + svmlalb_lane_fpm (f64, f8, f8, 0, fpm); /* { dg-error {'svmlalb_lane_fpm' has no form that takes 'svfloat64_t' and 'svmfloat8_t' arguments} } */ + svmlalb_lane_fpm (f16, 0, f8, 0, fpm); /* { dg-error {passing 'int' to argument 2 of 'svmlalb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_lane_fpm (f16, f32, f8, 0, fpm); /* { dg-error {passing 'svfloat32_t' to argument 2 of 'svmlalb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_lane_fpm (f16, f8, 0, 0, fpm); /* { dg-error {passing 'int' to argument 3 of 'svmlalb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_lane_fpm (f16, f8, f32, 0, fpm); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svmlalb_lane_fpm', which expects 'svmfloat8_t'} } */ + + svmlalb_lane_fpm (f16, f8, f8, s32, fpm); /* { dg-error {argument 4 of 'svmlalb_lane_fpm' must be an integer constant expression} } */ + svmlalb_lane_fpm (f16, f8, f8, i, fpm); /* { dg-error {argument 4 of 'svmlalb_lane_fpm' must be an integer constant expression} } */ + svmlalb_lane_fpm (f16, f8, f8, 16, fpm); /* { dg-error {passing 16 to argument 4 of 'svmlalb_lane_fpm', which expects a value in the range \[0, 15\]} } */ + svmlalb_lane_fpm (f16, f8, f8, -1, fpm); /* { dg-error {passing -1 to argument 4 of 'svmlalb_lane_fpm', which expects a value in the range \[0, 15\]} } */ + svmlalb_lane_fpm (f16, f8, f8, 15, f8); /* { dg-error {passing 'svmfloat8_t' to argument 5 of 'svmlalb_lane_fpm', which expects 'uint64_t'} } */ + + + svmlallbb_lane_fpm (f32, f8, f8, 0, fpm); + svmlallbb_lane_fpm (f32, f8, f8, 7, fpm); + svmlallbb_lane_fpm (f32, f8, f8, 8, fpm); + svmlallbb_lane_fpm (f32, f8, f8, 15, fpm); + + svmlallbb_lane_fpm (f32); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8, f8); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8, f8, 0); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8, f8, fpm); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8, 15, fpm); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f8, f8, 15, fpm); /* { dg-error {too few arguments to function 'svmlallbb_lane_fpm'} } */ + + svmlallbb_lane_fpm (f32, f8, f8, 15, 0, fpm); /* { dg-error {too many arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8, f8, 15, fpm, fpm); /* { dg-error {too many arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f8, f8, f8, 15, fpm); /* { dg-error {too many arguments to function 'svmlallbb_lane_fpm'} } */ + svmlallbb_lane_fpm (f32, f16, f8, f8, 15, fpm); /* { dg-error {too many arguments to function 'svmlallbb_lane_fpm'} } */ + + svmlallbb_lane_fpm (f32, bf16, bf16, 0, fpm); /* { dg-error {passing 'svbfloat16_t' to argument 2 of 'svmlallbb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_lane_fpm (0, f8, f8, 0, fpm); /* { dg-error {passing 'int' to argument 1 of 'svmlallbb_lane_fpm', which expects an SVE type rather than a scalar} } */ + svmlallbb_lane_fpm (pg, f8, f8, 0, fpm); /* { dg-error {'svmlallbb_lane_fpm' has no form that takes 'svbool_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_lane_fpm (u8, f8, f8, 0, fpm); /* { dg-error {'svmlallbb_lane_fpm' has no form that takes 'svuint8_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_lane_fpm (u16, f8, f8, 0, fpm); /* { dg-error {'svmlallbb_lane_fpm' has no form that takes 'svuint16_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_lane_fpm (f16, f8, f8, 0, fpm); /* { dg-error {'svmlallbb_lane_fpm' has no form that takes 'svfloat16_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_lane_fpm (f64, f8, f8, 0, fpm); /* { dg-error {'svmlallbb_lane_fpm' has no form that takes 'svfloat64_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_lane_fpm (f32, 0, f8, 0, fpm); /* { dg-error {passing 'int' to argument 2 of 'svmlallbb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_lane_fpm (f32, f32, f8, 0, fpm); /* { dg-error {passing 'svfloat32_t' to argument 2 of 'svmlallbb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_lane_fpm (f32, f8, 0, 0, fpm); /* { dg-error {passing 'int' to argument 3 of 'svmlallbb_lane_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_lane_fpm (f32, f8, f32, 0, fpm); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svmlallbb_lane_fpm', which expects 'svmfloat8_t'} } */ + + svmlallbb_lane_fpm (f32, f8, f8, s32, fpm); /* { dg-error {argument 4 of 'svmlallbb_lane_fpm' must be an integer constant expression} } */ + svmlallbb_lane_fpm (f32, f8, f8, i, fpm); /* { dg-error {argument 4 of 'svmlallbb_lane_fpm' must be an integer constant expression} } */ + svmlallbb_lane_fpm (f32, f8, f8, 16, fpm); /* { dg-error {passing 16 to argument 4 of 'svmlallbb_lane_fpm', which expects a value in the range \[0, 15\]} } */ + svmlallbb_lane_fpm (f32, f8, f8, -1, fpm); /* { dg-error {passing -1 to argument 4 of 'svmlallbb_lane_fpm', which expects a value in the range \[0, 15\]} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_opt_n_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_opt_n_1.c new file mode 100644 index 00000000000..1b6ff882e68 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_opt_n_1.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sve2+fp8fma") + +void +test (svfloat16_t f16, svmfloat8_t f8, fpm_t fpm, + svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, + svbfloat16_t bf16, svfloat32_t f32, svfloat64_t f64, mfloat8_t f) +{ + svmlalb_fpm (f16, f8, f8, fpm); + svmlalt_fpm (f16, f8, f8, fpm); + svmlalb_fpm (f16, f8, f, fpm); + + svmlalb_fpm (f16); /* { dg-error {too few arguments to function 'svmlalb_fpm'} } */ + svmlalb_fpm (f16, f8); /* { dg-error {too few arguments to function 'svmlalb_fpm'} } */ + svmlalb_fpm (f16, f8, f8); /* { dg-error {too few arguments to function 'svmlalb_fpm'} } */ + svmlalb_fpm (f8, f8, fpm); /* { dg-error {too few arguments to function 'svmlalb_fpm'} } */ + svmlalb_fpm (f16, f8, fpm); /* { dg-error {too few arguments to function 'svmlalb_fpm'} } */ + svmlalb_fpm (f16, f8, f8, fpm, 0); /* { dg-error {too many arguments to function 'svmlalb_fpm'} } */ + + svmlalt_fpm (f32, f8, f8, fpm); /* { dg-error {'svmlalt_fpm' has no form that takes 'svfloat32_t' and 'svmfloat8_t' arguments} } */ + svmlalb_fpm (0, f8, f8, fpm); /* { dg-error {passing 'int' to argument 1 of 'svmlalb_fpm', which expects an SVE type rather than a scalar} } */ + svmlalb_fpm (pg, f8, f8, fpm); /* { dg-error {'svmlalb_fpm' has no form that takes 'svbool_t' and 'svmfloat8_t' arguments} } */ + svmlalb_fpm (u8, f8, f8, fpm); /* { dg-error {'svmlalb_fpm' has no form that takes 'svuint8_t' and 'svmfloat8_t' arguments} } */ + svmlalb_fpm (u16, f8, f8, fpm); /* { dg-error {'svmlalb_fpm' has no form that takes 'svuint16_t' and 'svmfloat8_t' arguments} } */ + svmlalb_fpm (f64, f8, f8, fpm); /* { dg-error {'svmlalb_fpm' has no form that takes 'svfloat64_t' and 'svmfloat8_t' arguments} } */ + svmlalb_fpm (f16, 0, f8, fpm); /* { dg-error {passing 'int' to argument 2 of 'svmlalb_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_fpm (f16, f16, f8, fpm); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svmlalb_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_fpm (f16, f8, 0, fpm); /* { dg-error {invalid conversion to type 'mfloat8_t'} } */ + svmlalb_fpm (f16, f8, f16, fpm); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmlalb_fpm', which expects 'svmfloat8_t'} } */ + svmlalb_fpm (f16, f8, f8, f8); /* { dg-error {passing 'svmfloat8_t' to argument 4 of 'svmlalb_fpm', which expects 'uint64_t'} } */ + + + svmlallbb_fpm (f32, f8, f8, fpm); + svmlallbt_fpm (f32, f8, f8, fpm); + svmlalltb_fpm (f32, f8, f8, fpm); + svmlalltt_fpm (f32, f8, f8, fpm); + svmlallbb_fpm (f32, f8, f, fpm); + + svmlallbb_fpm (f16, f8, f8, fpm); /* { dg-error {'svmlallbb_fpm' has no form that takes 'svfloat16_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_fpm (f32); /* { dg-error {too few arguments to function 'svmlallbb_fpm'} } */ + svmlallbb_fpm (f32, f8); /* { dg-error {too few arguments to function 'svmlallbb_fpm'} } */ + svmlallbb_fpm (f32, f8, f8); /* { dg-error {too few arguments to function 'svmlallbb_fpm'} } */ + svmlallbb_fpm (f8, f8, fpm); /* { dg-error {too few arguments to function 'svmlallbb_fpm'} } */ + svmlallbb_fpm (f32, f8, fpm); /* { dg-error {too few arguments to function 'svmlallbb_fpm'} } */ + svmlallbb_fpm (f32, f8, f8, fpm, 0); /* { dg-error {too many arguments to function 'svmlallbb_fpm'} } */ + svmlallbb_fpm (0, f8, f8, fpm); /* { dg-error {passing 'int' to argument 1 of 'svmlallbb_fpm', which expects an SVE type rather than a scalar} } */ + svmlallbb_fpm (pg, f8, f8, fpm); /* { dg-error {'svmlallbb_fpm' has no form that takes 'svbool_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_fpm (u8, f8, f8, fpm); /* { dg-error {'svmlallbb_fpm' has no form that takes 'svuint8_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_fpm (u16, f8, f8, fpm); /* { dg-error {'svmlallbb_fpm' has no form that takes 'svuint16_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_fpm (f64, f8, f8, fpm); /* { dg-error {'svmlallbb_fpm' has no form that takes 'svfloat64_t' and 'svmfloat8_t' arguments} } */ + svmlallbb_fpm (f32, 0, f8, fpm); /* { dg-error {passing 'int' to argument 2 of 'svmlallbb_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_fpm (f32, f16, f8, fpm); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svmlallbb_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_fpm (f32, f8, 0, fpm); /* { dg-error {invalid conversion to type 'mfloat8_t'} } */ + svmlallbb_fpm (f32, f8, f16, fpm); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svmlallbb_fpm', which expects 'svmfloat8_t'} } */ + svmlallbb_fpm (f32, f8, f8, f8); /* { dg-error {passing 'svmfloat8_t' to argument 4 of 'svmlallbb_fpm', which expects 'uint64_t'} } */ + +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c new file mode 100644 index 00000000000..e7af1b6dcc6 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_lane_mf8.c @@ -0,0 +1,91 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalb_lane_0_f16_tied1: +** msr fpmr, x0 +** fmlalb z0\.h, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalb_lane_0_f16_tied1, svfloat16_t, svmfloat8_t, + z0 = svmlalb_lane_f16_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svmlalb_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** mlalb_lane_0_f16_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalb z0\.h, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalb_lane_0_f16_tied2, svfloat16_t, svmfloat8_t, + z0_res = svmlalb_lane_f16_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svmlalb_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** mlalb_lane_0_f16_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalb z0\.h, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalb_lane_0_f16_tied3, svfloat16_t, svmfloat8_t, + z0_res = svmlalb_lane_f16_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svmlalb_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** mlalb_lane_0_f16_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalb z0\.h, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalb_lane_0_f16_untied, svfloat16_t, svmfloat8_t, + z0 = svmlalb_lane_f16_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svmlalb_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** mlalb_lane_1_f16: +** msr fpmr, x0 +** fmlalb z0\.h, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (mlalb_lane_1_f16, svfloat16_t, svmfloat8_t, + z0 = svmlalb_lane_f16_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svmlalb_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** mlalb_lane_z8_f16: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fmlalb z0\.h, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (mlalb_lane_z8_f16, svfloat16_t, svmfloat8_t, z8, + z0 = svmlalb_lane_f16_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svmlalb_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** mlalb_lane_z16_f16: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fmlalb z0\.h, z1\.b, \1\.b\[15\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (mlalb_lane_z16_f16, svfloat16_t, svmfloat8_t, z16, + z0 = svmlalb_lane_f16_mf8_fpm (z0, z1, z16, 15, fpm0), + z0 = svmlalb_lane_fpm (z0, z1, z16, 15, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c new file mode 100644 index 00000000000..424640031fb --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalb_mf8.c @@ -0,0 +1,78 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalb_f16_mf8_tied1: +** msr fpmr, x0 +** fmlalb z0\.h, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalb_f16_mf8_tied1, svfloat16_t, svmfloat8_t, + z0 = svmlalb_f16_mf8_fpm (z0, z4, z5, fpm0), + z0 = svmlalb_fpm (z0, z4, z5, fpm0)) + +/* +** mlalb_f16_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalb z0\.h, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalb_f16_mf8_tied2, svfloat16_t, svmfloat8_t, + z0_res = svmlalb_f16_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svmlalb_fpm (z4, z0, z1, fpm0)) + +/* +** mlalb_f16_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalb z0\.h, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalb_f16_mf8_tied3, svfloat16_t, svmfloat8_t, + z0_res = svmlalb_f16_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svmlalb_fpm (z4, z1, z0, fpm0)) + +/* +** mlalb_f16_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalb z0\.h, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalb_f16_mf8_untied, svfloat16_t, svmfloat8_t, + z0 = svmlalb_f16_mf8_fpm (z1, z4, z5, fpm0), + z0 = svmlalb_fpm (z1, z4, z5, fpm0)) + +/* +** mlalb_h7_f16_tied1: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** fmlalb z0\.h, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_tied1, svfloat16_t, svmfloat8_t, mfloat8_t, + z0 = svmlalb_n_f16_mf8_fpm (z0, z4, d7, fpm0), + z0 = svmlalb_fpm (z0, z4, d7, fpm0)) + +/* +** mlalb_h7_f16_untied: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** movprfx z0, z1 +** fmlalb z0\.h, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_untied, svfloat16_t, svmfloat8_t, mfloat8_t, + z0 = svmlalb_n_f16_mf8_fpm (z1, z4, d7, fpm0), + z0 = svmlalb_fpm (z1, z4, d7, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c new file mode 100644 index 00000000000..07a529d8dc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_lane_mf8.c @@ -0,0 +1,91 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlallbb_lane_0_f16_tied1: +** msr fpmr, x0 +** fmlallbb z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlallbb_lane_0_f16_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlallbb_lane_f32_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svmlallbb_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** mlallbb_lane_0_f32_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbb z0\.s, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlallbb_lane_0_f32_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlallbb_lane_f32_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svmlallbb_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** mlallbb_lane_0_f32_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbb z0\.s, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlallbb_lane_0_f32_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlallbb_lane_f32_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svmlallbb_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** mlallbb_lane_0_f32_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlallbb z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlallbb_lane_0_f32_untied, svfloat32_t, svmfloat8_t, + z0 = svmlallbb_lane_f32_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svmlallbb_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** mlallbb_lane_1_f32: +** msr fpmr, x0 +** fmlallbb z0\.s, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (mlallbb_lane_1_f32, svfloat32_t, svmfloat8_t, + z0 = svmlallbb_lane_f32_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svmlallbb_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** mlallbb_lane_z8_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fmlallbb z0\.s, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (mlallbb_lane_z8_f32, svfloat32_t, svmfloat8_t, z8, + z0 = svmlallbb_lane_f32_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svmlallbb_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** mlallbb_lane_z16_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fmlallbb z0\.s, z1\.b, \1\.b\[15\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (mlallbb_lane_z16_f32, svfloat32_t, svmfloat8_t, z16, + z0 = svmlallbb_lane_f32_mf8_fpm (z0, z1, z16, 15, fpm0), + z0 = svmlallbb_lane_fpm (z0, z1, z16, 15, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c new file mode 100644 index 00000000000..543cd9030d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbb_mf8.c @@ -0,0 +1,78 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlallbb_f32_mf8_tied1: +** msr fpmr, x0 +** fmlallbb z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlallbb_f32_mf8_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlallbb_f32_mf8_fpm (z0, z4, z5, fpm0), + z0 = svmlallbb_fpm (z0, z4, z5, fpm0)) + +/* +** mlallbb_f32_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbb z0\.s, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlallbb_f32_mf8_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlallbb_f32_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svmlallbb_fpm (z4, z0, z1, fpm0)) + +/* +** mlallbb_f32_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbb z0\.s, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlallbb_f32_mf8_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlallbb_f32_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svmlallbb_fpm (z4, z1, z0, fpm0)) + +/* +** mlallbb_f32_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlallbb z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlallbb_f32_mf8_untied, svfloat32_t, svmfloat8_t, + z0 = svmlallbb_f32_mf8_fpm (z1, z4, z5, fpm0), + z0 = svmlallbb_fpm (z1, z4, z5, fpm0)) + +/* +** mlalb_h7_f16_tied1: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** fmlallbb z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_tied1, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlallbb_n_f32_mf8_fpm (z0, z4, d7, fpm0), + z0 = svmlallbb_fpm (z0, z4, d7, fpm0)) + +/* +** mlalb_h7_f16_untied: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** movprfx z0, z1 +** fmlallbb z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_untied, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlallbb_n_f32_mf8_fpm (z1, z4, d7, fpm0), + z0 = svmlallbb_fpm (z1, z4, d7, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c new file mode 100644 index 00000000000..9da29fbfb0b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_lane_mf8.c @@ -0,0 +1,91 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlallbt_lane_0_f16_tied1: +** msr fpmr, x0 +** fmlallbt z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlallbt_lane_0_f16_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlallbt_lane_f32_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svmlallbt_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** mlallbt_lane_0_f32_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbt z0\.s, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlallbt_lane_0_f32_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlallbt_lane_f32_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svmlallbt_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** mlallbt_lane_0_f32_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbt z0\.s, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlallbt_lane_0_f32_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlallbt_lane_f32_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svmlallbt_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** mlallbt_lane_0_f32_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlallbt z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlallbt_lane_0_f32_untied, svfloat32_t, svmfloat8_t, + z0 = svmlallbt_lane_f32_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svmlallbt_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** mlallbt_lane_1_f32: +** msr fpmr, x0 +** fmlallbt z0\.s, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (mlallbt_lane_1_f32, svfloat32_t, svmfloat8_t, + z0 = svmlallbt_lane_f32_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svmlallbt_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** mlallbt_lane_z8_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fmlallbt z0\.s, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (mlallbt_lane_z8_f32, svfloat32_t, svmfloat8_t, z8, + z0 = svmlallbt_lane_f32_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svmlallbt_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** mlallbt_lane_z16_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fmlallbt z0\.s, z1\.b, \1\.b\[15\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (mlallbt_lane_z16_f32, svfloat32_t, svmfloat8_t, z16, + z0 = svmlallbt_lane_f32_mf8_fpm (z0, z1, z16, 15, fpm0), + z0 = svmlallbt_lane_fpm (z0, z1, z16, 15, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c new file mode 100644 index 00000000000..aa8299c66b3 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlallbt_mf8.c @@ -0,0 +1,78 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlallbt_f32_mf8_tied1: +** msr fpmr, x0 +** fmlallbt z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlallbt_f32_mf8_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlallbt_f32_mf8_fpm (z0, z4, z5, fpm0), + z0 = svmlallbt_fpm (z0, z4, z5, fpm0)) + +/* +** mlallbt_f32_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbt z0\.s, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlallbt_f32_mf8_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlallbt_f32_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svmlallbt_fpm (z4, z0, z1, fpm0)) + +/* +** mlallbt_f32_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlallbt z0\.s, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlallbt_f32_mf8_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlallbt_f32_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svmlallbt_fpm (z4, z1, z0, fpm0)) + +/* +** mlallbt_f32_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlallbt z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlallbt_f32_mf8_untied, svfloat32_t, svmfloat8_t, + z0 = svmlallbt_f32_mf8_fpm (z1, z4, z5, fpm0), + z0 = svmlallbt_fpm (z1, z4, z5, fpm0)) + +/* +** mlalb_h7_f16_tied1: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** fmlallbt z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_tied1, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlallbt_n_f32_mf8_fpm (z0, z4, d7, fpm0), + z0 = svmlallbt_fpm (z0, z4, d7, fpm0)) + +/* +** mlalb_h7_f16_untied: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** movprfx z0, z1 +** fmlallbt z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_untied, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlallbt_n_f32_mf8_fpm (z1, z4, d7, fpm0), + z0 = svmlallbt_fpm (z1, z4, d7, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c new file mode 100644 index 00000000000..cbe297c188b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_lane_mf8.c @@ -0,0 +1,91 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalltb_lane_0_f16_tied1: +** msr fpmr, x0 +** fmlalltb z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalltb_lane_0_f16_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlalltb_lane_f32_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svmlalltb_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** mlalltb_lane_0_f32_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltb z0\.s, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalltb_lane_0_f32_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlalltb_lane_f32_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svmlalltb_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** mlalltb_lane_0_f32_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltb z0\.s, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalltb_lane_0_f32_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlalltb_lane_f32_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svmlalltb_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** mlalltb_lane_0_f32_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalltb z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalltb_lane_0_f32_untied, svfloat32_t, svmfloat8_t, + z0 = svmlalltb_lane_f32_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svmlalltb_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** mlalltb_lane_1_f32: +** msr fpmr, x0 +** fmlalltb z0\.s, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (mlalltb_lane_1_f32, svfloat32_t, svmfloat8_t, + z0 = svmlalltb_lane_f32_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svmlalltb_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** mlalltb_lane_z8_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fmlalltb z0\.s, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (mlalltb_lane_z8_f32, svfloat32_t, svmfloat8_t, z8, + z0 = svmlalltb_lane_f32_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svmlalltb_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** mlalltb_lane_z16_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fmlalltb z0\.s, z1\.b, \1\.b\[15\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (mlalltb_lane_z16_f32, svfloat32_t, svmfloat8_t, z16, + z0 = svmlalltb_lane_f32_mf8_fpm (z0, z1, z16, 15, fpm0), + z0 = svmlalltb_lane_fpm (z0, z1, z16, 15, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c new file mode 100644 index 00000000000..a921dbd1881 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltb_mf8.c @@ -0,0 +1,78 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalltb_f32_mf8_tied1: +** msr fpmr, x0 +** fmlalltb z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalltb_f32_mf8_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlalltb_f32_mf8_fpm (z0, z4, z5, fpm0), + z0 = svmlalltb_fpm (z0, z4, z5, fpm0)) + +/* +** mlalltb_f32_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltb z0\.s, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalltb_f32_mf8_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlalltb_f32_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svmlalltb_fpm (z4, z0, z1, fpm0)) + +/* +** mlalltb_f32_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltb z0\.s, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalltb_f32_mf8_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlalltb_f32_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svmlalltb_fpm (z4, z1, z0, fpm0)) + +/* +** mlalltb_f32_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalltb z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalltb_f32_mf8_untied, svfloat32_t, svmfloat8_t, + z0 = svmlalltb_f32_mf8_fpm (z1, z4, z5, fpm0), + z0 = svmlalltb_fpm (z1, z4, z5, fpm0)) + +/* +** mlalb_h7_f16_tied1: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** fmlalltb z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_tied1, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlalltb_n_f32_mf8_fpm (z0, z4, d7, fpm0), + z0 = svmlalltb_fpm (z0, z4, d7, fpm0)) + +/* +** mlalb_h7_f16_untied: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** movprfx z0, z1 +** fmlalltb z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_untied, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlalltb_n_f32_mf8_fpm (z1, z4, d7, fpm0), + z0 = svmlalltb_fpm (z1, z4, d7, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c new file mode 100644 index 00000000000..fc5bfba7877 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_lane_mf8.c @@ -0,0 +1,91 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalltt_lane_0_f16_tied1: +** msr fpmr, x0 +** fmlalltt z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalltt_lane_0_f16_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlalltt_lane_f32_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svmlalltt_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** mlalltt_lane_0_f32_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltt z0\.s, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalltt_lane_0_f32_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlalltt_lane_f32_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svmlalltt_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** mlalltt_lane_0_f32_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltt z0\.s, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalltt_lane_0_f32_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlalltt_lane_f32_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svmlalltt_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** mlalltt_lane_0_f32_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalltt z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalltt_lane_0_f32_untied, svfloat32_t, svmfloat8_t, + z0 = svmlalltt_lane_f32_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svmlalltt_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** mlalltt_lane_1_f32: +** msr fpmr, x0 +** fmlalltt z0\.s, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (mlalltt_lane_1_f32, svfloat32_t, svmfloat8_t, + z0 = svmlalltt_lane_f32_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svmlalltt_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** mlalltt_lane_z8_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fmlalltt z0\.s, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (mlalltt_lane_z8_f32, svfloat32_t, svmfloat8_t, z8, + z0 = svmlalltt_lane_f32_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svmlalltt_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** mlalltt_lane_z16_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fmlalltt z0\.s, z1\.b, \1\.b\[15\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (mlalltt_lane_z16_f32, svfloat32_t, svmfloat8_t, z16, + z0 = svmlalltt_lane_f32_mf8_fpm (z0, z1, z16, 15, fpm0), + z0 = svmlalltt_lane_fpm (z0, z1, z16, 15, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c new file mode 100644 index 00000000000..5cd6beb348a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalltt_mf8.c @@ -0,0 +1,78 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalltt_f32_mf8_tied1: +** msr fpmr, x0 +** fmlalltt z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalltt_f32_mf8_tied1, svfloat32_t, svmfloat8_t, + z0 = svmlalltt_f32_mf8_fpm (z0, z4, z5, fpm0), + z0 = svmlalltt_fpm (z0, z4, z5, fpm0)) + +/* +** mlalltt_f32_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltt z0\.s, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalltt_f32_mf8_tied2, svfloat32_t, svmfloat8_t, + z0_res = svmlalltt_f32_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svmlalltt_fpm (z4, z0, z1, fpm0)) + +/* +** mlalltt_f32_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalltt z0\.s, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalltt_f32_mf8_tied3, svfloat32_t, svmfloat8_t, + z0_res = svmlalltt_f32_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svmlalltt_fpm (z4, z1, z0, fpm0)) + +/* +** mlalltt_f32_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalltt z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalltt_f32_mf8_untied, svfloat32_t, svmfloat8_t, + z0 = svmlalltt_f32_mf8_fpm (z1, z4, z5, fpm0), + z0 = svmlalltt_fpm (z1, z4, z5, fpm0)) + +/* +** mlalb_h7_f16_tied1: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** fmlalltt z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_tied1, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlalltt_n_f32_mf8_fpm (z0, z4, d7, fpm0), + z0 = svmlalltt_fpm (z0, z4, d7, fpm0)) + +/* +** mlalb_h7_f16_untied: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** movprfx z0, z1 +** fmlalltt z0\.s, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalb_h7_f16_untied, svfloat32_t, svmfloat8_t, mfloat8_t, + z0 = svmlalltt_n_f32_mf8_fpm (z1, z4, d7, fpm0), + z0 = svmlalltt_fpm (z1, z4, d7, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c new file mode 100644 index 00000000000..4f5a1045420 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_lane_mf8.c @@ -0,0 +1,91 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalt_lane_0_f16_tied1: +** msr fpmr, x0 +** fmlalt z0\.h, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalt_lane_0_f16_tied1, svfloat16_t, svmfloat8_t, + z0 = svmlalt_lane_f16_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svmlalt_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** mlalt_lane_0_f16_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalt z0\.h, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalt_lane_0_f16_tied2, svfloat16_t, svmfloat8_t, + z0_res = svmlalt_lane_f16_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svmlalt_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** mlalt_lane_0_f16_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalt z0\.h, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (mlalt_lane_0_f16_tied3, svfloat16_t, svmfloat8_t, + z0_res = svmlalt_lane_f16_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svmlalt_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** mlalt_lane_0_f16_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalt z0\.h, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (mlalt_lane_0_f16_untied, svfloat16_t, svmfloat8_t, + z0 = svmlalt_lane_f16_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svmlalt_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** mlalt_lane_1_f16: +** msr fpmr, x0 +** fmlalt z0\.h, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (mlalt_lane_1_f16, svfloat16_t, svmfloat8_t, + z0 = svmlalt_lane_f16_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svmlalt_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** mlalt_lane_z8_f16: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fmlalt z0\.h, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (mlalt_lane_z8_f16, svfloat16_t, svmfloat8_t, z8, + z0 = svmlalt_lane_f16_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svmlalt_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** mlalt_lane_z16_f16: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fmlalt z0\.h, z1\.b, \1\.b\[15\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (mlalt_lane_z16_f16, svfloat16_t, svmfloat8_t, z16, + z0 = svmlalt_lane_f16_mf8_fpm (z0, z1, z16, 15, fpm0), + z0 = svmlalt_lane_fpm (z0, z1, z16, 15, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c new file mode 100644 index 00000000000..3a305d31cb8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/mlalt_mf8.c @@ -0,0 +1,78 @@ +/* { dg-do assemble { target aarch64_asm_fp8fma_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8fma_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8fma" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8fma" +#endif + +/* +** mlalt_f16_mf8_tied1: +** msr fpmr, x0 +** fmlalt z0\.h, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalt_f16_mf8_tied1, svfloat16_t, svmfloat8_t, + z0 = svmlalt_f16_mf8_fpm (z0, z4, z5, fpm0), + z0 = svmlalt_fpm (z0, z4, z5, fpm0)) + +/* +** mlalt_f16_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalt z0\.h, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalt_f16_mf8_tied2, svfloat16_t, svmfloat8_t, + z0_res = svmlalt_f16_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svmlalt_fpm (z4, z0, z1, fpm0)) + +/* +** mlalt_f16_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fmlalt z0\.h, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (mlalt_f16_mf8_tied3, svfloat16_t, svmfloat8_t, + z0_res = svmlalt_f16_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svmlalt_fpm (z4, z1, z0, fpm0)) + +/* +** mlalt_f16_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fmlalt z0\.h, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (mlalt_f16_mf8_untied, svfloat16_t, svmfloat8_t, + z0 = svmlalt_f16_mf8_fpm (z1, z4, z5, fpm0), + z0 = svmlalt_fpm (z1, z4, z5, fpm0)) + +/* +** mlalt_h7_f16_tied1: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** fmlalt z0\.h, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalt_h7_f16_tied1, svfloat16_t, svmfloat8_t, mfloat8_t, + z0 = svmlalt_n_f16_mf8_fpm (z0, z4, d7, fpm0), + z0 = svmlalt_fpm (z0, z4, d7, fpm0)) + +/* +** mlalt_h7_f16_untied: +** msr fpmr, x0 +** mov (z[0-9]+\.b), b7 +** movprfx z0, z1 +** fmlalt z0\.h, z4\.b, \1 +** ret +*/ +TEST_DUAL_ZD (mlalt_h7_f16_untied, svfloat16_t, svmfloat8_t, mfloat8_t, + z0 = svmlalt_n_f16_mf8_fpm (z1, z4, d7, fpm0), + z0 = svmlalt_fpm (z1, z4, d7, fpm0)) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index a3edccf1fda..a122178bd21 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -12140,7 +12140,8 @@ proc check_effective_target_aarch64_tiny { } { foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve" "i8mm" "f32mm" "f64mm" "bf16" "sb" "sve2" "ls64" "sme" "sme-i16i64" "sme2" "sve-b16b16" - "sme-b16b16" "sme-f16f16" "sme2p1" "fp8" } { + "sme-b16b16" "sme-f16f16" "sme2p1" "fp8" "fp8fma" + "ssve-fp8fma" } { eval [string map [list FUNC $aarch64_ext] { proc check_effective_target_aarch64_asm_FUNC_ok { } { if { [istarget aarch64*-*-*] } { From patchwork Thu Nov 28 21:12:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Claudio Bantaloukas X-Patchwork-Id: 102049 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 136473858423 for ; Thu, 28 Nov 2024 21:19:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 136473858423 Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=hYVHZoyN; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.a=rsa-sha256 header.s=selector1 header.b=hYVHZoyN X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from DUZPR83CU001.outbound.protection.outlook.com (mail-northeuropeazlp170130004.outbound.protection.outlook.com [IPv6:2a01:111:f403:c200::4]) by sourceware.org (Postfix) with ESMTPS id D3CE53858D39 for ; Thu, 28 Nov 2024 21:13:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D3CE53858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D3CE53858D39 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=2a01:111:f403:c200::4 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828381; cv=pass; b=Q+yj20Bg14/xawWf38O9dPKoFT5x3pq7LR8CTSFJd5gecozzcNu7k0VrLXwoxcrfX1LMj4KxHsd8oUBmK520aoo/OkJDYt7r0gPmU+YIW6wScRd26BglPIsWvQH70FP442zt/d70jJzYJZtxNE0vqTAEFTNdP9d6X4ZJKT2F2FA= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1732828381; c=relaxed/simple; bh=v0pbrdFyblQNCP3jto20h/Ji8qYbiPJY7mQyn6xbugw=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=YvVkAdYqie2o0AwBM5AJpfOMgJWsomwD+B7SBjJCluHjmhCRLWiSxCUarTZbeUGl37y6t6D6Vz05AaTKvDgCheCGvAUvU8l5NoQdczx8yMFxb6fNJnJZOfrRUY7CZoNLqqRAbjTVqagHKNpJCtovdvqMu8qbpJVYod1i0QqF/LU= ARC-Authentication-Results: i=3; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D3CE53858D39 ARC-Seal: i=2; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=pass; b=ObmRbk4uZXjy/GlGpUQpZ8KGW4x1pJNp6o+I1b0ZD+GAxVcpFiE6n3YyOikowJJ4YveymAn1nFDqz5rlxvu4ogmugeQhZqBWKXNybJOJn+qKLw1kv0R7l4uuBTp1FcJjrGbP3fAmJAxoqMK4hMOoCfYJcARWNW25UORKSA6ZkdS197LOhZIjGKBfTEyqVUHACaPMl+4vSjEN03ZJ7m7+XEEhwsN40m3fO3XChGebM+9rqhc47QROfwYZRl5rGxGRxSISruuNcxRihmO+3WwzMlHZ7TcFP1upYrVF9DOYCClNdcnZ4ReTjOnCB1PH6lmKSQMOHgxrONNicM/CmRmeYA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IwGUzlxoWT1KJYvBFIsM9JRIJpax4LOo0Mr16p8nRQU=; b=KOsErBR0myWfRqcUPtyW2f0eqOIqS/ZEtt3DpUU1Gr0WlvkIRKDfACqDqRuke+9odTOEqo1yx/Spr4IMPmmhbbCaDb2OdyHq0zKGvoxJ165bH5dOmtUvR1hHVWii29N6vjGr2XIHT5je4p2nEaHathNo2ZXk2QDV8l8bmUnsNtGN/Jbh4VQ33QSwvhURmvTTJJ4a9PJV9+6RpmdL1TAtVE2ayvQcrqARi6mjo/5X8rqeNqcLCnMrFMj+00bZ8mni8vTWmJ31F78siEEw1ttninIHi1KbIrHutYouqVRLrH+vMojKT+XQ6Rbj5nOk1SHjGCQSIsD3aXW+v5nelbXsSw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IwGUzlxoWT1KJYvBFIsM9JRIJpax4LOo0Mr16p8nRQU=; b=hYVHZoyNnIm6CaSKIms+VhlNGxsERhMKEnpPEuD1pmsxzyc7hD8LkDeUpz0POIPlxgd5YhW9to24mx5zqZVMg5kN0lDUeEk4hVKOVwwv3sAVhrt5QE27+AHhAjyftXVoAQPJ5EwyfzB7/YDc970WwBj+B+0AEoAyMqRMPGZaL/k= Received: from DU6P191CA0005.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:540::18) by AS8PR08MB8224.eurprd08.prod.outlook.com (2603:10a6:20b:52b::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.12; Thu, 28 Nov 2024 21:12:54 +0000 Received: from DU2PEPF00028CFE.eurprd03.prod.outlook.com (2603:10a6:10:540:cafe::22) by DU6P191CA0005.outlook.office365.com (2603:10a6:10:540::18) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.14 via Frontend Transport; Thu, 28 Nov 2024 21:12:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU2PEPF00028CFE.mail.protection.outlook.com (10.167.242.182) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:54 +0000 Received: ("Tessian outbound 3b1f0cd68b0e:v514"); Thu, 28 Nov 2024 21:12:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 722045d75d011195 X-TessianGatewayMetadata: pTn+4Cf1BJllkEK7dvdaSe9J8dMJVHMMBPTjdPELYtQrj6/rnKDqcnzQUgNUViHUqN3/2be/l3tJINj63slbXnNEvBY2prbY67j3JNJNeqRwc5pkYDTXUtp/SS0K3NpX5PZxLhWuL8KN7qRSfuqC2ifC44+G4w43Q6c1njkorJWPm8gjDL/tNAuveqQ8jSVY X-CR-MTA-TID: 64aa7808 Received: from L72d38c7719d6.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 1221766E-D0B3-4590-85DD-2B6A6ABA77D3.1; Thu, 28 Nov 2024 21:12:42 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id L72d38c7719d6.1 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384); Thu, 28 Nov 2024 21:12:42 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Aw47Y1svbIjcjfnOU6uPdWUuEjLOt5wUf5oxhs8saamIttVwmWyv296RqmNS4PEC83aGkGR9b3Wmq3pO8AnTH95hdEihJ8o6gVKsLervSdmnBd2if7OeyF21PgYOHvCV6hhTRxHqE65vu2yX1lPH2Vshe8c/SdNL7td4sk7h5ohVgS733l757NBIT+vZiZhaGA4i6yayxfjkZ19lPzw9mEQnZLGVz2oRtDgUn+66LQ1DubdLnxreLm5H/q7IUK2vHWhDmhxJGACHhVpu5/hyyT1EdF103j7gMfoKvfosHuLmsz8m7NdgNgod8MJ97sTnQGxa/A6VxB0N25d7zQoIGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=IwGUzlxoWT1KJYvBFIsM9JRIJpax4LOo0Mr16p8nRQU=; b=XxAVWlIeEBsC7yhRseAzDXpmmtn9KOezflSRsd29amUXoweBdGMgB3I6Wraz+AqTJJ5zl8XWLtsXcX5jeaOxp4hu9vCY+hNh36KYxtZu/EWl3yduEp4IMXcqq0oW8nTOMV8vU4uq7qQRteajO9xDzOl+4OXFalISildibqTRyGNf4ZCmlbL99Lk5vHQwdpHlAzCsjAHQFACnX7UuD0d248HvCQ7ePLPyWwxJwBzuz+imxmChfNkTxvAVwY9QkB2uNZP+AFkFDnoUlJq41IBijEj56JqbZwovICzjq5sxZnn7VAwr4p68JRKROfepkB1Nd+jB9uXSLkpQXQxbaJzVng== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=IwGUzlxoWT1KJYvBFIsM9JRIJpax4LOo0Mr16p8nRQU=; b=hYVHZoyNnIm6CaSKIms+VhlNGxsERhMKEnpPEuD1pmsxzyc7hD8LkDeUpz0POIPlxgd5YhW9to24mx5zqZVMg5kN0lDUeEk4hVKOVwwv3sAVhrt5QE27+AHhAjyftXVoAQPJ5EwyfzB7/YDc970WwBj+B+0AEoAyMqRMPGZaL/k= Received: from DB8PR06CA0051.eurprd06.prod.outlook.com (2603:10a6:10:120::25) by DB9PR08MB6636.eurprd08.prod.outlook.com (2603:10a6:10:250::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8207.14; Thu, 28 Nov 2024 21:12:38 +0000 Received: from DU6PEPF0000A7E4.eurprd02.prod.outlook.com (2603:10a6:10:120:cafe::14) by DB8PR06CA0051.outlook.office365.com (2603:10a6:10:120::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8207.13 via Frontend Transport; Thu, 28 Nov 2024 21:12:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DU6PEPF0000A7E4.mail.protection.outlook.com (10.167.8.43) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.8207.12 via Frontend Transport; Thu, 28 Nov 2024 21:12:38 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 28 Nov 2024 21:12:37 +0000 Received: from 5fe87ac27518.euhpc2.arm.com (10.58.86.32) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.39 via Frontend Transport; Thu, 28 Nov 2024 21:12:37 +0000 From: Claudio Bantaloukas To: CC: Claudio Bantaloukas Subject: [PATCH v5 5/5] aarch64: add SVE2 FP8DOT2 and FP8DOT4 intrinsics Date: Thu, 28 Nov 2024 21:12:34 +0000 Message-ID: <20241128211234.1714776-6-claudio.bantaloukas@arm.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> References: <20241128211234.1714776-1-claudio.bantaloukas@arm.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DU6PEPF0000A7E4:EE_|DB9PR08MB6636:EE_|DU2PEPF00028CFE:EE_|AS8PR08MB8224:EE_ X-MS-Office365-Filtering-Correlation-Id: 0c62752b-3a98-44ee-7b50-08dd0ff16cde x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; ARA:13230040|1800799024|82310400026|36860700013|376014; X-Microsoft-Antispam-Message-Info-Original: ZkIFqx2mEWhqiwKIFKgWdNOfKFVSM1vEs4qw2vari0+zWjGkmlr5GsHz0ccYbsfE2CQDZvuAqurx8xzf/nKvchgm3GXrZsWSWpL5ALlCQoZZKhTQQDbD3j/7Zk0Y3ZpDzqIRRbMpeHx714F8pfzyinbM2UcvZpXhf5uEqUF4kye74hc+h5rRYZ0E/GVJPHzMsF81O678s5H+cQvbxO5PC6n7C2BBpfWHL8kqWVGEslbl7gOOP+XjdrZPgpqJJ/hA4GA08FyX5qhn4mhU457BdjtyEp28O1EREfwuSCR823HvvWtgO+FJRuRwSoawk48Y0WF7ZNFxYm4FM0w3P06tvXRGenR4TeJbDOh88UTNv+fMZy8VrLoWqVwCXue0ZApTEQrHa0/DwGIymbbh6ehC2uLAs8YehmzotK4CzTHeg8k/s+Rdg4sGQPlIOFrKWpLj8jQ1RlIM+GpORVA+eRi4GS6kPRpka12FryYbYgpY/cwmApnzJpBnJPX7X/iBSu5y4UHMZZl8eWj9mnw2Z0YaFfPpR31y46VV2kykLZffuJM7TGGORRtlkGKQDhuTfZvYiZO2vWlzFCxIaWd4MrVtLv987B45Ft/K8+bGb1k5bEQ4reG8PZEefWD+eGa+6l9VTr9vMic6xxgl/Ntbc4PfCD+yhaHGZTpLMZjeCt33MT6DWyMbwjpuLvLPm1jNuop1Aq/nxaXH+JUwSurBQJtkMaWET6t8ty8B5dwbclv58zg07tOwjXibMiV1slznmZz9NxrmGNGQE2CHQ52oX/q7pr8strERTvWQCYorCP97A2BsVRCUvoZPcnH2Rg/5ob79PZWAndzHm5L1DTyeDjEF+863TIQ9C8kBPF3E8iZ+pqAIMHkx3JWm0Fc77ts6RukZFq/lzvum/pbYhYIr6til18n4Gia9aRtseU284s9+Hk1GcvSyd9ZeS9J3MdyMzj+VwGr3obMvarfOVyO6e9OK2zHefZP+a4FdC7sCSHozU+WUh8I6e6Qw6AYJWesNq0tWSVRMyxadqRreWRrUoYmZn4YMCbI/U/Z5wNqO2uWldF+MCtDqsazrMFAl6iSZ69seLkWKmadn9KKCceOCSr+g9kuxfHt2A0D7TlKB2T4qPPhY0txYqwbYBNS28PvthsPYf6AwrS1mkb1FrDGxdfcLOHAbkd7q0NQkz7ApBq7OPs2vvcPQ89utruY8SfGNAQSYpY/j9Zk5dZT4uMcxsOXZH3fbUnfHqfZbS07luqB9kYEb/j5/rABn6yjIiHUdzn40ccfZU2wJdi14adSkMLrkkAYs2iwVir/+jWSg9J0EV27+hVTDrNCcipslZpQ6+rOR2XjilLz3g6kE0FqWlKDvTq0KkNadHwD6wl39y/2kOjBEHFVZRrmQ/dNcrelqf0DJPZcWvkwuznSWucrXHtr40g== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(82310400026)(36860700013)(376014); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB9PR08MB6636 X-MS-Exchange-SkipListedInternetSender: ip=[2603:10a6:10:120::25]; domain=DB8PR06CA0051.eurprd06.prod.outlook.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU2PEPF00028CFE.eurprd03.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: e5e0925c-e5d8-452e-6d6b-08dd0ff16321 X-Microsoft-Antispam: BCL:0; ARA:13230040|35042699022|82310400026|376014|1800799024|36860700013|14060799003; X-Microsoft-Antispam-Message-Info: =?utf-8?q?JWnPBF6kOxWGlP5QXIpiRkwzC2bCrXP?= =?utf-8?q?rYb4PL2BDW4bdr9nayyDuGsNWjZJKXc6Itd+2jspr9V3zLh35Qju4NsmKzLPH0KOD?= =?utf-8?q?/YXKikdUqmE9neprGNVtUHuC0xusdUj4Kyi+0tjw6bVc3CwscEXDhsMfUDo8KeySu?= =?utf-8?q?1D9lBppayVTt5kgA47bsTfGPgo1FCnUx0yLhQWi5ktZY/vcVUg+uJL3kKaxI7lIFX?= =?utf-8?q?lGjmhLlJPOQjfq8mH+Cr9/02XS/UQhF6QgFkVwc2GMfYffdWQncktRwbSW7GykUw2?= =?utf-8?q?2AhmlajrzEkSDC8LyeLfJR1Hr+aLkJ1h/RUxnCiWqXOno6jphymM9+5mGt+wXDjxY?= =?utf-8?q?VCEMPCPvwG8D3i8YwCQhVpNPqOpYgWLBslnvmsDphbaZH66ZNJiTPwJk2CMXUXduj?= =?utf-8?q?LWNlUVgSAa2Og2fsREzfUCLBwm0Kco63crvapl/JT6OzQpA5fK2BUGoW+miaI8lLh?= =?utf-8?q?ySIHp+Dyr+oC+fSP9eyRZno0aERs4R8tie08djKceqHaY1LnP01rw0JcyJselksMW?= =?utf-8?q?gdrS45332eKKXPOIgqARWyBphrsHh9mKoBuTi9khBZthG7noJG+C/M8yb6FJVBDgz?= =?utf-8?q?mqn/xYrjDy9BDLXLpAy+J/guqoqgZpGbrN+5uXhGJVRumLYdJe4mcMoDlr5Q7NKzm?= =?utf-8?q?rDGNaO6uoaMtn2y6DnP7EVcRwGdgBVvuv/a0adlaYPI/yAnUD4rNKLKzb/rmyKIrN?= =?utf-8?q?xCGatycA7gwQ4+xbcN3i+knHOuUKbjI02kpFo5uygyVhtliSrubM0y7ikLok/SVMn?= =?utf-8?q?p/iHhgJvtJjTUQIvQrGX47J3K2NPpts39hX44Jh+VVfJ7WYGUoyD9aG2IvqRgsmSr?= =?utf-8?q?Zr0QWSBzIiDMSvkf9+tzfvdSO4HQUWByfJYkuwLCyC20KQJBlUO6rnIu+RGvbMzi9?= =?utf-8?q?Z0+iOHbYkD27Ir/AoTcHFTY3jRKzgLguDnJV8Cdecv8kof8k7k1Q/wiwUqD9x7MBg?= =?utf-8?q?pMK7wF8hHGHN3n1abyG+oHfZp28ecMRx+0liVYZ2sCjegTD4+oc67fnBhQs3uiO3w?= =?utf-8?q?T0jwvLo3X9SEqZYleruuB98iszd4Larz9aA30f71NjcRDBEzyQ3KAqwLr6L2KIH++?= =?utf-8?q?X08xRj0ZyCPt0LfoG+YVRN92kFdH8Zvpc7YdotcNqYLZLw5dAPRgIdcy2HWhTijKZ?= =?utf-8?q?wcyzSSTt+bw1+EzMlZTz2WzAO71d1YBz20wJ9/N80W4Lciu/WaQtrtoG3jYwin255?= =?utf-8?q?W0bJ0Ayrx2utaOZNPuKFsUuWVHoJ2c0JRa1RSeUnYIKetj4gtwFZLJTKCjOX13yWd?= =?utf-8?q?MW003JG79MJqyL38F7eZ/A2CSrGW6zEu2Ny0gxBmTOu7eaepNXqFG08vhimYYWISp?= =?utf-8?q?5pW/a7zDBhDiDe9vH5KNn6QrCSk4jmRCvw=3D=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:64aa7808-outbound-1.mta.getcheckrecipient.com; CAT:NONE; SFS:(13230040)(35042699022)(82310400026)(376014)(1800799024)(36860700013)(14060799003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2024 21:12:54.4533 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0c62752b-3a98-44ee-7b50-08dd0ff16cde X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU2PEPF00028CFE.eurprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB8224 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FORGED_SPF_HELO, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org This patch adds support for the following intrinsics: - svdot[_f32_mf8]_fpm - svdot_lane[_f32_mf8]_fpm - svdot[_f16_mf8]_fpm - svdot_lane[_f16_mf8]_fpm The first two are available under a combination of the FP8DOT4 and SVE2 features. Alternatively under the SSVE_FP8DOT4 feature under streaming mode. The final two are available under a combination of the FP8DOT2 and SVE2 features. Alternatively under the SSVE_FP8DOT2 feature under streaming mode. gcc/ * config/aarch64/aarch64-option-extensions.def (fp8dot4, ssve-fp8dot4): Add new extensions. (fp8dot2, ssve-fp8dot2): Likewise. * config/aarch64/aarch64-sve-builtins-base.cc (svdot_impl): Support fp8. (svdotprod_lane_impl): Likewise. (svdot_lane): Provide an unspec for fp8 types. * config/aarch64/aarch64-sve-builtins-shapes.cc (ternary_mfloat8_def): Add new class. (ternary_mfloat8): Add new shape. (ternary_mfloat8_lane_group_selection_def): Add new class. (ternary_mfloat8_lane_group_selection): Add new shape. * config/aarch64/aarch64-sve-builtins-shapes.h (ternary_mfloat8, ternary_mfloat8_lane_group_selection): Declare. * config/aarch64/aarch64-sve-builtins-sve2.def (svdot, svdot_lane): Add new DEF_SVE_FUNCTION_GS_FPM, twice to deal with the combination of features providing support for 32 and 16 bit floating point. * config/aarch64/aarch64-sve2.md (@aarch64_sve_dot): Add new. (@aarch64_sve_dot_lane): Likewise. * config/aarch64/aarch64.h: (TARGET_FP8DOT4, TARGET_SSVE_FP8DOT4): Add new defines. (TARGET_FP8DOT2, TARGET_SSVE_FP8DOT2): Likewise. * config/aarch64/iterators.md (UNSPEC_DOT_FP8, UNSPEC_DOT_LANE_FP8): Add new unspecs. * doc/invoke.texi: Document fp8dot4, fp8dot2, ssve-fp8dot4, ssve-fp8dot2 extensions. gcc/testsuite * gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_1.c: Add new. gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_lane_mf8.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/dot_mf8.c: Likewise. * lib/target-supports.exp: Add dg-require-effective-target support for aarch64_asm_fp8dot2_ok, aarch64_asm_fp8dot4_ok, aarch64_asm_ssve-fp8dot2_ok and aarch64_asm_ssve-fp8dot4_ok. --- .../aarch64/aarch64-option-extensions.def | 8 + .../aarch64/aarch64-sve-builtins-base.cc | 56 +++--- .../aarch64/aarch64-sve-builtins-shapes.cc | 48 +++++ .../aarch64/aarch64-sve-builtins-shapes.h | 8 +- .../aarch64/aarch64-sve-builtins-sve2.def | 14 ++ gcc/config/aarch64/aarch64-sve2.md | 41 +++++ gcc/config/aarch64/aarch64.h | 18 ++ gcc/config/aarch64/iterators.md | 2 + gcc/doc/invoke.texi | 12 ++ .../sve/acle/general-c/ternary_mfloat8_1.c | 33 ++++ .../ternary_mfloat8_lane_group_selection_1.c | 49 +++++ .../aarch64/sve2/acle/asm/dot_lane_mf8.c | 172 ++++++++++++++++++ .../aarch64/sve2/acle/asm/dot_mf8.c | 101 ++++++++++ gcc/testsuite/lib/target-supports.exp | 3 +- 14 files changed, 541 insertions(+), 24 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_mf8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_mf8.c diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def index f39c9e6f897..089a0a74ec0 100644 --- a/gcc/config/aarch64/aarch64-option-extensions.def +++ b/gcc/config/aarch64/aarch64-option-extensions.def @@ -251,6 +251,14 @@ AARCH64_OPT_EXTENSION("ssve-fp8fma", SSVE_FP8FMA, (SME2,FP8), (), (), "ssve-fp8f AARCH64_OPT_EXTENSION("faminmax", FAMINMAX, (SIMD), (), (), "faminmax") +AARCH64_OPT_EXTENSION("fp8dot4", FP8DOT4, (FP8FMA), (), (), "fp8dot4") + +AARCH64_OPT_EXTENSION("ssve-fp8dot4", SSVE_FP8DOT4, (SSVE_FP8FMA), (), (), "ssve-fp8dot4") + +AARCH64_OPT_EXTENSION("fp8dot2", FP8DOT2, (FP8DOT4), (), (), "fp8dot2") + +AARCH64_OPT_EXTENSION("ssve-fp8dot2", SSVE_FP8DOT2, (SSVE_FP8DOT4), (), (), "ssve-fp8dot2") + #undef AARCH64_OPT_FMV_EXTENSION #undef AARCH64_OPT_EXTENSION #undef AARCH64_FMV_FEATURE diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 95e66dc2adf..b97941932ab 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -838,21 +838,26 @@ public: rtx expand (function_expander &e) const override { - /* In the optab, the multiplication operands come before the accumulator - operand. The optab is keyed off the multiplication mode. */ - e.rotate_inputs_left (0, 3); insn_code icode; - if (e.type_suffix_ids[1] == NUM_TYPE_SUFFIXES) - icode = e.convert_optab_handler_for_sign (sdot_prod_optab, - udot_prod_optab, - 0, e.result_mode (), - GET_MODE (e.args[0])); + if (e.fpm_mode == aarch64_sve::FPM_set) + icode = code_for_aarch64_sve_dot (e.result_mode ()); else - icode = (e.type_suffix (0).float_p - ? CODE_FOR_aarch64_sve_fdotvnx4sfvnx8hf - : e.type_suffix (0).unsigned_p - ? CODE_FOR_udot_prodvnx4sivnx8hi - : CODE_FOR_sdot_prodvnx4sivnx8hi); + { + /* In the optab, the multiplication operands come before the accumulator + operand. The optab is keyed off the multiplication mode. */ + e.rotate_inputs_left (0, 3); + if (e.type_suffix_ids[1] == NUM_TYPE_SUFFIXES) + icode = e.convert_optab_handler_for_sign (sdot_prod_optab, + udot_prod_optab, + 0, e.result_mode (), + GET_MODE (e.args[0])); + else + icode = (e.type_suffix (0).float_p + ? CODE_FOR_aarch64_sve_fdotvnx4sfvnx8hf + : e.type_suffix (0).unsigned_p + ? CODE_FOR_udot_prodvnx4sivnx8hi + : CODE_FOR_sdot_prodvnx4sivnx8hi); + } return e.use_unpred_insn (icode); } }; @@ -865,17 +870,24 @@ public: rtx expand (function_expander &e) const override { + insn_code icode; machine_mode mode0 = GET_MODE (e.args[0]); machine_mode mode1 = GET_MODE (e.args[1]); - /* Use the same ordering as the dot_prod_optab, with the - accumulator last. */ - e.rotate_inputs_left (0, 4); - int unspec = unspec_for (e); - insn_code icode; - if (unspec == UNSPEC_FDOT) - icode = CODE_FOR_aarch64_fdot_prod_lanevnx4sfvnx8hf; + if (e.fpm_mode == aarch64_sve::FPM_set) + { + icode = code_for_aarch64_sve_dot_lane (mode0); + } else - icode = code_for_aarch64_dot_prod_lane (unspec, mode0, mode1); + { + /* Use the same ordering as the dot_prod_optab, with the + accumulator last. */ + e.rotate_inputs_left (0, 4); + int unspec = unspec_for (e); + if (unspec == UNSPEC_FDOT) + icode = CODE_FOR_aarch64_fdot_prod_lanevnx4sfvnx8hf; + else + icode = code_for_aarch64_dot_prod_lane (unspec, mode0, mode1); + } return e.use_exact_insn (icode); } }; @@ -3255,7 +3267,7 @@ FUNCTION (svdiv, svdiv_impl,) FUNCTION (svdivr, rtx_code_function_rotated, (DIV, UDIV, UNSPEC_COND_FDIV)) FUNCTION (svdot, svdot_impl,) FUNCTION (svdot_lane, svdotprod_lane_impl, (UNSPEC_SDOT, UNSPEC_UDOT, - UNSPEC_FDOT)) + UNSPEC_FDOT, UNSPEC_DOT_LANE_FP8)) FUNCTION (svdup, svdup_impl,) FUNCTION (svdup_lane, svdup_lane_impl,) FUNCTION (svdupq, svdupq_impl,) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 94f4da8ce31..cf3ddab09b6 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -4005,6 +4005,34 @@ struct ternary_bfloat_def }; SHAPE (ternary_bfloat) +/* sv_t svfoo[_t0](sv_t, svmfloat8_t, svmfloat8_t). */ +struct ternary_mfloat8_def + : public ternary_resize2_base<8, TYPE_mfloat, TYPE_mfloat> +{ + void + build (function_builder &b, const function_group_info &group) const override + { + gcc_assert (group.fpm_mode == FPM_set); + b.add_overloaded_functions (group, MODE_none); + build_all (b, "v0,v0,vM,vM", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + type_suffix_index type; + if (!r.check_num_arguments (4) + || (type = r.infer_vector_type (0)) == NUM_TYPE_SUFFIXES + || !r.require_vector_type (1, VECTOR_TYPE_svmfloat8_t) + || !r.require_vector_type (2, VECTOR_TYPE_svmfloat8_t) + || !r.require_scalar_type (3, "uint64_t")) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type, TYPE_SUFFIX_mf8, GROUP_none); + } +}; +SHAPE (ternary_mfloat8) + /* sv_t svfoo[_t0](sv_t, svbfloat16_t, svbfloat16_t, uint64_t) where the final argument is an integer constant expression in the range @@ -4057,6 +4085,26 @@ struct ternary_mfloat8_lane_def }; SHAPE (ternary_mfloat8_lane) +/* sv_t svfoo[_t0](sv_t, svmfloat8_t, svmfloat8_t, uint64_t) + + where the final argument is an integer constant expression in the range + [0, 7] or [0, 3]. */ +struct ternary_mfloat8_lane_group_selection_def + : public ternary_mfloat8_lane_def +{ + bool + check (function_checker &c) const override + { + machine_mode mode = c.vector_mode (0); + if (mode == E_VNx8HFmode) + return c.require_immediate_lane_index (3, 2, 2); + else if (mode == E_VNx4SFmode) + return c.require_immediate_lane_index (3, 2, 4); + gcc_unreachable (); + } +}; +SHAPE (ternary_mfloat8_lane_group_selection) + /* sv_t svfoo[_t0](sv_t, svbfloatt16_t, svbfloat16_t) sv_t svfoo[_n_t0](sv_t, svbfloat16_t, bfloat16_t). */ struct ternary_bfloat_opt_n_def diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index 1c8937ae027..c7e448c1fd4 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -71,7 +71,11 @@ namespace aarch64_sve scalar displacement". - "_pred" indicates that the function takes an svbool_t argument - that does not act as a governing predicate.. */ + that does not act as a governing predicate.. + + - "_group_selection" indicates that the function takes an imm integer + argument that selects a specific group of elements that fit a 128 bit + vector. */ namespace shapes { extern const function_shape *const adr_index; @@ -213,7 +217,9 @@ namespace aarch64_sve extern const function_shape *const ternary_lane_rotate; extern const function_shape *const ternary_long_lane; extern const function_shape *const ternary_long_opt_n; + extern const function_shape *const ternary_mfloat8; extern const function_shape *const ternary_mfloat8_lane; + extern const function_shape *const ternary_mfloat8_lane_group_selection; extern const function_shape *const ternary_mfloat8_opt_n; extern const function_shape *const ternary_opt_n; extern const function_shape *const ternary_qq_or_011_lane; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def index b489e8fad2f..082dec1377d 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def @@ -396,3 +396,17 @@ DEF_SVE_FUNCTION_GS_FPM (svmlallbb_lane, ternary_mfloat8_lane, s_float_mf8, none DEF_SVE_FUNCTION_GS_FPM (svmlallbt_lane, ternary_mfloat8_lane, s_float_mf8, none, none, set) DEF_SVE_FUNCTION_GS_FPM (svmlalltb_lane, ternary_mfloat8_lane, s_float_mf8, none, none, set) #undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS \ + streaming_compatible (AARCH64_FL_SVE2 | AARCH64_FL_FP8DOT4, \ + AARCH64_FL_SSVE_FP8DOT4) +DEF_SVE_FUNCTION_GS_FPM (svdot, ternary_mfloat8, s_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svdot_lane, ternary_mfloat8_lane_group_selection, s_float_mf8, none, none, set) +#undef REQUIRED_EXTENSIONS + +#define REQUIRED_EXTENSIONS \ + streaming_compatible (AARCH64_FL_SVE2 | AARCH64_FL_FP8DOT2, \ + AARCH64_FL_SSVE_FP8DOT2) +DEF_SVE_FUNCTION_GS_FPM (svdot, ternary_mfloat8, h_float_mf8, none, none, set) +DEF_SVE_FUNCTION_GS_FPM (svdot_lane, ternary_mfloat8_lane_group_selection, h_float_mf8, none, none, set) +#undef REQUIRED_EXTENSIONS diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 5498eac0b03..219e9fc1c81 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -68,6 +68,7 @@ ;; ---- [INT] Shift-and-insert operations ;; ---- [INT] Sum of absolute differences ;; ---- [FP] Mfloat8 Multiply-and-accumulate operations +;; ---- [FP] Mfloat8 dot products ;; ;; == Extending arithmetic ;; ---- [INT] Multi-register widening conversions @@ -2074,6 +2075,46 @@ (define_insn "@aarch64_sve_add_lane_" } ) +;; ------------------------------------------------------------------------- +;; ---- [FP] Mfloat8 dot products +;; ------------------------------------------------------------------------- +;; Includes: +;; - FDOT (4-way, vectors) +;; - FDOT (4-way, indexed) +;; - FDOT (2-way, vectors) +;; - FDOT (2-way, indexed) +;; ------------------------------------------------------------------------- +(define_insn "@aarch64_sve_dot" + [(set (match_operand:SVE_FULL_HSF 0 "register_operand") + (unspec:SVE_FULL_HSF + [(match_operand:SVE_FULL_HSF 1 "register_operand") + (match_operand:VNx16QI 2 "register_operand") + (match_operand:VNx16QI 3 "register_operand") + (reg:DI FPM_REGNUM)] + UNSPEC_DOT_FP8))] + "TARGET_SSVE_FP8DOT4 && !(mode == VNx8HFmode && !TARGET_SSVE_FP8DOT2)" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , 0 , w , w ; * ] fdot\t%0., %2.b, %3.b + [ ?&w , w , w , w ; yes ] movprfx\t%0, %1\;fdot\t%0., %2.b, %3.b + } +) + +(define_insn "@aarch64_sve_dot_lane" + [(set (match_operand:SVE_FULL_HSF 0 "register_operand") + (unspec:SVE_FULL_HSF + [(match_operand:SVE_FULL_HSF 1 "register_operand") + (match_operand:VNx16QI 2 "register_operand") + (match_operand:VNx16QI 3 "register_operand") + (match_operand:SI 4 "const_int_operand") + (reg:DI FPM_REGNUM)] + UNSPEC_DOT_LANE_FP8))] + "TARGET_SSVE_FP8DOT4 && !(mode == VNx8HFmode && !TARGET_SSVE_FP8DOT2)" + {@ [ cons: =0 , 1 , 2 , 3 ; attrs: movprfx ] + [ w , 0 , w , y ; * ] fdot\t%0., %2.b, %3.b[%4] + [ ?&w , w , w , y ; yes ] movprfx\t%0, %1\;fdot\t%0., %2.b, %3.b[%4] + } +) + ;; ========================================================================= ;; == Extending arithmetic ;; ========================================================================= diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 80a1fa40709..53b4f88b17a 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -527,6 +527,24 @@ constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED (((TARGET_SVE2 && TARGET_FP8FMA) || TARGET_STREAMING) \ && (AARCH64_HAVE_ISA (SSVE_FP8FMA) || TARGET_NON_STREAMING)) +/* fp8 four way dot product enabled through +fp8dot4. */ +#define TARGET_FP8DOT4 AARCH64_HAVE_ISA (FP8DOT4) + +/* Streaming versions of fp8 four way dot product instructions are enabled +through +ssve-fp8dot4. */ +#define TARGET_SSVE_FP8DOT4 ((\ + (TARGET_SVE2 && TARGET_FP8DOT4) || TARGET_STREAMING) \ + && (AARCH64_HAVE_ISA(SSVE_FP8DOT4) || TARGET_NON_STREAMING)) + +/* fp8 two way dot product enabled through +fp8dot2. */ +#define TARGET_FP8DOT2 AARCH64_HAVE_ISA (FP8DOT2) + +/* Streaming versions of fp8 two way dot product instructions are enabled +through +ssve-fp8dot2. */ +#define TARGET_SSVE_FP8DOT2 ((\ + (TARGET_SVE2 && TARGET_FP8DOT2) || TARGET_STREAMING) \ + && (AARCH64_HAVE_ISA(SSVE_FP8DOT2) || TARGET_NON_STREAMING)) + /* Standard register usage. */ /* 31 64-bit general purpose registers R0-R30: diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 4b265a73d9a..4786b0210e7 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -962,6 +962,8 @@ (define_c_enum "unspec" UNSPEC_COND_FCVTX ; Used in aarch64-sve2.md. UNSPEC_COND_FCVTXNT ; Used in aarch64-sve2.md. UNSPEC_COND_FLOGB ; Used in aarch64-sve2.md. + UNSPEC_DOT_FP8 ; Used in aarch64-sve2.md. + UNSPEC_DOT_LANE_FP8 ; Used in aarch64-sve2.md. UNSPEC_EORBT ; Used in aarch64-sve2.md. UNSPEC_EORTB ; Used in aarch64-sve2.md. UNSPEC_F1CVT ; Used in aarch64-sve2.md. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2a4f016e2df..f7440113570 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -21957,6 +21957,18 @@ Enable the fp8 (8-bit floating point) multiply accumulate extension. @item ssve-fp8fma Enable the fp8 (8-bit floating point) multiply accumulate extension in streaming mode. +@item fp8dot4 +Enable the fp8 (8-bit floating point) to single-precision 4-way dot product +extension. +@item ssve-fp8dot4 +Enable the fp8 (8-bit floating point) to single-precision 4-way dot product +extension in streaming mode. +@item fp8dot2 +Enable the fp8 (8-bit floating point) o half-precision 2-way dot product +extension. +@item ssve-fp8dot2 +Enable the fp8 (8-bit floating point) o half-precision 2-way dot product +extension in streaming mode. @item faminmax Enable the Floating Point Absolute Maximum/Minimum extension. @item sve-b16b16 diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_1.c new file mode 100644 index 00000000000..9ad789a8ad2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_1.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+sve2+fp8dot2") + +void +test (svfloat16_t f16, svmfloat8_t f8, fpm_t fpm, + svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, + svbfloat16_t bf16, svfloat32_t f32, svfloat64_t f64, mfloat8_t f) +{ + svdot_fpm (f16, f8, f8, fpm); + svdot_fpm (f32, f8, f8, fpm); + + svdot_fpm (f16); /* { dg-error {too few arguments to function 'svdot_fpm'} } */ + svdot_fpm (f16, f8); /* { dg-error {too few arguments to function 'svdot_fpm'} } */ + svdot_fpm (f16, f8, f8); /* { dg-error {too few arguments to function 'svdot_fpm'} } */ + svdot_fpm (f8, f8, fpm); /* { dg-error {too few arguments to function 'svdot_fpm'} } */ + svdot_fpm (f16, f8, fpm); /* { dg-error {too few arguments to function 'svdot_fpm'} } */ + svdot_fpm (f16, f8, f8, fpm, 0); /* { dg-error {too many arguments to function 'svdot_fpm'} } */ + + svdot_fpm (0, f8, f8, fpm); /* { dg-error {passing 'int' to argument 1 of 'svdot_fpm', which expects an SVE type rather than a scalar} } */ + svdot_fpm (f16, f8, f, fpm); /* { dg-error {passing 'mfloat8_t' {aka '__mfp8'} to argument 3 of 'svdot_fpm', which expects 'svmfloat8_t'} } */ + svdot_fpm (pg, f8, f8, fpm); /* { dg-error {'svdot_fpm' has no form that takes 'svbool_t' and 'svmfloat8_t' arguments} } */ + svdot_fpm (u8, f8, f8, fpm); /* { dg-error {'svdot_fpm' has no form that takes 'svuint8_t' and 'svmfloat8_t' arguments} } */ + svdot_fpm (u16, f8, f8, fpm); /* { dg-error {'svdot_fpm' has no form that takes 'svuint16_t' and 'svmfloat8_t' arguments} } */ + svdot_fpm (f64, f8, f8, fpm); /* { dg-error {'svdot_fpm' has no form that takes 'svfloat64_t' and 'svmfloat8_t' arguments} } */ + svdot_fpm (f16, 0, f8, fpm); /* { dg-error {passing 'int' to argument 2 of 'svdot_fpm', which expects 'svmfloat8_t'} } */ + svdot_fpm (f16, f16, f8, fpm); /* { dg-error {passing 'svfloat16_t' to argument 2 of 'svdot_fpm', which expects 'svmfloat8_t'} } */ + svdot_fpm (f16, f8, 0, fpm); /* { dg-error {passing 'int' to argument 3 of 'svdot_fpm', which expects 'svmfloat8_t'} } */ + svdot_fpm (f16, f8, f16, fpm); /* { dg-error {passing 'svfloat16_t' to argument 3 of 'svdot_fpm', which expects 'svmfloat8_t'} } */ + svdot_fpm (f16, f8, f8, f8); /* { dg-error {passing 'svmfloat8_t' to argument 4 of 'svdot_fpm', which expects 'uint64_t'} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c new file mode 100644 index 00000000000..dec00e3abf1 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/ternary_mfloat8_lane_group_selection_1.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ + +#include + +#pragma GCC target ("arch=armv8.2-a+ssve-fp8fma+ssve-fp8dot2") + +void +f1 (svfloat16_t f16, svmfloat8_t f8, fpm_t fpm, + svbool_t pg, svuint8_t u8, svuint16_t u16, svint32_t s32, + svbfloat16_t bf16, svfloat32_t f32, svfloat64_t f64, mfloat8_t f, int i) + __arm_streaming +{ + svdot_lane_fpm (f32, f8, f8, 0, fpm); + svdot_lane_fpm (f32, f8, f8, 3, fpm); + svdot_lane_fpm (f16, f8, f8, 0, fpm); + svdot_lane_fpm (f16, f8, f8, 7, fpm); + + svdot_lane_fpm (f32, f8, f8, -1, fpm); /* { dg-error {passing -1 to argument 4 of 'svdot_lane_fpm', which expects a value in the range \[0, 3\]} } */ + svdot_lane_fpm (f32, f8, f8, 4, fpm); /* { dg-error {passing 4 to argument 4 of 'svdot_lane_fpm', which expects a value in the range \[0, 3\]} } */ + svdot_lane_fpm (f16, f8, f8, -1, fpm); /* { dg-error {passing -1 to argument 4 of 'svdot_lane_fpm', which expects a value in the range \[0, 7\]} } */ + svdot_lane_fpm (f16, f8, f8, 8, fpm); /* { dg-error {passing 8 to argument 4 of 'svdot_lane_fpm', which expects a value in the range \[0, 7\]} } */ + + svdot_lane_fpm (f16); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8, f8); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8, f8, 0); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8, f8, fpm); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8, 15, fpm); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f8, f8, 15, fpm); /* { dg-error {too few arguments to function 'svdot_lane_fpm'} } */ + + svdot_lane_fpm (f16, f8, f8, 15, 0, fpm); /* { dg-error {too many arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8, f8, 15, fpm, fpm); /* { dg-error {too many arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f8, f8, f8, 15, fpm); /* { dg-error {too many arguments to function 'svdot_lane_fpm'} } */ + svdot_lane_fpm (f16, f16, f8, f8, 15, fpm); /* { dg-error {too many arguments to function 'svdot_lane_fpm'} } */ + + svdot_lane_fpm (f32, bf16, bf16, 0, fpm); /* { dg-error {passing 'svbfloat16_t' to argument 2 of 'svdot_lane_fpm', which expects 'svmfloat8_t'} } */ + svdot_lane_fpm (0, f8, f8, 0, fpm); /* { dg-error {passing 'int' to argument 1 of 'svdot_lane_fpm', which expects an SVE type rather than a scalar} } */ + svdot_lane_fpm (pg, f8, f8, 0, fpm); /* { dg-error {'svdot_lane_fpm' has no form that takes 'svbool_t' and 'svmfloat8_t' arguments} } */ + svdot_lane_fpm (u8, f8, f8, 0, fpm); /* { dg-error {'svdot_lane_fpm' has no form that takes 'svuint8_t' and 'svmfloat8_t' arguments} } */ + svdot_lane_fpm (u16, f8, f8, 0, fpm); /* { dg-error {'svdot_lane_fpm' has no form that takes 'svuint16_t' and 'svmfloat8_t' arguments} } */ + svdot_lane_fpm (f64, f8, f8, 0, fpm); /* { dg-error {'svdot_lane_fpm' has no form that takes 'svfloat64_t' and 'svmfloat8_t' arguments} } */ + svdot_lane_fpm (f16, 0, f8, 0, fpm); /* { dg-error {passing 'int' to argument 2 of 'svdot_lane_fpm', which expects 'svmfloat8_t'} } */ + svdot_lane_fpm (f16, f32, f8, 0, fpm); /* { dg-error {passing 'svfloat32_t' to argument 2 of 'svdot_lane_fpm', which expects 'svmfloat8_t'} } */ + svdot_lane_fpm (f16, f8, 0, 0, fpm); /* { dg-error {passing 'int' to argument 3 of 'svdot_lane_fpm', which expects 'svmfloat8_t'} } */ + svdot_lane_fpm (f16, f8, f32, 0, fpm); /* { dg-error {passing 'svfloat32_t' to argument 3 of 'svdot_lane_fpm', which expects 'svmfloat8_t'} } */ + + svdot_lane_fpm (f16, f8, f8, s32, fpm); /* { dg-error {argument 4 of 'svdot_lane_fpm' must be an integer constant expression} } */ + svdot_lane_fpm (f16, f8, f8, i, fpm); /* { dg-error {argument 4 of 'svdot_lane_fpm' must be an integer constant expression} } */ +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_mf8.c new file mode 100644 index 00000000000..9e54cd11c4b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_lane_mf8.c @@ -0,0 +1,172 @@ +/* { dg-do assemble { target aarch64_asm_fp8dot2_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8dot2_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8dot2" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8dot2" +#endif + +/* +** dot_lane_0_f16_tied1: +** msr fpmr, x0 +** fdot z0\.h, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (dot_lane_0_f16_tied1, svfloat16_t, svmfloat8_t, + z0 = svdot_lane_f16_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svdot_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** dot_lane_0_f16_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.h, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (dot_lane_0_f16_tied2, svfloat16_t, svmfloat8_t, + z0_res = svdot_lane_f16_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svdot_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** dot_lane_0_f16_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.h, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (dot_lane_0_f16_tied3, svfloat16_t, svmfloat8_t, + z0_res = svdot_lane_f16_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svdot_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** dot_lane_0_f16_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fdot z0\.h, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (dot_lane_0_f16_untied, svfloat16_t, svmfloat8_t, + z0 = svdot_lane_f16_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svdot_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** dot_lane_1_f16: +** msr fpmr, x0 +** fdot z0\.h, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (dot_lane_1_f16, svfloat16_t, svmfloat8_t, + z0 = svdot_lane_f16_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svdot_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** dot_lane_z8_f16: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fdot z0\.h, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (dot_lane_z8_f16, svfloat16_t, svmfloat8_t, z8, + z0 = svdot_lane_f16_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svdot_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** dot_lane_z16_f16: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fdot z0\.h, z1\.b, \1\.b\[7\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (dot_lane_z16_f16, svfloat16_t, svmfloat8_t, z16, + z0 = svdot_lane_f16_mf8_fpm (z0, z1, z16, 7, fpm0), + z0 = svdot_lane_fpm (z0, z1, z16, 7, fpm0)) + +/* +** dot_lane_0_f32_tied1: +** msr fpmr, x0 +** fdot z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (dot_lane_0_f32_tied1, svfloat32_t, svmfloat8_t, + z0 = svdot_lane_f32_mf8_fpm (z0, z4, z5, 0, fpm0), + z0 = svdot_lane_fpm (z0, z4, z5, 0, fpm0)) + +/* +** dot_lane_0_f32_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.s, \1\.b, z1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (dot_lane_0_f32_tied2, svfloat32_t, svmfloat8_t, + z0_res = svdot_lane_f32_mf8_fpm (z4, z0, z1, 0, fpm0), + z0_res = svdot_lane_fpm (z4, z0, z1, 0, fpm0)) + +/* +** dot_lane_0_f32_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.s, z1\.b, \1\.b\[0\] +** ret +*/ +TEST_DUAL_Z_REV (dot_lane_0_f32_tied3, svfloat32_t, svmfloat8_t, + z0_res = svdot_lane_f32_mf8_fpm (z4, z1, z0, 0, fpm0), + z0_res = svdot_lane_fpm (z4, z1, z0, 0, fpm0)) + +/* +** dot_lane_0_f32_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fdot z0\.s, z4\.b, z5\.b\[0\] +** ret +*/ +TEST_DUAL_Z (dot_lane_0_f32_untied, svfloat32_t, svmfloat8_t, + z0 = svdot_lane_f32_mf8_fpm (z1, z4, z5, 0, fpm0), + z0 = svdot_lane_fpm (z1, z4, z5, 0, fpm0)) + +/* +** dot_lane_1_f32: +** msr fpmr, x0 +** fdot z0\.s, z4\.b, z5\.b\[1\] +** ret +*/ +TEST_DUAL_Z (dot_lane_1_f32, svfloat32_t, svmfloat8_t, + z0 = svdot_lane_f32_mf8_fpm (z0, z4, z5, 1, fpm0), + z0 = svdot_lane_fpm (z0, z4, z5, 1, fpm0)) + +/* +** dot_lane_z8_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z8\.d +** fdot z0\.s, z1\.b, \1\.b\[1\] +** ldr d8, \[sp\], 32 +** ret +*/ +TEST_DUAL_LANE_REG (dot_lane_z8_f32, svfloat32_t, svmfloat8_t, z8, + z0 = svdot_lane_f32_mf8_fpm (z0, z1, z8, 1, fpm0), + z0 = svdot_lane_fpm (z0, z1, z8, 1, fpm0)) + +/* +** dot_lane_z32_f32: +** ... +** msr fpmr, x0 +** mov (z[0-7])\.d, z16\.d +** fdot z0\.s, z1\.b, \1\.b\[3\] +** ... +** ret +*/ +TEST_DUAL_LANE_REG (dot_lane_z32_f32, svfloat32_t, svmfloat8_t, z16, + z0 = svdot_lane_f32_mf8_fpm (z0, z1, z16, 3, fpm0), + z0 = svdot_lane_fpm (z0, z1, z16, 3, fpm0)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_mf8.c b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_mf8.c new file mode 100644 index 00000000000..12e28e3284f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/acle/asm/dot_mf8.c @@ -0,0 +1,101 @@ +/* { dg-do assemble { target aarch64_asm_fp8dot2_ok } } */ +/* { dg-do compile { target { ! aarch64_asm_fp8dot2_ok } } } */ +/* { dg-final { check-function-bodies "**" "" "-DCHECK_ASM" } } */ + +#include "test_sve_acle.h" + +#pragma GCC target "+fp8dot2" +#ifdef STREAMING_COMPATIBLE +#pragma GCC target "+ssve-fp8dot2" +#endif + +/* +** dot_f16_mf8_tied1: +** msr fpmr, x0 +** fdot z0\.h, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (dot_f16_mf8_tied1, svfloat16_t, svmfloat8_t, + z0 = svdot_f16_mf8_fpm (z0, z4, z5, fpm0), + z0 = svdot_fpm (z0, z4, z5, fpm0)) + +/* +** dot_f16_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.h, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (dot_f16_mf8_tied2, svfloat16_t, svmfloat8_t, + z0_res = svdot_f16_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svdot_fpm (z4, z0, z1, fpm0)) + +/* +** dot_f16_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.h, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (dot_f16_mf8_tied3, svfloat16_t, svmfloat8_t, + z0_res = svdot_f16_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svdot_fpm (z4, z1, z0, fpm0)) + +/* +** dot_f16_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fdot z0\.h, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (dot_f16_mf8_untied, svfloat16_t, svmfloat8_t, + z0 = svdot_f16_mf8_fpm (z1, z4, z5, fpm0), + z0 = svdot_fpm (z1, z4, z5, fpm0)) + +/* +** dot_f32_mf8_tied1: +** msr fpmr, x0 +** fdot z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (dot_f32_mf8_tied1, svfloat32_t, svmfloat8_t, + z0 = svdot_f32_mf8_fpm (z0, z4, z5, fpm0), + z0 = svdot_fpm (z0, z4, z5, fpm0)) + +/* +** dot_f32_mf8_tied2: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.s, \1\.b, z1\.b +** ret +*/ +TEST_DUAL_Z_REV (dot_f32_mf8_tied2, svfloat32_t, svmfloat8_t, + z0_res = svdot_f32_mf8_fpm (z4, z0, z1, fpm0), + z0_res = svdot_fpm (z4, z0, z1, fpm0)) + +/* +** dot_f32_mf8_tied3: +** msr fpmr, x0 +** mov (z[0-9]+)\.d, z0\.d +** movprfx z0, z4 +** fdot z0\.s, z1\.b, \1\.b +** ret +*/ +TEST_DUAL_Z_REV (dot_f32_mf8_tied3, svfloat32_t, svmfloat8_t, + z0_res = svdot_f32_mf8_fpm (z4, z1, z0, fpm0), + z0_res = svdot_fpm (z4, z1, z0, fpm0)) + +/* +** dot_f32_mf8_untied: +** msr fpmr, x0 +** movprfx z0, z1 +** fdot z0\.s, z4\.b, z5\.b +** ret +*/ +TEST_DUAL_Z (dot_f32_mf8_untied, svfloat32_t, svmfloat8_t, + z0 = svdot_f32_mf8_fpm (z1, z4, z5, fpm0), + z0 = svdot_fpm (z1, z4, z5, fpm0)) + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index a122178bd21..95acd0975bb 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -12141,7 +12141,8 @@ foreach { aarch64_ext } { "fp" "simd" "crypto" "crc" "lse" "dotprod" "sve" "i8mm" "f32mm" "f64mm" "bf16" "sb" "sve2" "ls64" "sme" "sme-i16i64" "sme2" "sve-b16b16" "sme-b16b16" "sme-f16f16" "sme2p1" "fp8" "fp8fma" - "ssve-fp8fma" } { + "ssve-fp8fma" "fp8dot2" "ssve-fp8dot2" "fp8dot4" + "ssve-fp8dot4"} { eval [string map [list FUNC $aarch64_ext] { proc check_effective_target_aarch64_asm_FUNC_ok { } { if { [istarget aarch64*-*-*] } {