From patchwork Mon Nov 27 17:02:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Ramsay X-Patchwork-Id: 80829 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 38B07385803A for ; Mon, 27 Nov 2023 17:03:41 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR02-VI1-obe.outbound.protection.outlook.com (mail-vi1eur02on2062.outbound.protection.outlook.com [40.107.241.62]) by sourceware.org (Postfix) with ESMTPS id 066643858C2D for ; Mon, 27 Nov 2023 17:03:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 066643858C2D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 066643858C2D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.241.62 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701104602; cv=pass; b=Ifo5PWNYTg+nFNoerbeo7ICs7a14lW9/HV2QcDEpPQesksiBpgW6XEJ6IAzgVIno08nfse3TNUQHmXj9H0p1E4H1V/3mSUXylAr04UT9DuhIzZcoAra64oWSxWg9wHf42Zo6/o8KotwqGp8NGCzo1MBwU1f8kSGQK8yncOp2Few= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701104602; c=relaxed/simple; bh=/6ioLTGpf2h6zxj1W339PUnyZi4QYAxq4ov210/nkkA=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=Sgur6kutK2uv0NWwF9ip7EUywskj5aTawvdDMtz8DGa9bdI2xldCe3aBjwdwUk1ApQ2obIunnYZltfLaJXi1jGlryRcjTKOLvK5XJa1uCC5Vobm9QyuA2RrGBxOydRArv8NK6wd5SFGRMFpidFDd+4tfMHgYkthEoylhs/ub4to= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=LeOgqnlY8O6ES1vsjM54kbZrjjsDIiR33N+KIf2etxKli1JpkDj8UHUVYEUJIoMFldEFbhocqfZ8cECXzXpEw3c9nlO6nkuVmLtnumrLUxCXisqMjRpBY0s3wkXqD1N0jlNcY3OeHeqsWIcvwP4A01EmzR1RcyGv2et5i369yEVzlOy7YG9e2fq1LjrCLZdrLFqJk0sopoPtsuRSNACWkCo1O/QXZCghGnSj4zxQmZhT1tWuaSlpUIJ1rh57eVT3/5T2Pyg7GMjt7z+RV8eouaB/Ss+ig8AeMiMXLFxiu5Sa5XNZ4dmCoLqcdZSTbZSwe+QE9e3oj++gzqMrca1mHw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8dsTtD2bhfjFAcaKdJbt4NC7+YwczYnK1HH5KGzH2X4=; b=NmyKQpgxdFRVj4SosJ5SQErTwBE3ggKFuWHXPMlg0KUnwpOTBsO2eXAKLztnm7q4Qvp/DckBmulsV2uxADeGPEKpmCzVDrqDhgp/y+fVlRWOTtnWqEQZsTepOJIvytG1YL0EZD7ND71Hzc/A+F5UGkpIqXuiaPfNBu+cj3cMQ1oJILtAN3ZJVdKcLHLaA90AOIxu+ZnmWkOJqcrS1hsA40M61b2P83eceRJ09q3zsUy5aG29edGSHPiLlV7lTr9kP1HDUhdHQU01wVruo7PG/H9PXJta4Dsu4r5dxIkkhv6EY0h80rzmF2Y8euL/ypO+crWQVz7LDmcgIFd1kJPvsw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1, 1, smtp.mailfrom=arm.com] dmarc=[1, 1, header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8dsTtD2bhfjFAcaKdJbt4NC7+YwczYnK1HH5KGzH2X4=; b=UuxNBjT1iLh6IEzOHhTTH4GfO0We4UUF8j2j/nixVmdBDpGD/reheG87U12L3Y5mhBu4Vq3A6WwVhj/x05j9oZ/couzdSu3gDKV7ap6frgLeBYBf4fjUtbHI4gVrDopVL4XTFeldAZ2pSo5YeWDi6fFWOoDBUqG6sVBBHVCS4gE= Received: from DUZPR01CA0073.eurprd01.prod.exchangelabs.com (2603:10a6:10:3c2::11) by AM0PR08MB5396.eurprd08.prod.outlook.com (2603:10a6:208:182::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.28; Mon, 27 Nov 2023 17:03:14 +0000 Received: from DU6PEPF0000B61F.eurprd02.prod.outlook.com (2603:10a6:10:3c2:cafe::39) by DUZPR01CA0073.outlook.office365.com (2603:10a6:10:3c2::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.28 via Frontend Transport; Mon, 27 Nov 2023 17:03:14 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU6PEPF0000B61F.mail.protection.outlook.com (10.167.8.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.17 via Frontend Transport; Mon, 27 Nov 2023 17:03:14 +0000 Received: ("Tessian outbound 8289ea11ec17:v228"); Mon, 27 Nov 2023 17:03:13 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 8f08a1a68279a535 X-CR-MTA-TID: 64aa7808 Received: from 14c26f6039c2.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 75727E91-8F39-41B7-A0A3-4C719C5D7A76.1; Mon, 27 Nov 2023 17:03:07 +0000 Received: from EUR05-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 14c26f6039c2.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 27 Nov 2023 17:03:07 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZdlOdibIAw0yAVvJGQ6FfqTnYEbrsoCOHCzKoVIv9bgimDcqTb703A/HyBFOHgYqq+2SB/n+ka+T1DDkQ4reLvpoM6Aayde3zx06tlKOcl/eVz7GDphg7gSL0Bqq341n2LIAPYM4/68v1Vm9n7wC7VKtIVOVp7lbm7NBM0vRaLOX1TcI+vRV59zYpZqYsyygV2ujOUfKpxMNj+J654j23tVRIkN3HwADDplyrnhab75NM8a8yNvdK+NONgqW243iOkflrfiKlHYM7jb9snr9ptrvovtGTuwo7/XA78YUDjze9ju1kVPe0/JgcsIy+ocLyqwcRZoNxmKrpC5kg2ZDJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=8dsTtD2bhfjFAcaKdJbt4NC7+YwczYnK1HH5KGzH2X4=; b=KpBplbidGTcnDm2nFR3LZqiADWb+H/KxRxdU16+dzLZLFxuL7sMPgiRKLgYPfeqUaSdBpjDbVQd67UOudBE9J47Nbsww3AYY+Xzl/t3j7ua3aPLAIDAG5IyDDX2IXYGJrfOt0Oer3c34luNprjnSyM0EGNrjgH5si5AljeDWxL1xGnAdaWTla2GO8BYQK8m7iK1tPyUs8uKi1+BiiPAldjJ6BggH91xYfGza676njgA49PSZLAMuoBmpPR1BW9xwkvfbMGeiUoy2IIsi9mf2SRaAawfrZiQTNiJvAXlo3TvACBMeWlYScJPdaik2Vsy38sDldDJmJYHrilNyDOcYGQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=8dsTtD2bhfjFAcaKdJbt4NC7+YwczYnK1HH5KGzH2X4=; b=UuxNBjT1iLh6IEzOHhTTH4GfO0We4UUF8j2j/nixVmdBDpGD/reheG87U12L3Y5mhBu4Vq3A6WwVhj/x05j9oZ/couzdSu3gDKV7ap6frgLeBYBf4fjUtbHI4gVrDopVL4XTFeldAZ2pSo5YeWDi6fFWOoDBUqG6sVBBHVCS4gE= Received: from AM0PR02CA0120.eurprd02.prod.outlook.com (2603:10a6:20b:28c::17) by AS8PR08MB7323.eurprd08.prod.outlook.com (2603:10a6:20b:442::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.28; Mon, 27 Nov 2023 17:03:05 +0000 Received: from AMS1EPF0000003F.eurprd04.prod.outlook.com (2603:10a6:20b:28c:cafe::51) by AM0PR02CA0120.outlook.office365.com (2603:10a6:20b:28c::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.27 via Frontend Transport; Mon, 27 Nov 2023 17:03:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS1EPF0000003F.mail.protection.outlook.com (10.167.16.36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7046.17 via Frontend Transport; Mon, 27 Nov 2023 17:03:05 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Mon, 27 Nov 2023 17:03:04 +0000 Received: from vcn-man-apps.manchester.arm.com (10.32.108.22) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Mon, 27 Nov 2023 17:03:04 +0000 From: Joe Ramsay To: CC: Joe Ramsay Subject: [PATCH] aarch64: Improve special-case handling in AdvSIMD double-precision libmvec routines Date: Mon, 27 Nov 2023 17:02:55 +0000 Message-ID: <20231127170255.52890-1-Joe.Ramsay@arm.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS1EPF0000003F:EE_|AS8PR08MB7323:EE_|DU6PEPF0000B61F:EE_|AM0PR08MB5396:EE_ X-MS-Office365-Filtering-Correlation-Id: 4601c5bc-15bd-4524-77c2-08dbef6abe3f x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: w4tLjGg1IxOGynoqyNSCNw+2n0qs6Rqu/sucXjKu/mLZD1Fd6rtJc+O5fQXG8xXZIKioKBAIEk52h885zKipjtdb7OzIFtc6W7EgGSbWgk1BaD4oo4VKtQK1JRkHJl3oDUI9equRV1PClqAcc4gkSklQ7qgqyckvwBCEjFhz2C1jQ/Wd5RdsS8smxPeM8boJkdix0xYeP2I9qBBIKcR/ezssctjOQBIX5BLspVYcSLp8kWiR9MOkc75Jv1XNA5Rsjj3AJOLWZaVPS5tc2T0KHb7MCrzGqMTQyBTjKnJWjY4OIqmAJRW8XL6oKoyovrhzQfNq1z0sfAMO9Bb7ns/cnQtE+7q60BrRRqXF8DkaNQOwpeMW2q2TrkZ7+UnnRsOYovK/vdzT7b42hxSswA7YEdSdNSS4xWZhhrYUsEfIM6W8OEiH4sLmDRdG5P6Zr06BIyd2jYjh/xt4FMpYgjdAgt4O80ThbtSFZ8wGiaz3y2UVfrXttwog7aGanWKphOsGCkap4LQCSmB7RwiGl4syfKEXty0Im9nI6/xZeLDraXvkvnqy9tI1OdUfgYZclm0PtXcRnpioER5MCAsDtsqYIZL4VSGv2iv6p0K6jEYC+Zl3byfa4lp+XTGgYSOpFIe9FN9YfYpaDq5Y1sUpNApKZYwTNDhNrHDZS03vyUKhnp1uc03yNRd1V40dGTj9czhE0lsWyH0SgQo0nOar0OV1b3LMGvKdqVAJi6mUxY7Z9wN5lDbyq5itFsLv3U3NH9NNhuBGvdD6I0TK0jx+vxkiBA== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(4636009)(376002)(136003)(39860400002)(396003)(346002)(230922051799003)(186009)(451199024)(1800799012)(64100799003)(82310400011)(46966006)(40470700004)(36840700001)(6666004)(4326008)(8676002)(8936002)(7696005)(6916009)(316002)(40460700003)(478600001)(36860700001)(81166007)(4744005)(356005)(47076005)(36756003)(40480700001)(41300700001)(1076003)(86362001)(26005)(70586007)(2906002)(2616005)(426003)(336012)(83380400001)(82740400003)(5660300002)(70206006)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB7323 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU6PEPF0000B61F.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 8a7b7e7a-5f42-49b1-11a5-08dbef6ab939 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: nKCndVhNDYd6hmbTwgMsnJg12RDK76eNYjC237h0JX13caBYEwWtd/59sxc8VZjVvkXJE1UXVsJdBMLA43lYuf1wfmzXh84zrKonH3+Yq/iGKBRkOvUCkgLuZfbnLN9jFI7vOMqGGSHIGULx0l7z8AywbqYQj0iaY/tW9lAPfjd0sEhxSB3RxGb0lZJsLES6FeqIqtEmhX3zcJk+SN6XRPM6aSTI2QW/x1Afk4pVVTZgoKjFLifUAD+nKIYMUmyR7m9VqvR0DEjKtdpvW5eXmmSbdD+Af0+bIZgPXVCSQYXxd7gvsAK8z6zWVe4nNI3RtCEIe4qI9e31iPvU9P/kacme22+XRuT8XXYVLEjrMJX3V44YrRdTCrSxU5zVRlCENjDhDXB5cL+mfFesrQTI8KYTi51y2OtKFmAXHSLBlaFk+grR2/Od1/4npwzgmCnTZ55QKZ4Irw0XqC/78tCB8ScIpmvKOdqV/rTlHc2SqRKaqwSBVBMlacQDYrNczeVKn/SbgQ+cI0VwWzMrAvBUjN2d4pHdS4v5AL2rpoICnhdIQQHxvsrwdk0JSeIrO0IzR1qtTWzg014TWIa6NmHbqU4Dh1mgf6bJzlDr/OoDocnhGZLwk3VUo9cQXR9YCLsP3AwVpWaQjfrEGSkRlMA+Zj987oodmqWzD7fA0SfCUYJiHTc+bhfoZAO27qiM6VNVP7yk2gHv1r0JCPtewS63WF3gPGnpsXNxL8QBrnb2Pz6b+Vex3dto0Yg970uWpU2U X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(346002)(39860400002)(376002)(136003)(396003)(230922051799003)(451199024)(1800799012)(64100799003)(82310400011)(186009)(46966006)(40470700004)(36840700001)(40480700001)(1076003)(336012)(2616005)(26005)(426003)(478600001)(7696005)(6666004)(81166007)(40460700003)(86362001)(82740400003)(36756003)(4326008)(5660300002)(4744005)(2906002)(41300700001)(36860700001)(47076005)(83380400001)(70586007)(70206006)(316002)(6916009)(8936002)(8676002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Nov 2023 17:03:14.0303 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4601c5bc-15bd-4524-77c2-08dbef6abe3f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU6PEPF0000B61F.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB5396 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Avoids emitting many saves/restores of vector registers, reduces the amount of code generated around the scalar fallback. --- Thanks, Joe sysdeps/aarch64/fpu/v_math.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/sysdeps/aarch64/fpu/v_math.h b/sysdeps/aarch64/fpu/v_math.h index cfc87f8dd0..d286eb81b3 100644 --- a/sysdeps/aarch64/fpu/v_math.h +++ b/sysdeps/aarch64/fpu/v_math.h @@ -137,7 +137,13 @@ v_lookup_u64 (const uint64_t *tab, uint64x2_t idx) static inline float64x2_t v_call_f64 (double (*f) (double), float64x2_t x, float64x2_t y, uint64x2_t p) { - return (float64x2_t){ p[0] ? f (x[0]) : y[0], p[1] ? f (x[1]) : y[1] }; + double p1 = p[1]; + double x1 = x[1]; + if (__glibc_likely (p[0])) + y[0] = f (x[0]); + if (__glibc_likely (p1)) + y[1] = f (x1); + return y; } static inline float64x2_t v_call2_f64 (double (*f) (double, double), float64x2_t x1, float64x2_t x2,