From patchwork Mon Dec 18 15:51:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Ramsay X-Patchwork-Id: 82395 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7C51D3857BBD for ; Mon, 18 Dec 2023 15:51:48 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2047.outbound.protection.outlook.com [40.107.7.47]) by sourceware.org (Postfix) with ESMTPS id C988838582B6 for ; Mon, 18 Dec 2023 15:51:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C988838582B6 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C988838582B6 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.7.47 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1702914695; cv=pass; b=NHmmEKMqX0XSNBRFWHWlomcBiGy1DmpxU5koywU7W2rnmfLk+TMR0Zt4mKNT1gFvyv7RmE3RVMCXpomsb1OrlrYV99g0UfAT5P1snHOi98j9cd3a7zWIymcQ/sol8bXygWiaZl2cbTDPHvfY2XYaqKIzO1HxGQw/eNYzUIc3eiE= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1702914695; c=relaxed/simple; bh=qNbnTnHTuQUEv2od78WpkikhNXGQpDig+4ofhFMhS6Y=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=o43/ZC2cm8bt8yGOgTcpCXPgxN2wsqUh/64+xBENm2oixeLIx6LTPgBYB4KRzsM7EdlpZ2tfkGWSSA2fxZrMVnpgZ12Zm6E5YwQXCi1Ic76X8nizCvXyKuakVtbr5zKvtAR2nkfinj+HexHxI6rRjKh1ZMKt5Id4cEW5r+Re4Jc= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=QWtIUYRy5pVTxLoq52tdn+mx34o5RWGlJCIcZX/+JQcENmbW1VK6l8cJ0lQYAErDWzBfMoU4r7bzs5jgbHfik+MeNFFhXdmuXe1kziwKIeqjC1zJJBWoZamVI/WaDtJEf/pWoJqmKW/L6Bv1gZPOxrkl/mYfs5z2H2R2e33gaFHQV3Xy4PZnVWA0qU0PHTRtLP+kvwaovXJEzMJHe0J+YcRZCozjJCSZtus5Xk00PB+rHmIawe79Gs3pLQZAIBUOAIeZSiiwBzeeV9V2FriYM5TgfAijsFR2fgSxjoVoIYD/7Mj3G0u6a5K/paNfZ+j5diYAaw/+XHYwHUpaDKx3xw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Yxna8ZS8ag70NlRSmL2gBrE+7fkfLPYcsp6c0aJhHkE=; b=UzXYk77qVy4zWwZ1jD99EjLSMygPfdY11y5lan8pDyxKuy87cj8SLgK8ZoPwzCJbqsi5JBWaqOeMf3RoxUREgPHyprEcD6OIVyBraw67Z4nCAUcojxHCNlpNVdzd1mrk+hPimGJKW9BQB7HvWCKs7BrawVMipxtvllXm/cxZzQwExrKM/i0ULx1/eSu9BXCqk/kUHT5pywAFw0J/JSmSgHGH/shppwqtNSXvPn0Ayw/lJjINt9wOG35P6YJByJyMtmYYuGpyAMQtv4Ba808/wB9v4jM//3NA7xCS/+WLO7RkVzjIFzBSqcBSZLWSXRI5G56wOYt/Pv05l/8ChQbqaQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1, 1, smtp.mailfrom=arm.com] dmarc=[1, 1, header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Yxna8ZS8ag70NlRSmL2gBrE+7fkfLPYcsp6c0aJhHkE=; b=lFLhY3KITWUFDpyN5c+uETA1ZDIvAZ3bOa9j0MwbGWrhr5MjRxRYasu8fXnj5hKjwik4iaJOcG8SAhURrlIp76l9ZIIh50AiM0r5pPXDdXTxoUqqjaSOcKxpB3g18LhidOjcPdCXUZ/k/cvaqR22DnY6i8zxXeiPL44xL9Vrvxs= Received: from AS9PR04CA0064.eurprd04.prod.outlook.com (2603:10a6:20b:48b::17) by AS8PR08MB7790.eurprd08.prod.outlook.com (2603:10a6:20b:527::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7091.37; Mon, 18 Dec 2023 15:51:29 +0000 Received: from AMS0EPF000001AF.eurprd05.prod.outlook.com (2603:10a6:20b:48b::4) by AS9PR04CA0064.outlook.office365.com (2603:10a6:20b:48b::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7091.38 via Frontend Transport; Mon, 18 Dec 2023 15:51:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS0EPF000001AF.mail.protection.outlook.com (10.167.16.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7113.14 via Frontend Transport; Mon, 18 Dec 2023 15:51:29 +0000 Received: ("Tessian outbound 20615a7e7970:v228"); Mon, 18 Dec 2023 15:51:28 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 0b2695619b3c173a X-CR-MTA-TID: 64aa7808 Received: from 7e82339a8dd4.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 87CE6180-6B01-4F0C-A0D8-A323FC79AA4A.1; Mon, 18 Dec 2023 15:51:22 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 7e82339a8dd4.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 18 Dec 2023 15:51:22 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=enb1RFRLNW095Q+kMloYrkefTdGv2dBpSOOtC2b9W3GU2xbHKTiBNbUCwOSgqjm95KAAHIalGaRwIKt6dJz5EBZz/FLAq7vx/Av7WBFbS8/0Jch5in14z2nql1w+02J+plaQKcsVD9l0vRK94iVpzekyCGGKsz3YQj9Rth0tRhTEKfihVznM63AGjtP4lOllWhK3DsVB2cBJKasQe66sN4p4aYXcvBqdFOkw7UwVRF6WSVsqatd37LDqQrQPqoggsrOB5z8eL9wlTI9bb1SXSyyCrTpf6onMV7qvVb175TJHbKIxYtwtUfeV9hvlbBi4XdINtZPaI3sEGJFH+SipRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Yxna8ZS8ag70NlRSmL2gBrE+7fkfLPYcsp6c0aJhHkE=; b=P2wMwlFNeSLOua91IYQbKB8hOZ94q2XUVQFFyzz1G+m37Qmvlr7zOIJelQyXjqVagNQHtFWKof6R+wmJGbdAdtAArK1fngVk2Nfpwygz/+6y0k7RzS6w1w63GNo4kOKuIvoBnm5lugAdRpFwvgUP/4EjSqzWxrKYLbOzk/+A5gk0UKQ70KKyDjmDxs5DA83Mb7K1bBFWxmKPmOKOQrK8jVWEMbuf3OCWhzNsFmn+6qGvAmw/Jqv+x9ntsVCNH7CCooHyNLnLzr2PA3nd2nwluCx/lu8tnkpl928RBg9nSeMnWypfbnt3AWvQbGXTzIvq5P2+X7VauJvGU1fnAA5bvQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Yxna8ZS8ag70NlRSmL2gBrE+7fkfLPYcsp6c0aJhHkE=; b=lFLhY3KITWUFDpyN5c+uETA1ZDIvAZ3bOa9j0MwbGWrhr5MjRxRYasu8fXnj5hKjwik4iaJOcG8SAhURrlIp76l9ZIIh50AiM0r5pPXDdXTxoUqqjaSOcKxpB3g18LhidOjcPdCXUZ/k/cvaqR22DnY6i8zxXeiPL44xL9Vrvxs= Received: from AM5PR0301CA0012.eurprd03.prod.outlook.com (2603:10a6:206:14::25) by VE1PR08MB5712.eurprd08.prod.outlook.com (2603:10a6:800:1a8::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7091.37; Mon, 18 Dec 2023 15:51:18 +0000 Received: from AM2PEPF0001C710.eurprd05.prod.outlook.com (2603:10a6:206:14:cafe::33) by AM5PR0301CA0012.outlook.office365.com (2603:10a6:206:14::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7091.37 via Frontend Transport; Mon, 18 Dec 2023 15:51:18 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM2PEPF0001C710.mail.protection.outlook.com (10.167.16.180) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7113.14 via Frontend Transport; Mon, 18 Dec 2023 15:51:18 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Mon, 18 Dec 2023 15:51:18 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Mon, 18 Dec 2023 15:51:17 +0000 Received: from vcn-man-apps.manchester.arm.com (10.32.108.22) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Mon, 18 Dec 2023 15:51:17 +0000 From: Joe Ramsay To: CC: Joe Ramsay Subject: [PATCH v2 1/2] aarch64: Add half-width versions of AdvSIMD f32 libmvec routines Date: Mon, 18 Dec 2023 15:51:15 +0000 Message-ID: <20231218155116.6444-1-Joe.Ramsay@arm.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM2PEPF0001C710:EE_|VE1PR08MB5712:EE_|AMS0EPF000001AF:EE_|AS8PR08MB7790:EE_ X-MS-Office365-Filtering-Correlation-Id: a37d08cf-cc65-40ad-782d-08dbffe132f7 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: tT/7Sd5A91UFavpv1LBPEecK3XXaVrx2q8npj3gFLvLtmAVEpf626KsNjgOSaM0bTLUE1gC47EcGFLe6oaz38HuuiOhy4p/Kb0gapmmYGhrTaR3Vp/TTueJfHIdw8UZMwmm2qqLiOun5U0kKyPgbQrsVMzE6BysDT68WF/KOSWxeXCKd9FnA6oIKA8K5+g4qg6csplOOtv2nBE7Tae4Dh1f6ZOKFoUOn6EKp22SPBC6gpv/j28ph+s4r+8Gprrhg87m0kzdQcrZyNWetVpVk3lwXbQKh0ApbUiqtjkFGozQEMrobxW97ylObfkAxKWEZrDVElpTSEox8IHlm8klYCCTAfkWDH5BIsCbwy9ngpSP12YzJr76MqZPWDm1V5Tb6a/2YTjlu5JsPdDAha/n4O+c8dqSzJ4y7m0NABHGRtUFBc31DDCm5hT2KEVLkMkKSkVSMHzBH+mZeMifnrxVtl/ARofSBybxYfK9fhFZ3GrY3NbPdnjB2WxDLd4Zd4Rb8HfyHPxVjbmakYf+8od5WDNmnAEsj0mxge0ESbcEW0XYsPbHMVk1eearDisFYNywtfWfgKukwIfw/ibGLMoB+ADK2AIQTBYNns+NpWAKDbpUPaUs/NayYqj3ZdZ5Q6tJjODkIJ5CPaKqOlUhhJADhlsw88p3CnVjshONFlOqgUHn38/uOxAgsuTx072I7C39JlwDmqaliRnCexJR5l+/fJvifoMp8Mm4roxdtKfqUYPL3wpRAOjvLZo543XrkVY55ZkhaZjyiaRgfmbATGCdYKA== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(4636009)(39850400004)(136003)(346002)(376002)(396003)(230922051799003)(1800799012)(451199024)(186009)(82310400011)(64100799003)(46966006)(36840700001)(41300700001)(82740400003)(4326008)(8936002)(8676002)(5660300002)(36756003)(86362001)(47076005)(356005)(40480700001)(36860700001)(6916009)(81166007)(70586007)(70206006)(316002)(2906002)(30864003)(1076003)(426003)(336012)(26005)(2616005)(7696005)(478600001)(2004002)(36900700001)(357404004); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: VE1PR08MB5712 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS0EPF000001AF.eurprd05.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a924810a-cacb-40b9-654b-08dbffe12cca X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: kkCORjgs8XeqL4ERVzz07k1RTGTlIDd8kOvvg9oBmj/k4XutidUH5MLOU+XAviu/MA7Qqb8YrFaW+2Vw9qcvN6YJsLFG9iHk4TpwepWAYKk139CE1vNS2ojryVa8t77fVApNol5FoYhVDxUF6jQJ3JVQNUu1LUpSNMQtmwnKrZcZvZKKj9aDqsCo4sYsKf/QfiBxjgtdUP9Bys4ZQKyzLnnAn8dezqIU8sd2Gmoh7msrmM4INwaBedVwSKwWbY/wnoqGMVPo31b0CUYUGs/Gww+6QrGXGofbo3VuAj0pirX/YiRiFk4bH3dwJ/8i04ipykyjmjghUeXyRFyv6Nkk5VHzHWeSHV7fQIDUvoOqtPcNbo6SK+3uvFvuS4GoiDSPDeH0CrHtDGX9CWRIUJbFkfoY3Z4iThznjouL48HZCx2wsNy3L+Gw5pUpO9nqJJSQu/eoHyDEEOPd+hvPhkWAHnJmH97Say38zCrhqwfI4AkpNXAyfgotunCJtBE34YKxxWIrlwtmqofYbLWkLgoaqwMgAvYCaMJwPfxMJLVCc3v0zBIGi99+yOq5uH5uzV4oacJvOO3T3CHHceFUQRwfpSWgVgjuiHzkA+gzkyuUyx3nRLO6PGuy20u03J+t8iuqcTORDISPRoccXyLvGSlMkb3+EaTOiS1bBBjfaNJaI+z/w7KrVFU/VqdmPm8mk1lFvgOTxkRkx8YFysGxPCCfSolQqW5/KN29ujfHfrPCNVP+0HFqwKU5dgwK6YF1QqpzRTo8tU2fkgoQpoY3XEfZpA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(396003)(39860400002)(136003)(346002)(230922051799003)(82310400011)(1800799012)(186009)(451199024)(64100799003)(36840700001)(46966006)(40470700004)(2616005)(70586007)(6916009)(478600001)(316002)(26005)(1076003)(40480700001)(70206006)(336012)(426003)(4326008)(8936002)(47076005)(8676002)(7696005)(40460700003)(36860700001)(5660300002)(30864003)(2906002)(86362001)(81166007)(82740400003)(36756003)(41300700001)(2004002)(357404004); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Dec 2023 15:51:29.0235 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a37d08cf-cc65-40ad-782d-08dbffe132f7 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF000001AF.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB7790 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Compilers may emit calls to 'half-width' routines (two-lane single-precision variants). These have been added in the form of wrappers around the full-width versions, where the low half of the vector is simply duplicated. This will perform poorly when one lane triggers the special-case handler, as there will be a redundant call to the scalar version, however this is expected to be rare at Ofast. --- Split out v1 into this and 2/2. Thanks, Joe include/libc-symbols.h | 2 ++ sysdeps/aarch64/fpu/Versions | 15 ++++++++ sysdeps/aarch64/fpu/acosf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/advsimd_f32_protos.h | 34 +++++++++++++++++++ sysdeps/aarch64/fpu/asinf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/atan2f_advsimd.c | 2 ++ sysdeps/aarch64/fpu/atanf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/cosf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/exp10f_advsimd.c | 2 ++ sysdeps/aarch64/fpu/exp2f_advsimd.c | 2 ++ sysdeps/aarch64/fpu/expf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/expm1f_advsimd.c | 2 ++ sysdeps/aarch64/fpu/log10f_advsimd.c | 2 ++ sysdeps/aarch64/fpu/log1pf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/log2f_advsimd.c | 2 ++ sysdeps/aarch64/fpu/logf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/sinf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/tanf_advsimd.c | 2 ++ sysdeps/aarch64/fpu/v_math.h | 15 ++++++++ .../unix/sysv/linux/aarch64/libmvec.abilist | 15 ++++++++ 20 files changed, 111 insertions(+) create mode 100644 sysdeps/aarch64/fpu/advsimd_f32_protos.h diff --git a/include/libc-symbols.h b/include/libc-symbols.h index 5794614488..a226119295 100644 --- a/include/libc-symbols.h +++ b/include/libc-symbols.h @@ -600,8 +600,10 @@ for linking") #endif #if IS_IN (libmvec) +# define libmvec_hidden_proto(name, attrs...) hidden_proto (name, ##attrs) # define libmvec_hidden_def(name) hidden_def (name) #else +# define libmvec_hidden_proto(name, attrs...) # define libmvec_hidden_def(name) #endif diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions index aaacacaebe..accd101184 100644 --- a/sysdeps/aarch64/fpu/Versions +++ b/sysdeps/aarch64/fpu/Versions @@ -18,47 +18,62 @@ libmvec { _ZGVsMxv_sinf; } GLIBC_2.39 { + _ZGVnN2v_cosf; + _ZGVnN2v_expf; + _ZGVnN2v_logf; + _ZGVnN2v_sinf; _ZGVnN4v_acosf; + _ZGVnN2v_acosf; _ZGVnN2v_acos; _ZGVsMxv_acosf; _ZGVsMxv_acos; _ZGVnN4v_asinf; + _ZGVnN2v_asinf; _ZGVnN2v_asin; _ZGVsMxv_asinf; _ZGVsMxv_asin; _ZGVnN4v_atanf; + _ZGVnN2v_atanf; _ZGVnN2v_atan; _ZGVsMxv_atanf; _ZGVsMxv_atan; _ZGVnN4vv_atan2f; + _ZGVnN2vv_atan2f; _ZGVnN2vv_atan2; _ZGVsMxvv_atan2f; _ZGVsMxvv_atan2; _ZGVnN4v_exp10f; + _ZGVnN2v_exp10f; _ZGVnN2v_exp10; _ZGVsMxv_exp10f; _ZGVsMxv_exp10; _ZGVnN4v_exp2f; + _ZGVnN2v_exp2f; _ZGVnN2v_exp2; _ZGVsMxv_exp2f; _ZGVsMxv_exp2; _ZGVnN4v_expm1f; + _ZGVnN2v_expm1f; _ZGVnN2v_expm1; _ZGVsMxv_expm1f; _ZGVsMxv_expm1; _ZGVnN4v_log10f; + _ZGVnN2v_log10f; _ZGVnN2v_log10; _ZGVsMxv_log10f; _ZGVsMxv_log10; _ZGVnN4v_log1pf; + _ZGVnN2v_log1pf; _ZGVnN2v_log1p; _ZGVsMxv_log1pf; _ZGVsMxv_log1p; _ZGVnN4v_log2f; + _ZGVnN2v_log2f; _ZGVnN2v_log2; _ZGVsMxv_log2f; _ZGVsMxv_log2; _ZGVnN4v_tanf; + _ZGVnN2v_tanf; _ZGVnN2v_tan; _ZGVsMxv_tanf; _ZGVsMxv_tan; diff --git a/sysdeps/aarch64/fpu/acosf_advsimd.c b/sysdeps/aarch64/fpu/acosf_advsimd.c index 7d39e9b805..e28c200a96 100644 --- a/sysdeps/aarch64/fpu/acosf_advsimd.c +++ b/sysdeps/aarch64/fpu/acosf_advsimd.c @@ -111,3 +111,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (acos) (float32x4_t x) return vfmaq_f32 (add, mul, y); } +libmvec_hidden_def (V_NAME_F1(acos)) +HALF_WIDTH_ALIAS_F1 (acos) diff --git a/sysdeps/aarch64/fpu/advsimd_f32_protos.h b/sysdeps/aarch64/fpu/advsimd_f32_protos.h new file mode 100644 index 0000000000..b406ad7156 --- /dev/null +++ b/sysdeps/aarch64/fpu/advsimd_f32_protos.h @@ -0,0 +1,34 @@ +/* Hidden prototypes for single-precision AdvSIMD routines + + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +libmvec_hidden_proto (V_NAME_F1(acos)); +libmvec_hidden_proto (V_NAME_F1(asin)); +libmvec_hidden_proto (V_NAME_F1(atan)); +libmvec_hidden_proto (V_NAME_F1(cos)); +libmvec_hidden_proto (V_NAME_F1(exp10)); +libmvec_hidden_proto (V_NAME_F1(exp2)); +libmvec_hidden_proto (V_NAME_F1(exp)); +libmvec_hidden_proto (V_NAME_F1(expm1)); +libmvec_hidden_proto (V_NAME_F1(log10)); +libmvec_hidden_proto (V_NAME_F1(log1p)); +libmvec_hidden_proto (V_NAME_F1(log2)); +libmvec_hidden_proto (V_NAME_F1(log)); +libmvec_hidden_proto (V_NAME_F1(sin)); +libmvec_hidden_proto (V_NAME_F1(tan)); +libmvec_hidden_proto (V_NAME_F2(atan2)); diff --git a/sysdeps/aarch64/fpu/asinf_advsimd.c b/sysdeps/aarch64/fpu/asinf_advsimd.c index 3180ae7c8e..842037fd56 100644 --- a/sysdeps/aarch64/fpu/asinf_advsimd.c +++ b/sysdeps/aarch64/fpu/asinf_advsimd.c @@ -102,3 +102,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (asin) (float32x4_t x) /* Copy sign. */ return vbslq_f32 (v_u32 (AbsMask), y, x); } +libmvec_hidden_def (V_NAME_F1 (asin)) +HALF_WIDTH_ALIAS_F1 (asin) diff --git a/sysdeps/aarch64/fpu/atan2f_advsimd.c b/sysdeps/aarch64/fpu/atan2f_advsimd.c index 5a5a6202d1..fa161b1458 100644 --- a/sysdeps/aarch64/fpu/atan2f_advsimd.c +++ b/sysdeps/aarch64/fpu/atan2f_advsimd.c @@ -114,3 +114,5 @@ float32x4_t VPCS_ATTR V_NAME_F2 (atan2) (float32x4_t y, float32x4_t x) return ret; } +libmvec_hidden_def (V_NAME_F2 (atan2)) +HALF_WIDTH_ALIAS_F2(atan2) diff --git a/sysdeps/aarch64/fpu/atanf_advsimd.c b/sysdeps/aarch64/fpu/atanf_advsimd.c index 589b0e8c96..3c08c51965 100644 --- a/sysdeps/aarch64/fpu/atanf_advsimd.c +++ b/sysdeps/aarch64/fpu/atanf_advsimd.c @@ -107,3 +107,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (atan) (float32x4_t x) return y; } +libmvec_hidden_def (V_NAME_F1 (atan)) +HALF_WIDTH_ALIAS_F1 (atan) diff --git a/sysdeps/aarch64/fpu/cosf_advsimd.c b/sysdeps/aarch64/fpu/cosf_advsimd.c index f05dd2bcda..9f82e7c5d6 100644 --- a/sysdeps/aarch64/fpu/cosf_advsimd.c +++ b/sysdeps/aarch64/fpu/cosf_advsimd.c @@ -92,3 +92,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (cos) (float32x4_t x) return special_case (x, y, odd, cmp); return vreinterpretq_f32_u32 (veorq_u32 (vreinterpretq_u32_f32 (y), odd)); } +libmvec_hidden_def (V_NAME_F1 (cos)) +HALF_WIDTH_ALIAS_F1 (cos) diff --git a/sysdeps/aarch64/fpu/exp10f_advsimd.c b/sysdeps/aarch64/fpu/exp10f_advsimd.c index 9e754c46fa..10c9dedad7 100644 --- a/sysdeps/aarch64/fpu/exp10f_advsimd.c +++ b/sysdeps/aarch64/fpu/exp10f_advsimd.c @@ -138,3 +138,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (exp10) (float32x4_t x) return vfmaq_f32 (scale, poly, scale); } +libmvec_hidden_def (V_NAME_F1 (exp10)) +HALF_WIDTH_ALIAS_F1 (exp10) diff --git a/sysdeps/aarch64/fpu/exp2f_advsimd.c b/sysdeps/aarch64/fpu/exp2f_advsimd.c index 70b3ab66c1..cf84a5b9cd 100644 --- a/sysdeps/aarch64/fpu/exp2f_advsimd.c +++ b/sysdeps/aarch64/fpu/exp2f_advsimd.c @@ -122,3 +122,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (exp2) (float32x4_t x) return vfmaq_f32 (scale, poly, scale); } +libmvec_hidden_def (V_NAME_F1 (exp2)) +HALF_WIDTH_ALIAS_F1 (exp2) diff --git a/sysdeps/aarch64/fpu/expf_advsimd.c b/sysdeps/aarch64/fpu/expf_advsimd.c index 69d5d1ea77..d70d1add95 100644 --- a/sysdeps/aarch64/fpu/expf_advsimd.c +++ b/sysdeps/aarch64/fpu/expf_advsimd.c @@ -131,3 +131,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (exp) (float32x4_t x) return vfmaq_f32 (scale, poly, scale); } +libmvec_hidden_def (V_NAME_F1 (exp)) +HALF_WIDTH_ALIAS_F1 (exp) diff --git a/sysdeps/aarch64/fpu/expm1f_advsimd.c b/sysdeps/aarch64/fpu/expm1f_advsimd.c index b27b75068a..e3b6e1b01b 100644 --- a/sysdeps/aarch64/fpu/expm1f_advsimd.c +++ b/sysdeps/aarch64/fpu/expm1f_advsimd.c @@ -115,3 +115,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (expm1) (float32x4_t x) /* expm1(x) ~= p * t + (t - 1). */ return vfmaq_f32 (vsubq_f32 (t, v_f32 (1.0f)), p, t); } +libmvec_hidden_def (V_NAME_F1 (expm1)) +HALF_WIDTH_ALIAS_F1 (expm1) diff --git a/sysdeps/aarch64/fpu/log10f_advsimd.c b/sysdeps/aarch64/fpu/log10f_advsimd.c index ba02060bbe..4c4e7cb2f4 100644 --- a/sysdeps/aarch64/fpu/log10f_advsimd.c +++ b/sysdeps/aarch64/fpu/log10f_advsimd.c @@ -80,3 +80,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (log10) (float32x4_t x) return special_case (x, y, poly, r2, special); return vfmaq_f32 (y, poly, r2); } +libmvec_hidden_def (V_NAME_F1 (log10)) +HALF_WIDTH_ALIAS_F1 (log10) diff --git a/sysdeps/aarch64/fpu/log1pf_advsimd.c b/sysdeps/aarch64/fpu/log1pf_advsimd.c index 3748830de8..0530dc2002 100644 --- a/sysdeps/aarch64/fpu/log1pf_advsimd.c +++ b/sysdeps/aarch64/fpu/log1pf_advsimd.c @@ -126,3 +126,5 @@ VPCS_ATTR float32x4_t V_NAME_F1 (log1p) (float32x4_t x) return special_case (special_arg, y, special_cases); return y; } +libmvec_hidden_def (V_NAME_F1 (log1p)) +HALF_WIDTH_ALIAS_F1 (log1p) diff --git a/sysdeps/aarch64/fpu/log2f_advsimd.c b/sysdeps/aarch64/fpu/log2f_advsimd.c index e913bcda18..4049db242e 100644 --- a/sysdeps/aarch64/fpu/log2f_advsimd.c +++ b/sysdeps/aarch64/fpu/log2f_advsimd.c @@ -75,3 +75,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (log2) (float32x4_t x) return special_case (x, n, p, r, special); return vfmaq_f32 (n, p, r); } +libmvec_hidden_def (V_NAME_F1 (log2)) +HALF_WIDTH_ALIAS_F1 (log2) diff --git a/sysdeps/aarch64/fpu/logf_advsimd.c b/sysdeps/aarch64/fpu/logf_advsimd.c index 93903c7962..930e4b9c8f 100644 --- a/sysdeps/aarch64/fpu/logf_advsimd.c +++ b/sysdeps/aarch64/fpu/logf_advsimd.c @@ -83,3 +83,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (log) (float32x4_t x) return special_case (x, y, r2, p, cmp); return vfmaq_f32 (p, y, r2); } +libmvec_hidden_def (V_NAME_F1 (log)) +HALF_WIDTH_ALIAS_F1 (log) diff --git a/sysdeps/aarch64/fpu/sinf_advsimd.c b/sysdeps/aarch64/fpu/sinf_advsimd.c index 0e78cf55f0..3638ab5508 100644 --- a/sysdeps/aarch64/fpu/sinf_advsimd.c +++ b/sysdeps/aarch64/fpu/sinf_advsimd.c @@ -92,3 +92,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (sin) (float32x4_t x) return special_case (x, y, odd, cmp); return vreinterpretq_f32_u32 (veorq_u32 (vreinterpretq_u32_f32 (y), odd)); } +libmvec_hidden_def (V_NAME_F1 (sin)) +HALF_WIDTH_ALIAS_F1 (sin) diff --git a/sysdeps/aarch64/fpu/tanf_advsimd.c b/sysdeps/aarch64/fpu/tanf_advsimd.c index 4c8a7f740e..dbc7b4dd6e 100644 --- a/sysdeps/aarch64/fpu/tanf_advsimd.c +++ b/sysdeps/aarch64/fpu/tanf_advsimd.c @@ -127,3 +127,5 @@ float32x4_t VPCS_ATTR V_NAME_F1 (tan) (float32x4_t x) return special_case (special_arg, vbslq_f32 (pred_alt, inv_y, y), special); return vbslq_f32 (pred_alt, inv_y, y); } +libmvec_hidden_def (V_NAME_F1 (tan)) +HALF_WIDTH_ALIAS_F1 (tan) diff --git a/sysdeps/aarch64/fpu/v_math.h b/sysdeps/aarch64/fpu/v_math.h index d286eb81b3..e8ac0e2332 100644 --- a/sysdeps/aarch64/fpu/v_math.h +++ b/sysdeps/aarch64/fpu/v_math.h @@ -29,6 +29,21 @@ #define V_NAME_F2(fun) _ZGVnN4vv_##fun##f #define V_NAME_D2(fun) _ZGVnN2vv_##fun +#include "advsimd_f32_protos.h" + +#define HALF_WIDTH_ALIAS_F1(fun) \ + float32x2_t VPCS_ATTR _ZGVnN2v_##fun##f (float32x2_t x) \ + { \ + return vget_low_f32 (_ZGVnN4v_##fun##f (vcombine_f32 (x, x))); \ + } + +#define HALF_WIDTH_ALIAS_F2(fun) \ + float32x2_t VPCS_ATTR _ZGVnN2vv_##fun##f (float32x2_t x, float32x2_t y) \ + { \ + return vget_low_f32 ( \ + _ZGVnN4vv_##fun##f (vcombine_f32 (x, x), vcombine_f32 (y, y))); \ + } + /* Shorthand helpers for declaring constants. */ #define V2(X) { X, X } #define V4(X) { X, X, X, X } diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist index 2bf4ea6332..580952b4de 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist +++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist @@ -15,16 +15,31 @@ GLIBC_2.38 _ZGVsMxv_logf F GLIBC_2.38 _ZGVsMxv_sin F GLIBC_2.38 _ZGVsMxv_sinf F GLIBC_2.39 _ZGVnN2v_acos F +GLIBC_2.39 _ZGVnN2v_acosf F GLIBC_2.39 _ZGVnN2v_asin F +GLIBC_2.39 _ZGVnN2v_asinf F GLIBC_2.39 _ZGVnN2v_atan F +GLIBC_2.39 _ZGVnN2v_atanf F +GLIBC_2.39 _ZGVnN2v_cosf F GLIBC_2.39 _ZGVnN2v_exp10 F +GLIBC_2.39 _ZGVnN2v_exp10f F GLIBC_2.39 _ZGVnN2v_exp2 F +GLIBC_2.39 _ZGVnN2v_exp2f F +GLIBC_2.39 _ZGVnN2v_expf F GLIBC_2.39 _ZGVnN2v_expm1 F +GLIBC_2.39 _ZGVnN2v_expm1f F GLIBC_2.39 _ZGVnN2v_log10 F +GLIBC_2.39 _ZGVnN2v_log10f F GLIBC_2.39 _ZGVnN2v_log1p F +GLIBC_2.39 _ZGVnN2v_log1pf F GLIBC_2.39 _ZGVnN2v_log2 F +GLIBC_2.39 _ZGVnN2v_log2f F +GLIBC_2.39 _ZGVnN2v_logf F +GLIBC_2.39 _ZGVnN2v_sinf F GLIBC_2.39 _ZGVnN2v_tan F +GLIBC_2.39 _ZGVnN2v_tanf F GLIBC_2.39 _ZGVnN2vv_atan2 F +GLIBC_2.39 _ZGVnN2vv_atan2f F GLIBC_2.39 _ZGVnN4v_acosf F GLIBC_2.39 _ZGVnN4v_asinf F GLIBC_2.39 _ZGVnN4v_atanf F