From patchwork Tue Nov 28 13:05:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Ramsay X-Patchwork-Id: 80914 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 68D153870C3C for ; Tue, 28 Nov 2023 13:06:14 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2053.outbound.protection.outlook.com [40.107.22.53]) by sourceware.org (Postfix) with ESMTPS id 0856A3858C50 for ; Tue, 28 Nov 2023 13:05:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0856A3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0856A3858C50 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.22.53 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701176761; cv=pass; b=m461uWgZn3cX2ntt5UYnJRjUAHnZluEuQsOhuq5fEQM6rFwoL9TMXaLfuGXooyXv51lOc0OLG3+tYrgZX87jYhON2jMH5hn4Ou2umNT+cKBPJCzi12wPdBon45j5QmwVHsVfF3ACP1Sw9nSIS4H/NYfUuRIk0WzeUNhbchS0+LY= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701176761; c=relaxed/simple; bh=L8G35JYBOG5bOMJfT/Dshw/fvP9BvJqdsIRNiX23pPc=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=nW6+TT6pw2EnO5ASfIO8Liz66ej7aN8gDXhIaDjoBe7ovEZRxI82gJhnjNuKzaUGu8dRXWqubMvUHwNvG9mdAJtFRZGkQNnZ2ro6d/xIytdN31es7wmAYGwyk43kqsUvPATohsM+JbZIAVoVD4Kx4oDIEve+pss5c+NvsPMmHo4= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=cZsSYGdR41lD0sNcbifSQcfU5IIX2pRym0L7vM+S/p/NEwn3Omxd8TnoqYrhjioS4b9ZhVvIhV+HNpqtbWCh62xfXUXkp9H54keN2pF5Jc14r5tDWsmiX+DvwZJT0rqYIqrhl3lgbYeSnIjtbwav50jCMFcpMC/f2FFwd6tt1Fj0m3Qwg4bjsHmN66hZpqyUmJm4w7Dd8t5HuS7sSqqWc3anDBgiCBXwBvmNQf8nhUorPjqRionW90EfruFSWKXOJNmWrxvURB0E1JPAMQqKYnI0lV/wGq14U7y1T+dHdX67gm7A9VIsRCVPzOFDEGfofBa7SlONp6bO79ZR8go+nQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ToXD5uGE12iSGo+vsvYHN7xoHJSrU32HWyq8ybgDWak=; b=EUIl8qASqaW3GeB9hcjpuUyWyllgSkxRJGi6fqeki8feQxow58zKPBlgRuYlerVElAoI6M3uMH54a2BmZsLwCIkpkQvB8fJXgk81sLWKOhxxTSrYIyRkgSpuX2uf/g5lZPYjbZsrtXde91gHACSRkbGUyIVft/7BVjNWLXDUeFChT/bBa309K29nMMvPuSIoiFrHikesQfu0nyATnAiY65Xlq5r8N75rT0fWZY4g1mhOZ/Zns2e3qfm+tDbBfI9HxYZWFVT2iiJTNVz13UzlAsgn5GK5onjvyK8ndICTmGgiO+g81HBWbcR4weBMlUqWvX1gW4GFKqejUxjS5KU4xg== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1, 1, smtp.mailfrom=arm.com] dmarc=[1, 1, header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ToXD5uGE12iSGo+vsvYHN7xoHJSrU32HWyq8ybgDWak=; b=G01RHjOmMmJGHP9MxYEUqsE+PjTiFed1XsyXHvNNWAKa9D7lZgM28dAUrNmRWdPSfJ2YXpZc1sxYarXPAaHOBwGnRuaJizobhl9tNKDkk1GMhhIgNlxOp8+1ee5i9F5wv60s0sDtKJiMs5CVJKM2RVGrn2zehr+E2NzlaQiGCUo= Received: from AM6P191CA0078.EURP191.PROD.OUTLOOK.COM (2603:10a6:209:8a::19) by VI1PR08MB10074.eurprd08.prod.outlook.com (2603:10a6:800:1be::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.27; Tue, 28 Nov 2023 13:05:53 +0000 Received: from AMS1EPF00000045.eurprd04.prod.outlook.com (2603:10a6:209:8a:cafe::52) by AM6P191CA0078.outlook.office365.com (2603:10a6:209:8a::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.27 via Frontend Transport; Tue, 28 Nov 2023 13:05:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AMS1EPF00000045.mail.protection.outlook.com (10.167.16.42) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.17 via Frontend Transport; Tue, 28 Nov 2023 13:05:53 +0000 Received: ("Tessian outbound 26ee1d40577c:v228"); Tue, 28 Nov 2023 13:05:52 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 0e8235bf71f1ee08 X-CR-MTA-TID: 64aa7808 Received: from 718cf859b288.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 3362265F-5144-416E-81FF-F4C759981C20.1; Tue, 28 Nov 2023 13:05:45 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 718cf859b288.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 28 Nov 2023 13:05:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dYdVjjykzz5p6B24O4HlgIGdPEjcUwvuxltPjGxoB2Jg62bgXRYCuCYlxipOsmXCDTEXPqbCSDF+x0TT/i/KQZxMIWfsHPwpyfH/8VJH/68aBbJ5RC6XrJb3BumgHpDY9e1N2cB1zuZ1zdautnOmxPKFYv7m0/00s9ojjIA7Dv7RaMGzql9e06tEiK56lH9CVQJwZylaDkXm1pkN6T5PmmS7VaBpr9UFxyFLy9Zt1OAcJuC5AFWB+WCSVpRnCy3jes9WM9gK2GTC6vmTRNjup+xVtPgHMo7KhgJ+jJnQi1iRSHNbLo1rN6CpPt1V/Qb0Yny0XSODx0K6ZJBRpXNuOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ToXD5uGE12iSGo+vsvYHN7xoHJSrU32HWyq8ybgDWak=; b=fA+7otRuqJ3t9tbUiDMz1YlDO6Wzw6G23uCGAmFC6udpSkr7iW8umg1SI2PityzOU91TNJqObnZoJzHIpQ5MSrPQd7asIn8Qj8NIF2+Xe11JgPQqPq5T8WCTPkZfPDaBt0AHUSn7A2o+LmYeQWTem6B+/R0HU2ggEuRORnc7ZTE1r9PP+5C3Krezss+cn8dPEBD4flxBPW+rj0+dTE1y4LRGuiQO61eafO7L+U8ltDbCOv4vFH7l2B4SJhc7sRUpxdQCugPuL5rDvhqFefXZ0eg22G2SoS9DICpWYan0uo3/UT3ms9bFLKUM5UZk8XXRv2LDRIoeB1lZLeF2uz75nQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ToXD5uGE12iSGo+vsvYHN7xoHJSrU32HWyq8ybgDWak=; b=G01RHjOmMmJGHP9MxYEUqsE+PjTiFed1XsyXHvNNWAKa9D7lZgM28dAUrNmRWdPSfJ2YXpZc1sxYarXPAaHOBwGnRuaJizobhl9tNKDkk1GMhhIgNlxOp8+1ee5i9F5wv60s0sDtKJiMs5CVJKM2RVGrn2zehr+E2NzlaQiGCUo= Received: from DUZP191CA0061.EURP191.PROD.OUTLOOK.COM (2603:10a6:10:4fa::16) by AS4PR08MB8119.eurprd08.prod.outlook.com (2603:10a6:20b:58b::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.29; Tue, 28 Nov 2023 13:05:43 +0000 Received: from DB5PEPF00014B88.eurprd02.prod.outlook.com (2603:10a6:10:4fa:cafe::5f) by DUZP191CA0061.outlook.office365.com (2603:10a6:10:4fa::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7025.27 via Frontend Transport; Tue, 28 Nov 2023 13:05:43 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by DB5PEPF00014B88.mail.protection.outlook.com (10.167.8.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7046.17 via Frontend Transport; Tue, 28 Nov 2023 13:05:43 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Tue, 28 Nov 2023 13:05:43 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Tue, 28 Nov 2023 13:05:42 +0000 Received: from vcn-man-apps.manchester.arm.com (10.32.108.22) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Tue, 28 Nov 2023 13:05:42 +0000 From: Joe Ramsay To: CC: Joe Ramsay Subject: [PATCH] math: Add new exp10 implementation Date: Tue, 28 Nov 2023 13:05:41 +0000 Message-ID: <20231128130541.7913-1-Joe.Ramsay@arm.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: DB5PEPF00014B88:EE_|AS4PR08MB8119:EE_|AMS1EPF00000045:EE_|VI1PR08MB10074:EE_ X-MS-Office365-Filtering-Correlation-Id: 8b1a0497-8b33-4088-d9ec-08dbf012c06d x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: X0TCaTwbPLhRfZ0RsVg9DEaHKiD4gwo6dT9hums5/uOIAj7v3NlmeU+6etrPhB4adQcwBOgzCEFZEbTAo1WK5t24d97k4ORBFoRL+YK9/rIwmOkznVoHrGF0F2djrRlAwag/38gSybN0b7l/UeAyolM5IkZDo7mjUbl0+s6ibuhmm+1lB8omKvLE+fB8EQUY+N5XPUKcwcANS2VJUxi1y/8ELAYf08k92eSiGOIzhG3Wy+tY4FFm1XyzDSC5jpGDohE/7/3FRvhXiyrH1EOrUAd2sL+jtcoCkvvGfZTiapnNtdi84stgsH5DIwGsUKRRN3NFxa0NfsqkKZY6PNLrmBG0V7wgJiLpXUuPV1wVwmbhpJw5HbOhA2rgcC8Vy5yTCqoMYyVb6MZkbJ1N1Gb/jPB+GDBVqMpnVfIov6uFl1wOVwgpIRo0vtkw+98l8RmNfgqKzeJyF6l7Kw29+MOHmnZzXp9NRKA1VAxZ3TqKl/ltOB7eoXDSg2dozb6UZDeXyC8CJt8bZU2etrN2XmF1LWKJbJjTu7t0LvIADr5UywNMcQtMj/HQDCIz8h/4eYMrqDO8ScPNiHNZvSqmRinx2b399A1SHMF7MpFbI4ZTyYb17+ep70WME3kUiFuo3tR5JCAyyt71dBx4Tg9TByEiITjCqUvJ1P7qQaOpehVo/26fuBX26r+NnVGmT/y1aV34CHPhQzN+lGsK9PyOwRdThkU59WyIjgHbyMU1QjvbfMfZSrrqVrNE98Nln1qWHfPM X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(4636009)(346002)(376002)(396003)(39860400002)(136003)(230922051799003)(451199024)(82310400011)(186009)(64100799003)(1800799012)(36840700001)(40470700004)(46966006)(40460700003)(70206006)(70586007)(316002)(4326008)(8936002)(8676002)(478600001)(6916009)(41300700001)(36756003)(2906002)(86362001)(5660300002)(36860700001)(356005)(47076005)(81166007)(2616005)(26005)(1076003)(7696005)(82740400003)(336012)(426003)(83380400001)(40480700001)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS4PR08MB8119 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AMS1EPF00000045.eurprd04.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 688fca3a-0f32-4f2e-e1b6-08dbf012baa6 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: hwbTEtazbIqF8F8AqCF5Xy4n8mTZ6/WoMnsj1ZN29Xo904chJMbk8dfyiGumVUfVGFFIt4DsJbpgwtaeM4v/P8+REIbcNWZU7LgRp+v/NcSC4UHgi0UdHXMxMFFhLTZaiDERpocjxt21syTuSWRWZZg2l/3gfy2biDL1tXi+q1ymMaXMF8bbhS4So04qEb0RNyBP3UkMM1qkfYrYCPGwehYEnEGs5zqhBAehMyf/iqP0EFAkyccWWTyRI8zC0aX2lzKz+tAe7A7cjtN8RcQPnpisuZm2k8NQK8FWqo4AY0z1bwjeRwINiNj1ks96LbUqcU7ejL8TgvbS8TRDwNyL8ghMH46oAaBPbd6TwLrL91NMuaGzgXdO/gL6YCqh9gBj3vwoTJL13cGUieL1RzZerjhV5Q92MwTrgMtjJNxCDm812XmIzoYqs+l453yF4qlHY44OK//POx/ZVVuc14LrBHLcigQ3Xu27BRtM+k/x8qPwHdSQ3CeciZe+2/w26ZEK/WRYwKwpM8kEtPPMU9BbZdt2qM7qs1PD1iuxgsDtdM7ykX8T8EdsxZbz8yPHkKchPoHAQVXdhJOviYU69IfUFseNcfRRcmw8ZGBcPDzGXQ31WEPcLJQLjd4MZ0N+Eg3R0TaI/KGAzBrYJ8bWipPBBDIyHH5Wy8AMU+lTc2YvfFKKAGphKXXjKmO68W8zfrYHaEjaYr+VOhaD74fYEP6n/P81AAmTIGX5S3vZSadLlV8= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(39850400004)(376002)(396003)(136003)(346002)(230922051799003)(451199024)(64100799003)(186009)(82310400011)(1800799012)(40470700004)(36840700001)(46966006)(40460700003)(2616005)(1076003)(26005)(426003)(86362001)(7696005)(82740400003)(336012)(8676002)(8936002)(5660300002)(4326008)(478600001)(6916009)(316002)(70586007)(70206006)(36860700001)(83380400001)(47076005)(81166007)(41300700001)(40480700001)(36756003)(2906002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Nov 2023 13:05:53.0778 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8b1a0497-8b33-4088-d9ec-08dbf012c06d X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AMS1EPF00000045.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB10074 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org New implementation is based on the existing exp/exp2, with different reduction constants and polynomial. The default selection has worst-case error of 0.501 ULP. For targets without rounding intrinsics a wider polynomial can be enabled to retain 1 ULP of accuracy in non-nearest rounding modes - this would be enabled by toggling EXP10_POLY_WIDE=1. The exp/exp2 shared table is reused for exp10 - .rodata size of e_exp_data increases by 64 bytes. As for exp/exp2, targets with single-instruction rounding/conversion intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the necessary code to their math_private.h. Improvements on Neoverse V1 compared to current GLIBC master: exp10 thruput: 4.1x in [-9.9, 9.9] exp10 latency: 1.6x in [-9.9, 9.9] exp10 thruput: 4.1x in [0.5, 1] exp10 latency: 1.6x in [0.5, 1] Tested on: aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) --- Thanks, Joe sysdeps/ieee754/dbl-64/e_exp10.c | 145 ++++++++++++++++++++++----- sysdeps/ieee754/dbl-64/e_exp_data.c | 21 ++++ sysdeps/ieee754/dbl-64/math_config.h | 7 ++ 3 files changed, 149 insertions(+), 24 deletions(-) diff --git a/sysdeps/ieee754/dbl-64/e_exp10.c b/sysdeps/ieee754/dbl-64/e_exp10.c index fa47f4f922..3f3a65d06f 100644 --- a/sysdeps/ieee754/dbl-64/e_exp10.c +++ b/sysdeps/ieee754/dbl-64/e_exp10.c @@ -16,36 +16,133 @@ . */ #include +#include +#include #include #include #include +#include "math_config.h" -static const double log10_high = 0x2.4d7637p0; -static const double log10_low = 0x7.6aaa2b05ba95cp-28; +#define N (1 << EXP_TABLE_BITS) +#define IndexMask (N - 1) +#define OFlowBound 0x1.34413509f79ffp8 /* log10(DBL_MAX). */ +#define UFlowBound -350 +#define SmallTop 0x3c6 /* top12(0x1p-57). */ +#define BigTop 0x407 /* top12(0x1p8). */ +#define Thresh 0x41 /* BigTop - SmallTop. */ +#define Shift __exp_data.shift +#define C(i) __exp_data.exp10_poly[i] +static double __attribute__ ((noinline)) +special_case (uint64_t sbits, double_t tmp, uint64_t ki) +{ + double_t scale, y; + + if (ki - (1ull << 16) < 0x80000000) + { + /* The exponent of scale might have overflowed by 1. */ + sbits -= 1ull << 52; + scale = asdouble (sbits); + y = 2 * (scale + scale * tmp); + return check_oflow (y); + } + + /* n < 0, need special care in the subnormal range. */ + sbits += 1022ull << 52; + scale = asdouble (sbits); + y = scale + scale * tmp; + + if (y < 1.0) + { + /* Round y to the right precision before scaling it into the subnormal + range to avoid double rounding that can cause 0.5+E/2 ulp error where + E is the worst-case ulp error outside the subnormal range. So this + is only useful if the goal is better than 1 ulp worst-case error. */ + double_t lo = scale - y + scale * tmp; + double_t hi = 1.0 + y; + lo = 1.0 - hi + y + lo; + y = math_narrow_eval (hi + lo) - 1.0; + /* Avoid -0.0 with downward rounding. */ + if (WANT_ROUNDING && y == 0.0) + y = 0.0; + /* The underflow exception needs to be signaled explicitly. */ + math_force_eval (math_opt_barrier (0x1p-1022) * 0x1p-1022); + } + y = 0x1p-1022 * y; + + return check_uflow (y); +} + +/* Double-precision 10^x approximation. Largest observed error is ~0.501 ULP. + */ double -__ieee754_exp10 (double arg) +__ieee754_exp10 (double x) { - int32_t lx; - double arg_high, arg_low; - double exp_high, exp_low; - - if (!isfinite (arg)) - return __ieee754_exp (arg); - if (arg < DBL_MIN_10_EXP - DBL_DIG - 10) - return DBL_MIN * DBL_MIN; - else if (arg > DBL_MAX_10_EXP + 1) - return DBL_MAX * DBL_MAX; - else if (fabs (arg) < 0x1p-56) - return 1.0; - - GET_LOW_WORD (lx, arg); - lx &= 0xf8000000; - arg_high = arg; - SET_LOW_WORD (arg_high, lx); - arg_low = arg - arg_high; - exp_high = arg_high * log10_high; - exp_low = arg_high * log10_low + arg_low * M_LN10; - return __ieee754_exp (exp_high) * __ieee754_exp (exp_low); + uint64_t ix = asuint64 (x); + uint32_t abstop = (ix >> 52) & 0x7ff; + + if (__glibc_unlikely (abstop - SmallTop >= Thresh)) + { + if (abstop - SmallTop >= 0x80000000) + /* Avoid spurious underflow for tiny x. + Note: 0 is common input. */ + return x + 1; + if (abstop == 0x7ff) + return ix == asuint64 (-INFINITY) ? 0.0 : x + 1.0; + if (x >= OFlowBound) + return __math_oflow (0); + if (x < UFlowBound) + return __math_uflow (0); + + /* Large x is special-cased below. */ + abstop = 0; + } + + /* Reduce x: z = x * N / log10(2), k = round(z). */ + double_t z = __exp_data.invlog10_2N * x; + double_t kd; + int64_t ki; +#if TOINT_INTRINSICS + kd = roundtoint (z); + ki = converttoint (z); +#else + kd = math_narrow_eval (z + Shift); + kd -= Shift; + ki = kd; +#endif + + /* r = x - k * log10(2), r in [-0.5, 0.5]. */ + double_t r = x; + r = __exp_data.neglog10_2hiN * kd + r; + r = __exp_data.neglog10_2loN * kd + r; + + /* exp10(x) = 2^(k/N) * 2^(r/N). + Approximate the two components separately. */ + + /* s = 2^(k/N), using lookup table. */ + uint64_t e = ki << (52 - EXP_TABLE_BITS); + uint64_t i = (ki & IndexMask) * 2; + uint64_t u = __exp_data.tab[i + 1]; + uint64_t sbits = u + e; + + double_t tail = asdouble (__exp_data.tab[i]); + + /* 2^(r/N) ~= 1 + r * Poly(r). */ + double_t r2 = r * r; + double_t p = C (0) + r * C (1); + double_t y = C (2) + r * C (3); + y = y + r2 * C (4); + y = p + r2 * y; + y = tail + y * r; + + if (__glibc_unlikely (abstop == 0)) + return special_case (sbits, y, ki); + + /* Assemble components: + y = 2^(r/N) * 2^(k/N) + ~= (y + 1) * s. */ + double_t s = asdouble (sbits); + return s * y + s; } + libm_alias_finite (__ieee754_exp10, __exp10) diff --git a/sysdeps/ieee754/dbl-64/e_exp_data.c b/sysdeps/ieee754/dbl-64/e_exp_data.c index d498b8bc3b..c81f6dd9b1 100644 --- a/sysdeps/ieee754/dbl-64/e_exp_data.c +++ b/sysdeps/ieee754/dbl-64/e_exp_data.c @@ -58,6 +58,27 @@ const struct exp_data __exp_data = { 0x1.5d7e09b4e3a84p-10, #endif }, +.invlog10_2N = 0x1.a934f0979a371p1 * N, +.neglog10_2hiN = -0x1.3441350ap-2 / N, +.neglog10_2loN = 0x1.0c0219dc1da99p-39 / N, +.exp10_poly = { +#if EXP10_POLY_WIDE +/* Range is wider if using shift-based reduction: coeffs generated + using Remez in [-log10(2)/128, log10(2)/128 ]. */ +0x1.26bb1bbb55515p1, +0x1.53524c73cd32bp1, +0x1.0470591e1a108p1, +0x1.2bd77b12fe9a8p0, +0x1.14289fef24b78p-1 +#else +/* Coeffs generated using Remez in [-log10(2)/256, log10(2)/256 ]. */ +0x1.26bb1bbb55516p1, +0x1.53524c73ce9fep1, +0x1.0470591ce4b26p1, +0x1.2bd76577fe684p0, +0x1.1446eeccd0efbp-1 +#endif +}, // 2^(k/N) ~= H[k]*(1 + T[k]) for int k in [0,N) // tab[2*k] = asuint64(T[k]) // tab[2*k+1] = asuint64(H[k]) - (k << 52)/N diff --git a/sysdeps/ieee754/dbl-64/math_config.h b/sysdeps/ieee754/dbl-64/math_config.h index 19af33fd86..3db595a296 100644 --- a/sysdeps/ieee754/dbl-64/math_config.h +++ b/sysdeps/ieee754/dbl-64/math_config.h @@ -192,6 +192,9 @@ check_uflow (double x) #define EXP_TABLE_BITS 7 #define EXP_POLY_ORDER 5 #define EXP2_POLY_ORDER 5 +/* Wider exp10 polynomial necessary for good precision in non-nearest rounding + and !TOINT_INTRINSICS. */ +#define EXP10_POLY_WIDE 0 extern const struct exp_data { double invln2N; @@ -201,6 +204,10 @@ extern const struct exp_data double poly[4]; /* Last four coefficients. */ double exp2_shift; double exp2_poly[EXP2_POLY_ORDER]; + double invlog10_2N; + double neglog10_2hiN; + double neglog10_2loN; + double exp10_poly[5]; uint64_t tab[2*(1 << EXP_TABLE_BITS)]; } __exp_data attribute_hidden;