From patchwork Fri Dec 1 09:49:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Ramsay X-Patchwork-Id: 81097 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 33E2D38618BE for ; Fri, 1 Dec 2023 09:50:30 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04on2047.outbound.protection.outlook.com [40.107.7.47]) by sourceware.org (Postfix) with ESMTPS id 33E323858415 for ; Fri, 1 Dec 2023 09:50:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 33E323858415 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 33E323858415 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.7.47 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701424212; cv=pass; b=Jg2lj97eFy+5Jgb18U/FNtWlwrUuWOKsB8JJPfTWENnnSpwV8zFgSWV3nvUq/r7tOlDtEAfXYH804vFD3eGQwWsAZ8vbKxjSGZ5BnEkeiHlvR3rAFzruQHXoNabRIamtvdCx/zRlg2EVzs5MbH192N4kPQ+4qZSmAWrtuZ19WA4= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1701424212; c=relaxed/simple; bh=/z6UyNtd8WXTvTneUU0DZd34tCVGBOHskXvMeObn4jk=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=czMn5ZgfNpSXCwHwFDAEIqVoodY6+jj1QhWaR4VJxkLm1Ezr/Q3S5E8lyDQldJETpJHUMJ+pzpHWW7uh8cM7rbkWuHpTHprAW7VGR+Fs40PYS8UXmCXcOcdQ4gU3gEjKHE7dWCO3zDlV7L4F+QxMvWHFyT6KMvR5mQiGbNAjL0A= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=UjTWWu4OWXyCElKISd0xlAdynbgKGjLlEB1OVKA7t3GHeG4XOZkPvCWCHQN0Lx8w3Y/ps5Ea7BpnmOE9GKC6OG2lf3k1VfBfrZVYrEUoXnPSTjtWhFFESWQDy2KZFX9ZV8R3XutNPVTdSR+w3MaobF0Nr0MJSIgAcybtU2RsjixZq5pAKZUoFoSEh2ZcpA9gqUK2AHdUA+bofkjXiB1bWqWzd7g4JSXiWJw41rCkPb/6B0N/KrFkMezw6Z7cbkimgSyOpEWTUd3/0eQVCa4LBEi4XOu6XiPeUbBrSGJTVV+1ef8nPc3BVfFU8VtZSIgMjnv6v40CBi8NWK8ypnj+0A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Zxy5z9n6kjngHyzia4UtEi7WfGjwEC8SUIZhySGoukM=; b=NDwnoIQ7mCQRgRcvFFEAt+ndYYoDRnvxttTy87N3xtLGA3AjsM8lLn9prJfzLQ2TrbQdOjOjk3oQ0JEXbZcF2KAPjcm6gjOiARQGJj5Zje+ZbgmFSxerENPhIxJJkQvZ5dNzU0AuiMvAfMNFqEeIEamC6eDiSkoN4if+SgSnktvKa0l+PhRsJ7iPrIsujcwMQo6hvbH81zCr2UaVo17plaxjysSlZ9kCj5XPGcRkfiuYcwrPdRq4mTdK+gtobmxejh2+hZ8J2sWw5EKB8cVqnyOa1+F/kgQLF2mbzrDGMJYaLTBwy89tUheQCQg52UZPxE60lrhfVqUkfA8EXCIrBw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1, 1, smtp.mailfrom=arm.com] dmarc=[1, 1, header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Zxy5z9n6kjngHyzia4UtEi7WfGjwEC8SUIZhySGoukM=; b=62KHFPCdnwFJ2QOCBPDnf5y/q4ZmgdS6GpwlXHLgIlyRV7JR7PzUlreHW8dvVyriPPraqIVNwGI80dI/WPoE55omP8qtMxPZsOakmKXehok0ru8+SyP6kjLroDW28mhxtH4+8SdUmLhLDqLOpPNTl9ESuLwYX5Ma+ehwWuuD/Y0= Received: from AM0PR02CA0183.eurprd02.prod.outlook.com (2603:10a6:20b:28e::20) by VI1PR08MB10276.eurprd08.prod.outlook.com (2603:10a6:800:1bd::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.27; Fri, 1 Dec 2023 09:49:58 +0000 Received: from AM4PEPF00025F95.EURPRD83.prod.outlook.com (2603:10a6:20b:28e:cafe::d3) by AM0PR02CA0183.outlook.office365.com (2603:10a6:20b:28e::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.23 via Frontend Transport; Fri, 1 Dec 2023 09:49:58 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM4PEPF00025F95.mail.protection.outlook.com (10.167.16.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7068.1 via Frontend Transport; Fri, 1 Dec 2023 09:49:58 +0000 Received: ("Tessian outbound 385ad2f98d71:v228"); Fri, 01 Dec 2023 09:49:58 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 20cedc12fdffea20 X-CR-MTA-TID: 64aa7808 Received: from a669918273e2.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 9478F180-76AB-45E9-8978-37879B51A63C.1; Fri, 01 Dec 2023 09:49:52 +0000 Received: from EUR03-AM7-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a669918273e2.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 01 Dec 2023 09:49:52 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=iRtsfRRchUd+1EkQHOyQiFBgN07OqxHMZafjL1HP6+gl6nDIG9PuB53+OjPDPKW9nnLe7uHQE5x+K+KV/EN6LtwdZwVk3fq0cy96sr7pm+6qlSYeRTnWdW8lhsahDhOwp0iwIgu5TCKxLxGMNLDZSCR4Fy+LUHGBNyThc+mvqBzg+bYkefj4Lefl0RPfcYopfqSehi2Kl5+IaF029vGbgvtPg9uJThISAOMCARu41qMFR4fFnfyM6a1tFsBh/oUwmXDZA06E0UEH6P1ujwDaz1FTyf+wlSgO9pw1M6O5DpcYSged0Vdp8vgfFm8FIEjCQNHfgtoBbqOEEojiGYtudQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Zxy5z9n6kjngHyzia4UtEi7WfGjwEC8SUIZhySGoukM=; b=dgmKtLzWHkATKxxibe737CdzDAl+VvTAUC9OJj7h8BylVPwahvZ7cIhSWsbkGgYHZw7SsiB3Fb6tfJn1gXk9CpJnx1tQ0kPZ/4vUnhYhb/3i1wyl5ruGzHElbD8DE8Whj91HWWacoMRNyNlcNTDmm/XEIOz6LC3JPih+0H5VwlUHQCBRt37WQeP9NDPwxzB1ML7CawqY9G6Jpn4r9rXN/btDH6Ek8ayLxM8uPFW7voy6PT8dO3i+7bd+bddxUjzYh8rV/adEAYdEihRrk6UOSKKbK7roXzHd25rWIH2niGE28BoWoSV5PSFyZkg+vOYZ9j+DgxIUlZiJCBD5qatibA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Zxy5z9n6kjngHyzia4UtEi7WfGjwEC8SUIZhySGoukM=; b=62KHFPCdnwFJ2QOCBPDnf5y/q4ZmgdS6GpwlXHLgIlyRV7JR7PzUlreHW8dvVyriPPraqIVNwGI80dI/WPoE55omP8qtMxPZsOakmKXehok0ru8+SyP6kjLroDW28mhxtH4+8SdUmLhLDqLOpPNTl9ESuLwYX5Ma+ehwWuuD/Y0= Received: from AS4P189CA0040.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:5dd::9) by GV1PR08MB8036.eurprd08.prod.outlook.com (2603:10a6:150:97::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.24; Fri, 1 Dec 2023 09:49:48 +0000 Received: from AM4PEPF00025F98.EURPRD83.prod.outlook.com (2603:10a6:20b:5dd:cafe::7a) by AS4P189CA0040.outlook.office365.com (2603:10a6:20b:5dd::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7046.27 via Frontend Transport; Fri, 1 Dec 2023 09:49:48 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AM4PEPF00025F98.mail.protection.outlook.com (10.167.16.7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7068.1 via Frontend Transport; Fri, 1 Dec 2023 09:49:48 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Fri, 1 Dec 2023 09:49:47 +0000 Received: from vcn-man-apps.manchester.arm.com (10.32.108.22) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Fri, 1 Dec 2023 09:49:47 +0000 From: Joe Ramsay To: CC: Joe Ramsay Subject: [PATCH v2] math: Add new exp10 implementation Date: Fri, 1 Dec 2023 09:49:45 +0000 Message-ID: <20231201094945.49035-1-Joe.Ramsay@arm.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AM4PEPF00025F98:EE_|GV1PR08MB8036:EE_|AM4PEPF00025F95:EE_|VI1PR08MB10276:EE_ X-MS-Office365-Filtering-Correlation-Id: ba8e6c86-ee91-4e57-b9e1-08dbf252e154 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: l+WXlAAH4tioh9ZJs9vDTUoIP3DeYR7A/daFsqrqW2vbaBF2JlsGTuoVuHlBdw9IzfEdst/gjFBgDiI3ggGLG4++joJP3Tme2Fec/sI936vmZ2qh7IAloisJCObW1BLoCh/lVyy5kxiAeqxfAzA6aJS4CBuYJadH4w1FK3yI3gDXTJw7KpReW0X77z+cUjnNmyUTkKakgHfUEZ/EdrqPksl1f+bycySPBrPr3aQE9kSU0R06Bh6NL4kWZHeeR6pQHfVyOvP4otXP2s3mZdR+eeyIvZCWAM/1m25hf99VWU/mKp3PGb0zS7N8jBWBUr1uMRnUNGtk2OwlcjFdxC3ufGMhSiUV4xu6H2GqVdo97s5x8bA0p6qJbkhFhVsFwEo6JqOnAuKdS9l7p7ILm7nUlJ4yBVH699i0OFpT8g1iDggf1Pp/4KRNFYjtR7hXfl22rW8CLQCuXosbLJXABrNU3NwMWz71ByRpbAL5mMPsc/fHozP4ws4ZgJWwUU3zt1QWelAhD5/86jQ/RVEqyvnJLr13sWbhp0oay/A4fhiCeKKqhJ1noW1aArOgc9n14RkWu1e5QRBndaes+1bsLh3JWgfIJGKTQR9hoMdsteevQt/6KX0CkKXJTAjvKVkiJc9b5A82glNxnGwdLjoZCGmNvRncxx14ofHRT20GvMaZDamUD0abzowJflo6sjAz0t4Yl3bIcF7xYztq9QP6pit9Kqmnqn5hgb4nwvEoGEos2MUQSeCFp90eGuylDeeDq3xO X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(4636009)(39860400002)(346002)(376002)(396003)(136003)(230922051799003)(451199024)(82310400011)(186009)(64100799003)(1800799012)(46966006)(36840700001)(40470700004)(40460700003)(1076003)(2616005)(336012)(26005)(83380400001)(7696005)(426003)(8936002)(4326008)(8676002)(86362001)(316002)(40480700001)(478600001)(70586007)(70206006)(6916009)(36860700001)(81166007)(356005)(47076005)(82740400003)(36756003)(41300700001)(2906002)(5660300002)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8036 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM4PEPF00025F95.EURPRD83.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: a9f4f0c3-1b45-46b9-41d8-08dbf252db69 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: txoJe/Sy0UZhVKhwMt4rzNJ6tuTBD/qh2MXmpqiqS5NE/wCvpnjMWjej5BL9aPEvvNyyVhKh0Bxs3f06BAVdvB/ffHgckyMZ5gslwW6IVrwKHkr1+XzqGGXG15RQdzlcbpt8IBafya9uzkUHbvs6ctv3Y9qkY1+H1g4HZfaFNGgVcW4B4eoPABOv7qW1beXGHePUsd4bbBcFMpSi5HwIEmKWh2LcKJiac//RO1ruNmFaDRZnTUrYFI67DCz5das4Mpol35HnMF191/eAOvejbU4jAK0xuT0vlM4AaFysV7Kki7E2lDIQ5xc0NH38l/j8DyzEs/gF2fdZOUIpt1wgvqQWGiMlpNgRMdCBa9I8OyKVLpKuNKFkmc+OXgx0R9LTPIFQejoJ80uwfDsWhGzGKO02jmrSg1l5BNPLD8d2ZmSaiIqCarAV+YU5BGghJSRTV2ZkEGWGC8RzziDe0gp13i79vNj/0aIN+yxixai1/aq8P5sB9ppw2ISsCO0t4syzbLOElUOfX12mH5zD6smbsi/bJXf3QHkofTnz7BzCNo+MhNrGAxm97M/5UmOLZHc2TaDs8vbIaQcZ/AHrNfDapJvWJxUw9/OmRXpuc7Sni11ucMJH8J0fbyCxDikD75AbJaAXO7hCeq6aFjZ7oCqPXqKZxZg50fixlGsfQntPa3JUxw8r64xbkpUs6kGuhYx323cI5ObrWLZDC20xHacxXnMJdTNln9gmFbsYBFSglZQ= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(136003)(396003)(39850400004)(346002)(376002)(230922051799003)(451199024)(1800799012)(64100799003)(186009)(82310400011)(40470700004)(36840700001)(46966006)(40480700001)(40460700003)(86362001)(336012)(70586007)(70206006)(6916009)(81166007)(82740400003)(36756003)(26005)(47076005)(36860700001)(83380400001)(1076003)(2616005)(426003)(7696005)(2906002)(316002)(478600001)(8936002)(5660300002)(8676002)(4326008)(41300700001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Dec 2023 09:49:58.4098 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ba8e6c86-ee91-4e57-b9e1-08dbf252e154 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00025F95.EURPRD83.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB10276 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org New implementation is based on the existing exp/exp2, with different reduction constants and polynomial. Worst-case error in round-to- nearest is 0.513 ULP. The exp/exp2 shared table is reused for exp10 - .rodata size of e_exp_data increases by 64 bytes. As for exp/exp2, targets with single-instruction rounding/conversion intrinsics can use them by toggling TOINT_INTRINSICS=1 and adding the necessary code to their math_private.h. Improvements on Neoverse V1 compared to current GLIBC master: exp10 thruput: 3.3x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8] exp10 latency: 1.8x in [-0x1.439b746e36b52p+8 0x1.34413509f79ffp+8] Tested on: aarch64-linux-gnu (TOINT_INTRINSICS, fma contraction) and x86_64-linux-gnu (!TOINT_INTRINSICS, no fma contraction) Reviewed-by: Szabolcs Nagy --- Differences from v1: * Stop preventing inlining of special-case - no good reason for this, in fact performance is slightly better if it is inlined * Remove configurable wide poly * Update max ULP based on several runs of the subdivide program with threshold=1000000 * Define float constant in hex format * No benchtest added, as there is already one for exp10 (I couldn't see a way to have multiple intervals in exp10-inputs?). Anyway the intervals in the previous version's commit message were chosen fairly arbitrarily - added speedup measurement for the interval used in exp10-inputs Thanks, Joe sysdeps/ieee754/dbl-64/e_exp10.c | 144 ++++++++++++++++++++++----- sysdeps/ieee754/dbl-64/e_exp_data.c | 11 ++ sysdeps/ieee754/dbl-64/math_config.h | 4 + 3 files changed, 135 insertions(+), 24 deletions(-) diff --git a/sysdeps/ieee754/dbl-64/e_exp10.c b/sysdeps/ieee754/dbl-64/e_exp10.c index fa47f4f922..08069140c0 100644 --- a/sysdeps/ieee754/dbl-64/e_exp10.c +++ b/sysdeps/ieee754/dbl-64/e_exp10.c @@ -16,36 +16,132 @@ . */ #include +#include +#include #include #include #include +#include "math_config.h" -static const double log10_high = 0x2.4d7637p0; -static const double log10_low = 0x7.6aaa2b05ba95cp-28; +#define N (1 << EXP_TABLE_BITS) +#define IndexMask (N - 1) +#define OFlowBound 0x1.34413509f79ffp8 /* log10(DBL_MAX). */ +#define UFlowBound -0x1.5ep+8 /* -350. */ +#define SmallTop 0x3c6 /* top12(0x1p-57). */ +#define BigTop 0x407 /* top12(0x1p8). */ +#define Thresh 0x41 /* BigTop - SmallTop. */ +#define Shift __exp_data.shift +#define C(i) __exp_data.exp10_poly[i] +static double +special_case (uint64_t sbits, double_t tmp, uint64_t ki) +{ + double_t scale, y; + + if (ki - (1ull << 16) < 0x80000000) + { + /* The exponent of scale might have overflowed by 1. */ + sbits -= 1ull << 52; + scale = asdouble (sbits); + y = 2 * (scale + scale * tmp); + return check_oflow (y); + } + + /* n < 0, need special care in the subnormal range. */ + sbits += 1022ull << 52; + scale = asdouble (sbits); + y = scale + scale * tmp; + + if (y < 1.0) + { + /* Round y to the right precision before scaling it into the subnormal + range to avoid double rounding that can cause 0.5+E/2 ulp error where + E is the worst-case ulp error outside the subnormal range. So this + is only useful if the goal is better than 1 ulp worst-case error. */ + double_t lo = scale - y + scale * tmp; + double_t hi = 1.0 + y; + lo = 1.0 - hi + y + lo; + y = math_narrow_eval (hi + lo) - 1.0; + /* Avoid -0.0 with downward rounding. */ + if (WANT_ROUNDING && y == 0.0) + y = 0.0; + /* The underflow exception needs to be signaled explicitly. */ + math_force_eval (math_opt_barrier (0x1p-1022) * 0x1p-1022); + } + y = 0x1p-1022 * y; + + return check_uflow (y); +} + +/* Double-precision 10^x approximation. Largest observed error is ~0.513 ULP. */ double -__ieee754_exp10 (double arg) +__ieee754_exp10 (double x) { - int32_t lx; - double arg_high, arg_low; - double exp_high, exp_low; - - if (!isfinite (arg)) - return __ieee754_exp (arg); - if (arg < DBL_MIN_10_EXP - DBL_DIG - 10) - return DBL_MIN * DBL_MIN; - else if (arg > DBL_MAX_10_EXP + 1) - return DBL_MAX * DBL_MAX; - else if (fabs (arg) < 0x1p-56) - return 1.0; - - GET_LOW_WORD (lx, arg); - lx &= 0xf8000000; - arg_high = arg; - SET_LOW_WORD (arg_high, lx); - arg_low = arg - arg_high; - exp_high = arg_high * log10_high; - exp_low = arg_high * log10_low + arg_low * M_LN10; - return __ieee754_exp (exp_high) * __ieee754_exp (exp_low); + uint64_t ix = asuint64 (x); + uint32_t abstop = (ix >> 52) & 0x7ff; + + if (__glibc_unlikely (abstop - SmallTop >= Thresh)) + { + if (abstop - SmallTop >= 0x80000000) + /* Avoid spurious underflow for tiny x. + Note: 0 is common input. */ + return x + 1; + if (abstop == 0x7ff) + return ix == asuint64 (-INFINITY) ? 0.0 : x + 1.0; + if (x >= OFlowBound) + return __math_oflow (0); + if (x < UFlowBound) + return __math_uflow (0); + + /* Large x is special-cased below. */ + abstop = 0; + } + + /* Reduce x: z = x * N / log10(2), k = round(z). */ + double_t z = __exp_data.invlog10_2N * x; + double_t kd; + int64_t ki; +#if TOINT_INTRINSICS + kd = roundtoint (z); + ki = converttoint (z); +#else + kd = math_narrow_eval (z + Shift); + kd -= Shift; + ki = kd; +#endif + + /* r = x - k * log10(2), r in [-0.5, 0.5]. */ + double_t r = x; + r = __exp_data.neglog10_2hiN * kd + r; + r = __exp_data.neglog10_2loN * kd + r; + + /* exp10(x) = 2^(k/N) * 2^(r/N). + Approximate the two components separately. */ + + /* s = 2^(k/N), using lookup table. */ + uint64_t e = ki << (52 - EXP_TABLE_BITS); + uint64_t i = (ki & IndexMask) * 2; + uint64_t u = __exp_data.tab[i + 1]; + uint64_t sbits = u + e; + + double_t tail = asdouble (__exp_data.tab[i]); + + /* 2^(r/N) ~= 1 + r * Poly(r). */ + double_t r2 = r * r; + double_t p = C (0) + r * C (1); + double_t y = C (2) + r * C (3); + y = y + r2 * C (4); + y = p + r2 * y; + y = tail + y * r; + + if (__glibc_unlikely (abstop == 0)) + return special_case (sbits, y, ki); + + /* Assemble components: + y = 2^(r/N) * 2^(k/N) + ~= (y + 1) * s. */ + double_t s = asdouble (sbits); + return s * y + s; } + libm_alias_finite (__ieee754_exp10, __exp10) diff --git a/sysdeps/ieee754/dbl-64/e_exp_data.c b/sysdeps/ieee754/dbl-64/e_exp_data.c index d498b8bc3b..5c774afbcb 100644 --- a/sysdeps/ieee754/dbl-64/e_exp_data.c +++ b/sysdeps/ieee754/dbl-64/e_exp_data.c @@ -58,6 +58,17 @@ const struct exp_data __exp_data = { 0x1.5d7e09b4e3a84p-10, #endif }, +.invlog10_2N = 0x1.a934f0979a371p1 * N, +.neglog10_2hiN = -0x1.3441350ap-2 / N, +.neglog10_2loN = 0x1.0c0219dc1da99p-39 / N, +.exp10_poly = { +/* Coeffs generated using Remez in [-log10(2)/256, log10(2)/256 ]. */ +0x1.26bb1bbb55516p1, +0x1.53524c73ce9fep1, +0x1.0470591ce4b26p1, +0x1.2bd76577fe684p0, +0x1.1446eeccd0efbp-1 +}, // 2^(k/N) ~= H[k]*(1 + T[k]) for int k in [0,N) // tab[2*k] = asuint64(T[k]) // tab[2*k+1] = asuint64(H[k]) - (k << 52)/N diff --git a/sysdeps/ieee754/dbl-64/math_config.h b/sysdeps/ieee754/dbl-64/math_config.h index 19af33fd86..d617addfbe 100644 --- a/sysdeps/ieee754/dbl-64/math_config.h +++ b/sysdeps/ieee754/dbl-64/math_config.h @@ -201,6 +201,10 @@ extern const struct exp_data double poly[4]; /* Last four coefficients. */ double exp2_shift; double exp2_poly[EXP2_POLY_ORDER]; + double invlog10_2N; + double neglog10_2hiN; + double neglog10_2loN; + double exp10_poly[5]; uint64_t tab[2*(1 << EXP_TABLE_BITS)]; } __exp_data attribute_hidden;