From patchwork Thu Jan 7 19:28:15 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 41671 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 15909386F41C; Thu, 7 Jan 2021 19:28:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 15909386F41C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1610047711; bh=zVGx4p1xk0pkrNTBsCxHAfMnLlAsK0UQD2Xv0iKpCO4=; h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=VVSnwNrVon6srxmG4pLfLk/ZJnA2IEqWV5X5hPiwPTH3+jeT32cSQT2W2GKFq/L2p F/G28uWN7kyR1Xp8Uy6RxMWvneVR1ScJU9sY2NAfWYyQKyA8sDs33P/GQut9M4dwWj IMKl5d1M0dDDE0hB3QSHOHKvA82qAceSFHY0Vcw8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from FRA01-MR2-obe.outbound.protection.outlook.com (mail-eopbgr90040.outbound.protection.outlook.com [40.107.9.40]) by sourceware.org (Postfix) with ESMTPS id 60209385043A for ; Thu, 7 Jan 2021 19:28:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 60209385043A Received: from DB6PR0801CA0058.eurprd08.prod.outlook.com (2603:10a6:4:2b::26) by PR2PR08MB4714.eurprd08.prod.outlook.com (2603:10a6:101:1c::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.23; Thu, 7 Jan 2021 19:28:24 +0000 Received: from DB5EUR03FT023.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:2b:cafe::fc) by DB6PR0801CA0058.outlook.office365.com (2603:10a6:4:2b::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3742.6 via Frontend Transport; Thu, 7 Jan 2021 19:28:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT023.mail.protection.outlook.com (10.152.20.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3742.6 via Frontend Transport; Thu, 7 Jan 2021 19:28:24 +0000 Received: ("Tessian outbound 76bd5a04122f:v71"); Thu, 07 Jan 2021 19:28:24 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: ce610e4120dec9dc X-CR-MTA-TID: 64aa7808 Received: from a4c625dcb2d5.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id EBF120F7-E1ED-453D-AECE-93A72C77B87D.1; Thu, 07 Jan 2021 19:28:17 +0000 Received: from EUR04-VI1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id a4c625dcb2d5.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 07 Jan 2021 19:28:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jvrE5EaFBlMvT9p7hDS5PvuTIqODiKIxMN2qcljTWmqJ8IZRjcwYERETvRfy2JCs3HynAk3wpIG18a3i7sEtaUVyq6nATg8JJCtBjFwVMCihV6skYguR5JY7KNvOxJ08UYzkaaNqdrSwTyxu+MJqJkxghk2TV/T8B4p0lxY/7vcVK6ZoASWLt5jM437Dii5l9uMSJtbP5FxZkYTWYEYUSDhCsIoZUJeHQfwmshh0jtQuVNHWMZcmmWhvgeAw6Avep3DFBVOJqswGxX27swE/nCq+UT2W5rUIWaqY3toBCo0MeOxMVnVyGz7giIuAQ6rFNI/a+ZGj1h539L0ubh2yhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=zVGx4p1xk0pkrNTBsCxHAfMnLlAsK0UQD2Xv0iKpCO4=; b=oRYDL9KIBbLe4+McBZwnEy5SKKF7Vm6ZnzUUfQIr1CSdsMcrZCEABq/did4XlP+qZ7jFV69v9o7oO5Gh7wllcQ/+IRRrv1mtNw129/51uMpg+YdqjJQomAD2271FOxWH62aa/B+RPDYD9O5WEWhRWa7dAICT9jFBa3IRsptvnEdfA2bhEo6v3nTlNv6v9Ix248wy7S/w9O0UZ9UbDVPbbpSXmmAX7e+E2s+TshU/VFdBfVLOGGvZIeYzq6IYnfzjJ3TaZe7KRKQJMRtDLyjviMfvCApEdd1MeeRlCitS7IGLZ70gtukGgcItX+G/exTizjWjryrmuZTjmlS3WavWTQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB5599.eurprd08.prod.outlook.com (2603:10a6:800:1a1::12) by VI1PR0801MB1760.eurprd08.prod.outlook.com (2603:10a6:800:51::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.21; Thu, 7 Jan 2021 19:28:15 +0000 Received: from VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::6d00:2694:e0d7:986f]) by VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::6d00:2694:e0d7:986f%5]) with mapi id 15.20.3742.006; Thu, 7 Jan 2021 19:28:15 +0000 To: 'GNU C Library' Subject: [PATCH v2 4/5] Remove slow paths from atan2 Thread-Topic: [PATCH v2 4/5] Remove slow paths from atan2 Thread-Index: AQHW5Ss+TmnABq1ts0ic/G3EGCP+/g== Date: Thu, 7 Jan 2021 19:28:15 +0000 Message-ID: References: , In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.249.100] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: a8e89958-f1f4-4fb7-cf0d-08d8b342668a x-ms-traffictypediagnostic: VI1PR0801MB1760:|PR2PR08MB4714: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:393;OLM:393; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: sgf5I9E3kXqNcoAsvb62L8gOc1UiCPT8PAFSlFT4HYDSUfprrqUmkogZm3MwRMCcnb/ktZqCXNGFgnMDhxmy72h6zmmjtRaiMm7UxiR1cJzCJjB4/jSHUw/0GlEaaiVfYEG4QqZCxfG6MYBTlkf8jC5gM0YD/vzent2oJJIUqvc+gRK+mjpbH6Ei43jz5vY+rIrdr7u2EL70QBcm8rP4qZ4XUmbelkkWroIWqhqwyTuU9S7chzHm8XzK1gGZDFHfxoDpOwa/qgppS+9uB4CJk2WDz8h5+cHc8kkVejOscCWK+atULeuaMzlv8+W86jtwRzwuscUuP/QzDG+guqbtwKyojC83Fmld4hGMbwCQuDmuj9f+wjoIoidxSwfRC+WDFP2eX/D5UxF26OmX5X4QOw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB5599.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(498600001)(83380400001)(6916009)(7696005)(26005)(33656002)(55016002)(71200400001)(8936002)(30864003)(5660300002)(6506007)(9686003)(76116006)(86362001)(8676002)(186003)(66446008)(2906002)(66946007)(66556008)(64756008)(66476007)(52536014); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?q?e2QOVNEvM4nc40sM11iKtmYZW?= =?iso-8859-1?q?C58MObjQMG8M2jc22gK8LJ6yjdKVKjcHh9Fc7Dv83Bbh6T+sH1ow1ehwMxJj?= =?iso-8859-1?q?/qKZhz+b9Cf96aYLfocQW7w0qH/d6Peg2ZcCd5EFP3/Luneqf5Li3SLLxYZY?= =?iso-8859-1?q?3IvtnzTNc7WDahkJXqYsSQs83t1BZGBFL6u3S66A+XQlSE7TpRTEUZzunKvf?= =?iso-8859-1?q?nyfX5SSCrQr9YnRQsZBx1LBXacYy07ZlFlDsf36ui3RK6lEo2Y0q85EEmrN4?= =?iso-8859-1?q?IemFhGKSb9LBhSiUhjJh+5Fkul1WFsfP896AmnJ6Qkh3kB6YlzAWgZxBZoBY?= =?iso-8859-1?q?fCREG1+qEkztp7x6oi3CybsPhHUDIfLRYQTP5jPRx7x1mNH8QNKBGuc2+8Ex?= =?iso-8859-1?q?LGTdYfwicnHhya0q2zm4yCfgMGNGDPby5zH9rSxDauJJkODQYnkIrweLNP12?= =?iso-8859-1?q?tGAcNwHBGG5EpcH4TNZkSz0g6/No8r/x/4rMm78gcPPMYMSfKEtxmsNrbklI?= =?iso-8859-1?q?M7YmxJeKfFYrXhAKmuf/9w/QSs/NHvSuQcA4p9OO4wlHOGgvk/1+FoAhYFRk?= =?iso-8859-1?q?7T0QPGPra4DLqAfH8uIr3NNYFoiYPZxsV/k5jBy98H5RqLjub/CbWSCcP7ET?= =?iso-8859-1?q?fiyUQ/2Eprcxo2vW6ROXcB1rRl9xMv22rNqq6KTFgBVPCdTg1l2eJayQx3sW?= =?iso-8859-1?q?FlCtBvmjLhCJiiX6Kx3+FnhXxRoTDRuRPWdqBW1ARVIKcNtu1b1uHNoswEej?= =?iso-8859-1?q?eDIkIphlszNHQXnvkEEjHOvl2Fzkat6XqqIaVRAGKYDvMBcdBdict6z0t9sQ?= =?iso-8859-1?q?pVUCwTR28vYO9Yoov9sCvgUwOM9n1FqV10xIVXfkfWGoS7fBSxWVrPnMjfJR?= =?iso-8859-1?q?2RYpKRlpUyJnCSaJ9+o/ievW4o4q8eDzmIN51aly5OFN+uTxIJaqhteEFzex?= =?iso-8859-1?q?WxJeYXV1Jwz/cCpfNWOMOziIlKdi0hauLmgPW/3V0IIUiDhzGuiflF/j4CdW?= =?iso-8859-1?q?qSPdQV/e8orVk/ahsc=3D?= x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0801MB1760 Original-Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none; sourceware.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: e9b1333d-e83e-4e17-9d6d-08d8b3426174 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: vr1iRqrIIUMylgtunvEXrqfH66XUMi/JQG3j54hOTaiK3Uy5X7+1jx/hVve5WKzcLteufvFqPPFg4EqthNmMO4Ga1OjOlNOWmwJBeaYJQ77n1kLA0/wN1uYfI+gWFvAELdPWQ0iGv4yWDzX7l3mobr9LgNrXq7z7A+XDnvEmk/NR6EjC/4ZOkOisPijq7mlKUOu76gxz7MLtmd8xzOs00ll2c9iwVqXA6YzoIrd/vOmQjIMI05gLmqZLzF6OOJaiqorW61MLmE5n6Is9SlPde7MO0mNfxbbTlPgWrl54E8XUOPGjPgg7Ge4iRjPxlpwjUa3K0/oNU2Bg5JRAMRzRcictFay1ur9a06pCBcZkzAbqdeVvvODWz7AaYerQ+tHudRcRB6YSlUdEdJ8G3BoNfG7+e6gVL0DQdDp66HIKJSSw6dupJw5YFMsackEkfedsulJtQpReGOx4yvfDTS0Y95/cBdkOls/bNBhUoJUKxwKT7jrTkU0NqxMXif4b0SO+ X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(396003)(376002)(136003)(346002)(39850400004)(46966006)(70206006)(478600001)(316002)(82740400003)(8676002)(81166007)(82310400003)(5660300002)(33656002)(8936002)(34020700004)(30864003)(70586007)(9686003)(7696005)(2906002)(26005)(83380400001)(47076005)(55016002)(356005)(52536014)(6916009)(186003)(86362001)(336012)(6506007); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Jan 2021 19:28:24.2441 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a8e89958-f1f4-4fb7-cf0d-08d8b342668a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT023.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR2PR08MB4714 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Wilco Dijkstra via Libc-alpha From: Wilco Dijkstra Reply-To: Wilco Dijkstra Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" Remove slow paths from atan2. Add ULP annotations. Passes GLIBC testsuite. diff --git a/sysdeps/ieee754/dbl-64/atnat2.h b/sysdeps/ieee754/dbl-64/atnat2.h index f1c0d23f9ecd3084a79a9dc18f6550153eba06db..f9f8debd55d85e77163a1bd3ebbf339ce0ac3bc7 100644 --- a/sysdeps/ieee754/dbl-64/atnat2.h +++ b/sysdeps/ieee754/dbl-64/atnat2.h @@ -34,7 +34,7 @@ #define MM 5 #ifdef BIG_ENDI - static const number + static const mynumber /* polynomial I */ /**/ d3 = {{0xbfd55555, 0x55555555} }, /* -0.333... */ /**/ d5 = {{0x3fc99999, 0x999997fd} }, /* 0.199... */ @@ -96,7 +96,7 @@ #else #ifdef LITTLE_ENDI - static const number + static const mynumber /* polynomial I */ /**/ d3 = {{0x55555555, 0xbfd55555} }, /* -0.333... */ /**/ d5 = {{0x999997fd, 0x3fc99999} }, /* 0.199... */ diff --git a/sysdeps/ieee754/dbl-64/e_atan2.c b/sysdeps/ieee754/dbl-64/e_atan2.c index e6b98142fbbc7c6bd4a51df0257b361fa46181ec..b77ad2dc6b9561e6396a76db11f6db686f170bb1 100644 --- a/sysdeps/ieee754/dbl-64/e_atan2.c +++ b/sysdeps/ieee754/dbl-64/e_atan2.c @@ -20,25 +20,14 @@ /* MODULE_NAME: atnat2.c */ /* */ /* FUNCTIONS: uatan2 */ -/* atan2Mp */ /* signArctan2 */ -/* normalized */ /* */ -/* FILES NEEDED: dla.h endian.h mpa.h mydefs.h atnat2.h */ -/* mpatan.c mpatan2.c mpsqrt.c */ +/* FILES NEEDED: dla.h endian.h mydefs.h atnat2.h */ /* uatan.tbl */ /* */ -/* An ultimate atan2() routine. Given two IEEE double machine numbers y,*/ -/* x it computes the correctly rounded (to nearest) value of atan2(y,x).*/ -/* */ -/* Assumption: Machine arithmetic operations are performed in */ -/* round to nearest mode of IEEE 754 standard. */ -/* */ /************************************************************************/ #include -#include "mpa.h" -#include "MathLib.h" #include "mydefs.h" #include "uatan.tbl" #include "atnat2.h" @@ -48,20 +37,15 @@ #include #include #include -#include #include #ifndef SECTION # define SECTION #endif -/************************************************************************/ -/* An ultimate atan2 routine. Given two IEEE double machine numbers y,x */ -/* it computes the correctly rounded (to nearest) value of atan2(y,x). */ -/* Assumption: Machine arithmetic operations are performed in */ -/* round to nearest mode of IEEE 754 standard. */ -/************************************************************************/ -static double atan2Mp (double, double, const int[]); +#define TWO52 0x1.0p52 +#define TWOM1022 0x1.0p-1022 + /* Fix the sign and return after stage 1 or stage 2 */ static double signArctan2 (double y, double z) @@ -69,18 +53,15 @@ signArctan2 (double y, double z) return copysign (z, y); } -static double normalized (double, double, double, double); -void __mpatan2 (mp_no *, mp_no *, mp_no *, int); - +/* atan2 with max ULP of ~0.524 based on random sampling. */ double SECTION __ieee754_atan2 (double y, double x) { int i, de, ux, dx, uy, dy; - static const int pr[MM] = { 6, 8, 10, 20, 32 }; - double ax, ay, u, du, u9, ua, v, vv, dv, t1, t2, t3, - z, zz, cor, s1, ss1, s2, ss2; - number num; + double ax, ay, u, du, v, vv, dv, t1, t2, t3, + z, zz, cor; + mynumber num; static const int ep = 59768832, /* 57*16**5 */ em = -59768832; /* -57*16**5 */ @@ -208,10 +189,8 @@ __ieee754_atan2 (double y, double x) if (x > 0) { double ret; - if ((z = ay / ax) < TWOM1022) - ret = normalized (ax, ay, y, z); - else - ret = signArctan2 (y, z); + z = ay / ax; + ret = signArctan2 (y, z); if (fabs (ret) < DBL_MIN) { double vret = ret ? ret : DBL_MIN; @@ -270,30 +249,12 @@ __ieee754_atan2 (double y, double x) + v * (d11.d + v * d13.d))))); - if ((z = u + (zz - u1.d * u)) == u + (zz + u1.d * u)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d + v * (f13.d - + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - u5.d * s1)) == s1 + (ss1 + u5.d * s1)) - return signArctan2 (y, z); - - return atan2Mp (x, y, pr); + z = u + zz; + /* Max ULP is 0.504. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; t3 = u - cij[i][0].d; EADD (t3, du, v, dv); @@ -304,43 +265,9 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][4].d + v * (cij[i][5].d + v * cij[i][6].d)))); - if (i < 112) - { - if (i < 48) - u9 = u91.d; /* u < 1/4 */ - else - u9 = u92.d; - } /* 1/4 <= u < 1/2 */ - else - { - if (i < 176) - u9 = u93.d; /* 1/2 <= u < 3/4 */ - else - u9 = u94.d; - } /* 3/4 <= u <= 1 */ - if ((z = t1 + (zz - u9 * t1)) == t1 + (zz + u9 * t1)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - EADD (t1, du, v, vv); - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d - + v * hij[i][15].d)))); - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - ub.d * s2)) == s2 + (ss2 + ub.d * s2)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.56. */ + return signArctan2 (y, z); } /* (ii) x>0, abs(x)<=abs(y): pi/2-atan(ax/ay) */ @@ -355,31 +282,12 @@ __ieee754_atan2 (double y, double x) + v * d13.d))))); ESUB (hpi.d, u, t2, cor); t3 = ((hpi1.d + cor) - du) - zz; - if ((z = t2 + (t3 - u2.d)) == t2 + (t3 + u2.d)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d - + v * (f13.d - + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - SUB2 (hpi.d, hpi1.d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - u6.d)) == s2 + (ss2 + u6.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t2 + t3; + /* Max ULP is 0.501. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; v = (u - cij[i][0].d) + du; @@ -389,36 +297,9 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][5].d + v * cij[i][6].d)))); t1 = hpi.d - cij[i][1].d; - if (i < 112) - ua = ua1.d; /* w < 1/2 */ - else - ua = ua2.d; /* w >= 1/2 */ - if ((z = t1 + (zz - ua)) == t1 + (zz + ua)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - EADD (t1, du, v, vv); - - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d - + v * hij[i][15].d)))); - - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - SUB2 (hpi.d, hpi1.d, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - uc.d)) == s1 + (ss1 + uc.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.503. */ + return signArctan2 (y, z); } /* (iii) x<0, abs(x)< abs(y): pi/2+atan(ax/ay) */ @@ -434,30 +315,12 @@ __ieee754_atan2 (double y, double x) + v * (d11.d + v * d13.d))))); EADD (hpi.d, u, t2, cor); t3 = ((hpi1.d + cor) + du) + zz; - if ((z = t2 + (t3 - u3.d)) == t2 + (t3 + u3.d)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d - + v * (f13.d + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - ADD2 (hpi.d, hpi1.d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - u7.d)) == s2 + (ss2 + u7.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t2 + t3; + /* Max ULP is 0.501. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; v = (u - cij[i][0].d) + du; zz = hpi1.d + v * (cij[i][2].d @@ -466,34 +329,9 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][5].d + v * cij[i][6].d)))); t1 = hpi.d + cij[i][1].d; - if (i < 112) - ua = ua1.d; /* w < 1/2 */ - else - ua = ua2.d; /* w >= 1/2 */ - if ((z = t1 + (zz - ua)) == t1 + (zz + ua)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - EADD (t1, du, v, vv); - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d - + v * hij[i][15].d)))); - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - ADD2 (hpi.d, hpi1.d, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - uc.d)) == s1 + (ss1 + uc.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.503. */ + return signArctan2 (y, z); } /* (iv) x<0, abs(y)<=abs(x): pi-atan(ax/ay) */ @@ -506,29 +344,12 @@ __ieee754_atan2 (double y, double x) + v * (d9.d + v * (d11.d + v * d13.d))))); ESUB (opi.d, u, t2, cor); t3 = ((opi1.d + cor) - du) - zz; - if ((z = t2 + (t3 - u4.d)) == t2 + (t3 + u4.d)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d + v * (f13.d + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - SUB2 (opi.d, opi1.d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - u8.d)) == s2 + (ss2 + u8.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t2 + t3; + /* Max ULP is 0.501. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; v = (u - cij[i][0].d) + du; zz = opi1.d - v * (cij[i][2].d @@ -536,86 +357,11 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][4].d + v * (cij[i][5].d + v * cij[i][6].d)))); t1 = opi.d - cij[i][1].d; - if (i < 112) - ua = ua1.d; /* w < 1/2 */ - else - ua = ua2.d; /* w >= 1/2 */ - if ((z = t1 + (zz - ua)) == t1 + (zz + ua)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - - EADD (t1, du, v, vv); - - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d + v * hij[i][15].d)))); - - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - SUB2 (opi.d, opi1.d, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - uc.d)) == s1 + (ss1 + uc.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.502. */ + return signArctan2 (y, z); } #ifndef __ieee754_atan2 libm_alias_finite (__ieee754_atan2, __atan2) #endif - -/* Treat the Denormalized case */ -static double -SECTION -normalized (double ax, double ay, double y, double z) -{ - int p; - mp_no mpx, mpy, mpz, mperr, mpz2, mpt1; - p = 6; - __dbl_mp (ax, &mpx, p); - __dbl_mp (ay, &mpy, p); - __dvd (&mpy, &mpx, &mpz, p); - __dbl_mp (ue.d, &mpt1, p); - __mul (&mpz, &mpt1, &mperr, p); - __sub (&mpz, &mperr, &mpz2, p); - __mp_dbl (&mpz2, &z, p); - return signArctan2 (y, z); -} - -/* Stage 3: Perform a multi-Precision computation */ -static double -SECTION -atan2Mp (double x, double y, const int pr[]) -{ - double z1, z2; - int i, p; - mp_no mpx, mpy, mpz, mpz1, mpz2, mperr, mpt1; - for (i = 0; i < MM; i++) - { - p = pr[i]; - __dbl_mp (x, &mpx, p); - __dbl_mp (y, &mpy, p); - __mpatan2 (&mpy, &mpx, &mpz, p); - __dbl_mp (ud[i].d, &mpt1, p); - __mul (&mpz, &mpt1, &mperr, p); - __add (&mpz, &mperr, &mpz1, p); - __sub (&mpz, &mperr, &mpz2, p); - __mp_dbl (&mpz1, &z1, p); - __mp_dbl (&mpz2, &z2, p); - if (z1 == z2) - { - LIBC_PROBE (slowatan2, 4, &p, &x, &y, &z1); - return z1; - } - } - LIBC_PROBE (slowatan2_inexact, 4, &p, &x, &y, &z1); - return z1; /*if impossible to do exact computing */ -}