From patchwork Mon Jan 4 12:21:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 41622 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 258E138708AA; Mon, 4 Jan 2021 12:22:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 258E138708AA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1609762928; bh=NuXZDbVeNLsA/iZYlDKQ9QAamQSC8oyxWdw4ZPKmkuc=; h=To:Subject:Date:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=sgZgT7bpKr7zDPkJPAETQMPQ1X8XmxYlkV4f2w1vFzywUXNwJJpatnpA05c9j0ObM szvPeL59kTQ7XTGcOsrcOD5bZ/jVh5yOy4hTzGn5EO/vbGW5rsL9NqR7jD+FmuooP9 uCtpVkcBoE35tWMfSmPgmr+qiozcWAHzhixc6WVM= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2079.outbound.protection.outlook.com [40.107.21.79]) by sourceware.org (Postfix) with ESMTPS id F16D53857C6F for ; Mon, 4 Jan 2021 12:22:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F16D53857C6F Received: from AM6PR02CA0006.eurprd02.prod.outlook.com (2603:10a6:20b:6e::19) by HE1PR08MB2827.eurprd08.prod.outlook.com (2603:10a6:7:37::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.20; Mon, 4 Jan 2021 12:22:01 +0000 Received: from AM5EUR03FT044.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:6e:cafe::8b) by AM6PR02CA0006.outlook.office365.com (2603:10a6:20b:6e::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.19 via Frontend Transport; Mon, 4 Jan 2021 12:22:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT044.mail.protection.outlook.com (10.152.17.56) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.21 via Frontend Transport; Mon, 4 Jan 2021 12:22:00 +0000 Received: ("Tessian outbound 6ec21dac9dd3:v71"); Mon, 04 Jan 2021 12:22:00 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: af565c74dbae7ffe X-CR-MTA-TID: 64aa7808 Received: from 1d7effa6c85b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 462EA606-615D-4355-BBEB-B56E5D6385EB.1; Mon, 04 Jan 2021 12:21:55 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 1d7effa6c85b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 04 Jan 2021 12:21:55 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=cFThXWPiSasN9BXqSixEHUtEMiziORpxj2aovUzd4/oBe0+P7jP9qYLG38pysEO2RCt9Pfxe05T0xICoJ2TIC0wTYOlrrim3KCHl39XABzal4b0A31k2oSjJL3tu+Z+XB75PyOC1VdJOfePItWhfX5PlLxcr1i9ko6O3JQ2CBKSOQtjKa5h+ruqUGbUldaaM+Aj7yfOAOJRmad6nEnPeX5fjOrxqlGNPt8hjbcMc1QpLEIjnvPy8+EeHHC0jPFlwyg9JM+JlnN7i3+wYnlM3C5SrgSWtlnR5NqTzZ/Fe7KTw0uOi5SdSkJgimqoIhg/ZX67qVTd2KcKLu80nULv4Lg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NuXZDbVeNLsA/iZYlDKQ9QAamQSC8oyxWdw4ZPKmkuc=; b=VFBCAdd3MiFkMwGuL3OvUp+bBiCJXy8h2XCOGZQXfDtvL2lU+KRjYMkVkH51yBQcicoPr+1izQt9w74o+kTrjJbF/1fRRTdeHDPeAs8D7NtSYM/+ZWUCH/KOqB8z1tIuW4QNGoRMXE6uopFon//PWoF8w9cD7IwqVSlwQhWSnAoHQq0ZM/T9kol2TJjEPtOWdHKglVdsCxYiQscoihe9FKL1u/n1nzV0dK7xiNK6sjH1cL6Q0Nay0bNZWMOuEyfF4oTvmeCpku8eXpQ/u6+F+/AngOnmfUXuDiV+nRe13+pU4sapkakLpCuVuh7GWHM910RStmpPvXfKBg5HBNyVPw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB5599.eurprd08.prod.outlook.com (2603:10a6:800:1a1::12) by VI1PR08MB5342.eurprd08.prod.outlook.com (2603:10a6:803:132::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3721.20; Mon, 4 Jan 2021 12:21:53 +0000 Received: from VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::6d00:2694:e0d7:986f]) by VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::6d00:2694:e0d7:986f%5]) with mapi id 15.20.3721.024; Mon, 4 Jan 2021 12:21:53 +0000 To: 'GNU C Library' Subject: [PATCH 4/5] Remove slow paths from atan2 Thread-Topic: [PATCH 4/5] Remove slow paths from atan2 Thread-Index: AQHW4pQvR2Oi5HS+2kC9PyGOHJ3UuQ== Date: Mon, 4 Jan 2021 12:21:53 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.249.100] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: dedb56ae-8762-48f2-7e12-08d8b0ab566a x-ms-traffictypediagnostic: VI1PR08MB5342:|HE1PR08MB2827: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:352;OLM:352; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: CI7aAVI38z2aq9fM2YNG6C2I83It62NufKmYL4DLP6WXhlQ1UXjKq3a76jE0u12NmUOm+IiZZyXGAmsjK/3Wkhw690YprZcJsmRgFWVWt/qrdCvnMv24hjMGs9BxSazwYj40yf1QVaBs5YxWd0RUfiWn0SWJkDUi8M9SctuwgtLZDgNDqbBdFze4drb4ykQbjH0in+eY82QoCyZvWvfC7FuFBwiX2nEnksAhgcYb1LcLdcABkH7nwPWz17cl0I1CCequbmAvR5TA905ORKCqTtkTEmKubBe4Y9Lw7rgcJvn8EoTNPNQRVeXZeJaBqngGpZKUPzawu0r/V+yZQKj+h7Ukj/3x92zbe4JOglRjrt9kgLt7JcsXTwfLqLcRUaXP3Z9D+Oz15/LsgZ0R5alPAA== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB5599.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(396003)(376002)(346002)(366004)(39860400002)(66446008)(2906002)(5660300002)(52536014)(7696005)(86362001)(30864003)(6506007)(91956017)(76116006)(33656002)(64756008)(83380400001)(66946007)(8936002)(316002)(66556008)(66476007)(71200400001)(26005)(2940100002)(8676002)(186003)(478600001)(55016002)(6916009)(9686003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?q?rGaQpFxgiaByHl9v4dPP4O2gj?= =?iso-8859-1?q?dY9kKUEFaZn26TsU40IO9sEUgpplyyuhkty3lI1xnWK23cDGhENc+uIbeHnE?= =?iso-8859-1?q?Srz5FyNPHhyqMcZlmGys7t/cjiYjwrSBtX/T8FFNubUawMCzyrIwCgKOlXH8?= =?iso-8859-1?q?Zcic+G5bOEyUmHy3I3BhWTnAG3TecY8ZiakFQHRJgm6n5QOf6u+MbVbbriv6?= =?iso-8859-1?q?gsbNZdgDGlZYev2sob459i+WCSjgESmWZYSOb2sClq5gYRB7sjwXxPsgXYm8?= =?iso-8859-1?q?6eo28NmPoBLjmJrlrkOUep6EG2zdyuf6aU4EvuI+Lz0vaL/b8ccR0nyAUIe5?= =?iso-8859-1?q?NwuZUYbPr3a1Az4A5I/p4fbZF1n4XEQbLsTkfexuLVHjpEXI4nvKDA+L2JU9?= =?iso-8859-1?q?aAPHdC8icsY8hBVFYjDXn18p0V2eVjrYRxmNDaHhqTRJQpE09AONApcad4wC?= =?iso-8859-1?q?usDUeSMjIDSVwzIKO5lS552JN+jJKKvQFI/P88Jy8rqgU2Eg1OW8jd8hoDv9?= =?iso-8859-1?q?wuR2LaDWY+XQc5TtdebGJaLTyeWi2lVyA15q1vAKpTHiz1J01TAKtPQSp8aI?= =?iso-8859-1?q?+41iqTPesnp3Zz71TRCLzQfHCBzwb5qKjz1NpH7K50Dg9O8mwc+PsZNEI3Xz?= =?iso-8859-1?q?LUpfJAT1Om8XTnqujxvrzK+13T9XV+6UdeoipSLy0+Su5fvLDugjif0WrXEi?= =?iso-8859-1?q?0GIIyXU8QywySc0Jvwyy0BE+rYkCIpSYfXMvf6HRMVKoHVKpz5us9zJrw8wY?= =?iso-8859-1?q?01ibolMLBg9kx9qiYIKWJbUzQ5sQFwWDMs5a0WUQGLuGljtwCWrqZ/GeH0Ek?= =?iso-8859-1?q?X8w8sVPlNUr4XIlowVCMpOKcsNddQN0sUuQBIxreZykiL/B6766PX/4bKcWD?= =?iso-8859-1?q?r94TaGtbBqFVb3Nc/i5g6fzmReK4lcmpRVljZvKQY50DPlpS3RBwB92E2m0x?= =?iso-8859-1?q?lYGjbhQCSyO2ID+BAWRE8KLmNm0zhWNpvFYcO8b9arhL4WybMFXJGgYwaDLf?= =?iso-8859-1?q?P3BPhpUYsSXYZWgpfk=3D?= x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB5342 Original-Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none; sourceware.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 62ff4882-9148-45f9-c711-08d8b0ab51da X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: DYJu6XfK5pPkx7guX7eje2+U/207hf9Py8G/2hyvqyhn7wwQzV1DeT6X9jExCoctkk0zO9lqLquZTbc6oEJLJi+Sd7r3mXSaQ8O6cipzfh+/yhkwg0XqJ3ND8/K+0ihWUF8loWrm7a+k4iZ2Ncd9+eiqW7DcINOo6YO1MBzLsZcYlxxhwzguxTXyb2CKKaOQxClB88nh4wjcblO+zp3Yvnswv7qmGYE0P5DFzn4AgkFivE1z+nVJdx9kn3/QGF5vK65PrmYvJapNb4VdtTb8njp8pr5kctQFZ5nXtfXwHWOBiW5IybdsMQA5QyYE4/lt1ePckmZH90GpCDJhTu0FuyrftGFG2tcemcP3yJn21Yve9jgRhVcVJB16h1oOL5DvxOPcwre9f8P+VfdvHHKPeQKtC5TP4RBBx5Rj2xcNG1HbO5zJwk2RFl+Zi1cazPmJvbrxS9CrwayouRQj48m0JA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(376002)(39860400002)(346002)(136003)(396003)(46966006)(70586007)(5660300002)(6506007)(356005)(86362001)(7696005)(2906002)(82310400003)(81166007)(336012)(33656002)(70206006)(9686003)(55016002)(186003)(82740400003)(83380400001)(26005)(2940100002)(52536014)(30864003)(8936002)(8676002)(47076005)(6916009)(478600001)(316002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 04 Jan 2021 12:22:00.8067 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dedb56ae-8762-48f2-7e12-08d8b0ab566a X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT044.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR08MB2827 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Wilco Dijkstra via Libc-alpha From: Wilco Dijkstra Reply-To: Wilco Dijkstra Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" Remove slow paths from atan2. Add ULP annotations. Passes GLIBC testsuite. diff --git a/sysdeps/ieee754/dbl-64/atnat2.h b/sysdeps/ieee754/dbl-64/atnat2.h index de4300c5fc559f391e0b20a818452640dabb0e41..9cb8a1fbeae6ff2845151f23d49797c1218c0062 100644 --- a/sysdeps/ieee754/dbl-64/atnat2.h +++ b/sysdeps/ieee754/dbl-64/atnat2.h @@ -34,7 +34,7 @@ #define MM 5 #ifdef BIG_ENDI - static const number + static const mynumber /* polynomial I */ /**/ d3 = {{0xbfd55555, 0x55555555} }, /* -0.333... */ /**/ d5 = {{0x3fc99999, 0x999997fd} }, /* 0.199... */ @@ -96,7 +96,7 @@ #else #ifdef LITTLE_ENDI - static const number + static const mynumber /* polynomial I */ /**/ d3 = {{0x55555555, 0xbfd55555} }, /* -0.333... */ /**/ d5 = {{0x999997fd, 0x3fc99999} }, /* 0.199... */ diff --git a/sysdeps/ieee754/dbl-64/e_atan2.c b/sysdeps/ieee754/dbl-64/e_atan2.c index 7af2a4f495d39f6450699de1d0d9d2f7a33fdcf7..5af12d37c3c89cac951d293339e811fd96d0daf3 100644 --- a/sysdeps/ieee754/dbl-64/e_atan2.c +++ b/sysdeps/ieee754/dbl-64/e_atan2.c @@ -20,25 +20,14 @@ /* MODULE_NAME: atnat2.c */ /* */ /* FUNCTIONS: uatan2 */ -/* atan2Mp */ /* signArctan2 */ -/* normalized */ /* */ -/* FILES NEEDED: dla.h endian.h mpa.h mydefs.h atnat2.h */ -/* mpatan.c mpatan2.c mpsqrt.c */ +/* FILES NEEDED: dla.h endian.h mydefs.h atnat2.h */ /* uatan.tbl */ /* */ -/* An ultimate atan2() routine. Given two IEEE double machine numbers y,*/ -/* x it computes the correctly rounded (to nearest) value of atan2(y,x).*/ -/* */ -/* Assumption: Machine arithmetic operations are performed in */ -/* round to nearest mode of IEEE 754 standard. */ -/* */ /************************************************************************/ #include -#include "mpa.h" -#include "MathLib.h" #include "mydefs.h" #include "uatan.tbl" #include "atnat2.h" @@ -48,20 +37,21 @@ #include #include #include -#include #include #ifndef SECTION # define SECTION #endif +#define TWO52 0x1.0p52 +#define TWOM1022 0x1.0p-1022 + /************************************************************************/ /* An ultimate atan2 routine. Given two IEEE double machine numbers y,x */ /* it computes the correctly rounded (to nearest) value of atan2(y,x). */ /* Assumption: Machine arithmetic operations are performed in */ /* round to nearest mode of IEEE 754 standard. */ /************************************************************************/ -static double atan2Mp (double, double, const int[]); /* Fix the sign and return after stage 1 or stage 2 */ static double signArctan2 (double y, double z) @@ -69,18 +59,14 @@ signArctan2 (double y, double z) return copysign (z, y); } -static double normalized (double, double, double, double); -void __mpatan2 (mp_no *, mp_no *, mp_no *, int); - double SECTION __ieee754_atan2 (double y, double x) { int i, de, ux, dx, uy, dy; - static const int pr[MM] = { 6, 8, 10, 20, 32 }; - double ax, ay, u, du, u9, ua, v, vv, dv, t1, t2, t3, - z, zz, cor, s1, ss1, s2, ss2; - number num; + double ax, ay, u, du, v, vv, dv, t1, t2, t3, + z, zz, cor; + mynumber num; static const int ep = 59768832, /* 57*16**5 */ em = -59768832; /* -57*16**5 */ @@ -208,10 +194,8 @@ __ieee754_atan2 (double y, double x) if (x > 0) { double ret; - if ((z = ay / ax) < TWOM1022) - ret = normalized (ax, ay, y, z); - else - ret = signArctan2 (y, z); + z = ay / ax; + ret = signArctan2 (y, z); if (fabs (ret) < DBL_MIN) { double vret = ret ? ret : DBL_MIN; @@ -270,30 +254,12 @@ __ieee754_atan2 (double y, double x) + v * (d11.d + v * d13.d))))); - if ((z = u + (zz - u1.d * u)) == u + (zz + u1.d * u)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d + v * (f13.d - + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - u5.d * s1)) == s1 + (ss1 + u5.d * s1)) - return signArctan2 (y, z); - - return atan2Mp (x, y, pr); + z = u + zz; + /* Max ULP is 0.504. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; t3 = u - cij[i][0].d; EADD (t3, du, v, dv); @@ -304,43 +270,9 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][4].d + v * (cij[i][5].d + v * cij[i][6].d)))); - if (i < 112) - { - if (i < 48) - u9 = u91.d; /* u < 1/4 */ - else - u9 = u92.d; - } /* 1/4 <= u < 1/2 */ - else - { - if (i < 176) - u9 = u93.d; /* 1/2 <= u < 3/4 */ - else - u9 = u94.d; - } /* 3/4 <= u <= 1 */ - if ((z = t1 + (zz - u9 * t1)) == t1 + (zz + u9 * t1)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - EADD (t1, du, v, vv); - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d - + v * hij[i][15].d)))); - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - ub.d * s2)) == s2 + (ss2 + ub.d * s2)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.56. */ + return signArctan2 (y, z); } /* (ii) x>0, abs(x)<=abs(y): pi/2-atan(ax/ay) */ @@ -355,31 +287,12 @@ __ieee754_atan2 (double y, double x) + v * d13.d))))); ESUB (hpi.d, u, t2, cor); t3 = ((hpi1.d + cor) - du) - zz; - if ((z = t2 + (t3 - u2.d)) == t2 + (t3 + u2.d)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d - + v * (f13.d - + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - SUB2 (hpi.d, hpi1.d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - u6.d)) == s2 + (ss2 + u6.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t2 + t3; + /* Max ULP is 0.501. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; v = (u - cij[i][0].d) + du; @@ -389,36 +302,9 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][5].d + v * cij[i][6].d)))); t1 = hpi.d - cij[i][1].d; - if (i < 112) - ua = ua1.d; /* w < 1/2 */ - else - ua = ua2.d; /* w >= 1/2 */ - if ((z = t1 + (zz - ua)) == t1 + (zz + ua)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - EADD (t1, du, v, vv); - - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d - + v * hij[i][15].d)))); - - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - SUB2 (hpi.d, hpi1.d, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - uc.d)) == s1 + (ss1 + uc.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.503. */ + return signArctan2 (y, z); } /* (iii) x<0, abs(x)< abs(y): pi/2+atan(ax/ay) */ @@ -434,30 +320,12 @@ __ieee754_atan2 (double y, double x) + v * (d11.d + v * d13.d))))); EADD (hpi.d, u, t2, cor); t3 = ((hpi1.d + cor) + du) + zz; - if ((z = t2 + (t3 - u3.d)) == t2 + (t3 + u3.d)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d - + v * (f13.d + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - ADD2 (hpi.d, hpi1.d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - u7.d)) == s2 + (ss2 + u7.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t2 + t3; + /* Max ULP is 0.501. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; v = (u - cij[i][0].d) + du; zz = hpi1.d + v * (cij[i][2].d @@ -466,34 +334,9 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][5].d + v * cij[i][6].d)))); t1 = hpi.d + cij[i][1].d; - if (i < 112) - ua = ua1.d; /* w < 1/2 */ - else - ua = ua2.d; /* w >= 1/2 */ - if ((z = t1 + (zz - ua)) == t1 + (zz + ua)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - EADD (t1, du, v, vv); - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d - + v * hij[i][15].d)))); - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - ADD2 (hpi.d, hpi1.d, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - uc.d)) == s1 + (ss1 + uc.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.503. */ + return signArctan2 (y, z); } /* (iv) x<0, abs(y)<=abs(x): pi-atan(ax/ay) */ @@ -506,29 +349,12 @@ __ieee754_atan2 (double y, double x) + v * (d9.d + v * (d11.d + v * d13.d))))); ESUB (opi.d, u, t2, cor); t3 = ((opi1.d + cor) - du) - zz; - if ((z = t2 + (t3 - u4.d)) == t2 + (t3 + u4.d)) - return signArctan2 (y, z); - - MUL2 (u, du, u, du, v, vv, t1, t2); - s1 = v * (f11.d + v * (f13.d + v * (f15.d + v * (f17.d + v * f19.d)))); - ADD2 (f9.d, ff9.d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f7.d, ff7.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f5.d, ff5.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (f3.d, ff3.d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - MUL2 (u, du, s1, ss1, s2, ss2, t1, t2); - ADD2 (u, du, s2, ss2, s1, ss1, t1, t2); - SUB2 (opi.d, opi1.d, s1, ss1, s2, ss2, t1, t2); - - if ((z = s2 + (ss2 - u8.d)) == s2 + (ss2 + u8.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t2 + t3; + /* Max ULP is 0.501. */ + return signArctan2 (y, z); } - i = (TWO52 + TWO8 * u) - TWO52; + i = (TWO52 + 256 * u) - TWO52; i -= 16; v = (u - cij[i][0].d) + du; zz = opi1.d - v * (cij[i][2].d @@ -536,86 +362,11 @@ __ieee754_atan2 (double y, double x) + v * (cij[i][4].d + v * (cij[i][5].d + v * cij[i][6].d)))); t1 = opi.d - cij[i][1].d; - if (i < 112) - ua = ua1.d; /* w < 1/2 */ - else - ua = ua2.d; /* w >= 1/2 */ - if ((z = t1 + (zz - ua)) == t1 + (zz + ua)) - return signArctan2 (y, z); - - t1 = u - hij[i][0].d; - - EADD (t1, du, v, vv); - - s1 = v * (hij[i][11].d - + v * (hij[i][12].d - + v * (hij[i][13].d - + v * (hij[i][14].d + v * hij[i][15].d)))); - - ADD2 (hij[i][9].d, hij[i][10].d, s1, 0, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][7].d, hij[i][8].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][5].d, hij[i][6].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][3].d, hij[i][4].d, s1, ss1, s2, ss2, t1, t2); - MUL2 (v, vv, s2, ss2, s1, ss1, t1, t2); - ADD2 (hij[i][1].d, hij[i][2].d, s1, ss1, s2, ss2, t1, t2); - SUB2 (opi.d, opi1.d, s2, ss2, s1, ss1, t1, t2); - - if ((z = s1 + (ss1 - uc.d)) == s1 + (ss1 + uc.d)) - return signArctan2 (y, z); - return atan2Mp (x, y, pr); + z = t1 + zz; + /* Max ULP is 0.502. */ + return signArctan2 (y, z); } #ifndef __ieee754_atan2 libm_alias_finite (__ieee754_atan2, __atan2) #endif - -/* Treat the Denormalized case */ -static double -SECTION -normalized (double ax, double ay, double y, double z) -{ - int p; - mp_no mpx, mpy, mpz, mperr, mpz2, mpt1; - p = 6; - __dbl_mp (ax, &mpx, p); - __dbl_mp (ay, &mpy, p); - __dvd (&mpy, &mpx, &mpz, p); - __dbl_mp (ue.d, &mpt1, p); - __mul (&mpz, &mpt1, &mperr, p); - __sub (&mpz, &mperr, &mpz2, p); - __mp_dbl (&mpz2, &z, p); - return signArctan2 (y, z); -} - -/* Stage 3: Perform a multi-Precision computation */ -static double -SECTION -atan2Mp (double x, double y, const int pr[]) -{ - double z1, z2; - int i, p; - mp_no mpx, mpy, mpz, mpz1, mpz2, mperr, mpt1; - for (i = 0; i < MM; i++) - { - p = pr[i]; - __dbl_mp (x, &mpx, p); - __dbl_mp (y, &mpy, p); - __mpatan2 (&mpy, &mpx, &mpz, p); - __dbl_mp (ud[i].d, &mpt1, p); - __mul (&mpz, &mpt1, &mperr, p); - __add (&mpz, &mperr, &mpz1, p); - __sub (&mpz, &mperr, &mpz2, p); - __mp_dbl (&mpz1, &z1, p); - __mp_dbl (&mpz2, &z2, p); - if (z1 == z2) - { - LIBC_PROBE (slowatan2, 4, &p, &x, &y, &z1); - return z1; - } - } - LIBC_PROBE (slowatan2_inexact, 4, &p, &x, &y, &z1); - return z1; /*if impossible to do exact computing */ -}