From patchwork Mon Mar 12 15:26:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 26284 Received: (qmail 112453 invoked by alias); 12 Mar 2018 15:26:06 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 112444 invoked by uid 89); 12 Mar 2018 15:26:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR02-HE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" , "sellcey@cavium.com" CC: nd Subject: Re: [PATCH 4/6] Remove slow paths from sin/cos Date: Mon, 12 Mar 2018 15:26:00 +0000 Message-ID: References: , <1520632203.6774.151.camel@cavium.com> In-Reply-To: <1520632203.6774.151.camel@cavium.com> x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1749; 6:p/D4yO7cCzXS+s4/jU6mxQgjeL28eXzHuZwMOlFNROVTAq/SrpdGQYmmsyKntVt3Bqj9SyukEQzOWQ3s4vafe1RxxblpZxE0EAmopOKEp5GplKLr7WvJznOBHle8giUz9ba6FWao5niHyesBMczsW71gz9Q4TNrIpVeaOpiBV2rEPOqRFzEaE6gJqlCJiUDaFryJVGAtl2oo7jHivFXnQzL7uDQRV29/LcJhv92PskwDrdI6/kbjmEJtBjW8oomAUWs/HTOIIb++Y0KN0uxU5dazfLAp2H0P46Ci/OSJk002tgIb+c/GKiKMlD9E9oX0gTpA+b/QhRySGcub/yolm9d1962gaANkhc/QLrZF3TI94ZtAzt1k3YLTOmpeUBOt; 5:MOz4CzYmA3fd+v7AVvXYMUU3xv7Fir6EmmTgcFiyuEMA1Lf1SFgwiSWJx021wTimvCUrmmY5ndekxHGMQNIEsfS7FylwNd7ucztgid9xwh0O0yp44TaFHDPy27mgH/UkqXAJIxX6eIMoF9tkZ1SEblTxXHtaw6x1qjlWvLZAojQ=; 24:peSjKZq4T858suGsCR3Y9GlvMN2FslH4STg+V1JDfLqMTa43y2p+J48bdryqS3xqqUHB45S5Ecn1JCwhHAIy+OGHL0gSzwDGe3Bp9TxNnj4=; 7:GOEk3a80WNBSiZaMcsWX3ByjdN4Reo6yfZjzsX9yWOyyjMzB2XTpX3sDPWT64xEcobRXH6kmnJTmRAOVFA6al7LKsMD794WN7NmPgF9+nukroB/1AK7C5PbjR1JZigKwYm2DWoqn3/yTRmn0d+rm2loMXGYLCI7RS2H+7tVB/hyofIAkU0fmEVwRlG/H9YopfmHQlHZ4I38wklSMoZE4I3US8inN3o61SFAbL0dMW+AR8s+5OLUr7XC5cI+vDfbd x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 949af891-5327-4623-fe8c-08d5882d8f4e x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1749; x-ms-traffictypediagnostic: DB6PR0801MB1749: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:; x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(93006095)(93001095)(10201501046)(3231220)(944501244)(52105095)(6055026)(6041310)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123558120)(6072148)(201708071742011); SRVR:DB6PR0801MB1749; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1749; x-forefront-prvs: 06098A2863 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39380400002)(396003)(346002)(366004)(39860400002)(376002)(51444003)(189003)(199004)(2906002)(106356001)(3660700001)(5660300001)(97736004)(105586002)(66066001)(68736007)(2950100002)(76176011)(7696005)(99286004)(229853002)(110136005)(2900100001)(3280700002)(72206003)(14454004)(25786009)(4326008)(6246003)(478600001)(6506007)(316002)(74316002)(9686003)(26005)(8676002)(55016002)(7736002)(3846002)(81156014)(81166006)(305945005)(8936002)(6116002)(5250100002)(53936002)(102836004)(86362001)(33656002)(6436002)(2501003); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1749; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: cf4VoseV4J7i8joc2x6LkmJpnA5uiowC5v3FULqVg4hstW7h+yDhwzb6ygoDXHSI7S/mqGRsoyq1Y7cGyeQOBecPoGvjFuQZ6wrhibx5LX7cGYneErJXRd7lKDztI29vnhHavMgy55H67C2rnBsxsJarTJ3+mLScrCDD9Mhk5CmvpCAc6wYnaKjU5/lej6qkQcmnRd0xv44GmHE13A/gv+KVY4JKOrmoy9ApUyhb9kd0zAGnds1p9TwpJoLu0O09rOZkCKothzlfsqd8D04YaWP1I90k9mHm1NV/BZQuSusHQHSn9+IgMXJX9xuJd2iBC8z7/2sgcunvlzUOvvlUkA== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 949af891-5327-4623-fe8c-08d5882d8f4e X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Mar 2018 15:26:00.0924 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1749 Steve Ellcey wrote: > Did this patch get mangled in transit or something?  I could apply > patches 1 through 3 with no problem (and no fuzzing) to ToT but when I > got to 4 I get errors.  I may have mangled something on this end but I > treated this patch just like 1 through 3 and those worked fine.  5 and > 6 have issues too but I think that is because 4 did not apply cleanly. Sorry, it looks like a few lines were missing due to cut&paste, here is the full version: diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 3b748821f6e5f817dc234ec7f96d951910299e21..5966282db60224528fea2bf55a05dd4120ab12a9 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -67,11 +67,10 @@ The constants s1, s2, s3, etc. are pre-computed values of 1/3!, 1/5! and so on. The result is returned to LHS and correction in COR. */ -#define TAYLOR_SIN(xx, a, da, cor) \ +#define TAYLOR_SIN(xx, a, da) \ ({ \ double t = ((POLYNOMIAL (xx) * (a) - 0.5 * (da)) * (xx) + (da)); \ double res = (a) + t; \ - (cor) = ((a) - res) + t; \ res; \ }) @@ -145,10 +144,10 @@ static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the cosine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to - get the result in RES and a correction value in COR. */ + get the result. */ static inline double __always_inline -do_cos (double x, double dx, double *corp) +do_cos (double x, double dx) { mynumber u; @@ -158,16 +157,13 @@ do_cos (double x, double dx, double *corp) u.x = big + fabs (x); x = fabs (x) - (u.x - big) + dx; - double xx, s, sn, ssn, c, cs, ccs, res, cor; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + x * xx * (sn3 + xx * sn5); c = xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ccs - s * ssn - cs * c) - sn * s; - res = cs + cor; - cor = (cs - res) + cor; - *corp = cor; - return res; + return cs + cor; } /* A more precise variant of DO_COS. EPS is the adjustment to the correction @@ -207,10 +203,10 @@ do_cos_slow (double x, double dx, double eps, double *corp) /* Given a number partitioned into X and DX, this function computes the sine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get - the result in RES and a correction value in COR. */ + the result. */ static inline double __always_inline -do_sin (double x, double dx, double *corp) +do_sin (double x, double dx) { mynumber u; @@ -219,16 +215,13 @@ do_sin (double x, double dx, double *corp) u.x = big + fabs (x); x = fabs (x) - (u.x - big); - double xx, s, sn, ssn, c, cs, ccs, cor, res; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + (dx + x * xx * (sn3 + xx * sn5)); c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ssn + s * ccs - sn * c) + cs * s; - res = sn + cor; - cor = (sn - res) + cor; - *corp = cor; - return res; + return sn + cor; } /* A more precise variant of DO_SIN. EPS is the adjustment to the correction @@ -340,19 +333,19 @@ static double __always_inline do_sincos (double a, double da, int4 n) { - double retval, cor; + double retval; if (n & 1) /* Max ULP is 0.513. */ - retval = do_cos (a, da, &cor); + retval = do_cos (a, da); else { double xx = a * a; /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da, cor); + retval = TAYLOR_SIN (xx, a, da); else - retval = __copysign (do_sin (a, da, &cor), a); + retval = __copysign (do_sin (a, da), a); } return (n & 2) ? -retval : retval; @@ -371,7 +364,7 @@ SECTION #endif __sin (double x) { - double xx, t, a, da, cor; + double xx, t, a, da; mynumber u; int4 k, m, n; double retval = 0; @@ -401,7 +394,7 @@ __sin (double x) else if (k < 0x3feb6000) { /* Max ULP is 0.548. */ - retval = __copysign (do_sin (x, 0, &cor), x); + retval = __copysign (do_sin (x, 0), x); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ @@ -409,7 +402,7 @@ __sin (double x) { t = hp0 - fabs (x); /* Max ULP is 0.51. */ - retval = __copysign (do_cos (t, hp1, &cor), x); + retval = __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ #ifndef IN_SINCOS @@ -422,8 +415,10 @@ __sin (double x) /* --------------------105414350 <|x| <2^1024------------------------------*/ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, false); - + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n); + } /*--------------------- |x| > 2^1024 ----------------------------------*/ else { @@ -455,7 +450,7 @@ SECTION #endif __cos (double x) { - double y, xx, cor, a, da; + double y, xx, a, da; mynumber u; int4 k, m, n; @@ -476,7 +471,7 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ /* Max ULP is 0.51. */ - retval = do_cos (x, 0, &cor); + retval = do_cos (x, 0); } /* else if (k < 0x3feb6000) */ else if (k < 0x400368fd) @@ -488,9 +483,9 @@ __cos (double x) /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. Range reduction uses 106 bits here which is sufficient. */ if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da, cor); + retval = TAYLOR_SIN (xx, a, da); else - retval = __copysign (do_sin (a, da, &cor), a); + retval = __copysign (do_sin (a, da), a); } /* else if (k < 0x400368fd) */ @@ -503,7 +498,10 @@ __cos (double x) /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, true); + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n + 1); + } else { diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 4f032d2e42593ccde22169b374728386dd8fca8e..4335ecbba3c9894e61c087ac970b392fa73abfab 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -28,37 +28,6 @@ #define IN_SINCOS 1 #include "s_sin.c" -/* Consolidated version of reduce_and_compute in s_sin.c that does range - reduction only once and computes sin and cos together. */ -static inline void -__always_inline -reduce_and_compute_sincos (double x, double *sinx, double *cosx) -{ - double a, da; - unsigned int n = __branred (x, &a, &da); - - n = n & 3; - - if (n == 1 || n == 2) - { - a = -a; - da = -da; - } - - if (n & 1) - { - double *temp = cosx; - cosx = sinx; - sinx = temp; - } - - if (a * a < 0.01588) - *sinx = bsloww (a, da, x, n); - else - *sinx = bsloww1 (a, da, x, n); - *cosx = bsloww2 (a, da, x, n); -} - void __sincos (double x, double *sinx, double *cosx) { @@ -88,8 +57,11 @@ __sincos (double x, double *sinx, double *cosx) } if (k < 0x7ff00000) { - reduce_and_compute_sincos (x, sinx, cosx); - return; + double a, da; + int4 n = __branred (x, &a, &da); + + *sinx = do_sincos (a, da, n); + *cosx = do_sincos (a, da, n + 1); } if (isinf (x))