From patchwork Wed Aug 22 14:27:14 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
X-Patchwork-Id: 29012
Received: (qmail 4377 invoked by alias); 22 Aug 2018 14:27:24 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 4236 invoked by uid 89); 22 Aug 2018 14:27:22 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0,
	GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE,
	SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=
X-HELO: EUR04-VI1-obe.outbound.protection.outlook.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;
	s=selector1-arm-com;
	h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
	bh=b2yHGCLDZ/KeLSLbw/Zn02d//n84FtjMOsRErht8uak=;
	b=OmCbSKXod+oqKGnA6KxNUY7RJwLMZ8NSHbwoMaMN8yYZseKK8f3U0TaQzzb0hs7C8BY+9bS9uezhl857RNTiVEsfPFUFEvtwZyON0HFQxp38nWlc1GxPTDPtetvCnY8bsAlD/RraoIjGwU1B0y6ohYnF03Q4Y36GaAuTR6OZfLM=
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: Joseph Myers <joseph@codesourcery.com>
CC: "libc-alpha@sourceware.org" <libc-alpha@sourceware.org>, nd <nd@arm.com>
Subject: Re: [PATCH] Speedup tanf range reduction
Date: Wed, 22 Aug 2018 14:27:14 +0000
Message-ID: 
 <HE1PR08MB103555974BE551C979EFD28883300@HE1PR08MB1035.eurprd08.prod.outlook.com>
References: 
 <HE1PR08MB10357845B62EA7A4DFCD927783300@HE1PR08MB1035.eurprd08.prod.outlook.com>,
	<alpine.DEB.2.21.1808221231420.19290@digraph.polyomino.org.uk>
In-Reply-To: <alpine.DEB.2.21.1808221231420.19290@digraph.polyomino.org.uk>
authentication-results: spf=none (sender IP is )
	smtp.mailfrom=Wilco.Dijkstra@arm.com;
received-spf: None (protection.outlook.com: arm.com does not designate
	permitted sender hosts)
MIME-Version: 1.0

Joseph Myers wrote:

> +
> +static inline int32_t
> +rem_pio2f (float x, float *y)

> Please put a comment on this function documenting its semantics.

Done, see below.


Speedup tanf range reduction by using the new sincosf range
reduction algorithm.  Overall code quality is improved due to
inlining, so there is a speedup even if no range reduction is
required.

Passes GLIBC testsuite on AArch64.  Some files are no longer
required which are removed in the next patch.

tanf througput gains on Cortex-A72:
* |x| < M_PI_4  : 1.1x
* |x| < M_PI_2  : 1.2x
* |x| < 2 * M_PI: 1.5x
* |x| < 120.0   : 1.6x
* |x| < Inf     : 12.1x

ChangeLog:
2018-08-22  Wilco Dijkstra  <wdijkstr@arm.com>

	* sysdeps/ieee754/flt-32/s_tanf.c (__tanf): Use fast range reduction.

diff --git a/sysdeps/ieee754/flt-32/s_tanf.c b/sysdeps/ieee754/flt-32/s_tanf.c
index ba3af54913669e4abdfd864307856ec44138f9b9..fd104103ad026a8c87ea7b571f13e868561a2998 100644
--- a/sysdeps/ieee754/flt-32/s_tanf.c
+++ b/sysdeps/ieee754/flt-32/s_tanf.c
@@ -21,6 +21,33 @@ static char rcsid[] = "$NetBSD: s_tanf.c,v 1.4 1995/05/10 20:48:20 jtc Exp $";
 #include <math.h>
 #include <math_private.h>
 #include <libm-alias-float.h>
+#include "s_sincosf.h"
+
+/* Reduce range of X to a multiple of PI/2.  The modulo result is between
+   -PI/4 and PI/4 and returned as a high part y[0] and a low part y[1].
+   The low bit in the return value indicates the first or 2nd half of tanf.  */
+static inline int32_t
+rem_pio2f (float x, float *y)
+{
+  double dx = x;
+  int n;
+  const sincos_t *p = &__sincosf_table[0];
+
+  if (__glibc_likely (abstop12 (x) < abstop12 (120.0f)))
+    dx = reduce_fast (dx, p, &n);
+  else
+    {
+      uint32_t xi = asuint (x);
+      int sign = xi >> 31;
+
+      dx = reduce_large (xi, &n);
+      dx = sign ? -dx : dx;
+    }
+
+  y[0] = dx;
+  y[1] = dx - y[0];
+  return n;
+}
 
 float __tanf(float x)
 {
@@ -42,7 +69,7 @@ float __tanf(float x)
 
     /* argument reduction needed */
 	else {
-	    n = __ieee754_rem_pio2f(x,y);
+	    n = rem_pio2f(x,y);
 	    return __kernel_tanf(y[0],y[1],1-((n&1)<<1)); /*   1 -- n even
 							      -1 -- n odd */
 	}