Patchwork [powerpc] get_rounding_mode: utilize faster method to get rounding mode

login
register
mail settings
Submitter Paul Clarke
Date June 5, 2019, 10:17 p.m.
Message ID <1559773022-26575-1-git-send-email-pc@us.ibm.com>
Download mbox | patch
Permalink /patch/33033/
State New
Headers show

Comments

Paul Clarke - June 5, 2019, 10:17 p.m.
From: "Paul A. Clarke" <pc@us.ibm.com>

Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

Also, mask the result to avoid bleeding unrelated bits into the result of
_FPU_GET_RC().

2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
	with call to equivalent function.
	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
	  (__FPU_MFFS): New.
	  (__FPU_MFFSL): New.
	  (_FPU_GET_RC): New.
	  (_FPU_GETCW): Use __FPU_MFFS().
	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
---
 sysdeps/powerpc/fpu/fenv_libc.h               | 15 ++++++++++++
 sysdeps/powerpc/fpu_control.h                 | 29 ++++++++++++++++++++---
 sysdeps/powerpc/powerpc64/get-rounding-mode.h | 33 +++++++++++++++++++++++++++
 3 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 sysdeps/powerpc/powerpc64/get-rounding-mode.h
Tulio Magno Quites Machado Filho - June 5, 2019, 10:42 p.m.
"Paul A. Clarke" <pc@us.ibm.com> writes:

> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>
> Also, mask the result to avoid bleeding unrelated bits into the result of
> _FPU_GET_RC().
>
> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>
> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
> 	with call to equivalent function.

This file is missing in this patch.

> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
> index 8a0bace..621e57f 100644
> --- a/sysdeps/powerpc/fpu/fenv_libc.h
> +++ b/sysdeps/powerpc/fpu/fenv_libc.h
> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
>     pointer.  */
>  #define fegetenv_register() __builtin_mffs()
>  
> +/* Equivalent to fegetenv_register, but only returns bits for
> +   status, exception enables, and mode.  */
> +#ifdef _ARCH_PWR9
> +#define fegetenv_status() \
> +  ({union { double __d; unsigned long long __ll; } __u;	\
> +    register double __fr;				\
> +    __asm__ ("mffsl %0" : "=f" (__fr));			\
> +    __u.__d = __fr;					\
> +    __u.__ll;						\
> +  })
> +#else
> +#define fegetenv_status() __builtin_mffs()
> +#error "power8"
> +#endif

Is this macro used in fegetexcept.c?
Is the error intentional?
Paul Clarke - June 5, 2019, 10:54 p.m.
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
> 
>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>>
>> Also, mask the result to avoid bleeding unrelated bits into the result of
>> _FPU_GET_RC().
>>
>> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>>
>> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
>> 	with call to equivalent function.
> 
> This file is missing in this patch.

Copy and paste error from the previous patch I submitted earlier today.  Sorry.
I will fix when committing, presuming the rest of this patch is OK.

PC
Paul Clarke - June 5, 2019, 11:04 p.m.
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
> 
>> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
>> index 8a0bace..621e57f 100644
>> --- a/sysdeps/powerpc/fpu/fenv_libc.h
>> +++ b/sysdeps/powerpc/fpu/fenv_libc.h
>> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
>>     pointer.  */
>>  #define fegetenv_register() __builtin_mffs()
>>  
>> +/* Equivalent to fegetenv_register, but only returns bits for
>> +   status, exception enables, and mode.  */
>> +#ifdef _ARCH_PWR9
>> +#define fegetenv_status() \
>> +  ({union { double __d; unsigned long long __ll; } __u;	\
>> +    register double __fr;				\
>> +    __asm__ ("mffsl %0" : "=f" (__fr));			\
>> +    __u.__d = __fr;					\
>> +    __u.__ll;						\
>> +  })
>> +#else
>> +#define fegetenv_status() __builtin_mffs()
>> +#error "power8"
>> +#endif
> 
> Is this macro used in fegetexcept.c?
> Is the error intentional?

This hunk should not be in this patch submission.  I shall remove it.

Sorry I missed this comment before.

PC
Tulio Magno Quites Machado Filho - June 6, 2019, 2:17 p.m.
"Paul A. Clarke" <pc@us.ibm.com> writes:

> From: "Paul A. Clarke" <pc@us.ibm.com>
>
> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>
> Also, mask the result to avoid bleeding unrelated bits into the result of
> _FPU_GET_RC().
>
> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>
> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
> 	with call to equivalent function.
> 	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.

As you explained, both entries have to be removed.

> 	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
> 	  (__FPU_MFFS): New.
> 	  (__FPU_MFFSL): New.
> 	  (_FPU_GET_RC): New.
> 	  (_FPU_GETCW): Use __FPU_MFFS().

        ^ Extra spaces here.  Should be only 1 tab.

> 	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.

I also think this file is unnecessary.

> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
> new file mode 100644
> index 0000000..e2fdbbb
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
> @@ -0,0 +1,33 @@
> +/* Determine floating-point rounding mode within libc.  powerpc64 version.
> +   Copyright (C) 2019 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef _POWERPC64_GET_ROUNDING_MODE_H
> +#define _POWERPC64_GET_ROUNDING_MODE_H	1
> +
> +#include <fenv.h>
> +#include <fpu_control.h>
> +
> +/* Return the floating-point rounding mode.  */
> +
> +static inline int
> +get_rounding_mode (void)
> +{
> +  return _FPU_GET_RC ();
> +}
> +
> +#endif /* get-rounding-mode.h */

I don't understand why this file/function is needed.
Looks like another file that should be removed from this patch.

LGTM if you fix the ChangeLog and keep only the changes to
sysdeps/powerpc/fpu_control.h.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
Paul Clarke - June 6, 2019, 5:41 p.m.
On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>>
>> Also, mask the result to avoid bleeding unrelated bits into the result of
>> _FPU_GET_RC().
>>
>> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>>
>> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
>> 	with call to equivalent function.
>> 	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
> 
> As you explained, both entries have to be removed.

Yes, sorry about the cruft.

>> 	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
>> 	  (__FPU_MFFS): New.
>> 	  (__FPU_MFFSL): New.
>> 	  (_FPU_GET_RC): New.
>> 	  (_FPU_GETCW): Use __FPU_MFFS().
> 
>         ^ Extra spaces here.  Should be only 1 tab.

OK.

>> 	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
> 
> I also think this file is unnecessary.

(discussed below)

>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>> new file mode 100644
>> index 0000000..e2fdbbb
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>> @@ -0,0 +1,33 @@
>> +/* Determine floating-point rounding mode within libc.  powerpc64 version.
<snip>
>> +static inline int
>> +get_rounding_mode (void)
>> +{
>> +  return _FPU_GET_RC ();
>> +}
>> +
>> +#endif /* get-rounding-mode.h */
> 
> I don't understand why this file/function is needed.
> Looks like another file that should be removed from this patch.

Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs".  get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).

> LGTM if you fix the ChangeLog and keep only the changes to
> sysdeps/powerpc/fpu_control.h.
> 
> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PC
Tulio Magno Quites Machado Filho - June 6, 2019, 6:05 p.m.
Paul Clarke <pc@us.ibm.com> writes:

> On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
>> "Paul A. Clarke" <pc@us.ibm.com> writes:

>>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>>> new file mode 100644
>>> index 0000000..e2fdbbb
>>> --- /dev/null
>>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>>> @@ -0,0 +1,33 @@
>>> +/* Determine floating-point rounding mode within libc.  powerpc64 version.
> <snip>
>>> +static inline int
>>> +get_rounding_mode (void)
>>> +{
>>> +  return _FPU_GET_RC ();
>>> +}
>>> +
>>> +#endif /* get-rounding-mode.h */
>> 
>> I don't understand why this file/function is needed.
>> Looks like another file that should be removed from this patch.
>
> Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs".  get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).

Very good point.  I had missed this.

In that case, I think you should move this file to
sysdeps/powerpc/fpu/get-rounding-mode.h allowing 32-bits builds to use it
and restricting its usage to fpu.

Patch

diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
index 8a0bace..621e57f 100644
--- a/sysdeps/powerpc/fpu/fenv_libc.h
+++ b/sysdeps/powerpc/fpu/fenv_libc.h
@@ -34,6 +34,21 @@  extern const fenv_t *__fe_mask_env (void) attribute_hidden;
    pointer.  */
 #define fegetenv_register() __builtin_mffs()
 
+/* Equivalent to fegetenv_register, but only returns bits for
+   status, exception enables, and mode.  */
+#ifdef _ARCH_PWR9
+#define fegetenv_status() \
+  ({union { double __d; unsigned long long __ll; } __u;	\
+    register double __fr;				\
+    __asm__ ("mffsl %0" : "=f" (__fr));			\
+    __u.__d = __fr;					\
+    __u.__ll;						\
+  })
+#else
+#define fegetenv_status() __builtin_mffs()
+#error "power8"
+#endif
+
 /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer.  */
 #define fesetenv_register(env) \
 	do { \
diff --git a/sysdeps/powerpc/fpu_control.h b/sysdeps/powerpc/fpu_control.h
index e0c5cf6..22c52b7 100644
--- a/sysdeps/powerpc/fpu_control.h
+++ b/sysdeps/powerpc/fpu_control.h
@@ -71,6 +71,8 @@  extern fpu_control_t __fpu_control;
 # define _FPU_RC_UP      0x02
 # define _FPU_RC_ZERO    0x01
 
+# define _FPU_MASK_RC (_FPU_RC_NEAREST|_FPU_RC_DOWN|_FPU_RC_UP|_FPU_RC_ZERO)
+
 # define _FPU_MASK_NI  0x04 /* non-ieee mode */
 
 /* masking of interrupts */
@@ -94,15 +96,36 @@  extern fpu_control_t __fpu_control;
 typedef unsigned int fpu_control_t;
 
 /* Macros for accessing the hardware control word.  */
+# define __FPU_MFFS()						\
+  ({register double __fr;					\
+    __asm__ ("mffs %0" : "=f" (__fr));				\
+    __fr;							\
+  })
+
 # define _FPU_GETCW(cw)						\
   ({union { double __d; unsigned long long __ll; } __u;		\
-    register double __fr;					\
-    __asm__ ("mffs %0" : "=f" (__fr));				\
-    __u.__d = __fr;						\
+    __u.__d = __FPU_MFFS();					\
     (cw) = (fpu_control_t) __u.__ll;				\
     (fpu_control_t) __u.__ll;					\
   })
 
+#ifdef _ARCH_PWR9
+# define __FPU_MFFSL()						\
+  ({register double __fr;					\
+    __asm__ ("mffsl %0" : "=f" (__fr));				\
+    __fr;							\
+  })
+#else
+# define __FPU_MFFSL() __FPU_MFFS()
+#endif
+    
+# define _FPU_GET_RC()						\
+  ({union { double __d; unsigned long long __ll; } __u;		\
+    __u.__d = __FPU_MFFSL();					\
+    __u.__ll &= _FPU_MASK_RC;					\
+    (fpu_control_t) __u.__ll;					\
+  })
+
 # define _FPU_SETCW(cw)						\
   { union { double __d; unsigned long long __ll; } __u;		\
     register double __fr;					\
diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
new file mode 100644
index 0000000..e2fdbbb
--- /dev/null
+++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
@@ -0,0 +1,33 @@ 
+/* Determine floating-point rounding mode within libc.  powerpc64 version.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _POWERPC64_GET_ROUNDING_MODE_H
+#define _POWERPC64_GET_ROUNDING_MODE_H	1
+
+#include <fenv.h>
+#include <fpu_control.h>
+
+/* Return the floating-point rounding mode.  */
+
+static inline int
+get_rounding_mode (void)
+{
+  return _FPU_GET_RC ();
+}
+
+#endif /* get-rounding-mode.h */