[powerpc] get_rounding_mode: utilize faster method to get rounding mode
Commit Message
From: "Paul A. Clarke" <pc@us.ibm.com>
Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
Also, mask the result to avoid bleeding unrelated bits into the result of
_FPU_GET_RC().
2019-06-05 Paul A. Clarke <pc@us.ibm.com>
* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
with call to equivalent function.
* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
(__FPU_MFFS): New.
(__FPU_MFFSL): New.
(_FPU_GET_RC): New.
(_FPU_GETCW): Use __FPU_MFFS().
* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
---
sysdeps/powerpc/fpu/fenv_libc.h | 15 ++++++++++++
sysdeps/powerpc/fpu_control.h | 29 ++++++++++++++++++++---
sysdeps/powerpc/powerpc64/get-rounding-mode.h | 33 +++++++++++++++++++++++++++
3 files changed, 74 insertions(+), 3 deletions(-)
create mode 100644 sysdeps/powerpc/powerpc64/get-rounding-mode.h
Comments
"Paul A. Clarke" <pc@us.ibm.com> writes:
> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>
> Also, mask the result to avoid bleeding unrelated bits into the result of
> _FPU_GET_RC().
>
> 2019-06-05 Paul A. Clarke <pc@us.ibm.com>
>
> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
> with call to equivalent function.
This file is missing in this patch.
> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
> index 8a0bace..621e57f 100644
> --- a/sysdeps/powerpc/fpu/fenv_libc.h
> +++ b/sysdeps/powerpc/fpu/fenv_libc.h
> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
> pointer. */
> #define fegetenv_register() __builtin_mffs()
>
> +/* Equivalent to fegetenv_register, but only returns bits for
> + status, exception enables, and mode. */
> +#ifdef _ARCH_PWR9
> +#define fegetenv_status() \
> + ({union { double __d; unsigned long long __ll; } __u; \
> + register double __fr; \
> + __asm__ ("mffsl %0" : "=f" (__fr)); \
> + __u.__d = __fr; \
> + __u.__ll; \
> + })
> +#else
> +#define fegetenv_status() __builtin_mffs()
> +#error "power8"
> +#endif
Is this macro used in fegetexcept.c?
Is the error intentional?
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
>
>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>>
>> Also, mask the result to avoid bleeding unrelated bits into the result of
>> _FPU_GET_RC().
>>
>> 2019-06-05 Paul A. Clarke <pc@us.ibm.com>
>>
>> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
>> with call to equivalent function.
>
> This file is missing in this patch.
Copy and paste error from the previous patch I submitted earlier today. Sorry.
I will fix when committing, presuming the rest of this patch is OK.
PC
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
>
>> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
>> index 8a0bace..621e57f 100644
>> --- a/sysdeps/powerpc/fpu/fenv_libc.h
>> +++ b/sysdeps/powerpc/fpu/fenv_libc.h
>> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
>> pointer. */
>> #define fegetenv_register() __builtin_mffs()
>>
>> +/* Equivalent to fegetenv_register, but only returns bits for
>> + status, exception enables, and mode. */
>> +#ifdef _ARCH_PWR9
>> +#define fegetenv_status() \
>> + ({union { double __d; unsigned long long __ll; } __u; \
>> + register double __fr; \
>> + __asm__ ("mffsl %0" : "=f" (__fr)); \
>> + __u.__d = __fr; \
>> + __u.__ll; \
>> + })
>> +#else
>> +#define fegetenv_status() __builtin_mffs()
>> +#error "power8"
>> +#endif
>
> Is this macro used in fegetexcept.c?
> Is the error intentional?
This hunk should not be in this patch submission. I shall remove it.
Sorry I missed this comment before.
PC
"Paul A. Clarke" <pc@us.ibm.com> writes:
> From: "Paul A. Clarke" <pc@us.ibm.com>
>
> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>
> Also, mask the result to avoid bleeding unrelated bits into the result of
> _FPU_GET_RC().
>
> 2019-06-05 Paul A. Clarke <pc@us.ibm.com>
>
> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
> with call to equivalent function.
> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
As you explained, both entries have to be removed.
> * sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
> (__FPU_MFFS): New.
> (__FPU_MFFSL): New.
> (_FPU_GET_RC): New.
> (_FPU_GETCW): Use __FPU_MFFS().
^ Extra spaces here. Should be only 1 tab.
> * sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
I also think this file is unnecessary.
> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
> new file mode 100644
> index 0000000..e2fdbbb
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
> @@ -0,0 +1,33 @@
> +/* Determine floating-point rounding mode within libc. powerpc64 version.
> + Copyright (C) 2019 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#ifndef _POWERPC64_GET_ROUNDING_MODE_H
> +#define _POWERPC64_GET_ROUNDING_MODE_H 1
> +
> +#include <fenv.h>
> +#include <fpu_control.h>
> +
> +/* Return the floating-point rounding mode. */
> +
> +static inline int
> +get_rounding_mode (void)
> +{
> + return _FPU_GET_RC ();
> +}
> +
> +#endif /* get-rounding-mode.h */
I don't understand why this file/function is needed.
Looks like another file that should be removed from this patch.
LGTM if you fix the ChangeLog and keep only the changes to
sysdeps/powerpc/fpu_control.h.
Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>>
>> Also, mask the result to avoid bleeding unrelated bits into the result of
>> _FPU_GET_RC().
>>
>> 2019-06-05 Paul A. Clarke <pc@us.ibm.com>
>>
>> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
>> with call to equivalent function.
>> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
>
> As you explained, both entries have to be removed.
Yes, sorry about the cruft.
>> * sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
>> (__FPU_MFFS): New.
>> (__FPU_MFFSL): New.
>> (_FPU_GET_RC): New.
>> (_FPU_GETCW): Use __FPU_MFFS().
>
> ^ Extra spaces here. Should be only 1 tab.
OK.
>> * sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
>
> I also think this file is unnecessary.
(discussed below)
>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>> new file mode 100644
>> index 0000000..e2fdbbb
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>> @@ -0,0 +1,33 @@
>> +/* Determine floating-point rounding mode within libc. powerpc64 version.
<snip>
>> +static inline int
>> +get_rounding_mode (void)
>> +{
>> + return _FPU_GET_RC ();
>> +}
>> +
>> +#endif /* get-rounding-mode.h */
>
> I don't understand why this file/function is needed.
> Looks like another file that should be removed from this patch.
Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs". get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).
> LGTM if you fix the ChangeLog and keep only the changes to
> sysdeps/powerpc/fpu_control.h.
>
> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
PC
Paul Clarke <pc@us.ibm.com> writes:
> On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
>> "Paul A. Clarke" <pc@us.ibm.com> writes:
>>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>>> new file mode 100644
>>> index 0000000..e2fdbbb
>>> --- /dev/null
>>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>>> @@ -0,0 +1,33 @@
>>> +/* Determine floating-point rounding mode within libc. powerpc64 version.
> <snip>
>>> +static inline int
>>> +get_rounding_mode (void)
>>> +{
>>> + return _FPU_GET_RC ();
>>> +}
>>> +
>>> +#endif /* get-rounding-mode.h */
>>
>> I don't understand why this file/function is needed.
>> Looks like another file that should be removed from this patch.
>
> Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs". get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).
Very good point. I had missed this.
In that case, I think you should move this file to
sysdeps/powerpc/fpu/get-rounding-mode.h allowing 32-bits builds to use it
and restricting its usage to fpu.
@@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
pointer. */
#define fegetenv_register() __builtin_mffs()
+/* Equivalent to fegetenv_register, but only returns bits for
+ status, exception enables, and mode. */
+#ifdef _ARCH_PWR9
+#define fegetenv_status() \
+ ({union { double __d; unsigned long long __ll; } __u; \
+ register double __fr; \
+ __asm__ ("mffsl %0" : "=f" (__fr)); \
+ __u.__d = __fr; \
+ __u.__ll; \
+ })
+#else
+#define fegetenv_status() __builtin_mffs()
+#error "power8"
+#endif
+
/* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */
#define fesetenv_register(env) \
do { \
@@ -71,6 +71,8 @@ extern fpu_control_t __fpu_control;
# define _FPU_RC_UP 0x02
# define _FPU_RC_ZERO 0x01
+# define _FPU_MASK_RC (_FPU_RC_NEAREST|_FPU_RC_DOWN|_FPU_RC_UP|_FPU_RC_ZERO)
+
# define _FPU_MASK_NI 0x04 /* non-ieee mode */
/* masking of interrupts */
@@ -94,15 +96,36 @@ extern fpu_control_t __fpu_control;
typedef unsigned int fpu_control_t;
/* Macros for accessing the hardware control word. */
+# define __FPU_MFFS() \
+ ({register double __fr; \
+ __asm__ ("mffs %0" : "=f" (__fr)); \
+ __fr; \
+ })
+
# define _FPU_GETCW(cw) \
({union { double __d; unsigned long long __ll; } __u; \
- register double __fr; \
- __asm__ ("mffs %0" : "=f" (__fr)); \
- __u.__d = __fr; \
+ __u.__d = __FPU_MFFS(); \
(cw) = (fpu_control_t) __u.__ll; \
(fpu_control_t) __u.__ll; \
})
+#ifdef _ARCH_PWR9
+# define __FPU_MFFSL() \
+ ({register double __fr; \
+ __asm__ ("mffsl %0" : "=f" (__fr)); \
+ __fr; \
+ })
+#else
+# define __FPU_MFFSL() __FPU_MFFS()
+#endif
+
+# define _FPU_GET_RC() \
+ ({union { double __d; unsigned long long __ll; } __u; \
+ __u.__d = __FPU_MFFSL(); \
+ __u.__ll &= _FPU_MASK_RC; \
+ (fpu_control_t) __u.__ll; \
+ })
+
# define _FPU_SETCW(cw) \
{ union { double __d; unsigned long long __ll; } __u; \
register double __fr; \
new file mode 100644
@@ -0,0 +1,33 @@
+/* Determine floating-point rounding mode within libc. powerpc64 version.
+ Copyright (C) 2019 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef _POWERPC64_GET_ROUNDING_MODE_H
+#define _POWERPC64_GET_ROUNDING_MODE_H 1
+
+#include <fenv.h>
+#include <fpu_control.h>
+
+/* Return the floating-point rounding mode. */
+
+static inline int
+get_rounding_mode (void)
+{
+ return _FPU_GET_RC ();
+}
+
+#endif /* get-rounding-mode.h */