Patchwork powerpc: fix check-before-set in SET_RESTORE_ROUND

login
register
mail settings
Submitter Paul Clarke
Date Oct. 10, 2017, 3:52 p.m.
Message ID <81a9460e-368d-997c-0e85-d7d8a0387e46@us.ibm.com>
Download mbox | patch
Permalink /patch/23426/
State Committed
Delegated to: Tulio Magno Quites Machado Filho
Headers show

Comments

Paul Clarke - Oct. 10, 2017, 3:52 p.m.
A performance regression was introduced by commit
84d74e427a771906830800e574a72f8d25a954b8 "powerpc: Cleanup fenv_private.h".

In the powerpc implementation of SET_RESTORE_ROUND, there is the
following code in the "SET" function (slightly simplified):
--
  old.fenv = fegetenv_register ();

  new.l = (old.l & _FPU_MASK_TRAPS_RN) | r; (1)

  if (new.l != old.l)                       (2)
    {
      if ((old.l & _FPU_ALL_TRAPS) != 0)
        (void) __fe_mask_env ();
      fesetenv_register (new.fenv);         (3)
--

Line (1) sets the value of "new" to the current value of FPSCR,
but masks off summary bits, exceptions, non-IEEE mode, and
rounding mode, then ORs in the new rounding mode.

Line (2) compares this new value to the current value in order to
avoid setting a new value in the FPSCR (line (3)) unless something
significant has changed (exception enables or rounding mode).

The summary bits are not germane to the comparison, but are cleared
in "new" and preserved in "old", resulting in false negative
comparisons, and unnecessarily setting the FPSCR in those cases
with associated negative performance impacts.

The solution is to treat the summaries identically for "new" and "old":
- save them in SET
- leave them alone otherwise
- restore the saved values in RESTORE

Also minor changes:
- expand _FPU_MASK_RN to 64bit hex, to match other MASKs
- treat bit 52 (left-to-right) as reserved (since it is)

2017-10-10  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fenv_private.h: Fix masks to
	more properly handle summary bits.
---
 sysdeps/powerpc/fpu/fenv_private.h | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)
Tulio Magno Quites Machado Filho - Oct. 13, 2017, 8:16 p.m.
Paul Clarke <pc@us.ibm.com> writes:

> A performance regression was introduced by commit
> 84d74e427a771906830800e574a72f8d25a954b8 "powerpc: Cleanup fenv_private.h".
>
> In the powerpc implementation of SET_RESTORE_ROUND, there is the
> following code in the "SET" function (slightly simplified):
> --
>   old.fenv = fegetenv_register ();
>
>   new.l = (old.l & _FPU_MASK_TRAPS_RN) | r; (1)
>
>   if (new.l != old.l)                       (2)
>     {
>       if ((old.l & _FPU_ALL_TRAPS) != 0)
>         (void) __fe_mask_env ();
>       fesetenv_register (new.fenv);         (3)
> --
>
> Line (1) sets the value of "new" to the current value of FPSCR,
> but masks off summary bits, exceptions, non-IEEE mode, and
> rounding mode, then ORs in the new rounding mode.
>
> Line (2) compares this new value to the current value in order to
> avoid setting a new value in the FPSCR (line (3)) unless something
> significant has changed (exception enables or rounding mode).
>
> The summary bits are not germane to the comparison, but are cleared
> in "new" and preserved in "old", resulting in false negative
> comparisons, and unnecessarily setting the FPSCR in those cases
> with associated negative performance impacts.
>
> The solution is to treat the summaries identically for "new" and "old":
> - save them in SET
> - leave them alone otherwise
> - restore the saved values in RESTORE
>
> Also minor changes:
> - expand _FPU_MASK_RN to 64bit hex, to match other MASKs
> - treat bit 52 (left-to-right) as reserved (since it is)
>
> 2017-10-10  Paul A. Clarke  <pc@us.ibm.com>
>
> 	* sysdeps/powerpc/fpu/fenv_private.h: Fix masks to
> 	more properly handle summary bits.

I think this ChangeLog entry has to mention all the macros that are being
changed, e.g.:

	* sysdeps/powerpc/fpu/fenv_private.h (_FPU_MASK_TRAPS_RN):
	(_FPU_MASK_FRAC_INEX_RET_CC): Fix masks to more properly handle
	summary bits.
	(_FPU_MASK_RN): Expand _FPU_MASK_RN to 64bit hex.
	(_FPU_MASK_NOT_RN_NI): Treat bit 52 (left-to-right) as reserved. 

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com>

Patch

diff --git a/sysdeps/powerpc/fpu/fenv_private.h b/sysdeps/powerpc/fpu/fenv_private.h
index 877f25b..984dff9 100644
--- a/sysdeps/powerpc/fpu/fenv_private.h
+++ b/sysdeps/powerpc/fpu/fenv_private.h
@@ -28,17 +28,16 @@ 
                       | _FPU_MASK_XM | _FPU_MASK_IM)
 
 /* Mask the rounding mode bits.  */
-#define _FPU_MASK_RN (~0x3)
+#define _FPU_MASK_RN 0xfffffffffffffffcLL
 
-/* Mask everything but the rounding moded and non-IEEE arithmetic flags.  */
-#define _FPU_MASK_NOT_RN_NI 0xffffffff00000007LL
+/* Mask everything but the rounding modes and non-IEEE arithmetic flags.  */
+#define _FPU_MASK_NOT_RN_NI 0xffffffff00000807LL
 
 /* Mask restore rounding mode and exception enabled.  */
-#define _FPU_MASK_TRAPS_RN 0xffffffff1fffff00LL
+#define _FPU_MASK_TRAPS_RN 0xffffffffffffff00LL
 
-/* Mask exception enable but fraction rounded/inexact and FP result/CC
-   bits.  */
-#define _FPU_MASK_FRAC_INEX_RET_CC 0xffffffff1ff80fff
+/* Mask FP result flags, preserve fraction rounded/inexact bits.  */
+#define _FPU_MASK_FRAC_INEX_RET_CC 0xfffffffffff80fffLL
 
 static __always_inline void
 __libc_feholdbits_ppc (fenv_t *envp, unsigned long long mask,