[1/2] ARC: fp: (micro)optimize FPU_STATUS read by eliding FWE bit clearing

Message ID 20210720205800.1056218-2-vgupta@synopsys.com
State Committed
Commit 31aefa93f3e9a49b7a493d410acb70108e176d61
Headers
Series ARC fixes/updates |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Vineet Gupta July 20, 2021, 8:57 p.m. UTC
  Any FPU_STATUS write needs setting the FWE bit (31) whcih just provides
a "control signal" to enable explicit write (vs. the side-effect of FPU
instructions).  However this bit is RAZ and write-only, thus effectively
never stored in FPU_STATUS register. Thus when reading the register
there is no need to clear it. This shaves off a BCLR instruction from
the fe*exceptino family of functions and while no big deal still makes
sense to do.

This came up when debugging a race in math/test-fenv-tls [1]

[1]: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/54

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 sysdeps/arc/fpu_control.h | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)
  

Comments

Vineet Gupta July 21, 2021, 8:16 p.m. UTC | #1
On 7/20/21 1:57 PM, Vineet Gupta via Libc-alpha wrote:
> Any FPU_STATUS write needs setting the FWE bit (31) whcih just provides
> a "control signal" to enable explicit write (vs. the side-effect of FPU
> instructions).  However this bit is RAZ and write-only, thus effectively
> never stored in FPU_STATUS register. Thus when reading the register
> there is no need to clear it. This shaves off a BCLR instruction from
> the fe*exceptino family of functions and while no big deal still makes
> sense to do.
> 
> This came up when debugging a race in math/test-fenv-tls [1]
> 
> [1]: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/54
> 
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>

Committed !

> ---
>   sysdeps/arc/fpu_control.h | 15 +++++++--------
>   1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/sysdeps/arc/fpu_control.h b/sysdeps/arc/fpu_control.h
> index c7d101e7838f..ae4348321c16 100644
> --- a/sysdeps/arc/fpu_control.h
> +++ b/sysdeps/arc/fpu_control.h
> @@ -81,21 +81,20 @@ typedef unsigned int fpu_control_t;
>   #  define _FPU_SETCW(cw) __asm__ volatile ("sr %0, [0x300]" : : "r" (cw))
>   
>   /*  Macros for accessing the hardware status word.
> -    FWE bit is special as it controls if actual status bits could be wrritten
> -    explicitly (other than FPU instructions). We handle it here to keep the
> -    callers agnostic of it:
> -      - clear it out when reporting status bits
> -      - always set it when changing status bits.  */
> +    Writing to FPU_STATUS requires a "control" bit FWE to be able to set the
> +    exception flags directly (as opposed to side-effects of FP instructions).
> +    That is done in the macro here to keeps callers agnostic of this detail.
> +    And given FWE is write-only and RAZ, no need to "clear" it in _FPU_GETS
> +    macro.  */
>   #  define _FPU_GETS(cw)				\
>       __asm__ volatile ("lr   %0, [0x301]	\r\n" 	\
> -                      "bclr %0, %0, 31	\r\n" 	\
>                         : "=r" (cw))
>   
>   #  define _FPU_SETS(cw)				\
>       do {					\
> -      unsigned int __tmp = 0x80000000 | (cw);	\
> +      unsigned int __fwe = 0x80000000 | (cw);	\
>         __asm__ volatile ("sr  %0, [0x301] \r\n" 	\
> -                        : : "r" (__tmp));	\
> +                        : : "r" (__fwe));	\
>       } while (0)
>   
>   /* Default control word set at startup.  */
>
  

Patch

diff --git a/sysdeps/arc/fpu_control.h b/sysdeps/arc/fpu_control.h
index c7d101e7838f..ae4348321c16 100644
--- a/sysdeps/arc/fpu_control.h
+++ b/sysdeps/arc/fpu_control.h
@@ -81,21 +81,20 @@  typedef unsigned int fpu_control_t;
 #  define _FPU_SETCW(cw) __asm__ volatile ("sr %0, [0x300]" : : "r" (cw))
 
 /*  Macros for accessing the hardware status word.
-    FWE bit is special as it controls if actual status bits could be wrritten
-    explicitly (other than FPU instructions). We handle it here to keep the
-    callers agnostic of it:
-      - clear it out when reporting status bits
-      - always set it when changing status bits.  */
+    Writing to FPU_STATUS requires a "control" bit FWE to be able to set the
+    exception flags directly (as opposed to side-effects of FP instructions).
+    That is done in the macro here to keeps callers agnostic of this detail.
+    And given FWE is write-only and RAZ, no need to "clear" it in _FPU_GETS
+    macro.  */
 #  define _FPU_GETS(cw)				\
     __asm__ volatile ("lr   %0, [0x301]	\r\n" 	\
-                      "bclr %0, %0, 31	\r\n" 	\
                       : "=r" (cw))
 
 #  define _FPU_SETS(cw)				\
     do {					\
-      unsigned int __tmp = 0x80000000 | (cw);	\
+      unsigned int __fwe = 0x80000000 | (cw);	\
       __asm__ volatile ("sr  %0, [0x301] \r\n" 	\
-                        : : "r" (__tmp));	\
+                        : : "r" (__fwe));	\
     } while (0)
 
 /* Default control word set at startup.  */