[1/2] ARC: fp: (micro)optimize FPU_STATUS read by eliding FWE bit clearing
Checks
Context |
Check |
Description |
dj/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
Commit Message
Any FPU_STATUS write needs setting the FWE bit (31) whcih just provides
a "control signal" to enable explicit write (vs. the side-effect of FPU
instructions). However this bit is RAZ and write-only, thus effectively
never stored in FPU_STATUS register. Thus when reading the register
there is no need to clear it. This shaves off a BCLR instruction from
the fe*exceptino family of functions and while no big deal still makes
sense to do.
This came up when debugging a race in math/test-fenv-tls [1]
[1]: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/54
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
sysdeps/arc/fpu_control.h | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
Comments
On 7/20/21 1:57 PM, Vineet Gupta via Libc-alpha wrote:
> Any FPU_STATUS write needs setting the FWE bit (31) whcih just provides
> a "control signal" to enable explicit write (vs. the side-effect of FPU
> instructions). However this bit is RAZ and write-only, thus effectively
> never stored in FPU_STATUS register. Thus when reading the register
> there is no need to clear it. This shaves off a BCLR instruction from
> the fe*exceptino family of functions and while no big deal still makes
> sense to do.
>
> This came up when debugging a race in math/test-fenv-tls [1]
>
> [1]: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/54
>
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Committed !
> ---
> sysdeps/arc/fpu_control.h | 15 +++++++--------
> 1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/sysdeps/arc/fpu_control.h b/sysdeps/arc/fpu_control.h
> index c7d101e7838f..ae4348321c16 100644
> --- a/sysdeps/arc/fpu_control.h
> +++ b/sysdeps/arc/fpu_control.h
> @@ -81,21 +81,20 @@ typedef unsigned int fpu_control_t;
> # define _FPU_SETCW(cw) __asm__ volatile ("sr %0, [0x300]" : : "r" (cw))
>
> /* Macros for accessing the hardware status word.
> - FWE bit is special as it controls if actual status bits could be wrritten
> - explicitly (other than FPU instructions). We handle it here to keep the
> - callers agnostic of it:
> - - clear it out when reporting status bits
> - - always set it when changing status bits. */
> + Writing to FPU_STATUS requires a "control" bit FWE to be able to set the
> + exception flags directly (as opposed to side-effects of FP instructions).
> + That is done in the macro here to keeps callers agnostic of this detail.
> + And given FWE is write-only and RAZ, no need to "clear" it in _FPU_GETS
> + macro. */
> # define _FPU_GETS(cw) \
> __asm__ volatile ("lr %0, [0x301] \r\n" \
> - "bclr %0, %0, 31 \r\n" \
> : "=r" (cw))
>
> # define _FPU_SETS(cw) \
> do { \
> - unsigned int __tmp = 0x80000000 | (cw); \
> + unsigned int __fwe = 0x80000000 | (cw); \
> __asm__ volatile ("sr %0, [0x301] \r\n" \
> - : : "r" (__tmp)); \
> + : : "r" (__fwe)); \
> } while (0)
>
> /* Default control word set at startup. */
>
@@ -81,21 +81,20 @@ typedef unsigned int fpu_control_t;
# define _FPU_SETCW(cw) __asm__ volatile ("sr %0, [0x300]" : : "r" (cw))
/* Macros for accessing the hardware status word.
- FWE bit is special as it controls if actual status bits could be wrritten
- explicitly (other than FPU instructions). We handle it here to keep the
- callers agnostic of it:
- - clear it out when reporting status bits
- - always set it when changing status bits. */
+ Writing to FPU_STATUS requires a "control" bit FWE to be able to set the
+ exception flags directly (as opposed to side-effects of FP instructions).
+ That is done in the macro here to keeps callers agnostic of this detail.
+ And given FWE is write-only and RAZ, no need to "clear" it in _FPU_GETS
+ macro. */
# define _FPU_GETS(cw) \
__asm__ volatile ("lr %0, [0x301] \r\n" \
- "bclr %0, %0, 31 \r\n" \
: "=r" (cw))
# define _FPU_SETS(cw) \
do { \
- unsigned int __tmp = 0x80000000 | (cw); \
+ unsigned int __fwe = 0x80000000 | (cw); \
__asm__ volatile ("sr %0, [0x301] \r\n" \
- : : "r" (__tmp)); \
+ : : "r" (__fwe)); \
} while (0)
/* Default control word set at startup. */