[3/4,powerpc] fesetenv: optimize FPSCR access

Message ID 1566335985-14601-4-git-send-email-pc@us.ibm.com
State Committed
Headers

Commit Message

Paul A. Clarke Aug. 20, 2019, 9:19 p.m. UTC
  From: "Paul A. Clarke" <pc@us.ibm.com>

fesetenv() reads the current value of the Floating-Point Status and Control
Register (FPSCR) to determine the difference between the current state of
exception enables and the newly requested state.  All of these bits are also
returned by the lighter weight 'mffsl' instruction used by fegetenv_status().
Use that instead.

Also, remove a local macro _FPU_MASK_ALL in favor of a common macro,
FPU_ENABLES_MASK from fenv_libc.h.

Finally, use a local variable ('new') in favor of a pointer dereference
('*envp').

2019-08-20  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fesetenv.c (__fesetenv):  Utilize lightweight
	FPSCR read.
	(_FPU_MASK_ALL):  Delete.
---
 sysdeps/powerpc/fpu/fesetenv.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)
  

Comments

Paul E Murphy Aug. 21, 2019, 10:13 p.m. UTC | #1
On 8/20/19 4:19 PM, Paul A. Clarke wrote:
> From: "Paul A. Clarke" <pc@us.ibm.com>
> 
> fesetenv() reads the current value of the Floating-Point Status and Control
> Register (FPSCR) to determine the difference between the current state of
> exception enables and the newly requested state.  All of these bits are also
> returned by the lighter weight 'mffsl' instruction used by fegetenv_status().
> Use that instead.
> 
> Also, remove a local macro _FPU_MASK_ALL in favor of a common macro,
> FPU_ENABLES_MASK from fenv_libc.h.
> 
> Finally, use a local variable ('new') in favor of a pointer dereference
> ('*envp').
> 
> 2019-08-20  Paul A. Clarke  <pc@us.ibm.com>
> 
> 	* sysdeps/powerpc/fpu/fesetenv.c (__fesetenv):  Utilize lightweight
> 	FPSCR read.
> 	(_FPU_MASK_ALL):  Delete.
> ---
>   sysdeps/powerpc/fpu/fesetenv.c | 12 ++++--------
>   1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/sysdeps/powerpc/fpu/fesetenv.c b/sysdeps/powerpc/fpu/fesetenv.c
> index 009a4f0..5ca15c7 100644
> --- a/sysdeps/powerpc/fpu/fesetenv.c
> +++ b/sysdeps/powerpc/fpu/fesetenv.c
> @@ -19,8 +19,6 @@
>   #include <fenv_libc.h>
>   #include <fpu_control.h>
>   
> -#define _FPU_MASK_ALL (_FPU_MASK_ZM | _FPU_MASK_OM | _FPU_MASK_UM | _FPU_MASK_XM | _FPU_MASK_IM)
> -
>   int
>   __fesetenv (const fenv_t *envp)
>   {
> @@ -28,25 +26,23 @@ __fesetenv (const fenv_t *envp)
>   
>     /* get the currently set exceptions.  */
>     new.fenv = *envp;
> -  old.fenv = fegetenv_register ();
> -  if (old.l == new.l)
> -    return 0;
> +  old.fenv = fegetenv_status ();
>   
>     /* If the old env has no enabled exceptions and the new env has any enabled
>        exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put the
>        hardware into "precise mode" and may cause the FPU to run slower on some
>        hardware.  */
> -  if ((old.l & _FPU_MASK_ALL) == 0 && (new.l & _FPU_MASK_ALL) != 0)
> +  if ((old.l & FPSCR_ENABLES_MASK) == 0 && (new.l & FPSCR_ENABLES_MASK) != 0)
>       (void) __fe_nomask_env_priv (); >
>     /* If the old env had any enabled exceptions and the new env has no enabled
>        exceptions, then mask SIGFPE in the MSR FE0/FE1 bits.  This may allow the
>        FPU to run faster because it always takes the default action and can not
>        generate SIGFPE. */
> -  if ((old.l & _FPU_MASK_ALL) != 0 && (new.l & _FPU_MASK_ALL) == 0)
> +  if ((old.l & FPSCR_ENABLES_MASK) != 0 && (new.l & FPSCR_ENABLES_MASK) == 0)
>       (void)__fe_mask_env ();
If you need to make another similar change, I might recommend 
consolidating the code to toggle the MSR bits.

>   
> -  fesetenv_register (*envp);
> +  fesetenv_register (new.fenv);
>   
>     /* Success.  */
>     return 0;
> 

LGTM, thanks.
  
Paul A. Clarke Aug. 22, 2019, 4:55 p.m. UTC | #2
On 8/21/19 5:13 PM, Paul E Murphy wrote:
> On 8/20/19 4:19 PM, Paul A. Clarke wrote:

>> diff --git a/sysdeps/powerpc/fpu/fesetenv.c b/sysdeps/powerpc/fpu/fesetenv.c
>> index 009a4f0..5ca15c7 100644
>> --- a/sysdeps/powerpc/fpu/fesetenv.c
>> +++ b/sysdeps/powerpc/fpu/fesetenv.c
>> @@ -28,25 +26,23 @@ __fesetenv (const fenv_t *envp)

>>       /* If the old env has no enabled exceptions and the new env has any enabled
>>        exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put the
>>        hardware into "precise mode" and may cause the FPU to run slower on some
>>        hardware.  */
>> -  if ((old.l & _FPU_MASK_ALL) == 0 && (new.l & _FPU_MASK_ALL) != 0)
>> +  if ((old.l & FPSCR_ENABLES_MASK) == 0 && (new.l & FPSCR_ENABLES_MASK) != 0)
>>       (void) __fe_nomask_env_priv (); >
>>     /* If the old env had any enabled exceptions and the new env has no enabled
>>        exceptions, then mask SIGFPE in the MSR FE0/FE1 bits.  This may allow the
>>        FPU to run faster because it always takes the default action and can not
>>        generate SIGFPE. */
>> -  if ((old.l & _FPU_MASK_ALL) != 0 && (new.l & _FPU_MASK_ALL) == 0)
>> +  if ((old.l & FPSCR_ENABLES_MASK) != 0 && (new.l & FPSCR_ENABLES_MASK) == 0)
>>       (void)__fe_mask_env ();

> If you need to make another similar change, I might recommend consolidating the code to toggle the MSR bits.

Could you elaborate on what consolidation you are anticipating?

PC
  
Paul E Murphy Aug. 22, 2019, 6:54 p.m. UTC | #3
On 8/22/19 11:55 AM, Paul Clarke wrote:
> On 8/21/19 5:13 PM, Paul E Murphy wrote:
>> On 8/20/19 4:19 PM, Paul A. Clarke wrote:
> 
>>> diff --git a/sysdeps/powerpc/fpu/fesetenv.c b/sysdeps/powerpc/fpu/fesetenv.c
>>> index 009a4f0..5ca15c7 100644
>>> --- a/sysdeps/powerpc/fpu/fesetenv.c
>>> +++ b/sysdeps/powerpc/fpu/fesetenv.c
>>> @@ -28,25 +26,23 @@ __fesetenv (const fenv_t *envp)
> 
>>>        /* If the old env has no enabled exceptions and the new env has any enabled
>>>         exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put the
>>>         hardware into "precise mode" and may cause the FPU to run slower on some
>>>         hardware.  */
>>> -  if ((old.l & _FPU_MASK_ALL) == 0 && (new.l & _FPU_MASK_ALL) != 0)
>>> +  if ((old.l & FPSCR_ENABLES_MASK) == 0 && (new.l & FPSCR_ENABLES_MASK) != 0)
>>>        (void) __fe_nomask_env_priv (); >
>>>      /* If the old env had any enabled exceptions and the new env has no enabled
>>>         exceptions, then mask SIGFPE in the MSR FE0/FE1 bits.  This may allow the
>>>         FPU to run faster because it always takes the default action and can not
>>>         generate SIGFPE. */
>>> -  if ((old.l & _FPU_MASK_ALL) != 0 && (new.l & _FPU_MASK_ALL) == 0)
>>> +  if ((old.l & FPSCR_ENABLES_MASK) != 0 && (new.l & FPSCR_ENABLES_MASK) == 0)
>>>        (void)__fe_mask_env ();
> 
>> If you need to make another similar change, I might recommend consolidating the code to toggle the MSR bits.
> 
> Could you elaborate on what consolidation you are anticipating?

Code similar to the above exists in at least four places (as of 
f615e3fced) to toggle the MSR bits if the number of enabled exceptions 
changes from 0.

git grep "([a-z]*[.]l & _FPU_[A-Z_]*) .= . && ([a-z]*.l & _FPU_[A-Z_]*)"

Maybe those can be moved into a single macro/inline function in 
fenv_private.h if more fixups need to be made?

> 
> PC
>
  

Patch

diff --git a/sysdeps/powerpc/fpu/fesetenv.c b/sysdeps/powerpc/fpu/fesetenv.c
index 009a4f0..5ca15c7 100644
--- a/sysdeps/powerpc/fpu/fesetenv.c
+++ b/sysdeps/powerpc/fpu/fesetenv.c
@@ -19,8 +19,6 @@ 
 #include <fenv_libc.h>
 #include <fpu_control.h>
 
-#define _FPU_MASK_ALL (_FPU_MASK_ZM | _FPU_MASK_OM | _FPU_MASK_UM | _FPU_MASK_XM | _FPU_MASK_IM)
-
 int
 __fesetenv (const fenv_t *envp)
 {
@@ -28,25 +26,23 @@  __fesetenv (const fenv_t *envp)
 
   /* get the currently set exceptions.  */
   new.fenv = *envp;
-  old.fenv = fegetenv_register ();
-  if (old.l == new.l)
-    return 0;
+  old.fenv = fegetenv_status ();
 
   /* If the old env has no enabled exceptions and the new env has any enabled
      exceptions, then unmask SIGFPE in the MSR FE0/FE1 bits.  This will put the
      hardware into "precise mode" and may cause the FPU to run slower on some
      hardware.  */
-  if ((old.l & _FPU_MASK_ALL) == 0 && (new.l & _FPU_MASK_ALL) != 0)
+  if ((old.l & FPSCR_ENABLES_MASK) == 0 && (new.l & FPSCR_ENABLES_MASK) != 0)
     (void) __fe_nomask_env_priv ();
 
   /* If the old env had any enabled exceptions and the new env has no enabled
      exceptions, then mask SIGFPE in the MSR FE0/FE1 bits.  This may allow the
      FPU to run faster because it always takes the default action and can not
      generate SIGFPE. */
-  if ((old.l & _FPU_MASK_ALL) != 0 && (new.l & _FPU_MASK_ALL) == 0)
+  if ((old.l & FPSCR_ENABLES_MASK) != 0 && (new.l & FPSCR_ENABLES_MASK) == 0)
     (void)__fe_mask_env ();
 
-  fesetenv_register (*envp);
+  fesetenv_register (new.fenv);
 
   /* Success.  */
   return 0;