[2/2] ARM: Improve fenv implementation

Message ID 000801cf6ad1$64fd3a60$2ef7af20$@com
State Committed
Headers

Commit Message

Wilco Dijkstra May 8, 2014, 3:22 p.m. UTC
  Hi,

Same for this one.

Wilco

ChangeLog:
2014-05-08  Wilco  <wdijkstr@arm.com>

	* sysdeps/arm/fclrexcpt.c: Optimize to avoid unnecessary FPSCR writes.
	* sysdeps/arm/fedisblxcpt.c: Likewise.
	* sysdeps/arm/feenablxcpt.c: Likewise.
	* sysdeps/arm/fegetround.c: Call (get_rounding_mode).
	* sysdeps/arm/feholdexcpt.c: Call optimized (libc_feholdexcept_vfp).
	* sysdeps/arm/fesetenv.c: Special case FE_DFL_ENV and FE_NOMASK_ENV.
	Call optimized (libc_fesetenv_vfp).
	* sysdeps/arm/fesetround.c: Call optimized (libc_fesetround_vfp).
	* sysdeps/arm/feupdateenv.c: Special case FE_DFL_ENV and FE_NOMASK_ENV.
	Call optimized (libc_feupdateenv_vfp).
	* sysdeps/arm/fgetexcptflg.c: Call optimized (libc_fetestexcept_vfp).
	* sysdeps/arm/fsetexcptflg.c: Optimize to avoid unnecessary FPSCR
	writes.
	* sysdeps/arm/ftestexcept.c: Call optimized (libc_fetestexcept_vfp).
	* sysdeps/arm/setfpucw.c: Optimize to avoid unnecessary FPSCR writes.

-----Original Message-----
From: Wilco [mailto:wdijkstr@arm.com] 
Sent: 25 April 2014 11:50
To: 'libc-alpha@sourceware.org'
Subject: [PATCH 2/2] ARM: Improve fenv implementation

Hi,

The second patch improves the fenv implementation by using the inline functions from fenv_private
where possible rather than duplicating functionality. The remaining functions are optimized in a
similar way to avoid unnecessary FPSCR writes (but no longer use __glibc_unlikely like in the
previous version).

OK for commit?

Wilco
---
 sysdeps/arm/fclrexcpt.c    |   10 +++++-----
 sysdeps/arm/fedisblxcpt.c  |    4 +++-
 sysdeps/arm/feenablxcpt.c  |   10 +++++-----
 sysdeps/arm/fegetround.c   |   14 ++------------
 sysdeps/arm/feholdexcpt.c  |   16 ++--------------
 sysdeps/arm/fesetenv.c     |   42 +++++++++++++++++++++---------------------
 sysdeps/arm/fesetround.c   |   10 ++--------
 sysdeps/arm/feupdateenv.c  |   24 ++++++++++++++++--------
 sysdeps/arm/fgetexcptflg.c |    9 ++-------
 sysdeps/arm/fsetexcptflg.c |   13 ++++++++-----
 sysdeps/arm/ftestexcept.c  |   11 +++--------
 sysdeps/arm/setfpucw.c     |   11 ++++++-----
 12 files changed, 75 insertions(+), 99 deletions(-)
  

Comments

Joseph Myers May 18, 2014, 2:14 p.m. UTC | #1
This breaks the build when VFP isn't enabled at compile-time, because then 
the libc_* functions aren't defined.  You need to have the <fenv.h> 
functions conditionally call libc_* functions that always use VFP (and are 
always defined), while those VFP-using functions are only conditionally 
used within the rest of libm.  I.e., I think the __SOFTFP__ conditionals 
in fenv_private.h should only control the definitions of libc_* to use 
libc_*_vfp, not the definitions of the libc_*_vfp inlines.
  
Marcus Shawcroft May 19, 2014, 8:13 a.m. UTC | #2
On 18 May 2014 15:14, Joseph S. Myers <joseph@codesourcery.com> wrote:
> This breaks the build when VFP isn't enabled at compile-time, because then
> the libc_* functions aren't defined.  You need to have the <fenv.h>
> functions conditionally call libc_* functions that always use VFP (and are
> always defined), while those VFP-using functions are only conditionally
> used within the rest of libm.  I.e., I think the __SOFTFP__ conditionals
> in fenv_private.h should only control the definitions of libc_* to use
> libc_*_vfp, not the definitions of the libc_*_vfp inlines.
>

Wilco,  I've reverted this patch for you.  You will need to re-spin it
(and test it) to address the build issues highlighted by Joseph.

Cheers
/Marcus

> --
> Joseph S. Myers
> joseph@codesourcery.com
  
Wilco Dijkstra May 19, 2014, 5:05 p.m. UTC | #3
> Joseph Myers wrote: 
> This breaks the build when VFP isn't enabled at compile-time, because then
> the libc_* functions aren't defined.  You need to have the <fenv.h>
> functions conditionally call libc_* functions that always use VFP (and are
> always defined), while those VFP-using functions are only conditionally
> used within the rest of libm.  I.e., I think the __SOFTFP__ conditionals
> in fenv_private.h should only control the definitions of libc_* to use
> libc_*_vfp, not the definitions of the libc_*_vfp inlines.

You're right, it wouldn't build properly. I'll move the #ifndef __SOFTFP__
to go just before the defines and add the reverted patch to my outstanding 
patch set.

Any chance we can obsolete softfp only builds any time soon? I can't
remember when the last non-VFP ARM was made, but it must have been at
least a decade ago... Are those NetWinder boards still working/in use?

Wilco
  
Joseph Myers May 19, 2014, 5:26 p.m. UTC | #4
On Mon, 19 May 2014, Wilco wrote:

> > Joseph Myers wrote: 
> > This breaks the build when VFP isn't enabled at compile-time, because then
> > the libc_* functions aren't defined.  You need to have the <fenv.h>
> > functions conditionally call libc_* functions that always use VFP (and are
> > always defined), while those VFP-using functions are only conditionally
> > used within the rest of libm.  I.e., I think the __SOFTFP__ conditionals
> > in fenv_private.h should only control the definitions of libc_* to use
> > libc_*_vfp, not the definitions of the libc_*_vfp inlines.
> 
> You're right, it wouldn't build properly. I'll move the #ifndef __SOFTFP__
> to go just before the defines and add the reverted patch to my outstanding 
> patch set.
> 
> Any chance we can obsolete softfp only builds any time soon? I can't
> remember when the last non-VFP ARM was made, but it must have been at
> least a decade ago... Are those NetWinder boards still working/in use?

That would be very premature.  You still get people building glibc for v4 
(not v4T) although only v4T and above are really expected to work properly 
for EABI.  VFP is still optional in v7 (without the "only for 
implementations targeting specialised markets" caveat of v8); my 
impression was that there are Cortex-A9 processors without VFP, for 
example.
  
Richard Earnshaw May 20, 2014, 8:49 a.m. UTC | #5
On 19/05/14 18:26, Joseph S. Myers wrote:
> On Mon, 19 May 2014, Wilco wrote:
> 
>>> Joseph Myers wrote: 
>>> This breaks the build when VFP isn't enabled at compile-time, because then
>>> the libc_* functions aren't defined.  You need to have the <fenv.h>
>>> functions conditionally call libc_* functions that always use VFP (and are
>>> always defined), while those VFP-using functions are only conditionally
>>> used within the rest of libm.  I.e., I think the __SOFTFP__ conditionals
>>> in fenv_private.h should only control the definitions of libc_* to use
>>> libc_*_vfp, not the definitions of the libc_*_vfp inlines.
>>
>> You're right, it wouldn't build properly. I'll move the #ifndef __SOFTFP__
>> to go just before the defines and add the reverted patch to my outstanding 
>> patch set.
>>
>> Any chance we can obsolete softfp only builds any time soon? I can't
>> remember when the last non-VFP ARM was made, but it must have been at
>> least a decade ago... Are those NetWinder boards still working/in use?
> 
> That would be very premature.  You still get people building glibc for v4 
> (not v4T) although only v4T and above are really expected to work properly 
> for EABI.  VFP is still optional in v7 (without the "only for 
> implementations targeting specialised markets" caveat of v8); my 
> impression was that there are Cortex-A9 processors without VFP, for 
> example.
> 

I'm aware of Cortex-A9s without Neon, and thus implementing VFPv3-D16,
but I don't recall ever seeing one without the FP unit.

[Someone will no doubt show me to be wrong though :-) ]

R.
  

Patch

diff --git a/sysdeps/arm/fclrexcpt.c b/sysdeps/arm/fclrexcpt.c
index 098a9bb..824145e 100644
--- a/sysdeps/arm/fclrexcpt.c
+++ b/sysdeps/arm/fclrexcpt.c
@@ -24,7 +24,7 @@ 
 int
 __feclearexcept (int excepts)
 {
-  fpu_control_t fpscr;
+  fpu_control_t fpscr, new_fpscr;
 
   /* Fail if a VFP unit isn't present unless nothing needs to be done.  */
   if (!ARM_HAVE_VFP)
@@ -32,11 +32,11 @@  __feclearexcept (int excepts)
 
   _FPU_GETCW (fpscr);
   excepts &= FE_ALL_EXCEPT;
+  new_fpscr = fpscr & ~excepts;
 
-  /* Clear the relevant bits.  */
-  fpscr = (fpscr & ~FE_ALL_EXCEPT) | (fpscr & FE_ALL_EXCEPT & ~excepts);
-
-  _FPU_SETCW (fpscr);
+  /* Write new exception flags if changed.  */
+  if (new_fpscr != fpscr)
+    _FPU_SETCW (new_fpscr);
 
   return 0;
 }
diff --git a/sysdeps/arm/fedisblxcpt.c b/sysdeps/arm/fedisblxcpt.c
index f2956cd..d5e0f00 100644
--- a/sysdeps/arm/fedisblxcpt.c
+++ b/sysdeps/arm/fedisblxcpt.c
@@ -35,7 +35,9 @@  fedisableexcept (int excepts)
   excepts &= FE_ALL_EXCEPT;
   new_fpscr = fpscr & ~(excepts << FE_EXCEPT_SHIFT);
 
-  _FPU_SETCW (new_fpscr);
+  /* Write new exceptions if changed.  */
+  if (new_fpscr != fpscr)
+    _FPU_SETCW (new_fpscr);
 
   return (fpscr >> FE_EXCEPT_SHIFT) & FE_ALL_EXCEPT;
 }
diff --git a/sysdeps/arm/feenablxcpt.c b/sysdeps/arm/feenablxcpt.c
index afd8943..e649b2f 100644
--- a/sysdeps/arm/feenablxcpt.c
+++ b/sysdeps/arm/feenablxcpt.c
@@ -35,15 +35,15 @@  feenableexcept (int excepts)
   excepts &= FE_ALL_EXCEPT;
   new_fpscr = fpscr | (excepts << FE_EXCEPT_SHIFT);
 
-  _FPU_SETCW (new_fpscr);
-
-  if (excepts != 0)
+  if (new_fpscr != fpscr)
     {
+      _FPU_SETCW (new_fpscr);
+
       /* Not all VFP architectures support trapping exceptions, so
 	 test whether the relevant bits were set and fail if not.  */
       _FPU_GETCW (new_fpscr);
-      if ((new_fpscr & (excepts << FE_EXCEPT_SHIFT))
-	  != (excepts << FE_EXCEPT_SHIFT))
+
+      if (((new_fpscr >> FE_EXCEPT_SHIFT) & excepts) != excepts)
 	return -1;
     }
 
diff --git a/sysdeps/arm/fegetround.c b/sysdeps/arm/fegetround.c
index 1c9c151..fbad0b3 100644
--- a/sysdeps/arm/fegetround.c
+++ b/sysdeps/arm/fegetround.c
@@ -16,22 +16,12 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
-#include <arm-features.h>
+#include <get-rounding-mode.h>
 
 
 int
 fegetround (void)
 {
-  fpu_control_t fpscr;
-
-  /* FE_TONEAREST is the only supported rounding mode
-     if a VFP unit isn't present.  */
-  if (!ARM_HAVE_VFP)
-    return FE_TONEAREST;
-
-  _FPU_GETCW (fpscr);
-  return fpscr & FE_TOWARDZERO;
+  return get_rounding_mode ();
 }
 libm_hidden_def (fegetround)
diff --git a/sysdeps/arm/feholdexcpt.c b/sysdeps/arm/feholdexcpt.c
index 258ba66..2d79e0c 100644
--- a/sysdeps/arm/feholdexcpt.c
+++ b/sysdeps/arm/feholdexcpt.c
@@ -16,30 +16,18 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
+#include <fenv_private.h>
 #include <arm-features.h>
 
 
 int
 feholdexcept (fenv_t *envp)
 {
-  fpu_control_t fpscr;
-
   /* Fail if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return 1;
 
-  _FPU_GETCW (fpscr);
-  envp->__cw = fpscr;
-
-  /* Now set all exceptions to non-stop.  */
-  fpscr &= ~(FE_ALL_EXCEPT << FE_EXCEPT_SHIFT);
-
-  /* And clear all exception flags.  */
-  fpscr &= ~FE_ALL_EXCEPT;
-
-  _FPU_SETCW (fpscr);
+  libc_feholdexcept_vfp (envp);
   return 0;
 }
 
diff --git a/sysdeps/arm/fesetenv.c b/sysdeps/arm/fesetenv.c
index 43b9b47..8677227 100644
--- a/sysdeps/arm/fesetenv.c
+++ b/sysdeps/arm/fesetenv.c
@@ -16,43 +16,43 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
+#include <fenv_private.h>
 #include <arm-features.h>
 
 
 int
 __fesetenv (const fenv_t *envp)
 {
-  fpu_control_t fpscr;
-
   /* Fail if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return 1;
 
-  _FPU_GETCW (fpscr);
+  if ((envp == FE_DFL_ENV) || (envp == FE_NOMASK_ENV))
+    {
+      fpu_control_t fpscr, new_fpscr;
+
+      _FPU_GETCW (fpscr);
 
-  /* Preserve the reserved FPSCR flags.  */
-  fpscr &= _FPU_RESERVED;
+      /* Preserve the reserved FPSCR flags.  */
+      new_fpscr = fpscr & _FPU_RESERVED;
 
-  if (envp == FE_DFL_ENV)
-    fpscr |= _FPU_DEFAULT;
-  else if (envp == FE_NOMASK_ENV)
-    fpscr |= _FPU_IEEE;
-  else
-    fpscr |= envp->__cw & ~_FPU_RESERVED;
+      if (envp == FE_DFL_ENV)
+	_FPU_SETCW (new_fpscr | _FPU_DEFAULT);
+      else
+	{
+	  _FPU_SETCW (new_fpscr | _FPU_IEEE);
+	  /* Not all VFP architectures support trapping exceptions, so
+	     test whether the relevant bits were set and fail if not.  */
+	  _FPU_GETCW (fpscr);
 
-  _FPU_SETCW (fpscr);
+	  if ((fpscr & _FPU_IEEE) != _FPU_IEEE)
+	    return 1;
+	}
 
-  if (envp == FE_NOMASK_ENV)
-    {
-      /* Not all VFP architectures support trapping exceptions, so
-	 test whether the relevant bits were set and fail if not.  */
-      _FPU_GETCW (fpscr);
-      if ((fpscr & _FPU_IEEE) != _FPU_IEEE)
-	return 1;
+      return 0;
     }
 
+  libc_fesetenv_vfp (envp);
   return 0;
 }
 
diff --git a/sysdeps/arm/fesetround.c b/sysdeps/arm/fesetround.c
index d1b92dc..f52c50a 100644
--- a/sysdeps/arm/fesetround.c
+++ b/sysdeps/arm/fesetround.c
@@ -16,28 +16,22 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
+#include <fenv_private.h>
 #include <arm-features.h>
 
 
 int
 fesetround (int round)
 {
-  fpu_control_t fpscr;
-
   /* FE_TONEAREST is the only supported rounding mode
      if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return (round == FE_TONEAREST) ? 0 : 1;
 
-  /* Fail if the rounding mode is not valid.  */
   if (round & ~FE_TOWARDZERO)
     return 1;
 
-  _FPU_GETCW (fpscr);
-  fpscr = (fpscr & ~FE_TOWARDZERO) | round;
-  _FPU_SETCW (fpscr);
+  libc_fesetround_vfp (round);
   return 0;
 }
 
diff --git a/sysdeps/arm/feupdateenv.c b/sysdeps/arm/feupdateenv.c
index 7cf6206..4b2cf0f 100644
--- a/sysdeps/arm/feupdateenv.c
+++ b/sysdeps/arm/feupdateenv.c
@@ -17,27 +17,35 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
+#include <fenv_private.h>
 #include <arm-features.h>
 
 
 int
 __feupdateenv (const fenv_t *envp)
 {
-  fpu_control_t fpscr;
+  fenv_t fenv;
 
   /* Fail if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return 1;
 
-  _FPU_GETCW (fpscr);
+  if ((envp == FE_DFL_ENV) || (envp == FE_NOMASK_ENV))
+    {
+      fpu_control_t fpscr;
 
-  /* Install new environment.  */
-  fesetenv (envp);
+      _FPU_GETCW (fpscr);
 
-  /* Raise the saved exceptions.  */
-  feraiseexcept (fpscr & FE_ALL_EXCEPT);
+      /* Preserve the reserved FPSCR flags.  */
+      fpscr &= _FPU_RESERVED;
+      fpscr |= (envp == FE_DFL_ENV) ? _FPU_DEFAULT : _FPU_IEEE;
+
+      /* Create a valid fenv to pass to libc_feupdateenv_vfp.  */
+      fenv.__cw = fpscr;
+      envp = &fenv;
+    }
+
+  libc_feupdateenv_vfp (envp);
   return 0;
 }
 
diff --git a/sysdeps/arm/fgetexcptflg.c b/sysdeps/arm/fgetexcptflg.c
index e1eb420..19c8b6d 100644
--- a/sysdeps/arm/fgetexcptflg.c
+++ b/sysdeps/arm/fgetexcptflg.c
@@ -17,23 +17,18 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
+#include <fenv_private.h>
 #include <arm-features.h>
 
 
 int
 __fegetexceptflag (fexcept_t *flagp, int excepts)
 {
-  fpu_control_t fpscr;
-
   /* Fail if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return 1;
 
-  _FPU_GETCW (fpscr);
-
-  *flagp = fpscr & excepts & FE_ALL_EXCEPT;
+  *flagp = libc_fetestexcept_vfp (excepts);
   return 0;
 }
 
diff --git a/sysdeps/arm/fsetexcptflg.c b/sysdeps/arm/fsetexcptflg.c
index a594e15..c1a6c1a 100644
--- a/sysdeps/arm/fsetexcptflg.c
+++ b/sysdeps/arm/fsetexcptflg.c
@@ -25,20 +25,23 @@ 
 int
 __fesetexceptflag (const fexcept_t *flagp, int excepts)
 {
-  fpu_control_t fpscr;
+  fpu_control_t fpscr, new_fpscr;
 
   /* Fail if a VFP unit isn't present unless nothing needs to be done.  */
   if (!ARM_HAVE_VFP)
     return (excepts != 0);
 
   _FPU_GETCW (fpscr);
+  excepts &= FE_ALL_EXCEPT;
 
   /* Set the desired exception mask.  */
-  fpscr &= ~(excepts & FE_ALL_EXCEPT);
-  fpscr |= (*flagp & excepts & FE_ALL_EXCEPT);
+  new_fpscr = fpscr & ~excepts;
+  new_fpscr |= *flagp & excepts;
+
+  /* Write new exception flags if changed.  */
+  if (new_fpscr != fpscr)
+    _FPU_SETCW (new_fpscr);
 
-  /* Save state back to the FPU.  */
-  _FPU_SETCW (fpscr);
   return 0;
 }
 
diff --git a/sysdeps/arm/ftestexcept.c b/sysdeps/arm/ftestexcept.c
index de082b2..6c5d3a8 100644
--- a/sysdeps/arm/ftestexcept.c
+++ b/sysdeps/arm/ftestexcept.c
@@ -16,23 +16,18 @@ 
    License along with the GNU C Library.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <fenv.h>
-#include <fpu_control.h>
+#include <fenv_private.h>
 #include <arm-features.h>
 
 
 int
 fetestexcept (int excepts)
 {
-  fpu_control_t fpscr;
-
   /* Return no exception flags if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return 0;
 
-  /* Get current exceptions.  */
-  _FPU_GETCW (fpscr);
-
-  return fpscr & excepts & FE_ALL_EXCEPT;
+  return libc_fetestexcept_vfp (excepts);
 }
+
 libm_hidden_def (fetestexcept)
diff --git a/sysdeps/arm/setfpucw.c b/sysdeps/arm/setfpucw.c
index 7416377..259b020 100644
--- a/sysdeps/arm/setfpucw.c
+++ b/sysdeps/arm/setfpucw.c
@@ -24,19 +24,20 @@ 
 void
 __setfpucw (fpu_control_t set)
 {
-  fpu_control_t fpscr;
+  fpu_control_t fpscr, new_fpscr;
 
   /* Do nothing if a VFP unit isn't present.  */
   if (!ARM_HAVE_VFP)
     return;
 
-  /* Fetch the current control word.  */
   _FPU_GETCW (fpscr);
 
   /* Preserve the reserved bits, and set the rest as the user
      specified (or the default, if the user gave zero).  */
-  fpscr &= _FPU_RESERVED;
-  fpscr |= set & ~_FPU_RESERVED;
+  new_fpscr = fpscr & _FPU_RESERVED;
+  new_fpscr |= set & ~_FPU_RESERVED;
 
-  _FPU_SETCW (fpscr);
+  /* Write FPSCR if changed.  */
+  if (new_fpscr != fpscr)
+    _FPU_SETCW (fpscr);
 }