On 2016-08-02 21:01, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Mon, 01 Aug 2016 18:10:33 -0700 (PDT)
>
> > From: Aurelien Jarno <aurelien@aurel32.net>
> > Date: Tue, 2 Aug 2016 01:59:24 +0200
> >
> >> On 2016-08-01 14:57, David Miller wrote:
> >>> >> Aurelien, it looks like we have the same exact issue with nearbyint on
> >>> >> sparc, right?
> >>> >
> >>> > I don't see the issue on nearbyint here. What is the issue exactly?
> >>>
> >>> Maybe only the vis3 variant shows the problem:
> >>>
> >>> Failure: nearbyint (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint (-sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_downward (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_downward (-sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_towardzero (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_towardzero (-sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_upward (sNaN): Exception "Invalid operation" not set
> >>> Failure: nearbyint_upward (-sNaN): Exception "Invalid operation" not set
> ...
> >> Now about the fix itself, we have to move the check before the fsr is
> >> saved and after the value has been moved to the floating point register,
> >> which is not something easy to do without breaking the whole code. One
> >> option would be to do it after loading the fsr at the end, the other one
> >> would be to use the generic version.
> >
> > I'll look into this.
>
> Aurelien I just tested the following and committed it to master:
Thanks!
I have finally been able to look at the fdim issue. I have tried to
compile the generic s_fdim.c without the missing test to generate
ERANGE. It appears that GCC generate very similar code on sparc32. On
sparc64 it does the same provided the file is compiled with -mvis. It
appears we compile sparc64 (and sysdeps/sparc/sparc32/sparcv9) assuming
a CPU with VIS instructions but we don't ask GCC to use them.
I therefore propose the following set of patches:
- Remove the current fdim/fdimf assembler implementation. This way this
patch can be easily backported.
- Pass -mvis when targetting sparc64
- Pass -mvis for files in sysdeps/sparc/sparc32/sparcv9. This will also
improve some more functions.
- Compile the generic fdim/fdimf functions with -mvis and -mvis3 in the
multiarch directory on sparc32.
I have already started working on that, I hope to get them by tomorrow.
In general we should probably review the existing optimized assembly
code. It looks like recent versions of GCC are good at generating
optimized code, provided they are fed with the correct -mvis or -mvis3
options.
Aurelien
====================
[PATCH] Fix sNaN handling in nearbyint on 32-bit sparc.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
(__nearbyint_vis3): Don't check for sNaN before float register is
loaded with the incoming argument.
* sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
(__nearbyintf_vis3): Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint):
Likewise.
* sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf):
Likewise.
---
ChangeLog | 10 ++++++++++
sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S | 6 +++---
.../sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S | 2 +-
sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S | 8 ++++----
sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S | 4 ++--
5 files changed, 20 insertions(+), 10 deletions(-)
@@ -1,5 +1,15 @@
2016-08-02 David S. Miller <davem@davemloft.net>
+ * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyint-vis3.S
+ (__nearbyint_vis3): Don't check for sNaN before float register is
+ loaded with the incoming argument.
+ * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_nearbyintf-vis3.S
+ (__nearbyintf_vis3): Likewise.
+ * sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyint.S (__nearbyint):
+ Likewise.
+ * sysdeps/sparc/sparc32/sparcv9/fpu/s_nearbyintf.S (__nearbyintf):
+ Likewise.
+
* string/test-strncmp.c (do_test_limit): Make sure the test data
stream is aligned as required for the type "CHAR".
(do_test): Likewise.
@@ -36,15 +36,15 @@
#define SIGN_BIT %f12 /* -0.0 */
ENTRY (__nearbyint_vis3)
+ sllx %o0, 32, %o0
+ or %o0, %o1, %o0
+ movxtod %o0, %f0
fcmpd %fcc3, %f0, %f0 /* Check for sNaN */
st %fsr, [%sp + 88]
sethi %hi(TWO_FIFTYTWO), %o2
sethi %hi(0xf8003e0), %o5
ld [%sp + 88], %o4
- sllx %o0, 32, %o0
or %o5, %lo(0xf8003e0), %o5
- or %o0, %o1, %o0
- movxtod %o0, %f0
andn %o4, %o5, %o4
fzero ZERO
st %o4, [%sp + 80]
@@ -35,9 +35,9 @@
#define SIGN_BIT %f12 /* -0.0 */
ENTRY (__nearbyintf_vis3)
+ movwtos %o0, %f1
fcmps %fcc3, %f1, %f1 /* Check for sNaN */
st %fsr, [%sp + 88]
- movwtos %o0, %f1
sethi %hi(TWO_TWENTYTHREE), %o2
sethi %hi(0xf8003e0), %o5
ld [%sp + 88], %o4
@@ -36,21 +36,21 @@
#define SIGN_BIT %f12 /* -0.0 */
ENTRY (__nearbyint)
+ sllx %o0, 32, %o0
+ or %o0, %o1, %o0
+ stx %o0, [%sp + 72]
+ ldd [%sp + 72], %f0
fcmpd %fcc3, %f0, %f0 /* Check for sNaN */
st %fsr, [%sp + 88]
sethi %hi(TWO_FIFTYTWO), %o2
sethi %hi(0xf8003e0), %o5
ld [%sp + 88], %o4
- sllx %o0, 32, %o0
or %o5, %lo(0xf8003e0), %o5
- or %o0, %o1, %o0
andn %o4, %o5, %o4
fzero ZERO
st %o4, [%sp + 80]
- stx %o0, [%sp + 72]
sllx %o2, 32, %o2
fnegd ZERO, SIGN_BIT
- ldd [%sp + 72], %f0
ld [%sp + 80], %fsr
stx %o2, [%sp + 72]
fabsd %f0, %f14
@@ -35,9 +35,10 @@
#define SIGN_BIT %f12 /* -0.0 */
ENTRY (__nearbyintf)
+ st %o0, [%sp + 68]
+ ld [%sp + 68], %f1
fcmps %fcc3, %f1, %f1 /* Check for sNaN */
st %fsr, [%sp + 88]
- st %o0, [%sp + 68]
sethi %hi(TWO_TWENTYTHREE), %o2
sethi %hi(0xf8003e0), %o5
ld [%sp + 88], %o4
@@ -46,7 +47,6 @@ ENTRY (__nearbyintf)
fnegs ZERO, SIGN_BIT
andn %o4, %o5, %o4
st %o4, [%sp + 80]
- ld [%sp + 68], %f1
ld [%sp + 80], %fsr
st %o2, [%sp + 68]
fabss %f1, %f14