[alpha] : Fix sysdeps/alpha/remqu.S clobbering $f3 reg
Commit Message
Hello!
Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
$y_is_neg path. There was missing restore of $f3 before the return
from the function.
The patch also reorders insns a bit, so it becomes similar as much as
possible to divqu.S.
Without the patch, math/big testcase from Go-1.11 testsuite (that
includes lots of corner cases that exercise remqu) FAIL, with patched
function, the testcase PASSes without problems.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Uros.
Comments
On Fri, Jan 18, 2019 at 2:06 PM Uros Bizjak <ubizjak@gmail.com>
wrote:Please also find the missing ChangeLog entry.
Please find missing ChangeLog entry in this message. Also, please note
I don't have write access to glibc repository.
Uros.
> Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> $y_is_neg path. There was missing restore of $f3 before the return
> from the function.
>
> The patch also reorders insns a bit, so it becomes similar as much as
> possible to divqu.S.
>
> Without the patch, math/big testcase from Go-1.11 testsuite (that
> includes lots of corner cases that exercise remqu) FAIL, with patched
> function, the testcase PASSes without problems.
2019-18-01 Uroš Bizjak <ubizjak@gmail.com>
* sysdeps/alpha/remqu.S (__remqu): Add missing restore
of $f3 register on $y_is_neg path. Reorder instructions
similar to divqu.S order.
> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
>
> Uros.
On 1/18/19 5:06 AM, Uros Bizjak wrote:
> Hello!
>
> Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> $y_is_neg path. There was missing restore of $f3 before the return
> from the function.
>
> The patch also reorders insns a bit, so it becomes similar as much as
> possible to divqu.S.
>
> Without the patch, math/big testcase from Go-1.11 testsuite (that
> includes lots of corner cases that exercise remqu) FAIL, with patched
> function, the testcase PASSes without problems.
> +++ b/sysdeps/alpha/remqu.S
> @@ -59,20 +59,19 @@ __remqu:
> subq Y, 1, AT
> stt $f0, 0(sp)
> and Y, AT, AT
> + excb
> + beq AT, $powerof2
>
> stt $f1, 8(sp)
> - excb
Why are you moving the excb above the powerof2 branch?
The path at powerof2 does not touch fpcr or issue fp insns.
> @@ -94,12 +93,12 @@ __remqu:
> mulq AT, Y, AT
> ldt $f0, 0(sp)
> ldt $f3, 48(sp)
> - lda sp, FRAME(sp)
> cfi_remember_state
> cfi_restore ($f0)
> cfi_restore ($f1)
> cfi_restore ($f3)
> cfi_def_cfa_offset (0)
> + lda sp, FRAME(sp)
This change is actively wrong wrt the unwind info.
> @@ -246,12 +247,16 @@ $y_is_neg:
> quotient must be either 0 or 1, so the remainder must be X
> or X-Y, so just compute it directly. */
> cmpule Y, X, AT
> + excb
> + mt_fpcr $f3
> subq X, Y, RV
> ldt $f0, 0(sp)
> + ldt $f3, 48(sp)
> cmoveq AT, X, RV
>
> lda sp, FRAME(sp)
> cfi_restore ($f0)
> + cfi_restore ($f3)
> cfi_def_cfa_offset (0)
> ret $31, (RA), 1
This appears to be the only change required to fix the bug.
Can you walk me through why the other changes?
r~
On Thu, Jan 24, 2019 at 9:23 AM Richard Henderson <rth@twiddle.net> wrote:
>
> On 1/18/19 5:06 AM, Uros Bizjak wrote:
> > Hello!
> >
> > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> > $y_is_neg path. There was missing restore of $f3 before the return
> > from the function.
> >
> > The patch also reorders insns a bit, so it becomes similar as much as
> > possible to divqu.S.
> >
> > Without the patch, math/big testcase from Go-1.11 testsuite (that
> > includes lots of corner cases that exercise remqu) FAIL, with patched
> > function, the testcase PASSes without problems.
>
>
> > +++ b/sysdeps/alpha/remqu.S
> > @@ -59,20 +59,19 @@ __remqu:
> > subq Y, 1, AT
> > stt $f0, 0(sp)
> > and Y, AT, AT
> > + excb
> > + beq AT, $powerof2
> >
> > stt $f1, 8(sp)
> > - excb
>
> Why are you moving the excb above the powerof2 branch?
> The path at powerof2 does not touch fpcr or issue fp insns.
This was meant to unify the flow with the __divqu assembly, which does
the above before calling DIVBYZERO. The idea was that __divqu is used
much more than __remqu, so the later should do the same as the former.
> > @@ -94,12 +93,12 @@ __remqu:
> > mulq AT, Y, AT
> > ldt $f0, 0(sp)
> > ldt $f3, 48(sp)
> > - lda sp, FRAME(sp)
> > cfi_remember_state
> > cfi_restore ($f0)
> > cfi_restore ($f1)
> > cfi_restore ($f3)
> > cfi_def_cfa_offset (0)
> > + lda sp, FRAME(sp)
>
> This change is actively wrong wrt the unwind info.
Again, this will match __divqu assembly. It looks that __divqu needs
to be fixed then.
> > @@ -246,12 +247,16 @@ $y_is_neg:
> > quotient must be either 0 or 1, so the remainder must be X
> > or X-Y, so just compute it directly. */
> > cmpule Y, X, AT
> > + excb
> > + mt_fpcr $f3
> > subq X, Y, RV
> > ldt $f0, 0(sp)
> > + ldt $f3, 48(sp)
> > cmoveq AT, X, RV
> >
> > lda sp, FRAME(sp)
> > cfi_restore ($f0)
> > + cfi_restore ($f3)
> > cfi_def_cfa_offset (0)
> > ret $31, (RA), 1
>
> This appears to be the only change required to fix the bug.
That is true. This part is the problematic part and clobbers $f3.
Should I resend the patch only with this part fixed?
> Can you walk me through why the other changes?
As said above, I was trying to make __remqu like __divqu, but it looks
that __divqu should be fixed in some places.
Thanks,
Uros.
On 1/24/19 12:35 AM, Uros Bizjak wrote:
> That is true. This part is the problematic part and clobbers $f3.
> Should I resend the patch only with this part fixed?
Yes please. We're in the final stages of release
and any changes should be most minimal.
We can fix the div/rem matchup and unwind errors
for the next release.
r~
@@ -59,20 +59,19 @@ __remqu:
subq Y, 1, AT
stt $f0, 0(sp)
and Y, AT, AT
+ excb
+ beq AT, $powerof2
stt $f1, 8(sp)
- excb
stt $f3, 48(sp)
- beq AT, $powerof2
cfi_rel_offset ($f0, 0)
cfi_rel_offset ($f1, 8)
cfi_rel_offset ($f3, 48)
+ mf_fpcr $f3
_ITOFT2 X, $f0, 16, Y, $f1, 24
- mf_fpcr $f3
cvtqt $f0, $f0
cvtqt $f1, $f1
-
blt X, $x_is_neg
divt/c $f0, $f1, $f0
@@ -94,12 +93,12 @@ __remqu:
mulq AT, Y, AT
ldt $f0, 0(sp)
ldt $f3, 48(sp)
- lda sp, FRAME(sp)
cfi_remember_state
cfi_restore ($f0)
cfi_restore ($f1)
cfi_restore ($f3)
cfi_def_cfa_offset (0)
+ lda sp, FRAME(sp)
.align 4
subq X, AT, RV
@@ -116,11 +115,13 @@ $x_is_neg:
cfi_rel_offset ($f2, 24)
_ITOFS AT, $f2, 16
+ .align 4
addt $f0, $f2, $f0
+ unop
divt/c $f0, $f1, $f0
+ unop
/* Ok, we've now the divide issued. Continue with other checks. */
- .align 4
ldt $f1, 8(sp)
unop
ldt $f2, 24(sp)
@@ -246,12 +247,16 @@ $y_is_neg:
quotient must be either 0 or 1, so the remainder must be X
or X-Y, so just compute it directly. */
cmpule Y, X, AT
+ excb
+ mt_fpcr $f3
subq X, Y, RV
ldt $f0, 0(sp)
+ ldt $f3, 48(sp)
cmoveq AT, X, RV
lda sp, FRAME(sp)
cfi_restore ($f0)
+ cfi_restore ($f3)
cfi_def_cfa_offset (0)
ret $31, (RA), 1