[alpha] : Fix sysdeps/alpha/remqu.S clobbering $f3 reg

Message ID CAFULd4ZzARNW3Z-XEVbpG9KS6-PY0nzzfzFs3iOHJ9fADsWU0w@mail.gmail.com
State Superseded
Headers

Commit Message

Uros Bizjak Jan. 18, 2019, 1:06 p.m. UTC
  Hello!

Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
$y_is_neg path. There was missing restore of $f3 before the return
from the function.

The patch also reorders insns a bit, so it becomes similar as much as
possible to divqu.S.

Without the patch, math/big testcase from Go-1.11 testsuite (that
includes lots of corner cases that exercise remqu) FAIL, with patched
function, the testcase PASSes without problems.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>

Uros.
  

Comments

Uros Bizjak Jan. 20, 2019, 6:21 p.m. UTC | #1
On Fri, Jan 18, 2019 at 2:06 PM Uros Bizjak <ubizjak@gmail.com>
wrote:Please also find the missing ChangeLog entry.

Please find missing ChangeLog entry in this message. Also, please note
I don't have write access to glibc repository.

Uros.

> Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> $y_is_neg path. There was missing restore of $f3 before the return
> from the function.
>
> The patch also reorders insns a bit, so it becomes similar as much as
> possible to divqu.S.
>
> Without the patch, math/big testcase from Go-1.11 testsuite (that
> includes lots of corner cases that exercise remqu) FAIL, with patched
> function, the testcase PASSes without problems.

2019-18-01  Uroš Bizjak  <ubizjak@gmail.com>

    * sysdeps/alpha/remqu.S (__remqu): Add missing restore
    of $f3 register on $y_is_neg path.  Reorder instructions
    similar to divqu.S order.

> Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
>
> Uros.
  
Richard Henderson Jan. 24, 2019, 8:23 a.m. UTC | #2
On 1/18/19 5:06 AM, Uros Bizjak wrote:
> Hello!
> 
> Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> $y_is_neg path. There was missing restore of $f3 before the return
> from the function.
> 
> The patch also reorders insns a bit, so it becomes similar as much as
> possible to divqu.S.
> 
> Without the patch, math/big testcase from Go-1.11 testsuite (that
> includes lots of corner cases that exercise remqu) FAIL, with patched
> function, the testcase PASSes without problems.


> +++ b/sysdeps/alpha/remqu.S
> @@ -59,20 +59,19 @@ __remqu:
>  	subq	Y, 1, AT
>  	stt	$f0, 0(sp)
>  	and	Y, AT, AT
> +	excb
> +	beq	AT, $powerof2
>  
>  	stt	$f1, 8(sp)
> -	excb

Why are you moving the excb above the powerof2 branch?
The path at powerof2 does not touch fpcr or issue fp insns.

> @@ -94,12 +93,12 @@ __remqu:
>  	mulq	AT, Y, AT
>  	ldt	$f0, 0(sp)
>  	ldt	$f3, 48(sp)
> -	lda	sp, FRAME(sp)
>  	cfi_remember_state
>  	cfi_restore ($f0)
>  	cfi_restore ($f1)
>  	cfi_restore ($f3)
>  	cfi_def_cfa_offset (0)
> +	lda	sp, FRAME(sp)

This change is actively wrong wrt the unwind info.

> @@ -246,12 +247,16 @@ $y_is_neg:
>  	   quotient must be either 0 or 1, so the remainder must be X
>  	   or X-Y, so just compute it directly.  */
>  	cmpule	Y, X, AT
> +	excb
> +	mt_fpcr	$f3
>  	subq	X, Y, RV
>  	ldt	$f0, 0(sp)
> +	ldt	$f3, 48(sp)
>  	cmoveq	AT, X, RV
>  
>  	lda	sp, FRAME(sp)
>  	cfi_restore ($f0)
> +	cfi_restore ($f3)
>  	cfi_def_cfa_offset (0)
>  	ret	$31, (RA), 1

This appears to be the only change required to fix the bug.

Can you walk me through why the other changes?


r~
  
Uros Bizjak Jan. 24, 2019, 8:35 a.m. UTC | #3
On Thu, Jan 24, 2019 at 9:23 AM Richard Henderson <rth@twiddle.net> wrote:
>
> On 1/18/19 5:06 AM, Uros Bizjak wrote:
> > Hello!
> >
> > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via
> > $y_is_neg path. There was missing restore of $f3 before the return
> > from the function.
> >
> > The patch also reorders insns a bit, so it becomes similar as much as
> > possible to divqu.S.
> >
> > Without the patch, math/big testcase from Go-1.11 testsuite (that
> > includes lots of corner cases that exercise remqu) FAIL, with patched
> > function, the testcase PASSes without problems.
>
>
> > +++ b/sysdeps/alpha/remqu.S
> > @@ -59,20 +59,19 @@ __remqu:
> >       subq    Y, 1, AT
> >       stt     $f0, 0(sp)
> >       and     Y, AT, AT
> > +     excb
> > +     beq     AT, $powerof2
> >
> >       stt     $f1, 8(sp)
> > -     excb
>
> Why are you moving the excb above the powerof2 branch?
> The path at powerof2 does not touch fpcr or issue fp insns.

This was meant to unify the flow with the __divqu assembly, which does
the above before calling DIVBYZERO. The idea was that __divqu is used
much more than __remqu, so the later should do the same as the former.

> > @@ -94,12 +93,12 @@ __remqu:
> >       mulq    AT, Y, AT
> >       ldt     $f0, 0(sp)
> >       ldt     $f3, 48(sp)
> > -     lda     sp, FRAME(sp)
> >       cfi_remember_state
> >       cfi_restore ($f0)
> >       cfi_restore ($f1)
> >       cfi_restore ($f3)
> >       cfi_def_cfa_offset (0)
> > +     lda     sp, FRAME(sp)
>
> This change is actively wrong wrt the unwind info.

Again, this will match __divqu assembly. It looks that __divqu needs
to be fixed then.

> > @@ -246,12 +247,16 @@ $y_is_neg:
> >          quotient must be either 0 or 1, so the remainder must be X
> >          or X-Y, so just compute it directly.  */
> >       cmpule  Y, X, AT
> > +     excb
> > +     mt_fpcr $f3
> >       subq    X, Y, RV
> >       ldt     $f0, 0(sp)
> > +     ldt     $f3, 48(sp)
> >       cmoveq  AT, X, RV
> >
> >       lda     sp, FRAME(sp)
> >       cfi_restore ($f0)
> > +     cfi_restore ($f3)
> >       cfi_def_cfa_offset (0)
> >       ret     $31, (RA), 1
>
> This appears to be the only change required to fix the bug.

That is true. This part is the problematic part and clobbers $f3.
Should I resend the patch only with this part fixed?

> Can you walk me through why the other changes?

As said above, I was trying to make __remqu like __divqu, but it looks
that __divqu should be fixed in some places.

Thanks,
Uros.
  
Richard Henderson Jan. 24, 2019, 9:04 a.m. UTC | #4
On 1/24/19 12:35 AM, Uros Bizjak wrote:
> That is true. This part is the problematic part and clobbers $f3.
> Should I resend the patch only with this part fixed?

Yes please.  We're in the final stages of release
and any changes should be most minimal.

We can fix the div/rem matchup and unwind errors
for the next release.


r~
  

Patch

diff --git a/sysdeps/alpha/remqu.S b/sysdeps/alpha/remqu.S
index c2c5caf3c20..7210628f973 100644
--- a/sysdeps/alpha/remqu.S
+++ b/sysdeps/alpha/remqu.S
@@ -59,20 +59,19 @@  __remqu:
 	subq	Y, 1, AT
 	stt	$f0, 0(sp)
 	and	Y, AT, AT
+	excb
+	beq	AT, $powerof2
 
 	stt	$f1, 8(sp)
-	excb
 	stt	$f3, 48(sp)
-	beq	AT, $powerof2
 	cfi_rel_offset ($f0, 0)
 	cfi_rel_offset ($f1, 8)
 	cfi_rel_offset ($f3, 48)
+	mf_fpcr	$f3
 
 	_ITOFT2	X, $f0, 16, Y, $f1, 24
-	mf_fpcr	$f3
 	cvtqt	$f0, $f0
 	cvtqt	$f1, $f1
-
 	blt	X, $x_is_neg
 	divt/c	$f0, $f1, $f0
 
@@ -94,12 +93,12 @@  __remqu:
 	mulq	AT, Y, AT
 	ldt	$f0, 0(sp)
 	ldt	$f3, 48(sp)
-	lda	sp, FRAME(sp)
 	cfi_remember_state
 	cfi_restore ($f0)
 	cfi_restore ($f1)
 	cfi_restore ($f3)
 	cfi_def_cfa_offset (0)
+	lda	sp, FRAME(sp)
 
 	.align	4
 	subq	X, AT, RV
@@ -116,11 +115,13 @@  $x_is_neg:
 	cfi_rel_offset ($f2, 24)
 	_ITOFS	AT, $f2, 16
 
+	.align	4
 	addt	$f0, $f2, $f0
+	unop
 	divt/c	$f0, $f1, $f0
+	unop
 
 	/* Ok, we've now the divide issued.  Continue with other checks.  */
-	.align	4
 	ldt	$f1, 8(sp)
 	unop
 	ldt	$f2, 24(sp)
@@ -246,12 +247,16 @@  $y_is_neg:
 	   quotient must be either 0 or 1, so the remainder must be X
 	   or X-Y, so just compute it directly.  */
 	cmpule	Y, X, AT
+	excb
+	mt_fpcr	$f3
 	subq	X, Y, RV
 	ldt	$f0, 0(sp)
+	ldt	$f3, 48(sp)
 	cmoveq	AT, X, RV
 
 	lda	sp, FRAME(sp)
 	cfi_restore ($f0)
+	cfi_restore ($f3)
 	cfi_def_cfa_offset (0)
 	ret	$31, (RA), 1