[v2] cse: avoid signed overflow in compute_const_anchors [PR 104843]

Message ID b278df38f01b823d8a3915f67321affc9a8ca05b.camel@mengyan1223.wang
State New
Headers
Series [v2] cse: avoid signed overflow in compute_const_anchors [PR 104843] |

Commit Message

Xi Ruoyao March 9, 2022, 4:12 p.m. UTC
  On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote:

> isn't it better to make targetm.const_anchor unsigned?
> The & and ~ are not subject to overflow rules.

It's not enough: if n is the minimum value of HOST_WIDE_INT and
const_anchor = 0x8000 (the value for MIPS), we'll have a signed 0x7fff
in *upper_base.  Then the next line, "*upper_offs = n - *upper_base;"
will be a signed overflow again.

How about the following?

-- >8 --

With a non-zero const_anchor, the behavior of this function relied on
signed overflow.

gcc/

	PR rtl-optimization/104843
	* cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for
	n to perform overflow arithmetics safely.
---
 gcc/cse.cc | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
  

Comments

Richard Biener March 10, 2022, 8:01 a.m. UTC | #1
On Wed, Mar 9, 2022 at 5:12 PM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
>
> On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote:
>
> > isn't it better to make targetm.const_anchor unsigned?
> > The & and ~ are not subject to overflow rules.
>
> It's not enough: if n is the minimum value of HOST_WIDE_INT and
> const_anchor = 0x8000 (the value for MIPS), we'll have a signed 0x7fff
> in *upper_base.  Then the next line, "*upper_offs = n - *upper_base;"
> will be a signed overflow again.
>
> How about the following?

Hmm, so all this seems to be to round CST up and down to a multiple of
CONST_ANCHOR.
It works on CONST_INT only which is sign-extended, so if there is
overflow the resulting
anchor is broken as far as I can see.  So instead of papering over this issue
the function should return false when n is negative since then
n & ~(targetm.const_anchor - 1) is also not n rounded down to a
multiple of const_anchor.

But of course I know nothing about this ..

Richard.

> -- >8 --
>
> With a non-zero const_anchor, the behavior of this function relied on
> signed overflow.
>
> gcc/
>
>         PR rtl-optimization/104843
>         * cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for
>         n to perform overflow arithmetics safely.
> ---
>  gcc/cse.cc | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/cse.cc b/gcc/cse.cc
> index a18b599d324..052fa0c3490 100644
> --- a/gcc/cse.cc
> +++ b/gcc/cse.cc
> @@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst,
>                        HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
>                        HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
>  {
> -  HOST_WIDE_INT n = INTVAL (cst);
> -
> -  *lower_base = n & ~(targetm.const_anchor - 1);
> -  if (*lower_base == n)
> +  unsigned HOST_WIDE_INT n = UINTVAL (cst);
> +  unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1);
> +  if (lb == n)
>      return false;
>
> +  *lower_base = lb;
>    *upper_base =
>      (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
>    *upper_offs = n - *upper_base;
> --
> 2.35.1
>
>
> >
  
Xi Ruoyao March 10, 2022, 11:32 a.m. UTC | #2
On Thu, 2022-03-10 at 09:01 +0100, Richard Biener wrote:
> On Wed, Mar 9, 2022 at 5:12 PM Xi Ruoyao <xry111@mengyan1223.wang>
> wrote:
> > 
> > On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote:
> > 
> > > isn't it better to make targetm.const_anchor unsigned?
> > > The & and ~ are not subject to overflow rules.
> > 
> > It's not enough: if n is the minimum value of HOST_WIDE_INT and
> > const_anchor = 0x8000 (the value for MIPS), we'll have a signed
> > 0x7fff
> > in *upper_base.  Then the next line, "*upper_offs = n -
> > *upper_base;"
> > will be a signed overflow again.
> > 
> > How about the following?
> 
> Hmm, so all this seems to be to round CST up and down to a multiple of
> CONST_ANCHOR.
> It works on CONST_INT only which is sign-extended, so if there is
> overflow the resulting
> anchor is broken as far as I can see.

On MIPS addiu/daddiu do 2-complement addition, so the overflowed result
is still usable.

> So instead of papering over this issue
> the function should return false when n is negative since then
> n & ~(targetm.const_anchor - 1) is also not n rounded down to a
> multiple of const_anchor.

This function does work for negative n, like:

void g (int, int);
void
f (void)
{
  g(0x8123ffff, 0x81240001);
}

It should produce:

	li	$4,-2128347136			# 0xffffffff81240000
	daddiu	$5,$4,1
	daddiu	$4,$4,-1
	jal	g

But return false for negative n will cause regression for this case,
producing:

	li	$5,-2128347136			# 0xffffffff81240000
	li	$4,-2128412672			# 0xffffffff81230000
	ori	$5,$5,0x1
	ori	$4,$4,0xffff
	jal	g

That being said, it indeed does not work for:

void g (int, int);
void f ()
{
  g (0x7fffffff, 0x80000001);
}

It produces:

	li	$5,-2147483648			# 0xffffffff80000000
	li	$4,2147418112			# 0x7fff0000
	daddiu	$5,$5,1
	ori	$4,$4,0xffff
	jal	g

Should be:

	li	$5,-2147483648		# 0xffffffff80000000
	daddiu	$5,$5,1
        addiu   $4,$5,-1

> > -- >8 --
> > 
> > With a non-zero const_anchor, the behavior of this function relied on
> > signed overflow.
> > 
> > gcc/
> > 
> >         PR rtl-optimization/104843
> >         * cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for
> >         n to perform overflow arithmetics safely.
> > ---
> >  gcc/cse.cc | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/gcc/cse.cc b/gcc/cse.cc
> > index a18b599d324..052fa0c3490 100644
> > --- a/gcc/cse.cc
> > +++ b/gcc/cse.cc
> > @@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst,
> >                        HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
> >                        HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
> >  {
> > -  HOST_WIDE_INT n = INTVAL (cst);
> > -
> > -  *lower_base = n & ~(targetm.const_anchor - 1);
> > -  if (*lower_base == n)
> > +  unsigned HOST_WIDE_INT n = UINTVAL (cst);
> > +  unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1);
> > +  if (lb == n)
> >      return false;
> > 
> > +  *lower_base = lb;
> >    *upper_base =
> >      (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
> >    *upper_offs = n - *upper_base;
> > --
> > 2.35.1
> > 
> > 
> > >
  
Richard Biener March 11, 2022, 11:12 a.m. UTC | #3
On Thu, Mar 10, 2022 at 12:32 PM Xi Ruoyao <xry111@mengyan1223.wang> wrote:
>
> On Thu, 2022-03-10 at 09:01 +0100, Richard Biener wrote:
> > On Wed, Mar 9, 2022 at 5:12 PM Xi Ruoyao <xry111@mengyan1223.wang>
> > wrote:
> > >
> > > On Wed, 2022-03-09 at 15:55 +0100, Richard Biener wrote:
> > >
> > > > isn't it better to make targetm.const_anchor unsigned?
> > > > The & and ~ are not subject to overflow rules.
> > >
> > > It's not enough: if n is the minimum value of HOST_WIDE_INT and
> > > const_anchor = 0x8000 (the value for MIPS), we'll have a signed
> > > 0x7fff
> > > in *upper_base.  Then the next line, "*upper_offs = n -
> > > *upper_base;"
> > > will be a signed overflow again.
> > >
> > > How about the following?
> >
> > Hmm, so all this seems to be to round CST up and down to a multiple of
> > CONST_ANCHOR.
> > It works on CONST_INT only which is sign-extended, so if there is
> > overflow the resulting
> > anchor is broken as far as I can see.
>
> On MIPS addiu/daddiu do 2-complement addition, so the overflowed result
> is still usable.

The issue is that what the CONST_INT actually means depends on the
mode, an "overflow" to a positive number will eventually change what
is lower and what is the upper bound(?)

> > So instead of papering over this issue
> > the function should return false when n is negative since then
> > n & ~(targetm.const_anchor - 1) is also not n rounded down to a
> > multiple of const_anchor.
>
> This function does work for negative n, like:
>
> void g (int, int);
> void
> f (void)
> {
>   g(0x8123ffff, 0x81240001);
> }
>
> It should produce:
>
>         li      $4,-2128347136                  # 0xffffffff81240000
>         daddiu  $5,$4,1
>         daddiu  $4,$4,-1
>         jal     g
>
> But return false for negative n will cause regression for this case,
> producing:
>
>         li      $5,-2128347136                  # 0xffffffff81240000
>         li      $4,-2128412672                  # 0xffffffff81230000
>         ori     $5,$5,0x1
>         ori     $4,$4,0xffff
>         jal     g
>
> That being said, it indeed does not work for:
>
> void g (int, int);
> void f ()
> {
>   g (0x7fffffff, 0x80000001);
> }
>
> It produces:
>
>         li      $5,-2147483648                  # 0xffffffff80000000
>         li      $4,2147418112                   # 0x7fff0000
>         daddiu  $5,$5,1
>         ori     $4,$4,0xffff
>         jal     g
>
> Should be:
>
>         li      $5,-2147483648          # 0xffffffff80000000
>         daddiu  $5,$5,1
>         addiu   $4,$5,-1

So maybe you can figure out a fix that makes it work for this case as well.

> > > -- >8 --
> > >
> > > With a non-zero const_anchor, the behavior of this function relied on
> > > signed overflow.
> > >
> > > gcc/
> > >
> > >         PR rtl-optimization/104843
> > >         * cse.cc (compute_const_anchors): Use unsigned HOST_WIDE_INT for
> > >         n to perform overflow arithmetics safely.
> > > ---
> > >  gcc/cse.cc | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/gcc/cse.cc b/gcc/cse.cc
> > > index a18b599d324..052fa0c3490 100644
> > > --- a/gcc/cse.cc
> > > +++ b/gcc/cse.cc
> > > @@ -1169,12 +1169,12 @@ compute_const_anchors (rtx cst,
> > >                        HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
> > >                        HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
> > >  {
> > > -  HOST_WIDE_INT n = INTVAL (cst);
> > > -
> > > -  *lower_base = n & ~(targetm.const_anchor - 1);
> > > -  if (*lower_base == n)
> > > +  unsigned HOST_WIDE_INT n = UINTVAL (cst);
> > > +  unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1);
> > > +  if (lb == n)
> > >      return false;
> > >
> > > +  *lower_base = lb;
> > >    *upper_base =
> > >      (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
> > >    *upper_offs = n - *upper_base;
> > > --
> > > 2.35.1
> > >
> > >
> > > >
>
> --
> Xi Ruoyao <xry111@mengyan1223.wang>
> School of Aerospace Science and Technology, Xidian University
  

Patch

diff --git a/gcc/cse.cc b/gcc/cse.cc
index a18b599d324..052fa0c3490 100644
--- a/gcc/cse.cc
+++ b/gcc/cse.cc
@@ -1169,12 +1169,12 @@  compute_const_anchors (rtx cst,
 		       HOST_WIDE_INT *lower_base, HOST_WIDE_INT *lower_offs,
 		       HOST_WIDE_INT *upper_base, HOST_WIDE_INT *upper_offs)
 {
-  HOST_WIDE_INT n = INTVAL (cst);
-
-  *lower_base = n & ~(targetm.const_anchor - 1);
-  if (*lower_base == n)
+  unsigned HOST_WIDE_INT n = UINTVAL (cst);
+  unsigned HOST_WIDE_INT lb = n & ~(targetm.const_anchor - 1);
+  if (lb == n)
     return false;
 
+  *lower_base = lb;
   *upper_base =
     (n + (targetm.const_anchor - 1)) & ~(targetm.const_anchor - 1);
   *upper_offs = n - *upper_base;