c++: Reject in constant evaluation address comparisons of start of one var and end of another [PR89074]

Message ID 20220106092416.GX2646553@tucnak
State New
Headers
Series c++: Reject in constant evaluation address comparisons of start of one var and end of another [PR89074] |

Commit Message

Jakub Jelinek Jan. 6, 2022, 9:24 a.m. UTC
  Hi!

The following testcase used to be incorrectly accepted.  The match.pd
optimization that uses address_compare punts on folding comparison
of start of one object and end of another one only when those addresses
are cast to integral types, when the comparison is done on pointer types
it assumes undefined behavior and decides to fold the comparison such
that the addresses don't compare equal even when they at runtime they
could be equal.
But C++ says it is undefined behavior and so during constant evaluation
we should reject those, so this patch adds !folding_initializer &&
check to that spot.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, address_compare has some special cases, e.g. it assumes that
static vars are never adjacent to automatic vars, which is the case
for the usual layout where automatic vars are on the stack and after
.rodata/.data sections there is heap:
  /* Assume that automatic variables can't be adjacent to global
     variables.  */
  else if (is_global_var (base0) != is_global_var (base1))
    ;
Is it ok that during constant evaluation we don't treat those as undefined
behavior, or shall that be with !folding_initializer && too?

Another special case is:
  if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
       || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
       || (TREE_CODE (base0) == STRING_CST
           && TREE_CODE (base1) == STRING_CST
           && ioff0 >= 0 && ioff1 >= 0
           && ioff0 < TREE_STRING_LENGTH (base0)
           && ioff1 < TREE_STRING_LENGTH (base1)
          /* This is a too conservative test that the STRING_CSTs
             will not end up being string-merged.  */
           && strncmp (TREE_STRING_POINTER (base0) + ioff0,
                       TREE_STRING_POINTER (base1) + ioff1,
                       MIN (TREE_STRING_LENGTH (base0) - ioff0,
                            TREE_STRING_LENGTH (base1) - ioff1)) != 0))
    ;
  else if (!DECL_P (base0) || !DECL_P (base1))
    return 2;
Here we similarly assume that vars aren't adjacent to string literals
or vice versa.  Do we need to stick !folding_initializer && to those
DECL_P vs. STRING_CST cases?  Though, because of the return 2; for
non-DECL_P that would mean rejecting comparisons like &var == &"foobar"[3]
etc. which ought to be fine, no?  So perhaps we need to watch for
decls. vs. STRING_CSTs like for DECLs whether the address is at the start
or at the end of the string literal or somewhere in between (at least
for folding_initializer)?
And yet another chapter but probably unsolvable is comparison of
string literal addresses.  I think pedantically in C++
&"foo"[0] == &"foo"[0] is undefined behavior, different occurences of
the same string literals might still not be merged in some implementations.
But constexpr const char *s = "foo"; &s[0] == &s[0] should be well defined,
and we aren't tracking anywhere whether the string literal was the same one
or different (and I think other compilers don't track that either).

2022-01-06  Jakub Jelinek  <jakub@redhat.com>

	PR c++/89074
	* fold-const.c (address_compare): Punt on comparison of address of
	one object with address of end of another object if
	folding_initializer.

	* g++.dg/cpp1y/constexpr-89074-1.C: New test.


	Jakub
  

Comments

Richard Biener Jan. 10, 2022, 2:10 p.m. UTC | #1
On Thu, Jan 6, 2022 at 10:25 AM Jakub Jelinek via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi!
>
> The following testcase used to be incorrectly accepted.  The match.pd
> optimization that uses address_compare punts on folding comparison
> of start of one object and end of another one only when those addresses
> are cast to integral types, when the comparison is done on pointer types
> it assumes undefined behavior and decides to fold the comparison such
> that the addresses don't compare equal even when they at runtime they
> could be equal.
> But C++ says it is undefined behavior and so during constant evaluation
> we should reject those, so this patch adds !folding_initializer &&
> check to that spot.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> Note, address_compare has some special cases, e.g. it assumes that
> static vars are never adjacent to automatic vars, which is the case
> for the usual layout where automatic vars are on the stack and after
> .rodata/.data sections there is heap:
>   /* Assume that automatic variables can't be adjacent to global
>      variables.  */
>   else if (is_global_var (base0) != is_global_var (base1))
>     ;
> Is it ok that during constant evaluation we don't treat those as undefined
> behavior, or shall that be with !folding_initializer && too?
>
> Another special case is:
>   if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
>        || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
>        || (TREE_CODE (base0) == STRING_CST
>            && TREE_CODE (base1) == STRING_CST
>            && ioff0 >= 0 && ioff1 >= 0
>            && ioff0 < TREE_STRING_LENGTH (base0)
>            && ioff1 < TREE_STRING_LENGTH (base1)
>           /* This is a too conservative test that the STRING_CSTs
>              will not end up being string-merged.  */
>            && strncmp (TREE_STRING_POINTER (base0) + ioff0,
>                        TREE_STRING_POINTER (base1) + ioff1,
>                        MIN (TREE_STRING_LENGTH (base0) - ioff0,
>                             TREE_STRING_LENGTH (base1) - ioff1)) != 0))
>     ;
>   else if (!DECL_P (base0) || !DECL_P (base1))
>     return 2;
> Here we similarly assume that vars aren't adjacent to string literals
> or vice versa.  Do we need to stick !folding_initializer && to those
> DECL_P vs. STRING_CST cases?  Though, because of the return 2; for
> non-DECL_P that would mean rejecting comparisons like &var == &"foobar"[3]
> etc. which ought to be fine, no?  So perhaps we need to watch for
> decls. vs. STRING_CSTs like for DECLs whether the address is at the start
> or at the end of the string literal or somewhere in between (at least
> for folding_initializer)?
> And yet another chapter but probably unsolvable is comparison of
> string literal addresses.  I think pedantically in C++
> &"foo"[0] == &"foo"[0] is undefined behavior, different occurences of
> the same string literals might still not be merged in some implementations.
> But constexpr const char *s = "foo"; &s[0] == &s[0] should be well defined,
> and we aren't tracking anywhere whether the string literal was the same one
> or different (and I think other compilers don't track that either).

On my TODO list is to make &"foo" invalid and instead require &CONST_DECL
(and DECL_INITIAL of it then being "foo"), that would make it possible to
track the "original" string literal and perform string merging in a
more considerate
way.

Richard.

>
> 2022-01-06  Jakub Jelinek  <jakub@redhat.com>
>
>         PR c++/89074
>         * fold-const.c (address_compare): Punt on comparison of address of
>         one object with address of end of another object if
>         folding_initializer.
>
>         * g++.dg/cpp1y/constexpr-89074-1.C: New test.
>
> --- gcc/fold-const.c.jj 2022-01-05 20:30:08.731806756 +0100
> +++ gcc/fold-const.c    2022-01-05 20:34:52.277822349 +0100
> @@ -16627,7 +16627,7 @@ address_compare (tree_code code, tree ty
>    /* If this is a pointer comparison, ignore for now even
>       valid equalities where one pointer is the offset zero
>       of one object and the other to one past end of another one.  */
> -  else if (!INTEGRAL_TYPE_P (type))
> +  else if (!folding_initializer && !INTEGRAL_TYPE_P (type))
>      ;
>    /* Assume that automatic variables can't be adjacent to global
>       variables.  */
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C.jj   2022-01-05 20:43:03.696917484 +0100
> +++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C      2022-01-05 20:42:12.676634044 +0100
> @@ -0,0 +1,28 @@
> +// PR c++/89074
> +// { dg-do compile { target c++14 } }
> +
> +constexpr bool
> +foo ()
> +{
> +  int a[] = { 1, 2 };
> +  int b[] = { 3, 4 };
> +
> +  if (&a[0] == &b[0])
> +    return false;
> +
> +  if (&a[1] == &b[0])
> +    return false;
> +
> +  if (&a[1] == &b[1])
> +    return false;
> +
> +  if (&a[2] == &b[1])
> +    return false;
> +
> +  if (&a[2] == &b[0])          // { dg-error "is not a constant expression" }
> +    return false;
> +
> +  return true;
> +}
> +
> +constexpr bool a = foo ();
>
>         Jakub
>
  
Andrew Pinski Jan. 11, 2022, 3:24 a.m. UTC | #2
On Mon, Jan 10, 2022 at 6:11 AM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Thu, Jan 6, 2022 at 10:25 AM Jakub Jelinek via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi!
> >
> > The following testcase used to be incorrectly accepted.  The match.pd
> > optimization that uses address_compare punts on folding comparison
> > of start of one object and end of another one only when those addresses
> > are cast to integral types, when the comparison is done on pointer types
> > it assumes undefined behavior and decides to fold the comparison such
> > that the addresses don't compare equal even when they at runtime they
> > could be equal.
> > But C++ says it is undefined behavior and so during constant evaluation
> > we should reject those, so this patch adds !folding_initializer &&
> > check to that spot.
> >
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> >
> > Note, address_compare has some special cases, e.g. it assumes that
> > static vars are never adjacent to automatic vars, which is the case
> > for the usual layout where automatic vars are on the stack and after
> > .rodata/.data sections there is heap:
> >   /* Assume that automatic variables can't be adjacent to global
> >      variables.  */
> >   else if (is_global_var (base0) != is_global_var (base1))
> >     ;
> > Is it ok that during constant evaluation we don't treat those as undefined
> > behavior, or shall that be with !folding_initializer && too?
> >
> > Another special case is:
> >   if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
> >        || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
> >        || (TREE_CODE (base0) == STRING_CST
> >            && TREE_CODE (base1) == STRING_CST
> >            && ioff0 >= 0 && ioff1 >= 0
> >            && ioff0 < TREE_STRING_LENGTH (base0)
> >            && ioff1 < TREE_STRING_LENGTH (base1)
> >           /* This is a too conservative test that the STRING_CSTs
> >              will not end up being string-merged.  */
> >            && strncmp (TREE_STRING_POINTER (base0) + ioff0,
> >                        TREE_STRING_POINTER (base1) + ioff1,
> >                        MIN (TREE_STRING_LENGTH (base0) - ioff0,
> >                             TREE_STRING_LENGTH (base1) - ioff1)) != 0))
> >     ;
> >   else if (!DECL_P (base0) || !DECL_P (base1))
> >     return 2;
> > Here we similarly assume that vars aren't adjacent to string literals
> > or vice versa.  Do we need to stick !folding_initializer && to those
> > DECL_P vs. STRING_CST cases?  Though, because of the return 2; for
> > non-DECL_P that would mean rejecting comparisons like &var == &"foobar"[3]
> > etc. which ought to be fine, no?  So perhaps we need to watch for
> > decls. vs. STRING_CSTs like for DECLs whether the address is at the start
> > or at the end of the string literal or somewhere in between (at least
> > for folding_initializer)?
> > And yet another chapter but probably unsolvable is comparison of
> > string literal addresses.  I think pedantically in C++
> > &"foo"[0] == &"foo"[0] is undefined behavior, different occurences of
> > the same string literals might still not be merged in some implementations.
> > But constexpr const char *s = "foo"; &s[0] == &s[0] should be well defined,
> > and we aren't tracking anywhere whether the string literal was the same one
> > or different (and I think other compilers don't track that either).
>
> On my TODO list is to make &"foo" invalid and instead require &CONST_DECL
> (and DECL_INITIAL of it then being "foo"), that would make it possible to
> track the "original" string literal and perform string merging in a
> more considerate way.

Interesting because I wrote this would be one way to fix PR88925.

Thanks,
Andrew Pinski

>
> Richard.
>
> >
> > 2022-01-06  Jakub Jelinek  <jakub@redhat.com>
> >
> >         PR c++/89074
> >         * fold-const.c (address_compare): Punt on comparison of address of
> >         one object with address of end of another object if
> >         folding_initializer.
> >
> >         * g++.dg/cpp1y/constexpr-89074-1.C: New test.
> >
> > --- gcc/fold-const.c.jj 2022-01-05 20:30:08.731806756 +0100
> > +++ gcc/fold-const.c    2022-01-05 20:34:52.277822349 +0100
> > @@ -16627,7 +16627,7 @@ address_compare (tree_code code, tree ty
> >    /* If this is a pointer comparison, ignore for now even
> >       valid equalities where one pointer is the offset zero
> >       of one object and the other to one past end of another one.  */
> > -  else if (!INTEGRAL_TYPE_P (type))
> > +  else if (!folding_initializer && !INTEGRAL_TYPE_P (type))
> >      ;
> >    /* Assume that automatic variables can't be adjacent to global
> >       variables.  */
> > --- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C.jj   2022-01-05 20:43:03.696917484 +0100
> > +++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C      2022-01-05 20:42:12.676634044 +0100
> > @@ -0,0 +1,28 @@
> > +// PR c++/89074
> > +// { dg-do compile { target c++14 } }
> > +
> > +constexpr bool
> > +foo ()
> > +{
> > +  int a[] = { 1, 2 };
> > +  int b[] = { 3, 4 };
> > +
> > +  if (&a[0] == &b[0])
> > +    return false;
> > +
> > +  if (&a[1] == &b[0])
> > +    return false;
> > +
> > +  if (&a[1] == &b[1])
> > +    return false;
> > +
> > +  if (&a[2] == &b[1])
> > +    return false;
> > +
> > +  if (&a[2] == &b[0])          // { dg-error "is not a constant expression" }
> > +    return false;
> > +
> > +  return true;
> > +}
> > +
> > +constexpr bool a = foo ();
> >
> >         Jakub
> >
  
Jakub Jelinek Jan. 13, 2022, 5:35 p.m. UTC | #3
Hi!

I'd like to ping this patch:

> 2022-01-06  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/89074
> 	* fold-const.c (address_compare): Punt on comparison of address of
> 	one object with address of end of another object if
> 	folding_initializer.
> 
> 	* g++.dg/cpp1y/constexpr-89074-1.C: New test.

Thanks.

	Jakub
  
Jason Merrill Jan. 13, 2022, 9:18 p.m. UTC | #4
On 1/6/22 04:24, Jakub Jelinek wrote:
> 
> The following testcase used to be incorrectly accepted.  The match.pd
> optimization that uses address_compare punts on folding comparison
> of start of one object and end of another one only when those addresses
> are cast to integral types, when the comparison is done on pointer types
> it assumes undefined behavior and decides to fold the comparison such
> that the addresses don't compare equal even when they at runtime they
> could be equal.
> But C++ says it is undefined behavior and so during constant evaluation
> we should reject those, so this patch adds !folding_initializer &&
> check to that spot.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> Note, address_compare has some special cases, e.g. it assumes that
> static vars are never adjacent to automatic vars, which is the case
> for the usual layout where automatic vars are on the stack and after
> .rodata/.data sections there is heap:
>    /* Assume that automatic variables can't be adjacent to global
>       variables.  */
>    else if (is_global_var (base0) != is_global_var (base1))
>      ;
> Is it ok that during constant evaluation we don't treat those as undefined
> behavior, or shall that be with !folding_initializer && too?

I guess that's undefined as well.

> Another special case is:
>    if ((DECL_P (base0) && TREE_CODE (base1) == STRING_CST)
>         || (TREE_CODE (base0) == STRING_CST && DECL_P (base1))
>         || (TREE_CODE (base0) == STRING_CST
>             && TREE_CODE (base1) == STRING_CST
>             && ioff0 >= 0 && ioff1 >= 0
>             && ioff0 < TREE_STRING_LENGTH (base0)
>             && ioff1 < TREE_STRING_LENGTH (base1)
>            /* This is a too conservative test that the STRING_CSTs
>               will not end up being string-merged.  */
>             && strncmp (TREE_STRING_POINTER (base0) + ioff0,
>                         TREE_STRING_POINTER (base1) + ioff1,
>                         MIN (TREE_STRING_LENGTH (base0) - ioff0,
>                              TREE_STRING_LENGTH (base1) - ioff1)) != 0))
>      ;
>    else if (!DECL_P (base0) || !DECL_P (base1))
>      return 2;
> Here we similarly assume that vars aren't adjacent to string literals
> or vice versa.  Do we need to stick !folding_initializer && to those
> DECL_P vs. STRING_CST cases?

Seems so.

> Though, because of the return 2; for
> non-DECL_P that would mean rejecting comparisons like &var == &"foobar"[3]
> etc. which ought to be fine, no?  So perhaps we need to watch for
> decls. vs. STRING_CSTs like for DECLs whether the address is at the start
> or at the end of the string literal or somewhere in between (at least
> for folding_initializer)?

Agreed.

> And yet another chapter but probably unsolvable is comparison of
> string literal addresses.  I think pedantically in C++
> &"foo"[0] == &"foo"[0] is undefined behavior, different occurences of
> the same string literals might still not be merged in some implementations.

I disagree; it's unspecified whether string literals are merged, but I 
think the comparison result is well specified depending on that 
implementation behavior.

> But constexpr const char *s = "foo"; &s[0] == &s[0] should be well defined,
> and we aren't tracking anywhere whether the string literal was the same one
> or different (and I think other compilers don't track that either).
> 
> 2022-01-06  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/89074
> 	* fold-const.c (address_compare): Punt on comparison of address of
> 	one object with address of end of another object if
> 	folding_initializer.
> 
> 	* g++.dg/cpp1y/constexpr-89074-1.C: New test.
> 
> --- gcc/fold-const.c.jj	2022-01-05 20:30:08.731806756 +0100
> +++ gcc/fold-const.c	2022-01-05 20:34:52.277822349 +0100
> @@ -16627,7 +16627,7 @@ address_compare (tree_code code, tree ty
>     /* If this is a pointer comparison, ignore for now even
>        valid equalities where one pointer is the offset zero
>        of one object and the other to one past end of another one.  */
> -  else if (!INTEGRAL_TYPE_P (type))
> +  else if (!folding_initializer && !INTEGRAL_TYPE_P (type))
>       ;
>     /* Assume that automatic variables can't be adjacent to global
>        variables.  */
> --- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C.jj	2022-01-05 20:43:03.696917484 +0100
> +++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C	2022-01-05 20:42:12.676634044 +0100
> @@ -0,0 +1,28 @@
> +// PR c++/89074
> +// { dg-do compile { target c++14 } }
> +
> +constexpr bool
> +foo ()
> +{
> +  int a[] = { 1, 2 };
> +  int b[] = { 3, 4 };
> +
> +  if (&a[0] == &b[0])
> +    return false;
> +
> +  if (&a[1] == &b[0])
> +    return false;
> +
> +  if (&a[1] == &b[1])
> +    return false;
> +
> +  if (&a[2] == &b[1])
> +    return false;
> +
> +  if (&a[2] == &b[0])		// { dg-error "is not a constant expression" }
> +    return false;
> +
> +  return true;
> +}
> +
> +constexpr bool a = foo ();
  

Patch

--- gcc/fold-const.c.jj	2022-01-05 20:30:08.731806756 +0100
+++ gcc/fold-const.c	2022-01-05 20:34:52.277822349 +0100
@@ -16627,7 +16627,7 @@  address_compare (tree_code code, tree ty
   /* If this is a pointer comparison, ignore for now even
      valid equalities where one pointer is the offset zero
      of one object and the other to one past end of another one.  */
-  else if (!INTEGRAL_TYPE_P (type))
+  else if (!folding_initializer && !INTEGRAL_TYPE_P (type))
     ;
   /* Assume that automatic variables can't be adjacent to global
      variables.  */
--- gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C.jj	2022-01-05 20:43:03.696917484 +0100
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-89074-1.C	2022-01-05 20:42:12.676634044 +0100
@@ -0,0 +1,28 @@ 
+// PR c++/89074
+// { dg-do compile { target c++14 } }
+
+constexpr bool
+foo ()
+{
+  int a[] = { 1, 2 };
+  int b[] = { 3, 4 };
+
+  if (&a[0] == &b[0])
+    return false;
+
+  if (&a[1] == &b[0])
+    return false;
+
+  if (&a[1] == &b[1])
+    return false;
+
+  if (&a[2] == &b[1])
+    return false;
+
+  if (&a[2] == &b[0])		// { dg-error "is not a constant expression" }
+    return false;
+
+  return true;
+}
+
+constexpr bool a = foo ();