[2/2] Avoid default-initializing auto_vec<T, N> storage, fix vec<vl_embed>

Message ID 20230224114444.7E2CE13246@imap2.suse-dmz.suse.de
State New
Headers
Series [1/2] Change vec<, , vl_embed>::m_vecdata refrences into address () |

Commit Message

Richard Biener Feb. 24, 2023, 11:44 a.m. UTC
  The following avoids default-initializing auto_vec storage for
non-POD T since that's not what the allocated storage fallback
will do and it's also not expected for existing cases like

  auto_vec<std::pair<unsigned, unsigned>, 64> elts;

which exist to optimize the allocation.

It also fixes the array accesses done by vec<vl_embed> to not
use its own m_vecdata member but instead access the container
provided storage via pointer arithmetic.

This seems to work but it also somehow breaks genrecog which now
goes OOM with this change.  I'm going to see if the testsuite
shows anything, but maybe it's obvious from a second eye what
I did wrong ...

Comments welcome of course.

Thanks,
Richard.

	* vec.h (vec<T, A, vl_embed>::m_vecdata): Remove.
	(vec<T, A, vl_embed>::m_vecpfx): Align as T to avoid
	changing alignment of vec<T, A, vl_embed> and simplifying
	address.
	(vec<T, A, vl_embed>::address): Compute as this + 1.
	(vec<T, A, vl_embed>::embedded_size): Use sizeof the
	vector instead of the offset of the m_vecdata member.
	(auto_vec<T, N>): Turn m_data storage into
	uninitialized unsigned char aligned as T.
	* vec.cc (test_auto_alias): New.
	(vec_cc_tests): Call it.
---
 gcc/vec.cc | 17 +++++++++++++++++
 gcc/vec.h  | 11 +++++------
 2 files changed, 22 insertions(+), 6 deletions(-)
  

Comments

Jakub Jelinek Feb. 24, 2023, 11:52 a.m. UTC | #1
On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> --- a/gcc/vec.h
> +++ b/gcc/vec.h
> @@ -586,8 +586,8 @@ public:
>    unsigned allocated (void) const { return m_vecpfx.m_alloc; }
>    unsigned length (void) const { return m_vecpfx.m_num; }
>    bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> -  T *address (void) { return m_vecdata; }
> -  const T *address (void) const { return m_vecdata; }
> +  T *address (void) { return reinterpret_cast <T *> (this + 1); }
> +  const T *address (void) const { return reinterpret_cast <const T *> (this + 1); }

This is now too long.

>    T *begin () { return address (); }
>    const T *begin () const { return address (); }
>    T *end () { return address () + length (); }
> @@ -631,8 +631,7 @@ public:
>  
>    /* FIXME - These fields should be private, but we need to cater to
>  	     compilers that have stricter notions of PODness for types.  */
> -  vec_prefix m_vecpfx;
> -  T m_vecdata[1];
> +  alignas (T) vec_prefix m_vecpfx;

The comment needs adjustment and down't we need
alignas (T) alignas (vec_prefix) ?

> @@ -1588,7 +1587,7 @@ public:
>  
>  private:
>    vec<T, va_heap, vl_embed> m_auto;
> -  T m_data[MAX (N - 1, 1)];
> +  alignas(T) unsigned char m_data[sizeof (T) * N];
>  };

I still believe you don't need alignas(T) here (and space before (T) ).
Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
standard C.

Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
bootstrap that I got with my version or not.

	Jakub
  
Jonathan Wakely Feb. 24, 2023, 11:54 a.m. UTC | #2
On Fri, 24 Feb 2023 at 11:52, Jakub Jelinek <jakub@redhat.com> wrote:
>
> On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> > --- a/gcc/vec.h
> > +++ b/gcc/vec.h
> > @@ -586,8 +586,8 @@ public:
> >    unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> >    unsigned length (void) const { return m_vecpfx.m_num; }
> >    bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > -  T *address (void) { return m_vecdata; }
> > -  const T *address (void) const { return m_vecdata; }
> > +  T *address (void) { return reinterpret_cast <T *> (this + 1); }
> > +  const T *address (void) const { return reinterpret_cast <const T *> (this + 1); }
>
> This is now too long.
>
> >    T *begin () { return address (); }
> >    const T *begin () const { return address (); }
> >    T *end () { return address () + length (); }
> > @@ -631,8 +631,7 @@ public:
> >
> >    /* FIXME - These fields should be private, but we need to cater to
> >            compilers that have stricter notions of PODness for types.  */
> > -  vec_prefix m_vecpfx;
> > -  T m_vecdata[1];
> > +  alignas (T) vec_prefix m_vecpfx;
>
> The comment needs adjustment and down't we need
> alignas (T) alignas (vec_prefix) ?

Yes. If alignas(T) is less than the natural alignment then this will
be an error. We want it to be the larger of  the two alignments, so we
need to specify both.

>
> > @@ -1588,7 +1587,7 @@ public:
> >
> >  private:
> >    vec<T, va_heap, vl_embed> m_auto;
> > -  T m_data[MAX (N - 1, 1)];
> > +  alignas(T) unsigned char m_data[sizeof (T) * N];
> >  };
>
> I still believe you don't need alignas(T) here (and space before (T) ).
> Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> standard C.
>
> Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> bootstrap that I got with my version or not.
>
>         Jakub
>
  
Jakub Jelinek Feb. 24, 2023, 12:13 p.m. UTC | #3
On Fri, Feb 24, 2023 at 11:54:54AM +0000, Jonathan Wakely wrote:
> > The comment needs adjustment and don't we need
> > alignas (T) alignas (vec_prefix) ?
> 
> Yes. If alignas(T) is less than the natural alignment then this will
> be an error. We want it to be the larger of  the two alignments, so we
> need to specify both.

Seems g++ doesn't diagnose this but clang++ does:
struct S { int a; };
alignas (char) alignas (S) S s;
alignas (char) S t;
$ g++ -S -o /tmp/1.s /tmp/1.C -pedantic-errors
$ clang++ -S -o /tmp/1.s /tmp/1.C -pedantic-errors
/tmp/1.C:3:1: error: requested alignment is less than minimum alignment of 4 for type 'S'
alignas (char) S t;
^
1 error generated.

	Jakub
  
Richard Biener Feb. 24, 2023, 12:15 p.m. UTC | #4
On Fri, 24 Feb 2023, Jonathan Wakely wrote:

> On Fri, 24 Feb 2023 at 11:52, Jakub Jelinek <jakub@redhat.com> wrote:
> >
> > On Fri, Feb 24, 2023 at 12:44:44PM +0100, Richard Biener wrote:
> > > --- a/gcc/vec.h
> > > +++ b/gcc/vec.h
> > > @@ -586,8 +586,8 @@ public:
> > >    unsigned allocated (void) const { return m_vecpfx.m_alloc; }
> > >    unsigned length (void) const { return m_vecpfx.m_num; }
> > >    bool is_empty (void) const { return m_vecpfx.m_num == 0; }
> > > -  T *address (void) { return m_vecdata; }
> > > -  const T *address (void) const { return m_vecdata; }
> > > +  T *address (void) { return reinterpret_cast <T *> (this + 1); }
> > > +  const T *address (void) const { return reinterpret_cast <const T *> (this + 1); }
> >
> > This is now too long.

Fixed.

> > >    T *begin () { return address (); }
> > >    const T *begin () const { return address (); }
> > >    T *end () { return address () + length (); }
> > > @@ -631,8 +631,7 @@ public:
> > >
> > >    /* FIXME - These fields should be private, but we need to cater to
> > >            compilers that have stricter notions of PODness for types.  */
> > > -  vec_prefix m_vecpfx;
> > > -  T m_vecdata[1];
> > > +  alignas (T) vec_prefix m_vecpfx;
> >
> > The comment needs adjustment and down't we need
> > alignas (T) alignas (vec_prefix) ?
> 
> Yes. If alignas(T) is less than the natural alignment then this will
> be an error. We want it to be the larger of  the two alignments, so we
> need to specify both.

OK, changed to specify both and adjusted the comment, also noting why
we do this - it simplifies address (), otherwise we'd have to round up
to an aligned address.

> >
> > > @@ -1588,7 +1587,7 @@ public:
> > >
> > >  private:
> > >    vec<T, va_heap, vl_embed> m_auto;
> > > -  T m_data[MAX (N - 1, 1)];
> > > +  alignas(T) unsigned char m_data[sizeof (T) * N];
> > >  };
> >
> > I still believe you don't need alignas(T) here (and space before (T) ).

I was worried that with auto_vec<__int128> we get tail-padding in m_auto
re-used, but since this isn't inheritance we're probably safe.  So 
removed give that m_auto is aligned to T.

> > Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> > ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> > now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> > standard C.

I've removed the MAX (N, 2) now, I think that N == 0 cannot happen
because we have a specialization covering that.  So we know N is
at least 1.

> > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > bootstrap that I got with my version or not.

Yes, I get this as well, not sure how to suppress it.  I guess there's
no standard way to get at the address after some object without going
through uintptr obfuscation - and obviously we do not want to have
that (and if we optimize it away that doesn't help the diagnostic ...)

Richard.
  
Jakub Jelinek Feb. 24, 2023, 12:26 p.m. UTC | #5
On Fri, Feb 24, 2023 at 12:15:04PM +0000, Richard Biener wrote:
> > > Also, I think it needs to be MAX (N, 2) instead of N, because auto_vec
> > > ctors use MAX (N, 2).  We could also change all those to MAX (N, 1)
> > > now, but it can't be N because m_data[sizeof (T) * 0] is invalid in
> > > standard C.
> 
> I've removed the MAX (N, 2) now, I think that N == 0 cannot happen
> because we have a specialization covering that.  So we know N is
> at least 1.

I think you're right.

> > > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > > bootstrap that I got with my version or not.
> 
> Yes, I get this as well, not sure how to suppress it.  I guess there's
> no standard way to get at the address after some object without going
> through uintptr obfuscation - and obviously we do not want to have
> that (and if we optimize it away that doesn't help the diagnostic ...)

I think we need to look at the exact IL on which it warns and see what
our options are.

	Jakub
  
Jakub Jelinek Feb. 24, 2023, 1:22 p.m. UTC | #6
On Fri, Feb 24, 2023 at 12:15:04PM +0000, Richard Biener wrote:
> > > Anyway, I wonder if you get the -Werror=stringop-overflow= errors during
> > > bootstrap that I got with my version or not.
> 
> Yes, I get this as well, not sure how to suppress it.  I guess there's
> no standard way to get at the address after some object without going
> through uintptr obfuscation - and obviously we do not want to have
> that (and if we optimize it away that doesn't help the diagnostic ...)

As I wrote on IRC, incremental:
--- gcc/vec.h	2023-02-24 11:54:49.859564268 +0100
+++ gcc/vec.h	2023-02-24 14:13:38.199163428 +0100
@@ -1553,7 +1553,8 @@
   auto_vec ()
   {
     m_auto.embedded_init (MAX (N, 2), 0, 1);
-    this->m_vec = &m_auto;
+    size_t off = (char *) &m_auto - (char *) this;
+    this->m_vec = (vec<T, va_heap, vl_embed> *) ((char *) this + off);
   }
 
   auto_vec (size_t s CXX_MEM_STAT_INFO)
@@ -1565,7 +1566,8 @@
       }
 
     m_auto.embedded_init (MAX (N, 2), 0, 1);
-    this->m_vec = &m_auto;
+    size_t off = (char *) &m_auto - (char *) this;
+    this->m_vec = (vec<T, va_heap, vl_embed> *) ((char *) this + off);
   }
 
   ~auto_vec ()
fixes the -Werror=stringop-overflow= errors for me during stage3, but
haven't done full bootstrap.
It is very ugly, but works.

	Jakub
  

Patch

diff --git a/gcc/vec.cc b/gcc/vec.cc
index 511e6dff50d..2128f6666b1 100644
--- a/gcc/vec.cc
+++ b/gcc/vec.cc
@@ -568,6 +568,22 @@  test_auto_delete_vec ()
   ASSERT_EQ (dtor_count, 2);
 }
 
+/* Verify accesses to m_vecdata are done indirectly.  */
+
+static void
+test_auto_alias ()
+{
+  volatile int i = 1;
+  auto_vec<int, 8> v;
+  v.quick_grow (2);
+  v[0] = 1;
+  v[1] = 2;
+  int val;
+  for (int ix = i; v.iterate (ix, &val); ix++)
+    ASSERT_EQ (val, 2);
+  ASSERT_EQ (val, 0);
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -587,6 +603,7 @@  vec_cc_tests ()
   test_qsort ();
   test_reverse ();
   test_auto_delete_vec ();
+  test_auto_alias ();
 }
 
 } // namespace selftest
diff --git a/gcc/vec.h b/gcc/vec.h
index 5a2ee9c0294..b680efebe7a 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -586,8 +586,8 @@  public:
   unsigned allocated (void) const { return m_vecpfx.m_alloc; }
   unsigned length (void) const { return m_vecpfx.m_num; }
   bool is_empty (void) const { return m_vecpfx.m_num == 0; }
-  T *address (void) { return m_vecdata; }
-  const T *address (void) const { return m_vecdata; }
+  T *address (void) { return reinterpret_cast <T *> (this + 1); }
+  const T *address (void) const { return reinterpret_cast <const T *> (this + 1); }
   T *begin () { return address (); }
   const T *begin () const { return address (); }
   T *end () { return address () + length (); }
@@ -631,8 +631,7 @@  public:
 
   /* FIXME - These fields should be private, but we need to cater to
 	     compilers that have stricter notions of PODness for types.  */
-  vec_prefix m_vecpfx;
-  T m_vecdata[1];
+  alignas (T) vec_prefix m_vecpfx;
 };
 
 
@@ -1313,7 +1312,7 @@  vec<T, A, vl_embed>::embedded_size (unsigned alloc)
 				    vec, vec_embedded>::type vec_stdlayout;
   static_assert (sizeof (vec_stdlayout) == sizeof (vec), "");
   static_assert (alignof (vec_stdlayout) == alignof (vec), "");
-  return offsetof (vec_stdlayout, m_vecdata) + alloc * sizeof (T);
+  return sizeof (vec_stdlayout) + alloc * sizeof (T);
 }
 
 
@@ -1588,7 +1587,7 @@  public:
 
 private:
   vec<T, va_heap, vl_embed> m_auto;
-  T m_data[MAX (N - 1, 1)];
+  alignas(T) unsigned char m_data[sizeof (T) * N];
 };
 
 /* auto_vec is a sub class of vec whose storage is released when it is