tree: Fix up tree_code_{length,type}

Message ID Y9KjbJA8HZs3nLNX@tucnak
State New
Headers
Series tree: Fix up tree_code_{length,type} |

Commit Message

Jakub Jelinek Jan. 26, 2023, 3:59 p.m. UTC
  On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote:
> > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> > +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> > +
> > +
> >  /* Class of tree given its code.  */
> > -extern const enum tree_code_class tree_code_type[];
> > +constexpr enum tree_code_class tree_code_type[] = {
> > +#include "all-tree.def"
> > +};
> > +
> > +#undef DEFTREECODE
> > +#undef END_OF_BASE_TREE_CODES
> >  
> >  /* Each tree code class has an associated string representation.
> >     These must correspond to the tree_code_class entries.  */
> >  extern const char *const tree_code_class_strings[];
> >  
> >  /* Number of argument-words in each kind of tree-node.  */
> > -extern const unsigned char tree_code_length[];
> > +
> > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> > +#define END_OF_BASE_TREE_CODES 0,
> > +constexpr unsigned char tree_code_length[] = {
> > +#include "all-tree.def"
> > +};
> > +
> > +#undef DEFTREECODE
> > +#undef END_OF_BASE_TREE_CODES
> 
> IIUC defining these globals as non-inline constexpr gives them internal
> linkage, and so each TU contains its own unique copy of these globals.
> This bloats cc1plus by a tiny bit and is technically an ODR violation
> because some inline functions such as tree_class_check also ODR-use
> these variables and so each defn of tree_class_check will refer to a
> "different" tree_code_class.  Since inline variables are a C++17
> feature, I guess we could fix this by defining the globals the old way
> before C++17 and as inline constexpr otherwise?

And I'd argue with the tiny bit.
In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars,
374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each.
So, that means waste of 555016 .rodata bytes, plus being highly non-cache
friendly.

The following patch does that.

So far tested on x86_64-linux in my -O0 working tree (system gcc 12
compiler) where .rodata shrunk with the patch by 928896 bytes, in last
stage of a bootstrapped tree (built by today's prev-gcc) where .rodata
shrunk by 561728 bytes (in neither case .text or most other sections
changed sizes) and on powerpc64le-linux --disable-bootstrap
(system gcc 4.8.5) to test also the non-C++17 case.

Ok for trunk if it passes full bootstrap/regtest?

BTW, wonder if tree_code_type couldn't be an array of unsigned char
elements rather than enum tree_code_class and we'd then cast it
to the enum in the macro, that would shrink that array from 1496 bytes
to 374.  Of course, that sounds like stage1 material.

2023-01-26  Patrick Palka  <ppalka@redhat.com>
	    Jakub Jelinek  <jakub@redhat.com>

	* tree-core.h (tree_code_type, tree_code_length): For
	C++17 and later, add inline keyword, otherwise don't define
	the arrays, but declare extern arrays.
	* tree.cc (tree_code_type, tree_code_length): Define these
	arrays for C++14 and older.


	Jakub
  

Comments

Patrick Palka Jan. 26, 2023, 6:03 p.m. UTC | #1
On Thu, 26 Jan 2023, Jakub Jelinek wrote:

> On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote:
> > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> > > +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> > > +
> > > +
> > >  /* Class of tree given its code.  */
> > > -extern const enum tree_code_class tree_code_type[];
> > > +constexpr enum tree_code_class tree_code_type[] = {
> > > +#include "all-tree.def"
> > > +};
> > > +
> > > +#undef DEFTREECODE
> > > +#undef END_OF_BASE_TREE_CODES
> > >  
> > >  /* Each tree code class has an associated string representation.
> > >     These must correspond to the tree_code_class entries.  */
> > >  extern const char *const tree_code_class_strings[];
> > >  
> > >  /* Number of argument-words in each kind of tree-node.  */
> > > -extern const unsigned char tree_code_length[];
> > > +
> > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> > > +#define END_OF_BASE_TREE_CODES 0,
> > > +constexpr unsigned char tree_code_length[] = {
> > > +#include "all-tree.def"
> > > +};
> > > +
> > > +#undef DEFTREECODE
> > > +#undef END_OF_BASE_TREE_CODES
> > 
> > IIUC defining these globals as non-inline constexpr gives them internal
> > linkage, and so each TU contains its own unique copy of these globals.
> > This bloats cc1plus by a tiny bit and is technically an ODR violation
> > because some inline functions such as tree_class_check also ODR-use
> > these variables and so each defn of tree_class_check will refer to a
> > "different" tree_code_class.  Since inline variables are a C++17
> > feature, I guess we could fix this by defining the globals the old way
> > before C++17 and as inline constexpr otherwise?
> 
> And I'd argue with the tiny bit.
> In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars,
> 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each.
> So, that means waste of 555016 .rodata bytes, plus being highly non-cache
> friendly.
> 
> The following patch does that.
> 
> So far tested on x86_64-linux in my -O0 working tree (system gcc 12
> compiler) where .rodata shrunk with the patch by 928896 bytes, in last
> stage of a bootstrapped tree (built by today's prev-gcc) where .rodata
> shrunk by 561728 bytes (in neither case .text or most other sections
> changed sizes) and on powerpc64le-linux --disable-bootstrap
> (system gcc 4.8.5) to test also the non-C++17 case.

LGTM FWIW.  On a related note I noticed the function
tree.h:tree_operand_length is declared static and is then used in the
non-static inline functions tree_operand_check etc, which seems to be
also be a (harmless) ODR violation?

We probably should do s/static inline/inline throughout the header files
at some point, which'd hopefully reduce the size of and speed up stage1
cc1plus.

> 
> Ok for trunk if it passes full bootstrap/regtest?
> 
> BTW, wonder if tree_code_type couldn't be an array of unsigned char
> elements rather than enum tree_code_class and we'd then cast it
> to the enum in the macro, that would shrink that array from 1496 bytes
> to 374.  Of course, that sounds like stage1 material.
> 
> 2023-01-26  Patrick Palka  <ppalka@redhat.com>
> 	    Jakub Jelinek  <jakub@redhat.com>
> 
> 	* tree-core.h (tree_code_type, tree_code_length): For
> 	C++17 and later, add inline keyword, otherwise don't define
> 	the arrays, but declare extern arrays.
> 	* tree.cc (tree_code_type, tree_code_length): Define these
> 	arrays for C++14 and older.
> 
> --- gcc/tree-core.h.jj	2023-01-02 09:32:31.188158094 +0100
> +++ gcc/tree-core.h	2023-01-26 16:02:34.212113251 +0100
> @@ -2284,17 +2284,20 @@ struct floatn_type_info {
>  /* Matrix describing the structures contained in a given tree code.  */
>  extern bool tree_contains_struct[MAX_TREE_CODES][64];
>  
> +/* Class of tree given its code.  */
> +#if __cpp_inline_variables >= 201606L
>  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
>  #define END_OF_BASE_TREE_CODES tcc_exceptional,
>  
> -
> -/* Class of tree given its code.  */
> -constexpr enum tree_code_class tree_code_type[] = {
> +constexpr inline enum tree_code_class tree_code_type[] = {
>  #include "all-tree.def"
>  };
>  
>  #undef DEFTREECODE
>  #undef END_OF_BASE_TREE_CODES
> +#else
> +extern const enum tree_code_class tree_code_type[];
> +#endif
>  
>  /* Each tree code class has an associated string representation.
>     These must correspond to the tree_code_class entries.  */
> @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class
>  
>  /* Number of argument-words in each kind of tree-node.  */
>  
> +#if __cpp_inline_variables >= 201606L
>  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
>  #define END_OF_BASE_TREE_CODES 0,
> -constexpr unsigned char tree_code_length[] = {
> +constexpr inline unsigned char tree_code_length[] = {
>  #include "all-tree.def"
>  };
>  
>  #undef DEFTREECODE
>  #undef END_OF_BASE_TREE_CODES
> +#else
> +extern const unsigned char tree_code_length[];
> +#endif
>  
>  /* Vector of all alias pairs for global symbols.  */
>  extern GTY(()) vec<alias_pair, va_gc> *alias_pairs;
> --- gcc/tree.cc.jj	2023-01-13 17:37:45.259482663 +0100
> +++ gcc/tree.cc	2023-01-26 16:03:59.796878082 +0100
> @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3.
>  #include "asan.h"
>  #include "ubsan.h"
>  
> +#if __cpp_inline_variables < 201606L
> +/* Tree code classes.  */
>  
> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> +
> +const enum tree_code_class tree_code_type[] = {
> +#include "all-tree.def"
> +};
> +
> +#undef DEFTREECODE
> +#undef END_OF_BASE_TREE_CODES
> +
> +/* Table indexed by tree code giving number of expression
> +   operands beyond the fixed part of the node structure.
> +   Not used for types or decls.  */
> +
> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> +#define END_OF_BASE_TREE_CODES 0,
> +
> +const unsigned char tree_code_length[] = {
> +#include "all-tree.def"
> +};
> +
> +#undef DEFTREECODE
> +#undef END_OF_BASE_TREE_CODES
> +#endif
>  
>  /* Names of tree components.
>     Used for printing out the tree and error messages.  */
> 
> 	Jakub
> 
>
  
Richard Biener Jan. 27, 2023, 7:42 a.m. UTC | #2
On Thu, 26 Jan 2023, Jakub Jelinek wrote:

> On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote:
> > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> > > +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> > > +
> > > +
> > >  /* Class of tree given its code.  */
> > > -extern const enum tree_code_class tree_code_type[];
> > > +constexpr enum tree_code_class tree_code_type[] = {
> > > +#include "all-tree.def"
> > > +};
> > > +
> > > +#undef DEFTREECODE
> > > +#undef END_OF_BASE_TREE_CODES
> > >  
> > >  /* Each tree code class has an associated string representation.
> > >     These must correspond to the tree_code_class entries.  */
> > >  extern const char *const tree_code_class_strings[];
> > >  
> > >  /* Number of argument-words in each kind of tree-node.  */
> > > -extern const unsigned char tree_code_length[];
> > > +
> > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> > > +#define END_OF_BASE_TREE_CODES 0,
> > > +constexpr unsigned char tree_code_length[] = {
> > > +#include "all-tree.def"
> > > +};
> > > +
> > > +#undef DEFTREECODE
> > > +#undef END_OF_BASE_TREE_CODES
> > 
> > IIUC defining these globals as non-inline constexpr gives them internal
> > linkage, and so each TU contains its own unique copy of these globals.
> > This bloats cc1plus by a tiny bit and is technically an ODR violation
> > because some inline functions such as tree_class_check also ODR-use
> > these variables and so each defn of tree_class_check will refer to a
> > "different" tree_code_class.  Since inline variables are a C++17
> > feature, I guess we could fix this by defining the globals the old way
> > before C++17 and as inline constexpr otherwise?
> 
> And I'd argue with the tiny bit.
> In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars,
> 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each.
> So, that means waste of 555016 .rodata bytes, plus being highly non-cache
> friendly.
> 
> The following patch does that.
> 
> So far tested on x86_64-linux in my -O0 working tree (system gcc 12
> compiler) where .rodata shrunk with the patch by 928896 bytes, in last
> stage of a bootstrapped tree (built by today's prev-gcc) where .rodata
> shrunk by 561728 bytes (in neither case .text or most other sections
> changed sizes) and on powerpc64le-linux --disable-bootstrap
> (system gcc 4.8.5) to test also the non-C++17 case.
> 
> Ok for trunk if it passes full bootstrap/regtest?
> 
> BTW, wonder if tree_code_type couldn't be an array of unsigned char
> elements rather than enum tree_code_class and we'd then cast it
> to the enum in the macro, that would shrink that array from 1496 bytes
> to 374.  Of course, that sounds like stage1 material.

One could argue the same way for this patch (and instead revert),
I'd say if we tweak this now then tweak it to the maximum extent?
Isn't sth like 'enum unsigned char tree_code_class' now possible?
(and a static assert the enum values all fit, though that would
be diagnosed anyway?)

> 2023-01-26  Patrick Palka  <ppalka@redhat.com>
> 	    Jakub Jelinek  <jakub@redhat.com>
> 
> 	* tree-core.h (tree_code_type, tree_code_length): For
> 	C++17 and later, add inline keyword, otherwise don't define
> 	the arrays, but declare extern arrays.
> 	* tree.cc (tree_code_type, tree_code_length): Define these
> 	arrays for C++14 and older.
> 
> --- gcc/tree-core.h.jj	2023-01-02 09:32:31.188158094 +0100
> +++ gcc/tree-core.h	2023-01-26 16:02:34.212113251 +0100
> @@ -2284,17 +2284,20 @@ struct floatn_type_info {
>  /* Matrix describing the structures contained in a given tree code.  */
>  extern bool tree_contains_struct[MAX_TREE_CODES][64];
>  
> +/* Class of tree given its code.  */
> +#if __cpp_inline_variables >= 201606L
>  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
>  #define END_OF_BASE_TREE_CODES tcc_exceptional,
>  
> -
> -/* Class of tree given its code.  */
> -constexpr enum tree_code_class tree_code_type[] = {
> +constexpr inline enum tree_code_class tree_code_type[] = {
>  #include "all-tree.def"
>  };

Do we need an explicit external definition somewhere when
constant folding isn't possible?

Otherwise looks good to me.

Thanks,
Richard.

>  #undef DEFTREECODE
>  #undef END_OF_BASE_TREE_CODES
> +#else
> +extern const enum tree_code_class tree_code_type[];
> +#endif
>  
>  /* Each tree code class has an associated string representation.
>     These must correspond to the tree_code_class entries.  */
> @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class
>  
>  /* Number of argument-words in each kind of tree-node.  */
>  
> +#if __cpp_inline_variables >= 201606L
>  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
>  #define END_OF_BASE_TREE_CODES 0,
> -constexpr unsigned char tree_code_length[] = {
> +constexpr inline unsigned char tree_code_length[] = {
>  #include "all-tree.def"
>  };
>  
>  #undef DEFTREECODE
>  #undef END_OF_BASE_TREE_CODES
> +#else
> +extern const unsigned char tree_code_length[];
> +#endif
>  
>  /* Vector of all alias pairs for global symbols.  */
>  extern GTY(()) vec<alias_pair, va_gc> *alias_pairs;
> --- gcc/tree.cc.jj	2023-01-13 17:37:45.259482663 +0100
> +++ gcc/tree.cc	2023-01-26 16:03:59.796878082 +0100
> @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3.
>  #include "asan.h"
>  #include "ubsan.h"
>  
> +#if __cpp_inline_variables < 201606L
> +/* Tree code classes.  */
>  
> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> +
> +const enum tree_code_class tree_code_type[] = {
> +#include "all-tree.def"
> +};
> +
> +#undef DEFTREECODE
> +#undef END_OF_BASE_TREE_CODES
> +
> +/* Table indexed by tree code giving number of expression
> +   operands beyond the fixed part of the node structure.
> +   Not used for types or decls.  */
> +
> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> +#define END_OF_BASE_TREE_CODES 0,
> +
> +const unsigned char tree_code_length[] = {
> +#include "all-tree.def"
> +};
> +
> +#undef DEFTREECODE
> +#undef END_OF_BASE_TREE_CODES
> +#endif
>  
>  /* Names of tree components.
>     Used for printing out the tree and error messages.  */
> 
> 	Jakub
> 
>
  
Jakub Jelinek Jan. 27, 2023, 8:57 a.m. UTC | #3
On Fri, Jan 27, 2023 at 07:42:39AM +0000, Richard Biener wrote:
> > BTW, wonder if tree_code_type couldn't be an array of unsigned char
> > elements rather than enum tree_code_class and we'd then cast it
> > to the enum in the macro, that would shrink that array from 1496 bytes
> > to 374.  Of course, that sounds like stage1 material.
> 
> One could argue the same way for this patch (and instead revert),

Well, this patch is in fact a conditional reversion (revert for
C++11/14, add one keyword to 2 declarations otherwise).

> I'd say if we tweak this now then tweak it to the maximum extent?
> Isn't sth like 'enum unsigned char tree_code_class' now possible?
> (and a static assert the enum values all fit, though that would
> be diagnosed anyway?)

C++11 indeed has
enum tree_code_class : unsigned char {
  tcc_exceptional,
  ...
  tcc_expression
};
and one indeed gets an error if some enumerator doesn't fit.
The problem I see with this is that the type is 8-bit everywhere,
which I'd be afraid could cause worse code generation (of course,
one would need to try to see how much; e.g. build the compiler
unmodified, with the unsigned char array plus explicit casts from
the array and finally with unsigned char as underlying type).
When passing around enum tree_code_class etc., it is fine if it
is 32-bit.  And there isn't a way to create an enum with different
underlying type but with the same enumerators as in another enum.
Perhaps for tree_code_class we could away with the underlying type
because it is mostly used in the macros which immediately compare
it, in gcc/*.cc just in the following explicitly:
expr.cc:get_def_for_expr_class (tree name, enum tree_code_class tclass)
fold-const.cc:  enum tree_code_class tclass;
fold-const.cc:  enum tree_code_class tclass = TREE_CODE_CLASS (code);
fold-const.cc:  enum tree_code_class tclass = TREE_CODE_CLASS (code);
fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
gimple-fold.cc:              enum tree_code_class kind = TREE_CODE_CLASS (subcode);
print-tree.cc:  enum tree_code_class tclass;
print-tree.cc:  enum tree_code_class tclass;
tree.cc:   These must correspond to the tree_code_class entries.  */
tree.cc:const char *const tree_code_class_strings[] =
tree.cc:  enum tree_code_class type = TREE_CODE_CLASS (code);
tree.cc:  enum tree_code_class type = TREE_CODE_CLASS (code);
tree.cc:tree_class_check_failed (const_tree node, const enum tree_code_class cl,
tree.cc:tree_not_class_check_failed (const_tree node, const enum tree_code_class cl,
tree.cc:  const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t));
tree.cc:  const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t));
tree-dump.cc:  enum tree_code_class code_class;
tree-inline.cc:  enum tree_code_class cl = TREE_CODE_CLASS (code);
tree-pretty-print.cc:	enum tree_code_class tclass;
tree-ssa-live.cc:  enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t));
tree-ssa-operands.cc:  enum tree_code_class codeclass;
But as I said, one would need to watch for code generation at least on
a couple of common hosts, and while x86_64 should be one of them, it might
have bigger effects on others as x86 has byte comparison etc. instructions.

> 
> > 2023-01-26  Patrick Palka  <ppalka@redhat.com>
> > 	    Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	* tree-core.h (tree_code_type, tree_code_length): For
> > 	C++17 and later, add inline keyword, otherwise don't define
> > 	the arrays, but declare extern arrays.
> > 	* tree.cc (tree_code_type, tree_code_length): Define these
> > 	arrays for C++14 and older.
> > 
> > --- gcc/tree-core.h.jj	2023-01-02 09:32:31.188158094 +0100
> > +++ gcc/tree-core.h	2023-01-26 16:02:34.212113251 +0100
> > @@ -2284,17 +2284,20 @@ struct floatn_type_info {
> >  /* Matrix describing the structures contained in a given tree code.  */
> >  extern bool tree_contains_struct[MAX_TREE_CODES][64];
> >  
> > +/* Class of tree given its code.  */
> > +#if __cpp_inline_variables >= 201606L
> >  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> >  #define END_OF_BASE_TREE_CODES tcc_exceptional,
> >  
> > -
> > -/* Class of tree given its code.  */
> > -constexpr enum tree_code_class tree_code_type[] = {
> > +constexpr inline enum tree_code_class tree_code_type[] = {
> >  #include "all-tree.def"
> >  };
> 
> Do we need an explicit external definition somewhere when
> constant folding isn't possible?

> 
> Otherwise looks good to me.
> 
> Thanks,
> Richard.
> 
> >  #undef DEFTREECODE
> >  #undef END_OF_BASE_TREE_CODES
> > +#else
> > +extern const enum tree_code_class tree_code_type[];

There is one here for the C++11 and C++14 cases.
For C++17 and later it isn't needed,
constexpr inline enum tree_code_class tree_code_type[] = {
...
};
means this is a comdat variable in all TUs which need non-ODR
uses of it (tree_code_type[23] evaluates to constant expression,
but tree_code_type[x] or &tree_code_type[23] etc. often don't and then
the comdat var is emitted and all TUs share one copy of the variable.

	Jakub
  
Richard Biener Jan. 27, 2023, 9:49 a.m. UTC | #4
On Fri, 27 Jan 2023, Jakub Jelinek wrote:

> On Fri, Jan 27, 2023 at 07:42:39AM +0000, Richard Biener wrote:
> > > BTW, wonder if tree_code_type couldn't be an array of unsigned char
> > > elements rather than enum tree_code_class and we'd then cast it
> > > to the enum in the macro, that would shrink that array from 1496 bytes
> > > to 374.  Of course, that sounds like stage1 material.
> > 
> > One could argue the same way for this patch (and instead revert),
> 
> Well, this patch is in fact a conditional reversion (revert for
> C++11/14, add one keyword to 2 declarations otherwise).
> 
> > I'd say if we tweak this now then tweak it to the maximum extent?
> > Isn't sth like 'enum unsigned char tree_code_class' now possible?
> > (and a static assert the enum values all fit, though that would
> > be diagnosed anyway?)
> 
> C++11 indeed has
> enum tree_code_class : unsigned char {
>   tcc_exceptional,
>   ...
>   tcc_expression
> };
> and one indeed gets an error if some enumerator doesn't fit.
> The problem I see with this is that the type is 8-bit everywhere,
> which I'd be afraid could cause worse code generation (of course,
> one would need to try to see how much; e.g. build the compiler
> unmodified, with the unsigned char array plus explicit casts from
> the array and finally with unsigned char as underlying type).
> When passing around enum tree_code_class etc., it is fine if it
> is 32-bit.  And there isn't a way to create an enum with different
> underlying type but with the same enumerators as in another enum.
> Perhaps for tree_code_class we could away with the underlying type
> because it is mostly used in the macros which immediately compare
> it, in gcc/*.cc just in the following explicitly:
> expr.cc:get_def_for_expr_class (tree name, enum tree_code_class tclass)
> fold-const.cc:  enum tree_code_class tclass;
> fold-const.cc:  enum tree_code_class tclass = TREE_CODE_CLASS (code);
> fold-const.cc:  enum tree_code_class tclass = TREE_CODE_CLASS (code);
> fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
> fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
> fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
> fold-const.cc:  enum tree_code_class kind = TREE_CODE_CLASS (code);
> gimple-fold.cc:              enum tree_code_class kind = TREE_CODE_CLASS (subcode);
> print-tree.cc:  enum tree_code_class tclass;
> print-tree.cc:  enum tree_code_class tclass;
> tree.cc:   These must correspond to the tree_code_class entries.  */
> tree.cc:const char *const tree_code_class_strings[] =
> tree.cc:  enum tree_code_class type = TREE_CODE_CLASS (code);
> tree.cc:  enum tree_code_class type = TREE_CODE_CLASS (code);
> tree.cc:tree_class_check_failed (const_tree node, const enum tree_code_class cl,
> tree.cc:tree_not_class_check_failed (const_tree node, const enum tree_code_class cl,
> tree.cc:  const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t));
> tree.cc:  const enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t));
> tree-dump.cc:  enum tree_code_class code_class;
> tree-inline.cc:  enum tree_code_class cl = TREE_CODE_CLASS (code);
> tree-pretty-print.cc:	enum tree_code_class tclass;
> tree-ssa-live.cc:  enum tree_code_class c = TREE_CODE_CLASS (TREE_CODE (t));
> tree-ssa-operands.cc:  enum tree_code_class codeclass;
> But as I said, one would need to watch for code generation at least on
> a couple of common hosts, and while x86_64 should be one of them, it might
> have bigger effects on others as x86 has byte comparison etc. instructions.

Hm, yes.  Not sure if using uint_fast8_t would make a difference where
it should.  So lets keep this change separate.

Richard.

> > 
> > > 2023-01-26  Patrick Palka  <ppalka@redhat.com>
> > > 	    Jakub Jelinek  <jakub@redhat.com>
> > > 
> > > 	* tree-core.h (tree_code_type, tree_code_length): For
> > > 	C++17 and later, add inline keyword, otherwise don't define
> > > 	the arrays, but declare extern arrays.
> > > 	* tree.cc (tree_code_type, tree_code_length): Define these
> > > 	arrays for C++14 and older.
> > > 
> > > --- gcc/tree-core.h.jj	2023-01-02 09:32:31.188158094 +0100
> > > +++ gcc/tree-core.h	2023-01-26 16:02:34.212113251 +0100
> > > @@ -2284,17 +2284,20 @@ struct floatn_type_info {
> > >  /* Matrix describing the structures contained in a given tree code.  */
> > >  extern bool tree_contains_struct[MAX_TREE_CODES][64];
> > >  
> > > +/* Class of tree given its code.  */
> > > +#if __cpp_inline_variables >= 201606L
> > >  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> > >  #define END_OF_BASE_TREE_CODES tcc_exceptional,
> > >  
> > > -
> > > -/* Class of tree given its code.  */
> > > -constexpr enum tree_code_class tree_code_type[] = {
> > > +constexpr inline enum tree_code_class tree_code_type[] = {
> > >  #include "all-tree.def"
> > >  };
> > 
> > Do we need an explicit external definition somewhere when
> > constant folding isn't possible?
> 
> > 
> > Otherwise looks good to me.
> > 
> > Thanks,
> > Richard.
> > 
> > >  #undef DEFTREECODE
> > >  #undef END_OF_BASE_TREE_CODES
> > > +#else
> > > +extern const enum tree_code_class tree_code_type[];
> 
> There is one here for the C++11 and C++14 cases.
> For C++17 and later it isn't needed,
> constexpr inline enum tree_code_class tree_code_type[] = {
> ...
> };
> means this is a comdat variable in all TUs which need non-ODR
> uses of it (tree_code_type[23] evaluates to constant expression,
> but tree_code_type[x] or &tree_code_type[23] etc. often don't and then
> the comdat var is emitted and all TUs share one copy of the variable.
> 
> 	Jakub
> 
>
  
Patrick Palka Jan. 27, 2023, 12:40 p.m. UTC | #5
On Thu, 26 Jan 2023, Patrick Palka wrote:

> On Thu, 26 Jan 2023, Jakub Jelinek wrote:
> 
> > On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote:
> > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> > > > +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> > > > +
> > > > +
> > > >  /* Class of tree given its code.  */
> > > > -extern const enum tree_code_class tree_code_type[];
> > > > +constexpr enum tree_code_class tree_code_type[] = {
> > > > +#include "all-tree.def"
> > > > +};
> > > > +
> > > > +#undef DEFTREECODE
> > > > +#undef END_OF_BASE_TREE_CODES
> > > >  
> > > >  /* Each tree code class has an associated string representation.
> > > >     These must correspond to the tree_code_class entries.  */
> > > >  extern const char *const tree_code_class_strings[];
> > > >  
> > > >  /* Number of argument-words in each kind of tree-node.  */
> > > > -extern const unsigned char tree_code_length[];
> > > > +
> > > > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> > > > +#define END_OF_BASE_TREE_CODES 0,
> > > > +constexpr unsigned char tree_code_length[] = {
> > > > +#include "all-tree.def"
> > > > +};
> > > > +
> > > > +#undef DEFTREECODE
> > > > +#undef END_OF_BASE_TREE_CODES
> > > 
> > > IIUC defining these globals as non-inline constexpr gives them internal
> > > linkage, and so each TU contains its own unique copy of these globals.
> > > This bloats cc1plus by a tiny bit and is technically an ODR violation
> > > because some inline functions such as tree_class_check also ODR-use
> > > these variables and so each defn of tree_class_check will refer to a
> > > "different" tree_code_class.  Since inline variables are a C++17
> > > feature, I guess we could fix this by defining the globals the old way
> > > before C++17 and as inline constexpr otherwise?
> > 
> > And I'd argue with the tiny bit.
> > In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars,
> > 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each.
> > So, that means waste of 555016 .rodata bytes, plus being highly non-cache
> > friendly.
> > 
> > The following patch does that.
> > 
> > So far tested on x86_64-linux in my -O0 working tree (system gcc 12
> > compiler) where .rodata shrunk with the patch by 928896 bytes, in last
> > stage of a bootstrapped tree (built by today's prev-gcc) where .rodata
> > shrunk by 561728 bytes (in neither case .text or most other sections
> > changed sizes) and on powerpc64le-linux --disable-bootstrap
> > (system gcc 4.8.5) to test also the non-C++17 case.
> 
> LGTM FWIW.  On a related note I noticed the function
> tree.h:tree_operand_length is declared static and is then used in the
> non-static inline functions tree_operand_check etc, which seems to be
> also be a (harmless) ODR violation?
> 
> We probably should do s/static inline/inline throughout the header files
> at some point, which'd hopefully reduce the size of and speed up stage1
> cc1plus.

Mechanically replacing uses of static inline in headers via

  echo gcc/*.h gcc/*/*.h | xargs sed -i 's/^static inline/inline/g'

reduces rodata size of stage1 cc1plus by ~1.5MB and seems to make it ~2%
faster.  Not bad..

> 
> > 
> > Ok for trunk if it passes full bootstrap/regtest?
> > 
> > BTW, wonder if tree_code_type couldn't be an array of unsigned char
> > elements rather than enum tree_code_class and we'd then cast it
> > to the enum in the macro, that would shrink that array from 1496 bytes
> > to 374.  Of course, that sounds like stage1 material.
> > 
> > 2023-01-26  Patrick Palka  <ppalka@redhat.com>
> > 	    Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	* tree-core.h (tree_code_type, tree_code_length): For
> > 	C++17 and later, add inline keyword, otherwise don't define
> > 	the arrays, but declare extern arrays.
> > 	* tree.cc (tree_code_type, tree_code_length): Define these
> > 	arrays for C++14 and older.
> > 
> > --- gcc/tree-core.h.jj	2023-01-02 09:32:31.188158094 +0100
> > +++ gcc/tree-core.h	2023-01-26 16:02:34.212113251 +0100
> > @@ -2284,17 +2284,20 @@ struct floatn_type_info {
> >  /* Matrix describing the structures contained in a given tree code.  */
> >  extern bool tree_contains_struct[MAX_TREE_CODES][64];
> >  
> > +/* Class of tree given its code.  */
> > +#if __cpp_inline_variables >= 201606L
> >  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> >  #define END_OF_BASE_TREE_CODES tcc_exceptional,
> >  
> > -
> > -/* Class of tree given its code.  */
> > -constexpr enum tree_code_class tree_code_type[] = {
> > +constexpr inline enum tree_code_class tree_code_type[] = {
> >  #include "all-tree.def"
> >  };
> >  
> >  #undef DEFTREECODE
> >  #undef END_OF_BASE_TREE_CODES
> > +#else
> > +extern const enum tree_code_class tree_code_type[];
> > +#endif
> >  
> >  /* Each tree code class has an associated string representation.
> >     These must correspond to the tree_code_class entries.  */
> > @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class
> >  
> >  /* Number of argument-words in each kind of tree-node.  */
> >  
> > +#if __cpp_inline_variables >= 201606L
> >  #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> >  #define END_OF_BASE_TREE_CODES 0,
> > -constexpr unsigned char tree_code_length[] = {
> > +constexpr inline unsigned char tree_code_length[] = {
> >  #include "all-tree.def"
> >  };
> >  
> >  #undef DEFTREECODE
> >  #undef END_OF_BASE_TREE_CODES
> > +#else
> > +extern const unsigned char tree_code_length[];
> > +#endif
> >  
> >  /* Vector of all alias pairs for global symbols.  */
> >  extern GTY(()) vec<alias_pair, va_gc> *alias_pairs;
> > --- gcc/tree.cc.jj	2023-01-13 17:37:45.259482663 +0100
> > +++ gcc/tree.cc	2023-01-26 16:03:59.796878082 +0100
> > @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3.
> >  #include "asan.h"
> >  #include "ubsan.h"
> >  
> > +#if __cpp_inline_variables < 201606L
> > +/* Tree code classes.  */
> >  
> > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
> > +#define END_OF_BASE_TREE_CODES tcc_exceptional,
> > +
> > +const enum tree_code_class tree_code_type[] = {
> > +#include "all-tree.def"
> > +};
> > +
> > +#undef DEFTREECODE
> > +#undef END_OF_BASE_TREE_CODES
> > +
> > +/* Table indexed by tree code giving number of expression
> > +   operands beyond the fixed part of the node structure.
> > +   Not used for types or decls.  */
> > +
> > +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
> > +#define END_OF_BASE_TREE_CODES 0,
> > +
> > +const unsigned char tree_code_length[] = {
> > +#include "all-tree.def"
> > +};
> > +
> > +#undef DEFTREECODE
> > +#undef END_OF_BASE_TREE_CODES
> > +#endif
> >  
> >  /* Names of tree components.
> >     Used for printing out the tree and error messages.  */
> > 
> > 	Jakub
> > 
> > 
>
  
Richard Biener Jan. 27, 2023, 1:14 p.m. UTC | #6
> Am 27.01.2023 um 13:41 schrieb Patrick Palka via Gcc-patches <gcc-patches@gcc.gnu.org>:
> 
> On Thu, 26 Jan 2023, Patrick Palka wrote:
> 
>>> On Thu, 26 Jan 2023, Jakub Jelinek wrote:
>>> 
>>> On Thu, Jan 26, 2023 at 09:45:35AM -0500, Patrick Palka via Gcc-patches wrote:
>>>>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
>>>>> +#define END_OF_BASE_TREE_CODES tcc_exceptional,
>>>>> +
>>>>> +
>>>>> /* Class of tree given its code.  */
>>>>> -extern const enum tree_code_class tree_code_type[];
>>>>> +constexpr enum tree_code_class tree_code_type[] = {
>>>>> +#include "all-tree.def"
>>>>> +};
>>>>> +
>>>>> +#undef DEFTREECODE
>>>>> +#undef END_OF_BASE_TREE_CODES
>>>>> 
>>>>> /* Each tree code class has an associated string representation.
>>>>>    These must correspond to the tree_code_class entries.  */
>>>>> extern const char *const tree_code_class_strings[];
>>>>> 
>>>>> /* Number of argument-words in each kind of tree-node.  */
>>>>> -extern const unsigned char tree_code_length[];
>>>>> +
>>>>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
>>>>> +#define END_OF_BASE_TREE_CODES 0,
>>>>> +constexpr unsigned char tree_code_length[] = {
>>>>> +#include "all-tree.def"
>>>>> +};
>>>>> +
>>>>> +#undef DEFTREECODE
>>>>> +#undef END_OF_BASE_TREE_CODES
>>>> 
>>>> IIUC defining these globals as non-inline constexpr gives them internal
>>>> linkage, and so each TU contains its own unique copy of these globals.
>>>> This bloats cc1plus by a tiny bit and is technically an ODR violation
>>>> because some inline functions such as tree_class_check also ODR-use
>>>> these variables and so each defn of tree_class_check will refer to a
>>>> "different" tree_code_class.  Since inline variables are a C++17
>>>> feature, I guess we could fix this by defining the globals the old way
>>>> before C++17 and as inline constexpr otherwise?
>>> 
>>> And I'd argue with the tiny bit.
>>> In my x86_64-linux cc1plus from today, I see 193 _ZL16tree_code_length vars,
>>> 374 bytes each, and 324 _ZL14tree_code_type vars, 1496 bytes each.
>>> So, that means waste of 555016 .rodata bytes, plus being highly non-cache
>>> friendly.
>>> 
>>> The following patch does that.
>>> 
>>> So far tested on x86_64-linux in my -O0 working tree (system gcc 12
>>> compiler) where .rodata shrunk with the patch by 928896 bytes, in last
>>> stage of a bootstrapped tree (built by today's prev-gcc) where .rodata
>>> shrunk by 561728 bytes (in neither case .text or most other sections
>>> changed sizes) and on powerpc64le-linux --disable-bootstrap
>>> (system gcc 4.8.5) to test also the non-C++17 case.
>> 
>> LGTM FWIW.  On a related note I noticed the function
>> tree.h:tree_operand_length is declared static and is then used in the
>> non-static inline functions tree_operand_check etc, which seems to be
>> also be a (harmless) ODR violation?
>> 
>> We probably should do s/static inline/inline throughout the header files
>> at some point, which'd hopefully reduce the size of and speed up stage1
>> cc1plus.
> 
> Mechanically replacing uses of static inline in headers via
> 
>  echo gcc/*.h gcc/*/*.h | xargs sed -i 's/^static inline/inline/g'
> 
> reduces rodata size of stage1 cc1plus by ~1.5MB and seems to make it ~2%
> faster.  Not bad..

Nice.

Richard 


>> 
>>> 
>>> Ok for trunk if it passes full bootstrap/regtest?
>>> 
>>> BTW, wonder if tree_code_type couldn't be an array of unsigned char
>>> elements rather than enum tree_code_class and we'd then cast it
>>> to the enum in the macro, that would shrink that array from 1496 bytes
>>> to 374.  Of course, that sounds like stage1 material.
>>> 
>>> 2023-01-26  Patrick Palka  <ppalka@redhat.com>
>>>        Jakub Jelinek  <jakub@redhat.com>
>>> 
>>>    * tree-core.h (tree_code_type, tree_code_length): For
>>>    C++17 and later, add inline keyword, otherwise don't define
>>>    the arrays, but declare extern arrays.
>>>    * tree.cc (tree_code_type, tree_code_length): Define these
>>>    arrays for C++14 and older.
>>> 
>>> --- gcc/tree-core.h.jj    2023-01-02 09:32:31.188158094 +0100
>>> +++ gcc/tree-core.h    2023-01-26 16:02:34.212113251 +0100
>>> @@ -2284,17 +2284,20 @@ struct floatn_type_info {
>>> /* Matrix describing the structures contained in a given tree code.  */
>>> extern bool tree_contains_struct[MAX_TREE_CODES][64];
>>> 
>>> +/* Class of tree given its code.  */
>>> +#if __cpp_inline_variables >= 201606L
>>> #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
>>> #define END_OF_BASE_TREE_CODES tcc_exceptional,
>>> 
>>> -
>>> -/* Class of tree given its code.  */
>>> -constexpr enum tree_code_class tree_code_type[] = {
>>> +constexpr inline enum tree_code_class tree_code_type[] = {
>>> #include "all-tree.def"
>>> };
>>> 
>>> #undef DEFTREECODE
>>> #undef END_OF_BASE_TREE_CODES
>>> +#else
>>> +extern const enum tree_code_class tree_code_type[];
>>> +#endif
>>> 
>>> /* Each tree code class has an associated string representation.
>>>    These must correspond to the tree_code_class entries.  */
>>> @@ -2302,14 +2305,18 @@ extern const char *const tree_code_class
>>> 
>>> /* Number of argument-words in each kind of tree-node.  */
>>> 
>>> +#if __cpp_inline_variables >= 201606L
>>> #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
>>> #define END_OF_BASE_TREE_CODES 0,
>>> -constexpr unsigned char tree_code_length[] = {
>>> +constexpr inline unsigned char tree_code_length[] = {
>>> #include "all-tree.def"
>>> };
>>> 
>>> #undef DEFTREECODE
>>> #undef END_OF_BASE_TREE_CODES
>>> +#else
>>> +extern const unsigned char tree_code_length[];
>>> +#endif
>>> 
>>> /* Vector of all alias pairs for global symbols.  */
>>> extern GTY(()) vec<alias_pair, va_gc> *alias_pairs;
>>> --- gcc/tree.cc.jj    2023-01-13 17:37:45.259482663 +0100
>>> +++ gcc/tree.cc    2023-01-26 16:03:59.796878082 +0100
>>> @@ -74,7 +74,33 @@ along with GCC; see the file COPYING3.
>>> #include "asan.h"
>>> #include "ubsan.h"
>>> 
>>> +#if __cpp_inline_variables < 201606L
>>> +/* Tree code classes.  */
>>> 
>>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
>>> +#define END_OF_BASE_TREE_CODES tcc_exceptional,
>>> +
>>> +const enum tree_code_class tree_code_type[] = {
>>> +#include "all-tree.def"
>>> +};
>>> +
>>> +#undef DEFTREECODE
>>> +#undef END_OF_BASE_TREE_CODES
>>> +
>>> +/* Table indexed by tree code giving number of expression
>>> +   operands beyond the fixed part of the node structure.
>>> +   Not used for types or decls.  */
>>> +
>>> +#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
>>> +#define END_OF_BASE_TREE_CODES 0,
>>> +
>>> +const unsigned char tree_code_length[] = {
>>> +#include "all-tree.def"
>>> +};
>>> +
>>> +#undef DEFTREECODE
>>> +#undef END_OF_BASE_TREE_CODES
>>> +#endif
>>> 
>>> /* Names of tree components.
>>>    Used for printing out the tree and error messages.  */
>>> 
>>>    Jakub
>>> 
>>> 
>> 
>
  

Patch

--- gcc/tree-core.h.jj	2023-01-02 09:32:31.188158094 +0100
+++ gcc/tree-core.h	2023-01-26 16:02:34.212113251 +0100
@@ -2284,17 +2284,20 @@  struct floatn_type_info {
 /* Matrix describing the structures contained in a given tree code.  */
 extern bool tree_contains_struct[MAX_TREE_CODES][64];
 
+/* Class of tree given its code.  */
+#if __cpp_inline_variables >= 201606L
 #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
 #define END_OF_BASE_TREE_CODES tcc_exceptional,
 
-
-/* Class of tree given its code.  */
-constexpr enum tree_code_class tree_code_type[] = {
+constexpr inline enum tree_code_class tree_code_type[] = {
 #include "all-tree.def"
 };
 
 #undef DEFTREECODE
 #undef END_OF_BASE_TREE_CODES
+#else
+extern const enum tree_code_class tree_code_type[];
+#endif
 
 /* Each tree code class has an associated string representation.
    These must correspond to the tree_code_class entries.  */
@@ -2302,14 +2305,18 @@  extern const char *const tree_code_class
 
 /* Number of argument-words in each kind of tree-node.  */
 
+#if __cpp_inline_variables >= 201606L
 #define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
 #define END_OF_BASE_TREE_CODES 0,
-constexpr unsigned char tree_code_length[] = {
+constexpr inline unsigned char tree_code_length[] = {
 #include "all-tree.def"
 };
 
 #undef DEFTREECODE
 #undef END_OF_BASE_TREE_CODES
+#else
+extern const unsigned char tree_code_length[];
+#endif
 
 /* Vector of all alias pairs for global symbols.  */
 extern GTY(()) vec<alias_pair, va_gc> *alias_pairs;
--- gcc/tree.cc.jj	2023-01-13 17:37:45.259482663 +0100
+++ gcc/tree.cc	2023-01-26 16:03:59.796878082 +0100
@@ -74,7 +74,33 @@  along with GCC; see the file COPYING3.
 #include "asan.h"
 #include "ubsan.h"
 
+#if __cpp_inline_variables < 201606L
+/* Tree code classes.  */
 
+#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) TYPE,
+#define END_OF_BASE_TREE_CODES tcc_exceptional,
+
+const enum tree_code_class tree_code_type[] = {
+#include "all-tree.def"
+};
+
+#undef DEFTREECODE
+#undef END_OF_BASE_TREE_CODES
+
+/* Table indexed by tree code giving number of expression
+   operands beyond the fixed part of the node structure.
+   Not used for types or decls.  */
+
+#define DEFTREECODE(SYM, NAME, TYPE, LENGTH) LENGTH,
+#define END_OF_BASE_TREE_CODES 0,
+
+const unsigned char tree_code_length[] = {
+#include "all-tree.def"
+};
+
+#undef DEFTREECODE
+#undef END_OF_BASE_TREE_CODES
+#endif
 
 /* Names of tree components.
    Used for printing out the tree and error messages.  */