lower-bitint: Handle INTEGER_CST rhs1 in handle_cast [PR113462]

Message ID ZauUjO7HYqbIsIw+@tucnak
State New
Headers
Series lower-bitint: Handle INTEGER_CST rhs1 in handle_cast [PR113462] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply

Commit Message

Jakub Jelinek Jan. 20, 2024, 9:38 a.m. UTC
  Hi!

The following patch ICEs because fre3 leaves around unfolded
  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
statement and in handle_cast I was expecting just SSA_NAMEs for the
large/huge _BitInt to large/huge _BitInt casts; INTEGER_CST is something
we can handle in that case exactly the same, as the handle_operand recursion
handles those.

Of course, maybe we should also try to fold_stmt such cases somewhere in
bitint lowering preparation.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-20  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/113462
	* gimple-lower-bitint.cc (bitint_large_huge::handle_cast):
	Handle rhs1 INTEGER_CST like SSA_NAME.

	* gcc.dg/bitint-76.c: New test.


	Jakub
  

Comments

Richard Biener Jan. 20, 2024, 11:11 a.m. UTC | #1
> Am 20.01.2024 um 10:39 schrieb Jakub Jelinek <jakub@redhat.com>:
> 
> Hi!
> 
> The following patch ICEs because fre3 leaves around unfolded
>  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);

Hmm, const_unop should handle this, I suppose either we fail to convert this to a NOP_EXPR or native encode/interpret do not handle it?

Richard 

> statement and in handle_cast I was expecting just SSA_NAMEs for the
> large/huge _BitInt to large/huge _BitInt casts; INTEGER_CST is something
> we can handle in that case exactly the same, as the handle_operand recursion
> handles those.
> 
> Of course, maybe we should also try to fold_stmt such cases somewhere in
> bitint lowering preparation.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok

Richard 

> 2024-01-20  Jakub Jelinek  <jakub@redhat.com>
> 
>    PR tree-optimization/113462
>    * gimple-lower-bitint.cc (bitint_large_huge::handle_cast):
>    Handle rhs1 INTEGER_CST like SSA_NAME.
> 
>    * gcc.dg/bitint-76.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj    2024-01-19 10:01:37.696673929 +0100
> +++ gcc/gimple-lower-bitint.cc    2024-01-19 18:36:34.175254308 +0100
> @@ -1250,7 +1250,7 @@ bitint_large_huge::handle_cast (tree lhs
> {
>   tree rhs_type = TREE_TYPE (rhs1);
>   gimple *g;
> -  if (TREE_CODE (rhs1) == SSA_NAME
> +  if ((TREE_CODE (rhs1) == SSA_NAME || TREE_CODE (rhs1) == INTEGER_CST)
>       && TREE_CODE (lhs_type) == BITINT_TYPE
>       && TREE_CODE (rhs_type) == BITINT_TYPE
>       && bitint_precision_kind (lhs_type) >= bitint_prec_large
> --- gcc/testsuite/gcc.dg/bitint-76.c.jj    2024-01-19 18:39:23.883980766 +0100
> +++ gcc/testsuite/gcc.dg/bitint-76.c    2024-01-19 18:38:48.758451335 +0100
> @@ -0,0 +1,16 @@
> +/* PR tree-optimization/113462 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-std=c23 -O2" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 129
> +typedef _BitInt(129) B;
> +#else
> +typedef _BitInt(63) B;
> +#endif
> +
> +B
> +foo (void)
> +{
> +  struct { B b; } s = {};
> +  return s.b;
> +}
> 
>    Jakub
>
  
Jakub Jelinek Jan. 20, 2024, 11:31 a.m. UTC | #2
On Sat, Jan 20, 2024 at 12:11:10PM +0100, Richard Biener wrote:
> > The following patch ICEs because fre3 leaves around unfolded
> >  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
> 
> Hmm, const_unop should handle this, I suppose either we fail to convert this to a NOP_EXPR or native encode/interpret do not handle it?

We just propagate
  _BitInt(192) s;
  _BitInt(129) _1;

  s_6 = 0;
  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(s_6);
to
  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
without trying to fold it in any way (note, fre1, I mistyped fre3).
And the earlier is result of sra, before that it is
  s = {};
  _1 = s.b;

	Jakub
  
Richard Biener Jan. 22, 2024, 8:58 a.m. UTC | #3
On Sat, 20 Jan 2024, Jakub Jelinek wrote:

> On Sat, Jan 20, 2024 at 12:11:10PM +0100, Richard Biener wrote:
> > > The following patch ICEs because fre3 leaves around unfolded
> > >  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
> > 
> > Hmm, const_unop should handle this, I suppose either we fail to convert this to a NOP_EXPR or native encode/interpret do not handle it?
> 
> We just propagate
>   _BitInt(192) s;
>   _BitInt(129) _1;
> 
>   s_6 = 0;
>   _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(s_6);
> to
>   _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
> without trying to fold it in any way (note, fre1, I mistyped fre3).

FRE folds all stmts it substitues into (:7182 and on):

  /* Fold the stmt if modified, this canonicalizes MEM_REFs we propagated
     into which is a requirement for the IPA devirt machinery.  */
  gimple *old_stmt = stmt;
  if (modified)
    {
      /* If a formerly non-invariant ADDR_EXPR is turned into an
         invariant one it was on a separate stmt.  */
      if (gimple_assign_single_p (stmt)
          && TREE_CODE (gimple_assign_rhs1 (stmt)) == ADDR_EXPR)
        recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 
(stmt));
      gimple_stmt_iterator prev = *gsi;
      gsi_prev (&prev);
      if (fold_stmt (gsi, follow_all_ssa_edges))
        {

so it's up to fold_stmt / match to optimize.  And match will also
try const_unop on unary ops with constant_for_folding op0 where
for VIEW_CONVERT_EXPR we should eventually try native_encode/interpret
via fold_view_convert_expr (but it uses a fixed 128 byte buffer which
might be too low here?).

Richard.

> And the earlier is result of sra, before that it is
>   s = {};
>   _1 = s.b;
> 
> 	Jakub
> 
>
  
Richard Biener Jan. 22, 2024, 10:27 a.m. UTC | #4
On Mon, 22 Jan 2024, Richard Biener wrote:

> On Sat, 20 Jan 2024, Jakub Jelinek wrote:
> 
> > On Sat, Jan 20, 2024 at 12:11:10PM +0100, Richard Biener wrote:
> > > > The following patch ICEs because fre3 leaves around unfolded
> > > >  _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
> > > 
> > > Hmm, const_unop should handle this, I suppose either we fail to convert this to a NOP_EXPR or native encode/interpret do not handle it?
> > 
> > We just propagate
> >   _BitInt(192) s;
> >   _BitInt(129) _1;
> > 
> >   s_6 = 0;
> >   _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(s_6);
> > to
> >   _1 = VIEW_CONVERT_EXPR<_BitInt(129)>(0);
> > without trying to fold it in any way (note, fre1, I mistyped fre3).
> 
> FRE folds all stmts it substitues into (:7182 and on):
> 
>   /* Fold the stmt if modified, this canonicalizes MEM_REFs we propagated
>      into which is a requirement for the IPA devirt machinery.  */
>   gimple *old_stmt = stmt;
>   if (modified)
>     {
>       /* If a formerly non-invariant ADDR_EXPR is turned into an
>          invariant one it was on a separate stmt.  */
>       if (gimple_assign_single_p (stmt)
>           && TREE_CODE (gimple_assign_rhs1 (stmt)) == ADDR_EXPR)
>         recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 
> (stmt));
>       gimple_stmt_iterator prev = *gsi;
>       gsi_prev (&prev);
>       if (fold_stmt (gsi, follow_all_ssa_edges))
>         {
> 
> so it's up to fold_stmt / match to optimize.  And match will also
> try const_unop on unary ops with constant_for_folding op0 where
> for VIEW_CONVERT_EXPR we should eventually try native_encode/interpret
> via fold_view_convert_expr (but it uses a fixed 128 byte buffer which
> might be too low here?).

We run into

static tree
native_interpret_int (tree type, const unsigned char *ptr, int len)
{ 
...
  if (total_bytes > len
      || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT)
    return NULL_TREE;

OTOH using a V_C_E to "truncate" a _BitInt looks wrong?  OTOH the
check doesn't really handle native_encode_expr using the "proper"
wide_int encoding however that's exactly handled.  So it might be
a pre-existing issue that's only uncovered by large _BitInts
(__int128 might show similar issues?)

Richard.
  
Jakub Jelinek Jan. 22, 2024, 10:40 a.m. UTC | #5
On Mon, Jan 22, 2024 at 11:27:52AM +0100, Richard Biener wrote:
> We run into
> 
> static tree
> native_interpret_int (tree type, const unsigned char *ptr, int len)
> { 
> ...
>   if (total_bytes > len
>       || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT)
>     return NULL_TREE;
> 
> OTOH using a V_C_E to "truncate" a _BitInt looks wrong?  OTOH the
> check doesn't really handle native_encode_expr using the "proper"
> wide_int encoding however that's exactly handled.  So it might be
> a pre-existing issue that's only uncovered by large _BitInts
> (__int128 might show similar issues?)

I guess the || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT
conditions make no sense, all we care is whether it fits in the buffer
or not.
But then there is
fold_view_convert_expr
(and other spots) which use
  /* We support up to 1024-bit values (for GCN/RISC-V V128QImode).  */
  unsigned char buffer[128];
or something similar.
Perhaps we could use XALLOCAVEC there instead (or use it only for the
larger sizes and keep the static buffer for the common case).

	Jakub
  
Richard Biener Jan. 22, 2024, 10:56 a.m. UTC | #6
On Mon, 22 Jan 2024, Jakub Jelinek wrote:

> On Mon, Jan 22, 2024 at 11:27:52AM +0100, Richard Biener wrote:
> > We run into
> > 
> > static tree
> > native_interpret_int (tree type, const unsigned char *ptr, int len)
> > { 
> > ...
> >   if (total_bytes > len
> >       || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT)
> >     return NULL_TREE;
> > 
> > OTOH using a V_C_E to "truncate" a _BitInt looks wrong?  OTOH the
> > check doesn't really handle native_encode_expr using the "proper"
> > wide_int encoding however that's exactly handled.  So it might be
> > a pre-existing issue that's only uncovered by large _BitInts
> > (__int128 might show similar issues?)
> 
> I guess the || total_bytes * BITS_PER_UNIT > HOST_BITS_PER_DOUBLE_INT
> conditions make no sense, all we care is whether it fits in the buffer
> or not.
> But then there is
> fold_view_convert_expr
> (and other spots) which use
>   /* We support up to 1024-bit values (for GCN/RISC-V V128QImode).  */
>   unsigned char buffer[128];
> or something similar.
> Perhaps we could use XALLOCAVEC there instead (or use it only for the
> larger sizes and keep the static buffer for the common case).

Well, yes.  V_C_E folding could do this but then the native_encode_expr
API could also allow lazy allocation or so.

> 	Jakub
> 
>
  

Patch

--- gcc/gimple-lower-bitint.cc.jj	2024-01-19 10:01:37.696673929 +0100
+++ gcc/gimple-lower-bitint.cc	2024-01-19 18:36:34.175254308 +0100
@@ -1250,7 +1250,7 @@  bitint_large_huge::handle_cast (tree lhs
 {
   tree rhs_type = TREE_TYPE (rhs1);
   gimple *g;
-  if (TREE_CODE (rhs1) == SSA_NAME
+  if ((TREE_CODE (rhs1) == SSA_NAME || TREE_CODE (rhs1) == INTEGER_CST)
       && TREE_CODE (lhs_type) == BITINT_TYPE
       && TREE_CODE (rhs_type) == BITINT_TYPE
       && bitint_precision_kind (lhs_type) >= bitint_prec_large
--- gcc/testsuite/gcc.dg/bitint-76.c.jj	2024-01-19 18:39:23.883980766 +0100
+++ gcc/testsuite/gcc.dg/bitint-76.c	2024-01-19 18:38:48.758451335 +0100
@@ -0,0 +1,16 @@ 
+/* PR tree-optimization/113462 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-std=c23 -O2" } */
+
+#if __BITINT_MAXWIDTH__ >= 129
+typedef _BitInt(129) B;
+#else
+typedef _BitInt(63) B;
+#endif
+
+B
+foo (void)
+{
+  struct { B b; } s = {};
+  return s.b;
+}