lower-bitint: Force some arrays corresponding to large/huge _BitInt SSA_NAMEs to BLKmode

Message ID ZajSnzdsI007ABOH@tucnak
State New
Headers
Series lower-bitint: Force some arrays corresponding to large/huge _BitInt SSA_NAMEs to BLKmode |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply

Commit Message

Jakub Jelinek Jan. 18, 2024, 7:26 a.m. UTC
  Hi!

On aarch64 the backend decides to use non-BLKmode for some arrays
like unsigned long[4] - OImode in that case, but the corresponding
BITINT_TYPEs have BLKmode (like structures containing that many limb
elements).  This both isn't a good idea (we really want such underlying vars
to live in memory and access them there, rather than live in registers and
access their parts in there) and causes ICEs during expansion
(VIEW_CONVERT_EXPR from such OImode array to BLKmode BITINT_TYPE), so the
following patch makes sure such arrays reflect the BLKmode of BITINT_TYPEs
it is accessed with (if any).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-18  Jakub Jelinek  <jakub@redhat.com>

	* gimple-lower-bitint.cc (gimple_lower_bitint): When creating
	array VAR_DECL for BITINT_TYPE SSA_NAMEs which have BLKmode, force
	DECL_MODE of those vars to be BLKmode as well.


	Jakub
  

Comments

Richard Biener Jan. 18, 2024, 7:27 a.m. UTC | #1
On Thu, 18 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> On aarch64 the backend decides to use non-BLKmode for some arrays
> like unsigned long[4] - OImode in that case, but the corresponding
> BITINT_TYPEs have BLKmode (like structures containing that many limb
> elements).  This both isn't a good idea (we really want such underlying vars
> to live in memory and access them there, rather than live in registers and
> access their parts in there) and causes ICEs during expansion
> (VIEW_CONVERT_EXPR from such OImode array to BLKmode BITINT_TYPE), so the
> following patch makes sure such arrays reflect the BLKmode of BITINT_TYPEs
> it is accessed with (if any).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

So the issue is only manifesting during expansion?  I think it would
be better to detect the specific issue (V_C_E from register to BLKmode)
in discover_nonconstant_array_refs_r and force the register argument
to stack?

> 2024-01-18  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* gimple-lower-bitint.cc (gimple_lower_bitint): When creating
> 	array VAR_DECL for BITINT_TYPE SSA_NAMEs which have BLKmode, force
> 	DECL_MODE of those vars to be BLKmode as well.
> 
> --- gcc/gimple-lower-bitint.cc.jj	2024-01-17 14:43:33.498961304 +0100
> +++ gcc/gimple-lower-bitint.cc	2024-01-17 14:50:50.252889131 +0100
> @@ -6348,7 +6348,15 @@ gimple_lower_bitint (void)
>  	  tree s = ssa_name (i);
>  	  int p = var_to_partition (large_huge.m_map, s);
>  	  if (large_huge.m_vars[p] != NULL_TREE)
> -	    continue;
> +	    {
> +	      /* If BITINT_TYPE is BLKmode, make sure the underlying
> +		 variable is BLKmode as well.  */
> +	      if (TYPE_MODE (TREE_TYPE (s)) == BLKmode
> +		  && VAR_P (large_huge.m_vars[p])
> +		  && DECL_MODE (large_huge.m_vars[p]) != BLKmode)
> +		DECL_MODE (large_huge.m_vars[p]) = BLKmode;
> +	      continue;
> +	    }
>  	  if (atype == NULL_TREE
>  	      || !tree_int_cst_equal (TYPE_SIZE (atype),
>  				      TYPE_SIZE (TREE_TYPE (s))))
> @@ -6359,6 +6367,11 @@ gimple_lower_bitint (void)
>  	    }
>  	  large_huge.m_vars[p] = create_tmp_var (atype, "bitint");
>  	  mark_addressable (large_huge.m_vars[p]);
> +	  /* If BITINT_TYPE is BLKmode, make sure the underlying
> +	     variable is BLKmode as well.  */
> +	  if (TYPE_MODE (TREE_TYPE (s)) == BLKmode
> +	      && DECL_MODE (large_huge.m_vars[p]) != BLKmode)
> +	    DECL_MODE (large_huge.m_vars[p]) = BLKmode;
>  	}
>      }
>  
> 
> 	Jakub
> 
>
  
Jakub Jelinek Jan. 18, 2024, 11:01 a.m. UTC | #2
On Thu, Jan 18, 2024 at 08:27:51AM +0100, Richard Biener wrote:
> On Thu, 18 Jan 2024, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > On aarch64 the backend decides to use non-BLKmode for some arrays
> > like unsigned long[4] - OImode in that case, but the corresponding
> > BITINT_TYPEs have BLKmode (like structures containing that many limb
> > elements).  This both isn't a good idea (we really want such underlying vars
> > to live in memory and access them there, rather than live in registers and
> > access their parts in there) and causes ICEs during expansion
> > (VIEW_CONVERT_EXPR from such OImode array to BLKmode BITINT_TYPE), so the
> > following patch makes sure such arrays reflect the BLKmode of BITINT_TYPEs
> > it is accessed with (if any).
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> So the issue is only manifesting during expansion?  I think it would
> be better to detect the specific issue (V_C_E from register to BLKmode)
> in discover_nonconstant_array_refs_r and force the register argument
> to stack?

That doesn't really work, tried:
--- gcc/cfgexpand.cc.jj	2024-01-16 11:45:16.159326506 +0100
+++ gcc/cfgexpand.cc	2024-01-18 11:26:17.906447274 +0100
@@ -6380,11 +6380,16 @@ discover_nonconstant_array_refs_r (tree
   /* References of size POLY_INT_CST to a fixed-size object must go
      through memory.  It's more efficient to force that here than
      to create temporary slots on the fly.
-     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
+     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
+     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
+     to BLKmode BITINT_TYPEs.  */
   else if (TREE_CODE (t) == TARGET_MEM_REF
 	   || (TREE_CODE (t) == MEM_REF
 	       && TYPE_SIZE (TREE_TYPE (t))
-	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
+	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
+	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
+	       && TREE_CODE (TREE_TYPE (t)) == BITINT_TYPE
+	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))
     {
       tree base = get_base_address (t);
       if (base
This doesn't actually do anything, because the base is TREE_ADDRESSABLE.
The var gets both with -O0 and -O2 DECL_RTL like
        (mem/c:OI (plus:DI (reg/f:DI 95 virtual-stack-vars)
		  (const_int -64 [0xffffffffffffffc0])) [2 bitint.2+0 S32 A128])
but the problem is that the expansion of the VAR_DECL because of the
non-BLKmode is forced into a pseudo.
--- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
+++ gcc/expr.cc	2024-01-18 11:56:07.142361031 +0100
@@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
 	  }
       }
 
+      /* Ensure non-BLKmode array VAR_DECLs VCEd to BLKmode BITINT_TYPE
+	 aren't promoted to registers.  */
+      if (op0 == NULL_RTX
+	  && mode == BLKmode
+	  && TREE_CODE (type) == BITINT_TYPE
+	  && modifier == EXPAND_NORMAL
+	  && VAR_P (treeop0)
+	  && DECL_MODE (treeop0) != BLKmode)
+	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, EXPAND_MEMORY,
+				NULL, inner_reference_p);
+
       if (!op0)
 	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
 				NULL, inner_reference_p);
doesn't work either, while we get the MEM op0 in that case, because
mode != GET_MODE (op0) we still take the
      /* If the output type is a bit-field type, do an extraction.  */
      else if (reduce_bit_field)
        return extract_bit_field (op0, TYPE_PRECISION (type), 0,
                                  TYPE_UNSIGNED (type), NULL_RTX,
                                  mode, mode, false, NULL);
path which can't deal with BLKmode extraction from non-BLKmode.
I guess we could in the above new expr.cc hunk perhaps
also if (MEM_P (op0)) op0 = adjust_address (op0, BLKmode, 0);

	Jakub
  
Jakub Jelinek Jan. 18, 2024, 11:11 a.m. UTC | #3
And
--- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
+++ gcc/expr.cc	2024-01-18 12:08:16.412147569 +0100
@@ -12382,6 +12382,21 @@ expand_expr_real_1 (tree exp, rtx target
 	  }
       }
 
+      /* Ensure non-BLKmode array VAR_DECLs VCEd to BLKmode BITINT_TYPE
+	 aren't promoted to registers.  */
+      if (op0 == NULL_RTX
+	  && mode == BLKmode
+	  && TREE_CODE (type) == BITINT_TYPE
+	  && modifier == EXPAND_NORMAL
+	  && VAR_P (treeop0)
+	  && DECL_MODE (treeop0) != BLKmode)
+	{
+	  op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, EXPAND_MEMORY,
+				  NULL, inner_reference_p);
+	  if (MEM_P (op0))
+	    op0 = adjust_address (op0, BLKmode, 0);
+	}
+
       if (!op0)
 	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
 				NULL, inner_reference_p);
helps at -O0, but doesn't at -O2, where even EXPAND_MEMORY doesn't actually
cause MEM_P result, it is still forced into a REG.

	Jakub
  
Richard Biener Jan. 18, 2024, 12:16 p.m. UTC | #4
On Thu, 18 Jan 2024, Jakub Jelinek wrote:

> On Thu, Jan 18, 2024 at 08:27:51AM +0100, Richard Biener wrote:
> > On Thu, 18 Jan 2024, Jakub Jelinek wrote:
> > 
> > > Hi!
> > > 
> > > On aarch64 the backend decides to use non-BLKmode for some arrays
> > > like unsigned long[4] - OImode in that case, but the corresponding
> > > BITINT_TYPEs have BLKmode (like structures containing that many limb
> > > elements).  This both isn't a good idea (we really want such underlying vars
> > > to live in memory and access them there, rather than live in registers and
> > > access their parts in there) and causes ICEs during expansion
> > > (VIEW_CONVERT_EXPR from such OImode array to BLKmode BITINT_TYPE), so the
> > > following patch makes sure such arrays reflect the BLKmode of BITINT_TYPEs
> > > it is accessed with (if any).
> > > 
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > So the issue is only manifesting during expansion?  I think it would
> > be better to detect the specific issue (V_C_E from register to BLKmode)
> > in discover_nonconstant_array_refs_r and force the register argument
> > to stack?
> 
> That doesn't really work, tried:
> --- gcc/cfgexpand.cc.jj	2024-01-16 11:45:16.159326506 +0100
> +++ gcc/cfgexpand.cc	2024-01-18 11:26:17.906447274 +0100
> @@ -6380,11 +6380,16 @@ discover_nonconstant_array_refs_r (tree
>    /* References of size POLY_INT_CST to a fixed-size object must go
>       through memory.  It's more efficient to force that here than
>       to create temporary slots on the fly.
> -     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
> +     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
> +     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
> +     to BLKmode BITINT_TYPEs.  */
>    else if (TREE_CODE (t) == TARGET_MEM_REF
>  	   || (TREE_CODE (t) == MEM_REF
>  	       && TYPE_SIZE (TREE_TYPE (t))
> -	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
> +	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
> +	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
> +	       && TREE_CODE (TREE_TYPE (t)) == BITINT_TYPE
> +	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))
>      {
>        tree base = get_base_address (t);
>        if (base
> This doesn't actually do anything, because the base is TREE_ADDRESSABLE.
> The var gets both with -O0 and -O2 DECL_RTL like
>         (mem/c:OI (plus:DI (reg/f:DI 95 virtual-stack-vars)
> 		  (const_int -64 [0xffffffffffffffc0])) [2 bitint.2+0 S32 A128])

But then it's not promoted to register but instead somebody decides
it gets an integer mode instead of BLKmode.

> but the problem is that the expansion of the VAR_DECL because of the
> non-BLKmode is forced into a pseudo.
> --- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
> +++ gcc/expr.cc	2024-01-18 11:56:07.142361031 +0100
> @@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
>  	  }
>        }

There's already an odd bit of code dealing with non-BLKmode to
BLKmode converts, suggesting those would need intermediate memory,
but likely not triggering because the base is a decl, not a
handled_component.  Does it work to go there also for DECL_P (treeop0)?


> +      /* Ensure non-BLKmode array VAR_DECLs VCEd to BLKmode BITINT_TYPE
> +	 aren't promoted to registers.  */
> +      if (op0 == NULL_RTX
> +	  && mode == BLKmode
> +	  && TREE_CODE (type) == BITINT_TYPE
> +	  && modifier == EXPAND_NORMAL
> +	  && VAR_P (treeop0)
> +	  && DECL_MODE (treeop0) != BLKmode)
> +	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, EXPAND_MEMORY,
> +				NULL, inner_reference_p);
> +
>        if (!op0)
>  	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
>  				NULL, inner_reference_p);
> doesn't work either, while we get the MEM op0 in that case, because
> mode != GET_MODE (op0) we still take the
>       /* If the output type is a bit-field type, do an extraction.  */
>       else if (reduce_bit_field)
>         return extract_bit_field (op0, TYPE_PRECISION (type), 0,
>                                   TYPE_UNSIGNED (type), NULL_RTX,
>                                   mode, mode, false, NULL);
> path which can't deal with BLKmode extraction from non-BLKmode.
> I guess we could in the above new expr.cc hunk perhaps
> also if (MEM_P (op0)) op0 = adjust_address (op0, BLKmode, 0);

Hmm, 'reduce_bit_field' is odd with V_C_E - if you disable that,
does it work?

How does it behave differently when the base is BLKmode instead
of OImode?

Richard.
  
Jakub Jelinek Jan. 18, 2024, 12:32 p.m. UTC | #5
On Thu, Jan 18, 2024 at 01:16:45PM +0100, Richard Biener wrote:
> > This doesn't actually do anything, because the base is TREE_ADDRESSABLE.
> > The var gets both with -O0 and -O2 DECL_RTL like
> >         (mem/c:OI (plus:DI (reg/f:DI 95 virtual-stack-vars)
> > 		  (const_int -64 [0xffffffffffffffc0])) [2 bitint.2+0 S32 A128])
> 
> But then it's not promoted to register but instead somebody decides
> it gets an integer mode instead of BLKmode.

It is promoted to register, expand_expr on treeop0 in that case
sees it is OImode MEM and at least when optimize forces it into a register.

> > but the problem is that the expansion of the VAR_DECL because of the
> > non-BLKmode is forced into a pseudo.
> > --- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
> > +++ gcc/expr.cc	2024-01-18 11:56:07.142361031 +0100
> > @@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
> >  	  }
> >        }
> 
> There's already an odd bit of code dealing with non-BLKmode to
> BLKmode converts, suggesting those would need intermediate memory,
> but likely not triggering because the base is a decl, not a
> handled_component.  Does it work to go there also for DECL_P (treeop0)?

Tried that, it doesn't do anything interesting.  It can handle mostly
the case where it is say a large structure element, the structure is
BLKmode, but the element is not and then is cast to BLKmode
VIEW_CONVERT_EXPR.

> > path which can't deal with BLKmode extraction from non-BLKmode.
> > I guess we could in the above new expr.cc hunk perhaps
> > also if (MEM_P (op0)) op0 = adjust_address (op0, BLKmode, 0);
> 
> Hmm, 'reduce_bit_field' is odd with V_C_E - if you disable that,
> does it work?

Generally, we need reduce_bit_field to work as is even for _BitInt,
in most spots it results in the reduction for bit-field precision
which has to happen.
If we'd somehow disable the
      /* If the output type is a bit-field type, do an extraction.  */
      else if (reduce_bit_field)
        return extract_bit_field (op0, TYPE_PRECISION (type), 0,
                                  TYPE_UNSIGNED (type), NULL_RTX,
                                  mode, mode, false, NULL);
case for mode == BLKmode, then we'd trigger the
      /* As a last resort, spill op0 to memory, and reload it in a
         different mode.  */
      else if (!MEM_P (op0))
        {
case which would spill it again into memory and extract.  That would
then not ICE, but I strongly doubt we'd be able to undo that later,
at least not the stack allocations.  IMHO it is much better to keep
the stuff in memory, instead of forcing it into register and then
force it to some other memory.

> How does it behave differently when the base is BLKmode instead
> of OImode?

If op0 has BLKmode and mode is BLKmode too, then it triggers the
      /* If the input and output modes are both the same, we are done.  */
      if (mode == GET_MODE (op0))
        ;
case and doesn't run into anything else, so the result is just the MEM
with the array.  Which is what the hack to set BLKmode DECL_MODE achieves
too.

	Jakub
  
Richard Biener Jan. 18, 2024, 12:34 p.m. UTC | #6
On Thu, 18 Jan 2024, Jakub Jelinek wrote:

> On Thu, Jan 18, 2024 at 01:16:45PM +0100, Richard Biener wrote:
> > > This doesn't actually do anything, because the base is TREE_ADDRESSABLE.
> > > The var gets both with -O0 and -O2 DECL_RTL like
> > >         (mem/c:OI (plus:DI (reg/f:DI 95 virtual-stack-vars)
> > > 		  (const_int -64 [0xffffffffffffffc0])) [2 bitint.2+0 S32 A128])
> > 
> > But then it's not promoted to register but instead somebody decides
> > it gets an integer mode instead of BLKmode.
> 
> It is promoted to register, expand_expr on treeop0 in that case
> sees it is OImode MEM and at least when optimize forces it into a register.
> 
> > > but the problem is that the expansion of the VAR_DECL because of the
> > > non-BLKmode is forced into a pseudo.
> > > --- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
> > > +++ gcc/expr.cc	2024-01-18 11:56:07.142361031 +0100
> > > @@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
> > >  	  }
> > >        }
> > 
> > There's already an odd bit of code dealing with non-BLKmode to
> > BLKmode converts, suggesting those would need intermediate memory,
> > but likely not triggering because the base is a decl, not a
> > handled_component.  Does it work to go there also for DECL_P (treeop0)?
> 
> Tried that, it doesn't do anything interesting.  It can handle mostly
> the case where it is say a large structure element, the structure is
> BLKmode, but the element is not and then is cast to BLKmode
> VIEW_CONVERT_EXPR.
> 
> > > path which can't deal with BLKmode extraction from non-BLKmode.
> > > I guess we could in the above new expr.cc hunk perhaps
> > > also if (MEM_P (op0)) op0 = adjust_address (op0, BLKmode, 0);
> > 
> > Hmm, 'reduce_bit_field' is odd with V_C_E - if you disable that,
> > does it work?
> 
> Generally, we need reduce_bit_field to work as is even for _BitInt,
> in most spots it results in the reduction for bit-field precision
> which has to happen.
> If we'd somehow disable the
>       /* If the output type is a bit-field type, do an extraction.  */
>       else if (reduce_bit_field)
>         return extract_bit_field (op0, TYPE_PRECISION (type), 0,
>                                   TYPE_UNSIGNED (type), NULL_RTX,
>                                   mode, mode, false, NULL);
> case for mode == BLKmode, then we'd trigger the
>       /* As a last resort, spill op0 to memory, and reload it in a
>          different mode.  */
>       else if (!MEM_P (op0))
>         {
> case which would spill it again into memory and extract.  That would
> then not ICE, but I strongly doubt we'd be able to undo that later,
> at least not the stack allocations.  IMHO it is much better to keep
> the stuff in memory, instead of forcing it into register and then
> force it to some other memory.
> 
> > How does it behave differently when the base is BLKmode instead
> > of OImode?
> 
> If op0 has BLKmode and mode is BLKmode too, then it triggers the
>       /* If the input and output modes are both the same, we are done.  */
>       if (mode == GET_MODE (op0))
>         ;
> case and doesn't run into anything else, so the result is just the MEM
> with the array.  Which is what the hack to set BLKmode DECL_MODE achieves
> too.

So - if we simply do

      /* If the input and output modes are both the same, we are done.  */
      if (mode == GET_MODE (op0) || (mode == BLKmode && MEM_P (op0))
        ;

?  After all if we want BLKmode we want a MEM, and if we have one
we should be done already.  V_C_E isn't supposed to do any value
transform.

Richard.
  
Jakub Jelinek Jan. 18, 2024, 12:46 p.m. UTC | #7
On Thu, Jan 18, 2024 at 01:34:53PM +0100, Richard Biener wrote:
> So - if we simply do
> 
>       /* If the input and output modes are both the same, we are done.  */
>       if (mode == GET_MODE (op0) || (mode == BLKmode && MEM_P (op0))
>         ;
> 
> ?  After all if we want BLKmode we want a MEM, and if we have one
> we should be done already.  V_C_E isn't supposed to do any value
> transform.

We'd need to make sure that op0 is actually MEM, which is not the case.

The following patch seems to work though, the discover_nonconstant_array_refs_r
part helps for -O2, the expr.cc part is needed at all optimization levels.

2024-01-18  Jakub Jelinek  <jakub@redhat.com>

	* cfgexpand.cc (discover_nonconstant_array_refs_r): Force non-BLKmode
	VAR_DECLs referenced in BLKmode VIEW_CONVERT_EXPRs into memory.
	* expr.cc (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: If mode is
	BLKmode and type a BITINT_TYPE from a non-BLKmode VAR_P with MEM
	DECL_RTL, use adjust_address to BLKmode for it instead of expand_expr.

--- gcc/cfgexpand.cc.jj	2024-01-16 11:45:16.159326506 +0100
+++ gcc/cfgexpand.cc	2024-01-18 13:41:54.579551282 +0100
@@ -6380,11 +6380,16 @@ discover_nonconstant_array_refs_r (tree
   /* References of size POLY_INT_CST to a fixed-size object must go
      through memory.  It's more efficient to force that here than
      to create temporary slots on the fly.
-     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
+     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
+     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
+     to BLKmode BITINT_TYPEs.  */
   else if (TREE_CODE (t) == TARGET_MEM_REF
 	   || (TREE_CODE (t) == MEM_REF
 	       && TYPE_SIZE (TREE_TYPE (t))
-	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
+	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
+	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
+	       && TREE_CODE (TREE_TYPE (t)) == BITINT_TYPE
+	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))
     {
       tree base = get_base_address (t);
       if (base
--- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
+++ gcc/expr.cc	2024-01-18 13:38:19.677556646 +0100
@@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
 	  }
       }
 
+      /* Ensure non-BLKmode array VAR_DECLs VCEd to BLKmode BITINT_TYPE
+	 aren't promoted to registers.  */
+      if (op0 == NULL_RTX
+	  && mode == BLKmode
+	  && TREE_CODE (type) == BITINT_TYPE
+	  && VAR_P (treeop0)
+	  && DECL_MODE (treeop0) != BLKmode
+	  && DECL_RTL_SET_P (treeop0)
+	  && MEM_P (DECL_RTL (treeop0)))
+	op0 = adjust_address (DECL_RTL (treeop0), BLKmode, 0);
+
       if (!op0)
 	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
 				NULL, inner_reference_p);


	Jakub
  
Richard Biener Jan. 18, 2024, 12:57 p.m. UTC | #8
On Thu, 18 Jan 2024, Jakub Jelinek wrote:

> On Thu, Jan 18, 2024 at 01:34:53PM +0100, Richard Biener wrote:
> > So - if we simply do
> > 
> >       /* If the input and output modes are both the same, we are done.  */
> >       if (mode == GET_MODE (op0) || (mode == BLKmode && MEM_P (op0))
> >         ;
> > 
> > ?  After all if we want BLKmode we want a MEM, and if we have one
> > we should be done already.  V_C_E isn't supposed to do any value
> > transform.
> 
> We'd need to make sure that op0 is actually MEM, which is not the case.
> 
> The following patch seems to work though, the discover_nonconstant_array_refs_r
> part helps for -O2, the expr.cc part is needed at all optimization levels.
> 
> 2024-01-18  Jakub Jelinek  <jakub@redhat.com>
> 
> 	* cfgexpand.cc (discover_nonconstant_array_refs_r): Force non-BLKmode
> 	VAR_DECLs referenced in BLKmode VIEW_CONVERT_EXPRs into memory.
> 	* expr.cc (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: If mode is
> 	BLKmode and type a BITINT_TYPE from a non-BLKmode VAR_P with MEM
> 	DECL_RTL, use adjust_address to BLKmode for it instead of expand_expr.
> 
> --- gcc/cfgexpand.cc.jj	2024-01-16 11:45:16.159326506 +0100
> +++ gcc/cfgexpand.cc	2024-01-18 13:41:54.579551282 +0100
> @@ -6380,11 +6380,16 @@ discover_nonconstant_array_refs_r (tree
>    /* References of size POLY_INT_CST to a fixed-size object must go
>       through memory.  It's more efficient to force that here than
>       to create temporary slots on the fly.
> -     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
> +     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
> +     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
> +     to BLKmode BITINT_TYPEs.  */
>    else if (TREE_CODE (t) == TARGET_MEM_REF
>  	   || (TREE_CODE (t) == MEM_REF
>  	       && TYPE_SIZE (TREE_TYPE (t))
> -	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
> +	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
> +	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
> +	       && TREE_CODE (TREE_TYPE (t)) == BITINT_TYPE
> +	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))

I'm still not getting what's special about BITINT_TYPE here so
shouldn't that apply to all BLKmode V_C_E?  But sure we can for
now just handle BITINT_TYPE.

That hunk looks OK to me.

>      {
>        tree base = get_base_address (t);
>        if (base
> --- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
> +++ gcc/expr.cc	2024-01-18 13:38:19.677556646 +0100
> @@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
>  	  }
>        }
>  
> +      /* Ensure non-BLKmode array VAR_DECLs VCEd to BLKmode BITINT_TYPE
> +	 aren't promoted to registers.  */
> +      if (op0 == NULL_RTX
> +	  && mode == BLKmode
> +	  && TREE_CODE (type) == BITINT_TYPE
> +	  && VAR_P (treeop0)
> +	  && DECL_MODE (treeop0) != BLKmode
> +	  && DECL_RTL_SET_P (treeop0)
> +	  && MEM_P (DECL_RTL (treeop0)))
> +	op0 = adjust_address (DECL_RTL (treeop0), BLKmode, 0);
> +
>        if (!op0)
>  	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
>  				NULL, inner_reference_p);

So we're now sure we have MEM_P (op0) after expand_expr_real,
even without this change, right?  What's wrong with the
suggestion to use

  if (mode == GET_MODE (op0) || (mode == BLKmode && MEM_P (op0))

thus not run into any of the special-casing?  We're doing just

          op0 = adjust_address (op0, mode, 0);

then which is basically what you do above?

> 
> 	Jakub
> 
>
  
Jakub Jelinek Jan. 18, 2024, 1:13 p.m. UTC | #9
On Thu, Jan 18, 2024 at 01:57:49PM +0100, Richard Biener wrote:
> > -     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
> > +     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
> > +     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
> > +     to BLKmode BITINT_TYPEs.  */
> >    else if (TREE_CODE (t) == TARGET_MEM_REF
> >  	   || (TREE_CODE (t) == MEM_REF
> >  	       && TYPE_SIZE (TREE_TYPE (t))
> > -	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
> > +	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
> > +	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
> > +	       && TREE_CODE (TREE_TYPE (t)) == BITINT_TYPE
> > +	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))
> 
> I'm still not getting what's special about BITINT_TYPE here so
> shouldn't that apply to all BLKmode V_C_E?  But sure we can for
> now just handle BITINT_TYPE.
> 
> That hunk looks OK to me.

The == BITINT_TYPE check is non-essential, was just trying to keep existing
behavior otherwise.  I can certainly drop that.

> > --- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
> > +++ gcc/expr.cc	2024-01-18 13:38:19.677556646 +0100
> > @@ -12382,6 +12382,17 @@ expand_expr_real_1 (tree exp, rtx target
> >  	  }
> >        }
> >  
> > +      /* Ensure non-BLKmode array VAR_DECLs VCEd to BLKmode BITINT_TYPE
> > +	 aren't promoted to registers.  */
> > +      if (op0 == NULL_RTX
> > +	  && mode == BLKmode
> > +	  && TREE_CODE (type) == BITINT_TYPE
> > +	  && VAR_P (treeop0)
> > +	  && DECL_MODE (treeop0) != BLKmode
> > +	  && DECL_RTL_SET_P (treeop0)
> > +	  && MEM_P (DECL_RTL (treeop0)))
> > +	op0 = adjust_address (DECL_RTL (treeop0), BLKmode, 0);
> > +
> >        if (!op0)
> >  	op0 = expand_expr_real (treeop0, NULL_RTX, VOIDmode, modifier,
> >  				NULL, inner_reference_p);
> 
> So we're now sure we have MEM_P (op0) after expand_expr_real,
> even without this change, right?  What's wrong with the
> suggestion to use

I wasn't sure if VAR_P (treeop0) && MEM_P (DECL_RTL (treeop0)) implies that
expand_expr_real will return a MEM, but I'm not able to find a path in which
it would return something different, so maybe ok.

>   if (mode == GET_MODE (op0) || (mode == BLKmode && MEM_P (op0))
> 
> thus not run into any of the special-casing?  We're doing just

It is true the later code will then do:
> 
>           op0 = adjust_address (op0, mode, 0);

so perhaps it is ok as you wrote it (though perhaps adding it as a separate
else if would allow a separate comment).

	Jakub
  
Jakub Jelinek Jan. 18, 2024, 1:18 p.m. UTC | #10
On Thu, Jan 18, 2024 at 02:13:55PM +0100, Jakub Jelinek wrote:
> The == BITINT_TYPE check is non-essential, was just trying to keep existing
> behavior otherwise.  I can certainly drop that.

So following then?

2024-01-18  Jakub Jelinek  <jakub@redhat.com>
	    Richard Biener  <rguenther@suse.de>

	* cfgexpand.cc (discover_nonconstant_array_refs_r): Force non-BLKmode
	VAR_DECLs referenced in BLKmode VIEW_CONVERT_EXPRs into memory.
	* expr.cc (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: Do nothing
	but adjust_address also for BLKmode mode and MEM op0.

--- gcc/cfgexpand.cc.jj	2024-01-16 11:45:16.159326506 +0100
+++ gcc/cfgexpand.cc	2024-01-18 14:15:54.853008586 +0100
@@ -6380,11 +6380,15 @@ discover_nonconstant_array_refs_r (tree
   /* References of size POLY_INT_CST to a fixed-size object must go
      through memory.  It's more efficient to force that here than
      to create temporary slots on the fly.
-     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
+     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
+     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
+     to BLKmode type.  */
   else if (TREE_CODE (t) == TARGET_MEM_REF
 	   || (TREE_CODE (t) == MEM_REF
 	       && TYPE_SIZE (TREE_TYPE (t))
-	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
+	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
+	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
+	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))
     {
       tree base = get_base_address (t);
       if (base
--- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
+++ gcc/expr.cc	2024-01-18 14:15:31.970328685 +0100
@@ -12389,6 +12389,10 @@ expand_expr_real_1 (tree exp, rtx target
       /* If the input and output modes are both the same, we are done.  */
       if (mode == GET_MODE (op0))
 	;
+      /* Similarly if the output mode is BLKmode and input is a MEM,
+	 adjust_address done below is all we need.  */
+      else if (mode == BLKmode && MEM_P (op0))
+	;
       /* If neither mode is BLKmode, and both modes are the same size
 	 then we can use gen_lowpart.  */
       else if (mode != BLKmode


	Jakub
  
Richard Biener Jan. 18, 2024, 1:36 p.m. UTC | #11
On Thu, 18 Jan 2024, Jakub Jelinek wrote:

> On Thu, Jan 18, 2024 at 02:13:55PM +0100, Jakub Jelinek wrote:
> > The == BITINT_TYPE check is non-essential, was just trying to keep existing
> > behavior otherwise.  I can certainly drop that.
> 
> So following then?

OK.

Thanks,
Richard.

> 2024-01-18  Jakub Jelinek  <jakub@redhat.com>
> 	    Richard Biener  <rguenther@suse.de>
> 
> 	* cfgexpand.cc (discover_nonconstant_array_refs_r): Force non-BLKmode
> 	VAR_DECLs referenced in BLKmode VIEW_CONVERT_EXPRs into memory.
> 	* expr.cc (expand_expr_real_1) <case VIEW_CONVERT_EXPR>: Do nothing
> 	but adjust_address also for BLKmode mode and MEM op0.
> 
> --- gcc/cfgexpand.cc.jj	2024-01-16 11:45:16.159326506 +0100
> +++ gcc/cfgexpand.cc	2024-01-18 14:15:54.853008586 +0100
> @@ -6380,11 +6380,15 @@ discover_nonconstant_array_refs_r (tree
>    /* References of size POLY_INT_CST to a fixed-size object must go
>       through memory.  It's more efficient to force that here than
>       to create temporary slots on the fly.
> -     RTL expansion expectes TARGET_MEM_REF to always address actual memory.  */
> +     RTL expansion expectes TARGET_MEM_REF to always address actual memory.
> +     Also, force to stack non-BLKmode vars accessed through VIEW_CONVERT_EXPR
> +     to BLKmode type.  */
>    else if (TREE_CODE (t) == TARGET_MEM_REF
>  	   || (TREE_CODE (t) == MEM_REF
>  	       && TYPE_SIZE (TREE_TYPE (t))
> -	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t)))))
> +	       && POLY_INT_CST_P (TYPE_SIZE (TREE_TYPE (t))))
> +	   || (TREE_CODE (t) == VIEW_CONVERT_EXPR
> +	       && TYPE_MODE (TREE_TYPE (t)) == BLKmode))
>      {
>        tree base = get_base_address (t);
>        if (base
> --- gcc/expr.cc.jj	2024-01-12 10:07:58.194851657 +0100
> +++ gcc/expr.cc	2024-01-18 14:15:31.970328685 +0100
> @@ -12389,6 +12389,10 @@ expand_expr_real_1 (tree exp, rtx target
>        /* If the input and output modes are both the same, we are done.  */
>        if (mode == GET_MODE (op0))
>  	;
> +      /* Similarly if the output mode is BLKmode and input is a MEM,
> +	 adjust_address done below is all we need.  */
> +      else if (mode == BLKmode && MEM_P (op0))
> +	;
>        /* If neither mode is BLKmode, and both modes are the same size
>  	 then we can use gen_lowpart.  */
>        else if (mode != BLKmode
> 
> 
> 	Jakub
> 
>
  

Patch

--- gcc/gimple-lower-bitint.cc.jj	2024-01-17 14:43:33.498961304 +0100
+++ gcc/gimple-lower-bitint.cc	2024-01-17 14:50:50.252889131 +0100
@@ -6348,7 +6348,15 @@  gimple_lower_bitint (void)
 	  tree s = ssa_name (i);
 	  int p = var_to_partition (large_huge.m_map, s);
 	  if (large_huge.m_vars[p] != NULL_TREE)
-	    continue;
+	    {
+	      /* If BITINT_TYPE is BLKmode, make sure the underlying
+		 variable is BLKmode as well.  */
+	      if (TYPE_MODE (TREE_TYPE (s)) == BLKmode
+		  && VAR_P (large_huge.m_vars[p])
+		  && DECL_MODE (large_huge.m_vars[p]) != BLKmode)
+		DECL_MODE (large_huge.m_vars[p]) = BLKmode;
+	      continue;
+	    }
 	  if (atype == NULL_TREE
 	      || !tree_int_cst_equal (TYPE_SIZE (atype),
 				      TYPE_SIZE (TREE_TYPE (s))))
@@ -6359,6 +6367,11 @@  gimple_lower_bitint (void)
 	    }
 	  large_huge.m_vars[p] = create_tmp_var (atype, "bitint");
 	  mark_addressable (large_huge.m_vars[p]);
+	  /* If BITINT_TYPE is BLKmode, make sure the underlying
+	     variable is BLKmode as well.  */
+	  if (TYPE_MODE (TREE_TYPE (s)) == BLKmode
+	      && DECL_MODE (large_huge.m_vars[p]) != BLKmode)
+	    DECL_MODE (large_huge.m_vars[p]) = BLKmode;
 	}
     }