vect: Fix wrong shift_n after widening on BE [PR107338]

Message ID 95d598d7-4f00-ad36-08f9-4b5942e48e42@linux.ibm.com
State New
Headers
Series vect: Fix wrong shift_n after widening on BE [PR107338] |

Commit Message

Kewen.Lin Oct. 24, 2022, 10:43 a.m. UTC
  Hi,

As PR107338 shows, with the use of widening loads, the
container_type can become a wider type, it causes us to
get wrong shift_n since the BIT_FIELD_REF offset actually
becomes bigger on BE.  Taking the case in PR107338 as
example, at the beginning the container type is short and
BIT_FIELD_REF offset is 8 and size is 4, with unpacking to
wider type int, the high 16 bits are zero, by viewing it
as type int, its offset actually becomes to 24.  So the
shift_n should be 4 (32 - 24 - 4) instead of 20 (32 - 8
- 4).

I noticed that if we move shift_n calculation early
before the adjustments for widening loads (container type
change), it's based on all the stuffs of the original
container, the shfit_n calculated there is exactly what
we want, it can be independent of widening.  Besides, I
add prec adjustment together with the current adjustments
for widening loads, although prec's subsequent uses don't
require this change for now, since the container type gets
changed, we should keep the corresponding prec consistent.

Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu, powerpc64-linux-gnu P7 and P8 and
powerpc64le-linux-gnu P9 and P10.

Is it ok for trunk?

BR,
Kewen
-----

	PR tree-optimization/107338

gcc/ChangeLog:

	* tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Move
	shfit_n calculation before the adjustments for widening loads.
---
 gcc/tree-vect-patterns.cc | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

--
2.35.4
  

Comments

Richard Biener Oct. 24, 2022, 12:55 p.m. UTC | #1
On Mon, Oct 24, 2022 at 12:43 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi,
>
> As PR107338 shows, with the use of widening loads, the
> container_type can become a wider type, it causes us to
> get wrong shift_n since the BIT_FIELD_REF offset actually
> becomes bigger on BE.  Taking the case in PR107338 as
> example, at the beginning the container type is short and
> BIT_FIELD_REF offset is 8 and size is 4, with unpacking to
> wider type int, the high 16 bits are zero, by viewing it
> as type int, its offset actually becomes to 24.  So the
> shift_n should be 4 (32 - 24 - 4) instead of 20 (32 - 8
> - 4).
>
> I noticed that if we move shift_n calculation early
> before the adjustments for widening loads (container type
> change), it's based on all the stuffs of the original
> container, the shfit_n calculated there is exactly what
> we want, it can be independent of widening.  Besides, I
> add prec adjustment together with the current adjustments
> for widening loads, although prec's subsequent uses don't
> require this change for now, since the container type gets
> changed, we should keep the corresponding prec consistent.
>
> Bootstrapped and regtested on x86_64-redhat-linux,
> aarch64-linux-gnu, powerpc64-linux-gnu P7 and P8 and
> powerpc64le-linux-gnu P9 and P10.
>
> Is it ok for trunk?

OK.

Richard.

> BR,
> Kewen
> -----
>
>         PR tree-optimization/107338
>
> gcc/ChangeLog:
>
>         * tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Move
>         shfit_n calculation before the adjustments for widening loads.
> ---
>  gcc/tree-vect-patterns.cc | 17 +++++++++++------
>  1 file changed, 11 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 777ba2f5903..01094e8cb86 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -1925,6 +1925,16 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
>    tree container_type = TREE_TYPE (container);
>    tree vectype = get_vectype_for_scalar_type (vinfo, container_type);
>
> +  /* Calculate shift_n before the adjustments for widening loads, otherwise
> +     the container may change and we have to consider offset change for
> +     widening loads on big endianness.  The shift_n calculated here can be
> +     independent of widening.  */
> +  unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant ();
> +  unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant ();
> +  unsigned HOST_WIDE_INT prec = tree_to_uhwi (TYPE_SIZE (container_type));
> +  if (BYTES_BIG_ENDIAN)
> +    shift_n = prec - shift_n - mask_width;
> +
>    /* We move the conversion earlier if the loaded type is smaller than the
>       return type to enable the use of widening loads.  */
>    if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type)
> @@ -1935,6 +1945,7 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
>                                NOP_EXPR, container);
>        container = gimple_get_lhs (pattern_stmt);
>        container_type = TREE_TYPE (container);
> +      prec = tree_to_uhwi (TYPE_SIZE (container_type));
>        vectype = get_vectype_for_scalar_type (vinfo, container_type);
>        append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
>      }
> @@ -1953,12 +1964,6 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
>         shift_first = false;
>      }
>
> -  unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant ();
> -  unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant ();
> -  unsigned HOST_WIDE_INT prec = tree_to_uhwi (TYPE_SIZE (container_type));
> -  if (BYTES_BIG_ENDIAN)
> -    shift_n = prec - shift_n - mask_width;
> -
>    /* If we don't have to shift we only generate the mask, so just fix the
>       code-path to shift_first.  */
>    if (shift_n == 0)
> --
> 2.35.4
  
Kewen.Lin Oct. 25, 2022, 5:28 a.m. UTC | #2
on 2022/10/24 20:55, Richard Biener wrote:
> On Mon, Oct 24, 2022 at 12:43 PM Kewen.Lin <linkw@linux.ibm.com> wrote:
>>
>> Hi,
>>
>> As PR107338 shows, with the use of widening loads, the
>> container_type can become a wider type, it causes us to
>> get wrong shift_n since the BIT_FIELD_REF offset actually
>> becomes bigger on BE.  Taking the case in PR107338 as
>> example, at the beginning the container type is short and
>> BIT_FIELD_REF offset is 8 and size is 4, with unpacking to
>> wider type int, the high 16 bits are zero, by viewing it
>> as type int, its offset actually becomes to 24.  So the
>> shift_n should be 4 (32 - 24 - 4) instead of 20 (32 - 8
>> - 4).
>>
>> I noticed that if we move shift_n calculation early
>> before the adjustments for widening loads (container type
>> change), it's based on all the stuffs of the original
>> container, the shfit_n calculated there is exactly what
>> we want, it can be independent of widening.  Besides, I
>> add prec adjustment together with the current adjustments
>> for widening loads, although prec's subsequent uses don't
>> require this change for now, since the container type gets
>> changed, we should keep the corresponding prec consistent.
>>
>> Bootstrapped and regtested on x86_64-redhat-linux,
>> aarch64-linux-gnu, powerpc64-linux-gnu P7 and P8 and
>> powerpc64le-linux-gnu P9 and P10.
>>
>> Is it ok for trunk?
> 
> OK.

Thanks Richi, committed in r13-3474.

BR,
Kewen
  

Patch

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 777ba2f5903..01094e8cb86 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1925,6 +1925,16 @@  vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
   tree container_type = TREE_TYPE (container);
   tree vectype = get_vectype_for_scalar_type (vinfo, container_type);

+  /* Calculate shift_n before the adjustments for widening loads, otherwise
+     the container may change and we have to consider offset change for
+     widening loads on big endianness.  The shift_n calculated here can be
+     independent of widening.  */
+  unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant ();
+  unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant ();
+  unsigned HOST_WIDE_INT prec = tree_to_uhwi (TYPE_SIZE (container_type));
+  if (BYTES_BIG_ENDIAN)
+    shift_n = prec - shift_n - mask_width;
+
   /* We move the conversion earlier if the loaded type is smaller than the
      return type to enable the use of widening loads.  */
   if (TYPE_PRECISION (TREE_TYPE (container)) < TYPE_PRECISION (ret_type)
@@ -1935,6 +1945,7 @@  vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
 			       NOP_EXPR, container);
       container = gimple_get_lhs (pattern_stmt);
       container_type = TREE_TYPE (container);
+      prec = tree_to_uhwi (TYPE_SIZE (container_type));
       vectype = get_vectype_for_scalar_type (vinfo, container_type);
       append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
     }
@@ -1953,12 +1964,6 @@  vect_recog_bitfield_ref_pattern (vec_info *vinfo, stmt_vec_info stmt_info,
 	shift_first = false;
     }

-  unsigned HOST_WIDE_INT shift_n = bit_field_offset (bf_ref).to_constant ();
-  unsigned HOST_WIDE_INT mask_width = bit_field_size (bf_ref).to_constant ();
-  unsigned HOST_WIDE_INT prec = tree_to_uhwi (TYPE_SIZE (container_type));
-  if (BYTES_BIG_ENDIAN)
-    shift_n = prec - shift_n - mask_width;
-
   /* If we don't have to shift we only generate the mask, so just fix the
      code-path to shift_first.  */
   if (shift_n == 0)