match.pd: Fix popcount (X) + popcount (Y) simplification [PR112719]

Message ID ZWWk9ObmRt5RlIuV@tucnak
State New
Headers
Series match.pd: Fix popcount (X) + popcount (Y) simplification [PR112719] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 fail Patch failed to apply

Commit Message

Jakub Jelinek Nov. 28, 2023, 8:29 a.m. UTC
  Hi!

Since my PR112566 r14-5557 changes the following testcase ICEs, because
.POPCOUNT (x) + .POPCOUNT (y) has a simplification attempted even when
x and y have incompatible types (different precisions).
Note, with _BitInt it can ICE already starting with r14-5435 and
I think as a latent problem it exists for years, because IFN_POPCOUNT
calls inherently can have different argument types and return type
is always the same.
The following patch fixes it by using widest_int during the analysis
(which is where it was ICEing) and if it is optimizable, casting to
the wider type so that bit_ior has matching argument types.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-11-28  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/112719
	* match.pd (popcount (X) + popcount (Y) -> POPCOUNT (X | Y)): Deal
	with argument types with different precisions.


	Jakub
  

Comments

Richard Biener Nov. 28, 2023, 9:09 a.m. UTC | #1
> Am 28.11.2023 um 09:30 schrieb Jakub Jelinek <jakub@redhat.com>:
> 
> Hi!
> 
> Since my PR112566 r14-5557 changes the following testcase ICEs, because
> .POPCOUNT (x) + .POPCOUNT (y) has a simplification attempted even when
> x and y have incompatible types (different precisions).
> Note, with _BitInt it can ICE already starting with r14-5435 and
> I think as a latent problem it exists for years, because IFN_POPCOUNT
> calls inherently can have different argument types and return type
> is always the same.
> The following patch fixes it by using widest_int during the analysis
> (which is where it was ICEing) and if it is optimizable, casting to
> the wider type so that bit_ior has matching argument types.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 

Ok

Richard 

> 2023-11-28  Jakub Jelinek  <jakub@redhat.com>
> 
>    PR tree-optimization/112719
>    * match.pd (popcount (X) + popcount (Y) -> POPCOUNT (X | Y)): Deal
>    with argument types with different precisions.
> 
> --- gcc/match.pd.jj    2023-11-24 11:32:22.161777182 +0100
> +++ gcc/match.pd    2023-11-27 10:43:54.857068074 +0100
> @@ -8723,8 +8723,13 @@ (define_operator_list SYNC_FETCH_AND_AND
> (simplify
>   (plus (POPCOUNT:s @0) (POPCOUNT:s @1))
>   (if (INTEGRAL_TYPE_P (type)
> -       && wi::bit_and (tree_nonzero_bits (@0), tree_nonzero_bits (@1)) == 0)
> -    (POPCOUNT (bit_ior @0 @1))))
> +       && (wi::bit_and (widest_int::from (tree_nonzero_bits (@0), UNSIGNED),
> +            widest_int::from (tree_nonzero_bits (@1), UNSIGNED))
> +       == 0))
> +   (with { tree utype = TREE_TYPE (@0);
> +       if (TYPE_PRECISION (utype) < TYPE_PRECISION (TREE_TYPE (@1)))
> +         utype = TREE_TYPE (@1); }
> +    (POPCOUNT (bit_ior (convert:utype @0) (convert:utype @1))))))
> 
> /* popcount(X) == 0 is X == 0, and related (in)equalities.  */
> (for popcount (POPCOUNT)
> --- gcc/testsuite/gcc.dg/pr112719.c.jj    2023-11-27 10:35:44.428911015 +0100
> +++ gcc/testsuite/gcc.dg/pr112719.c    2023-11-27 10:35:27.262153103 +0100
> @@ -0,0 +1,18 @@
> +/* PR tree-optimization/112719 */
> +/* { dg-do compile } */
> +/* { dg-options "-O" } */
> +/* { dg-additional-options "-msse4" { target i?86-*-* x86_64-*-* } } */
> +
> +int
> +foo (unsigned int a, unsigned short b)
> +{
> +  return __builtin_popcountl (a) + __builtin_popcountl (b);
> +}
> +
> +int
> +bar (unsigned int a, unsigned short b)
> +{
> +  a &= 0xaaaaaaaaUL;
> +  b &= 0x5555;
> +  return __builtin_popcountll (a) + __builtin_popcountll (b);
> +}
> 
>    Jakub
>
  

Patch

--- gcc/match.pd.jj	2023-11-24 11:32:22.161777182 +0100
+++ gcc/match.pd	2023-11-27 10:43:54.857068074 +0100
@@ -8723,8 +8723,13 @@  (define_operator_list SYNC_FETCH_AND_AND
 (simplify
   (plus (POPCOUNT:s @0) (POPCOUNT:s @1))
   (if (INTEGRAL_TYPE_P (type)
-       && wi::bit_and (tree_nonzero_bits (@0), tree_nonzero_bits (@1)) == 0)
-    (POPCOUNT (bit_ior @0 @1))))
+       && (wi::bit_and (widest_int::from (tree_nonzero_bits (@0), UNSIGNED),
+			widest_int::from (tree_nonzero_bits (@1), UNSIGNED))
+	   == 0))
+   (with { tree utype = TREE_TYPE (@0);
+	   if (TYPE_PRECISION (utype) < TYPE_PRECISION (TREE_TYPE (@1)))
+	     utype = TREE_TYPE (@1); }
+    (POPCOUNT (bit_ior (convert:utype @0) (convert:utype @1))))))
 
 /* popcount(X) == 0 is X == 0, and related (in)equalities.  */
 (for popcount (POPCOUNT)
--- gcc/testsuite/gcc.dg/pr112719.c.jj	2023-11-27 10:35:44.428911015 +0100
+++ gcc/testsuite/gcc.dg/pr112719.c	2023-11-27 10:35:27.262153103 +0100
@@ -0,0 +1,18 @@ 
+/* PR tree-optimization/112719 */
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+/* { dg-additional-options "-msse4" { target i?86-*-* x86_64-*-* } } */
+
+int
+foo (unsigned int a, unsigned short b)
+{
+  return __builtin_popcountl (a) + __builtin_popcountl (b);
+}
+
+int
+bar (unsigned int a, unsigned short b)
+{
+  a &= 0xaaaaaaaaUL;
+  b &= 0x5555;
+  return __builtin_popcountll (a) + __builtin_popcountll (b);
+}