[v2] phi-opt: Add missed optimization for "(cond | (a != b)) ? b : a"

Message ID DB9PR08MB6634EE2CF525F57E9DFA8960B7542@DB9PR08MB6634.eurprd08.prod.outlook.com
State New
Headers
Series [v2] phi-opt: Add missed optimization for "(cond | (a != b)) ? b : a" |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Test passed

Commit Message

Jovan Vukic Oct. 30, 2024, 12:33 p.m. UTC
  Thanks for the feedback on the first version of the patch. Accordingly:

I have corrected the code formatting as requested. I added new tests to
the existing file phi-opt-11.c, instead of creating a new one.

I performed testing before and after applying the patch on the x86
architecture, and I confirm that there are no new regressions.

The logic and general code of the patch itself have not been changed.

> So the A EQ/NE B expression, we can reverse A and B in the expression
> and still get the same result. But don't we have to be more careful for
> the TRUE/FALSE arms of the ternary? For BIT_AND we need ? a : b for
> BIT_IOR we need ? b : a.
>
> I don't see that gets verified in the existing code or after your
> change. I suspect I'm just missing something here. Can you clarify how
> we verify that BIT_AND gets ? a : b for the true/false arms and that
> BIT_IOR gets ? b : a for the true/false arms?

I did not communicate this clearly last time, but the existing optimization
simplifies the expression "(cond & (a == b)) ? a : b" to the simpler "b".
Similarly, the expression "(cond & (a == b)) ? b : a" simplifies to "a".

Thus, the existing and my optimization perform the following
simplifications:

(cond & (a == b)) ? a : b -> b
(cond & (a == b)) ? b : a -> a
(cond | (a != b)) ? a : b -> a
(cond | (a != b)) ? b : a -> b

For this reason, for BIT_AND_EXPR when we have A EQ B, it is sufficient to
confirm that one operand matches the true/false arm and the other matches
the false/true arm. In both cases, we simplify the expression to the third
operand of the ternary operation (i.e., OP0 ? OP1 : OP2 simplifies to OP2).
This is achieved in the value_replacement function after successfully
setting the value of *code within the rhs_is_fed_for_value_replacement
function to EQ_EXPR.

For BIT_IOR_EXPR, the same check is performed for A NE B, except now
*code remains NE_EXPR, and then value_replacement returns the second
operand (i.e., OP0 ? OP1 : OP2 simplifies to OP1).

2024-10-30  Jovan Vukic  <Jovan.Vukic@rt-rk.com>

gcc/ChangeLog:

        * tree-ssa-phiopt.cc
        (rhs_is_fed_for_value_replacement): Add a new optimization opportunity
        for BIT_IOR_EXPR and a != b.
        (operand_equal_for_value_replacement): Ditto.

gcc/testsuite/ChangeLog:

        * gcc.dg/tree-ssa/phi-opt-11.c: Add more tests.


CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straymail@rt-rk.com immediately.
---
 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-11.c | 31 +++++++++++++-
 gcc/tree-ssa-phiopt.cc                     | 48 ++++++++++++++--------
 2 files changed, 60 insertions(+), 19 deletions(-)
  

Comments

Andrew Pinski Nov. 1, 2024, 11:38 p.m. UTC | #1
On Wed, Oct 30, 2024 at 5:34 AM Jovan Vukic <Jovan.Vukic@rt-rk.com> wrote:
>
> Thanks for the feedback on the first version of the patch. Accordingly:
>
> I have corrected the code formatting as requested. I added new tests to
> the existing file phi-opt-11.c, instead of creating a new one.

The previous practice has been adding a new file rather than appending
to the old one.
Though maybe this is a good exception to that rule. The main reason is
to compare between a much older testsuite run and a testsuite run
after this patch.
I am not so picky about this either way though; just bring up the
reasons why we did it differently before.

>
> I performed testing before and after applying the patch on the x86
> architecture, and I confirm that there are no new regressions.
>
> The logic and general code of the patch itself have not been changed.
>
> > So the A EQ/NE B expression, we can reverse A and B in the expression
> > and still get the same result. But don't we have to be more careful for
> > the TRUE/FALSE arms of the ternary? For BIT_AND we need ? a : b for
> > BIT_IOR we need ? b : a.
> >
> > I don't see that gets verified in the existing code or after your
> > change. I suspect I'm just missing something here. Can you clarify how
> > we verify that BIT_AND gets ? a : b for the true/false arms and that
> > BIT_IOR gets ? b : a for the true/false arms?
>
> I did not communicate this clearly last time, but the existing optimization
> simplifies the expression "(cond & (a == b)) ? a : b" to the simpler "b".
> Similarly, the expression "(cond & (a == b)) ? b : a" simplifies to "a".
>
> Thus, the existing and my optimization perform the following
> simplifications:
>
> (cond & (a == b)) ? a : b -> b
> (cond & (a == b)) ? b : a -> a
> (cond | (a != b)) ? a : b -> a
> (cond | (a != b)) ? b : a -> b
>
> For this reason, for BIT_AND_EXPR when we have A EQ B, it is sufficient to
> confirm that one operand matches the true/false arm and the other matches
> the false/true arm. In both cases, we simplify the expression to the third
> operand of the ternary operation (i.e., OP0 ? OP1 : OP2 simplifies to OP2).
> This is achieved in the value_replacement function after successfully
> setting the value of *code within the rhs_is_fed_for_value_replacement
> function to EQ_EXPR.
>
> For BIT_IOR_EXPR, the same check is performed for A NE B, except now
> *code remains NE_EXPR, and then value_replacement returns the second
> operand (i.e., OP0 ? OP1 : OP2 simplifies to OP1).

I had started to rewrite this part of phiopt to use match-and-simplify
and to handle these kind of things.
I attached the first set of patches to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51030 and redid a
simplified version attached here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61110 .
I am not blocking this patch but mentioning that a rewrite is on the
way (how long before I get back to this, I don't know yet).
Also as far as reviewing the patch as it stands alone.

+   && ((bit_expression_code == BIT_AND_EXPR
+        && gimple_assign_rhs_code (def1) == EQ_EXPR)
+       || (bit_expression_code == BIT_IOR_EXPR
+   && gimple_assign_rhs_code (def1) == NE_EXPR)))

This could be simplified down to I think:
`gimple_assign_rhs_code (def1) == (bit_expression_code ==
BIT_AND_EXPR) ? EQ_EXPR : NE_EXPR`

And then add an assert at the beginning of
rhs_is_fed_for_value_replacement checking to make sure
bit_expression_code is either `BIT_AND_EXPR` or `BIT_IOR_EXPR` or
maybe return false right if it is not BIT_AND_EXPR/BIT_IOR_EXPR.

Otherwise the change looks good to me (but I can't fully approve it).

Thanks,
Andrew Pinski

>
> 2024-10-30  Jovan Vukic  <Jovan.Vukic@rt-rk.com>
>
> gcc/ChangeLog:
>
>         * tree-ssa-phiopt.cc
>         (rhs_is_fed_for_value_replacement): Add a new optimization opportunity
>         for BIT_IOR_EXPR and a != b.
>         (operand_equal_for_value_replacement): Ditto.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/phi-opt-11.c: Add more tests.
>
>
> CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straymail@rt-rk.com immediately.
  
Jeff Law Nov. 2, 2024, 12:04 a.m. UTC | #2
On 10/30/24 6:33 AM, Jovan Vukic wrote:
> Thanks for the feedback on the first version of the patch. Accordingly:
> 
> I have corrected the code formatting as requested. I added new tests to
> the existing file phi-opt-11.c, instead of creating a new one.
> 
> I performed testing before and after applying the patch on the x86
> architecture, and I confirm that there are no new regressions.
> 
> The logic and general code of the patch itself have not been changed.
> 
>> So the A EQ/NE B expression, we can reverse A and B in the expression
>> and still get the same result. But don't we have to be more careful for
>> the TRUE/FALSE arms of the ternary? For BIT_AND we need ? a : b for
>> BIT_IOR we need ? b : a.
>>
>> I don't see that gets verified in the existing code or after your
>> change. I suspect I'm just missing something here. Can you clarify how
>> we verify that BIT_AND gets ? a : b for the true/false arms and that
>> BIT_IOR gets ? b : a for the true/false arms?
> 
> I did not communicate this clearly last time, but the existing optimization
> simplifies the expression "(cond & (a == b)) ? a : b" to the simpler "b".
> Similarly, the expression "(cond & (a == b)) ? b : a" simplifies to "a".
> 
> Thus, the existing and my optimization perform the following
> simplifications:
> 
> (cond & (a == b)) ? a : b -> b
> (cond & (a == b)) ? b : a -> a
> (cond | (a != b)) ? a : b -> a
> (cond | (a != b)) ? b : a -> b
This is well understood.  The key in my mind is that for AND we always 
select the FALSE arm.  For IOR we always select the TRUE arm.


> 
> For this reason, for BIT_AND_EXPR when we have A EQ B, it is sufficient to
> confirm that one operand matches the true/false arm and the other matches
> the false/true arm. In both cases, we simplify the expression to the third
> operand of the ternary operation (i.e., OP0 ? OP1 : OP2 simplifies to OP2).
Right.  No concerns there.

> This is achieved in the value_replacement function after successfully
> setting the value of *code within the rhs_is_fed_for_value_replacement
> function to EQ_EXPR.


> 
> For BIT_IOR_EXPR, the same check is performed for A NE B, except now
> *code remains NE_EXPR, and then value_replacement returns the second
> operand (i.e., OP0 ? OP1 : OP2 simplifies to OP1).
This is the part I'm struggling a bit with.  But I'm guessing you're 
referring to this code:

>      /* For NE_EXPR, we want to build an assignment result = arg where
>          arg is the PHI argument associated with the true edge.  For
>          EQ_EXPR we want the PHI argument associated with the false edge.  */
>       e = (code == NE_EXPR ? true_edge : false_edge);
If I understand everything correctly your assertion is that we'll only 
get here for AND/EQ_EXPR and IOR/NE_EXPR.  There's no way to get here 
for AND/NE_EXPR or IOR/EQ_EXPR?

Intuitively that's probably true since if it wasn't we'd likely already 
be generating incorrect code for AND with NE_EXPR.  But I'd like you to 
confirm my basic understanding.

Thanks,
jeff
  
Jovan Vukic Nov. 4, 2024, 1:23 p.m. UTC | #3
On 11/02/24, Jeff Law wrote:
>This is well understood.  The key in my mind is that for AND we always
>select the FALSE arm.  For IOR we always select the TRUE arm.

Yes, I agree.

>>       e = (code == NE_EXPR ? true_edge : false_edge);
>If I understand everything correctly your assertion is that we'll only
>get here for AND/EQ_EXPR and IOR/NE_EXPR.  There's no way to get here
>for AND/NE_EXPR or IOR/EQ_EXPR?

If we examine the patch step by step, we can see that the function
rhs_is_fed_for_value_replacement enters the if block exclusively for
the combinations BIT_AND_EXPR/EQ_EXPR and BIT_IOR_EXPR/NE_EXPR. It is
only at this point that it returns true and sets the value of *code. This is
evident in the code:

>  if (TREE_CODE (rhs) == SSA_NAME)
>    {
>      ...
>      if (is_gimple_assign (def1)
>         && ((bit_expression_code == BIT_AND_EXPR
>              && gimple_assign_rhs_code (def1) == EQ_EXPR)
>             || (bit_expression_code == BIT_IOR_EXPR
>                 && gimple_assign_rhs_code (def1) == NE_EXPR)))
>       {
>         ...
>             *code = gimple_assign_rhs_code (def1);
>             return true;
>         ...
>       }
>    }
>  return false;

The function operand_equal_for_value_replacement only passes the value
it receives from rhs_is_fed_for_value_replacement to the call site.

At the call site, the returned value initializes the variable equal_p.

Thus, the return value is simply passed from
rhs_is_fed_for_value_replacement to the variable equal_p, and it is
true only in the situation of BIT_AND_EXPR/EQ_EXPR or
BIT_IOR_EXPR/NE_EXPR because that is how it is defined in
rhs_is_fed_for_value_replacement.

The patch only sets equal_p to true for the combinations AND/EQ and
OR/NE. These are the only two situations when equal_p becomes true as
a result of the patch, and therefore, only in those situations, as a
consequence of the patch, does the line of code you quoted execute:

> if (equal_p || maybe_equal_p)
> {
>       ...
>       e = (code == NE_EXPR ? true_edge : false_edge);
>       ...
>       if (...)
>       {
>               ...
>               if (equal_p)
>               {
>                       ...
>                       return 2;
>               }
>       }
>       else if (equal_p)
>       {
>               ...
>               return 1;
>       }
> }

Also, Mr. Pinski left a comment
(https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667258.html)
and offered some suggestions about the patch.

He also mentioned that he is working on integrating the affected code into
match-and-simplify, so the rewrite is on the way. We can either move forward
with this patch or stop it. Either way, I’m fine with any decision made.


CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straymail@rt-rk.com immediately.
  

Patch

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-11.c b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-11.c
index 14c82cd5216..d1e284c5325 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-11.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-11.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O1 -fdump-tree-optimized --param logical-op-non-short-circuit=1" } */
+/* { dg-options "-O1 -fdump-tree-phiopt2 -fdump-tree-optimized --param logical-op-non-short-circuit=1" } */
 
 int f(int a, int b, int c)
 {
@@ -22,4 +22,33 @@  int h(int a, int b, int c, int d)
  return a;
 }
 
+int i(int a, int b, int c)
+{
+  if ((a > c) & (a == b))
+    return a;
+  return b;
+}
+
+int j(int a, int b, int c)
+{
+  if ((a > c) & (a == b))
+    return b;
+  return a;
+}
+
+int k(int a, int b, int c)
+{
+  if ((a > c) | (a != b))
+    return b;
+  return a;
+}
+
+int l(int a, int b, int c)
+{
+  if ((a > c) | (a != b))
+    return a;
+  return b;
+}
+
+/* { dg-final { scan-tree-dump-times "if" 0 "phiopt2" } } */
 /* { dg-final { scan-tree-dump-times "if" 0 "optimized" } } */
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index cffafe101a4..61b33bfc361 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -1078,17 +1078,18 @@  jump_function_from_stmt (tree *arg, gimple *stmt)
   return false;
 }
 
-/* RHS is a source argument in a BIT_AND_EXPR which feeds a conditional
-   of the form SSA_NAME NE 0.
+/* RHS is a source argument in a BIT_AND_EXPR or BIT_IOR_EXPR which feeds
+   a conditional of the form SSA_NAME NE 0.
 
-   If RHS is fed by a simple EQ_EXPR comparison of two values, see if
-   the two input values of the EQ_EXPR match arg0 and arg1.
+   If RHS is fed by a simple EQ_EXPR or NE_EXPR comparison of two values,
+   see if the two input values of the comparison match arg0 and arg1.
 
    If so update *code and return TRUE.  Otherwise return FALSE.  */
 
 static bool
 rhs_is_fed_for_value_replacement (const_tree arg0, const_tree arg1,
-				  enum tree_code *code, const_tree rhs)
+				  enum tree_code *code, const_tree rhs,
+				  enum tree_code bit_expression_code)
 {
   /* Obviously if RHS is not an SSA_NAME, we can't look at the defining
      statement.  */
@@ -1096,11 +1097,15 @@  rhs_is_fed_for_value_replacement (const_tree arg0, const_tree arg1,
     {
       gimple *def1 = SSA_NAME_DEF_STMT (rhs);
 
-      /* Verify the defining statement has an EQ_EXPR on the RHS.  */
-      if (is_gimple_assign (def1) && gimple_assign_rhs_code (def1) == EQ_EXPR)
+      /* Verify the defining statement has an EQ_EXPR or NE_EXPR on the RHS.  */
+      if (is_gimple_assign (def1)
+	  && ((bit_expression_code == BIT_AND_EXPR
+	       && gimple_assign_rhs_code (def1) == EQ_EXPR)
+	      || (bit_expression_code == BIT_IOR_EXPR
+		  && gimple_assign_rhs_code (def1) == NE_EXPR)))
 	{
-	  /* Finally verify the source operands of the EQ_EXPR are equal
-	     to arg0 and arg1.  */
+	  /* Finally verify the source operands of the EQ_EXPR or NE_EXPR
+	     are equal to arg0 and arg1.  */
 	  tree op0 = gimple_assign_rhs1 (def1);
 	  tree op1 = gimple_assign_rhs2 (def1);
 	  if ((operand_equal_for_phi_arg_p (arg0, op0)
@@ -1119,8 +1124,9 @@  rhs_is_fed_for_value_replacement (const_tree arg0, const_tree arg1,
 
 /* Return TRUE if arg0/arg1 are equal to the rhs/lhs or lhs/rhs of COND.
 
-   Also return TRUE if arg0/arg1 are equal to the source arguments of a
-   an EQ comparison feeding a BIT_AND_EXPR which feeds COND.
+   Also return TRUE if arg0/arg1 are equal to the source arguments of an
+   EQ comparison feeding a BIT_AND_EXPR, or NE comparison feeding a
+   BIT_IOR_EXPR which feeds COND.
 
    Return FALSE otherwise.  */
 
@@ -1139,27 +1145,33 @@  operand_equal_for_value_replacement (const_tree arg0, const_tree arg1,
     return true;
 
   /* Now handle more complex case where we have an EQ comparison
-     which feeds a BIT_AND_EXPR which feeds COND.
+     feeding a BIT_AND_EXPR, or a NE comparison feeding a BIT_IOR_EXPR,
+     which then feeds into COND.
 
      First verify that COND is of the form SSA_NAME NE 0.  */
   if (*code != NE_EXPR || !integer_zerop (rhs)
       || TREE_CODE (lhs) != SSA_NAME)
     return false;
 
-  /* Now ensure that SSA_NAME is set by a BIT_AND_EXPR.  */
+  /* Now ensure that SSA_NAME is set by a BIT_AND_EXPR or BIT_OR_EXPR.  */
   def = SSA_NAME_DEF_STMT (lhs);
-  if (!is_gimple_assign (def) || gimple_assign_rhs_code (def) != BIT_AND_EXPR)
+  if (!is_gimple_assign (def)
+      || (gimple_assign_rhs_code (def) != BIT_AND_EXPR
+	  && gimple_assign_rhs_code (def) != BIT_IOR_EXPR))
     return false;
 
-  /* Now verify arg0/arg1 correspond to the source arguments of an
-     EQ comparison feeding the BIT_AND_EXPR.  */
+  /* Now verify arg0/arg1 correspond to the source arguments of an EQ
+     comparison feeding the BIT_AND_EXPR or a NE comparison feeding the
+     BIT_IOR_EXPR.  */
 
   tree tmp = gimple_assign_rhs1 (def);
-  if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
+  if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp,
+					gimple_assign_rhs_code (def)))
     return true;
 
   tmp = gimple_assign_rhs2 (def);
-  if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp))
+  if (rhs_is_fed_for_value_replacement (arg0, arg1, code, tmp,
+					gimple_assign_rhs_code (def)))
     return true;
 
   return false;