simplify-rtx, v2: Punt on simplify_associative_operation with large operands [PR102356]
Commit Message
On Tue, Nov 30, 2021 at 10:43:28AM +0100, Richard Biener wrote:
> I wonder given we now have 'simplify_context' whether we can
> track a re-association budget we can eat from. At least your
> code to determine whether the expression is too large is
> quadratic as well (but bound to 64, so just a very large constant
> overhead for an outermost expression of size 63). We already
> have a mem_depth there,
Makes sense.
> so just have reassoc_times and punt
> if that reaches --param max-simplify-reassoc-times, incrementing
> it each time simplify_associative_operation is entered?
Though, is a --param worth for it? There is IMO no way the 64 limit
can trigger for non-debug insns (I can certainly gather how many times
it triggers when > 20 and in which pass during bootstrap/regtest
to verify).
2021-11-30 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/102356
* rtl.h (simplify_context): Add assoc_count member.
* simplify-rtx.c (simplify_associative_operation): Don't reassociate
more than 64 times within one outermost simplify_* call.
* dwarf2out.c (mem_loc_descriptor): Optimize binary operation
with both operands the same using DW_OP_dup.
* gcc.dg/pr102356.c: New test.
Jakub
Comments
On Tue, 30 Nov 2021, Jakub Jelinek wrote:
> On Tue, Nov 30, 2021 at 10:43:28AM +0100, Richard Biener wrote:
> > I wonder given we now have 'simplify_context' whether we can
> > track a re-association budget we can eat from. At least your
> > code to determine whether the expression is too large is
> > quadratic as well (but bound to 64, so just a very large constant
> > overhead for an outermost expression of size 63). We already
> > have a mem_depth there,
>
> Makes sense.
>
> > so just have reassoc_times and punt
> > if that reaches --param max-simplify-reassoc-times, incrementing
> > it each time simplify_associative_operation is entered?
>
> Though, is a --param worth for it? There is IMO no way the 64 limit
> can trigger for non-debug insns (I can certainly gather how many times
> it triggers when > 20 and in which pass during bootstrap/regtest
> to verify).
Probably not - but maybe use a (static) const unsigned int max_assoc_count
in the class then?
OK either way I guess.
Thanks,
Richard.
> 2021-11-30 Jakub Jelinek <jakub@redhat.com>
>
> PR rtl-optimization/102356
> * rtl.h (simplify_context): Add assoc_count member.
> * simplify-rtx.c (simplify_associative_operation): Don't reassociate
> more than 64 times within one outermost simplify_* call.
> * dwarf2out.c (mem_loc_descriptor): Optimize binary operation
> with both operands the same using DW_OP_dup.
>
> * gcc.dg/pr102356.c: New test.
>
> --- gcc/rtl.h.jj 2021-11-02 09:06:05.904396581 +0100
> +++ gcc/rtl.h 2021-11-30 14:55:39.701257736 +0100
> @@ -3433,6 +3433,10 @@ public:
> inside a MEM than outside. */
> unsigned int mem_depth = 0;
>
> + /* Tracks number of simplify_associative_operation calls performed during
> + outermost simplify* call. */
> + unsigned int assoc_count = 0;
> +
> private:
> rtx simplify_truncation (machine_mode, rtx, machine_mode);
> rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
> --- gcc/simplify-rtx.c.jj 2021-11-30 09:44:46.619606170 +0100
> +++ gcc/simplify-rtx.c 2021-11-30 14:59:00.251321577 +0100
> @@ -2263,6 +2263,16 @@ simplify_context::simplify_associative_o
> {
> rtx tem;
>
> + /* Normally expressions simplified by simplify-rtx.c are combined
> + at most from a few machine instructions and therefore the
> + expressions should be fairly small. During var-tracking
> + we can see arbitrarily large expressions though and reassociating
> + those can be quadratic, so punt after encountering 64
> + simplify_associative_operation calls during outermost simplify_*
> + call. */
> + if (++assoc_count >= 64)
> + return NULL_RTX;
> +
> /* Linearize the operator to the left. */
> if (GET_CODE (op1) == code)
> {
> --- gcc/dwarf2out.c.jj 2021-11-30 09:44:46.568606908 +0100
> +++ gcc/dwarf2out.c 2021-11-30 14:53:28.779174490 +0100
> @@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
> do_binop:
> op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
> VAR_INIT_STATUS_INITIALIZED);
> + if (XEXP (rtl, 0) == XEXP (rtl, 1))
> + {
> + if (op0 == 0)
> + break;
> + mem_loc_result = op0;
> + add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
> + add_loc_descr (&mem_loc_result, new_loc_descr (op, 0, 0));
> + break;
> + }
> op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
> VAR_INIT_STATUS_INITIALIZED);
>
> --- gcc/testsuite/gcc.dg/pr102356.c.jj 2021-11-30 14:53:28.779174490 +0100
> +++ gcc/testsuite/gcc.dg/pr102356.c 2021-11-30 14:53:28.779174490 +0100
> @@ -0,0 +1,33 @@
> +/* PR rtl-optimization/102356 */
> +/* { dg-do compile { target int32plus } } */
> +/* { dg-options "-O3 -g" } */
> +
> +signed char a = 0;
> +unsigned char b = 9;
> +unsigned long long c = 0xF1FBFC17225F7A57ULL;
> +int d = 0x3A6667C6;
> +
> +unsigned char
> +foo (unsigned int x)
> +{
> + unsigned int *e = &x;
> + if ((c /= ((0 * (*e *= b)) <= 0)))
> + ;
> + for (d = 9; d > 2; d -= 2)
> + {
> + c = -2;
> + do
> + if ((*e *= *e))
> + {
> + a = 4;
> + do
> + {
> + a -= 3;
> + if ((*e *= *e))
> + b = 9;
> + }
> + while (a > 2);
> + }
> + while (c++);
> + }
> +}
>
>
> Jakub
>
>
@@ -3433,6 +3433,10 @@ public:
inside a MEM than outside. */
unsigned int mem_depth = 0;
+ /* Tracks number of simplify_associative_operation calls performed during
+ outermost simplify* call. */
+ unsigned int assoc_count = 0;
+
private:
rtx simplify_truncation (machine_mode, rtx, machine_mode);
rtx simplify_byte_swapping_operation (rtx_code, machine_mode, rtx, rtx);
@@ -2263,6 +2263,16 @@ simplify_context::simplify_associative_o
{
rtx tem;
+ /* Normally expressions simplified by simplify-rtx.c are combined
+ at most from a few machine instructions and therefore the
+ expressions should be fairly small. During var-tracking
+ we can see arbitrarily large expressions though and reassociating
+ those can be quadratic, so punt after encountering 64
+ simplify_associative_operation calls during outermost simplify_*
+ call. */
+ if (++assoc_count >= 64)
+ return NULL_RTX;
+
/* Linearize the operator to the left. */
if (GET_CODE (op1) == code)
{
@@ -16363,6 +16363,15 @@ mem_loc_descriptor (rtx rtl, machine_mod
do_binop:
op0 = mem_loc_descriptor (XEXP (rtl, 0), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
+ if (XEXP (rtl, 0) == XEXP (rtl, 1))
+ {
+ if (op0 == 0)
+ break;
+ mem_loc_result = op0;
+ add_loc_descr (&mem_loc_result, new_loc_descr (DW_OP_dup, 0, 0));
+ add_loc_descr (&mem_loc_result, new_loc_descr (op, 0, 0));
+ break;
+ }
op1 = mem_loc_descriptor (XEXP (rtl, 1), mode, mem_mode,
VAR_INIT_STATUS_INITIALIZED);
@@ -0,0 +1,33 @@
+/* PR rtl-optimization/102356 */
+/* { dg-do compile { target int32plus } } */
+/* { dg-options "-O3 -g" } */
+
+signed char a = 0;
+unsigned char b = 9;
+unsigned long long c = 0xF1FBFC17225F7A57ULL;
+int d = 0x3A6667C6;
+
+unsigned char
+foo (unsigned int x)
+{
+ unsigned int *e = &x;
+ if ((c /= ((0 * (*e *= b)) <= 0)))
+ ;
+ for (d = 9; d > 2; d -= 2)
+ {
+ c = -2;
+ do
+ if ((*e *= *e))
+ {
+ a = 4;
+ do
+ {
+ a -= 3;
+ if ((*e *= *e))
+ b = 9;
+ }
+ while (a > 2);
+ }
+ while (c++);
+ }
+}