From patchwork Fri Feb 4 13:17:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 50795 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E4B823858C1F for ; Fri, 4 Feb 2022 13:17:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E4B823858C1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1643980656; bh=nhmqdrbFKA5Je/78jW3cvp5Z4of1XhI+HYoydhMNlRM=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=unmAocwsxUtPp7ldTLcQnxzV5S2ox1RVJKYK4zsBuoOLzvYxTA2elXGiWa1puIs0t 5drWJeVUPy3jCw6cr6pUZtPc4qcxK6TKi076EmITCsEe4SPKbwGMIbDUc99yKpnueO aQGp4t5vLEwCjJem3jryNiJ0GIU769wlsjCrqSAg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 845C53858D20 for ; Fri, 4 Feb 2022 13:17:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 845C53858D20 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5E579210F9; Fri, 4 Feb 2022 13:17:05 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4432513AA9; Fri, 4 Feb 2022 13:17:05 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id uGheD1En/WHzagAAMHmgww (envelope-from ); Fri, 04 Feb 2022 13:17:05 +0000 Date: Fri, 4 Feb 2022 14:17:04 +0100 (CET) To: gcc-patches@gcc.gnu.org Subject: [PATCH][v2] tree-optimization/100499 - niter analysis and multiple_of_p MIME-Version: 1.0 Message-Id: <20220204131705.4432513AA9@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Cc: richard.sandiford@arm.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" niter analysis uses multiple_of_p which currently assumes operations like MULT_EXPR do not wrap. We've got to rely on this for optimizing size expressions like those in DECL_SIZE and those generally use unsigned arithmetic with no indication that they are not expected to wrap. To preserve that the following adds a parameter to multiple_of_p, defaulted to true, indicating that the TOP expression is not expected to wrap for outer computations in TYPE. This mostly follows a patch proposed by Bin last year with the conversion behavior added. Applying to all users the new effect is that upon type conversions in the TOP expression the behavior will switch to honor TYPE_OVERFLOW_UNDEFINED for the converted sub-expressions. The patch also changes the occurance in niter analysis that we know is problematic and we have testcases for to pass false to multiple_of_p. The patch also contains a change to the PR72817 fix from Bin to avoid regressing gcc.dg/tree-ssa/loop-42.c. The intent for stage1 is to introduce a size_multiple_of_p and internalize the added parameter so all multiple_of_p users will honor TYPE_OVERFLOW_UNDEFINED and users dealing with size expressions need to be switched to size_multiple_of_p. Boostrapped and tested on x86_64-unknown-linux-gnu. OK? Thanks, Richard. 2022-01-26 Richard Biener PR tree-optimization/100499 * fold-const.h (multiple_of_p): Add nowrap parameter, defaulted to true. * fold-const.cc (multiple_of_p): Likewise. Honor it for MULT_EXPR, PLUS_EXPR and MINUS_EXPR and pass it along, switching to false for conversions. * tree-ssa-loop-niter.cc (number_of_iterations_ne): Do not claim the outermost expression does not wrap when calling multiple_of_p. Refactor the check done to check the original IV, avoiding a bias that might wrap. * gcc.dg/torture/pr100499-1.c: New testcase. * gcc.dg/torture/pr100499-2.c: Likewise. * gcc.dg/torture/pr100499-3.c: Likewise. Co-authored-by: Bin Cheng --- gcc/fold-const.cc | 80 +++++++++++++++-------- gcc/fold-const.h | 2 +- gcc/testsuite/gcc.dg/torture/pr100499-1.c | 27 ++++++++ gcc/testsuite/gcc.dg/torture/pr100499-2.c | 16 +++++ gcc/testsuite/gcc.dg/torture/pr100499-3.c | 14 ++++ gcc/tree-ssa-loop-niter.cc | 52 ++++++--------- 6 files changed, 130 insertions(+), 61 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-1.c create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-2.c create mode 100644 gcc/testsuite/gcc.dg/torture/pr100499-3.c diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 12732d39c79..2578a86ca1a 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -14073,10 +14073,16 @@ fold_binary_initializer_loc (location_t loc, tree_code code, tree type, SAVE_EXPR (I) * SAVE_EXPR (J) (where the same SAVE_EXPR (J) is used in the original and the - transformed version). */ + transformed version). + + NOWRAP specifies whether all outer operations in TYPE should + be considered not wrapping. Any type conversion within TOP acts + as a barrier and we will fall back to NOWRAP being false. + NOWRAP is mostly used to treat expressions in TYPE_SIZE and friends + as not wrapping even though they are generally using unsigned arithmetic. */ int -multiple_of_p (tree type, const_tree top, const_tree bottom) +multiple_of_p (tree type, const_tree top, const_tree bottom, bool nowrap) { gimple *stmt; tree op1, op2; @@ -14094,10 +14100,17 @@ multiple_of_p (tree type, const_tree top, const_tree bottom) a multiple of BOTTOM then TOP is a multiple of BOTTOM. */ if (!integer_pow2p (bottom)) return 0; - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom) - || multiple_of_p (type, TREE_OPERAND (top, 0), bottom)); + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap) + || multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap)); case MULT_EXPR: + /* If the multiplication can wrap we cannot recurse further unless + the bottom is a power of two which is where wrapping does not + matter. */ + if (!nowrap + && !TYPE_OVERFLOW_UNDEFINED (type) + && !integer_pow2p (bottom)) + return 0; if (TREE_CODE (bottom) == INTEGER_CST) { op1 = TREE_OPERAND (top, 0); @@ -14106,24 +14119,24 @@ multiple_of_p (tree type, const_tree top, const_tree bottom) std::swap (op1, op2); if (TREE_CODE (op2) == INTEGER_CST) { - if (multiple_of_p (type, op2, bottom)) + if (multiple_of_p (type, op2, bottom, nowrap)) return 1; /* Handle multiple_of_p ((x * 2 + 2) * 4, 8). */ - if (multiple_of_p (type, bottom, op2)) + if (multiple_of_p (type, bottom, op2, nowrap)) { widest_int w = wi::sdiv_trunc (wi::to_widest (bottom), wi::to_widest (op2)); if (wi::fits_to_tree_p (w, TREE_TYPE (bottom))) { op2 = wide_int_to_tree (TREE_TYPE (bottom), w); - return multiple_of_p (type, op1, op2); + return multiple_of_p (type, op1, op2, nowrap); } } - return multiple_of_p (type, op1, bottom); + return multiple_of_p (type, op1, bottom, nowrap); } } - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom) - || multiple_of_p (type, TREE_OPERAND (top, 0), bottom)); + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap) + || multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap)); case LSHIFT_EXPR: /* Handle X << CST as X * (1 << CST) and only process the constant. */ @@ -14135,29 +14148,38 @@ multiple_of_p (tree type, const_tree top, const_tree bottom) wide_int mul_op = wi::one (TYPE_PRECISION (type)) << wi::to_wide (op1); return multiple_of_p (type, - wide_int_to_tree (type, mul_op), bottom); + wide_int_to_tree (type, mul_op), bottom, + nowrap); } } return 0; case MINUS_EXPR: - /* It is impossible to prove if op0 - op1 is multiple of bottom - precisely, so be conservative here checking if both op0 and op1 - are multiple of bottom. Note we check the second operand first - since it's usually simpler. */ - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom) - && multiple_of_p (type, TREE_OPERAND (top, 0), bottom)); - case PLUS_EXPR: - /* The same as MINUS_EXPR, but handle cases like op0 + 0xfffffffd - as op0 - 3 if the expression has unsigned type. For example, - (X / 3) + 0xfffffffd is multiple of 3, but 0xfffffffd is not. */ + /* If the addition or subtraction can wrap we cannot recurse further + unless bottom is a power of two which is where wrapping does not + matter. */ + if (!nowrap + && !TYPE_OVERFLOW_UNDEFINED (type) + && !integer_pow2p (bottom)) + return 0; + + /* Handle cases like op0 + 0xfffffffd as op0 - 3 if the expression has + unsigned type. For example, (X / 3) + 0xfffffffd is multiple of 3, + but 0xfffffffd is not. */ op1 = TREE_OPERAND (top, 1); - if (TYPE_UNSIGNED (type) + if (TREE_CODE (top) == PLUS_EXPR + && nowrap + && TYPE_UNSIGNED (type) && TREE_CODE (op1) == INTEGER_CST && tree_int_cst_sign_bit (op1)) op1 = fold_build1 (NEGATE_EXPR, type, op1); - return (multiple_of_p (type, op1, bottom) - && multiple_of_p (type, TREE_OPERAND (top, 0), bottom)); + + /* It is impossible to prove if op0 +- op1 is multiple of bottom + precisely, so be conservative here checking if both op0 and op1 + are multiple of bottom. Note we check the second operand first + since it's usually simpler. */ + return (multiple_of_p (type, op1, bottom, nowrap) + && multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap)); CASE_CONVERT: /* Can't handle conversions from non-integral or wider integral type. */ @@ -14165,15 +14187,17 @@ multiple_of_p (tree type, const_tree top, const_tree bottom) || (TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (top, 0))))) return 0; + /* NOWRAP only extends to operations in the outermost type so + make sure to strip it off here. */ return multiple_of_p (TREE_TYPE (TREE_OPERAND (top, 0)), - TREE_OPERAND (top, 0), bottom); + TREE_OPERAND (top, 0), bottom, false); case SAVE_EXPR: - return multiple_of_p (type, TREE_OPERAND (top, 0), bottom); + return multiple_of_p (type, TREE_OPERAND (top, 0), bottom, nowrap); case COND_EXPR: - return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom) - && multiple_of_p (type, TREE_OPERAND (top, 2), bottom)); + return (multiple_of_p (type, TREE_OPERAND (top, 1), bottom, nowrap) + && multiple_of_p (type, TREE_OPERAND (top, 2), bottom, nowrap)); case INTEGER_CST: if (TREE_CODE (bottom) != INTEGER_CST diff --git a/gcc/fold-const.h b/gcc/fold-const.h index 394a67ece79..d36bfc813cf 100644 --- a/gcc/fold-const.h +++ b/gcc/fold-const.h @@ -96,7 +96,7 @@ extern void fold_overflow_warning (const char*, enum warn_strict_overflow_code); extern enum tree_code fold_div_compare (enum tree_code, tree, tree, tree *, tree *, bool *); extern bool operand_equal_p (const_tree, const_tree, unsigned int flags = 0); -extern int multiple_of_p (tree, const_tree, const_tree); +extern int multiple_of_p (tree, const_tree, const_tree, bool = true); #define omit_one_operand(T1,T2,T3)\ omit_one_operand_loc (UNKNOWN_LOCATION, T1, T2, T3) extern tree omit_one_operand_loc (location_t, tree, tree, tree); diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-1.c b/gcc/testsuite/gcc.dg/torture/pr100499-1.c new file mode 100644 index 00000000000..9511c323505 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr100499-1.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ + +typedef __UINT16_TYPE__ uint16_t; +typedef __INT32_TYPE__ int32_t; +static uint16_t g_2823 = 0xEC75L; +static uint16_t g_116 = 0xBC07L; + +static uint16_t +safe_mul_func_uint16_t_u_u(uint16_t ui1, uint16_t ui2) +{ + return ((unsigned int)ui1) * ((unsigned int)ui2); +} + +int main () +{ + uint16_t l_2815 = 0xffff; + uint16_t *l_2821 = &g_116; + uint16_t *l_2822 = &g_2823; + +lbl_2826: + l_2815 &= 0x1eae; + if (safe_mul_func_uint16_t_u_u(((*l_2821) = l_2815), (--(*l_2822)))) + goto lbl_2826; + if (g_2823 != 32768) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-2.c b/gcc/testsuite/gcc.dg/torture/pr100499-2.c new file mode 100644 index 00000000000..999f931806a --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr100499-2.c @@ -0,0 +1,16 @@ +/* { dg-do run } */ + +unsigned char ag = 55; +unsigned i; +int main() +{ + unsigned char c; + unsigned char a = ag; +d: + c = a-- * 52; + if (c) + goto d; + if (a != 255) + __builtin_abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.dg/torture/pr100499-3.c b/gcc/testsuite/gcc.dg/torture/pr100499-3.c new file mode 100644 index 00000000000..01be1e50690 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr100499-3.c @@ -0,0 +1,14 @@ +/* { dg-do run } */ + +int a = 0; +unsigned char b = 0; + +int main() { + a - 6; + for (; a >= -13; a = a - 8) + while((unsigned char)(b-- * 6)) + ; + if (b != 127) + __builtin_abort(); + return 0; +} diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index d33095b8e03..318d10c8fac 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -1042,17 +1042,21 @@ number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv, new_base = base - step > FINAL ; step < 0 && base - step doesn't overflow - 2') |FINAL - new_base| is an exact multiple of step. - - Please refer to PR34114 as an example of loop-ch's impact, also refer - to PR72817 as an example why condition 2') is necessary. + Please refer to PR34114 as an example of loop-ch's impact. Note, for NE_EXPR, base equals to FINAL is a special case, in - which the loop exits immediately, and the iv does not overflow. */ + which the loop exits immediately, and the iv does not overflow. + + Also note, we prove condition 2) by checking base and final seperately + along with condition 1) or 1'). */ if (!niter->control.no_overflow - && (integer_onep (s) || multiple_of_p (type, c, s))) + && (integer_onep (s) + || (multiple_of_p (type, fold_convert (niter_type, iv->base), s, + false) + && multiple_of_p (type, fold_convert (niter_type, final), s, + false)))) { - tree t, cond, new_c, relaxed_cond = boolean_false_node; + tree t, cond, relaxed_cond = boolean_false_node; if (tree_int_cst_sign_bit (iv->step)) { @@ -1066,12 +1070,8 @@ number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv, if (integer_nonzerop (t)) { t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step); - new_c = fold_build2 (MINUS_EXPR, niter_type, - fold_convert (niter_type, t), - fold_convert (niter_type, final)); - if (multiple_of_p (type, new_c, s)) - relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node, - t, final); + relaxed_cond = fold_build2 (GT_EXPR, boolean_type_node, t, + final); } } } @@ -1087,12 +1087,8 @@ number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv, if (integer_nonzerop (t)) { t = fold_build2 (MINUS_EXPR, type, iv->base, iv->step); - new_c = fold_build2 (MINUS_EXPR, niter_type, - fold_convert (niter_type, final), - fold_convert (niter_type, t)); - if (multiple_of_p (type, new_c, s)) - relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node, - t, final); + relaxed_cond = fold_build2 (LT_EXPR, boolean_type_node, t, + final); } } } @@ -1102,19 +1098,11 @@ number_of_iterations_ne (class loop *loop, tree type, affine_iv *iv, t = simplify_using_initial_conditions (loop, relaxed_cond); if (t && integer_onep (t)) - niter->control.no_overflow = true; - } - - /* First the trivial cases -- when the step is 1. */ - if (integer_onep (s)) - { - niter->niter = c; - return true; - } - if (niter->control.no_overflow && multiple_of_p (type, c, s)) - { - niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, c, s); - return true; + { + niter->control.no_overflow = true; + niter->niter = fold_build2 (EXACT_DIV_EXPR, niter_type, c, s); + return true; + } } /* Let nsd (step, size of mode) = d. If d does not divide c, the loop