Message ID | 20211027063448.1844771-2-luoxhu@linux.ibm.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D98D33857C5B for <patchwork@sourceware.org>; Wed, 27 Oct 2021 06:37:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D98D33857C5B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1635316654; bh=OJp4e1Rgr0DL/flbhf6pZ9Q2COF9UXZmDUDLdDT/l7o=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=JFVoucs2771kXzHDy6hn602hKJtp1dRf8917k1X4BhJ9P1xTiXe81C1If14N3fdy5 TAZtkLJkhg1PsUTZZg0QoeHPdPlnC1kF0TNpXFq84+TjAn30ZdiGerTNoPEx21NiWj w2kBshN5xUtgo3SZABJIm4M6eiGQvLhwRbpasnXU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id A84FF3858429; Wed, 27 Oct 2021 06:36:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A84FF3858429 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19R5rn6x032587; Wed, 27 Oct 2021 06:36:00 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3by1148twu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 27 Oct 2021 06:36:00 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 19R6IcR4032747; Wed, 27 Oct 2021 06:36:00 GMT Received: from ppma01fra.de.ibm.com (46.49.7a9f.ip4.static.sl-reverse.com [159.122.73.70]) by mx0a-001b2d01.pphosted.com with ESMTP id 3by1148tvx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 27 Oct 2021 06:35:59 +0000 Received: from pps.filterd (ppma01fra.de.ibm.com [127.0.0.1]) by ppma01fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19R6R18V008805; Wed, 27 Oct 2021 06:35:57 GMT Received: from b06cxnps3074.portsmouth.uk.ibm.com (d06relay09.portsmouth.uk.ibm.com [9.149.109.194]) by ppma01fra.de.ibm.com with ESMTP id 3bx4f2b59k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 27 Oct 2021 06:35:57 +0000 Received: from d06av22.portsmouth.uk.ibm.com (d06av22.portsmouth.uk.ibm.com [9.149.105.58]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19R6ZsDm59441624 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 27 Oct 2021 06:35:54 GMT Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 517EC4C044; Wed, 27 Oct 2021 06:35:54 +0000 (GMT) Received: from d06av22.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E905D4C040; Wed, 27 Oct 2021 06:35:52 +0000 (GMT) Received: from marlin.aus.stglabs.ibm.com (unknown [9.40.194.84]) by d06av22.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 27 Oct 2021 06:35:52 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 1/4] Fix loop split incorrect count and probability Date: Wed, 27 Oct 2021 01:34:45 -0500 Message-Id: <20211027063448.1844771-2-luoxhu@linux.ibm.com> X-Mailer: git-send-email 2.27.0.90.geebb51ba8c In-Reply-To: <20211027063448.1844771-1-luoxhu@linux.ibm.com> References: <20211027063448.1844771-1-luoxhu@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: fyWVvkXEUnFrELApfzJqkV0pf4y05CpM X-Proofpoint-ORIG-GUID: XVYdw00hd7-9-1UE7wwFseyChCOvhG8I X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-27_02,2021-10-26_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 malwarescore=0 suspectscore=0 impostorscore=0 adultscore=0 lowpriorityscore=0 spamscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 mlxlogscore=999 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2110270038 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Xionghu Luo via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Xionghu Luo <luoxhu@linux.ibm.com> Cc: rguenther@suse.de, segher@kernel.crashing.org, Xionghu Luo <luoxhu@linux.ibm.com>, hubicka@kam.mff.cuni.cz, wschmidt@linux.ibm.com, linkw@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
loop split fix and functions renaming
|
|
Commit Message
Xionghu Luo
Oct. 27, 2021, 6:34 a.m. UTC
loop split condition is moved between loop1 and loop2, the split bb's count and probability should also be duplicated instead of (100% vs INV), secondly, the original loop1 and loop2 count need be propotional from the original loop. Regression tested pass, OK for master?
Comments
> > gcc/ChangeLog: > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > (do_split_loop_on_cond): Likewise. > --- > gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- > 1 file changed, 16 insertions(+), 9 deletions(-) > > diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > index 3f6ad046623..d30782888f3 100644 > --- a/gcc/tree-ssa-loop-split.c > +++ b/gcc/tree-ssa-loop-split.c > @@ -575,7 +575,11 @@ split_loop (class loop *loop1) > stmts2); > tree cond = build2 (guard_code, boolean_type_node, guard_init, border); > if (!initial_true) > - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + > + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE > + ? EDGE_SUCC (bbs[i], 0) > + : EDGE_SUCC (bbs[i], 1); > > /* Now version the loop, placing loop2 after loop1 connecting > them, and fix up SSA form for that. */ > @@ -583,10 +587,10 @@ split_loop (class loop *loop1) > basic_block cond_bb; > > class loop *loop2 = loop_version (loop1, cond, &cond_bb, > - profile_probability::always (), > - profile_probability::always (), > - profile_probability::always (), > - profile_probability::always (), > + true_edge->probability, > + true_edge->probability.invert (), > + true_edge->probability, > + true_edge->probability.invert (), > true); As discussed yesterday, for loop of form for (...) if (cond) cond = something(); else something2 Split as loop1: for (...) if (true) cond = something(); if (!cond) break else something2 (); loop2: for (...) if (false) cond = something(); else something2 (); If "if (cond)" has probability p, you want to scale loop1 by p and loop2 by 1-p as your patch does, but you need to exclude the basic blocks guarded by the condition. One way is to break out loop_version and implement it inline, other option (perhaps leading to less code duplication) is to add argument listing basic blocks that should not be scaled, which would be set to both arms of the if. Are there other profile patches of your I should look at? Honza > gcc_assert (loop2); > > @@ -1486,10 +1490,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > initialize_original_copy_tables (); > > struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, > - profile_probability::always (), > - profile_probability::never (), > - profile_probability::always (), > - profile_probability::always (), > + invar_branch->probability.invert (), > + invar_branch->probability, > + invar_branch->probability.invert (), > + invar_branch->probability, > true); > if (!loop2) > { > @@ -1530,6 +1534,9 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE; > to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE; > > + to_loop1->probability = invar_branch->probability.invert (); > + to_loop2->probability = invar_branch->probability; > + > /* Due to introduction of a control flow edge from loop1 latch to loop2 > pre-header, we should update PHIs in loop2 to reflect this connection > between loop1 and loop2. */ > -- > 2.27.0.90.geebb51ba8c >
> As discussed yesterday, for loop of form > > for (...) > if (cond) > cond = something(); > else > something2 > > Split as > Say "if (cond)" has probability p, then individual statements scale as follows: loop1: p for (...) p if (true) 1 cond = something(); 1 if (!cond) x break 0 else 0 something2 (); loop2: 1-p for (...) 1-p if (false) 0 cond = something(); 1 else 1 something2 (); Block with x has same count as preheader of the loop. Your patch does loop1: p for (...) p if (true) p cond = something(); p if (!cond) x break p else p something2 (); loop2: 1-p for (...) 1-p if (false) 1-p cond = something(); 1-p else 1-p something2 (); One does not need set 0 correctly (blocks will be removed), but it would be nice to avoid dropping 1s down. Looking at this, things will go well with your change if we are guaranteed that something() and something2() is always 1 bb becuase block merging happening after turning condiitonal to constant will remove the misupdated count. Your profile scales after merging is: loop1: p for (...) 1 cond = something(); 1 if (!cond) x break loop2: 1-p for (...) 1 something2 (); assuming that profile was sane and frequency of something() is p*frequency of the conditional and similarly for something2(). This is why you see final profile correct. So if we split only loops with one BB then/else arms, the patch is OK with comment explaining this. Also I wonder, do we correctly duplicate&update known bounds on iteration counts attached to the loop struccture? Honza > > If "if (cond)" has probability p, you want to scale loop1 by p > and loop2 by 1-p as your patch does, but you need to exclude the basic > blocks guarded by the condition. > > One way is to break out loop_version and implement it inline, other > option (perhaps leading to less code duplication) is to add argument listing > basic blocks that should not be scaled, which would be set to both arms > of the if. > > Are there other profile patches of your I should look at? > Honza > > gcc_assert (loop2); > > > > @@ -1486,10 +1490,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > > initialize_original_copy_tables (); > > > > struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, > > - profile_probability::always (), > > - profile_probability::never (), > > - profile_probability::always (), > > - profile_probability::always (), > > + invar_branch->probability.invert (), > > + invar_branch->probability, > > + invar_branch->probability.invert (), > > + invar_branch->probability, > > true); > > if (!loop2) > > { > > @@ -1530,6 +1534,9 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > > to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE; > > to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE; > > > > + to_loop1->probability = invar_branch->probability.invert (); > > + to_loop2->probability = invar_branch->probability; > > + > > /* Due to introduction of a control flow edge from loop1 latch to loop2 > > pre-header, we should update PHIs in loop2 to reflect this connection > > between loop1 and loop2. */ > > -- > > 2.27.0.90.geebb51ba8c > >
On Wed, 27 Oct 2021, Jan Hubicka wrote: > > > > gcc/ChangeLog: > > > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > > (do_split_loop_on_cond): Likewise. > > --- > > gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- > > 1 file changed, 16 insertions(+), 9 deletions(-) > > > > diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > > index 3f6ad046623..d30782888f3 100644 > > --- a/gcc/tree-ssa-loop-split.c > > +++ b/gcc/tree-ssa-loop-split.c > > @@ -575,7 +575,11 @@ split_loop (class loop *loop1) > > stmts2); > > tree cond = build2 (guard_code, boolean_type_node, guard_init, border); > > if (!initial_true) > > - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > > + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > > + > > + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE > > + ? EDGE_SUCC (bbs[i], 0) > > + : EDGE_SUCC (bbs[i], 1); > > > > /* Now version the loop, placing loop2 after loop1 connecting > > them, and fix up SSA form for that. */ > > @@ -583,10 +587,10 @@ split_loop (class loop *loop1) > > basic_block cond_bb; > > > > class loop *loop2 = loop_version (loop1, cond, &cond_bb, > > - profile_probability::always (), > > - profile_probability::always (), > > - profile_probability::always (), > > - profile_probability::always (), > > + true_edge->probability, > > + true_edge->probability.invert (), > > + true_edge->probability, > > + true_edge->probability.invert (), > > true); > > As discussed yesterday, for loop of form > > for (...) > if (cond) > cond = something(); > else > something2 > > Split as Note that you are missing to conditionalize loop1 execution on 'cond' (not sure if that makes a difference). > loop1: if (cond) > for (...) > if (true) > cond = something(); > if (!cond) > break > else > something2 (); > > loop2: > for (...) > if (false) > cond = something(); > else > something2 (); > > If "if (cond)" has probability p, you want to scale loop1 by p > and loop2 by 1-p as your patch does, but you need to exclude the basic > blocks guarded by the condition. > > One way is to break out loop_version and implement it inline, other > option (perhaps leading to less code duplication) is to add argument listing > basic blocks that should not be scaled, which would be set to both arms > of the if. > > Are there other profile patches of your I should look at? > Honza > > gcc_assert (loop2); > > > > @@ -1486,10 +1490,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > > initialize_original_copy_tables (); > > > > struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, > > - profile_probability::always (), > > - profile_probability::never (), > > - profile_probability::always (), > > - profile_probability::always (), > > + invar_branch->probability.invert (), > > + invar_branch->probability, > > + invar_branch->probability.invert (), > > + invar_branch->probability, > > true); > > if (!loop2) > > { > > @@ -1530,6 +1534,9 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > > to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE; > > to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE; > > > > + to_loop1->probability = invar_branch->probability.invert (); > > + to_loop2->probability = invar_branch->probability; > > + > > /* Due to introduction of a control flow edge from loop1 latch to loop2 > > pre-header, we should update PHIs in loop2 to reflect this connection > > between loop1 and loop2. */ > > -- > > 2.27.0.90.geebb51ba8c > > >
> On Wed, 27 Oct 2021, Jan Hubicka wrote: > > > > > > > gcc/ChangeLog: > > > > > > * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. > > > (do_split_loop_on_cond): Likewise. > > > --- > > > gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- > > > 1 file changed, 16 insertions(+), 9 deletions(-) > > > > > > diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > > > index 3f6ad046623..d30782888f3 100644 > > > --- a/gcc/tree-ssa-loop-split.c > > > +++ b/gcc/tree-ssa-loop-split.c > > > @@ -575,7 +575,11 @@ split_loop (class loop *loop1) > > > stmts2); > > > tree cond = build2 (guard_code, boolean_type_node, guard_init, border); > > > if (!initial_true) > > > - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > > > + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > > > + > > > + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE > > > + ? EDGE_SUCC (bbs[i], 0) > > > + : EDGE_SUCC (bbs[i], 1); > > > > > > /* Now version the loop, placing loop2 after loop1 connecting > > > them, and fix up SSA form for that. */ > > > @@ -583,10 +587,10 @@ split_loop (class loop *loop1) > > > basic_block cond_bb; > > > > > > class loop *loop2 = loop_version (loop1, cond, &cond_bb, > > > - profile_probability::always (), > > > - profile_probability::always (), > > > - profile_probability::always (), > > > - profile_probability::always (), > > > + true_edge->probability, > > > + true_edge->probability.invert (), > > > + true_edge->probability, > > > + true_edge->probability.invert (), > > > true); > > > > As discussed yesterday, for loop of form > > > > for (...) > > if (cond) > > cond = something(); > > else > > something2 > > > > Split as > > Note that you are missing to conditionalize loop1 execution > on 'cond' (not sure if that makes a difference). You are right - forgot to mention that. Entry conditional makes no difference on scaling stmts inside loop but affects its header and expected trip count. We however need to set up probability of this conditional (and preheader count if it exists) There is no general way to read the probability of this initial conditional from cfg profile. So I guess we are stuck with guessing some arbitrary value. I guess common case is that cond is true first iteration tough and often we can easily see that fromo PHI node initializing the test variable. Other thing that changes is expected number of iterations of the split loops, so we may want to update the exit conditinal probability accordingly... Honza > > > loop1: > if (cond) > > for (...) > > if (true) > > cond = something(); > > if (!cond) > > break > > else > > something2 (); > > > > loop2: > > for (...) > > if (false) > > cond = something(); > > else > > something2 (); > > > > If "if (cond)" has probability p, you want to scale loop1 by p > > and loop2 by 1-p as your patch does, but you need to exclude the basic > > blocks guarded by the condition. > > > > One way is to break out loop_version and implement it inline, other > > option (perhaps leading to less code duplication) is to add argument listing > > basic blocks that should not be scaled, which would be set to both arms > > of the if. > > > > Are there other profile patches of your I should look at? > > Honza > > > gcc_assert (loop2); > > > > > > @@ -1486,10 +1490,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > > > initialize_original_copy_tables (); > > > > > > struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, > > > - profile_probability::always (), > > > - profile_probability::never (), > > > - profile_probability::always (), > > > - profile_probability::always (), > > > + invar_branch->probability.invert (), > > > + invar_branch->probability, > > > + invar_branch->probability.invert (), > > > + invar_branch->probability, > > > true); > > > if (!loop2) > > > { > > > @@ -1530,6 +1534,9 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > > > to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE; > > > to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE; > > > > > > + to_loop1->probability = invar_branch->probability.invert (); > > > + to_loop2->probability = invar_branch->probability; > > > + > > > /* Due to introduction of a control flow edge from loop1 latch to loop2 > > > pre-header, we should update PHIs in loop2 to reflect this connection > > > between loop1 and loop2. */ > > > -- > > > 2.27.0.90.geebb51ba8c > > > > > > > -- > Richard Biener <rguenther@suse.de> > SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, > Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
On 2021/10/27 15:44, Jan Hubicka wrote: >> On Wed, 27 Oct 2021, Jan Hubicka wrote: >> >>>> >>>> gcc/ChangeLog: >>>> >>>> * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. >>>> (do_split_loop_on_cond): Likewise. >>>> --- >>>> gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- >>>> 1 file changed, 16 insertions(+), 9 deletions(-) >>>> >>>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c >>>> index 3f6ad046623..d30782888f3 100644 >>>> --- a/gcc/tree-ssa-loop-split.c >>>> +++ b/gcc/tree-ssa-loop-split.c >>>> @@ -575,7 +575,11 @@ split_loop (class loop *loop1) >>>> stmts2); >>>> tree cond = build2 (guard_code, boolean_type_node, guard_init, border); >>>> if (!initial_true) >>>> - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); >>>> + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); >>>> + >>>> + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE >>>> + ? EDGE_SUCC (bbs[i], 0) >>>> + : EDGE_SUCC (bbs[i], 1); >>>> >>>> /* Now version the loop, placing loop2 after loop1 connecting >>>> them, and fix up SSA form for that. */ >>>> @@ -583,10 +587,10 @@ split_loop (class loop *loop1) >>>> basic_block cond_bb; >>>> >>>> class loop *loop2 = loop_version (loop1, cond, &cond_bb, >>>> - profile_probability::always (), >>>> - profile_probability::always (), >>>> - profile_probability::always (), >>>> - profile_probability::always (), >>>> + true_edge->probability, >>>> + true_edge->probability.invert (), >>>> + true_edge->probability, >>>> + true_edge->probability.invert (), >>>> true); >>> >>> As discussed yesterday, for loop of form >>> >>> for (...) >>> if (cond) >>> cond = something(); >>> else >>> something2 >>> >>> Split as >> >> Note that you are missing to conditionalize loop1 execution >> on 'cond' (not sure if that makes a difference). > You are right - forgot to mention that. > > Entry conditional makes no difference on scaling stmts inside loop but > affects its header and expected trip count. We however need to set up > probability of this conditional (and preheader count if it exists) > There is no general way to read the probability of this initial > conditional from cfg profile. So I guess we are stuck with guessing > some arbitrary value. I guess common case is that cond is true first > iteration tough and often we can easily see that fromo PHI node > initializing the test variable. > > Other thing that changes is expected number of iterations of the split > loops, so we may want to update the exit conditinal probability > accordingly... > Sorry for the late reply. The below updated patch mainly solves the issues you pointed out: - profile count proportion for both original loop and copied loop without dropping down the true branch's count; - probability update in the two loops and between the two loops; - number of iterations update/check for split_loop. [PATCH v3] Fix loop split incorrect count and probability In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two kind of split. split_loop only works for single loop and insert edge at exit when split, while split_loop_on_cond is not limited to single loop and insert edge at latch when split. Both split behavior should consider loop count and probability update. For split_loop, loop split condition is moved in front of loop1 and loop2; But split_loop_on_cond moves the condition between loop1 and loop2, this patch does: 1) profile count proportion for both original loop and copied loop without dropping down the true branch's count; 2) probability update in and between the two loops; 3) number of iterations update for split_loop. Regression tested pass, OK for master? Changes diff for split_loop and split_loop_on_cond cases: 1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit ... <bb 2> [local count: 118111600]: if (beg_5(D) < end_8(D)) goto <bb 14>; [89.00%] else goto <bb 6>; [11.00%] <bb 14> [local count: 105119324]: if (beg2_6(D) < c_9(D)) - goto <bb 15>; [100.00%] + goto <bb 15>; [33.00%] else - goto <bb 16>; [100.00%] + goto <bb 16>; [67.00%] - <bb 15> [local count: 105119324]: + <bb 15> [local count: 34689377]: _25 = beg_5(D) + 1; _26 = end_8(D) - beg_5(D); _27 = beg2_6(D) + _26; _28 = MIN_EXPR <c_9(D), _27>; - <bb 3> [local count: 955630225]: + <bb 3> [local count: 315357973]: # i_16 = PHI <i_11(8), beg_5(D)(15)> # j_17 = PHI <j_12(8), beg2_6(D)(15)> printf ("a: %d %d\n", i_16, j_17); i_11 = i_16 + 1; j_12 = j_17 + 1; if (j_12 < _28) - goto <bb 8>; [89.00%] + goto <bb 8>; [29.37%] else - goto <bb 17>; [11.00%] + goto <bb 17>; [70.63%] - <bb 8> [local count: 850510901]: + <bb 8> [local count: 280668596]: goto <bb 3>; [100.00%] - <bb 16> [local count: 105119324]: + <bb 16> [local count: 70429947]: # i_22 = PHI <beg_5(D)(14), i_29(17)> # j_23 = PHI <beg2_6(D)(14), j_30(17)> <bb 10> [local count: 955630225]: # i_2 = PHI <i_22(16), i_20(13)> # j_1 = PHI <j_23(16), j_21(13)> i_20 = i_2 + 1; j_21 = j_1 + 1; if (end_8(D) > i_20) - goto <bb 13>; [89.00%] + goto <bb 13>; [59.63%] else - goto <bb 9>; [11.00%] + goto <bb 9>; [40.37%] - <bb 13> [local count: 850510901]: + <bb 13> [local count: 569842305]: goto <bb 10>; [100.00%] <bb 17> [local count: 105119324]: # i_29 = PHI <i_11(3)> # j_30 = PHI <j_12(3)> if (end_8(D) > i_29) goto <bb 16>; [80.00%] else goto <bb 9>; [20.00%] <bb 9> [local count: 105119324]: <bb 6> [local count: 118111600]: return 0; } 2) diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit: ... <bb 2> [local count: 118111600]: if (n_7(D) > 0) goto <bb 4>; [89.00%] else goto <bb 3>; [11.00%] <bb 3> [local count: 118111600]: return; <bb 4> [local count: 105119324]: pretmp_3 = ga; - <bb 5> [local count: 955630225]: + <bb 5> [local count: 315357973]: # i_13 = PHI <i_10(20), 0(4)> # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)> if (prephitmp_12 != 0) goto <bb 6>; [33.00%] else goto <bb 7>; [67.00%] <bb 6> [local count: 315357972]: _2 = do_something (); ga = _2; - <bb 7> [local count: 955630225]: + <bb 7> [local count: 315357973]: # prephitmp_5 = PHI <prephitmp_12(5), _2(6)> i_10 = inc (i_13); if (n_7(D) > i_10) goto <bb 21>; [89.00%] else goto <bb 11>; [11.00%] <bb 11> [local count: 105119324]: goto <bb 3>; [100.00%] - <bb 21> [local count: 850510901]: + <bb 21> [local count: 280668596]: if (prephitmp_12 != 0) - goto <bb 20>; [100.00%] + goto <bb 20>; [33.00%] else - goto <bb 19>; [INV] + goto <bb 19>; [67.00%] - <bb 20> [local count: 850510901]: + <bb 20> [local count: 280668596]: goto <bb 5>; [100.00%] - <bb 19> [count: 0]: + <bb 19> [local count: 70429947]: # i_23 = PHI <i_10(21)> # prephitmp_25 = PHI <prephitmp_5(21)> - <bb 12> [local count: 955630225]: + <bb 12> [local count: 640272252]: # i_15 = PHI <i_23(19), i_22(16)> # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)> i_22 = inc (i_15); if (n_7(D) > i_22) - goto <bb 16>; [89.00%] + goto <bb 16>; [59.63%] else - goto <bb 11>; [11.00%] + goto <bb 11>; [40.37%] - <bb 16> [local count: 850510901]: + <bb 16> [local count: 569842305]: goto <bb 12>; [100.00%] } gcc/ChangeLog: * tree-ssa-loop-split.c (split_loop): Fix incorrect profile_count and probability. (do_split_loop_on_cond): Likewise. --- gcc/tree-ssa-loop-split.c | 110 +++++++++++++++++++++++++++++++++++--- 1 file changed, 102 insertions(+), 8 deletions(-) diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c index 3f6ad046623..102766241fb 100644 --- a/gcc/tree-ssa-loop-split.c +++ b/gcc/tree-ssa-loop-split.c @@ -575,7 +575,10 @@ split_loop (class loop *loop1) stmts2); tree cond = build2 (guard_code, boolean_type_node, guard_init, border); if (!initial_true) - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); + + edge true_edge, false_edge; + extract_true_false_edges_from_block (bbs[i], &true_edge, &false_edge); /* Now version the loop, placing loop2 after loop1 connecting them, and fix up SSA form for that. */ @@ -583,11 +586,11 @@ split_loop (class loop *loop1) basic_block cond_bb; class loop *loop2 = loop_version (loop1, cond, &cond_bb, - profile_probability::always (), - profile_probability::always (), - profile_probability::always (), - profile_probability::always (), - true); + true_edge->probability, + true_edge->probability.invert (), + profile_probability::always (), + profile_probability::always (), + true); gcc_assert (loop2); edge new_e = connect_loops (loop1, loop2); @@ -607,6 +610,53 @@ split_loop (class loop *loop1) tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop1)); patch_loop_exit (loop1, guard_stmt, guard_next, newend, initial_true); + /* Check first loop's number of iterations. */ + update_ssa (TODO_update_ssa); + gcc_assert (number_of_iterations_exit (loop1, single_exit (loop1), + &niter, false, true)); + + /* Proportion first loop's bb counts except those dominated by true + branch to avoid drop 1s down. */ + basic_block *bbs1, *bbs2; + bbs1 = get_loop_body (loop1); + unsigned j; + for (j = 0; j < loop1->num_nodes; j++) + if (bbs1[j] == loop1->latch + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) + bbs1[j]->count + = bbs1[j]->count.apply_probability (true_edge->probability); + free (bbs1); + + /* Fix first loop's exit probability after scaling. */ + edge exit_to_latch1 = single_pred_edge (loop1->latch); + exit_to_latch1->probability = exit_to_latch1->probability.apply_scale ( + true_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); + single_exit (loop1)->probability + = exit_to_latch1->probability.invert (); + + /* Check second loop's number of iterations. */ + class tree_niter_desc niter2; + gcc_assert (number_of_iterations_exit (loop2, single_exit (loop2), + &niter2, false, true)); + + /* Proportion second loop's bb counts except those dominated by false + branch to avoid drop 1s down. */ + basic_block bbi_copy = get_bb_copy (false_edge->dest); + bbs2 = get_loop_body (loop2); + for (j = 0; j < loop2->num_nodes; j++) + if (bbs2[j] == loop2->latch + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) + bbs2[j]->count = bbs2[j]->count.apply_probability ( + true_edge->probability.invert ()); + free (bbs2); + + /* Fix second loop's exit probability after scaling. */ + edge exit_to_latch2 = single_pred_edge (loop2->latch); + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale ( + false_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); + single_exit (loop2)->probability + = exit_to_latch2->probability.invert (); + /* Finally patch out the two copies of the condition to be always true/false (or opposite). */ gcond *force_true = as_a<gcond *> (last_stmt (bbs[i])); @@ -1486,8 +1536,8 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) initialize_original_copy_tables (); struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, - profile_probability::always (), - profile_probability::never (), + invar_branch->probability.invert (), + invar_branch->probability, profile_probability::always (), profile_probability::always (), true); @@ -1535,6 +1585,50 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) between loop1 and loop2. */ connect_loop_phis (loop1, loop2, to_loop2); + update_ssa (TODO_update_ssa); + + edge true_edge, false_edge, skip_edge1, skip_edge2; + extract_true_false_edges_from_block (cond_bb, &true_edge, &false_edge); + + /* Proportion first loop's bb counts except those dominated by true + branch to avoid drop 1s down. */ + skip_edge1 = true_invar ? false_edge : true_edge; + skip_edge2 = true_invar ? true_edge : false_edge; + basic_block *bbs1, *bbs2; + bbs1 = get_loop_body (loop1); + unsigned j; + for (j = 0; j < loop1->num_nodes; j++) + if (bbs1[j] == loop1->latch + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], skip_edge1->dest)) + bbs1[j]->count + = bbs1[j]->count.apply_probability (skip_edge1->probability); + free (bbs1); + + /* Fix first loop's exit probability after scaling. */ + to_loop1->probability = invar_branch->probability.invert (); + to_loop2->probability = invar_branch->probability; + + /* Proportion second loop's bb counts except those dominated by false + branch to avoid drop 1s down. */ + basic_block bbi_copy = get_bb_copy (skip_edge2->dest); + bbs2 = get_loop_body (loop2); + for (j = 0; j < loop2->num_nodes; j++) + if (bbs2[j] == loop2->latch + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) + bbs2[j]->count + = bbs2[j]->count.apply_probability (skip_edge2->probability); + free (bbs2); + + /* Fix second loop's exit probability after scaling. */ + edge loop2_latch_exit; + edge exit_to_latch2 = single_pred_edge (loop2->latch); + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale ( + skip_edge2->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); + loop2_latch_exit = EDGE_SUCC (exit_to_latch2->src, 0) == exit_to_latch2 + ? EDGE_SUCC (exit_to_latch2->src, 1) + : EDGE_SUCC (exit_to_latch2->src, 0); + loop2_latch_exit->probability = exit_to_latch2->probability.invert (); + free_original_copy_tables (); return true;
Gentle ping, thanks. [PATCH v3] Fix loop split incorrect count and probability https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583626.html On 2021/11/8 14:09, Xionghu Luo via Gcc-patches wrote: > > > On 2021/10/27 15:44, Jan Hubicka wrote: >>> On Wed, 27 Oct 2021, Jan Hubicka wrote: >>> >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. >>>>> (do_split_loop_on_cond): Likewise. >>>>> --- >>>>> gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- >>>>> 1 file changed, 16 insertions(+), 9 deletions(-) >>>>> >>>>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c >>>>> index 3f6ad046623..d30782888f3 100644 >>>>> --- a/gcc/tree-ssa-loop-split.c >>>>> +++ b/gcc/tree-ssa-loop-split.c >>>>> @@ -575,7 +575,11 @@ split_loop (class loop *loop1) >>>>> stmts2); >>>>> tree cond = build2 (guard_code, boolean_type_node, guard_init, border); >>>>> if (!initial_true) >>>>> - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); >>>>> + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); >>>>> + >>>>> + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE >>>>> + ? EDGE_SUCC (bbs[i], 0) >>>>> + : EDGE_SUCC (bbs[i], 1); >>>>> >>>>> /* Now version the loop, placing loop2 after loop1 connecting >>>>> them, and fix up SSA form for that. */ >>>>> @@ -583,10 +587,10 @@ split_loop (class loop *loop1) >>>>> basic_block cond_bb; >>>>> >>>>> class loop *loop2 = loop_version (loop1, cond, &cond_bb, >>>>> - profile_probability::always (), >>>>> - profile_probability::always (), >>>>> - profile_probability::always (), >>>>> - profile_probability::always (), >>>>> + true_edge->probability, >>>>> + true_edge->probability.invert (), >>>>> + true_edge->probability, >>>>> + true_edge->probability.invert (), >>>>> true); >>>> >>>> As discussed yesterday, for loop of form >>>> >>>> for (...) >>>> if (cond) >>>> cond = something(); >>>> else >>>> something2 >>>> >>>> Split as >>> >>> Note that you are missing to conditionalize loop1 execution >>> on 'cond' (not sure if that makes a difference). >> You are right - forgot to mention that. >> >> Entry conditional makes no difference on scaling stmts inside loop but >> affects its header and expected trip count. We however need to set up >> probability of this conditional (and preheader count if it exists) >> There is no general way to read the probability of this initial >> conditional from cfg profile. So I guess we are stuck with guessing >> some arbitrary value. I guess common case is that cond is true first >> iteration tough and often we can easily see that fromo PHI node >> initializing the test variable. >> >> Other thing that changes is expected number of iterations of the split >> loops, so we may want to update the exit conditinal probability >> accordingly... >> > Sorry for the late reply. The below updated patch mainly solves the issues > you pointed out: > - profile count proportion for both original loop and copied loop > without dropping down the true branch's count; > - probability update in the two loops and between the two loops; > - number of iterations update/check for split_loop. > > > [PATCH v3] Fix loop split incorrect count and probability > > In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two > kind of split. split_loop only works for single loop and insert edge at > exit when split, while split_loop_on_cond is not limited to single loop > and insert edge at latch when split. Both split behavior should consider > loop count and probability update. For split_loop, loop split condition > is moved in front of loop1 and loop2; But split_loop_on_cond moves the > condition between loop1 and loop2, this patch does: > 1) profile count proportion for both original loop and copied loop > without dropping down the true branch's count; > 2) probability update in and between the two loops; > 3) number of iterations update for split_loop. > > Regression tested pass, OK for master? > > Changes diff for split_loop and split_loop_on_cond cases: > > 1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit > ... > <bb 2> [local count: 118111600]: > if (beg_5(D) < end_8(D)) > goto <bb 14>; [89.00%] > else > goto <bb 6>; [11.00%] > > <bb 14> [local count: 105119324]: > if (beg2_6(D) < c_9(D)) > - goto <bb 15>; [100.00%] > + goto <bb 15>; [33.00%] > else > - goto <bb 16>; [100.00%] > + goto <bb 16>; [67.00%] > > - <bb 15> [local count: 105119324]: > + <bb 15> [local count: 34689377]: > _25 = beg_5(D) + 1; > _26 = end_8(D) - beg_5(D); > _27 = beg2_6(D) + _26; > _28 = MIN_EXPR <c_9(D), _27>; > > - <bb 3> [local count: 955630225]: > + <bb 3> [local count: 315357973]: > # i_16 = PHI <i_11(8), beg_5(D)(15)> > # j_17 = PHI <j_12(8), beg2_6(D)(15)> > printf ("a: %d %d\n", i_16, j_17); > i_11 = i_16 + 1; > j_12 = j_17 + 1; > if (j_12 < _28) > - goto <bb 8>; [89.00%] > + goto <bb 8>; [29.37%] > else > - goto <bb 17>; [11.00%] > + goto <bb 17>; [70.63%] > > - <bb 8> [local count: 850510901]: > + <bb 8> [local count: 280668596]: > goto <bb 3>; [100.00%] > > - <bb 16> [local count: 105119324]: > + <bb 16> [local count: 70429947]: > # i_22 = PHI <beg_5(D)(14), i_29(17)> > # j_23 = PHI <beg2_6(D)(14), j_30(17)> > > <bb 10> [local count: 955630225]: > # i_2 = PHI <i_22(16), i_20(13)> > # j_1 = PHI <j_23(16), j_21(13)> > i_20 = i_2 + 1; > j_21 = j_1 + 1; > if (end_8(D) > i_20) > - goto <bb 13>; [89.00%] > + goto <bb 13>; [59.63%] > else > - goto <bb 9>; [11.00%] > + goto <bb 9>; [40.37%] > > - <bb 13> [local count: 850510901]: > + <bb 13> [local count: 569842305]: > goto <bb 10>; [100.00%] > > <bb 17> [local count: 105119324]: > # i_29 = PHI <i_11(3)> > # j_30 = PHI <j_12(3)> > if (end_8(D) > i_29) > goto <bb 16>; [80.00%] > else > goto <bb 9>; [20.00%] > > <bb 9> [local count: 105119324]: > > <bb 6> [local count: 118111600]: > return 0; > > } > > 2) diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit: > ... > <bb 2> [local count: 118111600]: > if (n_7(D) > 0) > goto <bb 4>; [89.00%] > else > goto <bb 3>; [11.00%] > > <bb 3> [local count: 118111600]: > return; > > <bb 4> [local count: 105119324]: > pretmp_3 = ga; > > - <bb 5> [local count: 955630225]: > + <bb 5> [local count: 315357973]: > # i_13 = PHI <i_10(20), 0(4)> > # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)> > if (prephitmp_12 != 0) > goto <bb 6>; [33.00%] > else > goto <bb 7>; [67.00%] > > <bb 6> [local count: 315357972]: > _2 = do_something (); > ga = _2; > > - <bb 7> [local count: 955630225]: > + <bb 7> [local count: 315357973]: > # prephitmp_5 = PHI <prephitmp_12(5), _2(6)> > i_10 = inc (i_13); > if (n_7(D) > i_10) > goto <bb 21>; [89.00%] > else > goto <bb 11>; [11.00%] > > <bb 11> [local count: 105119324]: > goto <bb 3>; [100.00%] > > - <bb 21> [local count: 850510901]: > + <bb 21> [local count: 280668596]: > if (prephitmp_12 != 0) > - goto <bb 20>; [100.00%] > + goto <bb 20>; [33.00%] > else > - goto <bb 19>; [INV] > + goto <bb 19>; [67.00%] > > - <bb 20> [local count: 850510901]: > + <bb 20> [local count: 280668596]: > goto <bb 5>; [100.00%] > > - <bb 19> [count: 0]: > + <bb 19> [local count: 70429947]: > # i_23 = PHI <i_10(21)> > # prephitmp_25 = PHI <prephitmp_5(21)> > > - <bb 12> [local count: 955630225]: > + <bb 12> [local count: 640272252]: > # i_15 = PHI <i_23(19), i_22(16)> > # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)> > i_22 = inc (i_15); > if (n_7(D) > i_22) > - goto <bb 16>; [89.00%] > + goto <bb 16>; [59.63%] > else > - goto <bb 11>; [11.00%] > + goto <bb 11>; [40.37%] > > - <bb 16> [local count: 850510901]: > + <bb 16> [local count: 569842305]: > goto <bb 12>; [100.00%] > > } > > gcc/ChangeLog: > > * tree-ssa-loop-split.c (split_loop): Fix incorrect > profile_count and probability. > (do_split_loop_on_cond): Likewise. > --- > gcc/tree-ssa-loop-split.c | 110 +++++++++++++++++++++++++++++++++++--- > 1 file changed, 102 insertions(+), 8 deletions(-) > > diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c > index 3f6ad046623..102766241fb 100644 > --- a/gcc/tree-ssa-loop-split.c > +++ b/gcc/tree-ssa-loop-split.c > @@ -575,7 +575,10 @@ split_loop (class loop *loop1) > stmts2); > tree cond = build2 (guard_code, boolean_type_node, guard_init, border); > if (!initial_true) > - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); > + > + edge true_edge, false_edge; > + extract_true_false_edges_from_block (bbs[i], &true_edge, &false_edge); > > /* Now version the loop, placing loop2 after loop1 connecting > them, and fix up SSA form for that. */ > @@ -583,11 +586,11 @@ split_loop (class loop *loop1) > basic_block cond_bb; > > class loop *loop2 = loop_version (loop1, cond, &cond_bb, > - profile_probability::always (), > - profile_probability::always (), > - profile_probability::always (), > - profile_probability::always (), > - true); > + true_edge->probability, > + true_edge->probability.invert (), > + profile_probability::always (), > + profile_probability::always (), > + true); > gcc_assert (loop2); > > edge new_e = connect_loops (loop1, loop2); > @@ -607,6 +610,53 @@ split_loop (class loop *loop1) > tree guard_next = PHI_ARG_DEF_FROM_EDGE (phi, loop_latch_edge (loop1)); > patch_loop_exit (loop1, guard_stmt, guard_next, newend, initial_true); > > + /* Check first loop's number of iterations. */ > + update_ssa (TODO_update_ssa); > + gcc_assert (number_of_iterations_exit (loop1, single_exit (loop1), > + &niter, false, true)); > + > + /* Proportion first loop's bb counts except those dominated by true > + branch to avoid drop 1s down. */ > + basic_block *bbs1, *bbs2; > + bbs1 = get_loop_body (loop1); > + unsigned j; > + for (j = 0; j < loop1->num_nodes; j++) > + if (bbs1[j] == loop1->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], true_edge->dest)) > + bbs1[j]->count > + = bbs1[j]->count.apply_probability (true_edge->probability); > + free (bbs1); > + > + /* Fix first loop's exit probability after scaling. */ > + edge exit_to_latch1 = single_pred_edge (loop1->latch); > + exit_to_latch1->probability = exit_to_latch1->probability.apply_scale ( > + true_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); > + single_exit (loop1)->probability > + = exit_to_latch1->probability.invert (); > + > + /* Check second loop's number of iterations. */ > + class tree_niter_desc niter2; > + gcc_assert (number_of_iterations_exit (loop2, single_exit (loop2), > + &niter2, false, true)); > + > + /* Proportion second loop's bb counts except those dominated by false > + branch to avoid drop 1s down. */ > + basic_block bbi_copy = get_bb_copy (false_edge->dest); > + bbs2 = get_loop_body (loop2); > + for (j = 0; j < loop2->num_nodes; j++) > + if (bbs2[j] == loop2->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) > + bbs2[j]->count = bbs2[j]->count.apply_probability ( > + true_edge->probability.invert ()); > + free (bbs2); > + > + /* Fix second loop's exit probability after scaling. */ > + edge exit_to_latch2 = single_pred_edge (loop2->latch); > + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale ( > + false_edge->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); > + single_exit (loop2)->probability > + = exit_to_latch2->probability.invert (); > + > /* Finally patch out the two copies of the condition to be always > true/false (or opposite). */ > gcond *force_true = as_a<gcond *> (last_stmt (bbs[i])); > @@ -1486,8 +1536,8 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > initialize_original_copy_tables (); > > struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, > - profile_probability::always (), > - profile_probability::never (), > + invar_branch->probability.invert (), > + invar_branch->probability, > profile_probability::always (), > profile_probability::always (), > true); > @@ -1535,6 +1585,50 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) > between loop1 and loop2. */ > connect_loop_phis (loop1, loop2, to_loop2); > > + update_ssa (TODO_update_ssa); > + > + edge true_edge, false_edge, skip_edge1, skip_edge2; > + extract_true_false_edges_from_block (cond_bb, &true_edge, &false_edge); > + > + /* Proportion first loop's bb counts except those dominated by true > + branch to avoid drop 1s down. */ > + skip_edge1 = true_invar ? false_edge : true_edge; > + skip_edge2 = true_invar ? true_edge : false_edge; > + basic_block *bbs1, *bbs2; > + bbs1 = get_loop_body (loop1); > + unsigned j; > + for (j = 0; j < loop1->num_nodes; j++) > + if (bbs1[j] == loop1->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs1[j], skip_edge1->dest)) > + bbs1[j]->count > + = bbs1[j]->count.apply_probability (skip_edge1->probability); > + free (bbs1); > + > + /* Fix first loop's exit probability after scaling. */ > + to_loop1->probability = invar_branch->probability.invert (); > + to_loop2->probability = invar_branch->probability; > + > + /* Proportion second loop's bb counts except those dominated by false > + branch to avoid drop 1s down. */ > + basic_block bbi_copy = get_bb_copy (skip_edge2->dest); > + bbs2 = get_loop_body (loop2); > + for (j = 0; j < loop2->num_nodes; j++) > + if (bbs2[j] == loop2->latch > + || !dominated_by_p (CDI_DOMINATORS, bbs2[j], bbi_copy)) > + bbs2[j]->count > + = bbs2[j]->count.apply_probability (skip_edge2->probability); > + free (bbs2); > + > + /* Fix second loop's exit probability after scaling. */ > + edge loop2_latch_exit; > + edge exit_to_latch2 = single_pred_edge (loop2->latch); > + exit_to_latch2->probability = exit_to_latch2->probability.apply_scale ( > + skip_edge2->probability.to_reg_br_prob_base (), REG_BR_PROB_BASE); > + loop2_latch_exit = EDGE_SUCC (exit_to_latch2->src, 0) == exit_to_latch2 > + ? EDGE_SUCC (exit_to_latch2->src, 1) > + : EDGE_SUCC (exit_to_latch2->src, 0); > + loop2_latch_exit->probability = exit_to_latch2->probability.invert (); > + > free_original_copy_tables (); > > return true; >
diff base/loop-cond-split-1.c.151t.lsplit patched/loop-cond-split-1.c.151t.lsplit: ... int prephitmp_16; int prephitmp_25; <bb 2> [local count: 118111600]: if (n_7(D) > 0) goto <bb 4>; [89.00%] else goto <bb 3>; [11.00%] <bb 3> [local count: 118111600]: return; <bb 4> [local count: 105119324]: pretmp_3 = ga; - <bb 5> [local count: 955630225]: + <bb 5> [local count: 315357973]: # i_13 = PHI <i_10(20), 0(4)> # prephitmp_12 = PHI <prephitmp_5(20), pretmp_3(4)> if (prephitmp_12 != 0) goto <bb 6>; [33.00%] else goto <bb 7>; [67.00%] - <bb 6> [local count: 315357972]: + <bb 6> [local count: 104068130]: _2 = do_something (); ga = _2; - <bb 7> [local count: 955630225]: + <bb 7> [local count: 315357973]: # prephitmp_5 = PHI <prephitmp_12(5), _2(6)> i_10 = inc (i_13); if (n_7(D) > i_10) goto <bb 21>; [89.00%] else goto <bb 11>; [11.00%] <bb 11> [local count: 105119324]: goto <bb 3>; [100.00%] - <bb 21> [local count: 850510901]: + <bb 21> [local count: 280668596]: if (prephitmp_12 != 0) - goto <bb 20>; [100.00%] + goto <bb 20>; [33.00%] else - goto <bb 19>; [INV] + goto <bb 19>; [67.00%] - <bb 20> [local count: 850510901]: + <bb 20> [local count: 280668596]: goto <bb 5>; [100.00%] - <bb 19> [count: 0]: + <bb 19> [local count: 70429947]: # i_23 = PHI <i_10(21)> # prephitmp_25 = PHI <prephitmp_5(21)> - <bb 12> [local count: 955630225]: + <bb 12> [local count: 640272252]: # i_15 = PHI <i_23(19), i_22(16)> # prephitmp_16 = PHI <prephitmp_25(19), prephitmp_16(16)> i_22 = inc (i_15); if (n_7(D) > i_22) goto <bb 16>; [89.00%] else goto <bb 11>; [11.00%] - <bb 16> [local count: 850510901]: + <bb 16> [local count: 569842305]: goto <bb 12>; [100.00%] } gcc/ChangeLog: * tree-ssa-loop-split.c (split_loop): Fix incorrect probability. (do_split_loop_on_cond): Likewise. --- gcc/tree-ssa-loop-split.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c index 3f6ad046623..d30782888f3 100644 --- a/gcc/tree-ssa-loop-split.c +++ b/gcc/tree-ssa-loop-split.c @@ -575,7 +575,11 @@ split_loop (class loop *loop1) stmts2); tree cond = build2 (guard_code, boolean_type_node, guard_init, border); if (!initial_true) - cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); + cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); + + edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE + ? EDGE_SUCC (bbs[i], 0) + : EDGE_SUCC (bbs[i], 1); /* Now version the loop, placing loop2 after loop1 connecting them, and fix up SSA form for that. */ @@ -583,10 +587,10 @@ split_loop (class loop *loop1) basic_block cond_bb; class loop *loop2 = loop_version (loop1, cond, &cond_bb, - profile_probability::always (), - profile_probability::always (), - profile_probability::always (), - profile_probability::always (), + true_edge->probability, + true_edge->probability.invert (), + true_edge->probability, + true_edge->probability.invert (), true); gcc_assert (loop2); @@ -1486,10 +1490,10 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) initialize_original_copy_tables (); struct loop *loop2 = loop_version (loop1, boolean_true_node, NULL, - profile_probability::always (), - profile_probability::never (), - profile_probability::always (), - profile_probability::always (), + invar_branch->probability.invert (), + invar_branch->probability, + invar_branch->probability.invert (), + invar_branch->probability, true); if (!loop2) { @@ -1530,6 +1534,9 @@ do_split_loop_on_cond (struct loop *loop1, edge invar_branch) to_loop1->flags |= true_invar ? EDGE_FALSE_VALUE : EDGE_TRUE_VALUE; to_loop2->flags |= true_invar ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE; + to_loop1->probability = invar_branch->probability.invert (); + to_loop2->probability = invar_branch->probability; + /* Due to introduction of a control flow edge from loop1 latch to loop2 pre-header, we should update PHIs in loop2 to reflect this connection between loop1 and loop2. */