Message ID | 20211208055416.1415283-3-luoxhu@linux.ibm.com |
---|---|
State | Committed |
Commit | 46bfe1b0e11c4797c5926e0754fae2848026376c |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DCEDC3858410 for <patchwork@sourceware.org>; Wed, 8 Dec 2021 05:57:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DCEDC3858410 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1638943032; bh=IXwNQjHtOCpdLX9nzNDKkAVUCviXd8QIGxKtDfk21TY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=V2fy0MaUmhMeWlMVf2f40MJqpcQVZxE3JMzs/yShq2NYxivd3mLkgdzXPCkqLazBs 6ODsEHv9AFUl3pae0Uu1UvbpjqPPGEyFJKyuxsMOzE1wbQ45FhGmRCUMK47UWv8YR9 GjXtf5vqFHakCInUF4pmTgu5Gu53eeujxbogG0yI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 3CA823858C60; Wed, 8 Dec 2021 05:54:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3CA823858C60 Received: from pps.filterd (m0127361.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1B82Ng7j011033; Wed, 8 Dec 2021 05:54:50 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3ctkvq3108-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Dec 2021 05:54:50 +0000 Received: from m0127361.ppops.net (m0127361.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1B85S9AL002671; Wed, 8 Dec 2021 05:54:50 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com with ESMTP id 3ctkvq30yy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Dec 2021 05:54:49 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1B85qroT003706; Wed, 8 Dec 2021 05:54:48 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma06ams.nl.ibm.com with ESMTP id 3cqykjc7f7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 Dec 2021 05:54:48 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1B85sisf29884824 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 Dec 2021 05:54:45 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DD37A52050; Wed, 8 Dec 2021 05:54:44 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 7D7945204F; Wed, 8 Dec 2021 05:54:43 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH 2/3] Fix incorrect loop exit edge probability [PR103270] Date: Tue, 7 Dec 2021 23:54:15 -0600 Message-Id: <20211208055416.1415283-3-luoxhu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211208055416.1415283-1-luoxhu@linux.ibm.com> References: <20211208055416.1415283-1-luoxhu@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: Xp2As9Q4dUjwgWL2NNiMZLbyznwgfr18 X-Proofpoint-ORIG-GUID: GNafevTR1IeFYF2M1J6ZyDwGeE3KFNFT X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2021-12-08_01,2021-12-06_02,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxlogscore=926 phishscore=0 impostorscore=0 adultscore=0 spamscore=0 mlxscore=0 clxscore=1015 bulkscore=0 suspectscore=0 priorityscore=1501 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2112080038 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Xionghu Luo via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Xionghu Luo <luoxhu@linux.ibm.com> Cc: segher@kernel.crashing.org, Xionghu Luo <luoxhu@linux.ibm.com>, hubicka@kam.mff.cuni.cz, wschmidt@linux.ibm.com, linkw@gcc.gnu.org, dje.gcc@gmail.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
Dependency patches for hoist LIM code to cold loop
|
|
Commit Message
Xionghu Luo
Dec. 8, 2021, 5:54 a.m. UTC
r12-4526 cancelled jump thread path rotates loop. It exposes a issue in profile-estimate when predict_extra_loop_exits, outer loop's exit edge is marked as inner loop's extra loop exit and set with incorrect prediction, then a hot inner loop will become cold loop finally through optimizations, this patch add loop check when searching extra exit edges to avoid unexpected predict_edge from predict_paths_for_bb. Regression tested on P8LE, OK for master? gcc/ChangeLog: PR middle-end/103270 * predict.c (predict_extra_loop_exits): Add loop parameter. (predict_loops): Call with loop argument. gcc/testsuite/ChangeLog: PR middle-end/103270 * gcc.dg/pr103270.c: New test. --- gcc/predict.c | 10 ++++++---- gcc/testsuite/gcc.dg/pr103270.c | 19 +++++++++++++++++++ 2 files changed, 25 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/pr103270.c
Comments
On 12/7/2021 10:54 PM, Xionghu Luo via Gcc-patches wrote: > r12-4526 cancelled jump thread path rotates loop. It exposes a issue in > profile-estimate when predict_extra_loop_exits, outer loop's exit edge > is marked as inner loop's extra loop exit and set with incorrect > prediction, then a hot inner loop will become cold loop finally through > optimizations, this patch add loop check when searching extra exit edges > to avoid unexpected predict_edge from predict_paths_for_bb. > > Regression tested on P8LE, OK for master? > > gcc/ChangeLog: > > PR middle-end/103270 > * predict.c (predict_extra_loop_exits): Add loop parameter. > (predict_loops): Call with loop argument. > > gcc/testsuite/ChangeLog: > > PR middle-end/103270 > * gcc.dg/pr103270.c: New test. OK jeff
> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in > profile-estimate when predict_extra_loop_exits, outer loop's exit edge > is marked as inner loop's extra loop exit and set with incorrect > prediction, then a hot inner loop will become cold loop finally through > optimizations, this patch add loop check when searching extra exit edges > to avoid unexpected predict_edge from predict_paths_for_bb. > > Regression tested on P8LE, OK for master? > > gcc/ChangeLog: > > PR middle-end/103270 > * predict.c (predict_extra_loop_exits): Add loop parameter. > (predict_loops): Call with loop argument. With changes to branch predictors it is useful to re-test their effectivity on spec and see if their hitrates are still mathcing reality. You can do it by buiding spec with -fprofile-generate, train it and then build with -fprofile-use -fdump-tree-ipa-profile-details and use contrib/analyze_brprob.py that will collect info on how they work. This patch looks good to me, but it would be nice to have things reality checked (and since we did not do the stats for some time, there may be surprises) so if you could run the specs and post results of analyze_brprob, it would be great. I will also try to get to that soon, but currently I am bit swamped by other problems I noticed on clang builds. Thanks a lot for working on profile fixes - I am trying now to get things into shape. With Martin we added basic testing infrastructure for keeping track of profile updates and I am trying to see how it works in practice now. Hopefully it will make it easier to judge on profile updating patches. I would welcome list of patches I should look at. I will write separate mail on this. Honza > > gcc/testsuite/ChangeLog: > > PR middle-end/103270 > * gcc.dg/pr103270.c: New test. > --- > gcc/predict.c | 10 ++++++---- > gcc/testsuite/gcc.dg/pr103270.c | 19 +++++++++++++++++++ > 2 files changed, 25 insertions(+), 4 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/pr103270.c > > diff --git a/gcc/predict.c b/gcc/predict.c > index 3cb4e3c0eb5..5b6e0cf722b 100644 > --- a/gcc/predict.c > +++ b/gcc/predict.c > @@ -1859,7 +1859,7 @@ predict_iv_comparison (class loop *loop, basic_block bb, > exits to predict them using PRED_LOOP_EXTRA_EXIT. */ > > static void > -predict_extra_loop_exits (edge exit_edge) > +predict_extra_loop_exits (class loop *loop, edge exit_edge) > { > unsigned i; > bool check_value_one; > @@ -1912,12 +1912,14 @@ predict_extra_loop_exits (edge exit_edge) > continue; > if (EDGE_COUNT (e->src->succs) != 1) > { > - predict_paths_leading_to_edge (e, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN); > + predict_paths_leading_to_edge (e, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN, > + loop); > continue; > } > > FOR_EACH_EDGE (e1, ei, e->src->preds) > - predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN); > + predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN, > + loop); > } > } > > @@ -2008,7 +2010,7 @@ predict_loops (void) > ex->src->index, ex->dest->index); > continue; > } > - predict_extra_loop_exits (ex); > + predict_extra_loop_exits (loop, ex); > > if (number_of_iterations_exit (loop, ex, &niter_desc, false, false)) > niter = niter_desc.niter; > diff --git a/gcc/testsuite/gcc.dg/pr103270.c b/gcc/testsuite/gcc.dg/pr103270.c > new file mode 100644 > index 00000000000..819310e360e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr103270.c > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */ > + > +void test(int a, int* i) > +{ > + for (; a < 5; ++a) > + { > + int b = 0; > + int c = 0; > + for (; b != -11; b--) > + for (int d = 0; d ==0; d++) > + { > + *i += c & a; > + c = b; > + } > + } > +} > + > +/* { dg-final { scan-tree-dump-not "extra loop exit heuristics of edge\[^:\]*:" "profile_estimate"} } */ > -- > 2.25.1 >
On 2021/12/13 17:25, Jan Hubicka wrote: >> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in >> profile-estimate when predict_extra_loop_exits, outer loop's exit edge >> is marked as inner loop's extra loop exit and set with incorrect >> prediction, then a hot inner loop will become cold loop finally through >> optimizations, this patch add loop check when searching extra exit edges >> to avoid unexpected predict_edge from predict_paths_for_bb. >> >> Regression tested on P8LE, OK for master? >> >> gcc/ChangeLog: >> >> PR middle-end/103270 >> * predict.c (predict_extra_loop_exits): Add loop parameter. >> (predict_loops): Call with loop argument. > > With changes to branch predictors it is useful to re-test their > effectivity on spec and see if their hitrates are still mathcing > reality. You can do it by buiding spec with -fprofile-generate, train > it and then build with -fprofile-use -fdump-tree-ipa-profile-details > and use contrib/analyze_brprob.py that will collect info on how they > work. > > This patch looks good to me, but it would be nice to have things reality > checked (and since we did not do the stats for some time, there may be > surprises) so if you could run the specs and post results of > analyze_brprob, it would be great. I will also try to get to that soon, > but currently I am bit swamped by other problems I noticed on clang > builds. > > Thanks a lot for working on profile fixes - I am trying now to get > things into shape. With Martin we added basic testing infrastructure > for keeping track of profile updates and I am trying to see how it works > in practice now. Hopefully it will make it easier to judge on profile > updating patches. I would welcome list of patches I should look at. > > I will write separate mail on this. > Honza With the patch, the analyze_brprob.py outputs below data with PGO build, there is no verification code in the script, so how to check whether it is correct? Run it again without the patch and compare "extra loop exit" field? ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688244 3.27G 0.6% 53%:2 recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 continue 956 1.3% 63.70% 65.65% / 79.97% 3780303799 3.78G 0.7% 52%:3 loop guard 1177 1.6% 56.33% 42.54% / 80.32% 7373601457 7.37G 1.4% 50%:2 opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 6.0% 21%:2 loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.6% 18%:1 loop iterations 4761 6.3% 99.98% 84.27% / 84.27% 73463634555 73.46G 13.9% pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.8% 34%:1 opcode values nonequal (on trees) 12237 16.3% 70.70% 70.86% / 83.54% 36638772333 36.64G 6.9% guessed loop iterations 16760 22.3% 99.78% 91.49% / 91.49% 162952747918 162.95G 30.9% HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) no prediction 12730 16.9% 39.29% 33.32% / 79.93% 121106031835 121.11G 23.0% first match 25261 33.6% 92.17% 88.33% / 88.98% 296652487962 296.65G 56.3% DS theory 28333 37.7% 63.03% 72.05% / 85.00% 109563734005 109.56G 20.8% combined 75232 100.0% 73.17% 72.32% / 86.08% 527351738575 527.35G 100.0% Loop count: 37870 avg. # of iter: 8444.77 median # of iter: 7.00 avg. (1% cutoff) # of iter: 174.68 avg. (5% cutoff) # of iter: 55.14 avg. (10% cutoff) # of iter: 35.21 avg. (20% cutoff) # of iter: 26.23 avg. (30% cutoff) # of iter: 21.70 >> >> gcc/testsuite/ChangeLog: >> >> PR middle-end/103270 >> * gcc.dg/pr103270.c: New test. >> --- >> gcc/predict.c | 10 ++++++---- >> gcc/testsuite/gcc.dg/pr103270.c | 19 +++++++++++++++++++ >> 2 files changed, 25 insertions(+), 4 deletions(-) >> create mode 100644 gcc/testsuite/gcc.dg/pr103270.c >> >> diff --git a/gcc/predict.c b/gcc/predict.c >> index 3cb4e3c0eb5..5b6e0cf722b 100644 >> --- a/gcc/predict.c >> +++ b/gcc/predict.c >> @@ -1859,7 +1859,7 @@ predict_iv_comparison (class loop *loop, basic_block bb, >> exits to predict them using PRED_LOOP_EXTRA_EXIT. */ >> >> static void >> -predict_extra_loop_exits (edge exit_edge) >> +predict_extra_loop_exits (class loop *loop, edge exit_edge) >> { >> unsigned i; >> bool check_value_one; >> @@ -1912,12 +1912,14 @@ predict_extra_loop_exits (edge exit_edge) >> continue; >> if (EDGE_COUNT (e->src->succs) != 1) >> { >> - predict_paths_leading_to_edge (e, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN); >> + predict_paths_leading_to_edge (e, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN, >> + loop); >> continue; >> } >> >> FOR_EACH_EDGE (e1, ei, e->src->preds) >> - predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN); >> + predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN, >> + loop); >> } >> } >> >> @@ -2008,7 +2010,7 @@ predict_loops (void) >> ex->src->index, ex->dest->index); >> continue; >> } >> - predict_extra_loop_exits (ex); >> + predict_extra_loop_exits (loop, ex); >> >> if (number_of_iterations_exit (loop, ex, &niter_desc, false, false)) >> niter = niter_desc.niter; >> diff --git a/gcc/testsuite/gcc.dg/pr103270.c b/gcc/testsuite/gcc.dg/pr103270.c >> new file mode 100644 >> index 00000000000..819310e360e >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/pr103270.c >> @@ -0,0 +1,19 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */ >> + >> +void test(int a, int* i) >> +{ >> + for (; a < 5; ++a) >> + { >> + int b = 0; >> + int c = 0; >> + for (; b != -11; b--) >> + for (int d = 0; d ==0; d++) >> + { >> + *i += c & a; >> + c = b; >> + } >> + } >> +} >> + >> +/* { dg-final { scan-tree-dump-not "extra loop exit heuristics of edge\[^:\]*:" "profile_estimate"} } */ >> -- >> 2.25.1 >>
On 2021/12/14 17:27, Xionghu Luo via Gcc-patches wrote: > > > On 2021/12/13 17:25, Jan Hubicka wrote: >>> r12-4526 cancelled jump thread path rotates loop. It exposes a issue in >>> profile-estimate when predict_extra_loop_exits, outer loop's exit edge >>> is marked as inner loop's extra loop exit and set with incorrect >>> prediction, then a hot inner loop will become cold loop finally through >>> optimizations, this patch add loop check when searching extra exit edges >>> to avoid unexpected predict_edge from predict_paths_for_bb. >>> >>> Regression tested on P8LE, OK for master? >>> >>> gcc/ChangeLog: >>> >>> PR middle-end/103270 >>> * predict.c (predict_extra_loop_exits): Add loop parameter. >>> (predict_loops): Call with loop argument. >> >> With changes to branch predictors it is useful to re-test their >> effectivity on spec and see if their hitrates are still mathcing >> reality. You can do it by buiding spec with -fprofile-generate, train >> it and then build with -fprofile-use -fdump-tree-ipa-profile-details >> and use contrib/analyze_brprob.py that will collect info on how they >> work. >> >> This patch looks good to me, but it would be nice to have things reality >> checked (and since we did not do the stats for some time, there may be >> surprises) so if you could run the specs and post results of >> analyze_brprob, it would be great. I will also try to get to that soon, >> but currently I am bit swamped by other problems I noticed on clang >> builds. >> >> Thanks a lot for working on profile fixes - I am trying now to get >> things into shape. With Martin we added basic testing infrastructure >> for keeping track of profile updates and I am trying to see how it works >> in practice now. Hopefully it will make it easier to judge on profile >> updating patches. I would welcome list of patches I should look at. >> >> I will write separate mail on this. >> Honza > > > With the patch, the analyze_brprob.py outputs below data with PGO build, > there is no verification code in the script, so how to check whether it > is correct? Run it again without the patch and compare "extra loop exit" > field? > > > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 > Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 > loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 > __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% > loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 > extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 > guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 > negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 > loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 > const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 > indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 > polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688244 3.27G 0.6% 53%:2 > recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 > goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 > null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 > continue 956 1.3% 63.70% 65.65% / 79.97% 3780303799 3.78G 0.7% 52%:3 > loop guard 1177 1.6% 56.33% 42.54% / 80.32% 7373601457 7.37G 1.4% 50%:2 > opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 6.0% 21%:2 > loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.6% 18%:1 > loop iterations 4761 6.3% 99.98% 84.27% / 84.27% 73463634555 73.46G 13.9% > pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% > call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.8% 34%:1 > opcode values nonequal (on trees) 12237 16.3% 70.70% 70.86% / 83.54% 36638772333 36.64G 6.9% > guessed loop iterations 16760 22.3% 99.78% 91.49% / 91.49% 162952747918 162.95G 30.9% > > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > no prediction 12730 16.9% 39.29% 33.32% / 79.93% 121106031835 121.11G 23.0% > first match 25261 33.6% 92.17% 88.33% / 88.98% 296652487962 296.65G 56.3% > DS theory 28333 37.7% 63.03% 72.05% / 85.00% 109563734005 109.56G 20.8% > combined 75232 100.0% 73.17% 72.32% / 86.08% 527351738575 527.35G 100.0% > > Loop count: 37870 > avg. # of iter: 8444.77 > median # of iter: 7.00 > avg. (1% cutoff) # of iter: 174.68 > avg. (5% cutoff) # of iter: 55.14 > avg. (10% cutoff) # of iter: 35.21 > avg. (20% cutoff) # of iter: 26.23 > avg. (30% cutoff) # of iter: 21.70 This is the output data collected without the patch, as can be seen, no difference on "extra loop exit". But this issue should be fixed. ./contrib/analyze_brprob_spec.py ~/workspace/tests/spec2017/ benchspec HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688238 3.27G 0.6% 53%:2 recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 continue 956 1.3% 63.70% 65.65% / 79.97% 3780303795 3.78G 0.7% 52%:3 loop guard 1178 1.6% 56.37% 42.54% / 80.32% 7373601533 7.37G 1.4% 50%:2 opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 5.9% 21%:2 loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.4% 18%:1 loop iterations 4772 6.3% 99.98% 84.27% / 84.27% 74045982111 74.05G 13.8% pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.7% 34%:1 opcode values nonequal (on trees) 12240 16.2% 70.71% 70.86% / 83.54% 36638772682 36.64G 6.9% guessed loop iterations 16854 22.4% 99.78% 91.21% / 91.22% 169765264401 169.77G 31.7% HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) no prediction 12731 16.9% 39.30% 33.32% / 79.93% 121106031963 121.11G 22.6% first match 25366 33.7% 92.20% 88.24% / 88.88% 304047352001 304.05G 56.9% DS theory 28337 37.6% 63.03% 72.05% / 85.00% 109563734430 109.56G 20.5% combined 75342 100.0% 73.21% 72.49% / 86.06% 534746603167 534.75G 100.0% Loop count: 38058 avg. # of iter: 8403.32 median # of iter: 7.00 avg. (1% cutoff) # of iter: 173.72 avg. (5% cutoff) # of iter: 54.90 avg. (10% cutoff) # of iter: 35.20 avg. (20% cutoff) # of iter: 26.35 avg. (30% cutoff) # of iter: 21.87
> > > > > > ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all > > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > > noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 > > Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 > > loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 > > __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% > > loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 > > extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 > > guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 > > negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 > > loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 > > const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 > > indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 > > polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688244 3.27G 0.6% 53%:2 > > recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 > > goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 > > null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 > > continue 956 1.3% 63.70% 65.65% / 79.97% 3780303799 3.78G 0.7% 52%:3 > > loop guard 1177 1.6% 56.33% 42.54% / 80.32% 7373601457 7.37G 1.4% 50%:2 > > opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 6.0% 21%:2 > > loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.6% 18%:1 > > loop iterations 4761 6.3% 99.98% 84.27% / 84.27% 73463634555 73.46G 13.9% > > pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% > > call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.8% 34%:1 > > opcode values nonequal (on trees) 12237 16.3% 70.70% 70.86% / 83.54% 36638772333 36.64G 6.9% > > guessed loop iterations 16760 22.3% 99.78% 91.49% / 91.49% 162952747918 162.95G 30.9% > > > > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > > no prediction 12730 16.9% 39.29% 33.32% / 79.93% 121106031835 121.11G 23.0% > > first match 25261 33.6% 92.17% 88.33% / 88.98% 296652487962 296.65G 56.3% > > DS theory 28333 37.7% 63.03% 72.05% / 85.00% 109563734005 109.56G 20.8% > > combined 75232 100.0% 73.17% 72.32% / 86.08% 527351738575 527.35G 100.0% > > > > Loop count: 37870 > > avg. # of iter: 8444.77 > > median # of iter: 7.00 > > avg. (1% cutoff) # of iter: 174.68 > > avg. (5% cutoff) # of iter: 55.14 > > avg. (10% cutoff) # of iter: 35.21 > > avg. (20% cutoff) # of iter: 26.23 > > avg. (30% cutoff) # of iter: 21.70 > > This is the output data collected without the patch, as can be seen, no difference on "extra loop exit". > But this issue should be fixed. > > > ./contrib/analyze_brprob_spec.py ~/workspace/tests/spec2017/ > > benchspec > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 > Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 > loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 > __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% > loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 > extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 > guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 > negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 > loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 > const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 > indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 > polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688238 3.27G 0.6% 53%:2 > recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 > goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 > null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 > continue 956 1.3% 63.70% 65.65% / 79.97% 3780303795 3.78G 0.7% 52%:3 > loop guard 1178 1.6% 56.37% 42.54% / 80.32% 7373601533 7.37G 1.4% 50%:2 > opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 5.9% 21%:2 > loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.4% 18%:1 > loop iterations 4772 6.3% 99.98% 84.27% / 84.27% 74045982111 74.05G 13.8% > pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% > call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.7% 34%:1 > opcode values nonequal (on trees) 12240 16.2% 70.71% 70.86% / 83.54% 36638772682 36.64G 6.9% > guessed loop iterations 16854 22.4% 99.78% 91.21% / 91.22% 169765264401 169.77G 31.7% > > HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) > no prediction 12731 16.9% 39.30% 33.32% / 79.93% 121106031963 121.11G 22.6% > first match 25366 33.7% 92.20% 88.24% / 88.88% 304047352001 304.05G 56.9% > DS theory 28337 37.6% 63.03% 72.05% / 85.00% 109563734430 109.56G 20.5% > combined 75342 100.0% 73.21% 72.49% / 86.06% 534746603167 534.75G 100.0% Thank you. So it seems that the problem does not trigger in Spec but I was also wondering if our current predict.def values are anywhere near to reality. THe table reads as follows: - BRANCHES is number of branches the heuristics hit on (so extra loop exit has 80 and therefore we do not have that good statistics on it) - HITRATE is the probability that the prediction goes given direction during the train run. after / is the value which would be reached by perfect predictor (which predict branch to the direction that dominates during train) Extra loop exit is 81% out of 89% so it is pretty close to optimum - COVERAGE is how many times the predicted branch was executed In general the idea is that for most heuristics (wihch can not determine exact value like loop iteraitons) HITRATE values can be put to predict.def so the Dempster-Shafer formula (DS theory) combines the hypothesis sort of realistically (it assumes that all the predictors are staistically independent which they are not). We have HITRATE 67 for extra loop exit which is bit off what we do have in the measured data, but I think our predict.def is still based on spec2006 numbers. So the patch is OK. Perhaps we could experiment with updating predict.def (It does develop even when run across same benchmark suite since early optimizations change - this stage1 I think the threading work definitly affects the situation substantially) Honza > > Loop count: 38058 > avg. # of iter: 8403.32 > median # of iter: 7.00 > avg. (1% cutoff) # of iter: 173.72 > avg. (5% cutoff) # of iter: 54.90 > avg. (10% cutoff) # of iter: 35.20 > avg. (20% cutoff) # of iter: 26.35 > avg. (30% cutoff) # of iter: 21.87 > > > -- > Thanks, > Xionghu
On 2021/12/16 19:18, Jan Hubicka wrote: >>> >>> >>> ./contrib/analyze_brprob.py ~/workspace/tests/spec2017/dump_file_all >>> HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) >>> noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 >>> Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 >>> loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 >>> __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% >>> loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 >>> extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 >>> guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 >>> negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 >>> loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 >>> const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 >>> indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 >>> polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688244 3.27G 0.6% 53%:2 >>> recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 >>> goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 >>> null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 >>> continue 956 1.3% 63.70% 65.65% / 79.97% 3780303799 3.78G 0.7% 52%:3 >>> loop guard 1177 1.6% 56.33% 42.54% / 80.32% 7373601457 7.37G 1.4% 50%:2 >>> opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 6.0% 21%:2 >>> loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.6% 18%:1 >>> loop iterations 4761 6.3% 99.98% 84.27% / 84.27% 73463634555 73.46G 13.9% >>> pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% >>> call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.8% 34%:1 >>> opcode values nonequal (on trees) 12237 16.3% 70.70% 70.86% / 83.54% 36638772333 36.64G 6.9% >>> guessed loop iterations 16760 22.3% 99.78% 91.49% / 91.49% 162952747918 162.95G 30.9% >>> >>> HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) >>> no prediction 12730 16.9% 39.29% 33.32% / 79.93% 121106031835 121.11G 23.0% >>> first match 25261 33.6% 92.17% 88.33% / 88.98% 296652487962 296.65G 56.3% >>> DS theory 28333 37.7% 63.03% 72.05% / 85.00% 109563734005 109.56G 20.8% >>> combined 75232 100.0% 73.17% 72.32% / 86.08% 527351738575 527.35G 100.0% >>> >>> Loop count: 37870 >>> avg. # of iter: 8444.77 >>> median # of iter: 7.00 >>> avg. (1% cutoff) # of iter: 174.68 >>> avg. (5% cutoff) # of iter: 55.14 >>> avg. (10% cutoff) # of iter: 35.21 >>> avg. (20% cutoff) # of iter: 26.23 >>> avg. (30% cutoff) # of iter: 21.70 >> >> This is the output data collected without the patch, as can be seen, no difference on "extra loop exit". >> But this issue should be fixed. >> >> >> ./contrib/analyze_brprob_spec.py ~/workspace/tests/spec2017/ >> >> benchspec >> HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) >> noreturn call 1 0.0% 100.00% 50.00% / 50.00% 2 2.00 0.0% 100%:1 >> Fortran zero-sized array 3 0.0% 66.67% 41.71% / 60.50% 362 362.00 0.0% 100%:3 >> loop iv compare 16 0.0% 93.75% 98.26% / 98.76% 279847 279.85k 0.0% 93%:4 >> __builtin_expect 35 0.0% 97.14% 78.09% / 78.35% 17079558 17.08M 0.0% >> loop guard with recursion 45 0.1% 86.67% 85.13% / 85.14% 6722424412 6.72G 1.3% 74%:4 >> extra loop exit 80 0.1% 58.75% 81.49% / 89.21% 438470261 438.47M 0.1% 86%:3 >> guess loop iv compare 235 0.3% 80.85% 52.83% / 73.97% 148558247 148.56M 0.0% 47%:3 >> negative return 241 0.3% 71.37% 25.33% / 92.61% 250402383 250.40M 0.0% 69%:2 >> loop exit with recursion 315 0.4% 74.60% 85.07% / 85.71% 9403136858 9.40G 1.8% 59%:4 >> const return 320 0.4% 51.88% 90.45% / 95.63% 925341727 925.34M 0.2% 76%:5 >> indirect call 377 0.5% 51.46% 84.72% / 91.14% 2133772848 2.13G 0.4% 69%:1 >> polymorphic call 410 0.5% 44.15% 31.26% / 79.37% 3272688238 3.27G 0.6% 53%:2 >> recursive call 506 0.7% 39.53% 44.97% / 83.92% 1211036806 1.21G 0.2% 10%:1 >> goto 618 0.8% 64.24% 65.37% / 83.57% 702446178 702.45M 0.1% 20%:1 >> null return 800 1.1% 64.62% 56.59% / 77.70% 603952067 603.95M 0.1% 28%:2 >> continue 956 1.3% 63.70% 65.65% / 79.97% 3780303795 3.78G 0.7% 52%:3 >> loop guard 1178 1.6% 56.37% 42.54% / 80.32% 7373601533 7.37G 1.4% 50%:2 >> opcode values positive (on trees) 2020 2.7% 62.38% 64.16% / 84.44% 31695571761 31.70G 5.9% 21%:2 >> loop exit 3293 4.4% 76.19% 87.18% / 88.35% 50377138963 50.38G 9.4% 18%:1 >> loop iterations 4772 6.3% 99.98% 84.27% / 84.27% 74045982111 74.05G 13.8% >> pointer (on trees) 8076 10.7% 56.23% 69.36% / 83.15% 12322099991 12.32G 2.3% >> call 11396 15.1% 64.14% 74.13% / 89.82% 25197949198 25.20G 4.7% 34%:1 >> opcode values nonequal (on trees) 12240 16.2% 70.71% 70.86% / 83.54% 36638772682 36.64G 6.9% >> guessed loop iterations 16854 22.4% 99.78% 91.21% / 91.22% 169765264401 169.77G 31.7% >> >> HEURISTICS BRANCHES (REL) BR. HITRATE HITRATE COVERAGE COVERAGE (REL) predict.def (REL) HOT branches (>10%) >> no prediction 12731 16.9% 39.30% 33.32% / 79.93% 121106031963 121.11G 22.6% >> first match 25366 33.7% 92.20% 88.24% / 88.88% 304047352001 304.05G 56.9% >> DS theory 28337 37.6% 63.03% 72.05% / 85.00% 109563734430 109.56G 20.5% >> combined 75342 100.0% 73.21% 72.49% / 86.06% 534746603167 534.75G 100.0% > > Thank you. So it seems that the problem does not trigger in Spec but I > was also wondering if our current predict.def values are anywhere near > to reality. > > THe table reads as follows: > - BRANCHES is number of branches the heuristics hit on (so extra loop > exit has 80 and therefore we do not have that good statistics on it) > - HITRATE is the probability that the prediction goes given direction > during the train run. > after / is the value which would be reached by perfect predictor > (which predict branch to the direction that dominates during train) > Extra loop exit is 81% out of 89% so it is pretty close to optimum > - COVERAGE is how many times the predicted branch was executed > > In general the idea is that for most heuristics (wihch can not determine > exact value like loop iteraitons) HITRATE values can be put to > predict.def so the Dempster-Shafer formula (DS theory) combines the > hypothesis sort of realistically (it assumes that all the predictors are > staistically independent which they are not). > > We have HITRATE 67 for extra loop exit which is bit off what we do have > in the measured data, but I think our predict.def is still based on > spec2006 numbers. > > So the patch is OK. Perhaps we could experiment with updating > predict.def (It does develop even when run across same benchmark suite > since early optimizations change - this stage1 I think the threading > work definitly affects the situation substantially) Thanks, committed to r12-6085. > > Honza >> >> Loop count: 38058 >> avg. # of iter: 8403.32 >> median # of iter: 7.00 >> avg. (1% cutoff) # of iter: 173.72 >> avg. (5% cutoff) # of iter: 54.90 >> avg. (10% cutoff) # of iter: 35.20 >> avg. (20% cutoff) # of iter: 26.35 >> avg. (30% cutoff) # of iter: 21.87 >> >> >> -- >> Thanks, >> Xionghu
diff --git a/gcc/predict.c b/gcc/predict.c index 3cb4e3c0eb5..5b6e0cf722b 100644 --- a/gcc/predict.c +++ b/gcc/predict.c @@ -1859,7 +1859,7 @@ predict_iv_comparison (class loop *loop, basic_block bb, exits to predict them using PRED_LOOP_EXTRA_EXIT. */ static void -predict_extra_loop_exits (edge exit_edge) +predict_extra_loop_exits (class loop *loop, edge exit_edge) { unsigned i; bool check_value_one; @@ -1912,12 +1912,14 @@ predict_extra_loop_exits (edge exit_edge) continue; if (EDGE_COUNT (e->src->succs) != 1) { - predict_paths_leading_to_edge (e, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN); + predict_paths_leading_to_edge (e, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN, + loop); continue; } FOR_EACH_EDGE (e1, ei, e->src->preds) - predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN); + predict_paths_leading_to_edge (e1, PRED_LOOP_EXTRA_EXIT, NOT_TAKEN, + loop); } } @@ -2008,7 +2010,7 @@ predict_loops (void) ex->src->index, ex->dest->index); continue; } - predict_extra_loop_exits (ex); + predict_extra_loop_exits (loop, ex); if (number_of_iterations_exit (loop, ex, &niter_desc, false, false)) niter = niter_desc.niter; diff --git a/gcc/testsuite/gcc.dg/pr103270.c b/gcc/testsuite/gcc.dg/pr103270.c new file mode 100644 index 00000000000..819310e360e --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr103270.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-profile_estimate" } */ + +void test(int a, int* i) +{ + for (; a < 5; ++a) + { + int b = 0; + int c = 0; + for (; b != -11; b--) + for (int d = 0; d ==0; d++) + { + *i += c & a; + c = b; + } + } +} + +/* { dg-final { scan-tree-dump-not "extra loop exit heuristics of edge\[^:\]*:" "profile_estimate"} } */