Fix invalid profile mismatch error

Message ID Z9LUpdYtLtQjWoPO@kam.mff.cuni.cz
State New
Headers
Series Fix invalid profile mismatch error |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-arm-bootstrap success Build passed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-aarch64-bootstrap success Build passed

Commit Message

Jan Hubicka March 13, 2025, 12:50 p.m. UTC
  Hi,
this patch fixes false incosistent profile error message seen when building SPEC with
-fprofile-use -fdump-ipa-profile.
The problem is that with dumping tree_esitmate_probability is run in dry run
mode to report success rates of heuristics.  It however runs determine_unlikely_bbs
which ovewrites some counts to profile_count::zero and later value profiling sees
the mismatch.

In sane profile determine_unlikely_bbs should be almost always no-op since it
should only drop to 0 things that are known to be unlikely executed. What
happens here is that there is a comdat where profile is lost and we see a
call with non-zero count calling function with zero count and "fix" the profile
by making the call to have zero count, too.

I also extended unlikely prediates to avoid tampering with predictions when
prediciton is believed to be reliable.  This also avoids us from dropping all
EH regions to 0 count as tested by the testcase.

Jeff, I think we can use same logic as in isolate-paths and detect
statements which will trigger undefined effect or errornerous
termination in unlikely_executed_stmt_p.  Are you OK with exporting
stmt_uses_0_or_null_in_undefined_way and using it in
unlikely_executed_stmt_p?

Do we have other ways to detect that given STMT will very likely not be
executed?

Bootsrapped/regtested x86_64-linux, will commit it shortly.

gcc/ChangeLog:

	* predict.cc (unlikely_executed_edge_p): Ignore EDGE_EH if profile
	is reliable.
	(unlikely_executed_stmt_p): special case builtin_trap/unreachable and
	ignore other heuristics for reliable profiles.
	(tree_estimate_probability): Disable unlikely bb detection when
	doing dry run

gcc/testsuite/ChangeLog:

	* g++.dg/tree-prof/eh1.C: New test.
  

Patch

diff --git a/gcc/predict.cc b/gcc/predict.cc
index ef31c48bfe2..eefff5908f9 100644
--- a/gcc/predict.cc
+++ b/gcc/predict.cc
@@ -245,7 +245,10 @@  unlikely_executed_edge_p (edge e)
 {
   return (e->src->count == profile_count::zero ()
 	  || e->probability == profile_probability::never ())
-	 || (e->flags & (EDGE_EH | EDGE_FAKE));
+	 || (e->flags & EDGE_FAKE)
+	 /* If we read profile and know EH edge is executed, trust it.
+	    Otherwise we consider EH edges never executed.  */
+	 || ((e->flags & EDGE_EH) && !e->probability.reliable_p ());
 }
 
 /* Return true if edge E of function FUN is probably never executed.  */
@@ -830,6 +833,26 @@  unlikely_executed_stmt_p (gimple *stmt)
 {
   if (!is_gimple_call (stmt))
     return false;
+
+  /* Those calls are inserted by optimizers when code is known to be
+     unreachable or undefined.  */
+  if (gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE)
+      || gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE_TRAP)
+      || gimple_call_builtin_p (stmt, BUILT_IN_TRAP))
+    return false;
+
+  /* Checks below do not need to be fully reliable.  Cold attribute may be
+     misplaced by user and in the presence of comdat we may result in call to
+     function with 0 profile having non-zero profile.
+
+     We later detect that profile is lost and will drop the profile of the
+     comdat.
+
+     So if we think profile count is reliable, do not try to apply these
+     heuristics.  */
+  if (gimple_bb (stmt)->count.reliable_p ()
+      && gimple_bb (stmt)->count.nonzero_p ())
+    return gimple_bb (stmt)->count == profile_count::zero ();
   /* NORETURN attribute alone is not strong enough: exit() may be quite
      likely executed once during program run.  */
   if (gimple_call_fntype (stmt)
@@ -3269,7 +3292,8 @@  tree_estimate_probability (bool dry_run)
   calculate_dominance_info (CDI_POST_DOMINATORS);
   /* Decide which edges are known to be unlikely.  This improves later
      branch prediction. */
-  determine_unlikely_bbs ();
+  if (!dry_run)
+    determine_unlikely_bbs ();
 
   bb_predictions = new hash_map<const_basic_block, edge_prediction *>;
   ssa_expected_value = new hash_map<int_hash<unsigned, 0>, expected_value>;
diff --git a/gcc/testsuite/g++.dg/tree-prof/eh1.C b/gcc/testsuite/g++.dg/tree-prof/eh1.C
new file mode 100644
index 00000000000..10a35968dc4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-prof/eh1.C
@@ -0,0 +1,34 @@ 
+/* { dg-options "-O3 -fdump-ipa-profile-details -fno-inline -fdump-tree-fixup_cfg3-details -fdump-tree-optimized-details" } */
+char a[10000];
+char b[10000];
+int sz = 1000;
+
+__attribute__((noipa))
+     void test2 ()
+{
+  throw (sz);
+}
+void
+test ()
+{
+  try
+  {
+    test2 ();
+  }
+  catch (int v)
+  {
+    __builtin_memcpy (b, a, v);
+  }
+}
+int
+main ()
+{
+  for (int i = 0; i < 100000; i++)
+    test ();
+}
+/* { dg-final-use-not-autofdo { scan-ipa-dump-times "Average value sum:100000000" 2 "profile" } } */
+/* 1 zero count for resx block.  */
+/* { dg-final-use-not-autofdo { scan-tree-dump-times "count: 0" 1 "fixup_cfg3" } } */
+/* 2 zero count for resx block and return block since return gets duplicated by tracer.  */
+/* { dg-final-use-not-autofdo { scan-tree-dump-times "count: 0" 2 "optimized" } } */
+/* { dg-final-use-not-autofdo { scan-tree-dump-times "Average value sum:100000000" 1 "optimized" } } */