From patchwork Mon Mar 20 06:31:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 66628 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 11E3B385B533 for ; Mon, 20 Mar 2023 06:31:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 11E3B385B533 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1679293904; bh=U3r3QnXq85enZl07gMCU/3cuHZq7RIa7RRRNSo1i6BU=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=Jr3S+d7Xqr03z3hM14J7N47F7V3lasxT3VW+kfeqE4vfzzHcbZNUZq4H4HkxKR5fz BrEnMyIIA/fu/ozv2ACmcYR4w1nJ4mlXCkNGadBTT6cOj44uRPqdLTVCrnbTzb5JCs zhqvDqGfY4S9S6Dd8NHXDsTSFFXBITUdRE+/WG0s= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 283633858C53 for ; Mon, 20 Mar 2023 06:31:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 283633858C53 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32K59SjH026755; Mon, 20 Mar 2023 06:31:12 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pdpvrqnp5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Mar 2023 06:31:12 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32K6U90D030458; Mon, 20 Mar 2023 06:31:11 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3pdpvrqnne-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Mar 2023 06:31:11 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32K3i91p017149; Mon, 20 Mar 2023 06:31:09 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3pd4x6asq4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Mar 2023 06:31:09 +0000 Received: from smtpav06.fra02v.mail.ibm.com (smtpav06.fra02v.mail.ibm.com [10.20.54.105]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32K6V7iL8127166 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 20 Mar 2023 06:31:07 GMT Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3620F20049; Mon, 20 Mar 2023 06:31:07 +0000 (GMT) Received: from smtpav06.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id E7E3720043; Mon, 20 Mar 2023 06:31:03 +0000 (GMT) Received: from [9.177.11.227] (unknown [9.177.11.227]) by smtpav06.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 20 Mar 2023 06:31:03 +0000 (GMT) Message-ID: <928b5bd5-387c-5400-6863-0c045fd22aef@linux.ibm.com> Date: Mon, 20 Mar 2023 14:31:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Content-Language: en-US To: GCC Patches Cc: Segher Boessenkool , Peter Bergner , Richard Biener , Jeff Law , vladimir Makarov , amonakov@ispras.ru, Richard Sandiford Subject: [RFC/PATCH] sched: Consider debug insn in no_real_insns_p [PR108273] X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: jFls4na45aTGPO3uuFZjJABmKwkMG8r1 X-Proofpoint-GUID: UAvuh80vrmLr_Ilg0EUP54d3atl2MgfB X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-20_02,2023-03-16_02,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 phishscore=0 malwarescore=0 impostorscore=0 mlxlogscore=999 lowpriorityscore=0 adultscore=0 clxscore=1011 spamscore=0 suspectscore=0 priorityscore=1501 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303150002 definitions=main-2303200053 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, As PR108273 shows, when there is one block which only has NOTE_P and LABEL_P insns at non-debug mode while has some extra DEBUG_INSN_P insns at debug mode, after scheduling it, the DFA states would be different between debug mode and non-debug mode. Since at non-debug mode, the block meets no_real_insns_p, it gets skipped; while at debug mode, it gets scheduled, even it only has NOTE_P, LABEL_P and DEBUG_INSN_P, the call of function advance_one_cycle will change the DFA state. PR108519 also shows this issue issue can be exposed by some scheduler changes. This patch is to take debug insn into account in function no_real_insns_p, which make us not try to schedule for the block having only NOTE_P, LABEL_P and DEBUG_INSN_P insns, resulting in consistent DFA states between non-debug and debug mode. Changing no_real_insns_p caused ICE when doing free_block_dependencies, the root cause is that we create dependencies for debug insns, those dependencies are expected to be resolved during scheduling insns which gets skipped after the change in no_real_insns_p. By checking the code, it looks it's reasonable to skip to compute block dependencies for no_real_insns_p blocks. It can be bootstrapped and regtested but it hit one ICE when built SPEC2017 bmks at option -O2 -g. The root cause is that initially there are no no_real_insns_p blocks in a region, but in the later scheduling one block has one insn scheduled speculatively then becomes no_real_insns_p, so we compute dependencies and rgn_n_insns for this special block before scheduling, later it gets skipped so not scheduled, the following counts would mismatch: /* Sanity check: verify that all region insns were scheduled. */ gcc_assert (sched_rgn_n_insns == rgn_n_insns); , and we miss to release the allocated dependencies. To avoid the unexpected mis-matchings, this patch adds one bitmap to track this kind of special block which isn't no_real_insns_p but becomes no_real_insns_p later, then we can adjust the count and free deps for it. This patch can be bootstrapped and regress-tested on x86_64-redhat-linux, aarch64-linux-gnu and powerpc64{,le}-linux-gnu. I also verified this patch can pass SPEC2017 both intrate and fprate bmks building at -g -O2/-O3. This is for next stage 1, but since I know little on the scheduler, I'd like to post it early for more comments. Is it on the right track? Any thoughts? BR, Kewen ----- PR rtl-optimization/108273 gcc/ChangeLog: * haifa-sched.cc (no_real_insns_p): Consider DEBUG_INSN_P insn. * sched-rgn.cc (no_real_insns): New static bitmap variable. (compute_block_dependences): Skip for no_real_insns_p. (free_deps_for_bb_no_real_insns_p): New function. (free_block_dependencies): Call free_deps_for_bb_no_real_insns_p for no_real_insns_p bb. (schedule_region): Fix up sched_rgn_n_insns for some block for which rgn_n_insns is computed before, and move sched_rgn_local_finish after free_block_dependencies loop. (sched_rgn_local_init): Allocate and compute no_real_insns. (sched_rgn_local_free): Free no_real_insns. --- gcc/haifa-sched.cc | 8 ++++- gcc/sched-rgn.cc | 84 +++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 87 insertions(+), 5 deletions(-) -- 2.39.1 diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc index 48b53776fa9..378f3b34cc0 100644 --- a/gcc/haifa-sched.cc +++ b/gcc/haifa-sched.cc @@ -5040,7 +5040,13 @@ no_real_insns_p (const rtx_insn *head, const rtx_insn *tail) { while (head != NEXT_INSN (tail)) { - if (!NOTE_P (head) && !LABEL_P (head)) + /* Take debug insn into account here, otherwise we can have different + DFA states after scheduling a block which only has NOTE_P, LABEL_P + and DEBUG_P (debug mode) insns between non-debug and debug modes, + it could cause -fcompare-debug failure. */ + if (!NOTE_P (head) + && !LABEL_P (head) + && !DEBUG_INSN_P (head)) return 0; head = NEXT_INSN (head); } diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc index f2751f62450..211b62e2b4a 100644 --- a/gcc/sched-rgn.cc +++ b/gcc/sched-rgn.cc @@ -213,6 +213,11 @@ static int rgn_nr_edges; /* Array of size rgn_nr_edges. */ static edge *rgn_edges; +/* For basic block i, the corresponding set bit i in bitmap indicates this basic + block meets predicate no_real_insns_p before scheduling any basic blocks in + the region. */ +static bitmap no_real_insns; + /* Mapping from each edge in the graph to its number in the rgn. */ #define EDGE_TO_BIT(edge) ((int)(size_t)(edge)->aux) #define SET_EDGE_TO_BIT(edge,nr) ((edge)->aux = (void *)(size_t)(nr)) @@ -2730,6 +2735,15 @@ compute_block_dependences (int bb) gcc_assert (EBB_FIRST_BB (bb) == EBB_LAST_BB (bb)); get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); + /* Don't compute block dependencies if there are no real insns. */ + if (no_real_insns_p (head, tail)) + { + if (current_nr_blocks > 1) + propagate_deps (bb, &tmp_deps); + free_deps (&tmp_deps); + return; + } + sched_analyze (&tmp_deps, head, tail); add_branch_dependences (head, tail); @@ -2744,6 +2758,42 @@ compute_block_dependences (int bb) targetm.sched.dependencies_evaluation_hook (head, tail); } +/* The basic block without any real insns (no_real_insns_p) would be + skipped in compute_block_dependences, so for most no_real_insns_p + basic block, we don't need to free dependencies for them. But + sometimes some basic block which isn't no_real_insns_p before + scheduling can become no_real_insns_p with speculative scheduling, + so we still need to free dependencies computed early for it. So + this function is to free dependencies if need. */ + +static void +free_deps_for_bb_no_real_insns_p (rtx_insn *head, rtx_insn *tail, int bb) +{ + gcc_assert (no_real_insns_p (head, tail)); + + /* Don't bother if there is only one block. */ + if (current_nr_blocks == 1) + return; + + /* We don't compute dependencies before. */ + if (bitmap_bit_p (no_real_insns, bb)) + return; + + rtx_insn *insn; + rtx_insn *next_tail = NEXT_INSN (tail); + sd_iterator_def sd_it; + dep_t dep; + + /* There could be some insns which get skipped in scheduling but we + compute dependencies for them previously, so make them resolved. */ + for (insn = head; insn != next_tail; insn = NEXT_INSN (insn)) + for (sd_it = sd_iterator_start (insn, SD_LIST_FORW); + sd_iterator_cond (&sd_it, &dep);) + sd_resolve_dep (sd_it); + + sched_free_deps (head, tail, true); +} + /* Free dependencies of instructions inside BB. */ static void free_block_dependencies (int bb) @@ -2754,7 +2804,10 @@ free_block_dependencies (int bb) get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); if (no_real_insns_p (head, tail)) - return; + { + free_deps_for_bb_no_real_insns_p (head, tail, bb); + return; + } sched_free_deps (head, tail, true); } @@ -3177,6 +3230,18 @@ schedule_region (int rgn) { gcc_assert (first_bb == last_bb); save_state_for_fallthru_edge (last_bb, bb_state[first_bb->index]); + /* We have counted this block when computing rgn_n_insns + previously, so need to fix up sched_rgn_n_insns if we + skip this. */ + if (current_nr_blocks > 1 && !bitmap_bit_p (no_real_insns, bb)) + { + while (head != NEXT_INSN (tail)) + { + if (INSN_P (head)) + sched_rgn_n_insns++; + head = NEXT_INSN (head); + } + } continue; } @@ -3218,13 +3283,13 @@ schedule_region (int rgn) sched_finish_ready_list (); - /* Done with this region. */ - sched_rgn_local_finish (); - /* Free dependencies. */ for (bb = 0; bb < current_nr_blocks; ++bb) free_block_dependencies (bb); + /* Done with this region. */ + sched_rgn_local_finish (); + gcc_assert (haifa_recovery_bb_ever_added_p || deps_pools_are_empty_p ()); } @@ -3444,6 +3509,16 @@ sched_rgn_local_init (int rgn) FOR_EACH_EDGE (e, ei, block->succs) e->aux = NULL; } + + /* Compute no_real_insns. */ + no_real_insns = BITMAP_ALLOC (NULL); + for (bb = 0; bb < current_nr_blocks; bb++) + { + rtx_insn *head, *tail; + get_ebb_head_tail (EBB_FIRST_BB (bb), EBB_LAST_BB (bb), &head, &tail); + if (no_real_insns_p (head, tail)) + bitmap_set_bit (no_real_insns, bb); + } } } @@ -3456,6 +3531,7 @@ sched_rgn_local_free (void) sbitmap_vector_free (pot_split); sbitmap_vector_free (ancestor_edges); free (rgn_edges); + BITMAP_FREE (no_real_insns); } /* Free data computed for the finished region. */