From patchwork Sun Nov 14 22:32:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 47639 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 63B913858431 for ; Sun, 14 Nov 2021 22:35:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 63B913858431 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1636929340; bh=b0Qms0al75mla6HynJPTj/swV+EaKlnNaoTR6F/n+q0=; h=Resent-From:Resent-Date:Resent-To:Date:To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=P25yDd1WHtVRc+2ArW1hkUd+nuSLudxzLWrSJNu+XpDeSCPE2fK6qh9DUwUchaltc SWxVAgrL6V0Ls9REiNTeziTO6jbwYeQWJBBuA7adkdNa+C6xxHy+FmpyyMGc5eis4s QBDXC/3+9T0AkFCegl/3Y7jPi9jFqaynS0ToM9Zg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 871653858D39 for ; Sun, 14 Nov 2021 22:35:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 871653858D39 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id 47457280865; Sun, 14 Nov 2021 23:35:08 +0100 (CET) Resent-From: Jan Hubicka Resent-Date: Sun, 14 Nov 2021 23:35:08 +0100 Resent-Message-ID: <20211114223508.GC42690@kam.mff.cuni.cz> Resent-To: gcc-patches@gcc.gnu.org Date: Sun, 14 Nov 2021 23:32:48 +0100 To: gcc-patches@gcc.gnu.rg Subject: Track nondeterminism and interposable calls in ipa-modref Message-ID: <20211114223248.GB42690@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jan Hubicka via Gcc-patches From: Jan Hubicka Reply-To: Jan Hubicka Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Hi, This patch adds tracking of two new flags in ipa-modref: nondeterministic and calls_interposable. First is set when function does something that is not guaranteed to be the same if run again (volatile memory access, volatile asm or external function call). Second is set if function calls something that does not bind to current def. nondeterministic enables ipa-modref to discover looping pure/const functions and it now discovers 138 of them during cc1plus link (which about doubles number of such functions detected late). We however can do more 1) We can extend FRE to eliminate redundant calls. I filled a PR103168 for that. A common case are inline functions that are not autodetected as ECF_CONST just becuase they do not bind to local def and can be easily handled. More tricky is to use modref summary to check what memory locations are read. 2) DSE can eliminate redundant stores The calls_interposable flag currently also improves tree-ssa-structalias on functions that are not binds_to_current_def since reads_global_memory is now not cleared by interposable functions. Bootstrapped/regtsted x86_64-linux, will commit it shortly. gcc/ChangeLog: * ipa-modref.h (struct modref_summary): Add nondeterministic and calls_interposable flags. * ipa-modref.c (modref_summary::modref_summary): Initialize new flags. (modref_summary::useful_p): Check new flags. (struct modref_summary_lto): Add nondeterministic and calls_interposable flags. (modref_summary_lto::modref_summary_lto): Initialize new flags. (modref_summary_lto::useful_p): Check new flags. (modref_summary::dump): Dump new flags. (modref_summary_lto::dump): Dump new flags. (ignore_nondeterminism_p): New function. (merge_call_side_effects): Merge new flags. (process_fnspec): Likewise. (analyze_load): Volatile access is nondeterministic. (analyze_store): Liekwise. (analyze_stmt): Volatile ASM is nondeterministic. (analyze_function): Clear new flags. (modref_summaries::duplicate): Duplicate new flags. (modref_summaries_lto::duplicate): Duplicate new flags. (modref_write): Stream new flags. (read_section): Stream new flags. (propagate_unknown_call): Update new flags. (modref_propagate_in_scc): Propagate new flags. * tree-ssa-alias.c (ref_maybe_used_by_call_p_1): Check calls_interposable. * tree-ssa-structalias.c (determine_global_memory_access): Likewise. diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index b75ed84135b..4d878f45e30 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -276,7 +276,8 @@ static GTY(()) fast_function_summary modref_summary::modref_summary () : loads (NULL), stores (NULL), retslot_flags (0), static_chain_flags (0), - writes_errno (false), side_effects (false), global_memory_read (false), + writes_errno (false), side_effects (false), nondeterministic (false), + calls_interposable (false), global_memory_read (false), global_memory_written (false), try_dse (false) { } @@ -332,11 +333,13 @@ modref_summary::useful_p (int ecf_flags, bool check_flags) && remove_useless_eaf_flags (static_chain_flags, ecf_flags, false)) return true; if (ecf_flags & (ECF_CONST | ECF_NOVOPS)) - return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); + return ((!side_effects || !nondeterministic) + && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); if (loads && !loads->every_base) return true; if (ecf_flags & ECF_PURE) - return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); + return ((!side_effects || !nondeterministic) + && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); return stores && !stores->every_base; } @@ -354,8 +357,10 @@ struct GTY(()) modref_summary_lto auto_vec GTY((skip)) arg_flags; eaf_flags_t retslot_flags; eaf_flags_t static_chain_flags; - bool writes_errno; - bool side_effects; + unsigned writes_errno : 1; + unsigned side_effects : 1; + unsigned nondeterministic : 1; + unsigned calls_interposable : 1; modref_summary_lto (); ~modref_summary_lto (); @@ -367,7 +372,8 @@ struct GTY(()) modref_summary_lto modref_summary_lto::modref_summary_lto () : loads (NULL), stores (NULL), retslot_flags (0), static_chain_flags (0), - writes_errno (false), side_effects (false) + writes_errno (false), side_effects (false), nondeterministic (false), + calls_interposable (false) { } @@ -397,11 +403,13 @@ modref_summary_lto::useful_p (int ecf_flags, bool check_flags) && remove_useless_eaf_flags (static_chain_flags, ecf_flags, false)) return true; if (ecf_flags & (ECF_CONST | ECF_NOVOPS)) - return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); + return ((!side_effects || !nondeterministic) + && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); if (loads && !loads->every_base) return true; if (ecf_flags & ECF_PURE) - return (!side_effects && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); + return ((!side_effects || !nondeterministic) + && (ecf_flags & ECF_LOOPING_CONST_OR_PURE)); return stores && !stores->every_base; } @@ -574,6 +582,10 @@ modref_summary::dump (FILE *out) fprintf (out, " Writes errno\n"); if (side_effects) fprintf (out, " Side effects\n"); + if (nondeterministic) + fprintf (out, " Nondeterministic\n"); + if (calls_interposable) + fprintf (out, " Calls interposable\n"); if (global_memory_read) fprintf (out, " Global memory read\n"); if (global_memory_written) @@ -614,6 +626,10 @@ modref_summary_lto::dump (FILE *out) fprintf (out, " Writes errno\n"); if (side_effects) fprintf (out, " Side effects\n"); + if (nondeterministic) + fprintf (out, " Nondeterministic\n"); + if (calls_interposable) + fprintf (out, " Calls interposable\n"); if (arg_flags.length ()) { for (unsigned int i = 0; i < arg_flags.length (); i++) @@ -859,6 +875,20 @@ record_access_p (tree expr) return true; } +/* Return true if ECF flags says that nondeterminsm can be ignored. */ + +static bool +ignore_nondeterminism_p (tree caller, int flags) +{ + if ((flags & (ECF_CONST | ECF_PURE)) + && !(flags & ECF_LOOPING_CONST_OR_PURE)) + return true; + if ((flags & (ECF_NORETURN | ECF_NOTHROW)) == (ECF_NORETURN | ECF_NOTHROW) + || (!opt_for_fn (caller, flag_exceptions) && (flags & ECF_NORETURN))) + return true; + return false; +} + /* Return true if ECF flags says that return value can be ignored. */ static bool @@ -939,26 +969,48 @@ merge_call_side_effects (modref_summary *cur_summary, bool changed = false; int flags = gimple_call_flags (stmt); - if (!cur_summary->side_effects && callee_summary->side_effects) + if (!(flags & (ECF_CONST | ECF_NOVOPS | ECF_PURE)) + || (flags & ECF_LOOPING_CONST_OR_PURE)) + { + if (!cur_summary->side_effects && callee_summary->side_effects) + { + if (dump_file) + fprintf (dump_file, " - merging side effects.\n"); + cur_summary->side_effects = true; + changed = true; + } + if (!cur_summary->nondeterministic && callee_summary->nondeterministic + && !ignore_nondeterminism_p (current_function_decl, flags)) + { + if (dump_file) + fprintf (dump_file, " - merging side effects.\n"); + cur_summary->nondeterministic = true; + changed = true; + } + } + + if (flags & (ECF_CONST | ECF_NOVOPS)) + return changed; + + if (!cur_summary->calls_interposable && callee_summary->calls_interposable) { if (dump_file) fprintf (dump_file, " - merging side effects.\n"); - cur_summary->side_effects = true; + cur_summary->calls_interposable = true; changed = true; } - if (flags & (ECF_CONST | ECF_NOVOPS)) - return changed; - /* We can not safely optimize based on summary of callee if it does not always bind to current def: it is possible that memory load was optimized out earlier which may not happen in the interposed variant. */ - if (!callee_node->binds_to_current_def_p ()) + if (!callee_node->binds_to_current_def_p () + && !cur_summary->calls_interposable) { if (dump_file) fprintf (dump_file, " - May be interposed: collapsing loads.\n"); - cur_summary->loads->collapse (); + cur_summary->calls_interposable = true; + changed = true; } if (dump_file) @@ -1115,9 +1167,17 @@ process_fnspec (modref_summary *cur_summary, && stmt_could_throw_p (cfun, call))) { if (cur_summary) - cur_summary->side_effects = true; + { + cur_summary->side_effects = true; + if (!ignore_nondeterminism_p (current_function_decl, flags)) + cur_summary->nondeterministic = true; + } if (cur_summary_lto) - cur_summary_lto->side_effects = true; + { + cur_summary_lto->side_effects = true; + if (!ignore_nondeterminism_p (current_function_decl, flags)) + cur_summary_lto->nondeterministic = true; + } } if (flags & (ECF_CONST | ECF_NOVOPS)) return true; @@ -1340,9 +1400,9 @@ analyze_load (gimple *, tree, tree op, void *data) if (dump_file) fprintf (dump_file, " (volatile or can throw; marking side effects) "); if (summary) - summary->side_effects = true; + summary->side_effects = summary->nondeterministic = true; if (summary_lto) - summary_lto->side_effects = true; + summary_lto->side_effects = summary_lto->nondeterministic = true; } if (!record_access_p (op)) @@ -1380,9 +1440,9 @@ analyze_store (gimple *, tree, tree op, void *data) if (dump_file) fprintf (dump_file, " (volatile or can throw; marking side effects) "); if (summary) - summary->side_effects = true; + summary->side_effects = summary->nondeterministic = true; if (summary_lto) - summary_lto->side_effects = true; + summary_lto->side_effects = summary_lto->nondeterministic = true; } if (!record_access_p (op)) @@ -1421,9 +1481,15 @@ analyze_stmt (modref_summary *summary, modref_summary_lto *summary_lto, switch (gimple_code (stmt)) { case GIMPLE_ASM: - if (gimple_asm_volatile_p (as_a (stmt)) - || (cfun->can_throw_non_call_exceptions - && stmt_could_throw_p (cfun, stmt))) + if (gimple_asm_volatile_p (as_a (stmt))) + { + if (summary) + summary->side_effects = summary->nondeterministic = true; + if (summary_lto) + summary_lto->side_effects = summary_lto->nondeterministic = true; + } + if (cfun->can_throw_non_call_exceptions + && stmt_could_throw_p (cfun, stmt)) { if (summary) summary->side_effects = true; @@ -2749,6 +2815,8 @@ analyze_function (function *f, bool ipa) param_modref_max_accesses); summary->writes_errno = false; summary->side_effects = false; + summary->nondeterministic = false; + summary->calls_interposable = false; } if (lto) { @@ -2764,6 +2832,8 @@ analyze_function (function *f, bool ipa) param_modref_max_accesses); summary_lto->writes_errno = false; summary_lto->side_effects = false; + summary_lto->nondeterministic = false; + summary_lto->calls_interposable = false; } analyze_parms (summary, summary_lto, ipa, @@ -2829,9 +2899,10 @@ analyze_function (function *f, bool ipa) if (!ipa && flag_ipa_pure_const) { if (!summary->stores->every_base && !summary->stores->bases - && !summary->side_effects) + && !summary->nondeterministic) { - if (!summary->loads->every_base && !summary->loads->bases) + if (!summary->loads->every_base && !summary->loads->bases + && !summary->calls_interposable) fixup_cfg = ipa_make_function_const (cgraph_node::get (current_function_decl), summary->side_effects, true); @@ -3063,6 +3134,8 @@ modref_summaries::duplicate (cgraph_node *, cgraph_node *dst, dst_data->loads->copy_from (src_data->loads); dst_data->writes_errno = src_data->writes_errno; dst_data->side_effects = src_data->side_effects; + dst_data->nondeterministic = src_data->nondeterministic; + dst_data->calls_interposable = src_data->calls_interposable; if (src_data->arg_flags.length ()) dst_data->arg_flags = src_data->arg_flags.copy (); dst_data->retslot_flags = src_data->retslot_flags; @@ -3091,6 +3164,8 @@ modref_summaries_lto::duplicate (cgraph_node *, cgraph_node *, dst_data->loads->copy_from (src_data->loads); dst_data->writes_errno = src_data->writes_errno; dst_data->side_effects = src_data->side_effects; + dst_data->nondeterministic = src_data->nondeterministic; + dst_data->calls_interposable = src_data->calls_interposable; if (src_data->arg_flags.length ()) dst_data->arg_flags = src_data->arg_flags.copy (); dst_data->retslot_flags = src_data->retslot_flags; @@ -3420,6 +3495,8 @@ modref_write () struct bitpack_d bp = bitpack_create (ob->main_stream); bp_pack_value (&bp, r->writes_errno, 1); bp_pack_value (&bp, r->side_effects, 1); + bp_pack_value (&bp, r->nondeterministic, 1); + bp_pack_value (&bp, r->calls_interposable, 1); if (!flag_wpa) { for (cgraph_edge *e = cnode->indirect_calls; @@ -3492,11 +3569,15 @@ read_section (struct lto_file_decl_data *file_data, const char *data, { modref_sum->writes_errno = false; modref_sum->side_effects = false; + modref_sum->nondeterministic = false; + modref_sum->calls_interposable = false; } if (modref_sum_lto) { modref_sum_lto->writes_errno = false; modref_sum_lto->side_effects = false; + modref_sum_lto->nondeterministic = false; + modref_sum_lto->calls_interposable = false; } gcc_assert (!modref_sum || (!modref_sum->loads @@ -3549,6 +3630,20 @@ read_section (struct lto_file_decl_data *file_data, const char *data, if (modref_sum_lto) modref_sum_lto->side_effects = true; } + if (bp_unpack_value (&bp, 1)) + { + if (modref_sum) + modref_sum->nondeterministic = true; + if (modref_sum_lto) + modref_sum_lto->nondeterministic = true; + } + if (bp_unpack_value (&bp, 1)) + { + if (modref_sum) + modref_sum->calls_interposable = true; + if (modref_sum_lto) + modref_sum_lto->calls_interposable = true; + } if (!flag_ltrans) { for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee) @@ -4072,6 +4167,18 @@ propagate_unknown_call (cgraph_node *node, cur_summary_lto->side_effects = true; changed = true; } + if (cur_summary && !cur_summary->nondeterministic + && !ignore_nondeterminism_p (node->decl, ecf_flags)) + { + cur_summary->nondeterministic = true; + changed = true; + } + if (cur_summary_lto && !cur_summary_lto->nondeterministic + && !ignore_nondeterminism_p (node->decl, ecf_flags)) + { + cur_summary_lto->nondeterministic = true; + changed = true; + } } if (ecf_flags & (ECF_CONST | ECF_NOVOPS)) return changed; @@ -4331,6 +4438,20 @@ modref_propagate_in_scc (cgraph_node *component_node) cur_summary_lto->side_effects = true; changed = true; } + if (callee_summary && !cur_summary->nondeterministic + && callee_summary->nondeterministic + && !ignore_nondeterminism_p (cur->decl, flags)) + { + cur_summary->nondeterministic = true; + changed = true; + } + if (callee_summary_lto && !cur_summary_lto->nondeterministic + && callee_summary_lto->nondeterministic + && !ignore_nondeterminism_p (cur->decl, flags)) + { + cur_summary_lto->nondeterministic = true; + changed = true; + } if (flags & (ECF_CONST | ECF_NOVOPS)) continue; @@ -4340,7 +4461,16 @@ modref_propagate_in_scc (cgraph_node *component_node) the interposed variant. */ if (!callee_edge->binds_to_current_def_p ()) { - changed |= collapse_loads (cur_summary, cur_summary_lto); + if (cur_summary && !cur_summary->calls_interposable) + { + cur_summary->calls_interposable = true; + changed = true; + } + if (cur_summary_lto && !cur_summary_lto->calls_interposable) + { + cur_summary_lto->calls_interposable = true; + changed = true; + } if (dump_file) fprintf (dump_file, " May not bind local;" " collapsing loads\n"); @@ -4426,9 +4556,10 @@ modref_propagate_in_scc (cgraph_node *component_node) ? summaries_lto->get (cur) : NULL; if (summary && !summary->stores->every_base && !summary->stores->bases - && !summary->side_effects) + && !summary->nondeterministic) { - if (!summary->loads->every_base && !summary->loads->bases) + if (!summary->loads->every_base && !summary->loads->bases + && !summary->calls_interposable) pureconst |= ipa_make_function_const (cur, summary->side_effects, false); else @@ -4436,9 +4567,10 @@ modref_propagate_in_scc (cgraph_node *component_node) (cur, summary->side_effects, false); } if (summary_lto && !summary_lto->stores->every_base - && !summary_lto->stores->bases && !summary_lto->side_effects) + && !summary_lto->stores->bases && !summary_lto->nondeterministic) { - if (!summary_lto->loads->every_base && !summary_lto->loads->bases) + if (!summary_lto->loads->every_base && !summary_lto->loads->bases + && !summary_lto->calls_interposable) pureconst |= ipa_make_function_const (cur, summary_lto->side_effects, false); else diff --git a/gcc/ipa-modref.h b/gcc/ipa-modref.h index 118dc5f2abf..adaf7b1bfa5 100644 --- a/gcc/ipa-modref.h +++ b/gcc/ipa-modref.h @@ -35,7 +35,19 @@ struct GTY(()) modref_summary eaf_flags_t static_chain_flags; unsigned writes_errno : 1; unsigned side_effects : 1; + /* If true function can not be CSE optimized because it may behave + differently even if invoked with same inputs. */ + unsigned nondeterministic : 1; + /* IF true the function may read any reachable memory but not use + it for anything useful. This may happen i.e. when interposing + function with optimized out conditional with unoptimized one. + + In this situation the loads summary is not useful for DSE but + it is still useful for CSE. */ + unsigned calls_interposable : 1; + /* Flags coputed by finalize method. */ + unsigned global_memory_read : 1; unsigned global_memory_written : 1; unsigned try_dse : 1; diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c index 29be1f848b5..928a25d3f02 100644 --- a/gcc/tree-ssa-alias.c +++ b/gcc/tree-ssa-alias.c @@ -2749,7 +2749,7 @@ ref_maybe_used_by_call_p_1 (gcall *call, ao_ref *ref, bool tbaa_p) if (node && node->binds_to_current_def_p ()) { modref_summary *summary = get_modref_function_summary (node); - if (summary) + if (summary && !summary->calls_interposable) { if (!modref_may_conflict (call, summary->loads, ref, tbaa_p)) { diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c index 6141e944b9f..06eb22c029a 100644 --- a/gcc/tree-ssa-structalias.c +++ b/gcc/tree-ssa-structalias.c @@ -4266,6 +4266,7 @@ determine_global_memory_access (gcall *stmt, if (reads_global_memory && *reads_global_memory) *reads_global_memory = summary->global_memory_read; if (reads_global_memory && uses_global_memory + && !summary->calls_interposable && !*reads_global_memory && node->binds_to_current_def_p ()) *uses_global_memory = false; }