Message ID | ri6czn5z8mw.fsf@suse.cz |
---|---|
State | Committed |
Commit | 458d2c689963d8461d84670a3d8988cd6ecbfd81 |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3B8303857C4A for <patchwork@sourceware.org>; Fri, 12 Nov 2021 15:39:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 58C25385840F for <gcc-patches@gcc.gnu.org>; Fri, 12 Nov 2021 15:38:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 58C25385840F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 217F71FD66; Fri, 12 Nov 2021 15:38:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1636731528; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=sR+wv4WpFX+vK+CGe1c6szY5XXdEK6jMDMCi243yeSk=; b=VfVyvXG8hgOtkdNnel/oIIqKnodLTrzDYuojl9ERvGgMdVeu98ygbZ2UZ3CCFxb74fRxv0 T7Gpr8sHTKWQObGeOE5AH5Hp5y4JZznt4BXGJvb8eGMteWrkoywwa18zHc1IGY1xk7EDfo m5Y7YLXnsCFMJ9A/e/XvlMijPN7y1d4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1636731528; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=sR+wv4WpFX+vK+CGe1c6szY5XXdEK6jMDMCi243yeSk=; b=bEuRBBPDU8SgvA9GIvT9g1Wzjto63FgRozfcYEPQhGDiRvGC6R8SC8KIeFW7wzna04Ntnq dPEruM/uCDyPLKCw== Received: from suse.cz (virgil.suse.cz [10.100.13.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 0E66FA3B8F; Fri, 12 Nov 2021 15:38:48 +0000 (UTC) From: Martin Jambor <mjambor@suse.cz> To: GCC Patches <gcc-patches@gcc.gnu.org> Subject: [PATCH] options: Make -Ofast switch off -fsemantic-interposition User-Agent: Notmuch/0.33.2 (https://notmuchmail.org) Emacs/27.2 (x86_64-suse-linux-gnu) Date: Fri, 12 Nov 2021 16:38:47 +0100 Message-ID: <ri6czn5z8mw.fsf@suse.cz> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: Jan Hubicka <jh@suse.cz>, Richard Biener <rguenther@suse.de>, Jan Hubicka <hubicka@ucw.cz> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
options: Make -Ofast switch off -fsemantic-interposition
|
|
Commit Message
Martin Jambor
Nov. 12, 2021, 3:38 p.m. UTC
Hi, using -fno-semantic-interposition has been reported by various people to bring about considerable speed up at the cost of strict compliance to the ELF symbol interposition rules See for example https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup As such I believe it should be implied by our -Ofast optimization level, not only so that benchmarks that can benefit run faster, but also so that people looking at -Ofast documentation for options that could speed their programs find it. I have verified that with the following patch IPA-CP sees flag_semantic_interposition set to zero at Ofast and that info and pdf manual builds fine with the documentation change. I am bootstrapping and testing it now in order to comply with submission criteria but I don't think an Ofast change gets much tested. Assuming it passes, is the patch OK? (If it is, I will also add a note about it in the "Caveats" section in gcc-12/changes.html of wwwdocs after I commit the patch.) Thanks, Martin gcc/ChangeLog: 2021-11-12 Martin Jambor <mjambor@suse.cz> * opts.c (default_options_table): Switch off flag_semantic_interposition at Ofast. * doc/invoke.texi (Optimize Options): Document that Ofast switches off -fsemantic-interposition. --- gcc/doc/invoke.texi | 1 + gcc/opts.c | 1 + 2 files changed, 2 insertions(+)
Comments
> Hi, > > using -fno-semantic-interposition has been reported by various people > to bring about considerable speed up at the cost of strict compliance > to the ELF symbol interposition rules See for example > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup > > As such I believe it should be implied by our -Ofast optimization > level, not only so that benchmarks that can benefit run faster, but > also so that people looking at -Ofast documentation for options that > could speed their programs find it. > > I have verified that with the following patch IPA-CP sees > flag_semantic_interposition set to zero at Ofast and that info and pdf > manual builds fine with the documentation change. I am bootstrapping > and testing it now in order to comply with submission criteria but I > don't think an Ofast change gets much tested. > > Assuming it passes, is the patch OK? (If it is, I will also add a note > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs > after I commit the patch.) > > Thanks, > > Martin > > > gcc/ChangeLog: > > 2021-11-12 Martin Jambor <mjambor@suse.cz> > > * opts.c (default_options_table): Switch off > flag_semantic_interposition at Ofast. > * doc/invoke.texi (Optimize Options): Document that Ofast switches off > -fsemantic-interposition. OK, thanks! Honza
Hi, On Fri, Nov 12 2021, Martin Jambor wrote: > Hi, > > using -fno-semantic-interposition has been reported by various people > to bring about considerable speed up at the cost of strict compliance > to the ELF symbol interposition rules See for example > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup > > As such I believe it should be implied by our -Ofast optimization > level, not only so that benchmarks that can benefit run faster, but > also so that people looking at -Ofast documentation for options that > could speed their programs find it. > > I have verified that with the following patch IPA-CP sees > flag_semantic_interposition set to zero at Ofast and that info and pdf > manual builds fine with the documentation change. I am bootstrapping > and testing it now in order to comply with submission criteria but I > don't think an Ofast change gets much tested. > > Assuming it passes, is the patch OK? (If it is, I will also add a note > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs > after I commit the patch.) > Unfortunately, I was wrong, there are testcases which use the optimize attribute to switch a function to Ofast and those ICE because -fsemantic-interposition is not an optimization flag and only optimization flags can change in an optimize attribute (AFAIK, I only had a quick glance at the results). I am not sure what is the right way to tackle this, whether to set the flag at Ofast in some nonstandard way or make the flag an optimization flag - probably affecting function definitions, having it affect call-sites seems too fine-grained. I will try to discuss this on IRC on Monday (and hope such change is still doable early stage3). Sorry for posting this a bit prematurely, Martin > > > gcc/ChangeLog: > > 2021-11-12 Martin Jambor <mjambor@suse.cz> > > * opts.c (default_options_table): Switch off > flag_semantic_interposition at Ofast. > * doc/invoke.texi (Optimize Options): Document that Ofast switches off > -fsemantic-interposition. > --- > gcc/doc/invoke.texi | 1 + > gcc/opts.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 2ea23d07c4c..fd16c91aec8 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -10551,6 +10551,7 @@ valid for all standard-compliant programs. > It turns on @option{-ffast-math}, @option{-fallow-store-data-races} > and the Fortran-specific @option{-fstack-arrays}, unless > @option{-fmax-stack-var-size} is specified, and @option{-fno-protect-parens}. > +It turns off @option {-fsemantic-interposition}. > > @item -Og > @opindex Og > diff --git a/gcc/opts.c b/gcc/opts.c > index caed6255500..3da53d8f890 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -682,6 +682,7 @@ static const struct default_options default_options_table[] = > /* -Ofast adds optimizations to -O3. */ > { OPT_LEVELS_FAST, OPT_ffast_math, NULL, 1 }, > { OPT_LEVELS_FAST, OPT_fallow_store_data_races, NULL, 1 }, > + { OPT_LEVELS_FAST, OPT_fsemantic_interposition, NULL, 0 }, > > { OPT_LEVELS_NONE, 0, NULL, 0 } > }; > -- > 2.33.0
> Hi, > > On Fri, Nov 12 2021, Martin Jambor wrote: > > Hi, > > > > using -fno-semantic-interposition has been reported by various people > > to bring about considerable speed up at the cost of strict compliance > > to the ELF symbol interposition rules See for example > > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup > > > > As such I believe it should be implied by our -Ofast optimization > > level, not only so that benchmarks that can benefit run faster, but > > also so that people looking at -Ofast documentation for options that > > could speed their programs find it. > > > > I have verified that with the following patch IPA-CP sees > > flag_semantic_interposition set to zero at Ofast and that info and pdf > > manual builds fine with the documentation change. I am bootstrapping > > and testing it now in order to comply with submission criteria but I > > don't think an Ofast change gets much tested. > > > > Assuming it passes, is the patch OK? (If it is, I will also add a note > > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs > > after I commit the patch.) > > > > Unfortunately, I was wrong, there are testcases which use the optimize > attribute to switch a function to Ofast and those ICE because > -fsemantic-interposition is not an optimization flag and only > optimization flags can change in an optimize attribute (AFAIK, I only > had a quick glance at the results). > > I am not sure what is the right way to tackle this, whether to set the > flag at Ofast in some nonstandard way or make the flag an optimization > flag - probably affecting function definitions, having it affect > call-sites seems too fine-grained. I will try to discuss this on IRC on > Monday (and hope such change is still doable early stage3). > > Sorry for posting this a bit prematurely, Hi, This patch turns flag_semantic_interposition to optimization option so it can be enabled with per-function granuality. This is done by adding the flag among visibility flags into the symbol table. This fixes the behaviour on the testcase I added to testsuite. There are bugs where get_availability misbehaves on partitioned program. We can also use the new flag to fix those, but I will do that incrementally. The -Ofast change should be safe now. Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-11-18 Jan Hubicka <hubicka@ucw.cz> * cgraph.c (cgraph_node::get_availability): Update call of decl_replaceable_p. (cgraph_node::verify_node): Verify that semantic_interposition flag is set correclty. * cgraph.h: (symtab_node): Add semantic_interposition flag. * cgraphclones.c (set_new_clone_decl_and_node_flags): Clear semantic_interposition flag. * cgraphunit.c (cgraph_node::finalize_function): Set semantic_interposition flag. (cgraph_node::add_new_function): Likewise. (varpool_node::finalize_decl): Likewise. (cgraph_node::create_wrapper): Likewise. * common.opt (fsemantic-interposition): Turn to optimization node. * lto-cgraph.c (lto_output_node): Stream semantic_interposition. (lto_output_varpool_node): Likewise. (input_overwrite_node): Likewise. (input_varpool_node): Likewise. * symtab.c (symtab_node::dump_base): * varasm.c (decl_replaceable_p): Add semantic_interposition_p parameter. * varasm.h (decl_replaceable_p): Update declaration. * varpool.c (varpool_node::ctor_useable_for_folding_p): Use semantic_interposition flag. (varpool_node::get_availability): Likewise. (varpool_node::create_alias): Copy semantic_interposition flag. gcc/cp/ChangeLog: 2021-11-18 Jan Hubicka <hubicka@ucw.cz> * decl.c (finish_function): Update use of decl_replaceable_p. gcc/lto/ChangeLog: 2021-11-18 Jan Hubicka <hubicka@ucw.cz> * lto-partition.c (promote_symbol): Clear semantic_interposition flag. gcc/testsuite/ChangeLog: 2021-11-18 Jan Hubicka <hubicka@ucw.cz> * gcc.dg/lto/semantic-interposition-1_0.c: New test. * gcc.dg/lto/semantic-interposition-1_1.c: New test. diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 466b66d5ba5..8e7c12642ad 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -2390,7 +2390,8 @@ cgraph_node::get_availability (symtab_node *ref) to code cp_cannot_inline_tree_fn and probably shall be shared and the inlinability hooks completely eliminated). */ - else if (decl_replaceable_p (decl) && !DECL_EXTERNAL (decl)) + else if (decl_replaceable_p (decl, semantic_interposition) + && !DECL_EXTERNAL (decl)) avail = AVAIL_INTERPOSABLE; else avail = AVAIL_AVAILABLE; @@ -3486,6 +3487,13 @@ cgraph_node::verify_node (void) "returns a pointer"); error_found = true; } + if (definition && externally_visible + && semantic_interposition + != opt_for_fn (decl, flag_semantic_interposition)) + { + error ("semantic interposition mismatch"); + error_found = true; + } for (e = indirect_calls; e; e = e->next_callee) { if (e->aux) diff --git a/gcc/cgraph.h b/gcc/cgraph.h index e42e305cdb6..dafdfd289a0 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -117,6 +117,7 @@ public: refuse_visibility_changes (false), externally_visible (false), no_reorder (false), force_output (false), forced_by_abi (false), unique_name (false), implicit_section (false), body_removed (false), + semantic_interposition (flag_semantic_interposition), used_from_other_partition (false), in_other_partition (false), address_taken (false), in_init_priority_hash (false), need_lto_streaming (false), offloadable (false), ifunc_resolver (false), @@ -557,6 +558,8 @@ public: /* True when body and other characteristics have been removed by symtab_remove_unreachable_nodes. */ unsigned body_removed : 1; + /* True when symbol should comply to -fsemantic-interposition flag. */ + unsigned semantic_interposition : 1; /*** WHOPR Partitioning flags. These flags are used at ltrans stage when only part of the callgraph is diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c index ae91dccd31d..c997b82f3e1 100644 --- a/gcc/cgraphclones.c +++ b/gcc/cgraphclones.c @@ -172,6 +172,7 @@ set_new_clone_decl_and_node_flags (cgraph_node *new_node) new_node->externally_visible = 0; new_node->local = 1; new_node->lowered = true; + new_node->semantic_interposition = 0; } /* Duplicate thunk THUNK if necessary but make it to refer to NODE. diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c index 55cb0347149..1e58ffd65e8 100644 --- a/gcc/cgraphunit.c +++ b/gcc/cgraphunit.c @@ -452,6 +452,7 @@ cgraph_node::finalize_function (tree decl, bool no_collect) node->definition = true; notice_global_symbol (decl); node->lowered = DECL_STRUCT_FUNCTION (decl)->cfg != NULL; + node->semantic_interposition = opt_for_fn (decl, flag_semantic_interposition); if (!flag_toplevel_reorder) node->no_reorder = true; @@ -554,6 +555,8 @@ cgraph_node::add_new_function (tree fndecl, bool lowered) node = cgraph_node::get_create (fndecl); node->local = false; node->definition = true; + node->semantic_interposition = opt_for_fn (fndecl, + flag_semantic_interposition); node->force_output = true; if (TREE_PUBLIC (fndecl)) node->externally_visible = true; @@ -581,6 +584,8 @@ cgraph_node::add_new_function (tree fndecl, bool lowered) if (lowered) node->lowered = true; node->definition = true; + node->semantic_interposition = opt_for_fn (fndecl, + flag_semantic_interposition); node->analyze (); push_cfun (DECL_STRUCT_FUNCTION (fndecl)); gimple_register_cfg_hooks (); @@ -954,6 +959,7 @@ varpool_node::finalize_decl (tree decl) /* Set definition first before calling notice_global_symbol so that it is available to notice_global_symbol. */ node->definition = true; + node->semantic_interposition = flag_semantic_interposition; notice_global_symbol (decl); if (!flag_toplevel_reorder) node->no_reorder = true; @@ -2576,6 +2582,7 @@ cgraph_node::create_wrapper (cgraph_node *target) /* Turn alias into thunk and expand it into GIMPLE representation. */ definition = true; + semantic_interposition = opt_for_fn (decl, flag_semantic_interposition); /* Create empty thunk, but be sure we did not keep former thunk around. In that case we would need to preserve the info. */ diff --git a/gcc/common.opt b/gcc/common.opt index de9b848eda5..db6010e4e20 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2539,7 +2539,7 @@ Common Var(flag_sel_sched_reschedule_pipelined) Init(0) Optimization Reschedule pipelined regions without pipelining. fsemantic-interposition -Common Var(flag_semantic_interposition) Init(1) +Common Var(flag_semantic_interposition) Init(1) Optimization Allow interposing function (or variables) by ones with different semantics (or initializer) respectively by dynamic linker. ; sched_stalled_insns means that insns can be moved prematurely from the queue diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 2ddf0e4a524..9f68d1a5590 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -17609,7 +17609,8 @@ finish_function (bool inline_p) if (!processing_template_decl && !cp_function_chain->can_throw && !flag_non_call_exceptions - && !decl_replaceable_p (fndecl)) + && !decl_replaceable_p (fndecl, + opt_for_fn (fndecl, flag_semantic_interposition))) TREE_NOTHROW (fndecl) = 1; /* This must come after expand_function_end because cleanups might diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index 7c3e276a8ea..ef3c371c98f 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -516,6 +516,7 @@ lto_output_node (struct lto_simple_output_block *ob, struct cgraph_node *node, bp_pack_value (&bp, node->forced_by_abi, 1); bp_pack_value (&bp, node->unique_name, 1); bp_pack_value (&bp, node->body_removed, 1); + bp_pack_value (&bp, node->semantic_interposition, 1); bp_pack_value (&bp, node->implicit_section, 1); bp_pack_value (&bp, node->address_taken, 1); bp_pack_value (&bp, tag == LTO_symtab_analyzed_node @@ -604,6 +605,7 @@ lto_output_varpool_node (struct lto_simple_output_block *ob, varpool_node *node, node->body_removed || (!encode_initializer_p && !node->alias && node->definition), 1); + bp_pack_value (&bp, node->semantic_interposition, 1); bp_pack_value (&bp, node->implicit_section, 1); bp_pack_value (&bp, node->writeonly, 1); bp_pack_value (&bp, node->definition && (encode_initializer_p || node->alias), @@ -1167,6 +1169,7 @@ input_overwrite_node (struct lto_file_decl_data *file_data, node->forced_by_abi = bp_unpack_value (bp, 1); node->unique_name = bp_unpack_value (bp, 1); node->body_removed = bp_unpack_value (bp, 1); + node->semantic_interposition = bp_unpack_value (bp, 1); node->implicit_section = bp_unpack_value (bp, 1); node->address_taken = bp_unpack_value (bp, 1); node->used_from_other_partition = bp_unpack_value (bp, 1); @@ -1373,6 +1376,7 @@ input_varpool_node (struct lto_file_decl_data *file_data, node->forced_by_abi = bp_unpack_value (&bp, 1); node->unique_name = bp_unpack_value (&bp, 1); node->body_removed = bp_unpack_value (&bp, 1); + node->semantic_interposition = bp_unpack_value (&bp, 1); node->implicit_section = bp_unpack_value (&bp, 1); node->writeonly = bp_unpack_value (&bp, 1); node->definition = bp_unpack_value (&bp, 1); diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c index bee40218159..6d20555073e 100644 --- a/gcc/lto/lto-partition.c +++ b/gcc/lto/lto-partition.c @@ -1001,6 +1001,7 @@ promote_symbol (symtab_node *node) so it is prevailing. This is important to keep binds_to_current_def_p to work across partitions. */ node->resolution = LDPR_PREVAILING_DEF_IRONLY; + node->semantic_interposition = false; DECL_VISIBILITY (node->decl) = VISIBILITY_HIDDEN; DECL_VISIBILITY_SPECIFIED (node->decl) = true; if (dump_file) diff --git a/gcc/symtab.c b/gcc/symtab.c index c7ea8ecef74..94b4e47f749 100644 --- a/gcc/symtab.c +++ b/gcc/symtab.c @@ -929,6 +929,8 @@ symtab_node::dump_base (FILE *f) fprintf (f, " forced_by_abi"); if (externally_visible) fprintf (f, " externally_visible"); + if (semantic_interposition) + fprintf (f, " semantic_interposition"); if (no_reorder) fprintf (f, " no_reorder"); if (resolution != LDPR_UNKNOWN) diff --git a/gcc/testsuite/gcc.dg/lto/semantic-interposition-1_0.c b/gcc/testsuite/gcc.dg/lto/semantic-interposition-1_0.c new file mode 100644 index 00000000000..db842749008 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/semantic-interposition-1_0.c @@ -0,0 +1,13 @@ +/* { dg-lto-do link } */ +/* { dg-require-effective-target shared } */ +/* { dg-extra-ld-options { -shared } } */ +/* { dg-lto-options {{-O2 -flto -fpic -fdump-ipa-inline-details --shared}} } */ +extern int ret1(); + +int +test() +{ + return ret1(); +} +/* { dg-final { scan-wpa-ipa-dump "Inlined 1 calls" "inline" } } */ + diff --git a/gcc/testsuite/gcc.dg/lto/semantic-interposition-1_1.c b/gcc/testsuite/gcc.dg/lto/semantic-interposition-1_1.c new file mode 100644 index 00000000000..9a741773002 --- /dev/null +++ b/gcc/testsuite/gcc.dg/lto/semantic-interposition-1_1.c @@ -0,0 +1,5 @@ +/* { dg-options "-O2 -flto -fpic -fno-semantic-interposition" } */ +int ret1() +{ + return 1; +} diff --git a/gcc/varasm.c b/gcc/varasm.c index 09316c62050..8c7aba2db61 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -7625,6 +7625,8 @@ decl_binds_to_current_def_p (const_tree decl) at link-time with an entirely different definition, provided that the replacement has the same type. For example, functions declared with __attribute__((weak)) on most systems are replaceable. + If SEMANTIC_INTERPOSITION_P is false allow interposition only on + symbols explicitly declared weak. COMDAT functions are not replaceable, since all definitions of the function must be equivalent. It is important that COMDAT functions @@ -7632,12 +7634,12 @@ decl_binds_to_current_def_p (const_tree decl) instantiations is not penalized. */ bool -decl_replaceable_p (tree decl) +decl_replaceable_p (tree decl, bool semantic_interposition_p) { gcc_assert (DECL_P (decl)); if (!TREE_PUBLIC (decl) || DECL_COMDAT (decl)) return false; - if (!flag_semantic_interposition + if (!semantic_interposition_p && !DECL_WEAK (decl)) return false; return !decl_binds_to_current_def_p (decl); diff --git a/gcc/varasm.h b/gcc/varasm.h index 442ca6a380a..4ab9dc51838 100644 --- a/gcc/varasm.h +++ b/gcc/varasm.h @@ -38,7 +38,7 @@ extern void mark_decl_referenced (tree); extern void notice_global_symbol (tree); extern void set_user_assembler_name (tree, const char *); extern void process_pending_assemble_externals (void); -extern bool decl_replaceable_p (tree); +extern bool decl_replaceable_p (tree, bool); extern bool decl_binds_to_current_def_p (const_tree); extern enum tls_model decl_default_tls_model (const_tree); diff --git a/gcc/varpool.c b/gcc/varpool.c index 4830df5c320..fd0d53b9eb6 100644 --- a/gcc/varpool.c +++ b/gcc/varpool.c @@ -372,7 +372,8 @@ varpool_node::ctor_useable_for_folding_p (void) */ if ((!DECL_INITIAL (real_node->decl) || (DECL_WEAK (decl) && !DECL_COMDAT (decl))) - && (DECL_EXTERNAL (decl) || decl_replaceable_p (decl))) + && ((DECL_EXTERNAL (decl) && !in_other_partition) + || decl_replaceable_p (decl, semantic_interposition))) return false; /* Variables declared `const' with an initializer are considered @@ -511,8 +512,8 @@ varpool_node::get_availability (symtab_node *ref) /* If the variable can be overwritten, return OVERWRITABLE. Takes care of at least one notable extension - the COMDAT variables used to share template instantiations in C++. */ - if (decl_replaceable_p (decl) - || DECL_EXTERNAL (decl)) + if (decl_replaceable_p (decl, semantic_interposition) + || (DECL_EXTERNAL (decl) && !in_other_partition)) return AVAIL_INTERPOSABLE; return AVAIL_AVAILABLE; } @@ -779,6 +780,7 @@ varpool_node::create_alias (tree alias, tree decl) alias_node = varpool_node::get_create (alias); alias_node->alias = true; alias_node->definition = true; + alias_node->semantic_interposition = flag_semantic_interposition; alias_node->alias_target = decl; if (lookup_attribute ("weakref", DECL_ATTRIBUTES (alias)) != NULL) alias_node->weakref = alias_node->transparent_alias = true;
> > Hi, > > > > On Fri, Nov 12 2021, Martin Jambor wrote: > > > Hi, > > > > > > using -fno-semantic-interposition has been reported by various people > > > to bring about considerable speed up at the cost of strict compliance > > > to the ELF symbol interposition rules See for example > > > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup > > > > > > As such I believe it should be implied by our -Ofast optimization > > > level, not only so that benchmarks that can benefit run faster, but > > > also so that people looking at -Ofast documentation for options that > > > could speed their programs find it. > > > > > > I have verified that with the following patch IPA-CP sees > > > flag_semantic_interposition set to zero at Ofast and that info and pdf > > > manual builds fine with the documentation change. I am bootstrapping > > > and testing it now in order to comply with submission criteria but I > > > don't think an Ofast change gets much tested. > > > > > > Assuming it passes, is the patch OK? (If it is, I will also add a note > > > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs > > > after I commit the patch.) > > > > > > > Unfortunately, I was wrong, there are testcases which use the optimize > > attribute to switch a function to Ofast and those ICE because > > -fsemantic-interposition is not an optimization flag and only > > optimization flags can change in an optimize attribute (AFAIK, I only > > had a quick glance at the results). > > > > I am not sure what is the right way to tackle this, whether to set the > > flag at Ofast in some nonstandard way or make the flag an optimization > > flag - probably affecting function definitions, having it affect > > call-sites seems too fine-grained. I will try to discuss this on IRC on > > Monday (and hope such change is still doable early stage3). > > > > Sorry for posting this a bit prematurely, > > Hi, > > This patch turns flag_semantic_interposition to optimization option so > it can be enabled with per-function granuality. This is done by adding > the flag among visibility flags into the symbol table. This fixes the > behaviour on the testcase I added to testsuite. > > There are bugs where get_availability misbehaves on partitioned program. > We can also use the new flag to fix those, but I will do that > incrementally. > > The -Ofast change should be safe now. Also forgot to say it explicitly, the patch is OK, so please commit it. Honza
Hi, On Fri, Nov 19 2021, Jan Hubicka wrote: >> > Hi, >> > >> > On Fri, Nov 12 2021, Martin Jambor wrote: >> > > Hi, >> > > >> > > using -fno-semantic-interposition has been reported by various people >> > > to bring about considerable speed up at the cost of strict compliance >> > > to the ELF symbol interposition rules See for example >> > > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup >> > > >> > > As such I believe it should be implied by our -Ofast optimization >> > > level, not only so that benchmarks that can benefit run faster, but >> > > also so that people looking at -Ofast documentation for options that >> > > could speed their programs find it. >> > > >> > > I have verified that with the following patch IPA-CP sees >> > > flag_semantic_interposition set to zero at Ofast and that info and pdf >> > > manual builds fine with the documentation change. I am bootstrapping >> > > and testing it now in order to comply with submission criteria but I >> > > don't think an Ofast change gets much tested. >> > > >> > > Assuming it passes, is the patch OK? (If it is, I will also add a note >> > > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs >> > > after I commit the patch.) >> > > >> > >> > Unfortunately, I was wrong, there are testcases which use the optimize >> > attribute to switch a function to Ofast and those ICE because >> > -fsemantic-interposition is not an optimization flag and only >> > optimization flags can change in an optimize attribute (AFAIK, I only >> > had a quick glance at the results). >> > >> > I am not sure what is the right way to tackle this, whether to set the >> > flag at Ofast in some nonstandard way or make the flag an optimization >> > flag - probably affecting function definitions, having it affect >> > call-sites seems too fine-grained. I will try to discuss this on IRC on >> > Monday (and hope such change is still doable early stage3). >> > >> > Sorry for posting this a bit prematurely, >> >> Hi, >> >> This patch turns flag_semantic_interposition to optimization option so >> it can be enabled with per-function granuality. This is done by adding >> the flag among visibility flags into the symbol table. This fixes the >> behaviour on the testcase I added to testsuite. >> >> There are bugs where get_availability misbehaves on partitioned program. >> We can also use the new flag to fix those, but I will do that >> incrementally. >> >> The -Ofast change should be safe now. > > Also forgot to say it explicitly, the patch is OK, so please commit it. > Thanks a lot for the enabling patch. I have committed mine after re-testing. Martin
On Fri, Nov 19, 2021 at 9:55 AM Martin Jambor <mjambor@suse.cz> wrote: > > Hi, > > On Fri, Nov 19 2021, Jan Hubicka wrote: > >> > Hi, > >> > > >> > On Fri, Nov 12 2021, Martin Jambor wrote: > >> > > Hi, > >> > > > >> > > using -fno-semantic-interposition has been reported by various people > >> > > to bring about considerable speed up at the cost of strict compliance > >> > > to the ELF symbol interposition rules See for example > >> > > https://fedoraproject.org/wiki/Changes/PythonNoSemanticInterpositionSpeedup > >> > > > >> > > As such I believe it should be implied by our -Ofast optimization > >> > > level, not only so that benchmarks that can benefit run faster, but > >> > > also so that people looking at -Ofast documentation for options that > >> > > could speed their programs find it. > >> > > > >> > > I have verified that with the following patch IPA-CP sees > >> > > flag_semantic_interposition set to zero at Ofast and that info and pdf > >> > > manual builds fine with the documentation change. I am bootstrapping > >> > > and testing it now in order to comply with submission criteria but I > >> > > don't think an Ofast change gets much tested. > >> > > > >> > > Assuming it passes, is the patch OK? (If it is, I will also add a note > >> > > about it in the "Caveats" section in gcc-12/changes.html of wwwdocs > >> > > after I commit the patch.) > >> > > > >> > > >> > Unfortunately, I was wrong, there are testcases which use the optimize > >> > attribute to switch a function to Ofast and those ICE because > >> > -fsemantic-interposition is not an optimization flag and only > >> > optimization flags can change in an optimize attribute (AFAIK, I only > >> > had a quick glance at the results). > >> > > >> > I am not sure what is the right way to tackle this, whether to set the > >> > flag at Ofast in some nonstandard way or make the flag an optimization > >> > flag - probably affecting function definitions, having it affect > >> > call-sites seems too fine-grained. I will try to discuss this on IRC on > >> > Monday (and hope such change is still doable early stage3). > >> > > >> > Sorry for posting this a bit prematurely, > >> > >> Hi, > >> > >> This patch turns flag_semantic_interposition to optimization option so > >> it can be enabled with per-function granuality. This is done by adding > >> the flag among visibility flags into the symbol table. This fixes the > >> behaviour on the testcase I added to testsuite. > >> > >> There are bugs where get_availability misbehaves on partitioned program. > >> We can also use the new flag to fix those, but I will do that > >> incrementally. > >> > >> The -Ofast change should be safe now. > > > > Also forgot to say it explicitly, the patch is OK, so please commit it. > > > > Thanks a lot for the enabling patch. I have committed mine after re-testing. > > Martin Perhaps reconsider https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100937 (configure: Add --enable-default-semantic-interposition)? Patch: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572018.html I happen a write-up on https://maskray.me/blog/2021-05-09-fno-semantic-interposition
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2ea23d07c4c..fd16c91aec8 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10551,6 +10551,7 @@ valid for all standard-compliant programs. It turns on @option{-ffast-math}, @option{-fallow-store-data-races} and the Fortran-specific @option{-fstack-arrays}, unless @option{-fmax-stack-var-size} is specified, and @option{-fno-protect-parens}. +It turns off @option {-fsemantic-interposition}. @item -Og @opindex Og diff --git a/gcc/opts.c b/gcc/opts.c index caed6255500..3da53d8f890 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -682,6 +682,7 @@ static const struct default_options default_options_table[] = /* -Ofast adds optimizations to -O3. */ { OPT_LEVELS_FAST, OPT_ffast_math, NULL, 1 }, { OPT_LEVELS_FAST, OPT_fallow_store_data_races, NULL, 1 }, + { OPT_LEVELS_FAST, OPT_fsemantic_interposition, NULL, 0 }, { OPT_LEVELS_NONE, 0, NULL, 0 } };