Message ID | 20211212104912.GD50931@kam.mff.cuni.cz |
---|---|
State | Committed |
Commit | e93809f62363ba4b233858005aef652fb550e896 |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 70541385841C for <patchwork@sourceware.org>; Sun, 12 Dec 2021 10:49:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 70541385841C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1639306183; bh=26hMTE/4wAEWApalkOwTIyT9ItlaKsdoYK5owH1xCeM=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=FT8jNxPWJEzb6pRNZChhIw72iM6PB1/MFCkmeSLgU+u6Bx2l/iEIjUN4g2BMVuaiA VHsZij29f5WtbZqf7pl4qYVDEyAMhJS9DUtDqEMWi1sezC3M1nbY3BlwtrVcPFdxm1 z4giK2Col1kFRG0BLyXCBWOQWDiy/rJH8is/q+Fk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 084B9385840C for <gcc-patches@gcc.gnu.org>; Sun, 12 Dec 2021 10:49:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 084B9385840C Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id A12DA280E3E; Sun, 12 Dec 2021 11:49:12 +0100 (CET) Date: Sun, 12 Dec 2021 11:49:12 +0100 To: gcc-patches@gcc.gnu.org Subject: Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref Message-ID: <20211212104912.GD50931@kam.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Jan Hubicka <hubicka@kam.mff.cuni.cz> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
Terminate BB analysis on NULL pointer access in ipa-pure-const and ipa-modref
|
|
Commit Message
Jan Hubicka
Dec. 12, 2021, 10:49 a.m. UTC
Hi, As discussed in the PR, we miss some optimization becuase gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds __builtin_trap after them. This is seen as a side-effect by IPA analysis and additionally the (fully unreachable) builtin_trap is believed to load all global memory. I think we should think of less intrusive gimple representation of this, but it is also easy enough to special case that in IPA analysers as done in this patch. This is a win even if we improve the representation since gimple-ssa-isolate-paths is run late and this way we improve optimization early. This affects 1623 functions during cc1plus link. Current cc1plus disambiguation stats are: Alias oracle query stats: refs_may_alias_p: 76712244 disambiguations, 90400285 queries ref_maybe_used_by_call_p: 581770 disambiguations, 77716066 queries call_may_clobber_ref_p: 322533 disambiguations, 325280 queries stmt_kills_ref_p: 106434 kills, 5728738 queries nonoverlapping_component_refs_p: 0 disambiguations, 8849 queries nonoverlapping_refs_since_match_p: 29947 disambiguations, 66277 must overlaps, 97248 queries aliasing_component_refs_p: 55737 disambiguations, 3055719 queries TBAA oracle: 27336487 disambiguations 67140201 queries 16214932 are in alias set 0 9713534 queries asked about the same object 92 queries asked about the same alias set 0 access volatile 12049850 are dependent in the DAG 1825306 are aritificially in conflict with void * Modref stats: modref kill: 70 kills, 8279 queries modref use: 29557 disambiguations, 701616 queries modref clobber: 1612655 disambiguations, 21688020 queries 4925889 tbaa queries (0.227125 per modref query) 864389 base compares (0.039856 per modref query) PTA query stats: pt_solution_includes: 13512509 disambiguations, 27571176 queries pt_solutions_intersect: 1594296 disambiguations, 15943975 queries Bootstrapped/regtested x86_64-linux, comitted. gcc/ChangeLog: 2021-12-12 Jan Hubicka <hubicka@ucw.cz> PR ipa/103665 * ipa-modref.c (modref_access_analysis::analyze): Terminate BB analysis on NULL memory access. * ipa-pure-const.c (analyze_function): Likewise.
Comments
On December 12, 2021 11:49:12 AM GMT+01:00, Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: >Hi, >As discussed in the PR, we miss some optimization becuase >gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds >__builtin_trap after them. This is seen as a side-effect by IPA analysis >and additionally the (fully unreachable) builtin_trap is believed to load >all global memory. > >I think we should think of less intrusive gimple representation of this, but >it is also easy enough to special case that in IPA analysers as done in >this patch. This is a win even if we improve the representation since >gimple-ssa-isolate-paths is run late and this way we improve optimization >early. > >This affects 1623 functions during cc1plus link. >Current cc1plus disambiguation stats are: > >Alias oracle query stats: > refs_may_alias_p: 76712244 disambiguations, 90400285 queries > ref_maybe_used_by_call_p: 581770 disambiguations, 77716066 queries > call_may_clobber_ref_p: 322533 disambiguations, 325280 queries > stmt_kills_ref_p: 106434 kills, 5728738 queries > nonoverlapping_component_refs_p: 0 disambiguations, 8849 queries > nonoverlapping_refs_since_match_p: 29947 disambiguations, 66277 must overlaps, 97248 queries > aliasing_component_refs_p: 55737 disambiguations, 3055719 queries > TBAA oracle: 27336487 disambiguations 67140201 queries > 16214932 are in alias set 0 > 9713534 queries asked about the same object > 92 queries asked about the same alias set > 0 access volatile > 12049850 are dependent in the DAG > 1825306 are aritificially in conflict with void * > >Modref stats: > modref kill: 70 kills, 8279 queries > modref use: 29557 disambiguations, 701616 queries > modref clobber: 1612655 disambiguations, 21688020 queries > 4925889 tbaa queries (0.227125 per modref query) > 864389 base compares (0.039856 per modref query) > >PTA query stats: > pt_solution_includes: 13512509 disambiguations, 27571176 queries > pt_solutions_intersect: 1594296 disambiguations, 15943975 queries > > >Bootstrapped/regtested x86_64-linux, comitted. > >gcc/ChangeLog: > >2021-12-12 Jan Hubicka <hubicka@ucw.cz> > > PR ipa/103665 > * ipa-modref.c (modref_access_analysis::analyze): Terminate BB > analysis on NULL memory access. > * ipa-pure-const.c (analyze_function): Likewise. > >diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c >index 55fa873e1f0..2c89c63baf6 100644 >--- a/gcc/ipa-modref.c >+++ b/gcc/ipa-modref.c >@@ -1750,6 +1750,19 @@ modref_access_analysis::analyze () > for (si = gsi_start_nondebug_after_labels_bb (bb); > !gsi_end_p (si); gsi_next_nondebug (&si)) > { >+ /* NULL memory accesses terminates BB. These accesses are known >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them >+ to volatile accesses and adds builtin_trap call which would >+ confuse us otherwise. */ >+ if (infer_nonnull_range_by_dereference (gsi_stmt (si), >+ null_pointer_node)) Does that correctly handle flag_delete_null_pointer_checks and address spaces with null? >+ { >+ if (dump_file) >+ fprintf (dump_file, " - NULL memory access; terminating BB\n"); >+ if (flag_non_call_exceptions) >+ set_side_effects (); >+ break; >+ } > analyze_stmt (gsi_stmt (si), always_executed); > > /* Avoid doing useles work. */ >diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c >index fea8b08c4eb..25503f360e6 100644 >--- a/gcc/ipa-pure-const.c >+++ b/gcc/ipa-pure-const.c >@@ -1097,6 +1097,28 @@ analyze_function (struct cgraph_node *fn, bool ipa) > !gsi_end_p (gsi); > gsi_next (&gsi)) > { >+ /* NULL memory accesses terminates BB. These accesses are known >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them >+ to volatile accesses and adds builtin_trap call which would >+ confuse us otherwise. */ >+ if (infer_nonnull_range_by_dereference (gsi_stmt (gsi), >+ null_pointer_node)) >+ { >+ if (dump_file) >+ fprintf (dump_file, " NULL memory access; terminating BB%s\n", >+ flag_non_call_exceptions ? "; looping" : ""); >+ if (flag_non_call_exceptions) >+ { >+ l->looping = true; >+ if (stmt_can_throw_external (cfun, gsi_stmt (gsi))) >+ { >+ if (dump_file) >+ fprintf (dump_file, " can throw externally\n"); >+ l->can_throw = true; >+ } >+ } >+ break; >+ } > check_stmt (&gsi, l, ipa); > if (l->pure_const_state == IPA_NEITHER > && l->looping
On Sun, Dec 12, 2021 at 4:34 AM Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On December 12, 2021 11:49:12 AM GMT+01:00, Jan Hubicka via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > >Hi, > >As discussed in the PR, we miss some optimization becuase > >gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > >__builtin_trap after them. This is seen as a side-effect by IPA analysis > >and additionally the (fully unreachable) builtin_trap is believed to load > >all global memory. > > > >I think we should think of less intrusive gimple representation of this, but > >it is also easy enough to special case that in IPA analysers as done in > >this patch. This is a win even if we improve the representation since > >gimple-ssa-isolate-paths is run late and this way we improve optimization > >early. > > > >This affects 1623 functions during cc1plus link. > >Current cc1plus disambiguation stats are: > > > >Alias oracle query stats: > > refs_may_alias_p: 76712244 disambiguations, 90400285 queries > > ref_maybe_used_by_call_p: 581770 disambiguations, 77716066 queries > > call_may_clobber_ref_p: 322533 disambiguations, 325280 queries > > stmt_kills_ref_p: 106434 kills, 5728738 queries > > nonoverlapping_component_refs_p: 0 disambiguations, 8849 queries > > nonoverlapping_refs_since_match_p: 29947 disambiguations, 66277 must overlaps, 97248 queries > > aliasing_component_refs_p: 55737 disambiguations, 3055719 queries > > TBAA oracle: 27336487 disambiguations 67140201 queries > > 16214932 are in alias set 0 > > 9713534 queries asked about the same object > > 92 queries asked about the same alias set > > 0 access volatile > > 12049850 are dependent in the DAG > > 1825306 are aritificially in conflict with void * > > > >Modref stats: > > modref kill: 70 kills, 8279 queries > > modref use: 29557 disambiguations, 701616 queries > > modref clobber: 1612655 disambiguations, 21688020 queries > > 4925889 tbaa queries (0.227125 per modref query) > > 864389 base compares (0.039856 per modref query) > > > >PTA query stats: > > pt_solution_includes: 13512509 disambiguations, 27571176 queries > > pt_solutions_intersect: 1594296 disambiguations, 15943975 queries > > > > > >Bootstrapped/regtested x86_64-linux, comitted. > > > >gcc/ChangeLog: > > > >2021-12-12 Jan Hubicka <hubicka@ucw.cz> > > > > PR ipa/103665 > > * ipa-modref.c (modref_access_analysis::analyze): Terminate BB > > analysis on NULL memory access. > > * ipa-pure-const.c (analyze_function): Likewise. > > > >diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c > >index 55fa873e1f0..2c89c63baf6 100644 > >--- a/gcc/ipa-modref.c > >+++ b/gcc/ipa-modref.c > >@@ -1750,6 +1750,19 @@ modref_access_analysis::analyze () > > for (si = gsi_start_nondebug_after_labels_bb (bb); > > !gsi_end_p (si); gsi_next_nondebug (&si)) > > { > >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if (infer_nonnull_range_by_dereference (gsi_stmt (si), > >+ null_pointer_node)) > > Does that correctly handle flag_delete_null_pointer_checks and address spaces with null? Yes. infer_nonnull_range_by_dereference has the following check: /* We can only assume that a pointer dereference will yield non-NULL if -fdelete-null-pointer-checks is enabled. */ if (!flag_delete_null_pointer_checks || !POINTER_TYPE_P (TREE_TYPE (op)) || gimple_code (stmt) == GIMPLE_ASM || gimple_clobber_p (stmt)) return false; And then calls walk_stmt_load_store_ops with check_loadstore as the callback function which does: /* Some address spaces may legitimately dereference zero. */ addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (op)); if (targetm.addr_space.zero_address_valid (as)) return false; return operand_equal_p (TREE_OPERAND (op, 0), (tree)data, 0); Thanks, Andrew Pinski > > >+ { > >+ if (dump_file) > >+ fprintf (dump_file, " - NULL memory access; terminating BB\n"); > >+ if (flag_non_call_exceptions) > >+ set_side_effects (); > >+ break; > >+ } > > analyze_stmt (gsi_stmt (si), always_executed); > > > > /* Avoid doing useles work. */ > >diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c > >index fea8b08c4eb..25503f360e6 100644 > >--- a/gcc/ipa-pure-const.c > >+++ b/gcc/ipa-pure-const.c > >@@ -1097,6 +1097,28 @@ analyze_function (struct cgraph_node *fn, bool ipa) > > !gsi_end_p (gsi); > > gsi_next (&gsi)) > > { > >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if (infer_nonnull_range_by_dereference (gsi_stmt (gsi), > >+ null_pointer_node)) > >+ { > >+ if (dump_file) > >+ fprintf (dump_file, " NULL memory access; terminating BB%s\n", > >+ flag_non_call_exceptions ? "; looping" : ""); > >+ if (flag_non_call_exceptions) > >+ { > >+ l->looping = true; > >+ if (stmt_can_throw_external (cfun, gsi_stmt (gsi))) > >+ { > >+ if (dump_file) > >+ fprintf (dump_file, " can throw externally\n"); > >+ l->can_throw = true; > >+ } > >+ } > >+ break; > >+ } > > check_stmt (&gsi, l, ipa); > > if (l->pure_const_state == IPA_NEITHER > > && l->looping >
> >+ /* NULL memory accesses terminates BB. These accesses are known > >+ to trip undefined behaviour. gimple-ssa-isolate-paths turns them > >+ to volatile accesses and adds builtin_trap call which would > >+ confuse us otherwise. */ > >+ if (infer_nonnull_range_by_dereference (gsi_stmt (si), > >+ null_pointer_node)) > > Does that correctly handle flag_delete_null_pointer_checks and address spaces with null? Yes it does both - it is also same check that gimple-ssa-isolate-paths eventually uses to add the trap so we would probably see PRs if it did not. Honza
On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: > Hi, > As discussed in the PR, we miss some optimization becuase > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > __builtin_trap after them. This is seen as a side-effect by IPA analysis > and additionally the (fully unreachable) builtin_trap is believed to load > all global memory. > > I think we should think of less intrusive gimple representation of this, but > it is also easy enough to special case that in IPA analysers as done in > this patch. This is a win even if we improve the representation since > gimple-ssa-isolate-paths is run late and this way we improve optimization > early. So what's important about the IR representation is that we keep some kind of memory access around (so that it will fault), that the memory access reference a minimal amount of other stuff (we don't want the address computation to keep anything live for example) and that we have a subsequent trap so that the CFG routines know it traps. Otherwise we're free to do whatever we want. jeff
> > > On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: > > Hi, > > As discussed in the PR, we miss some optimization becuase > > gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds > > __builtin_trap after them. This is seen as a side-effect by IPA analysis > > and additionally the (fully unreachable) builtin_trap is believed to load > > all global memory. > > > > I think we should think of less intrusive gimple representation of this, but > > it is also easy enough to special case that in IPA analysers as done in > > this patch. This is a win even if we improve the representation since > > gimple-ssa-isolate-paths is run late and this way we improve optimization > > early. > So what's important about the IR representation is that we keep some kind of > memory access around (so that it will fault), that the memory access > reference a minimal amount of other stuff (we don't want the address > computation to keep anything live for example) and that we have a subsequent > trap so that the CFG routines know it traps. > > Otherwise we're free to do whatever we want. I was thinking about __builtin_trap_load (ptr) and __builtin_trap_store (ptr) which we could annotate correctly for PTA and avoid need for two statements, but I am not sure if this is good idea. coioncidentally I just ran across another artifact of this. One of hottest functions in clang in getTerminator. Clang builds it as: │ 000000000122f4b0 <llvm::BasicBlock::getTerminator() const>: │ llvm::BasicBlock::getTerminator() const: 2.16 │ mov 0x28(%rdi),%rax 28.85 │ add $0x28,%rdi 0.20 │ cmp %rax,%rdi 3.55 │ ↓ je 29 │ lea -0x18(%rax),%rcx 0.79 │ test %rax,%rax │ cmove %rax,%rcx 4.15 │ movzbl 0x10(%rcx),%edx 48.46 │ add $0xffffffe5,%edx 3.95 │ xor %eax,%eax 0.20 │ cmp $0xb,%edx 3.35 │ cmovb %rcx,%rax 4.33 │ ← ret │29: xor %eax,%eax │ ← ret While we do: │ │ 0000000001471840 <llvm::BasicBlock::getTerminator() const>: │ llvm::BasicBlock::getTerminator() const: 3.24 │ mov 0x28(%rdi),%rax 31.31 │ add $0x28,%rdi 0.59 │ cmp %rdi,%rax 5.31 │ ↓ je 30 │ test %rax,%rax 2.36 │ → je 9366f4 <llvm::BasicBlock::getTerminator() const [clone .cold]> │ movzbl -0x8(%rax),%edx 45.73 │ sub $0x18,%rax │ sub $0x1b,%edx 4.70 │ cmp $0xb,%edx 3.53 │ mov $0x0,%edx │ cmovae %rdx,%rax 3.23 │ ← ret │ xchg %ax,%ax │30: xor %eax,%eax │ ← ret .... │ │ 00000000009366f4 <llvm::BasicBlock::getTerminator() const [clone .cold]>: │ llvm::BasicBlock::getTerminator() const [clone .cold]: │ movzbl 0x10,%eax │ ud2 The clang generated code obviously conditionally loads NULL pointer, but replacing this cmov by conditional jump to cold section seems overkill. Loading 10 into eax seems useless... I think this is common pattern in C++ code originating from cast with multiple inheritance. I would vote towards optimizing out the conditial move in this case and I think it is correct. Honza
> > I think this is common pattern in C++ code originating from cast with > multiple inheritance. I would vote towards optimizing out the conditial > move in this case and I think it is correct. I crafted a testcse and filled in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103674 Honza > > Honza
On 12/12/2021 12:57 PM, Jan Hubicka wrote: >> >> On 12/12/2021 3:49 AM, Jan Hubicka via Gcc-patches wrote: >>> Hi, >>> As discussed in the PR, we miss some optimization becuase >>> gimple-ssa-isolate-paths turns NULL memory accesses to volatile and adds >>> __builtin_trap after them. This is seen as a side-effect by IPA analysis >>> and additionally the (fully unreachable) builtin_trap is believed to load >>> all global memory. >>> >>> I think we should think of less intrusive gimple representation of this, but >>> it is also easy enough to special case that in IPA analysers as done in >>> this patch. This is a win even if we improve the representation since >>> gimple-ssa-isolate-paths is run late and this way we improve optimization >>> early. >> So what's important about the IR representation is that we keep some kind of >> memory access around (so that it will fault), that the memory access >> reference a minimal amount of other stuff (we don't want the address >> computation to keep anything live for example) and that we have a subsequent >> trap so that the CFG routines know it traps. >> >> Otherwise we're free to do whatever we want. > I was thinking about __builtin_trap_load (ptr) and __builtin_trap_store (ptr) > which we could annotate correctly for PTA and avoid need for two > statements, but I am not sure if this is good idea. Seems reasonable. I'm not sure if we need both though. __builtin_trap_memref(ptr) perhaps? > > coioncidentally I just ran across another artifact of this. One of > hottest functions in clang in getTerminator. Clang builds it as: > > │ 000000000122f4b0 <llvm::BasicBlock::getTerminator() const>: > │ llvm::BasicBlock::getTerminator() const: > 2.16 │ mov 0x28(%rdi),%rax > 28.85 │ add $0x28,%rdi > 0.20 │ cmp %rax,%rdi > 3.55 │ ↓ je 29 > │ lea -0x18(%rax),%rcx > 0.79 │ test %rax,%rax > │ cmove %rax,%rcx > 4.15 │ movzbl 0x10(%rcx),%edx > 48.46 │ add $0xffffffe5,%edx > 3.95 │ xor %eax,%eax > 0.20 │ cmp $0xb,%edx > 3.35 │ cmovb %rcx,%rax > 4.33 │ ← ret > │29: xor %eax,%eax > │ ← ret > > While we do: > > │ > │ 0000000001471840 <llvm::BasicBlock::getTerminator() const>: > │ llvm::BasicBlock::getTerminator() const: > 3.24 │ mov 0x28(%rdi),%rax > 31.31 │ add $0x28,%rdi > 0.59 │ cmp %rdi,%rax > 5.31 │ ↓ je 30 > │ test %rax,%rax > 2.36 │ → je 9366f4 <llvm::BasicBlock::getTerminator() const [clone .cold]> > │ movzbl -0x8(%rax),%edx > 45.73 │ sub $0x18,%rax > │ sub $0x1b,%edx > 4.70 │ cmp $0xb,%edx > 3.53 │ mov $0x0,%edx > │ cmovae %rdx,%rax > 3.23 │ ← ret > │ xchg %ax,%ax > │30: xor %eax,%eax > │ ← ret > > .... > > │ > │ 00000000009366f4 <llvm::BasicBlock::getTerminator() const [clone .cold]>: > │ llvm::BasicBlock::getTerminator() const [clone .cold]: > │ movzbl 0x10,%eax > │ ud2 > > The clang generated code obviously conditionally loads NULL pointer, but > replacing this cmov by conditional jump to cold section seems overkill. > Loading 10 into eax seems useless... The load into %eax definitely seems useless. Given branch prediction I'm less sure about the cmov vs jump to the cold section though, but I don't have strong opinions. If you think doing the cmov is a better choice, I won't argue. jeff
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c index 55fa873e1f0..2c89c63baf6 100644 --- a/gcc/ipa-modref.c +++ b/gcc/ipa-modref.c @@ -1750,6 +1750,19 @@ modref_access_analysis::analyze () for (si = gsi_start_nondebug_after_labels_bb (bb); !gsi_end_p (si); gsi_next_nondebug (&si)) { + /* NULL memory accesses terminates BB. These accesses are known + to trip undefined behaviour. gimple-ssa-isolate-paths turns them + to volatile accesses and adds builtin_trap call which would + confuse us otherwise. */ + if (infer_nonnull_range_by_dereference (gsi_stmt (si), + null_pointer_node)) + { + if (dump_file) + fprintf (dump_file, " - NULL memory access; terminating BB\n"); + if (flag_non_call_exceptions) + set_side_effects (); + break; + } analyze_stmt (gsi_stmt (si), always_executed); /* Avoid doing useles work. */ diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c index fea8b08c4eb..25503f360e6 100644 --- a/gcc/ipa-pure-const.c +++ b/gcc/ipa-pure-const.c @@ -1097,6 +1097,28 @@ analyze_function (struct cgraph_node *fn, bool ipa) !gsi_end_p (gsi); gsi_next (&gsi)) { + /* NULL memory accesses terminates BB. These accesses are known + to trip undefined behaviour. gimple-ssa-isolate-paths turns them + to volatile accesses and adds builtin_trap call which would + confuse us otherwise. */ + if (infer_nonnull_range_by_dereference (gsi_stmt (gsi), + null_pointer_node)) + { + if (dump_file) + fprintf (dump_file, " NULL memory access; terminating BB%s\n", + flag_non_call_exceptions ? "; looping" : ""); + if (flag_non_call_exceptions) + { + l->looping = true; + if (stmt_can_throw_external (cfun, gsi_stmt (gsi))) + { + if (dump_file) + fprintf (dump_file, " can throw externally\n"); + l->can_throw = true; + } + } + break; + } check_stmt (&gsi, l, ipa); if (l->pure_const_state == IPA_NEITHER && l->looping