Improve jump threading dump output.

Message ID 20210928094545.889111-1-aldyh@redhat.com
State Committed
Commit 0400ca17f361dcc7f8230bb69a25de22497c73c3
Headers
Series Improve jump threading dump output. |

Commit Message

Aldy Hernandez Sept. 28, 2021, 9:45 a.m. UTC
  In analyzing PR102511, it has become abundantly clear that we need
better debugging aids for the jump threader solver.  Currently
debugging these issues is a nightmare if you're not intimately
familiar with the code.  This patch attempts to improve this.

First, I'm enabling path solver dumps with TDF_THREADING.  None of the
available TDF_* flags are a good match, and using TDF_DETAILS would blow
up the dump file, since both threaders continually call the solver to
try out candidates.  This will allow dumping path solver details without
having to resort to hacking the source.

I am also dumping the current registered_jump_thread dbg counter used
by the registry, in the solver.  That way narrowing down a problematic
thread can then be examined by -fdump-*-threading and looking at the
solver details surrounding the appropriate counter (which the dbgcnt
also dumps to the dump file).

You still need knowledge of the solver to debug these issues, but at
least now it's not entirely opaque.

OK?

gcc/ChangeLog:

	* dbgcnt.c (dbg_cnt_counter): New.
	* dbgcnt.h (dbg_cnt_counter): New.
	* dumpfile.c (dump_options): Add entry for TDF_THREADING.
	* dumpfile.h (enum dump_flag): Add TDF_THREADING.
	* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
	* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
	debug counter.
---
 gcc/dbgcnt.c                |  8 ++++++++
 gcc/dbgcnt.h                |  1 +
 gcc/dumpfile.c              |  1 +
 gcc/dumpfile.h              |  3 +++
 gcc/gimple-range-path.cc    |  2 +-
 gcc/tree-ssa-threadupdate.c | 13 +++++++++----
 6 files changed, 23 insertions(+), 5 deletions(-)
  

Comments

Jeff Law Sept. 28, 2021, 1:47 p.m. UTC | #1
On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
> In analyzing PR102511, it has become abundantly clear that we need
> better debugging aids for the jump threader solver.  Currently
> debugging these issues is a nightmare if you're not intimately
> familiar with the code.  This patch attempts to improve this.
>
> First, I'm enabling path solver dumps with TDF_THREADING.  None of the
> available TDF_* flags are a good match, and using TDF_DETAILS would blow
> up the dump file, since both threaders continually call the solver to
> try out candidates.  This will allow dumping path solver details without
> having to resort to hacking the source.
>
> I am also dumping the current registered_jump_thread dbg counter used
> by the registry, in the solver.  That way narrowing down a problematic
> thread can then be examined by -fdump-*-threading and looking at the
> solver details surrounding the appropriate counter (which the dbgcnt
> also dumps to the dump file).
>
> You still need knowledge of the solver to debug these issues, but at
> least now it's not entirely opaque.
>
> OK?
>
> gcc/ChangeLog:
>
> 	* dbgcnt.c (dbg_cnt_counter): New.
> 	* dbgcnt.h (dbg_cnt_counter): New.
> 	* dumpfile.c (dump_options): Add entry for TDF_THREADING.
> 	* dumpfile.h (enum dump_flag): Add TDF_THREADING.
> 	* gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
> 	* tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
> 	debug counter.
OK.

Note we've got massive failures in the tester starting sometime 
yesterday and I suspect all the threader work.    So I'm going to slow 
down on reviews of that code as we stabilize stuff.

jeff
  
Aldy Hernandez Sept. 28, 2021, 1:53 p.m. UTC | #2
On 9/28/21 3:47 PM, Jeff Law wrote:
> 
> 
> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>> In analyzing PR102511, it has become abundantly clear that we need
>> better debugging aids for the jump threader solver.  Currently
>> debugging these issues is a nightmare if you're not intimately
>> familiar with the code.  This patch attempts to improve this.
>>
>> First, I'm enabling path solver dumps with TDF_THREADING.  None of the
>> available TDF_* flags are a good match, and using TDF_DETAILS would blow
>> up the dump file, since both threaders continually call the solver to
>> try out candidates.  This will allow dumping path solver details without
>> having to resort to hacking the source.
>>
>> I am also dumping the current registered_jump_thread dbg counter used
>> by the registry, in the solver.  That way narrowing down a problematic
>> thread can then be examined by -fdump-*-threading and looking at the
>> solver details surrounding the appropriate counter (which the dbgcnt
>> also dumps to the dump file).
>>
>> You still need knowledge of the solver to debug these issues, but at
>> least now it's not entirely opaque.
>>
>> OK?
>>
>> gcc/ChangeLog:
>>
>>     * dbgcnt.c (dbg_cnt_counter): New.
>>     * dbgcnt.h (dbg_cnt_counter): New.
>>     * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>     * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>     * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>     * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>     debug counter.
> OK.
> 
> Note we've got massive failures in the tester starting sometime 
> yesterday and I suspect all the threader work.    So I'm going to slow 
> down on reviews of that code as we stabilize stuff.

Fair enough.  Let's knock those out then.

I just fixed a P1 that was causing undefined behavior.  Other than that, 
I don't have any known regressions apart from the loop crossing 
restrictions which you and me haven't agreed upon yet.  (Well...there 
are some archs that need testsuite tweaking, but they're not bugs per se.)

Send anything my way.

Aldy
  
Jeff Law Sept. 28, 2021, 1:56 p.m. UTC | #3
On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>
>
> On 9/28/21 3:47 PM, Jeff Law wrote:
>>
>>
>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>> In analyzing PR102511, it has become abundantly clear that we need
>>> better debugging aids for the jump threader solver.  Currently
>>> debugging these issues is a nightmare if you're not intimately
>>> familiar with the code.  This patch attempts to improve this.
>>>
>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>> available TDF_* flags are a good match, and using TDF_DETAILS would 
>>> blow
>>> up the dump file, since both threaders continually call the solver to
>>> try out candidates.  This will allow dumping path solver details 
>>> without
>>> having to resort to hacking the source.
>>>
>>> I am also dumping the current registered_jump_thread dbg counter used
>>> by the registry, in the solver.  That way narrowing down a problematic
>>> thread can then be examined by -fdump-*-threading and looking at the
>>> solver details surrounding the appropriate counter (which the dbgcnt
>>> also dumps to the dump file).
>>>
>>> You still need knowledge of the solver to debug these issues, but at
>>> least now it's not entirely opaque.
>>>
>>> OK?
>>>
>>> gcc/ChangeLog:
>>>
>>>     * dbgcnt.c (dbg_cnt_counter): New.
>>>     * dbgcnt.h (dbg_cnt_counter): New.
>>>     * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>     * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>     * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>     * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>     debug counter.
>> OK.
>>
>> Note we've got massive failures in the tester starting sometime 
>> yesterday and I suspect all the threader work.    So I'm going to 
>> slow down on reviews of that code as we stabilize stuff.
>
> Fair enough.  Let's knock those out then.
Yup.  I suspect it's just one or two issues that are showing up on a 
variety of targets.  And as I've said before, that's why we've got a 
tester :-)

>
> I just fixed a P1 that was causing undefined behavior.  Other than 
> that, I don't have any known regressions apart from the loop crossing 
> restrictions which you and me haven't agreed upon yet. (Well...there 
> are some archs that need testsuite tweaking, but they're not bugs per 
> se.)
These could end up being testsuite issues.  I've only debugged as far as 
"there's a sea of red failures" on the dashboard.

>
> Send anything my way.
Got a docker instance of the first one spinning right now for debugging 
purposes.  I'll look at it after I finish playing chauffeur for my daughter.

jeff
  
Jeff Law Sept. 28, 2021, 3:45 p.m. UTC | #4
On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>
>
> On 9/28/21 3:47 PM, Jeff Law wrote:
>>
>>
>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>> In analyzing PR102511, it has become abundantly clear that we need
>>> better debugging aids for the jump threader solver.  Currently
>>> debugging these issues is a nightmare if you're not intimately
>>> familiar with the code.  This patch attempts to improve this.
>>>
>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>> available TDF_* flags are a good match, and using TDF_DETAILS would 
>>> blow
>>> up the dump file, since both threaders continually call the solver to
>>> try out candidates.  This will allow dumping path solver details 
>>> without
>>> having to resort to hacking the source.
>>>
>>> I am also dumping the current registered_jump_thread dbg counter used
>>> by the registry, in the solver.  That way narrowing down a problematic
>>> thread can then be examined by -fdump-*-threading and looking at the
>>> solver details surrounding the appropriate counter (which the dbgcnt
>>> also dumps to the dump file).
>>>
>>> You still need knowledge of the solver to debug these issues, but at
>>> least now it's not entirely opaque.
>>>
>>> OK?
>>>
>>> gcc/ChangeLog:
>>>
>>>     * dbgcnt.c (dbg_cnt_counter): New.
>>>     * dbgcnt.h (dbg_cnt_counter): New.
>>>     * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>     * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>     * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>     * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>     debug counter.
>> OK.
>>
>> Note we've got massive failures in the tester starting sometime 
>> yesterday and I suspect all the threader work.    So I'm going to 
>> slow down on reviews of that code as we stabilize stuff.
>
> Fair enough.  Let's knock those out then.
So several are failing gcc.dg/loop-unswitch-3.c.

This test appears to be verifying that we unswitch a test in one of the 
loops, which is no longer happening after the change to replace the VRP 
threader with the hybrid forward threader.

So both the old VRP threader and the new style identify and realize a 
single jump thread.

In the old VRP threader realization of the jump thread ends up creating 
nested loops.  In the new implementation we end up creating a single 
loop with two back edges to the header.

ie, the (partial) graphs look like this

OLD

        1<--+
        |      |
+->  2     |
|    /   \   |
|  3     4  |
+- +     +-+

NEW


+->  2 <-+
|    /   \   |
|  3     4  |
+- +     +-+


I wonder if we're not doing proper loop fixups or something similar 
after that change.  IIRC we have/had bits in the copier and CFG update 
code to mark the loops that need re-analysis and fixing up.

Anyway, you should be able to trigger and analyze with a cross compiler.

I've got to switch to my day job, but I'll pass along more as I get a 
chance to look at them.

jeff
  
Richard Biener Sept. 28, 2021, 4:05 p.m. UTC | #5
On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
>
>On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>>
>>
>> On 9/28/21 3:47 PM, Jeff Law wrote:
>>>
>>>
>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>>> In analyzing PR102511, it has become abundantly clear that we need
>>>> better debugging aids for the jump threader solver.  Currently
>>>> debugging these issues is a nightmare if you're not intimately
>>>> familiar with the code.  This patch attempts to improve this.
>>>>
>>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>>> available TDF_* flags are a good match, and using TDF_DETAILS would 
>>>> blow
>>>> up the dump file, since both threaders continually call the solver to
>>>> try out candidates.  This will allow dumping path solver details 
>>>> without
>>>> having to resort to hacking the source.
>>>>
>>>> I am also dumping the current registered_jump_thread dbg counter used
>>>> by the registry, in the solver.  That way narrowing down a problematic
>>>> thread can then be examined by -fdump-*-threading and looking at the
>>>> solver details surrounding the appropriate counter (which the dbgcnt
>>>> also dumps to the dump file).
>>>>
>>>> You still need knowledge of the solver to debug these issues, but at
>>>> least now it's not entirely opaque.
>>>>
>>>> OK?
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>     * dbgcnt.c (dbg_cnt_counter): New.
>>>>     * dbgcnt.h (dbg_cnt_counter): New.
>>>>     * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>>     * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>>     * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>>     * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>>     debug counter.
>>> OK.
>>>
>>> Note we've got massive failures in the tester starting sometime 
>>> yesterday and I suspect all the threader work.    So I'm going to 
>>> slow down on reviews of that code as we stabilize stuff.
>>
>> Fair enough.  Let's knock those out then.
>So several are failing gcc.dg/loop-unswitch-3.c.
>
>This test appears to be verifying that we unswitch a test in one of the 
>loops, which is no longer happening after the change to replace the VRP 
>threader with the hybrid forward threader.
>
>So both the old VRP threader and the new style identify and realize a 
>single jump thread.
>
>In the old VRP threader realization of the jump thread ends up creating 
>nested loops.  In the new implementation we end up creating a single 
>loop with two back edges to the header.
>
>ie, the (partial) graphs look like this
>
>OLD
>
>        1<--+
>        |      |
>+->  2     |
>|    /   \   |
>|  3     4  |
>+- +     +-+
>
>NEW
>
>
>+->  2 <-+
>|    /   \   |
>|  3     4  |
>+- +     +-+
>
>
>I wonder if we're not doing proper loop fixups or something similar 
>after that change.  IIRC we have/had bits in the copier and CFG update 
>code to mark the loops that need re-analysis and fixing up.
>
>Anyway, you should be able to trigger and analyze with a cross compiler.
>
>I've got to switch to my day job, but I'll pass along more as I get a 
>chance to look at them.

If you're stuck I'm also happy to help. Note that relying on loop fixup is almost never good because we easily lose track of loop association of info like OMP simd loops and all loop pragmas. 

Richard. 

>jeff
>
>
>
  
Richard Biener Sept. 28, 2021, 4:05 p.m. UTC | #6
On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
>
>On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>>
>>
>> On 9/28/21 3:47 PM, Jeff Law wrote:
>>>
>>>
>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>>> In analyzing PR102511, it has become abundantly clear that we need
>>>> better debugging aids for the jump threader solver.  Currently
>>>> debugging these issues is a nightmare if you're not intimately
>>>> familiar with the code.  This patch attempts to improve this.
>>>>
>>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>>> available TDF_* flags are a good match, and using TDF_DETAILS would 
>>>> blow
>>>> up the dump file, since both threaders continually call the solver to
>>>> try out candidates.  This will allow dumping path solver details 
>>>> without
>>>> having to resort to hacking the source.
>>>>
>>>> I am also dumping the current registered_jump_thread dbg counter used
>>>> by the registry, in the solver.  That way narrowing down a problematic
>>>> thread can then be examined by -fdump-*-threading and looking at the
>>>> solver details surrounding the appropriate counter (which the dbgcnt
>>>> also dumps to the dump file).
>>>>
>>>> You still need knowledge of the solver to debug these issues, but at
>>>> least now it's not entirely opaque.
>>>>
>>>> OK?
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>>     * dbgcnt.c (dbg_cnt_counter): New.
>>>>     * dbgcnt.h (dbg_cnt_counter): New.
>>>>     * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>>     * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>>     * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>>     * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>>     debug counter.
>>> OK.
>>>
>>> Note we've got massive failures in the tester starting sometime 
>>> yesterday and I suspect all the threader work.    So I'm going to 
>>> slow down on reviews of that code as we stabilize stuff.
>>
>> Fair enough.  Let's knock those out then.
>So several are failing gcc.dg/loop-unswitch-3.c.
>
>This test appears to be verifying that we unswitch a test in one of the 
>loops, which is no longer happening after the change to replace the VRP 
>threader with the hybrid forward threader.
>
>So both the old VRP threader and the new style identify and realize a 
>single jump thread.
>
>In the old VRP threader realization of the jump thread ends up creating 
>nested loops.  In the new implementation we end up creating a single 
>loop with two back edges to the header.
>
>ie, the (partial) graphs look like this
>
>OLD
>
>        1<--+
>        |      |
>+->  2     |
>|    /   \   |
>|  3     4  |
>+- +     +-+
>
>NEW
>
>
>+->  2 <-+
>|    /   \   |
>|  3     4  |
>+- +     +-+
>
>
>I wonder if we're not doing proper loop fixups or something similar 
>after that change.  IIRC we have/had bits in the copier and CFG update 
>code to mark the loops that need re-analysis and fixing up.
>
>Anyway, you should be able to trigger and analyze with a cross compiler.
>
>I've got to switch to my day job, but I'll pass along more as I get a 
>chance to look at them.

If you're stuck I'm also happy to help. Note that relying on loop fixup is almost never good because we easily lose track of loop association of info like OMP simd loops and all loop pragmas. 

Richard. 

>jeff
>
>
>
  
Aldy Hernandez Sept. 28, 2021, 4:13 p.m. UTC | #7
On 9/28/21 6:05 PM, Richard Biener wrote:
> On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>
>>
>> On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>>>
>>>
>>> On 9/28/21 3:47 PM, Jeff Law wrote:
>>>>
>>>>
>>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>>>> In analyzing PR102511, it has become abundantly clear that we need
>>>>> better debugging aids for the jump threader solver.  Currently
>>>>> debugging these issues is a nightmare if you're not intimately
>>>>> familiar with the code.  This patch attempts to improve this.
>>>>>
>>>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>>>> available TDF_* flags are a good match, and using TDF_DETAILS would
>>>>> blow
>>>>> up the dump file, since both threaders continually call the solver to
>>>>> try out candidates.  This will allow dumping path solver details
>>>>> without
>>>>> having to resort to hacking the source.
>>>>>
>>>>> I am also dumping the current registered_jump_thread dbg counter used
>>>>> by the registry, in the solver.  That way narrowing down a problematic
>>>>> thread can then be examined by -fdump-*-threading and looking at the
>>>>> solver details surrounding the appropriate counter (which the dbgcnt
>>>>> also dumps to the dump file).
>>>>>
>>>>> You still need knowledge of the solver to debug these issues, but at
>>>>> least now it's not entirely opaque.
>>>>>
>>>>> OK?
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>>      * dbgcnt.c (dbg_cnt_counter): New.
>>>>>      * dbgcnt.h (dbg_cnt_counter): New.
>>>>>      * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>>>      * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>>>      * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>>>      * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>>>      debug counter.
>>>> OK.
>>>>
>>>> Note we've got massive failures in the tester starting sometime
>>>> yesterday and I suspect all the threader work.    So I'm going to
>>>> slow down on reviews of that code as we stabilize stuff.
>>>
>>> Fair enough.  Let's knock those out then.
>> So several are failing gcc.dg/loop-unswitch-3.c.
>>
>> This test appears to be verifying that we unswitch a test in one of the
>> loops, which is no longer happening after the change to replace the VRP
>> threader with the hybrid forward threader.
>>
>> So both the old VRP threader and the new style identify and realize a
>> single jump thread.
>>
>> In the old VRP threader realization of the jump thread ends up creating
>> nested loops.  In the new implementation we end up creating a single
>> loop with two back edges to the header.
>>
>> ie, the (partial) graphs look like this
>>
>> OLD
>>
>>         1<--+
>>         |      |
>> +->  2     |
>> |    /   \   |
>> |  3     4  |
>> +- +     +-+
>>
>> NEW
>>
>>
>> +->  2 <-+
>> |    /   \   |
>> |  3     4  |
>> +- +     +-+
>>
>>
>> I wonder if we're not doing proper loop fixups or something similar
>> after that change.  IIRC we have/had bits in the copier and CFG update
>> code to mark the loops that need re-analysis and fixing up.
>>
>> Anyway, you should be able to trigger and analyze with a cross compiler.
>>
>> I've got to switch to my day job, but I'll pass along more as I get a
>> chance to look at them.
> 
> If you're stuck I'm also happy to help. Note that relying on loop fixup is almost never good because we easily lose track of loop association of info like OMP simd loops and all loop pragmas.

I could absolutely use the help here.  Care to take a look?

Aldy
  
Richard Biener Sept. 29, 2021, 8:53 a.m. UTC | #8
On Tue, Sep 28, 2021 at 6:13 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>
>
>
> On 9/28/21 6:05 PM, Richard Biener wrote:
> > On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> >>
> >>
> >> On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
> >>>
> >>>
> >>> On 9/28/21 3:47 PM, Jeff Law wrote:
> >>>>
> >>>>
> >>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
> >>>>> In analyzing PR102511, it has become abundantly clear that we need
> >>>>> better debugging aids for the jump threader solver.  Currently
> >>>>> debugging these issues is a nightmare if you're not intimately
> >>>>> familiar with the code.  This patch attempts to improve this.
> >>>>>
> >>>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
> >>>>> available TDF_* flags are a good match, and using TDF_DETAILS would
> >>>>> blow
> >>>>> up the dump file, since both threaders continually call the solver to
> >>>>> try out candidates.  This will allow dumping path solver details
> >>>>> without
> >>>>> having to resort to hacking the source.
> >>>>>
> >>>>> I am also dumping the current registered_jump_thread dbg counter used
> >>>>> by the registry, in the solver.  That way narrowing down a problematic
> >>>>> thread can then be examined by -fdump-*-threading and looking at the
> >>>>> solver details surrounding the appropriate counter (which the dbgcnt
> >>>>> also dumps to the dump file).
> >>>>>
> >>>>> You still need knowledge of the solver to debug these issues, but at
> >>>>> least now it's not entirely opaque.
> >>>>>
> >>>>> OK?
> >>>>>
> >>>>> gcc/ChangeLog:
> >>>>>
> >>>>>      * dbgcnt.c (dbg_cnt_counter): New.
> >>>>>      * dbgcnt.h (dbg_cnt_counter): New.
> >>>>>      * dumpfile.c (dump_options): Add entry for TDF_THREADING.
> >>>>>      * dumpfile.h (enum dump_flag): Add TDF_THREADING.
> >>>>>      * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
> >>>>>      * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
> >>>>>      debug counter.
> >>>> OK.
> >>>>
> >>>> Note we've got massive failures in the tester starting sometime
> >>>> yesterday and I suspect all the threader work.    So I'm going to
> >>>> slow down on reviews of that code as we stabilize stuff.
> >>>
> >>> Fair enough.  Let's knock those out then.
> >> So several are failing gcc.dg/loop-unswitch-3.c.
> >>
> >> This test appears to be verifying that we unswitch a test in one of the
> >> loops, which is no longer happening after the change to replace the VRP
> >> threader with the hybrid forward threader.
> >>
> >> So both the old VRP threader and the new style identify and realize a
> >> single jump thread.
> >>
> >> In the old VRP threader realization of the jump thread ends up creating
> >> nested loops.  In the new implementation we end up creating a single
> >> loop with two back edges to the header.
> >>
> >> ie, the (partial) graphs look like this
> >>
> >> OLD
> >>
> >>         1<--+
> >>         |      |
> >> +->  2     |
> >> |    /   \   |
> >> |  3     4  |
> >> +- +     +-+
> >>
> >> NEW
> >>
> >>
> >> +->  2 <-+
> >> |    /   \   |
> >> |  3     4  |
> >> +- +     +-+
> >>
> >>
> >> I wonder if we're not doing proper loop fixups or something similar
> >> after that change.  IIRC we have/had bits in the copier and CFG update
> >> code to mark the loops that need re-analysis and fixing up.
> >>
> >> Anyway, you should be able to trigger and analyze with a cross compiler.
> >>
> >> I've got to switch to my day job, but I'll pass along more as I get a
> >> chance to look at them.
> >
> > If you're stuck I'm also happy to help. Note that relying on loop fixup is almost never good because we easily lose track of loop association of info like OMP simd loops and all loop pragmas.
>
> I could absolutely use the help here.  Care to take a look?

Any hint on which target FAILs gcc.dg/loop-unswitch-3.c?

Richard.

> Aldy
>
  
Jeff Law Sept. 30, 2021, 12:49 a.m. UTC | #9
On 9/29/2021 2:53 AM, Richard Biener wrote:
> On Tue, Sep 28, 2021 at 6:13 PM Aldy Hernandez <aldyh@redhat.com> wrote:
>>
>>
>> On 9/28/21 6:05 PM, Richard Biener wrote:
>>> On September 28, 2021 5:45:52 PM GMT+02:00, Jeff Law via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>> On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>>>>>
>>>>> On 9/28/21 3:47 PM, Jeff Law wrote:
>>>>>>
>>>>>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>>>>>> In analyzing PR102511, it has become abundantly clear that we need
>>>>>>> better debugging aids for the jump threader solver.  Currently
>>>>>>> debugging these issues is a nightmare if you're not intimately
>>>>>>> familiar with the code.  This patch attempts to improve this.
>>>>>>>
>>>>>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>>>>>> available TDF_* flags are a good match, and using TDF_DETAILS would
>>>>>>> blow
>>>>>>> up the dump file, since both threaders continually call the solver to
>>>>>>> try out candidates.  This will allow dumping path solver details
>>>>>>> without
>>>>>>> having to resort to hacking the source.
>>>>>>>
>>>>>>> I am also dumping the current registered_jump_thread dbg counter used
>>>>>>> by the registry, in the solver.  That way narrowing down a problematic
>>>>>>> thread can then be examined by -fdump-*-threading and looking at the
>>>>>>> solver details surrounding the appropriate counter (which the dbgcnt
>>>>>>> also dumps to the dump file).
>>>>>>>
>>>>>>> You still need knowledge of the solver to debug these issues, but at
>>>>>>> least now it's not entirely opaque.
>>>>>>>
>>>>>>> OK?
>>>>>>>
>>>>>>> gcc/ChangeLog:
>>>>>>>
>>>>>>>       * dbgcnt.c (dbg_cnt_counter): New.
>>>>>>>       * dbgcnt.h (dbg_cnt_counter): New.
>>>>>>>       * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>>>>>       * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>>>>>       * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>>>>>       * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>>>>>       debug counter.
>>>>>> OK.
>>>>>>
>>>>>> Note we've got massive failures in the tester starting sometime
>>>>>> yesterday and I suspect all the threader work.    So I'm going to
>>>>>> slow down on reviews of that code as we stabilize stuff.
>>>>> Fair enough.  Let's knock those out then.
>>>> So several are failing gcc.dg/loop-unswitch-3.c.
>>>>
>>>> This test appears to be verifying that we unswitch a test in one of the
>>>> loops, which is no longer happening after the change to replace the VRP
>>>> threader with the hybrid forward threader.
>>>>
>>>> So both the old VRP threader and the new style identify and realize a
>>>> single jump thread.
>>>>
>>>> In the old VRP threader realization of the jump thread ends up creating
>>>> nested loops.  In the new implementation we end up creating a single
>>>> loop with two back edges to the header.
>>>>
>>>> ie, the (partial) graphs look like this
>>>>
>>>> OLD
>>>>
>>>>          1<--+
>>>>          |      |
>>>> +->  2     |
>>>> |    /   \   |
>>>> |  3     4  |
>>>> +- +     +-+
>>>>
>>>> NEW
>>>>
>>>>
>>>> +->  2 <-+
>>>> |    /   \   |
>>>> |  3     4  |
>>>> +- +     +-+
>>>>
>>>>
>>>> I wonder if we're not doing proper loop fixups or something similar
>>>> after that change.  IIRC we have/had bits in the copier and CFG update
>>>> code to mark the loops that need re-analysis and fixing up.
>>>>
>>>> Anyway, you should be able to trigger and analyze with a cross compiler.
>>>>
>>>> I've got to switch to my day job, but I'll pass along more as I get a
>>>> chance to look at them.
>>> If you're stuck I'm also happy to help. Note that relying on loop fixup is almost never good because we easily lose track of loop association of info like OMP simd loops and all loop pragmas.
>> I could absolutely use the help here.  Care to take a look?
> Any hint on which target FAILs gcc.dg/loop-unswitch-3.c?
I thought it was cris-elf, but I just re-enabled loop-unswitch-3.c and 
cris-elf handled it just fine.  That makes me wonder if it was a testing 
patch from Aldy that I might have had in my tree.

jeff
  
Jeff Law Sept. 30, 2021, 6:26 p.m. UTC | #10
On 9/28/2021 7:53 AM, Aldy Hernandez wrote:
>
>
> On 9/28/21 3:47 PM, Jeff Law wrote:
>>
>>
>> On 9/28/2021 3:45 AM, Aldy Hernandez wrote:
>>> In analyzing PR102511, it has become abundantly clear that we need
>>> better debugging aids for the jump threader solver.  Currently
>>> debugging these issues is a nightmare if you're not intimately
>>> familiar with the code.  This patch attempts to improve this.
>>>
>>> First, I'm enabling path solver dumps with TDF_THREADING. None of the
>>> available TDF_* flags are a good match, and using TDF_DETAILS would 
>>> blow
>>> up the dump file, since both threaders continually call the solver to
>>> try out candidates.  This will allow dumping path solver details 
>>> without
>>> having to resort to hacking the source.
>>>
>>> I am also dumping the current registered_jump_thread dbg counter used
>>> by the registry, in the solver.  That way narrowing down a problematic
>>> thread can then be examined by -fdump-*-threading and looking at the
>>> solver details surrounding the appropriate counter (which the dbgcnt
>>> also dumps to the dump file).
>>>
>>> You still need knowledge of the solver to debug these issues, but at
>>> least now it's not entirely opaque.
>>>
>>> OK?
>>>
>>> gcc/ChangeLog:
>>>
>>>     * dbgcnt.c (dbg_cnt_counter): New.
>>>     * dbgcnt.h (dbg_cnt_counter): New.
>>>     * dumpfile.c (dump_options): Add entry for TDF_THREADING.
>>>     * dumpfile.h (enum dump_flag): Add TDF_THREADING.
>>>     * gimple-range-path.cc (DEBUG_SOLVER): Use TDF_THREADING.
>>>     * tree-ssa-threadupdate.c (dump_jump_thread_path): Dump out
>>>     debug counter.
>> OK.
>>
>> Note we've got massive failures in the tester starting sometime 
>> yesterday and I suspect all the threader work.    So I'm going to 
>> slow down on reviews of that code as we stabilize stuff.
>
> Fair enough.  Let's knock those out then.
So I'm really wondering if these were caused by that patch you'd sent me 
privately for the visium issue.  Right now we're regressing in a few 
places, but it's not bad.

visium & bfin are the only embedded targets failing.

visium fails:
Tests that now fail, but worked before (9 tests):

visium-sim: gcc.c-torture/execute/960218-1.c   -Os  (test for excess errors)
visium-sim: gcc.c-torture/execute/961125-1.c   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for 
excess errors)
visium-sim: gcc.c-torture/execute/961125-1.c   -O3 -g  (test for excess 
errors)
visium-sim: gcc.c-torture/execute/pending-4.c   -O1  (test for excess 
errors)
visium-sim: gcc.c-torture/execute/pr58209.c   -O2  (test for excess errors)
visium-sim: gcc.c-torture/execute/pr58209.c   -O2 -flto 
-fno-use-linker-plugin -flto-partition=none  (test for excess errors)
visium-sim: gcc.c-torture/execute/pr58209.c   -O2 -flto 
-fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
visium-sim: gcc.c-torture/execute/pr58209.c   -O3 -g  (test for excess 
errors)
visium-sim: gcc.c-torture/execute/pr68911.c   -O1  (test for excess errors)

We've already discussed 960218-1 a bit.  I wouldn't be surprised if 
they're all the same problem in the end.  These started with:

commit 4a960d548b7d7d942f316c5295f6d849b74214f5 (HEAD, refs/bisect/bad)
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Thu Sep 23 10:59:24 2021 +0200

     Avoid invalid loop transformations in jump threading registry.
  
Aldy Hernandez Oct. 4, 2021, 12:05 p.m. UTC | #11
On Thu, Sep 30, 2021 at 8:26 PM Jeff Law <jeffreyalaw@gmail.com> wrote:

> So I'm really wondering if these were caused by that patch you'd sent me
> privately for the visium issue.  Right now we're regressing in a few
> places, but it's not bad.
>
> visium & bfin are the only embedded targets failing.
>
> visium fails:
> Tests that now fail, but worked before (9 tests):
>
> visium-sim: gcc.c-torture/execute/960218-1.c   -Os  (test for excess errors)
> visium-sim: gcc.c-torture/execute/961125-1.c   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
> excess errors)
> visium-sim: gcc.c-torture/execute/961125-1.c   -O3 -g  (test for excess
> errors)
> visium-sim: gcc.c-torture/execute/pending-4.c   -O1  (test for excess
> errors)
> visium-sim: gcc.c-torture/execute/pr58209.c   -O2  (test for excess errors)
> visium-sim: gcc.c-torture/execute/pr58209.c   -O2 -flto
> -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
> visium-sim: gcc.c-torture/execute/pr58209.c   -O2 -flto
> -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
> visium-sim: gcc.c-torture/execute/pr58209.c   -O3 -g  (test for excess
> errors)
> visium-sim: gcc.c-torture/execute/pr68911.c   -O1  (test for excess errors)
>
> We've already discussed 960218-1 a bit.  I wouldn't be surprised if
> they're all the same problem in the end.  These started with:

Is this still an issue?  I'm having some trouble reproducing it.

I tried building a combined tree with adapted instructions from here
(dunno if they still apply):

https://gcc.gnu.org/wiki/Building_Cross_Toolchains_with_gcc

Building visium-sim fails with:

    BFD does not support target visium-sim-none.

Building visium-elf or even bfin-elf has libctf compile errors.

Instead, I tried just building cc1 for both visium and bfin but I
don't get any *compilation* errors from the above tests.  I'm assuming
compilation errors since they say "test for excess errors".  Are they
compilation errors?

If this is still an issue, is there an easy way to reproduce?

Thanks.
Aldy
  
Jeff Law Oct. 4, 2021, 1:29 p.m. UTC | #12
On 10/4/2021 6:05 AM, Aldy Hernandez wrote:
> On Thu, Sep 30, 2021 at 8:26 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>> So I'm really wondering if these were caused by that patch you'd sent me
>> privately for the visium issue.  Right now we're regressing in a few
>> places, but it's not bad.
>>
>> visium & bfin are the only embedded targets failing.
>>
>> visium fails:
>> Tests that now fail, but worked before (9 tests):
>>
>> visium-sim: gcc.c-torture/execute/960218-1.c   -Os  (test for excess errors)
>> visium-sim: gcc.c-torture/execute/961125-1.c   -O3 -fomit-frame-pointer
>> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for
>> excess errors)
>> visium-sim: gcc.c-torture/execute/961125-1.c   -O3 -g  (test for excess
>> errors)
>> visium-sim: gcc.c-torture/execute/pending-4.c   -O1  (test for excess
>> errors)
>> visium-sim: gcc.c-torture/execute/pr58209.c   -O2  (test for excess errors)
>> visium-sim: gcc.c-torture/execute/pr58209.c   -O2 -flto
>> -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
>> visium-sim: gcc.c-torture/execute/pr58209.c   -O2 -flto
>> -fuse-linker-plugin -fno-fat-lto-objects  (test for excess errors)
>> visium-sim: gcc.c-torture/execute/pr58209.c   -O3 -g  (test for excess
>> errors)
>> visium-sim: gcc.c-torture/execute/pr68911.c   -O1  (test for excess errors)
>>
>> We've already discussed 960218-1 a bit.  I wouldn't be surprised if
>> they're all the same problem in the end.  These started with:
> Is this still an issue?  I'm having some trouble reproducing it.
>
> I tried building a combined tree with adapted instructions from here
> (dunno if they still apply):
>
> https://gcc.gnu.org/wiki/Building_Cross_Toolchains_with_gcc
>
> Building visium-sim fails with:
>
>      BFD does not support target visium-sim-none.
>
> Building visium-elf or even bfin-elf has libctf compile errors.
>
> Instead, I tried just building cc1 for both visium and bfin but I
> don't get any *compilation* errors from the above tests.  I'm assuming
> compilation errors since they say "test for excess errors".  Are they
> compilation errors?
>
> If this is still an issue, is there an easy way to reproduce?
For the visium stuff, look for a call to abort () in the resulting 
assembly code of 960218-1.  Due to the newlib/libgloss bug on the visum 
port, any call to abort () results in a link error which triggers the 
excess errors failure.

Jeff
  

Patch

diff --git a/gcc/dbgcnt.c b/gcc/dbgcnt.c
index 934bbe033ee..6a7eb34cd3e 100644
--- a/gcc/dbgcnt.c
+++ b/gcc/dbgcnt.c
@@ -98,6 +98,14 @@  dbg_cnt (enum debug_counter index)
     return false;
 }
 
+/* Return the counter for INDEX.  */
+
+unsigned
+dbg_cnt_counter (enum debug_counter index)
+{
+  return count[index];
+}
+
 /* Compare limit_tuple intervals by first item in descending order.  */
 
 static int
diff --git a/gcc/dbgcnt.h b/gcc/dbgcnt.h
index 17f2091f5a7..3c35dcc3e0a 100644
--- a/gcc/dbgcnt.h
+++ b/gcc/dbgcnt.h
@@ -33,6 +33,7 @@  enum debug_counter {
 
 extern bool dbg_cnt_is_enabled (enum debug_counter index);
 extern bool dbg_cnt (enum debug_counter index);
+extern unsigned dbg_cnt_counter (enum debug_counter index);
 extern void dbg_cnt_process_opt (const char *arg);
 extern void dbg_cnt_list_all_counters (void);
 
diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c
index 8169daf7f59..e6ead5debe5 100644
--- a/gcc/dumpfile.c
+++ b/gcc/dumpfile.c
@@ -145,6 +145,7 @@  static const kv_pair<dump_flags_t> dump_options[] =
   {"missed", MSG_MISSED_OPTIMIZATION},
   {"note", MSG_NOTE},
   {"optall", MSG_ALL_KINDS},
+  {"threading", TDF_THREADING},
   {"all", dump_flags_t (TDF_ALL_VALUES
 			& ~(TDF_RAW | TDF_SLIM | TDF_LINENO | TDF_GRAPH
 			    | TDF_STMTADDR | TDF_RHS_ONLY | TDF_NOUID
diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h
index 892bfc9ae90..6c7758dd2fb 100644
--- a/gcc/dumpfile.h
+++ b/gcc/dumpfile.h
@@ -197,6 +197,9 @@  enum dump_flag
   /* For error.  */
   TDF_ERROR = (1 << 26),
 
+  /* Dumping for range path solver.  */
+  TDF_THREADING = (1 << 27),
+
   /* All values.  */
   TDF_ALL_VALUES = (1 << 29) - 1
 };
diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index 9da67d2a35b..a29d5318ca9 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -34,7 +34,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 
 // Internal construct to help facilitate debugging of solver.
-#define DEBUG_SOLVER (0 && dump_file)
+#define DEBUG_SOLVER (dump_file && dump_flags & TDF_THREADING)
 
 path_range_query::path_range_query (gimple_ranger &ranger, bool resolve)
   : m_ranger (ranger)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index cf96c903668..905dea2e6ca 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -218,10 +218,15 @@  dump_jump_thread_path (FILE *dump_file,
 		       const vec<jump_thread_edge *> &path,
 		       bool registering)
 {
-  fprintf (dump_file,
-	   "  %s jump thread: (%d, %d) incoming edge; ",
-	   (registering ? "Registering" : "Cancelling"),
-	   path[0]->e->src->index, path[0]->e->dest->index);
+  if (registering)
+    fprintf (dump_file,
+	     "  [%u] Registering jump thread: (%d, %d) incoming edge; ",
+	     dbg_cnt_counter (registered_jump_thread),
+	     path[0]->e->src->index, path[0]->e->dest->index);
+  else
+    fprintf (dump_file,
+	     "  Cancelling jump thread: (%d, %d) incoming edge; ",
+	     path[0]->e->src->index, path[0]->e->dest->index);
 
   for (unsigned int i = 1; i < path.length (); i++)
     {