[v2] doc: Document order of define_peephole2 scanning

Message ID 20230419031527.6D39120420@pchp3.se.axis.com
State New
Headers
Series [v2] doc: Document order of define_peephole2 scanning |

Commit Message

Hans-Peter Nilsson April 19, 2023, 3:15 a.m. UTC
  > From: Hans-Peter Nilsson <hp@axis.com>
> Date: Tue, 18 Apr 2023 20:44:12 +0200
> 
> > From: Paul Koning <paulkoning@comcast.net>
> 
> > Date: Tue, 18 Apr 2023 14:32:07 -0400
> > 
> > I'm not sure about the meaning of part of this.
> > "...resumes at the last generated insn."  Does that mean:

[...]

> (Neither...)

[...]

> Sorry, your confusement confuses me.  I just don't see how
> to confuse last with first or matched with generated. :)

It's 4:30am and things appear much clearer, in particular
wrt. confusion.  Hopefully the version below is clearer.
Here's also the example from 35 lines up in md.texi:

(define_peephole2
  [(match_scratch:SI 4 "r")
   (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" ""))
   (set (match_operand:SI 2 "" "") (match_dup 1))
   (match_dup 4)
   (set (match_operand:SI 3 "" "") (match_dup 1))]
  "/* @r{determine 1 does not overlap 0 and 2} */"
  [(set (match_dup 4) (match_dup 1))
   (set (match_dup 0) (match_dup 4))
   (set (match_dup 2) (match_dup 4))
   (set (match_dup 3) (match_dup 4))]
  "")

Approvers: pdf output reviewed.  Ok to commit?

All: thoughts on making define_peephole2 work "as expected";
"backtracing" so the replacement buffer ends with the first
generated replacement insn?  Might be simpler to restart at
the beginning of the BB, but I'm scared of overly long BB's.
Does anyone have statistics on the sizes of BB's in terms of
number of insns?

-- >8 --
I was a bit surprised when my define_peephole2 didn't match, but
it was because it was expected to partially match the generated
output of a previous define_peephole2.  I had assumed that the
algorithm backed-up the size of the match-buffer, thereby
exposing newly created opportunities with context to all
define_peephole2's.  While things can change in that direction,
let's start with documenting the current state.

	* doc/md.texi (define_peephole2): Document order of scanning.
---
 gcc/doc/md.texi | 8 ++++++++
 1 file changed, 8 insertions(+)
  

Comments

Hans-Peter Nilsson April 19, 2023, 4:06 a.m. UTC | #1
> From: Hans-Peter Nilsson <hp@axis.com>
> Date: Wed, 19 Apr 2023 05:15:27 +0200

> Approvers: pdf output reviewed.  Ok to commit?

Patch retracted, at least temporarily.  My "understanding"
may be clouded by looking at an actual bug.  Sigh.

brgds, H-P
  
Hans-Peter Nilsson April 19, 2023, 4:59 p.m. UTC | #2
> From: Hans-Peter Nilsson <hp@axis.com>
> Date: Wed, 19 Apr 2023 06:06:27 +0200
> 
> Patch retracted, at least temporarily.  My "understanding"
> may be clouded by looking at an actual bug.  Sigh.

Mea culpa.  I was looking at the result of one
define_peephole2 and thinking it was due to another, and
also tricked by incorrect code comments (patch posted, will
commit).

TL;DR: Matching indeed does resume with attempting to match
the *first* define_peephole2 replacement insn.  But the
match-and-replacement order is largely undocumented.

Anyway, the missing-context problem I ran into remains: if
you have an insn sequence {foo bar} and a define_peephole2
matching and replacing {bar} into {baz}, the resulting {foo
baz} *will not be matched* against a define_peephole2
looking for {foo baz}.  But, I'm not trying to document this
caveat specifically, though at least it'll now be implied by
the documentation.

This could be fixed by always backing up MAX_INSNS_PER_PEEP2
- 1 insns after a successful replacement.  I'm somewhat
worries that this would also mean lots of futile re-match
attempts.  Thoughts?

(I could also just restart at the BB start, but I see all
this support for backing-up live info by single insns being
used.  Taking notes about a partial match for the first insn
of a failed attempt, as the maximum need to back-up to,
doesn't look like it'd fly, judging from the nonspecific
looking (set dest src) patterns being the first in i386
define_peephole2's match sequences.)

So again: Approvers: pdf output reviewed.  Ok to commit?
-- >8 --
I was a bit surprised when my newly-added define_peephole2 didn't
match, but it was because it was expected to partially match the
generated output of a previous define_peephole2, which matched and
modified the last insn of a sequence to be matched.  I had assumed
that the algorithm backed-up the size of the match-buffer, thereby
exposing newly created opportunities *with sufficient context* to all
define_peephole2's.  While things can change in that direction, let's
start with documenting the current state.

	* doc/md.texi (define_peephole2): Document order of scanning.
---
 gcc/doc/md.texi | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 07bf8bdebffb..300d104d58ab 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
+Insns are scanned in forward order from beginning to end for each basic
+block.  Matches are attempted in order of @code{define_peephole2}
+appearance in the @file{md} file.  After a successful replacement,
+scanning for further opportunities for @code{define_peephole2}, resumes
+with the first generated replacement insn as the first insn to be
+matched against all @code{define_peephole2}.  For the example above,
+after its successful replacement, the first insn that can be matched by
+a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
+
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes
  
Bernhard Reutner-Fischer April 19, 2023, 8:16 p.m. UTC | #3
[+list]
On 19 April 2023 21:21:18 CEST, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote:
>Hi H-P!
>
>This begs the question iff now (i fear it's not), upon successful match(es), the whole peepholes get re-run again per BB (ATM?), exposing more opportunities?
>If not, would we want to retry, at least for -fexpensive-optimisations (sp?) or would all hell break loose?
>
>Please also see below.
>
>On 19 April 2023 18:59:14 CEST, Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> From: Hans-Peter Nilsson <hp@axis.com>
>>> Date: Wed, 19 Apr 2023 06:06:27 +0200
>>> 
>>> Patch retracted, at least temporarily.  My "understanding"
>>> may be clouded by looking at an actual bug.  Sigh.
>>
>>Mea culpa.  I was looking at the result of one
>>define_peephole2 and thinking it was due to another, and
>>also tricked by incorrect code comments (patch posted, will
>>commit).
>>
>>TL;DR: Matching indeed does resume with attempting to match
>>the *first* define_peephole2 replacement insn.  But the
>>match-and-replacement order is largely undocumented.
>>
>>Anyway, the missing-context problem I ran into remains: if
>>you have an insn sequence {foo bar} and a define_peephole2
>>matching and replacing {bar} into {baz}, the resulting {foo
>>baz} *will not be matched* against a define_peephole2
>>looking for {foo baz}.  But, I'm not trying to document this
>>caveat specifically, though at least it'll now be implied by
>>the documentation.
>>
>>This could be fixed by always backing up MAX_INSNS_PER_PEEP2
>>- 1 insns after a successful replacement.  I'm somewhat
>>worries that this would also mean lots of futile re-match
>>attempts.  Thoughts?
>
>Good point. Probably. Do you have stats?
>
>If there is even a slight benefit, then some certainly would be willing to take that penalty for sure. I.e. iff it helps -Os or locality then such expensive optimisations certainly provide benefit for at least a few if not some, IMO.
>
>Just curious && cheers,
>
>>
>>(I could also just restart at the BB start, but I see all
>>this support for backing-up live info by single insns being
>>used.  Taking notes about a partial match for the first insn
>>of a failed attempt, as the maximum need to back-up to,
>>doesn't look like it'd fly, judging from the nonspecific
>>looking (set dest src) patterns being the first in i386
>>define_peephole2's match sequences.)
>
  
Hans-Peter Nilsson April 20, 2023, 2:57 a.m. UTC | #4
> Date: Wed, 19 Apr 2023 22:16:36 +0200
> From: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com>

> On 19 April 2023 21:21:18 CEST, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote:
> >Hi H-P!
> >
> >This begs the question iff now (i fear it's not), upon
> >successful match(es), the whole peepholes get re-run
> >again per BB (ATM?), exposing more opportunities?

(Not sure IIUC the question, but:) no; as mentioned the
peephole2 machinery doesn't back up to include the context
of longer sequences when having done replacement for a
shorter match.  It resumes matching at the beginning of the
shorter, matched and replaced, sequence.

> >If not, would we want to retry, at least for
> >-fexpensive-optimisations (sp?) or would all hell break
> >loose?

I don't think hell would break loose, but maybe slowdown
and/or buggy define_peephole2s would be weeded out.  Or
something else entirely unexpected. :)

> >Please also see below.
> >
> >On 19 April 2023 18:59:14 CEST, Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> >>Anyway, the missing-context problem I ran into remains: if
> >>you have an insn sequence {foo bar} and a define_peephole2
> >>matching and replacing {bar} into {baz}, the resulting {foo
> >>baz} *will not be matched* against a define_peephole2
> >>looking for {foo baz}.  But, I'm not trying to document this
> >>caveat specifically, though at least it'll now be implied by
> >>the documentation.
> >>
> >>This could be fixed by always backing up MAX_INSNS_PER_PEEP2
> >>- 1 insns after a successful replacement.  I'm somewhat
> >>worries that this would also mean lots of futile re-match
> >>attempts.  Thoughts?
> >
> >Good point. Probably. Do you have stats?

None whatsoever.  I'm going to test with the "original"
define_peephole2 for CRIS that was suffering, but with an
"extra" define_peephole2 that also does the modification for
the one that caused it do be missed and see if the original
free one still fires off.  (I see I cut down my foo bar baz
examples too much to make sense to explain that.)

> >If there is even a slight benefit, then some certainly
> >would be willing to take that penalty for sure. I.e. iff
> >it helps -Os or locality then such expensive
> >optimisations certainly provide benefit for at least a
> >few if not some, IMO.

Right.  Not sure I'll be doing it though, having other
things, gcc-related and others, on my plate.  If so, first
I'd do a crude attempt at getting statistics for x86_64, by
after a successful replacement, restarting at the beginning
of a BB and checking whether any define_peephole2 fires off
before reaching the first replaced insn.

brgds, H-P
  
Hans-Peter Nilsson April 26, 2023, 11:55 p.m. UTC | #5
> From: Hans-Peter Nilsson <hp@axis.com>
> Date: Wed, 19 Apr 2023 18:59:14 +0200
[...]

> So again: Approvers: pdf output reviewed.  Ok to commit?
> -- >8 --
> I was a bit surprised when my newly-added define_peephole2 didn't
> match, but it was because it was expected to partially match the
> generated output of a previous define_peephole2, which matched and
> modified the last insn of a sequence to be matched.  I had assumed
> that the algorithm backed-up the size of the match-buffer, thereby
> exposing newly created opportunities *with sufficient context* to all
> define_peephole2's.  While things can change in that direction, let's
> start with documenting the current state.
> 
> 	* doc/md.texi (define_peephole2): Document order of scanning.
> ---
>  gcc/doc/md.texi | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 07bf8bdebffb..300d104d58ab 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor
>  @code{FAIL}), then the @code{define_peephole2} uses the replacement
>  template.
>  
> +Insns are scanned in forward order from beginning to end for each basic
> +block.  Matches are attempted in order of @code{define_peephole2}
> +appearance in the @file{md} file.  After a successful replacement,
> +scanning for further opportunities for @code{define_peephole2}, resumes
> +with the first generated replacement insn as the first insn to be
> +matched against all @code{define_peephole2}.  For the example above,
> +after its successful replacement, the first insn that can be matched by
> +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
> +
>  @end ifset
>  @ifset INTERNALS
>  @node Insn Attributes
> -- 
> 2.30.2
>
  
Hans-Peter Nilsson May 4, 2023, 12:09 a.m. UTC | #6
Ping again.

> From: Hans-Peter Nilsson <hp@axis.com>
> Date: Thu, 27 Apr 2023 01:55:24 +0200
> 
> > From: Hans-Peter Nilsson <hp@axis.com>
> > Date: Wed, 19 Apr 2023 18:59:14 +0200
> [...]
> 
> > So again: Approvers: pdf output reviewed.  Ok to commit?
> > -- >8 --
> > I was a bit surprised when my newly-added define_peephole2 didn't
> > match, but it was because it was expected to partially match the
> > generated output of a previous define_peephole2, which matched and
> > modified the last insn of a sequence to be matched.  I had assumed
> > that the algorithm backed-up the size of the match-buffer, thereby
> > exposing newly created opportunities *with sufficient context* to all
> > define_peephole2's.  While things can change in that direction, let's
> > start with documenting the current state.
> > 
> > 	* doc/md.texi (define_peephole2): Document order of scanning.
> > ---
> >  gcc/doc/md.texi | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index 07bf8bdebffb..300d104d58ab 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor
> >  @code{FAIL}), then the @code{define_peephole2} uses the replacement
> >  template.
> >  
> > +Insns are scanned in forward order from beginning to end for each basic
> > +block.  Matches are attempted in order of @code{define_peephole2}
> > +appearance in the @file{md} file.  After a successful replacement,
> > +scanning for further opportunities for @code{define_peephole2}, resumes
> > +with the first generated replacement insn as the first insn to be
> > +matched against all @code{define_peephole2}.  For the example above,
> > +after its successful replacement, the first insn that can be matched by
> > +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
> > +
> >  @end ifset
> >  @ifset INTERNALS
> >  @node Insn Attributes
> > -- 
> > 2.30.2
> > 
>
  
Richard Biener May 4, 2023, 9:45 a.m. UTC | #7
On Thu, May 4, 2023 at 2:10 AM Hans-Peter Nilsson via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Ping again.

OK.

> > From: Hans-Peter Nilsson <hp@axis.com>
> > Date: Thu, 27 Apr 2023 01:55:24 +0200
> >
> > > From: Hans-Peter Nilsson <hp@axis.com>
> > > Date: Wed, 19 Apr 2023 18:59:14 +0200
> > [...]
> >
> > > So again: Approvers: pdf output reviewed.  Ok to commit?
> > > -- >8 --
> > > I was a bit surprised when my newly-added define_peephole2 didn't
> > > match, but it was because it was expected to partially match the
> > > generated output of a previous define_peephole2, which matched and
> > > modified the last insn of a sequence to be matched.  I had assumed
> > > that the algorithm backed-up the size of the match-buffer, thereby
> > > exposing newly created opportunities *with sufficient context* to all
> > > define_peephole2's.  While things can change in that direction, let's
> > > start with documenting the current state.
> > >
> > >     * doc/md.texi (define_peephole2): Document order of scanning.
> > > ---
> > >  gcc/doc/md.texi | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > >
> > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > > index 07bf8bdebffb..300d104d58ab 100644
> > > --- a/gcc/doc/md.texi
> > > +++ b/gcc/doc/md.texi
> > > @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor
> > >  @code{FAIL}), then the @code{define_peephole2} uses the replacement
> > >  template.
> > >
> > > +Insns are scanned in forward order from beginning to end for each basic
> > > +block.  Matches are attempted in order of @code{define_peephole2}
> > > +appearance in the @file{md} file.  After a successful replacement,
> > > +scanning for further opportunities for @code{define_peephole2}, resumes
> > > +with the first generated replacement insn as the first insn to be
> > > +matched against all @code{define_peephole2}.  For the example above,
> > > +after its successful replacement, the first insn that can be matched by
> > > +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}.
> > > +
> > >  @end ifset
> > >  @ifset INTERNALS
> > >  @node Insn Attributes
> > > --
> > > 2.30.2
> > >
> >
  

Patch

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 07bf8bdebffb..2ce043e6edc2 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -9362,6 +9362,14 @@  If the preparation falls through (invokes neither @code{DONE} nor
 @code{FAIL}), then the @code{define_peephole2} uses the replacement
 template.
 
+Insns are scanned in forward order from beginning to end for each basic
+block.  Matches are attempted in order of appearance in the @file{md}
+file.  After a successful replacement, scanning for further
+opportunities for @code{define_peephole2}, resumes with the last
+generated replacement insn as the first insn to be matched.  For the
+example above, the first insn that can be matched by another
+@code{define_peephole2}, is @code{(set (match_dup 3) (match_dup 4))}.
+
 @end ifset
 @ifset INTERNALS
 @node Insn Attributes