Message ID | 20230419031527.6D39120420@pchp3.se.axis.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 12EDB3858D28 for <patchwork@sourceware.org>; Wed, 19 Apr 2023 03:15:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 12EDB3858D28 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1681874158; bh=V5x2om15iUXMvFZxegmFvYdnCBC/8q47fNSa9Xq4sbs=; h=To:CC:In-Reply-To:Subject:References:Date:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=JOIViMCkuOGjdl0jQaNpB+UfnWCF2eBEN65mzs0sUjIniZW9meH3GRYfreG+NA48b av43iVFiI1yO0oVakl6t0sDuQs01uo3ahN+Og75l0/MUCt3WqFuVygmo14X9q/FPs2 bhOxZZ3kp3yEvMwFJISfeuz59JW7hnN9WelSev6c= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp2.axis.com (smtp2.axis.com [195.60.68.18]) by sourceware.org (Postfix) with ESMTPS id 1785D3858D1E for <gcc-patches@gcc.gnu.org>; Wed, 19 Apr 2023 03:15:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1785D3858D1E To: <gcc-patches@gcc.gnu.org> CC: <paulkoning@comcast.net> In-Reply-To: <20230418184412.95E632043B@pchp3.se.axis.com> (message from Hans-Peter Nilsson on Tue, 18 Apr 2023 20:44:12 +0200) Subject: [PATCH v2] doc: Document order of define_peephole2 scanning MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT References: <20230418175507.2C40B2040B@pchp3.se.axis.com> <0442F4ED-45A1-44D3-AB1B-F28657F04D3F@comcast.net> <20230418184412.95E632043B@pchp3.se.axis.com> Message-ID: <20230419031527.6D39120420@pchp3.se.axis.com> Date: Wed, 19 Apr 2023 05:15:27 +0200 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Hans-Peter Nilsson <hp@axis.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[v2] doc: Document order of define_peephole2 scanning
|
|
Commit Message
Hans-Peter Nilsson
April 19, 2023, 3:15 a.m. UTC
> From: Hans-Peter Nilsson <hp@axis.com> > Date: Tue, 18 Apr 2023 20:44:12 +0200 > > > From: Paul Koning <paulkoning@comcast.net> > > > Date: Tue, 18 Apr 2023 14:32:07 -0400 > > > > I'm not sure about the meaning of part of this. > > "...resumes at the last generated insn." Does that mean: [...] > (Neither...) [...] > Sorry, your confusement confuses me. I just don't see how > to confuse last with first or matched with generated. :) It's 4:30am and things appear much clearer, in particular wrt. confusion. Hopefully the version below is clearer. Here's also the example from 35 lines up in md.texi: (define_peephole2 [(match_scratch:SI 4 "r") (set (match_operand:SI 0 "" "") (match_operand:SI 1 "" "")) (set (match_operand:SI 2 "" "") (match_dup 1)) (match_dup 4) (set (match_operand:SI 3 "" "") (match_dup 1))] "/* @r{determine 1 does not overlap 0 and 2} */" [(set (match_dup 4) (match_dup 1)) (set (match_dup 0) (match_dup 4)) (set (match_dup 2) (match_dup 4)) (set (match_dup 3) (match_dup 4))] "") Approvers: pdf output reviewed. Ok to commit? All: thoughts on making define_peephole2 work "as expected"; "backtracing" so the replacement buffer ends with the first generated replacement insn? Might be simpler to restart at the beginning of the BB, but I'm scared of overly long BB's. Does anyone have statistics on the sizes of BB's in terms of number of insns? -- >8 -- I was a bit surprised when my define_peephole2 didn't match, but it was because it was expected to partially match the generated output of a previous define_peephole2. I had assumed that the algorithm backed-up the size of the match-buffer, thereby exposing newly created opportunities with context to all define_peephole2's. While things can change in that direction, let's start with documenting the current state. * doc/md.texi (define_peephole2): Document order of scanning. --- gcc/doc/md.texi | 8 ++++++++ 1 file changed, 8 insertions(+)
Comments
> From: Hans-Peter Nilsson <hp@axis.com> > Date: Wed, 19 Apr 2023 05:15:27 +0200 > Approvers: pdf output reviewed. Ok to commit? Patch retracted, at least temporarily. My "understanding" may be clouded by looking at an actual bug. Sigh. brgds, H-P
> From: Hans-Peter Nilsson <hp@axis.com> > Date: Wed, 19 Apr 2023 06:06:27 +0200 > > Patch retracted, at least temporarily. My "understanding" > may be clouded by looking at an actual bug. Sigh. Mea culpa. I was looking at the result of one define_peephole2 and thinking it was due to another, and also tricked by incorrect code comments (patch posted, will commit). TL;DR: Matching indeed does resume with attempting to match the *first* define_peephole2 replacement insn. But the match-and-replacement order is largely undocumented. Anyway, the missing-context problem I ran into remains: if you have an insn sequence {foo bar} and a define_peephole2 matching and replacing {bar} into {baz}, the resulting {foo baz} *will not be matched* against a define_peephole2 looking for {foo baz}. But, I'm not trying to document this caveat specifically, though at least it'll now be implied by the documentation. This could be fixed by always backing up MAX_INSNS_PER_PEEP2 - 1 insns after a successful replacement. I'm somewhat worries that this would also mean lots of futile re-match attempts. Thoughts? (I could also just restart at the BB start, but I see all this support for backing-up live info by single insns being used. Taking notes about a partial match for the first insn of a failed attempt, as the maximum need to back-up to, doesn't look like it'd fly, judging from the nonspecific looking (set dest src) patterns being the first in i386 define_peephole2's match sequences.) So again: Approvers: pdf output reviewed. Ok to commit? -- >8 -- I was a bit surprised when my newly-added define_peephole2 didn't match, but it was because it was expected to partially match the generated output of a previous define_peephole2, which matched and modified the last insn of a sequence to be matched. I had assumed that the algorithm backed-up the size of the match-buffer, thereby exposing newly created opportunities *with sufficient context* to all define_peephole2's. While things can change in that direction, let's start with documenting the current state. * doc/md.texi (define_peephole2): Document order of scanning. --- gcc/doc/md.texi | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 07bf8bdebffb..300d104d58ab 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor @code{FAIL}), then the @code{define_peephole2} uses the replacement template. +Insns are scanned in forward order from beginning to end for each basic +block. Matches are attempted in order of @code{define_peephole2} +appearance in the @file{md} file. After a successful replacement, +scanning for further opportunities for @code{define_peephole2}, resumes +with the first generated replacement insn as the first insn to be +matched against all @code{define_peephole2}. For the example above, +after its successful replacement, the first insn that can be matched by +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}. + @end ifset @ifset INTERNALS @node Insn Attributes
[+list] On 19 April 2023 21:21:18 CEST, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote: >Hi H-P! > >This begs the question iff now (i fear it's not), upon successful match(es), the whole peepholes get re-run again per BB (ATM?), exposing more opportunities? >If not, would we want to retry, at least for -fexpensive-optimisations (sp?) or would all hell break loose? > >Please also see below. > >On 19 April 2023 18:59:14 CEST, Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: >>> From: Hans-Peter Nilsson <hp@axis.com> >>> Date: Wed, 19 Apr 2023 06:06:27 +0200 >>> >>> Patch retracted, at least temporarily. My "understanding" >>> may be clouded by looking at an actual bug. Sigh. >> >>Mea culpa. I was looking at the result of one >>define_peephole2 and thinking it was due to another, and >>also tricked by incorrect code comments (patch posted, will >>commit). >> >>TL;DR: Matching indeed does resume with attempting to match >>the *first* define_peephole2 replacement insn. But the >>match-and-replacement order is largely undocumented. >> >>Anyway, the missing-context problem I ran into remains: if >>you have an insn sequence {foo bar} and a define_peephole2 >>matching and replacing {bar} into {baz}, the resulting {foo >>baz} *will not be matched* against a define_peephole2 >>looking for {foo baz}. But, I'm not trying to document this >>caveat specifically, though at least it'll now be implied by >>the documentation. >> >>This could be fixed by always backing up MAX_INSNS_PER_PEEP2 >>- 1 insns after a successful replacement. I'm somewhat >>worries that this would also mean lots of futile re-match >>attempts. Thoughts? > >Good point. Probably. Do you have stats? > >If there is even a slight benefit, then some certainly would be willing to take that penalty for sure. I.e. iff it helps -Os or locality then such expensive optimisations certainly provide benefit for at least a few if not some, IMO. > >Just curious && cheers, > >> >>(I could also just restart at the BB start, but I see all >>this support for backing-up live info by single insns being >>used. Taking notes about a partial match for the first insn >>of a failed attempt, as the maximum need to back-up to, >>doesn't look like it'd fly, judging from the nonspecific >>looking (set dest src) patterns being the first in i386 >>define_peephole2's match sequences.) >
> Date: Wed, 19 Apr 2023 22:16:36 +0200 > From: Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> > On 19 April 2023 21:21:18 CEST, Bernhard Reutner-Fischer <rep.dot.nop@gmail.com> wrote: > >Hi H-P! > > > >This begs the question iff now (i fear it's not), upon > >successful match(es), the whole peepholes get re-run > >again per BB (ATM?), exposing more opportunities? (Not sure IIUC the question, but:) no; as mentioned the peephole2 machinery doesn't back up to include the context of longer sequences when having done replacement for a shorter match. It resumes matching at the beginning of the shorter, matched and replaced, sequence. > >If not, would we want to retry, at least for > >-fexpensive-optimisations (sp?) or would all hell break > >loose? I don't think hell would break loose, but maybe slowdown and/or buggy define_peephole2s would be weeded out. Or something else entirely unexpected. :) > >Please also see below. > > > >On 19 April 2023 18:59:14 CEST, Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > >>Anyway, the missing-context problem I ran into remains: if > >>you have an insn sequence {foo bar} and a define_peephole2 > >>matching and replacing {bar} into {baz}, the resulting {foo > >>baz} *will not be matched* against a define_peephole2 > >>looking for {foo baz}. But, I'm not trying to document this > >>caveat specifically, though at least it'll now be implied by > >>the documentation. > >> > >>This could be fixed by always backing up MAX_INSNS_PER_PEEP2 > >>- 1 insns after a successful replacement. I'm somewhat > >>worries that this would also mean lots of futile re-match > >>attempts. Thoughts? > > > >Good point. Probably. Do you have stats? None whatsoever. I'm going to test with the "original" define_peephole2 for CRIS that was suffering, but with an "extra" define_peephole2 that also does the modification for the one that caused it do be missed and see if the original free one still fires off. (I see I cut down my foo bar baz examples too much to make sense to explain that.) > >If there is even a slight benefit, then some certainly > >would be willing to take that penalty for sure. I.e. iff > >it helps -Os or locality then such expensive > >optimisations certainly provide benefit for at least a > >few if not some, IMO. Right. Not sure I'll be doing it though, having other things, gcc-related and others, on my plate. If so, first I'd do a crude attempt at getting statistics for x86_64, by after a successful replacement, restarting at the beginning of a BB and checking whether any define_peephole2 fires off before reaching the first replaced insn. brgds, H-P
> From: Hans-Peter Nilsson <hp@axis.com> > Date: Wed, 19 Apr 2023 18:59:14 +0200 [...] > So again: Approvers: pdf output reviewed. Ok to commit? > -- >8 -- > I was a bit surprised when my newly-added define_peephole2 didn't > match, but it was because it was expected to partially match the > generated output of a previous define_peephole2, which matched and > modified the last insn of a sequence to be matched. I had assumed > that the algorithm backed-up the size of the match-buffer, thereby > exposing newly created opportunities *with sufficient context* to all > define_peephole2's. While things can change in that direction, let's > start with documenting the current state. > > * doc/md.texi (define_peephole2): Document order of scanning. > --- > gcc/doc/md.texi | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > index 07bf8bdebffb..300d104d58ab 100644 > --- a/gcc/doc/md.texi > +++ b/gcc/doc/md.texi > @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor > @code{FAIL}), then the @code{define_peephole2} uses the replacement > template. > > +Insns are scanned in forward order from beginning to end for each basic > +block. Matches are attempted in order of @code{define_peephole2} > +appearance in the @file{md} file. After a successful replacement, > +scanning for further opportunities for @code{define_peephole2}, resumes > +with the first generated replacement insn as the first insn to be > +matched against all @code{define_peephole2}. For the example above, > +after its successful replacement, the first insn that can be matched by > +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}. > + > @end ifset > @ifset INTERNALS > @node Insn Attributes > -- > 2.30.2 >
Ping again. > From: Hans-Peter Nilsson <hp@axis.com> > Date: Thu, 27 Apr 2023 01:55:24 +0200 > > > From: Hans-Peter Nilsson <hp@axis.com> > > Date: Wed, 19 Apr 2023 18:59:14 +0200 > [...] > > > So again: Approvers: pdf output reviewed. Ok to commit? > > -- >8 -- > > I was a bit surprised when my newly-added define_peephole2 didn't > > match, but it was because it was expected to partially match the > > generated output of a previous define_peephole2, which matched and > > modified the last insn of a sequence to be matched. I had assumed > > that the algorithm backed-up the size of the match-buffer, thereby > > exposing newly created opportunities *with sufficient context* to all > > define_peephole2's. While things can change in that direction, let's > > start with documenting the current state. > > > > * doc/md.texi (define_peephole2): Document order of scanning. > > --- > > gcc/doc/md.texi | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > > index 07bf8bdebffb..300d104d58ab 100644 > > --- a/gcc/doc/md.texi > > +++ b/gcc/doc/md.texi > > @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor > > @code{FAIL}), then the @code{define_peephole2} uses the replacement > > template. > > > > +Insns are scanned in forward order from beginning to end for each basic > > +block. Matches are attempted in order of @code{define_peephole2} > > +appearance in the @file{md} file. After a successful replacement, > > +scanning for further opportunities for @code{define_peephole2}, resumes > > +with the first generated replacement insn as the first insn to be > > +matched against all @code{define_peephole2}. For the example above, > > +after its successful replacement, the first insn that can be matched by > > +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}. > > + > > @end ifset > > @ifset INTERNALS > > @node Insn Attributes > > -- > > 2.30.2 > > >
On Thu, May 4, 2023 at 2:10 AM Hans-Peter Nilsson via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Ping again. OK. > > From: Hans-Peter Nilsson <hp@axis.com> > > Date: Thu, 27 Apr 2023 01:55:24 +0200 > > > > > From: Hans-Peter Nilsson <hp@axis.com> > > > Date: Wed, 19 Apr 2023 18:59:14 +0200 > > [...] > > > > > So again: Approvers: pdf output reviewed. Ok to commit? > > > -- >8 -- > > > I was a bit surprised when my newly-added define_peephole2 didn't > > > match, but it was because it was expected to partially match the > > > generated output of a previous define_peephole2, which matched and > > > modified the last insn of a sequence to be matched. I had assumed > > > that the algorithm backed-up the size of the match-buffer, thereby > > > exposing newly created opportunities *with sufficient context* to all > > > define_peephole2's. While things can change in that direction, let's > > > start with documenting the current state. > > > > > > * doc/md.texi (define_peephole2): Document order of scanning. > > > --- > > > gcc/doc/md.texi | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi > > > index 07bf8bdebffb..300d104d58ab 100644 > > > --- a/gcc/doc/md.texi > > > +++ b/gcc/doc/md.texi > > > @@ -9362,6 +9362,15 @@ If the preparation falls through (invokes neither @code{DONE} nor > > > @code{FAIL}), then the @code{define_peephole2} uses the replacement > > > template. > > > > > > +Insns are scanned in forward order from beginning to end for each basic > > > +block. Matches are attempted in order of @code{define_peephole2} > > > +appearance in the @file{md} file. After a successful replacement, > > > +scanning for further opportunities for @code{define_peephole2}, resumes > > > +with the first generated replacement insn as the first insn to be > > > +matched against all @code{define_peephole2}. For the example above, > > > +after its successful replacement, the first insn that can be matched by > > > +a @code{define_peephole2} is @code{(set (match_dup 4) (match_dup 1))}. > > > + > > > @end ifset > > > @ifset INTERNALS > > > @node Insn Attributes > > > -- > > > 2.30.2 > > > > >
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 07bf8bdebffb..2ce043e6edc2 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -9362,6 +9362,14 @@ If the preparation falls through (invokes neither @code{DONE} nor @code{FAIL}), then the @code{define_peephole2} uses the replacement template. +Insns are scanned in forward order from beginning to end for each basic +block. Matches are attempted in order of appearance in the @file{md} +file. After a successful replacement, scanning for further +opportunities for @code{define_peephole2}, resumes with the last +generated replacement insn as the first insn to be matched. For the +example above, the first insn that can be matched by another +@code{define_peephole2}, is @code{(set (match_dup 3) (match_dup 4))}. + @end ifset @ifset INTERNALS @node Insn Attributes