Message ID | fc85253b-cc4c-02cf-3c65-717633b08d27@linux.ibm.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30065385802E for <patchwork@sourceware.org>; Mon, 13 Dec 2021 03:01:00 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 30065385802E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1639364460; bh=74sjsELvLEw5Jn0Hm0e4ZOSeyrvuf10Wi07ekJunOvw=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Dv9LFuvAIz9mlfUHjLIl8bJSquNaWVIzIlib4ltXAvHU99y8vW65OBFwPrl+myRBl OYcm50jEXPcnl4lD3JfLrRHQOzlvVz+daVzrsUHoc6bm5tqKiAnQVOzXwnhijhy7+i FXU4ClToMkiRUA8C796EqDvrk8wEruJ/O8nJ4pBw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id DB0423858409 for <gcc-patches@gcc.gnu.org>; Mon, 13 Dec 2021 03:00:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DB0423858409 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1BCNwXuE027232; Mon, 13 Dec 2021 03:00:29 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3cwu7q283m-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 13 Dec 2021 03:00:29 +0000 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1BD308cX031678; Mon, 13 Dec 2021 03:00:28 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com with ESMTP id 3cwu7q282y-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 13 Dec 2021 03:00:28 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1BD2tMHq013156; Mon, 13 Dec 2021 03:00:26 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma03ams.nl.ibm.com with ESMTP id 3cvkm906q6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 13 Dec 2021 03:00:26 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1BD30OD216580958 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 13 Dec 2021 03:00:24 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 664BFA405C; Mon, 13 Dec 2021 03:00:24 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3191BA4065; Mon, 13 Dec 2021 03:00:23 +0000 (GMT) Received: from [9.200.100.183] (unknown [9.200.100.183]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 13 Dec 2021 03:00:22 +0000 (GMT) Message-ID: <fc85253b-cc4c-02cf-3c65-717633b08d27@linux.ibm.com> Date: Mon, 13 Dec 2021 11:00:20 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.3.2 Content-Language: en-US To: gcc-patches <gcc-patches@gcc.gnu.org> Subject: [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124] Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lds9L4rHuwPfiXoA45PD958ddKHSVPdR X-Proofpoint-GUID: bbfi6k3QkXj2WptPW4_Mh7nVDVTjXpJW X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2021-12-12_10,2021-12-10_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 suspectscore=0 mlxscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 clxscore=1015 impostorscore=0 malwarescore=0 adultscore=0 mlxlogscore=999 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2112130013 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: HAO CHEN GUI via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: HAO CHEN GUI <guihaoc@linux.ibm.com> Cc: Bill Schmidt <wschmidt@linux.ibm.com>, David <dje.gcc@gmail.com>, Segher Boessenkool <segher@kernel.crashing.org> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[rs6000] new split pattern for TI to V1TI move [PR103124]
|
|
Commit Message
HAO CHEN GUI
Dec. 13, 2021, 3 a.m. UTC
Hi, This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of a TI to a V2DI, then move the V2DI to V1TI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out on P9. The new test case illustrates it. Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2021-12-13 Haochen Gui <guihaoc@linux.ibm.com> gcc/ * config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined. gcc/testsuite/ * gcc.target/powerpc/pr103124.c: New testcase. patch.diff
Comments
On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote: > > Hi, > This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of > a TI to a V2DI, then move the V2DI to V1TI. With the pattern, the subreg pass can do register split for > TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out on P9. The new > test case illustrates it. > > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk? > Any recommendations? Thanks a lot. > > ChangeLog > 2021-12-13 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > * config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined. > > gcc/testsuite/ > * gcc.target/powerpc/pr103124.c: New testcase. > > > patch.diff > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md > index bf033e31c1c..7bca7780735 100644 > --- a/gcc/config/rs6000/vsx.md > +++ b/gcc/config/rs6000/vsx.md > @@ -6589,3 +6589,19 @@ (define_insn "xxeval" > [(set_attr "type" "vecperm") > (set_attr "prefixed" "yes")]) > > +;; split TI to V1TI move > +(define_split > + [(set (match_operand:V1TI 0 "vsx_register_operand") > + (subreg:V1TI > + (match_operand:TI 1 "int_reg_operand") 0 ))] > + "TARGET_P9_VECTOR && !reload_completed" > + [(const_int 0)] > +{ > + rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0); > + rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8); > + rtx tmp3 = gen_reg_rtx (V2DImode); > + emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2)); > + rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0); > + emit_move_insn (operands[0], tmp4); > + DONE; > +}) > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c > new file mode 100644 > index 00000000000..724492dbcd2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c > @@ -0,0 +1,11 @@ > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */ Please don't include the "powerpc" target selector in the gcc.target/powerpc directory. Just use lp64. > +/* { dg-require-effective-target powerpc_p9vector_ok } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ > +/* { dg-final { scan-assembler-not "\mmr\M" } } */ > + > +vector __int128 add (long long a) > +{ > + vector __int128 b; > + b = (vector __int128) {a}; > + return b; > +} Okay with that change. Thanks, David
Hi! On Mon, Dec 13, 2021 at 05:22:06PM -0500, David Edelsohn wrote: > On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote: > > --- a/gcc/config/rs6000/vsx.md > > +++ b/gcc/config/rs6000/vsx.md > > @@ -6589,3 +6589,19 @@ (define_insn "xxeval" > > [(set_attr "type" "vecperm") > > (set_attr "prefixed" "yes")]) > > > > +;; split TI to V1TI move Please comment that this splitter tries to generate mtvsrdd insns, and don't say the obvious things :-) > > +(define_split > > + [(set (match_operand:V1TI 0 "vsx_register_operand") > > + (subreg:V1TI > > + (match_operand:TI 1 "int_reg_operand") 0 ))] > > + "TARGET_P9_VECTOR && !reload_completed" Why the "!reload_completed"? Is this generated after reload as well, and that is bad for some reason? > > + [(const_int 0)] > > +{ > > + rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0); > > + rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8); > > + rtx tmp3 = gen_reg_rtx (V2DImode); > > + emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2)); > > + rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0); > > + emit_move_insn (operands[0], tmp4); > > + DONE; > > +}) Ah, it is bad because it generates a pseudo. So either you just make it work when everything is hard regs, or you do this *and comment it*. The first option is not very easy to do. You need to make sure you can do those subregs (and get GPRs!), and you need to use a hard reg instead of the new pseudo (you can use operand 0 for this here though, it can never be the same as operand 1 :-) (but only do this if this *is* after reload)). But, it sounds like you actually saw problems when allowing it after reload, so it sounds like it would actually be useful to do it then? > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c > > @@ -0,0 +1,11 @@ > > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */ > > Please don't include the "powerpc" target selector in the > gcc.target/powerpc directory. Just use lp64. Or actually, don't use anything, and do a dg-require int128 instead. Segher
Hi Segher, Thanks for your advice. Please see my comments. On 14/12/2021 上午 6:59, Segher Boessenkool wrote: > Hi! > > On Mon, Dec 13, 2021 at 05:22:06PM -0500, David Edelsohn wrote: >> On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote: >>> --- a/gcc/config/rs6000/vsx.md >>> +++ b/gcc/config/rs6000/vsx.md >>> @@ -6589,3 +6589,19 @@ (define_insn "xxeval" >>> [(set_attr "type" "vecperm") >>> (set_attr "prefixed" "yes")]) >>> >>> +;; split TI to V1TI move > > Please comment that this splitter tries to generate mtvsrdd insns, and > don't say the obvious things :-) > OK, I will modify it. >>> +(define_split >>> + [(set (match_operand:V1TI 0 "vsx_register_operand") >>> + (subreg:V1TI >>> + (match_operand:TI 1 "int_reg_operand") 0 ))] >>> + "TARGET_P9_VECTOR && !reload_completed" > > Why the "!reload_completed"? Is this generated after reload as well, > and that is bad for some reason? > >>> + [(const_int 0)] >>> +{ >>> + rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0); >>> + rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8); >>> + rtx tmp3 = gen_reg_rtx (V2DImode); >>> + emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2)); >>> + rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0); >>> + emit_move_insn (operands[0], tmp4); >>> + DONE; >>> +}) > > Ah, it is bad because it generates a pseudo. > > So either you just make it work when everything is hard regs, or you do > this *and comment it*. > > The first option is not very easy to do. You need to make sure you can > do those subregs (and get GPRs!), and you need to use a hard reg instead > of the new pseudo (you can use operand 0 for this here though, it can > never be the same as operand 1 :-) (but only do this if this *is* after > reload)). > > But, it sounds like you actually saw problems when allowing it after > reload, so it sounds like it would actually be useful to do it then? The purpose of this split pattern is to generate V1TI by two subregs from TI. Subsequent subreg pass can recognize TI in the insn as splitable. As there is no subreg pass after reload, I want the split just to be done before reload. Also as you mentioned, my patch generates a pseudo. It doesn't work after reload. That's why I set "!reload_complete" condition. > >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c >>> @@ -0,0 +1,11 @@ >>> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */ >> >> Please don't include the "powerpc" target selector in the >> gcc.target/powerpc directory. Just use lp64. > > Or actually, don't use anything, and do a dg-require int128 instead. > Thanks, I will take it. > > Segher >
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index bf033e31c1c..7bca7780735 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -6589,3 +6589,19 @@ (define_insn "xxeval" [(set_attr "type" "vecperm") (set_attr "prefixed" "yes")]) +;; split TI to V1TI move +(define_split + [(set (match_operand:V1TI 0 "vsx_register_operand") + (subreg:V1TI + (match_operand:TI 1 "int_reg_operand") 0 ))] + "TARGET_P9_VECTOR && !reload_completed" + [(const_int 0)] +{ + rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0); + rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8); + rtx tmp3 = gen_reg_rtx (V2DImode); + emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2)); + rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0); + emit_move_insn (operands[0], tmp4); + DONE; +}) diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c new file mode 100644 index 00000000000..724492dbcd2 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ +/* { dg-final { scan-assembler-not "\mmr\M" } } */ + +vector __int128 add (long long a) +{ + vector __int128 b; + b = (vector __int128) {a}; + return b; +}