Message ID | YqNi7DDZC+9Eqzsm@toto.the-meissners.org |
---|---|
State | Committed |
Commit | fddb7f65129a12dabb5ddc3c8082fe576f4af451 |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 44527385C30B for <patchwork@sourceware.org>; Fri, 10 Jun 2022 15:28:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 44527385C30B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654874896; bh=7pRtGcU7ScRVphMFzzaxk6zpS9aZdKjWM+Nd75FgYjc=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=pYLjcOFp4kC+ptLBbSL3RCI3reFV6I20ckufZ02zvSwF8cGy/xMgFg6ICE6pRDNCo h//CWj/zT7zL2+17yQMkHukWzo5az/Vr/P4bMLRzC8ukHOW98mJo9ZfdZhY6AXCMTn qX2mGS+O0A3ra8UDQN4esJJEsQywTKqcqvtHRG6A= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 7790A385C30B for <gcc-patches@gcc.gnu.org>; Fri, 10 Jun 2022 15:27:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7790A385C30B Received: from pps.filterd (m0098413.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25AEuVjl007553; Fri, 10 Jun 2022 15:27:44 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3gm85prjc2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Jun 2022 15:27:44 +0000 Received: from m0098413.ppops.net (m0098413.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 25AF0KAC016451; Fri, 10 Jun 2022 15:27:44 GMT Received: from ppma03wdc.us.ibm.com (ba.79.3fa9.ip4.static.sl-reverse.com [169.63.121.186]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3gm85prjbm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Jun 2022 15:27:44 +0000 Received: from pps.filterd (ppma03wdc.us.ibm.com [127.0.0.1]) by ppma03wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 25AFKW8d021807; Fri, 10 Jun 2022 15:27:43 GMT Received: from b01cxnp23034.gho.pok.ibm.com (b01cxnp23034.gho.pok.ibm.com [9.57.198.29]) by ppma03wdc.us.ibm.com with ESMTP id 3gfy1a9mhg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 10 Jun 2022 15:27:43 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 25AFRgCU13500790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 10 Jun 2022 15:27:42 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 64A6CAC05F; Fri, 10 Jun 2022 15:27:42 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 215E9AC05E; Fri, 10 Jun 2022 15:27:42 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.87.14]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTPS; Fri, 10 Jun 2022 15:27:42 +0000 (GMT) Date: Fri, 10 Jun 2022 11:27:40 -0400 To: gcc-patches@gcc.gnu.org, Michael Meissner <meissner@linux.ibm.com>, Segher Boessenkool <segher@kernel.crashing.org>, "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>, Peter Bergner <bergner@linux.ibm.com>, Will Schmidt <will_schmidt@vnet.ibm.com> Subject: [PATCH V2] Disable generating load/store vector pairs for block copies. Message-ID: <YqNi7DDZC+9Eqzsm@toto.the-meissners.org> Mail-Followup-To: Michael Meissner <meissner@linux.ibm.com>, gcc-patches@gcc.gnu.org, Segher Boessenkool <segher@kernel.crashing.org>, "Kewen.Lin" <linkw@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>, Peter Bergner <bergner@linux.ibm.com>, Will Schmidt <will_schmidt@vnet.ibm.com> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-TM-AS-GCONF: 00 X-Proofpoint-GUID: aFmxrp5ko9XcVGORSzJcNS_HV2SMU2_V X-Proofpoint-ORIG-GUID: aJ11bGREhefWnyLTGWH1D1oORcB7jBKW X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.517,FMLib:17.11.64.514 definitions=2022-06-10_06,2022-06-09_02,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 suspectscore=0 clxscore=1015 impostorscore=0 malwarescore=0 bulkscore=0 phishscore=0 mlxscore=0 mlxlogscore=999 lowpriorityscore=0 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2204290000 definitions=main-2206100060 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Michael Meissner via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Michael Meissner <meissner@linux.ibm.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[V2] Disable generating load/store vector pairs for block copies.
|
|
Commit Message
Michael Meissner
June 10, 2022, 3:27 p.m. UTC
[PATCH, V2] Disable generating load/store vector pairs for block copies. Testing has found that using store vector pair for block copies can result in a slow down on power10. This patch disables using the vector pair instructions for block copies if we are tuning for power10. This is version 2 of the patch. | Date: Mon, 6 Jun 2022 20:55:55 -0400 | Subject: [PATCH 2/3] Disable generating load/store vector pairs for block copies. | Message-ID: <Yp6iGwU03VjRSDAc@toto.the-meissners.org> | https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596275.html Compared to version 1, this patch is a stand-alone patch, and it doesn't depend on a new switch (-mno-store-vector-pair). Instead, this patch just sets the default for -mblock-ops-vector-pair to be off if the current cpu being tuned for is power10. It would be anticipated that it would automatically be eabled when tuning for a future cpu. I have tested this patch on: little endian power10 using --with-cpu=power10 little endian power9 using --with-cpu=power9 big endian power8 using --with-cpu=power8, both 32/64-bit tested there were no regressions. Can I apply this to the master branch, and then apply it to the GCC 12 patch after a burn-in period? 2022-06-09 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do not generate block copies with vector pair instructions if we are tuning for power10. --- gcc/config/rs6000/rs6000.cc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
Comments
Hi! On Fri, Jun 10, 2022 at 11:27:40AM -0400, Michael Meissner wrote: > Testing has found that using store vector pair for block copies can result > in a slow down on power10. This patch disables using the vector pair > instructions for block copies if we are tuning for power10. Load paired should be disabled as well, for the same reason. The patch seems to do that fine? Please fix the commit message. Thanks, Segher > 2022-06-09 Michael Meissner <meissner@linux.ibm.com> > > gcc/ > * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do > not generate block copies with vector pair instructions if we are > tuning for power10. > --- > gcc/config/rs6000/rs6000.cc | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc > index 0af2085adc0..59481d9ac70 100644 > --- a/gcc/config/rs6000/rs6000.cc > +++ b/gcc/config/rs6000/rs6000.cc > @@ -4141,7 +4141,10 @@ rs6000_option_override_internal (bool global_init_p) > > if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) > { > - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) > + /* Do not generate lxvp and stxvp on power10 since there are some > + performance issues. */ > + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX > + && rs6000_tune != PROCESSOR_POWER10) > rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; > else > rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
I have applied the patch to GCC 12. | From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001 | From: Michael Meissner <meissner@linux.ibm.com> | Date: Thu, 14 Jul 2022 11:16:08 -0400 | Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for block copies. Testing has found that using load and store vector pair for block copies can result in a slow down on power10. This patch disables using the vector pair instructions for block copies if we are tuning for power10. 2022-06-11 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do not generate block copies with vector pair instructions if we are tuning for power10. Back port from master branch. --- gcc/config/rs6000/rs6000.cc | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 4030864aa1a..040487bd277 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -4151,7 +4151,10 @@ rs6000_option_override_internal (bool global_init_p) if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) { - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) + /* Do not generate lxvp and stxvp on power10 since there are some + performance issues. */ + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX + && rs6000_tune != PROCESSOR_POWER10) rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; else rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
Back port patch (changing .cc to .c) from trunk to GCC 11 committed. | From 3118d0856b030fe491a170354fed2df570df199f Mon Sep 17 00:00:00 2001 | From: Michael Meissner <meissner@linux.ibm.com> | Date: Thu, 14 Jul 2022 14:03:37 -0400 | Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for block copies. Testing has found that using load and store vector pair for block copies can result in a slow down on power10. This patch disables using the vector pair instructions for block copies if we are tuning for power10. 2022-06-11 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/rs6000.c (rs6000_option_override_internal): Do not generate block copies with vector pair instructions if we are tuning for power10. Back port from master branch. --- gcc/config/rs6000/rs6000.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 8c89b45922f..73f90f134f8 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4149,7 +4149,10 @@ rs6000_option_override_internal (bool global_init_p) if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) { - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) + /* Do not generate lxvp and stxvp on power10 since there are some + performance issues. */ + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX + && rs6000_tune != PROCESSOR_POWER10) rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; else rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
On Thu, Jul 14, 2022 at 11:20:56AM -0400, Michael Meissner wrote: > I have applied the patch to GCC 12. > > | From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001 > | From: Michael Meissner <meissner@linux.ibm.com> > | Date: Thu, 14 Jul 2022 11:16:08 -0400 > | Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for block copies. > > Testing has found that using load and store vector pair for block copies > can result in a slow down on power10. This patch disables using the > vector pair instructions for block copies if we are tuning for power10. > > 2022-06-11 Michael Meissner <meissner@linux.ibm.com> > > gcc/ > > * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do > not generate block copies with vector pair instructions if we are > tuning for power10. Back port from master branch. You never posted the trunk version of this, so that never was approved either. > +++ b/gcc/config/rs6000/rs6000.cc > @@ -4151,7 +4151,10 @@ rs6000_option_override_internal (bool global_init_p) > > if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) > { > - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) > + /* Do not generate lxvp and stxvp on power10 since there are some > + performance issues. */ > + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX > + && rs6000_tune != PROCESSOR_POWER10) > rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; > else > rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; The TARGET_MMA in that should not be there. Please fix that (that probably needs more changes). This statement does the opposite of what the comment says. Please fix this. On trunk, first. Segher
On Thu, Jul 14, 2022 at 04:12:14PM -0500, Segher Boessenkool wrote: > On Thu, Jul 14, 2022 at 11:20:56AM -0400, Michael Meissner wrote: > > I have applied the patch to GCC 12. > > > > | From 22736f3d0d4fb8ce4afb3230023f8accdb03a623 Mon Sep 17 00:00:00 2001 > > | From: Michael Meissner <meissner@linux.ibm.com> > > | Date: Thu, 14 Jul 2022 11:16:08 -0400 > > | Subject: [PATCH] [BACKPORT] Disable generating load/store vector pairs for block copies. > > > > Testing has found that using load and store vector pair for block copies > > can result in a slow down on power10. This patch disables using the > > vector pair instructions for block copies if we are tuning for power10. > > > > 2022-06-11 Michael Meissner <meissner@linux.ibm.com> > > > > gcc/ > > > > * config/rs6000/rs6000.cc (rs6000_option_override_internal): Do > > not generate block copies with vector pair instructions if we are > > tuning for power10. Back port from master branch. > > You never posted the trunk version of this, so that never was approved > either. I did post the trunk version on June 10th, and your only comment was fix the commit message, which I thought I did in the commit. > > +++ b/gcc/config/rs6000/rs6000.cc > > @@ -4151,7 +4151,10 @@ rs6000_option_override_internal (bool global_init_p) > > > > if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) > > { > > - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) > > + /* Do not generate lxvp and stxvp on power10 since there are some > > + performance issues. */ > > + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX > > + && rs6000_tune != PROCESSOR_POWER10) > > rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; > > else > > rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; > > The TARGET_MMA in that should not be there. Please fix that (that > probably needs more changes). All of the movoo and movxo support require TARGET_MMA as does the code in rs6000-string.cc that could possibly generate load/store vector pair. To remove the check here would mean also fixing all of the vector load and store pairs in mma.md. > This statement does the opposite of what the comment says. > > Please fix this. On trunk, first.
On Thu, Jul 14, 2022 at 05:49:57PM -0400, Michael Meissner wrote: > On Thu, Jul 14, 2022 at 04:12:14PM -0500, Segher Boessenkool wrote: > > You never posted the trunk version of this, so that never was approved > > either. > > I did post the trunk version on June 10th, and your only comment was fix the > commit message, which I thought I did in the commit. I did not approve the patch. Of course not, I didn't even get as far as reading it. You should have fixed it and sent again, I did not approve anything. > > > + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX > > > + && rs6000_tune != PROCESSOR_POWER10) > > > rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; > > > else > > > rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; > > > > The TARGET_MMA in that should not be there. Please fix that (that > > probably needs more changes). > > All of the movoo and movxo support require TARGET_MMA as does the code in > rs6000-string.cc that could possibly generate load/store vector pair. And all that is wrong and should be fixed. > To > remove the check here would mean also fixing all of the vector load and store > pairs in mma.md. That is wha I said, yes,. > > This statement does the opposite of what the comment says. > > > > Please fix this. On trunk, first. This is the core problem with this patch: it is simply wrong. It is a very roundabout way of saying "only enable vector pairs if -mcpu=power10 but -mtune=somethingelse". Which is not a sensible thing to do, and not what the comment says either. Segher
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index 0af2085adc0..59481d9ac70 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -4141,7 +4141,10 @@ rs6000_option_override_internal (bool global_init_p) if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR)) { - if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX) + /* Do not generate lxvp and stxvp on power10 since there are some + performance issues. */ + if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX + && rs6000_tune != PROCESSOR_POWER10) rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR; else rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;