Message ID | d2de3c57-01ab-3e42-97d4-80ad552eaac8@linux.ibm.com |
---|---|
State | Committed |
Commit | 300dbea12693e365c89971527ca14cb0242def64 |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7662A3857831 for <patchwork@sourceware.org>; Thu, 25 Nov 2021 03:21:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7662A3857831 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1637810501; bh=UF9CABZ/KWHMoGB7zmic4tneIq+rxgA1k6ubcv2hpUI=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=EFbo1oap/R3TyOB6A3muXcllfQN6CUbw0dUFIz+BdRw2BtZ3Uplt1ioHuYtJDLLn+ w+4j7HcHz3NmxxpzBsbmROv1vKYG/WLbLyzvNeo4tbsJqMiC9qGgFG30XaY+ccoWGP YNiZ9b6phIj15E+vR0BpY7LOap/4IxQk2IweRnP0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 78BAC3858403 for <gcc-patches@gcc.gnu.org>; Thu, 25 Nov 2021 03:21:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 78BAC3858403 Received: from pps.filterd (m0098394.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 1AP2GZWK015313; Thu, 25 Nov 2021 03:21:06 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3cj1jf8tp9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 03:21:06 +0000 Received: from m0098394.ppops.net (m0098394.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 1AP2j7wh002856; Thu, 25 Nov 2021 03:21:06 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com with ESMTP id 3cj1jf8tnx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 03:21:05 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1AP3H3vH022889; Thu, 25 Nov 2021 03:21:03 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma03ams.nl.ibm.com with ESMTP id 3cernaepq4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 25 Nov 2021 03:21:03 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1AP3Dkhi62259502 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 25 Nov 2021 03:13:46 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4A855A405B; Thu, 25 Nov 2021 03:21:01 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7645CA4064; Thu, 25 Nov 2021 03:20:59 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.58.140]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 25 Nov 2021 03:20:59 +0000 (GMT) Subject: [PATCH] rs6000/test: Add emulated gather test case To: GCC Patches <gcc-patches@gcc.gnu.org> Message-ID: <d2de3c57-01ab-3e42-97d4-80ad552eaac8@linux.ibm.com> Date: Thu, 25 Nov 2021 11:20:57 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 Content-Type: text/plain; charset=gbk Content-Language: en-US Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: sEa7uXiPFZdJJImmWMmD6yiqC8nXsMtQ X-Proofpoint-ORIG-GUID: plWIrFVD2ehTpgshSm-gbsImGua3oc3G X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-24_06,2021-11-24_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 mlxscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 priorityscore=1501 suspectscore=0 clxscore=1015 malwarescore=0 mlxlogscore=999 adultscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2110150000 definitions=main-2111250015 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: "Kewen.Lin via Gcc-patches" <gcc-patches@gcc.gnu.org> Reply-To: "Kewen.Lin" <linkw@linux.ibm.com> Cc: Bill Schmidt <wschmidt@linux.ibm.com>, David Edelsohn <dje.gcc@gmail.com>, Segher Boessenkool <segher@kernel.crashing.org> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
rs6000/test: Add emulated gather test case
|
|
Commit Message
Kewen.Lin
Nov. 25, 2021, 3:20 a.m. UTC
Hi, This patch is to add a test case similar to the one in i386 to add testing coverage for 510.parest_r hotspots. As evaluated, the emulated gather capability of vectorizer (r12-2733) can help to speed up SPEC2017 510.parest_r on Power8/9/10 by 5% to 9% with option sets Ofast unroll and Ofast lto. But since rs6000 missed unpacking support for unsigned int before, it can only vectorize the hotspots until r12-3134. By checking why r12-2733 doesn't immediately show its impact for SPEC2017 510.parest_r while the associated test case already can get vectorized on rs6000 at that time, I realized the associated test case use int as INDEXTYPE while the hotspots actually use unsigned int. So different from the one in i386, this patch uses unsigned int as INDEXTYPE since the unpack support for unsigned int (r12-3134) also matters for the hotspots vectorization. Not sure if it's worth to updating the one in i386 as well? Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. Is it ok for trunk? BR, Kewen ----- gcc/testsuite/ChangeLog: * gcc.target/powerpc/vect-gather-1.c: New test. -- 2.25.1
Comments
On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > This patch is to add a test case similar to the one in i386 > to add testing coverage for 510.parest_r hotspots. > > As evaluated, the emulated gather capability of vectorizer > (r12-2733) can help to speed up SPEC2017 510.parest_r on > Power8/9/10 by 5% to 9% with option sets Ofast unroll and > Ofast lto. But since rs6000 missed unpacking support for > unsigned int before, it can only vectorize the hotspots > until r12-3134. > > By checking why r12-2733 doesn't immediately show its impact > for SPEC2017 510.parest_r while the associated test case > already can get vectorized on rs6000 at that time, I realized > the associated test case use int as INDEXTYPE while the > hotspots actually use unsigned int. So different from the one > in i386, this patch uses unsigned int as INDEXTYPE since the > unpack support for unsigned int (r12-3134) also matters for > the hotspots vectorization. Not sure if it's worth to updating > the one in i386 as well? It looks like the same testcase added in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531 > > Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. > > Is it ok for trunk? > > BR, > Kewen > ----- > gcc/testsuite/ChangeLog: > > * gcc.target/powerpc/vect-gather-1.c: New test. > > diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > new file mode 100644 > index 00000000000..bf98045ab03 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c > @@ -0,0 +1,20 @@ > +/* { dg-do compile } */ > +/* Profitable from Power8 since it supports efficient unaligned load. */ > +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ > + > +#ifndef INDEXTYPE > +#define INDEXTYPE unsigned int > +#endif > +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, > + double *luval, double *dst) > +{ > + double res = 0; > + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) > + res += *luval * dst[*col]; > + return res; > +} > + > +/* With gather emulation this should be profitable to vectorize from Power8. */ > +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ > +/* The index vector loads and promotions should be scalar after forwprop. */ > +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */ > -- > 2.25.1 >
on 2021/11/25 下午1:17, Hongtao Liu wrote: > On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> Hi, >> >> This patch is to add a test case similar to the one in i386 >> to add testing coverage for 510.parest_r hotspots. >> >> As evaluated, the emulated gather capability of vectorizer >> (r12-2733) can help to speed up SPEC2017 510.parest_r on >> Power8/9/10 by 5% to 9% with option sets Ofast unroll and >> Ofast lto. But since rs6000 missed unpacking support for >> unsigned int before, it can only vectorize the hotspots >> until r12-3134. >> >> By checking why r12-2733 doesn't immediately show its impact >> for SPEC2017 510.parest_r while the associated test case >> already can get vectorized on rs6000 at that time, I realized >> the associated test case use int as INDEXTYPE while the >> hotspots actually use unsigned int. So different from the one >> in i386, this patch uses unsigned int as INDEXTYPE since the >> unpack support for unsigned int (r12-3134) also matters for >> the hotspots vectorization. Not sure if it's worth to updating >> the one in i386 as well? > It looks like the same testcase added in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531 Thanks for the information! Good to know that there are already some cases to cover. :) BR, Kewen >> >> Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8. >> >> Is it ok for trunk? >> >> BR, >> Kewen >> ----- >> gcc/testsuite/ChangeLog: >> >> * gcc.target/powerpc/vect-gather-1.c: New test. >> >> diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c >> new file mode 100644 >> index 00000000000..bf98045ab03 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c >> @@ -0,0 +1,20 @@ >> +/* { dg-do compile } */ >> +/* Profitable from Power8 since it supports efficient unaligned load. */ >> +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ >> + >> +#ifndef INDEXTYPE >> +#define INDEXTYPE unsigned int >> +#endif >> +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, >> + double *luval, double *dst) >> +{ >> + double res = 0; >> + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) >> + res += *luval * dst[*col]; >> + return res; >> +} >> + >> +/* With gather emulation this should be profitable to vectorize from Power8. */ >> +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ >> +/* The index vector loads and promotions should be scalar after forwprop. */ >> +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */ >> -- >> 2.25.1 >> > >
Hi! On Thu, Nov 25, 2021 at 11:20:57AM +0800, Kewen.Lin wrote: > This patch is to add a test case similar to the one in i386 > to add testing coverage for 510.parest_r hotspots. > gcc/testsuite/ChangeLog: > * gcc.target/powerpc/vect-gather-1.c: New test. This is okay for trunk. Thanks! Segher
on 2021/11/27 上午12:24, Segher Boessenkool wrote: > Hi! > > On Thu, Nov 25, 2021 at 11:20:57AM +0800, Kewen.Lin wrote: >> This patch is to add a test case similar to the one in i386 >> to add testing coverage for 510.parest_r hotspots. > >> gcc/testsuite/ChangeLog: >> * gcc.target/powerpc/vect-gather-1.c: New test. > > This is okay for trunk. Thanks! > Thanks Segher! Committed as r12-5569. BR, Kewen
diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c new file mode 100644 index 00000000000..bf98045ab03 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* Profitable from Power8 since it supports efficient unaligned load. */ +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */ + +#ifndef INDEXTYPE +#define INDEXTYPE unsigned int +#endif +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend, + double *luval, double *dst) +{ + double res = 0; + for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval) + res += *luval * dst[*col]; + return res; +} + +/* With gather emulation this should be profitable to vectorize from Power8. */ +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */ +/* The index vector loads and promotions should be scalar after forwprop. */ +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */