Message ID | ad59cfde-5424-27bc-29e3-24d8f0a4b129@linux.ibm.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EFD7F3857C6F for <patchwork@sourceware.org>; Mon, 9 May 2022 01:55:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EFD7F3857C6F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1652061338; bh=eVuqUzpOGJ3fD4EeJps80mStFxkerW0cAScZDzZR9+k=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=svMyd5BvNmsGxX8uvjA91Ubc6xCqlNGBdflzEPCcV+rWqZoGKD8/uCvs6MgRoF+Wk G2lZE3Y284WMQYYFybzC0l6j4RTI7l78PnnF/K1K09FjV//GQN/RjDya8pfQjcWwU3 atvskCfvzl1zInWzlbbFKTtsjvBdD1hZqCmROFS8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 24B873858C54 for <gcc-patches@gcc.gnu.org>; Mon, 9 May 2022 01:55:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 24B873858C54 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 2490kYL9000841; Mon, 9 May 2022 01:55:07 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3fx247qupk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 09 May 2022 01:55:06 +0000 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2491q7ZE029096; Mon, 9 May 2022 01:55:06 GMT Received: from ppma06ams.nl.ibm.com (66.31.33a9.ip4.static.sl-reverse.com [169.51.49.102]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3fx247qup4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 09 May 2022 01:55:06 +0000 Received: from pps.filterd (ppma06ams.nl.ibm.com [127.0.0.1]) by ppma06ams.nl.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2491r0wQ016078; Mon, 9 May 2022 01:55:04 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma06ams.nl.ibm.com with ESMTP id 3fwg1j1rdd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 09 May 2022 01:55:03 +0000 Received: from b06wcsmtp001.portsmouth.uk.ibm.com (b06wcsmtp001.portsmouth.uk.ibm.com [9.149.105.160]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2491t08M31129994 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 9 May 2022 01:55:00 GMT Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 88362A405B; Mon, 9 May 2022 01:55:00 +0000 (GMT) Received: from b06wcsmtp001.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 378BFA4054; Mon, 9 May 2022 01:54:48 +0000 (GMT) Received: from [9.200.38.215] (unknown [9.200.38.215]) by b06wcsmtp001.portsmouth.uk.ibm.com (Postfix) with ESMTP; Mon, 9 May 2022 01:54:47 +0000 (GMT) Message-ID: <ad59cfde-5424-27bc-29e3-24d8f0a4b129@linux.ibm.com> Date: Mon, 9 May 2022 09:54:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 Content-Language: en-US To: gcc-patches <gcc-patches@gcc.gnu.org> Subject: [PATCH, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-GUID: huQip1G3qMxS7LPmAHDxo-fTUwjkJ3J2 X-Proofpoint-ORIG-GUID: BoarbXoNNBPT9mDMnc-WruI7EFKJ4497 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-08_09,2022-05-06_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 spamscore=0 clxscore=1015 adultscore=0 lowpriorityscore=0 priorityscore=1501 phishscore=0 mlxlogscore=999 mlxscore=0 bulkscore=0 malwarescore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2205090006 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: HAO CHEN GUI via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: HAO CHEN GUI <guihaoc@linux.ibm.com> Cc: Peter Bergner <bergner@linux.ibm.com>, David <dje.gcc@gmail.com>, Segher Boessenkool <segher@kernel.crashing.org> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
|
|
Commit Message
HAO CHEN GUI
May 9, 2022, 1:54 a.m. UTC
Hi, This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. Tests show that outputs of xs[min/max]dp are consistent with the standard of C99 fmin/max. Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2022-05-09 Haochen Gui <guihaoc@linux.ibm.com> gcc/ PR target/103605 * rs6000.md (unspec): Add UNSPEC_FMAX and UNSPEC_FMIN. (fminmax): New. (minmax_op): Likewise. (<fminmax><mode>3): New pattern. Implemented by UNSPEC_FMAX and UNSPEC_FMIN. gcc/testsuite/ PR target/103605 * gcc.dg/pr103605.c: New. patch.diff
Comments
Hi Haochen, Thanks for the patch, some comments are inlined. on 2022/5/9 09:54, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-05-09 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > PR target/103605 > * rs6000.md (unspec): Add UNSPEC_FMAX and UNSPEC_FMIN. Nit: one entry for iterator FMINMAX? > (fminmax): New. > (minmax_op): Likewise. > (<fminmax><mode>3): New pattern. Implemented by UNSPEC_FMAX and > UNSPEC_FMIN. > > gcc/testsuite/ > PR target/103605 > * gcc.dg/pr103605.c: New. > > patch.diff > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index fdfbc6566a5..8aae3e80bcd 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -158,6 +158,8 @@ (define_c_enum "unspec" > UNSPEC_HASHCHK > UNSPEC_XXSPLTIDP_CONST > UNSPEC_XXSPLTIW_CONST > + UNSPEC_FMAX > + UNSPEC_FMIN > ]) > > ;; > @@ -5350,6 +5352,25 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" > DONE; > }) > > + > +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) > + > +(define_int_attr fminmax [(UNSPEC_FMAX "fmax") > + (UNSPEC_FMIN "fmin")]) > + > +(define_int_attr minmax_op [(UNSPEC_FMAX "max") > + (UNSPEC_FMIN "min")]) > + Can we use the later one for both? Like f<minmax_op><mode>3. > +(define_insn "<fminmax><mode>3" > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>") > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "<Fv>") > + (match_operand:SFDF 2 "vsx_register_operand" "<Fv>")] Nit: both SD and DF are mapped to constraint wa, just hardcode Fv to wa? > + FMINMAX))] > +"TARGET_VSX" > +"xs<minmax_op>dp %x0,%x1,%x2" > +[(set_attr "type" "fp")] > +) Maybe it's good to put one comment before this pattern to note that xs<minmax_op> can satisfy all required semantics of fmin/fmax. PR103605 also exposes another problem on bif __builtin_vsx_xs{min,max}dp, both bifs are expanded into xs{min,max}cdp instead of xs{min,max}dp starting from power9. IMHO, it's something we want to fix as well, based on the reasons: 1) bif names have the corresponding mnemonics, users would expect 1-1 mapping here. 2) clang emits xs{min,max}dp all the time, with cpu type power7/8/9/10. 3) according to uarch info, xs{min,max}cdp use the same units and have the same latency, no benefits to replace with xs{min,max}cdp. So I wonder if it would be more clear with: 1) add new define_insn for xs{min,max}dp 2) use them for new define_expand of fmin/fmax 3) use them for bif expansion pattern BR, Kewen > + > (define_expand "mov<mode>cc" > [(set (match_operand:GPR 0 "gpc_reg_operand") > (if_then_else:GPR (match_operand 1 "comparison_operator") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c > new file mode 100644 > index 00000000000..a40da064742 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O1 -mvsx" } */ > +/* { dg-final { scan-assembler-times "xsmaxdp" 2 } } */ > +/* { dg-final { scan-assembler-times "xsmindp" 2 } } */ > + > +#include <math.h> > + > +double test1 (double d0, double d1) > +{ > + return fmin (d0, d1); > +} > + > +float test2 (float d0, float d1) > +{ > + return fmin (d0, d1); > +} > + > +double test3 (double d0, double d1) > +{ > + return fmax (d0, d1); > +} > + > +float test4 (float d0, float d1) > +{ > + return fmax (d0, d1); > +} >
Hi guys, On Mon, May 09, 2022 at 12:05:51PM +0800, Kewen.Lin wrote: > on 2022/5/9 09:54, HAO CHEN GUI wrote: > > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > > Tests show that outputs of xs[min/max]dp are consistent with the standard > > of C99 fmin/max. > > gcc/ > > PR target/103605 > > * rs6000.md (unspec): Add UNSPEC_FMAX and UNSPEC_FMIN. > > Nit: one entry for iterator FMINMAX? It should be * rs6000.md (FMINMAX): New. > > (fminmax): New. > > (minmax_op): Likewise. Please say "New." when something is new, never "Likewise." > > (<fminmax><mode>3): New pattern. Implemented by UNSPEC_FMAX and > > UNSPEC_FMIN. Leave out the "Implemented..." part please. > > +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) > > + > > +(define_int_attr fminmax [(UNSPEC_FMAX "fmax") > > + (UNSPEC_FMIN "fmin")]) > > + > > +(define_int_attr minmax_op [(UNSPEC_FMAX "max") > > + (UNSPEC_FMIN "min")]) > > + > > Can we use the later one for both? > > Like f<minmax_op><mode>3. Yes. RTL iterators and attributes are textual replacement and expansion only: there is no deeper semantic meaning to it. In fact, we should have only a "minmax" iterator and a "MINMAX" attribute, and then simplify everything else. I'll do this for the existing code. > > +(define_insn "<fminmax><mode>3" > > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>") > > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "<Fv>") > > + (match_operand:SFDF 2 "vsx_register_operand" "<Fv>")] > > Nit: both SD and DF are mapped to constraint wa, just hardcode Fv to wa? (SF and DF) Please keep <Fv> until <Ff> is gone as well, it is easier to read that way. And at that time get rid of *all* <Fv>. Indeed we could get rid of it, but only replacing it in some places and not others is confusing (or at least distracting). > PR103605 also exposes another problem on bif __builtin_vsx_xs{min,max}dp, both bifs are > expanded into xs{min,max}cdp instead of xs{min,max}dp starting from power9. > > IMHO, it's something we want to fix as well, based on the reasons: > 1) bif names have the corresponding mnemonics, users would expect 1-1 mapping here. > 2) clang emits xs{min,max}dp all the time, with cpu type power7/8/9/10. > 3) according to uarch info, xs{min,max}cdp use the same units and have the same latency, > no benefits to replace with xs{min,max}cdp. I never understood any of this. Mike? Why do we do those "c" things at all, ever? > So I wonder if it would be more clear with: > 1) add new define_insn for xs{min,max}dp No, the existing thing should always do these insns. > 2) use them for new define_expand of fmin/fmax > 3) use them for bif expansion pattern There is no way to express fmin/fmax without unspecs currently (without fastmath). This is a serious problem. Segher
On Tue, May 10, 2022 at 07:27:30AM -0500, Segher Boessenkool wrote: > > IMHO, it's something we want to fix as well, based on the reasons: > > 1) bif names have the corresponding mnemonics, users would expect 1-1 mapping here. > > 2) clang emits xs{min,max}dp all the time, with cpu type power7/8/9/10. > > 3) according to uarch info, xs{min,max}cdp use the same units and have the same latency, > > no benefits to replace with xs{min,max}cdp. > > I never understood any of this. Mike? Why do we do those "c" things > at all, ever? In the power7, we only had x{s,v}{min,max}{sp,dp}. But those aren't useful for optimizing normal (a > b) ? a : b without using -ffast-math. Power9 added the 'c' and 'j' versions of the insns. GCC never generates the 'j' version. Basically for ?: we generate: * Code = power8, no -ffast-math: Generate compare, move; * Code = power8, -ffast-math: Generate xsmaxdp/xsmindp; * Code = power9, no -ffast-mth: Generate xsmaxcdp/xsmincdp; (and) * Code = power9, -ffast-math: Generate xsmaxcdp/xsmincdp. For the __builtin_fmax and __builtin_fmin functions: * Code = power8, no -ffast-math: Generate call to fmax/fmin; * Code = power8, -ffast-math: Generate xsmaxdp/xsmindp; * Code = power9, no -ffast-mth: Generate call to fmax/fmin; (and) * Code = power9, -ffast-math: Generate xsmaxcdp/xsmincdp. For IEEE 128-bit floating point, we only have xs{min,max}cqp. We do not have the version without 'c' nor do we have the 'j' version.
On Tue, May 10, 2022 at 02:56:58PM -0400, Michael Meissner wrote: > On Tue, May 10, 2022 at 07:27:30AM -0500, Segher Boessenkool wrote: > > > IMHO, it's something we want to fix as well, based on the reasons: > > > 1) bif names have the corresponding mnemonics, users would expect 1-1 mapping here. > > > 2) clang emits xs{min,max}dp all the time, with cpu type power7/8/9/10. > > > 3) according to uarch info, xs{min,max}cdp use the same units and have the same latency, > > > no benefits to replace with xs{min,max}cdp. > > > > I never understood any of this. Mike? Why do we do those "c" things > > at all, ever? > > In the power7, we only had x{s,v}{min,max}{sp,dp}. But those aren't useful for > optimizing normal (a > b) ? a : b without using -ffast-math. But RTL smin (as well as Gimple min_expr) is *undefined* without -ffast-math (well, -ffinite-math-only and -fno-signed-zeros at least). The only place we generate xs{min,max}[c]dp uses s{min,max}. So the much saner xs{min,max}dp are fine always. > Power9 added the > 'c' and 'j' versions of the insns. GCC never generates the 'j' version. > > Basically for ?: we generate: > > * Code = power8, no -ffast-math: Generate compare, move; > * Code = power8, -ffast-math: Generate xsmaxdp/xsmindp; > * Code = power9, no -ffast-mth: Generate xsmaxcdp/xsmincdp; (and) This one uses broken RTL (and broken Gimple before that): s{min,max} cannot be used for FP without -ffast-math. > * Code = power9, -ffast-math: Generate xsmaxcdp/xsmincdp. xs{min,max}dp will work just as well. > For the __builtin_fmax and __builtin_fmin functions: > > * Code = power8, no -ffast-math: Generate call to fmax/fmin; > * Code = power8, -ffast-math: Generate xsmaxdp/xsmindp; > * Code = power9, no -ffast-mth: Generate call to fmax/fmin; (and) > * Code = power9, -ffast-math: Generate xsmaxcdp/xsmincdp. Same brokenness here. > For IEEE 128-bit floating point, we only have xs{min,max}cqp. We do not have > the version without 'c' nor do we have the 'j' version. And here. Why would we ever prefer xsmincdp over xsmindp, other than for machine code for some not-so-smart C code that wil not do useful things for signed zeros or NaNs (but using the "c" insns generates faster, smaller code that has those silly semantics)? Segher
Hi Segher, on 2022/5/10 20:27, Segher Boessenkool wrote: > Hi guys, > > On Mon, May 09, 2022 at 12:05:51PM +0800, Kewen.Lin wrote: >> on 2022/5/9 09:54, HAO CHEN GUI wrote: >>> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. >>> Tests show that outputs of xs[min/max]dp are consistent with the standard >>> of C99 fmin/max. > >>> gcc/ >>> PR target/103605 >>> * rs6000.md (unspec): Add UNSPEC_FMAX and UNSPEC_FMIN. >> >> Nit: one entry for iterator FMINMAX? > > It should be > * rs6000.md (FMINMAX): New. > >>> (fminmax): New. >>> (minmax_op): Likewise. > > Please say "New." when something is new, never "Likewise." > >>> (<fminmax><mode>3): New pattern. Implemented by UNSPEC_FMAX and >>> UNSPEC_FMIN. > > Leave out the "Implemented..." part please. > >>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) >>> + >>> +(define_int_attr fminmax [(UNSPEC_FMAX "fmax") >>> + (UNSPEC_FMIN "fmin")]) >>> + >>> +(define_int_attr minmax_op [(UNSPEC_FMAX "max") >>> + (UNSPEC_FMIN "min")]) >>> + >> >> Can we use the later one for both? >> >> Like f<minmax_op><mode>3. > > Yes. RTL iterators and attributes are textual replacement and expansion > only: there is no deeper semantic meaning to it. In fact, we should > have only a "minmax" iterator and a "MINMAX" attribute, and then > simplify everything else. I'll do this for the existing code. > >>> +(define_insn "<fminmax><mode>3" >>> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>") >>> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "<Fv>") >>> + (match_operand:SFDF 2 "vsx_register_operand" "<Fv>")] >> >> Nit: both SD and DF are mapped to constraint wa, just hardcode Fv to wa? > > (SF and DF) > > Please keep <Fv> until <Ff> is gone as well, it is easier to read that > way. And at that time get rid of *all* <Fv>. > > Indeed we could get rid of it, but only replacing it in some places and > not others is confusing (or at least distracting). > aha, thanks for the correction and explanation! >> PR103605 also exposes another problem on bif __builtin_vsx_xs{min,max}dp, both bifs are >> expanded into xs{min,max}cdp instead of xs{min,max}dp starting from power9. >> >> IMHO, it's something we want to fix as well, based on the reasons: >> 1) bif names have the corresponding mnemonics, users would expect 1-1 mapping here. >> 2) clang emits xs{min,max}dp all the time, with cpu type power7/8/9/10. >> 3) according to uarch info, xs{min,max}cdp use the same units and have the same latency, >> no benefits to replace with xs{min,max}cdp. > > I never understood any of this. Mike? Why do we do those "c" things > at all, ever? > >> So I wonder if it would be more clear with: >> 1) add new define_insn for xs{min,max}dp > > No, the existing thing should always do these insns. Currently xs{min,max}dp are only covered by smin/smax, no their own define_insn? > >> 2) use them for new define_expand of fmin/fmax >> 3) use them for bif expansion pattern > > There is no way to express fmin/fmax without unspecs currently (without > fastmath). This is a serious problem. > Yeah, I agree. The above question/proposal still needs UNSPEC for new define_insn, and if we want to replace the expansion pattern for bif __builtin_vsx_xs{min,max}dp, we seem to have alternatives: 1) as Haochen's patch, new define_insn fmin/fmax with UNSPEC_F{MAX,MIN} and used for actual insns xs{min,max}dp and replace bif table entry with fmin/fmax*. 2) new define_insn vsx_xs{min,max}dp with UNSPEC_XS{MAX,MIN}DP and used for actual insns xs{min,max}dp, new define_expand for fmin/fmax optabs, replace bif table entry with vsx_xs{min,max}dp*. I personally felt (wondered) if 2) is more clear? Because the mnemonics xs{min,max}dp don't have fmin/fmax inside the names, ISA doesn't even mention they can be used for fmin/fmax in Programming Notes, it seems more straight to see vsx_xs{min,max}dp used as bif expansion patterns and UNSPEC_XS{MAX,MIN}DP. But both 1) and 2) are fine to me. :) BR, Kewen
Hi~ On Wed, May 11, 2022 at 04:53:08PM +0800, Kewen.Lin wrote: > on 2022/5/10 20:27, Segher Boessenkool wrote: > > Yes. RTL iterators and attributes are textual replacement and expansion > > only: there is no deeper semantic meaning to it. In fact, we should > > have only a "minmax" iterator and a "MINMAX" attribute, and then > > simplify everything else. I'll do this for the existing code. Iterators of course are *not* just text replacement, which is pretty nasty, makes inconvenient contentless iterators necessary. Bah. But at least we can get rid of many attributes. > >>> +(define_insn "<fminmax><mode>3" > >>> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>") > >>> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "<Fv>") > >>> + (match_operand:SFDF 2 "vsx_register_operand" "<Fv>")] > >> > >> Nit: both SD and DF are mapped to constraint wa, just hardcode Fv to wa? > > > > (SF and DF) > > > > Please keep <Fv> until <Ff> is gone as well, it is easier to read that > > way. And at that time get rid of *all* <Fv>. > > > > Indeed we could get rid of it, but only replacing it in some places and > > not others is confusing (or at least distracting). > > > > aha, thanks for the correction and explanation! I'll commit a patch series doing just this in a bit: it removes <Fv>, <Ff>, and RS6000_CONSTRAINT_f (it is the same as RS6000_CONSTRAINT_d always after all, since 2c2aa74d1df0 (2018, removing Xilinx FP)). > >> So I wonder if it would be more clear with: > >> 1) add new define_insn for xs{min,max}dp > > > > No, the existing thing should always do these insns. > > Currently xs{min,max}dp are only covered by smin/smax, no their own define_insn? Yeah, smin/smax cannot work if -ffast-math is not enabled, so we need some new stuff using an unspec, instead. But ideally we should have an RTL code that just is the correct thing always! Maybe "fmin/fmax" :-) Until such a thing exists we can use an unspec, sure. > Yeah, I agree. The above question/proposal still needs UNSPEC for new > define_insn, and if we want to replace the expansion pattern for bif > __builtin_vsx_xs{min,max}dp, we seem to have alternatives: > > 1) as Haochen's patch, new define_insn fmin/fmax with UNSPEC_F{MAX,MIN} > and used for actual insns xs{min,max}dp and replace bif table entry with > fmin/fmax*. > > 2) new define_insn vsx_xs{min,max}dp with UNSPEC_XS{MAX,MIN}DP and used > for actual insns xs{min,max}dp, new define_expand for fmin/fmax optabs, > replace bif table entry with vsx_xs{min,max}dp*. > > I personally felt (wondered) if 2) is more clear? Because the mnemonics > xs{min,max}dp don't have fmin/fmax inside the names, ISA doesn't even > mention they can be used for fmin/fmax in Programming Notes, it seems > more straight to see vsx_xs{min,max}dp used as bif expansion patterns > and UNSPEC_XS{MAX,MIN}DP. But both 1) and 2) are fine to me. :) I prefer 1), for the time being at least, because we have two different patterns both resulting in xs{min,max}dp. Segher
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index fdfbc6566a5..8aae3e80bcd 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -158,6 +158,8 @@ (define_c_enum "unspec" UNSPEC_HASHCHK UNSPEC_XXSPLTIDP_CONST UNSPEC_XXSPLTIW_CONST + UNSPEC_FMAX + UNSPEC_FMIN ]) ;; @@ -5350,6 +5352,25 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" DONE; }) + +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) + +(define_int_attr fminmax [(UNSPEC_FMAX "fmax") + (UNSPEC_FMIN "fmin")]) + +(define_int_attr minmax_op [(UNSPEC_FMAX "max") + (UNSPEC_FMIN "min")]) + +(define_insn "<fminmax><mode>3" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=<Fv>") + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "<Fv>") + (match_operand:SFDF 2 "vsx_register_operand" "<Fv>")] + FMINMAX))] +"TARGET_VSX" +"xs<minmax_op>dp %x0,%x1,%x2" +[(set_attr "type" "fp")] +) + (define_expand "mov<mode>cc" [(set (match_operand:GPR 0 "gpc_reg_operand") (if_then_else:GPR (match_operand 1 "comparison_operator") diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c new file mode 100644 index 00000000000..a40da064742 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O1 -mvsx" } */ +/* { dg-final { scan-assembler-times "xsmaxdp" 2 } } */ +/* { dg-final { scan-assembler-times "xsmindp" 2 } } */ + +#include <math.h> + +double test1 (double d0, double d1) +{ + return fmin (d0, d1); +} + +float test2 (float d0, float d1) +{ + return fmin (d0, d1); +} + +double test3 (double d0, double d1) +{ + return fmax (d0, d1); +} + +float test4 (float d0, float d1) +{ + return fmax (d0, d1); +}