Message ID | df790a93-52bf-761a-6586-78d540934f96@linux.ibm.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AFA843856DF2 for <patchwork@sourceware.org>; Fri, 24 Jun 2022 02:03:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AFA843856DF2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1656036182; bh=cvMgdxiuAMTVNJBBJqACZf6czMStAki4bVzrOqVcW4o=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Jm9W0ckBpyG629eKFEhmskk2+xAq7epApueUQ0CRRXA1RIF+iS5Tun5GdXnXRRqnE ElxeJ/bJZgUbuuCtZF2wOLNMX0+a8QDrXw2+6488JFj0eFzLexkV5XYE1qmHxRJV8B C5Qi1s9dcx7uH/oFt4mXaxfuxKymC+hU5q6UyLoM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 65EE03856DCB for <gcc-patches@gcc.gnu.org>; Fri, 24 Jun 2022 02:02:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 65EE03856DCB Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 25NNdaK7003636; Fri, 24 Jun 2022 02:02:30 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3gw1p0u20k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Jun 2022 02:02:30 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 25O22FD6006173; Fri, 24 Jun 2022 02:02:30 GMT Received: from ppma04fra.de.ibm.com (6a.4a.5195.ip4.static.sl-reverse.com [149.81.74.106]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3gw1p0u201-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Jun 2022 02:02:30 +0000 Received: from pps.filterd (ppma04fra.de.ibm.com [127.0.0.1]) by ppma04fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 25O1pdWI000670; Fri, 24 Jun 2022 02:02:28 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma04fra.de.ibm.com with ESMTP id 3gs6b8xchk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Jun 2022 02:02:28 +0000 Received: from d06av24.portsmouth.uk.ibm.com (mk.ibm.com [9.149.105.60]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 25O22UpI24445244 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 24 Jun 2022 02:02:30 GMT Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 072CE4203F; Fri, 24 Jun 2022 02:02:25 +0000 (GMT) Received: from d06av24.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 5045A42047; Fri, 24 Jun 2022 02:02:23 +0000 (GMT) Received: from [9.197.226.153] (unknown [9.197.226.153]) by d06av24.portsmouth.uk.ibm.com (Postfix) with ESMTP; Fri, 24 Jun 2022 02:02:23 +0000 (GMT) Message-ID: <df790a93-52bf-761a-6586-78d540934f96@linux.ibm.com> Date: Fri, 24 Jun 2022 10:02:19 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: gcc-patches <gcc-patches@gcc.gnu.org> Subject: [PATCH v6, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: liiIC14eAp4E0FHODicFGyRV8aeLoZ2t X-Proofpoint-GUID: 4MJdYSfPbsf1d7NzkcBk615iMf1zyE5U X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-06-24_01,2022-06-23_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 clxscore=1015 adultscore=0 malwarescore=0 spamscore=0 mlxscore=0 suspectscore=0 impostorscore=0 mlxlogscore=999 phishscore=0 priorityscore=1501 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2204290000 definitions=main-2206240004 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: HAO CHEN GUI via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: HAO CHEN GUI <guihaoc@linux.ibm.com> Cc: Peter Bergner <bergner@linux.ibm.com>, David <dje.gcc@gmail.com>, Segher Boessenkool <segher@kernel.crashing.org> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[v6,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
|
|
Commit Message
HAO CHEN GUI
June 24, 2022, 2:02 a.m. UTC
Hi, This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. Tests show that outputs of xs[min/max]dp are consistent with the standard of C99 fmin/max. This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead of smin/max. So the builtins always generate xs[min/max]dp on all platforms. Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. ChangeLog 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> gcc/ PR target/103605 * config/rs6000/rs6000.md (FMINMAX): New. (minmax_op): New. (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set pattern to fmaxdf3. (__builtin_vsx_xsmindp): Set pattern to fmindf3. gcc/testsuite/ PR target/103605 * gcc.dg/powerpc/pr103605.c: New. patch.diff
Comments
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 24/6/2022 上午 10:02, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of smin/max. So the builtins always generate xs[min/max]dp on all > platforms. > > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > PR target/103605 > * config/rs6000/rs6000.md (FMINMAX): New. > (minmax_op): New. > (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. > * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set > pattern to fmaxdf3. > (__builtin_vsx_xsmindp): Set pattern to fmindf3. > > gcc/testsuite/ > PR target/103605 > * gcc.dg/powerpc/pr103605.c: New. > > > patch.diff > diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def > index f4a9f24bcc5..8b735493b40 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -1613,10 +1613,10 @@ > XSCVSPDP vsx_xscvspdp {} > > const double __builtin_vsx_xsmaxdp (double, double); > - XSMAXDP smaxdf3 {} > + XSMAXDP fmaxdf3 {} > > const double __builtin_vsx_xsmindp (double, double); > - XSMINDP smindf3 {} > + XSMINDP fmindf3 {} > > const double __builtin_vsx_xsrdpi (double); > XSRDPI vsx_xsrdpi {} > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bf85baa5370..ae0dd98f0f9 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -158,6 +158,8 @@ (define_c_enum "unspec" > UNSPEC_HASHCHK > UNSPEC_XXSPLTIDP_CONST > UNSPEC_XXSPLTIW_CONST > + UNSPEC_FMAX > + UNSPEC_FMIN > ]) > > ;; > @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" > DONE; > }) > > + > +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) > + > +(define_int_attr minmax_op [(UNSPEC_FMAX "max") > + (UNSPEC_FMIN "min")]) > + > +(define_insn "f<minmax_op><mode>3" > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") > + (match_operand:SFDF 2 "vsx_register_operand" "wa")] > + FMINMAX))] > + "TARGET_VSX && !flag_finite_math_only" > + "xs<minmax_op>dp %x0,%x1,%x2" > + [(set_attr "type" "fp")] > +) > + > (define_expand "mov<mode>cc" > [(set (match_operand:GPR 0 "gpc_reg_operand") > (if_then_else:GPR (match_operand 1 "comparison_operator") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c > new file mode 100644 > index 00000000000..1c938d40e61 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -mvsx" } */ > +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ > + > +#include <math.h> > + > +double test1 (double d0, double d1) > +{ > + return fmin (d0, d1); > +} > + > +float test2 (float d0, float d1) > +{ > + return fmin (d0, d1); > +} > + > +double test3 (double d0, double d1) > +{ > + return fmax (d0, d1); > +} > + > +float test4 (float d0, float d1) > +{ > + return fmax (d0, d1); > +} > + > +double test5 (double d0, double d1) > +{ > + return __builtin_vsx_xsmindp (d0, d1); > +} > + > +double test6 (double d0, double d1) > +{ > + return __builtin_vsx_xsmaxdp (d0, d1); > +}
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 4/7/2022 下午 2:32, HAO CHEN GUI wrote: > Hi, > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html > Thanks. > > On 24/6/2022 上午 10:02, HAO CHEN GUI wrote: >> Hi, >> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. >> Tests show that outputs of xs[min/max]dp are consistent with the standard >> of C99 fmin/max. >> >> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >> of smin/max. So the builtins always generate xs[min/max]dp on all >> platforms. >> >> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. >> Is this okay for trunk? Any recommendations? Thanks a lot. >> >> ChangeLog >> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> >> >> gcc/ >> PR target/103605 >> * config/rs6000/rs6000.md (FMINMAX): New. >> (minmax_op): New. >> (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. >> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set >> pattern to fmaxdf3. >> (__builtin_vsx_xsmindp): Set pattern to fmindf3. >> >> gcc/testsuite/ >> PR target/103605 >> * gcc.dg/powerpc/pr103605.c: New. >> >> >> patch.diff >> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def >> index f4a9f24bcc5..8b735493b40 100644 >> --- a/gcc/config/rs6000/rs6000-builtins.def >> +++ b/gcc/config/rs6000/rs6000-builtins.def >> @@ -1613,10 +1613,10 @@ >> XSCVSPDP vsx_xscvspdp {} >> >> const double __builtin_vsx_xsmaxdp (double, double); >> - XSMAXDP smaxdf3 {} >> + XSMAXDP fmaxdf3 {} >> >> const double __builtin_vsx_xsmindp (double, double); >> - XSMINDP smindf3 {} >> + XSMINDP fmindf3 {} >> >> const double __builtin_vsx_xsrdpi (double); >> XSRDPI vsx_xsrdpi {} >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index bf85baa5370..ae0dd98f0f9 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -158,6 +158,8 @@ (define_c_enum "unspec" >> UNSPEC_HASHCHK >> UNSPEC_XXSPLTIDP_CONST >> UNSPEC_XXSPLTIW_CONST >> + UNSPEC_FMAX >> + UNSPEC_FMIN >> ]) >> >> ;; >> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" >> DONE; >> }) >> >> + >> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) >> + >> +(define_int_attr minmax_op [(UNSPEC_FMAX "max") >> + (UNSPEC_FMIN "min")]) >> + >> +(define_insn "f<minmax_op><mode>3" >> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") >> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") >> + (match_operand:SFDF 2 "vsx_register_operand" "wa")] >> + FMINMAX))] >> + "TARGET_VSX && !flag_finite_math_only" >> + "xs<minmax_op>dp %x0,%x1,%x2" >> + [(set_attr "type" "fp")] >> +) >> + >> (define_expand "mov<mode>cc" >> [(set (match_operand:GPR 0 "gpc_reg_operand") >> (if_then_else:GPR (match_operand 1 "comparison_operator") >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c >> new file mode 100644 >> index 00000000000..1c938d40e61 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c >> @@ -0,0 +1,37 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target powerpc_vsx_ok } */ >> +/* { dg-options "-O2 -mvsx" } */ >> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ >> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ >> + >> +#include <math.h> >> + >> +double test1 (double d0, double d1) >> +{ >> + return fmin (d0, d1); >> +} >> + >> +float test2 (float d0, float d1) >> +{ >> + return fmin (d0, d1); >> +} >> + >> +double test3 (double d0, double d1) >> +{ >> + return fmax (d0, d1); >> +} >> + >> +float test4 (float d0, float d1) >> +{ >> + return fmax (d0, d1); >> +} >> + >> +double test5 (double d0, double d1) >> +{ >> + return __builtin_vsx_xsmindp (d0, d1); >> +} >> + >> +double test6 (double d0, double d1) >> +{ >> + return __builtin_vsx_xsmaxdp (d0, d1); >> +}
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 1/8/2022 上午 10:03, HAO CHEN GUI wrote: > Hi, > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html > Thanks. > > > On 4/7/2022 下午 2:32, HAO CHEN GUI wrote: >> Hi, >> Gentle ping this: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html >> Thanks. >> >> On 24/6/2022 上午 10:02, HAO CHEN GUI wrote: >>> Hi, >>> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. >>> Tests show that outputs of xs[min/max]dp are consistent with the standard >>> of C99 fmin/max. >>> >>> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >>> of smin/max. So the builtins always generate xs[min/max]dp on all >>> platforms. >>> >>> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. >>> Is this okay for trunk? Any recommendations? Thanks a lot. >>> >>> ChangeLog >>> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> >>> >>> gcc/ >>> PR target/103605 >>> * config/rs6000/rs6000.md (FMINMAX): New. >>> (minmax_op): New. >>> (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. >>> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set >>> pattern to fmaxdf3. >>> (__builtin_vsx_xsmindp): Set pattern to fmindf3. >>> >>> gcc/testsuite/ >>> PR target/103605 >>> * gcc.dg/powerpc/pr103605.c: New. >>> >>> >>> patch.diff >>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def >>> index f4a9f24bcc5..8b735493b40 100644 >>> --- a/gcc/config/rs6000/rs6000-builtins.def >>> +++ b/gcc/config/rs6000/rs6000-builtins.def >>> @@ -1613,10 +1613,10 @@ >>> XSCVSPDP vsx_xscvspdp {} >>> >>> const double __builtin_vsx_xsmaxdp (double, double); >>> - XSMAXDP smaxdf3 {} >>> + XSMAXDP fmaxdf3 {} >>> >>> const double __builtin_vsx_xsmindp (double, double); >>> - XSMINDP smindf3 {} >>> + XSMINDP fmindf3 {} >>> >>> const double __builtin_vsx_xsrdpi (double); >>> XSRDPI vsx_xsrdpi {} >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index bf85baa5370..ae0dd98f0f9 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -158,6 +158,8 @@ (define_c_enum "unspec" >>> UNSPEC_HASHCHK >>> UNSPEC_XXSPLTIDP_CONST >>> UNSPEC_XXSPLTIW_CONST >>> + UNSPEC_FMAX >>> + UNSPEC_FMIN >>> ]) >>> >>> ;; >>> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" >>> DONE; >>> }) >>> >>> + >>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) >>> + >>> +(define_int_attr minmax_op [(UNSPEC_FMAX "max") >>> + (UNSPEC_FMIN "min")]) >>> + >>> +(define_insn "f<minmax_op><mode>3" >>> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") >>> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") >>> + (match_operand:SFDF 2 "vsx_register_operand" "wa")] >>> + FMINMAX))] >>> + "TARGET_VSX && !flag_finite_math_only" >>> + "xs<minmax_op>dp %x0,%x1,%x2" >>> + [(set_attr "type" "fp")] >>> +) >>> + >>> (define_expand "mov<mode>cc" >>> [(set (match_operand:GPR 0 "gpc_reg_operand") >>> (if_then_else:GPR (match_operand 1 "comparison_operator") >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c >>> new file mode 100644 >>> index 00000000000..1c938d40e61 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c >>> @@ -0,0 +1,37 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target powerpc_vsx_ok } */ >>> +/* { dg-options "-O2 -mvsx" } */ >>> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ >>> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ >>> + >>> +#include <math.h> >>> + >>> +double test1 (double d0, double d1) >>> +{ >>> + return fmin (d0, d1); >>> +} >>> + >>> +float test2 (float d0, float d1) >>> +{ >>> + return fmin (d0, d1); >>> +} >>> + >>> +double test3 (double d0, double d1) >>> +{ >>> + return fmax (d0, d1); >>> +} >>> + >>> +float test4 (float d0, float d1) >>> +{ >>> + return fmax (d0, d1); >>> +} >>> + >>> +double test5 (double d0, double d1) >>> +{ >>> + return __builtin_vsx_xsmindp (d0, d1); >>> +} >>> + >>> +double test6 (double d0, double d1) >>> +{ >>> + return __builtin_vsx_xsmaxdp (d0, d1); >>> +}
Hi Haochen, on 2022/6/24 10:02, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of smin/max. So the builtins always generate xs[min/max]dp on all > platforms. > > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > PR target/103605 > * config/rs6000/rs6000.md (FMINMAX): New. > (minmax_op): New. > (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. Nit: here miss UNSPEC_FMAX and UNSPEC_FMIN. > * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set > pattern to fmaxdf3. > (__builtin_vsx_xsmindp): Set pattern to fmindf3. > > gcc/testsuite/ > PR target/103605 > * gcc.dg/powerpc/pr103605.c: New. > > > patch.diff > diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def > index f4a9f24bcc5..8b735493b40 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -1613,10 +1613,10 @@ > XSCVSPDP vsx_xscvspdp {} > > const double __builtin_vsx_xsmaxdp (double, double); > - XSMAXDP smaxdf3 {} > + XSMAXDP fmaxdf3 {} > > const double __builtin_vsx_xsmindp (double, double); > - XSMINDP smindf3 {} > + XSMINDP fmindf3 {} > > const double __builtin_vsx_xsrdpi (double); > XSRDPI vsx_xsrdpi {} > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bf85baa5370..ae0dd98f0f9 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -158,6 +158,8 @@ (define_c_enum "unspec" > UNSPEC_HASHCHK > UNSPEC_XXSPLTIDP_CONST > UNSPEC_XXSPLTIW_CONST > + UNSPEC_FMAX > + UNSPEC_FMIN > ]) > > ;; > @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" > DONE; > }) > > + > +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) > + > +(define_int_attr minmax_op [(UNSPEC_FMAX "max") > + (UNSPEC_FMIN "min")]) > + > +(define_insn "f<minmax_op><mode>3" > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") > + (match_operand:SFDF 2 "vsx_register_operand" "wa")] > + FMINMAX))] > + "TARGET_VSX && !flag_finite_math_only" > + "xs<minmax_op>dp %x0,%x1,%x2" > + [(set_attr "type" "fp")] > +) > + > (define_expand "mov<mode>cc" > [(set (match_operand:GPR 0 "gpc_reg_operand") > (if_then_else:GPR (match_operand 1 "comparison_operator") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c > new file mode 100644 > index 00000000000..1c938d40e61 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ Nit: This dg-do line isn't needed. OK with or without two nits fixed. Thanks! BR, Kewen
Hi! On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of smin/max. So the builtins always generate xs[min/max]dp on all > platforms. But how does this not blow up with -ffast-math? In the other direction I am worried that the unspecs will degrade performance (relative to smin/smax) when -ffast-math *is* active (and this new builtin code and pattern doesn't blow up). I still think we should get RTL codes for this, to have access to proper floating point min/max semantics always and everywhere. "fmin" and "fmax" seem to be good names :-) Segher
on 2022/9/22 05:56, Segher Boessenkool wrote: > Hi! > > On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >> of smin/max. So the builtins always generate xs[min/max]dp on all >> platforms. > > But how does this not blow up with -ffast-math? Indeed. Since it guards with "TARGET_VSX && !flag_finite_math_only", the bifs seem to cause ICE at -ffast-math. Haochen, could you double check it? > > In the other direction I am worried that the unspecs will degrade > performance (relative to smin/smax) when -ffast-math *is* active (and > this new builtin code and pattern doesn't blow up). For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} EXPR in middle end, and yes, it can degrade for the bifs, although IMHO the previous expansion to smin/smax contradicts with the bif names (users expect to map them to xs{min,max}dp than others). > > I still think we should get RTL codes for this, to have access to proper > floating point min/max semantics always and everywhere. "fmin" and > "fmax" seem to be good names :-) It would be good, especially if we have observed some uses of these bifs and further opportunities around them. :) BR, Kewen
Hi Kewen & Segher, Thanks so much for your review comments. On 22/9/2022 上午 10:28, Kewen.Lin wrote: > on 2022/9/22 05:56, Segher Boessenkool wrote: >> Hi! >> >> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >>> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >>> of smin/max. So the builtins always generate xs[min/max]dp on all >>> platforms. >> >> But how does this not blow up with -ffast-math? > > Indeed. Since it guards with "TARGET_VSX && !flag_finite_math_only", > the bifs seem to cause ICE at -ffast-math. > > Haochen, could you double check it? I tested it with "-ffast-math". fmin/max functions are converted to MIN/MAX_EXPR in gimple lower pass. But the built-ins are not and hit the ICE. I thought the built-ins are folded to MIN/MAX_EXPR like vec_ versions' when fast-math is set. In fact they're not. Sorry for that. I made a patch to fold these two built-ins to MIN/MAX_EXPR when fast-math is set. Then the built-ins are converted to MIN/MAX_EXPR and expanded to smin/max. Thanks for pointing out the problem! > >> >> In the other direction I am worried that the unspecs will degrade >> performance (relative to smin/smax) when -ffast-math *is* active (and >> this new builtin code and pattern doesn't blow up). > > For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} > EXPR in middle end, and yes, it can degrade for the bifs, although IMHO > the previous expansion to smin/smax contradicts with the bif names (users > expect to map them to xs{min,max}dp than others). > >> >> I still think we should get RTL codes for this, to have access to proper >> floating point min/max semantics always and everywhere. "fmin" and >> "fmax" seem to be good names :-) > > It would be good, especially if we have observed some uses of these bifs > and further opportunities around them. :) > Shall we submit a PR to add fmin/fmax to RTL codes? > BR, > Kewen
Hi! On Thu, Sep 22, 2022 at 05:59:07PM +0800, HAO CHEN GUI wrote: > >> I still think we should get RTL codes for this, to have access to proper > >> floating point min/max semantics always and everywhere. "fmin" and > >> "fmax" seem to be good names :-) > > > > It would be good, especially if we have observed some uses of these bifs > > and further opportunities around them. :) > > > Shall we submit a PR to add fmin/fmax to RTL codes? Yes, please do. If we have fmin/fmax RTL codes that describe the standard semantics, we can generate code for that with -ffast-math as well, since the code generated is optimal in either case; it's just the *generic* optimisations that fall behind. Segher
Hi! On Thu, Sep 22, 2022 at 10:28:23AM +0800, Kewen.Lin wrote: > on 2022/9/22 05:56, Segher Boessenkool wrote: > > On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: > > In the other direction I am worried that the unspecs will degrade > > performance (relative to smin/smax) when -ffast-math *is* active (and > > this new builtin code and pattern doesn't blow up). > > For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} > EXPR in middle end, and yes, it can degrade for the bifs, although IMHO > the previous expansion to smin/smax contradicts with the bif names (users > expect to map them to xs{min,max}dp than others). But builtins *never* say to generate any particular instruction. They say to generate code that implements certain functionality. For many builtins this does of course boil down to specific instructions, but even then it could be optimised away completely or replace with something more specific if things can be folded or such. > > I still think we should get RTL codes for this, to have access to proper > > floating point min/max semantics always and everywhere. "fmin" and > > "fmax" seem to be good names :-) > > It would be good, especially if we have observed some uses of these bifs > and further opportunities around them. :) Currently we only have smin/smax for float, and those are not valid for NaNs, or when the sign of zeros is relevant. On the other hand the semantics of fmin/fmax are settled and in most standards nowadays. So it is time we did this I would say :-) Segher
Hi Segher, on 2022/9/22 22:05, Segher Boessenkool wrote: > Hi! > > On Thu, Sep 22, 2022 at 10:28:23AM +0800, Kewen.Lin wrote: >> on 2022/9/22 05:56, Segher Boessenkool wrote: >>> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >>> In the other direction I am worried that the unspecs will degrade >>> performance (relative to smin/smax) when -ffast-math *is* active (and >>> this new builtin code and pattern doesn't blow up). >> >> For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} >> EXPR in middle end, and yes, it can degrade for the bifs, although IMHO >> the previous expansion to smin/smax contradicts with the bif names (users >> expect to map them to xs{min,max}dp than others). > > But builtins *never* say to generate any particular instruction. They > say to generate code that implements certain functionality. For many > builtins this does of course boil down to specific instructions, but > even then it could be optimised away completely or replace with > something more specific if things can be folded or such. ah, your explanation refreshed my mind, thanks! Previously I thought the bifs with specific mnemonic as part of their names should be used to generate specific instructions, it's to save users' efforts using inline-asm, if we want them to represent the generic functionality (not bind with specific), we can use some generic names instead. As your explanation, binding at fast-math isn't needed, then I think Haochen's patch v7 with gimple folding can avoid the concern on degradation at fast-math (still smax/smin), nice. :) BR, Kewen
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index f4a9f24bcc5..8b735493b40 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -1613,10 +1613,10 @@ XSCVSPDP vsx_xscvspdp {} const double __builtin_vsx_xsmaxdp (double, double); - XSMAXDP smaxdf3 {} + XSMAXDP fmaxdf3 {} const double __builtin_vsx_xsmindp (double, double); - XSMINDP smindf3 {} + XSMINDP fmindf3 {} const double __builtin_vsx_xsrdpi (double); XSRDPI vsx_xsrdpi {} diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bf85baa5370..ae0dd98f0f9 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -158,6 +158,8 @@ (define_c_enum "unspec" UNSPEC_HASHCHK UNSPEC_XXSPLTIDP_CONST UNSPEC_XXSPLTIW_CONST + UNSPEC_FMAX + UNSPEC_FMIN ]) ;; @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" DONE; }) + +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) + +(define_int_attr minmax_op [(UNSPEC_FMAX "max") + (UNSPEC_FMIN "min")]) + +(define_insn "f<minmax_op><mode>3" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") + (match_operand:SFDF 2 "vsx_register_operand" "wa")] + FMINMAX))] + "TARGET_VSX && !flag_finite_math_only" + "xs<minmax_op>dp %x0,%x1,%x2" + [(set_attr "type" "fp")] +) + (define_expand "mov<mode>cc" [(set (match_operand:GPR 0 "gpc_reg_operand") (if_then_else:GPR (match_operand 1 "comparison_operator") diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c new file mode 100644 index 00000000000..1c938d40e61 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ + +#include <math.h> + +double test1 (double d0, double d1) +{ + return fmin (d0, d1); +} + +float test2 (float d0, float d1) +{ + return fmin (d0, d1); +} + +double test3 (double d0, double d1) +{ + return fmax (d0, d1); +} + +float test4 (float d0, float d1) +{ + return fmax (d0, d1); +} + +double test5 (double d0, double d1) +{ + return __builtin_vsx_xsmindp (d0, d1); +} + +double test6 (double d0, double d1) +{ + return __builtin_vsx_xsmaxdp (d0, d1); +}