[rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set
Message ID | d98a12d0-6089-315d-9264-62260ec85530@linux.ibm.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 046CD3858402 for <patchwork@sourceware.org>; Tue, 12 Oct 2021 08:58:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 046CD3858402 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1634029133; bh=9ZQxJeDnL7D7X5RAQHT1PLH9Dzc/ZZm3ZIx0GbhnOxM=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=CT/KCD+2wn39LaDqGhZ9slHbXbDrfgm81acEQYhuFsPELGozeN44b5Xe8mznSJuMn yhvubGSEWaiPKlxUS6IEHXzqUcj5UyCo/B6NCkZ1pxH0sbpgZESzj/Zt41SwqPkkHu 19VoMj04OqpWSF1UPpHsm3rhD4OIeGgfKHCnBSQQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 172A33857C6C for <gcc-patches@gcc.gnu.org>; Tue, 12 Oct 2021 08:57:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 172A33857C6C Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 19C887C0026655; Tue, 12 Oct 2021 04:57:55 -0400 Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com with ESMTP id 3bn49ykwyp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 Oct 2021 04:57:55 -0400 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 19C8rOXI032284; Tue, 12 Oct 2021 08:57:52 GMT Received: from b06cxnps4074.portsmouth.uk.ibm.com (d06relay11.portsmouth.uk.ibm.com [9.149.109.196]) by ppma05fra.de.ibm.com with ESMTP id 3bk2q9mpc2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 12 Oct 2021 08:57:52 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 19C8vmqi2097718 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 12 Oct 2021 08:57:48 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B6807A4040; Tue, 12 Oct 2021 08:57:48 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F0FF8A406B; Tue, 12 Oct 2021 08:57:46 +0000 (GMT) Received: from [9.200.40.250] (unknown [9.200.40.250]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Tue, 12 Oct 2021 08:57:46 +0000 (GMT) Message-ID: <d98a12d0-6089-315d-9264-62260ec85530@linux.ibm.com> Date: Tue, 12 Oct 2021 16:57:43 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.1.2 Content-Language: en-US To: gcc-patches <gcc-patches@gcc.gnu.org> Subject: [PATCH, rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: EDtNizIQK2yYNf9q5Py9HpVBvBEsKu7T X-Proofpoint-GUID: EDtNizIQK2yYNf9q5Py9HpVBvBEsKu7T X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.182.1,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-10-12_02,2021-10-11_01,2020-04-07_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 bulkscore=0 clxscore=1015 mlxscore=0 mlxlogscore=999 malwarescore=0 adultscore=0 priorityscore=1501 phishscore=0 spamscore=0 impostorscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2109230001 definitions=main-2110120047 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: HAO CHEN GUI via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: HAO CHEN GUI <guihaoc@linux.ibm.com> Cc: Bill Schmidt <wschmidt@linux.ibm.com>, Segher Boessenkool <segher@kernel.crashing.org> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[rs6000] Disable gimple fold for float or double vec_minmax when fast-math is not set
|
|
Commit Message
HAO CHEN GUI
Oct. 12, 2021, 8:57 a.m. UTC
Hi, This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. I re-send the patch as previous one is messed up in email thread. Sorry for that. ChangeLog 2021-08-25 Haochen Gui <guihaoc@linux.ibm.com> gcc/ * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): Modify the VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP, VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP expansions. gcc/testsuite/ * gcc.target/powerpc/vec-minmax-1.c: New test. * gcc.target/powerpc/vec-minmax-2.c: Likewise. patch.diff
Comments
On Tue, Oct 12, 2021 at 10:59 AM HAO CHEN GUI via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. > > Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. > > I re-send the patch as previous one is messed up in email thread. Sorry for that. If the VSX/altivec min/max instructions conform to IEEE behavior then you could instead fold to .F{MIN,MAX} internal functions and define the f{min,max} optabs. Otherwise the patch looks correct to me - MIN_EXPR and MAX_EXPR are not IEEE conforming. Note a better check would be to use HONOR_NANS/HONOR_SIGNED_ZEROS on the argument type (that also works for the integer types with the obvious answer). Richard. > ChangeLog > > 2021-08-25 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): > Modify the VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP, > VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP expansions. > > gcc/testsuite/ > * gcc.target/powerpc/vec-minmax-1.c: New test. > * gcc.target/powerpc/vec-minmax-2.c: Likewise. > > > patch.diff > > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index b4e13af4dc6..90527734ceb 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -12159,6 +12159,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > return true; > /* flavors of vec_min. */ > case VSX_BUILTIN_XVMINDP: > + case ALTIVEC_BUILTIN_VMINFP: > + if (!flag_finite_math_only || flag_signed_zeros) > + return false; > + /* Fall through to MIN_EXPR. */ > + gcc_fallthrough (); > case P8V_BUILTIN_VMINSD: > case P8V_BUILTIN_VMINUD: > case ALTIVEC_BUILTIN_VMINSB: > @@ -12167,7 +12172,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > case ALTIVEC_BUILTIN_VMINUB: > case ALTIVEC_BUILTIN_VMINUH: > case ALTIVEC_BUILTIN_VMINUW: > - case ALTIVEC_BUILTIN_VMINFP: > arg0 = gimple_call_arg (stmt, 0); > arg1 = gimple_call_arg (stmt, 1); > lhs = gimple_call_lhs (stmt); > @@ -12177,6 +12181,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > return true; > /* flavors of vec_max. */ > case VSX_BUILTIN_XVMAXDP: > + case ALTIVEC_BUILTIN_VMAXFP: > + if (!flag_finite_math_only || flag_signed_zeros) > + return false; > + /* Fall through to MAX_EXPR. */ > + gcc_fallthrough (); > case P8V_BUILTIN_VMAXSD: > case P8V_BUILTIN_VMAXUD: > case ALTIVEC_BUILTIN_VMAXSB: > @@ -12185,7 +12194,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > case ALTIVEC_BUILTIN_VMAXUB: > case ALTIVEC_BUILTIN_VMAXUH: > case ALTIVEC_BUILTIN_VMAXUW: > - case ALTIVEC_BUILTIN_VMAXFP: > arg0 = gimple_call_arg (stmt, 0); > arg1 = gimple_call_arg (stmt, 1); > lhs = gimple_call_lhs (stmt); > diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c > new file mode 100644 > index 00000000000..547798fd65c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c > @@ -0,0 +1,53 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p9vector_ok } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ > +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */ > + > +/* This test verifies that float or double vec_min/max are bound to > + xv[min|max][d|s]p instructions when fast-math is not set. */ > + > + > +#include <altivec.h> > + > +#ifdef _BIG_ENDIAN > + const int PREF_D = 0; > +#else > + const int PREF_D = 1; > +#endif > + > +double vmaxd (double a, double b) > +{ > + vector double va = vec_promote (a, PREF_D); > + vector double vb = vec_promote (b, PREF_D); > + return vec_extract (vec_max (va, vb), PREF_D); > +} > + > +double vmind (double a, double b) > +{ > + vector double va = vec_promote (a, PREF_D); > + vector double vb = vec_promote (b, PREF_D); > + return vec_extract (vec_min (va, vb), PREF_D); > +} > + > +#ifdef _BIG_ENDIAN > + const int PREF_F = 0; > +#else > + const int PREF_F = 3; > +#endif > + > +float vmaxf (float a, float b) > +{ > + vector float va = vec_promote (a, PREF_F); > + vector float vb = vec_promote (b, PREF_F); > + return vec_extract (vec_max (va, vb), PREF_F); > +} > + > +float vminf (float a, float b) > +{ > + vector float va = vec_promote (a, PREF_F); > + vector float vb = vec_promote (b, PREF_F); > + return vec_extract (vec_min (va, vb), PREF_F); > +} > diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c > new file mode 100644 > index 00000000000..4c6f4365830 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c > @@ -0,0 +1,51 @@ > +/* { dg-do compile { target { powerpc*-*-* } } } */ > +/* { dg-require-effective-target powerpc_p9vector_ok } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */ > +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */ > + > +/* This test verifies that float or double vec_min/max can be converted > + to scalar comparison when fast-math is set. */ > + > + > +#include <altivec.h> > + > +#ifdef _BIG_ENDIAN > + const int PREF_D = 0; > +#else > + const int PREF_D = 1; > +#endif > + > +double vmaxd (double a, double b) > +{ > + vector double va = vec_promote (a, PREF_D); > + vector double vb = vec_promote (b, PREF_D); > + return vec_extract (vec_max (va, vb), PREF_D); > +} > + > +double vmind (double a, double b) > +{ > + vector double va = vec_promote (a, PREF_D); > + vector double vb = vec_promote (b, PREF_D); > + return vec_extract (vec_min (va, vb), PREF_D); > +} > + > +#ifdef _BIG_ENDIAN > + const int PREF_F = 0; > +#else > + const int PREF_F = 3; > +#endif > + > +float vmaxf (float a, float b) > +{ > + vector float va = vec_promote (a, PREF_F); > + vector float vb = vec_promote (b, PREF_F); > + return vec_extract (vec_max (va, vb), PREF_F); > +} > + > +float vminf (float a, float b) > +{ > + vector float va = vec_promote (a, PREF_F); > + vector float vb = vec_promote (b, PREF_F); > + return vec_extract (vec_min (va, vb), PREF_F); > +} >
Richard, Thanks so much for your comments. As far as I know, VSX/altivec min/max instructions don't conform with C-Sytle Min/Max Macro. The fold converts it to MIN/MAX_EXPR then it has a chance to be implemented by scalar min/max instructions which conform with C-Sytle Min/Max Macro. That's why I made this patch. As to IEEE behavior, do you mean "Minimum and maximum operations" defined in IEEE-754 2019? If so, I think VSX/altivec min/max instructions don't conform with it. It demands a quite NaN if either operand is a NaN while our instructions don't. IEEE-754 2019 maximum(x, y) is xif x>y, yif y>x, and a quiet NaN if either operand is a NaN, according to 6.2. For this operation, +0 compares greater than −0. Otherwise (i.e., when x=y and signs are the same) it is either xor y. Actions for xvmaxdp On 12/10/2021 下午 5:57, Richard Biener wrote: > On Tue, Oct 12, 2021 at 10:59 AM HAO CHEN GUI via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> Hi, >> >> This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. >> >> Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. >> >> I re-send the patch as previous one is messed up in email thread. Sorry for that. > If the VSX/altivec min/max instructions conform to IEEE behavior then > you could instead fold > to .F{MIN,MAX} internal functions and define the f{min,max} optabs. > > Otherwise the patch looks correct to me - MIN_EXPR and MAX_EXPR are > not IEEE conforming. > Note a better check would be to use HONOR_NANS/HONOR_SIGNED_ZEROS on > the argument type > (that also works for the integer types with the obvious answer). > > Richard. > >> ChangeLog >> >> 2021-08-25 Haochen Gui <guihaoc@linux.ibm.com> >> >> gcc/ >> * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): >> Modify the VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP, >> VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP expansions. >> >> gcc/testsuite/ >> * gcc.target/powerpc/vec-minmax-1.c: New test. >> * gcc.target/powerpc/vec-minmax-2.c: Likewise. >> >> >> patch.diff >> >> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c >> index b4e13af4dc6..90527734ceb 100644 >> --- a/gcc/config/rs6000/rs6000-call.c >> +++ b/gcc/config/rs6000/rs6000-call.c >> @@ -12159,6 +12159,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> return true; >> /* flavors of vec_min. */ >> case VSX_BUILTIN_XVMINDP: >> + case ALTIVEC_BUILTIN_VMINFP: >> + if (!flag_finite_math_only || flag_signed_zeros) >> + return false; >> + /* Fall through to MIN_EXPR. */ >> + gcc_fallthrough (); >> case P8V_BUILTIN_VMINSD: >> case P8V_BUILTIN_VMINUD: >> case ALTIVEC_BUILTIN_VMINSB: >> @@ -12167,7 +12172,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> case ALTIVEC_BUILTIN_VMINUB: >> case ALTIVEC_BUILTIN_VMINUH: >> case ALTIVEC_BUILTIN_VMINUW: >> - case ALTIVEC_BUILTIN_VMINFP: >> arg0 = gimple_call_arg (stmt, 0); >> arg1 = gimple_call_arg (stmt, 1); >> lhs = gimple_call_lhs (stmt); >> @@ -12177,6 +12181,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> return true; >> /* flavors of vec_max. */ >> case VSX_BUILTIN_XVMAXDP: >> + case ALTIVEC_BUILTIN_VMAXFP: >> + if (!flag_finite_math_only || flag_signed_zeros) >> + return false; >> + /* Fall through to MAX_EXPR. */ >> + gcc_fallthrough (); >> case P8V_BUILTIN_VMAXSD: >> case P8V_BUILTIN_VMAXUD: >> case ALTIVEC_BUILTIN_VMAXSB: >> @@ -12185,7 +12194,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >> case ALTIVEC_BUILTIN_VMAXUB: >> case ALTIVEC_BUILTIN_VMAXUH: >> case ALTIVEC_BUILTIN_VMAXUW: >> - case ALTIVEC_BUILTIN_VMAXFP: >> arg0 = gimple_call_arg (stmt, 0); >> arg1 = gimple_call_arg (stmt, 1); >> lhs = gimple_call_lhs (stmt); >> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c >> new file mode 100644 >> index 00000000000..547798fd65c >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c >> @@ -0,0 +1,53 @@ >> +/* { dg-do compile { target { powerpc*-*-* } } } */ >> +/* { dg-require-effective-target powerpc_p9vector_ok } */ >> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ >> +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */ >> +/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */ >> +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */ >> +/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */ >> + >> +/* This test verifies that float or double vec_min/max are bound to >> + xv[min|max][d|s]p instructions when fast-math is not set. */ >> + >> + >> +#include <altivec.h> >> + >> +#ifdef _BIG_ENDIAN >> + const int PREF_D = 0; >> +#else >> + const int PREF_D = 1; >> +#endif >> + >> +double vmaxd (double a, double b) >> +{ >> + vector double va = vec_promote (a, PREF_D); >> + vector double vb = vec_promote (b, PREF_D); >> + return vec_extract (vec_max (va, vb), PREF_D); >> +} >> + >> +double vmind (double a, double b) >> +{ >> + vector double va = vec_promote (a, PREF_D); >> + vector double vb = vec_promote (b, PREF_D); >> + return vec_extract (vec_min (va, vb), PREF_D); >> +} >> + >> +#ifdef _BIG_ENDIAN >> + const int PREF_F = 0; >> +#else >> + const int PREF_F = 3; >> +#endif >> + >> +float vmaxf (float a, float b) >> +{ >> + vector float va = vec_promote (a, PREF_F); >> + vector float vb = vec_promote (b, PREF_F); >> + return vec_extract (vec_max (va, vb), PREF_F); >> +} >> + >> +float vminf (float a, float b) >> +{ >> + vector float va = vec_promote (a, PREF_F); >> + vector float vb = vec_promote (b, PREF_F); >> + return vec_extract (vec_min (va, vb), PREF_F); >> +} >> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c >> new file mode 100644 >> index 00000000000..4c6f4365830 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c >> @@ -0,0 +1,51 @@ >> +/* { dg-do compile { target { powerpc*-*-* } } } */ >> +/* { dg-require-effective-target powerpc_p9vector_ok } */ >> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */ >> +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */ >> +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */ >> + >> +/* This test verifies that float or double vec_min/max can be converted >> + to scalar comparison when fast-math is set. */ >> + >> + >> +#include <altivec.h> >> + >> +#ifdef _BIG_ENDIAN >> + const int PREF_D = 0; >> +#else >> + const int PREF_D = 1; >> +#endif >> + >> +double vmaxd (double a, double b) >> +{ >> + vector double va = vec_promote (a, PREF_D); >> + vector double vb = vec_promote (b, PREF_D); >> + return vec_extract (vec_max (va, vb), PREF_D); >> +} >> + >> +double vmind (double a, double b) >> +{ >> + vector double va = vec_promote (a, PREF_D); >> + vector double vb = vec_promote (b, PREF_D); >> + return vec_extract (vec_min (va, vb), PREF_D); >> +} >> + >> +#ifdef _BIG_ENDIAN >> + const int PREF_F = 0; >> +#else >> + const int PREF_F = 3; >> +#endif >> + >> +float vmaxf (float a, float b) >> +{ >> + vector float va = vec_promote (a, PREF_F); >> + vector float vb = vec_promote (b, PREF_F); >> + return vec_extract (vec_max (va, vb), PREF_F); >> +} >> + >> +float vminf (float a, float b) >> +{ >> + vector float va = vec_promote (a, PREF_F); >> + vector float vb = vec_promote (b, PREF_F); >> + return vec_extract (vec_min (va, vb), PREF_F); >> +} >>
On Wed, Oct 13, 2021 at 9:43 AM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote: > > Richard, > > Thanks so much for your comments. > > As far as I know, VSX/altivec min/max instructions don't conform with C-Sytle Min/Max Macro. The fold converts it to MIN/MAX_EXPR then it has a chance to be implemented by scalar min/max instructions which conform with C-Sytle Min/Max Macro. That's why I made this patch. > > As to IEEE behavior, do you mean "Minimum and maximum operations" defined in IEEE-754 2019? If so, I think VSX/altivec min/max instructions don't conform with it. It demands a quite NaN if either operand is a NaN while our instructions don't. > > IEEE-754 2019 maximum(x, y) is xif x>y, yif y>x, and a quiet NaN if either operand is a NaN, according to 6.2. For this operation, +0 compares greater than −0. Otherwise (i.e., when x=y and signs are the same) it is either xor y. Actions for xvmaxdp Hmm, then I do not understand the reason for the patch - people using the intrinsics cannot expect IEEE semantics then. So you are concerned that people don't get the 1:1 machine instruction but eventually the IEEE conforming MIN/MAX_EXPR? But that can then still happen with -ffast-math so I wonder what's the point. Richard. > On 12/10/2021 下午 5:57, Richard Biener wrote: > > On Tue, Oct 12, 2021 at 10:59 AM HAO CHEN GUI via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> Hi, > >> > >> This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. > >> > >> Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. > >> > >> I re-send the patch as previous one is messed up in email thread. Sorry for that. > > If the VSX/altivec min/max instructions conform to IEEE behavior then > > you could instead fold > > to .F{MIN,MAX} internal functions and define the f{min,max} optabs. > > > > Otherwise the patch looks correct to me - MIN_EXPR and MAX_EXPR are > > not IEEE conforming. > > Note a better check would be to use HONOR_NANS/HONOR_SIGNED_ZEROS on > > the argument type > > (that also works for the integer types with the obvious answer). > > > > Richard. > > > >> ChangeLog > >> > >> 2021-08-25 Haochen Gui <guihaoc@linux.ibm.com> > >> > >> gcc/ > >> * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): > >> Modify the VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP, > >> VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP expansions. > >> > >> gcc/testsuite/ > >> * gcc.target/powerpc/vec-minmax-1.c: New test. > >> * gcc.target/powerpc/vec-minmax-2.c: Likewise. > >> > >> > >> patch.diff > >> > >> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > >> index b4e13af4dc6..90527734ceb 100644 > >> --- a/gcc/config/rs6000/rs6000-call.c > >> +++ b/gcc/config/rs6000/rs6000-call.c > >> @@ -12159,6 +12159,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > >> return true; > >> /* flavors of vec_min. */ > >> case VSX_BUILTIN_XVMINDP: > >> + case ALTIVEC_BUILTIN_VMINFP: > >> + if (!flag_finite_math_only || flag_signed_zeros) > >> + return false; > >> + /* Fall through to MIN_EXPR. */ > >> + gcc_fallthrough (); > >> case P8V_BUILTIN_VMINSD: > >> case P8V_BUILTIN_VMINUD: > >> case ALTIVEC_BUILTIN_VMINSB: > >> @@ -12167,7 +12172,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > >> case ALTIVEC_BUILTIN_VMINUB: > >> case ALTIVEC_BUILTIN_VMINUH: > >> case ALTIVEC_BUILTIN_VMINUW: > >> - case ALTIVEC_BUILTIN_VMINFP: > >> arg0 = gimple_call_arg (stmt, 0); > >> arg1 = gimple_call_arg (stmt, 1); > >> lhs = gimple_call_lhs (stmt); > >> @@ -12177,6 +12181,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > >> return true; > >> /* flavors of vec_max. */ > >> case VSX_BUILTIN_XVMAXDP: > >> + case ALTIVEC_BUILTIN_VMAXFP: > >> + if (!flag_finite_math_only || flag_signed_zeros) > >> + return false; > >> + /* Fall through to MAX_EXPR. */ > >> + gcc_fallthrough (); > >> case P8V_BUILTIN_VMAXSD: > >> case P8V_BUILTIN_VMAXUD: > >> case ALTIVEC_BUILTIN_VMAXSB: > >> @@ -12185,7 +12194,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) > >> case ALTIVEC_BUILTIN_VMAXUB: > >> case ALTIVEC_BUILTIN_VMAXUH: > >> case ALTIVEC_BUILTIN_VMAXUW: > >> - case ALTIVEC_BUILTIN_VMAXFP: > >> arg0 = gimple_call_arg (stmt, 0); > >> arg1 = gimple_call_arg (stmt, 1); > >> lhs = gimple_call_lhs (stmt); > >> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c > >> new file mode 100644 > >> index 00000000000..547798fd65c > >> --- /dev/null > >> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c > >> @@ -0,0 +1,53 @@ > >> +/* { dg-do compile { target { powerpc*-*-* } } } */ > >> +/* { dg-require-effective-target powerpc_p9vector_ok } */ > >> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ > >> +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */ > >> +/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */ > >> +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */ > >> +/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */ > >> + > >> +/* This test verifies that float or double vec_min/max are bound to > >> + xv[min|max][d|s]p instructions when fast-math is not set. */ > >> + > >> + > >> +#include <altivec.h> > >> + > >> +#ifdef _BIG_ENDIAN > >> + const int PREF_D = 0; > >> +#else > >> + const int PREF_D = 1; > >> +#endif > >> + > >> +double vmaxd (double a, double b) > >> +{ > >> + vector double va = vec_promote (a, PREF_D); > >> + vector double vb = vec_promote (b, PREF_D); > >> + return vec_extract (vec_max (va, vb), PREF_D); > >> +} > >> + > >> +double vmind (double a, double b) > >> +{ > >> + vector double va = vec_promote (a, PREF_D); > >> + vector double vb = vec_promote (b, PREF_D); > >> + return vec_extract (vec_min (va, vb), PREF_D); > >> +} > >> + > >> +#ifdef _BIG_ENDIAN > >> + const int PREF_F = 0; > >> +#else > >> + const int PREF_F = 3; > >> +#endif > >> + > >> +float vmaxf (float a, float b) > >> +{ > >> + vector float va = vec_promote (a, PREF_F); > >> + vector float vb = vec_promote (b, PREF_F); > >> + return vec_extract (vec_max (va, vb), PREF_F); > >> +} > >> + > >> +float vminf (float a, float b) > >> +{ > >> + vector float va = vec_promote (a, PREF_F); > >> + vector float vb = vec_promote (b, PREF_F); > >> + return vec_extract (vec_min (va, vb), PREF_F); > >> +} > >> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c > >> new file mode 100644 > >> index 00000000000..4c6f4365830 > >> --- /dev/null > >> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c > >> @@ -0,0 +1,51 @@ > >> +/* { dg-do compile { target { powerpc*-*-* } } } */ > >> +/* { dg-require-effective-target powerpc_p9vector_ok } */ > >> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */ > >> +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */ > >> +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */ > >> + > >> +/* This test verifies that float or double vec_min/max can be converted > >> + to scalar comparison when fast-math is set. */ > >> + > >> + > >> +#include <altivec.h> > >> + > >> +#ifdef _BIG_ENDIAN > >> + const int PREF_D = 0; > >> +#else > >> + const int PREF_D = 1; > >> +#endif > >> + > >> +double vmaxd (double a, double b) > >> +{ > >> + vector double va = vec_promote (a, PREF_D); > >> + vector double vb = vec_promote (b, PREF_D); > >> + return vec_extract (vec_max (va, vb), PREF_D); > >> +} > >> + > >> +double vmind (double a, double b) > >> +{ > >> + vector double va = vec_promote (a, PREF_D); > >> + vector double vb = vec_promote (b, PREF_D); > >> + return vec_extract (vec_min (va, vb), PREF_D); > >> +} > >> + > >> +#ifdef _BIG_ENDIAN > >> + const int PREF_F = 0; > >> +#else > >> + const int PREF_F = 3; > >> +#endif > >> + > >> +float vmaxf (float a, float b) > >> +{ > >> + vector float va = vec_promote (a, PREF_F); > >> + vector float vb = vec_promote (b, PREF_F); > >> + return vec_extract (vec_max (va, vb), PREF_F); > >> +} > >> + > >> +float vminf (float a, float b) > >> +{ > >> + vector float va = vec_promote (a, PREF_F); > >> + vector float vb = vec_promote (b, PREF_F); > >> + return vec_extract (vec_min (va, vb), PREF_F); > >> +} > >>
On 13/10/2021 下午 4:29, Richard Biener wrote: > On Wed, Oct 13, 2021 at 9:43 AM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote: >> Richard, >> >> Thanks so much for your comments. >> >> As far as I know, VSX/altivec min/max instructions don't conform with C-Sytle Min/Max Macro. The fold converts it to MIN/MAX_EXPR then it has a chance to be implemented by scalar min/max instructions which conform with C-Sytle Min/Max Macro. That's why I made this patch. >> >> As to IEEE behavior, do you mean "Minimum and maximum operations" defined in IEEE-754 2019? If so, I think VSX/altivec min/max instructions don't conform with it. It demands a quite NaN if either operand is a NaN while our instructions don't. >> >> IEEE-754 2019 maximum(x, y) is xif x>y, yif y>x, and a quiet NaN if either operand is a NaN, according to 6.2. For this operation, +0 compares greater than −0. Otherwise (i.e., when x=y and signs are the same) it is either xor y. Actions for xvmaxdp > Hmm, then I do not understand the reason for the patch - people using > the intrinsics cannot expect IEEE semantics then. > So you are concerned that people don't get the 1:1 machine instruction > but eventually the IEEE conforming MIN/MAX_EXPR? > But that can then still happen with -ffast-math so I wonder what's the point. > > Richard. The reason for the patch is to keep compatibility between different Power servers. The old servers don't have C-style Min/Max instructions and all are implemented by VSX/altivec min/max instructions. So I just want to keep the compatibility. For fast-math, the C-style Min/Max should be acceptable, I think. The IEEE standard changed. VSX/altivec min/max instructions conform with IEEE 754-2008 (the old standard), but not with IEEE 754-2019 (the last one). As of 2019, the formerly required/minNum, maxNum, minNumMag, and maxNumMag/in IEEE 754-2008 are now deleted due to their non-associativity. Instead, two sets of new/minimum and maximum operations/are recommended. 754-2008 maxNum(x, y) is the canonicalized number yif x< y, xif y< x, the canonicalized number if one operand is a number and the other a quiet NaN. Otherwise it is either xor y, canonicalized (this means results might differ among implementations). When either xor yis a signalingNaN, then the result is according to 6.2. Thanks again. Gui Haochen > >> On 12/10/2021 下午 5:57, Richard Biener wrote: >>> On Tue, Oct 12, 2021 at 10:59 AM HAO CHEN GUI via Gcc-patches >>> <gcc-patches@gcc.gnu.org> wrote: >>>> Hi, >>>> >>>> This patch disables gimple folding for float or double vec_min/max when fast-math is not set. It makes vec_min/max conform with the guide. >>>> >>>> Bootstrapped and tested on powerpc64le-linux with no regressions. Is this okay for trunk? Any recommendations? Thanks a lot. >>>> >>>> I re-send the patch as previous one is messed up in email thread. Sorry for that. >>> If the VSX/altivec min/max instructions conform to IEEE behavior then >>> you could instead fold >>> to .F{MIN,MAX} internal functions and define the f{min,max} optabs. >>> >>> Otherwise the patch looks correct to me - MIN_EXPR and MAX_EXPR are >>> not IEEE conforming. >>> Note a better check would be to use HONOR_NANS/HONOR_SIGNED_ZEROS on >>> the argument type >>> (that also works for the integer types with the obvious answer). >>> >>> Richard. >>> >>>> ChangeLog >>>> >>>> 2021-08-25 Haochen Gui <guihaoc@linux.ibm.com> >>>> >>>> gcc/ >>>> * config/rs6000/rs6000-call.c (rs6000_gimple_fold_builtin): >>>> Modify the VSX_BUILTIN_XVMINDP, ALTIVEC_BUILTIN_VMINFP, >>>> VSX_BUILTIN_XVMAXDP, ALTIVEC_BUILTIN_VMAXFP expansions. >>>> >>>> gcc/testsuite/ >>>> * gcc.target/powerpc/vec-minmax-1.c: New test. >>>> * gcc.target/powerpc/vec-minmax-2.c: Likewise. >>>> >>>> >>>> patch.diff >>>> >>>> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c >>>> index b4e13af4dc6..90527734ceb 100644 >>>> --- a/gcc/config/rs6000/rs6000-call.c >>>> +++ b/gcc/config/rs6000/rs6000-call.c >>>> @@ -12159,6 +12159,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >>>> return true; >>>> /* flavors of vec_min. */ >>>> case VSX_BUILTIN_XVMINDP: >>>> + case ALTIVEC_BUILTIN_VMINFP: >>>> + if (!flag_finite_math_only || flag_signed_zeros) >>>> + return false; >>>> + /* Fall through to MIN_EXPR. */ >>>> + gcc_fallthrough (); >>>> case P8V_BUILTIN_VMINSD: >>>> case P8V_BUILTIN_VMINUD: >>>> case ALTIVEC_BUILTIN_VMINSB: >>>> @@ -12167,7 +12172,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >>>> case ALTIVEC_BUILTIN_VMINUB: >>>> case ALTIVEC_BUILTIN_VMINUH: >>>> case ALTIVEC_BUILTIN_VMINUW: >>>> - case ALTIVEC_BUILTIN_VMINFP: >>>> arg0 = gimple_call_arg (stmt, 0); >>>> arg1 = gimple_call_arg (stmt, 1); >>>> lhs = gimple_call_lhs (stmt); >>>> @@ -12177,6 +12181,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >>>> return true; >>>> /* flavors of vec_max. */ >>>> case VSX_BUILTIN_XVMAXDP: >>>> + case ALTIVEC_BUILTIN_VMAXFP: >>>> + if (!flag_finite_math_only || flag_signed_zeros) >>>> + return false; >>>> + /* Fall through to MAX_EXPR. */ >>>> + gcc_fallthrough (); >>>> case P8V_BUILTIN_VMAXSD: >>>> case P8V_BUILTIN_VMAXUD: >>>> case ALTIVEC_BUILTIN_VMAXSB: >>>> @@ -12185,7 +12194,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) >>>> case ALTIVEC_BUILTIN_VMAXUB: >>>> case ALTIVEC_BUILTIN_VMAXUH: >>>> case ALTIVEC_BUILTIN_VMAXUW: >>>> - case ALTIVEC_BUILTIN_VMAXFP: >>>> arg0 = gimple_call_arg (stmt, 0); >>>> arg1 = gimple_call_arg (stmt, 1); >>>> lhs = gimple_call_lhs (stmt); >>>> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c >>>> new file mode 100644 >>>> index 00000000000..547798fd65c >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c >>>> @@ -0,0 +1,53 @@ >>>> +/* { dg-do compile { target { powerpc*-*-* } } } */ >>>> +/* { dg-require-effective-target powerpc_p9vector_ok } */ >>>> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ >>>> +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */ >>>> +/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */ >>>> +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */ >>>> +/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */ >>>> + >>>> +/* This test verifies that float or double vec_min/max are bound to >>>> + xv[min|max][d|s]p instructions when fast-math is not set. */ >>>> + >>>> + >>>> +#include <altivec.h> >>>> + >>>> +#ifdef _BIG_ENDIAN >>>> + const int PREF_D = 0; >>>> +#else >>>> + const int PREF_D = 1; >>>> +#endif >>>> + >>>> +double vmaxd (double a, double b) >>>> +{ >>>> + vector double va = vec_promote (a, PREF_D); >>>> + vector double vb = vec_promote (b, PREF_D); >>>> + return vec_extract (vec_max (va, vb), PREF_D); >>>> +} >>>> + >>>> +double vmind (double a, double b) >>>> +{ >>>> + vector double va = vec_promote (a, PREF_D); >>>> + vector double vb = vec_promote (b, PREF_D); >>>> + return vec_extract (vec_min (va, vb), PREF_D); >>>> +} >>>> + >>>> +#ifdef _BIG_ENDIAN >>>> + const int PREF_F = 0; >>>> +#else >>>> + const int PREF_F = 3; >>>> +#endif >>>> + >>>> +float vmaxf (float a, float b) >>>> +{ >>>> + vector float va = vec_promote (a, PREF_F); >>>> + vector float vb = vec_promote (b, PREF_F); >>>> + return vec_extract (vec_max (va, vb), PREF_F); >>>> +} >>>> + >>>> +float vminf (float a, float b) >>>> +{ >>>> + vector float va = vec_promote (a, PREF_F); >>>> + vector float vb = vec_promote (b, PREF_F); >>>> + return vec_extract (vec_min (va, vb), PREF_F); >>>> +} >>>> diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c >>>> new file mode 100644 >>>> index 00000000000..4c6f4365830 >>>> --- /dev/null >>>> +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c >>>> @@ -0,0 +1,51 @@ >>>> +/* { dg-do compile { target { powerpc*-*-* } } } */ >>>> +/* { dg-require-effective-target powerpc_p9vector_ok } */ >>>> +/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */ >>>> +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */ >>>> +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */ >>>> + >>>> +/* This test verifies that float or double vec_min/max can be converted >>>> + to scalar comparison when fast-math is set. */ >>>> + >>>> + >>>> +#include <altivec.h> >>>> + >>>> +#ifdef _BIG_ENDIAN >>>> + const int PREF_D = 0; >>>> +#else >>>> + const int PREF_D = 1; >>>> +#endif >>>> + >>>> +double vmaxd (double a, double b) >>>> +{ >>>> + vector double va = vec_promote (a, PREF_D); >>>> + vector double vb = vec_promote (b, PREF_D); >>>> + return vec_extract (vec_max (va, vb), PREF_D); >>>> +} >>>> + >>>> +double vmind (double a, double b) >>>> +{ >>>> + vector double va = vec_promote (a, PREF_D); >>>> + vector double vb = vec_promote (b, PREF_D); >>>> + return vec_extract (vec_min (va, vb), PREF_D); >>>> +} >>>> + >>>> +#ifdef _BIG_ENDIAN >>>> + const int PREF_F = 0; >>>> +#else >>>> + const int PREF_F = 3; >>>> +#endif >>>> + >>>> +float vmaxf (float a, float b) >>>> +{ >>>> + vector float va = vec_promote (a, PREF_F); >>>> + vector float vb = vec_promote (b, PREF_F); >>>> + return vec_extract (vec_max (va, vb), PREF_F); >>>> +} >>>> + >>>> +float vminf (float a, float b) >>>> +{ >>>> + vector float va = vec_promote (a, PREF_F); >>>> + vector float vb = vec_promote (b, PREF_F); >>>> + return vec_extract (vec_min (va, vb), PREF_F); >>>> +} >>>>
On Wed, Oct 13, 2021 at 10:29:26AM +0200, Richard Biener wrote: > On Wed, Oct 13, 2021 at 9:43 AM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote: > > As to IEEE behavior, do you mean "Minimum and maximum operations" defined in IEEE-754 2019? If so, I think VSX/altivec min/max instructions don't conform with it. It demands a quite NaN if either operand is a NaN while our instructions don't. > > > > IEEE-754 2019 maximum(x, y) is xif x>y, yif y>x, and a quiet NaN if either operand is a NaN, according to 6.2. For this operation, +0 compares greater than −0. Otherwise (i.e., when x=y and signs are the same) it is either xor y. Actions for xvmaxdp > > Hmm, then I do not understand the reason for the patch - people using > the intrinsics cannot expect IEEE semantics then. > So you are concerned that people don't get the 1:1 machine instruction > but eventually the IEEE conforming MIN/MAX_EXPR? I do not know about Gimple MIN_EXPR (it is not documented?), but the RTL "smin" is meaningless in the presence of NaNs or signed zeros. This is documented (in rtl.texi): """ @findex smin @findex smax @cindex signed minimum @cindex signed maximum @item (smin:@var{m} @var{x} @var{y}) @itemx (smax:@var{m} @var{x} @var{y}) Represents the smaller (for @code{smin}) or larger (for @code{smax}) of @var{x} and @var{y}, interpreted as signed values in mode @var{m}. When used with floating point, if both operands are zeros, or if either operand is @code{NaN}, then it is unspecified which of the two operands is returned as the result. """ (not exactly meaningless, okay, but not usable for almost anything). > But that can then still happen with -ffast-math so I wonder what's the point. That :-) Segher
On Tue, Oct 12, 2021 at 04:57:43PM +0800, HAO CHEN GUI wrote: > b/gcc/config/rs6000/rs6000-call.c > index b4e13af4dc6..90527734ceb 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -12159,6 +12159,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator > *gsi) > return true; > /* flavors of vec_min. */ > case VSX_BUILTIN_XVMINDP: > + case ALTIVEC_BUILTIN_VMINFP: > + if (!flag_finite_math_only || flag_signed_zeros) > + return false; > + /* Fall through to MIN_EXPR. */ > + gcc_fallthrough (); > case P8V_BUILTIN_VMINSD: > case P8V_BUILTIN_VMINUD: > case ALTIVEC_BUILTIN_VMINSB: "Fall though to code for MIN_EXPR"? It suggests it is a label, as written now. Or don't have this comment at all, maybe? > +/* { dg-do compile { target { powerpc*-*-* } } } */ Leave out the target clause? Testcases in gcc.target/powerpc/ are not run when this is not satisfied anyway, testing it twice is just more noise. Segher
On Wed, 13 Oct 2021, HAO CHEN GUI via Gcc-patches wrote: > As to IEEE behavior, do you mean "Minimum and maximum operations" defined in > IEEE-754 2019? If so, I think VSX/altivec min/max instructions don't conform > with it. It demands a quite NaN if either operand is a NaN while our > instructions don't. > > IEEE-754 2019 maximum(x, y) is xif x>y, yif y>x, and a quiet NaN if either > operand is a NaN, according to 6.2. For this operation, +0 compares greater > than −0. Otherwise (i.e., when x=y and signs are the same) it is either xor > y. Actions for xvmaxdp We don't have any built-in functions (or I think other internal operations) for the IEEE 754-2019 operations (C2X function names fmaximum, fminimum, fmaximum_num, fminimum_num, plus per-type suffixes) either, though as I noted when adding those functions to glibc, having such built-in functions would make sense (specifically, so that RISC-V can expand calls to fmaximum_num and fminimum_num inline when building for F or D extension version 2.2 and later). The built-in functions we have for fmax and fmin correspond to the IEEE 754-2008 operations (as implemented by the AArch64 fmaxnm / fminnm instructions, for example).
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index b4e13af4dc6..90527734ceb 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -12159,6 +12159,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) return true; /* flavors of vec_min. */ case VSX_BUILTIN_XVMINDP: + case ALTIVEC_BUILTIN_VMINFP: + if (!flag_finite_math_only || flag_signed_zeros) + return false; + /* Fall through to MIN_EXPR. */ + gcc_fallthrough (); case P8V_BUILTIN_VMINSD: case P8V_BUILTIN_VMINUD: case ALTIVEC_BUILTIN_VMINSB: @@ -12167,7 +12172,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VMINUB: case ALTIVEC_BUILTIN_VMINUH: case ALTIVEC_BUILTIN_VMINUW: - case ALTIVEC_BUILTIN_VMINFP: arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); lhs = gimple_call_lhs (stmt); @@ -12177,6 +12181,11 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) return true; /* flavors of vec_max. */ case VSX_BUILTIN_XVMAXDP: + case ALTIVEC_BUILTIN_VMAXFP: + if (!flag_finite_math_only || flag_signed_zeros) + return false; + /* Fall through to MAX_EXPR. */ + gcc_fallthrough (); case P8V_BUILTIN_VMAXSD: case P8V_BUILTIN_VMAXUD: case ALTIVEC_BUILTIN_VMAXSB: @@ -12185,7 +12194,6 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi) case ALTIVEC_BUILTIN_VMAXUB: case ALTIVEC_BUILTIN_VMAXUH: case ALTIVEC_BUILTIN_VMAXUW: - case ALTIVEC_BUILTIN_VMAXFP: arg0 = gimple_call_arg (stmt, 0); arg1 = gimple_call_arg (stmt, 1); lhs = gimple_call_lhs (stmt); diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c new file mode 100644 index 00000000000..547798fd65c --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-1.c @@ -0,0 +1,53 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ +/* { dg-final { scan-assembler-times {\mxvmaxdp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmaxsp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvmindp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mxvminsp\M} 1 } } */ + +/* This test verifies that float or double vec_min/max are bound to + xv[min|max][d|s]p instructions when fast-math is not set. */ + + +#include <altivec.h> + +#ifdef _BIG_ENDIAN + const int PREF_D = 0; +#else + const int PREF_D = 1; +#endif + +double vmaxd (double a, double b) +{ + vector double va = vec_promote (a, PREF_D); + vector double vb = vec_promote (b, PREF_D); + return vec_extract (vec_max (va, vb), PREF_D); +} + +double vmind (double a, double b) +{ + vector double va = vec_promote (a, PREF_D); + vector double vb = vec_promote (b, PREF_D); + return vec_extract (vec_min (va, vb), PREF_D); +} + +#ifdef _BIG_ENDIAN + const int PREF_F = 0; +#else + const int PREF_F = 3; +#endif + +float vmaxf (float a, float b) +{ + vector float va = vec_promote (a, PREF_F); + vector float vb = vec_promote (b, PREF_F); + return vec_extract (vec_max (va, vb), PREF_F); +} + +float vminf (float a, float b) +{ + vector float va = vec_promote (a, PREF_F); + vector float vb = vec_promote (b, PREF_F); + return vec_extract (vec_min (va, vb), PREF_F); +} diff --git a/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c new file mode 100644 index 00000000000..4c6f4365830 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-minmax-2.c @@ -0,0 +1,51 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9 -ffast-math" } */ +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 2 } } */ + +/* This test verifies that float or double vec_min/max can be converted + to scalar comparison when fast-math is set. */ + + +#include <altivec.h> + +#ifdef _BIG_ENDIAN + const int PREF_D = 0; +#else + const int PREF_D = 1; +#endif + +double vmaxd (double a, double b) +{ + vector double va = vec_promote (a, PREF_D); + vector double vb = vec_promote (b, PREF_D); + return vec_extract (vec_max (va, vb), PREF_D); +} + +double vmind (double a, double b) +{ + vector double va = vec_promote (a, PREF_D); + vector double vb = vec_promote (b, PREF_D); + return vec_extract (vec_min (va, vb), PREF_D); +} + +#ifdef _BIG_ENDIAN + const int PREF_F = 0; +#else + const int PREF_F = 3; +#endif + +float vmaxf (float a, float b) +{ + vector float va = vec_promote (a, PREF_F); + vector float vb = vec_promote (b, PREF_F); + return vec_extract (vec_max (va, vb), PREF_F); +} + +float vminf (float a, float b) +{ + vector float va = vec_promote (a, PREF_F); + vector float vb = vec_promote (b, PREF_F); + return vec_extract (vec_min (va, vb), PREF_F); +}