Message ID | 20220422081019.31897-1-hongyu.wang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B20D73856256 for <patchwork@sourceware.org>; Fri, 22 Apr 2022 08:12:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B20D73856256 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1650615127; bh=CgJFANwpiPUFVLE3eIHEGtUrBpCba0n8B5jZDFFIY2c=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=Xv0Sjvph1T2n9lwxaRlVCOD2mca276oBa16tJwn2Ern3VfIY0MRO7n6KiV4QlDxM8 Q2w26BVMio0wbv5uWDE76e83doXfY7gS/oolXFGvj+wUs6TEzcqJcrSZeeMlU+88+F bcqsaw8YQBWM6QMoJgp0hQA3NDSYEt1KTOE1M9bA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by sourceware.org (Postfix) with ESMTPS id 4B3AD3856262 for <gcc-patches@gcc.gnu.org>; Fri, 22 Apr 2022 08:10:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4B3AD3856262 X-IronPort-AV: E=McAfee;i="6400,9594,10324"; a="262211108" X-IronPort-AV: E=Sophos;i="5.90,281,1643702400"; d="scan'208";a="262211108" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Apr 2022 01:10:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,281,1643702400"; d="scan'208";a="530727491" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga006.jf.intel.com with ESMTP; 22 Apr 2022 01:10:20 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 23M8AJNj001301; Fri, 22 Apr 2022 01:10:20 -0700 To: hongtao.liu@intel.com Subject: [PATCH] AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339] Date: Fri, 22 Apr 2022 16:10:19 +0800 Message-Id: <20220422081019.31897-1-hongyu.wang@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Hongyu Wang via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Hongyu Wang <hongyu.wang@intel.com> Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
AVX512F: Add missing macro for mask(z?)_scalf_s[sd] [PR 105339]
|
|
Commit Message
Hongyu Wang
April 22, 2022, 8:10 a.m. UTC
Hi, Add missing macro under O0 and adjust macro format for scalf intrinsics. Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}. Ok for master and backport to GCC 9/10/11? gcc/ChangeLog: PR target/105339 * config/i386/avx512fintrin.h (_mm512_scalef_round_pd): Add parentheses for parameters and djust format. (_mm512_mask_scalef_round_pd): Ditto. (_mm512_maskz_scalef_round_pd): Ditto. (_mm512_scalef_round_ps): Ditto. (_mm512_mask_scalef_round_ps): Ditto. (_mm512_maskz_scalef_round_ps): Ditto. (_mm_scalef_round_sd): Use _mm_undefined_pd. (_mm_scalef_round_ss): Use _mm_undefined_ps. (_mm_mask_scalef_round_sd): New macro. (_mm_mask_scalef_round_ss): Ditto. (_mm_maskz_scalef_round_sd): Ditto. (_mm_maskz_scalef_round_ss): Ditto. --- gcc/config/i386/avx512fintrin.h | 76 ++++++++++++++++++++++++--------- 1 file changed, 56 insertions(+), 20 deletions(-)
Comments
On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi, > > Add missing macro under O0 and adjust macro format for scalf > intrinsics. > Please add the corresponding intrinsic test in sse-14.c. > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}. > > Ok for master and backport to GCC 9/10/11? > > gcc/ChangeLog: > > PR target/105339 > * config/i386/avx512fintrin.h (_mm512_scalef_round_pd): > Add parentheses for parameters and djust format. > (_mm512_mask_scalef_round_pd): Ditto. > (_mm512_maskz_scalef_round_pd): Ditto. > (_mm512_scalef_round_ps): Ditto. > (_mm512_mask_scalef_round_ps): Ditto. > (_mm512_maskz_scalef_round_ps): Ditto. > (_mm_scalef_round_sd): Use _mm_undefined_pd. > (_mm_scalef_round_ss): Use _mm_undefined_ps. > (_mm_mask_scalef_round_sd): New macro. > (_mm_mask_scalef_round_ss): Ditto. > (_mm_maskz_scalef_round_sd): Ditto. > (_mm_maskz_scalef_round_ss): Ditto. > --- > gcc/config/i386/avx512fintrin.h | 76 ++++++++++++++++++++++++--------- > 1 file changed, 56 insertions(+), 20 deletions(-) > > diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h > index 29511fd2831..6dc69ff0234 100644 > --- a/gcc/config/i386/avx512fintrin.h > +++ b/gcc/config/i386/avx512fintrin.h > @@ -3286,31 +3286,67 @@ _mm_maskz_scalef_round_ss (__mmask8 __U, __m128 __A, __m128 __B, const int __R) > (__mmask8) __U, __R); > } > #else > -#define _mm512_scalef_round_pd(A, B, C) \ > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_undefined_pd(), -1, C) > - > -#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, W, U, C) > - > -#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_setzero_pd(), U, C) > +#define _mm512_scalef_round_pd(A, B, C) \ > + ((__m512d) \ > + __builtin_ia32_scalefpd512_mask((A), (B), \ > + (__v8df) _mm512_undefined_pd(), \ > + -1, (C))) > + > +#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ > + ((__m512d) __builtin_ia32_scalefpd512_mask((A), (B), (W), (U), (C))) > + > +#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ > + ((__m512d) \ > + __builtin_ia32_scalefpd512_mask((A), (B), \ > + (__v8df) _mm512_setzero_pd(), \ > + (U), (C))) > + > +#define _mm512_scalef_round_ps(A, B, C) \ > + ((__m512) \ > + __builtin_ia32_scalefps512_mask((A), (B), \ > + (__v16sf) _mm512_undefined_ps(), \ > + -1, (C))) > + > +#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ > + ((__m512) __builtin_ia32_scalefps512_mask((A), (B), (W), (U), (C))) > + > +#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ > + ((__m512) \ > + __builtin_ia32_scalefps512_mask((A), (B), \ > + (__v16sf) _mm512_setzero_ps(), \ > + (U), (C))) > + > +#define _mm_scalef_round_sd(A, B, C) \ > + ((__m128d) \ > + __builtin_ia32_scalefsd_mask_round ((A), (B), \ > + (__v2df) _mm_undefined_pd (), \ > + -1, (C))) > > -#define _mm512_scalef_round_ps(A, B, C) \ > - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_undefined_ps(), -1, C) > +#define _mm_scalef_round_ss(A, B, C) \ > + ((__m128) \ > + __builtin_ia32_scalefss_mask_round ((A), (B), \ > + (__v4sf) _mm_undefined_ps (), \ > + -1, (C))) > > -#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ > - (__m512)__builtin_ia32_scalefps512_mask(A, B, W, U, C) > +#define _mm_mask_scalef_round_sd(W, U, A, B, C) \ > + ((__m128d) \ > + __builtin_ia32_scalefsd_mask_round ((A), (B), (W), (U), (C))) > > -#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ > - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C) > +#define _mm_mask_scalef_round_ss(W, U, A, B, C) \ > + ((__m128) \ > + __builtin_ia32_scalefss_mask_round ((A), (B), (W), (U), (C))) > > -#define _mm_scalef_round_sd(A, B, C) \ > - (__m128d)__builtin_ia32_scalefsd_mask_round (A, B, \ > - (__v2df)_mm_setzero_pd (), -1, C) > +#define _mm_maskz_scalef_round_sd(U, A, B, C) \ > + ((__m128d) \ > + __builtin_ia32_scalefsd_mask_round ((A), (B), \ > + (__v2df) _mm_setzero_pd (), \ > + (U), (C))) > > -#define _mm_scalef_round_ss(A, B, C) \ > - (__m128)__builtin_ia32_scalefss_mask_round (A, B, \ > - (__v4sf)_mm_setzero_ps (), -1, C) > +#define _mm_maskz_scalef_round_ss(U, A, B, C) \ > + ((__m128) \ > + __builtin_ia32_scalefss_mask_round ((A), (B), \ > + (__v4sf) _mm_setzero_ps (), \ > + (W), (U), (C))) > #endif > > #define _mm_mask_scalef_sd(W, U, A, B) \ > -- > 2.18.1 >
> Please add the corresponding intrinsic test in sse-14.c Sorry for forgetting this part. Updated patch. Thanks. Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> 于2022年4月22日周五 16:49写道: > > On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > Hi, > > > > Add missing macro under O0 and adjust macro format for scalf > > intrinsics. > > > Please add the corresponding intrinsic test in sse-14.c. > > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}. > > > > Ok for master and backport to GCC 9/10/11? > > > > gcc/ChangeLog: > > > > PR target/105339 > > * config/i386/avx512fintrin.h (_mm512_scalef_round_pd): > > Add parentheses for parameters and djust format. > > (_mm512_mask_scalef_round_pd): Ditto. > > (_mm512_maskz_scalef_round_pd): Ditto. > > (_mm512_scalef_round_ps): Ditto. > > (_mm512_mask_scalef_round_ps): Ditto. > > (_mm512_maskz_scalef_round_ps): Ditto. > > (_mm_scalef_round_sd): Use _mm_undefined_pd. > > (_mm_scalef_round_ss): Use _mm_undefined_ps. > > (_mm_mask_scalef_round_sd): New macro. > > (_mm_mask_scalef_round_ss): Ditto. > > (_mm_maskz_scalef_round_sd): Ditto. > > (_mm_maskz_scalef_round_ss): Ditto. > > --- > > gcc/config/i386/avx512fintrin.h | 76 ++++++++++++++++++++++++--------- > > 1 file changed, 56 insertions(+), 20 deletions(-) > > > > diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h > > index 29511fd2831..6dc69ff0234 100644 > > --- a/gcc/config/i386/avx512fintrin.h > > +++ b/gcc/config/i386/avx512fintrin.h > > @@ -3286,31 +3286,67 @@ _mm_maskz_scalef_round_ss (__mmask8 __U, __m128 __A, __m128 __B, const int __R) > > (__mmask8) __U, __R); > > } > > #else > > -#define _mm512_scalef_round_pd(A, B, C) \ > > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_undefined_pd(), -1, C) > > - > > -#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ > > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, W, U, C) > > - > > -#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ > > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_setzero_pd(), U, C) > > +#define _mm512_scalef_round_pd(A, B, C) \ > > + ((__m512d) \ > > + __builtin_ia32_scalefpd512_mask((A), (B), \ > > + (__v8df) _mm512_undefined_pd(), \ > > + -1, (C))) > > + > > +#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ > > + ((__m512d) __builtin_ia32_scalefpd512_mask((A), (B), (W), (U), (C))) > > + > > +#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ > > + ((__m512d) \ > > + __builtin_ia32_scalefpd512_mask((A), (B), \ > > + (__v8df) _mm512_setzero_pd(), \ > > + (U), (C))) > > + > > +#define _mm512_scalef_round_ps(A, B, C) \ > > + ((__m512) \ > > + __builtin_ia32_scalefps512_mask((A), (B), \ > > + (__v16sf) _mm512_undefined_ps(), \ > > + -1, (C))) > > + > > +#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ > > + ((__m512) __builtin_ia32_scalefps512_mask((A), (B), (W), (U), (C))) > > + > > +#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ > > + ((__m512) \ > > + __builtin_ia32_scalefps512_mask((A), (B), \ > > + (__v16sf) _mm512_setzero_ps(), \ > > + (U), (C))) > > + > > +#define _mm_scalef_round_sd(A, B, C) \ > > + ((__m128d) \ > > + __builtin_ia32_scalefsd_mask_round ((A), (B), \ > > + (__v2df) _mm_undefined_pd (), \ > > + -1, (C))) > > > > -#define _mm512_scalef_round_ps(A, B, C) \ > > - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_undefined_ps(), -1, C) > > +#define _mm_scalef_round_ss(A, B, C) \ > > + ((__m128) \ > > + __builtin_ia32_scalefss_mask_round ((A), (B), \ > > + (__v4sf) _mm_undefined_ps (), \ > > + -1, (C))) > > > > -#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ > > - (__m512)__builtin_ia32_scalefps512_mask(A, B, W, U, C) > > +#define _mm_mask_scalef_round_sd(W, U, A, B, C) \ > > + ((__m128d) \ > > + __builtin_ia32_scalefsd_mask_round ((A), (B), (W), (U), (C))) > > > > -#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ > > - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C) > > +#define _mm_mask_scalef_round_ss(W, U, A, B, C) \ > > + ((__m128) \ > > + __builtin_ia32_scalefss_mask_round ((A), (B), (W), (U), (C))) > > > > -#define _mm_scalef_round_sd(A, B, C) \ > > - (__m128d)__builtin_ia32_scalefsd_mask_round (A, B, \ > > - (__v2df)_mm_setzero_pd (), -1, C) > > +#define _mm_maskz_scalef_round_sd(U, A, B, C) \ > > + ((__m128d) \ > > + __builtin_ia32_scalefsd_mask_round ((A), (B), \ > > + (__v2df) _mm_setzero_pd (), \ > > + (U), (C))) > > > > -#define _mm_scalef_round_ss(A, B, C) \ > > - (__m128)__builtin_ia32_scalefss_mask_round (A, B, \ > > - (__v4sf)_mm_setzero_ps (), -1, C) > > +#define _mm_maskz_scalef_round_ss(U, A, B, C) \ > > + ((__m128) \ > > + __builtin_ia32_scalefss_mask_round ((A), (B), \ > > + (__v4sf) _mm_setzero_ps (), \ > > + (W), (U), (C))) > > #endif > > > > #define _mm_mask_scalef_sd(W, U, A, B) \ > > -- > > 2.18.1 > > > > > -- > BR, > Hongtao
On Fri, Apr 22, 2022 at 8:43 PM Hongyu Wang <wwwhhhyyy333@gmail.com> wrote: > > > Please add the corresponding intrinsic test in sse-14.c > > Sorry for forgetting this part. Updated patch. Thanks. > LGTM. > Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> 于2022年4月22日周五 16:49写道: > > > > On Fri, Apr 22, 2022 at 4:12 PM Hongyu Wang via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > Hi, > > > > > > Add missing macro under O0 and adjust macro format for scalf > > > intrinsics. > > > > > Please add the corresponding intrinsic test in sse-14.c. > > > Bootstrapped/regtested on x86_64-pc-linux-gnu{-m32,}. > > > > > > Ok for master and backport to GCC 9/10/11? > > > > > > gcc/ChangeLog: > > > > > > PR target/105339 > > > * config/i386/avx512fintrin.h (_mm512_scalef_round_pd): > > > Add parentheses for parameters and djust format. > > > (_mm512_mask_scalef_round_pd): Ditto. > > > (_mm512_maskz_scalef_round_pd): Ditto. > > > (_mm512_scalef_round_ps): Ditto. > > > (_mm512_mask_scalef_round_ps): Ditto. > > > (_mm512_maskz_scalef_round_ps): Ditto. > > > (_mm_scalef_round_sd): Use _mm_undefined_pd. > > > (_mm_scalef_round_ss): Use _mm_undefined_ps. > > > (_mm_mask_scalef_round_sd): New macro. > > > (_mm_mask_scalef_round_ss): Ditto. > > > (_mm_maskz_scalef_round_sd): Ditto. > > > (_mm_maskz_scalef_round_ss): Ditto. > > > --- > > > gcc/config/i386/avx512fintrin.h | 76 ++++++++++++++++++++++++--------- > > > 1 file changed, 56 insertions(+), 20 deletions(-) > > > > > > diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h > > > index 29511fd2831..6dc69ff0234 100644 > > > --- a/gcc/config/i386/avx512fintrin.h > > > +++ b/gcc/config/i386/avx512fintrin.h > > > @@ -3286,31 +3286,67 @@ _mm_maskz_scalef_round_ss (__mmask8 __U, __m128 __A, __m128 __B, const int __R) > > > (__mmask8) __U, __R); > > > } > > > #else > > > -#define _mm512_scalef_round_pd(A, B, C) \ > > > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_undefined_pd(), -1, C) > > > - > > > -#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ > > > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, W, U, C) > > > - > > > -#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ > > > - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_setzero_pd(), U, C) > > > +#define _mm512_scalef_round_pd(A, B, C) \ > > > + ((__m512d) \ > > > + __builtin_ia32_scalefpd512_mask((A), (B), \ > > > + (__v8df) _mm512_undefined_pd(), \ > > > + -1, (C))) > > > + > > > +#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ > > > + ((__m512d) __builtin_ia32_scalefpd512_mask((A), (B), (W), (U), (C))) > > > + > > > +#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ > > > + ((__m512d) \ > > > + __builtin_ia32_scalefpd512_mask((A), (B), \ > > > + (__v8df) _mm512_setzero_pd(), \ > > > + (U), (C))) > > > + > > > +#define _mm512_scalef_round_ps(A, B, C) \ > > > + ((__m512) \ > > > + __builtin_ia32_scalefps512_mask((A), (B), \ > > > + (__v16sf) _mm512_undefined_ps(), \ > > > + -1, (C))) > > > + > > > +#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ > > > + ((__m512) __builtin_ia32_scalefps512_mask((A), (B), (W), (U), (C))) > > > + > > > +#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ > > > + ((__m512) \ > > > + __builtin_ia32_scalefps512_mask((A), (B), \ > > > + (__v16sf) _mm512_setzero_ps(), \ > > > + (U), (C))) > > > + > > > +#define _mm_scalef_round_sd(A, B, C) \ > > > + ((__m128d) \ > > > + __builtin_ia32_scalefsd_mask_round ((A), (B), \ > > > + (__v2df) _mm_undefined_pd (), \ > > > + -1, (C))) > > > > > > -#define _mm512_scalef_round_ps(A, B, C) \ > > > - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_undefined_ps(), -1, C) > > > +#define _mm_scalef_round_ss(A, B, C) \ > > > + ((__m128) \ > > > + __builtin_ia32_scalefss_mask_round ((A), (B), \ > > > + (__v4sf) _mm_undefined_ps (), \ > > > + -1, (C))) > > > > > > -#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ > > > - (__m512)__builtin_ia32_scalefps512_mask(A, B, W, U, C) > > > +#define _mm_mask_scalef_round_sd(W, U, A, B, C) \ > > > + ((__m128d) \ > > > + __builtin_ia32_scalefsd_mask_round ((A), (B), (W), (U), (C))) > > > > > > -#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ > > > - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C) > > > +#define _mm_mask_scalef_round_ss(W, U, A, B, C) \ > > > + ((__m128) \ > > > + __builtin_ia32_scalefss_mask_round ((A), (B), (W), (U), (C))) > > > > > > -#define _mm_scalef_round_sd(A, B, C) \ > > > - (__m128d)__builtin_ia32_scalefsd_mask_round (A, B, \ > > > - (__v2df)_mm_setzero_pd (), -1, C) > > > +#define _mm_maskz_scalef_round_sd(U, A, B, C) \ > > > + ((__m128d) \ > > > + __builtin_ia32_scalefsd_mask_round ((A), (B), \ > > > + (__v2df) _mm_setzero_pd (), \ > > > + (U), (C))) > > > > > > -#define _mm_scalef_round_ss(A, B, C) \ > > > - (__m128)__builtin_ia32_scalefss_mask_round (A, B, \ > > > - (__v4sf)_mm_setzero_ps (), -1, C) > > > +#define _mm_maskz_scalef_round_ss(U, A, B, C) \ > > > + ((__m128) \ > > > + __builtin_ia32_scalefss_mask_round ((A), (B), \ > > > + (__v4sf) _mm_setzero_ps (), \ > > > + (W), (U), (C))) > > > #endif > > > > > > #define _mm_mask_scalef_sd(W, U, A, B) \ > > > -- > > > 2.18.1 > > > > > > > > > -- > > BR, > > Hongtao
diff --git a/gcc/config/i386/avx512fintrin.h b/gcc/config/i386/avx512fintrin.h index 29511fd2831..6dc69ff0234 100644 --- a/gcc/config/i386/avx512fintrin.h +++ b/gcc/config/i386/avx512fintrin.h @@ -3286,31 +3286,67 @@ _mm_maskz_scalef_round_ss (__mmask8 __U, __m128 __A, __m128 __B, const int __R) (__mmask8) __U, __R); } #else -#define _mm512_scalef_round_pd(A, B, C) \ - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_undefined_pd(), -1, C) - -#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, W, U, C) - -#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ - (__m512d)__builtin_ia32_scalefpd512_mask(A, B, (__v8df)_mm512_setzero_pd(), U, C) +#define _mm512_scalef_round_pd(A, B, C) \ + ((__m512d) \ + __builtin_ia32_scalefpd512_mask((A), (B), \ + (__v8df) _mm512_undefined_pd(), \ + -1, (C))) + +#define _mm512_mask_scalef_round_pd(W, U, A, B, C) \ + ((__m512d) __builtin_ia32_scalefpd512_mask((A), (B), (W), (U), (C))) + +#define _mm512_maskz_scalef_round_pd(U, A, B, C) \ + ((__m512d) \ + __builtin_ia32_scalefpd512_mask((A), (B), \ + (__v8df) _mm512_setzero_pd(), \ + (U), (C))) + +#define _mm512_scalef_round_ps(A, B, C) \ + ((__m512) \ + __builtin_ia32_scalefps512_mask((A), (B), \ + (__v16sf) _mm512_undefined_ps(), \ + -1, (C))) + +#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ + ((__m512) __builtin_ia32_scalefps512_mask((A), (B), (W), (U), (C))) + +#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ + ((__m512) \ + __builtin_ia32_scalefps512_mask((A), (B), \ + (__v16sf) _mm512_setzero_ps(), \ + (U), (C))) + +#define _mm_scalef_round_sd(A, B, C) \ + ((__m128d) \ + __builtin_ia32_scalefsd_mask_round ((A), (B), \ + (__v2df) _mm_undefined_pd (), \ + -1, (C))) -#define _mm512_scalef_round_ps(A, B, C) \ - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_undefined_ps(), -1, C) +#define _mm_scalef_round_ss(A, B, C) \ + ((__m128) \ + __builtin_ia32_scalefss_mask_round ((A), (B), \ + (__v4sf) _mm_undefined_ps (), \ + -1, (C))) -#define _mm512_mask_scalef_round_ps(W, U, A, B, C) \ - (__m512)__builtin_ia32_scalefps512_mask(A, B, W, U, C) +#define _mm_mask_scalef_round_sd(W, U, A, B, C) \ + ((__m128d) \ + __builtin_ia32_scalefsd_mask_round ((A), (B), (W), (U), (C))) -#define _mm512_maskz_scalef_round_ps(U, A, B, C) \ - (__m512)__builtin_ia32_scalefps512_mask(A, B, (__v16sf)_mm512_setzero_ps(), U, C) +#define _mm_mask_scalef_round_ss(W, U, A, B, C) \ + ((__m128) \ + __builtin_ia32_scalefss_mask_round ((A), (B), (W), (U), (C))) -#define _mm_scalef_round_sd(A, B, C) \ - (__m128d)__builtin_ia32_scalefsd_mask_round (A, B, \ - (__v2df)_mm_setzero_pd (), -1, C) +#define _mm_maskz_scalef_round_sd(U, A, B, C) \ + ((__m128d) \ + __builtin_ia32_scalefsd_mask_round ((A), (B), \ + (__v2df) _mm_setzero_pd (), \ + (U), (C))) -#define _mm_scalef_round_ss(A, B, C) \ - (__m128)__builtin_ia32_scalefss_mask_round (A, B, \ - (__v4sf)_mm_setzero_ps (), -1, C) +#define _mm_maskz_scalef_round_ss(U, A, B, C) \ + ((__m128) \ + __builtin_ia32_scalefss_mask_round ((A), (B), \ + (__v4sf) _mm_setzero_ps (), \ + (W), (U), (C))) #endif #define _mm_mask_scalef_sd(W, U, A, B) \