From patchwork Fri Oct 15 08:38:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Pan2 via Gcc-patches" X-Patchwork-Id: 46264 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B5A053857C75 for ; Fri, 15 Oct 2021 08:38:41 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B5A053857C75 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1634287121; bh=5RB+0YY8cAfTWP2hhFjCs6wlixAYCMQNEXxMlJHUFas=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=SlGV4qxfdP3xkmlTGJVMW+MTL6CvvRDobYz4uOvNND+nynRbLXdGTt3BmQ6ZbI00o 7Cq/TZCVdV5obbEHMFGLt0uLvi8FTKPKh+aXCNuyd181kh1AujiApy3dkfhKo3okfy sQ93hLSB+cmaB13ZlkiMUeKHZug8kD/N8jqZ2mkI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by sourceware.org (Postfix) with ESMTPS id 306EB3858412 for ; Fri, 15 Oct 2021 08:38:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 306EB3858412 X-IronPort-AV: E=McAfee;i="6200,9189,10137"; a="225340752" X-IronPort-AV: E=Sophos;i="5.85,375,1624345200"; d="scan'208";a="225340752" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2021 01:38:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.85,375,1624345200"; d="scan'208";a="481626442" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga007.jf.intel.com with ESMTP; 15 Oct 2021 01:38:09 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 19F8c7YZ013708; Fri, 15 Oct 2021 01:38:07 -0700 To: hongtao.liu@intel.com Subject: [PATCH] AVX512FP16: Add *_set1_pch intrinsics. Date: Fri, 15 Oct 2021 16:38:07 +0800 Message-Id: <20211015083807.21741-1-dianhong.xu@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "dianhong.xu--- via Gcc-patches" From: "Li, Pan2 via Gcc-patches" Reply-To: dianhong.xu@intel.com Cc: gcc-patches@gcc.gnu.org, dianhong.xu@intel.com, dianhong7@gmail.com Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" From: dianhong xu Add *_set1_pch (_Float16 _Complex A) intrinsics. gcc/ChangeLog: * config/i386/avx512fp16intrin.h: (_mm512_set1_pch): New intrinsic. * config/i386/avx512fp16vlintrin.h: (_mm256_set1_pch): New intrinsic. (_mm_set1_pch): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-set1-pch-1a.c: New test. * gcc.target/i386/avx512fp16-set1-pch-1b.c: New test. * gcc.target/i386/avx512fp16vl-set1-pch-1a.c: New test. * gcc.target/i386/avx512fp16vl-set1-pch-1b.c: New test. --- gcc/config/i386/avx512fp16intrin.h | 13 +++++ gcc/config/i386/avx512fp16vlintrin.h | 26 +++++++++ .../gcc.target/i386/avx512fp16-set1-pch-1a.c | 13 +++++ .../gcc.target/i386/avx512fp16-set1-pch-1b.c | 42 ++++++++++++++ .../i386/avx512fp16vl-set1-pch-1a.c | 20 +++++++ .../i386/avx512fp16vl-set1-pch-1b.c | 57 +++++++++++++++++++ 6 files changed, 171 insertions(+) create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1b.c diff --git a/gcc/config/i386/avx512fp16intrin.h b/gcc/config/i386/avx512fp16intrin.h index 079ce321c01..17025d68b8e 100644 --- a/gcc/config/i386/avx512fp16intrin.h +++ b/gcc/config/i386/avx512fp16intrin.h @@ -7237,6 +7237,19 @@ _mm512_permutexvar_ph (__m512i __A, __m512h __B) (__mmask32)-1); } +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_set1_pch (_Float16 _Complex __A) +{ + union + { + _Float16 _Complex a; + float b; + } u = { .a = __A}; + + return (__m512h) _mm512_set1_ps (u.b); +} + #ifdef __DISABLE_AVX512FP16__ #undef __DISABLE_AVX512FP16__ #pragma GCC pop_options diff --git a/gcc/config/i386/avx512fp16vlintrin.h b/gcc/config/i386/avx512fp16vlintrin.h index f83a429ba43..1de4513d7f1 100644 --- a/gcc/config/i386/avx512fp16vlintrin.h +++ b/gcc/config/i386/avx512fp16vlintrin.h @@ -3315,6 +3315,32 @@ _mm_permutexvar_ph (__m128i __A, __m128h __B) (__mmask8)-1); } +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_set1_pch (_Float16 _Complex __A) +{ + union + { + _Float16 _Complex a; + float b; + } u = { .a = __A }; + + return (__m256h) _mm256_set1_ps (u.b); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_set1_pch (_Float16 _Complex __A) +{ + union + { + _Float16 _Complex a; + float b; + } u = { .a = __A }; + + return (__m128h) _mm_set1_ps (u.b); +} + #ifdef __DISABLE_AVX512FP16VL__ #undef __DISABLE_AVX512FP16VL__ #pragma GCC pop_options diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c new file mode 100644 index 00000000000..0055193f243 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1a.c @@ -0,0 +1,13 @@ +/* { dg-do compile} */ +/* { dg-options "-O2 -mavx512fp16" } */ + +#include + +__m512h +__attribute__ ((noinline, noclone)) +test_mm512_set1_pch (_Float16 _Complex A) +{ + return _mm512_set1_pch(A); +} + +/* { dg-final { scan-assembler "vbroadcastss\[ \\t\]+\[^\n\r\]*%zmm\[01\]" } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c new file mode 100644 index 00000000000..450d7e37237 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-set1-pch-1b.c @@ -0,0 +1,42 @@ +/* { dg-do run { target avx512fp16 } } */ +/* { dg-options "-O2 -mavx512fp16" } */ + +#include +#include +#include + +static void do_test (void); + +#define DO_TEST do_test +#define AVX512FP16 + +#include +#include "avx512-check.h" + +static void +do_test (void) +{ + _Float16 _Complex fc = 1.0 + 1.0*I; + union + { + _Float16 _Complex a; + float b; + } u = { .a = fc }; + float ff= u.b; + + typedef union + { + float fp[16]; + __m512h m512h; + } u1; + + __m512h test512 = _mm512_set1_pch(fc); + + u1 test; + test.m512h = test512; + for (int i = 0; i<16; i++) + { + if (test.fp[i] != ff) abort(); + } + +} diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c b/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c new file mode 100644 index 00000000000..4c5624f9935 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1a.c @@ -0,0 +1,20 @@ +/* { dg-do compile} */ +/* { dg-options "-O2 -mavx512fp16 -mavx512vl" } */ + +#include + +__m256h +__attribute__ ((noinline, noclone)) +test_mm256_set1_pch (_Float16 _Complex A) +{ + return _mm256_set1_pch(A); +} + +__m128h +__attribute__ ((noinline, noclone)) +test_mm_set1_pch (_Float16 _Complex A) +{ + return _mm_set1_pch(A); +} + +/* { dg-final { scan-assembler-times "vbroadcastss" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1b.c b/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1b.c new file mode 100644 index 00000000000..aebff141821 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512fp16vl-set1-pch-1b.c @@ -0,0 +1,57 @@ +/* { dg-do run { target avx512fp16 } } */ +/* { dg-options "-O2 -mavx512fp16 -mavx512vl" } */ + +#include +#include +#include + +static void do_test (void); + +#define DO_TEST do_test +#define AVX512FP16 + +#include +#include "avx512-check.h" + +static void +do_test (void) +{ + _Float16 _Complex fc = 1.0 + 1.0*I; + union + { + _Float16 _Complex a; + float b; + } u = { .a = fc }; + float ff= u.b; + + typedef union + { + float fp[8]; + __m256h m256h; + } u1; + + __m256h test256 = _mm256_set1_pch(fc); + + u1 test1; + test1.m256h = test256; + for (int i = 0; i<8; i++) + { + if (test1.fp[i] != ff) abort(); + } + + typedef union + { + float fp[4]; + __m128h m128h; + } u2; + + __m128h test128 = _mm_set1_pch(fc); + + u2 test2; + test2.m128h = test128; + for (int i = 0; i<4; i++) + { + if (test2.fp[i] != ff) abort(); + } + +}