From patchwork Wed Nov 23 12:28:20 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Liu, Hongtao" <hongtao.liu@intel.com>
X-Patchwork-Id: 61027
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id D8F483853D5F
	for <patchwork@sourceware.org>; Wed, 23 Nov 2022 12:30:53 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D8F483853D5F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1669206653;
	bh=DEffwPcTvHD23pKOZKSIdXT9VS7ZII/wYtmOdwvO6lA=;
	h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive:
	 List-Post:List-Help:List-Subscribe:From:Reply-To:From;
	b=ikvK5VQLwPEJz8psAOTGxarAqO28O6NGdoKNF66ayg1zJUzrbhonBzvUEhdPn2eF6
	 P4hnthVtVBoDiwnUcyVlEqQcGLLuk1N2JDpjnw4VlkvKXohYtjN2GiP378iOC1QEO6
	 buG6lKeEpJmyqIIkwAnoxFiCpM/xM9AL+/VCcG7o=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by sourceware.org (Postfix) with ESMTPS id 838173851886
 for <gcc-patches@gcc.gnu.org>; Wed, 23 Nov 2022 12:30:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 838173851886
X-IronPort-AV: E=McAfee;i="6500,9779,10539"; a="340934533"
X-IronPort-AV: E=Sophos;i="5.96,187,1665471600"; d="scan'208";a="340934533"
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 23 Nov 2022 04:30:23 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6500,9779,10539"; a="641782453"
X-IronPort-AV: E=Sophos;i="5.96,187,1665471600"; d="scan'208";a="641782453"
Received: from shvmail03.sh.intel.com ([10.239.245.20])
 by orsmga002.jf.intel.com with ESMTP; 23 Nov 2022 04:30:20 -0800
Received: from shliclel4051.sh.intel.com (shliclel4051.sh.intel.com
 [10.239.240.51])
 by shvmail03.sh.intel.com (Postfix) with ESMTP id 3F638100568F;
 Wed, 23 Nov 2022 20:30:20 +0800 (CST)
To: gcc-patches@gcc.gnu.org
Cc: crazylht@gmail.com,
	hjl.tools@gmail.com,
	ubizjak@gmail.com
Subject: [PATCH] [x86] Fix incorrect implementation for mm_cvtsbh_ss.
Date: Wed, 23 Nov 2022 20:28:20 +0800
Message-Id: <20221123122820.3150670-1-hongtao.liu@intel.com>
X-Mailer: git-send-email 2.27.0
MIME-Version: 1.0
X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: liuhongt via Gcc-patches <gcc-patches@gcc.gnu.org>
From: "Liu, Hongtao" <hongtao.liu@intel.com>
Reply-To: liuhongt <hongtao.liu@intel.com>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

After supporting real __bf16 type, implementation of mm_cvtsbh_ss went wrong.
The patch supports extendbfsf2/truncsfbf2 with pslld/psrld,
and then refined the intrinsic with implicit conversion.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

	PR target/107748
	* config/i386/avx512bf16intrin.h (_mm_cvtsbh_ss): Refined.
	* config/i386/i386.md (extendbfsf2): New define_insn.
	(truncsfbf2): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/extendbfsf.c: New test.
	* gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Adjust testcase.
---
 gcc/config/i386/avx512bf16intrin.h            |  4 +--
 gcc/config/i386/i386.md                       | 33 ++++++++++++++++++-
 .../gcc.target/i386/avx512bf16-cvtsbh2ss-1.c  |  3 +-
 gcc/testsuite/gcc.target/i386/extendbfsf.c    | 16 +++++++++
 4 files changed, 50 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/extendbfsf.c

diff --git a/gcc/config/i386/avx512bf16intrin.h b/gcc/config/i386/avx512bf16intrin.h
index ea1d0125b3f..4a071bcd75a 100644
--- a/gcc/config/i386/avx512bf16intrin.h
+++ b/gcc/config/i386/avx512bf16intrin.h
@@ -46,9 +46,7 @@ extern __inline float
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_cvtsbh_ss (__bf16 __A)
 {
-  union{ float a; unsigned int b;} __tmp;
-  __tmp.b = ((unsigned int)(__A)) << 16;
-  return __tmp.a;
+  return __A;
 }
 
 /* vcvtne2ps2bf16 */
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 01faa911b77..f5215596d44 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4961,6 +4961,21 @@ (define_insn "*extendhf<mode>2"
    (set_attr "prefix" "evex")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "extendbfsf2"
+  [(set (match_operand:SF 0 "register_operand"   "=x,Yw")
+	(float_extend:SF
+	  (match_operand:BF 1 "register_operand" " 0,Yw")))]
+ "TARGET_SSE2"
+ "@
+  pslld\t{$16, %0|%0, 16}
+  vpslld\t{$16, %1, %0|%0, %1, 16}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sseishft")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix" "orig,vex")
+   (set_attr "mode" "TI")
+   (set_attr "memory" "none")])
 
 (define_expand "extend<mode>xf2"
   [(set (match_operand:XF 0 "nonimmediate_operand")
@@ -5177,7 +5192,23 @@ (define_insn "*trunc<mode>hf2"
   [(set_attr "type" "ssecvt")
    (set_attr "prefix" "evex")
    (set_attr "mode" "HF")])
-
+
+(define_insn "truncsfbf2"
+  [(set (match_operand:BF 0 "register_operand"   "=x,Yw")
+	(float_truncate:BF
+	  (match_operand:SF 1 "register_operand" " 0,Yw")))]
+ "TARGET_SSE2"
+ "@
+  psrld\t{$16, %0|%0, 16}
+  vpsrld\t{$16, %1, %0|%0, %1, 16}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "sseishft")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix_data16" "1,*")
+   (set_attr "prefix" "orig,vex")
+   (set_attr "mode" "TI")
+   (set_attr "memory" "none")])
+
 ;; Signed conversion to DImode.
 
 (define_expand "fix_truncxfdi2"
diff --git a/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c b/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
index 8e929e6f159..edf30b583b9 100644
--- a/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
+++ b/gcc/testsuite/gcc.target/i386/avx512bf16-cvtsbh2ss-1.c
@@ -1,8 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-mavx512bf16 -O2" } */
 /* { dg-additional-options "-fno-PIE -mfpmath=sse" { target ia32 } } */
-/* { dg-final { scan-assembler-times "sall\[ \\t\]+\[^\{\n\]*16" 1 } } */
-/* { dg-final { scan-assembler-times "movl" 1 } } */
+/* { dg-final { scan-assembler-times "pslld" 1 } } */
 
 #include <immintrin.h>
 
diff --git a/gcc/testsuite/gcc.target/i386/extendbfsf.c b/gcc/testsuite/gcc.target/i386/extendbfsf.c
new file mode 100644
index 00000000000..f1b4c218742
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/extendbfsf.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-msse2 -O2" } */
+/* { dg-final { scan-assembler-times "pslld" 1 } } */
+/* { dg-final { scan-assembler-times "psrld" 1 } } */
+
+float
+extendsfbf (__bf16 a)
+{
+  return a;
+}
+
+__bf16
+truncsfbf (float a)
+{
+  return a;
+}