From patchwork Thu Mar 30 01:44:56 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: liuhongt <hongtao.liu@intel.com>
X-Patchwork-Id: 67109
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 213453858C1F
	for <patchwork@sourceware.org>; Thu, 30 Mar 2023 01:48:06 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 213453858C1F
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1680140886;
	bh=W5H3CuLlR87aT8nX6uqFnQu7lmkM2XathmhUYquMSAU=;
	h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive:
	 List-Post:List-Help:List-Subscribe:From:Reply-To:From;
	b=L1jS60Sie7jBVv/a3jCxpsn2cCoAk7UN/vAb33cRu8eTjLXd/UxWTTcYaE3UfVpBK
	 Q6wurH4YQ1YuHpT9VnuC2omwDv86XKnDcRncXHBljDM+OZf8r5/Wtmc0XRW8pM6Usi
	 E0qHZGVT9/hvIQ1/4r7wMpSFKmJwaUw/Twp3iWQg=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mga17.intel.com (mga17.intel.com [192.55.52.151])
 by sourceware.org (Postfix) with ESMTPS id 5F20B3858CDA
 for <gcc-patches@gcc.gnu.org>; Thu, 30 Mar 2023 01:47:21 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5F20B3858CDA
X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="321430164"
X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="321430164"
Received: from orsmga005.jf.intel.com ([10.7.209.41])
 by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 29 Mar 2023 18:46:59 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="858722763"
X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="858722763"
Received: from shvmail03.sh.intel.com ([10.239.245.20])
 by orsmga005.jf.intel.com with ESMTP; 29 Mar 2023 18:46:57 -0700
Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com
 [10.239.240.127])
 by shvmail03.sh.intel.com (Postfix) with ESMTP id 31E3C10189E3;
 Thu, 30 Mar 2023 09:46:56 +0800 (CST)
To: gcc-patches@gcc.gnu.org
Cc: crazylht@gmail.com,
	hjl.tools@gmail.com,
	ubizjak@gmail.com
Subject: [PATCH] Support vector conversion for AVX512
 vcvtudq2pd/vcvttps2udq/vcvttpd2udq.
Date: Thu, 30 Mar 2023 09:44:56 +0800
Message-Id: <20230330014456.1425596-1-hongtao.liu@intel.com>
X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c
MIME-Version: 1.0
X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: liuhongt via Gcc-patches <gcc-patches@gcc.gnu.org>
From: liuhongt <hongtao.liu@intel.com>
Reply-To: liuhongt <hongtao.liu@intel.com>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org
Sender: "Gcc-patches"
 <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>

There's some typo for the standard pattern name for unsigned_{float,fix},
it should be floatunsmn2/fixuns_truncmn2, not ufloatmn2/ufix_truncmn2
in current trunk, the patch fix the typo.

Also vcvttps2udq is available under AVX512VL, so it can be generated
directly instead of being emulated via vcvttps2dq.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
Ok for GCC14 stage1{or maybe for trunk)?

gcc/ChangeLog:

	PR target/85048
	* config/i386/sse.md (floatuns<si2dfmodelower><mode>2):
	Generate vcvtudq2ps under AVX512VL.
	(fixuns_truncv4dfv4si2): New expander.
	(floatuns<si2dfmodelower><mode>2): New expander.

gcc/testsuite/ChangeLog:

	* g++.target/i386/pr85048.C: New test.
---
 gcc/config/i386/sse.md                  | 18 ++++++++++++--
 gcc/testsuite/g++.target/i386/pr85048.C | 33 +++++++++++++++++++++++++
 2 files changed, 49 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/i386/pr85048.C

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 172ec3bea4f..9c2bd468c65 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -8014,8 +8014,9 @@ (define_expand "fixuns_trunc<mode><sseintvecmodelower>2"
    (match_operand:VF1 1 "register_operand")]
   "TARGET_SSE2"
 {
-  if (<MODE>mode == V16SFmode)
-    emit_insn (gen_ufix_truncv16sfv16si2 (operands[0],
+  /* AVX512 support vcvttps2udq for all 128/256/512-bit vectors.  */
+  if (<MODE>mode == V16SFmode || TARGET_AVX512VL)
+    emit_insn (gen_ufix_trunc<mode><sseintvecmodelower>2 (operands[0],
 					  operands[1]));
   else
     {
@@ -8413,6 +8414,12 @@ (define_insn "*float<floatunssuffix>v2div2sf2_mask_1"
    (set_attr "prefix" "evex")
    (set_attr "mode" "V4SF")])
 
+(define_expand "floatuns<si2dfmodelower><mode>2"
+  [(set (match_operand:VF2_512_256VL 0 "register_operand")
+	(unsigned_float:VF2_512_256VL
+	  (match_operand:<si2dfmode> 1 "nonimmediate_operand")))]
+   "TARGET_AVX512F")
+
 (define_insn "ufloat<si2dfmodelower><mode>2<mask_name>"
   [(set (match_operand:VF2_512_256VL 0 "register_operand" "=v")
 	(unsigned_float:VF2_512_256VL
@@ -8694,6 +8701,13 @@ (define_insn "fix_truncv4dfv4si2<mask_name>"
    (set_attr "prefix" "maybe_evex")
    (set_attr "mode" "OI")])
 
+
+/* The standard pattern name is fixuns_truncmn2.  */
+(define_expand "fixuns_truncv4dfv4si2"
+  [(set (match_operand:V4SI 0 "register_operand")
+	(unsigned_fix:V4SI (match_operand:V4DF 1 "nonimmediate_operand")))]
+  "TARGET_AVX512VL && TARGET_AVX512F")
+
 (define_insn "ufix_truncv4dfv4si2<mask_name>"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
 	(unsigned_fix:V4SI (match_operand:V4DF 1 "nonimmediate_operand" "vm")))]
diff --git a/gcc/testsuite/g++.target/i386/pr85048.C b/gcc/testsuite/g++.target/i386/pr85048.C
new file mode 100644
index 00000000000..52973c18ebd
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/pr85048.C
@@ -0,0 +1,33 @@
+/* PR target/85048 */
+/* { dg-do compile }  */
+/* { dg-options "-std=c++17 -O2 -mavx512vl -mavx512dq -mprefer-vector-width=512" } */
+/* { dg-final { scan-assembler-times {(?n)vcvtudq2pd[ \t]+} 2 } } */
+/* { dg-final { scan-assembler-times {(?n)vcvttps2udq[ \t]+} 2 } } */
+/* { dg-final { scan-assembler-times {(?n)vcvttpd2udqy?[ \t]+} 1 } } */
+
+#include <cstdint>
+
+template <class T, int N, int Size = N * sizeof(T)>
+using V [[gnu::vector_size(Size)]] = T;
+
+template <class From, class To> V<To, 4> cvt4(V<From, 4> x) {
+    return V<To, 4>{To(x[0]), To(x[1]), To(x[2]), To(x[3])};
+}
+template <class From, class To> V<To, 8> cvt8(V<From, 8> x) {
+    return V<To, 8>{
+        To(x[0]), To(x[1]), To(x[2]), To(x[3]),
+        To(x[4]), To(x[5]), To(x[6]), To(x[7])
+    };
+}
+
+#define _(name, from, to, size) \
+auto name(V<from, size> x) { return cvt##size<from, to>(x); }
+// integral -> double
+_(vcvtudq2pd, uint32_t, double, 4)
+_(vcvtudq2pd, uint32_t, double, 8)
+
+_( cvttps2udq, float, uint32_t,  4)
+_(vcvttps2udq, float, uint32_t,  8)
+
+// double -> integral
+_(vcvttpd2udq, double, uint32_t, 4)