From patchwork Thu Mar 30 01:44:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 67109 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 213453858C1F for ; Thu, 30 Mar 2023 01:48:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 213453858C1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1680140886; bh=W5H3CuLlR87aT8nX6uqFnQu7lmkM2XathmhUYquMSAU=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=L1jS60Sie7jBVv/a3jCxpsn2cCoAk7UN/vAb33cRu8eTjLXd/UxWTTcYaE3UfVpBK Q6wurH4YQ1YuHpT9VnuC2omwDv86XKnDcRncXHBljDM+OZf8r5/Wtmc0XRW8pM6Usi E0qHZGVT9/hvIQ1/4r7wMpSFKmJwaUw/Twp3iWQg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by sourceware.org (Postfix) with ESMTPS id 5F20B3858CDA for ; Thu, 30 Mar 2023 01:47:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5F20B3858CDA X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="321430164" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="321430164" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 18:46:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10664"; a="858722763" X-IronPort-AV: E=Sophos;i="5.98,301,1673942400"; d="scan'208";a="858722763" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga005.jf.intel.com with ESMTP; 29 Mar 2023 18:46:57 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 31E3C10189E3; Thu, 30 Mar 2023 09:46:56 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com, ubizjak@gmail.com Subject: [PATCH] Support vector conversion for AVX512 vcvtudq2pd/vcvttps2udq/vcvttpd2udq. Date: Thu, 30 Mar 2023 09:44:56 +0800 Message-Id: <20230330014456.1425596-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" There's some typo for the standard pattern name for unsigned_{float,fix}, it should be floatunsmn2/fixuns_truncmn2, not ufloatmn2/ufix_truncmn2 in current trunk, the patch fix the typo. Also vcvttps2udq is available under AVX512VL, so it can be generated directly instead of being emulated via vcvttps2dq. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} Ok for GCC14 stage1{or maybe for trunk)? gcc/ChangeLog: PR target/85048 * config/i386/sse.md (floatuns2): Generate vcvtudq2ps under AVX512VL. (fixuns_truncv4dfv4si2): New expander. (floatuns2): New expander. gcc/testsuite/ChangeLog: * g++.target/i386/pr85048.C: New test. --- gcc/config/i386/sse.md | 18 ++++++++++++-- gcc/testsuite/g++.target/i386/pr85048.C | 33 +++++++++++++++++++++++++ 2 files changed, 49 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/pr85048.C diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 172ec3bea4f..9c2bd468c65 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -8014,8 +8014,9 @@ (define_expand "fixuns_trunc2" (match_operand:VF1 1 "register_operand")] "TARGET_SSE2" { - if (mode == V16SFmode) - emit_insn (gen_ufix_truncv16sfv16si2 (operands[0], + /* AVX512 support vcvttps2udq for all 128/256/512-bit vectors. */ + if (mode == V16SFmode || TARGET_AVX512VL) + emit_insn (gen_ufix_trunc2 (operands[0], operands[1])); else { @@ -8413,6 +8414,12 @@ (define_insn "*floatv2div2sf2_mask_1" (set_attr "prefix" "evex") (set_attr "mode" "V4SF")]) +(define_expand "floatuns2" + [(set (match_operand:VF2_512_256VL 0 "register_operand") + (unsigned_float:VF2_512_256VL + (match_operand: 1 "nonimmediate_operand")))] + "TARGET_AVX512F") + (define_insn "ufloat2" [(set (match_operand:VF2_512_256VL 0 "register_operand" "=v") (unsigned_float:VF2_512_256VL @@ -8694,6 +8701,13 @@ (define_insn "fix_truncv4dfv4si2" (set_attr "prefix" "maybe_evex") (set_attr "mode" "OI")]) + +/* The standard pattern name is fixuns_truncmn2. */ +(define_expand "fixuns_truncv4dfv4si2" + [(set (match_operand:V4SI 0 "register_operand") + (unsigned_fix:V4SI (match_operand:V4DF 1 "nonimmediate_operand")))] + "TARGET_AVX512VL && TARGET_AVX512F") + (define_insn "ufix_truncv4dfv4si2" [(set (match_operand:V4SI 0 "register_operand" "=v") (unsigned_fix:V4SI (match_operand:V4DF 1 "nonimmediate_operand" "vm")))] diff --git a/gcc/testsuite/g++.target/i386/pr85048.C b/gcc/testsuite/g++.target/i386/pr85048.C new file mode 100644 index 00000000000..52973c18ebd --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr85048.C @@ -0,0 +1,33 @@ +/* PR target/85048 */ +/* { dg-do compile } */ +/* { dg-options "-std=c++17 -O2 -mavx512vl -mavx512dq -mprefer-vector-width=512" } */ +/* { dg-final { scan-assembler-times {(?n)vcvtudq2pd[ \t]+} 2 } } */ +/* { dg-final { scan-assembler-times {(?n)vcvttps2udq[ \t]+} 2 } } */ +/* { dg-final { scan-assembler-times {(?n)vcvttpd2udqy?[ \t]+} 1 } } */ + +#include + +template +using V [[gnu::vector_size(Size)]] = T; + +template V cvt4(V x) { + return V{To(x[0]), To(x[1]), To(x[2]), To(x[3])}; +} +template V cvt8(V x) { + return V{ + To(x[0]), To(x[1]), To(x[2]), To(x[3]), + To(x[4]), To(x[5]), To(x[6]), To(x[7]) + }; +} + +#define _(name, from, to, size) \ +auto name(V x) { return cvt##size(x); } +// integral -> double +_(vcvtudq2pd, uint32_t, double, 4) +_(vcvtudq2pd, uint32_t, double, 8) + +_( cvttps2udq, float, uint32_t, 4) +_(vcvttps2udq, float, uint32_t, 8) + +// double -> integral +_(vcvttpd2udq, double, uint32_t, 4)