From patchwork Wed Mar 29 07:21:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 67076 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E90EA3858028 for ; Wed, 29 Mar 2023 07:21:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E90EA3858028 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1680074518; bh=CvGtI1w8zzIDZBCvI3LSTSiQ0yEB6HIQrbQ0h8Bj16U=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=tyKJyI6VhDINeHaGSBXiDPVC9dTUnDDMbE75WHlp91oa86CdEPrr/PESiB8QBAd1/ Gyzo3HO0b+iMxRGx1Sb8O/UPuD8swhs9tKk0eCRHEoNXFFdOan5kIfxXSGxPQL+iEb Tp2zz7IeRkkuFIWZq+KsXbExra0vymzgO3RQ2vI8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by sourceware.org (Postfix) with ESMTPS id 0B42E3858D39 for ; Wed, 29 Mar 2023 07:21:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B42E3858D39 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="368568959" X-IronPort-AV: E=Sophos;i="5.98,300,1673942400"; d="scan'208";a="368568959" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2023 00:21:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10663"; a="827771890" X-IronPort-AV: E=Sophos;i="5.98,300,1673942400"; d="scan'208";a="827771890" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga001.fm.intel.com with ESMTP; 29 Mar 2023 00:21:26 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 4CE8C1005183; Wed, 29 Mar 2023 15:21:26 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com, ubizjak@gmail.com Subject: [PATCH] Generate vpblendd instead of vpblendw for V4SI under AVX2. Date: Wed, 29 Mar 2023 15:21:26 +0800 Message-Id: <20230329072126.2297953-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} Ok for GCC14 stage-1(or maybe trunk)? gcc/ChangeLog: * config/i386/i386-expand.cc (expand_vec_perm_blend): Generate vpblendd instead of vpblendw for V4SI under avx2. gcc/testsuite/ChangeLog: * gcc.target/i386/pr88828-0.c: Adjust testcase. --- gcc/config/i386/i386-expand.cc | 18 ++++++++++++++---- gcc/testsuite/gcc.target/i386/pr88828-0.c | 2 +- 2 files changed, 15 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index c1300dc4e26..1c436262ee5 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -19069,10 +19069,20 @@ expand_vec_perm_blend (struct expand_vec_perm_d *d) goto do_subreg; case E_V4SImode: - for (i = 0; i < 4; ++i) - mask |= (d->perm[i] >= 4 ? 3 : 0) << (i * 2); - vmode = V8HImode; - goto do_subreg; + if (TARGET_AVX2) + { + /* Use vpblendd instead of vpblendw. */ + for (i = 0; i < nelt; ++i) + mask |= ((unsigned HOST_WIDE_INT) (d->perm[i] >= nelt)) << i; + break; + } + else + { + for (i = 0; i < 4; ++i) + mask |= (d->perm[i] >= 4 ? 3 : 0) << (i * 2); + vmode = V8HImode; + goto do_subreg; + } case E_V16QImode: /* See if bytes move in pairs so we can use pblendw with diff --git a/gcc/testsuite/gcc.target/i386/pr88828-0.c b/gcc/testsuite/gcc.target/i386/pr88828-0.c index 3ddb2d13526..441c441b51d 100644 --- a/gcc/testsuite/gcc.target/i386/pr88828-0.c +++ b/gcc/testsuite/gcc.target/i386/pr88828-0.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -msse4.2" } */ +/* { dg-options "-O2 -msse4.2 -mno-avx2" } */ typedef int v4si __attribute__((vector_size(16))); typedef float v4sf __attribute__((vector_size(16)));