From patchwork Mon Dec  4 05:32:08 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: liuhongt <hongtao.liu@intel.com>
X-Patchwork-Id: 81247
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org>
X-Original-To: patchwork@sourceware.org
Delivered-To: patchwork@sourceware.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 4A9763857C58
	for <patchwork@sourceware.org>; Mon,  4 Dec 2023 05:32:33 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.12])
 by sourceware.org (Postfix) with ESMTPS id 926B03858C78
 for <gcc-patches@gcc.gnu.org>; Mon,  4 Dec 2023 05:32:14 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 926B03858C78
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=intel.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 926B03858C78
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=198.175.65.12
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701667937; cv=none;
 b=WYWHlWFJwprt0MX0ufGPUMAW3dseUM/TGkhqKXQRcjoZDo2rPGBND0Q3Kyky0GFjFUpz1yNSpo2MKJyGRhyrPxmVvKbKB51JHc50Is8WJAJ/ofFNv2N5f/Gm/VuGPi1ULc1NNm22BygBXj501pT2q1vU7pQ8necdflLP/zvRxVs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701667937; c=relaxed/simple;
 bh=g5paXve4KjIfqlwjfzZQ86XmkcsDeNU2UxxOL6bX+nU=;
 h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version;
 b=bpPWuFjSM0CNZYn+6Av3GGQO9mapdaZjoRPv4YHE+TKfsVCzPOgp3+PN3/fWtwObd9tEoSPQrgNslJ1TrM7mtCRn9JuBbzcsfK4TbVb5hK0EnVzjNHFL2w3GUxEVCAZgL+DhFIUD4zmN8RpJrPXaw9KbBaWeAhS6CHkaTPJfau4=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1701667934; x=1733203934;
 h=from:to:cc:subject:date:message-id:mime-version:
 content-transfer-encoding;
 bh=g5paXve4KjIfqlwjfzZQ86XmkcsDeNU2UxxOL6bX+nU=;
 b=BXmlhRUR+E29dcEaipbbq6nHJ8XU51rLxgM12PDeIb6mLE4I4I8lbniY
 psbi32SBcttEtcV5uJvPC2UIYlco0xZrOx6NIe7kLAlTH5C06rL6pXDXx
 nlY4c8BqQZgKYCHvr9Dm6idqCy/rOuhfBCaDkFwcq0AyDHHodaIuhAYFp
 e8iVhbbpJDpxvvrCskZ7SlClw+6x2OZovYowUUUC1cPL9ufpHvjGqph29
 c2xEr8g4mMReIBtNAtmgexnWtcxycDtHzBJSg2pdqc1s7NM0a4/5JUr2Q
 JgtcySx+h5JTjMVtIcqWld+3S5PSMkZfOES3zFI03PUE4akMTUUEvh7HX g==;
X-IronPort-AV: E=McAfee;i="6600,9927,10913"; a="736407"
X-IronPort-AV: E=Sophos;i="6.04,248,1695711600";
   d="scan'208";a="736407"
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by orvoesa104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 03 Dec 2023 21:32:13 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=McAfee;i="6600,9927,10913"; a="861261515"
X-IronPort-AV: E=Sophos;i="6.04,248,1695711600"; d="scan'208";a="861261515"
Received: from shvmail03.sh.intel.com ([10.239.245.20])
 by FMSMGA003.fm.intel.com with ESMTP; 03 Dec 2023 21:32:09 -0800
Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com
 [10.239.240.127])
 by shvmail03.sh.intel.com (Postfix) with ESMTP id 9D3871005689;
 Mon,  4 Dec 2023 13:32:08 +0800 (CST)
From: liuhongt <hongtao.liu@intel.com>
To: gcc-patches@gcc.gnu.org
Cc: crazylht@gmail.com,
	hjl.tools@gmail.com
Subject: [PATCH] Don't vectorize when vector stmts are only vec_contruct and
 stores
Date: Mon,  4 Dec 2023 13:32:08 +0800
Message-Id: <20231204053208.908533-1-hongtao.liu@intel.com>
X-Mailer: git-send-email 2.31.1
MIME-Version: 1.0
X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org

.i.e. for below cases.
   a[0] = b1;
   a[1] = b2;
   ..
   a[n] = bn;

There're extra dependences when contructing the vector, but not for
scalar store. According to experiments, it's generally worse.

The patch adds an cut-off heuristic when vec_stmt is just
vec_construct and vector store. It improves SPEC2017 a little bit.

BenchMarks		Ratio
500.perlbench_r		2.60%
502.gcc_r		0.30%
505.mcf_r		0.40%
520.omnetpp_r		-1.00%
523.xalancbmk_r		0.90%
525.x264_r		0.00%
531.deepsjeng_r		0.30%
541.leela_r		0.90%
548.exchange2_r		3.20%
557.xz_r		1.40%
503.bwaves_r		0.00%
507.cactuBSSN_r		0.00%
508.namd_r		0.30%
510.parest_r		0.00%
511.povray_r		0.20%
519.lbm_r		SAME BIN
521.wrf_r		-0.30%
526.blender_r		-1.20%
527.cam4_r		-0.20%
538.imagick_r		4.00%
544.nab_r		0.40%
549.fotonik3d_r		0.00%
554.roms_r		0.00%
Geomean-int		0.90%
Geomean-fp		0.30%
Geomean-all		0.50%

And
Regressed testcases:

gcc.target/i386/part-vect-absneghf.c
gcc.target/i386/part-vect-copysignhf.c
gcc.target/i386/part-vect-xorsignhf.c

Regressed under -m32 since it generates 2 vector
.ABS/NEG/XORSIGN/COPYSIGN vs original 1 64-bit vec_construct. The
original testcases are used to test vectorization capability for
.ABS/NEG/XORG/COPYSIGN, so just restrict testcase to TARGET_64BIT.

gcc.target/i386/pr111023-2.c
gcc.target/i386/pr111023.c
Regressed under -m32

testcase as below

void
v8hi_v8qi (v8hi *dst, v16qi src)
{
  short tem[8];
  tem[0] = src[0];
  tem[1] = src[1];
  tem[2] = src[2];
  tem[3] = src[3];
  tem[4] = src[4];
  tem[5] = src[5];
  tem[6] = src[6];
  tem[7] = src[7];
  dst[0] = *(v8hi *) tem;
}

under 64-bit target, vectorizer realize it's just permutation of
original src vector, but under -m32, vectorizer relies on
vec_construct for vectorization. I think optimziation for this case
under 32-bit target maynot impact much, so just add
-fno-vect-cost-model.

gcc.target/i386/pr91446.c: This testcase is guard for cost model of
vector store, not vectorization capability, so just adjust testcase.

gcc.target/i386/pr108938-3.c: This testcase relies on vec_construct to
optimize for bswap, like other optimziation vectorizer can't realize
optimization after it. So the current solution is add
-fno-vect-cost-model to the testcase.

costmodel-pr104582-1.c
costmodel-pr104582-2.c
costmodel-pr104582-4.c

Failed since it's now not vectorized, looked at the PR, it's exactly
what's wanted, so adjust testcase to scan-tree-dump-not.


Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

	PR target/99881
	PR target/104582
	* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
	Check if kind is vec_construct or vector store.
	(ix86_vector_costs::finish_cost): Don't do vectorization when
	vector stmts are only vec_construct and stores.
	(ix86_vector_costs::ix86_vect_construct_store_only_p): New
	function.
	(ix86_vector_costs::ix86_vect_cut_off): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/part-vect-absneghf.c: Restrict testcase to
	TARGET_64BIT.
	* gcc.target/i386/part-vect-copysignhf.c: Ditto.
	* gcc.target/i386/part-vect-xorsignhf.c: Ditto.
	* gcc.target/i386/pr91446.c: Adjust testcase.
	* gcc.target/i386/pr108938-3.c: Add -fno-vect-cost-model.
	* gcc.target/i386/pr111023-2.c: Ditto.
	* gcc.target/i386/pr111023.c: Ditto.
	* gcc.target/i386/pr99881.c: Remove xfail.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c: Changed
	to Scan-tree-dump-not.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c: Ditto.
	* gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c: Ditto.
---
 gcc/config/i386/i386.cc                       | 81 ++++++++++++++++++-
 .../costmodel/x86_64/costmodel-pr104582-1.c   |  2 +-
 .../costmodel/x86_64/costmodel-pr104582-3.c   |  2 +-
 .../costmodel/x86_64/costmodel-pr104582-4.c   |  2 +-
 .../gcc.target/i386/part-vect-absneghf.c      |  4 +-
 .../gcc.target/i386/part-vect-copysignhf.c    |  4 +-
 .../gcc.target/i386/part-vect-xorsignhf.c     |  4 +-
 gcc/testsuite/gcc.target/i386/pr108938-3.c    |  2 +-
 gcc/testsuite/gcc.target/i386/pr111023-2.c    |  2 +-
 gcc/testsuite/gcc.target/i386/pr111023.c      |  2 +-
 gcc/testsuite/gcc.target/i386/pr91446.c       | 14 ++--
 gcc/testsuite/gcc.target/i386/pr99881.c       |  2 +-
 12 files changed, 99 insertions(+), 22 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index dcaea6c2096..a4b23e29eba 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -24573,6 +24573,10 @@ public:
 
 private:
 
+  /* Don't do vectorization for certain patterns.  */
+  void ix86_vect_cut_off ();
+
+  bool ix86_vect_construct_store_only_p (vect_cost_for_stmt, stmt_vec_info);
   /* Estimate register pressure of the vectorized code.  */
   void ix86_vect_estimate_reg_pressure ();
   /* Number of GENERAL_REGS/SSE_REGS used in the vectorizer, it's used for
@@ -24581,12 +24585,15 @@ private:
      where we know it's not loaded from memory.  */
   unsigned m_num_gpr_needed[3];
   unsigned m_num_sse_needed[3];
+
+  bool m_vec_construct_store_only;
 };
 
 ix86_vector_costs::ix86_vector_costs (vec_info* vinfo, bool costing_for_scalar)
   : vector_costs (vinfo, costing_for_scalar),
     m_num_gpr_needed (),
-    m_num_sse_needed ()
+    m_num_sse_needed (),
+    m_vec_construct_store_only (true)
 {
 }
 
@@ -24609,6 +24616,10 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
     = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
   int stmt_cost = - 1;
 
+  if (m_vec_construct_store_only
+      && !ix86_vect_construct_store_only_p (kind, stmt_info))
+    m_vec_construct_store_only = false;
+
   bool fp = false;
   machine_mode mode = scalar_p ? SImode : TImode;
 
@@ -24865,8 +24876,45 @@ ix86_vector_costs::ix86_vect_estimate_reg_pressure ()
     }
 }
 
+/* Return true if KIND and STMT_INFO is either vec_construct or vector
+   store, stmt_info could be promote/demote between vec_construct and
+   vector store, also count that.  */
+bool
+ix86_vector_costs::ix86_vect_construct_store_only_p (vect_cost_for_stmt kind,
+						    stmt_vec_info stmt_info)
+{
+  switch (kind)
+    {
+    case vec_construct:
+    case vector_store:
+    case unaligned_store:
+    case vector_scatter_store:
+    case vec_promote_demote:
+      return true;
+
+      /* Vectorizer will try VEC_PACK_TRUNK_EXPR for things likes
+	 char* a;
+	 short b1, b2, b3, b4;
+	 a[0] = b1;
+	 a[1] = b2;
+	 a[2] = b3;
+	 a[3] = b4;
+	 Also don't vectorized it.  */
+    case vector_stmt:
+      if (stmt_info && stmt_info->stmt
+	  && gimple_code (stmt_info->stmt) == GIMPLE_ASSIGN
+	  && gimple_assign_rhs_code (stmt_info->stmt) == NOP_EXPR)
+	return true;
+
+    default:
+      break;
+    }
+
+  return false;
+}
+
 void
-ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
+ix86_vector_costs::ix86_vect_cut_off ()
 {
   loop_vec_info loop_vinfo = dyn_cast<loop_vec_info> (m_vinfo);
   if (loop_vinfo && !m_costing_for_scalar)
@@ -24885,10 +24933,39 @@ ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
 	  && (exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo).to_constant ())
 	      > ceil_log2 (LOOP_VINFO_INT_NITERS (loop_vinfo))))
 	m_costs[vect_body] = INT_MAX;
+      return;
+    }
+
+  /* Don't do vectorization for things like
+
+     a[0] = b1;
+     a[1] = b2;
+     ..
+     a[n] = bn;
+
+     There're extra dependences when contructing the vector, but not for
+     scalar store. According to experiments, it's generally worse.  */
+  if (m_vec_construct_store_only)
+    {
+      m_costs[0] = m_costs[1] = m_costs[2] = INT_MAX;
+      if (dump_enabled_p ())
+	dump_printf_loc (MSG_NOTE, vect_location,
+			 "Skip vectorization for stmts which only contains"
+			 " vec_construct and vector store.\n");
+      return;
     }
 
+  return;
+}
+
+void
+ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)
+{
+
   ix86_vect_estimate_reg_pressure ();
 
+  ix86_vect_cut_off ();
+
   vector_costs::finish_cost (scalar_costs);
 }
 
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c
index 992a845ad7a..f940af70b72 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-1.c
@@ -12,4 +12,4 @@ foo (unsigned long *a, unsigned long *b)
   s.b = b_;
 }
 
-/* { dg-final { scan-tree-dump "basic block part vectorized" "slp2" } } */
+/* { dg-final { scan-tree-dump-not "basic block part vectorized" "slp2" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c
index 999c4905708..eff60b2a82a 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-3.c
@@ -10,4 +10,4 @@ foo (double a, double b)
   s.b = b;
 }
 
-/* { dg-final { scan-tree-dump "basic block part vectorized" "slp2" } } */
+/* { dg-final { scan-tree-dump-not "basic block part vectorized" "slp2" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c
index cc471e1ed73..2f354338e96 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-pr104582-4.c
@@ -12,4 +12,4 @@ foo (signed long *a, unsigned long *b)
   s.b = b_;
 }
 
-/* { dg-final { scan-tree-dump "basic block part vectorized" "slp2" } } */
+/* { dg-final { scan-tree-dump-not "basic block part vectorized" "slp2" } } */
diff --git a/gcc/testsuite/gcc.target/i386/part-vect-absneghf.c b/gcc/testsuite/gcc.target/i386/part-vect-absneghf.c
index 48aed14d604..4052210ec39 100644
--- a/gcc/testsuite/gcc.target/i386/part-vect-absneghf.c
+++ b/gcc/testsuite/gcc.target/i386/part-vect-absneghf.c
@@ -85,7 +85,7 @@ do_test (void)
       abort ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized using 8 byte vectors" 2 "slp2" } } */
-/* { dg-final { scan-tree-dump-times "vectorized using 4 byte vectors" 2 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "vectorized using 8 byte vectors" 2 "slp2" { target { ! ia32 } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized using 4 byte vectors" 2 "slp2" { target { ! ia32 } } } } */
 /* { dg-final { scan-tree-dump-times {(?n)ABS_EXPR <vect} 2 "optimized" { target { ! ia32 } } } } */
 /* { dg-final { scan-tree-dump-times {(?n)= -vect} 2 "optimized" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/part-vect-copysignhf.c b/gcc/testsuite/gcc.target/i386/part-vect-copysignhf.c
index 811617bc3dd..006941f6651 100644
--- a/gcc/testsuite/gcc.target/i386/part-vect-copysignhf.c
+++ b/gcc/testsuite/gcc.target/i386/part-vect-copysignhf.c
@@ -55,6 +55,6 @@ do_test (void)
       abort ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized using 8 byte vectors" 1 "slp2" } } */
-/* { dg-final { scan-tree-dump-times "vectorized using 4 byte vectors" 1 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "vectorized using 8 byte vectors" 1 "slp2" { target { ! ia32 } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized using 4 byte vectors" 1 "slp2" { target { ! ia32 } } } } */
 /* { dg-final { scan-tree-dump-times ".COPYSIGN" 2 "optimized" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/part-vect-xorsignhf.c b/gcc/testsuite/gcc.target/i386/part-vect-xorsignhf.c
index a8ec60a088a..a57dc5aba23 100644
--- a/gcc/testsuite/gcc.target/i386/part-vect-xorsignhf.c
+++ b/gcc/testsuite/gcc.target/i386/part-vect-xorsignhf.c
@@ -55,6 +55,6 @@ do_test (void)
       abort ();
 }
 
-/* { dg-final { scan-tree-dump-times "vectorized using 8 byte vectors" 1 "slp2" } } */
-/* { dg-final { scan-tree-dump-times "vectorized using 4 byte vectors" 1 "slp2" } } */
+/* { dg-final { scan-tree-dump-times "vectorized using 8 byte vectors" 1 "slp2" { target { ! ia32 } } } } */
+/* { dg-final { scan-tree-dump-times "vectorized using 4 byte vectors" 1 "slp2" { target { ! ia32 } } } } */
 /* { dg-final { scan-tree-dump-times ".XORSIGN" 2 "optimized" { target { ! ia32 } } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr108938-3.c b/gcc/testsuite/gcc.target/i386/pr108938-3.c
index 32ac544c7ed..24725a9ab1d 100644
--- a/gcc/testsuite/gcc.target/i386/pr108938-3.c
+++ b/gcc/testsuite/gcc.target/i386/pr108938-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -ftree-vectorize -mno-movbe" } */
+/* { dg-options "-O2 -ftree-vectorize -mno-movbe -fno-vect-cost-model" } */
 /* { dg-final { scan-assembler-times "bswap\[\t ]+" 2 { target { ! ia32 } } } } */
 /* { dg-final { scan-assembler-times "bswap\[\t ]+" 3 { target ia32 } } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr111023-2.c b/gcc/testsuite/gcc.target/i386/pr111023-2.c
index 6c69f947544..db25668a9ae 100644
--- a/gcc/testsuite/gcc.target/i386/pr111023-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr111023-2.c
@@ -1,6 +1,6 @@
 /* PR target/111023 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse2 -mno-sse4.1" } */
+/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse2 -mno-sse4.1 -fno-vect-cost-model" } */
 
 typedef char v16qi __attribute__((vector_size (16)));
 typedef short v8hi __attribute__((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/pr111023.c b/gcc/testsuite/gcc.target/i386/pr111023.c
index 6144c371f32..18cd579d937 100644
--- a/gcc/testsuite/gcc.target/i386/pr111023.c
+++ b/gcc/testsuite/gcc.target/i386/pr111023.c
@@ -1,6 +1,6 @@
 /* PR target/111023 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse2 -mno-sse4.1" } */
+/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse2 -mno-sse4.1 -fno-vect-cost-model" } */
 
 typedef unsigned char v16qi __attribute__((vector_size (16)));
 typedef unsigned short v8hi __attribute__((vector_size (16)));
diff --git a/gcc/testsuite/gcc.target/i386/pr91446.c b/gcc/testsuite/gcc.target/i386/pr91446.c
index 0243ca3ea68..3e0f8a4b315 100644
--- a/gcc/testsuite/gcc.target/i386/pr91446.c
+++ b/gcc/testsuite/gcc.target/i386/pr91446.c
@@ -10,15 +10,15 @@ typedef struct
 extern void bar (info *);
 
 void
-foo (unsigned long long width, unsigned long long height,
-     long long x, long long y)
+foo (unsigned long long* width,
+     long long* x)
 {
   info t;
-  t.width = width;
-  t.height = height;
-  t.x = x;
-  t.y = y;
+  t.width = width[0];
+  t.height = width[1];
+  t.x = x[0];
+  t.y = x[1];
   bar (&t);
 }
 
-/* { dg-final { scan-assembler-times "vmovdqa\[^\n\r\]*xmm\[0-9\]" 2 } } */
+/* { dg-final { scan-assembler-times "vmovdq\[au\]\[^\n\r\]*xmm\[0-9\]" 4 } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr99881.c b/gcc/testsuite/gcc.target/i386/pr99881.c
index 3e087eb2ed7..1a59325ac1c 100644
--- a/gcc/testsuite/gcc.target/i386/pr99881.c
+++ b/gcc/testsuite/gcc.target/i386/pr99881.c
@@ -1,7 +1,7 @@
 /* PR target/99881.  */
 /* { dg-do compile { target { ! ia32 } } } */
 /* { dg-options "-Ofast -march=skylake" } */
-/* { dg-final { scan-assembler-not "xmm\[0-9\]" { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not "xmm\[0-9\]"  } } */
 
 void
 foo (int* __restrict a, int n, int c)