x86: Check XMM destination when optimizing 128-bit VPBROADCASTQ

Message ID CAMe9rOqfwneN-54vwwLez0_YGTBn88NuzCzL9=N4h5kio2mVJQ@mail.gmail.com
State New
Headers
Series x86: Check XMM destination when optimizing 128-bit VPBROADCASTQ |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_binutils_check--master-aarch64 success Test passed
linaro-tcwg-bot/tcwg_binutils_check--master-arm success Test passed

Commit Message

H.J. Lu May 27, 2026, 12:53 a.m. UTC
  commit eb4031cb20aa710834be891f8638e04dbba81edc
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Jul 4 17:07:26 2023 +0200

    x86: optimize 128-bit VPBROADCASTQ to VPUNPCKLQDQ

was supposed to optimize

vpbroadcastq %xmmN, %xmmM  -> vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8)

But it didn't check if the destination operand is XMM.  As the result, it
turned:

vpbroadcastq %xmmN, %ymmM

into

vpunpcklqdq %xmmN, %xmmN, %xmmM

Fixing it by checking XMM destination.

PR gas/34171
* config/tc-i386.c (optimize_encoding): Check XMM destination
when optimizing 128-bit VPBROADCASTQ.
* testsuite/gas/i386/optimize-2.d: Updated.
* testsuite/gas/i386/optimize-2.s: Add 256-bit vpbroadcastq.
  

Comments

H.J. Lu May 27, 2026, 11:08 p.m. UTC | #1
On Wed, May 27, 2026 at 8:53 AM H.J. Lu <hjl.tools@gmail.com> wrote:
>
> commit eb4031cb20aa710834be891f8638e04dbba81edc
> Author: Jan Beulich <jbeulich@suse.com>
> Date:   Tue Jul 4 17:07:26 2023 +0200
>
>     x86: optimize 128-bit VPBROADCASTQ to VPUNPCKLQDQ
>
> was supposed to optimize
>
> vpbroadcastq %xmmN, %xmmM  -> vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8)
>
> But it didn't check if the destination operand is XMM.  As the result, it
> turned:
>
> vpbroadcastq %xmmN, %ymmM
>
> into
>
> vpunpcklqdq %xmmN, %xmmN, %xmmM
>
> Fixing it by checking XMM destination.
>
> PR gas/34171
> * config/tc-i386.c (optimize_encoding): Check XMM destination
> when optimizing 128-bit VPBROADCASTQ.
> * testsuite/gas/i386/optimize-2.d: Updated.
> * testsuite/gas/i386/optimize-2.s: Add 256-bit vpbroadcastq.
>
>
> --
> H.J.

I am checking it in.
  

Patch

From adc8d3e9f4219230189b1081d01cc1a09f1ff735 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Wed, 27 May 2026 08:39:16 +0800
Subject: [PATCH] x86: Check XMM destination when optimizing 128-bit
 VPBROADCASTQ

commit eb4031cb20aa710834be891f8638e04dbba81edc
Author: Jan Beulich <jbeulich@suse.com>
Date:   Tue Jul 4 17:07:26 2023 +0200

    x86: optimize 128-bit VPBROADCASTQ to VPUNPCKLQDQ

was supposed to optimize

vpbroadcastq %xmmN, %xmmM  -> vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8)

But it didn't check if the destination operand is XMM.  As the result, it
turned:

vpbroadcastq %xmmN, %ymmM

into

vpunpcklqdq %xmmN, %xmmN, %xmmM

Fixing it by checking XMM destination.

	PR gas/34171
	* config/tc-i386.c (optimize_encoding): Check XMM destination
	when optimizing 128-bit VPBROADCASTQ.
	* testsuite/gas/i386/optimize-2.d: Updated.
	* testsuite/gas/i386/optimize-2.s: Add 256-bit vpbroadcastq.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
---
 gas/config/tc-i386.c                | 1 +
 gas/testsuite/gas/i386/optimize-2.d | 1 +
 gas/testsuite/gas/i386/optimize-2.s | 1 +
 3 files changed, 3 insertions(+)

diff --git a/gas/config/tc-i386.c b/gas/config/tc-i386.c
index 988b024f0b8..365c2ee95f5 100644
--- a/gas/config/tc-i386.c
+++ b/gas/config/tc-i386.c
@@ -5802,6 +5802,7 @@  optimize_encoding (void)
 	   && i.tm.opcode_modifier.vex
 	   && !(i.op[0].regs->reg_flags & RegRex)
 	   && i.op[0].regs->reg_type.bitfield.xmmword
+	   && i.op[1].regs->reg_type.bitfield.xmmword
 	   && pp.encoding != encoding_vex3)
     {
       /* Optimize: -Os:
diff --git a/gas/testsuite/gas/i386/optimize-2.d b/gas/testsuite/gas/i386/optimize-2.d
index 2738b84b80d..3c90cc9b178 100644
--- a/gas/testsuite/gas/i386/optimize-2.d
+++ b/gas/testsuite/gas/i386/optimize-2.d
@@ -198,4 +198,5 @@  Disassembly of section .text:
  +[a-f0-9]+:	c5 .*	vpaddq %xmm2,%xmm2,%xmm3
  +[a-f0-9]+:	62 .*	vpaddq %zmm2,%zmm2,%zmm3
  +[a-f0-9]+:	c5 .*	vpunpcklqdq %xmm2,%xmm2,%xmm0
+ +[a-f0-9]+:	c4 .*	vpbroadcastq %xmm2,%ymm0
 #pass
diff --git a/gas/testsuite/gas/i386/optimize-2.s b/gas/testsuite/gas/i386/optimize-2.s
index b2b1cc112df..80a46eab485 100644
--- a/gas/testsuite/gas/i386/optimize-2.s
+++ b/gas/testsuite/gas/i386/optimize-2.s
@@ -233,3 +233,4 @@  _start:
 	vpsllq	$1, %zmm2, %zmm3
 
 	vpbroadcastq	%xmm2, %xmm0
+	vpbroadcastq	%xmm2, %ymm0
-- 
2.54.0