[committed] i386: Promote {QI, HI}mode x86_mov<mode>cc_0_m1_neg to SImode

Message ID CAFULd4ZX55pkdO3h8Cckq877q7Cy3TMzU0FE6Xwf_HuRRcYMMQ@mail.gmail.com
State New
Headers
Series [committed] i386: Promote {QI, HI}mode x86_mov<mode>cc_0_m1_neg to SImode |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm warning Patch is already merged
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 warning Patch is already merged

Commit Message

Uros Bizjak July 8, 2024, 8:41 p.m. UTC
  Promote HImode x86_mov<mode>cc_0_m1_neg insn to SImode to avoid
redundant prefixes. Also promote QImode insn when TARGET_PROMOTE_QImode
is set. This is similar to promotable_binary_operator splitter, where we
promote the result to SImode.

Also correct insn condition for splitters to SImode of NEG and NOT
instructions. The sizes of QImode and SImode instructions are always
the same, so there is no need for optimize_insn_for_size bypass.

    gcc/ChangeLog:

    * config/i386/i386.md (x86_mov<mode>cc_0_m1_neg splitter to SImode):
    New splitter.
    (NEG and NOT splitter to SImode): Remove optimize_insn_for_size_p
    predicate from insn condition.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Uros.
  

Patch

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index b24c4fe5875..214cb2e239a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -26576,9 +26576,7 @@  (define_split
    (clobber (reg:CC FLAGS_REG))]
   "! TARGET_PARTIAL_REG_STALL && reload_completed
    && (GET_MODE (operands[0]) == HImode
-       || (GET_MODE (operands[0]) == QImode
-	   && (TARGET_PROMOTE_QImode
-	       || optimize_insn_for_size_p ())))"
+       || (GET_MODE (operands[0]) == QImode && TARGET_PROMOTE_QImode))"
   [(parallel [(set (match_dup 0)
 		   (neg:SI (match_dup 1)))
 	      (clobber (reg:CC FLAGS_REG))])]
@@ -26593,15 +26591,30 @@  (define_split
 	(not (match_operand 1 "general_reg_operand")))]
   "! TARGET_PARTIAL_REG_STALL && reload_completed
    && (GET_MODE (operands[0]) == HImode
-       || (GET_MODE (operands[0]) == QImode
-	   && (TARGET_PROMOTE_QImode
-	       || optimize_insn_for_size_p ())))"
+       || (GET_MODE (operands[0]) == QImode && TARGET_PROMOTE_QImode))"
   [(set (match_dup 0)
 	(not:SI (match_dup 1)))]
 {
   operands[0] = gen_lowpart (SImode, operands[0]);
   operands[1] = gen_lowpart (SImode, operands[1]);
 })
+
+(define_split
+  [(set (match_operand 0 "general_reg_operand")
+	(neg (match_operator 1 "ix86_carry_flag_operator"
+	      [(reg FLAGS_REG) (const_int 0)])))
+   (clobber (reg:CC FLAGS_REG))]
+  "! TARGET_PARTIAL_REG_STALL && reload_completed
+   && (GET_MODE (operands[0]) == HImode
+       || (GET_MODE (operands[0]) == QImode && TARGET_PROMOTE_QImode))"
+  [(parallel [(set (match_dup 0)
+		   (neg:SI (match_dup 1)))
+	      (clobber (reg:CC FLAGS_REG))])]
+{
+  operands[0] = gen_lowpart (SImode, operands[0]);
+  operands[1] = shallow_copy_rtx (operands[1]);
+  PUT_MODE (operands[1], SImode);
+})
 
 ;; RTL Peephole optimizations, run before sched2.  These primarily look to
 ;; transform a complex memory operation into two memory to register operations.