[v2] bpf: md: fix "*movsi" to generate wN regs [PR124688]

Message ID 20260331221908.4067255-1-vineet.gupta@linux.dev
State New
Headers
Series [v2] bpf: md: fix "*movsi" to generate wN regs [PR124688] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_binutils_build--master-arm fail Patch failed to apply
linaro-tcwg-bot/tcwg_binutils_build--master-aarch64 fail Patch failed to apply

Commit Message

Vineet Gupta March 31, 2026, 10:19 p.m. UTC
  Changes since v1 [1]
 - Only update asm template for reg-reg form to emit wN/rN (mem forms
   continue to emit rN forms even for SImode as discussed at [2])

[1] https://gcc.gnu.org/pipermail/bpf/2026-March/000077.html
[2] https://gcc.gnu.org/pipermail/bpf/2026-March/000085.html
---

movsi currently only generates DImode rN regs, despite RTL being SImode.

| (insn 14 5 11 (set (reg:SI 1 %r1 [23])
|        (reg:SI 0 %r0))  {*movsi}
|     (expr_list:REG_DEAD (reg:SI 0 %r0)
|        (nil)))

generates

|  r1 = r0

as opposed to

| w1 = w0

This is not just issue of taste or getting more wN regs. As illustrated
by acompanying test, this can be a correctness issue where mov needed to zero
out the upper bits, which rM form won't.
Fix as with some of prior similar issues is 'w' in asm template.

Note #1:
-------
Technically asm templates of all the alternatives of the pattern need
to be fixed, but are intentionally not. Specifically the mem ones with "q"
constraint in src/dst are left out because the ecodings for rN and wN
forms below are exactly the same:

|  *(u32 *) (r9+0) = r5
|  *(u32 *) (r9+0) = w5

So even if gcc were to generate them differently, tools like disassembler can
only print 1 form, so for consistency we decided to only do rN.
Side note: Since llvm supports the wN form, a binutils/gas patch has been
           posted to support it even if gcc is not currently gen it.

Note #2:
-------
I was debating fixing the movhi asm template to also emit "and 0xffff"
since that is missing if one were to change the new test to "short" type
(vs. "int") for return and arg. However this ventures into ABI territory
and something core code will handle when the ABI hooks are fixed for
forthcoming PR/124171. The gist of this patch is to emit the right "container"
reg for a given mode. Any adjustments due to modes better be handled by
core compiler mechanisms.

Testing is clean as well with usual flaky tests bumping here and there.

	PR target/124688

gcc/ChangeLog:

	* config/bpf/bpf.md (*movsi): Add 'w' to asm template.

gcc/testsuite/ChangeLog:

	* gcc.target/bpf/ret-reuse-arg-1.c: New test.

Signed-off-by: Vineet Gupta <vineet.gupta@linux.dev>
---
 gcc/config/bpf/bpf.md                          |  8 +++++++-
 gcc/testsuite/gcc.target/bpf/ret-reuse-arg-1.c | 14 ++++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/bpf/ret-reuse-arg-1.c
  

Patch

diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
index a2bceb8998d7..078752ec6854 100644
--- a/gcc/config/bpf/bpf.md
+++ b/gcc/config/bpf/bpf.md
@@ -378,13 +378,19 @@ 
     operands[1] = force_reg (<MM:MODE>mode, operands[1]);
 }")
 
+;; Note: In the asm templates below, only r->r/mov alternative has 'w' reg
+;; specifer, so SImode regs emitted as wN. Rest just emit rN regs.
+;; Technically the rest of mem forms with "q" src/dst can take 'w' except
+;; that the encodings are same, e.g.
+;;         *(u32 *) (r9+0) = r5 vs. *(u32 *) (r9+0) = w5
+
 (define_insn "*mov<MM:mode>"
   [(set (match_operand:MM 0 "nonimmediate_operand" "=r,  r, r,q,q")
         (match_operand:MM 1 "mov_src_operand"      " q,rIc,BC,r,I"))]
   ""
   "@
    *return bpf_output_move (operands, \"{ldx<mop>\t%0,%1|%0 = *(<smop> *) %1}\");
-   *return bpf_output_move (operands, \"{mov\t%0,%1|%0 = %1}\");
+   *return bpf_output_move (operands, \"{mov\t%0,%1|%w0 = %w1}\");
    *return bpf_output_move (operands, \"{lddw\t%0,%1|%0 = %1 ll}\");
    *return bpf_output_move (operands, \"{stx<mop>\t%0,%1|*(<smop> *) %0 = %1}\");
    *return bpf_output_move (operands, \"{st<mop>\t%0,%1|*(<smop> *) %0 = %1}\");"
diff --git a/gcc/testsuite/gcc.target/bpf/ret-reuse-arg-1.c b/gcc/testsuite/gcc.target/bpf/ret-reuse-arg-1.c
new file mode 100644
index 000000000000..6d0a4f280cd7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/ret-reuse-arg-1.c
@@ -0,0 +1,14 @@ 
+/* Return value of first call is arg to second call.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mcpu=v4" } */
+
+int ret_int ();
+void arg_int (int);
+
+void foo () {
+   arg_int(ret_int ());
+}
+
+/* { dg-final { scan-assembler-not {r1 = r0} } } */
+/* { dg-final { scan-assembler-times {w1 = w0} 1 } } */