xtensa: Eliminate unnecessary general-purpose reg-reg moves

Message ID aba72208-c9b6-0d9e-918d-4b7e402b634b@yahoo.co.jp
State New
Headers
Series xtensa: Eliminate unnecessary general-purpose reg-reg moves |

Commit Message

Takayuki 'January June' Suwa Jan. 17, 2023, 4:54 a.m. UTC
  Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:

/* example */
double test(double a, double b) {
  return __builtin_copysign(a, b);
}

test:
	add.n	a3, a3, a3
	extui	a5, a5, 31, 1
	ssai	1
				;; be in the same BB
	src	a7, a5, a3	;; No '0' in the source constraints
				;; No CALL insns in this span
				;; Both A3 and A7 are irrelevant to
				;;   insns in this span
	mov.n	a3, a7		;; An unnecessary reg-reg move
				;; A7 is not used after this
	ret.n

The last two instructions above, excluding the return instruction,
could be done like this:

	src	a3, a5, a3

This symptom often occurs when handling DI/DFmode values with SImode
instructions.  This patch solves the above problem using peephole2
pattern.

gcc/ChangeLog:

	* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
	the occurrence of genral-purpose register used only once and for
	transferring intermediate value.
---
 gcc/config/xtensa/xtensa.md | 44 +++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)
  

Comments

Max Filippov Jan. 17, 2023, 8:03 a.m. UTC | #1
Hi Suwa-san,

On Mon, Jan 16, 2023 at 8:54 PM Takayuki 'January June' Suwa
<jjsuwa_sys3175@yahoo.co.jp> wrote:
>
> Register-register move instructions that can be easily seen as
> unnecessary by the human eye may remain in the compiled result.
> For example:
>
> /* example */
> double test(double a, double b) {
>   return __builtin_copysign(a, b);
> }
>
> test:
>         add.n   a3, a3, a3
>         extui   a5, a5, 31, 1
>         ssai    1
>                                 ;; be in the same BB
>         src     a7, a5, a3      ;; No '0' in the source constraints
>                                 ;; No CALL insns in this span
>                                 ;; Both A3 and A7 are irrelevant to
>                                 ;;   insns in this span
>         mov.n   a3, a7          ;; An unnecessary reg-reg move
>                                 ;; A7 is not used after this
>         ret.n
>
> The last two instructions above, excluding the return instruction,
> could be done like this:
>
>         src     a3, a5, a3
>
> This symptom often occurs when handling DI/DFmode values with SImode
> instructions.  This patch solves the above problem using peephole2
> pattern.
>
> gcc/ChangeLog:
>
>         * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
>         the occurrence of genral-purpose register used only once and for
>         transferring intermediate value.
> ---
>  gcc/config/xtensa/xtensa.md | 44 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 44 insertions(+)

This change results in a bunch of ICEs with the following backtrace:

gcc/libgcc/unwind-dw2.c: In function ‘execute_cfa_program_specialized’:
gcc/libgcc/unwind-dw2.c:972:1: internal compiler error: RTL check:
expected elt 2 type 'B', have '0' (rtx barrier) in BLOCK_FOR_INSN, at
rtl.h:1493
 972 | }
     | ^
0x6c3334 rtl_check_failed_type1(rtx_def const*, int, int, char const*,
int, char const*)
       gcc/gcc/rtl.cc:897
0x7bf285 BLOCK_FOR_INSN(rtx_def*)
       gcc/gcc/rtl.h:1493
0x7c448d BLOCK_FOR_INSN(rtx_def*)
       gcc/gcc/rtl.h:1509
0x7c448d gen_peephole2_4(rtx_insn*, rtx_def**)
       gcc/gcc/config/xtensa/xtensa.md:3102
0xe1cce2 peephole2_optimize
       gcc/gcc/recog.cc:4180
0xe1cce2 rest_of_handle_peephole2
       gcc/gcc/recog.cc:4331
0xe1cce2 execute
       gcc/gcc/recog.cc:4368
  

Patch

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 61bbad8e4..1b53c8c9e 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -3089,3 +3089,47 @@  FALLTHRU:;
   df_insn_rescan (insnR);
   set_insn_deleted (insnP);
 })
+
+(define_peephole2
+  [(set (match_operand 0 "register_operand")
+	(match_operand 1 "register_operand"))]
+  "GET_MODE_SIZE (GET_MODE (operands[0])) == 4
+   && GET_MODE_SIZE (GET_MODE (operands[1])) == 4
+   && GP_REG_P (REGNO (operands[0])) && GP_REG_P (REGNO (operands[1]))
+   && peep2_reg_dead_p (1, operands[1])"
+  [(const_int 0)]
+{
+  basic_block bb = BLOCK_FOR_INSN (curr_insn);
+  rtx dest = operands[0], src = operands[1], pattern, t_dest;
+  rtx_insn *insn;
+  int i;
+  for (insn = PREV_INSN (curr_insn);
+       insn && BLOCK_FOR_INSN (insn) == bb;
+       insn = PREV_INSN (insn))
+    if (CALL_P (insn))
+      break;
+    else if (INSN_P (insn))
+      {
+	if (GET_CODE (pattern = PATTERN (insn)) == SET
+	    && REG_P (t_dest = SET_DEST (pattern))
+	    && GET_MODE_SIZE (GET_MODE (t_dest)) == 4
+	    && REGNO (t_dest) == REGNO (src)
+	    && ! REG_P (SET_SRC (pattern)))
+	{
+	  extract_constrain_insn (insn);
+	  for (i = 1; i < recog_data.n_operands; ++i)
+	    if (strchr (recog_data.constraints[i], '0'))
+	      goto ABORT;
+	  SET_REGNO (t_dest, REGNO (dest));
+	  goto FALLTHRU;
+	}
+	if (reg_overlap_mentioned_p (dest, pattern)
+	    || reg_overlap_mentioned_p (src, pattern)
+	    || set_of (dest, insn)
+	    || set_of (src, insn))
+	  break;
+      }
+ABORT:
+  FAIL;
+FALLTHRU:;
+})