[4/7] riscv: Move riscv_block_move_loop to separate file

Message ID 20221113230521.712693-5-christoph.muellner@vrull.eu
State New
Headers
Series riscv: Improve builtins expansion |

Commit Message

Christoph Müllner Nov. 13, 2022, 11:05 p.m. UTC
  From: Christoph Müllner <christoph.muellner@vrull.eu>

Let's try to not accumulate too much functionality in one single file
as this does not really help maintaining or extending the code.
So in order to add more similar functionality like riscv_block_move_loop
let's move this function to a separate file.

This change does not do any functional changes.
It does modify a single line in the existing code,
that check_GNU_style.py complained about.

gcc/ChangeLog:

	* config.gcc: Add new object riscv-string.o
	* config/riscv/riscv-protos.h (riscv_expand_block_move): Remove
	  duplicated prototype and move to new section for
	  riscv-string.cc.
	* config/riscv/riscv.cc (riscv_block_move_straight): Remove function.
	(riscv_adjust_block_mem): Likewise.
	(riscv_block_move_loop): Likewise.
	(riscv_expand_block_move): Likewise.
	* config/riscv/riscv.md (cpymemsi): Move to new section for
	  riscv-string.cc.
	* config/riscv/t-riscv: Add compile rule for riscv-string.o
	* config/riscv/riscv-string.c: New file.

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
---
 gcc/config.gcc                   |   3 +-
 gcc/config/riscv/riscv-protos.h  |   5 +-
 gcc/config/riscv/riscv-string.cc | 194 +++++++++++++++++++++++++++++++
 gcc/config/riscv/riscv.cc        | 155 ------------------------
 gcc/config/riscv/riscv.md        |  28 ++---
 gcc/config/riscv/t-riscv         |   4 +
 6 files changed, 218 insertions(+), 171 deletions(-)
 create mode 100644 gcc/config/riscv/riscv-string.cc
  

Comments

Jeff Law Nov. 14, 2022, 4:56 p.m. UTC | #1
On 11/13/22 16:05, Christoph Muellner wrote:
> From: Christoph Müllner <christoph.muellner@vrull.eu>
>
> Let's try to not accumulate too much functionality in one single file
> as this does not really help maintaining or extending the code.
> So in order to add more similar functionality like riscv_block_move_loop
> let's move this function to a separate file.
>
> This change does not do any functional changes.
> It does modify a single line in the existing code,
> that check_GNU_style.py complained about.
>
> gcc/ChangeLog:
>
> 	* config.gcc: Add new object riscv-string.o
> 	* config/riscv/riscv-protos.h (riscv_expand_block_move): Remove
> 	  duplicated prototype and move to new section for
> 	  riscv-string.cc.
> 	* config/riscv/riscv.cc (riscv_block_move_straight): Remove function.
> 	(riscv_adjust_block_mem): Likewise.
> 	(riscv_block_move_loop): Likewise.
> 	(riscv_expand_block_move): Likewise.
> 	* config/riscv/riscv.md (cpymemsi): Move to new section for
> 	  riscv-string.cc.
> 	* config/riscv/t-riscv: Add compile rule for riscv-string.o
> 	* config/riscv/riscv-string.c: New file.

OK.  Note I suspect the commit hooks are going to complain about your 
ChangeLog formatting.


jeff
  

Patch

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b5eda046033..fc9e582e713 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -518,7 +518,8 @@  pru-*-*)
 	;;
 riscv*)
 	cpu_type=riscv
-	extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o"
+	extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o"
+	extra_objs="${extra_objs} riscv-string.o riscv-v.o"
 	extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
 	d_target_objs="riscv-d.o"
 	extra_headers="riscv_vector.h"
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5a718bb62b4..344515dbaf4 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -62,7 +62,6 @@  extern void riscv_expand_conditional_move (rtx, rtx, rtx, rtx_code, rtx, rtx);
 #endif
 extern rtx riscv_legitimize_call_address (rtx);
 extern void riscv_set_return_address (rtx, rtx);
-extern bool riscv_expand_block_move (rtx, rtx, rtx);
 extern rtx riscv_return_addr (int, rtx);
 extern poly_int64 riscv_initial_elimination_offset (int, int);
 extern void riscv_expand_prologue (void);
@@ -70,7 +69,6 @@  extern void riscv_expand_epilogue (int);
 extern bool riscv_epilogue_uses (unsigned int);
 extern bool riscv_can_use_return_insn (void);
 extern rtx riscv_function_value (const_tree, const_tree, enum machine_mode);
-extern bool riscv_expand_block_move (rtx, rtx, rtx);
 extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn *);
 extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *);
 extern bool riscv_gpr_save_operation_p (rtx);
@@ -96,6 +94,9 @@  extern bool riscv_hard_regno_rename_ok (unsigned, unsigned);
 
 rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt);
 
+/* Routines implemented in riscv-string.c.  */
+extern bool riscv_expand_block_move (rtx, rtx, rtx);
+
 /* Information about one CPU we know about.  */
 struct riscv_cpu_info {
   /* This CPU's canonical name.  */
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
new file mode 100644
index 00000000000..6882f0be269
--- /dev/null
+++ b/gcc/config/riscv/riscv-string.cc
@@ -0,0 +1,194 @@ 
+/* Subroutines used to expand string and block move, clear,
+   compare and other operations for RISC-V.
+   Copyright (C) 2011-2022 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "tm_p.h"
+#include "ira.h"
+#include "print-tree.h"
+#include "varasm.h"
+#include "explow.h"
+#include "expr.h"
+#include "output.h"
+#include "target.h"
+#include "predict.h"
+#include "optabs.h"
+
+/* Emit straight-line code to move LENGTH bytes from SRC to DEST.
+   Assume that the areas do not overlap.  */
+
+static void
+riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length)
+{
+  unsigned HOST_WIDE_INT offset, delta;
+  unsigned HOST_WIDE_INT bits;
+  int i;
+  enum machine_mode mode;
+  rtx *regs;
+
+  bits = MAX (BITS_PER_UNIT,
+	      MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest))));
+
+  mode = mode_for_size (bits, MODE_INT, 0).require ();
+  delta = bits / BITS_PER_UNIT;
+
+  /* Allocate a buffer for the temporary registers.  */
+  regs = XALLOCAVEC (rtx, length / delta);
+
+  /* Load as many BITS-sized chunks as possible.  Use a normal load if
+     the source has enough alignment, otherwise use left/right pairs.  */
+  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+    {
+      regs[i] = gen_reg_rtx (mode);
+      riscv_emit_move (regs[i], adjust_address (src, mode, offset));
+    }
+
+  /* Copy the chunks to the destination.  */
+  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
+    riscv_emit_move (adjust_address (dest, mode, offset), regs[i]);
+
+  /* Mop up any left-over bytes.  */
+  if (offset < length)
+    {
+      src = adjust_address (src, BLKmode, offset);
+      dest = adjust_address (dest, BLKmode, offset);
+      move_by_pieces (dest, src, length - offset,
+		      MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN);
+    }
+}
+
+/* Helper function for doing a loop-based block operation on memory
+   reference MEM.  Each iteration of the loop will operate on LENGTH
+   bytes of MEM.
+
+   Create a new base register for use within the loop and point it to
+   the start of MEM.  Create a new memory reference that uses this
+   register.  Store them in *LOOP_REG and *LOOP_MEM respectively.  */
+
+static void
+riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length,
+			rtx *loop_reg, rtx *loop_mem)
+{
+  *loop_reg = copy_addr_to_reg (XEXP (mem, 0));
+
+  /* Although the new mem does not refer to a known location,
+     it does keep up to LENGTH bytes of alignment.  */
+  *loop_mem = change_address (mem, BLKmode, *loop_reg);
+  set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT));
+}
+
+/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER
+   bytes at a time.  LENGTH must be at least BYTES_PER_ITER.  Assume that
+   the memory regions do not overlap.  */
+
+static void
+riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
+		       unsigned HOST_WIDE_INT bytes_per_iter)
+{
+  rtx label, src_reg, dest_reg, final_src, test;
+  unsigned HOST_WIDE_INT leftover;
+
+  leftover = length % bytes_per_iter;
+  length -= leftover;
+
+  /* Create registers and memory references for use within the loop.  */
+  riscv_adjust_block_mem (src, bytes_per_iter, &src_reg, &src);
+  riscv_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest);
+
+  /* Calculate the value that SRC_REG should have after the last iteration
+     of the loop.  */
+  final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length),
+				   0, 0, OPTAB_WIDEN);
+
+  /* Emit the start of the loop.  */
+  label = gen_label_rtx ();
+  emit_label (label);
+
+  /* Emit the loop body.  */
+  riscv_block_move_straight (dest, src, bytes_per_iter);
+
+  /* Move on to the next block.  */
+  riscv_emit_move (src_reg, plus_constant (Pmode, src_reg, bytes_per_iter));
+  riscv_emit_move (dest_reg, plus_constant (Pmode, dest_reg, bytes_per_iter));
+
+  /* Emit the loop condition.  */
+  test = gen_rtx_NE (VOIDmode, src_reg, final_src);
+  emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label));
+
+  /* Mop up any left-over bytes.  */
+  if (leftover)
+    riscv_block_move_straight (dest, src, leftover);
+  else
+    emit_insn (gen_nop ());
+}
+
+/* Expand a cpymemsi instruction, which copies LENGTH bytes from
+   memory reference SRC to memory reference DEST.  */
+
+bool
+riscv_expand_block_move (rtx dest, rtx src, rtx length)
+{
+  if (CONST_INT_P (length))
+    {
+      unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
+      unsigned HOST_WIDE_INT factor, align;
+
+      align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
+      factor = BITS_PER_WORD / align;
+
+      if (optimize_function_for_size_p (cfun)
+	  && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
+	return false;
+
+      if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
+	{
+	  riscv_block_move_straight (dest, src, INTVAL (length));
+	  return true;
+	}
+      else if (optimize && align >= BITS_PER_WORD)
+	{
+	  unsigned min_iter_words
+	    = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
+	  unsigned iter_words = min_iter_words;
+	  unsigned HOST_WIDE_INT bytes = hwi_length;
+	  unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
+
+	  /* Lengthen the loop body if it shortens the tail.  */
+	  for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
+	    {
+	      unsigned cur_cost = iter_words + words % iter_words;
+	      unsigned new_cost = i + words % i;
+	      if (new_cost <= cur_cost)
+		iter_words = i;
+	    }
+
+	  riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD);
+	  return true;
+	}
+    }
+  return false;
+}
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7357cf51cdf..fab40c6f8dc 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3949,161 +3949,6 @@  riscv_legitimize_call_address (rtx addr)
   return addr;
 }
 
-/* Emit straight-line code to move LENGTH bytes from SRC to DEST.
-   Assume that the areas do not overlap.  */
-
-static void
-riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length)
-{
-  unsigned HOST_WIDE_INT offset, delta;
-  unsigned HOST_WIDE_INT bits;
-  int i;
-  enum machine_mode mode;
-  rtx *regs;
-
-  bits = MAX (BITS_PER_UNIT,
-	      MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest))));
-
-  mode = mode_for_size (bits, MODE_INT, 0).require ();
-  delta = bits / BITS_PER_UNIT;
-
-  /* Allocate a buffer for the temporary registers.  */
-  regs = XALLOCAVEC (rtx, length / delta);
-
-  /* Load as many BITS-sized chunks as possible.  Use a normal load if
-     the source has enough alignment, otherwise use left/right pairs.  */
-  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
-    {
-      regs[i] = gen_reg_rtx (mode);
-      riscv_emit_move (regs[i], adjust_address (src, mode, offset));
-    }
-
-  /* Copy the chunks to the destination.  */
-  for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++)
-    riscv_emit_move (adjust_address (dest, mode, offset), regs[i]);
-
-  /* Mop up any left-over bytes.  */
-  if (offset < length)
-    {
-      src = adjust_address (src, BLKmode, offset);
-      dest = adjust_address (dest, BLKmode, offset);
-      move_by_pieces (dest, src, length - offset,
-		      MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN);
-    }
-}
-
-/* Helper function for doing a loop-based block operation on memory
-   reference MEM.  Each iteration of the loop will operate on LENGTH
-   bytes of MEM.
-
-   Create a new base register for use within the loop and point it to
-   the start of MEM.  Create a new memory reference that uses this
-   register.  Store them in *LOOP_REG and *LOOP_MEM respectively.  */
-
-static void
-riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length,
-			rtx *loop_reg, rtx *loop_mem)
-{
-  *loop_reg = copy_addr_to_reg (XEXP (mem, 0));
-
-  /* Although the new mem does not refer to a known location,
-     it does keep up to LENGTH bytes of alignment.  */
-  *loop_mem = change_address (mem, BLKmode, *loop_reg);
-  set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT));
-}
-
-/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER
-   bytes at a time.  LENGTH must be at least BYTES_PER_ITER.  Assume that
-   the memory regions do not overlap.  */
-
-static void
-riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
-		       unsigned HOST_WIDE_INT bytes_per_iter)
-{
-  rtx label, src_reg, dest_reg, final_src, test;
-  unsigned HOST_WIDE_INT leftover;
-
-  leftover = length % bytes_per_iter;
-  length -= leftover;
-
-  /* Create registers and memory references for use within the loop.  */
-  riscv_adjust_block_mem (src, bytes_per_iter, &src_reg, &src);
-  riscv_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest);
-
-  /* Calculate the value that SRC_REG should have after the last iteration
-     of the loop.  */
-  final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length),
-				   0, 0, OPTAB_WIDEN);
-
-  /* Emit the start of the loop.  */
-  label = gen_label_rtx ();
-  emit_label (label);
-
-  /* Emit the loop body.  */
-  riscv_block_move_straight (dest, src, bytes_per_iter);
-
-  /* Move on to the next block.  */
-  riscv_emit_move (src_reg, plus_constant (Pmode, src_reg, bytes_per_iter));
-  riscv_emit_move (dest_reg, plus_constant (Pmode, dest_reg, bytes_per_iter));
-
-  /* Emit the loop condition.  */
-  test = gen_rtx_NE (VOIDmode, src_reg, final_src);
-  emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label));
-
-  /* Mop up any left-over bytes.  */
-  if (leftover)
-    riscv_block_move_straight (dest, src, leftover);
-  else
-    emit_insn(gen_nop ());
-}
-
-/* Expand a cpymemsi instruction, which copies LENGTH bytes from
-   memory reference SRC to memory reference DEST.  */
-
-bool
-riscv_expand_block_move (rtx dest, rtx src, rtx length)
-{
-  if (CONST_INT_P (length))
-    {
-      unsigned HOST_WIDE_INT hwi_length = UINTVAL (length);
-      unsigned HOST_WIDE_INT factor, align;
-
-      align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD);
-      factor = BITS_PER_WORD / align;
-
-      if (optimize_function_for_size_p (cfun)
-	  && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false))
-	return false;
-
-      if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor))
-	{
-	  riscv_block_move_straight (dest, src, INTVAL (length));
-	  return true;
-	}
-      else if (optimize && align >= BITS_PER_WORD)
-	{
-	  unsigned min_iter_words
-	    = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD;
-	  unsigned iter_words = min_iter_words;
-	  unsigned HOST_WIDE_INT bytes = hwi_length;
-	  unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD;
-
-	  /* Lengthen the loop body if it shortens the tail.  */
-	  for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++)
-	    {
-	      unsigned cur_cost = iter_words + words % iter_words;
-	      unsigned new_cost = i + words % i;
-	      if (new_cost <= cur_cost)
-		iter_words = i;
-	    }
-
-	  riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD);
-	  return true;
-	}
-    }
-  return false;
-}
-
 /* Print symbolic operand OP, which is part of a HIGH or LO_SUM
    in context CONTEXT.  HI_RELOC indicates a high-part reloc.  */
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 532289dd178..43b97f1181e 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1872,19 +1872,6 @@  (define_split
   DONE;
 })
 
-(define_expand "cpymemsi"
-  [(parallel [(set (match_operand:BLK 0 "general_operand")
-		   (match_operand:BLK 1 "general_operand"))
-	      (use (match_operand:SI 2 ""))
-	      (use (match_operand:SI 3 "const_int_operand"))])]
-  ""
-{
-  if (riscv_expand_block_move (operands[0], operands[1], operands[2]))
-    DONE;
-  else
-    FAIL;
-})
-
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
@@ -3005,6 +2992,21 @@  (define_insn "riscv_prefetchi_<mode>"
   "prefetch.i\t%a0"
 )
 
+;; Expansions from riscv-string.c
+
+(define_expand "cpymemsi"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+		   (match_operand:BLK 1 "general_operand"))
+	      (use (match_operand:SI 2 ""))
+	      (use (match_operand:SI 3 "const_int_operand"))])]
+  ""
+{
+  if (riscv_expand_block_move (operands[0], operands[1], operands[2]))
+    DONE;
+  else
+    FAIL;
+})
+
 (include "bitmanip.md")
 (include "sync.md")
 (include "peephole.md")
diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv
index 7997db3d898..5cb58a74a53 100644
--- a/gcc/config/riscv/t-riscv
+++ b/gcc/config/riscv/t-riscv
@@ -63,6 +63,10 @@  riscv-selftests.o: $(srcdir)/config/riscv/riscv-selftests.cc
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
+riscv-string.o: $(srcdir)/config/riscv/riscv-string.cc
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
 riscv-v.o: $(srcdir)/config/riscv/riscv-v.cc
 	$(COMPILE) $<
 	$(POSTCOMPILE)