[5/5] cris: Reload using special-regs before general-regs

Message ID 20220202002531.12CFA20439@pchp3.se.axis.com
State Committed
Commit 07a6c52c4cd145d20488c4823669a2d984ba2051
Headers
Series A few CRIS port improvements |

Commit Message

Hans-Peter Nilsson Feb. 2, 2022, 12:25 a.m. UTC
  On code where reload has an effect (i.e. quite rarely, just enough to be
noticeable), this change gets code quality back to the situation prior
to "Remove CRIS v32 ACR artefacts".  We had from IRA a pseudoregister
marked to be reloaded from a union of all allocatable registers (here:
SPEC_GENNONACR_REGS) but where the register-class corresponding to the
constraint for the register-type alternative (here: GENERAL_REGS) was
*not* a subset of that class: SPEC_GENNONACR_REGS (and GENNONACR_REGS)
had a one-register "hole" for the ACR register, a register present in
GENERAL_REGS.

Code in reload.cc:find_reloads adds 4 to the cost of a register-type
alternative that is neither a subset of the preferred register class nor
vice versa and thus reload thinks it can't use.  It would be preferable
to look for a non-empty intersection of the two, and use that
intersection for that alternative, something that can't be expressed
because a register class can't be formed from a random register set.

The effect was here that the GENERAL_REGS to/from memory alternatives
("r") had their cost raised such that the SPECIAL_REGS alternatives
("x") looked better.  This happened to improve code quality just a
little bit compared to GENERAL_REGS being chosen.

Anyway, with the improved CRIS register-class topology, the
subset-checking code no longer has the GENERAL_REGS-demoting effect.
To get the same quality, we have to adjust the port such that
SPECIAL_REGS are specifically preferred when possible and advisible,
i.e. when there's at least two of those registers as for the CPU variant
with multiplication (which happens to be the variant maintained for
performance).

For the move-pattern, the obvious method may seem to simply "curse" the
constraints of some alternatives (by prepending one of the "?!^$"
characters) but that method can't be used, because we want the effect to
be conditional on the CPU variant.  It'd also be a shame to split the
"*movsi_internal<setcc><setnz><setnzvc>" into two CPU-variants (with
different cursing).  Iterators would help, but it still seems unwieldy.
Instead, add copies of the GENERAL_REGS variants (to the SPECIAL_REGS
alternatives) on the "other" side, and make use of the "enabled"
attribute to activate just the desired order of alternatives.

gcc:

	* config/cris/cris.cc (cris_preferred_reload_class): Reject
	"eliminated" registers and small-enough constants unless
	reloaded into a class that is a subset of GENERAL_REGS.
	* config/cris/cris.md (attribute "cpu_variant"): New.
	(attribute "enabled"): Conditionalize on a matching attribute
	cpu_variant, if specified.
	("*movsi_internal<setcc><setnz><setnzvc>"): For moves to and from
	memory, add cpu-variant-enabled variants for "r" alternatives on
	the far side of the "x" alternatives, preferring the "x" ones
	only for variants where MOF is present (in addition to SRP).
---
 gcc/config/cris/cris.cc | 13 ++++++++++++-
 gcc/config/cris/cris.md | 25 ++++++++++++++++++++-----
 2 files changed, 32 insertions(+), 6 deletions(-)
  

Patch

diff --git a/gcc/config/cris/cris.cc b/gcc/config/cris/cris.cc
index 4f977221f459..f0017d630229 100644
--- a/gcc/config/cris/cris.cc
+++ b/gcc/config/cris/cris.cc
@@ -1661,7 +1661,7 @@  cris_reload_address_legitimized (rtx x,
    a bug.  */
 
 static reg_class_t
-cris_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t rclass)
+cris_preferred_reload_class (rtx x, reg_class_t rclass)
 {
   if (rclass != MOF_REGS
       && rclass != MOF_SRP_REGS
@@ -1670,6 +1670,17 @@  cris_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, reg_class_t rclass)
       && rclass != SPECIAL_REGS)
     return GENERAL_REGS;
 
+  /* We can't make use of something that's not a general register when
+     reloading an "eliminated" register (i.e. something that has turned into
+     e.g. sp + const_int).  */
+  if (GET_CODE (x) == PLUS && !reg_class_subset_p (rclass, GENERAL_REGS))
+    return NO_REGS;
+
+  /* Avoid putting constants into a special register, where the instruction is
+     shorter if loaded into a general register.  */
+  if (satisfies_constraint_P (x) && !reg_class_subset_p (rclass, GENERAL_REGS))
+    return NO_REGS;
+
   return rclass;
 }
 
diff --git a/gcc/config/cris/cris.md b/gcc/config/cris/cris.md
index 9d9eb8b7dbbf..dd7094163784 100644
--- a/gcc/config/cris/cris.md
+++ b/gcc/config/cris/cris.md
@@ -153,9 +153,20 @@  (define_delay (eq_attr "slottable" "has_return_slot")
 	(not (match_test "dead_or_set_regno_p (insn, CRIS_SRP_REGNUM)")))
    (nil) (nil)])
 
+;; Enable choosing particular instructions.  The discriminator choice
+;; "v0" stands for "pre-v10", for brevity.
+(define_attr "cpu_variant" "default,v0,v10" (const_string "default"))
+
 (define_attr "enabled" "no,yes"
   (if_then_else
-   (eq_attr "cc_enabled" "normal")
+   (and
+    (eq_attr "cc_enabled" "normal")
+    (ior
+     (eq_attr "cpu_variant" "default")
+     (and (eq_attr "cpu_variant" "v10")
+	  (match_test "TARGET_HAS_MUL_INSNS"))
+     (and (eq_attr "cpu_variant" "v0")
+	  (not (match_test "TARGET_HAS_MUL_INSNS")))))
    (const_string "yes")
    (const_string "no")))
 
@@ -578,9 +589,9 @@  (define_expand "movsi"
 (define_insn "*movsi_internal<setcc><setnz><setnzvc>"
   [(set
     (match_operand:SI 0 "nonimmediate_operand"
-		      "=r,r, r,Q>,r,Q>,g,r,r,g,rQ>,x,  m,x")
+		      "=r,r, r,Q>,r,Q>,g,r,r,g,rQ>,x,  m,x, Q>,r,g")
     (match_operand:SI 1 "general_operand"
-		       "r,Q>,M,M, I,r, M,n,g,r,x,  rQ>,x,gi"))
+		       "r,Q>,M,M, I,r, M,n,g,r,x,  rQ>,x,gi,r, g,r"))
    (clobber (reg:CC CRIS_CC0_REGNUM))]
   ;; Avoid matching insns we know must be reloaded.  Without one
   ;; operand being a (pseudo-)register, reload chooses
@@ -597,6 +608,9 @@  (define_insn "*movsi_internal<setcc><setnz><setnzvc>"
     case 5:
     case 8:
     case 9:
+    case 14:
+    case 15:
+    case 16:
       return "move.d %1,%0";
 
     case 10:
@@ -634,9 +648,10 @@  (define_insn "*movsi_internal<setcc><setnz><setnzvc>"
       gcc_unreachable ();
     }
 }
-  [(set_attr "slottable" "yes,yes,yes,yes,yes,yes,no,no,no,no,yes,yes,no,no")
+  [(set_attr "cpu_variant" "*,*,*,*,*,v0,*,*,v0,v0,*,*,*,*,v10,v10,v10")
+   (set_attr "slottable" "yes,yes,yes,yes,yes,yes,no,no,no,no,yes,yes,no,no,yes,no,no")
    (set_attr "cc<cccc><ccnz><ccnzvc>"
-	     "*,*,none,none,*,none,none,*,*,none,none,none,none,none")])
+	     "*,*,none,none,*,none,none,*,*,none,none,none,none,none,none,*,none")])
 
 ;; FIXME: See movsi.