[v2,2/2] RISC-V: Add instruction fusion (for ventana-vt1)

Message ID 20221113204824.4062042-3-philipp.tomsich@vrull.eu
State Deferred, archived
Headers
Series Basic support for the Ventana VT1 w/ instruction fusion |

Commit Message

Philipp Tomsich Nov. 13, 2022, 8:48 p.m. UTC
  The Ventana VT1 core supports quad-issue and instruction fusion.
This implemented TARGET_SCHED_MACRO_FUSION_P to keep fusible sequences
together and adds idiom matcheing for the supported fusion cases.

gcc/ChangeLog:

	* config/riscv/riscv.cc (enum riscv_fusion_pairs): Add symbolic
	constants to identify supported fusion patterns.
	(struct riscv_tune_param): Add fusible_op field.
	(riscv_macro_fusion_p): Implement.
	(riscv_fusion_enabled_p): Implement.
	(riscv_macro_fusion_pair_p): Implement and recoginze fusible
	idioms for Ventana VT1.
	(TARGET_SCHED_MACRO_FUSION_P): Point to riscv_macro_fusion_p.
	(TARGET_SCHED_MACRO_FUSION_PAIR_P): Point to riscv_macro_fusion_pair_p.

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---

Changes in v2:
- Update fusion patterns and catch some missing idioms/fusion pairs.

 gcc/config/riscv/riscv.cc | 219 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 219 insertions(+)
  

Comments

Jeff Law Nov. 14, 2022, 4:06 p.m. UTC | #1
On 11/13/22 13:48, Philipp Tomsich wrote:
> The Ventana VT1 core supports quad-issue and instruction fusion.
> This implemented TARGET_SCHED_MACRO_FUSION_P to keep fusible sequences
> together and adds idiom matcheing for the supported fusion cases.
>
> gcc/ChangeLog:
>
> 	* config/riscv/riscv.cc (enum riscv_fusion_pairs): Add symbolic
> 	constants to identify supported fusion patterns.
> 	(struct riscv_tune_param): Add fusible_op field.
> 	(riscv_macro_fusion_p): Implement.
> 	(riscv_fusion_enabled_p): Implement.
> 	(riscv_macro_fusion_pair_p): Implement and recoginze fusible
> 	idioms for Ventana VT1.
> 	(TARGET_SCHED_MACRO_FUSION_P): Point to riscv_macro_fusion_p.
> 	(TARGET_SCHED_MACRO_FUSION_PAIR_P): Point to riscv_macro_fusion_pair_p.

You know the fusion rules for VT1 better than I...  I'm happy to largely 
defer to you on this.

I do wonder if going forward hand matching RTL like this is going to be 
an unmaintainable mess and whether or not we would be better served 
using insn attributes to describe instruction fusion.



>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
>
> Changes in v2:
> - Update fusion patterns and catch some missing idioms/fusion pairs.
>
>   gcc/config/riscv/riscv.cc | 219 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 219 insertions(+)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 31d651f8744..43ba520885c 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
>
> +static bool
> +riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
> +{
> +  rtx prev_set = single_set (prev);
> +  rtx curr_set = single_set (curr);
> +  /* prev and curr are simple SET insns i.e. no flag setting or branching.  */
> +  bool simple_sets_p = prev_set && curr_set && !any_condjump_p (curr);
> +
> +  if (!riscv_macro_fusion_p ())
> +    return false;
> +
> +  if (simple_sets_p && (riscv_fusion_enabled_p (RISCV_FUSE_ZEXTW) ||
> +			riscv_fusion_enabled_p (RISCV_FUSE_ZEXTH)))

Formatting nit.  Bring the && down to a new line and if you still need a 
line break for the "||",  then the "||" should be on a new line as 
well.  Something like this...


if (simple_sets_p
       && (riscv_fusion_enabled_p (RISCV_FUSE_ZEXTW

           || riscv_fusion_enabled_p (RISCV_FUSE_ZEXTH)))


> +	  && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO(SET_DEST (curr_set))

Space before open paren on this line.


>
> +	  && (( INTVAL (XEXP (SET_SRC (curr_set), 1)) == 32
> +		&& riscv_fusion_enabled_p(RISCV_FUSE_ZEXTW) )
> +	      || ( INTVAL (XEXP (SET_SRC (curr_set), 1)) < 32
> +		   && riscv_fusion_enabled_p(RISCV_FUSE_ZEXTWS))))

Extraneous spaces after the open parens before INTVALs above.


> +	  && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO(SET_DEST (curr_set))

Missing whitespace before open paren on this line.


OK with the nits fixed.


Jeff
  
Jakub Jelinek Nov. 14, 2022, 4:11 p.m. UTC | #2
On Mon, Nov 14, 2022 at 09:06:10AM -0700, Jeff Law via Gcc-patches wrote:
> 
> On 11/13/22 13:48, Philipp Tomsich wrote:
> > The Ventana VT1 core supports quad-issue and instruction fusion.
> > This implemented TARGET_SCHED_MACRO_FUSION_P to keep fusible sequences
> > together and adds idiom matcheing for the supported fusion cases.
> > 
> > gcc/ChangeLog:
> > 
> > 	* config/riscv/riscv.cc (enum riscv_fusion_pairs): Add symbolic
> > 	constants to identify supported fusion patterns.
> > 	(struct riscv_tune_param): Add fusible_op field.
> > 	(riscv_macro_fusion_p): Implement.
> > 	(riscv_fusion_enabled_p): Implement.
> > 	(riscv_macro_fusion_pair_p): Implement and recoginze fusible

s/recoginze/recognize/

> > 	idioms for Ventana VT1.
> > 	(TARGET_SCHED_MACRO_FUSION_P): Point to riscv_macro_fusion_p.
> > 	(TARGET_SCHED_MACRO_FUSION_PAIR_P): Point to riscv_macro_fusion_pair_p.
> 
> You know the fusion rules for VT1 better than I...  I'm happy to largely
> defer to you on this.
> 
> I do wonder if going forward hand matching RTL like this is going to be an
> unmaintainable mess and whether or not we would be better served using insn
> attributes to describe instruction fusion.

	Jakub
  
Philipp Tomsich Nov. 14, 2022, 6:55 p.m. UTC | #3
On Mon, 14 Nov 2022 at 17:06, Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
> On 11/13/22 13:48, Philipp Tomsich wrote:
> > The Ventana VT1 core supports quad-issue and instruction fusion.
> > This implemented TARGET_SCHED_MACRO_FUSION_P to keep fusible sequences
> > together and adds idiom matcheing for the supported fusion cases.
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/riscv.cc (enum riscv_fusion_pairs): Add symbolic
> >       constants to identify supported fusion patterns.
> >       (struct riscv_tune_param): Add fusible_op field.
> >       (riscv_macro_fusion_p): Implement.
> >       (riscv_fusion_enabled_p): Implement.
> >       (riscv_macro_fusion_pair_p): Implement and recoginze fusible
> >       idioms for Ventana VT1.
> >       (TARGET_SCHED_MACRO_FUSION_P): Point to riscv_macro_fusion_p.
> >       (TARGET_SCHED_MACRO_FUSION_PAIR_P): Point to riscv_macro_fusion_pair_p.
>
> You know the fusion rules for VT1 better than I...  I'm happy to largely
> defer to you on this.
>
> I do wonder if going forward hand matching RTL like this is going to be
> an unmaintainable mess and whether or not we would be better served
> using insn attributes to describe instruction fusion.

I had thought about that, too.
In the end our team decided to stay away from it for the time being:
fusion frequently needs to look at second-level properties and whether
the first instruction's output register is overwritten by the second
instruction.  So we kept with the same stereotype of idiom-matching
that is also used for AArch64 today.

That said, both the RISC-V and the AArch64 implementations of this are
on my list of things to refactor in a quiet hour.

>
>
>
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> >
> > Changes in v2:
> > - Update fusion patterns and catch some missing idioms/fusion pairs.
> >
> >   gcc/config/riscv/riscv.cc | 219 ++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 219 insertions(+)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 31d651f8744..43ba520885c 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> >
> > +static bool
> > +riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
> > +{
> > +  rtx prev_set = single_set (prev);
> > +  rtx curr_set = single_set (curr);
> > +  /* prev and curr are simple SET insns i.e. no flag setting or branching.  */
> > +  bool simple_sets_p = prev_set && curr_set && !any_condjump_p (curr);
> > +
> > +  if (!riscv_macro_fusion_p ())
> > +    return false;
> > +
> > +  if (simple_sets_p && (riscv_fusion_enabled_p (RISCV_FUSE_ZEXTW) ||
> > +                     riscv_fusion_enabled_p (RISCV_FUSE_ZEXTH)))
>
> Formatting nit.  Bring the && down to a new line and if you still need a
> line break for the "||",  then the "||" should be on a new line as
> well.  Something like this...
>
>
> if (simple_sets_p
>        && (riscv_fusion_enabled_p (RISCV_FUSE_ZEXTW
>
>            || riscv_fusion_enabled_p (RISCV_FUSE_ZEXTH)))
>
>
> > +       && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO(SET_DEST (curr_set))
>
> Space before open paren on this line.
>
>
> >
> > +       && (( INTVAL (XEXP (SET_SRC (curr_set), 1)) == 32
> > +             && riscv_fusion_enabled_p(RISCV_FUSE_ZEXTW) )
> > +           || ( INTVAL (XEXP (SET_SRC (curr_set), 1)) < 32
> > +                && riscv_fusion_enabled_p(RISCV_FUSE_ZEXTWS))))
>
> Extraneous spaces after the open parens before INTVALs above.
>
>
> > +       && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO(SET_DEST (curr_set))
>
> Missing whitespace before open paren on this line.
>
>
> OK with the nits fixed.

Applied to master with these fixes (and a fix for the typo in the
commit message that Jakub spotted).
Thanks!

Philipp.
  
Jeff Law Nov. 14, 2022, 7:10 p.m. UTC | #4
On 11/14/22 11:55, Philipp Tomsich wrote:
> On Mon, 14 Nov 2022 at 17:06, Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>> On 11/13/22 13:48, Philipp Tomsich wrote:
>>> The Ventana VT1 core supports quad-issue and instruction fusion.
>>> This implemented TARGET_SCHED_MACRO_FUSION_P to keep fusible sequences
>>> together and adds idiom matcheing for the supported fusion cases.
>>>
>>> gcc/ChangeLog:
>>>
>>>        * config/riscv/riscv.cc (enum riscv_fusion_pairs): Add symbolic
>>>        constants to identify supported fusion patterns.
>>>        (struct riscv_tune_param): Add fusible_op field.
>>>        (riscv_macro_fusion_p): Implement.
>>>        (riscv_fusion_enabled_p): Implement.
>>>        (riscv_macro_fusion_pair_p): Implement and recoginze fusible
>>>        idioms for Ventana VT1.
>>>        (TARGET_SCHED_MACRO_FUSION_P): Point to riscv_macro_fusion_p.
>>>        (TARGET_SCHED_MACRO_FUSION_PAIR_P): Point to riscv_macro_fusion_pair_p.
>> You know the fusion rules for VT1 better than I...  I'm happy to largely
>> defer to you on this.
>>
>> I do wonder if going forward hand matching RTL like this is going to be
>> an unmaintainable mess and whether or not we would be better served
>> using insn attributes to describe instruction fusion.
> I had thought about that, too.
> In the end our team decided to stay away from it for the time being:
> fusion frequently needs to look at second-level properties and whether
> the first instruction's output register is overwritten by the second
> instruction.  So we kept with the same stereotype of idiom-matching
> that is also used for AArch64 today.

Yea, we're still going to have to grub around to get the operands.  But 
we'd know the overall form of the insn and the types of its operands was 
right.  But it's still going to be clunky either way.  My worry with the 
attribute approach is we'll end up with a horrible mess of attributes 
due to multiple fusion implementations and that we'll need to split 
alternatives so that we can tag them more precisely, etc.

It feels like we almost need a DSL to specify this stuff, much like we 
have for scheduling models.


Jeff
  

Patch

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 31d651f8744..43ba520885c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -215,6 +215,19 @@  struct riscv_integer_op {
    The worst case is LUI, ADDI, SLLI, ADDI, SLLI, ADDI, SLLI, ADDI.  */
 #define RISCV_MAX_INTEGER_OPS 8
 
+enum riscv_fusion_pairs
+{
+  RISCV_FUSE_NOTHING = 0,
+  RISCV_FUSE_ZEXTW = (1 << 0),
+  RISCV_FUSE_ZEXTH = (1 << 1),
+  RISCV_FUSE_ZEXTWS = (1 << 2),
+  RISCV_FUSE_LDINDEXED = (1 << 3),
+  RISCV_FUSE_LUI_ADDI = (1 << 4),
+  RISCV_FUSE_AUIPC_ADDI = (1 << 5),
+  RISCV_FUSE_LUI_LD = (1 << 6),
+  RISCV_FUSE_AUIPC_LD = (1 << 7),
+};
+
 /* Costs of various operations on the different architectures.  */
 
 struct riscv_tune_param
@@ -229,6 +242,7 @@  struct riscv_tune_param
   unsigned short memory_cost;
   unsigned short fmv_cost;
   bool slow_unaligned_access;
+  unsigned int fusible_ops;
 };
 
 /* Information about one micro-arch we know about.  */
@@ -316,6 +330,7 @@  static const struct riscv_tune_param rocket_tune_info = {
   5,						/* memory_cost */
   8,						/* fmv_cost */
   true,						/* slow_unaligned_access */
+  RISCV_FUSE_NOTHING,                           /* fusible_ops */
 };
 
 /* Costs to use when optimizing for Sifive 7 Series.  */
@@ -330,6 +345,7 @@  static const struct riscv_tune_param sifive_7_tune_info = {
   3,						/* memory_cost */
   8,						/* fmv_cost */
   true,						/* slow_unaligned_access */
+  RISCV_FUSE_NOTHING,                           /* fusible_ops */
 };
 
 /* Costs to use when optimizing for T-HEAD c906.  */
@@ -344,6 +360,7 @@  static const struct riscv_tune_param thead_c906_tune_info = {
   5,            /* memory_cost */
   8,		/* fmv_cost */
   false,            /* slow_unaligned_access */
+  RISCV_FUSE_NOTHING,                           /* fusible_ops */
 };
 
 /* Costs to use when optimizing for size.  */
@@ -358,6 +375,7 @@  static const struct riscv_tune_param optimize_size_tune_info = {
   2,						/* memory_cost */
   8,						/* fmv_cost */
   false,					/* slow_unaligned_access */
+  RISCV_FUSE_NOTHING,                           /* fusible_ops */
 };
 
 /* Costs to use when optimizing for Ventana Micro VT1.  */
@@ -372,6 +390,10 @@  static const struct riscv_tune_param ventana_vt1_tune_info = {
   5,						/* memory_cost */
   8,						/* fmv_cost */
   false,					/* slow_unaligned_access */
+  ( RISCV_FUSE_ZEXTW | RISCV_FUSE_ZEXTH |       /* fusible_ops */
+    RISCV_FUSE_ZEXTWS | RISCV_FUSE_LDINDEXED |
+    RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI |
+    RISCV_FUSE_LUI_LD | RISCV_FUSE_AUIPC_LD )
 };
 
 static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *);
@@ -5627,6 +5649,199 @@  riscv_issue_rate (void)
   return tune_param->issue_rate;
 }
 
+/* Implement TARGET_SCHED_MACRO_FUSION_P.  Return true if target supports
+   instruction fusion of some sort.  */
+
+static bool
+riscv_macro_fusion_p (void)
+{
+  return tune_param->fusible_ops != RISCV_FUSE_NOTHING;
+}
+
+/* Return true iff the instruction fusion described by OP is enabled.  */
+
+static bool
+riscv_fusion_enabled_p(enum riscv_fusion_pairs op)
+{
+  return tune_param->fusible_ops & op;
+}
+
+/* Implement TARGET_SCHED_MACRO_FUSION_PAIR_P.  Return true if PREV and CURR
+   should be kept together during scheduling.  */
+
+static bool
+riscv_macro_fusion_pair_p (rtx_insn *prev, rtx_insn *curr)
+{
+  rtx prev_set = single_set (prev);
+  rtx curr_set = single_set (curr);
+  /* prev and curr are simple SET insns i.e. no flag setting or branching.  */
+  bool simple_sets_p = prev_set && curr_set && !any_condjump_p (curr);
+
+  if (!riscv_macro_fusion_p ())
+    return false;
+
+  if (simple_sets_p && (riscv_fusion_enabled_p (RISCV_FUSE_ZEXTW) ||
+			riscv_fusion_enabled_p (RISCV_FUSE_ZEXTH)))
+    {
+      /* We are trying to match the following:
+	   prev (slli) == (set (reg:DI rD)
+			       (ashift:DI (reg:DI rS) (const_int 32)))
+	   curr (slri) == (set (reg:DI rD)
+			       (lshiftrt:DI (reg:DI rD) (const_int <shift>)))
+	 with <shift> being either 32 for FUSE_ZEXTW, or
+			 `less than 32 for FUSE_ZEXTWS. */
+
+      if (GET_CODE (SET_SRC (prev_set)) == ASHIFT
+	  && GET_CODE (SET_SRC (curr_set)) == LSHIFTRT
+	  && REG_P (SET_DEST (prev_set))
+	  && REG_P (SET_DEST (curr_set))
+	  && REGNO (SET_DEST (prev_set)) == REGNO (SET_DEST (curr_set))
+	  && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO(SET_DEST (curr_set))
+	  && CONST_INT_P (XEXP (SET_SRC (prev_set), 1))
+	  && CONST_INT_P (XEXP (SET_SRC (curr_set), 1))
+	  && INTVAL (XEXP (SET_SRC (prev_set), 1)) == 32
+	  && (( INTVAL (XEXP (SET_SRC (curr_set), 1)) == 32
+		&& riscv_fusion_enabled_p(RISCV_FUSE_ZEXTW) )
+	      || ( INTVAL (XEXP (SET_SRC (curr_set), 1)) < 32
+		   && riscv_fusion_enabled_p(RISCV_FUSE_ZEXTWS))))
+	return true;
+    }
+
+  if (simple_sets_p && riscv_fusion_enabled_p (RISCV_FUSE_ZEXTH))
+    {
+      /* We are trying to match the following:
+	   prev (slli) == (set (reg:DI rD)
+			       (ashift:DI (reg:DI rS) (const_int 48)))
+	   curr (slri) == (set (reg:DI rD)
+			       (lshiftrt:DI (reg:DI rD) (const_int 48))) */
+
+      if (GET_CODE (SET_SRC (prev_set)) == ASHIFT
+	  && GET_CODE (SET_SRC (curr_set)) == LSHIFTRT
+	  && REG_P (SET_DEST (prev_set))
+	  && REG_P (SET_DEST (curr_set))
+	  && REGNO (SET_DEST (prev_set)) == REGNO (SET_DEST (curr_set))
+	  && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO(SET_DEST (curr_set))
+	  && CONST_INT_P (XEXP (SET_SRC (prev_set), 1))
+	  && CONST_INT_P (XEXP (SET_SRC (curr_set), 1))
+	  && INTVAL (XEXP (SET_SRC (prev_set), 1)) == 48
+	  && INTVAL (XEXP (SET_SRC (curr_set), 1)) == 48)
+	return true;
+    }
+
+  if (simple_sets_p && riscv_fusion_enabled_p (RISCV_FUSE_LDINDEXED))
+    {
+      /* We are trying to match the following:
+	   prev (add) == (set (reg:DI rD)
+			      (plus:DI (reg:DI rS1) (reg:DI rS2))
+	   curr (ld)  == (set (reg:DI rD)
+			      (mem:DI (reg:DI rD))) */
+
+      if (MEM_P (SET_SRC (curr_set))
+	  && REG_P (XEXP (SET_SRC (curr_set), 0))
+	  && REGNO (XEXP (SET_SRC (curr_set), 0)) == REGNO (SET_DEST (prev_set))
+	  && GET_CODE (SET_SRC (prev_set)) == PLUS
+	  && REG_P (XEXP (SET_SRC (prev_set), 0))
+	  && REG_P (XEXP (SET_SRC (prev_set), 1)))
+	return true;
+
+      /* We are trying to match the following:
+	   prev (add) == (set (reg:DI rD)
+			      (plus:DI (reg:DI rS1) (reg:DI rS2)))
+	   curr (lw)  == (set (any_extend:DI (mem:SUBX (reg:DI rD)))) */
+
+      if ((GET_CODE (SET_SRC (curr_set)) == SIGN_EXTEND
+	   || (GET_CODE (SET_SRC (curr_set)) == ZERO_EXTEND))
+	  && MEM_P (XEXP (SET_SRC (curr_set), 0))
+	  && REG_P (XEXP (XEXP (SET_SRC (curr_set), 0), 0))
+	  && REGNO (XEXP (XEXP (SET_SRC (curr_set), 0), 0)) == REGNO (SET_DEST (prev_set))
+	  && GET_CODE (SET_SRC (prev_set)) == PLUS
+	  && REG_P (XEXP (SET_SRC (prev_set), 0))
+	  && REG_P (XEXP (SET_SRC (prev_set), 1)))
+	return true;
+    }
+
+  if (simple_sets_p && riscv_fusion_enabled_p (RISCV_FUSE_LUI_ADDI))
+    {
+      /* We are trying to match the following:
+	   prev (lui)  == (set (reg:DI rD) (const_int UPPER_IMM_20))
+	   curr (addi) == (set (reg:DI rD)
+			       (plus:DI (reg:DI rD) (const_int IMM12))) */
+
+      if ((GET_CODE (SET_SRC (curr_set)) == LO_SUM
+	   || (GET_CODE (SET_SRC (curr_set)) == PLUS
+	       && CONST_INT_P (XEXP (SET_SRC (curr_set), 1))
+	       && SMALL_OPERAND (INTVAL (XEXP (SET_SRC (curr_set), 1)))))
+	  && (GET_CODE (SET_SRC (prev_set)) == HIGH
+	      || (CONST_INT_P (SET_SRC (prev_set))
+		  && LUI_OPERAND (INTVAL (SET_SRC (prev_set))))))
+	return true;
+    }
+
+  if (simple_sets_p && riscv_fusion_enabled_p (RISCV_FUSE_AUIPC_ADDI))
+    {
+      /* We are trying to match the following:
+	   prev (auipc) == (set (reg:DI rD) (unspec:DI [...] UNSPEC_AUIPC))
+	   curr (addi)  == (set (reg:DI rD)
+				(plus:DI (reg:DI rD) (const_int IMM12)))
+	 and
+	   prev (auipc) == (set (reg:DI rD) (unspec:DI [...] UNSPEC_AUIPC))
+	   curr (addi)  == (set (reg:DI rD)
+				(lo_sum:DI (reg:DI rD) (const_int IMM12))) */
+
+      if (GET_CODE (SET_SRC (prev_set)) == UNSPEC
+	  && XINT (prev_set, 1) == UNSPEC_AUIPC
+	  && (GET_CODE (SET_SRC (curr_set)) == LO_SUM
+	      || (GET_CODE (SET_SRC (curr_set)) == PLUS
+		  && SMALL_OPERAND (INTVAL (XEXP (SET_SRC (curr_set), 1))))))
+
+	return true;
+    }
+
+  if (simple_sets_p && riscv_fusion_enabled_p (RISCV_FUSE_LUI_LD))
+    {
+      /* We are trying to match the following:
+	   prev (lui)  == (set (reg:DI rD) (const_int UPPER_IMM_20))
+	   curr (ld)  == (set (reg:DI rD)
+			      (mem:DI (plus:DI (reg:DI rD) (const_int IMM12)))) */
+
+      if (CONST_INT_P (SET_SRC (prev_set))
+	  && LUI_OPERAND (INTVAL (SET_SRC (prev_set)))
+	  && MEM_P (SET_SRC (curr_set))
+	  && GET_CODE (XEXP (SET_SRC (curr_set), 0)) == PLUS)
+	return true;
+
+      if (GET_CODE (SET_SRC (prev_set)) == HIGH
+	  && MEM_P (SET_SRC (curr_set))
+	  && GET_CODE (XEXP (SET_SRC (curr_set), 0)) == LO_SUM
+	  && REGNO (SET_DEST (prev_set)) == REGNO (XEXP (XEXP (SET_SRC (curr_set), 0), 0)))
+	return true;
+
+      if (GET_CODE (SET_SRC (prev_set)) == HIGH
+	  && (GET_CODE (SET_SRC (curr_set)) == SIGN_EXTEND
+	      || GET_CODE (SET_SRC (curr_set)) == ZERO_EXTEND)
+	  && MEM_P (XEXP (SET_SRC (curr_set), 0))
+	  && (GET_CODE (XEXP (XEXP (SET_SRC (curr_set), 0), 0)) == LO_SUM
+	      && REGNO (SET_DEST (prev_set)) == REGNO (XEXP (XEXP (XEXP (SET_SRC (curr_set), 0), 0), 0))))
+	return true;
+    }
+
+  if (simple_sets_p && riscv_fusion_enabled_p (RISCV_FUSE_AUIPC_LD))
+    {
+      /* We are trying to match the following:
+	   prev (auipc) == (set (reg:DI rD) (unspec:DI [...] UNSPEC_AUIPC))
+	   curr (ld)  == (set (reg:DI rD)
+			      (mem:DI (plus:DI (reg:DI rD) (const_int IMM12)))) */
+
+      if (GET_CODE (SET_SRC (prev_set)) == UNSPEC
+	  && XINT (prev_set, 1) == UNSPEC_AUIPC
+	  && MEM_P (SET_SRC (curr_set))
+	  && GET_CODE (XEXP (SET_SRC (curr_set), 0)) == PLUS)
+	return true;
+    }
+
+  return false;
+}
+
 /* Auxiliary function to emit RISC-V ELF attribute. */
 static void
 riscv_emit_attribute ()
@@ -6657,6 +6872,10 @@  riscv_shamt_matches_mask_p (int shamt, HOST_WIDE_INT mask)
 
 #undef TARGET_SCHED_ISSUE_RATE
 #define TARGET_SCHED_ISSUE_RATE riscv_issue_rate
+#undef TARGET_SCHED_MACRO_FUSION_P
+#define TARGET_SCHED_MACRO_FUSION_P riscv_macro_fusion_p
+#undef TARGET_SCHED_MACRO_FUSION_PAIR_P
+#define TARGET_SCHED_MACRO_FUSION_PAIR_P riscv_macro_fusion_pair_p
 
 #undef TARGET_FUNCTION_OK_FOR_SIBCALL
 #define TARGET_FUNCTION_OK_FOR_SIBCALL riscv_function_ok_for_sibcall