RISC-V: Support the ins "rol" with immediate operand

Message ID 20221128021428.13824-1-wangfeng@eswincomputing.com
State Deferred, archived
Headers
Series RISC-V: Support the ins "rol" with immediate operand |

Commit Message

wangfeng@eswincomputing.com Nov. 28, 2022, 2:14 a.m. UTC
  From: wangfeng <wangfeng@eswincomputing.com>

There is no Immediate operand of ins "rol" accroding to the B-ext,
so the immediate operand should be loaded into register at first.
But we can convert it to the ins "rori" or "roriw", and then one
immediate load ins can be reduced.

Please refer to the following use cases:
unsigned long foo2(unsigned long rs1)
{
    return (rs1 << 10) | (rs1 >> 54);
}

The complier result is:
li	a1,10
rol	a0,a0,a1

This patch will generate one ins
rori a0,a0,54

gcc/ChangeLog:

        * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/zbb-rol-ror-04.c: New test.
        * gcc.target/riscv/zbb-rol-ror-05.c: New test.
---
 gcc/config/riscv/bitmanip.md                  | 36 +++++++++++++++----
 .../gcc.target/riscv/zbb-rol-ror-04.c         | 24 +++++++++++++
 .../gcc.target/riscv/zbb-rol-ror-05.c         | 15 ++++++++
 3 files changed, 69 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
  

Comments

Jeff Law Nov. 28, 2022, 3:39 p.m. UTC | #1
On 11/27/22 19:14, Feng Wang wrote:
> From: wangfeng <wangfeng@eswincomputing.com>
>
> There is no Immediate operand of ins "rol" accroding to the B-ext,
> so the immediate operand should be loaded into register at first.
> But we can convert it to the ins "rori" or "roriw", and then one
> immediate load ins can be reduced.
>
> Please refer to the following use cases:
> unsigned long foo2(unsigned long rs1)
> {
>      return (rs1 << 10) | (rs1 >> 54);
> }
>
> The complier result is:
> li	a1,10
> rol	a0,a0,a1
>
> This patch will generate one ins
> rori a0,a0,54
>
> gcc/ChangeLog:
>
>          * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
>
> gcc/testsuite/ChangeLog:
>
>          * gcc.target/riscv/zbb-rol-ror-04.c: New test.
>          * gcc.target/riscv/zbb-rol-ror-05.c: New test.

So this arrived after stage1 close and I'm not aware of an existing BZ 
around this issue, so I'd tend to think this should wait for stage1 to 
re-open in the spring.


 From a technical standpoint, would it be better to hand this in a more 
generic way?   ie, when converting from gimple into RTL, if we want to 
generate a rotate left by immediate and don't have a suitable insn, then 
change it to a rotate right by an adjusted immediate.    This could 
probably be done in optabs.cc::expand_binop.


We might need similar code in combine.cc or simplify-rtx.cc since some 
rotate cases (or exposure of the constant) may not show up until later 
in the RTL pipeline.


Anyway, doing this in a more generic way seems like it's worth 
investigating.


jeff
  
wangfeng@eswincomputing.com Nov. 29, 2022, 1:53 a.m. UTC | #2
on 2022-11-28 23:39  Jeff Law<jeffreyalaw@gmail.com> wrote:
>
>
>On 11/27/22 19:14, Feng Wang wrote:
>> From: wangfeng <wangfeng@eswincomputing.com>
>>
>> There is no Immediate operand of ins "rol" accroding to the B-ext,
>> so the immediate operand should be loaded into register at first.
>> But we can convert it to the ins "rori" or "roriw", and then one
>> immediate load ins can be reduced.
>>
>> Please refer to the following use cases:
>> unsigned long foo2(unsigned long rs1)
>> {
>>      return (rs1 << 10) | (rs1 >> 54);
>> }
>>
>> The complier result is:
>> li	a1,10
>> rol	a0,a0,a1
>>
>> This patch will generate one ins
>> rori a0,a0,54
>>
>> gcc/ChangeLog:
>>
>>          * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/riscv/zbb-rol-ror-04.c: New test.
>>          * gcc.target/riscv/zbb-rol-ror-05.c: New test.
>
>So this arrived after stage1 close and I'm not aware of an existing BZ
>around this issue, so I'd tend to think this should wait for stage1 to
>re-open in the spring.
>
>
> From a technical standpoint, would it be better to hand this in a more
>generic way?   ie, when converting from gimple into RTL, if we want to
>generate a rotate left by immediate and don't have a suitable insn, then
>change it to a rotate right by an adjusted immediate.    This could
>probably be done in optabs.cc::expand_binop.
>
>
>We might need similar code in combine.cc or simplify-rtx.cc since some
>rotate cases (or exposure of the constant) may not show up until later
>in the RTL pipeline.
>
>
>Anyway, doing this in a more generic way seems like it's worth
>investigating.
>
>
>jeff
> 
Hi jeff,

Thanks for your reply. In the currently it will judge the rotate shift number when converting from
gimple into RTL, if the shift number bigger than mode_size/2, then reverse the rotate direction. 
I think the purpose of this process is to handle rotate shift quickly. I will think about your advice
and try to modify it in the expand pass.

Wang Feng
Best regards
  
Jeff Law Nov. 29, 2022, 4:52 a.m. UTC | #3
On 11/28/22 18:53, Feng Wang wrote:
> on 2022-11-28 23:39  Jeff Law<jeffreyalaw@gmail.com> wrote:
>>
>>
>> On 11/27/22 19:14, Feng Wang wrote:
>>> From: wangfeng <wangfeng@eswincomputing.com>
>>>
>>> There is no Immediate operand of ins "rol" accroding to the B-ext,
>>> so the immediate operand should be loaded into register at first.
>>> But we can convert it to the ins "rori" or "roriw", and then one
>>> immediate load ins can be reduced.
>>>
>>> Please refer to the following use cases:
>>> unsigned long foo2(unsigned long rs1)
>>> {
>>>        return (rs1 << 10) | (rs1 >> 54);
>>> }
>>>
>>> The complier result is:
>>> li	a1,10
>>> rol	a0,a0,a1
>>>
>>> This patch will generate one ins
>>> rori a0,a0,54
>>>
>>> gcc/ChangeLog:
>>>
>>>            * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>            * gcc.target/riscv/zbb-rol-ror-04.c: New test.
>>>            * gcc.target/riscv/zbb-rol-ror-05.c: New test.
>>
>> So this arrived after stage1 close and I'm not aware of an existing BZ
>> around this issue, so I'd tend to think this should wait for stage1 to
>> re-open in the spring.
>>
>>
>>  From a technical standpoint, would it be better to hand this in a more
>> generic way?   ie, when converting from gimple into RTL, if we want to
>> generate a rotate left by immediate and don't have a suitable insn, then
>> change it to a rotate right by an adjusted immediate.    This could
>> probably be done in optabs.cc::expand_binop.
>>
>>
>> We might need similar code in combine.cc or simplify-rtx.cc since some
>> rotate cases (or exposure of the constant) may not show up until later
>> in the RTL pipeline.
>>
>>
>> Anyway, doing this in a more generic way seems like it's worth
>> investigating.
>>
>>
>> jeff
>>
> Hi jeff,
> 
> Thanks for your reply. In the currently it will judge the rotate shift number when converting from
> gimple into RTL, if the shift number bigger than mode_size/2, then reverse the rotate direction.
> I think the purpose of this process is to handle rotate shift quickly. I will think about your advice
> and try to modify it in the expand pass.
Yea, in the past there were targets where the cost of a shift or rotate 
was proportional to the number of bits shifted.  So for a rotate in 
particular it was advantageous to reverse the rotation if the count was 
more than mode_size/2.

I suspect such processors are a lot less common now than in the past. 
But we can probably utilize some of that code to suit our needs.

Jeff
  

Patch

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index d17133d58c1..cddfa2a4b19 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -300,25 +300,49 @@ 
 (define_insn "rotlsi3"
   [(set (match_operand:SI 0 "register_operand" "=r")
 	(rotate:SI (match_operand:SI 1 "register_operand" "r")
-		   (match_operand:QI 2 "register_operand" "r")))]
+		   (match_operand:QI 2 "arith_operand" "rI")))]
   "TARGET_ZBB"
-  "rol%~\t%0,%1,%2"
+  {
+    if (immediate_operand(operands[2], QImode))
+    {
+      operands[2] = GEN_INT(GET_MODE_BITSIZE (SImode) - INTVAL(operands[2]));
+	    return "rori\t%0,%1,%2";
+    }
+    else
+      return "rol\t%0,%1,%2";
+  }
   [(set_attr "type" "bitmanip")])
 
 (define_insn "rotldi3"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(rotate:DI (match_operand:DI 1 "register_operand" "r")
-		   (match_operand:QI 2 "register_operand" "r")))]
+		   (match_operand:QI 2 "arith_operand" "rI")))]
   "TARGET_64BIT && TARGET_ZBB"
-  "rol\t%0,%1,%2"
+  {
+    if (immediate_operand(operands[2], QImode))
+    {
+      operands[2] = GEN_INT(GET_MODE_BITSIZE (DImode) - INTVAL(operands[2]));
+	    return "rori\t%0,%1,%2";
+    }
+    else
+      return "rol\t%0,%1,%2";
+  }
   [(set_attr "type" "bitmanip")])
 
 (define_insn "rotlsi3_sext"
   [(set (match_operand:DI 0 "register_operand" "=r")
 	(sign_extend:DI (rotate:SI (match_operand:SI 1 "register_operand" "r")
-				   (match_operand:QI 2 "register_operand" "r"))))]
+				   (match_operand:QI 2 "arith_operand" "rI"))))]
   "TARGET_64BIT && TARGET_ZBB"
-  "rolw\t%0,%1,%2"
+  {
+    if (immediate_operand(operands[2], QImode))
+    {
+      operands[2] = GEN_INT(GET_MODE_BITSIZE (SImode) - INTVAL(operands[2]));
+	    return "roriw\t%0,%1,%2";
+    }
+    else
+      return "rolw\t%0,%1,%2";
+  }
   [(set_attr "type" "bitmanip")])
 
 ;; orc.b (or-combine) is added as an unspec for the benefit of the support
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
new file mode 100644
index 00000000000..23883cc3a5e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -O2" } */
+/* { dg-skip-if "" { *-*-* } { "-g" } } */
+
+unsigned long foo1 (unsigned long rs1)
+{ return (rs1 >> (34)) | (rs1 << 30); }
+
+unsigned long foo2(unsigned long rs1)
+{
+    return (rs1 << 10) | (rs1 >> 54);
+}
+
+unsigned int foo3(unsigned int rs1)
+{
+    return (rs1 >> 20) | (rs1 << 12);
+}
+
+unsigned int foo4(unsigned int rs1)
+{
+    return (rs1 << 10) | (rs1 >> 22);
+}
+
+/* { dg-final { scan-assembler-times "rori\t" 2 } } */
+/* { dg-final { scan-assembler-times "roriw" 2 } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
new file mode 100644
index 00000000000..3e300a30b9b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gc_zbb -mabi=ilp32" } */
+/* { dg-skip-if "" { rv64-*-* } { "-g" } } */
+
+unsigned int foo1(unsigned int rs1)
+{
+    return (rs1 >> 20) | (rs1 << 12);
+}
+
+unsigned int foo2(unsigned int rs1)
+{
+    return (rs1 << 10) | (rs1 >> 22);
+}
+
+/* { dg-final { scan-assembler-times "rori" 2 } } */
\ No newline at end of file