bfin: Popcount-related improvements to machine description.
Commit Message
Blackfin processors support a ONES instruction that implements a
32-bit popcount returning a 16-bit result. This instruction was
previously described by GCC's bfin backed using a UNSPEC, but with
this patch uses a POPCOUNT:SI rtx to capture the semantics, allowing
it to evaluated at compile-time. I've decided to keep the instruction
name the same (avoiding any changes to the __builtin_bfin_ones
machinery), but have provided popcountsi2 and popcounthi2 expanders
so that the middle-end can use this instruction to implement
__builtin_popcount (and __builtin_parity).
The new testcase ones.c
short foo ()
{
int t = 5;
short r = __builtin_bfin_ones(t);
return r;
}
previously generated:
_foo: nop;
nop;
R0 = 5 (X);
R0.L = ONES R0;
rts;
with this patch, now generates:
_foo: nop;
nop;
nop;
R0 = 2 (X);
rts;
The new testcase popcount.c
int foo(int x)
{
return __builtin_popcount(x);
}
previously generated:
_foo: [--SP] = RETS;
SP += -12;
call ___popcountsi2;
SP += 12;
RETS = [SP++];
rts;
now generates:
_foo: nop;
nop;
R0.L = ONES R0;
R0 = R0.L (Z);
rts;
And the new testcase parity.c
int foo(int x)
{
return __builtin_parity(x);
}
previously generated:
_foo: [--SP] = RETS;
SP += -12;
call ___paritysi2;
SP += 12;
RETS = [SP++];
rts;
now generates:
_foo: nop;
R1 = 1 (X);
R0.L = ONES R0;
R0 = R1 & R0;
rts;
This patch has been tested on a cross-compiler to bfin-elf hosted
on x86_64-pc-linux-gnu, but without a toolchain, and shows no
regressions in the compile-only parts of the testsuite.
Ok for mainline?
2021-10-17 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/bfin/bfin.md (define_constants): Remove UNSPEC_ONES.
(define_insn "ones"): Replace UNSPEC_ONES with a truncate of
a popcount, allowing compile-time evaluation/simplification.
(popcountsi2, popcounthi2): New expanders using a "ones" insn.
gcc/testsuite/ChangeLog
* gcc.target/bfin/ones.c: New test case.
* gcc.target/bfin/parity.c: New test case.
* gcc.target/bfin/ones.c: New test case.
Thanks in advance,
Roger
--
/* { dg-do compile } */
/* { dg-options "-O2" } */
short foo ()
{
int t = 5;
short r = __builtin_bfin_ones(t);
return r;
}
/* { dg-final { scan-assembler-not "ONES" } } */
/* { dg-do compile } */
/* { dg-options "-O2" } */
int foo(int x)
{
return __builtin_parity(x);
}
/* { dg-final { scan-assembler "ONES" } } */
/* { dg-do compile } */
/* { dg-options "-O2" } */
int foo(int x)
{
return __builtin_popcount(x);
}
/* { dg-final { scan-assembler "ONES" } } */
Comments
On 10/17/2021 7:08 AM, Roger Sayle wrote:
> Blackfin processors support a ONES instruction that implements a
> 32-bit popcount returning a 16-bit result. This instruction was
> previously described by GCC's bfin backed using a UNSPEC, but with
> this patch uses a POPCOUNT:SI rtx to capture the semantics, allowing
> it to evaluated at compile-time. I've decided to keep the instruction
> name the same (avoiding any changes to the __builtin_bfin_ones
> machinery), but have provided popcountsi2 and popcounthi2 expanders
> so that the middle-end can use this instruction to implement
> __builtin_popcount (and __builtin_parity).
>
> The new testcase ones.c
> short foo ()
> {
> int t = 5;
> short r = __builtin_bfin_ones(t);
> return r;
> }
>
> previously generated:
> _foo: nop;
> nop;
> R0 = 5 (X);
> R0.L = ONES R0;
> rts;
>
> with this patch, now generates:
> _foo: nop;
> nop;
> nop;
> R0 = 2 (X);
> rts;
>
> The new testcase popcount.c
> int foo(int x)
> {
> return __builtin_popcount(x);
> }
>
> previously generated:
> _foo: [--SP] = RETS;
> SP += -12;
> call ___popcountsi2;
> SP += 12;
> RETS = [SP++];
> rts;
>
> now generates:
> _foo: nop;
> nop;
> R0.L = ONES R0;
> R0 = R0.L (Z);
> rts;
>
> And the new testcase parity.c
> int foo(int x)
> {
> return __builtin_parity(x);
> }
>
> previously generated:
> _foo: [--SP] = RETS;
> SP += -12;
> call ___paritysi2;
> SP += 12;
> RETS = [SP++];
> rts;
>
> now generates:
> _foo: nop;
> R1 = 1 (X);
> R0.L = ONES R0;
> R0 = R1 & R0;
> rts;
>
>
> This patch has been tested on a cross-compiler to bfin-elf hosted
> on x86_64-pc-linux-gnu, but without a toolchain, and shows no
> regressions in the compile-only parts of the testsuite.
> Ok for mainline?
>
>
> 2021-10-17 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> * config/bfin/bfin.md (define_constants): Remove UNSPEC_ONES.
> (define_insn "ones"): Replace UNSPEC_ONES with a truncate of
> a popcount, allowing compile-time evaluation/simplification.
> (popcountsi2, popcounthi2): New expanders using a "ones" insn.
>
> gcc/testsuite/ChangeLog
> * gcc.target/bfin/ones.c: New test case.
> * gcc.target/bfin/parity.c: New test case.
> * gcc.target/bfin/ones.c: New test case.
OK
jeff
@@ -138,8 +138,7 @@
;; Distinguish a 32-bit version of an insn from a 16-bit version.
(UNSPEC_32BIT 11)
(UNSPEC_NOP 12)
- (UNSPEC_ONES 13)
- (UNSPEC_ATOMIC 14)])
+ (UNSPEC_ATOMIC 13)])
(define_constants
[(UNSPEC_VOLATILE_CSYNC 1)
@@ -1398,12 +1397,32 @@
(define_insn "ones"
[(set (match_operand:HI 0 "register_operand" "=d")
- (unspec:HI [(match_operand:SI 1 "register_operand" "d")]
- UNSPEC_ONES))]
+ (truncate:HI
+ (popcount:SI (match_operand:SI 1 "register_operand" "d"))))]
""
"%h0 = ONES %1;"
[(set_attr "type" "alu0")])
+(define_expand "popcountsi2"
+ [(set (match_dup 2)
+ (truncate:HI (popcount:SI (match_operand:SI 1 "register_operand" ""))))
+ (set (match_operand:SI 0 "register_operand")
+ (zero_extend:SI (match_dup 2)))]
+ ""
+{
+ operands[2] = gen_reg_rtx (HImode);
+})
+
+(define_expand "popcounthi2"
+ [(set (match_dup 2)
+ (zero_extend:SI (match_operand:HI 1 "register_operand" "")))
+ (set (match_operand:HI 0 "register_operand")
+ (truncate:HI (popcount:SI (match_dup 2))))]
+ ""
+{
+ operands[2] = gen_reg_rtx (SImode);
+})
+
(define_insn "smaxsi3"
[(set (match_operand:SI 0 "register_operand" "=d")
(smax:SI (match_operand:SI 1 "register_operand" "d")