[v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
Checks
Context |
Check |
Description |
rivoscibot/toolchain-ci-rivos-lint |
success
|
Lint passed
|
rivoscibot/toolchain-ci-rivos-apply-patch |
success
|
Patch applied
|
rivoscibot/toolchain-ci-rivos-build--newlib-rv64gcv-lp64d-multilib |
success
|
Build passed
|
rivoscibot/toolchain-ci-rivos-build--linux-rv64gcv-lp64d-multilib |
success
|
Build passed
|
rivoscibot/toolchain-ci-rivos-build--linux-rv64gc_zba_zbb_zbc_zbs-lp64d-multilib |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-aarch64-bootstrap |
success
|
Build passed
|
rivoscibot/toolchain-ci-rivos-test |
fail
|
Testing failed
|
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_simplebootstrap_build--master-arm-bootstrap |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Test passed
|
Commit Message
From: Pan Li <pan2.li@intel.com>
This patch would like to fix one bug when expanding const vector for the
interleave case. For example, we have:
base1 = 151
step = 121
For vec_series, we will generate vector in format of v[i] = base + i * step.
Then the vec_series will have below result for HImode, and we can find
that the result overflow to the highest 8 bits of HImode.
v1.b = {151, 255, 7, 0, 119, 0, 231, 0, 87, 1, 199, 1, 55, 2, 167, 2}
Aka we expect v1.b should be:
v1.b = {151, 0, 7, 0, 119, 0, 231, 0, 87, 0, 199, 0, 55, 0, 167, 0}
After that it will perform the IOR with v2 for the base2(aka another series).
v2.b = {0, 17, 0, 33, 0, 49, 0, 65, 0, 81, 0, 97, 0, 113, 0, 129}
Unfortunately, the base1 + i * step1 in HImode may overflow to the high
8 bits, and the high 8 bits will pollute the v2 and result in incorrect
value in const_vector.
This patch would like to perform the overflow to smode check before IOR
the base2 series, and perform the clean highest bit if the const_vector
overflow to smode occurs. If no overflow, there will do nothing here.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
PR target/118931
gcc/ChangeLog:
* config/riscv/riscv-v.cc (expand_const_vector): Add overflow to
smode check and clean up highest bits if overflow.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr118931-run-1.c: New test.
Signed-off-by: Pan Li <pan2.li@intel.com>
---
gcc/config/riscv/riscv-v.cc | 31 ++++++++++++++++++-
.../riscv/rvv/base/pr118931-run-1.c | 19 ++++++++++++
2 files changed, 49 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr118931-run-1.c
Comments
> This patch would like to perform the overflow to smode check before IOR
> the base2 series, and perform the clean highest bit if the const_vector
> overflow to smode occurs. If no overflow, there will do nothing here.
I agree with the general idea but I'm a bit wary fiddling with the coefficients
directly. I think for a fixed-size, non VLA vector it should be sufficient to
check whether the last step overflows.
For VLA we actually know the largest possible runtime vector length and could
use it the check? But maybe it's probably easier to bail with VLA anyway.
In addition: if we're already checking the step anyway we could get rid of the
negative step restriction as well.
Thanks Robin for comments.
> I agree with the general idea but I'm a bit wary fiddling with the coefficients
> directly. I think for a fixed-size, non VLA vector it should be sufficient to
> check whether the last step overflows.
Initial idea I would like to take care of VLS and VLA in the same chunk of code, given
the expand_const_vector is sort of complicated currently(I don't want to make it worse by add more if VLA, else if VLS ... etc).
I will try to leverage poly_int mult, shift, and compare if possible, instead of touch the coefficients directly in v2.
> For VLA we actually know the largest possible runtime vector length and could
> use it the check? But maybe it's probably easier to bail with VLA anyway.
> In addition: if we're already checking the step anyway we could get rid of the
> negative step restriction as well.
I don't explore more cases here consider we are in stage 4. I think the expand_const_vector need some
refactor up to a point.
Pan
-----Original Message-----
From: Robin Dapp <rdapp.gcc@gmail.com>
Sent: Monday, February 24, 2025 4:29 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; Robin Dapp <rdapp.gcc@gmail.com>
Subject: Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
> This patch would like to perform the overflow to smode check before IOR
> the base2 series, and perform the clean highest bit if the const_vector
> overflow to smode occurs. If no overflow, there will do nothing here.
I agree with the general idea but I'm a bit wary fiddling with the coefficients
directly. I think for a fixed-size, non VLA vector it should be sufficient to
check whether the last step overflows.
For VLA we actually know the largest possible runtime vector length and could
use it the check? But maybe it's probably easier to bail with VLA anyway.
In addition: if we're already checking the step anyway we could get rid of the
negative step restriction as well.
--
Regards
Robin
> I don't explore more cases here consider we are in stage 4. I think the
> expand_const_vector need some refactor up to a point.
I added the negative step check just some weeks ago and I'd see it as
simplification to remove the restriction again if you're touching the actual
step anyway. So I wouldn't worry about stage 4 in that regard.
> I added the negative step check just some weeks ago and I'd see it as
> simplification to remove the restriction again if you're touching the actual
> step anyway. So I wouldn't worry about stage 4 in that regard.
Oh, I see. I'll have a try after this bug fix.
Pan
-----Original Message-----
From: Robin Dapp <rdapp.gcc@gmail.com>
Sent: Monday, February 24, 2025 7:44 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; Robin Dapp <rdapp.gcc@gmail.com>
Subject: Re: [PATCH v1] RISC-V: Fix bug for expand_const_vector interleave [PR118931]
> I don't explore more cases here consider we are in stage 4. I think the
> expand_const_vector need some refactor up to a point.
I added the negative step check just some weeks ago and I'd see it as
simplification to remove the restriction again if you're touching the actual
step anyway. So I wouldn't worry about stage 4 in that regard.
--
Regards
Robin
@@ -1504,9 +1504,38 @@ expand_const_vector (rtx target, rtx src)
&& get_vector_mode (new_smode, new_nunits).exists (&new_mode))
{
rtx tmp1 = gen_reg_rtx (new_mode);
- base1 = gen_int_mode (rtx_to_poly_int64 (base1), new_smode);
+ poly_int64 base1_poly = rtx_to_poly_int64 (base1);
+ base1 = gen_int_mode (base1_poly, new_smode);
expand_vec_series (tmp1, base1, gen_int_mode (step1, new_smode));
+ bool overflow_smode_p = false;
+
+ for (int i = 0; i < XVECLEN (src, 0) / 2; i++)
+ {
+ for (int k = 0; k < NUM_POLY_INT_COEFFS; k++)
+ if (((uint64_t)(base1_poly.coeffs[k] + i * step1.coeffs[k])
+ >> builder.inner_bits_size ()) != 0)
+ overflow_smode_p = true;
+
+ if (overflow_smode_p)
+ break;
+ }
+
+ if (overflow_smode_p)
+ {
+ /* The vec_series base1 may overflow bits to base2 series. */
+ rtx vec_mask = gen_vec_duplicate (new_mode,
+ CONSTM1_RTX (new_smode));
+ rtx lshift_vec_mask = gen_reg_rtx (new_mode);
+ rtx shift = gen_int_mode (builder.inner_bits_size (), Xmode);
+ rtx lshift_ops[] = {lshift_vec_mask, vec_mask, shift};
+ emit_vlmax_insn (code_for_pred_scalar (LSHIFTRT, new_mode),
+ BINARY_OP, lshift_ops);
+ rtx and_ops[] = {tmp1, tmp1, lshift_vec_mask};
+ emit_vlmax_insn (code_for_pred (AND, new_mode), BINARY_OP,
+ and_ops);
+ }
+
if (rtx_equal_p (base2, const0_rtx) && known_eq (step2, 0))
/* { 1, 0, 2, 0, ... }. */
emit_move_insn (result, gen_lowpart (mode, tmp1));
new file mode 100644
@@ -0,0 +1,19 @@
+/* { dg-do run { target { riscv_v } } } */
+/* { dg-options "-O3 -march=rv64gcv -flto -mrvv-vector-bits=zvl" } */
+
+long long m;
+char f = 151;
+char h = 103;
+unsigned char a = 109;
+
+int main() {
+ for (char l = 0; l < 255 - 241; l += h - 102)
+ a *= f;
+
+ m = a;
+
+ if (m != 29)
+ __builtin_abort ();
+
+ return 0;
+}