ext-dce: Fix SIGN_EXTEND handling and cleanups [PR117360]
Checks
Context |
Check |
Description |
rivoscibot/toolchain-ci-rivos-lint |
success
|
Lint passed
|
rivoscibot/toolchain-ci-rivos-apply-patch |
fail
|
Patch failed to apply
|
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Build passed
|
Commit Message
Hi!
This is mostly a blind attempt to fix the PR + various cleanups.
The PR is about a shift of a HOST_WIDE_INT by 127 invoking UB.
Most of carry_backpropagate works on GET_MODE_INNER of the operand,
mode is assigned
enum machine_mode mode = GET_MODE_INNER (GET_MODE (x));
at the beginning and everything is done using that mode, so for
vector modes (or complex even?) we work with the element modes
rather than vector/complex modes.
But the SIGN_EXTEND handling does that inconsistently, it looks
at mode of the operand and uses GET_MODE_INNER in GET_MODE_MASK,
but doesn't use it in the shift.
The following patch appart from the cleanups fixes it by doing
essentially:
mode = GET_MODE (XEXP (x, 0));
if (mask & ~GET_MODE_MASK (GET_MODE_INNER (mode)))
- mask |= 1ULL << (GET_MODE_BITSIZE (mode).to_constant () - 1);
+ mask |= 1ULL << (GET_MODE_BITSIZE (GET_MODE_INNER (mode)).to_constant () - 1);
i.e. also shifting by GET_MODE_BITSIZE of the GET_MODE_INNER of the
operand's mode. We don't need to check if it is at most 64 bits,
at the start of the function we've already verified the result mode
is at most 64 bits and SIGN_EXTEND by definition extends from a narrower
mode.
The rest of the patch are cleanups. For HOST_WIDE_INT we have the
HOST_WIDE_INT_{UC,1U} macros, a HWI isn't necessarily unsigned long long,
so using ULL suffixes for it is weird.
More importantly, the function does
scalar_int_mode smode;
if (!is_a <scalar_int_mode> (mode, &smode)
|| GET_MODE_BITSIZE (smode) > HOST_BITS_PER_WIDE_INT)
return mmask;
early, so we don't need to use GET_MODE_BITSIZE (mode) which is
a poly_int but can use GET_MODE_BITSIZE (smode) with the same value
but in unsigned short, so we don't need to use known_lt or .to_constant ()
everywhere.
Plus some formatting issues.
What I've left around is
if (!GET_MODE_BITSIZE (GET_MODE (x)).is_constant ()
|| !GET_MODE_BITSIZE (GET_MODE (XEXP (x, 0))).is_constant ())
return -1;
at the start of SIGN_EXTEND or ZERO_EXTEND, I'm afraid I don't know enough
about aarch64/riscv VL vectors to know why this is done (though even that
return -1; is weird, rest of the code does return mmask; if it wants to
punt.
Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
2024-11-29 Jakub Jelinek <jakub@redhat.com>
PR rtl-optimization/117360
* ext-dce.cc (ext_dce_process_sets): Use HOST_WIDE_INT_UC
macro instead of ULL suffixed constants.
(carry_backpropagate): Likewise. Use HOST_WIDE_INT_1U instead of
1ULL. Use GET_MODE_BITSIZE (smode) instead of
GET_MODE_BITSIZE (mode) and with that avoid having to use
known_lt instead of < or use .to_constant (). Formatting fixes.
(case SIGN_EXTEND): Set mode to GET_MODE_INNER (GET_MODE (XEXP (x, 0)))
rather than GET_MODE (XEXP (x, 0)) and don't use GET_MODE_INNER (mode).
(ext_dce_process_uses): Use HOST_WIDE_INT_UC macro instead of ULL
suffixed constants.
Jakub
Comments
On 11/29/24 1:43 AM, Jakub Jelinek wrote:
> Hi!
>
> This is mostly a blind attempt to fix the PR + various cleanups.
> The PR is about a shift of a HOST_WIDE_INT by 127 invoking UB.
>
> Most of carry_backpropagate works on GET_MODE_INNER of the operand,
> mode is assigned
> enum machine_mode mode = GET_MODE_INNER (GET_MODE (x));
> at the beginning and everything is done using that mode, so for
> vector modes (or complex even?) we work with the element modes
> rather than vector/complex modes.
> But the SIGN_EXTEND handling does that inconsistently, it looks
> at mode of the operand and uses GET_MODE_INNER in GET_MODE_MASK,
> but doesn't use it in the shift.
> The following patch appart from the cleanups fixes it by doing
> essentially:
> mode = GET_MODE (XEXP (x, 0));
> if (mask & ~GET_MODE_MASK (GET_MODE_INNER (mode)))
> - mask |= 1ULL << (GET_MODE_BITSIZE (mode).to_constant () - 1);
> + mask |= 1ULL << (GET_MODE_BITSIZE (GET_MODE_INNER (mode)).to_constant () - 1);
> i.e. also shifting by GET_MODE_BITSIZE of the GET_MODE_INNER of the
> operand's mode. We don't need to check if it is at most 64 bits,
> at the start of the function we've already verified the result mode
> is at most 64 bits and SIGN_EXTEND by definition extends from a narrower
> mode.
Yea, the code was definitely not consistent in this regard. Thanks for
cleaning it up.
>
> The rest of the patch are cleanups. For HOST_WIDE_INT we have the
> HOST_WIDE_INT_{UC,1U} macros, a HWI isn't necessarily unsigned long long,
> so using ULL suffixes for it is weird.
Yea, the code clearly needed a lot of cleanups related to its constants.
I actually just cleaned up several similar problems in the CRC bits I
committed earlier this week. It's an easy issue to miss.
>
> More importantly, the function does
> scalar_int_mode smode;
> if (!is_a <scalar_int_mode> (mode, &smode)
> || GET_MODE_BITSIZE (smode) > HOST_BITS_PER_WIDE_INT)
> return mmask;
> early, so we don't need to use GET_MODE_BITSIZE (mode) which is
> a poly_int but can use GET_MODE_BITSIZE (smode) with the same value
> but in unsigned short, so we don't need to use known_lt or .to_constant ()
> everywhere.
Good cleanup. I'd started working backwards from the bottom of the
patch and had repeatedly asked myself if the changes were safe. This
explains it just fine.
>
> Plus some formatting issues.
>
> What I've left around is
> if (!GET_MODE_BITSIZE (GET_MODE (x)).is_constant ()
> || !GET_MODE_BITSIZE (GET_MODE (XEXP (x, 0))).is_constant ())
> return -1;
> at the start of SIGN_EXTEND or ZERO_EXTEND, I'm afraid I don't know enough
> about aarch64/riscv VL vectors to know why this is done (though even that
> return -1; is weird, rest of the code does return mmask; if it wants to
> punt.
It's just saying everything is live if we're presented with a mode that
doesn't have a constant side. mmask, if wide enough, should work
better in theory.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2024-11-29 Jakub Jelinek <jakub@redhat.com>
>
> PR rtl-optimization/117360
> * ext-dce.cc (ext_dce_process_sets): Use HOST_WIDE_INT_UC
> macro instead of ULL suffixed constants.
> (carry_backpropagate): Likewise. Use HOST_WIDE_INT_1U instead of
> 1ULL. Use GET_MODE_BITSIZE (smode) instead of
> GET_MODE_BITSIZE (mode) and with that avoid having to use
> known_lt instead of < or use .to_constant (). Formatting fixes.
> (case SIGN_EXTEND): Set mode to GET_MODE_INNER (GET_MODE (XEXP (x, 0)))
> rather than GET_MODE (XEXP (x, 0)) and don't use GET_MODE_INNER (mode).
> (ext_dce_process_uses): Use HOST_WIDE_INT_UC macro instead of ULL
> suffixed constants.
Thanks. I hadn't forgotten this, but was still a few days away from
being able to focus on it.
I still have other concerns about this code (the carry_backpropagate
bits), but your changes don't impact those concerns at all.
OK for the trunk.
jeff
@@ -357,8 +357,8 @@ ext_dce_process_sets (rtx_insn *insn, rt
Note that BIT need not be a power of two, consider a
ZERO_EXTRACT destination. */
int start = (bit < 8 ? 0 : bit < 16 ? 1 : bit < 32 ? 2 : 3);
- int end = ((mask & ~0xffffffffULL) ? 4
- : (mask & 0xffff0000ULL) ? 3
+ int end = ((mask & ~HOST_WIDE_INT_UC (0xffffffff)) ? 4
+ : (mask & HOST_WIDE_INT_UC (0xffff0000)) ? 3
: (mask & 0xff00) ? 2 : 1);
bitmap_clear_range (livenow, 4 * rn + start, end - start);
}
@@ -509,21 +509,21 @@ carry_backpropagate (unsigned HOST_WIDE_
case PLUS:
case MINUS:
case MULT:
- return (2ULL << floor_log2 (mask)) - 1;
+ return (HOST_WIDE_INT_UC (2) << floor_log2 (mask)) - 1;
/* We propagate for the shifted operand, but not the shift
count. The count is handled specially. */
case ASHIFT:
if (CONST_INT_P (XEXP (x, 1))
- && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode)))
- return (HOST_WIDE_INT)mask >> INTVAL (XEXP (x, 1));
- return (2ULL << floor_log2 (mask)) - 1;
+ && UINTVAL (XEXP (x, 1)) < GET_MODE_BITSIZE (smode))
+ return (HOST_WIDE_INT) mask >> INTVAL (XEXP (x, 1));
+ return (HOST_WIDE_INT_UC (2) << floor_log2 (mask)) - 1;
/* We propagate for the shifted operand, but not the shift
count. The count is handled specially. */
case LSHIFTRT:
if (CONST_INT_P (XEXP (x, 1))
- && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode)))
+ && UINTVAL (XEXP (x, 1)) < GET_MODE_BITSIZE (smode))
return mmask & (mask << INTVAL (XEXP (x, 1)));
return mmask;
@@ -531,12 +531,12 @@ carry_backpropagate (unsigned HOST_WIDE_
count. The count is handled specially. */
case ASHIFTRT:
if (CONST_INT_P (XEXP (x, 1))
- && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode)))
+ && UINTVAL (XEXP (x, 1)) < GET_MODE_BITSIZE (smode))
{
HOST_WIDE_INT sign = 0;
if (HOST_BITS_PER_WIDE_INT - clz_hwi (mask) + INTVAL (XEXP (x, 1))
- > GET_MODE_BITSIZE (mode).to_constant ())
- sign = 1ULL << (GET_MODE_BITSIZE (mode).to_constant () - 1);
+ > GET_MODE_BITSIZE (smode))
+ sign = HOST_WIDE_INT_1U << (GET_MODE_BITSIZE (smode) - 1);
return sign | (mmask & (mask << INTVAL (XEXP (x, 1))));
}
return mmask;
@@ -550,14 +550,13 @@ carry_backpropagate (unsigned HOST_WIDE_
if (CONST_INT_P (XEXP (x, 1)))
{
if (pow2p_hwi (INTVAL (XEXP (x, 1))))
- return mmask & (mask << (GET_MODE_BITSIZE (mode).to_constant ()
+ return mmask & (mask << (GET_MODE_BITSIZE (smode)
- exact_log2 (INTVAL (XEXP (x, 1)))));
- int bits = (HOST_BITS_PER_WIDE_INT
- + GET_MODE_BITSIZE (mode).to_constant ()
+ int bits = (HOST_BITS_PER_WIDE_INT + GET_MODE_BITSIZE (smode)
- clz_hwi (mask) - ctz_hwi (INTVAL (XEXP (x, 1))));
- if (bits < GET_MODE_BITSIZE (mode).to_constant ())
- return (1ULL << bits) - 1;
+ if (bits < GET_MODE_BITSIZE (smode))
+ return (HOST_WIDE_INT_1U << bits) - 1;
}
return mmask;
@@ -568,9 +567,10 @@ carry_backpropagate (unsigned HOST_WIDE_
/* We want the mode of the inner object. We need to ensure its
sign bit is on in MASK. */
- mode = GET_MODE (XEXP (x, 0));
- if (mask & ~GET_MODE_MASK (GET_MODE_INNER (mode)))
- mask |= 1ULL << (GET_MODE_BITSIZE (mode).to_constant () - 1);
+ mode = GET_MODE_INNER (GET_MODE (XEXP (x, 0)));
+ if (mask & ~GET_MODE_MASK (mode))
+ mask |= HOST_WIDE_INT_1U << (GET_MODE_BITSIZE (mode).to_constant ()
+ - 1);
/* Recurse into the operand. */
return carry_backpropagate (mask, GET_CODE (XEXP (x, 0)), XEXP (x, 0));
@@ -588,13 +588,13 @@ carry_backpropagate (unsigned HOST_WIDE_
case SS_ASHIFT:
case US_ASHIFT:
if (CONST_INT_P (XEXP (x, 1))
- && UINTVAL (XEXP (x, 1)) < GET_MODE_BITSIZE (mode).to_constant ())
+ && UINTVAL (XEXP (x, 1)) < GET_MODE_BITSIZE (smode))
{
- return ((mmask & ~((unsigned HOST_WIDE_INT)mmask
+ return ((mmask & ~((unsigned HOST_WIDE_INT) mmask
>> (INTVAL (XEXP (x, 1))
+ (XEXP (x, 1) != const0_rtx
&& code == SS_ASHIFT))))
- | ((HOST_WIDE_INT)mask >> INTVAL (XEXP (x, 1))));
+ | ((HOST_WIDE_INT) mask >> INTVAL (XEXP (x, 1))));
}
return mmask;
@@ -681,7 +681,8 @@ ext_dce_process_uses (rtx_insn *insn, rt
unsigned HOST_WIDE_INT dst_mask = 0;
HOST_WIDE_INT rn = REGNO (dst);
unsigned HOST_WIDE_INT mask_array[]
- = { 0xff, 0xff00, 0xffff0000ULL, -0x100000000ULL };
+ = { 0xff, 0xff00, HOST_WIDE_INT_UC (0xffff0000),
+ -HOST_WIDE_INT_UC (0x100000000) };
for (int i = 0; i < 4; i++)
if (bitmap_bit_p (live_tmp, 4 * rn + i))
dst_mask |= mask_array[i];
@@ -786,7 +787,7 @@ ext_dce_process_uses (rtx_insn *insn, rt
{
dst_mask <<= bit;
if (!dst_mask)
- dst_mask = -0x100000000ULL;
+ dst_mask = -HOST_WIDE_INT_UC (0x100000000);
}
y = SUBREG_REG (y);
}
@@ -811,9 +812,9 @@ ext_dce_process_uses (rtx_insn *insn, rt
bitmap_set_bit (livenow, rn);
if (tmp_mask & 0xff00)
bitmap_set_bit (livenow, rn + 1);
- if (tmp_mask & 0xffff0000ULL)
+ if (tmp_mask & HOST_WIDE_INT_UC (0xffff0000))
bitmap_set_bit (livenow, rn + 2);
- if (tmp_mask & -0x100000000ULL)
+ if (tmp_mask & -HOST_WIDE_INT_UC (0x100000000))
bitmap_set_bit (livenow, rn + 3);
}
else if (!CONSTANT_P (y))