combine: Don't simplify high part of paradoxical-SUBREG-of-MEM on machines that sign-extend loads [PR113010]
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
fail
|
Testing failed
|
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Testing passed
|
Commit Message
The sign bit of a sign-extending load cannot be known until runtime,
so don't attempt to simplify it in the combiner.
2024-02-22 Greg McGary <gkm@rivosinc.com>
PR rtl-optimization/113010
* combine.cc (simplify_comparison): Don't simplify high part
of paradoxical-SUBREG-of-MEM on machines that sign-extend loads
* gcc.c-torture/execute/pr113010.c: New test.
---
gcc/combine.cc | 10 ++++++++--
gcc/testsuite/gcc.c-torture/execute/pr113010.c | 9 +++++++++
2 files changed, 17 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
Comments
On Thu, Feb 22, 2024 at 12:59:18PM -0800, Greg McGary wrote:
> The sign bit of a sign-extending load cannot be known until runtime,
> so don't attempt to simplify it in the combiner.
>
> 2024-02-22 Greg McGary <gkm@rivosinc.com>
>
> PR rtl-optimization/113010
> * combine.cc (simplify_comparison): Don't simplify high part
> of paradoxical-SUBREG-of-MEM on machines that sign-extend loads
>
> * gcc.c-torture/execute/pr113010.c: New test.
> ---
> gcc/combine.cc | 10 ++++++++--
> gcc/testsuite/gcc.c-torture/execute/pr113010.c | 9 +++++++++
> 2 files changed, 17 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
>
> diff --git a/gcc/combine.cc b/gcc/combine.cc
> index 812553c091e..736206242e1 100644
> --- a/gcc/combine.cc
> +++ b/gcc/combine.cc
> @@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
> }
>
> /* If the inner mode is narrower and we are extracting the low part,
> - we can treat the SUBREG as if it were a ZERO_EXTEND. */
> + we can treat the SUBREG as if it were a ZERO_EXTEND ... */
> if (paradoxical_subreg_p (op0))
> - ;
> + {
> + /* ... except we can't treat as ZERO_EXTEND when a machine
> + automatically sign-extends loads. */
> + if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
> + && load_extend_op (inner_mode) == SIGN_EXTEND)
> + break;
That doesn't feel sufficient. Like in the PR112758 patch, I believe
for WORD_REGISTER_OPERATIONS you should treat it as a ZERO_EXTEND only
if MEM_P (SUBREG_REG (op0)) && load_extend_op (inner_mode) == ZERO_EXTEND
or if GET_MODE_PRECISION (inner_mode) is known to be >= BITS_PER_WORD.
Jakub
On 2/22/24 2:08 PM, Jakub Jelinek wrote:
> On Thu, Feb 22, 2024 at 12:59:18PM -0800, Greg McGary wrote:
>> The sign bit of a sign-extending load cannot be known until runtime,
>> so don't attempt to simplify it in the combiner.
>>
>> 2024-02-22 Greg McGary <gkm@rivosinc.com>
>>
>> PR rtl-optimization/113010
>> * combine.cc (simplify_comparison): Don't simplify high part
>> of paradoxical-SUBREG-of-MEM on machines that sign-extend loads
>>
>> * gcc.c-torture/execute/pr113010.c: New test.
>> ---
>> gcc/combine.cc | 10 ++++++++--
>> gcc/testsuite/gcc.c-torture/execute/pr113010.c | 9 +++++++++
>> 2 files changed, 17 insertions(+), 2 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
>>
>> diff --git a/gcc/combine.cc b/gcc/combine.cc
>> index 812553c091e..736206242e1 100644
>> --- a/gcc/combine.cc
>> +++ b/gcc/combine.cc
>> @@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
>> }
>>
>> /* If the inner mode is narrower and we are extracting the low part,
>> - we can treat the SUBREG as if it were a ZERO_EXTEND. */
>> + we can treat the SUBREG as if it were a ZERO_EXTEND ... */
>> if (paradoxical_subreg_p (op0))
>> - ;
>> + {
>> + /* ... except we can't treat as ZERO_EXTEND when a machine
>> + automatically sign-extends loads. */
>> + if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
>> + && load_extend_op (inner_mode) == SIGN_EXTEND)
>> + break;
> That doesn't feel sufficient. Like in the PR112758 patch, I believe
> for WORD_REGISTER_OPERATIONS you should treat it as a ZERO_EXTEND only
> if MEM_P (SUBREG_REG (op0)) && load_extend_op (inner_mode) == ZERO_EXTEND
> or if GET_MODE_PRECISION (inner_mode) is known to be >= BITS_PER_WORD.
>
> Jakub
The gist of your comment is that my patch was missing the test for width
of inner_mode vs. BITS_PER_WORD. Here's a revision. Looks good to you?
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 812553c091e..4626f2edae9 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -12550,9 +12550,17 @@ simplify_comparison (enum rtx_code code, rtx
*pop0, rtx *pop1)
}
/* If the inner mode is narrower and we are extracting the
low part,
- we can treat the SUBREG as if it were a ZERO_EXTEND. */
+ we can treat the SUBREG as if it were a ZERO_EXTEND ... */
if (paradoxical_subreg_p (op0))
- ;
+ {
+ /* ... except we can't do that for loads on machines
+ that don't automatically zero-extend loads. */
+ if (WORD_REGISTER_OPERATIONS
+ && GET_MODE_PRECISION (inner_mode) < BITS_PER_WORD
+ && MEM_P (SUBREG_REG (op0))
+ && load_extend_op (inner_mode) != ZERO_EXTEND)
+ break;
+ }
else if (subreg_lowpart_p (op0)
&& GET_MODE_CLASS (mode) == MODE_INT
&& is_int_mode (GET_MODE (SUBREG_REG (op0)),
&inner_mode)
@@ -12550,9 +12550,15 @@ simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
}
/* If the inner mode is narrower and we are extracting the low part,
- we can treat the SUBREG as if it were a ZERO_EXTEND. */
+ we can treat the SUBREG as if it were a ZERO_EXTEND ... */
if (paradoxical_subreg_p (op0))
- ;
+ {
+ /* ... except we can't treat as ZERO_EXTEND when a machine
+ automatically sign-extends loads. */
+ if (MEM_P (SUBREG_REG (op0)) && WORD_REGISTER_OPERATIONS
+ && load_extend_op (inner_mode) == SIGN_EXTEND)
+ break;
+ }
else if (subreg_lowpart_p (op0)
&& GET_MODE_CLASS (mode) == MODE_INT
&& is_int_mode (GET_MODE (SUBREG_REG (op0)), &inner_mode)
new file mode 100644
@@ -0,0 +1,9 @@
+int minus_1 = -1;
+
+int
+main ()
+{
+ if ((0, 0xfffffffful) >= minus_1)
+ __builtin_abort ();
+ return 0;
+}