combine: Don't simplify paradoxical SUBREG on WORD_REGISTER_OPERATIONS [PR113010]

Message ID 20240227001736.3690294-1-gkm@rivosinc.com
State New
Headers
Series combine: Don't simplify paradoxical SUBREG on WORD_REGISTER_OPERATIONS [PR113010] |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-arm fail Testing failed
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 success Testing passed

Commit Message

Greg McGary Feb. 27, 2024, 12:17 a.m. UTC
  The sign-bit-copies of a sign-extending load cannot be known until runtime on
WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
load.  See the fix for PR112758.

2024-02-22  Greg McGary  <gkm@rivosinc.com>

        PR rtl-optimization/113010
	* combine.cc (simplify_comparison): Simplify a SUBREG on
	  WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
	  MEM load.

	* gcc.c-torture/execute/pr113010.c: New test.
---
 gcc/combine.cc                                 | 15 +++++++++++++--
 gcc/testsuite/gcc.c-torture/execute/pr113010.c |  9 +++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr113010.c
  

Comments

Greg McGary Feb. 27, 2024, 3:26 p.m. UTC | #1
On 2/26/24 5:17 PM, Greg McGary wrote:
> diff --git a/gcc/testsuite/gcc.c-torture/execute/pr113010.c b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
> new file mode 100644
> index 00000000000..a95c613c1df
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
> @@ -0,0 +1,9 @@
> +int minus_1 = -1;
> +
> +int
> +main ()
> +{
> +  if ((0, 0xfffffffful) >= minus_1)
> +    __builtin_abort ();
> +  return 0;
> +}


Note that this is a stale version of the testcase. The constant needs to be
long long 0xffffffffull for the sake of 32-bit machines, such as ARM.

G
  
Jeff Law March 1, 2024, 4:30 a.m. UTC | #2
On 2/26/24 17:17, Greg McGary wrote:
> The sign-bit-copies of a sign-extending load cannot be known until runtime on
> WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
> load.  See the fix for PR112758.
> 
> 2024-02-22  Greg McGary  <gkm@rivosinc.com>
> 
>          PR rtl-optimization/113010
> 	* combine.cc (simplify_comparison): Simplify a SUBREG on
> 	  WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
> 	  MEM load.
> 
> 	* gcc.c-torture/execute/pr113010.c: New test.
I think this is fine for the trunk.  I'll do some final testing on it 
tomorrow.

Jeff
  
Rainer Orth March 4, 2024, 4:18 p.m. UTC | #3
Hi Jeff,

> On 2/26/24 17:17, Greg McGary wrote:
>> The sign-bit-copies of a sign-extending load cannot be known until runtime on
>> WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
>> load.  See the fix for PR112758.
>> 2024-02-22  Greg McGary  <gkm@rivosinc.com>
>>          PR rtl-optimization/113010
>> 	* combine.cc (simplify_comparison): Simplify a SUBREG on
>> 	  WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
>> 	  MEM load.
>> 	* gcc.c-torture/execute/pr113010.c: New test.
> I think this is fine for the trunk.  I'll do some final testing on it
> tomorrow.

unfortunately, the patch broke Solaris/SPARC bootstrap
(sparc-sun-solaris2.11):

/vol/gcc/src/hg/master/local/gcc/combine.cc: In function 'rtx_code simplify_comparison(rtx_code, rtx_def**, rtx_def**)':
/vol/gcc/src/hg/master/local/gcc/combine.cc:12101:25: error: '*(unsigned int*)((char*)&inner_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))' may be used uninitialized [-Werror=maybe-uninitialized]
12101 |   scalar_int_mode mode, inner_mode, tmode;
      |                         ^~~~~~~~~~

	Rainer
  

Patch

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 76543d85b7c..b09200d757e 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -12550,9 +12550,20 @@  simplify_comparison (enum rtx_code code, rtx *pop0, rtx *pop1)
 	    }
 
 	  /* If the inner mode is narrower and we are extracting the low part,
-	     we can treat the SUBREG as if it were a ZERO_EXTEND.  */
+	     we can treat the SUBREG as if it were a ZERO_EXTEND ...  */
 	  if (paradoxical_subreg_p (op0))
-	    ;
+	    {
+	      if (WORD_REGISTER_OPERATIONS
+		  && GET_MODE_PRECISION (inner_mode) < BITS_PER_WORD
+		  /* On WORD_REGISTER_OPERATIONS targets the bits
+		     beyond sub_mode aren't considered undefined,
+		     so optimize only if it is a MEM load when MEM loads
+		     zero extend, because then the upper bits are all zero.  */
+		  && !(MEM_P (SUBREG_REG (op0))
+		       && load_extend_op (inner_mode) == ZERO_EXTEND))
+		break;
+	      /* FALLTHROUGH to case ZERO_EXTEND */
+	    }
 	  else if (subreg_lowpart_p (op0)
 		   && GET_MODE_CLASS (mode) == MODE_INT
 		   && is_int_mode (GET_MODE (SUBREG_REG (op0)), &inner_mode)
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr113010.c b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
new file mode 100644
index 00000000000..a95c613c1df
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr113010.c
@@ -0,0 +1,9 @@ 
+int minus_1 = -1;
+
+int
+main ()
+{
+  if ((0, 0xfffffffful) >= minus_1)
+    __builtin_abort ();
+  return 0;
+}