PR tree-optimization/123319 - Always snap subranges after intersection.

Message ID 402f1898-4c90-4ed3-951f-e250cffdeaec@redhat.com
State New
Headers
Series PR tree-optimization/123319 - Always snap subranges after intersection. |

Commit Message

Andrew MacLeod Jan. 6, 2026, 5:35 p.m. UTC
  There was a tiny bug in intersection.

[irange] int64_t [-INF, -65536][65536, 9223372036854710272] MASK 
0xffffffffffff0000
    intersect
[irange] int64_t [-1736223011, -1736223011][1, 1]

  the first range does not contain the second range, but the first 
subrange is a component.  The resulting range is simply

  [-1736223011, -1736223011]

But as it is being constructed into the same range and the second range 
has no bitmask, the bitnmask remains the same.


IN this particular case, bitmask_intersect takes a shortcut and since 
the bitmask doesnt change, it doesn't snap the bounds to the the new 
bitmask.

The new range SHOULD be UNDEFINED since this bitmask cannot contain 
-1736223011.  The intersection routine SHOULD  always snap the bounds if 
it changed and of the bounds in the new range.

This was showing up in IPA as we were getting an invalid range to union 
with:

   [irange] int64_t [-1736223011, -1736223011] MASK 0xffffffffffff0000 
VALUE 0x0

because it had the effect of an UNDFINED during union processing as the 
mask was applied, but it was tricking union into thinking it has changes 
the range, even though it had not.    This sort of choas happens when 
the range should be undefined but isn't :-P

Bootstraps on x86_64-pc-linux-gnu with no regressions, and very minor 
performance impact.  OK?

Andrew
  

Comments

Richard Biener Jan. 7, 2026, 1:06 p.m. UTC | #1
On Tue, Jan 6, 2026 at 6:36 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> There was a tiny bug in intersection.
>
> [irange] int64_t [-INF, -65536][65536, 9223372036854710272] MASK
> 0xffffffffffff0000
>     intersect
> [irange] int64_t [-1736223011, -1736223011][1, 1]
>
>   the first range does not contain the second range, but the first
> subrange is a component.  The resulting range is simply
>
>   [-1736223011, -1736223011]
>
> But as it is being constructed into the same range and the second range
> has no bitmask, the bitnmask remains the same.
>
>
> IN this particular case, bitmask_intersect takes a shortcut and since
> the bitmask doesnt change, it doesn't snap the bounds to the the new
> bitmask.
>
> The new range SHOULD be UNDEFINED since this bitmask cannot contain
> -1736223011.  The intersection routine SHOULD  always snap the bounds if
> it changed and of the bounds in the new range.
>
> This was showing up in IPA as we were getting an invalid range to union
> with:
>
>    [irange] int64_t [-1736223011, -1736223011] MASK 0xffffffffffff0000
> VALUE 0x0
>
> because it had the effect of an UNDFINED during union processing as the
> mask was applied, but it was tricking union into thinking it has changes
> the range, even though it had not.    This sort of choas happens when
> the range should be undefined but isn't :-P
>
> Bootstraps on x86_64-pc-linux-gnu with no regressions, and very minor
> performance impact.  OK?

OK.

Richard.

>
> Andrew
  
Andrew Pinski Jan. 8, 2026, 2:45 a.m. UTC | #2
On Wed, Jan 7, 2026 at 5:08 AM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Tue, Jan 6, 2026 at 6:36 PM Andrew MacLeod <amacleod@redhat.com> wrote:
> >
> > There was a tiny bug in intersection.
> >
> > [irange] int64_t [-INF, -65536][65536, 9223372036854710272] MASK
> > 0xffffffffffff0000
> >     intersect
> > [irange] int64_t [-1736223011, -1736223011][1, 1]
> >
> >   the first range does not contain the second range, but the first
> > subrange is a component.  The resulting range is simply
> >
> >   [-1736223011, -1736223011]
> >
> > But as it is being constructed into the same range and the second range
> > has no bitmask, the bitnmask remains the same.
> >
> >
> > IN this particular case, bitmask_intersect takes a shortcut and since
> > the bitmask doesnt change, it doesn't snap the bounds to the the new
> > bitmask.
> >
> > The new range SHOULD be UNDEFINED since this bitmask cannot contain
> > -1736223011.  The intersection routine SHOULD  always snap the bounds if
> > it changed and of the bounds in the new range.
> >
> > This was showing up in IPA as we were getting an invalid range to union
> > with:
> >
> >    [irange] int64_t [-1736223011, -1736223011] MASK 0xffffffffffff0000
> > VALUE 0x0
> >
> > because it had the effect of an UNDFINED during union processing as the
> > mask was applied, but it was tricking union into thinking it has changes
> > the range, even though it had not.    This sort of choas happens when
> > the range should be undefined but isn't :-P
> >
> > Bootstraps on x86_64-pc-linux-gnu with no regressions, and very minor
> > performance impact.  OK?
>
> OK.

Note the testcase does not work on anything non-x86. I have placed a
new reduced testcase in the bug report which does not use vectors nor
x86 specific functions/headers.
Does it make sense to replace the broken non working testcase with
that one? Or keep the old one and mark it as being compiled only for
x86 and then add the new one?

Thanks,
Andrew

>
> Richard.
>
> >
> > Andrew
  
Jakub Jelinek Jan. 8, 2026, 9:53 a.m. UTC | #3
On Tue, Jan 06, 2026 at 12:35:17PM -0500, Andrew MacLeod wrote:
> 	gcc/testsuite/
> 	* gcc.dg/pr123319.c: New.

The test doesn't work on non-x86 targets, but doesn't work on ia32 either.
On the former because (except for powerpc with its emulations) those don't
have immintrin.h header at all, on the latter due to
FAIL: gcc.dg/pr123319.c (test for excess errors)
Excess errors:
/home/jakub/src/gcc/obj39/gcc/include/xmmintrin.h:1288:1: error: inlining failed in call to 'always_inline' '_mm_avg_pu8': target specific option mismatch

The following patch fixes that, committed as obvious to trunk.

2026-01-08  Jakub Jelinek  <jakub@redhat.com>
	    Andrew Pinski  <andrew.pinski@oss.qualcomm.com>

	PR tree-optimization/123319
	* gcc.dg/pr123319.c: Replace test with target independent one.  Move
	previous test to ...
	* gcc.target/i386/pr123319.c: ... here.  Add comment with PR number,
	add -msse to dg-options, move immintrin.h include right after stdint.h
	include.

--- gcc/testsuite/gcc.dg/pr123319.c.jj	2026-01-08 09:44:44.081248326 +0100
+++ gcc/testsuite/gcc.dg/pr123319.c	2026-01-08 10:28:01.411420483 +0100
@@ -1,89 +1,35 @@
-/* { dg-do compile } */
-/* { dg-options "-O3 -w -Wno-psabi" } */
+/* PR tree-optimization/123319 */
+/* { dg-do compile { target int32plus } } */
+/* { dg-options "-O3" } */
 
-#include <stdint.h>
-#define BS_VEC(type, num) type __attribute__((vector_size(num * sizeof(type))))
-#define BITCAST(T, F, arg)                                                     \
-    (union {                                                                   \
-        F src;                                                                 \
-        T dst                                                                  \
-    }){ arg }                                                                  \
-        .dst
-#include <immintrin.h>
-#define BARRIER_v8u8(x) (BS_VEC(uint8_t, 8)) _mm_avg_pu8((__m64)x, (__m64)x)
-uint64_t BS_CHECKSUM_ARR[];
-BS_VEC(int8_t, 16) backsmith_pure_2(BS_VEC(int32_t, 8));
-BS_VEC(uint16_t, 4)
-backsmith_pure_0(BS_VEC(uint8_t, 4), int16_t, int64_t BS_ARG_2)
+signed char b, c[4];
+void foo (void);
+
+void
+bar (long e)
 {
-    int8_t BS_VAR_1;
-    BS_VEC(uint8_t, 16) BS_VAR_4[7];
-    if (BS_ARG_2)
-    {
-        if ((uint16_t)BS_ARG_2)
-            for (uint16_t BS_INC_0;;)
-                __builtin_convertvector(
-                    __builtin_shufflevector(BS_VAR_4[BS_INC_0],
-                                            BS_VAR_4[BS_INC_0], 8, 4, 5, 8),
-                    BS_VEC(uint16_t, 4));
-        BS_VAR_1 = __builtin_convertvector(
-            __builtin_shufflevector((BS_VEC(uint8_t, 8)){}, BARRIER_v8u8({}), 4,
-                                    10, 2, 1),
-            BS_VEC(int8_t, 4))[BS_ARG_2];
-        uint16_t BS_TEMP_11 = BS_VAR_1;
-        return (BS_VEC(uint16_t, 4)){ BS_TEMP_11 };
-    }
-    __builtin_convertvector(__builtin_shufflevector((BS_VEC(uint8_t, 8)){},
-                                                    (BS_VEC(uint8_t, 8)){}, 2,
-                                                    6, 4, 4),
-                            BS_VEC(uint16_t, 4));
+  if (e) {
+    if ((short) e)
+      for (;;)
+        ;
+    foo ();
+    b = c[e];
+  }
 }
-static int32_t *func_31(int32_t *, uint64_t);
-uint16_t func_1()
+
+static void
+baz (long e)
 {
-    BITCAST(
-        int64_t, BS_VEC(int16_t, 4),
-        (__builtin_convertvector(
-             __builtin_shufflevector((BS_VEC(uint32_t, 4)){},
-                                     (BS_VEC(uint32_t, 4)){}, 0, 3, 3, 0, 2, 7,
-                                     2, 5, 3, 5, 0, 4, 0, 1, 1, 7, 1, 0, 6, 7,
-                                     6, 3, 4, 6, 3, 3, 1, 7, 3, 6, 0, 0),
-             BS_VEC(int16_t, 32)),
-         __builtin_convertvector(
-             __builtin_shufflevector(
-                 backsmith_pure_0((BS_VEC(uint8_t, 4)){}, 0,
-                                  BITCAST(int64_t, BS_VEC(int32_t, 2), )),
-                 backsmith_pure_0(
-                     (BS_VEC(uint8_t, 4)){}, 0,
-                     BITCAST(int64_t, BS_VEC(int32_t, 2),
-                             __builtin_convertvector(
-                                 __builtin_shufflevector(
-                                     backsmith_pure_2(__builtin_convertvector(
-                                         (BS_VEC(int64_t, 8)){},
-                                         BS_VEC(int32_t, 8))),
-                                     backsmith_pure_2(__builtin_convertvector(
-                                         (BS_VEC(int64_t, 8)){},
-                                         BS_VEC(int32_t, 8))),
-                                     9, 2),
-                                 BS_VEC(int32_t, 2)))),
-                 3, 7, 5, 5, 1, 0, 2, 1, 1, 4, 6, 4, 5, 1, 5, 6, 1, 0, 1, 6, 4,
-                 1, 2, 3, 1, 1, 1, 0, 7, 2, 5, 1),
-             BS_VEC(int16_t, 32)),
-         1));
-    int32_t l_969;
-    int8_t l_1016 = 1;
-    func_31(&l_969, l_1016);
-    __builtin_convertvector((BS_VEC(int32_t, 32)){}, BS_VEC(int16_t, 32)),
-        __builtin_convertvector((BS_VEC(int32_t, 2)){}, BS_VEC(uint8_t, 2));
-    int l_572 = 2558744285;
-    func_31(0, l_572);
+  bar (e);
 }
-int32_t *func_31(int32_t *, uint64_t p_33)
+
+int g;
+
+void
+qux ()
 {
-    uint64_t LOCAL_CHECKSUM;
-    backsmith_pure_0(
-        __builtin_convertvector((BS_VEC(int32_t, 4)){}, BS_VEC(uint8_t, 4)),
-        20966, p_33);
-    for (uint32_t BS_TEMP_215; BS_TEMP_215;)
-        BS_CHECKSUM_ARR[6] += LOCAL_CHECKSUM;
+  bar (g);
+  baz (1);
+  int h = 2558744285;
+  baz (h);
 }
--- gcc/testsuite/gcc.target/i386/pr123319.c.jj	2026-01-08 10:23:41.672845301 +0100
+++ gcc/testsuite/gcc.target/i386/pr123319.c	2026-01-08 10:28:10.468266193 +0100
@@ -0,0 +1,90 @@
+/* PR tree-optimization/123319 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -w -Wno-psabi -msse" } */
+
+#include <stdint.h>
+#include <immintrin.h>
+#define BS_VEC(type, num) type __attribute__((vector_size(num * sizeof(type))))
+#define BITCAST(T, F, arg)                                                     \
+    (union {                                                                   \
+        F src;                                                                 \
+        T dst                                                                  \
+    }){ arg }                                                                  \
+        .dst
+#define BARRIER_v8u8(x) (BS_VEC(uint8_t, 8)) _mm_avg_pu8((__m64)x, (__m64)x)
+uint64_t BS_CHECKSUM_ARR[];
+BS_VEC(int8_t, 16) backsmith_pure_2(BS_VEC(int32_t, 8));
+BS_VEC(uint16_t, 4)
+backsmith_pure_0(BS_VEC(uint8_t, 4), int16_t, int64_t BS_ARG_2)
+{
+    int8_t BS_VAR_1;
+    BS_VEC(uint8_t, 16) BS_VAR_4[7];
+    if (BS_ARG_2)
+    {
+        if ((uint16_t)BS_ARG_2)
+            for (uint16_t BS_INC_0;;)
+                __builtin_convertvector(
+                    __builtin_shufflevector(BS_VAR_4[BS_INC_0],
+                                            BS_VAR_4[BS_INC_0], 8, 4, 5, 8),
+                    BS_VEC(uint16_t, 4));
+        BS_VAR_1 = __builtin_convertvector(
+            __builtin_shufflevector((BS_VEC(uint8_t, 8)){}, BARRIER_v8u8({}), 4,
+                                    10, 2, 1),
+            BS_VEC(int8_t, 4))[BS_ARG_2];
+        uint16_t BS_TEMP_11 = BS_VAR_1;
+        return (BS_VEC(uint16_t, 4)){ BS_TEMP_11 };
+    }
+    __builtin_convertvector(__builtin_shufflevector((BS_VEC(uint8_t, 8)){},
+                                                    (BS_VEC(uint8_t, 8)){}, 2,
+                                                    6, 4, 4),
+                            BS_VEC(uint16_t, 4));
+}
+static int32_t *func_31(int32_t *, uint64_t);
+uint16_t func_1()
+{
+    BITCAST(
+        int64_t, BS_VEC(int16_t, 4),
+        (__builtin_convertvector(
+             __builtin_shufflevector((BS_VEC(uint32_t, 4)){},
+                                     (BS_VEC(uint32_t, 4)){}, 0, 3, 3, 0, 2, 7,
+                                     2, 5, 3, 5, 0, 4, 0, 1, 1, 7, 1, 0, 6, 7,
+                                     6, 3, 4, 6, 3, 3, 1, 7, 3, 6, 0, 0),
+             BS_VEC(int16_t, 32)),
+         __builtin_convertvector(
+             __builtin_shufflevector(
+                 backsmith_pure_0((BS_VEC(uint8_t, 4)){}, 0,
+                                  BITCAST(int64_t, BS_VEC(int32_t, 2), )),
+                 backsmith_pure_0(
+                     (BS_VEC(uint8_t, 4)){}, 0,
+                     BITCAST(int64_t, BS_VEC(int32_t, 2),
+                             __builtin_convertvector(
+                                 __builtin_shufflevector(
+                                     backsmith_pure_2(__builtin_convertvector(
+                                         (BS_VEC(int64_t, 8)){},
+                                         BS_VEC(int32_t, 8))),
+                                     backsmith_pure_2(__builtin_convertvector(
+                                         (BS_VEC(int64_t, 8)){},
+                                         BS_VEC(int32_t, 8))),
+                                     9, 2),
+                                 BS_VEC(int32_t, 2)))),
+                 3, 7, 5, 5, 1, 0, 2, 1, 1, 4, 6, 4, 5, 1, 5, 6, 1, 0, 1, 6, 4,
+                 1, 2, 3, 1, 1, 1, 0, 7, 2, 5, 1),
+             BS_VEC(int16_t, 32)),
+         1));
+    int32_t l_969;
+    int8_t l_1016 = 1;
+    func_31(&l_969, l_1016);
+    __builtin_convertvector((BS_VEC(int32_t, 32)){}, BS_VEC(int16_t, 32)),
+        __builtin_convertvector((BS_VEC(int32_t, 2)){}, BS_VEC(uint8_t, 2));
+    int l_572 = 2558744285;
+    func_31(0, l_572);
+}
+int32_t *func_31(int32_t *, uint64_t p_33)
+{
+    uint64_t LOCAL_CHECKSUM;
+    backsmith_pure_0(
+        __builtin_convertvector((BS_VEC(int32_t, 4)){}, BS_VEC(uint8_t, 4)),
+        20966, p_33);
+    for (uint32_t BS_TEMP_215; BS_TEMP_215;)
+        BS_CHECKSUM_ARR[6] += LOCAL_CHECKSUM;
+}


	Jakub
  
Andrew MacLeod Jan. 8, 2026, 2:36 p.m. UTC | #4
Ugg.

Sorry about that..  spent a few minutes seeing if I could re move the 
warnings and gave up, didn't think about other targets. Thanks guys, you 
are saints :-).

Andrew

On 1/8/26 04:53, Jakub Jelinek wrote:
> On Tue, Jan 06, 2026 at 12:35:17PM -0500, Andrew MacLeod wrote:
>> 	gcc/testsuite/
>> 	* gcc.dg/pr123319.c: New.
> The test doesn't work on non-x86 targets, but doesn't work on ia32 either.
> On the former because (except for powerpc with its emulations) those don't
> have immintrin.h header at all, on the latter due to
> FAIL: gcc.dg/pr123319.c (test for excess errors)
> Excess errors:
> /home/jakub/src/gcc/obj39/gcc/include/xmmintrin.h:1288:1: error: inlining failed in call to 'always_inline' '_mm_avg_pu8': target specific option mismatch
>
> The following patch fixes that, committed as obvious to trunk.
>
> 2026-01-08  Jakub Jelinek  <jakub@redhat.com>
> 	    Andrew Pinski  <andrew.pinski@oss.qualcomm.com>
>
> 	PR tree-optimization/123319
> 	* gcc.dg/pr123319.c: Replace test with target independent one.  Move
> 	previous test to ...
> 	* gcc.target/i386/pr123319.c: ... here.  Add comment with PR number,
> 	add -msse to dg-options, move immintrin.h include right after stdint.h
> 	include.
>
> --- gcc/testsuite/gcc.dg/pr123319.c.jj	2026-01-08 09:44:44.081248326 +0100
> +++ gcc/testsuite/gcc.dg/pr123319.c	2026-01-08 10:28:01.411420483 +0100
> @@ -1,89 +1,35 @@
> -/* { dg-do compile } */
> -/* { dg-options "-O3 -w -Wno-psabi" } */
> +/* PR tree-optimization/123319 */
> +/* { dg-do compile { target int32plus } } */
> +/* { dg-options "-O3" } */
>   
> -#include <stdint.h>
> -#define BS_VEC(type, num) type __attribute__((vector_size(num * sizeof(type))))
> -#define BITCAST(T, F, arg)                                                     \
> -    (union {                                                                   \
> -        F src;                                                                 \
> -        T dst                                                                  \
> -    }){ arg }                                                                  \
> -        .dst
> -#include <immintrin.h>
> -#define BARRIER_v8u8(x) (BS_VEC(uint8_t, 8)) _mm_avg_pu8((__m64)x, (__m64)x)
> -uint64_t BS_CHECKSUM_ARR[];
> -BS_VEC(int8_t, 16) backsmith_pure_2(BS_VEC(int32_t, 8));
> -BS_VEC(uint16_t, 4)
> -backsmith_pure_0(BS_VEC(uint8_t, 4), int16_t, int64_t BS_ARG_2)
> +signed char b, c[4];
> +void foo (void);
> +
> +void
> +bar (long e)
>   {
> -    int8_t BS_VAR_1;
> -    BS_VEC(uint8_t, 16) BS_VAR_4[7];
> -    if (BS_ARG_2)
> -    {
> -        if ((uint16_t)BS_ARG_2)
> -            for (uint16_t BS_INC_0;;)
> -                __builtin_convertvector(
> -                    __builtin_shufflevector(BS_VAR_4[BS_INC_0],
> -                                            BS_VAR_4[BS_INC_0], 8, 4, 5, 8),
> -                    BS_VEC(uint16_t, 4));
> -        BS_VAR_1 = __builtin_convertvector(
> -            __builtin_shufflevector((BS_VEC(uint8_t, 8)){}, BARRIER_v8u8({}), 4,
> -                                    10, 2, 1),
> -            BS_VEC(int8_t, 4))[BS_ARG_2];
> -        uint16_t BS_TEMP_11 = BS_VAR_1;
> -        return (BS_VEC(uint16_t, 4)){ BS_TEMP_11 };
> -    }
> -    __builtin_convertvector(__builtin_shufflevector((BS_VEC(uint8_t, 8)){},
> -                                                    (BS_VEC(uint8_t, 8)){}, 2,
> -                                                    6, 4, 4),
> -                            BS_VEC(uint16_t, 4));
> +  if (e) {
> +    if ((short) e)
> +      for (;;)
> +        ;
> +    foo ();
> +    b = c[e];
> +  }
>   }
> -static int32_t *func_31(int32_t *, uint64_t);
> -uint16_t func_1()
> +
> +static void
> +baz (long e)
>   {
> -    BITCAST(
> -        int64_t, BS_VEC(int16_t, 4),
> -        (__builtin_convertvector(
> -             __builtin_shufflevector((BS_VEC(uint32_t, 4)){},
> -                                     (BS_VEC(uint32_t, 4)){}, 0, 3, 3, 0, 2, 7,
> -                                     2, 5, 3, 5, 0, 4, 0, 1, 1, 7, 1, 0, 6, 7,
> -                                     6, 3, 4, 6, 3, 3, 1, 7, 3, 6, 0, 0),
> -             BS_VEC(int16_t, 32)),
> -         __builtin_convertvector(
> -             __builtin_shufflevector(
> -                 backsmith_pure_0((BS_VEC(uint8_t, 4)){}, 0,
> -                                  BITCAST(int64_t, BS_VEC(int32_t, 2), )),
> -                 backsmith_pure_0(
> -                     (BS_VEC(uint8_t, 4)){}, 0,
> -                     BITCAST(int64_t, BS_VEC(int32_t, 2),
> -                             __builtin_convertvector(
> -                                 __builtin_shufflevector(
> -                                     backsmith_pure_2(__builtin_convertvector(
> -                                         (BS_VEC(int64_t, 8)){},
> -                                         BS_VEC(int32_t, 8))),
> -                                     backsmith_pure_2(__builtin_convertvector(
> -                                         (BS_VEC(int64_t, 8)){},
> -                                         BS_VEC(int32_t, 8))),
> -                                     9, 2),
> -                                 BS_VEC(int32_t, 2)))),
> -                 3, 7, 5, 5, 1, 0, 2, 1, 1, 4, 6, 4, 5, 1, 5, 6, 1, 0, 1, 6, 4,
> -                 1, 2, 3, 1, 1, 1, 0, 7, 2, 5, 1),
> -             BS_VEC(int16_t, 32)),
> -         1));
> -    int32_t l_969;
> -    int8_t l_1016 = 1;
> -    func_31(&l_969, l_1016);
> -    __builtin_convertvector((BS_VEC(int32_t, 32)){}, BS_VEC(int16_t, 32)),
> -        __builtin_convertvector((BS_VEC(int32_t, 2)){}, BS_VEC(uint8_t, 2));
> -    int l_572 = 2558744285;
> -    func_31(0, l_572);
> +  bar (e);
>   }
> -int32_t *func_31(int32_t *, uint64_t p_33)
> +
> +int g;
> +
> +void
> +qux ()
>   {
> -    uint64_t LOCAL_CHECKSUM;
> -    backsmith_pure_0(
> -        __builtin_convertvector((BS_VEC(int32_t, 4)){}, BS_VEC(uint8_t, 4)),
> -        20966, p_33);
> -    for (uint32_t BS_TEMP_215; BS_TEMP_215;)
> -        BS_CHECKSUM_ARR[6] += LOCAL_CHECKSUM;
> +  bar (g);
> +  baz (1);
> +  int h = 2558744285;
> +  baz (h);
>   }
> --- gcc/testsuite/gcc.target/i386/pr123319.c.jj	2026-01-08 10:23:41.672845301 +0100
> +++ gcc/testsuite/gcc.target/i386/pr123319.c	2026-01-08 10:28:10.468266193 +0100
> @@ -0,0 +1,90 @@
> +/* PR tree-optimization/123319 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -w -Wno-psabi -msse" } */
> +
> +#include <stdint.h>
> +#include <immintrin.h>
> +#define BS_VEC(type, num) type __attribute__((vector_size(num * sizeof(type))))
> +#define BITCAST(T, F, arg)                                                     \
> +    (union {                                                                   \
> +        F src;                                                                 \
> +        T dst                                                                  \
> +    }){ arg }                                                                  \
> +        .dst
> +#define BARRIER_v8u8(x) (BS_VEC(uint8_t, 8)) _mm_avg_pu8((__m64)x, (__m64)x)
> +uint64_t BS_CHECKSUM_ARR[];
> +BS_VEC(int8_t, 16) backsmith_pure_2(BS_VEC(int32_t, 8));
> +BS_VEC(uint16_t, 4)
> +backsmith_pure_0(BS_VEC(uint8_t, 4), int16_t, int64_t BS_ARG_2)
> +{
> +    int8_t BS_VAR_1;
> +    BS_VEC(uint8_t, 16) BS_VAR_4[7];
> +    if (BS_ARG_2)
> +    {
> +        if ((uint16_t)BS_ARG_2)
> +            for (uint16_t BS_INC_0;;)
> +                __builtin_convertvector(
> +                    __builtin_shufflevector(BS_VAR_4[BS_INC_0],
> +                                            BS_VAR_4[BS_INC_0], 8, 4, 5, 8),
> +                    BS_VEC(uint16_t, 4));
> +        BS_VAR_1 = __builtin_convertvector(
> +            __builtin_shufflevector((BS_VEC(uint8_t, 8)){}, BARRIER_v8u8({}), 4,
> +                                    10, 2, 1),
> +            BS_VEC(int8_t, 4))[BS_ARG_2];
> +        uint16_t BS_TEMP_11 = BS_VAR_1;
> +        return (BS_VEC(uint16_t, 4)){ BS_TEMP_11 };
> +    }
> +    __builtin_convertvector(__builtin_shufflevector((BS_VEC(uint8_t, 8)){},
> +                                                    (BS_VEC(uint8_t, 8)){}, 2,
> +                                                    6, 4, 4),
> +                            BS_VEC(uint16_t, 4));
> +}
> +static int32_t *func_31(int32_t *, uint64_t);
> +uint16_t func_1()
> +{
> +    BITCAST(
> +        int64_t, BS_VEC(int16_t, 4),
> +        (__builtin_convertvector(
> +             __builtin_shufflevector((BS_VEC(uint32_t, 4)){},
> +                                     (BS_VEC(uint32_t, 4)){}, 0, 3, 3, 0, 2, 7,
> +                                     2, 5, 3, 5, 0, 4, 0, 1, 1, 7, 1, 0, 6, 7,
> +                                     6, 3, 4, 6, 3, 3, 1, 7, 3, 6, 0, 0),
> +             BS_VEC(int16_t, 32)),
> +         __builtin_convertvector(
> +             __builtin_shufflevector(
> +                 backsmith_pure_0((BS_VEC(uint8_t, 4)){}, 0,
> +                                  BITCAST(int64_t, BS_VEC(int32_t, 2), )),
> +                 backsmith_pure_0(
> +                     (BS_VEC(uint8_t, 4)){}, 0,
> +                     BITCAST(int64_t, BS_VEC(int32_t, 2),
> +                             __builtin_convertvector(
> +                                 __builtin_shufflevector(
> +                                     backsmith_pure_2(__builtin_convertvector(
> +                                         (BS_VEC(int64_t, 8)){},
> +                                         BS_VEC(int32_t, 8))),
> +                                     backsmith_pure_2(__builtin_convertvector(
> +                                         (BS_VEC(int64_t, 8)){},
> +                                         BS_VEC(int32_t, 8))),
> +                                     9, 2),
> +                                 BS_VEC(int32_t, 2)))),
> +                 3, 7, 5, 5, 1, 0, 2, 1, 1, 4, 6, 4, 5, 1, 5, 6, 1, 0, 1, 6, 4,
> +                 1, 2, 3, 1, 1, 1, 0, 7, 2, 5, 1),
> +             BS_VEC(int16_t, 32)),
> +         1));
> +    int32_t l_969;
> +    int8_t l_1016 = 1;
> +    func_31(&l_969, l_1016);
> +    __builtin_convertvector((BS_VEC(int32_t, 32)){}, BS_VEC(int16_t, 32)),
> +        __builtin_convertvector((BS_VEC(int32_t, 2)){}, BS_VEC(uint8_t, 2));
> +    int l_572 = 2558744285;
> +    func_31(0, l_572);
> +}
> +int32_t *func_31(int32_t *, uint64_t p_33)
> +{
> +    uint64_t LOCAL_CHECKSUM;
> +    backsmith_pure_0(
> +        __builtin_convertvector((BS_VEC(int32_t, 4)){}, BS_VEC(uint8_t, 4)),
> +        20966, p_33);
> +    for (uint32_t BS_TEMP_215; BS_TEMP_215;)
> +        BS_CHECKSUM_ARR[6] += LOCAL_CHECKSUM;
> +}
>
>
> 	Jakub
>
  

Patch

From 9285064e0315972057c16a153e7862be6d689394 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacleod@redhat.com>
Date: Mon, 5 Jan 2026 11:00:00 -0500
Subject: [PATCH 1/3] Always snap subranges after intersection.

	PR tree-optimization/123319
	gcc/
	* value-range.cc (irange::intersect): If there is a bitmask, snap
	subranges after creating them.

	gcc/testsuite/
	* gcc.dg/pr123319.c: New.
---
 gcc/testsuite/gcc.dg/pr123319.c | 89 +++++++++++++++++++++++++++++++++
 gcc/value-range.cc              |  7 +++
 2 files changed, 96 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr123319.c

diff --git a/gcc/testsuite/gcc.dg/pr123319.c b/gcc/testsuite/gcc.dg/pr123319.c
new file mode 100644
index 00000000000..a60f2de22d6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr123319.c
@@ -0,0 +1,89 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -w -Wno-psabi" } */
+
+#include <stdint.h>
+#define BS_VEC(type, num) type __attribute__((vector_size(num * sizeof(type))))
+#define BITCAST(T, F, arg)                                                     \
+    (union {                                                                   \
+        F src;                                                                 \
+        T dst                                                                  \
+    }){ arg }                                                                  \
+        .dst
+#include <immintrin.h>
+#define BARRIER_v8u8(x) (BS_VEC(uint8_t, 8)) _mm_avg_pu8((__m64)x, (__m64)x)
+uint64_t BS_CHECKSUM_ARR[];
+BS_VEC(int8_t, 16) backsmith_pure_2(BS_VEC(int32_t, 8));
+BS_VEC(uint16_t, 4)
+backsmith_pure_0(BS_VEC(uint8_t, 4), int16_t, int64_t BS_ARG_2)
+{
+    int8_t BS_VAR_1;
+    BS_VEC(uint8_t, 16) BS_VAR_4[7];
+    if (BS_ARG_2)
+    {
+        if ((uint16_t)BS_ARG_2)
+            for (uint16_t BS_INC_0;;)
+                __builtin_convertvector(
+                    __builtin_shufflevector(BS_VAR_4[BS_INC_0],
+                                            BS_VAR_4[BS_INC_0], 8, 4, 5, 8),
+                    BS_VEC(uint16_t, 4));
+        BS_VAR_1 = __builtin_convertvector(
+            __builtin_shufflevector((BS_VEC(uint8_t, 8)){}, BARRIER_v8u8({}), 4,
+                                    10, 2, 1),
+            BS_VEC(int8_t, 4))[BS_ARG_2];
+        uint16_t BS_TEMP_11 = BS_VAR_1;
+        return (BS_VEC(uint16_t, 4)){ BS_TEMP_11 };
+    }
+    __builtin_convertvector(__builtin_shufflevector((BS_VEC(uint8_t, 8)){},
+                                                    (BS_VEC(uint8_t, 8)){}, 2,
+                                                    6, 4, 4),
+                            BS_VEC(uint16_t, 4));
+}
+static int32_t *func_31(int32_t *, uint64_t);
+uint16_t func_1()
+{
+    BITCAST(
+        int64_t, BS_VEC(int16_t, 4),
+        (__builtin_convertvector(
+             __builtin_shufflevector((BS_VEC(uint32_t, 4)){},
+                                     (BS_VEC(uint32_t, 4)){}, 0, 3, 3, 0, 2, 7,
+                                     2, 5, 3, 5, 0, 4, 0, 1, 1, 7, 1, 0, 6, 7,
+                                     6, 3, 4, 6, 3, 3, 1, 7, 3, 6, 0, 0),
+             BS_VEC(int16_t, 32)),
+         __builtin_convertvector(
+             __builtin_shufflevector(
+                 backsmith_pure_0((BS_VEC(uint8_t, 4)){}, 0,
+                                  BITCAST(int64_t, BS_VEC(int32_t, 2), )),
+                 backsmith_pure_0(
+                     (BS_VEC(uint8_t, 4)){}, 0,
+                     BITCAST(int64_t, BS_VEC(int32_t, 2),
+                             __builtin_convertvector(
+                                 __builtin_shufflevector(
+                                     backsmith_pure_2(__builtin_convertvector(
+                                         (BS_VEC(int64_t, 8)){},
+                                         BS_VEC(int32_t, 8))),
+                                     backsmith_pure_2(__builtin_convertvector(
+                                         (BS_VEC(int64_t, 8)){},
+                                         BS_VEC(int32_t, 8))),
+                                     9, 2),
+                                 BS_VEC(int32_t, 2)))),
+                 3, 7, 5, 5, 1, 0, 2, 1, 1, 4, 6, 4, 5, 1, 5, 6, 1, 0, 1, 6, 4,
+                 1, 2, 3, 1, 1, 1, 0, 7, 2, 5, 1),
+             BS_VEC(int16_t, 32)),
+         1));
+    int32_t l_969;
+    int8_t l_1016 = 1;
+    func_31(&l_969, l_1016);
+    __builtin_convertvector((BS_VEC(int32_t, 32)){}, BS_VEC(int16_t, 32)),
+        __builtin_convertvector((BS_VEC(int32_t, 2)){}, BS_VEC(uint8_t, 2));
+    int l_572 = 2558744285;
+    func_31(0, l_572);
+}
+int32_t *func_31(int32_t *, uint64_t p_33)
+{
+    uint64_t LOCAL_CHECKSUM;
+    backsmith_pure_0(
+        __builtin_convertvector((BS_VEC(int32_t, 4)){}, BS_VEC(uint8_t, 4)),
+        20966, p_33);
+    for (uint32_t BS_TEMP_215; BS_TEMP_215;)
+        BS_CHECKSUM_ARR[6] += LOCAL_CHECKSUM;
+}
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 9f5abcbcbf8..9bd9dc7506b 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2163,6 +2163,13 @@  irange::intersect (const vrange &v)
     }
 
   m_kind = VR_RANGE;
+  // Snap subranges if there is a bitmask.  See PR 123319.
+  if (!m_bitmask.unknown_p ())
+    {
+      snap_subranges ();
+      if (undefined_p ())
+	return true;
+    }
   // The range has been altered, so normalize it even if nothing
   // changed in the mask.
   if (!intersect_bitmask (r))
-- 
2.45.0