[x86_64] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.
Commit Message
This patch resolves PR target/105791 which is a regression that was
accidentally introduced for my workaround to PR tree-optimization/10566.
(a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
shouldn't). The latest issues is that by providing a vcond_mask_v1tiv1ti
pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
V1TImode operands, which has a special case for TARGET_XOP to generate
a vpcmov instruction. Unfortunately, there wasn't previously a V1TImode
variant, xop_pcmov_v1ti, so we'd ICE.
This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
which is only used for defining XOP's vpcmov instruction. This in turn
requires V1TI (and V2TI) to be supported by <avxsizesuffix> (though
the use if <avxsizesuffix> in the names xop_pcmov_<mode><avxsizesuffix>
seems unnecessary; the mode makes the name unique).
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures. Ok for mainline?
2022-06-02 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
PR target/105791
* config/i386/sse.md (V_128_256):Add V1TI and V2TI.
(define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.
gcc/testsuite/ChangeLog
PR target/105791
* gcc.target/i386/pr105791.c: New test case.
Thanks in advance. Sorry for the inconvenience/breakage.
Roger
--
Comments
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> This patch resolves PR target/105791 which is a regression that was
> accidentally introduced for my workaround to PR tree-optimization/10566.
> (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
> shouldn't). The latest issues is that by providing a vcond_mask_v1tiv1ti
> pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
> V1TImode operands, which has a special case for TARGET_XOP to generate
> a vpcmov instruction. Unfortunately, there wasn't previously a V1TImode
> variant, xop_pcmov_v1ti, so we'd ICE.
>
> This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
> which is only used for defining XOP's vpcmov instruction. This in turn
> requires V1TI (and V2TI) to be supported by <avxsizesuffix> (though
> the use if <avxsizesuffix> in the names xop_pcmov_<mode><avxsizesuffix>
> seems unnecessary; the mode makes the name unique).
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures. Ok for mainline?
LGTM.
>
>
> 2022-06-02 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> PR target/105791
> * config/i386/sse.md (V_128_256):Add V1TI and V2TI.
> (define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.
>
> gcc/testsuite/ChangeLog
> PR target/105791
> * gcc.target/i386/pr105791.c: New test case.
>
>
> Thanks in advance. Sorry for the inconvenience/breakage.
> Roger
> --
>
@@ -301,7 +301,8 @@
;; All 128bit and 256bit vector modes
(define_mode_iterator V_128_256
- [V32QI V16QI V16HI V8HI V8SI V4SI V4DI V2DI V16HF V8HF V8SF V4SF V4DF V2DF])
+ [V32QI V16QI V16HI V8HI V8SI V4SI V4DI V2DI V2TI V1TI
+ V16HF V8HF V8SF V4SF V4DF V2DF])
;; All 512bit vector modes
(define_mode_iterator V_512 [V64QI V32HI V16SI V8DI V16SF V8DF])
@@ -897,9 +898,9 @@
(V8HI "sse4_1") (V16HI "avx")])
(define_mode_attr avxsizesuffix
- [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512")
- (V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256")
- (V16QI "") (V8HI "") (V4SI "") (V2DI "")
+ [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512") (V4TI "512")
+ (V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256") (V2TI "256")
+ (V16QI "") (V8HI "") (V4SI "") (V2DI "") (V1TI "")
(V32HF "512") (V16SF "512") (V8DF "512")
(V16HF "256") (V8SF "256") (V4DF "256")
(V8HF "") (V4SF "") (V2DF "")])
new file mode 100644
@@ -0,0 +1,13 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -mxop" } */
+typedef __int128 __attribute__((__vector_size__ (sizeof (__int128)))) U;
+typedef int __attribute__((__vector_size__ (sizeof (int)))) V;
+
+U u;
+V v;
+
+U
+foo (void)
+{
+ return (0 != __builtin_convertvector (v, U)) <= (0 != u);
+}