[v2] Do not emulate vectors containing floats.
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gcc_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 |
success
|
Testing passed
|
Commit Message
The emulation via word mode tries to perform integer arithmetic on floating
point values instead of floating point arithmetic. This leads to
mis-compilations.
Failure occured on s390x on these existing test cases:
gcc.dg/vect/tsvc/vect-tsvc-s112.c
gcc.dg/vect/tsvc/vect-tsvc-s113.c
gcc.dg/vect/tsvc/vect-tsvc-s119.c
gcc.dg/vect/tsvc/vect-tsvc-s121.c
gcc.dg/vect/tsvc/vect-tsvc-s131.c
gcc.dg/vect/tsvc/vect-tsvc-s132.c
gcc.dg/vect/tsvc/vect-tsvc-s2233.c
gcc.dg/vect/tsvc/vect-tsvc-s421.c
gcc.dg/vect/vect-alias-check-14.c
gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
gcc/ChangeLog:
* tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
point vectors
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
---
gcc/tree-vect-stmts.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Comments
Juergen Christ <jchrist@linux.ibm.com> writes:
> The emulation via word mode tries to perform integer arithmetic on floating
> point values instead of floating point arithmetic. This leads to
> mis-compilations.
Is the bug ref + test missing?
>
> Failure occured on s390x on these existing test cases:
> gcc.dg/vect/tsvc/vect-tsvc-s112.c
> gcc.dg/vect/tsvc/vect-tsvc-s113.c
> gcc.dg/vect/tsvc/vect-tsvc-s119.c
> gcc.dg/vect/tsvc/vect-tsvc-s121.c
> gcc.dg/vect/tsvc/vect-tsvc-s131.c
> gcc.dg/vect/tsvc/vect-tsvc-s132.c
> gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> gcc.dg/vect/tsvc/vect-tsvc-s421.c
> gcc.dg/vect/vect-alias-check-14.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
>
> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> point vectors
>
> Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> ---
> gcc/tree-vect-stmts.cc | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 09749ae38174..f95ff2c2aa34 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> those through even when the mode isn't word_mode. For
> ops we have to lower the lowering code assumes we are
> dealing with word_mode. */
> - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> || !target_support_p)
> && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> /* Check only during analysis. */
Am Fri, Feb 23, 2024 at 01:57:12PM +0000 schrieb Sam James:
>
> Juergen Christ <jchrist@linux.ibm.com> writes:
>
> > The emulation via word mode tries to perform integer arithmetic on floating
> > point values instead of floating point arithmetic. This leads to
> > mis-compilations.
>
> Is the bug ref + test missing?
Sorry, forgot to add the "bootstrapped and tested on s390x and x86_64".
Not sure how to reference a bugzilla here. There is 114075 that
should be solved with this, too.
> >
> > Failure occured on s390x on these existing test cases:
> > gcc.dg/vect/tsvc/vect-tsvc-s112.c
> > gcc.dg/vect/tsvc/vect-tsvc-s113.c
> > gcc.dg/vect/tsvc/vect-tsvc-s119.c
> > gcc.dg/vect/tsvc/vect-tsvc-s121.c
> > gcc.dg/vect/tsvc/vect-tsvc-s131.c
> > gcc.dg/vect/tsvc/vect-tsvc-s132.c
> > gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> > gcc.dg/vect/tsvc/vect-tsvc-s421.c
> > gcc.dg/vect/vect-alias-check-14.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> > point vectors
> >
> > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> > ---
> > gcc/tree-vect-stmts.cc | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 09749ae38174..f95ff2c2aa34 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > those through even when the mode isn't word_mode. For
> > ops we have to lower the lowering code assumes we are
> > dealing with word_mode. */
> > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > || !target_support_p)
> > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > /* Check only during analysis. */
>
On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote:
> The emulation via word mode tries to perform integer arithmetic on floating
> point values instead of floating point arithmetic. This leads to
> mis-compilations.
>
> Failure occured on s390x on these existing test cases:
> gcc.dg/vect/tsvc/vect-tsvc-s112.c
> gcc.dg/vect/tsvc/vect-tsvc-s113.c
> gcc.dg/vect/tsvc/vect-tsvc-s119.c
> gcc.dg/vect/tsvc/vect-tsvc-s121.c
> gcc.dg/vect/tsvc/vect-tsvc-s131.c
> gcc.dg/vect/tsvc/vect-tsvc-s132.c
> gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> gcc.dg/vect/tsvc/vect-tsvc-s421.c
> gcc.dg/vect/vect-alias-check-14.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
>
> gcc/ChangeLog:
>
Please add
PR tree-optimization/114075
above the * tree-vect-stmts line.
> * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> point vectors
This line should be tab indented like the first one, and end with .
And given what the patch does, perhaps say non-integral instead of floating
point.
As for testcase, I'll handle it separately, given that it already
fixes some pre-existing tests.
> Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> ---
> gcc/tree-vect-stmts.cc | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index 09749ae38174..f95ff2c2aa34 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> those through even when the mode isn't word_mode. For
> ops we have to lower the lowering code assumes we are
> dealing with word_mode. */
> - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> || !target_support_p)
> && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> /* Check only during analysis. */
LGTM, but please wait until Monday evening so that Richi or Richard
have a chance to chime in.
Jakub
On Fri, 23 Feb 2024, Jakub Jelinek wrote:
> On Fri, Feb 23, 2024 at 02:43:45PM +0100, Juergen Christ wrote:
> > The emulation via word mode tries to perform integer arithmetic on floating
> > point values instead of floating point arithmetic. This leads to
> > mis-compilations.
> >
> > Failure occured on s390x on these existing test cases:
> > gcc.dg/vect/tsvc/vect-tsvc-s112.c
> > gcc.dg/vect/tsvc/vect-tsvc-s113.c
> > gcc.dg/vect/tsvc/vect-tsvc-s119.c
> > gcc.dg/vect/tsvc/vect-tsvc-s121.c
> > gcc.dg/vect/tsvc/vect-tsvc-s131.c
> > gcc.dg/vect/tsvc/vect-tsvc-s132.c
> > gcc.dg/vect/tsvc/vect-tsvc-s2233.c
> > gcc.dg/vect/tsvc/vect-tsvc-s421.c
> > gcc.dg/vect/vect-alias-check-14.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-1.c
> > gcc.target/s390/vector/partial/s390-vec-length-epil-run-3.c
> > gcc.target/s390/vector/partial/s390-vec-length-full-run-3.c
> >
> > gcc/ChangeLog:
> >
>
> Please add
> PR tree-optimization/114075
> above the * tree-vect-stmts line.
> > * tree-vect-stmts.cc (vectorizable_operation): Don't emulate floating
> > point vectors
>
> This line should be tab indented like the first one, and end with .
> And given what the patch does, perhaps say non-integral instead of floating
> point.
>
> As for testcase, I'll handle it separately, given that it already
> fixes some pre-existing tests.
>
> > Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
> > ---
> > gcc/tree-vect-stmts.cc | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index 09749ae38174..f95ff2c2aa34 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > those through even when the mode isn't word_mode. For
> > ops we have to lower the lowering code assumes we are
> > dealing with word_mode. */
> > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > || !target_support_p)
> > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > /* Check only during analysis. */
I think it will work fine. Even after the last TLC this feels like in
the need of more TLC ;)
So OK. Also for affected branches - the effective check should be the
same in GCC 13 at least, but with some added ad-hoc costing which might
make this not trigger (maybe_lt (nunits_out, 4U)) - so we'd need a
word_mode that can cover 4 FP elements. Possibly triggerable with
HFmode?
Thanks,
Richard.
> LGTM, but please wait until Monday evening so that Richi or Richard
> have a chance to chime in.
>
> Jakub
>
>
On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > those through even when the mode isn't word_mode. For
> > > ops we have to lower the lowering code assumes we are
> > > dealing with word_mode. */
> > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > || !target_support_p)
> > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > /* Check only during analysis. */
>
> I think it will work fine. Even after the last TLC this feels like in
> the need of more TLC ;)
Note, at least in theory, floating point NEGATE_EXPR could be handled just
fine in the emulated vectors, just xor the sign bit, and
BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
floating point modes (though e.g. in RTL they are used even for them),
it is mainly PLUS_EXPR/MINUS_EXPR. Perhaps the NEGATE_EXPR isn't worth
it though. In any case, that wouldn't be stage4 material.
Jakub
On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > those through even when the mode isn't word_mode. For
> > > > ops we have to lower the lowering code assumes we are
> > > > dealing with word_mode. */
> > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > || !target_support_p)
> > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > /* Check only during analysis. */
> >
> > I think it will work fine. Even after the last TLC this feels like in
> > the need of more TLC ;)
>
> Note, at least in theory, floating point NEGATE_EXPR could be handled just
> fine in the emulated vectors, just xor the sign bit, and
> BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> floating point modes (though e.g. in RTL they are used even for them),
Hmm, I think vector types not supported only ever get integer modes
assigned, not FP modes so the operations would be on integer modes.
Unless I'm missing the point ...
> it is mainly PLUS_EXPR/MINUS_EXPR. Perhaps the NEGATE_EXPR isn't worth
> it though. In any case, that wouldn't be stage4 material.
Agreed.
Richard.
On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> On Mon, 26 Feb 2024, Jakub Jelinek wrote:
>
> > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > > those through even when the mode isn't word_mode. For
> > > > > ops we have to lower the lowering code assumes we are
> > > > > dealing with word_mode. */
> > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > || !target_support_p)
> > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > > /* Check only during analysis. */
> > >
> > > I think it will work fine. Even after the last TLC this feels like in
> > > the need of more TLC ;)
> >
> > Note, at least in theory, floating point NEGATE_EXPR could be handled just
> > fine in the emulated vectors, just xor the sign bit, and
> > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> > floating point modes (though e.g. in RTL they are used even for them),
>
> Hmm, I think vector types not supported only ever get integer modes
> assigned, not FP modes so the operations would be on integer modes.
I mean one could still SLP vectorize using the emulated vectors
float a[2], b[2];
...
a[0] = -b[0];
a[1] = -b[1];
as effectively
*(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL;
if we handled that case later on (except the above !INTEGRAL_TYPE_P check
would prevent that).
As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't
disallow
float f, g;
f_3 |= g_4;
etc. (but the FEs do), though maybe we'd ICE on some targets during
expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs.
Emulation using integer mode would be well defined.
Jakub
On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> On Mon, Feb 26, 2024 at 09:53:41AM +0100, Richard Biener wrote:
> > On Mon, 26 Feb 2024, Jakub Jelinek wrote:
> >
> > > On Mon, Feb 26, 2024 at 09:00:58AM +0100, Richard Biener wrote:
> > > > > > @@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
> > > > > > those through even when the mode isn't word_mode. For
> > > > > > ops we have to lower the lowering code assumes we are
> > > > > > dealing with word_mode. */
> > > > > > - if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > > + if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
> > > > > > + || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
> > > > > > || !target_support_p)
> > > > > > && maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
> > > > > > /* Check only during analysis. */
> > > >
> > > > I think it will work fine. Even after the last TLC this feels like in
> > > > the need of more TLC ;)
> > >
> > > Note, at least in theory, floating point NEGATE_EXPR could be handled just
> > > fine in the emulated vectors, just xor the sign bit, and
> > > BIT_{AND,IOR,XOR}_EXPR also would work but likely aren't valid IL on
> > > floating point modes (though e.g. in RTL they are used even for them),
> >
> > Hmm, I think vector types not supported only ever get integer modes
> > assigned, not FP modes so the operations would be on integer modes.
>
> I mean one could still SLP vectorize using the emulated vectors
> float a[2], b[2];
> ...
> a[0] = -b[0];
> a[1] = -b[1];
> as effectively
> *(unsigned long long *)a = *(unsigned long long *)b ^ 0x8000000080000000ULL;
> if we handled that case later on (except the above !INTEGRAL_TYPE_P check
> would prevent that).
Yep.
> As for BIT_{AND,IOR,XOR}_EXPR, seems the GIMPLE verifiers actually don't
> disallow
> float f, g;
> f_3 |= g_4;
> etc. (but the FEs do), though maybe we'd ICE on some targets during
> expansion; i386.md has {and,ior,xor}{s,d,x}f3 optabs.
I think that's an omission in the verifier (we allow the ops on
pointers though).
Richard.
@@ -6756,7 +6756,8 @@ vectorizable_operation (vec_info *vinfo,
those through even when the mode isn't word_mode. For
ops we have to lower the lowering code assumes we are
dealing with word_mode. */
- if ((((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
+ if (!INTEGRAL_TYPE_P (TREE_TYPE (vectype))
+ || (((code == PLUS_EXPR || code == MINUS_EXPR || code == NEGATE_EXPR)
|| !target_support_p)
&& maybe_ne (GET_MODE_SIZE (vec_mode), UNITS_PER_WORD))
/* Check only during analysis. */