rs6000: Replace UNSPECS with ss_plus/us_plus and ss_minus/us_minus

Message ID 20211220082405.2206998-1-luoxhu@linux.ibm.com
State Committed
Commit 460d53f816fe30093653cb22ef0feeb4bddc0895
Headers
Series rs6000: Replace UNSPECS with ss_plus/us_plus and ss_minus/us_minus |

Commit Message

Xionghu Luo Dec. 20, 2021, 8:24 a.m. UTC
  These four UNSPECS seems could be replaced with native RTL, and why
"(set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))"
in the RTL pattern, per ISA of VSCR bit 127(VECTOR Saturation, SAT):

  This bit is sticky; that is, once set to 1 it
  remains set to 1 until it is set to 0 by an
  mtvscr instruction.

The RTL pattern set it to 0 but final ASM doesn't present it?  And why
not use Clobber VSCR_REGNO instead?  Tested pass on P10, OK for master?
Thanks.

gcc/ChangeLog:

	* config/rs6000/altivec.md (altivec_vaddu<VI_char>s): Replace
	UNSPEC_VADDU with us_plus.
	(altivec_vadds<VI_char>s): Replace UNSPEC_VADDS with ss_plus.
	(altivec_vsubu<VI_char>s): Replace UNSPEC_VSUBU with us_minus.
	(altivec_vsubs<VI_char>s): Replace UNSPEC_VSUBS with ss_minus.
	(altivec_abss_<mode>): Likewise.
---
 gcc/config/rs6000/altivec.md | 29 ++++++++++-------------------
 1 file changed, 10 insertions(+), 19 deletions(-)
  

Comments

David Edelsohn Dec. 20, 2021, 4:45 p.m. UTC | #1
On Mon, Dec 20, 2021 at 3:24 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>
> These four UNSPECS seems could be replaced with native RTL, and why
> "(set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))"
> in the RTL pattern, per ISA of VSCR bit 127(VECTOR Saturation, SAT):
>
>   This bit is sticky; that is, once set to 1 it
>   remains set to 1 until it is set to 0 by an
>   mtvscr instruction.
>
> The RTL pattern set it to 0 but final ASM doesn't present it?  And why
> not use Clobber VSCR_REGNO instead?

The design came from the early implementation of Altivec:

https://gcc.gnu.org/pipermail/gcc-patches/2002-May/077409.html

If one later checks for saturation (reads VSCR), one needs a
corresponding SET of the value.  It's set in an architecture-specific
manner that isn't described to GCC, but it's set, not just clobbered
and in an undefined state.

The RTL does not describe that VSCR is set to the value 0.  The
(const_int 0) is not the value set.  You can think of the (const_int
0) as a dummy RTL argument to the VSCR UNSPEC.  UNSPEC requires at
least one argument and the pattern doesn't try to express the
argument, so it uses a dummy RTL constant.  It's part of a PARALLEL
and the plus or minus already expresses the data dependency of the
pattern on the input operands.

I'm unsure of the meaning of your question "final ASM doesn't present
it".  The operation on VSCR is implicit and not emitted as an
instruction.  It's in a PARALLEL, which means that the single Altivec
instruction has both effects.  Is that what you were asking?

> Tested pass on P10, OK for master?

This patch is okay.  Thanks for updating the machine description and
for cleaning up the formatting.

> Thanks.
>
> gcc/ChangeLog:
>
>         * config/rs6000/altivec.md (altivec_vaddu<VI_char>s): Replace
>         UNSPEC_VADDU with us_plus.
>         (altivec_vadds<VI_char>s): Replace UNSPEC_VADDS with ss_plus.
>         (altivec_vsubu<VI_char>s): Replace UNSPEC_VSUBU with us_minus.
>         (altivec_vsubs<VI_char>s): Replace UNSPEC_VSUBS with ss_minus.
>         (altivec_abss_<mode>): Likewise.
> ---
>  gcc/config/rs6000/altivec.md | 29 ++++++++++-------------------
>  1 file changed, 10 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index a057218aa28..b2909857c34 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -29,8 +29,6 @@ (define_c_enum "unspec"
>     UNSPEC_VMHADDSHS
>     UNSPEC_VMHRADDSHS
>     UNSPEC_VADDCUW
> -   UNSPEC_VADDU
> -   UNSPEC_VADDS
>     UNSPEC_VAVGU
>     UNSPEC_VAVGS
>     UNSPEC_VMULEUB
> @@ -61,8 +59,6 @@ (define_c_enum "unspec"
>     UNSPEC_VSR
>     UNSPEC_VSRO
>     UNSPEC_VSUBCUW
> -   UNSPEC_VSUBU
> -   UNSPEC_VSUBS
>     UNSPEC_VSUM4UBS
>     UNSPEC_VSUM4S
>     UNSPEC_VSUM2SWS
> @@ -517,9 +513,8 @@ (define_insn "altivec_vaddcuw"
>
>  (define_insn "altivec_vaddu<VI_char>s"
>    [(set (match_operand:VI 0 "register_operand" "=v")
> -        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
> -                   (match_operand:VI 2 "register_operand" "v")]
> -                  UNSPEC_VADDU))
> +        (us_plus:VI (match_operand:VI 1 "register_operand" "v")
> +                   (match_operand:VI 2 "register_operand" "v")))
>     (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
>    "<VI_unit>"
>    "vaddu<VI_char>s %0,%1,%2"
> @@ -527,9 +522,8 @@ (define_insn "altivec_vaddu<VI_char>s"
>
>  (define_insn "altivec_vadds<VI_char>s"
>    [(set (match_operand:VI 0 "register_operand" "=v")
> -        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
> -                    (match_operand:VI 2 "register_operand" "v")]
> -                  UNSPEC_VADDS))
> +        (ss_plus:VI (match_operand:VI 1 "register_operand" "v")
> +                   (match_operand:VI 2 "register_operand" "v")))
>     (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
>    "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
>    "vadds<VI_char>s %0,%1,%2"
> @@ -563,9 +557,8 @@ (define_insn "altivec_vsubcuw"
>
>  (define_insn "altivec_vsubu<VI_char>s"
>    [(set (match_operand:VI 0 "register_operand" "=v")
> -        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
> -                    (match_operand:VI 2 "register_operand" "v")]
> -                  UNSPEC_VSUBU))
> +        (us_minus:VI (match_operand:VI 1 "register_operand" "v")
> +                    (match_operand:VI 2 "register_operand" "v")))
>     (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
>    "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
>    "vsubu<VI_char>s %0,%1,%2"
> @@ -573,9 +566,8 @@ (define_insn "altivec_vsubu<VI_char>s"
>
>  (define_insn "altivec_vsubs<VI_char>s"
>    [(set (match_operand:VI 0 "register_operand" "=v")
> -        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
> -                    (match_operand:VI 2 "register_operand" "v")]
> -                  UNSPEC_VSUBS))
> +        (ss_minus:VI (match_operand:VI 1 "register_operand" "v")
> +                    (match_operand:VI 2 "register_operand" "v")))
>     (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
>    "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
>    "vsubs<VI_char>s %0,%1,%2"
> @@ -3480,9 +3472,8 @@ (define_expand "altivec_absv4sf2"
>  (define_expand "altivec_abss_<mode>"
>    [(set (match_dup 2) (vec_duplicate:VI (const_int 0)))
>     (parallel [(set (match_dup 3)
> -                  (unspec:VI [(match_dup 2)
> -                              (match_operand:VI 1 "register_operand" "v")]
> -                             UNSPEC_VSUBS))
> +                  (ss_minus:VI (match_dup 2)
> +                               (match_operand:VI 1 "register_operand" "v")))
>               (set (reg:SI VSCR_REGNO)
>                    (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))])
>     (set (match_operand:VI 0 "register_operand" "=v")
> --
> 2.27.0
>
  
Segher Boessenkool Dec. 20, 2021, 11:53 p.m. UTC | #2
On Mon, Dec 20, 2021 at 11:45:45AM -0500, David Edelsohn wrote:
> On Mon, Dec 20, 2021 at 3:24 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
> > These four UNSPECS seems could be replaced with native RTL, and why
> > "(set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))"
> > in the RTL pattern, per ISA of VSCR bit 127(VECTOR Saturation, SAT):
> >
> >   This bit is sticky; that is, once set to 1 it
> >   remains set to 1 until it is set to 0 by an
> >   mtvscr instruction.
> >
> > The RTL pattern set it to 0 but final ASM doesn't present it?  And why
> > not use Clobber VSCR_REGNO instead?
> 
> The design came from the early implementation of Altivec:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2002-May/077409.html
> 
> If one later checks for saturation (reads VSCR), one needs a
> corresponding SET of the value.  It's set in an architecture-specific
> manner that isn't described to GCC, but it's set, not just clobbered
> and in an undefined state.

Well.  RTL clobber and set do exactly the same thing, except with
clobber it is not specified *what* value is set.  All bits are set, all
bits are defined.  There is no (direct) way in RTL to say
"undetermined".

An RTL clobber would work just fine afaics?

> The RTL does not describe that VSCR is set to the value 0.  The
> (const_int 0) is not the value set.  You can think of the (const_int
> 0) as a dummy RTL argument to the VSCR UNSPEC.  UNSPEC requires at
> least one argument and the pattern doesn't try to express the
> argument, so it uses a dummy RTL constant.

Yup.  Traditionally (pc) was used for this.  Nowadays (const_int 0) is
not really more expensive anymore, and many people find it clearer (but
not in this case it seems :-) ).

> It's part of a PARALLEL
> and the plus or minus already expresses the data dependency of the
> pattern on the input operands.

But they do not describe any dependency on vscr, or output to it.  This
is the same problem we have with fpscr (most FP insns use some of its
fields, most set some, but there is no way to cleanly express that).

Explicit clobbers like this help one side of the issue.  For vscr, other
than the sat bit there is only the nj bit, and we just ignore that :-)

> This patch is okay.  Thanks for updating the machine description and
> for cleaning up the formatting.

x2.  Thanks!


Segher
  
David Edelsohn Dec. 21, 2021, 1:32 a.m. UTC | #3
On Mon, Dec 20, 2021 at 6:55 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Mon, Dec 20, 2021 at 11:45:45AM -0500, David Edelsohn wrote:
> > On Mon, Dec 20, 2021 at 3:24 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
> > > These four UNSPECS seems could be replaced with native RTL, and why
> > > "(set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))"
> > > in the RTL pattern, per ISA of VSCR bit 127(VECTOR Saturation, SAT):
> > >
> > >   This bit is sticky; that is, once set to 1 it
> > >   remains set to 1 until it is set to 0 by an
> > >   mtvscr instruction.
> > >
> > > The RTL pattern set it to 0 but final ASM doesn't present it?  And why
> > > not use Clobber VSCR_REGNO instead?
> >
> > The design came from the early implementation of Altivec:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2002-May/077409.html
> >
> > If one later checks for saturation (reads VSCR), one needs a
> > corresponding SET of the value.  It's set in an architecture-specific
> > manner that isn't described to GCC, but it's set, not just clobbered
> > and in an undefined state.
>
> Well.  RTL clobber and set do exactly the same thing, except with
> clobber it is not specified *what* value is set.  All bits are set, all
> bits are defined.  There is no (direct) way in RTL to say
> "undetermined".
>
> An RTL clobber would work just fine afaics?

I don't know about the original intention from Aldy, but if one were
looking at an RTL dump and the code used the saturation bit from VSCR,
it might be confusing to see a CLOBBER instead of a SET.  The SET
documents that VSCR_REGNO is assigned a specific value; GCC doesn't
know about the semantics, but it's not some undefined bit pattern.
CLOBBER implies a trash value or a value that one will not query
later, i.e., one would want to SET the register to a specific value
before using it.

>
> > The RTL does not describe that VSCR is set to the value 0.  The
> > (const_int 0) is not the value set.  You can think of the (const_int
> > 0) as a dummy RTL argument to the VSCR UNSPEC.  UNSPEC requires at
> > least one argument and the pattern doesn't try to express the
> > argument, so it uses a dummy RTL constant.
>
> Yup.  Traditionally (pc) was used for this.  Nowadays (const_int 0) is
> not really more expensive anymore, and many people find it clearer (but
> not in this case it seems :-) ).
>
> > It's part of a PARALLEL
> > and the plus or minus already expresses the data dependency of the
> > pattern on the input operands.
>
> But they do not describe any dependency on vscr, or output to it.  This
> is the same problem we have with fpscr (most FP insns use some of its
> fields, most set some, but there is no way to cleanly express that).

It describes that VSCR_REGNO is set, an output. It doesn't describe
how it is set nor inform the compiler that the value depends on the
input operands in some complicated way unknown to the compiler, but
the compiler cannot do anything useful with the additional
information.

>
> Explicit clobbers like this help one side of the issue.  For vscr, other
> than the sat bit there is only the nj bit, and we just ignore that :-)
>
> > This patch is okay.  Thanks for updating the machine description and
> > for cleaning up the formatting.
>
> x2.  Thanks!
>
>
> Segher
  
Xionghu Luo Dec. 21, 2021, 2:19 a.m. UTC | #4
On 2021/12/21 09:32, David Edelsohn wrote:
> On Mon, Dec 20, 2021 at 6:55 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>>
>> On Mon, Dec 20, 2021 at 11:45:45AM -0500, David Edelsohn wrote:
>>> On Mon, Dec 20, 2021 at 3:24 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>>>> These four UNSPECS seems could be replaced with native RTL, and why
>>>> "(set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))"
>>>> in the RTL pattern, per ISA of VSCR bit 127(VECTOR Saturation, SAT):
>>>>
>>>>   This bit is sticky; that is, once set to 1 it
>>>>   remains set to 1 until it is set to 0 by an
>>>>   mtvscr instruction.
>>>>
>>>> The RTL pattern set it to 0 but final ASM doesn't present it?  And why
>>>> not use Clobber VSCR_REGNO instead?
>>>
>>> The design came from the early implementation of Altivec:
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2002-May/077409.html

Thanks, I constructed a testcase for this,

cat vadds.c
#include <stdio.h>
#include <altivec.h>
vector signed int foo1 (vector signed int a, vector signed int c) {
  vector signed int ret = vec_vaddsws (a, c);
  union {vector unsigned short v; unsigned short s[8];} u;
  u.v = vec_mfvscr();
  printf("foo1: vscr == { %d,%d,%d,%d,%d,%d,%d,%d }\n", u.s[0], u.s[1],
u.s[2],
         u.s[3], u.s[4], u.s[5], u.s[6], u.s[7]);
  return a;
}

vector signed int foo2 (vector signed int a, vector signed int c) {
  vector signed int ret = vec_vaddsws (a, c);
  union {vector unsigned short v; unsigned short s[8];} u;
  u.v = vec_mfvscr();
  printf("foo2: vscr == { %d,%d,%d,%d,%d,%d,%d,%d }\n", u.s[0], u.s[1],
u.s[2],
         u.s[3], u.s[4], u.s[5], u.s[6], u.s[7]);
  return ret;
}

int main ()
{
  vector signed int a = {0x12345678, 0x22345678, 0x32345678, 0x42345678};
  vector signed int c = {0x12345678, 0x22345678, 0x32345678, 0x42345678};
  vector signed int b1 = foo1 (a, c);
  vector signed int b2 = foo2 (a, c);
  return b1[0]+b2[0];
}

The output is:

./a.out
foo1: vscr == { 0,0,0,0,0,0,0,0 }
foo2: vscr == { 1,0,0,0,0,0,0,0 }


So is this expected behavior?  Seems doesn't meet with Aldy's
description...  (foo1's temp is optimized away so no vaddsws produced)

" foo()
{
  vector int result;

  result = vec_adds(blah, blah);
  __check_for_saturation__
}

gcc, will optimize away vec_adds() because the result is a local
variable unused later.  then when we check the saturation bit in VSCR,
we get wrong results.

this patch explains to gcc all about VSCR, and adds it as a global
register as well."


>>>
>>> If one later checks for saturation (reads VSCR), one needs a
>>> corresponding SET of the value.  It's set in an architecture-specific
>>> manner that isn't described to GCC, but it's set, not just clobbered
>>> and in an undefined state.
>>
>> Well.  RTL clobber and set do exactly the same thing, except with
>> clobber it is not specified *what* value is set.  All bits are set, all
>> bits are defined.  There is no (direct) way in RTL to say
>> "undetermined".
>>
>> An RTL clobber would work just fine afaics?
> 
> I don't know about the original intention from Aldy, but if one were
> looking at an RTL dump and the code used the saturation bit from VSCR,
> it might be confusing to see a CLOBBER instead of a SET.  The SET
> documents that VSCR_REGNO is assigned a specific value; GCC doesn't
> know about the semantics, but it's not some undefined bit pattern.
> CLOBBER implies a trash value or a value that one will not query
> later, i.e., one would want to SET the register to a specific value
> before using it.
> 
>>
>>> The RTL does not describe that VSCR is set to the value 0.  The
>>> (const_int 0) is not the value set.  You can think of the (const_int
>>> 0) as a dummy RTL argument to the VSCR UNSPEC.  UNSPEC requires at
>>> least one argument and the pattern doesn't try to express the
>>> argument, so it uses a dummy RTL constant.
>>
>> Yup.  Traditionally (pc) was used for this.  Nowadays (const_int 0) is
>> not really more expensive anymore, and many people find it clearer (but
>> not in this case it seems :-) ).
>>
>>> It's part of a PARALLEL
>>> and the plus or minus already expresses the data dependency of the
>>> pattern on the input operands.
>>
>> But they do not describe any dependency on vscr, or output to it.  This
>> is the same problem we have with fpscr (most FP insns use some of its
>> fields, most set some, but there is no way to cleanly express that).
> 
> It describes that VSCR_REGNO is set, an output. It doesn't describe
> how it is set nor inform the compiler that the value depends on the
> input operands in some complicated way unknown to the compiler, but
> the compiler cannot do anything useful with the additional
> information.
> 
>>
>> Explicit clobbers like this help one side of the issue.  For vscr, other
>> than the sat bit there is only the nj bit, and we just ignore that :-)
>>
>>> This patch is okay.  Thanks for updating the machine description and
>>> for cleaning up the formatting.
>>
>> x2.  Thanks!
>>
>>
>> Segher
  
Xionghu Luo Dec. 21, 2021, 2:37 a.m. UTC | #5
On 2021/12/21 10:19, Xionghu Luo via Gcc-patches wrote:
> 
> 
> On 2021/12/21 09:32, David Edelsohn wrote:
>> On Mon, Dec 20, 2021 at 6:55 PM Segher Boessenkool
>> <segher@kernel.crashing.org> wrote:
>>>
>>> On Mon, Dec 20, 2021 at 11:45:45AM -0500, David Edelsohn wrote:
>>>> On Mon, Dec 20, 2021 at 3:24 AM Xionghu Luo <luoxhu@linux.ibm.com> wrote:
>>>>> These four UNSPECS seems could be replaced with native RTL, and why
>>>>> "(set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))"
>>>>> in the RTL pattern, per ISA of VSCR bit 127(VECTOR Saturation, SAT):
>>>>>
>>>>>   This bit is sticky; that is, once set to 1 it
>>>>>   remains set to 1 until it is set to 0 by an
>>>>>   mtvscr instruction.
>>>>>
>>>>> The RTL pattern set it to 0 but final ASM doesn't present it?  And why
>>>>> not use Clobber VSCR_REGNO instead?
>>>>
>>>> The design came from the early implementation of Altivec:
>>>>
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2002-May/077409.html
> 
> Thanks, I constructed a testcase for this,
> 
> cat vadds.c
> #include <stdio.h>
> #include <altivec.h>
> vector signed int foo1 (vector signed int a, vector signed int c) {
>   vector signed int ret = vec_vaddsws (a, c);
>   union {vector unsigned short v; unsigned short s[8];} u;
>   u.v = vec_mfvscr();
>   printf("foo1: vscr == { %d,%d,%d,%d,%d,%d,%d,%d }\n", u.s[0], u.s[1],
> u.s[2],
>          u.s[3], u.s[4], u.s[5], u.s[6], u.s[7]);
>   return a;
> }
> 
> vector signed int foo2 (vector signed int a, vector signed int c) {
>   vector signed int ret = vec_vaddsws (a, c);
>   union {vector unsigned short v; unsigned short s[8];} u;
>   u.v = vec_mfvscr();
>   printf("foo2: vscr == { %d,%d,%d,%d,%d,%d,%d,%d }\n", u.s[0], u.s[1],
> u.s[2],
>          u.s[3], u.s[4], u.s[5], u.s[6], u.s[7]);
>   return ret;
> }
> 
> int main ()
> {
>   vector signed int a = {0x12345678, 0x22345678, 0x32345678, 0x42345678};
>   vector signed int c = {0x12345678, 0x22345678, 0x32345678, 0x42345678};
>   vector signed int b1 = foo1 (a, c);
>   vector signed int b2 = foo2 (a, c);
>   return b1[0]+b2[0];
> }
> 
> The output is:
> 
> ./a.out
> foo1: vscr == { 0,0,0,0,0,0,0,0 }
> foo2: vscr == { 1,0,0,0,0,0,0,0 }
> 
> 
> So is this expected behavior?  Seems doesn't meet with Aldy's
> description...  (foo1's temp is optimized away so no vaddsws produced)


Just realized that foo1's "ret" is optimized way in gimple already, so
no such RTL and vaddsws generated in foo1, only foo2's vaddsws will set
VSCR explicitly.

> 
> " foo()
> {
>   vector int result;
> 
>   result = vec_adds(blah, blah);
>   __check_for_saturation__
> }
> 
> gcc, will optimize away vec_adds() because the result is a local
> variable unused later.  then when we check the saturation bit in VSCR,
> we get wrong results.
> 
> this patch explains to gcc all about VSCR, and adds it as a global
> register as well."
> 
> 
>>>>
>>>> If one later checks for saturation (reads VSCR), one needs a
>>>> corresponding SET of the value.  It's set in an architecture-specific
>>>> manner that isn't described to GCC, but it's set, not just clobbered
>>>> and in an undefined state.
>>>
>>> Well.  RTL clobber and set do exactly the same thing, except with
>>> clobber it is not specified *what* value is set.  All bits are set, all
>>> bits are defined.  There is no (direct) way in RTL to say
>>> "undetermined".
>>>
>>> An RTL clobber would work just fine afaics?
>>
>> I don't know about the original intention from Aldy, but if one were
>> looking at an RTL dump and the code used the saturation bit from VSCR,
>> it might be confusing to see a CLOBBER instead of a SET.  The SET
>> documents that VSCR_REGNO is assigned a specific value; GCC doesn't
>> know about the semantics, but it's not some undefined bit pattern.
>> CLOBBER implies a trash value or a value that one will not query
>> later, i.e., one would want to SET the register to a specific value
>> before using it.

Agree that SET is better than CLOBBER.  Thank you!

>>
>>>
>>>> The RTL does not describe that VSCR is set to the value 0.  The
>>>> (const_int 0) is not the value set.  You can think of the (const_int
>>>> 0) as a dummy RTL argument to the VSCR UNSPEC.  UNSPEC requires at
>>>> least one argument and the pattern doesn't try to express the
>>>> argument, so it uses a dummy RTL constant.
>>>
>>> Yup.  Traditionally (pc) was used for this.  Nowadays (const_int 0) is
>>> not really more expensive anymore, and many people find it clearer (but
>>> not in this case it seems :-) ).
>>>
>>>> It's part of a PARALLEL
>>>> and the plus or minus already expresses the data dependency of the
>>>> pattern on the input operands.
>>>
>>> But they do not describe any dependency on vscr, or output to it.  This
>>> is the same problem we have with fpscr (most FP insns use some of its
>>> fields, most set some, but there is no way to cleanly express that).
>>
>> It describes that VSCR_REGNO is set, an output. It doesn't describe
>> how it is set nor inform the compiler that the value depends on the
>> input operands in some complicated way unknown to the compiler, but
>> the compiler cannot do anything useful with the additional
>> information.
>>
>>>
>>> Explicit clobbers like this help one side of the issue.  For vscr, other
>>> than the sat bit there is only the nj bit, and we just ignore that :-)
>>>
>>>> This patch is okay.  Thanks for updating the machine description and
>>>> for cleaning up the formatting.
>>>
>>> x2.  Thanks!
>>>
>>>
>>> Segher
>
  
Xionghu Luo Dec. 21, 2021, 3:54 a.m. UTC | #6
On 2021/12/21 09:32, David Edelsohn wrote:
> Explicit clobbers like this help one side of the issue.  For vscr, other
> than the sat bit there is only the nj bit, and we just ignore that :-)
> 
>> This patch is okay.  Thanks for updating the machine description and
>> for cleaning up the formatting.
> x2.  Thanks!


Committed to r12-6084.
  

Patch

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index a057218aa28..b2909857c34 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -29,8 +29,6 @@  (define_c_enum "unspec"
    UNSPEC_VMHADDSHS
    UNSPEC_VMHRADDSHS
    UNSPEC_VADDCUW
-   UNSPEC_VADDU
-   UNSPEC_VADDS
    UNSPEC_VAVGU
    UNSPEC_VAVGS
    UNSPEC_VMULEUB
@@ -61,8 +59,6 @@  (define_c_enum "unspec"
    UNSPEC_VSR
    UNSPEC_VSRO
    UNSPEC_VSUBCUW
-   UNSPEC_VSUBU
-   UNSPEC_VSUBS
    UNSPEC_VSUM4UBS
    UNSPEC_VSUM4S
    UNSPEC_VSUM2SWS
@@ -517,9 +513,8 @@  (define_insn "altivec_vaddcuw"
 
 (define_insn "altivec_vaddu<VI_char>s"
   [(set (match_operand:VI 0 "register_operand" "=v")
-        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
-		    (match_operand:VI 2 "register_operand" "v")]
-		   UNSPEC_VADDU))
+        (us_plus:VI (match_operand:VI 1 "register_operand" "v")
+		    (match_operand:VI 2 "register_operand" "v")))
    (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "<VI_unit>"
   "vaddu<VI_char>s %0,%1,%2"
@@ -527,9 +522,8 @@  (define_insn "altivec_vaddu<VI_char>s"
 
 (define_insn "altivec_vadds<VI_char>s"
   [(set (match_operand:VI 0 "register_operand" "=v")
-        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
-                    (match_operand:VI 2 "register_operand" "v")]
-		   UNSPEC_VADDS))
+        (ss_plus:VI (match_operand:VI 1 "register_operand" "v")
+		    (match_operand:VI 2 "register_operand" "v")))
    (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vadds<VI_char>s %0,%1,%2"
@@ -563,9 +557,8 @@  (define_insn "altivec_vsubcuw"
 
 (define_insn "altivec_vsubu<VI_char>s"
   [(set (match_operand:VI 0 "register_operand" "=v")
-        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
-                    (match_operand:VI 2 "register_operand" "v")]
-		   UNSPEC_VSUBU))
+        (us_minus:VI (match_operand:VI 1 "register_operand" "v")
+		     (match_operand:VI 2 "register_operand" "v")))
    (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vsubu<VI_char>s %0,%1,%2"
@@ -573,9 +566,8 @@  (define_insn "altivec_vsubu<VI_char>s"
 
 (define_insn "altivec_vsubs<VI_char>s"
   [(set (match_operand:VI 0 "register_operand" "=v")
-        (unspec:VI [(match_operand:VI 1 "register_operand" "v")
-                    (match_operand:VI 2 "register_operand" "v")]
-		   UNSPEC_VSUBS))
+        (ss_minus:VI (match_operand:VI 1 "register_operand" "v")
+		     (match_operand:VI 2 "register_operand" "v")))
    (set (reg:SI VSCR_REGNO) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "vsubs<VI_char>s %0,%1,%2"
@@ -3480,9 +3472,8 @@  (define_expand "altivec_absv4sf2"
 (define_expand "altivec_abss_<mode>"
   [(set (match_dup 2) (vec_duplicate:VI (const_int 0)))
    (parallel [(set (match_dup 3)
-		   (unspec:VI [(match_dup 2)
-			       (match_operand:VI 1 "register_operand" "v")]
-			      UNSPEC_VSUBS))
+		   (ss_minus:VI (match_dup 2)
+				(match_operand:VI 1 "register_operand" "v")))
 	      (set (reg:SI VSCR_REGNO)
 		   (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))])
    (set (match_operand:VI 0 "register_operand" "=v")