[3/4] Make vsx_extract_<mode> use correct insn attributes, PR target 99293.

Message ID YkHiFLLuHIS4R59K@toto.the-meissners.org
State New
Headers
Series Optimize vec_splats of vec_extract, PR target/99293 |

Commit Message

Michael Meissner March 28, 2022, 4:28 p.m. UTC
  Make vsx_extract_<mode> use correct insn attributes, PR target 99293.

In looking at PR target/99293, I noticed that the insn "type" attribute is
incorrect for the vsx_extract_<mode> insns.  In particular:

    1)	Simple vector register move should be vecmove (alternative 1);
    2)	Xxpermdi should be vecperm (alternative 2); (and)
    3)	Mfvsrld should be mfvsr (alternative 4).

This patch fixes those attributes.

I have built the spec 2017 benchmark with this patch (#3) and the previous
patch (#2), along with the first patch in the series for power9 and
power10 targets.  Most of the floating point benchmarks changed code
slightly, due to the scheduling changes that came from changing the insn
type attribute.  I ran the spec 2017 suite on power9, and I did not not
notice any significant changes from these changes.

The power9 benchmarks that had code differences with these 2 patches
applied in addition to the build with just the first patch applied were:

	namd_r, pareset_r, povray_r, wrf_r, blender_r, cam4_r,
	deepsjeng_r, imagick_r, roms_r

The power9 benchmarks that had code differences with these 2 patches
applied in addition to the build with just the first patch applied were
(cactuBSSN_r had changes for power10 but not power9):

	cactuBSSN_r, namd_r, pareset_r, povray_r, wrf_r, blender_r,
	cam4_r, deepsjeng_r, imagick_r, nab_r, roms_r

I have built bootstrap versions on the following systems.  There were no
regressions in the runs:

	Power9 little endian, --with-cpu=power9
	Power10 little endian, --with-cpu=power10
	Power8 big endian, --with-cpu=power8 (both 32-bit & 64-bit tests)

Can I install this into the trunk?  After a burn-in period, can I backport
and install this into GCC 11 and GCC 10 branches?

2022-03-28   Michael Meissner  <meissner@linux.ibm.com>

gcc/
	PR target/99293
	* config/rs6000/rs6000.md (vsx_extract_<mode>): Use the correct
	insn type for the alternatives.
---
 gcc/config/rs6000/vsx.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Segher Boessenkool March 28, 2022, 10:06 p.m. UTC | #1
Hi!

On Mon, Mar 28, 2022 at 12:28:04PM -0400, Michael Meissner wrote:
> In looking at PR target/99293, I noticed that the insn "type" attribute is
> incorrect for the vsx_extract_<mode> insns.  In particular:
> 
>     1)	Simple vector register move should be vecmove (alternative 1);
>     2)	Xxpermdi should be vecperm (alternative 2); (and)
>     3)	Mfvsrld should be mfvsr (alternative 4).
> 
> This patch fixes those attributes.

But the code does not correspond well to the alternatives, even worse
for BE.  It would be much clearer (and even possibly correct!) if it
would just use the alternative # in the code, instead of using twenty
different conditions.  There are some important cases that have no
alternative right now, like, when op 1 is the same as op 0: it should
have the constraint "0" for op 1 then, and have cost 0.  The register
allocator will then hopefully try to make that happen.

> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -3451,7 +3451,7 @@ (define_insn "vsx_extract_<mode>"
>    else
>      gcc_unreachable ();
>  }
> -  [(set_attr "type" "veclogical,mfvsr,mfvsr,vecperm")
> +  [(set_attr "type" "vecmove,vecperm,mfvsr,mfvsr")
>     (set_attr "isa" "*,*,p8v,p9v")])

The generated code is one of
  no-op
  mfvsrd
  fmr
  xxlor
  mfvsrld
  xxpermdi

Which of the 4 alts are meant to correspond to which of those six
possible generated pieces of code?


Segher
  
Michael Meissner March 30, 2022, 10:58 p.m. UTC | #2
On Mon, Mar 28, 2022 at 05:06:00PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Mar 28, 2022 at 12:28:04PM -0400, Michael Meissner wrote:
> > In looking at PR target/99293, I noticed that the insn "type" attribute is
> > incorrect for the vsx_extract_<mode> insns.  In particular:
> > 
> >     1)	Simple vector register move should be vecmove (alternative 1);
> >     2)	Xxpermdi should be vecperm (alternative 2); (and)
> >     3)	Mfvsrld should be mfvsr (alternative 4).
> > 
> > This patch fixes those attributes.
> 
> But the code does not correspond well to the alternatives, even worse
> for BE.  It would be much clearer (and even possibly correct!) if it
> would just use the alternative # in the code, instead of using twenty
> different conditions.  There are some important cases that have no
> alternative right now, like, when op 1 is the same as op 0: it should
> have the constraint "0" for op 1 then, and have cost 0.  The register
> allocator will then hopefully try to make that happen.
> 
> > --- a/gcc/config/rs6000/vsx.md
> > +++ b/gcc/config/rs6000/vsx.md
> > @@ -3451,7 +3451,7 @@ (define_insn "vsx_extract_<mode>"
> >    else
> >      gcc_unreachable ();
> >  }
> > -  [(set_attr "type" "veclogical,mfvsr,mfvsr,vecperm")
> > +  [(set_attr "type" "vecmove,vecperm,mfvsr,mfvsr")
> >     (set_attr "isa" "*,*,p8v,p9v")])
> 
> The generated code is one of
>   no-op
>   mfvsrd
>   fmr
>   xxlor
>   mfvsrld
>   xxpermdi
> 
> Which of the 4 alts are meant to correspond to which of those six
> possible generated pieces of code?

The mop comment, fmr, and xxlor are all alternative 1 (target is a VSX
register, element number is which double-word that that is in the upper 64-bits
of the register).

The xxpermdi is alternative 2 (target is a VSX register, elment number is which
double-word that is in the lower 64-bits of the register).

The mfvsrd is alternative 3 (target is a GPR register, element number is which
double-word that is in the upper 64-bits of the register).

The mfvsrld is alternative 4 (target is a GPR register, elment number is which
double-word that is in the lower 64-bits of the register).

In terms of this patch, I had originally rewritten the insn as a
define_insn_and_split, and changed the move case into just a move to let the
later passes just delete moves to the same register.  However, it caused some
unrelated errors (in the gcc.dg/tree-ssa/builtin-s*printf tests), so I backed
out the code to use split to create moves.

But it looks like alternative 2 and alternative 4 have the insn attribute
"type" mixed up (alternative 2 should be vecperm, alternative 4 should be move
from vsr).
  

Patch

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index ad722cff70f..2a23807c2dc 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3451,7 +3451,7 @@  (define_insn "vsx_extract_<mode>"
   else
     gcc_unreachable ();
 }
-  [(set_attr "type" "veclogical,mfvsr,mfvsr,vecperm")
+  [(set_attr "type" "vecmove,vecperm,mfvsr,mfvsr")
    (set_attr "isa" "*,*,p8v,p9v")])
 
 ;; Optimize extracting a single scalar element from memory.