[4/4] Allow vsx_extract_<mode> to use Altivec registers, PR target/99293

Message ID YkHiR3hZQ2LCnEpE@toto.the-meissners.org
State Committed
Commit 9f9ccc4a5788fc6afbb4fb2d56ad20dde28f0de5
Headers
Series Optimize vec_splats of vec_extract, PR target/99293 |

Commit Message

Michael Meissner March 28, 2022, 4:28 p.m. UTC
  Allow vsx_extract_<mode> to use Altivec registers, PR target/99293

In looking at PR target/99293, I noticed that the vsx_extract_<mode>
pattern for V2DImode and V2DFmode only allowed traditional floating point
registers, and it did not allow Altivec registers.  The original code was
written a few years ago when we used the old register allocator, and
support for scalar floating point in Altivec registers was just being
added to GCC.

I have built the spec 2017 benchmark suite With all 4 patches in this
series applied, and compared it to the build with the previous 3 patches
applied.  In addition to the changes from the previous 3 patches, this
patch now changes the code for the following 3 benchmarks (2 floating
point, 1 integer):

	bwaves_r, fotonik3d_r, xalancbmk_r

I have built bootstrap versions on the following systems.  There were no
regressions in the runs:

	Power9 little endian, --with-cpu=power9
	Power10 little endian, --with-cpu=power10
	Power8 big endian, --with-cpu=power8 (both 32-bit & 64-bit tests)

Can I install this into the trunk?  After a burn-in period, can I backport
and install this into GCC 11 and GCC 10 branches?

2022-03-28   Michael Meissner  <meissner@linux.ibm.com>

gcc/
	PR target/99293
	* config/rs6000/rs6000.md (vsx_extract_<mode>): Allow destination
	to be an Altivec register.
---
 gcc/config/rs6000/vsx.md | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)
  

Comments

Segher Boessenkool March 28, 2022, 11:59 p.m. UTC | #1
On Mon, Mar 28, 2022 at 12:28:55PM -0400, Michael Meissner wrote:
> In looking at PR target/99293, I noticed that the vsx_extract_<mode>
> pattern for V2DImode and V2DFmode only allowed traditional floating point
> registers, and it did not allow Altivec registers.  The original code was
> written a few years ago when we used the old register allocator, and
> support for scalar floating point in Altivec registers was just being
> added to GCC.

vsx_extract_<mode> is from 2009...  How time flies :-)

This comment is from 2016 though.  Still before LRA was default for us
of course ;-)

If would have been nice if we had a testcase for this breakage, so that
we could now be confident it really has been fixed.  But the "reload"
here likely means "old reload", so okay.

> 	PR target/99293

It has essentially nothing to do with that PR, right?  Or I just do not
see it, always a possibility of course.

> 	* config/rs6000/rs6000.md (vsx_extract_<mode>): Allow destination
> 	to be an Altivec register.

... to be any VSX register.

Okay for trunk with those things fixed.  Thanks!


Segher
  
Michael Meissner March 29, 2022, 5:26 p.m. UTC | #2
On Mon, Mar 28, 2022 at 06:59:14PM -0500, Segher Boessenkool wrote:
> On Mon, Mar 28, 2022 at 12:28:55PM -0400, Michael Meissner wrote:
> > In looking at PR target/99293, I noticed that the vsx_extract_<mode>
> > pattern for V2DImode and V2DFmode only allowed traditional floating point
> > registers, and it did not allow Altivec registers.  The original code was
> > written a few years ago when we used the old register allocator, and
> > support for scalar floating point in Altivec registers was just being
> > added to GCC.
> 
> vsx_extract_<mode> is from 2009...  How time flies :-)
> 
> This comment is from 2016 though.  Still before LRA was default for us
> of course ;-)

The support for scalars in Altivec registers wasn't really done until the 2016
time frame.  At the time I had tried to use VSX registers for this, but I could
never get a reproducable case for the failure other one spec benchmark not
building with some flags (most likely spec 2017's 521.wrf_r or spec 2006's
481.wrf).  So I opted to just keep it limited to traditional FPR registers, and
maybe fix it some time later.

> If would have been nice if we had a testcase for this breakage, so that
> we could now be confident it really has been fixed.  But the "reload"
> here likely means "old reload", so okay.

Yes, it was the old reload.

> > 	PR target/99293
> 
> It has essentially nothing to do with that PR, right?  Or I just do not
> see it, always a possibility of course.

It was just that I noticed the change in looking at PR target/99293.  I did
remove the reference from the checkin commit.

> > 	* config/rs6000/rs6000.md (vsx_extract_<mode>): Allow destination
> > 	to be an Altivec register.
> 
> ... to be any VSX register.

Thanks.

> Okay for trunk with those things fixed.  Thanks!

Done.
  

Patch

diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2a23807c2dc..d30fd4f2596 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3397,15 +3397,12 @@  (define_expand "vsx_set_<mode>"
 ;; Optimize cases were we can do a simple or direct move.
 ;; Or see if we can avoid doing the move at all
 
-;; There are some unresolved problems with reload that show up if an Altivec
-;; register was picked.  Limit the scalar value to FPRs for now.
-
 (define_insn "vsx_extract_<mode>"
-  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=d, d,  wr, wr")
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand" "=wa, wa, wr, wr")
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "gpc_reg_operand"      "wa, wa, wa, wa")
+	 (match_operand:VSX_D 1 "gpc_reg_operand"       "wa, wa, wa, wa")
 	 (parallel
-	  [(match_operand:QI 2 "const_0_to_1_operand"  "wD, n,  wD, n")])))]
+	  [(match_operand:QI 2 "const_0_to_1_operand"   "wD, n,  wD, n")])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
   int element = INTVAL (operands[2]);