[0/4] Optimize vec_splats of vec_extract, PR target/99293

Message ID YkHhL5rL39UoKIHC@toto.the-meissners.org
Headers
Series Optimize vec_splats of vec_extract, PR target/99293 |

Message

Michael Meissner March 28, 2022, 4:24 p.m. UTC
  The following 4 patches fix PR target/99293.  This bug complains that on power9
and power10:

	vector long long v, v0, v1;
	// ...
	v0 = __builtin_vec_splats (__builtin_vec_extract (v, 0));
	v1 = __builtin_vec_splats (__builtin_vec_extract (v, 1));

generates move from vector register and move to vector register instructions
instead of keeping the data within the vector registers.

The first patch adds a combiner patterns to match this case and generate a
single xxpermdi instruction, instead of two instructions (the extract and then
the splats operations).

The second and third patches fix the insn attributes to be correct in the
extract and concat operations.

The fourth patch allows the target to be traditional Altivec registers in
addition to traditional floating point registers and GPRs.

I have built bootstrap versions on the following systems with these patches
applied.  There were no regressions in the runs:

	Power9 little endian, --with-cpu=power9
	Power10 little endian, --with-cpu=power10
	Power8 big endian, --with-cpu=power8 (both 32-bit & 64-bit tests)