[2/3] Disable generating load/store vector pairs for block copies.
Commit Message
[PATCH 2/3] Disable generating load/store vector pairs for block copies.
If the store vector pair instruction is disabled, do not generate block
copies that use load and store vector pair instructions.
I have built bootstrap compilers and run the regression tests on three
different systems:
1) Little endian power10 using the --with-cpu=power10 option.
2) Little endian power9 using the --with-cpu=power9 option.
3) Big endian power8 using the --with-cpu=power8 option. On this system,
both 64-bit and 32-bit code generation was tested.
There were no regressions in the runs. Can I check this patch into the
trunk? If there are no changes needed for the backports, can I check this
code into the active branches after a burn-in period?
2022-06-06 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/rs6000-string.cc (expand_block_move): If the store
vector pair instructions are disabled, do not generate block
copies using load and store vector pairs.
---
gcc/config/rs6000/rs6000-string.cc | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
Comments
On Mon, 2022-06-06 at 20:55 -0400, Michael Meissner wrote:
> [PATCH 2/3] Disable generating load/store vector pairs for block copies.
>
> If the store vector pair instruction is disabled, do not generate block
> copies that use load and store vector pair instructions.
>
> I have built bootstrap compilers and run the regression tests on three
> different systems:
>
> 1) Little endian power10 using the --with-cpu=power10 option.
>
> 2) Little endian power9 using the --with-cpu=power9 option.
>
> 3) Big endian power8 using the --with-cpu=power8 option. On this system,
> both 64-bit and 32-bit code generation was tested.
>
> There were no regressions in the runs. Can I check this patch into the
> trunk? If there are no changes needed for the backports, can I check this
> code into the active branches after a burn-in period?
>
> 2022-06-06 Michael Meissner <meissner@linux.ibm.com>
>
> gcc/
>
> * config/rs6000/rs6000-string.cc (expand_block_move): If the store
> vector pair instructions are disabled, do not generate block
> copies using load and store vector pairs.
> ---
> gcc/config/rs6000/rs6000-string.cc | 12 +++++++-----
> 1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc
> index 59d901ac68d..1b18e043269 100644
> --- a/gcc/config/rs6000/rs6000-string.cc
> +++ b/gcc/config/rs6000/rs6000-string.cc
> @@ -2787,14 +2787,16 @@ expand_block_move (rtx operands[], bool might_overlap)
> rtx src, dest;
> bool move_with_length = false;
>
> - /* Use OOmode for paired vsx load/store. Use V2DI for single
> - unaligned vsx load/store, for consistency with what other
> - expansions (compare) already do, and so we can use lxvd2x on
> - p8. Order is VSX pair unaligned, VSX unaligned, Altivec, VSX
> - with length < 16 (if allowed), then gpr load/store. */
> + /* Use OOmode for paired vsx load/store unless the store vector pair
> + instructions are disabled. Use V2DI for single unaligned vsx
> + load/store, for consistency with what other expansions (compare)
> + already do, and so we can use lxvd2x on p8. Order is VSX pair
> + unaligned, VSX unaligned, Altivec, VSX with length < 16 (if allowed),
> + then gpr load/store. */
>
> if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
> && TARGET_BLOCK_OPS_VECTOR_PAIR
> + && TARGET_STORE_VECTOR_PAIR
> && bytes >= 32
> && (align >= 256 || !STRICT_ALIGNMENT))
Seems straightforward. LGTM,
Thanks
-Will
> {
> --
> 2.35.3
>
>
@@ -2787,14 +2787,16 @@ expand_block_move (rtx operands[], bool might_overlap)
rtx src, dest;
bool move_with_length = false;
- /* Use OOmode for paired vsx load/store. Use V2DI for single
- unaligned vsx load/store, for consistency with what other
- expansions (compare) already do, and so we can use lxvd2x on
- p8. Order is VSX pair unaligned, VSX unaligned, Altivec, VSX
- with length < 16 (if allowed), then gpr load/store. */
+ /* Use OOmode for paired vsx load/store unless the store vector pair
+ instructions are disabled. Use V2DI for single unaligned vsx
+ load/store, for consistency with what other expansions (compare)
+ already do, and so we can use lxvd2x on p8. Order is VSX pair
+ unaligned, VSX unaligned, Altivec, VSX with length < 16 (if allowed),
+ then gpr load/store. */
if (TARGET_MMA && TARGET_BLOCK_OPS_UNALIGNED_VSX
&& TARGET_BLOCK_OPS_VECTOR_PAIR
+ && TARGET_STORE_VECTOR_PAIR
&& bytes >= 32
&& (align >= 256 || !STRICT_ALIGNMENT))
{