[V2] Do not enable -mblock-ops-vector-pair.
Commit Message
Do not enable -mblock-ops-vector-pair.
Testing has shown that using the load vector pair and store vector pair
instructions for block moves has some performance issues on power10.
A patch on June 11th modified the code so that GCC would not set
-mblock-ops-vector-pair by default if we are tuning for power10, but it
would set the option if we were tuning for a different machine and have
load and store vector pair instructions enabled.
This patch eliminates the code setting -mblock-ops-vector-pair. If you
want to generate load vector pair and store vector pair instructions for
block moves, you must use -mblock-ops-vector-pair.
2022-07-25 Michael Meissner <meissner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove
code setting -mblock-ops-vector-pair.
---
gcc/config/rs6000/rs6000.cc | 11 -----------
1 file changed, 11 deletions(-)
Comments
Ping patch.
| Date: Mon, 25 Jul 2022 16:15:05 -0400
| Subject: [PATCH, V2] Do not enable -mblock-ops-vector-pair.
| Message-ID: <Yt75yUWKIqoWI5kZ@toto.the-meissners.org>
Hi Mike,
On Mon, Jul 25, 2022 at 04:15:05PM -0400, Michael Meissner wrote:
> Testing has shown that using the load vector pair and store vector pair
> instructions for block moves has some performance issues on power10.
> This patch eliminates the code setting -mblock-ops-vector-pair. If you
> want to generate load vector pair and store vector pair instructions for
> block moves, you must use -mblock-ops-vector-pair.
>
> 2022-07-25 Michael Meissner <meissner@linux.ibm.com>
>
> gcc/
>
> * config/rs6000/rs6000.cc (rs6000_option_override_internal): Remove
> code setting -mblock-ops-vector-pair.
Okay for trunk (and any backports you may need). Thanks!
Segher
@@ -4139,17 +4139,6 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_UNALIGNED_VSX;
}
- if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR))
- {
- /* Do not generate lxvp and stxvp on power10 since there are some
- performance issues. */
- if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX
- && rs6000_tune != PROCESSOR_POWER10)
- rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
- else
- rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
- }
-
/* Use long double size to select the appropriate long double. We use
TYPE_PRECISION to differentiate the 3 different long double types. We map
128 into the precision used for TFmode. */