[V2] Eliminate power8 fusion options, use power8 tuning, PR target/102059

Message ID Yil1VQyyrnsqX7/A@toto.the-meissners.org
State New
Headers
Series [V2] Eliminate power8 fusion options, use power8 tuning, PR target/102059 |

Commit Message

Michael Meissner March 10, 2022, 3:49 a.m. UTC
  Eliminate power8 fusion options, use power8 tuning, PR target/102059

The power8 fusion support used to be set automatically when -mcpu=power8 or
-mtune=power8 was used, and it was cleared for other cpu's.  However, if you
used the target attribute or target #pragma to change the default cpu type or
tuning, you would get an error that a target specifiction option mismatch
occurred.

This occurred because the rs6000_can_inline_p function just compares the ISA
bits between the called inline function and the caller.  If the ISA flags of
the called function is not a subset of the ISA flags of the caller, we won't do
the inlinging.  When a power9 or power10 function inlines a function that is
explicitly compiled for power8, the power8 function has the power8 fusion bits
set and the power9 or power10 functions do not have the fusion bits set.

This code removes the -mpower8-fusion and -mpower8-fusion-sign options, and
only enables power8 fusion if we are tuning for a power8.  Power8 sign fusion
is only enabled if we are tuning for a power8 and we have -O3 optimization or
higher.

I left the options -mno-power8-fusion and -mno-power8-fusion-sign in rs6000.opt
and they don't issue a warning.  If the user explicitly used -mpower8-fusion or
-mpower8-fusion-sign, then they will get a warning that the swtich has been
removed.

Similarly, I left in the pragma target and attribute target support for the
fusion options, but they don't do anything now.  This is because I believe the
customer who encountered this problem now is explicitly setting the
no-power8-fusion option in the pragma or attribute to avoid the warning.

I have tested this on the following systems, and they all bootstraps fine and
there were no regressions in the test suite:

    big endian power8 (both 64-bit and 32-bit)
    little endian power9
    little endian power10

Can I check this patch into the current master branch for GCC and after a
cooling period check in the patch to the GCC 11 and GCC 10 branches.  The
customer is currently using GCC 10.

2022-03-09   Michael Meissner  <meissner@linux.ibm.com>

gcc/
	PR target/102059
	* config/rs6000/rs6000-cpus.def (OTHER_FUSION_MASKS): Delete.
	(ISA_3_0_MASKS_SERVER): Don't clear the fusion masks.
	(POWERPC_MASKS): Remove OPTION_MASK_P8_FUSION.
	* config/rs6000/rs6000.cc (rs6000_option_override_internal):
	Delete code that set the power8 fusion options automatically.
	(rs6000_opt_masks): Allow #pragma target and attribute target to set
	power8-fusion and power8-fusion-sign, but these no longer represent
	options that the user can set.
	(rs6000_print_options_internal): Skip printing nop options.
	* config/rs6000/rs6000.h (TARGET_P8_FUSION): New macro.
	(TARGET_P8_FUSION_SIGN): Likewise.
	(MASK_P8_FUSION): Delete.
	* config/rs6000/rs6000.opt (-mpower8-fusion): Recognize the option but
	ignore the no form and warn that the option was removed for the regular
	form.
	(-mpower8-fusion-sign): Likewise.
	* doc/invoke.texi (RS/6000 and PowerPC Options): Delete -mpower8-fusion
	and -mpower8-fusion-sign.

gcc/testsuite/
	PR target/102059
	* gcc.dg/lto/pr102059-1_0.c: Remove -mno-power8-fusion.
	* gcc.dg/lto/pr102059-2_0.c: Likewise.
	* gcc.target/powerpc/pr102059-3.c: Likewise.
	* gcc.target/powerpc/pr102059-4.c: New test.
---
 gcc/config/rs6000/rs6000-cpus.def             | 22 +++------
 gcc/config/rs6000/rs6000.cc                   | 49 +++++--------------
 gcc/config/rs6000/rs6000.h                    | 14 +++++-
 gcc/config/rs6000/rs6000.opt                  | 19 +++++--
 gcc/doc/invoke.texi                           | 13 +----
 gcc/testsuite/gcc.dg/lto/pr102059-1_0.c       |  2 +-
 gcc/testsuite/gcc.dg/lto/pr102059-2_0.c       |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr102059-3.c |  2 +-
 gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 23 +++++++++
 9 files changed, 75 insertions(+), 71 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c
  

Comments

will schmidt March 10, 2022, 4:44 p.m. UTC | #1
On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> Eliminate power8 fusion options, use power8 tuning, PR target/102059

Hi,

> 
> The power8 fusion support used to be set automatically when -mcpu=power8 or
> -mtune=power8 was used, and it was cleared for other cpu's.  However, if you
> used the target attribute or target #pragma to change the default cpu type or
> tuning, you would get an error that a target specifiction option mismatch
> occurred.
> 
specification. 
(ok :-)


> This occurred because the rs6000_can_inline_p function just compares the ISA
> bits between the called inline function and the caller.  If the ISA flags of
> the called function is not a subset of the ISA flags of the caller, we won't do
> the inlinging.  When a power9 or power10 function inlines a function that is
> explicitly compiled for power8, the power8 function has the power8 fusion bits
> set and the power9 or power10 functions do not have the fusion bits set.
> 

inlining.


> This code removes the -mpower8-fusion and -mpower8-fusion-sign options, and
> only enables power8 fusion if we are tuning for a power8.  Power8 sign fusion
> is only enabled if we are tuning for a power8 and we have -O3 optimization or
> higher.
> 
> I left the options -mno-power8-fusion and -mno-power8-fusion-sign in rs6000.opt
> and they don't issue a warning.  If the user explicitly used -mpower8-fusion or
> -mpower8-fusion-sign, then they will get a warning that the swtich has been
> removed.
> 

switch


> Similarly, I left in the pragma target and attribute target support for the
> fusion options, but they don't do anything now.  This is because I believe the
> customer who encountered this problem now is explicitly setting the
> no-power8-fusion option in the pragma or attribute to avoid the warning.
> 
> I have tested this on the following systems, and they all bootstraps fine and
> there were no regressions in the test suite:
> 
>     big endian power8 (both 64-bit and 32-bit)
>     little endian power9
>     little endian power10
> 

ok.

> Can I check this patch into the current master branch for GCC and after a
> cooling period check in the patch to the GCC 11 and GCC 10 branches.  The
> customer is currently using GCC 10.
> 
> 2022-03-09   Michael Meissner  <meissner@linux.ibm.com>
> 
> gcc/
> 	PR target/102059
> 	* config/rs6000/rs6000-cpus.def (OTHER_FUSION_MASKS): Delete.
> 	(ISA_3_0_MASKS_SERVER): Don't clear the fusion masks.
> 	(POWERPC_MASKS): Remove OPTION_MASK_P8_FUSION.

ok

> 	* config/rs6000/rs6000.cc (rs6000_option_override_internal):
> 	Delete code that set the power8 fusion options automatically.
> 	(rs6000_opt_masks): Allow #pragma target and attribute target to set
> 	power8-fusion and power8-fusion-sign, but these no longer represent
> 	options that the user can set.
> 	(rs6000_print_options_internal): Skip printing nop options.

ok


> 	* config/rs6000/rs6000.h (TARGET_P8_FUSION): New macro.
> 	(TARGET_P8_FUSION_SIGN): Likewise.
> 	(MASK_P8_FUSION): Delete.

ok


> 	* config/rs6000/rs6000.opt (-mpower8-fusion): Recognize the option but
> 	ignore the no form and warn that the option was removed for the regular
> 	form.
> 	(-mpower8-fusion-sign): Likewise.

ok

> 	* doc/invoke.texi (RS/6000 and PowerPC Options): Delete -mpower8-fusion
> 	and -mpower8-fusion-sign.

This change removes the -mpower8-fusion and -mno-power8-fusion options,
There is not a direct reference to -mpower8-fusion-sign in the change
here.  It may be an implied removal, but not immediately obvious to me.


> 
> gcc/testsuite/
> 	PR target/102059
> 	* gcc.dg/lto/pr102059-1_0.c: Remove -mno-power8-fusion.
> 	* gcc.dg/lto/pr102059-2_0.c: Likewise.
> 	* gcc.target/powerpc/pr102059-3.c: Likewise.
> 	* gcc.target/powerpc/pr102059-4.c: New test.

ok

> ---
>  gcc/config/rs6000/rs6000-cpus.def             | 22 +++------
>  gcc/config/rs6000/rs6000.cc                   | 49 +++++--------------
>  gcc/config/rs6000/rs6000.h                    | 14 +++++-
>  gcc/config/rs6000/rs6000.opt                  | 19 +++++--
>  gcc/doc/invoke.texi                           | 13 +----
>  gcc/testsuite/gcc.dg/lto/pr102059-1_0.c       |  2 +-
>  gcc/testsuite/gcc.dg/lto/pr102059-2_0.c       |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr102059-3.c |  2 +-
>  gcc/testsuite/gcc.target/powerpc/pr102059-4.c | 23 +++++++++
>  9 files changed, 75 insertions(+), 71 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr102059-4.c
> 
> diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
> index 963947f6939..a05b2d8c41a 100644
> --- a/gcc/config/rs6000/rs6000-cpus.def
> +++ b/gcc/config/rs6000/rs6000-cpus.def
> @@ -43,9 +43,7 @@
>  				 | OPTION_MASK_ALTIVEC			\
>  				 | OPTION_MASK_VSX)
> 
> -/* For now, don't provide an embedded version of ISA 2.07.  Do not set power8
> -   fusion here, instead set it in rs6000.cc if we are tuning for a power8
> -   system.  */
> +/* For now, don't provide an embedded version of ISA 2.07.  */

ok.  (as far as removing the comment, I'm not clear what the remaining
comment is telling me, but thats outside of the scope of this patch).


>  #define ISA_2_7_MASKS_SERVER	(ISA_2_6_MASKS_SERVER			\
>  				 | OPTION_MASK_P8_VECTOR		\
>  				 | OPTION_MASK_CRYPTO			\
> @@ -54,19 +52,14 @@
>  				 | OPTION_MASK_QUAD_MEMORY		\
>  				 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
> 
> -/* ISA masks setting fusion options.  */
> -#define OTHER_FUSION_MASKS	(OPTION_MASK_P8_FUSION			\
> -				 | OPTION_MASK_P8_FUSION_SIGN)
> -
ok
>  /* Add ISEL back into ISA 3.0, since it is supposed to be a win.  Do not add
>     FLOAT128_HW here until we are ready to make -mfloat128 on by default.  */
> -#define ISA_3_0_MASKS_SERVER	((ISA_2_7_MASKS_SERVER			\
> -				  | OPTION_MASK_ISEL			\
> -				  | OPTION_MASK_MODULO			\
> -				  | OPTION_MASK_P9_MINMAX		\
> -				  | OPTION_MASK_P9_MISC			\
> -				  | OPTION_MASK_P9_VECTOR)		\
> -				 & ~OTHER_FUSION_MASKS)
> +#define ISA_3_0_MASKS_SERVER	(ISA_2_7_MASKS_SERVER			\
> +				 | OPTION_MASK_ISEL			\
> +				 | OPTION_MASK_MODULO			\
> +				 | OPTION_MASK_P9_MINMAX		\
> +				 | OPTION_MASK_P9_MISC			\
> +				 | OPTION_MASK_P9_VECTOR)
> 

Whitespace change.  I assume this fixes an indentation glitch. 


>  /* Support for the IEEE 128-bit floating point hardware requires a lot of the
>     VSX instructions that are part of ISA 3.0.  */
> @@ -140,7 +133,6 @@
>  				 | OPTION_MASK_MODULO			\
>  				 | OPTION_MASK_MULHW			\
>  				 | OPTION_MASK_NO_UPDATE		\
> -				 | OPTION_MASK_P8_FUSION		\
>  				 | OPTION_MASK_P8_VECTOR		\
>  				 | OPTION_MASK_P9_MINMAX		\
>  				 | OPTION_MASK_P9_MISC			\

ok.

> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 3afe78f5d04..46dad54a066 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -4040,41 +4040,6 @@ rs6000_option_override_internal (bool global_init_p)
>        && optimize_function_for_speed_p (cfun))
>      rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
> 
> -  /* Enable power8 fusion if we are tuning for power8, even if we aren't
> -     generating power8 instructions.  Power9 does not optimize power8 fusion
> -     cases.  */
> -  if (!(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION))
> -    {
> -      if (processor_target_table[tune_index].processor == PROCESSOR_POWER8)
> -	rs6000_isa_flags |= OPTION_MASK_P8_FUSION;
> -      else
> -	rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION;
> -    }
> -
> -  /* Setting additional fusion flags turns on base fusion.  */
> -  if (!TARGET_P8_FUSION && TARGET_P8_FUSION_SIGN)
> -    {
> -      if (rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION)
> -	{
> -	  if (TARGET_P8_FUSION_SIGN)
> -	    error ("%qs requires %qs", "-mpower8-fusion-sign",
> -		   "-mpower8-fusion");
> -
> -	  rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION;
> -	}
> -      else
> -	rs6000_isa_flags |= OPTION_MASK_P8_FUSION;
> -    }
> -
> -  /* Power8 does not fuse sign extended loads with the addis.  If we are
> -     optimizing at high levels for speed, convert a sign extended load into a
> -     zero extending load, and an explicit sign extension.  */
> -  if (TARGET_P8_FUSION
> -      && !(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION_SIGN)
> -      && optimize_function_for_speed_p (cfun)
> -      && optimize >= 3)
> -    rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN;
> -


ok

>    /* ISA 3.0 vector instructions include ISA 2.07.  */
>    if (TARGET_P9_VECTOR && !TARGET_P8_VECTOR)
>      {
> @@ -23983,8 +23948,6 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
>    { "pcrel-opt",		OPTION_MASK_PCREL_OPT,		false, true  },
>    { "popcntb",			OPTION_MASK_POPCNTB,		false, true  },
>    { "popcntd",			OPTION_MASK_POPCNTD,		false, true  },
> -  { "power8-fusion",		OPTION_MASK_P8_FUSION,		false, true  },
> -  { "power8-fusion-sign",	OPTION_MASK_P8_FUSION_SIGN,	false, true  },
>    { "power8-vector",		OPTION_MASK_P8_VECTOR,		false, true  },
>    { "power9-minmax",		OPTION_MASK_P9_MINMAX,		false, true  },
>    { "power9-misc",		OPTION_MASK_P9_MISC,		false, true  },
> @@ -24024,6 +23987,14 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
>  #endif
>    { "soft-float",		OPTION_MASK_SOFT_FLOAT,		false, false },
>    { "string",			0,				false, false },
> +
> +  /* Power8 fusion options were removed, but ignore using them in #pragma and
> +     attribute target.  Users may have used these options to suppress errors if
> +     they declare an inline function to be specifically power8 and the function
> +     was included by power9 or power10 which turned off the power8 fusion
> +     support.  */

I'd be tempted to clarify 'ignore' with some form of 'continue to
allow', but i'm not certain if that would be helpful. 
After reading
further, ..  The comment in the rs6000.opt change below flows better to
me.

> +  { "power8-fusion",		0,				false, true  },
> +  { "power8-fusion-sign",	0,				false, true  },
>  };


ok



> 
>  /* Builtin mask mapping for printing the flags.  */
> @@ -24660,6 +24631,10 @@ rs6000_print_options_internal (FILE *file,
>        HOST_WIDE_INT mask = opts[i].mask;
>        size_t len = comma_len + prefix_len + strlen (name);
> 
> +      /* Don't print NOP options.  */
> +      if (!mask)
> +	continue;
> +

ok

>        if (!invert)
>  	{
>  	  if ((flags & mask) == 0)
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index 17af314416c..4ae45ff822f 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -504,6 +504,19 @@ extern int rs6000_vector_align[];
>  #define TARGET_MINMAX	(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT		\
>  			 && (TARGET_P9_MINMAX || !flag_trapping_math))
> 
> +/* Power8 has special fusion operations that are enabled if we are tuning for
> +   power8.  This used to be settable with an option (-mpower8-fusion), but that
> +   option has been removed.  */
> +#define TARGET_P8_FUSION	(rs6000_tune == PROCESSOR_POWER8)
> +
> +/* Power8 fusion does not fuse loads with sign extends.  If we are doing higher
> +   optimization levels, split loads with sign extension to loads with zero
> +   extension and an explicit sign extend operation, so that the zero extending
> +   load can be fused.  */
> +#define TARGET_P8_FUSION_SIGN	(TARGET_P8_FUSION			\
> +				 && optimize_function_for_speed_p (cfun) \
> +				 && optimize >= 3)
> +

ok


>  /* In switching from using target_flags to using rs6000_isa_flags, the options
>     machinery creates OPTION_MASK_<xxx> instead of MASK_<xxx>.  For now map
>     OPTION_MASK_<xxx> back into MASK_<xxx>.  */
> @@ -517,7 +530,6 @@ extern int rs6000_vector_align[];
>  #define MASK_FLOAT128_KEYWORD		OPTION_MASK_FLOAT128_KEYWORD
>  #define MASK_FLOAT128_HW		OPTION_MASK_FLOAT128_HW
>  #define MASK_FPRND			OPTION_MASK_FPRND
> -#define MASK_P8_FUSION			OPTION_MASK_P8_FUSION
>  #define MASK_HARD_FLOAT			OPTION_MASK_HARD_FLOAT
>  #define MASK_HTM			OPTION_MASK_HTM
>  #define MASK_ISEL			OPTION_MASK_ISEL

ok

> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index 4931d781c4e..a817df1b448 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -474,13 +474,24 @@ Save the TOC in the prologue for indirect calls rather than inline.
>  mvsx-timode
>  Target RejectNegative Undocumented Ignore
> 
> +; The -mpower8-fusion and -mpower8-fusion-sign options existed in the past, but
> +; they have been removed.  Some users in the past needed to use the
> +; -mno-power8-fusion option when they had inline functions that were specified
> +; as generating power8 code and the functions were included by power9 or
> +; power10 function.  Using -mno-power8-fusion prevented an error in this case.
> +; We allow the -mno-power8-fusion and -mno-power8-fusion-sign options without a
> +; warning.


Ok. 


> +mno-power8-fusion
> +Target RejectNegative Undocumented
> +
>  mpower8-fusion
> -Target Mask(P8_FUSION) Var(rs6000_isa_flags)
> -Fuse certain integer operations together for better performance on power8.
> +Target RejectNegative Undocumented WarnRemoved
> +
> +mno-power8-fusion-sign
> +Target RejectNegative Undocumented
> 
>  mpower8-fusion-sign
> -Target Undocumented Mask(P8_FUSION_SIGN) Var(rs6000_isa_flags)
> -Allow sign extension in fusion operations.
> +Target RejectNegative Undocumented WarnRemoved
> 

ok


>  mpower8-vector
>  Target Mask(P8_VECTOR) Var(rs6000_isa_flags)
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a0fa5e1cf43..cf7e4e22de6 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1255,8 +1255,7 @@ See RS/6000 and PowerPC Options.
>  -mveclibabi=@var{type}  -mfriz  -mno-friz @gol
>  -mpointers-to-nested-functions  -mno-pointers-to-nested-functions @gol
>  -msave-toc-indirect  -mno-save-toc-indirect @gol
> --mpower8-fusion  -mno-mpower8-fusion  -mpower8-vector  -mno-power8-vector @gol
> --mcrypto  -mno-crypto  -mhtm  -mno-htm @gol
> +-mpower8-vector  -mno-power8-vector -mcrypto  -mno-crypto  -mhtm  -mno-htm @gol
>  -mquad-memory  -mno-quad-memory @gol
>  -mquad-memory-atomic  -mno-quad-memory-atomic @gol
>  -mcompat-align-parm  -mno-compat-align-parm @gol

ok.   (Assuming the combining of the two lines onto one is kosher. )


> @@ -28080,7 +28079,7 @@ following options:
>  -mpopcntb  -mpopcntd  -mpowerpc64 @gol
>  -mpowerpc-gpopt  -mpowerpc-gfxopt @gol
>  -mmulhw  -mdlmzb  -mmfpgpr  -mvsx @gol
> --mcrypto  -mhtm  -mpower8-fusion  -mpower8-vector @gol
> +-mcrypto  -mhtm  -mpower8-vector @gol
>  -mquad-memory  -mquad-memory-atomic  -mfloat128 @gol
>  -mfloat128-hardware -mprefixed -mpcrel -mmma @gol
>  -mrop-protect}
> @@ -28195,14 +28194,6 @@ Enable (disable) the use of the built-in functions that allow direct
>  access to the Hardware Transactional Memory (HTM) instructions that
>  were added in version 2.07 of the PowerPC ISA.
> 
> -@item -mpower8-fusion
> -@itemx -mno-power8-fusion
> -@opindex mpower8-fusion
> -@opindex mno-power8-fusion
> -Generate code that keeps (does not keeps) some integer operations
> -adjacent so that the instructions can be fused together on power8 and
> -later processors.
> -


ok

>  @item -mpower8-vector
>  @itemx -mno-power8-vector
>  @opindex mpower8-vector
> diff --git a/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c b/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
> index e1004f5cfbf..b215b701097 100644
> --- a/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
> +++ b/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
> @@ -1,7 +1,7 @@
>  /* { dg-lto-do link } */
>  /* { dg-skip-if "power10 and above only" { ! { power10_ok } } } */
>  /* -Wno-attributes suppresses always_inline warnings.  */
> -/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes -mno-power8-fusion" } } */
> +/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes" } } */
> 
>  int __attribute__ ((always_inline))
>  foo1 (int *b)
> diff --git a/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c b/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c
> index ebb0a7b8fa1..ffab23ab7e1 100644
> --- a/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c
> +++ b/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c
> @@ -1,6 +1,6 @@
>  /* { dg-lto-do link } */
>  /* { dg-skip-if "power10 and above only" { ! { power10_ok } } } */
> -/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -flto -fdump-ipa-inline" } } */
> +/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -fdump-ipa-inline" } } */
> 
>  int
>  foo1 (int *b)
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
> index 21c982d93f0..0cb3a4cb9f9 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -fdump-tree-einline-optimized" } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8 -fdump-tree-einline-optimized" } */
> 
>  /* Like pr102059-1.c, to verify the inlining still happens
>     even without always_inline attribute.  */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
> new file mode 100644
> index 00000000000..5fe66f8af4b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
> +/* { dg-require-effective-target power10_ok } */
> +
> +/* Verify that power10 can explicity include functions compiled for power8.
> +   The issue was -mcpu=power8 enables -mpower8-fusion, but -mcpu=power9 or
> +   -mcpu=power10 do not set power8-fusion by default.  Thus when doing this
> +   compilation, they would get an error that the inline function failed in its
> +   inlining due to having incompatible options.  */
> +
> +static inline int __attribute__ ((always_inline,target("cpu=power8")))
> +foo (int *b)
> +{
> +  *b += 10;
> +  return *b;
> +}
> +
> +int
> +bar (int *a)
> +{
> +  *a = foo (a);
> +  return 0;
> +}
> -- 
> 2.35.1
> 


lgtm, 
thanks,
-Will
  
Segher Boessenkool March 10, 2022, 7:49 p.m. UTC | #2
On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote:
> On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> > --- a/gcc/config/rs6000/rs6000-cpus.def
> > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > @@ -43,9 +43,7 @@
> >  				 | OPTION_MASK_ALTIVEC			\
> >  				 | OPTION_MASK_VSX)
> > 
> > -/* For now, don't provide an embedded version of ISA 2.07.  Do not set power8
> > -   fusion here, instead set it in rs6000.cc if we are tuning for a power8
> > -   system.  */
> > +/* For now, don't provide an embedded version of ISA 2.07.  */
> 
> ok.  (as far as removing the comment, I'm not clear what the remaining
> comment is telling me, but thats outside of the scope of this patch).

It is saying there is nothing that implements Book III-E of ISA 2.07
(nothing in GCC, but no actual CPU either).  Or Category: Embedded even
maybe :-)

It could be clearer perhaps, or just be removed completely; it might
have been useful historically, but it isn't anymore really.


Segher
  
will schmidt March 10, 2022, 8 p.m. UTC | #3
On Thu, 2022-03-10 at 13:49 -0600, Segher Boessenkool wrote:
> On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote:
> > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> > > --- a/gcc/config/rs6000/rs6000-cpus.def
> > > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > > @@ -43,9 +43,7 @@
> > >  				 | OPTION_MASK_ALTIVEC			
> > > \
> > >  				 | OPTION_MASK_VSX)
> > > 
> > > -/* For now, don't provide an embedded version of ISA 2.07.  Do
> > > not set power8
> > > -   fusion here, instead set it in rs6000.cc if we are tuning for
> > > a power8
> > > -   system.  */
> > > +/* For now, don't provide an embedded version of ISA 2.07.  */
> > 
> > ok.  (as far as removing the comment, I'm not clear what the
> > remaining
> > comment is telling me, but thats outside of the scope of this
> > patch).
> 
> It is saying there is nothing that implements Book III-E of ISA 2.07
> (nothing in GCC, but no actual CPU either).  Or Category: Embedded
> even
> maybe :-)

Lol, Ok.  The small-e in embedded did not clue me in that this was
referring to the big-E Embedded category.  :-)

> It could be clearer perhaps, or just be removed completely; it might
> have been useful historically, but it isn't anymore really.


THanks,
-Will

> 
> 
> Segher
  
Michael Meissner March 10, 2022, 8:34 p.m. UTC | #4
On Thu, Mar 10, 2022 at 01:49:36PM -0600, Segher Boessenkool wrote:
> On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote:
> > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> > > --- a/gcc/config/rs6000/rs6000-cpus.def
> > > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > > @@ -43,9 +43,7 @@
> > >  				 | OPTION_MASK_ALTIVEC			\
> > >  				 | OPTION_MASK_VSX)
> > > 
> > > -/* For now, don't provide an embedded version of ISA 2.07.  Do not set power8
> > > -   fusion here, instead set it in rs6000.cc if we are tuning for a power8
> > > -   system.  */
> > > +/* For now, don't provide an embedded version of ISA 2.07.  */
> > 
> > ok.  (as far as removing the comment, I'm not clear what the remaining
> > comment is telling me, but thats outside of the scope of this patch).
> 
> It is saying there is nothing that implements Book III-E of ISA 2.07
> (nothing in GCC, but no actual CPU either).  Or Category: Embedded even
> maybe :-)
> 
> It could be clearer perhaps, or just be removed completely; it might
> have been useful historically, but it isn't anymore really.

At the time it was written, there were possiblities of power8 that weren't
server class (i.e. with VSX).  But yeah now, it might be useful to delete it.
  
Michael Meissner March 10, 2022, 8:46 p.m. UTC | #5
On Thu, Mar 10, 2022 at 01:49:36PM -0600, Segher Boessenkool wrote:
> On Thu, Mar 10, 2022 at 10:44:52AM -0600, will schmidt wrote:
> > On Wed, 2022-03-09 at 22:49 -0500, Michael Meissner wrote:
> > > --- a/gcc/config/rs6000/rs6000-cpus.def
> > > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > > @@ -43,9 +43,7 @@
> > >  				 | OPTION_MASK_ALTIVEC			\
> > >  				 | OPTION_MASK_VSX)
> > > 
> > > -/* For now, don't provide an embedded version of ISA 2.07.  Do not set power8
> > > -   fusion here, instead set it in rs6000.cc if we are tuning for a power8
> > > -   system.  */
> > > +/* For now, don't provide an embedded version of ISA 2.07.  */
> > 
> > ok.  (as far as removing the comment, I'm not clear what the remaining
> > comment is telling me, but thats outside of the scope of this patch).
> 
> It is saying there is nothing that implements Book III-E of ISA 2.07
> (nothing in GCC, but no actual CPU either).  Or Category: Embedded even
> maybe :-)
> 
> It could be clearer perhaps, or just be removed completely; it might
> have been useful historically, but it isn't anymore really.

Other than possibly removing the comment, are there other things about the
patch that need to be done?
  

Patch

diff --git a/gcc/config/rs6000/rs6000-cpus.def b/gcc/config/rs6000/rs6000-cpus.def
index 963947f6939..a05b2d8c41a 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -43,9 +43,7 @@ 
 				 | OPTION_MASK_ALTIVEC			\
 				 | OPTION_MASK_VSX)
 
-/* For now, don't provide an embedded version of ISA 2.07.  Do not set power8
-   fusion here, instead set it in rs6000.cc if we are tuning for a power8
-   system.  */
+/* For now, don't provide an embedded version of ISA 2.07.  */
 #define ISA_2_7_MASKS_SERVER	(ISA_2_6_MASKS_SERVER			\
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_CRYPTO			\
@@ -54,19 +52,14 @@ 
 				 | OPTION_MASK_QUAD_MEMORY		\
 				 | OPTION_MASK_QUAD_MEMORY_ATOMIC)
 
-/* ISA masks setting fusion options.  */
-#define OTHER_FUSION_MASKS	(OPTION_MASK_P8_FUSION			\
-				 | OPTION_MASK_P8_FUSION_SIGN)
-
 /* Add ISEL back into ISA 3.0, since it is supposed to be a win.  Do not add
    FLOAT128_HW here until we are ready to make -mfloat128 on by default.  */
-#define ISA_3_0_MASKS_SERVER	((ISA_2_7_MASKS_SERVER			\
-				  | OPTION_MASK_ISEL			\
-				  | OPTION_MASK_MODULO			\
-				  | OPTION_MASK_P9_MINMAX		\
-				  | OPTION_MASK_P9_MISC			\
-				  | OPTION_MASK_P9_VECTOR)		\
-				 & ~OTHER_FUSION_MASKS)
+#define ISA_3_0_MASKS_SERVER	(ISA_2_7_MASKS_SERVER			\
+				 | OPTION_MASK_ISEL			\
+				 | OPTION_MASK_MODULO			\
+				 | OPTION_MASK_P9_MINMAX		\
+				 | OPTION_MASK_P9_MISC			\
+				 | OPTION_MASK_P9_VECTOR)
 
 /* Support for the IEEE 128-bit floating point hardware requires a lot of the
    VSX instructions that are part of ISA 3.0.  */
@@ -140,7 +133,6 @@ 
 				 | OPTION_MASK_MODULO			\
 				 | OPTION_MASK_MULHW			\
 				 | OPTION_MASK_NO_UPDATE		\
-				 | OPTION_MASK_P8_FUSION		\
 				 | OPTION_MASK_P8_VECTOR		\
 				 | OPTION_MASK_P9_MINMAX		\
 				 | OPTION_MASK_P9_MISC			\
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 3afe78f5d04..46dad54a066 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -4040,41 +4040,6 @@  rs6000_option_override_internal (bool global_init_p)
       && optimize_function_for_speed_p (cfun))
     rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
 
-  /* Enable power8 fusion if we are tuning for power8, even if we aren't
-     generating power8 instructions.  Power9 does not optimize power8 fusion
-     cases.  */
-  if (!(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION))
-    {
-      if (processor_target_table[tune_index].processor == PROCESSOR_POWER8)
-	rs6000_isa_flags |= OPTION_MASK_P8_FUSION;
-      else
-	rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION;
-    }
-
-  /* Setting additional fusion flags turns on base fusion.  */
-  if (!TARGET_P8_FUSION && TARGET_P8_FUSION_SIGN)
-    {
-      if (rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION)
-	{
-	  if (TARGET_P8_FUSION_SIGN)
-	    error ("%qs requires %qs", "-mpower8-fusion-sign",
-		   "-mpower8-fusion");
-
-	  rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION;
-	}
-      else
-	rs6000_isa_flags |= OPTION_MASK_P8_FUSION;
-    }
-
-  /* Power8 does not fuse sign extended loads with the addis.  If we are
-     optimizing at high levels for speed, convert a sign extended load into a
-     zero extending load, and an explicit sign extension.  */
-  if (TARGET_P8_FUSION
-      && !(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION_SIGN)
-      && optimize_function_for_speed_p (cfun)
-      && optimize >= 3)
-    rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN;
-
   /* ISA 3.0 vector instructions include ISA 2.07.  */
   if (TARGET_P9_VECTOR && !TARGET_P8_VECTOR)
     {
@@ -23983,8 +23948,6 @@  static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "pcrel-opt",		OPTION_MASK_PCREL_OPT,		false, true  },
   { "popcntb",			OPTION_MASK_POPCNTB,		false, true  },
   { "popcntd",			OPTION_MASK_POPCNTD,		false, true  },
-  { "power8-fusion",		OPTION_MASK_P8_FUSION,		false, true  },
-  { "power8-fusion-sign",	OPTION_MASK_P8_FUSION_SIGN,	false, true  },
   { "power8-vector",		OPTION_MASK_P8_VECTOR,		false, true  },
   { "power9-minmax",		OPTION_MASK_P9_MINMAX,		false, true  },
   { "power9-misc",		OPTION_MASK_P9_MISC,		false, true  },
@@ -24024,6 +23987,14 @@  static struct rs6000_opt_mask const rs6000_opt_masks[] =
 #endif
   { "soft-float",		OPTION_MASK_SOFT_FLOAT,		false, false },
   { "string",			0,				false, false },
+
+  /* Power8 fusion options were removed, but ignore using them in #pragma and
+     attribute target.  Users may have used these options to suppress errors if
+     they declare an inline function to be specifically power8 and the function
+     was included by power9 or power10 which turned off the power8 fusion
+     support.  */
+  { "power8-fusion",		0,				false, true  },
+  { "power8-fusion-sign",	0,				false, true  },
 };
 
 /* Builtin mask mapping for printing the flags.  */
@@ -24660,6 +24631,10 @@  rs6000_print_options_internal (FILE *file,
       HOST_WIDE_INT mask = opts[i].mask;
       size_t len = comma_len + prefix_len + strlen (name);
 
+      /* Don't print NOP options.  */
+      if (!mask)
+	continue;
+
       if (!invert)
 	{
 	  if ((flags & mask) == 0)
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 17af314416c..4ae45ff822f 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -504,6 +504,19 @@  extern int rs6000_vector_align[];
 #define TARGET_MINMAX	(TARGET_HARD_FLOAT && TARGET_PPC_GFXOPT		\
 			 && (TARGET_P9_MINMAX || !flag_trapping_math))
 
+/* Power8 has special fusion operations that are enabled if we are tuning for
+   power8.  This used to be settable with an option (-mpower8-fusion), but that
+   option has been removed.  */
+#define TARGET_P8_FUSION	(rs6000_tune == PROCESSOR_POWER8)
+
+/* Power8 fusion does not fuse loads with sign extends.  If we are doing higher
+   optimization levels, split loads with sign extension to loads with zero
+   extension and an explicit sign extend operation, so that the zero extending
+   load can be fused.  */
+#define TARGET_P8_FUSION_SIGN	(TARGET_P8_FUSION			\
+				 && optimize_function_for_speed_p (cfun) \
+				 && optimize >= 3)
+
 /* In switching from using target_flags to using rs6000_isa_flags, the options
    machinery creates OPTION_MASK_<xxx> instead of MASK_<xxx>.  For now map
    OPTION_MASK_<xxx> back into MASK_<xxx>.  */
@@ -517,7 +530,6 @@  extern int rs6000_vector_align[];
 #define MASK_FLOAT128_KEYWORD		OPTION_MASK_FLOAT128_KEYWORD
 #define MASK_FLOAT128_HW		OPTION_MASK_FLOAT128_HW
 #define MASK_FPRND			OPTION_MASK_FPRND
-#define MASK_P8_FUSION			OPTION_MASK_P8_FUSION
 #define MASK_HARD_FLOAT			OPTION_MASK_HARD_FLOAT
 #define MASK_HTM			OPTION_MASK_HTM
 #define MASK_ISEL			OPTION_MASK_ISEL
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 4931d781c4e..a817df1b448 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -474,13 +474,24 @@  Save the TOC in the prologue for indirect calls rather than inline.
 mvsx-timode
 Target RejectNegative Undocumented Ignore
 
+; The -mpower8-fusion and -mpower8-fusion-sign options existed in the past, but
+; they have been removed.  Some users in the past needed to use the
+; -mno-power8-fusion option when they had inline functions that were specified
+; as generating power8 code and the functions were included by power9 or
+; power10 function.  Using -mno-power8-fusion prevented an error in this case.
+; We allow the -mno-power8-fusion and -mno-power8-fusion-sign options without a
+; warning.
+mno-power8-fusion
+Target RejectNegative Undocumented
+
 mpower8-fusion
-Target Mask(P8_FUSION) Var(rs6000_isa_flags)
-Fuse certain integer operations together for better performance on power8.
+Target RejectNegative Undocumented WarnRemoved
+
+mno-power8-fusion-sign
+Target RejectNegative Undocumented
 
 mpower8-fusion-sign
-Target Undocumented Mask(P8_FUSION_SIGN) Var(rs6000_isa_flags)
-Allow sign extension in fusion operations.
+Target RejectNegative Undocumented WarnRemoved
 
 mpower8-vector
 Target Mask(P8_VECTOR) Var(rs6000_isa_flags)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a0fa5e1cf43..cf7e4e22de6 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1255,8 +1255,7 @@  See RS/6000 and PowerPC Options.
 -mveclibabi=@var{type}  -mfriz  -mno-friz @gol
 -mpointers-to-nested-functions  -mno-pointers-to-nested-functions @gol
 -msave-toc-indirect  -mno-save-toc-indirect @gol
--mpower8-fusion  -mno-mpower8-fusion  -mpower8-vector  -mno-power8-vector @gol
--mcrypto  -mno-crypto  -mhtm  -mno-htm @gol
+-mpower8-vector  -mno-power8-vector -mcrypto  -mno-crypto  -mhtm  -mno-htm @gol
 -mquad-memory  -mno-quad-memory @gol
 -mquad-memory-atomic  -mno-quad-memory-atomic @gol
 -mcompat-align-parm  -mno-compat-align-parm @gol
@@ -28080,7 +28079,7 @@  following options:
 -mpopcntb  -mpopcntd  -mpowerpc64 @gol
 -mpowerpc-gpopt  -mpowerpc-gfxopt @gol
 -mmulhw  -mdlmzb  -mmfpgpr  -mvsx @gol
--mcrypto  -mhtm  -mpower8-fusion  -mpower8-vector @gol
+-mcrypto  -mhtm  -mpower8-vector @gol
 -mquad-memory  -mquad-memory-atomic  -mfloat128 @gol
 -mfloat128-hardware -mprefixed -mpcrel -mmma @gol
 -mrop-protect}
@@ -28195,14 +28194,6 @@  Enable (disable) the use of the built-in functions that allow direct
 access to the Hardware Transactional Memory (HTM) instructions that
 were added in version 2.07 of the PowerPC ISA.
 
-@item -mpower8-fusion
-@itemx -mno-power8-fusion
-@opindex mpower8-fusion
-@opindex mno-power8-fusion
-Generate code that keeps (does not keeps) some integer operations
-adjacent so that the instructions can be fused together on power8 and
-later processors.
-
 @item -mpower8-vector
 @itemx -mno-power8-vector
 @opindex mpower8-vector
diff --git a/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c b/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
index e1004f5cfbf..b215b701097 100644
--- a/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
+++ b/gcc/testsuite/gcc.dg/lto/pr102059-1_0.c
@@ -1,7 +1,7 @@ 
 /* { dg-lto-do link } */
 /* { dg-skip-if "power10 and above only" { ! { power10_ok } } } */
 /* -Wno-attributes suppresses always_inline warnings.  */
-/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes -mno-power8-fusion" } } */
+/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -Wno-attributes" } } */
 
 int __attribute__ ((always_inline))
 foo1 (int *b)
diff --git a/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c b/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c
index ebb0a7b8fa1..ffab23ab7e1 100644
--- a/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c
+++ b/gcc/testsuite/gcc.dg/lto/pr102059-2_0.c
@@ -1,6 +1,6 @@ 
 /* { dg-lto-do link } */
 /* { dg-skip-if "power10 and above only" { ! { power10_ok } } } */
-/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -flto -fdump-ipa-inline" } } */
+/* { dg-lto-options { "-O2 -mdejagnu-cpu=power8 -flto -fdump-ipa-inline" } } */
 
 int
 foo1 (int *b)
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
index 21c982d93f0..0cb3a4cb9f9 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-3.c
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -mdejagnu-cpu=power8 -mno-power8-fusion -fdump-tree-einline-optimized" } */
+/* { dg-options "-O2 -mdejagnu-cpu=power8 -fdump-tree-einline-optimized" } */
 
 /* Like pr102059-1.c, to verify the inlining still happens
    even without always_inline attribute.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr102059-4.c b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
new file mode 100644
index 00000000000..5fe66f8af4b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr102059-4.c
@@ -0,0 +1,23 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+/* { dg-require-effective-target power10_ok } */
+
+/* Verify that power10 can explicity include functions compiled for power8.
+   The issue was -mcpu=power8 enables -mpower8-fusion, but -mcpu=power9 or
+   -mcpu=power10 do not set power8-fusion by default.  Thus when doing this
+   compilation, they would get an error that the inline function failed in its
+   inlining due to having incompatible options.  */
+
+static inline int __attribute__ ((always_inline,target("cpu=power8")))
+foo (int *b)
+{
+  *b += 10;
+  return *b;
+}
+
+int
+bar (int *a)
+{
+  *a = foo (a);
+  return 0;
+}