Rename the glibc.tune namespace to glibc.cpu

Message ID 20180716141633.6948-1-siddhesh@sourceware.org
State Superseded
Headers

Commit Message

Siddhesh Poyarekar July 16, 2018, 2:16 p.m. UTC
  The glibc.tune namespace is vaguely named since it is a 'tunable', so
give it a more specific name that describes what it refers to.  Rename
the tunable namespace to 'cpu' to more accurately reflect what it
encompasses.  Also rename glibc.tune.cpu to glibc.cpu.name since
glibc.cpu.cpu is weird.

	* NEWS: Mention the change.
	* elf/dl-tunables.list: Rename tune namespace to cpu.
	* sysdeps/powerpc/dl-tunables.list: Likewise.
	* sysdeps/x86/dl-tunables.list: Likewise.
	* sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to
	cpu.name.
	* elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust.
	* elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise.
	* manual/README.tunables: Likewise.
	* manual/tunables.texi: Likewise.
	* sysdeps/powerpc/cpu-features.c: Likewise.
	* sysdeps/unix/sysv/linux/aarch64/cpu-features.c
	(init_cpu_features): Likewise.
	* sysdeps/x86/cpu-features.c: Likewise.
	* sysdeps/x86/cpu-features.h: Likewise.
	* sysdeps/x86/cpu-tunables.c: Likewise.
	* sysdeps/x86_64/Makefile: Likewise.
---
 NEWS                                          |  3 ++
 elf/dl-hwcaps.c                               |  2 +-
 elf/dl-hwcaps.h                               |  2 +-
 elf/dl-tunables.list                          |  2 +-
 manual/README.tunables                        |  6 ++--
 manual/tunables.texi                          | 30 +++++++++----------
 sysdeps/aarch64/dl-tunables.list              |  4 +--
 sysdeps/powerpc/cpu-features.c                |  2 +-
 sysdeps/powerpc/dl-tunables.list              |  2 +-
 .../unix/sysv/linux/aarch64/cpu-features.c    |  2 +-
 sysdeps/x86/cpu-features.c                    |  4 +--
 sysdeps/x86/cpu-features.h                    |  2 +-
 sysdeps/x86/cpu-tunables.c                    |  4 +--
 sysdeps/x86/dl-tunables.list                  |  2 +-
 sysdeps/x86_64/Makefile                       |  4 +--
 15 files changed, 37 insertions(+), 34 deletions(-)
  

Comments

Carlos O'Donell July 16, 2018, 3:17 p.m. UTC | #1
On 07/16/2018 10:16 AM, Siddhesh Poyarekar wrote:
> The glibc.tune namespace is vaguely named since it is a 'tunable', so
> give it a more specific name that describes what it refers to.  Rename
> the tunable namespace to 'cpu' to more accurately reflect what it
> encompasses.  Also rename glibc.tune.cpu to glibc.cpu.name since
> glibc.cpu.cpu is weird.
> 
> 	* NEWS: Mention the change.
> 	* elf/dl-tunables.list: Rename tune namespace to cpu.
> 	* sysdeps/powerpc/dl-tunables.list: Likewise.
> 	* sysdeps/x86/dl-tunables.list: Likewise.
> 	* sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to
> 	cpu.name.
> 	* elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust.
> 	* elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise.
> 	* manual/README.tunables: Likewise.
> 	* manual/tunables.texi: Likewise.
> 	* sysdeps/powerpc/cpu-features.c: Likewise.
> 	* sysdeps/unix/sysv/linux/aarch64/cpu-features.c
> 	(init_cpu_features): Likewise.
> 	* sysdeps/x86/cpu-features.c: Likewise.
> 	* sysdeps/x86/cpu-features.h: Likewise.
> 	* sysdeps/x86/cpu-tunables.c: Likewise.
> 	* sysdeps/x86_64/Makefile: Likewise.

This looks good to me. I'd like this to wait until 2.29 opens. I want to
minimize spurious changes. It would also be nice in 2.29 to rename the
options to arch_* prefixed options for those that are arch-specific.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> ---
>  NEWS                                          |  3 ++
>  elf/dl-hwcaps.c                               |  2 +-
>  elf/dl-hwcaps.h                               |  2 +-
>  elf/dl-tunables.list                          |  2 +-
>  manual/README.tunables                        |  6 ++--
>  manual/tunables.texi                          | 30 +++++++++----------
>  sysdeps/aarch64/dl-tunables.list              |  4 +--
>  sysdeps/powerpc/cpu-features.c                |  2 +-
>  sysdeps/powerpc/dl-tunables.list              |  2 +-
>  .../unix/sysv/linux/aarch64/cpu-features.c    |  2 +-
>  sysdeps/x86/cpu-features.c                    |  4 +--
>  sysdeps/x86/cpu-features.h                    |  2 +-
>  sysdeps/x86/cpu-tunables.c                    |  4 +--
>  sysdeps/x86/dl-tunables.list                  |  2 +-
>  sysdeps/x86_64/Makefile                       |  4 +--
>  15 files changed, 37 insertions(+), 34 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 5de2c2816f..b5308fd596 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -173,6 +173,9 @@ Deprecated and removed features, and other changes affecting compatibility:
>    project's versions of these files.  The plan is to make this the default
>    behavior in a future release.
>  
> +* The glibc.tune tunable namespace has been renamed to glibc.cpu and the
> +  tunable glibc.tune.cpu has been renamed to glibc.cpu.name.
> +
>  Changes to build and runtime requirements:
>  
>    GNU make 4.0 or later is now required to build glibc.
> diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c
> index 23482a88a1..ecf00b4577 100644
> --- a/elf/dl-hwcaps.c
> +++ b/elf/dl-hwcaps.c
> @@ -140,7 +140,7 @@ _dl_important_hwcaps (const char *platform, size_t platform_len, size_t *sz,
>  	 string and bit like you can ignore an OS-supplied HWCAP bit.  */
>        hwcap_mask |= (uint64_t) mask << _DL_FIRST_EXTRA;
>  #if HAVE_TUNABLES
> -      TUNABLE_SET (glibc, tune, hwcap_mask, uint64_t, hwcap_mask);
> +      TUNABLE_SET (glibc, cpu, hwcap_mask, uint64_t, hwcap_mask);
>  #else
>        GLRO(dl_hwcap_mask) = hwcap_mask;
>  #endif
> diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h
> index 17f0da4c73..d69ee11dc2 100644
> --- a/elf/dl-hwcaps.h
> +++ b/elf/dl-hwcaps.h
> @@ -19,7 +19,7 @@
>  #include <elf/dl-tunables.h>
>  
>  #if HAVE_TUNABLES
> -# define GET_HWCAP_MASK() TUNABLE_GET (glibc, tune, hwcap_mask, uint64_t, NULL)
> +# define GET_HWCAP_MASK() TUNABLE_GET (glibc, cpu, hwcap_mask, uint64_t, NULL)
>  #else
>  # ifdef SHARED
>  #   define GET_HWCAP_MASK() GLRO(dl_hwcap_mask)
> diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list
> index 1f8ecb8437..b108592b62 100644
> --- a/elf/dl-tunables.list
> +++ b/elf/dl-tunables.list
> @@ -86,7 +86,7 @@ glibc {
>        type: SIZE_T
>      }
>    }
> -  tune {
> +  cpu {
>      hwcap_mask {
>        type: UINT_64
>        env_alias: LD_HWCAP_MASK
> diff --git a/manual/README.tunables b/manual/README.tunables
> index 3967679f43..f87a31a65e 100644
> --- a/manual/README.tunables
> +++ b/manual/README.tunables
> @@ -105,11 +105,11 @@ where 'check' is the tunable name, 'int32_t' is the C type of the tunable and
>  To get and set tunables in a different namespace from that module, use the full
>  form of the macros as follows:
>  
> -  val = TUNABLE_GET_FULL (glibc, tune, hwcap_mask, uint64_t, NULL)
> +  val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL)
>  
> -  TUNABLE_SET_FULL (glibc, tune, hwcap_mask, uint64_t, val)
> +  TUNABLE_SET_FULL (glibc, cpu, hwcap_mask, uint64_t, val)
>  
> -where 'glibc' is the top namespace, 'tune' is the tunable namespace and the
> +where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the
>  remaining arguments are the same as the short form macros.
>  
>  When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to
> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index be33c9fc79..9b8f9e4610 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}.
>  @cindex non_temporal_threshold tunables
>  @cindex tunables, non_temporal_threshold
>  
> -@deftp {Tunable namespace} glibc.tune
> +@deftp {Tunable namespace} glibc.cpu
>  Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
>  by setting the following tunables in the @code{tune} namespace:
>  @end deftp
>  
> -@deftp Tunable glibc.tune.hwcap_mask
> +@deftp Tunable glibc.cpu.hwcap_mask
>  This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is
>  identical in features.
>  
>  The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set
>  extensions available in the processor at runtime for some architectures.  The
> -@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those
> +@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those
>  capabilities at runtime, thus disabling use of those extensions.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.hwcaps
> -The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
> +@deftp Tunable glibc.cpu.hwcaps
> +The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
>  enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
>  and @code{zzz} where the feature name is case-sensitive and has to match
>  the ones in @code{sysdeps/x86/cpu-features.h}.
> @@ -319,8 +319,8 @@ the ones in @code{sysdeps/x86/cpu-features.h}.
>  This tunable is specific to i386 and x86-64.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.cached_memopt
> -The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to
> +@deftp Tunable glibc.cpu.cached_memopt
> +The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
>  enable optimizations recommended for cacheable memory.  If set to
>  @code{1}, @theglibc{} assumes that the process memory image consists
>  of cacheable (non-device) memory only.  The default, @code{0},
> @@ -329,8 +329,8 @@ indicates that the process may use device memory.
>  This tunable is specific to powerpc, powerpc64 and powerpc64le.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.cpu
> -The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to
> +@deftp Tunable glibc.cpu.cpu
> +The @code{glibc.cpu.cpu=xxx} tunable allows the user to tell @theglibc{} to
>  assume that the CPU is @code{xxx} where xxx may have one of these values:
>  @code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99},
>  @code{thunderx2t99p1}.
> @@ -338,20 +338,20 @@ assume that the CPU is @code{xxx} where xxx may have one of these values:
>  This tunable is specific to aarch64.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.x86_data_cache_size
> -The @code{glibc.tune.x86_data_cache_size} tunable allows the user to set
> +@deftp Tunable glibc.cpu.x86_data_cache_size
> +The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set
>  data cache size in bytes for use in memory and string routines.
>  
>  This tunable is specific to i386 and x86-64.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.x86_shared_cache_size
> -The @code{glibc.tune.x86_shared_cache_size} tunable allows the user to
> +@deftp Tunable glibc.cpu.x86_shared_cache_size
> +The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to
>  set shared cache size in bytes for use in memory and string routines.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.x86_non_temporal_threshold
> -The @code{glibc.tune.x86_non_temporal_threshold} tunable allows the user
> +@deftp Tunable glibc.cpu.x86_non_temporal_threshold
> +The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user
>  to set threshold in bytes for non temporal store.
>  
>  This tunable is specific to i386 and x86-64.
> diff --git a/sysdeps/aarch64/dl-tunables.list b/sysdeps/aarch64/dl-tunables.list
> index f6a88168cc..cfcf940ebd 100644
> --- a/sysdeps/aarch64/dl-tunables.list
> +++ b/sysdeps/aarch64/dl-tunables.list
> @@ -17,8 +17,8 @@
>  # <http://www.gnu.org/licenses/>.
>  
>  glibc {
> -  tune {
> -    cpu {
> +  cpu {
> +    name {
>        type: STRING
>      }
>    }
> diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c
> index 955d4778a6..ad809b9815 100644
> --- a/sysdeps/powerpc/cpu-features.c
> +++ b/sysdeps/powerpc/cpu-features.c
> @@ -30,7 +30,7 @@ init_cpu_features (struct cpu_features *cpu_features)
>       tunables is enable, since for this case user can explicit disable
>       unaligned optimizations.  */
>  #if HAVE_TUNABLES
> -  int32_t cached_memfunc = TUNABLE_GET (glibc, tune, cached_memopt, int32_t,
> +  int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t,
>  					NULL);
>    cpu_features->use_cached_memopt = (cached_memfunc > 0);
>  #else
> diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list
> index d26636a16b..b3372555f7 100644
> --- a/sysdeps/powerpc/dl-tunables.list
> +++ b/sysdeps/powerpc/dl-tunables.list
> @@ -17,7 +17,7 @@
>  # <http://www.gnu.org/licenses/>.
>  
>  glibc {
> -  tune {
> +  cpu {
>      cached_memopt {
>        type: INT_32
>        minval: 0
> diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
> index 39eba0186f..b4f348509e 100644
> --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
> +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
> @@ -57,7 +57,7 @@ init_cpu_features (struct cpu_features *cpu_features)
>  
>  #if HAVE_TUNABLES
>    /* Get the tunable override.  */
> -  const char *mcpu = TUNABLE_GET (glibc, tune, cpu, const char *, NULL);
> +  const char *mcpu = TUNABLE_GET (glibc, cpu, name, const char *, NULL);
>    if (mcpu != NULL)
>      midr = get_midr_from_mcpu (mcpu);
>  #endif
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index d41ebde823..b8bef8d54b 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -22,7 +22,7 @@
>  #include <libc-pointer-arith.h>
>  
>  #if HAVE_TUNABLES
> -# define TUNABLE_NAMESPACE tune
> +# define TUNABLE_NAMESPACE cpu
>  # include <unistd.h>		/* Get STDOUT_FILENO for _dl_printf.  */
>  # include <elf/dl-tunables.h>
>  
> @@ -398,7 +398,7 @@ no_cpuid:
>  
>    /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86.  */
>  #if !HAVE_TUNABLES && defined SHARED
> -  /* The glibc.tune.hwcap_mask tunable is initialized already, so no need to do
> +  /* The glibc.cpu.hwcap_mask tunable is initialized already, so no need to do
>       this.  */
>    GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT;
>  #endif
> diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
> index 624e681e96..e9713f6215 100644
> --- a/sysdeps/x86/cpu-features.h
> +++ b/sysdeps/x86/cpu-features.h
> @@ -141,7 +141,7 @@ struct cpu_features
>    unsigned long int xsave_state_size;
>    /* The full state size for XSAVE when XSAVEC is disabled by
>  
> -     GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable
> +     GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable
>     */
>    unsigned int xsave_state_full_size;
>    unsigned int feature[FEATURE_INDEX_MAX];
> diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c
> index af761dcbbc..c38af71b8a 100644
> --- a/sysdeps/x86/cpu-tunables.c
> +++ b/sysdeps/x86/cpu-tunables.c
> @@ -17,7 +17,7 @@
>     <http://www.gnu.org/licenses/>.  */
>  
>  #if HAVE_TUNABLES
> -# define TUNABLE_NAMESPACE tune
> +# define TUNABLE_NAMESPACE cpu
>  # include <stdbool.h>
>  # include <stdint.h>
>  # include <unistd.h>		/* Get STDOUT_FILENO for _dl_printf.  */
> @@ -116,7 +116,7 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp)
>       the hardware which wasn't available when the selection was made.
>       The environment variable:
>  
> -     GLIBC_TUNABLES=glibc.tune.hwcaps=-xxx,yyy,-zzz,....
> +     GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz,....
>  
>       can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature
>       yyy and zzz, where the feature name is case-sensitive and has to
> diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list
> index 7c3236a68f..9a5a0b1a63 100644
> --- a/sysdeps/x86/dl-tunables.list
> +++ b/sysdeps/x86/dl-tunables.list
> @@ -17,7 +17,7 @@
>  # <http://www.gnu.org/licenses/>.
>  
>  glibc {
> -  tune {
> +  cpu {
>      hwcaps {
>        type: STRING
>      }
> diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile
> index 9f1562f1b2..d51cf03ac9 100644
> --- a/sysdeps/x86_64/Makefile
> +++ b/sysdeps/x86_64/Makefile
> @@ -57,7 +57,7 @@ modules-names += x86_64/tst-x86_64mod-1
>  LDFLAGS-tst-x86_64mod-1.so = -Wl,-soname,tst-x86_64mod-1.so
>  ifneq (no,$(have-tunables))
>  # Test the state size for XSAVE when XSAVEC is disabled.
> -tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable
> +tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable
>  endif
>  
>  $(objpfx)tst-x86_64-1: $(objpfx)x86_64/tst-x86_64mod-1.so
> @@ -74,7 +74,7 @@ $(objpfx)tst-platform-1.out: $(objpfx)x86_64/tst-platformmod-2.so
>  # Turn off AVX512F_Usable and AVX2_Usable so that GLRO(dl_platform) is
>  # always set to x86_64.
>  tst-platform-1-ENV = LD_PRELOAD=$(objpfx)\$$PLATFORM/tst-platformmod-2.so \
> -	GLIBC_TUNABLES=glibc.tune.hwcaps=-AVX512F_Usable,-AVX2_Usable
> +	GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F_Usable,-AVX2_Usable
>  endif
>  
>  tests += tst-audit3 tst-audit4 tst-audit5 tst-audit6 tst-audit7 \
>
  
Siddhesh Poyarekar July 16, 2018, 3:42 p.m. UTC | #2
On 07/16/2018 08:47 PM, Carlos O'Donell wrote:
> This looks good to me. I'd like this to wait until 2.29 opens. I want to

OK, I'll queue it up for 2.29.

> minimize spurious changes. It would also be nice in 2.29 to rename the
> options to arch_* prefixed options for those that are arch-specific.

There's only the powerpc one that's named in a generic manner but is 
architecture specific.  I'd actually like to discuss dropping the 
tunable (and use hwcap_mask instead) but we can have that discussion for 
2.29.

Siddhesh
  
Rical Jasan July 16, 2018, 3:46 p.m. UTC | #3
On 07/16/2018 07:16 AM, Siddhesh Poyarekar wrote:
...
> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index be33c9fc79..9b8f9e4610 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}.
>  @cindex non_temporal_threshold tunables
>  @cindex tunables, non_temporal_threshold
>  
> -@deftp {Tunable namespace} glibc.tune
> +@deftp {Tunable namespace} glibc.cpu
>  Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
>  by setting the following tunables in the @code{tune} namespace:

Should be @code{cpu} now.

Rical
  
Siddhesh Poyarekar July 16, 2018, 6:20 p.m. UTC | #4
On 07/16/2018 09:16 PM, Rical Jasan wrote:
> On 07/16/2018 07:16 AM, Siddhesh Poyarekar wrote:
> ...
>> diff --git a/manual/tunables.texi b/manual/tunables.texi
>> index be33c9fc79..9b8f9e4610 100644
>> --- a/manual/tunables.texi
>> +++ b/manual/tunables.texi
>> @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}.
>>   @cindex non_temporal_threshold tunables
>>   @cindex tunables, non_temporal_threshold
>>   
>> -@deftp {Tunable namespace} glibc.tune
>> +@deftp {Tunable namespace} glibc.cpu
>>   Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
>>   by setting the following tunables in the @code{tune} namespace:
> 
> Should be @code{cpu} now.

Oh yes, thanks.  I'll fix it up when I commit.

Siddhesh
  
Tulio Magno Quites Machado Filho July 16, 2018, 8:02 p.m. UTC | #5
Siddhesh Poyarekar <siddhesh@sourceware.org> writes:

> On 07/16/2018 08:47 PM, Carlos O'Donell wrote:
>> This looks good to me. I'd like this to wait until 2.29 opens. I want to
>
> OK, I'll queue it up for 2.29.
>
>> minimize spurious changes. It would also be nice in 2.29 to rename the
>> options to arch_* prefixed options for those that are arch-specific.
>
> There's only the powerpc one that's named in a generic manner but is 
> architecture specific.

I'm not following your line of thought here:

 - glibc.cpu.hwcaps is specific to i386 and x86-64
 - glibc.cpu is specific to aarch64
 - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le

What am I missing?

> I'd actually like to discuss dropping the 
> tunable (and use hwcap_mask instead) but we can have that discussion for 
> 2.29.

Notice the optimization is not specific to a CPU, but specific to an user
scenario (cacheable memory).  In other words, the optimization can't be used
whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
cache-inhibited memory is being used.
  
Tulio Magno Quites Machado Filho July 16, 2018, 8:16 p.m. UTC | #6
Siddhesh Poyarekar <siddhesh@sourceware.org> writes:

> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index be33c9fc79..9b8f9e4610 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}.
>  @cindex non_temporal_threshold tunables
>  @cindex tunables, non_temporal_threshold
>  
> -@deftp {Tunable namespace} glibc.tune
> +@deftp {Tunable namespace} glibc.cpu
>  Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
>  by setting the following tunables in the @code{tune} namespace:
>  @end deftp
>  
> -@deftp Tunable glibc.tune.hwcap_mask
> +@deftp Tunable glibc.cpu.hwcap_mask
>  This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is
>  identical in features.
>  
>  The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set
>  extensions available in the processor at runtime for some architectures.  The
> -@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those
> +@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those
>  capabilities at runtime, thus disabling use of those extensions.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.hwcaps
> -The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
> +@deftp Tunable glibc.cpu.hwcaps
> +The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
>  enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
>  and @code{zzz} where the feature name is case-sensitive and has to match
>  the ones in @code{sysdeps/x86/cpu-features.h}.
> @@ -319,8 +319,8 @@ the ones in @code{sysdeps/x86/cpu-features.h}.
>  This tunable is specific to i386 and x86-64.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.cached_memopt
> -The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to
> +@deftp Tunable glibc.cpu.cached_memopt
> +The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
>  enable optimizations recommended for cacheable memory.  If set to
>  @code{1}, @theglibc{} assumes that the process memory image consists
>  of cacheable (non-device) memory only.  The default, @code{0},
> @@ -329,8 +329,8 @@ indicates that the process may use device memory.
>  This tunable is specific to powerpc, powerpc64 and powerpc64le.
>  @end deftp
>  
> -@deftp Tunable glibc.tune.cpu
> -The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to
> +@deftp Tunable glibc.cpu.cpu
> +The @code{glibc.cpu.cpu=xxx} tunable allows the user to tell @theglibc{} to

s/cpu.cpu/cpu.name/  in both lines.

Otherwise, looks good to me.
  
Siddhesh Poyarekar July 17, 2018, 2 a.m. UTC | #7
On 07/17/2018 01:32 AM, Tulio Magno Quites Machado Filho wrote:
> I'm not following your line of thought here:
> 
>   - glibc.cpu.hwcaps is specific to i386 and x86-64
>   - glibc.cpu is specific to aarch64
>   - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le
> 
> What am I missing?

The difference is that glibc.cpu.name and glibc.cpu.hwcaps are 
conceptually generic tunables, i.e. there is a reasonable chance that 
couple of releases down the line another architecture may want to 
provide tuning facility for CPUs by name or by HWCAPS.  The 
cached_memopt one is not very clear to me and seems more like something 
that is only useful on power8.  x86-specific tunables i,e, where the 
concept is not currently applicable for other architectures 
(x86_l2_temporal_threshold) are prefixed with x86_*.

> Notice the optimization is not specific to a CPU, but specific to an user
> scenario (cacheable memory).  In other words, the optimization can't be used
> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
> cache-inhibited memory is being used.

Ahh OK, I got thrown off by the fact that there's a separate routine for 
it and assumed that it is Power8-specific.  I have a different concern 
then; a tunable is process-wide so the cached_memopt tunable essentially 
assumes that the entire process is using cache-inhibited memory.  Is 
that a reasonable assumption?  In my experience a typical process would 
have only a set of structures in cache-inhibited memory and most of it 
would be regular memory.  In that sense it looks more like a tradeoff 
hack and it would be nice to consider alternatives.  Here are a couple I 
can think of off the top of my head:

1. A new relocation that overlays on top of ifuncs and allows selection 
of routines based on specific properties.  I have had this idea for a 
while but no time to implement it and it has much more general scope 
than memory type; for example memory alignment could also be a factor to 
short-cut parts of string routines at compile time itself.  It does not 
have the runtime flexibility of a tunable but is probably far more 
configurable.

2. If there is a correlation to size then implement something similar to 
the x86 temporal_threshold tunable.  This is probably just as good or 
bad as setting a cached_memopt flag but has the effect of generalizing 
what was a tunable.

What do you think?

Siddhesh
  
Siddhesh Poyarekar July 17, 2018, 2:01 a.m. UTC | #8
On 07/17/2018 01:46 AM, Tulio Magno Quites Machado Filho wrote:
> s/cpu.cpu/cpu.name/  in both lines.
> 
> Otherwise, looks good to me.

Oops, my find/replace skills are getting rusty.  I'll fix this, thanks.

Siddhesh
  
Tulio Magno Quites Machado Filho Aug. 3, 2018, 8:33 p.m. UTC | #9
Siddhesh Poyarekar <siddhesh@sourceware.org> writes:

> On 07/17/2018 01:32 AM, Tulio Magno Quites Machado Filho wrote:
>> I'm not following your line of thought here:
>> 
>>   - glibc.cpu.hwcaps is specific to i386 and x86-64
>>   - glibc.cpu is specific to aarch64
>>   - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le
>> 
>> What am I missing?
>
> The difference is that glibc.cpu.name and glibc.cpu.hwcaps are 
> conceptually generic tunables, i.e. there is a reasonable chance that 
> couple of releases down the line another architecture may want to 
> provide tuning facility for CPUs by name or by HWCAPS.  The 
> cached_memopt one is not very clear to me and seems more like something 
> that is only useful on power8.

Maybe it isn't restricted only to powerpc:
https://sourceware.org/ml/libc-alpha/2018-08/msg00069.html

Obviously other machine maintainers may not be interested on cached_memopt,
but this thread helps me to explain why I was thinking cached_memopt was
generic.

>> Notice the optimization is not specific to a CPU, but specific to an user
>> scenario (cacheable memory).  In other words, the optimization can't be used
>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
>> cache-inhibited memory is being used.
>
> Ahh OK, I got thrown off by the fact that there's a separate routine for 
> it and assumed that it is Power8-specific.  I have a different concern 
> then; a tunable is process-wide so the cached_memopt tunable essentially 
> assumes that the entire process is using cache-inhibited memory.  Is 
> that a reasonable assumption?

It's the opposite.
When cached_memopt=1, it's assumed the process only uses cacheable memory.
If cached_memopt=0 (default value), nothing is assumed and a safe execution
is taken.

> 1. A new relocation that overlays on top of ifuncs and allows selection 
> of routines based on specific properties.  I have had this idea for a 
> while but no time to implement it and it has much more general scope 
> than memory type; for example memory alignment could also be a factor to 
> short-cut parts of string routines at compile time itself.  It does not 
> have the runtime flexibility of a tunable but is probably far more 
> configurable.

Sounds interesting.  Where are these properties coming from?

> 2. If there is a correlation to size then implement something similar to 
> the x86 temporal_threshold tunable.  This is probably just as good or 
> bad as setting a cached_memopt flag but has the effect of generalizing 
> what was a tunable.

I don't think this option would help in this case.
I can't correlate size to cache-inhibited memory.
  
Siddhesh Poyarekar Aug. 6, 2018, 11 a.m. UTC | #10
On 08/04/2018 02:03 AM, Tulio Magno Quites Machado Filho wrote:
> Maybe it isn't restricted only to powerpc:
> https://sourceware.org/ml/libc-alpha/2018-08/msg00069.html
> 
> Obviously other machine maintainers may not be interested on cached_memopt,
> but this thread helps me to explain why I was thinking cached_memopt was
> generic.

OK.

>>> Notice the optimization is not specific to a CPU, but specific to an user
>>> scenario (cacheable memory).  In other words, the optimization can't be used
>>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
>>> cache-inhibited memory is being used.
>>
>> Ahh OK, I got thrown off by the fact that there's a separate routine for
>> it and assumed that it is Power8-specific.  I have a different concern
>> then; a tunable is process-wide so the cached_memopt tunable essentially
>> assumes that the entire process is using cache-inhibited memory.  Is
>> that a reasonable assumption?
> 
> It's the opposite.
> When cached_memopt=1, it's assumed the process only uses cacheable memory.
> If cached_memopt=0 (default value), nothing is assumed and a safe execution
> is taken.

OK, thanks for the clarification. It doesn't change my question though; 
is there a performance loss when you do a safe execution and does it 
make sense to fix this in glibc?  I haven't formed a strong opinion 
either way for the latter yet but one thing that would be nice to ensure 
is that we don't do different things for different architectures.  There 
seems to be scope to come to a consensus across architectures for this 
and we should try to do that.

Given that Cauldron is only a month away, we could have a more detailed 
conversation on this in the glibc BoF too if necessary.

>> 1. A new relocation that overlays on top of ifuncs and allows selection
>> of routines based on specific properties.  I have had this idea for a
>> while but no time to implement it and it has much more general scope
>> than memory type; for example memory alignment could also be a factor to
>> short-cut parts of string routines at compile time itself.  It does not
>> have the runtime flexibility of a tunable but is probably far more
>> configurable.
> 
> Sounds interesting.  Where are these properties coming from?

I haven't thought this through tbh, but something like this:

- Add new relocations for each special case: R_MEMCPY_REG, 
R_MEMCPY_CACHE_INHIBITED, R_MEMCPY_ALIGN16, etc. that can be generated 
based on properties of the inputs such as volatileness, alignment, etc.

- Create separate entry points memcpy@plt and memcpy_noncached@plt for 
each relocation we end up using for that TU.

- Have the ifunc resolver take into consideration the relocation type 
when patching in the PLT.

It may be simpler to just emit different entry points (similar to the 
*_finite math functions) and separate ifunc resolvers if there is no 
overlap between ifunc implementations for these entry points.

> I don't think this option would help in this case.
> I can't correlate size to cache-inhibited memory.

Right, I had not understood where you were coming from then and assumed 
you were talking about non-temporal accesses.

Siddhesh
  
Tulio Magno Quites Machado Filho Aug. 6, 2018, 1:33 p.m. UTC | #11
Siddhesh Poyarekar <siddhesh@sourceware.org> writes:

> On 08/04/2018 02:03 AM, Tulio Magno Quites Machado Filho wrote:
>>>> Notice the optimization is not specific to a CPU, but specific to an user
>>>> scenario (cacheable memory).  In other words, the optimization can't be used
>>>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
>>>> cache-inhibited memory is being used.
>>>
>>> Ahh OK, I got thrown off by the fact that there's a separate routine for
>>> it and assumed that it is Power8-specific.  I have a different concern
>>> then; a tunable is process-wide so the cached_memopt tunable essentially
>>> assumes that the entire process is using cache-inhibited memory.  Is
>>> that a reasonable assumption?
>> 
>> It's the opposite.
>> When cached_memopt=1, it's assumed the process only uses cacheable memory.
>> If cached_memopt=0 (default value), nothing is assumed and a safe execution
>> is taken.
>
> OK, thanks for the clarification. It doesn't change my question though; 
> is there a performance loss when you do a safe execution

Yes, for cacheable memory.  A safe execution uses only naturally aligned memory
accesses and doesn't provide the best performance we have.

However an unsafe execution on cached inhibited memory is catastrophic because
every naturally unaligned memory access generates an alignment interruption
that is treated by the kernel, causing an even greater performance impact than
a safe execution on cacheable memory.

> does it make sense to fix this in glibc?

IMHO, yes.  I haven't seen yet a good explanation on why userspace programs
should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not
prohibit this.

> There seems to be scope to come to a consensus across architectures for this 
> and we should try to do that.

Agreed.

>>> 1. A new relocation that overlays on top of ifuncs and allows selection
>>> of routines based on specific properties.  I have had this idea for a
>>> while but no time to implement it and it has much more general scope
>>> than memory type; for example memory alignment could also be a factor to
>>> short-cut parts of string routines at compile time itself.  It does not
>>> have the runtime flexibility of a tunable but is probably far more
>>> configurable.
>> 
>> Sounds interesting.  Where are these properties coming from?
>
> I haven't thought this through tbh, but something like this:
>
> - Add new relocations for each special case: R_MEMCPY_REG, 
> R_MEMCPY_CACHE_INHIBITED, R_MEMCPY_ALIGN16, etc. that can be generated 
> based on properties of the inputs such as volatileness, alignment, etc.
>
> - Create separate entry points memcpy@plt and memcpy_noncached@plt for 
> each relocation we end up using for that TU.
>
> - Have the ifunc resolver take into consideration the relocation type 
> when patching in the PLT.
>
> It may be simpler to just emit different entry points (similar to the 
> *_finite math functions) and separate ifunc resolvers if there is no 
> overlap between ifunc implementations for these entry points.

I still believe this could help, but there is still one open issue: how do we
know a memcpy call is accessing cached inhibited memory?
I'm afraid this property is not that easy to detect.
  
Rich Felker Aug. 7, 2018, 12:17 a.m. UTC | #12
On Mon, Aug 06, 2018 at 10:33:40AM -0300, Tulio Magno Quites Machado Filho wrote:
> Siddhesh Poyarekar <siddhesh@sourceware.org> writes:
> 
> > On 08/04/2018 02:03 AM, Tulio Magno Quites Machado Filho wrote:
> >>>> Notice the optimization is not specific to a CPU, but specific to an user
> >>>> scenario (cacheable memory).  In other words, the optimization can't be used
> >>>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when
> >>>> cache-inhibited memory is being used.
> >>>
> >>> Ahh OK, I got thrown off by the fact that there's a separate routine for
> >>> it and assumed that it is Power8-specific.  I have a different concern
> >>> then; a tunable is process-wide so the cached_memopt tunable essentially
> >>> assumes that the entire process is using cache-inhibited memory.  Is
> >>> that a reasonable assumption?
> >> 
> >> It's the opposite.
> >> When cached_memopt=1, it's assumed the process only uses cacheable memory.
> >> If cached_memopt=0 (default value), nothing is assumed and a safe execution
> >> is taken.
> >
> > OK, thanks for the clarification. It doesn't change my question though; 
> > is there a performance loss when you do a safe execution
> 
> Yes, for cacheable memory.  A safe execution uses only naturally aligned memory
> accesses and doesn't provide the best performance we have.
> 
> However an unsafe execution on cached inhibited memory is catastrophic because
> every naturally unaligned memory access generates an alignment interruption
> that is treated by the kernel, causing an even greater performance impact than
> a safe execution on cacheable memory.
> 
> > does it make sense to fix this in glibc?
> 
> IMHO, yes.  I haven't seen yet a good explanation on why userspace programs
> should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not
> prohibit this.

I'm not entirely clear on what "these conditions" are, but ISO C does
not allow memcpy to be used on volatile objects. It's not clear how
such objects would come into existence (presumably some sort of mmap),
but if they have weird properties about how you can perform accesses
on them, I think it's reasonable to argue that they have to be
volatile.

The compiler should not be generating calls to memcpy for volatile
objects; if it does that's a compiler bug.

Rich
  
Siddhesh Poyarekar Aug. 7, 2018, 7:46 a.m. UTC | #13
On 08/06/2018 07:03 PM, Tulio Magno Quites Machado Filho wrote:
> Yes, for cacheable memory.  A safe execution uses only naturally aligned memory
> accesses and doesn't provide the best performance we have.
> 
> However an unsafe execution on cached inhibited memory is catastrophic because
> every naturally unaligned memory access generates an alignment interruption
> that is treated by the kernel, causing an even greater performance impact than
> a safe execution on cacheable memory.

There seem to be two discussions that seem to me to be slightly 
orthogonal: there's the issue of using memcpy for volatile objects 
because overlapping writes may not work correctly without barriers and 
then there is the question of ensuring aligned accesses for device 
memory that may have been mapped in as cache-inhibited and does not like 
misaligned access.

It seems to me the issue with Power w.r.t. cache-inhibited memory access 
is only the latter.  Is that correct?

>> does it make sense to fix this in glibc?
> 
> IMHO, yes.  I haven't seen yet a good explanation on why userspace programs
> should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not
> prohibit this.

If it is a question of misaligned accesses only then there may be a case 
to add a memcpy that strictly does aligned accesses only, but a better 
name for that would be glibc.cpu.misaligned_access and not cached_memopt 
since that has slightly different implications.

If volatile (and overlapping) access is also an issue then there seems 
to be some amount of clarity that we need not attempt to support it in 
memcpy by default.  I don't know if having support only in Power makes 
sense but if there is a strong need for it then the tunable name should 
change to something more precise, e.g. glibc.cpu.ppc_allow_volatile_memcpy.

> I still believe this could help, but there is still one open issue: how do we
> know a memcpy call is accessing cached inhibited memory?
> I'm afraid this property is not that easy to detect.

It's not, it has to be annotated by the developer.

Siddhesh
  

Patch

diff --git a/NEWS b/NEWS
index 5de2c2816f..b5308fd596 100644
--- a/NEWS
+++ b/NEWS
@@ -173,6 +173,9 @@  Deprecated and removed features, and other changes affecting compatibility:
   project's versions of these files.  The plan is to make this the default
   behavior in a future release.
 
+* The glibc.tune tunable namespace has been renamed to glibc.cpu and the
+  tunable glibc.tune.cpu has been renamed to glibc.cpu.name.
+
 Changes to build and runtime requirements:
 
   GNU make 4.0 or later is now required to build glibc.
diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c
index 23482a88a1..ecf00b4577 100644
--- a/elf/dl-hwcaps.c
+++ b/elf/dl-hwcaps.c
@@ -140,7 +140,7 @@  _dl_important_hwcaps (const char *platform, size_t platform_len, size_t *sz,
 	 string and bit like you can ignore an OS-supplied HWCAP bit.  */
       hwcap_mask |= (uint64_t) mask << _DL_FIRST_EXTRA;
 #if HAVE_TUNABLES
-      TUNABLE_SET (glibc, tune, hwcap_mask, uint64_t, hwcap_mask);
+      TUNABLE_SET (glibc, cpu, hwcap_mask, uint64_t, hwcap_mask);
 #else
       GLRO(dl_hwcap_mask) = hwcap_mask;
 #endif
diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h
index 17f0da4c73..d69ee11dc2 100644
--- a/elf/dl-hwcaps.h
+++ b/elf/dl-hwcaps.h
@@ -19,7 +19,7 @@ 
 #include <elf/dl-tunables.h>
 
 #if HAVE_TUNABLES
-# define GET_HWCAP_MASK() TUNABLE_GET (glibc, tune, hwcap_mask, uint64_t, NULL)
+# define GET_HWCAP_MASK() TUNABLE_GET (glibc, cpu, hwcap_mask, uint64_t, NULL)
 #else
 # ifdef SHARED
 #   define GET_HWCAP_MASK() GLRO(dl_hwcap_mask)
diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list
index 1f8ecb8437..b108592b62 100644
--- a/elf/dl-tunables.list
+++ b/elf/dl-tunables.list
@@ -86,7 +86,7 @@  glibc {
       type: SIZE_T
     }
   }
-  tune {
+  cpu {
     hwcap_mask {
       type: UINT_64
       env_alias: LD_HWCAP_MASK
diff --git a/manual/README.tunables b/manual/README.tunables
index 3967679f43..f87a31a65e 100644
--- a/manual/README.tunables
+++ b/manual/README.tunables
@@ -105,11 +105,11 @@  where 'check' is the tunable name, 'int32_t' is the C type of the tunable and
 To get and set tunables in a different namespace from that module, use the full
 form of the macros as follows:
 
-  val = TUNABLE_GET_FULL (glibc, tune, hwcap_mask, uint64_t, NULL)
+  val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL)
 
-  TUNABLE_SET_FULL (glibc, tune, hwcap_mask, uint64_t, val)
+  TUNABLE_SET_FULL (glibc, cpu, hwcap_mask, uint64_t, val)
 
-where 'glibc' is the top namespace, 'tune' is the tunable namespace and the
+where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the
 remaining arguments are the same as the short form macros.
 
 When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to
diff --git a/manual/tunables.texi b/manual/tunables.texi
index be33c9fc79..9b8f9e4610 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -295,23 +295,23 @@  The default value of this tunable is @samp{3}.
 @cindex non_temporal_threshold tunables
 @cindex tunables, non_temporal_threshold
 
-@deftp {Tunable namespace} glibc.tune
+@deftp {Tunable namespace} glibc.cpu
 Behavior of @theglibc{} can be tuned to assume specific hardware capabilities
 by setting the following tunables in the @code{tune} namespace:
 @end deftp
 
-@deftp Tunable glibc.tune.hwcap_mask
+@deftp Tunable glibc.cpu.hwcap_mask
 This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is
 identical in features.
 
 The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set
 extensions available in the processor at runtime for some architectures.  The
-@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those
+@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those
 capabilities at runtime, thus disabling use of those extensions.
 @end deftp
 
-@deftp Tunable glibc.tune.hwcaps
-The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
+@deftp Tunable glibc.cpu.hwcaps
+The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to
 enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx}
 and @code{zzz} where the feature name is case-sensitive and has to match
 the ones in @code{sysdeps/x86/cpu-features.h}.
@@ -319,8 +319,8 @@  the ones in @code{sysdeps/x86/cpu-features.h}.
 This tunable is specific to i386 and x86-64.
 @end deftp
 
-@deftp Tunable glibc.tune.cached_memopt
-The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to
+@deftp Tunable glibc.cpu.cached_memopt
+The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to
 enable optimizations recommended for cacheable memory.  If set to
 @code{1}, @theglibc{} assumes that the process memory image consists
 of cacheable (non-device) memory only.  The default, @code{0},
@@ -329,8 +329,8 @@  indicates that the process may use device memory.
 This tunable is specific to powerpc, powerpc64 and powerpc64le.
 @end deftp
 
-@deftp Tunable glibc.tune.cpu
-The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to
+@deftp Tunable glibc.cpu.cpu
+The @code{glibc.cpu.cpu=xxx} tunable allows the user to tell @theglibc{} to
 assume that the CPU is @code{xxx} where xxx may have one of these values:
 @code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99},
 @code{thunderx2t99p1}.
@@ -338,20 +338,20 @@  assume that the CPU is @code{xxx} where xxx may have one of these values:
 This tunable is specific to aarch64.
 @end deftp
 
-@deftp Tunable glibc.tune.x86_data_cache_size
-The @code{glibc.tune.x86_data_cache_size} tunable allows the user to set
+@deftp Tunable glibc.cpu.x86_data_cache_size
+The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set
 data cache size in bytes for use in memory and string routines.
 
 This tunable is specific to i386 and x86-64.
 @end deftp
 
-@deftp Tunable glibc.tune.x86_shared_cache_size
-The @code{glibc.tune.x86_shared_cache_size} tunable allows the user to
+@deftp Tunable glibc.cpu.x86_shared_cache_size
+The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to
 set shared cache size in bytes for use in memory and string routines.
 @end deftp
 
-@deftp Tunable glibc.tune.x86_non_temporal_threshold
-The @code{glibc.tune.x86_non_temporal_threshold} tunable allows the user
+@deftp Tunable glibc.cpu.x86_non_temporal_threshold
+The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user
 to set threshold in bytes for non temporal store.
 
 This tunable is specific to i386 and x86-64.
diff --git a/sysdeps/aarch64/dl-tunables.list b/sysdeps/aarch64/dl-tunables.list
index f6a88168cc..cfcf940ebd 100644
--- a/sysdeps/aarch64/dl-tunables.list
+++ b/sysdeps/aarch64/dl-tunables.list
@@ -17,8 +17,8 @@ 
 # <http://www.gnu.org/licenses/>.
 
 glibc {
-  tune {
-    cpu {
+  cpu {
+    name {
       type: STRING
     }
   }
diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c
index 955d4778a6..ad809b9815 100644
--- a/sysdeps/powerpc/cpu-features.c
+++ b/sysdeps/powerpc/cpu-features.c
@@ -30,7 +30,7 @@  init_cpu_features (struct cpu_features *cpu_features)
      tunables is enable, since for this case user can explicit disable
      unaligned optimizations.  */
 #if HAVE_TUNABLES
-  int32_t cached_memfunc = TUNABLE_GET (glibc, tune, cached_memopt, int32_t,
+  int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t,
 					NULL);
   cpu_features->use_cached_memopt = (cached_memfunc > 0);
 #else
diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list
index d26636a16b..b3372555f7 100644
--- a/sysdeps/powerpc/dl-tunables.list
+++ b/sysdeps/powerpc/dl-tunables.list
@@ -17,7 +17,7 @@ 
 # <http://www.gnu.org/licenses/>.
 
 glibc {
-  tune {
+  cpu {
     cached_memopt {
       type: INT_32
       minval: 0
diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
index 39eba0186f..b4f348509e 100644
--- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
+++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c
@@ -57,7 +57,7 @@  init_cpu_features (struct cpu_features *cpu_features)
 
 #if HAVE_TUNABLES
   /* Get the tunable override.  */
-  const char *mcpu = TUNABLE_GET (glibc, tune, cpu, const char *, NULL);
+  const char *mcpu = TUNABLE_GET (glibc, cpu, name, const char *, NULL);
   if (mcpu != NULL)
     midr = get_midr_from_mcpu (mcpu);
 #endif
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index d41ebde823..b8bef8d54b 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -22,7 +22,7 @@ 
 #include <libc-pointer-arith.h>
 
 #if HAVE_TUNABLES
-# define TUNABLE_NAMESPACE tune
+# define TUNABLE_NAMESPACE cpu
 # include <unistd.h>		/* Get STDOUT_FILENO for _dl_printf.  */
 # include <elf/dl-tunables.h>
 
@@ -398,7 +398,7 @@  no_cpuid:
 
   /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86.  */
 #if !HAVE_TUNABLES && defined SHARED
-  /* The glibc.tune.hwcap_mask tunable is initialized already, so no need to do
+  /* The glibc.cpu.hwcap_mask tunable is initialized already, so no need to do
      this.  */
   GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT;
 #endif
diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
index 624e681e96..e9713f6215 100644
--- a/sysdeps/x86/cpu-features.h
+++ b/sysdeps/x86/cpu-features.h
@@ -141,7 +141,7 @@  struct cpu_features
   unsigned long int xsave_state_size;
   /* The full state size for XSAVE when XSAVEC is disabled by
 
-     GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable
+     GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable
    */
   unsigned int xsave_state_full_size;
   unsigned int feature[FEATURE_INDEX_MAX];
diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c
index af761dcbbc..c38af71b8a 100644
--- a/sysdeps/x86/cpu-tunables.c
+++ b/sysdeps/x86/cpu-tunables.c
@@ -17,7 +17,7 @@ 
    <http://www.gnu.org/licenses/>.  */
 
 #if HAVE_TUNABLES
-# define TUNABLE_NAMESPACE tune
+# define TUNABLE_NAMESPACE cpu
 # include <stdbool.h>
 # include <stdint.h>
 # include <unistd.h>		/* Get STDOUT_FILENO for _dl_printf.  */
@@ -116,7 +116,7 @@  TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp)
      the hardware which wasn't available when the selection was made.
      The environment variable:
 
-     GLIBC_TUNABLES=glibc.tune.hwcaps=-xxx,yyy,-zzz,....
+     GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz,....
 
      can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature
      yyy and zzz, where the feature name is case-sensitive and has to
diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list
index 7c3236a68f..9a5a0b1a63 100644
--- a/sysdeps/x86/dl-tunables.list
+++ b/sysdeps/x86/dl-tunables.list
@@ -17,7 +17,7 @@ 
 # <http://www.gnu.org/licenses/>.
 
 glibc {
-  tune {
+  cpu {
     hwcaps {
       type: STRING
     }
diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile
index 9f1562f1b2..d51cf03ac9 100644
--- a/sysdeps/x86_64/Makefile
+++ b/sysdeps/x86_64/Makefile
@@ -57,7 +57,7 @@  modules-names += x86_64/tst-x86_64mod-1
 LDFLAGS-tst-x86_64mod-1.so = -Wl,-soname,tst-x86_64mod-1.so
 ifneq (no,$(have-tunables))
 # Test the state size for XSAVE when XSAVEC is disabled.
-tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable
+tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable
 endif
 
 $(objpfx)tst-x86_64-1: $(objpfx)x86_64/tst-x86_64mod-1.so
@@ -74,7 +74,7 @@  $(objpfx)tst-platform-1.out: $(objpfx)x86_64/tst-platformmod-2.so
 # Turn off AVX512F_Usable and AVX2_Usable so that GLRO(dl_platform) is
 # always set to x86_64.
 tst-platform-1-ENV = LD_PRELOAD=$(objpfx)\$$PLATFORM/tst-platformmod-2.so \
-	GLIBC_TUNABLES=glibc.tune.hwcaps=-AVX512F_Usable,-AVX2_Usable
+	GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F_Usable,-AVX2_Usable
 endif
 
 tests += tst-audit3 tst-audit4 tst-audit5 tst-audit6 tst-audit7 \