x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ #21967]

Message ID 20170901174034.GA17543@gmail.com
State Committed
Commit ef8adeb0416309082c41a1518caee6961b5c42e8
Headers

Commit Message

Lu, Hongjiu Sept. 1, 2017, 5:40 p.m. UTC
  AVX512 functions in mathvec are used on machines with AVX512.  An AVX2
wrapper is also provided and it can be used when the AVX512 version
isn't profitable.  MathVec_Prefer_No_AVX512 is addded to cpu-features.
If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES
environment variable, the AVX2 wrapper will be used.

Tested on x86-64 machines with and without AVX512.  Also verified
glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine.

Any comments?

H.J.
---
	[BZ #21967]
	* sysdeps/x86/cpu-features.h (bit_arch_MathVec_Prefer_No_AVX512):
	New.
	(index_arch_MathVec_Prefer_No_AVX512): Likewise.
	* sysdeps/x86/cpu-tunables.c (TUNABLE_CALLBACK (set_hwcaps)):
	Handle MathVec_Prefer_No_AVX512.
---
 sysdeps/x86/cpu-features.h                          |  2 ++
 sysdeps/x86/cpu-tunables.c                          |  7 +++++++
 sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h | 13 ++++++++-----
 3 files changed, 17 insertions(+), 5 deletions(-)
  

Comments

Andrew Senkevich Sept. 4, 2017, 10:16 a.m. UTC | #1
> -----Original Message-----
> From: Lu, Hongjiu
> Sent: Friday, September 1, 2017 19:41
> To: GNU C Library <libc-alpha@sourceware.org>
> Cc: Senkevich, Andrew <andrew.senkevich@intel.com>
> Subject: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ
> #21967]
> 
> AVX512 functions in mathvec are used on machines with AVX512.  An AVX2
> wrapper is also provided and it can be used when the AVX512 version isn't
> profitable.  MathVec_Prefer_No_AVX512 is addded to cpu-features.
> If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES
> environment variable, the AVX2 wrapper will be used.
> 
> Tested on x86-64 machines with and without AVX512.  Also verified
> glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine.
> 
> Any comments?

But libc selects AVX2 mem/str on SKX, looks like libmvec should do the same (and backward option MathVec_Prefer_AVX512 needed).


--
Andrew
  
H.J. Lu Sept. 4, 2017, 8:27 p.m. UTC | #2
On Mon, Sep 4, 2017 at 3:16 AM, Senkevich, Andrew
<andrew.senkevich@intel.com> wrote:
>> -----Original Message-----
>> From: Lu, Hongjiu
>> Sent: Friday, September 1, 2017 19:41
>> To: GNU C Library <libc-alpha@sourceware.org>
>> Cc: Senkevich, Andrew <andrew.senkevich@intel.com>
>> Subject: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ
>> #21967]
>>
>> AVX512 functions in mathvec are used on machines with AVX512.  An AVX2
>> wrapper is also provided and it can be used when the AVX512 version isn't
>> profitable.  MathVec_Prefer_No_AVX512 is addded to cpu-features.
>> If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in GLIBC_TUNABLES
>> environment variable, the AVX2 wrapper will be used.
>>
>> Tested on x86-64 machines with and without AVX512.  Also verified
>> glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine.
>>
>> Any comments?
>
> But libc selects AVX2 mem/str on SKX, looks like libmvec should do the same (and backward option MathVec_Prefer_AVX512 needed).
>

AVX2 mem/str  can be called from codes which may use only SSE
or AVX.  But AVX512 functions in mathvec are called only from AVX512
programs.  Do we have benchmarks to show that AVX512 programs
are faster with AVX2 mathvec?
  
Andrew Senkevich Sept. 12, 2017, 2:27 p.m. UTC | #3
> -----Original Message-----

> From: H.J. Lu [mailto:hjl.tools@gmail.com]

> Sent: Monday, September 4, 2017 22:27

> To: Senkevich, Andrew <andrew.senkevich@intel.com>

> Cc: GNU C Library <libc-alpha@sourceware.org>

> Subject: Re: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ

> #21967]

> 

> On Mon, Sep 4, 2017 at 3:16 AM, Senkevich, Andrew

> <andrew.senkevich@intel.com> wrote:

> >> -----Original Message-----

> >> From: Lu, Hongjiu

> >> Sent: Friday, September 1, 2017 19:41

> >> To: GNU C Library <libc-alpha@sourceware.org>

> >> Cc: Senkevich, Andrew <andrew.senkevich@intel.com>

> >> Subject: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features

> >> [BZ #21967]

> >>

> >> AVX512 functions in mathvec are used on machines with AVX512.  An

> >> AVX2 wrapper is also provided and it can be used when the AVX512

> >> version isn't profitable.  MathVec_Prefer_No_AVX512 is addded to cpu-

> features.

> >> If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in

> >> GLIBC_TUNABLES environment variable, the AVX2 wrapper will be used.

> >>

> >> Tested on x86-64 machines with and without AVX512.  Also verified

> >> glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine.

> >>

> >> Any comments?

> >

> > But libc selects AVX2 mem/str on SKX, looks like libmvec should do the same

> (and backward option MathVec_Prefer_AVX512 needed).

> >

> 

> AVX2 mem/str  can be called from codes which may use only SSE or AVX.  But

> AVX512 functions in mathvec are called only from AVX512 programs.  Do we

> have benchmarks to show that AVX512 programs are faster with AVX2 mathvec?


I agree, if compiler vectorizes with AVX512 then Glibc should use AVX512 implementations in mathvec by default.


--
Andrew
  
H.J. Lu Sept. 12, 2017, 2:43 p.m. UTC | #4
On Tue, Sep 12, 2017 at 7:27 AM, Senkevich, Andrew
<andrew.senkevich@intel.com> wrote:
>> -----Original Message-----
>> From: H.J. Lu [mailto:hjl.tools@gmail.com]
>> Sent: Monday, September 4, 2017 22:27
>> To: Senkevich, Andrew <andrew.senkevich@intel.com>
>> Cc: GNU C Library <libc-alpha@sourceware.org>
>> Subject: Re: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features [BZ
>> #21967]
>>
>> On Mon, Sep 4, 2017 at 3:16 AM, Senkevich, Andrew
>> <andrew.senkevich@intel.com> wrote:
>> >> -----Original Message-----
>> >> From: Lu, Hongjiu
>> >> Sent: Friday, September 1, 2017 19:41
>> >> To: GNU C Library <libc-alpha@sourceware.org>
>> >> Cc: Senkevich, Andrew <andrew.senkevich@intel.com>
>> >> Subject: [PATCH] x86: Add MathVec_Prefer_No_AVX512 to cpu-features
>> >> [BZ #21967]
>> >>
>> >> AVX512 functions in mathvec are used on machines with AVX512.  An
>> >> AVX2 wrapper is also provided and it can be used when the AVX512
>> >> version isn't profitable.  MathVec_Prefer_No_AVX512 is addded to cpu-
>> features.
>> >> If glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 is set in
>> >> GLIBC_TUNABLES environment variable, the AVX2 wrapper will be used.
>> >>
>> >> Tested on x86-64 machines with and without AVX512.  Also verified
>> >> glibc.tune.hwcaps=MathVec_Prefer_No_AVX512 on AVX512 machine.
>> >>
>> >> Any comments?
>> >
>> > But libc selects AVX2 mem/str on SKX, looks like libmvec should do the same
>> (and backward option MathVec_Prefer_AVX512 needed).
>> >
>>
>> AVX2 mem/str  can be called from codes which may use only SSE or AVX.  But
>> AVX512 functions in mathvec are called only from AVX512 programs.  Do we
>> have benchmarks to show that AVX512 programs are faster with AVX2 mathvec?
>
> I agree, if compiler vectorizes with AVX512 then Glibc should use AVX512 implementations in mathvec by default.
>

I am checking it in.   Thanks.
  

Patch

diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
index 9e01781424..a032a2e168 100644
--- a/sysdeps/x86/cpu-features.h
+++ b/sysdeps/x86/cpu-features.h
@@ -40,6 +40,7 @@ 
 #define bit_arch_Use_dl_runtime_resolve_opt	(1 << 20)
 #define bit_arch_Use_dl_runtime_resolve_slow	(1 << 21)
 #define bit_arch_Prefer_No_AVX512		(1 << 22)
+#define bit_arch_MathVec_Prefer_No_AVX512	(1 << 23)
 
 /* CPUID Feature flags.  */
 
@@ -239,6 +240,7 @@  extern const struct cpu_features *__get_cpu_features (void)
 # define index_arch_Use_dl_runtime_resolve_opt FEATURE_INDEX_1
 # define index_arch_Use_dl_runtime_resolve_slow FEATURE_INDEX_1
 # define index_arch_Prefer_No_AVX512	FEATURE_INDEX_1
+# define index_arch_MathVec_Prefer_No_AVX512 FEATURE_INDEX_1
 
 #endif	/* !__ASSEMBLER__ */
 
diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c
index 0ab708cca8..ec72d86f08 100644
--- a/sysdeps/x86/cpu-tunables.c
+++ b/sysdeps/x86/cpu-tunables.c
@@ -303,6 +303,13 @@  TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp)
 		 disable, 23);
 	    }
 	  break;
+	case 24:
+	    {
+	      CHECK_GLIBC_IFUNC_ARCH_NEED_ARCH_BOTH
+		(n, cpu_features, MathVec_Prefer_No_AVX512,
+		 AVX512F_Usable, disable, 24);
+	    }
+	  break;
 	case 26:
 	    {
 	      CHECK_GLIBC_IFUNC_ARCH_NEED_CPU_BOTH
diff --git a/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h b/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h
index 1857e1f760..fffc9da114 100644
--- a/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h
+++ b/sysdeps/x86_64/fpu/multiarch/ifunc-mathvec-avx512.h
@@ -32,11 +32,14 @@  IFUNC_SELECTOR (void)
 {
   const struct cpu_features* cpu_features = __get_cpu_features ();
 
-  if (CPU_FEATURES_ARCH_P (cpu_features, AVX512DQ_Usable))
-    return OPTIMIZE (skx);
-
-  if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable))
-    return OPTIMIZE (knl);
+  if (!CPU_FEATURES_ARCH_P (cpu_features, MathVec_Prefer_No_AVX512))
+    {
+      if (CPU_FEATURES_ARCH_P (cpu_features, AVX512DQ_Usable))
+	return OPTIMIZE (skx);
+
+      if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable))
+	return OPTIMIZE (knl);
+    }
 
   return OPTIMIZE (avx2_wrapper);
 }