Bug 201689: Belt-and-suspenders detection of FMA.

Message ID 85c5a00e-583b-da48-0bac-66ec6231be5a@redhat.com
State Committed
Headers

Commit Message

Carlos O'Donell Oct. 17, 2016, 3:44 p.m. UTC
  On 10/14/2016 03:12 PM, H.J. Lu wrote:
> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>> In the Intel Architecture Instruction Set Extensions Programming reference
>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>> is:
>>
>> "Application Software must identify that hardware supports AVX as explained
>>  in ... after that it must also detect support for FMA..."
>>
>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>> after that we check for AVX and FMA orthogonally. It is conceivable that
>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>
>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>> we should follow the intel specifications and adjust our check as the following
>> patch works.
> 
> One can't write a program with FMA nor AVX2 without using AVX
> instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do
> 
> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
>    {
>       Set AVX available
>       if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
>          Set AVX2 available
>       if (CPU_FEATURES_CPU_P (cpu_features, FMA))
>          Set FMA available
>    }
> 

Agreed. I double checked the manual, here is a new patch.

Testing on x86_64 and i686.

OK to commit if the results are clean?

v2
- Move FMA and AVX2 check up into AVX check since they depend upon it.

2016-10-14  Carlos O'Donell  <carlos@redhat.com>

	[BZ #20689]
	* sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
	AVX is usable.

---
  

Comments

H.J. Lu Oct. 17, 2016, 3:57 p.m. UTC | #1
On Mon, Oct 17, 2016 at 8:44 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 10/14/2016 03:12 PM, H.J. Lu wrote:
>> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> In the Intel Architecture Instruction Set Extensions Programming reference
>>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>>> is:
>>>
>>> "Application Software must identify that hardware supports AVX as explained
>>>  in ... after that it must also detect support for FMA..."
>>>
>>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>>> after that we check for AVX and FMA orthogonally. It is conceivable that
>>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>>
>>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>>> we should follow the intel specifications and adjust our check as the following
>>> patch works.
>>
>> One can't write a program with FMA nor AVX2 without using AVX
>> instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do
>>
>> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
>>    {
>>       Set AVX available
>>       if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
>>          Set AVX2 available
>>       if (CPU_FEATURES_CPU_P (cpu_features, FMA))
>>          Set FMA available
>>    }
>>
>
> Agreed. I double checked the manual, here is a new patch.
>
> Testing on x86_64 and i686.
>
> OK to commit if the results are clean?
>
> v2
> - Move FMA and AVX2 check up into AVX check since they depend upon it.
>
> 2016-10-14  Carlos O'Donell  <carlos@redhat.com>
>
>         [BZ #20689]
>         * sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
>         AVX is usable.
>
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index 11b9af2..e228a76 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -60,12 +60,20 @@ get_common_indeces (struct cpu_features *cpu_features,
>         {
>           /* Determine if AVX is usable.  */
>           if (CPU_FEATURES_CPU_P (cpu_features, AVX))
> -           cpu_features->feature[index_arch_AVX_Usable]
> -             |= bit_arch_AVX_Usable;
> -         /* Determine if AVX2 is usable.  */
> -         if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
> -           cpu_features->feature[index_arch_AVX2_Usable]
> -             |= bit_arch_AVX2_Usable;
> +           {
> +             cpu_features->feature[index_arch_AVX_Usable]
> +               |= bit_arch_AVX_Usable;
> +             /* The following features depend on AVX being usable.  */
> +             /* Determine if AVX2 is usable.  */
> +             if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
> +               cpu_features->feature[index_arch_AVX2_Usable]
> +                 |= bit_arch_AVX2_Usable;
> +             /* Determine if FMA is usable.  */
> +             if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> +               cpu_features->feature[index_arch_FMA_Usable]
> +                 |= bit_arch_FMA_Usable;
> +           }
> +
>           /* Check if OPMASK state, upper 256-bit of ZMM0-ZMM15 and
>              ZMM16-ZMM31 state are enabled.  */
>           if ((xcrlow & (bit_Opmask_state | bit_ZMM0_15_state
> @@ -83,10 +91,6 @@ get_common_indeces (struct cpu_features *cpu_features,
>                       |= bit_arch_AVX512DQ_Usable;
>                 }
>             }
> -         /* Determine if FMA is usable.  */
> -         if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> -           cpu_features->feature[index_arch_FMA_Usable]
> -             |= bit_arch_FMA_Usable;
>         }
>      }
>  }
> ---
>

This is OK.

Thanks.
  
Carlos O'Donell Oct. 17, 2016, 11:40 p.m. UTC | #2
On 10/17/2016 11:44 AM, Carlos O'Donell wrote:
> On 10/14/2016 03:12 PM, H.J. Lu wrote:
>> On Fri, Oct 14, 2016 at 11:09 AM, Carlos O'Donell <carlos@redhat.com> wrote:
>>> In the Intel Architecture Instruction Set Extensions Programming reference
>>> the recommended way to test for FMA in section '2.2.1 Detection of FMA'
>>> is:
>>>
>>> "Application Software must identify that hardware supports AVX as explained
>>>  in ... after that it must also detect support for FMA..."
>>>
>>> We don't do that in glibc. We use osxsave to detect the use of xgetbv, and
>>> after that we check for AVX and FMA orthogonally. It is conceivable that
>>> you could have the AVX bit clear and the FMA bit in an undefined state.
>>>
>>> I have never seen a machine with the AVX bit clear and the FMA bit set, but
>>> we should follow the intel specifications and adjust our check as the following
>>> patch works.
>>
>> One can't write a program with FMA nor AVX2 without using AVX
>> instructions.  AVX2/FMA aren't usable if AVX isn't.   We should do
>>
>> if (CPU_FEATURES_CPU_P (cpu_features, AVX))
>>    {
>>       Set AVX available
>>       if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
>>          Set AVX2 available
>>       if (CPU_FEATURES_CPU_P (cpu_features, FMA))
>>          Set FMA available
>>    }
>>
> 
> Agreed. I double checked the manual, here is a new patch.
> 
> Testing on x86_64 and i686.
> 
> OK to commit if the results are clean?
> 
> v2
> - Move FMA and AVX2 check up into AVX check since they depend upon it.
> 
> 2016-10-14  Carlos O'Donell  <carlos@redhat.com>
> 
> 	[BZ #20689]
> 	* sysdeps/x86/cpu-features.c: Only enable FMA and AVX2 if
> 	AVX is usable.

Committed. Thanks.
  

Patch

diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 11b9af2..e228a76 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -60,12 +60,20 @@  get_common_indeces (struct cpu_features *cpu_features,
        {
          /* Determine if AVX is usable.  */
          if (CPU_FEATURES_CPU_P (cpu_features, AVX))
-           cpu_features->feature[index_arch_AVX_Usable]
-             |= bit_arch_AVX_Usable;
-         /* Determine if AVX2 is usable.  */
-         if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
-           cpu_features->feature[index_arch_AVX2_Usable]
-             |= bit_arch_AVX2_Usable;
+           {
+             cpu_features->feature[index_arch_AVX_Usable]
+               |= bit_arch_AVX_Usable;
+             /* The following features depend on AVX being usable.  */
+             /* Determine if AVX2 is usable.  */
+             if (CPU_FEATURES_CPU_P (cpu_features, AVX2))
+               cpu_features->feature[index_arch_AVX2_Usable]
+                 |= bit_arch_AVX2_Usable;
+             /* Determine if FMA is usable.  */
+             if (CPU_FEATURES_CPU_P (cpu_features, FMA))
+               cpu_features->feature[index_arch_FMA_Usable]
+                 |= bit_arch_FMA_Usable;
+           }
+
          /* Check if OPMASK state, upper 256-bit of ZMM0-ZMM15 and
             ZMM16-ZMM31 state are enabled.  */
          if ((xcrlow & (bit_Opmask_state | bit_ZMM0_15_state
@@ -83,10 +91,6 @@  get_common_indeces (struct cpu_features *cpu_features,
                      |= bit_arch_AVX512DQ_Usable;
                }
            }
-         /* Determine if FMA is usable.  */
-         if (CPU_FEATURES_CPU_P (cpu_features, FMA))
-           cpu_features->feature[index_arch_FMA_Usable]
-             |= bit_arch_FMA_Usable;
        }
     }
 }