Commit Message
After the git commit "f4b6d20366aac66070f1cf50552cf2951991a1e5" FMA4 detection is failing in trunk.
This commit was related to Bugzilla id https://sourceware.org/bugzilla/show_bug.cgi?id=19214 and is ok to fix in following way in cpu-features.c file?
Thanks,
Amit Pawar
Comments
On Thu, 2 Jun 2016, Pawar, Amit wrote:
> After the git commit "f4b6d20366aac66070f1cf50552cf2951991a1e5" FMA4
> detection is failing in trunk.
> This commit was related to Bugzilla id
> https://sourceware.org/bugzilla/show_bug.cgi?id=19214 and is ok to fix
> in following way in cpu-features.c file?
You need to open a new bug for what appears to be a new issue (and then
mention [BZ #N] in your ChangeLog entry, etc.).
On 06/02/2016 05:35 AM, Pawar, Amit wrote:
> After the git commit "f4b6d20366aac66070f1cf50552cf2951991a1e5" FMA4 detection is failing in trunk.
You need to explain _why_ it's not working.
> This commit was related to Bugzilla id https://sourceware.org/bugzilla/show_bug.cgi?id=19214 and is ok to fix in following way in cpu-features.c file?
>
> diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
> index a5fa81f..384e328 100644
> --- a/sysdeps/x86/cpu-features.c
> +++ b/sysdeps/x86/cpu-features.c
> @@ -87,10 +87,6 @@ get_common_indeces (struct cpu_features *cpu_features,
> if (CPU_FEATURES_CPU_P (cpu_features, FMA))
> cpu_features->feature[index_arch_FMA_Usable]
> |= bit_arch_FMA_Usable;
> - /* Determine if FMA4 is usable. */
> - if (CPU_FEATURES_CPU_P (cpu_features, FMA4))
> - cpu_features->feature[index_arch_FMA4_Usable]
> - |= bit_arch_FMA4_Usable;
> }
You move the feature check from the subroutine get_common_indeces...
> }
> }
> @@ -230,6 +226,25 @@ init_cpu_features (struct cpu_features *cpu_features)
> cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
> cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
>
> + /* Can we call xgetbv? */
> + if (CPU_FEATURES_CPU_P (cpu_features, OSXSAVE))
> + {
> + unsigned int xcrlow;
> + unsigned int xcrhigh;
> +
> + asm ("xgetbv" : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0));
> +
> + /* Is YMM and XMM state usable? */
> + if ((xcrlow & (bit_YMM_state | bit_XMM_state)) ==
> + (bit_YMM_state | bit_XMM_state))
> + {
> + /* Determine if FMA4 is usable. */
> + if (CPU_FEATURES_CPU_P (cpu_features, FMA4))
> + cpu_features->feature[index_arch_FMA4_Usable]
> + |= bit_arch_FMA4_Usable;
... and up into the init_cpu_features (caller of get_common_indeces).
As you refactor it up the call stack you end up copying the osxsave/xgetbv
checks which get_common_indeces was already doing.
As far as I can tell your patch just moves the check earlier... why?
> + }
> + }
> +
> if (family == 0x15)
> {
> #if index_arch_Fast_Unaligned_Load != index_arch_Fast_Copy_Backward
>
>
> Thanks,
> Amit Pawar
>... and up into the init_cpu_features (caller of get_common_indeces).
>
>As you refactor it up the call stack you end up copying the osxsave/xgetbv checks which get_common_indeces was already doing.
>
>As far as I can tell your patch just moves the check earlier... why?
Thanks for replying and I have filed a new defect in Bugzilla as per the suggestion and please check the same https://sourceware.org/bugzilla/show_bug.cgi?id=20195
Thanks,
Amit
@@ -87,10 +87,6 @@ get_common_indeces (struct cpu_features *cpu_features,
if (CPU_FEATURES_CPU_P (cpu_features, FMA))
cpu_features->feature[index_arch_FMA_Usable]
|= bit_arch_FMA_Usable;
- /* Determine if FMA4 is usable. */
- if (CPU_FEATURES_CPU_P (cpu_features, FMA4))
- cpu_features->feature[index_arch_FMA4_Usable]
- |= bit_arch_FMA4_Usable;
}
}
}
@@ -230,6 +226,25 @@ init_cpu_features (struct cpu_features *cpu_features)
cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].ecx,
cpu_features->cpuid[COMMON_CPUID_INDEX_80000001].edx);
+ /* Can we call xgetbv? */
+ if (CPU_FEATURES_CPU_P (cpu_features, OSXSAVE))
+ {
+ unsigned int xcrlow;
+ unsigned int xcrhigh;
+
+ asm ("xgetbv" : "=a" (xcrlow), "=d" (xcrhigh) : "c" (0));
+
+ /* Is YMM and XMM state usable? */
+ if ((xcrlow & (bit_YMM_state | bit_XMM_state)) ==
+ (bit_YMM_state | bit_XMM_state))
+ {
+ /* Determine if FMA4 is usable. */
+ if (CPU_FEATURES_CPU_P (cpu_features, FMA4))
+ cpu_features->feature[index_arch_FMA4_Usable]
+ |= bit_arch_FMA4_Usable;
+ }
+ }
+
if (family == 0x15)
{
#if index_arch_Fast_Unaligned_Load != index_arch_Fast_Copy_Backward