libatomic: Improve ifunc selection on AArch64
Checks
Commit Message
Add support for ifunc selection based on CPUID register. Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.
Passes regress, OK for commit?
libatomic/
config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
selection.
---
Comments
On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:
> +#ifdef HWCAP_USCAT
> +
> +#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
> +#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
> +
> +static inline bool
> +ifunc1 (unsigned long hwcap)
> +{
> + if (hwcap & HWCAP_USCAT)
> + return true;
> + if (!(hwcap & HWCAP_CPUID))
> + return false;
> +
> + unsigned long midr;
> + asm volatile ("mrs %0, midr_el1" : "=r" (midr));
> +
> + /* Neoverse N1 supports atomic 128-bit load/store. */
> + if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
> + return true;
> +
> + return false;
> +}
> +#endif
Why would HWCAP_USCAT not be set by the kernel?
Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
r~
On 8/9/23 19:11, Richard Henderson wrote:
> On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:
>> +#ifdef HWCAP_USCAT
>> +
>> +#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
>> +#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
>> +
>> +static inline bool
>> +ifunc1 (unsigned long hwcap)
>> +{
>> + if (hwcap & HWCAP_USCAT)
>> + return true;
>> + if (!(hwcap & HWCAP_CPUID))
>> + return false;
>> +
>> + unsigned long midr;
>> + asm volatile ("mrs %0, midr_el1" : "=r" (midr));
>> +
>> + /* Neoverse N1 supports atomic 128-bit load/store. */
>> + if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
>> + return true;
>> +
>> + return false;
>> +}
>> +#endif
>
> Why would HWCAP_USCAT not be set by the kernel?
>
> Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
Answering my own question, N1 does not officially have FEAT_LSE2.
r~
Hi Richard,
>> Why would HWCAP_USCAT not be set by the kernel?
>>
>> Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
>>
> Answering my own question, N1 does not officially have FEAT_LSE2.
It doesn't indeed. However most cores support atomic 128-bit load/store
(part of LSE2), so we can still use the LSE2 ifunc for those cores. Since there
isn't a feature bit for this in the CPU or HWCAP, I check the CPUID register.
Cheers,
Wilco
On 8/10/23 02:50, Wilco Dijkstra wrote:
> Hi Richard,
>
>>> Why would HWCAP_USCAT not be set by the kernel?
>>>
>>> Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
>>>
>> Answering my own question, N1 does not officially have FEAT_LSE2.
>
> It doesn't indeed. However most cores support atomic 128-bit load/store
> (part of LSE2), so we can still use the LSE2 ifunc for those cores. Since there
> isn't a feature bit for this in the CPU or HWCAP, I check the CPUID register.
That would be a really nice bit to add to HWCAP, then, to consolidate this knowledge in
one place. Certainly I would use it in QEMU as well.
r~
Hi Richard,
>>> Answering my own question, N1 does not officially have FEAT_LSE2.
>>
>> It doesn't indeed. However most cores support atomic 128-bit load/store
>> (part of LSE2), so we can still use the LSE2 ifunc for those cores. Since there
>> isn't a feature bit for this in the CPU or HWCAP, I check the CPUID register.
>
> That would be a really nice bit to add to HWCAP, then, to consolidate this knowledge in
> one place. Certainly I would use it in QEMU as well.
Yes this was suggested by a colleague as well. I'll ask and see whether the kernel guys
like the idea. It would take some time to get added, so we still need this for the time
being.
Cheers,
Wilco
ping
From: Wilco Dijkstra
Sent: 04 August 2023 16:05
To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: [PATCH] libatomic: Improve ifunc selection on AArch64
Add support for ifunc selection based on CPUID register. Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.
Passes regress, OK for commit?
libatomic/
config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
selection.
---
diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@
#ifdef HWCAP_USCAT
# if N == 16
-# define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+# define IFUNC_COND_1 ifunc1 (hwcap)
# else
# define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS)
# endif
@@ -50,4 +50,28 @@
#undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
#define MAYBE_HAVE_ATOMIC_EXCHANGE_16 1
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+ if (hwcap & HWCAP_USCAT)
+ return true;
+ if (!(hwcap & HWCAP_CPUID))
+ return false;
+
+ unsigned long midr;
+ asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+ /* Neoverse N1 supports atomic 128-bit load/store. */
+ if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+ return true;
+
+ return false;
+}
+#endif
+
#include_next <host-config.h>
ping
From: Wilco Dijkstra
Sent: 04 August 2023 16:05
To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: [PATCH] libatomic: Improve ifunc selection on AArch64
Add support for ifunc selection based on CPUID register. Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.
Passes regress, OK for commit?
libatomic/
config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
selection.
---
diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@
#ifdef HWCAP_USCAT
# if N == 16
-# define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+# define IFUNC_COND_1 ifunc1 (hwcap)
# else
# define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS)
# endif
@@ -50,4 +50,28 @@
#undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
#define MAYBE_HAVE_ATOMIC_EXCHANGE_16 1
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+ if (hwcap & HWCAP_USCAT)
+ return true;
+ if (!(hwcap & HWCAP_CPUID))
+ return false;
+
+ unsigned long midr;
+ asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+ /* Neoverse N1 supports atomic 128-bit load/store. */
+ if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+ return true;
+
+ return false;
+}
+#endif
+
#include_next <host-config.h>
ping
From: Wilco Dijkstra
Sent: 04 August 2023 16:05
To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: [PATCH] libatomic: Improve ifunc selection on AArch64
Add support for ifunc selection based on CPUID register. Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.
Passes regress, OK for commit?
libatomic/
config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
selection.
---
diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@
#ifdef HWCAP_USCAT
# if N == 16
-# define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+# define IFUNC_COND_1 ifunc1 (hwcap)
# else
# define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS)
# endif
@@ -50,4 +50,28 @@
#undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
#define MAYBE_HAVE_ATOMIC_EXCHANGE_16 1
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+ if (hwcap & HWCAP_USCAT)
+ return true;
+ if (!(hwcap & HWCAP_CPUID))
+ return false;
+
+ unsigned long midr;
+ asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+ /* Neoverse N1 supports atomic 128-bit load/store. */
+ if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+ return true;
+
+ return false;
+}
+#endif
+
#include_next <host-config.h>
Hi Wilco,
> -----Original Message-----
> From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Sent: Monday, November 6, 2023 12:13 PM
> To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: Re: [PATCH] libatomic: Improve ifunc selection on AArch64
>
>
>
> ping
>
>
> From: Wilco Dijkstra
> Sent: 04 August 2023 16:05
> To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: [PATCH] libatomic: Improve ifunc selection on AArch64
>
>
> Add support for ifunc selection based on CPUID register. Neoverse N1
> supports
> atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
> cores.
>
> Passes regress, OK for commit?
>
> libatomic/
> config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
> selection.
>
> ---
>
> diff --git a/libatomic/config/linux/aarch64/host-config.h
> b/libatomic/config/linux/aarch64/host-config.h
> index
> 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6
> c69425db463fdc4b 100644
> --- a/libatomic/config/linux/aarch64/host-config.h
> +++ b/libatomic/config/linux/aarch64/host-config.h
> @@ -26,7 +26,7 @@
>
> #ifdef HWCAP_USCAT
> # if N == 16
> -# define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
> +# define IFUNC_COND_1 ifunc1 (hwcap)
> # else
> # define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS)
> # endif
> @@ -50,4 +50,28 @@
> #undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
> #define MAYBE_HAVE_ATOMIC_EXCHANGE_16 1
>
> +#ifdef HWCAP_USCAT
> +
> +#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
> +#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
> +
> +static inline bool
> +ifunc1 (unsigned long hwcap)
> +{
> + if (hwcap & HWCAP_USCAT)
> + return true;
> + if (!(hwcap & HWCAP_CPUID))
> + return false;
> +
> + unsigned long midr;
> + asm volatile ("mrs %0, midr_el1" : "=r" (midr));
From what I recall that midr_el1 register is emulated by the kernel and so userspace software has to check that the kernel supports that emulation through hwcaps before reading it.
According to https://www.kernel.org/doc/html/v5.8/arm64/cpu-feature-registers.html you need to check getauxval(AT_HWCAP) & HWCAP_CPUID) before doing that read.
Thanks,
Kyrill
> +
> + /* Neoverse N1 supports atomic 128-bit load/store. */
> + if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
> + return true;
> +
> + return false;
> +}
> +#endif
> +
> #include_next <host-config.h>
Hi Kyrill,
> + if (!(hwcap & HWCAP_CPUID))
> + return false;
> +
> + unsigned long midr;
> + asm volatile ("mrs %0, midr_el1" : "=r" (midr));
> From what I recall that midr_el1 register is emulated by the kernel and so userspace software
> has to check that the kernel supports that emulation through hwcaps before reading it.
> According to https://www.kernel.org/doc/html/v5.8/arm64/cpu-feature-registers.html you
> need to check getauxval(AT_HWCAP) & HWCAP_CPUID) before doing that read.
That's why I do that immediately before reading midr_el1 - see above.
Cheers,
Wilco
> -----Original Message-----
> From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Sent: Friday, November 10, 2023 10:23 AM
> To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; GCC Patches <gcc-
> patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH] libatomic: Improve ifunc selection on AArch64
>
> Hi Kyrill,
>
> > + if (!(hwcap & HWCAP_CPUID))
> > + return false;
> > +
> > + unsigned long midr;
> > + asm volatile ("mrs %0, midr_el1" : "=r" (midr));
>
> > From what I recall that midr_el1 register is emulated by the kernel and so
> userspace software
> > has to check that the kernel supports that emulation through hwcaps before
> reading it.
> > According to https://www.kernel.org/doc/html/v5.8/arm64/cpu-feature-
> registers.html you
> > need to check getauxval(AT_HWCAP) & HWCAP_CPUID) before doing that
> read.
>
> That's why I do that immediately before reading midr_el1 - see above.
Errr, yes. Obviously I wasn't fully awake when I looked at it!
Sorry for the noise.
Ok for trunk then.
Kyrill
>
> Cheers,
> Wilco
@@ -26,7 +26,7 @@
#ifdef HWCAP_USCAT
# if N == 16
-# define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+# define IFUNC_COND_1 ifunc1 (hwcap)
# else
# define IFUNC_COND_1 (hwcap & HWCAP_ATOMICS)
# endif
@@ -50,4 +50,28 @@
#undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
#define MAYBE_HAVE_ATOMIC_EXCHANGE_16 1
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr) (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+ if (hwcap & HWCAP_USCAT)
+ return true;
+ if (!(hwcap & HWCAP_CPUID))
+ return false;
+
+ unsigned long midr;
+ asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+ /* Neoverse N1 supports atomic 128-bit load/store. */
+ if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+ return true;
+
+ return false;
+}
+#endif
+
#include_next <host-config.h>