libatomic: Improve ifunc selection on AArch64

Message ID PAWPR08MB8982E6D99993A9BC82615B368309A@PAWPR08MB8982.eurprd08.prod.outlook.com
State New
Headers
Series libatomic: Improve ifunc selection on AArch64 |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gcc_check--master-arm warning Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-aarch64 warning Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_check--master-aarch64 warning Patch failed to apply
linaro-tcwg-bot/tcwg_gcc_build--master-arm warning Patch failed to apply

Commit Message

Wilco Dijkstra Aug. 4, 2023, 3:05 p.m. UTC
  Add support for ifunc selection based on CPUID register.  Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.

Passes regress, OK for commit?

libatomic/
	config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
	selection.

---
  

Comments

Richard Henderson Aug. 10, 2023, 2:11 a.m. UTC | #1
On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:
> +#ifdef HWCAP_USCAT
> +
> +#define MIDR_IMPLEMENTOR(midr)	(((midr) >> 24) & 255)
> +#define MIDR_PARTNUM(midr)	(((midr) >> 4) & 0xfff)
> +
> +static inline bool
> +ifunc1 (unsigned long hwcap)
> +{
> +  if (hwcap & HWCAP_USCAT)
> +    return true;
> +  if (!(hwcap & HWCAP_CPUID))
> +    return false;
> +
> +  unsigned long midr;
> +  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
> +
> +  /* Neoverse N1 supports atomic 128-bit load/store.  */
> +  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
> +    return true;
> +
> +  return false;
> +}
> +#endif

Why would HWCAP_USCAT not be set by the kernel?

Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.


r~
  
Richard Henderson Aug. 10, 2023, 2:15 a.m. UTC | #2
On 8/9/23 19:11, Richard Henderson wrote:
> On 8/4/23 08:05, Wilco Dijkstra via Gcc-patches wrote:
>> +#ifdef HWCAP_USCAT
>> +
>> +#define MIDR_IMPLEMENTOR(midr)    (((midr) >> 24) & 255)
>> +#define MIDR_PARTNUM(midr)    (((midr) >> 4) & 0xfff)
>> +
>> +static inline bool
>> +ifunc1 (unsigned long hwcap)
>> +{
>> +  if (hwcap & HWCAP_USCAT)
>> +    return true;
>> +  if (!(hwcap & HWCAP_CPUID))
>> +    return false;
>> +
>> +  unsigned long midr;
>> +  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
>> +
>> +  /* Neoverse N1 supports atomic 128-bit load/store.  */
>> +  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
>> +    return true;
>> +
>> +  return false;
>> +}
>> +#endif
> 
> Why would HWCAP_USCAT not be set by the kernel?
> 
> Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.

Answering my own question, N1 does not officially have FEAT_LSE2.


r~
  
Wilco Dijkstra Aug. 10, 2023, 9:50 a.m. UTC | #3
Hi Richard,

>> Why would HWCAP_USCAT not be set by the kernel?
>> 
>> Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
>>
> Answering my own question, N1 does not officially have FEAT_LSE2.

It doesn't indeed. However most cores support atomic 128-bit load/store
(part of LSE2), so we can still use the LSE2 ifunc for those cores. Since there
isn't a feature bit for this in the CPU or HWCAP, I check the CPUID register.

Cheers,
Wilco
  
Richard Henderson Aug. 10, 2023, 3:14 p.m. UTC | #4
On 8/10/23 02:50, Wilco Dijkstra wrote:
> Hi Richard,
> 
>>> Why would HWCAP_USCAT not be set by the kernel?
>>>
>>> Failing that, I would think you would check ID_AA64MMFR2_EL1.AT.
>>>
>> Answering my own question, N1 does not officially have FEAT_LSE2.
> 
> It doesn't indeed. However most cores support atomic 128-bit load/store
> (part of LSE2), so we can still use the LSE2 ifunc for those cores. Since there
> isn't a feature bit for this in the CPU or HWCAP, I check the CPUID register.

That would be a really nice bit to add to HWCAP, then, to consolidate this knowledge in 
one place.  Certainly I would use it in QEMU as well.


r~
  
Wilco Dijkstra Aug. 10, 2023, 4:09 p.m. UTC | #5
Hi Richard,

>>> Answering my own question, N1 does not officially have FEAT_LSE2.
>> 
>> It doesn't indeed. However most cores support atomic 128-bit load/store
>> (part of LSE2), so we can still use the LSE2 ifunc for those cores. Since there
>> isn't a feature bit for this in the CPU or HWCAP, I check the CPUID register.
>
> That would be a really nice bit to add to HWCAP, then, to consolidate this knowledge in 
> one place.  Certainly I would use it in QEMU as well.

Yes this was suggested by a colleague as well. I'll ask and see whether the kernel guys
like the idea. It would take some time to get added, so we still need this for the time
being.

Cheers,
Wilco
  
Wilco Dijkstra Sept. 13, 2023, 2:56 p.m. UTC | #6
ping


From: Wilco Dijkstra
Sent: 04 August 2023 16:05
To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: [PATCH] libatomic: Improve ifunc selection on AArch64 
 

Add support for ifunc selection based on CPUID register.  Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.

Passes regress, OK for commit?

libatomic/
        config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
        selection.

---

diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@
 
 #ifdef HWCAP_USCAT
 # if N == 16
-#  define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+#  define IFUNC_COND_1 ifunc1 (hwcap)
 # else
 #  define IFUNC_COND_1  (hwcap & HWCAP_ATOMICS)
 # endif
@@ -50,4 +50,28 @@
 #undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
 #define MAYBE_HAVE_ATOMIC_EXCHANGE_16   1
 
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr)     (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+  if (hwcap & HWCAP_USCAT)
+    return true;
+  if (!(hwcap & HWCAP_CPUID))
+    return false;
+
+  unsigned long midr;
+  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+  /* Neoverse N1 supports atomic 128-bit load/store.  */
+  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+    return true;
+
+  return false;
+}
+#endif
+
 #include_next <host-config.h>
  
Wilco Dijkstra Oct. 16, 2023, 12:51 p.m. UTC | #7
ping


From: Wilco Dijkstra
Sent: 04 August 2023 16:05
To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: [PATCH] libatomic: Improve ifunc selection on AArch64 
 

Add support for ifunc selection based on CPUID register.  Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.

Passes regress, OK for commit?

libatomic/
        config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
        selection.

---

diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@
 
 #ifdef HWCAP_USCAT
 # if N == 16
-#  define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+#  define IFUNC_COND_1 ifunc1 (hwcap)
 # else
 #  define IFUNC_COND_1  (hwcap & HWCAP_ATOMICS)
 # endif
@@ -50,4 +50,28 @@
 #undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
 #define MAYBE_HAVE_ATOMIC_EXCHANGE_16   1
 
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr)     (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+  if (hwcap & HWCAP_USCAT)
+    return true;
+  if (!(hwcap & HWCAP_CPUID))
+    return false;
+
+  unsigned long midr;
+  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+  /* Neoverse N1 supports atomic 128-bit load/store.  */
+  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+    return true;
+
+  return false;
+}
+#endif
+
 #include_next <host-config.h>
  
Wilco Dijkstra Nov. 6, 2023, 12:12 p.m. UTC | #8
ping


From: Wilco Dijkstra
Sent: 04 August 2023 16:05
To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
Subject: [PATCH] libatomic: Improve ifunc selection on AArch64 
 

Add support for ifunc selection based on CPUID register.  Neoverse N1 supports
atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
cores.

Passes regress, OK for commit?

libatomic/
        config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
        selection.

---

diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@
 
 #ifdef HWCAP_USCAT
 # if N == 16
-#  define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
+#  define IFUNC_COND_1 ifunc1 (hwcap)
 # else
 #  define IFUNC_COND_1  (hwcap & HWCAP_ATOMICS)
 # endif
@@ -50,4 +50,28 @@
 #undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
 #define MAYBE_HAVE_ATOMIC_EXCHANGE_16   1
 
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr)     (((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+  if (hwcap & HWCAP_USCAT)
+    return true;
+  if (!(hwcap & HWCAP_CPUID))
+    return false;
+
+  unsigned long midr;
+  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+  /* Neoverse N1 supports atomic 128-bit load/store.  */
+  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+    return true;
+
+  return false;
+}
+#endif
+
 #include_next <host-config.h>
  
Kyrylo Tkachov Nov. 10, 2023, 10 a.m. UTC | #9
Hi Wilco,

> -----Original Message-----
> From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Sent: Monday, November 6, 2023 12:13 PM
> To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: Re: [PATCH] libatomic: Improve ifunc selection on AArch64
> 
> 
> 
> ping
> 
> 
> From: Wilco Dijkstra
> Sent: 04 August 2023 16:05
> To: GCC Patches <gcc-patches@gcc.gnu.org>; Richard Sandiford
> <Richard.Sandiford@arm.com>
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: [PATCH] libatomic: Improve ifunc selection on AArch64
> 
> 
> Add support for ifunc selection based on CPUID register.  Neoverse N1
> supports
> atomic 128-bit load/store, so use the FEAT_USCAT ifunc like newer Neoverse
> cores.
> 
> Passes regress, OK for commit?
> 
> libatomic/
>         config/linux/aarch64/host-config.h (ifunc1): Use CPUID in ifunc
>         selection.
> 
> ---
> 
> diff --git a/libatomic/config/linux/aarch64/host-config.h
> b/libatomic/config/linux/aarch64/host-config.h
> index
> 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6
> c69425db463fdc4b 100644
> --- a/libatomic/config/linux/aarch64/host-config.h
> +++ b/libatomic/config/linux/aarch64/host-config.h
> @@ -26,7 +26,7 @@
> 
>  #ifdef HWCAP_USCAT
>  # if N == 16
> -#  define IFUNC_COND_1 (hwcap & HWCAP_USCAT)
> +#  define IFUNC_COND_1 ifunc1 (hwcap)
>  # else
>  #  define IFUNC_COND_1  (hwcap & HWCAP_ATOMICS)
>  # endif
> @@ -50,4 +50,28 @@
>  #undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
>  #define MAYBE_HAVE_ATOMIC_EXCHANGE_16   1
> 
> +#ifdef HWCAP_USCAT
> +
> +#define MIDR_IMPLEMENTOR(midr) (((midr) >> 24) & 255)
> +#define MIDR_PARTNUM(midr)     (((midr) >> 4) & 0xfff)
> +
> +static inline bool
> +ifunc1 (unsigned long hwcap)
> +{
> +  if (hwcap & HWCAP_USCAT)
> +    return true;
> +  if (!(hwcap & HWCAP_CPUID))
> +    return false;
> +
> +  unsigned long midr;
> +  asm volatile ("mrs %0, midr_el1" : "=r" (midr));

From what I recall that midr_el1 register is emulated by the kernel and so userspace software has to check that the kernel supports that emulation through hwcaps before reading it.
According to https://www.kernel.org/doc/html/v5.8/arm64/cpu-feature-registers.html you need to check getauxval(AT_HWCAP) & HWCAP_CPUID) before doing that read.

Thanks,
Kyrill

> +
> +  /* Neoverse N1 supports atomic 128-bit load/store.  */
> +  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
> +    return true;
> +
> +  return false;
> +}
> +#endif
> +
>  #include_next <host-config.h>
  
Wilco Dijkstra Nov. 10, 2023, 10:22 a.m. UTC | #10
Hi Kyrill,

> +  if (!(hwcap & HWCAP_CPUID))
> +    return false;
> +
> +  unsigned long midr;
> +  asm volatile ("mrs %0, midr_el1" : "=r" (midr));

> From what I recall that midr_el1 register is emulated by the kernel and so userspace software
> has to check that the kernel supports that emulation through hwcaps before reading it.
> According to https://www.kernel.org/doc/html/v5.8/arm64/cpu-feature-registers.html you
> need to check getauxval(AT_HWCAP) & HWCAP_CPUID) before doing that read.

That's why I do that immediately before reading midr_el1 - see above.

Cheers,
Wilco
  
Kyrylo Tkachov Nov. 10, 2023, 12:36 p.m. UTC | #11
> -----Original Message-----
> From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
> Sent: Friday, November 10, 2023 10:23 AM
> To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; GCC Patches <gcc-
> patches@gcc.gnu.org>; Richard Sandiford <Richard.Sandiford@arm.com>
> Subject: Re: [PATCH] libatomic: Improve ifunc selection on AArch64
> 
> Hi Kyrill,
> 
> > +  if (!(hwcap & HWCAP_CPUID))
> > +    return false;
> > +
> > +  unsigned long midr;
> > +  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
> 
> > From what I recall that midr_el1 register is emulated by the kernel and so
> userspace software
> > has to check that the kernel supports that emulation through hwcaps before
> reading it.
> > According to https://www.kernel.org/doc/html/v5.8/arm64/cpu-feature-
> registers.html you
> > need to check getauxval(AT_HWCAP) & HWCAP_CPUID) before doing that
> read.
> 
> That's why I do that immediately before reading midr_el1 - see above.

Errr, yes. Obviously I wasn't fully awake when I looked at it!
Sorry for the noise.
Ok for trunk then.
Kyrill

> 
> Cheers,
> Wilco
  

Patch

diff --git a/libatomic/config/linux/aarch64/host-config.h b/libatomic/config/linux/aarch64/host-config.h
index 851c78c01cd643318aaa52929ce4550266238b79..e5dc33c030a4bab927874fa6c69425db463fdc4b 100644
--- a/libatomic/config/linux/aarch64/host-config.h
+++ b/libatomic/config/linux/aarch64/host-config.h
@@ -26,7 +26,7 @@ 
 
 #ifdef HWCAP_USCAT
 # if N == 16
-#  define IFUNC_COND_1	(hwcap & HWCAP_USCAT)
+#  define IFUNC_COND_1	ifunc1 (hwcap)
 # else
 #  define IFUNC_COND_1	(hwcap & HWCAP_ATOMICS)
 # endif
@@ -50,4 +50,28 @@ 
 #undef MAYBE_HAVE_ATOMIC_EXCHANGE_16
 #define MAYBE_HAVE_ATOMIC_EXCHANGE_16	1
 
+#ifdef HWCAP_USCAT
+
+#define MIDR_IMPLEMENTOR(midr)	(((midr) >> 24) & 255)
+#define MIDR_PARTNUM(midr)	(((midr) >> 4) & 0xfff)
+
+static inline bool
+ifunc1 (unsigned long hwcap)
+{
+  if (hwcap & HWCAP_USCAT)
+    return true;
+  if (!(hwcap & HWCAP_CPUID))
+    return false;
+
+  unsigned long midr;
+  asm volatile ("mrs %0, midr_el1" : "=r" (midr));
+
+  /* Neoverse N1 supports atomic 128-bit load/store.  */
+  if (MIDR_IMPLEMENTOR (midr) == 'A' && MIDR_PARTNUM(midr) == 0xd0c)
+    return true;
+
+  return false;
+}
+#endif
+
 #include_next <host-config.h>