[1/3] Guess L1 cache linesize for aarch64

Message ID 20170608225728.26779-2-rth@twiddle.net
State New, archived
Headers

Commit Message

Richard Henderson June 8, 2017, 10:57 p.m. UTC
  Using the cache hierarchy linesize minimum in CTR_EL0.
See the comment within the code for rationale.

	* sysdeps/unix/sysv/linux/aarch64/sysconf.c: New file.

Cc: Marcus Shawcroft <marcus.shawcroft@arm.com>
---
 sysdeps/unix/sysv/linux/aarch64/sysconf.c | 55 +++++++++++++++++++++++++++++++
 1 file changed, 55 insertions(+)
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/sysconf.c
  

Comments

Siddhesh Poyarekar June 9, 2017, 5:51 a.m. UTC | #1
On Friday 09 June 2017 04:27 AM, Richard Henderson wrote:
> Using the cache hierarchy linesize minimum in CTR_EL0.
> See the comment within the code for rationale.
> 
> 	* sysdeps/unix/sysv/linux/aarch64/sysconf.c: New file.
> 

Looks good to me.

Thanks,
Siddhesh
  
Andrew Pinski June 9, 2017, 5:52 a.m. UTC | #2
On Thu, Jun 8, 2017 at 10:51 PM, Siddhesh Poyarekar <siddhesh@gotplt.org> wrote:
> On Friday 09 June 2017 04:27 AM, Richard Henderson wrote:
>> Using the cache hierarchy linesize minimum in CTR_EL0.
>> See the comment within the code for rationale.
>>
>>       * sysdeps/unix/sysv/linux/aarch64/sysconf.c: New file.
>>
>
> Looks good to me.

And me too.

Thanks,
ANdrew

>
> Thanks,
> Siddhesh
  
Siddhesh Poyarekar Oct. 10, 2017, 7:24 a.m. UTC | #3
On Friday 09 June 2017 04:27 AM, Richard Henderson wrote:
> Using the cache hierarchy linesize minimum in CTR_EL0.
> See the comment within the code for rationale.
> 
> 	* sysdeps/unix/sysv/linux/aarch64/sysconf.c: New file.

Hi Richard,

I noticed that you did not push this patch yet.  Are you waiting for
additional review?

Siddhesh
  
Szabolcs Nagy Oct. 10, 2017, 10:19 a.m. UTC | #4
On 08/06/17 23:57, Richard Henderson wrote:
> +  /* Unfortunately, the registers that contain the actual cache info
> +     (CCSIDR_EL1, CLIDR_EL1, and CSSELR_EL1) are protected by the Linux
> +     kernel (though they need not have been).  However, CTR_EL0 contains
> +     the *minimum* linesize in the entire cache hierarchy, and is
> +     accessible to userland, for use in __aarch64_sync_cache_range,
> +     and it is a reasonable assumption that the L1 cache will have that
> +     minimum line size.  */

maybe

> +    case _SC_LEVEL1_ICACHE_LINESIZE:
> +    case _SC_LEVEL1_DCACHE_LINESIZE:

i can't find documentation for these, what meaning do users expect?
  
Siddhesh Poyarekar Oct. 10, 2017, 10:37 a.m. UTC | #5
On Tuesday 10 October 2017 03:49 PM, Szabolcs Nagy wrote:
> On 08/06/17 23:57, Richard Henderson wrote:
>> +  /* Unfortunately, the registers that contain the actual cache info
>> +     (CCSIDR_EL1, CLIDR_EL1, and CSSELR_EL1) are protected by the Linux
>> +     kernel (though they need not have been).  However, CTR_EL0 contains
>> +     the *minimum* linesize in the entire cache hierarchy, and is
>> +     accessible to userland, for use in __aarch64_sync_cache_range,
>> +     and it is a reasonable assumption that the L1 cache will have that
>> +     minimum line size.  */
> 
> maybe

Right, but that's an architectural detail that may not be relevant for
sysconf.  That is, the assumption may be suitable for the way the
sysconf is typically used.

>> +    case _SC_LEVEL1_ICACHE_LINESIZE:
>> +    case _SC_LEVEL1_DCACHE_LINESIZE:
> 
> i can't find documentation for these, what meaning do users expect?

Applications may use these hints to try and align their code/data
suitably or read/write data in an optimal manner.  It needs to be
documented and I hope to have a patch ready for it soon, but I wanted to
be sure that this patch was in place since otherwise the documentation
does not make sense.

Siddhesh
  
Szabolcs Nagy Oct. 10, 2017, 11:01 a.m. UTC | #6
On 10/10/17 11:37, Siddhesh Poyarekar wrote:
> On Tuesday 10 October 2017 03:49 PM, Szabolcs Nagy wrote:
>> On 08/06/17 23:57, Richard Henderson wrote:
>>> +    case _SC_LEVEL1_ICACHE_LINESIZE:
>>> +    case _SC_LEVEL1_DCACHE_LINESIZE:
>>
>> i can't find documentation for these, what meaning do users expect?
> 
> Applications may use these hints to try and align their code/data
> suitably or read/write data in an optimal manner.  It needs to be

that's different from the given libgcc clear_cache example

> documented and I hope to have a patch ready for it soon, but I wanted to
> be sure that this patch was in place since otherwise the documentation
> does not make sense.

either there is existing meaning or it's a new api with
some proposed meaning, i wanted to look at that to tell
if the implementation is acceptable.
  
Siddhesh Poyarekar Oct. 10, 2017, 11:56 a.m. UTC | #7
On Tuesday 10 October 2017 04:31 PM, Szabolcs Nagy wrote:
>> Applications may use these hints to try and align their code/data
>> suitably or read/write data in an optimal manner.  It needs to be
> 
> that's different from the given libgcc clear_cache example

Line sizes reported by ctr_el0 must be usable for clearing/invalidating
cache lines in a loop so it should be compatible with that use case too.

> either there is existing meaning or it's a new api with
> some proposed meaning, i wanted to look at that to tell
> if the implementation is acceptable.

This is an old API and the existing meaning is literally what it says,
the size of the L1 cache line.  There is no specification defining what
it can or cannot be used for since it is a GNU extension.

To comply with the name 1:1 we would have to emulate reeading clidr_el1,
ccsidr_el1, etc. which is overkill given that the value returned is
valid for almost everything.  The only place it goes wrong is where an
application might use it to report system architecture and that's where
we need to add a documentation snippet stating that it isn't quite what
it says it is, but is close.

The other alternative is to never implement this information on aarch64,
which is potentially sub-optimal for all of the other use cases.

Siddhesh
  
Richard Henderson Oct. 10, 2017, 2:20 p.m. UTC | #8
On 10/10/2017 03:19 AM, Szabolcs Nagy wrote:
> On 08/06/17 23:57, Richard Henderson wrote:
>> +  /* Unfortunately, the registers that contain the actual cache info
>> +     (CCSIDR_EL1, CLIDR_EL1, and CSSELR_EL1) are protected by the Linux
>> +     kernel (though they need not have been).  However, CTR_EL0 contains
>> +     the *minimum* linesize in the entire cache hierarchy, and is
>> +     accessible to userland, for use in __aarch64_sync_cache_range,
>> +     and it is a reasonable assumption that the L1 cache will have that
>> +     minimum line size.  */
> 
> maybe

Have you ever seen a system for which this was not true?
I don't believe I have.

>> +    case _SC_LEVEL1_ICACHE_LINESIZE:
>> +    case _SC_LEVEL1_DCACHE_LINESIZE:
> 
> i can't find documentation for these, what meaning do users expect?

They expect them to be exactly what they say they are.
The question is, what do they expect to be able to do with that information.

Speaking for myself, I expect to be able to dynamically align objects on this
linesize and for that to have some predictable effect on performance.


r~
  
Szabolcs Nagy Oct. 10, 2017, 5:19 p.m. UTC | #9
On 08/06/17 23:57, Richard Henderson wrote:
> Using the cache hierarchy linesize minimum in CTR_EL0.
> See the comment within the code for rationale.
> 
> 	* sysdeps/unix/sysv/linux/aarch64/sysconf.c: New file.

OK.
  
Siddhesh Poyarekar Oct. 11, 2017, 5:28 a.m. UTC | #10
On Tuesday 10 October 2017 07:50 PM, Richard Henderson wrote:
>> maybe
> 
> Have you ever seen a system for which this was not true?
> I don't believe I have.

The Qualcomm Falkor chip shows the L2 dcache line size here instead of L1.

> They expect them to be exactly what they say they are.
> The question is, what do they expect to be able to do with that information.
> 
> Speaking for myself, I expect to be able to dynamically align objects on this
> linesize and for that to have some predictable effect on performance.

This should continue to work well even for falkor.  In fact, if a core
does this (i.e. deviate from L1 line size in ctr_el0) and also not
perform better (or at least not regress) then the designers should
understand that such a design would be sub-optimal.

Siddhesh
  

Patch

diff --git a/sysdeps/unix/sysv/linux/aarch64/sysconf.c b/sysdeps/unix/sysv/linux/aarch64/sysconf.c
new file mode 100644
index 0000000..30608dd
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/aarch64/sysconf.c
@@ -0,0 +1,55 @@ 
+/* Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <assert.h>
+#include <stdbool.h>
+#include <stdlib.h>
+#include <unistd.h>
+
+
+static long int linux_sysconf (int name);
+
+/* Get the value of the system variable NAME.  */
+long int
+__sysconf (int name)
+{
+  unsigned ctr;
+
+  /* Unfortunately, the registers that contain the actual cache info
+     (CCSIDR_EL1, CLIDR_EL1, and CSSELR_EL1) are protected by the Linux
+     kernel (though they need not have been).  However, CTR_EL0 contains
+     the *minimum* linesize in the entire cache hierarchy, and is
+     accessible to userland, for use in __aarch64_sync_cache_range,
+     and it is a reasonable assumption that the L1 cache will have that
+     minimum line size.  */
+  switch (name)
+    {
+    case _SC_LEVEL1_ICACHE_LINESIZE:
+      asm("mrs\t%0, ctr_el0" : "=r"(ctr));
+      return 4 << (ctr & 0xf);
+    case _SC_LEVEL1_DCACHE_LINESIZE:
+      asm("mrs\t%0, ctr_el0" : "=r"(ctr));
+      return 4 << ((ctr >> 16) & 0xf);
+    }
+
+  return linux_sysconf (name);
+}
+
+/* Now the generic Linux version.  */
+#undef __sysconf
+#define __sysconf static linux_sysconf
+#include <sysdeps/unix/sysv/linux/sysconf.c>