[3/5] Linux: Use rseq to accelerate sched_getcpu

Message ID 31d8f14f77f950ae8cc6c1c921032a5d2ff3c8a1.1638798186.git.fweimer@redhat.com
State Superseded
Delegated to: Szabolcs Nagy
Headers
Series Extensible rseq support for glibc |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Florian Weimer Dec. 6, 2021, 1:46 p.m. UTC
  Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
---
 sysdeps/unix/sysv/linux/sched_getcpu.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)
  

Comments

Szabolcs Nagy Dec. 6, 2021, 4:50 p.m. UTC | #1
The 12/06/2021 14:46, Florian Weimer via Libc-alpha wrote:
> Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> ---
>  sysdeps/unix/sysv/linux/sched_getcpu.c | 19 +++++++++++++++++--
>  1 file changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/sysdeps/unix/sysv/linux/sched_getcpu.c b/sysdeps/unix/sysv/linux/sched_getcpu.c
> index c41e986f2c..91250f9d0c 100644
> --- a/sysdeps/unix/sysv/linux/sched_getcpu.c
> +++ b/sysdeps/unix/sysv/linux/sched_getcpu.c
> @@ -20,8 +20,8 @@
>  #include <sysdep.h>
>  #include <sysdep-vdso.h>
>  
> -int
> -sched_getcpu (void)
> +static int
> +vsyscall_sched_getcpu (void)
>  {
>    unsigned int cpu;
>    int r = -1;
> @@ -32,3 +32,18 @@ sched_getcpu (void)
>  #endif
>    return r == -1 ? r : cpu;
>  }
> +
> +#ifdef RSEQ_SIG
> +int
> +sched_getcpu (void)
> +{
> +  int cpu_id = THREAD_GETMEM (THREAD_SELF, rseq_area.cpu_id);
> +  return __glibc_likely (cpu_id >= 0) ? cpu_id : vsyscall_sched_getcpu ();
> +}

i think there is a formal memory model issue here since
tp->rseq_area.cpu_id can be modified by the kernel asynchronously
but we don't use volatile or relaxed_mo atomic access.
(assuming THREAD_GETMEM is defined as a normal access)

i think this is fine here (unlikely to cause miscompilation),
but may be worth a note?

> +#else /* RSEQ_SIG */
> +int
> +sched_getcpu (void)
> +{
> +  return vsyscall_sched_getcpu ();
> +}
> +#endif /* RSEQ_SIG */
> -- 
> 2.33.1
  
Florian Weimer Dec. 6, 2021, 5:06 p.m. UTC | #2
* Szabolcs Nagy:

> The 12/06/2021 14:46, Florian Weimer via Libc-alpha wrote:
>> Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> ---
>>  sysdeps/unix/sysv/linux/sched_getcpu.c | 19 +++++++++++++++++--
>>  1 file changed, 17 insertions(+), 2 deletions(-)
>> 
>> diff --git a/sysdeps/unix/sysv/linux/sched_getcpu.c b/sysdeps/unix/sysv/linux/sched_getcpu.c
>> index c41e986f2c..91250f9d0c 100644
>> --- a/sysdeps/unix/sysv/linux/sched_getcpu.c
>> +++ b/sysdeps/unix/sysv/linux/sched_getcpu.c
>> @@ -20,8 +20,8 @@
>>  #include <sysdep.h>
>>  #include <sysdep-vdso.h>
>>  
>> -int
>> -sched_getcpu (void)
>> +static int
>> +vsyscall_sched_getcpu (void)
>>  {
>>    unsigned int cpu;
>>    int r = -1;
>> @@ -32,3 +32,18 @@ sched_getcpu (void)
>>  #endif
>>    return r == -1 ? r : cpu;
>>  }
>> +
>> +#ifdef RSEQ_SIG
>> +int
>> +sched_getcpu (void)
>> +{
>> +  int cpu_id = THREAD_GETMEM (THREAD_SELF, rseq_area.cpu_id);
>> +  return __glibc_likely (cpu_id >= 0) ? cpu_id : vsyscall_sched_getcpu ();
>> +}
>
> i think there is a formal memory model issue here since
> tp->rseq_area.cpu_id can be modified by the kernel asynchronously
> but we don't use volatile or relaxed_mo atomic access.
> (assuming THREAD_GETMEM is defined as a normal access)

The kernel only modifies cpu_id if the user code is not running on the
thread, so I don't think this doesn't matter.

> i think this is fine here (unlikely to cause miscompilation),
> but may be worth a note?

The kernel doesn't use the C memory model, so using that terminology in
the kernel context would only be more confusing.

Thanks,
Florian
  
Szabolcs Nagy Dec. 6, 2021, 5:45 p.m. UTC | #3
The 12/06/2021 18:06, Florian Weimer wrote:
> * Szabolcs Nagy:
> 
> > The 12/06/2021 14:46, Florian Weimer via Libc-alpha wrote:
> >> Co-Authored-By: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> >> ---
> >>  sysdeps/unix/sysv/linux/sched_getcpu.c | 19 +++++++++++++++++--
> >>  1 file changed, 17 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/sysdeps/unix/sysv/linux/sched_getcpu.c b/sysdeps/unix/sysv/linux/sched_getcpu.c
> >> index c41e986f2c..91250f9d0c 100644
> >> --- a/sysdeps/unix/sysv/linux/sched_getcpu.c
> >> +++ b/sysdeps/unix/sysv/linux/sched_getcpu.c
> >> @@ -20,8 +20,8 @@
> >>  #include <sysdep.h>
> >>  #include <sysdep-vdso.h>
> >>  
> >> -int
> >> -sched_getcpu (void)
> >> +static int
> >> +vsyscall_sched_getcpu (void)
> >>  {
> >>    unsigned int cpu;
> >>    int r = -1;
> >> @@ -32,3 +32,18 @@ sched_getcpu (void)
> >>  #endif
> >>    return r == -1 ? r : cpu;
> >>  }
> >> +
> >> +#ifdef RSEQ_SIG
> >> +int
> >> +sched_getcpu (void)
> >> +{
> >> +  int cpu_id = THREAD_GETMEM (THREAD_SELF, rseq_area.cpu_id);
> >> +  return __glibc_likely (cpu_id >= 0) ? cpu_id : vsyscall_sched_getcpu ();
> >> +}
> >
> > i think there is a formal memory model issue here since
> > tp->rseq_area.cpu_id can be modified by the kernel asynchronously
> > but we don't use volatile or relaxed_mo atomic access.
> > (assuming THREAD_GETMEM is defined as a normal access)
> 
> The kernel only modifies cpu_id if the user code is not running on the
> thread, so I don't think this doesn't matter.

it does not look very different from async signal handler
modifying a global. there we would use volatile eg. to
avoid splitting a read access into multiple reads.

> > i think this is fine here (unlikely to cause miscompilation),
> > but may be worth a note?
> 
> The kernel doesn't use the C memory model, so using that terminology in
> the kernel context would only be more confusing.

ok, i'm happy with the code, i just noticed that the
previous patchset used relaxed mo.
  
Florian Weimer Dec. 7, 2021, 3:48 p.m. UTC | #4
* Szabolcs Nagy:

>> > i think there is a formal memory model issue here since
>> > tp->rseq_area.cpu_id can be modified by the kernel asynchronously
>> > but we don't use volatile or relaxed_mo atomic access.
>> > (assuming THREAD_GETMEM is defined as a normal access)
>> 
>> The kernel only modifies cpu_id if the user code is not running on the
>> thread, so I don't think this doesn't matter.
>
> it does not look very different from async signal handler
> modifying a global. there we would use volatile eg. to
> avoid splitting a read access into multiple reads.

I added the necessary scaffolding to use a voltile read here in the v2
series.

Thanks,
Florian
  

Patch

diff --git a/sysdeps/unix/sysv/linux/sched_getcpu.c b/sysdeps/unix/sysv/linux/sched_getcpu.c
index c41e986f2c..91250f9d0c 100644
--- a/sysdeps/unix/sysv/linux/sched_getcpu.c
+++ b/sysdeps/unix/sysv/linux/sched_getcpu.c
@@ -20,8 +20,8 @@ 
 #include <sysdep.h>
 #include <sysdep-vdso.h>
 
-int
-sched_getcpu (void)
+static int
+vsyscall_sched_getcpu (void)
 {
   unsigned int cpu;
   int r = -1;
@@ -32,3 +32,18 @@  sched_getcpu (void)
 #endif
   return r == -1 ? r : cpu;
 }
+
+#ifdef RSEQ_SIG
+int
+sched_getcpu (void)
+{
+  int cpu_id = THREAD_GETMEM (THREAD_SELF, rseq_area.cpu_id);
+  return __glibc_likely (cpu_id >= 0) ? cpu_id : vsyscall_sched_getcpu ();
+}
+#else /* RSEQ_SIG */
+int
+sched_getcpu (void)
+{
+  return vsyscall_sched_getcpu ();
+}
+#endif /* RSEQ_SIG */