[v1,2/6] nptl: Introduce futex_lock_pi2()

Message ID 20210625081104.1134598-3-kurt@linutronix.de
State Superseded
Headers
Series nptl: Introduce and use FUTEX_LOCK_PI2 |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent

Commit Message

Kurt Kanzenbach June 25, 2021, 8:11 a.m. UTC
  This variant of futex_lock() has support for selectable clocks and priority
inheritance. The underlying FUTEX_LOCK_PI2 operation has been recently
introduced into the Linux kernel.

It can be used for implementing pthread_mutex_clocklock(MONOTONIC)/PI.

The code works like this:

 * If kernel support is assumed, then use FUTEX_LOCK_PI2
 * If not, distuingish between clockid:
   * For realtime use FUTEX_LOCK_PI
   * For monotonic try to use FUTEX_LOCK_PI2 which might not be available

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
---
 sysdeps/nptl/futex-internal.h     | 131 ++++++++++++++++++++++++++++++
 sysdeps/nptl/lowlevellock-futex.h |   1 +
 2 files changed, 132 insertions(+)
  

Comments

Adhemerval Zanella Netto July 9, 2021, 2:13 p.m. UTC | #1
On 25/06/2021 05:11, Kurt Kanzenbach wrote:
> This variant of futex_lock() has support for selectable clocks and priority
> inheritance. The underlying FUTEX_LOCK_PI2 operation has been recently
> introduced into the Linux kernel.
> 
> It can be used for implementing pthread_mutex_clocklock(MONOTONIC)/PI.
> 
> The code works like this:
> 
>  * If kernel support is assumed, then use FUTEX_LOCK_PI2
>  * If not, distuingish between clockid:

s/distuingish/distinguish

>    * For realtime use FUTEX_LOCK_PI
>    * For monotonic try to use FUTEX_LOCK_PI2 which might not be available
> 
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>


> ---
>  sysdeps/nptl/futex-internal.h     | 131 ++++++++++++++++++++++++++++++
>  sysdeps/nptl/lowlevellock-futex.h |   1 +
>  2 files changed, 132 insertions(+)
> 
> diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
> index 79a366604d9e..38c969831276 100644
> --- a/sysdeps/nptl/futex-internal.h
> +++ b/sysdeps/nptl/futex-internal.h
> @@ -303,6 +303,137 @@ futex_lock_pi64 (int *futex_word, const struct __timespec64 *abstime,
>      }
>  }
>  
> +/* The operation checks the value of the futex, if the value is 0, then
> +   it is atomically set to the caller's thread ID.  If the futex value is
> +   nonzero, it is atomically sets the FUTEX_WAITERS bit, which signals wrt
> +   other futex owner that it cannot unlock the futex in user space by
> +   atomically by setting its value to 0.
> +
> +   If more than one wait operations is issued, the enqueueing of the waiters
> +   are done in descending priority order.
> +
> +   The ABSTIME arguments provides an absolute timeout (measured against the
> +   CLOCK_REALTIME or CLOCK_MONOTONIC clock).  If TIMEOUT is NULL, the operation
> +   will block indefinitely.
> +
> +   Returns:
> +
> +     - 0 if woken by a PI unlock operation or spuriously.
> +     - EAGAIN if the futex owner thread ID is about to exit, but has not yet
> +       handled the state cleanup.
> +     - EDEADLK if the futex is already locked by the caller.
> +     - ESRCH if the thread ID int he futex does not exist.
> +     - EINVAL is the state is corrupted or if there is a waiter on the
> +       futex or if the clockid is invalid.
> +     - ETIMEDOUT if the ABSTIME expires.
> +*/
> +static __always_inline int
> +futex_lock_pi2_64 (int *futex_word, clockid_t clockid,
> +                   const struct __timespec64 *abstime, int private)
> +{

This routine has become quite complex and I think it should move it to the
sysdeps/nptl/futex-internal.c, since it is called in two places now
(pthread_mutex_lock.c and pthread_mutex_timedlock.c).  I think also we
should move futex_lock_pi2() futex-internal.c, so if any other implementation
that might want PI-futex will use the proper implementation.

> +  unsigned int clockbit;
> +  int err;
> +
> +  if (! lll_futex_supported_clockid (clockid))
> +    return EINVAL;
> +
> +  clockbit = (clockid == CLOCK_REALTIME) ? FUTEX_CLOCK_REALTIME : 0;
> +  int op = __lll_private_flag (FUTEX_LOCK_PI2 | clockbit, private);
> +
> +  /* FUTEX_LOCK_PI2 is a new futex operation. It supports selectable clocks
> +     whereas the old FUTEX_LOCK_PI does only support CLOCK_REALTIME.
> +
> +     Therefore, the code works like this:
> +
> +     - If kernel support is available, then use FUTEX_LOCK_PI2
> +     - If not, distuingish between clockid: For realtime use FUTEX_LOCK_PI and
> +       for monotonic try to use FUTEX_LOCK_PI2 which might not be available.  */
> +
> +#if __ASSUME_FUTEX_LOCK_PI2
> +
> +# ifdef __ASSUME_TIME64_SYSCALLS
> +  err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0, abstime);
> +# else
> +
> +  bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec);
> +  if (need_time64)
> +    {
> +      err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0, abstime);
> +      if (err == -ENOSYS)
> +	err = -EOVERFLOW;
> +    }
> +  else
> +    {
> +      struct timespec ts32;
> +
> +      if (abstime != NULL)
> +	ts32 = valid_timespec64_to_timespec (*abstime);
> +
> +      err = INTERNAL_SYSCALL_CALL (futex, futex_word, op, 0,
> +                                   abstime != NULL ? &ts32 : NULL);
> +    }
> +# endif	 /* __ASSUME_TIME64_SYSCALLS */
> +

I think we can assume that if FUTEX_LOCK_PI2 is support, 64-bit time_t syscalls
are also supported. 

> +#else
> +
> +  /* For CLOCK_MONOTONIC the only option is to use FUTEX_LOCK_PI2 */
> +  if (abstime != NULL && clockid != CLOCK_REALTIME)
> +    {
> +# ifdef __ASSUME_TIME64_SYSCALLS
> +      err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0, abstime);
> +# else
> +      bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec);
> +      if (need_time64)
> +	{
> +	  err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0,
> +				       abstime);
> +	}
> +      else
> +	{
> +	  struct timespec ts32;
> +
> +	  if (abstime != NULL)
> +	    ts32 = valid_timespec64_to_timespec (*abstime);
> +
> +	  err = INTERNAL_SYSCALL_CALL (futex, futex_word, op, 0,
> +				       abstime != NULL ? &ts32 : NULL);

We will hit here only if the timeout if not null, so there is no need to
check it again.

> +	}
> +
> +
> +      /* FUTEX_LOCK_PI2 is not available on this kernel */
> +      if (err == -ENOSYS)
> +	return EINVAL;
> +    }
> +  else
> +    {
> +      /* Otherwise use CLOCK_REALTIME and FUTEX_LOCK_PI */
> +      return futex_lock_pi64 (futex_word, abstime, private);
> +    }
> +#endif 	/* __ASSUME_FUTEX_LOCK_PI2  */
> +

I think we can simplify to only one futex pi operation to something like:

int
futex_lock_pi64 (int *futex_word, const struct __timespec64 *abstime,
                 int private)
{
  int op_pi2 = _lll_private_flag (FUTEX_LOCK_PI2 | clockbit, private);
#if __ASSUME_FUTEX_LOCK_PI2
  /* Assume __ASSUME_TIME64_SYSCALLS.  */
  err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op_pi2, 0, abstime);
#else
  int op_pi1 = _lll_private_flag (FUTEX_LOCK_PI | clockbit, private),
  
  /* For CLOCK_MONOTONIC the only option is to use FUTEX_LOCK_PI2.  */
  int op_pi = abstime != NULL && clockid != CLOCK_REALTIME ? op_pi2 : op_pi1;
  
# ifdef __ASSUME_TIME64_SYSCALLS
  err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op_pi, 0, abstime);
# else
  bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec);
  if (need_time64)
    err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op_pi, 0, abstime);
  else
    {
      struct timespec ts32, *pts32 = NULL;
      if (abstime != NULL)
      	{
      	  ts32 = valid_timespec64_to_timespec (*abstime);
      	  pts32 = &ts32;
      	}
      err = INTERNAL_SYSCALL_CALL (futex, futex_word, op_pi, 0, &ts32);
    }
# endif	 /* __ASSUME_TIME64_SYSCALLS */
   /* FUTEX_LOCK_PI2 is not available on this kernel */
   if (err == -ENOSYS)
     err =  EINVAL;
#endif /* __ASSUME_FUTEX_LOCK_PI2  */

  switch (err)
    {
    case 0:
    case -EAGAIN:
    case -EINTR:
    case -ETIMEDOUT:
    case -ESRCH:
    case -EDEADLK:
    case -EINVAL: /* This indicates either state corruption or that the kernel
                     found a waiter on futex address which is waiting via
                     FUTEX_WAIT or FUTEX_WAIT_BITSET.  This is reported on
                     some futex_lock_pi usage (pthread_mutex_timedlock for
                     instance).  */
      return -err;

    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
    case -ENOSYS: /* Must have been caused by a glibc bug.  */
    /* No other errors are documented at this time.  */
    default:
      futex_fatal_error ();
    }
}


I think there is no much gain in adding another PI futex internal
function, so we can just use the same name.


> +  switch (err)
> +    {
> +    case 0:
> +    case -EAGAIN:
> +    case -EINTR:
> +    case -ETIMEDOUT:
> +    case -ESRCH:
> +    case -EDEADLK:
> +    case -EINVAL: /* This indicates either state corruption or that the kernel
> +		     found a waiter on futex address which is waiting via
> +		     FUTEX_WAIT or FUTEX_WAIT_BITSET.  This is reported on
> +		     some futex_lock_pi usage (pthread_mutex_timedlock for
> +		     instance).  */
> +      return -err;
> +
> +    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
> +    case -ENOSYS: /* Must have been caused by a glibc bug.  */
> +    /* No other errors are documented at this time.  */
> +    default:
> +      futex_fatal_error ();
> +    }
> +}
> +
>  /* Wakes the top priority waiter that called a futex_lock_pi operation on
>     the futex.
>  
> diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
> index 66ebfe50f4c1..abda179e0de2 100644
> --- a/sysdeps/nptl/lowlevellock-futex.h
> +++ b/sysdeps/nptl/lowlevellock-futex.h
> @@ -38,6 +38,7 @@
>  #define FUTEX_WAKE_BITSET	10
>  #define FUTEX_WAIT_REQUEUE_PI   11
>  #define FUTEX_CMP_REQUEUE_PI    12
> +#define FUTEX_LOCK_PI2		13
>  #define FUTEX_PRIVATE_FLAG	128
>  #define FUTEX_CLOCK_REALTIME	256
>  
> 

Ok.
  

Patch

diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
index 79a366604d9e..38c969831276 100644
--- a/sysdeps/nptl/futex-internal.h
+++ b/sysdeps/nptl/futex-internal.h
@@ -303,6 +303,137 @@  futex_lock_pi64 (int *futex_word, const struct __timespec64 *abstime,
     }
 }
 
+/* The operation checks the value of the futex, if the value is 0, then
+   it is atomically set to the caller's thread ID.  If the futex value is
+   nonzero, it is atomically sets the FUTEX_WAITERS bit, which signals wrt
+   other futex owner that it cannot unlock the futex in user space by
+   atomically by setting its value to 0.
+
+   If more than one wait operations is issued, the enqueueing of the waiters
+   are done in descending priority order.
+
+   The ABSTIME arguments provides an absolute timeout (measured against the
+   CLOCK_REALTIME or CLOCK_MONOTONIC clock).  If TIMEOUT is NULL, the operation
+   will block indefinitely.
+
+   Returns:
+
+     - 0 if woken by a PI unlock operation or spuriously.
+     - EAGAIN if the futex owner thread ID is about to exit, but has not yet
+       handled the state cleanup.
+     - EDEADLK if the futex is already locked by the caller.
+     - ESRCH if the thread ID int he futex does not exist.
+     - EINVAL is the state is corrupted or if there is a waiter on the
+       futex or if the clockid is invalid.
+     - ETIMEDOUT if the ABSTIME expires.
+*/
+static __always_inline int
+futex_lock_pi2_64 (int *futex_word, clockid_t clockid,
+                   const struct __timespec64 *abstime, int private)
+{
+  unsigned int clockbit;
+  int err;
+
+  if (! lll_futex_supported_clockid (clockid))
+    return EINVAL;
+
+  clockbit = (clockid == CLOCK_REALTIME) ? FUTEX_CLOCK_REALTIME : 0;
+  int op = __lll_private_flag (FUTEX_LOCK_PI2 | clockbit, private);
+
+  /* FUTEX_LOCK_PI2 is a new futex operation. It supports selectable clocks
+     whereas the old FUTEX_LOCK_PI does only support CLOCK_REALTIME.
+
+     Therefore, the code works like this:
+
+     - If kernel support is available, then use FUTEX_LOCK_PI2
+     - If not, distuingish between clockid: For realtime use FUTEX_LOCK_PI and
+       for monotonic try to use FUTEX_LOCK_PI2 which might not be available.  */
+
+#if __ASSUME_FUTEX_LOCK_PI2
+
+# ifdef __ASSUME_TIME64_SYSCALLS
+  err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0, abstime);
+# else
+
+  bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec);
+  if (need_time64)
+    {
+      err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0, abstime);
+      if (err == -ENOSYS)
+	err = -EOVERFLOW;
+    }
+  else
+    {
+      struct timespec ts32;
+
+      if (abstime != NULL)
+	ts32 = valid_timespec64_to_timespec (*abstime);
+
+      err = INTERNAL_SYSCALL_CALL (futex, futex_word, op, 0,
+                                   abstime != NULL ? &ts32 : NULL);
+    }
+# endif	 /* __ASSUME_TIME64_SYSCALLS */
+
+#else
+
+  /* For CLOCK_MONOTONIC the only option is to use FUTEX_LOCK_PI2 */
+  if (abstime != NULL && clockid != CLOCK_REALTIME)
+    {
+# ifdef __ASSUME_TIME64_SYSCALLS
+      err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0, abstime);
+# else
+      bool need_time64 = abstime != NULL && !in_time_t_range (abstime->tv_sec);
+      if (need_time64)
+	{
+	  err = INTERNAL_SYSCALL_CALL (futex_time64, futex_word, op, 0,
+				       abstime);
+	}
+      else
+	{
+	  struct timespec ts32;
+
+	  if (abstime != NULL)
+	    ts32 = valid_timespec64_to_timespec (*abstime);
+
+	  err = INTERNAL_SYSCALL_CALL (futex, futex_word, op, 0,
+				       abstime != NULL ? &ts32 : NULL);
+	}
+# endif	 /* __ASSUME_TIME64_SYSCALLS */
+
+      /* FUTEX_LOCK_PI2 is not available on this kernel */
+      if (err == -ENOSYS)
+	return EINVAL;
+    }
+  else
+    {
+      /* Otherwise use CLOCK_REALTIME and FUTEX_LOCK_PI */
+      return futex_lock_pi64 (futex_word, abstime, private);
+    }
+#endif 	/* __ASSUME_FUTEX_LOCK_PI2  */
+
+  switch (err)
+    {
+    case 0:
+    case -EAGAIN:
+    case -EINTR:
+    case -ETIMEDOUT:
+    case -ESRCH:
+    case -EDEADLK:
+    case -EINVAL: /* This indicates either state corruption or that the kernel
+		     found a waiter on futex address which is waiting via
+		     FUTEX_WAIT or FUTEX_WAIT_BITSET.  This is reported on
+		     some futex_lock_pi usage (pthread_mutex_timedlock for
+		     instance).  */
+      return -err;
+
+    case -EFAULT: /* Must have been caused by a glibc or application bug.  */
+    case -ENOSYS: /* Must have been caused by a glibc bug.  */
+    /* No other errors are documented at this time.  */
+    default:
+      futex_fatal_error ();
+    }
+}
+
 /* Wakes the top priority waiter that called a futex_lock_pi operation on
    the futex.
 
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index 66ebfe50f4c1..abda179e0de2 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -38,6 +38,7 @@ 
 #define FUTEX_WAKE_BITSET	10
 #define FUTEX_WAIT_REQUEUE_PI   11
 #define FUTEX_CMP_REQUEUE_PI    12
+#define FUTEX_LOCK_PI2		13
 #define FUTEX_PRIVATE_FLAG	128
 #define FUTEX_CLOCK_REALTIME	256