[RFC,BZ,19944] ntpl: pthread_mutex_lock() use a wrapper for error checking

Message ID 20160531164025.GA19018@breakpoint.cc
State New, archived
Headers

Commit Message

Sebastian Andrzej Siewior May 31, 2016, 4:40 p.m. UTC
  Implement a wrapper around kernel's FUTEX_LOCK_PI opcode / syscall to
filter error codes. Returned to caller:
- EDEADLOCK: detected a deadlock
- EINVAL: if uaddr's address is not a multiple of four.
- ENOMEM: failed to allocate memory for PI state.
          I am not sure if it is better to return it to the caller or
	  retry again. A sleep() between invocation is probably a bad
	  idea if the process is RT. Trying again is probably a bad idea
	  if the process is RT with the highest priority.
- EAGAIN: The owner is about to terminate. Same as with ENOMEM (not sure
          what is the best action here).
- ESRCH: Happens also for non-robust futex if the owner exits. The lock
         value becomes FUTEX_OWNER_DIED set. Returned as EOWNERDEAD.

Error codes which should not happen invoke futex_fatal_error().
2016-04-13  Sebastian Andrzej Siewior  <bigeasy@linutronix.de>
	[BZ #19944] (partly)
	* sysdeps/nptl/futex-internal.h: futex_lock_pi () prototype
	* sysdeps/unix/sysv/linux/futex-internal.h: implement futex_lock_pi()
	* nptl/pthread_mutex_lock.c: use new wrapper futex_lock_pi ()
---

Compile tested only.
Could we extend futex_fatal_error() a little to pass the opcode + error
code? If it is not fully reproducible it would be nice to know what led
to it.

 nptl/pthread_mutex_lock.c                | 31 ++++++---------------------
 sysdeps/nptl/futex-internal.h            |  3 +++
 sysdeps/unix/sysv/linux/futex-internal.h | 36 ++++++++++++++++++++++++++++++++
 3 files changed, 45 insertions(+), 25 deletions(-)
  

Comments

Torvald Riegel June 1, 2016, 10:30 a.m. UTC | #1
On Tue, 2016-05-31 at 18:40 +0200, Sebastian Andrzej Siewior wrote:
> Implement a wrapper around kernel's FUTEX_LOCK_PI opcode / syscall to
> filter error codes. Returned to caller:
> - EDEADLOCK: detected a deadlock
> - EINVAL: if uaddr's address is not a multiple of four.
> - ENOMEM: failed to allocate memory for PI state.
>           I am not sure if it is better to return it to the caller or
> 	  retry again. A sleep() between invocation is probably a bad
> 	  idea if the process is RT. Trying again is probably a bad idea
> 	  if the process is RT with the highest priority.
> - EAGAIN: The owner is about to terminate. Same as with ENOMEM (not sure
>           what is the best action here).
> - ESRCH: Happens also for non-robust futex if the owner exits. The lock
>          value becomes FUTEX_OWNER_DIED set. Returned as EOWNERDEAD.
> 
> Error codes which should not happen invoke futex_fatal_error().

Thanks for trying to improve how we handle errors.

There are two levels of interfaces we need to consider.  First, the
futex interface in futex-internal.h should return errors that a caller
of a futex operation would have to deal with; thus, this is basically
the errors the kernel can return, with the exception that we abort early
on errors that can only arise due to bugs in the kernel or glibc.

Thus, adding a futex_lock_pi to futex-internal.h seems fine, but then
this should be an abstraction for the FUTEX_LOCK_PI operation: It should
not need to be aware of pthread_mutex_t internals, and it should not
convert error codes.

The patch should also add the matching FUTEX_UNLOCK_PI operations.  

Furthermore, it should define these operations for nacl too, I believe,
but just abort when they are called as they aren't supported there, and
should never be called in the first place.  Another option would be to
leave them simply declared but not defined.

The second interface is pthread mutexes.  The error codes we can return
for these should follow what's defined by POSIX (although we already
deviate from that sometimes); if this doesn't work in practice, this
should be discussed with the Austin Group.  If we deviate, we need to
document it.
The pthread mutex code would then call the futex interface, and
transform errors as required or abort on others that we cannot return.

More detailed comments below.

> 2016-04-13  Sebastian Andrzej Siewior  <bigeasy@linutronix.de>
> 	[BZ #19944] (partly)
> 	* sysdeps/nptl/futex-internal.h: futex_lock_pi () prototype
> 	* sysdeps/unix/sysv/linux/futex-internal.h: implement futex_lock_pi()
> 	* nptl/pthread_mutex_lock.c: use new wrapper futex_lock_pi ()
> ---
> 
> Compile tested only.

Tests for this would be good.

> Could we extend futex_fatal_error() a little to pass the opcode + error
> code? If it is not fully reproducible it would be nice to know what led
> to it.
> 
>  nptl/pthread_mutex_lock.c                | 31 ++++++---------------------
>  sysdeps/nptl/futex-internal.h            |  3 +++
>  sysdeps/unix/sysv/linux/futex-internal.h | 36 ++++++++++++++++++++++++++++++++
>  3 files changed, 45 insertions(+), 25 deletions(-)
> 
> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
> index bdfa529f639b..4ecd2d8a91ec 100644
> --- a/nptl/pthread_mutex_lock.c
> +++ b/nptl/pthread_mutex_lock.c
> @@ -26,6 +26,7 @@
>  #include <atomic.h>
>  #include <lowlevellock.h>
>  #include <stap-probe.h>
> +#include <futex-internal.h>
>  
>  #ifndef lll_lock_elision
>  #define lll_lock_elision(lock, try_lock, private)	({ \
> @@ -332,36 +333,16 @@ __pthread_mutex_lock_full (pthread_mutex_t *mutex)
>  	  {
>  	    /* The mutex is locked.  The kernel will now take care of
>  	       everything.  */
> -	    int private = (robust
> -			   ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex)
> -			   : PTHREAD_MUTEX_PSHARED (mutex));
> -	    INTERNAL_SYSCALL_DECL (__err);
> -	    int e = INTERNAL_SYSCALL (futex, __err, 4, &mutex->__data.__lock,
> -				      __lll_private_flag (FUTEX_LOCK_PI,
> -							  private), 1, 0);
> +	    int ret;
>  
> -	    if (INTERNAL_SYSCALL_ERROR_P (e, __err)
> -		&& (INTERNAL_SYSCALL_ERRNO (e, __err) == ESRCH
> -		    || INTERNAL_SYSCALL_ERRNO (e, __err) == EDEADLK))
> -	      {
> -		assert (INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK
> -			|| (kind != PTHREAD_MUTEX_ERRORCHECK_NP
> -			    && kind != PTHREAD_MUTEX_RECURSIVE_NP));
> -		/* ESRCH can happen only for non-robust PI mutexes where
> -		   the owner of the lock died.  */
> -		assert (INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust);
> -
> -		/* Delay the thread indefinitely.  */
> -		while (1)
> -		  pause_not_cancel ();
> -	      }
> +	    ret = futex_lock_pi(mutex);
> +	    if (ret)
> +		    return ret;
>  
>  	    oldval = mutex->__data.__lock;
> -
> -	    assert (robust || (oldval & FUTEX_OWNER_DIED) == 0);
>  	  }
>  
> -	if (__glibc_unlikely (oldval & FUTEX_OWNER_DIED))
> +	if (__glibc_unlikely (oldval & FUTEX_OWNER_DIED) && robust)
>  	  {
>  	    atomic_and (&mutex->__data.__lock, ~FUTEX_OWNER_DIED);
>  
> diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
> index d798b6970893..3c376788437a 100644
> --- a/sysdeps/nptl/futex-internal.h
> +++ b/sysdeps/nptl/futex-internal.h
> @@ -192,6 +192,9 @@ futex_abstimed_wait_cancelable (unsigned int* futex_word,
>  static __always_inline void
>  futex_wake (unsigned int* futex_word, int processes_to_wake, int private);
>  
> +/* Invoke kernel's FUTEX_LOCK_PI syscall. */

This needs the same level of detail in the documentation as we have for
other futex operations.  Also, two spaces behind the . please.

> +static __always_inline int futex_lock_pi(pthread_mutex_t *mutex);
> +
>  /* Calls __libc_fatal with an error message.  Convenience function for
>     concrete implementations of the futex interface.  */
>  static __always_inline __attribute__ ((__noreturn__)) void
> diff --git a/sysdeps/unix/sysv/linux/futex-internal.h b/sysdeps/unix/sysv/linux/futex-internal.h
> index 1add836ebc2f..6b571ae82e33 100644
> --- a/sysdeps/unix/sysv/linux/futex-internal.h
> +++ b/sysdeps/unix/sysv/linux/futex-internal.h
> @@ -248,4 +248,40 @@ futex_wake (unsigned int *futex_word, int processes_to_wake, int private)
>      }
>  }
>  
> +static __always_inline int futex_lock_pi(pthread_mutex_t *mutex)

See above.

> +{
> +  INTERNAL_SYSCALL_DECL (err);
> +  int robust = mutex->__data.__kind & PTHREAD_MUTEX_ROBUST_NORMAL_NP;
> +  int private;
> +  int ret;
> +
> +  private = (robust
> +	     ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex)
> +	     : PTHREAD_MUTEX_PSHARED (mutex));

This belongs into the mutex code, not the futex code.

> +
> +  ret = INTERNAL_SYSCALL (futex, err, 4, &mutex->__data.__lock,
> +		    __lll_private_flag (FUTEX_LOCK_PI,
> +					private), 1, 0);
> +  if (!INTERNAL_SYSCALL_ERROR_P(ret, err))
> +    return 0;
> +  ret = INTERNAL_SYSCALL_ERRNO (ret, err);
> +  switch (ret)
> +    {
> +    case EDEADLOCK:	/* dead lock detected */
> +    case EINVAL:	/* the __lock variable is not properly aligned */
> +    case ENOMEM:	/* internal memory allocation failed */
> +    case EAGAIN:	/* lock owner is terminating */
> +      return ret;
> +    case ESRCH:		/* The process holding the lock is gone */
> +      return EOWNERDEAD;
> +    default:
> +    case EPERM:		/* not possible (unlock) or PID of owner belongs to a
> +			   kernel thread */
> +    case EFAULT:	/* we dereferenced the pointer, kernel should be able to
> +			   do, too */
> +    case ETIMEDOUT:	/* didn't ask for timeout */

Formatting of comments as elsewhere in the file, please.  Some of these
also seem to need more detail.
  

Patch

diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
index bdfa529f639b..4ecd2d8a91ec 100644
--- a/nptl/pthread_mutex_lock.c
+++ b/nptl/pthread_mutex_lock.c
@@ -26,6 +26,7 @@ 
 #include <atomic.h>
 #include <lowlevellock.h>
 #include <stap-probe.h>
+#include <futex-internal.h>
 
 #ifndef lll_lock_elision
 #define lll_lock_elision(lock, try_lock, private)	({ \
@@ -332,36 +333,16 @@  __pthread_mutex_lock_full (pthread_mutex_t *mutex)
 	  {
 	    /* The mutex is locked.  The kernel will now take care of
 	       everything.  */
-	    int private = (robust
-			   ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex)
-			   : PTHREAD_MUTEX_PSHARED (mutex));
-	    INTERNAL_SYSCALL_DECL (__err);
-	    int e = INTERNAL_SYSCALL (futex, __err, 4, &mutex->__data.__lock,
-				      __lll_private_flag (FUTEX_LOCK_PI,
-							  private), 1, 0);
+	    int ret;
 
-	    if (INTERNAL_SYSCALL_ERROR_P (e, __err)
-		&& (INTERNAL_SYSCALL_ERRNO (e, __err) == ESRCH
-		    || INTERNAL_SYSCALL_ERRNO (e, __err) == EDEADLK))
-	      {
-		assert (INTERNAL_SYSCALL_ERRNO (e, __err) != EDEADLK
-			|| (kind != PTHREAD_MUTEX_ERRORCHECK_NP
-			    && kind != PTHREAD_MUTEX_RECURSIVE_NP));
-		/* ESRCH can happen only for non-robust PI mutexes where
-		   the owner of the lock died.  */
-		assert (INTERNAL_SYSCALL_ERRNO (e, __err) != ESRCH || !robust);
-
-		/* Delay the thread indefinitely.  */
-		while (1)
-		  pause_not_cancel ();
-	      }
+	    ret = futex_lock_pi(mutex);
+	    if (ret)
+		    return ret;
 
 	    oldval = mutex->__data.__lock;
-
-	    assert (robust || (oldval & FUTEX_OWNER_DIED) == 0);
 	  }
 
-	if (__glibc_unlikely (oldval & FUTEX_OWNER_DIED))
+	if (__glibc_unlikely (oldval & FUTEX_OWNER_DIED) && robust)
 	  {
 	    atomic_and (&mutex->__data.__lock, ~FUTEX_OWNER_DIED);
 
diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
index d798b6970893..3c376788437a 100644
--- a/sysdeps/nptl/futex-internal.h
+++ b/sysdeps/nptl/futex-internal.h
@@ -192,6 +192,9 @@  futex_abstimed_wait_cancelable (unsigned int* futex_word,
 static __always_inline void
 futex_wake (unsigned int* futex_word, int processes_to_wake, int private);
 
+/* Invoke kernel's FUTEX_LOCK_PI syscall. */
+static __always_inline int futex_lock_pi(pthread_mutex_t *mutex);
+
 /* Calls __libc_fatal with an error message.  Convenience function for
    concrete implementations of the futex interface.  */
 static __always_inline __attribute__ ((__noreturn__)) void
diff --git a/sysdeps/unix/sysv/linux/futex-internal.h b/sysdeps/unix/sysv/linux/futex-internal.h
index 1add836ebc2f..6b571ae82e33 100644
--- a/sysdeps/unix/sysv/linux/futex-internal.h
+++ b/sysdeps/unix/sysv/linux/futex-internal.h
@@ -248,4 +248,40 @@  futex_wake (unsigned int *futex_word, int processes_to_wake, int private)
     }
 }
 
+static __always_inline int futex_lock_pi(pthread_mutex_t *mutex)
+{
+  INTERNAL_SYSCALL_DECL (err);
+  int robust = mutex->__data.__kind & PTHREAD_MUTEX_ROBUST_NORMAL_NP;
+  int private;
+  int ret;
+
+  private = (robust
+	     ? PTHREAD_ROBUST_MUTEX_PSHARED (mutex)
+	     : PTHREAD_MUTEX_PSHARED (mutex));
+
+  ret = INTERNAL_SYSCALL (futex, err, 4, &mutex->__data.__lock,
+		    __lll_private_flag (FUTEX_LOCK_PI,
+					private), 1, 0);
+  if (!INTERNAL_SYSCALL_ERROR_P(ret, err))
+    return 0;
+  ret = INTERNAL_SYSCALL_ERRNO (ret, err);
+  switch (ret)
+    {
+    case EDEADLOCK:	/* dead lock detected */
+    case EINVAL:	/* the __lock variable is not properly aligned */
+    case ENOMEM:	/* internal memory allocation failed */
+    case EAGAIN:	/* lock owner is terminating */
+      return ret;
+    case ESRCH:		/* The process holding the lock is gone */
+      return EOWNERDEAD;
+    default:
+    case EPERM:		/* not possible (unlock) or PID of owner belongs to a
+			   kernel thread */
+    case EFAULT:	/* we dereferenced the pointer, kernel should be able to
+			   do, too */
+    case ETIMEDOUT:	/* didn't ask for timeout */
+      futex_fatal_error ();
+    }
+}
+
 #endif  /* futex-internal.h */