: Fix blocking pthread_join.

  On 04/25/2018 05:25 PM, Torvald Riegel wrote:
> On Wed, 2018-04-25 at 07:39 -0500, Carlos O'Donell wrote:
>> On 04/25/2018 06:27 AM, Stefan Liebler wrote:
>>> With this patch, the tid is loaded by dereferencing a volatile pointer.
>>> Then the compiler is not allowed to reload the value for __tid from memory.
> 
> We always use atomic accesses when it comes to concurrently accessed
> data (there are exceptions, but these are tightly controlled).
> We never use volatile to "fix" concurrent accesses.
> 
>>> Okay to commit?
>>
>> Would using an atomic type and an atomic load MO relaxed prevent the
>> compiler from reloading from memory?
> 
> That's the right fix, and it should be an acquire MO load to synchronize
> with the kernel's store to 0.  (We should make it a requirement for the
> kernel to use a release store; IIRC, it is on many archs, but it isn't
> documented.)
> 
See the attached patch for lll_wait_tid.
This prevents the compiler from reloading from memory if build with -Os 
on s390 (31bit).

> The accesses to the TID should be changed to use atomics everywhere, and
> some (simple) concurrency notes should be added.
> There are some functions which are using the loaded pd->tid as argument 
for e.g. passing it to a syscall.
Then this syscall "operates" on the thread with given tid or on the 
calling thread if zero was specified, e.g.:
-nptl/pthread_setschedparam.c: The INVALID_TD_P macro is used in order 
to check if pd->tid is valid, but pd->tid is reloaded before the call to 
__sched_setscheduler().
-sysdeps/unix/sysv/linux/pthread_[s|g]etaffinity.c: pd is not evaluated 
with INVALID_TD_P macro in order to return ESRCH. If the thread has 
already exited, then this function won't fail with ESRCH.

Can we enhance the INVALID_TD_P macro in a way, that it additionally 
stores the evaluated tid in a local variable?
Then we could e.g. pass this tid-value to the mentioned syscalls.
Is atomic_load_relaxed enough for loading pd->tid within INVALID_TD_P?
In the examples above, the syscall will fail if the thread has just exited.

>> I'm unhappy with the use of volatile here because it's not quite
>> the real semantics. Sure, the memory is volatile, it may change at
>> any point, but that's not what matters. What matters is that we load
>> from that memory once and only once.
> 
> It's a normal concurrent access, so we're using atomics for it.
> Volatile but non-atomic is for cases where one would communicate with an
> external device or sth like that, and those device's memory accesses
> would appear to interrupt the thread that's using the volatile accesses.
> IOW, it's like sequential code from a memory-model perspective, just
> that the device's accesses can interleave with the CPU thread's
> accesses.  There's no such simple interleaving when it comes to
> concurrent accesses.

: Fix blocking pthread_join.

Commit Message

Comments

Patch