diff mbox series

Avoid RMW of flags2 outside lock (BZ #27842)

Message ID DB6PR0801MB187958646D1DCDD920EA00F183D09@DB6PR0801MB1879.eurprd08.prod.outlook.com
State Superseded
Headers show
Series Avoid RMW of flags2 outside lock (BZ #27842) | expand

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Wilco Dijkstra May 19, 2022, 3:10 p.m. UTC
Remove an unconditional RMW on flags2 in flockfile - if _IO_FLAGS2_NEED_LOCK
is set, we are single-threaded and we can safely clear the flag. This fixes
BZ #27842.

---

Comments

Andreas Schwab May 19, 2022, 3:25 p.m. UTC | #1
On Mai 19 2022, Wilco Dijkstra via Libc-alpha wrote:

> Remove an unconditional RMW on flags2 in flockfile - if _IO_FLAGS2_NEED_LOCK
> is set, we are single-threaded and we can safely clear the flag.

Isn't that backwards?  In single-theaded case, the flags is normally not
set, and it can safely be set.
Adhemerval Zanella May 19, 2022, 3:47 p.m. UTC | #2
On 19/05/2022 12:10, Wilco Dijkstra wrote:
> 
> Remove an unconditional RMW on flags2 in flockfile - if _IO_FLAGS2_NEED_LOCK
> is set, we are single-threaded and we can safely clear the flag. This fixes
> BZ #27842.
> 

I don't think this is correct because if the caller issues pthread_create 
after flockfile, funlockfile will not issues the correct operations.  I
have a fix that uses a different locking mechanism where the _IO_FLAGS2_NEED_LOCK
is removed by moving both the thread id and single-thread optimization to the
locks itself (on Linux tid has at maximum 30-bits, we can use 1 bits for the
single-thread optimization and 1 bits for congestion optimization).

The issue is now requires slight more memory operations to check if you
can use single-thread optimization, since the FILE lock is not always
present and you need to first check __flags.

I would say that with currency scheme where _IO_FLAGS2_NEED_LOCK is stick,
this is a benign data race (although still undesirable). 

> ---
> 
> diff --git a/stdio-common/flockfile.c b/stdio-common/flockfile.c
> index a5decb450f8d477e3105d02661282afeab58f88b..7ba9ab59082d2c1dfba7d5e9a91175c9dec7ec49 100644
> --- a/stdio-common/flockfile.c
> +++ b/stdio-common/flockfile.c
> @@ -22,7 +22,10 @@
>  void
>  __flockfile (FILE *stream)
>  {
> -  stream->_flags2 |= _IO_FLAGS2_NEED_LOCK;
> +  /* If we're single-threaded, turn off single-thread optimizations
> +     when locking a file.  Avoid updating flags2 if multi-threaded.  */
> +  if (!_IO_need_lock (stream))
> +    stream->_flags2 |= _IO_FLAGS2_NEED_LOCK;
>    _IO_lock_lock (*stream->_lock);
>  }
>  weak_alias (__flockfile, flockfile);
Wilco Dijkstra May 19, 2022, 4:26 p.m. UTC | #3
Hi Adhemerval,
 
> I don't think this is correct because if the caller issues pthread_create 
> after flockfile, funlockfile will not issues the correct operations.  I

No, the idea of switching off the single-threaded optimization before the lock is
precisely to ensure that you never could get that situation. Note that neither of
the locks in flockfile and funlockfile use _IO_FLAGS2_NEED_LOCK currently, so 
this is just being extremely conservative - in principle we could remove the
update or move it after the lock.

> have a fix that uses a different locking mechanism where the _IO_FLAGS2_NEED_LOCK
> is removed by moving both the thread id and single-thread optimization to the
> locks itself (on Linux tid has at maximum 30-bits, we can use 1 bits for the
> single-thread optimization and 1 bits for congestion optimization).

Right so you mean moving NEED_LOCK bit into the lock variable?

> I would say that with currency scheme where _IO_FLAGS2_NEED_LOCK is stick,
> this is a benign data race (although still undesirable). 

You mean as it is now? It is a real bug since various functions update flags2
behind a lock, so it is possible for this RMW to cause corruption (but only if
you are already multithreaded, which the update is pointless anyway and we
can just skip it).

Cheers,
Wilco
Adhemerval Zanella May 19, 2022, 4:57 p.m. UTC | #4
On 19/05/2022 13:26, Wilco Dijkstra wrote:
> Hi Adhemerval,
>  
>> I don't think this is correct because if the caller issues pthread_create 
>> after flockfile, funlockfile will not issues the correct operations.  I
> 
> No, the idea of switching off the single-threaded optimization before the lock is
> precisely to ensure that you never could get that situation. Note that neither of
> the locks in flockfile and funlockfile use _IO_FLAGS2_NEED_LOCK currently, so 
> this is just being extremely conservative - in principle we could remove the
> update or move it after the lock.

I am trying to see why exactly we need to disable single-thread optimization
on flockfile, since there is no FILE operation that takes a callback where
pthread_create might be called beween _IO_acquire_lock.  Can't we just remove
the _IO_FLAGS2_NEED_LOCK set on flockfile?

> 
>> have a fix that uses a different locking mechanism where the _IO_FLAGS2_NEED_LOCK
>> is removed by moving both the thread id and single-thread optimization to the
>> locks itself (on Linux tid has at maximum 30-bits, we can use 1 bits for the
>> single-thread optimization and 1 bits for congestion optimization).
> 
> Right so you mean moving NEED_LOCK bit into the lock variable?

Yes, and making the lock smaller on linux (just a word plus the recursive counter).

> 
>> I would say that with currency scheme where _IO_FLAGS2_NEED_LOCK is stick,
>> this is a benign data race (although still undesirable). 
> 
> You mean as it is now? It is a real bug since various functions update flags2
> behind a lock, so it is possible for this RMW to cause corruption (but only if
> you are already multithreaded, which the update is pointless anyway and we
> can just skip it).

I don't think it would be possible to corrupt because once pthread_create
is called, _IO_FLAGS2_NEED_LOCK will be always set (so RMW won't see a
__flags2 without _IO_FLAGS2_NEED_LOCK being set).
Wilco Dijkstra May 20, 2022, 10:34 a.m. UTC | #5
Hi Adhemerval,

> I am trying to see why exactly we need to disable single-thread optimization
> on flockfile, since there is no FILE operation that takes a callback where
> pthread_create might be called beween _IO_acquire_lock.  Can't we just remove
> the _IO_FLAGS2_NEED_LOCK set on flockfile?

Yes we could. I don't believe you could actually notice the difference - recursion
or starting a thread before funlockfile would work fine.

>> Right so you mean moving NEED_LOCK bit into the lock variable?
>
> Yes, and making the lock smaller on linux (just a word plus the recursive counter).

OK, it will be interesting to see how that works out. Any lock speedup will be
good, however I guess that recursive locks will remain quite expensive. By design
the single-thread optimization does not need to support things like trylock,
recursion or starting a new thread, and that is a huge advantage.

> You mean as it is now? It is a real bug since various functions update flags2
> behind a lock, so it is possible for this RMW to cause corruption (but only if
> you are already multithreaded, which the update is pointless anyway and we
> can just skip it).

> I don't think it would be possible to corrupt because once pthread_create
> is called, _IO_FLAGS2_NEED_LOCK will be always set (so RMW won't see a
> __flags2 without _IO_FLAGS2_NEED_LOCK being set).

The issue is not corruption of _IO_FLAGS2_NEED_LOCK but of a different flag.
Several IO functions may set or clear bits in flag2. So we could get a race
between RMW sequences setting/clearing different bits - if there is any overlap
one of the updates may be lost.

Cheers,
Wilco
diff mbox series

Patch

diff --git a/stdio-common/flockfile.c b/stdio-common/flockfile.c
index a5decb450f8d477e3105d02661282afeab58f88b..7ba9ab59082d2c1dfba7d5e9a91175c9dec7ec49 100644
--- a/stdio-common/flockfile.c
+++ b/stdio-common/flockfile.c
@@ -22,7 +22,10 @@ 
 void
 __flockfile (FILE *stream)
 {
-  stream->_flags2 |= _IO_FLAGS2_NEED_LOCK;
+  /* If we're single-threaded, turn off single-thread optimizations
+     when locking a file.  Avoid updating flags2 if multi-threaded.  */
+  if (!_IO_need_lock (stream))
+    stream->_flags2 |= _IO_FLAGS2_NEED_LOCK;
   _IO_lock_lock (*stream->_lock);
 }
 weak_alias (__flockfile, flockfile);