[1/2] libstdc++: Fix proxy wait detection in atomic wait

Message ID 20260108214524.2292515-1-jwakely@redhat.com
State New
Headers
Series [1/2] libstdc++: Fix proxy wait detection in atomic wait |

Commit Message

Jonathan Wakely Jan. 8, 2026, 9:44 p.m. UTC
  A failed assertion was observed with std::atomic<bool>::wait when the
loop in __atomic_wait_address is entered and calls _M_setup_wait a
second time. When the first call to _M_setup_wait makes a call to
_M_setup_proxy_wait it decides that a proxy wait is needed for bool, and
updates the _M_obj and _M_obj_size members to refer to the futex in the
proxy state, instead of referring to the bool object. The next time
_M_setup_wait is called it calls _M_setup_proxy_wait again but now it
sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
check.

The problem is that _M_setup_wait is calling _M_setup_proxy_wait twice,
when it should avoid the second call because the decision to use a proxy
wait has already been made. The caller should avoid making the second
call by checking whether _M_obj != addr, because that implies that a
decision to make a proxy wait has already happened.

This change fixes the bug, by preventing a second call to
_M_setup_proxy_wait after we've changed _M_obj and _M_obj_size. But it
doesn't prevent a second call in the case where _M_setup_proxy_wait
returned false because we made a runtime decision to do a non-proxy
wait. That is suboptimal, because it means that code which can wait on
the atomic variable directly makes redundant calls into the library just
to be told the same answer every time. That's not a problem for now
because firstly, it's only a performance degradation not a correctness
bug, and secondly because we currently have no targets that ever make a
runtime decision to do a non-proxy wait. Linux makes a compile-time
decision to do non-proxy waits for futex-sized objects and does proxy
waits for everything else, and all non-Linux targets always do proxy
waits. So solving the performance problem for non-proxy waits can happen
later. We might want to set a bit in __wait_args::_M_flags to say "I
already told you this isn't a proxy wait, stop asking me".

libstdc++-v3/ChangeLog:

	* include/bits/atomic_wait.h (__wait_args::_M_setup_wait): Do
	not call _M_setup_proxy_wait again if a proxy wait was already
	set up.
---

Tested x86_64-linux and x86_64-freebsd14 (just the parts that use atomic
waits).

 libstdc++-v3/include/bits/atomic_wait.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Tomasz Kaminski Jan. 9, 2026, 8:55 a.m. UTC | #1
On Thu, Jan 8, 2026 at 10:50 PM Jonathan Wakely <jwakely@redhat.com> wrote:

> A failed assertion was observed with std::atomic<bool>::wait when the
> loop in __atomic_wait_address is entered and calls _M_setup_wait a
> second time. When the first call to _M_setup_wait makes a call to
> _M_setup_proxy_wait it decides that a proxy wait is needed for bool, and
> updates the _M_obj and _M_obj_size members to refer to the futex in the
> proxy state, instead of referring to the bool object. The next time
> _M_setup_wait is called it calls _M_setup_proxy_wait again but now it
> sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
> is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
> check.
>
> The problem is that _M_setup_wait is calling _M_setup_proxy_wait twice,
> when it should avoid the second call because the decision to use a proxy
> wait has already been made. The caller should avoid making the second
> call by checking whether _M_obj != addr, because that implies that a
> decision to make a proxy wait has already happened.
>
I do not think this is correct behavior, the _M_setup_wait have two
responsibilities now:
* deciding it the proxy wait should be used
* updating the _M_old value with the most recent _M_ver versions, I am
referring to this line
  // Read the value of the _M_ver counter.
  _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
To hit this assertion, this means that there was contention on the proxy
wait slot
or had a spurious wake, and in this case _M_ver was changed, and we should
still update the _M_old to the latest value.

I think the proper fix matching the approach would be to have:
bool
__wait_args::_M_setup_proxy_wait(const void* addr)
{
  if (_M_obj !=  addr) {
    // We already decided to use proxy wait, read the value of the _M_ver
counter.
    _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
  }

     if (!use_proxy_wait(*this)) // We can wait on this address directly.
         return false;

    // This will be a proxy wait, so get a waitable state.
    auto state = set_wait_state(addr, *this);

    // The address we will wait on is the version count of the waitable
state:
    _M_obj = &state->_M_ver;
    // __wait_impl and __wait_until_impl need to know this size:
    _M_obj_size = sizeof(state->_M_ver);
  }
}

  // Read the value of the _M_ver counter.
  _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);

  return true;
}
// See below for alternative

>
> This change fixes the bug, by preventing a second call to
> _M_setup_proxy_wait after we've changed _M_obj and _M_obj_size. But it
> doesn't prevent a second call in the case where _M_setup_proxy_wait
> returned false because we made a runtime decision to do a non-proxy
> wait. That is suboptimal, because it means that code which can wait on
> the atomic variable directly makes redundant calls into the library just
> to be told the same answer every time. That's not a problem for now
> because firstly, it's only a performance degradation not a correctness
> bug, and secondly because we currently have no targets that ever make a
> runtime decision to do a non-proxy wait. Linux makes a compile-time
> decision to do non-proxy waits for futex-sized objects and does proxy
> waits for everything else, and all non-Linux targets always do proxy
> waits. So solving the performance problem for non-proxy waits can happen
> later. We might want to set a bit in __wait_args::_M_flags to say "I
> already told you this isn't a proxy wait, stop asking me".
>
I think the best option would be to separate _M_setup_wait (initial setups)
and later updates into separate functions. Like _M_setup_wait and
_M_update_wait.
I will post a patch implementing those suggestion.



>
> libstdc++-v3/ChangeLog:
>
>         * include/bits/atomic_wait.h (__wait_args::_M_setup_wait): Do
>         not call _M_setup_proxy_wait again if a proxy wait was already
>         set up.
> ---
>
> Tested x86_64-linux and x86_64-freebsd14 (just the parts that use atomic
> waits).
>
>  libstdc++-v3/include/bits/atomic_wait.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/bits/atomic_wait.h
> b/libstdc++-v3/include/bits/atomic_wait.h
> index b6240a95370f..a677dbf0a040 100644
> --- a/libstdc++-v3/include/bits/atomic_wait.h
> +++ b/libstdc++-v3/include/bits/atomic_wait.h
> @@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>             }
>
>           if constexpr (!__platform_wait_uses_type<_Tp>)
> -           if (_M_setup_proxy_wait(__addr))
> +           if (_M_obj != __addr || _M_setup_proxy_wait(__addr))
>
Or to reload _M_wait here.

>               {
>                 // We will use a proxy wait for this object.
>                 // The library has set _M_obj and _M_obj_size and _M_old.
> --
> 2.52.0
>
>
  
Jonathan Wakely Jan. 9, 2026, 9:26 a.m. UTC | #2
On Fri, 9 Jan 2026 at 08:56, Tomasz Kaminski <tkaminsk@redhat.com> wrote:
>
>
>
> On Thu, Jan 8, 2026 at 10:50 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> A failed assertion was observed with std::atomic<bool>::wait when the
>> loop in __atomic_wait_address is entered and calls _M_setup_wait a
>> second time. When the first call to _M_setup_wait makes a call to
>> _M_setup_proxy_wait it decides that a proxy wait is needed for bool, and
>> updates the _M_obj and _M_obj_size members to refer to the futex in the
>> proxy state, instead of referring to the bool object. The next time
>> _M_setup_wait is called it calls _M_setup_proxy_wait again but now it
>> sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
>> is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
>> check.
>>
>> The problem is that _M_setup_wait is calling _M_setup_proxy_wait twice,
>> when it should avoid the second call because the decision to use a proxy
>> wait has already been made. The caller should avoid making the second
>> call by checking whether _M_obj != addr, because that implies that a
>> decision to make a proxy wait has already happened.
>
> I do not think this is correct behavior, the _M_setup_wait have two responsibilities now:
> * deciding it the proxy wait should be used
> * updating the _M_old value with the most recent _M_ver versions, I am referring to this line
>   // Read the value of the _M_ver counter.
>   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> To hit this assertion, this means that there was contention on the proxy wait slot
> or had a spurious wake,

It happens on a non-spurious wake as well, the current code always
calls _M_setup_wait after __wait_impl, even if it woke up because the
value changed and a notify call was done.


> still update the _M_old to the latest value.
>
> I think the proper fix matching the approach would be to have:
> bool
> __wait_args::_M_setup_proxy_wait(const void* addr)
> {
>   if (_M_obj !=  addr) {
>     // We already decided to use proxy wait, read the value of the _M_ver counter.
>     _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
>   }

If the condition above is true, the we don't need to call
use_proxy_wait, so this should be if-else:

>      if (!use_proxy_wait(*this)) // We can wait on this address directly.
>          return false;

Or maybe you meant to do return true above, after setting _M_old


>     // This will be a proxy wait, so get a waitable state.
>     auto state = set_wait_state(addr, *this);
>
>     // The address we will wait on is the version count of the waitable state:
>     _M_obj = &state->_M_ver;
>     // __wait_impl and __wait_until_impl need to know this size:
>     _M_obj_size = sizeof(state->_M_ver);
>   }
> }
>
>   // Read the value of the _M_ver counter.
>   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
>
>   return true;
> }
> // See below for alternative
>>
>>
>> This change fixes the bug, by preventing a second call to
>> _M_setup_proxy_wait after we've changed _M_obj and _M_obj_size. But it
>> doesn't prevent a second call in the case where _M_setup_proxy_wait
>> returned false because we made a runtime decision to do a non-proxy
>> wait. That is suboptimal, because it means that code which can wait on
>> the atomic variable directly makes redundant calls into the library just
>> to be told the same answer every time. That's not a problem for now
>> because firstly, it's only a performance degradation not a correctness
>> bug, and secondly because we currently have no targets that ever make a
>> runtime decision to do a non-proxy wait. Linux makes a compile-time
>> decision to do non-proxy waits for futex-sized objects and does proxy
>> waits for everything else, and all non-Linux targets always do proxy
>> waits. So solving the performance problem for non-proxy waits can happen
>> later. We might want to set a bit in __wait_args::_M_flags to say "I
>> already told you this isn't a proxy wait, stop asking me".
>
> I think the best option would be to separate _M_setup_wait (initial setups)
> and later updates into separate functions. Like _M_setup_wait and _M_update_wait.
> I will post a patch implementing those suggestion.
>
>
>>
>>
>> libstdc++-v3/ChangeLog:
>>
>>         * include/bits/atomic_wait.h (__wait_args::_M_setup_wait): Do
>>         not call _M_setup_proxy_wait again if a proxy wait was already
>>         set up.
>> ---
>>
>> Tested x86_64-linux and x86_64-freebsd14 (just the parts that use atomic
>> waits).
>>
>>  libstdc++-v3/include/bits/atomic_wait.h | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
>> index b6240a95370f..a677dbf0a040 100644
>> --- a/libstdc++-v3/include/bits/atomic_wait.h
>> +++ b/libstdc++-v3/include/bits/atomic_wait.h
>> @@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>             }
>>
>>           if constexpr (!__platform_wait_uses_type<_Tp>)
>> -           if (_M_setup_proxy_wait(__addr))
>> +           if (_M_obj != __addr || _M_setup_proxy_wait(__addr))
>
> Or to reload _M_wait here.
>>
>>               {
>>                 // We will use a proxy wait for this object.
>>                 // The library has set _M_obj and _M_obj_size and _M_old.
>> --
>> 2.52.0
>>
  
Tomasz Kaminski Jan. 9, 2026, 9:40 a.m. UTC | #3
On Fri, Jan 9, 2026 at 10:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:

> On Fri, 9 Jan 2026 at 08:56, Tomasz Kaminski <tkaminsk@redhat.com> wrote:
> >
> >
> >
> > On Thu, Jan 8, 2026 at 10:50 PM Jonathan Wakely <jwakely@redhat.com>
> wrote:
> >>
> >> A failed assertion was observed with std::atomic<bool>::wait when the
> >> loop in __atomic_wait_address is entered and calls _M_setup_wait a
> >> second time. When the first call to _M_setup_wait makes a call to
> >> _M_setup_proxy_wait it decides that a proxy wait is needed for bool, and
> >> updates the _M_obj and _M_obj_size members to refer to the futex in the
> >> proxy state, instead of referring to the bool object. The next time
> >> _M_setup_wait is called it calls _M_setup_proxy_wait again but now it
> >> sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
> >> is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
> >> check.
> >>
> >> The problem is that _M_setup_wait is calling _M_setup_proxy_wait twice,
> >> when it should avoid the second call because the decision to use a proxy
> >> wait has already been made. The caller should avoid making the second
> >> call by checking whether _M_obj != addr, because that implies that a
> >> decision to make a proxy wait has already happened.
> >
> > I do not think this is correct behavior, the _M_setup_wait have two
> responsibilities now:
> > * deciding it the proxy wait should be used
> > * updating the _M_old value with the most recent _M_ver versions, I am
> referring to this line
> >   // Read the value of the _M_ver counter.
> >   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> > To hit this assertion, this means that there was contention on the proxy
> wait slot
> > or had a spurious wake,
>
> It happens on a non-spurious wake as well, the current code always
> calls _M_setup_wait after __wait_impl, even if it woke up because the
> value changed and a notify call was done.
>
That what I mean by contention on the wait slot, _M_ver in __watiable_state
was
notified, but the value of our variable hasn't changed. That most likely
mean other
variable in the same slot was updated.

>
>
> > still update the _M_old to the latest value.
> >
> > I think the proper fix matching the approach would be to have:
> > bool
> > __wait_args::_M_setup_proxy_wait(const void* addr)
> > {
> >   if (_M_obj !=  addr) {
> >     // We already decided to use proxy wait, read the value of the
> _M_ver counter.
> >     _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> >   }
>
> If the condition above is true, the we don't need to call
> use_proxy_wait, so this should be if-else:

Yes, just return true.

>


> >      if (!use_proxy_wait(*this)) // We can wait on this address directly.
> >          return false;
>
> Or maybe you meant to do return true above, after setting _M_old
>
>
> >     // This will be a proxy wait, so get a waitable state.
> >     auto state = set_wait_state(addr, *this);
> >
> >     // The address we will wait on is the version count of the waitable
> state:
> >     _M_obj = &state->_M_ver;
> >     // __wait_impl and __wait_until_impl need to know this size:
> >     _M_obj_size = sizeof(state->_M_ver);
> >   }
> > }
> >
> >   // Read the value of the _M_ver counter.
> >   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> >
> >   return true;
> > }
> > // See below for alternative
> >>
> >>
> >> This change fixes the bug, by preventing a second call to
> >> _M_setup_proxy_wait after we've changed _M_obj and _M_obj_size. But it
> >> doesn't prevent a second call in the case where _M_setup_proxy_wait
> >> returned false because we made a runtime decision to do a non-proxy
> >> wait. That is suboptimal, because it means that code which can wait on
> >> the atomic variable directly makes redundant calls into the library just
> >> to be told the same answer every time. That's not a problem for now
> >> because firstly, it's only a performance degradation not a correctness
> >> bug, and secondly because we currently have no targets that ever make a
> >> runtime decision to do a non-proxy wait. Linux makes a compile-time
> >> decision to do non-proxy waits for futex-sized objects and does proxy
> >> waits for everything else, and all non-Linux targets always do proxy
> >> waits. So solving the performance problem for non-proxy waits can happen
> >> later. We might want to set a bit in __wait_args::_M_flags to say "I
> >> already told you this isn't a proxy wait, stop asking me".
> >
> > I think the best option would be to separate _M_setup_wait (initial
> setups)
> > and later updates into separate functions. Like _M_setup_wait and
> _M_update_wait.
> > I will post a patch implementing those suggestion.
> >
> >
> >>
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >>         * include/bits/atomic_wait.h (__wait_args::_M_setup_wait): Do
> >>         not call _M_setup_proxy_wait again if a proxy wait was already
> >>         set up.
> >> ---
> >>
> >> Tested x86_64-linux and x86_64-freebsd14 (just the parts that use atomic
> >> waits).
> >>
> >>  libstdc++-v3/include/bits/atomic_wait.h | 2 +-
> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/libstdc++-v3/include/bits/atomic_wait.h
> b/libstdc++-v3/include/bits/atomic_wait.h
> >> index b6240a95370f..a677dbf0a040 100644
> >> --- a/libstdc++-v3/include/bits/atomic_wait.h
> >> +++ b/libstdc++-v3/include/bits/atomic_wait.h
> >> @@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>             }
> >>
> >>           if constexpr (!__platform_wait_uses_type<_Tp>)
> >> -           if (_M_setup_proxy_wait(__addr))
> >> +           if (_M_obj != __addr || _M_setup_proxy_wait(__addr))
> >
> > Or to reload _M_wait here.
> >>
> >>               {
> >>                 // We will use a proxy wait for this object.
> >>                 // The library has set _M_obj and _M_obj_size and
> _M_old.
> >> --
> >> 2.52.0
> >>
>
>
  
Jonathan Wakely Jan. 9, 2026, 9:44 a.m. UTC | #4
On Fri, 9 Jan 2026 at 09:40, Tomasz Kaminski <tkaminsk@redhat.com> wrote:
>
>
>
> On Fri, Jan 9, 2026 at 10:26 AM Jonathan Wakely <jwakely@redhat.com> wrote:
>>
>> On Fri, 9 Jan 2026 at 08:56, Tomasz Kaminski <tkaminsk@redhat.com> wrote:
>> >
>> >
>> >
>> > On Thu, Jan 8, 2026 at 10:50 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>> >>
>> >> A failed assertion was observed with std::atomic<bool>::wait when the
>> >> loop in __atomic_wait_address is entered and calls _M_setup_wait a
>> >> second time. When the first call to _M_setup_wait makes a call to
>> >> _M_setup_proxy_wait it decides that a proxy wait is needed for bool, and
>> >> updates the _M_obj and _M_obj_size members to refer to the futex in the
>> >> proxy state, instead of referring to the bool object. The next time
>> >> _M_setup_wait is called it calls _M_setup_proxy_wait again but now it
>> >> sees _M_obj_size == sizeof(futex) and so this time decides a proxy wait
>> >> is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
>> >> check.
>> >>
>> >> The problem is that _M_setup_wait is calling _M_setup_proxy_wait twice,
>> >> when it should avoid the second call because the decision to use a proxy
>> >> wait has already been made. The caller should avoid making the second
>> >> call by checking whether _M_obj != addr, because that implies that a
>> >> decision to make a proxy wait has already happened.
>> >
>> > I do not think this is correct behavior, the _M_setup_wait have two responsibilities now:
>> > * deciding it the proxy wait should be used
>> > * updating the _M_old value with the most recent _M_ver versions, I am referring to this line
>> >   // Read the value of the _M_ver counter.
>> >   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
>> > To hit this assertion, this means that there was contention on the proxy wait slot
>> > or had a spurious wake,
>>
>> It happens on a non-spurious wake as well, the current code always
>> calls _M_setup_wait after __wait_impl, even if it woke up because the
>> value changed and a notify call was done.
>
> That what I mean by contention on the wait slot, _M_ver in __watiable_state was
> notified, but the value of our variable hasn't changed. That most likely mean other
> variable in the same slot was updated.

No. The case where the assertion failure was observed is a proxy wait
where there is a normal, non-contended wake-up. The value of the
variable changed.



>>
>>
>>
>> > still update the _M_old to the latest value.
>> >
>> > I think the proper fix matching the approach would be to have:
>> > bool
>> > __wait_args::_M_setup_proxy_wait(const void* addr)
>> > {
>> >   if (_M_obj !=  addr) {
>> >     // We already decided to use proxy wait, read the value of the _M_ver counter.
>> >     _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
>> >   }
>>
>> If the condition above is true, the we don't need to call
>> use_proxy_wait, so this should be if-else:
>
> Yes, just return true.
>>
>>
>>
>>
>> >      if (!use_proxy_wait(*this)) // We can wait on this address directly.
>> >          return false;
>>
>> Or maybe you meant to do return true above, after setting _M_old
>>
>>
>> >     // This will be a proxy wait, so get a waitable state.
>> >     auto state = set_wait_state(addr, *this);
>> >
>> >     // The address we will wait on is the version count of the waitable state:
>> >     _M_obj = &state->_M_ver;
>> >     // __wait_impl and __wait_until_impl need to know this size:
>> >     _M_obj_size = sizeof(state->_M_ver);
>> >   }
>> > }
>> >
>> >   // Read the value of the _M_ver counter.
>> >   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
>> >
>> >   return true;
>> > }
>> > // See below for alternative
>> >>
>> >>
>> >> This change fixes the bug, by preventing a second call to
>> >> _M_setup_proxy_wait after we've changed _M_obj and _M_obj_size. But it
>> >> doesn't prevent a second call in the case where _M_setup_proxy_wait
>> >> returned false because we made a runtime decision to do a non-proxy
>> >> wait. That is suboptimal, because it means that code which can wait on
>> >> the atomic variable directly makes redundant calls into the library just
>> >> to be told the same answer every time. That's not a problem for now
>> >> because firstly, it's only a performance degradation not a correctness
>> >> bug, and secondly because we currently have no targets that ever make a
>> >> runtime decision to do a non-proxy wait. Linux makes a compile-time
>> >> decision to do non-proxy waits for futex-sized objects and does proxy
>> >> waits for everything else, and all non-Linux targets always do proxy
>> >> waits. So solving the performance problem for non-proxy waits can happen
>> >> later. We might want to set a bit in __wait_args::_M_flags to say "I
>> >> already told you this isn't a proxy wait, stop asking me".
>> >
>> > I think the best option would be to separate _M_setup_wait (initial setups)
>> > and later updates into separate functions. Like _M_setup_wait and _M_update_wait.
>> > I will post a patch implementing those suggestion.
>> >
>> >
>> >>
>> >>
>> >> libstdc++-v3/ChangeLog:
>> >>
>> >>         * include/bits/atomic_wait.h (__wait_args::_M_setup_wait): Do
>> >>         not call _M_setup_proxy_wait again if a proxy wait was already
>> >>         set up.
>> >> ---
>> >>
>> >> Tested x86_64-linux and x86_64-freebsd14 (just the parts that use atomic
>> >> waits).
>> >>
>> >>  libstdc++-v3/include/bits/atomic_wait.h | 2 +-
>> >>  1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
>> >> index b6240a95370f..a677dbf0a040 100644
>> >> --- a/libstdc++-v3/include/bits/atomic_wait.h
>> >> +++ b/libstdc++-v3/include/bits/atomic_wait.h
>> >> @@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> >>             }
>> >>
>> >>           if constexpr (!__platform_wait_uses_type<_Tp>)
>> >> -           if (_M_setup_proxy_wait(__addr))
>> >> +           if (_M_obj != __addr || _M_setup_proxy_wait(__addr))
>> >
>> > Or to reload _M_wait here.
>> >>
>> >>               {
>> >>                 // We will use a proxy wait for this object.
>> >>                 // The library has set _M_obj and _M_obj_size and _M_old.
>> >> --
>> >> 2.52.0
>> >>
>>
  
Tomasz Kaminski Jan. 9, 2026, 10 a.m. UTC | #5
On Fri, Jan 9, 2026 at 10:44 AM Jonathan Wakely <jwakely@redhat.com> wrote:

> On Fri, 9 Jan 2026 at 09:40, Tomasz Kaminski <tkaminsk@redhat.com> wrote:
> >
> >
> >
> > On Fri, Jan 9, 2026 at 10:26 AM Jonathan Wakely <jwakely@redhat.com>
> wrote:
> >>
> >> On Fri, 9 Jan 2026 at 08:56, Tomasz Kaminski <tkaminsk@redhat.com>
> wrote:
> >> >
> >> >
> >> >
> >> > On Thu, Jan 8, 2026 at 10:50 PM Jonathan Wakely <jwakely@redhat.com>
> wrote:
> >> >>
> >> >> A failed assertion was observed with std::atomic<bool>::wait when the
> >> >> loop in __atomic_wait_address is entered and calls _M_setup_wait a
> >> >> second time. When the first call to _M_setup_wait makes a call to
> >> >> _M_setup_proxy_wait it decides that a proxy wait is needed for bool,
> and
> >> >> updates the _M_obj and _M_obj_size members to refer to the futex in
> the
> >> >> proxy state, instead of referring to the bool object. The next time
> >> >> _M_setup_wait is called it calls _M_setup_proxy_wait again but now it
> >> >> sees _M_obj_size == sizeof(futex) and so this time decides a proxy
> wait
> >> >> is *not* needed, and then fails the __glibcxx_assert(_M_obj == addr)
> >> >> check.
> >> >>
> >> >> The problem is that _M_setup_wait is calling _M_setup_proxy_wait
> twice,
> >> >> when it should avoid the second call because the decision to use a
> proxy
> >> >> wait has already been made. The caller should avoid making the second
> >> >> call by checking whether _M_obj != addr, because that implies that a
> >> >> decision to make a proxy wait has already happened.
> >> >
> >> > I do not think this is correct behavior, the _M_setup_wait have two
> responsibilities now:
> >> > * deciding it the proxy wait should be used
> >> > * updating the _M_old value with the most recent _M_ver versions, I
> am referring to this line
> >> >   // Read the value of the _M_ver counter.
> >> >   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> >> > To hit this assertion, this means that there was contention on the
> proxy wait slot
> >> > or had a spurious wake,
> >>
> >> It happens on a non-spurious wake as well, the current code always
> >> calls _M_setup_wait after __wait_impl, even if it woke up because the
> >> value changed and a notify call was done.
> >
> > That what I mean by contention on the wait slot, _M_ver in
> __watiable_state was
> > notified, but the value of our variable hasn't changed. That most likely
> mean other
> > variable in the same slot was updated.
>
> No. The case where the assertion failure was observed is a proxy wait
> where there is a normal, non-contended wake-up. The value of the
> variable changed.
>
Ah yes, update wait is always called after successful wait, before the
predicate is checked.

>
>
>
> >>
> >>
> >>
> >> > still update the _M_old to the latest value.
> >> >
> >> > I think the proper fix matching the approach would be to have:
> >> > bool
> >> > __wait_args::_M_setup_proxy_wait(const void* addr)
> >> > {
> >> >   if (_M_obj !=  addr) {
> >> >     // We already decided to use proxy wait, read the value of the
> _M_ver counter.
> >> >     _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> >> >   }
> >>
> >> If the condition above is true, the we don't need to call
> >> use_proxy_wait, so this should be if-else:
> >
> > Yes, just return true.
> >>
> >>
> >>
> >>
> >> >      if (!use_proxy_wait(*this)) // We can wait on this address
> directly.
> >> >          return false;
> >>
> >> Or maybe you meant to do return true above, after setting _M_old
> >>
> >>
> >> >     // This will be a proxy wait, so get a waitable state.
> >> >     auto state = set_wait_state(addr, *this);
> >> >
> >> >     // The address we will wait on is the version count of the
> waitable state:
> >> >     _M_obj = &state->_M_ver;
> >> >     // __wait_impl and __wait_until_impl need to know this size:
> >> >     _M_obj_size = sizeof(state->_M_ver);
> >> >   }
> >> > }
> >> >
> >> >   // Read the value of the _M_ver counter.
> >> >   _M_old = __atomic_load_n(&state->_M_ver, __ATOMIC_ACQUIRE);
> >> >
> >> >   return true;
> >> > }
> >> > // See below for alternative
> >> >>
> >> >>
> >> >> This change fixes the bug, by preventing a second call to
> >> >> _M_setup_proxy_wait after we've changed _M_obj and _M_obj_size. But
> it
> >> >> doesn't prevent a second call in the case where _M_setup_proxy_wait
> >> >> returned false because we made a runtime decision to do a non-proxy
> >> >> wait. That is suboptimal, because it means that code which can wait
> on
> >> >> the atomic variable directly makes redundant calls into the library
> just
> >> >> to be told the same answer every time. That's not a problem for now
> >> >> because firstly, it's only a performance degradation not a
> correctness
> >> >> bug, and secondly because we currently have no targets that ever
> make a
> >> >> runtime decision to do a non-proxy wait. Linux makes a compile-time
> >> >> decision to do non-proxy waits for futex-sized objects and does proxy
> >> >> waits for everything else, and all non-Linux targets always do proxy
> >> >> waits. So solving the performance problem for non-proxy waits can
> happen
> >> >> later. We might want to set a bit in __wait_args::_M_flags to say "I
> >> >> already told you this isn't a proxy wait, stop asking me".
> >> >
> >> > I think the best option would be to separate _M_setup_wait (initial
> setups)
> >> > and later updates into separate functions. Like _M_setup_wait and
> _M_update_wait.
> >> > I will post a patch implementing those suggestion.
> >> >
> >> >
> >> >>
> >> >>
> >> >> libstdc++-v3/ChangeLog:
> >> >>
> >> >>         * include/bits/atomic_wait.h (__wait_args::_M_setup_wait): Do
> >> >>         not call _M_setup_proxy_wait again if a proxy wait was
> already
> >> >>         set up.
> >> >> ---
> >> >>
> >> >> Tested x86_64-linux and x86_64-freebsd14 (just the parts that use
> atomic
> >> >> waits).
> >> >>
> >> >>  libstdc++-v3/include/bits/atomic_wait.h | 2 +-
> >> >>  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >>
> >> >> diff --git a/libstdc++-v3/include/bits/atomic_wait.h
> b/libstdc++-v3/include/bits/atomic_wait.h
> >> >> index b6240a95370f..a677dbf0a040 100644
> >> >> --- a/libstdc++-v3/include/bits/atomic_wait.h
> >> >> +++ b/libstdc++-v3/include/bits/atomic_wait.h
> >> >> @@ -230,7 +230,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >> >>             }
> >> >>
> >> >>           if constexpr (!__platform_wait_uses_type<_Tp>)
> >> >> -           if (_M_setup_proxy_wait(__addr))
> >> >> +           if (_M_obj != __addr || _M_setup_proxy_wait(__addr))
> >> >
> >> > Or to reload _M_wait here.
> >> >>
> >> >>               {
> >> >>                 // We will use a proxy wait for this object.
> >> >>                 // The library has set _M_obj and _M_obj_size and
> _M_old.
> >> >> --
> >> >> 2.52.0
> >> >>
> >>
>
>
  

Patch

diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
index b6240a95370f..a677dbf0a040 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -230,7 +230,7 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	    }
 
 	  if constexpr (!__platform_wait_uses_type<_Tp>)
-	    if (_M_setup_proxy_wait(__addr))
+	    if (_M_obj != __addr || _M_setup_proxy_wait(__addr))
 	      {
 		// We will use a proxy wait for this object.
 		// The library has set _M_obj and _M_obj_size and _M_old.