[libatomic] Fix return value in libat_test_and_set

Message ID 20220324082813.GA6633@delia.home
State New
Headers
Series [libatomic] Fix return value in libat_test_and_set |

Commit Message

Tom de Vries March 24, 2022, 8:28 a.m. UTC
  Hi,

On nvptx (using a Quadro K2000 with driver 470.103.01) I ran into this:
...
FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1 execution test
...
which mimimized to:
...
  #include <stdatomic.h>
  atomic_flag a = ATOMIC_FLAG_INIT;
  int main () {
    if ((atomic_flag_test_and_set) (&a))
      __builtin_abort ();
    return 0;
  }
...

The atomic_flag_test_and_set is implemented using __atomic_test_and_set_1,
which corresponds to the "word-sized compare-and-swap loop" version of
libat_test_and_set in libatomic/tas_n.c.

The semantics of a test-and-set is that the return value is "true if and only
if the previous contents were 'set'".

But the code uses:
...
  return woldval != 0;
...
which means it doesn't look only at the byte that was either set or not set,
but at the entire word.

Fix this by using instead:
...
  return (woldval & wval) == wval;
...

Tested on nvptx.

OK for trunk?

Thanks,
- Tom

[libatomic] Fix return value in libat_test_and_set

---
 libatomic/tas_n.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Jakub Jelinek March 24, 2022, 9:02 a.m. UTC | #1
On Thu, Mar 24, 2022 at 09:28:15AM +0100, Tom de Vries via Gcc-patches wrote:
> Hi,
> 
> On nvptx (using a Quadro K2000 with driver 470.103.01) I ran into this:
> ...
> FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1 execution test
> ...
> which mimimized to:
> ...
>   #include <stdatomic.h>
>   atomic_flag a = ATOMIC_FLAG_INIT;
>   int main () {
>     if ((atomic_flag_test_and_set) (&a))
>       __builtin_abort ();
>     return 0;
>   }
> ...
> 
> The atomic_flag_test_and_set is implemented using __atomic_test_and_set_1,
> which corresponds to the "word-sized compare-and-swap loop" version of
> libat_test_and_set in libatomic/tas_n.c.
> 
> The semantics of a test-and-set is that the return value is "true if and only
> if the previous contents were 'set'".
> 
> But the code uses:
> ...
>   return woldval != 0;
> ...
> which means it doesn't look only at the byte that was either set or not set,
> but at the entire word.
> 
> Fix this by using instead:
> ...
>   return (woldval & wval) == wval;

Shouldn't that be instead
  return (woldval & ((UWORD) -1 << shift)) != 0;
or
  return (woldval & ((UWORD) ~(UWORD) 0 << shift)) != 0;
?
The exact __GCC_ATOMIC_TEST_AND_SET_TRUEVAL varies (the most usual
value is 1, but sparc uses 0xff and m68k/sh use 0x80), falseval is
always 0 though and (woldval & wval) == wval
is testing whether some bits of the oldval are all set rather than
whether the old byte was 0.
Say for trueval 1 it tests whether the least significant bit is set,
for 0x80 if the most significant bit of the byte is set, for
0xff whether all bits are set.

	Jakub
  
Tom de Vries March 24, 2022, 10:01 a.m. UTC | #2
On 3/24/22 10:02, Jakub Jelinek wrote:
> On Thu, Mar 24, 2022 at 09:28:15AM +0100, Tom de Vries via Gcc-patches wrote:
>> Hi,
>>
>> On nvptx (using a Quadro K2000 with driver 470.103.01) I ran into this:
>> ...
>> FAIL: gcc.dg/atomic/stdatomic-flag-2.c -O1 execution test
>> ...
>> which mimimized to:
>> ...
>>    #include <stdatomic.h>
>>    atomic_flag a = ATOMIC_FLAG_INIT;
>>    int main () {
>>      if ((atomic_flag_test_and_set) (&a))
>>        __builtin_abort ();
>>      return 0;
>>    }
>> ...
>>
>> The atomic_flag_test_and_set is implemented using __atomic_test_and_set_1,
>> which corresponds to the "word-sized compare-and-swap loop" version of
>> libat_test_and_set in libatomic/tas_n.c.
>>
>> The semantics of a test-and-set is that the return value is "true if and only
>> if the previous contents were 'set'".
>>
>> But the code uses:
>> ...
>>    return woldval != 0;
>> ...
>> which means it doesn't look only at the byte that was either set or not set,
>> but at the entire word.
>>
>> Fix this by using instead:
>> ...
>>    return (woldval & wval) == wval;
> 
> Shouldn't that be instead
>    return (woldval & ((UWORD) -1 << shift)) != 0;
> or
>    return (woldval & ((UWORD) ~(UWORD) 0 << shift)) != 0;
> ?

Well, I used '(woldval & wval) == wval' based on the fact that the set 
operation uses a bitor:
...
   wval = (UWORD)__GCC_ATOMIC_TEST_AND_SET_TRUEVAL << shift;
   woldval = __atomic_load_n (wptr, __ATOMIC_RELAXED);
   do
     {
       t = woldval | wval;
...
so apparently we do not care here about bits not in 
__GCC_ATOMIC_TEST_AND_SET_TRUEVAL (or alternatively, we care but assume 
that they're 0).

AFAIU, it would have been more precise to compare the entire byte with 
__GCC_ATOMIC_TEST_AND_SET_TRUEVAL, but then it would have made sense to 
set the entire byte in the set part as well.

Anyway, that doesn't seem to be what you're proposing.  During 
investigation of the failure I found that the address used is 
word-aligned, so shift becomes 0 in that case.  AFAICT, the fix you're 
proposing is a nop for shift == 0, and indeed, it doesn't fix the 
failure I'm observing.

> The exact __GCC_ATOMIC_TEST_AND_SET_TRUEVAL varies (the most usual
> value is 1, but sparc uses 0xff and m68k/sh use 0x80), falseval is
> always 0 though and (woldval & wval) == wval
> is testing whether some bits of the oldval are all set rather than
> whether the old byte was 0.
> Say for trueval 1 it tests whether the least significant bit is set,
> for 0x80 if the most significant bit of the byte is set, for
> 0xff whether all bits are set.

Yes, I noticed that.

AFAIU, the proposed patch ddrt under the assumption that we don't care 
about bits not set in __GCC_ATOMIC_TEST_AND_SET_TRUEVAL.

If that's not acceptable, I can submit a patch that doesn't have that 
assumption, and tests the entire byte (but should I also fix the set 
operation then?).

Thanks,
- Tom
  
Jakub Jelinek March 24, 2022, 10:59 a.m. UTC | #3
On Thu, Mar 24, 2022 at 11:01:30AM +0100, Tom de Vries wrote:
> > Shouldn't that be instead
> >    return (woldval & ((UWORD) -1 << shift)) != 0;
> > or
> >    return (woldval & ((UWORD) ~(UWORD) 0 << shift)) != 0;
> > ?
> 
> Well, I used '(woldval & wval) == wval' based on the fact that the set
> operation uses a bitor:
> ...
>   wval = (UWORD)__GCC_ATOMIC_TEST_AND_SET_TRUEVAL << shift;
>   woldval = __atomic_load_n (wptr, __ATOMIC_RELAXED);
>   do
>     {
>       t = woldval | wval;
> ...
> so apparently we do not care here about bits not in
> __GCC_ATOMIC_TEST_AND_SET_TRUEVAL (or alternatively, we care but assume that
> they're 0).
> 
> AFAIU, it would have been more precise to compare the entire byte with
> __GCC_ATOMIC_TEST_AND_SET_TRUEVAL, but then it would have made sense to set
> the entire byte in the set part as well.
> 
> Anyway, that doesn't seem to be what you're proposing.  During investigation
> of the failure I found that the address used is word-aligned, so shift
> becomes 0 in that case.  AFAICT, the fix you're proposing is a nop for shift
> == 0, and indeed, it doesn't fix the failure I'm observing.

Ah, sorry, I certainly meant
  return (woldval & ((UTYPE) -1 << shift)) != 0;
or
  return (woldval & ((UTYPE) ~(UTYPE) 0 << shift)) != 0;
i.e. more portable ways of
  return (woldval & (0xff << shift)) != 0;
which don't hardcode that UTYPE is 8-bit unsigned char.

If one uses just __atomic_test_and_set and __atomic_clear, then I think
it makes no difference.
But testing whether the old byte was non-zero more matches the previous
intent in case the previous value is neither 0 nor __GCC_ATOMIC_TEST_AND_SET_TRUEVAL
and treats it as "set" as well.
I think we don't need to change the loop, woldval | wval even for woldval
byte containing say 42 the or will make it still non-zero.

The documentation argues against using those atomics on types other than
bool and {,{un,}signed }char but libatomic still supports those, I believe
when one doesn't have hw specific support for these, __atomic_clear will
clear the entire UTYPE.

	Jakub
  
Tom de Vries March 24, 2022, 12:08 p.m. UTC | #4
On 3/24/22 11:59, Jakub Jelinek wrote:
> On Thu, Mar 24, 2022 at 11:01:30AM +0100, Tom de Vries wrote:
>>> Shouldn't that be instead
>>>     return (woldval & ((UWORD) -1 << shift)) != 0;
>>> or
>>>     return (woldval & ((UWORD) ~(UWORD) 0 << shift)) != 0;
>>> ?
>>
>> Well, I used '(woldval & wval) == wval' based on the fact that the set
>> operation uses a bitor:
>> ...
>>    wval = (UWORD)__GCC_ATOMIC_TEST_AND_SET_TRUEVAL << shift;
>>    woldval = __atomic_load_n (wptr, __ATOMIC_RELAXED);
>>    do
>>      {
>>        t = woldval | wval;
>> ...
>> so apparently we do not care here about bits not in
>> __GCC_ATOMIC_TEST_AND_SET_TRUEVAL (or alternatively, we care but assume that
>> they're 0).
>>
>> AFAIU, it would have been more precise to compare the entire byte with
>> __GCC_ATOMIC_TEST_AND_SET_TRUEVAL, but then it would have made sense to set
>> the entire byte in the set part as well.
>>
>> Anyway, that doesn't seem to be what you're proposing.  During investigation
>> of the failure I found that the address used is word-aligned, so shift
>> becomes 0 in that case.  AFAICT, the fix you're proposing is a nop for shift
>> == 0, and indeed, it doesn't fix the failure I'm observing.
> 
> Ah, sorry, I certainly meant
>    return (woldval & ((UTYPE) -1 << shift)) != 0;
> or
>    return (woldval & ((UTYPE) ~(UTYPE) 0 << shift)) != 0;
> i.e. more portable ways of
>    return (woldval & (0xff << shift)) != 0;
> which don't hardcode that UTYPE is 8-bit unsigned char.
> 

I see, that makes sense.

> If one uses just __atomic_test_and_set and __atomic_clear, then I think
> it makes no difference.
> But testing whether the old byte was non-zero more matches the previous
> intent in case the previous value is neither 0 nor __GCC_ATOMIC_TEST_AND_SET_TRUEVAL
> and treats it as "set" as well.
> I think we don't need to change the loop, woldval | wval even for woldval
> byte containing say 42 the or will make it still non-zero.
> 
> The documentation argues against using those atomics on types other than
> bool and {,{un,}signed }char but libatomic still supports those, I believe
> when one doesn't have hw specific support for these, __atomic_clear will
> clear the entire UTYPE.

Ack, updated patch, added missing changelog contribution.

OK for trunk?

Thanks,
- Tom
  
Jakub Jelinek March 24, 2022, 12:25 p.m. UTC | #5
On Thu, Mar 24, 2022 at 01:08:56PM +0100, Tom de Vries wrote:
> Ack, updated patch, added missing changelog contribution.
> 
> OK for trunk?

Yes.  I guess it is a backport candidate to release branches as well
(after a while).

	Jakub
  

Patch

diff --git a/libatomic/tas_n.c b/libatomic/tas_n.c
index d0d8c283b49..65eaa7753a5 100644
--- a/libatomic/tas_n.c
+++ b/libatomic/tas_n.c
@@ -73,7 +73,7 @@  SIZE(libat_test_and_set) (UTYPE *mptr, int smodel)
 				     __ATOMIC_RELAXED, __ATOMIC_RELAXED));
 
   post_barrier (smodel);
-  return woldval != 0;
+  return (woldval & wval) == wval;
 }
 
 #define DONE 1