nptl: Effectively skip CAS in spinlock loop

Message ID 20211205152814.13597-1-6812skiii@gmail.com
State Superseded
Delegated to: Maxim Kuvyrkov
Headers
Series nptl: Effectively skip CAS in spinlock loop |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Jangwoong Kim Dec. 5, 2021, 3:28 p.m. UTC
  The commit:
"Add LLL_MUTEX_READ_LOCK [BZ #28537]"
SHA1: d672a98a1af106bd68deb15576710cd61363f7a6

introduced LLL_MUTEX_READ_LOCK, to skip CAS in spinlock loop
if atmoic load fails. But, "continue" inside of do-while loop
does not skip the evaluation of escape expression, thus CAS
is not skipped.

Replace do-while with while and break if LLL_MUTEX_READ_LOCK fails.
---
 nptl/pthread_mutex_lock.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
  

Comments

Maxim Kuvyrkov Dec. 6, 2021, 2:11 p.m. UTC | #1
> On 5 Dec 2021, at 18:28, Jangwoong Kim via Libc-alpha <libc-alpha@sourceware.org> wrote:
> 
> The commit:
> "Add LLL_MUTEX_READ_LOCK [BZ #28537]"
> SHA1: d672a98a1af106bd68deb15576710cd61363f7a6
> 
> introduced LLL_MUTEX_READ_LOCK, to skip CAS in spinlock loop
> if atmoic load fails. But, "continue" inside of do-while loop
> does not skip the evaluation of escape expression, thus CAS
> is not skipped.
> 
> Replace do-while with while and break if LLL_MUTEX_READ_LOCK fails.
> ---
> nptl/pthread_mutex_lock.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)

Good catch!  Looks good to me.

--
Maxim Kuvyrkov
https://www.linaro.org


> 
> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
> index 47b88a6b5b..24936c8d56 100644
> --- a/nptl/pthread_mutex_lock.c
> +++ b/nptl/pthread_mutex_lock.c
> @@ -138,7 +138,7 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
> 	  int cnt = 0;
> 	  int max_cnt = MIN (max_adaptive_count (),
> 			     mutex->__data.__spins * 2 + 10);
> -	  do
> +	  while (1)
> 	    {
> 	      if (cnt++ >= max_cnt)
> 		{
> @@ -148,8 +148,9 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
> 	      atomic_spin_nop ();
> 	      if (LLL_MUTEX_READ_LOCK (mutex) != 0)
> 		continue;
> +	      if (LLL_MUTEX_TRYLOCK (mutex) == 0)
> +		break;
> 	    }
> -	  while (LLL_MUTEX_TRYLOCK (mutex) != 0);
> 
> 	  mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
> 	}
> -- 
> 2.25.1
>
  
Maxim Kuvyrkov Dec. 6, 2021, 2:26 p.m. UTC | #2
> On 5 Dec 2021, at 18:28, Jangwoong Kim via Libc-alpha <libc-alpha@sourceware.org> wrote:
> 
> The commit:
> "Add LLL_MUTEX_READ_LOCK [BZ #28537]"
> SHA1: d672a98a1af106bd68deb15576710cd61363f7a6
> 
> introduced LLL_MUTEX_READ_LOCK, to skip CAS in spinlock loop
> if atmoic load fails. But, "continue" inside of do-while loop
> does not skip the evaluation of escape expression, thus CAS
> is not skipped.
> 
> Replace do-while with while and break if LLL_MUTEX_READ_LOCK fails.
> ---
> nptl/pthread_mutex_lock.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
> index 47b88a6b5b..24936c8d56 100644
> --- a/nptl/pthread_mutex_lock.c
> +++ b/nptl/pthread_mutex_lock.c
> @@ -138,7 +138,7 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
> 	  int cnt = 0;
> 	  int max_cnt = MIN (max_adaptive_count (),
> 			     mutex->__data.__spins * 2 + 10);
> -	  do
> +	  while (1)
> 	    {
> 	      if (cnt++ >= max_cnt)
> 		{
> @@ -148,8 +148,9 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
> 	      atomic_spin_nop ();
> 	      if (LLL_MUTEX_READ_LOCK (mutex) != 0)
> 		continue;
> +	      if (LLL_MUTEX_TRYLOCK (mutex) == 0)
> +		break;
> 	    }
> -	  while (LLL_MUTEX_TRYLOCK (mutex) != 0);

Giving this a second look, would you please add a comment before “break” to explain why do-while loop structure would cause an extra compare-and-swap?

--
Maxim Kuvyrkov
https://www.linaro.org

> 
> 	  mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
> 	}
> -- 
> 2.25.1
>
  
Jangwoong Kim Dec. 6, 2021, 6:03 p.m. UTC | #3
Hi Maxim,

Did you mean editing the commit message?

If so, I will resend a patch right away.

Or, if you meant adding a comment to source code, I'm not clear on your request.

The reason do-while loop structure causes an extra CAS is "continue"
expression does not skip the conditional expression do-while loop.

My patch replaces the do-while loop, so there isn't an extra CAS.

Thus, adding a comment why the previous code was wrong doesn't seem proper.

--
Thank you.
Jangwoong Kim

2021년 12월 6일 (월) 오후 11:26, Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>님이 작성:
>
>
> > On 5 Dec 2021, at 18:28, Jangwoong Kim via Libc-alpha <libc-alpha@sourceware.org> wrote:
> >
> > The commit:
> > "Add LLL_MUTEX_READ_LOCK [BZ #28537]"
> > SHA1: d672a98a1af106bd68deb15576710cd61363f7a6
> >
> > introduced LLL_MUTEX_READ_LOCK, to skip CAS in spinlock loop
> > if atmoic load fails. But, "continue" inside of do-while loop
> > does not skip the evaluation of escape expression, thus CAS
> > is not skipped.
> >
> > Replace do-while with while and break if LLL_MUTEX_READ_LOCK fails.
> > ---
> > nptl/pthread_mutex_lock.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
> > index 47b88a6b5b..24936c8d56 100644
> > --- a/nptl/pthread_mutex_lock.c
> > +++ b/nptl/pthread_mutex_lock.c
> > @@ -138,7 +138,7 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
> >         int cnt = 0;
> >         int max_cnt = MIN (max_adaptive_count (),
> >                            mutex->__data.__spins * 2 + 10);
> > -       do
> > +       while (1)
> >           {
> >             if (cnt++ >= max_cnt)
> >               {
> > @@ -148,8 +148,9 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
> >             atomic_spin_nop ();
> >             if (LLL_MUTEX_READ_LOCK (mutex) != 0)
> >               continue;
> > +           if (LLL_MUTEX_TRYLOCK (mutex) == 0)
> > +             break;
> >           }
> > -       while (LLL_MUTEX_TRYLOCK (mutex) != 0);
>
> Giving this a second look, would you please add a comment before “break” to explain why do-while loop structure would cause an extra compare-and-swap?
>
> --
> Maxim Kuvyrkov
> https://www.linaro.org
>
> >
> >         mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
> >       }
> > --
> > 2.25.1
> >
>
  
Maxim Kuvyrkov Dec. 13, 2021, 1:42 p.m. UTC | #4
Hi Jangwoong,

I needed a minute to stare at your patch to understand where extra CAS execution comes from, and I meant adding a comment like the following to make code easier to understand.
+	      if (LLL_MUTEX_TRYLOCK (mutex) == 0)
                /* Do not be tempted to move this check to the loop condition.
                   See commit that added this comment for details.  */
+		break;

Feel free to merge your patch as-is, I may be overthinking this.

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org

> On 6 Dec 2021, at 21:03, Jangwoong Kim via Libc-alpha <libc-alpha@sourceware.org> wrote:
> 
> Hi Maxim,
> 
> Did you mean editing the commit message?
> 
> If so, I will resend a patch right away.
> 
> Or, if you meant adding a comment to source code, I'm not clear on your request.
> 
> The reason do-while loop structure causes an extra CAS is "continue"
> expression does not skip the conditional expression do-while loop.
> 
> My patch replaces the do-while loop, so there isn't an extra CAS.
> 
> Thus, adding a comment why the previous code was wrong doesn't seem proper.
> 
> --
> Thank you.
> Jangwoong Kim
> 
> 2021년 12월 6일 (월) 오후 11:26, Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>님이 작성:
>> 
>> 
>>> On 5 Dec 2021, at 18:28, Jangwoong Kim via Libc-alpha <libc-alpha@sourceware.org> wrote:
>>> 
>>> The commit:
>>> "Add LLL_MUTEX_READ_LOCK [BZ #28537]"
>>> SHA1: d672a98a1af106bd68deb15576710cd61363f7a6
>>> 
>>> introduced LLL_MUTEX_READ_LOCK, to skip CAS in spinlock loop
>>> if atmoic load fails. But, "continue" inside of do-while loop
>>> does not skip the evaluation of escape expression, thus CAS
>>> is not skipped.
>>> 
>>> Replace do-while with while and break if LLL_MUTEX_READ_LOCK fails.
>>> ---
>>> nptl/pthread_mutex_lock.c | 5 +++--
>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
>>> index 47b88a6b5b..24936c8d56 100644
>>> --- a/nptl/pthread_mutex_lock.c
>>> +++ b/nptl/pthread_mutex_lock.c
>>> @@ -138,7 +138,7 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
>>>        int cnt = 0;
>>>        int max_cnt = MIN (max_adaptive_count (),
>>>                           mutex->__data.__spins * 2 + 10);
>>> -       do
>>> +       while (1)
>>>          {
>>>            if (cnt++ >= max_cnt)
>>>              {
>>> @@ -148,8 +148,9 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
>>>            atomic_spin_nop ();
>>>            if (LLL_MUTEX_READ_LOCK (mutex) != 0)
>>>              continue;
>>> +           if (LLL_MUTEX_TRYLOCK (mutex) == 0)
>>> +             break;
>>>          }
>>> -       while (LLL_MUTEX_TRYLOCK (mutex) != 0);
>> 
>> Giving this a second look, would you please add a comment before “break” to explain why do-while loop structure would cause an extra compare-and-swap?
>> 
>> --
>> Maxim Kuvyrkov
>> https://www.linaro.org
>> 
>>> 
>>>        mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
>>>      }
>>> --
>>> 2.25.1
>>> 
>>
  
Andreas Schwab Dec. 13, 2021, 2:42 p.m. UTC | #5
On Dez 06 2021, Jangwoong Kim via Libc-alpha wrote:

> @@ -148,8 +148,9 @@ PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
>  	      atomic_spin_nop ();
>  	      if (LLL_MUTEX_READ_LOCK (mutex) != 0)
>  		continue;
> +	      if (LLL_MUTEX_TRYLOCK (mutex) == 0)
> +		break;

Please combine the two conditions to make it a less confusing control flow.
  

Patch

diff --git a/nptl/pthread_mutex_lock.c b/nptl/pthread_mutex_lock.c
index 47b88a6b5b..24936c8d56 100644
--- a/nptl/pthread_mutex_lock.c
+++ b/nptl/pthread_mutex_lock.c
@@ -138,7 +138,7 @@  PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
 	  int cnt = 0;
 	  int max_cnt = MIN (max_adaptive_count (),
 			     mutex->__data.__spins * 2 + 10);
-	  do
+	  while (1)
 	    {
 	      if (cnt++ >= max_cnt)
 		{
@@ -148,8 +148,9 @@  PTHREAD_MUTEX_LOCK (pthread_mutex_t *mutex)
 	      atomic_spin_nop ();
 	      if (LLL_MUTEX_READ_LOCK (mutex) != 0)
 		continue;
+	      if (LLL_MUTEX_TRYLOCK (mutex) == 0)
+		break;
 	    }
-	  while (LLL_MUTEX_TRYLOCK (mutex) != 0);
 
 	  mutex->__data.__spins += (cnt - mutex->__data.__spins) / 8;
 	}