[2/2] S390: Use generic spinlock code.
Commit Message
On 02/18/2017 06:05 PM, Torvald Riegel wrote:
> On Wed, 2017-02-15 at 17:26 +0100, Stefan Liebler wrote:
>> On 02/13/2017 09:39 PM, Torvald Riegel wrote:
>>> On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote:
>>>> This is an updated version of the patch, which adjusts the s390 specific
>>>> atomic-macros in the same way as in include/atomic.h.
>>>> Thus passing a volatile int pointer is fine, too.
>>>
>>> The general direction of this is okay.
>>> Some of my comments for patch 1/2 apply here as well (eg, volatile vs.
>>> atomics).
>>>
>> See answer in patch 1/2.
>>
>>> What I don't like is the choice of 1000 for
>>> SPIN_LOCK_READS_BETWEEN_CMPXCHG. Have you run benchmarks to come up
>>> with this value, or is it a guess? Why isn't it documented how you end
>>> up with this number?
>>> We can keep these with a choice such as this, but then we need to have a
>>> FIXME comment in the code, explaining that this is just an arbitrary
>>> choice.
>>>
>>> I would guess that just spinning forever is sufficient, and that we
>>> don't need to throw in a CAS every now and then; using randomized
>>> exponential back-off might be more important. This is something that we
>>> would be in a better position to answer if you'd provide a
>>> microbenchmark for this choice too.
>>> At the end of 2016, I've posted a draft of a microbenchmark for rwlocks.
>>> Maybe you can use this as a start and run a few experiments?
>> >
>> I've run own benchmarks in the same manner as your mentioned
>> microbenchmark for rwlocks below.
>> You are right, I can't recognize a real difference between
>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 1000
>> and
>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG -1
>>
>> As it does not hurt, I prefer to use a CAS every 1000 plain reads.
>> A CAS is not necessary on current CPUs but from architecture
>> perspective, it is more correct if there is such a serialization
>> instruction.
>
> What do you mean by "more correct"? I'm not aware of an architecture
> that would not ensure that stores on one CPU will eventually become
> visible to other CPUs.
>
> Thus, as I wrote in my review of patch 1/2, I think we should just
> remove the occassional CAS (ie, just do the equivalent of the -1
> setting, always).
>
>> There is a difference between
>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 0
>> and one of the others.
>
> Right. We do want to spin if the lock is acquired by another thread.
> What we should do in a future patch is to experiment with the back-off
> between loads. tile already has some code for this, which we at least
> need to integrate at some point.
>
>> The same applies to
>> #define SPIN_TRYLOCK_LOAD_AND_TEST_BEFORE_XCHG 1
>> It does not hurt if the lock is free, but there is a difference if the
>> lock is already acquired and trylock is called often.
>
> Yes. I've replied to this point in the 1/2 patch thread.
>
>> I've saw your microbenchmark-post and added some notes.
>
> Thanks. I'll get to them later.
>
>> I added a FIXME comment to re-evaluate the choice once we have the
>> appropriate microbenchmarks.
>>
>>> Also, I'm not quite sure whether this number is really
>>> spinlock-specific, and I would like to find a better place for these.
>>> IMO, they should be in some header that contains default tuning
>>> parameters for synchronization code, which is provided by each
>>> architecture that uses the generic spinlock; we'd have no #ifdef for the
>>> tuning parameters, so we'd catch typos in those headers.
>>>
>> See pthread_spin_parameters.h in updated patch 1/2.
>
> I suggest that we'll work towards consensus on patch 1/2 first. I
> believe once that is done, patch 2/2 would likely just remove s390 code.
>
I've attached an updated patch which just removes the s390 code.
> Thanks for continuing the work on this. I know we have some back and
> forth here in terms of overall direction, but I also think we're making
> progress and are continually improving the changes.
>
Comments
On 03/14/2017 04:55 PM, Stefan Liebler wrote:
> On 02/18/2017 06:05 PM, Torvald Riegel wrote:
>> On Wed, 2017-02-15 at 17:26 +0100, Stefan Liebler wrote:
>>> On 02/13/2017 09:39 PM, Torvald Riegel wrote:
>>>> On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote:
>>>>> This is an updated version of the patch, which adjusts the s390
>>>>> specific
>>>>> atomic-macros in the same way as in include/atomic.h.
>>>>> Thus passing a volatile int pointer is fine, too.
>>>>
>>>> The general direction of this is okay.
>>>> Some of my comments for patch 1/2 apply here as well (eg, volatile vs.
>>>> atomics).
>>>>
>>> See answer in patch 1/2.
>>>
>>>> What I don't like is the choice of 1000 for
>>>> SPIN_LOCK_READS_BETWEEN_CMPXCHG. Have you run benchmarks to come up
>>>> with this value, or is it a guess? Why isn't it documented how you end
>>>> up with this number?
>>>> We can keep these with a choice such as this, but then we need to
>>>> have a
>>>> FIXME comment in the code, explaining that this is just an arbitrary
>>>> choice.
>>>>
>>>> I would guess that just spinning forever is sufficient, and that we
>>>> don't need to throw in a CAS every now and then; using randomized
>>>> exponential back-off might be more important. This is something
>>>> that we
>>>> would be in a better position to answer if you'd provide a
>>>> microbenchmark for this choice too.
>>>> At the end of 2016, I've posted a draft of a microbenchmark for
>>>> rwlocks.
>>>> Maybe you can use this as a start and run a few experiments?
>>> >
>>> I've run own benchmarks in the same manner as your mentioned
>>> microbenchmark for rwlocks below.
>>> You are right, I can't recognize a real difference between
>>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 1000
>>> and
>>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG -1
>>>
>>> As it does not hurt, I prefer to use a CAS every 1000 plain reads.
>>> A CAS is not necessary on current CPUs but from architecture
>>> perspective, it is more correct if there is such a serialization
>>> instruction.
>>
>> What do you mean by "more correct"? I'm not aware of an architecture
>> that would not ensure that stores on one CPU will eventually become
>> visible to other CPUs.
>>
>> Thus, as I wrote in my review of patch 1/2, I think we should just
>> remove the occassional CAS (ie, just do the equivalent of the -1
>> setting, always).
>>
>>> There is a difference between
>>> #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 0
>>> and one of the others.
>>
>> Right. We do want to spin if the lock is acquired by another thread.
>> What we should do in a future patch is to experiment with the back-off
>> between loads. tile already has some code for this, which we at least
>> need to integrate at some point.
>>
>>> The same applies to
>>> #define SPIN_TRYLOCK_LOAD_AND_TEST_BEFORE_XCHG 1
>>> It does not hurt if the lock is free, but there is a difference if the
>>> lock is already acquired and trylock is called often.
>>
>> Yes. I've replied to this point in the 1/2 patch thread.
>>
>>> I've saw your microbenchmark-post and added some notes.
>>
>> Thanks. I'll get to them later.
>>
>>> I added a FIXME comment to re-evaluate the choice once we have the
>>> appropriate microbenchmarks.
>>>
>>>> Also, I'm not quite sure whether this number is really
>>>> spinlock-specific, and I would like to find a better place for these.
>>>> IMO, they should be in some header that contains default tuning
>>>> parameters for synchronization code, which is provided by each
>>>> architecture that uses the generic spinlock; we'd have no #ifdef for
>>>> the
>>>> tuning parameters, so we'd catch typos in those headers.
>>>>
>>> See pthread_spin_parameters.h in updated patch 1/2.
>>
>> I suggest that we'll work towards consensus on patch 1/2 first. I
>> believe once that is done, patch 2/2 would likely just remove s390 code.
>>
> I've attached an updated patch which just removes the s390 code.
>
>
PING
>> Thanks for continuing the work on this. I know we have some back and
>> forth here in terms of overall direction, but I also think we're making
>> progress and are continually improving the changes.
>>
On Tue, 2017-03-14 at 16:55 +0100, Stefan Liebler wrote:
> On 02/18/2017 06:05 PM, Torvald Riegel wrote:
> > On Wed, 2017-02-15 at 17:26 +0100, Stefan Liebler wrote:
> >> On 02/13/2017 09:39 PM, Torvald Riegel wrote:
> >>> On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote:
> >>> Also, I'm not quite sure whether this number is really
> >>> spinlock-specific, and I would like to find a better place for these.
> >>> IMO, they should be in some header that contains default tuning
> >>> parameters for synchronization code, which is provided by each
> >>> architecture that uses the generic spinlock; we'd have no #ifdef for the
> >>> tuning parameters, so we'd catch typos in those headers.
> >>>
> >> See pthread_spin_parameters.h in updated patch 1/2.
> >
> > I suggest that we'll work towards consensus on patch 1/2 first. I
> > believe once that is done, patch 2/2 would likely just remove s390 code.
> >
> I've attached an updated patch which just removes the s390 code.
Okay.
From 78888be8fab0f3e988360c77240d8aa08fcc980c Mon Sep 17 00:00:00 2001
From: Stefan Liebler <stli@linux.vnet.ibm.com>
Date: Tue, 14 Mar 2017 14:10:05 +0100
Subject: [PATCH 2/2] S390: Use generic spinlock code.
This patch removes the s390 specific implementation of spinlock code
and is now using the generic one.
ChangeLog:
* sysdeps/s390/nptl/pthread_spin_init.c: Delete File.
* sysdeps/s390/nptl/pthread_spin_lock.c: Likewise.
* sysdeps/s390/nptl/pthread_spin_trylock.c: Likewise.
* sysdeps/s390/nptl/pthread_spin_unlock.c: Likewise.
---
sysdeps/s390/nptl/pthread_spin_init.c | 19 -------------------
sysdeps/s390/nptl/pthread_spin_lock.c | 32 --------------------------------
sysdeps/s390/nptl/pthread_spin_trylock.c | 32 --------------------------------
sysdeps/s390/nptl/pthread_spin_unlock.c | 32 --------------------------------
4 files changed, 115 deletions(-)
delete mode 100644 sysdeps/s390/nptl/pthread_spin_init.c
delete mode 100644 sysdeps/s390/nptl/pthread_spin_lock.c
delete mode 100644 sysdeps/s390/nptl/pthread_spin_trylock.c
delete mode 100644 sysdeps/s390/nptl/pthread_spin_unlock.c
deleted file mode 100644
@@ -1,19 +0,0 @@
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
- Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-/* Not needed. pthread_spin_init is an alias for pthread_spin_unlock. */
deleted file mode 100644
@@ -1,32 +0,0 @@
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
- Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#include "pthreadP.h"
-
-int
-pthread_spin_lock (pthread_spinlock_t *lock)
-{
- int oldval;
-
- __asm__ __volatile__ ("0: lhi %0,0\n"
- " cs %0,%2,%1\n"
- " jl 0b"
- : "=&d" (oldval), "=Q" (*lock)
- : "d" (1), "m" (*lock) : "cc" );
- return 0;
-}
deleted file mode 100644
@@ -1,32 +0,0 @@
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
- Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-#include <errno.h>
-#include "pthreadP.h"
-
-int
-pthread_spin_trylock (pthread_spinlock_t *lock)
-{
- int old;
-
- __asm__ __volatile__ ("cs %0,%3,%1"
- : "=d" (old), "=Q" (*lock)
- : "0" (0), "d" (1), "m" (*lock) : "cc" );
-
- return old != 0 ? EBUSY : 0;
-}
deleted file mode 100644
@@ -1,32 +0,0 @@
-/* Copyright (C) 2003-2017 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
- Contributed by Martin Schwidefsky <schwidefsky@de.ibm.com>, 2003.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-/* Ugly hack to avoid the declaration of pthread_spin_init. */
-#define pthread_spin_init pthread_spin_init_XXX
-#include "pthreadP.h"
-#undef pthread_spin_init
-
-int
-pthread_spin_unlock (pthread_spinlock_t *lock)
-{
- __asm__ __volatile__ (" xc %O0(4,%R0),%0\n"
- " bcr 15,0"
- : "=Q" (*lock) : "m" (*lock) : "cc" );
- return 0;
-}
-strong_alias (pthread_spin_unlock, pthread_spin_init)
--
2.7.4