From patchwork Wed Feb 15 16:26:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Liebler X-Patchwork-Id: 19265 Received: (qmail 119205 invoked by alias); 15 Feb 2017 16:26:27 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 119139 invoked by uid 89); 15 Feb 2017 16:26:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 spammy=bye, acquired, 1, 19, ebusy X-HELO: mx0a-001b2d01.pphosted.com From: Stefan Liebler Subject: Re: [PATCH 2/2] S390: Use generic spinlock code. To: libc-alpha@sourceware.org References: <1481905917-15654-1-git-send-email-stli@linux.vnet.ibm.com> <1481905917-15654-2-git-send-email-stli@linux.vnet.ibm.com> <1487018364.16322.88.camel@redhat.com> Date: Wed, 15 Feb 2017 17:26:09 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <1487018364.16322.88.camel@redhat.com> X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 17021516-0020-0000-0000-00000274B89D X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17021516-0021-0000-0000-00001F640F62 Message-Id: X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-02-15_08:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=3 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1612050000 definitions=main-1702150156 On 02/13/2017 09:39 PM, Torvald Riegel wrote: > On Wed, 2017-02-08 at 15:49 +0100, Stefan Liebler wrote: >> This is an updated version of the patch, which adjusts the s390 specific >> atomic-macros in the same way as in include/atomic.h. >> Thus passing a volatile int pointer is fine, too. > > The general direction of this is okay. > Some of my comments for patch 1/2 apply here as well (eg, volatile vs. > atomics). > See answer in patch 1/2. > What I don't like is the choice of 1000 for > SPIN_LOCK_READS_BETWEEN_CMPXCHG. Have you run benchmarks to come up > with this value, or is it a guess? Why isn't it documented how you end > up with this number? > We can keep these with a choice such as this, but then we need to have a > FIXME comment in the code, explaining that this is just an arbitrary > choice. > > I would guess that just spinning forever is sufficient, and that we > don't need to throw in a CAS every now and then; using randomized > exponential back-off might be more important. This is something that we > would be in a better position to answer if you'd provide a > microbenchmark for this choice too. > At the end of 2016, I've posted a draft of a microbenchmark for rwlocks. > Maybe you can use this as a start and run a few experiments? > I've run own benchmarks in the same manner as your mentioned microbenchmark for rwlocks below. You are right, I can't recognize a real difference between #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 1000 and #define SPIN_LOCK_READS_BETWEEN_CMPXCHG -1 As it does not hurt, I prefer to use a CAS every 1000 plain reads. A CAS is not necessary on current CPUs but from architecture perspective, it is more correct if there is such a serialization instruction. There is a difference between #define SPIN_LOCK_READS_BETWEEN_CMPXCHG 0 and one of the others. The same applies to #define SPIN_TRYLOCK_LOAD_AND_TEST_BEFORE_XCHG 1 It does not hurt if the lock is free, but there is a difference if the lock is already acquired and trylock is called often. I've saw your microbenchmark-post and added some notes. I added a FIXME comment to re-evaluate the choice once we have the appropriate microbenchmarks. > Also, I'm not quite sure whether this number is really > spinlock-specific, and I would like to find a better place for these. > IMO, they should be in some header that contains default tuning > parameters for synchronization code, which is provided by each > architecture that uses the generic spinlock; we'd have no #ifdef for the > tuning parameters, so we'd catch typos in those headers. > See pthread_spin_parameters.h in updated patch 1/2. I've attached an updated patch due to the changes in patch 1/2 and added comments to the macro definitions. Bye. Stefan ChangeLog: * sysdeps/s390/atomic-machine.h: (__arch_compare_and_exchange_val_32_acq): Cast type to omit volatile qualifier. (__arch_compare_and_exchange_val_64_acq): Likewise. (atomic_exchange_acq): Likewise. * sysdeps/s390/nptl/pthread_spin_init.c: Delete File. * sysdeps/s390/nptl/pthread_spin_lock.c: Likewise. * sysdeps/s390/nptl/pthread_spin_trylock.c: Likewise. * sysdeps/s390/nptl/pthread_spin_unlock.c: Likewise. * sysdeps/s390/nptl/pthread_spin_parameters.h: New File. commit 6dc9da4a9c99524d5d6deb841811f1a8cc37c5b2 Author: Stefan Liebler Date: Wed Feb 15 17:22:38 2017 +0100 S390: Use generic spinlock code. This patch removes the s390 specific implementation of spinlock code and is now using the generic one. For pthread_spin_trylock an explicit load and test before executing compare and swap instruction is done as it is an interlock update even if the lock is already acquired. The macros in s390 specific atomic-machine.h are aligned in order to omit the storing to and loading from stack. ChangeLog: * sysdeps/s390/atomic-machine.h: (__arch_compare_and_exchange_val_32_acq): Cast type to omit volatile qualifier. (__arch_compare_and_exchange_val_64_acq): Likewise. (atomic_exchange_acq): Likewise. * sysdeps/s390/nptl/pthread_spin_init.c: Delete File. * sysdeps/s390/nptl/pthread_spin_lock.c: Likewise. * sysdeps/s390/nptl/pthread_spin_trylock.c: Likewise. * sysdeps/s390/nptl/pthread_spin_unlock.c: Likewise. * sysdeps/s390/nptl/pthread_spin_parameters.h: New File. diff --git a/sysdeps/s390/atomic-machine.h b/sysdeps/s390/atomic-machine.h index 211d3d6..d98c836 100644 --- a/sysdeps/s390/atomic-machine.h +++ b/sysdeps/s390/atomic-machine.h @@ -54,7 +54,7 @@ typedef uintmax_t uatomic_max_t; #define __arch_compare_and_exchange_val_32_acq(mem, newval, oldval) \ ({ __typeof (mem) __archmem = (mem); \ - __typeof (*mem) __archold = (oldval); \ + __typeof ((__typeof (*(mem))) *(mem)) __archold = (oldval); \ __asm__ __volatile__ ("cs %0,%2,%1" \ : "+d" (__archold), "=Q" (*__archmem) \ : "d" (newval), "m" (*__archmem) : "cc", "memory" ); \ @@ -64,7 +64,7 @@ typedef uintmax_t uatomic_max_t; # define __HAVE_64B_ATOMICS 1 # define __arch_compare_and_exchange_val_64_acq(mem, newval, oldval) \ ({ __typeof (mem) __archmem = (mem); \ - __typeof (*mem) __archold = (oldval); \ + __typeof ((__typeof (*(mem))) *(mem)) __archold = (oldval); \ __asm__ __volatile__ ("csg %0,%2,%1" \ : "+d" (__archold), "=Q" (*__archmem) \ : "d" ((long) (newval)), "m" (*__archmem) : "cc", "memory" ); \ @@ -86,8 +86,8 @@ typedef uintmax_t uatomic_max_t; #ifdef __s390x__ # define atomic_exchange_acq(mem, newvalue) \ ({ __typeof (mem) __atg5_memp = (mem); \ - __typeof (*(mem)) __atg5_oldval = *__atg5_memp; \ - __typeof (*(mem)) __atg5_value = (newvalue); \ + __typeof ((__typeof (*(mem))) *(mem)) __atg5_oldval = *__atg5_memp; \ + __typeof ((__typeof (*(mem))) *(mem)) __atg5_value = (newvalue); \ if (sizeof (*mem) == 4) \ __asm__ __volatile__ ("0: cs %0,%2,%1\n" \ " jl 0b" \ @@ -106,8 +106,8 @@ typedef uintmax_t uatomic_max_t; #else # define atomic_exchange_acq(mem, newvalue) \ ({ __typeof (mem) __atg5_memp = (mem); \ - __typeof (*(mem)) __atg5_oldval = *__atg5_memp; \ - __typeof (*(mem)) __atg5_value = (newvalue); \ + __typeof ((__typeof (*(mem))) *(mem)) __atg5_oldval = *__atg5_memp; \ + __typeof ((__typeof (*(mem))) *(mem)) __atg5_value = (newvalue); \ if (sizeof (*mem) == 4) \ __asm__ __volatile__ ("0: cs %0,%2,%1\n" \ " jl 0b" \ diff --git a/sysdeps/s390/nptl/pthread_spin_init.c b/sysdeps/s390/nptl/pthread_spin_init.c deleted file mode 100644 index d826871..0000000 --- a/sysdeps/s390/nptl/pthread_spin_init.c +++ /dev/null @@ -1,19 +0,0 @@ -/* Copyright (C) 2003-2017 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Martin Schwidefsky , 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -/* Not needed. pthread_spin_init is an alias for pthread_spin_unlock. */ diff --git a/sysdeps/s390/nptl/pthread_spin_lock.c b/sysdeps/s390/nptl/pthread_spin_lock.c deleted file mode 100644 index 7349940..0000000 --- a/sysdeps/s390/nptl/pthread_spin_lock.c +++ /dev/null @@ -1,32 +0,0 @@ -/* Copyright (C) 2003-2017 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Martin Schwidefsky , 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include "pthreadP.h" - -int -pthread_spin_lock (pthread_spinlock_t *lock) -{ - int oldval; - - __asm__ __volatile__ ("0: lhi %0,0\n" - " cs %0,%2,%1\n" - " jl 0b" - : "=&d" (oldval), "=Q" (*lock) - : "d" (1), "m" (*lock) : "cc" ); - return 0; -} diff --git a/sysdeps/s390/nptl/pthread_spin_parameters.h b/sysdeps/s390/nptl/pthread_spin_parameters.h new file mode 100644 index 0000000..b2ff166 --- /dev/null +++ b/sysdeps/s390/nptl/pthread_spin_parameters.h @@ -0,0 +1,36 @@ +/* Parameters for generic pthread_spinlock_t implementation. s390 version. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* FIXME: Re-evaluate the defines below once we have an appropriate + microbenchmark. + See e.g. "RFC: shared-memory synchronization benchmarking in glibc" + (https://sourceware.org/ml/libc-alpha/2017-01/msg00168.html) */ + +/* An own benchmark shows no difference between defining to '1000' + and '-1', but a positive result compared to '0' as a CAS instruction + would be used in a loop, which should be avoided on s390 due to cache-line + ping-pong. */ +#define SPIN_LOCK_READS_BETWEEN_CMPXCHG 1000 + +/* An own benchmark showed a positive result if defined to '1' + and trylock is called often by different CPUs. + If the lock is not acquired and no other CPU is trying to + acquire the lock, then the additional load does not hurt. */ +#define SPIN_TRYLOCK_LOAD_AND_TEST_BEFORE_XCHG 1 + +#include_next diff --git a/sysdeps/s390/nptl/pthread_spin_trylock.c b/sysdeps/s390/nptl/pthread_spin_trylock.c deleted file mode 100644 index 0e848da..0000000 --- a/sysdeps/s390/nptl/pthread_spin_trylock.c +++ /dev/null @@ -1,32 +0,0 @@ -/* Copyright (C) 2003-2017 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Martin Schwidefsky , 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -#include -#include "pthreadP.h" - -int -pthread_spin_trylock (pthread_spinlock_t *lock) -{ - int old; - - __asm__ __volatile__ ("cs %0,%3,%1" - : "=d" (old), "=Q" (*lock) - : "0" (0), "d" (1), "m" (*lock) : "cc" ); - - return old != 0 ? EBUSY : 0; -} diff --git a/sysdeps/s390/nptl/pthread_spin_unlock.c b/sysdeps/s390/nptl/pthread_spin_unlock.c deleted file mode 100644 index 54e7378..0000000 --- a/sysdeps/s390/nptl/pthread_spin_unlock.c +++ /dev/null @@ -1,32 +0,0 @@ -/* Copyright (C) 2003-2017 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Martin Schwidefsky , 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -/* Ugly hack to avoid the declaration of pthread_spin_init. */ -#define pthread_spin_init pthread_spin_init_XXX -#include "pthreadP.h" -#undef pthread_spin_init - -int -pthread_spin_unlock (pthread_spinlock_t *lock) -{ - __asm__ __volatile__ (" xc %O0(4,%R0),%0\n" - " bcr 15,0" - : "=Q" (*lock) : "m" (*lock) : "cc" ); - return 0; -} -strong_alias (pthread_spin_unlock, pthread_spin_init)