From patchwork Fri Sep 21 00:39:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 29497 Received: (qmail 53189 invoked by alias); 21 Sep 2018 00:39:37 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 53000 invoked by uid 89); 21 Sep 2018 00:39:36 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_DBL_ABUSE_MALW autolearn=ham version=3.3.2 spammy=HX-Received:sk:b1-v6mr X-HELO: mail-pl1-f194.google.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=subject:from:to:cc:references:openpgp:autocrypt:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=/xhiOWYuYMoBrm8wGSqK8Ues6EEPCiSGxIPSWxH7FGw=; b=id00Lx5aFXwLgZ9Me0RHov9bwS0rSXC1QdXycgfIUFWmjPfB2UajLMTClH05RuqQJZ IIrqw4DpUMzHTvaLGcHrbI/S8rHVMkA9Dsjh66bbsrtICrBoJvZXuDibh4pKnlBXsaw/ Myx/p+YydWuWHF20FxrGR2gD5mxqPsYa0aYy8= Return-Path: Subject: Re: [PATCH] powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC From: Adhemerval Zanella To: Tulio Magno Quites Machado Filho , libc-alpha@sourceware.org Cc: Breno Leitao References: <1535403247-27306-1-git-send-email-adhemerval.zanella@linaro.org> <87mut7bfb9.fsf@linux.ibm.com> <87h8ilm5w8.fsf@linux.ibm.com> Openpgp: preference=signencrypt Message-ID: <4e214751-9ceb-28d4-8255-7bb70409e48c@linaro.org> Date: Thu, 20 Sep 2018 17:39:29 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: On 20/09/2018 09:29, Adhemerval Zanella wrote: > > > On 19/09/2018 16:20, Tulio Magno Quites Machado Filho wrote: >> Adhemerval Zanella writes: >> >>> On 27/08/2018 19:44, Tulio Magno Quites Machado Filho wrote: >>>> Adhemerval Zanella writes: >>>> >>>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c >>>>> index 906882a..508b917 100644 >>>>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c >>>>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c >>>>> @@ -127,6 +127,30 @@ elision_init (int argc __attribute__ ((unused)), >>>>> TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort)); >>>>> #endif >>>>> >>>>> + /* Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls, >>> >>> In fact it should be 4.1, 4.2 is the first release which actually abort transactions >>> on syscalls. >> >> Ack. >> >>>>> + instead it suspend and resume it when leaving the kernel. The >>>>> + side-effects of the syscall will always remain visible, even if the >>>>> + transaction is aborted. This is an issue when transaction is used along >>>>> + with futex syscall, on pthread_cond_wait for instance, where the futex >>>>> + call might succeed but the transaction is rolled back leading the >>>>> + pthread_cond object in an inconsistent state. >>>>> + >>>>> + GLIBC used to prevent it by always aborting a transaction before issuing >>>>> + a syscall. Linux 4.2 also decided to abort active transaction in >>>>> + syscalls which makes the GLIBC workaround superflours. Worse, GLIBC >>>>> + transaction abortion leads to a performance issue on recent kernels >>>>> + where the HTM state is saved/restore lazily. By aborting a transaction >>>>> + on every syscalls, regardless whether a transaction has being initiated >>>>> + before, glibc make the kernel always save/restore HTM state (it can not >>>>> + even lazily disable it after a certain number of syscall iterations). >>>>> + >>>>> + Because of this shortcoming, Lock Elision is just enabled when it has >>>>> + been explictly set (either by tunables of by a configure switch) and if >>>>> + kernel aborts HTM transactions on syscalls (PPC_FEATURE2_HTM_NOSC) */ >>>>> + >>>>> + __pthread_force_elision = __pthread_force_elision && >>>>> + GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC; >>>> >>>> In other words: this patch is requiring Linux 4.3 for TLE. >>>> >>>> Can't ABORT_TRANSACTION check for PPC_FEATURE2_HTM_NOSC in tcbhead_t.hwcap and >>>> continue to support older Linux versions? >>>> >>> >>> I don't have a strong opinion here, but chatting with Breno I had the impression >>> that TLE (and HTM in general) is only used on kernel with this recent fix. This >>> patch is really a hack insertion that penalizes all syscall for an specific >>> feature which may not be present in the kernel. It is unfortunate that kernel >>> developers did not realize at feature inclusion this design for transaction >>> state is not the best one for library usage. >> >> My point was actually related to the fact that INSTALL still mentions Linux 3.2. >> >> I completely agree this solution is better in terms of performance and >> maintenance in the long term. >> Maybe we (me?) could document this problem somewhere (INSTALL?) pointing >> users that want to run the latest glibc and use HTM on Linux to use >> Linux >= 4.2. > > The INSTALL file still lacks to mention that powerpc64le still requires > at least a 3.10 kernel (which is the minimum version where the ABI is > actually supported). And there still some other ABIs that does not > have support for 3.2 (aarch64 for instance). > > At least, I think we start to document powerpc64le requirements: a minimum > Linux version of 3.10 and 4.2 for TLE. > >> >> With the removal of ABORT_TRANSACTION, it's also necessary to remove >> sysdeps/unix/sysv/linux/powerpc/not-errno.h. >> >> With that said, my questions have already been answered and this patch looks >> good to me with the fix in the comment you mentioned and the removal of >> not-errno.h.> > > I will update the patch with this removal. > Below is an updated patch, I decided to advertise the TLE support change on NEWS instead of adding on INSTALL (it have information only for glibc build/installation instead of expected runtime support). --- * NEWS: Add note about new TLE support on powerpc64le. * sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove. * sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to __ununsed1. (TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup. (THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros. * sysdeps/powerpc/powerpc32/sysdep.h, sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL, ABORT_TRANSACTION): Remove macros. * sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise. * sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set __pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set. * sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h, sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove usage. * sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file. Reported-by: Breno Leitão --- diff --git a/NEWS b/NEWS index 53d7bd09b3..0701d2d10b 100644 --- a/NEWS +++ b/NEWS @@ -21,6 +21,15 @@ Major new features: * The reallocarray function is now declared under _DEFAULT_SOURCE, not just for _GNU_SOURCE, to match BSD environments. +* For powercp64le ABI, Transactional Lock Elision is now enabled iff kernel + aborts the transaction prior kernel execution (PPC_FEATURE2_HTM_NOSC on + hwcap2). On older kernels the transaction is suspended, and this caused + some undefined side-effects issues when transactions relies on other + synchronization mechanisms (futex for instance). GLIBC avoided it by + abort transactions manually on each syscall, but it lead to performance + issues on newer kernels where the HTM state is saved and restore lazily + (the state being saved even when the process actually does not use HTM). + Deprecated and removed features, and other changes affecting compatibility: * The glibc.tune tunable namespace has been renamed to glibc.cpu and the diff --git a/sysdeps/powerpc/nptl/tcb-offsets.sym b/sysdeps/powerpc/nptl/tcb-offsets.sym index e5bb2b376d..4c01615ad0 100644 --- a/sysdeps/powerpc/nptl/tcb-offsets.sym +++ b/sysdeps/powerpc/nptl/tcb-offsets.sym @@ -21,7 +21,6 @@ DSO_SLOT2 (offsetof (tcbhead_t, dso_slot2) - TLS_TCB_OFFSET - sizeof (tcbhead_ #ifdef __powerpc64__ TCB_AT_PLATFORM (offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t)) #endif -TM_CAPABLE (offsetof (tcbhead_t, tm_capable) - TLS_TCB_OFFSET - sizeof (tcbhead_t)) #ifndef __powerpc64__ TCB_AT_PLATFORM (offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t)) PADDING (offsetof (tcbhead_t, padding) - TLS_TCB_OFFSET - sizeof(tcbhead_t)) diff --git a/sysdeps/powerpc/nptl/tls.h b/sysdeps/powerpc/nptl/tls.h index f88fed5ecf..8317ca7cfa 100644 --- a/sysdeps/powerpc/nptl/tls.h +++ b/sysdeps/powerpc/nptl/tls.h @@ -67,8 +67,7 @@ typedef struct uint32_t padding; uint32_t at_platform; #endif - /* Indicate if HTM capable (ISA 2.07). */ - uint32_t tm_capable; + uint32_t __unused; /* Reservation for AT_PLATFORM data - powerpc64. */ #ifdef __powerpc64__ uint32_t at_platform; @@ -142,7 +141,6 @@ register void *__thread_register __asm__ ("r13"); # define TLS_INIT_TP(tcbp) \ ({ \ __thread_register = (void *) (tcbp) + TLS_TCB_OFFSET; \ - THREAD_SET_TM_CAPABLE (__tcb_hwcap & PPC_FEATURE2_HAS_HTM ? 1 : 0); \ THREAD_SET_HWCAP (__tcb_hwcap); \ THREAD_SET_AT_PLATFORM (__tcb_platform); \ NULL; \ @@ -151,8 +149,6 @@ register void *__thread_register __asm__ ("r13"); /* Value passed to 'clone' for initialization of the thread register. */ # define TLS_DEFINE_INIT_TP(tp, pd) \ void *tp = (void *) (pd) + TLS_TCB_OFFSET + TLS_PRE_TCB_SIZE; \ - (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].tm_capable) = \ - THREAD_GET_TM_CAPABLE (); \ (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].hwcap) = \ THREAD_GET_HWCAP (); \ (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].at_platform) = \ @@ -210,13 +206,6 @@ register void *__thread_register __asm__ ("r13"); + TLS_PRE_TCB_SIZE))[-1].pointer_guard \ = THREAD_GET_POINTER_GUARD()) -/* tm_capable field in TCB head. */ -# define THREAD_GET_TM_CAPABLE() \ - (((tcbhead_t *) ((char *) __thread_register \ - - TLS_TCB_OFFSET))[-1].tm_capable) -# define THREAD_SET_TM_CAPABLE(value) \ - (THREAD_GET_TM_CAPABLE () = (value)) - /* hwcap field in TCB head. */ # define THREAD_GET_HWCAP() \ (((tcbhead_t *) ((char *) __thread_register \ diff --git a/sysdeps/powerpc/powerpc32/sysdep.h b/sysdeps/powerpc/powerpc32/sysdep.h index 5f1294ead3..93097c5459 100644 --- a/sysdeps/powerpc/powerpc32/sysdep.h +++ b/sysdeps/powerpc/powerpc32/sysdep.h @@ -90,24 +90,7 @@ GOT_LABEL: ; \ cfi_endproc; \ ASM_SIZE_DIRECTIVE(name) -#if !IS_IN(rtld) && !defined(__SPE__) -# define ABORT_TRANSACTION_IMPL \ - cmpwi 2,0; \ - beq 1f; \ - lwz 0,TM_CAPABLE(2); \ - cmpwi 0,0; \ - beq 1f; \ - li 11,_ABORT_SYSCALL; \ - tabort. 11; \ - .align 4; \ -1: -#else -# define ABORT_TRANSACTION_IMPL -#endif -#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL - #define DO_CALL(syscall) \ - ABORT_TRANSACTION \ li 0,syscall; \ sc diff --git a/sysdeps/powerpc/powerpc64/sysdep.h b/sysdeps/powerpc/powerpc64/sysdep.h index 2df1d9b6e6..50e64f9ce2 100644 --- a/sysdeps/powerpc/powerpc64/sysdep.h +++ b/sysdeps/powerpc/powerpc64/sysdep.h @@ -263,24 +263,7 @@ LT_LABELSUFFIX(name,_name_end): ; \ TRACEBACK_MASK(name,mask); \ END_2(name) -#if !IS_IN(rtld) -# define ABORT_TRANSACTION_IMPL \ - cmpdi 13,0; \ - beq 1f; \ - lwz 0,TM_CAPABLE(13); \ - cmpwi 0,0; \ - beq 1f; \ - li 11,_ABORT_SYSCALL; \ - tabort. 11; \ - .p2align 4; \ -1: -#else -# define ABORT_TRANSACTION_IMPL -#endif -#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL - #define DO_CALL(syscall) \ - ABORT_TRANSACTION \ li 0,syscall; \ sc diff --git a/sysdeps/powerpc/sysdep.h b/sysdeps/powerpc/sysdep.h index 8a6d236caa..c8bf25e870 100644 --- a/sysdeps/powerpc/sysdep.h +++ b/sysdeps/powerpc/sysdep.h @@ -21,8 +21,6 @@ */ #define _SYSDEPS_SYSDEP_H 1 #include -#include -#include #define PPC_FEATURE_970 (PPC_FEATURE_POWER4 + PPC_FEATURE_HAS_ALTIVEC) @@ -166,22 +164,4 @@ #define ALIGNARG(log2) log2 #define ASM_SIZE_DIRECTIVE(name) .size name,.-name -#else - -/* Linux kernel powerpc documentation [1] states issuing a syscall inside a - transaction is not recommended and may lead to undefined behavior. It - also states syscalls do not abort transactions. To avoid such traps, - we abort transaction just before syscalls. - - [1] Documentation/powerpc/transactional_memory.txt [Syscalls] */ -#if !IS_IN(rtld) && !defined(__SPE__) -# define ABORT_TRANSACTION \ - ({ \ - if (THREAD_GET_TM_CAPABLE ()) \ - __libc_tabort (_ABORT_SYSCALL); \ - }) -#else -# define ABORT_TRANSACTION -#endif - #endif /* __ASSEMBLER__ */ diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c index 906882a65e..60b19482bb 100644 --- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c +++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c @@ -70,7 +70,7 @@ do_set_elision_enable (int32_t elision_enable) & PPC_FEATURE2_HAS_HTM) ? 1 : 0; } -/* The pthread->elision_enable tunable is 0 or 1 indicating that elision +/* The pthread->elision_enable tunable is 0 or 1 indicating that elisionGLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC; should be disabled or enabled respectively. The feature will only be used if it's supported by the hardware. */ @@ -127,6 +127,26 @@ elision_init (int argc __attribute__ ((unused)), TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort)); #endif + /* Linux from 3.9 through 4.2 do not abort HTM transaction on syscalls, + instead it suspend the transaction and resumes it when returning to + usercode. The side-effects of the syscall will always remain visible, + even if the transaction is aborted. This is an issue when transaction + is used along with futex syscall, on pthread_cond_wait for instance, + where futex might succeed but the transaction is rolled back leading + the pthread_cond object in an inconsistent state. + + GLIBC used to prevent it by always aborting a transaction before issuing + a syscall. Linux 4.2 also decided to abort active transaction in + syscalls which makes the GLIBC workaround superflouts. Worse, GLIBC + transaction abortion leads to a performance issue on recent kernels. + + So Lock Elision is just enabled when it has been explict set (either + by tunables of by a configure switch) and if kernel aborts HTM + transactions on syscalls (PPC_FEATURE2_HTM_NOSC) */ + + __pthread_force_elision = __pthread_force_elision && + GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC; + if (!__pthread_force_elision) __elision_aconf.try_tbegin = 0; /* Disable elision on rwlocks. */ } diff --git a/sysdeps/unix/sysv/linux/powerpc/not-errno.h b/sysdeps/unix/sysv/linux/powerpc/not-errno.h deleted file mode 100644 index 27da21bdf1..0000000000 --- a/sysdeps/unix/sysv/linux/powerpc/not-errno.h +++ /dev/null @@ -1,30 +0,0 @@ -/* Syscall wrapper that do not set errno. Linux powerpc version. - Copyright (C) 2018 Free Software Foundation, Inc. - This file is part of the GNU C Library. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - . */ - -/* __access_noerrno is used during process initialization in elf/dl-tunables.c - before the TCB is initialized, prohibiting the usage of - ABORT_TRANSACTION. */ -#undef ABORT_TRANSACTION -#define ABORT_TRANSACTION - -#include "sysdeps/unix/sysv/linux/not-errno.h" - -/* Recover ABORT_TRANSACTION's previous value, in order to not affect - other syscalls. */ -#undef ABORT_TRANSACTION -#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h index f7277d59e1..ec5c5250f8 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h @@ -109,7 +109,6 @@ register long int r11 __asm__ ("r11"); \ register long int r12 __asm__ ("r12"); \ LOADARGS_##nr(name, args); \ - ABORT_TRANSACTION; \ __asm__ __volatile__ \ ("sc \n\t" \ "mfcr %0" \ diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h b/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h index 0956cf04a7..1f17f7bd5f 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h @@ -131,7 +131,6 @@ register long int r7 __asm__ ("r7"); \ register long int r8 __asm__ ("r8"); \ LOADARGS_##nr (name, ##args); \ - ABORT_TRANSACTION; \ __asm__ __volatile__ \ ("sc\n\t" \ "mfcr %0\n\t" \ diff --git a/sysdeps/unix/sysv/linux/powerpc/syscall.S b/sysdeps/unix/sysv/linux/powerpc/syscall.S index 2da91721be..bbab613aca 100644 --- a/sysdeps/unix/sysv/linux/powerpc/syscall.S +++ b/sysdeps/unix/sysv/linux/powerpc/syscall.S @@ -18,7 +18,6 @@ #include ENTRY (syscall) - ABORT_TRANSACTION mr r0,r3 mr r3,r4 mr r4,r5