powerpc: Only enable TLE with PPC_FEATURE2_HTM_NOSC
Commit Message
On 20/09/2018 09:29, Adhemerval Zanella wrote:
>
>
> On 19/09/2018 16:20, Tulio Magno Quites Machado Filho wrote:
>> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
>>
>>> On 27/08/2018 19:44, Tulio Magno Quites Machado Filho wrote:
>>>> Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:
>>>>
>>>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>>>>> index 906882a..508b917 100644
>>>>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>>>>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>>>>> @@ -127,6 +127,30 @@ elision_init (int argc __attribute__ ((unused)),
>>>>> TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
>>>>> #endif
>>>>>
>>>>> + /* Linux from 3.9 through 4.2 does not abort HTM transaction on syscalls,
>>>
>>> In fact it should be 4.1, 4.2 is the first release which actually abort transactions
>>> on syscalls.
>>
>> Ack.
>>
>>>>> + instead it suspend and resume it when leaving the kernel. The
>>>>> + side-effects of the syscall will always remain visible, even if the
>>>>> + transaction is aborted. This is an issue when transaction is used along
>>>>> + with futex syscall, on pthread_cond_wait for instance, where the futex
>>>>> + call might succeed but the transaction is rolled back leading the
>>>>> + pthread_cond object in an inconsistent state.
>>>>> +
>>>>> + GLIBC used to prevent it by always aborting a transaction before issuing
>>>>> + a syscall. Linux 4.2 also decided to abort active transaction in
>>>>> + syscalls which makes the GLIBC workaround superflours. Worse, GLIBC
>>>>> + transaction abortion leads to a performance issue on recent kernels
>>>>> + where the HTM state is saved/restore lazily. By aborting a transaction
>>>>> + on every syscalls, regardless whether a transaction has being initiated
>>>>> + before, glibc make the kernel always save/restore HTM state (it can not
>>>>> + even lazily disable it after a certain number of syscall iterations).
>>>>> +
>>>>> + Because of this shortcoming, Lock Elision is just enabled when it has
>>>>> + been explictly set (either by tunables of by a configure switch) and if
>>>>> + kernel aborts HTM transactions on syscalls (PPC_FEATURE2_HTM_NOSC) */
>>>>> +
>>>>> + __pthread_force_elision = __pthread_force_elision &&
>>>>> + GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
>>>>
>>>> In other words: this patch is requiring Linux 4.3 for TLE.
>>>>
>>>> Can't ABORT_TRANSACTION check for PPC_FEATURE2_HTM_NOSC in tcbhead_t.hwcap and
>>>> continue to support older Linux versions?
>>>>
>>>
>>> I don't have a strong opinion here, but chatting with Breno I had the impression
>>> that TLE (and HTM in general) is only used on kernel with this recent fix. This
>>> patch is really a hack insertion that penalizes all syscall for an specific
>>> feature which may not be present in the kernel. It is unfortunate that kernel
>>> developers did not realize at feature inclusion this design for transaction
>>> state is not the best one for library usage.
>>
>> My point was actually related to the fact that INSTALL still mentions Linux 3.2.
>>
>> I completely agree this solution is better in terms of performance and
>> maintenance in the long term.
>> Maybe we (me?) could document this problem somewhere (INSTALL?) pointing
>> users that want to run the latest glibc and use HTM on Linux to use
>> Linux >= 4.2.
>
> The INSTALL file still lacks to mention that powerpc64le still requires
> at least a 3.10 kernel (which is the minimum version where the ABI is
> actually supported). And there still some other ABIs that does not
> have support for 3.2 (aarch64 for instance).
>
> At least, I think we start to document powerpc64le requirements: a minimum
> Linux version of 3.10 and 4.2 for TLE.
>
>>
>> With the removal of ABORT_TRANSACTION, it's also necessary to remove
>> sysdeps/unix/sysv/linux/powerpc/not-errno.h.
>>
>> With that said, my questions have already been answered and this patch looks
>> good to me with the fix in the comment you mentioned and the removal of
>> not-errno.h.>
>
> I will update the patch with this removal.
>
Below is an updated patch, I decided to advertise the TLE support change
on NEWS instead of adding on INSTALL (it have information only for glibc
build/installation instead of expected runtime support).
---
* NEWS: Add note about new TLE support on powerpc64le.
* sysdeps/powerpc/nptl/tcb-offsets.sym (TM_CAPABLE): Remove.
* sysdeps/powerpc/nptl/tls.h (tcbhead_t): Rename tm_capable to
__ununsed1.
(TLS_INIT_TP, TLS_DEFINE_INIT_TP): Remove tm_capable setup.
(THREAD_GET_TM_CAPABLE, THREAD_SET_TM_CAPABLE): Remove macros.
* sysdeps/powerpc/powerpc32/sysdep.h,
sysdeps/powerpc/powerpc64/sysdep.h (ABORT_TRANSACTION_IMPL,
ABORT_TRANSACTION): Remove macros.
* sysdeps/powerpc/sysdep.h (ABORT_TRANSACTION): Likewise.
* sysdeps/unix/sysv/linux/powerpc/elision-conf.c (elision_init): Set
__pthread_force_elision iff PPC_FEATURE2_HTM_NOSC is set.
* sysdeps/unix/sysv/linux/powerpc/powerpc32/sysdep.h,
sysdeps/unix/sysv/linux/powerpc/powerpc64/sysdep.h
sysdeps/unix/sysv/linux/powerpc/syscall.S (ABORT_TRANSACTION): Remove
usage.
* sysdeps/unix/sysv/linux/powerpc/not-errno.h: Remove file.
Reported-by: Breno Leitão <leitao@debian.org>
---
Comments
* Adhemerval Zanella:
> +* For powercp64le ABI, Transactional Lock Elision is now enabled iff kernel
> + aborts the transaction prior kernel execution (PPC_FEATURE2_HTM_NOSC on
> + hwcap2). On older kernels the transaction is suspended, and this caused
> + some undefined side-effects issues when transactions relies on other
> + synchronization mechanisms (futex for instance). GLIBC avoided it by
> + abort transactions manually on each syscall, but it lead to performance
> + issues on newer kernels where the HTM state is saved and restore lazily
> + (the state being saved even when the process actually does not use HTM).
I'd prefer “enabled iff the kernel indicates that it will abort the
transaction prior to entering the kernel (PPC_FEATURE2_HTM_NOSC on
hwcap2)” and “when transactions relie*d*”, “by aborting transactions
manually”, “but it led”, “saved and restored”.
And “GLIBC” should perhaps be spelled “glibc”.
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
> index 906882a65e..60b19482bb 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
> @@ -70,7 +70,7 @@ do_set_elision_enable (int32_t elision_enable)
> & PPC_FEATURE2_HAS_HTM) ? 1 : 0;
> }
>
> -/* The pthread->elision_enable tunable is 0 or 1 indicating that elision
> +/* The pthread->elision_enable tunable is 0 or 1 indicating that elisionGLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
> should be disabled or enabled respectively. The feature will only be used
> if it's supported by the hardware. */
Unrelated accidental change.
>
> @@ -127,6 +127,26 @@ elision_init (int argc __attribute__ ((unused)),
> TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
> #endif
>
> + /* Linux from 3.9 through 4.2 do not abort HTM transaction on syscalls,
> + instead it suspend the transaction and resumes it when returning to
“instead it suspends”
> + usercode. The side-effects of the syscall will always remain visible,
> + even if the transaction is aborted. This is an issue when transaction
“when a transaction”
> + is used along with futex syscall, on pthread_cond_wait for instance,
> + where futex might succeed but the transaction is rolled back leading
> + the pthread_cond object in an inconsistent state.
“pthread_cond_t” (or “condition variable”).
> + GLIBC used to prevent it by always aborting a transaction before issuing
> + a syscall. Linux 4.2 also decided to abort active transaction in
> + syscalls which makes the GLIBC workaround superflouts. Worse, GLIBC
> + transaction abortion leads to a performance issue on recent kernels.
See above about GLIBC vs glibc. And “superfluous”, “glibc transaction
abort*s* lead**”.
The actual change looks good to me, based on the explanation in the
patch.
Thanks,
Florian
On Thu, 20 Sep 2018, Adhemerval Zanella wrote:
> + __pthread_force_elision = __pthread_force_elision &&
> + GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
Break lines before an operator, not after, and use parentheses around the
expression in that case (the operator should be in the column immediately
after the open parenthesis).
On 20/09/2018 23:13, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> +* For powercp64le ABI, Transactional Lock Elision is now enabled iff kernel
>> + aborts the transaction prior kernel execution (PPC_FEATURE2_HTM_NOSC on
>> + hwcap2). On older kernels the transaction is suspended, and this caused
>> + some undefined side-effects issues when transactions relies on other
>> + synchronization mechanisms (futex for instance). GLIBC avoided it by
>> + abort transactions manually on each syscall, but it lead to performance
>> + issues on newer kernels where the HTM state is saved and restore lazily
>> + (the state being saved even when the process actually does not use HTM).
>
> I'd prefer “enabled iff the kernel indicates that it will abort the
> transaction prior to entering the kernel (PPC_FEATURE2_HTM_NOSC on
> hwcap2)” and “when transactions relie*d*”, “by aborting transactions
> manually”, “but it led”, “saved and restored”.
>
> And “GLIBC” should perhaps be spelled “glibc”.
>
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>> index 906882a65e..60b19482bb 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-conf.c
>> @@ -70,7 +70,7 @@ do_set_elision_enable (int32_t elision_enable)
>> & PPC_FEATURE2_HAS_HTM) ? 1 : 0;
>> }
>>
>> -/* The pthread->elision_enable tunable is 0 or 1 indicating that elision
>> +/* The pthread->elision_enable tunable is 0 or 1 indicating that elisionGLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
>> should be disabled or enabled respectively. The feature will only be used
>> if it's supported by the hardware. */
>
> Unrelated accidental change.
>
>>
>> @@ -127,6 +127,26 @@ elision_init (int argc __attribute__ ((unused)),
>> TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
>> #endif
>>
>> + /* Linux from 3.9 through 4.2 do not abort HTM transaction on syscalls,
>> + instead it suspend the transaction and resumes it when returning to
>
> “instead it suspends”
>
>> + usercode. The side-effects of the syscall will always remain visible,
>> + even if the transaction is aborted. This is an issue when transaction
>
> “when a transaction”
>
>> + is used along with futex syscall, on pthread_cond_wait for instance,
>> + where futex might succeed but the transaction is rolled back leading
>> + the pthread_cond object in an inconsistent state.
>
> “pthread_cond_t” (or “condition variable”).
>
>> + GLIBC used to prevent it by always aborting a transaction before issuing
>> + a syscall. Linux 4.2 also decided to abort active transaction in
>> + syscalls which makes the GLIBC workaround superflouts. Worse, GLIBC
>> + transaction abortion leads to a performance issue on recent kernels.
>
> See above about GLIBC vs glibc. And “superfluous”, “glibc transaction
> abort*s* lead**”.
>
> The actual change looks good to me, based on the explanation in the
> patch.
Based on previous Tulio's ack, I fixed the points you and Joseph raised
and push upstream.
@@ -21,6 +21,15 @@ Major new features:
* The reallocarray function is now declared under _DEFAULT_SOURCE, not just
for _GNU_SOURCE, to match BSD environments.
+* For powercp64le ABI, Transactional Lock Elision is now enabled iff kernel
+ aborts the transaction prior kernel execution (PPC_FEATURE2_HTM_NOSC on
+ hwcap2). On older kernels the transaction is suspended, and this caused
+ some undefined side-effects issues when transactions relies on other
+ synchronization mechanisms (futex for instance). GLIBC avoided it by
+ abort transactions manually on each syscall, but it lead to performance
+ issues on newer kernels where the HTM state is saved and restore lazily
+ (the state being saved even when the process actually does not use HTM).
+
Deprecated and removed features, and other changes affecting compatibility:
* The glibc.tune tunable namespace has been renamed to glibc.cpu and the
@@ -21,7 +21,6 @@ DSO_SLOT2 (offsetof (tcbhead_t, dso_slot2) - TLS_TCB_OFFSET - sizeof (tcbhead_
#ifdef __powerpc64__
TCB_AT_PLATFORM (offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
#endif
-TM_CAPABLE (offsetof (tcbhead_t, tm_capable) - TLS_TCB_OFFSET - sizeof (tcbhead_t))
#ifndef __powerpc64__
TCB_AT_PLATFORM (offsetof (tcbhead_t, at_platform) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
PADDING (offsetof (tcbhead_t, padding) - TLS_TCB_OFFSET - sizeof(tcbhead_t))
@@ -67,8 +67,7 @@ typedef struct
uint32_t padding;
uint32_t at_platform;
#endif
- /* Indicate if HTM capable (ISA 2.07). */
- uint32_t tm_capable;
+ uint32_t __unused;
/* Reservation for AT_PLATFORM data - powerpc64. */
#ifdef __powerpc64__
uint32_t at_platform;
@@ -142,7 +141,6 @@ register void *__thread_register __asm__ ("r13");
# define TLS_INIT_TP(tcbp) \
({ \
__thread_register = (void *) (tcbp) + TLS_TCB_OFFSET; \
- THREAD_SET_TM_CAPABLE (__tcb_hwcap & PPC_FEATURE2_HAS_HTM ? 1 : 0); \
THREAD_SET_HWCAP (__tcb_hwcap); \
THREAD_SET_AT_PLATFORM (__tcb_platform); \
NULL; \
@@ -151,8 +149,6 @@ register void *__thread_register __asm__ ("r13");
/* Value passed to 'clone' for initialization of the thread register. */
# define TLS_DEFINE_INIT_TP(tp, pd) \
void *tp = (void *) (pd) + TLS_TCB_OFFSET + TLS_PRE_TCB_SIZE; \
- (((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].tm_capable) = \
- THREAD_GET_TM_CAPABLE (); \
(((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].hwcap) = \
THREAD_GET_HWCAP (); \
(((tcbhead_t *) ((char *) tp - TLS_TCB_OFFSET))[-1].at_platform) = \
@@ -210,13 +206,6 @@ register void *__thread_register __asm__ ("r13");
+ TLS_PRE_TCB_SIZE))[-1].pointer_guard \
= THREAD_GET_POINTER_GUARD())
-/* tm_capable field in TCB head. */
-# define THREAD_GET_TM_CAPABLE() \
- (((tcbhead_t *) ((char *) __thread_register \
- - TLS_TCB_OFFSET))[-1].tm_capable)
-# define THREAD_SET_TM_CAPABLE(value) \
- (THREAD_GET_TM_CAPABLE () = (value))
-
/* hwcap field in TCB head. */
# define THREAD_GET_HWCAP() \
(((tcbhead_t *) ((char *) __thread_register \
@@ -90,24 +90,7 @@ GOT_LABEL: ; \
cfi_endproc; \
ASM_SIZE_DIRECTIVE(name)
-#if !IS_IN(rtld) && !defined(__SPE__)
-# define ABORT_TRANSACTION_IMPL \
- cmpwi 2,0; \
- beq 1f; \
- lwz 0,TM_CAPABLE(2); \
- cmpwi 0,0; \
- beq 1f; \
- li 11,_ABORT_SYSCALL; \
- tabort. 11; \
- .align 4; \
-1:
-#else
-# define ABORT_TRANSACTION_IMPL
-#endif
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
-
#define DO_CALL(syscall) \
- ABORT_TRANSACTION \
li 0,syscall; \
sc
@@ -263,24 +263,7 @@ LT_LABELSUFFIX(name,_name_end): ; \
TRACEBACK_MASK(name,mask); \
END_2(name)
-#if !IS_IN(rtld)
-# define ABORT_TRANSACTION_IMPL \
- cmpdi 13,0; \
- beq 1f; \
- lwz 0,TM_CAPABLE(13); \
- cmpwi 0,0; \
- beq 1f; \
- li 11,_ABORT_SYSCALL; \
- tabort. 11; \
- .p2align 4; \
-1:
-#else
-# define ABORT_TRANSACTION_IMPL
-#endif
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
-
#define DO_CALL(syscall) \
- ABORT_TRANSACTION \
li 0,syscall; \
sc
@@ -21,8 +21,6 @@
*/
#define _SYSDEPS_SYSDEP_H 1
#include <bits/hwcap.h>
-#include <tls.h>
-#include <htm.h>
#define PPC_FEATURE_970 (PPC_FEATURE_POWER4 + PPC_FEATURE_HAS_ALTIVEC)
@@ -166,22 +164,4 @@
#define ALIGNARG(log2) log2
#define ASM_SIZE_DIRECTIVE(name) .size name,.-name
-#else
-
-/* Linux kernel powerpc documentation [1] states issuing a syscall inside a
- transaction is not recommended and may lead to undefined behavior. It
- also states syscalls do not abort transactions. To avoid such traps,
- we abort transaction just before syscalls.
-
- [1] Documentation/powerpc/transactional_memory.txt [Syscalls] */
-#if !IS_IN(rtld) && !defined(__SPE__)
-# define ABORT_TRANSACTION \
- ({ \
- if (THREAD_GET_TM_CAPABLE ()) \
- __libc_tabort (_ABORT_SYSCALL); \
- })
-#else
-# define ABORT_TRANSACTION
-#endif
-
#endif /* __ASSEMBLER__ */
@@ -70,7 +70,7 @@ do_set_elision_enable (int32_t elision_enable)
& PPC_FEATURE2_HAS_HTM) ? 1 : 0;
}
-/* The pthread->elision_enable tunable is 0 or 1 indicating that elision
+/* The pthread->elision_enable tunable is 0 or 1 indicating that elisionGLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
should be disabled or enabled respectively. The feature will only be used
if it's supported by the hardware. */
@@ -127,6 +127,26 @@ elision_init (int argc __attribute__ ((unused)),
TUNABLE_CALLBACK (set_elision_skip_trylock_internal_abort));
#endif
+ /* Linux from 3.9 through 4.2 do not abort HTM transaction on syscalls,
+ instead it suspend the transaction and resumes it when returning to
+ usercode. The side-effects of the syscall will always remain visible,
+ even if the transaction is aborted. This is an issue when transaction
+ is used along with futex syscall, on pthread_cond_wait for instance,
+ where futex might succeed but the transaction is rolled back leading
+ the pthread_cond object in an inconsistent state.
+
+ GLIBC used to prevent it by always aborting a transaction before issuing
+ a syscall. Linux 4.2 also decided to abort active transaction in
+ syscalls which makes the GLIBC workaround superflouts. Worse, GLIBC
+ transaction abortion leads to a performance issue on recent kernels.
+
+ So Lock Elision is just enabled when it has been explict set (either
+ by tunables of by a configure switch) and if kernel aborts HTM
+ transactions on syscalls (PPC_FEATURE2_HTM_NOSC) */
+
+ __pthread_force_elision = __pthread_force_elision &&
+ GLRO (dl_hwcap2) & PPC_FEATURE2_HTM_NOSC;
+
if (!__pthread_force_elision)
__elision_aconf.try_tbegin = 0; /* Disable elision on rwlocks. */
}
deleted file mode 100644
@@ -1,30 +0,0 @@
-/* Syscall wrapper that do not set errno. Linux powerpc version.
- Copyright (C) 2018 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <http://www.gnu.org/licenses/>. */
-
-/* __access_noerrno is used during process initialization in elf/dl-tunables.c
- before the TCB is initialized, prohibiting the usage of
- ABORT_TRANSACTION. */
-#undef ABORT_TRANSACTION
-#define ABORT_TRANSACTION
-
-#include "sysdeps/unix/sysv/linux/not-errno.h"
-
-/* Recover ABORT_TRANSACTION's previous value, in order to not affect
- other syscalls. */
-#undef ABORT_TRANSACTION
-#define ABORT_TRANSACTION ABORT_TRANSACTION_IMPL
@@ -109,7 +109,6 @@
register long int r11 __asm__ ("r11"); \
register long int r12 __asm__ ("r12"); \
LOADARGS_##nr(name, args); \
- ABORT_TRANSACTION; \
__asm__ __volatile__ \
("sc \n\t" \
"mfcr %0" \
@@ -131,7 +131,6 @@
register long int r7 __asm__ ("r7"); \
register long int r8 __asm__ ("r8"); \
LOADARGS_##nr (name, ##args); \
- ABORT_TRANSACTION; \
__asm__ __volatile__ \
("sc\n\t" \
"mfcr %0\n\t" \
@@ -18,7 +18,6 @@
#include <sysdep.h>
ENTRY (syscall)
- ABORT_TRANSACTION
mr r0,r3
mr r3,r4
mr r4,r5