From patchwork Fri Dec 19 07:37:45 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andi Kleen X-Patchwork-Id: 4357 Received: (qmail 15188 invoked by alias); 19 Dec 2014 07:38:08 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 15098 invoked by uid 89); 19 Dec 2014 07:38:08 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=AWL, BAYES_00, LUCRATIVE, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: mga09.intel.com X-ExtLoop1: 1 From: Andi Kleen To: libc-alpha@sourceware.org Cc: Andi Kleen Subject: [PATCH 5/7] Add manual for lock elision Date: Thu, 18 Dec 2014 23:37:45 -0800 Message-Id: <1418974667-32587-6-git-send-email-andi@firstfloor.org> In-Reply-To: <1418974667-32587-1-git-send-email-andi@firstfloor.org> References: <1418974667-32587-1-git-send-email-andi@firstfloor.org> From: Andi Kleen This adds the original manual (from the original code submission) for elision, with a description of all the new tuning options. 2014-12-17 Andi Kleen * manual/Makefile: Add elision.texi. * manual/threads.texi: Link to elision. * manual/elision.texi: New file. * manual/intro.texi: Link to elision. * manual/lang.texi: dito. --- manual/Makefile | 3 +- manual/elision.texi | 264 ++++++++++++++++++++++++++++++++++++++++++++++++++++ manual/intro.texi | 3 + manual/lang.texi | 2 +- 4 files changed, 270 insertions(+), 2 deletions(-) create mode 100644 manual/elision.texi diff --git a/manual/Makefile b/manual/Makefile index 1f481f2..dccebfe 100644 --- a/manual/Makefile +++ b/manual/Makefile @@ -38,7 +38,8 @@ chapters = $(addsuffix .texi, \ message search pattern io stdio llio filesys \ pipe socket terminal syslog math arith time \ resource setjmp signal startup process ipc job \ - nss users sysinfo conf crypt debug threads probes) + nss users sysinfo conf crypt debug threads probes \ + elision) add-chapters = $(wildcard $(foreach d, $(add-ons), ../$d/$d.texi)) appendices = lang.texi header.texi install.texi maint.texi platform.texi \ contrib.texi diff --git a/manual/elision.texi b/manual/elision.texi new file mode 100644 index 0000000..8f0b3e5 --- /dev/null +++ b/manual/elision.texi @@ -0,0 +1,264 @@ +@node Lock elision, Language Features, POSIX Threads, Top +@c %MENU% Lock elision +@chapter Lock elision + +@c create the bizarre situation that lock elision is documented, but pthreads isn't + +This chapter describes the elided lock implementation for POSIX thread locks. + +@menu +* Lock elision introduction:: What is lock elision? +* Semantic differences of elided locks:: +* Tuning lock elision:: +* Setting elision for individual @code{pthread_mutex_t}:: +* Setting @code{pthread_mutex_t} elision using environment variables:: +* Setting @code{pthread_rwlock_t} elision using environment variables:: +@end menu + +@node Lock elision introduction +@section Lock elision introduction + +Lock elision is a technique to improve lock scaling. It runs +lock regions in parallel using hardware support for a transactional execution +mode. The lock region is executed speculatively, and as long +as there is no conflict or other reason for transaction abort the lock +will executed in parallel. If an transaction abort occurs, any +side effect of the speculative execution is undone, the lock is taken +for real and the lock region re-executed. This improves scalability +of the program because locks do not need to wait for each other. + +The standard @code{pthread_mutex_t} mutexes and @code{pthread_rwlock_t} rwlocks +can be transparently elided by @theglibc{}. + +Lock elision may lower performance if transaction aborts occur too frequently. +In this case it is recommended to use a PMU profiler to find the causes for +the aborts first and try to eliminate them. If that is not possible +elision can be disabled for a specific lock or for the whole program. +Alternatively elision can be disabled completely, and only enabled for +specific locks that are known to be elision friendly. + +The defaults locks are adaptive. The library decides whether elision +is profitable based on the abort rates, and automatically disables +elision for a lock when it aborts too often. After some time elision +is re-tried, in case the workload changed. + +Lock elision is currently supported for default (timed) mutexes, and +rwlocks. Other lock types (including @code{PTHREAD_MUTEX_NORMAL}) do not elide. +Condition variables also do not elide. This may change in future versions. + +@node Semantic differences of elided locks +@section Semantic differences of elided locks + +Elided locks have some semantic (visible) differences to classic locks. These differences +are only visible when the lock is successfully elided. Since elision may always +fail a program cannot rely on any of these semantics. + +@itemize +@item +timedlocks may not time out. + +@smallexample +pthread_mutex_lock (&lock); +if (pthread_mutex_timedlock (&lock, &timeout) == 0) + /* With elision we always come here */ +else + /* With no elision we always come here because timeout happens. */ +@end smallexample + +Similar semantic changes apply to @code{pthread_rwlock_trywrlock} and +@code{pthread_rwlock_timedwrlock}. + +A program like + +@smallexample +/* lock is not a recursive lock type */ +pthread_mutex_lock (&lock); +/* Relock same lock in same thread */ +pthread_mutex_lock (&lock); +@end smallexample + +will immediately hang on the second lock (dead lock) without elision. With +elision the deadlock will only happen on an abort, which can happen +early or could happen later, but will likely not happen every time. + +This behavior is allowed in POSIX for @code{PTHREAD_MUTEX_DEFAULT}, but not for +@code{PTHREAD_MUTEX_NORMAL}. When @code{PTHREAD_MUTEX_NORMAL} is +set for a mutex using @code{pthread_mutexattr_settype} elision is implicitly +disabled. Note that @code{PTHREAD_MUTEX_INITIALIZER} sets a +@code{PTHREAD_MUTEX_DEFAULT} type, thus allows elision. + +Depending on the ABI version @theglibc{} may not distinguish between +@code{PTHREAD_MUTEX_NORMAL} and @code{PTHREAD_MUTEX_DEFAULT}, as they may +have the same numerical value. If that is the case any call to +@code{pthread_mutexattr_settype} with either type will disable elision. + +@item +@code{pthread_mutex_destroy} does not return an error when the lock is locked +and will clear the lock state. + +@item +@code{pthread_mutex_t} and @code{pthread_rwlock_t} appear free from other threads. + +This can be visible through trylock or timedlock. +In most cases checking this is a existing latent race in the program, but there may +be cases when it is not. + +@item +@code{EAGAIN} and @code{EDEADLK} in rwlocks will not happen under elision. + +@item +@code{pthread_mutex_unlock} does not return an error when unlocking a free lock. + +@item +Elision changes timing because locks now run in parallel. +Timing differences may expose latent race bugs in the program. Programs using time based synchronization +(as opposed to using data dependencies) may change behavior. + +@end itemize + +@node Tuning lock elision +@section Tuning lock elision + +Critical regions may need some tuning to get the benefit of lock elision. +This is based on the abort rates, which can be determined by a PMU profiler +(e.g. perf on @gnulinuxsystems{}). When the abort rate is too high lock +scaling will not improve. Generally lock elision feedback should be done +only based on profile feedback. + +Most of these optimizations will improve performance even without lock elision +because they will minimize cache line bouncing between threads or make +lock regions smaller. + +Common causes of transactional aborts: + +@itemize +@item +Not elidable operations like system calls, IO, CPU exceptions. + +Try to move out of the critical section when common. Note that these often happen at program startup only. +@item +Global statistic counts + +Global statistic variables tend to cause conflicts. Either disable, or make per thread or as a last resort sample +(not update every operation) +@item +False sharing of variables or data structures causing conflicts with other threads + +Add padding as needed. +@item +Other conflicts on the same cache lines with other threads + +Minimize conflicts with other threads. This may require changes to the data structures. +@item +Capacity overflow + +The memory transaction used for lock elision has a limited capacity. Make the critical region smaller +or move operations that do not need to be protected by the lock outside. + +@item +Rewriting already set flags + +Setting flags or variables in shared objects that are already set may cause conflicts. Add a check +to only write when the value changed. + +@item +Using @code{pthread_mutex_trylock} or @code{pthread_rwlock_trywrlock} +nested in another elided lock. + +@end itemize + +@node Setting elision for individual @code{pthread_mutex_t} +@section Setting elision for individual @code{pthread_mutex_t} + +Elision can be only disabled for each @code{pthread_mutex_t} in the program +by setting its type using @code{pthread_mutex_settype_np} to @code{PTHREAD_MUTEX_NORMAL}. + +/* Force no lock elision for a mutex */ +pthread_mutexattr_t attr; +pthread_mutexattr_init (&attr); +pthread_mutexattr_settype_np (&attr, PTHREAD_MUTEX_NORMAL); +pthread_mutex_init (&object->mylock, &attr); +@end smallexample + +@node Setting @code{pthread_mutex_t} elision using environment variables +@section Setting @code{pthread_mutex_t} elision using environment variables +The elision of @code{pthread_mutex_t} mutexes can be configured at runtime with the @code{GLIBC_PTHREAD_MUTEX} +environment variable. This will force a specific lock type for all +mutexes in the program that did not explicitely disable elision +by setting a non default type. + +@smallexample +# run myprogram with no elision +GLIBC_PTHREAD_MUTEX=none myprogram +@end smallexample + +The default depends on the @theglibc{} build configuration and whether the hardware +supports lock elision. + +@itemize +@item +@code{GLIBC_PTHREAD_MUTEX=elision} +Use elided mutexes, unless explicitly disabled in the program. + +@item +@code{GLIBC_PTHREAD_MUTEX=none} +Don't use elide mutexes, unless explicitly enable in the program. +@end itemize + +Additional tunables can be configured through the environment variable, +like this: +@code{GLIBC_PTHREAD_MUTEX=adaptive:retry_lock_busy=10,retry_lock_internal_abort=20} +Note these parameters do not constitute an ABI and may change or disappear +at any time as the lock elision algorithm evolves. + +Currently supported parameters are: + +@itemize +@item +skip_lock_busy +How often to not attempt a transaction when the lock is seen as busy. +Expressed in number of lock attempts. + +@item +skip_lock_internal_abort +How often to not attempt a transaction after an internal abort is seen. +Expressed in number of lock attempts. + +@item +retry_try_xbegin +How often to retry the transaction on external aborts. +Expressed in number of transaction starts. + +@item +skip_trylock_internal_abort +How often to skip doing a transaction on internal aborts during trylock. +This setting is also used for adaptive locks. +Expressed in number of transaction starts. + +@end itemize + +@node Setting @code{pthread_rwlock_t} elision using environment variables +@section Setting @code{pthread_rwlock_t} elision using environment variables +The elision of @code{pthread_rwlock_t} rwlocks can be configured at +runtime with the @code{GLIBC_PTHREAD_RWLOCK} environment variable. + +@smallexample +# run myprogram with no elision +GLIBC_PTHREAD_RWLOCK=none myprogram +@end smallexample + +The default depends on the @theglibc{} build configuration and whether the hardware +supports lock elision. + +@itemize +@item +@code{GLIBC_PTHREAD_RWLOCK=elision} +Use elided rwlockes, unless explicitly disabled in the program. + +@item +@code{GLIBC_PTHREAD_RWLOCK=none} +Don't use elided rwlocks, unless explicitly enabled in the program. +@end itemize + +The same tunable parameters as documented for @code{GLIBC_PTHREAD_MUTEX} can be +also used for @code{GLIBC_PTHREAD_RWLOCK}. diff --git a/manual/intro.texi b/manual/intro.texi index d4045f2..c54d050 100644 --- a/manual/intro.texi +++ b/manual/intro.texi @@ -1470,6 +1470,9 @@ information about the hardware and software configuration your program is executing under. @item +@ref{Lock elision} describes elided locks in POSIX threads. + +@item @ref{System Configuration}, tells you how you can get information about various operating system limits. Most of these parameters are provided for compatibility with POSIX. diff --git a/manual/lang.texi b/manual/lang.texi index 28b21cb..f88f4fd 100644 --- a/manual/lang.texi +++ b/manual/lang.texi @@ -1,6 +1,6 @@ @c This node must have no pointers. @node Language Features -@c @node Language Features, Library Summary, , Top +@c @node Language Features, Library Summary, Lock elision, Top @c %MENU% C language features provided by the library @appendix C Language Facilities in the Library