[v2,2/2] manual: Document __libc_single_threaded
Commit Message
---
V2: Fix typos. Add the requestion documentation of implementation details.
manual/threads.texi | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 113 insertions(+)
Comments
The 06/24/2020 15:03, Florian Weimer via Libc-alpha wrote:
> ---
> V2: Fix typos. Add the requestion documentation of implementation details.
this spec looks ok to me.
Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
one comment:
linux has several syscalls that change some property of
the current thread and get inherited by new threads
(e.g. setxid, various prctls, signal mask) and a user
may want to set such property for the entire process
and bail out if that cannot work, so the question is
if __libc_single_threaded works for this? even if this
is done in a signal handler? e.g.
if (__libc_single_thread)
prctl(set_property,a2,a3,a4,a5); // entire process ok
else
bailout(); // there might be other threads
i think this may be another valid usage (not related
to atomics).
>
> manual/threads.texi | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 113 insertions(+)
>
> diff --git a/manual/threads.texi b/manual/threads.texi
> index bb7a42c655..009026deb8 100644
> --- a/manual/threads.texi
> +++ b/manual/threads.texi
> @@ -628,6 +628,7 @@ the standard.
> * Initial Thread Signal Mask:: Setting the initial mask of threads.
> * Waiting with Explicit Clocks:: Functions for waiting with an
> explicit clock specification.
> +* Single-Threaded:: Detecting single-threaded execution.
> @end menu
>
> @node Default Thread Attributes
> @@ -843,6 +844,118 @@ Behaves like @code{pthread_timedjoin_np} except that the absolute time in
> @var{abstime} is measured against the clock specified by @var{clockid}.
> @end deftypefun
>
> +@node Single-Threaded
> +@subsubsection Detecting Single-Threaded Execution
> +
> +Multi-threaded programs require synchronization among threads. This
> +synchronization can be costly even if there is just a single thread
> +and no data is shared between multiple processors. @Theglibc{} offers
> +an interface to detect whether the process is in single-threaded mode.
> +Applications can use this information to avoid synchronization, for
> +example by using regular instructions to load and store memory instead
> +of atomic instructions, or using relaxed memory ordering instead of
> +stronger memory ordering.
> +
> +@deftypevar char __libc_single_threaded
> +@standards{GNU, sys/single_threaded.h}
> +This variable is non-zero if the current process is definitely
> +single-threaded. If it is zero, the process may be multi-threaded,
> +or @theglibc{} cannot determine at this point of the program execution
> +whether the process is single-threaded or not.
> +
> +Applications must never write to this variable.
> +@end deftypevar
> +
> +Most applications should perform the same actions whether or not
> +@code{__libc_single_threaded} is true, except with less
> +synchronization. If this rule is followed, a process that
> +subsequently becomes multi-threaded is already in a consistent state.
> +For example, in order to increment a reference count, the following
> +code can be used:
> +
> +@smallexample
> +if (__libc_single_threaded)
> + atomic_fetch_add (&reference_count, 1, memory_order_relaxed);
> +else
> + atomic_fetch_add (&reference_count, 1, memory_order_acq_rel);
> +@end smallexample
> +
> +@c Note: No memory order on __libc_single_threaded. The
> +@c implementation must ensure that exit of the critical
> +@c (second-to-last) thread happens-before setting
> +@c __libc_single_threaded to true. Otherwise, acquire MO might be
> +@c needed for reading the variable in some scenarios, and that would
> +@c completely defeat its purpose.
> +
> +This still requires some form of synchronization on the
> +single-threaded branch, so it can be beneficial not to declare the
> +reference count as @code{_Atomic}, and use the GCC @code{__atomic}
> +built-ins. @xref{__atomic Builtins,, Built-in Functions for Memory
> +Model Aware Atomic Operations, gcc, Using the GNU Compiler Collection
> +(GCC)}. Then the code to increment a reference count looks like this:
> +
> +@smallexample
> +if (__libc_single_threaded)
> + ++refeference_count;
> +else
> + __atomic_fetch_add (&reference_count, 1, __ATOMIC_ACQ_REL);
> +@end smallexample
> +
> +(Depending on the data associated with the reference count, it may be
> +possible to use the weaker @code{__ATOMIC_RELAXED} memory ordering on
> +the multi-threaded branch.)
> +
> +Several functions in @theglibc{} can change the value of the
> +@code{__libc_single_threaded} variable. For example, creating new
> +threads using the @code{pthread_create} or @code{thrd_create} function
> +sets the variable to false. This can also happen indirectly, say via
> +a call to @code{dlopen}. Therefore, applications need to make a copy
> +of the value of @code{__libc_single_threaded} if after such a function
> +call, behavior must match the value as it was before the call, like
> +this:
> +
> +@smallexample
> +bool single_threaded = __libc_single_threaded;
> +if (single_threaded)
> + prepare_single_threaded ();
> +else
> + prepare_multi_thread ();
> +
> +void *handle = dlopen (shared_library_name, RTLD_NOW);
> +lookup_symbols (handle);
> +
> +if (single_threaded)
> + cleanup_single_threaded ();
> +else
> + cleanup_multi_thread ();
> +@end smallexample
> +
> +Since the value of @code{__libc_single_threaded} can change from true
> +to false during the execution of the program, it is not useful for
> +selecting optimized function implementations in IFUNC resolvers.
> +
> +Atomic operations can also be used on mappings shared among
> +single-threaded processes. This means that a compiler cannot use
> +@code{__libc_single_threaded} to optimize atomic operations, unless it
> +is able to prove that the memory is not shared.
> +
> +@strong{Implementation Note:} The @code{__libc_single_threaded}
> +variable is not declared as @code{volatile} because it is expected
> +that compilers optimize a sequence of single-threaded checks into one
> +check, for example if several reference counts are updated. The
> +current implementation in @theglibc{} does not set the
> +@code{__libc_single_threaded} variable to a true value if a process
> +turns single-threaded again. Future versions of @theglibc{} may do
> +this, but only as the result of function calls which imply an acquire
> +(compiler) barrier. (Some compilers assume that well-known functions
> +such as @code{malloc} do not write to global variables, and setting
> +@code{__libc_single_threaded} would introduce a data race and
> +undefined behavior.) In any case, an application must not write to
> +@code{__libc_single_threaded} even if it has joined the last
> +application-created thread because future versions of @theglibc{} may
> +create background threads after the first thread has been created, and
> +the application has no way of knowning that these threads are present.
> +
> @c FIXME these are undocumented:
> @c pthread_atfork
> @c pthread_attr_destroy
>
--
* Szabolcs Nagy:
> linux has several syscalls that change some property of
> the current thread and get inherited by new threads
> (e.g. setxid, various prctls, signal mask) and a user
> may want to set such property for the entire process
> and bail out if that cannot work, so the question is
> if __libc_single_threaded works for this? even if this
> is done in a signal handler? e.g.
>
> if (__libc_single_thread)
> prctl(set_property,a2,a3,a4,a5); // entire process ok
> else
> bailout(); // there might be other threads
>
> i think this may be another valid usage (not related
> to atomics).
In theory, that could be the case, but the question is what you do in
the bailout case.
Thanks,
Florian
@@ -628,6 +628,7 @@ the standard.
* Initial Thread Signal Mask:: Setting the initial mask of threads.
* Waiting with Explicit Clocks:: Functions for waiting with an
explicit clock specification.
+* Single-Threaded:: Detecting single-threaded execution.
@end menu
@node Default Thread Attributes
@@ -843,6 +844,118 @@ Behaves like @code{pthread_timedjoin_np} except that the absolute time in
@var{abstime} is measured against the clock specified by @var{clockid}.
@end deftypefun
+@node Single-Threaded
+@subsubsection Detecting Single-Threaded Execution
+
+Multi-threaded programs require synchronization among threads. This
+synchronization can be costly even if there is just a single thread
+and no data is shared between multiple processors. @Theglibc{} offers
+an interface to detect whether the process is in single-threaded mode.
+Applications can use this information to avoid synchronization, for
+example by using regular instructions to load and store memory instead
+of atomic instructions, or using relaxed memory ordering instead of
+stronger memory ordering.
+
+@deftypevar char __libc_single_threaded
+@standards{GNU, sys/single_threaded.h}
+This variable is non-zero if the current process is definitely
+single-threaded. If it is zero, the process may be multi-threaded,
+or @theglibc{} cannot determine at this point of the program execution
+whether the process is single-threaded or not.
+
+Applications must never write to this variable.
+@end deftypevar
+
+Most applications should perform the same actions whether or not
+@code{__libc_single_threaded} is true, except with less
+synchronization. If this rule is followed, a process that
+subsequently becomes multi-threaded is already in a consistent state.
+For example, in order to increment a reference count, the following
+code can be used:
+
+@smallexample
+if (__libc_single_threaded)
+ atomic_fetch_add (&reference_count, 1, memory_order_relaxed);
+else
+ atomic_fetch_add (&reference_count, 1, memory_order_acq_rel);
+@end smallexample
+
+@c Note: No memory order on __libc_single_threaded. The
+@c implementation must ensure that exit of the critical
+@c (second-to-last) thread happens-before setting
+@c __libc_single_threaded to true. Otherwise, acquire MO might be
+@c needed for reading the variable in some scenarios, and that would
+@c completely defeat its purpose.
+
+This still requires some form of synchronization on the
+single-threaded branch, so it can be beneficial not to declare the
+reference count as @code{_Atomic}, and use the GCC @code{__atomic}
+built-ins. @xref{__atomic Builtins,, Built-in Functions for Memory
+Model Aware Atomic Operations, gcc, Using the GNU Compiler Collection
+(GCC)}. Then the code to increment a reference count looks like this:
+
+@smallexample
+if (__libc_single_threaded)
+ ++refeference_count;
+else
+ __atomic_fetch_add (&reference_count, 1, __ATOMIC_ACQ_REL);
+@end smallexample
+
+(Depending on the data associated with the reference count, it may be
+possible to use the weaker @code{__ATOMIC_RELAXED} memory ordering on
+the multi-threaded branch.)
+
+Several functions in @theglibc{} can change the value of the
+@code{__libc_single_threaded} variable. For example, creating new
+threads using the @code{pthread_create} or @code{thrd_create} function
+sets the variable to false. This can also happen indirectly, say via
+a call to @code{dlopen}. Therefore, applications need to make a copy
+of the value of @code{__libc_single_threaded} if after such a function
+call, behavior must match the value as it was before the call, like
+this:
+
+@smallexample
+bool single_threaded = __libc_single_threaded;
+if (single_threaded)
+ prepare_single_threaded ();
+else
+ prepare_multi_thread ();
+
+void *handle = dlopen (shared_library_name, RTLD_NOW);
+lookup_symbols (handle);
+
+if (single_threaded)
+ cleanup_single_threaded ();
+else
+ cleanup_multi_thread ();
+@end smallexample
+
+Since the value of @code{__libc_single_threaded} can change from true
+to false during the execution of the program, it is not useful for
+selecting optimized function implementations in IFUNC resolvers.
+
+Atomic operations can also be used on mappings shared among
+single-threaded processes. This means that a compiler cannot use
+@code{__libc_single_threaded} to optimize atomic operations, unless it
+is able to prove that the memory is not shared.
+
+@strong{Implementation Note:} The @code{__libc_single_threaded}
+variable is not declared as @code{volatile} because it is expected
+that compilers optimize a sequence of single-threaded checks into one
+check, for example if several reference counts are updated. The
+current implementation in @theglibc{} does not set the
+@code{__libc_single_threaded} variable to a true value if a process
+turns single-threaded again. Future versions of @theglibc{} may do
+this, but only as the result of function calls which imply an acquire
+(compiler) barrier. (Some compilers assume that well-known functions
+such as @code{malloc} do not write to global variables, and setting
+@code{__libc_single_threaded} would introduce a data race and
+undefined behavior.) In any case, an application must not write to
+@code{__libc_single_threaded} even if it has joined the last
+application-created thread because future versions of @theglibc{} may
+create background threads after the first thread has been created, and
+the application has no way of knowning that these threads are present.
+
@c FIXME these are undocumented:
@c pthread_atfork
@c pthread_attr_destroy