diff mbox series

[v2,2/2] manual: Document __libc_single_threaded

Message ID 99b00049f399fd0acffb2c27530f8fd7a64018f8.1593003514.git.fweimer@redhat.com
State Superseded
Headers show
Series [v2,1/2] Add the __libc_single_threaded variable | expand

Commit Message

Florian Weimer June 24, 2020, 1:03 p.m. UTC
---
V2: Fix typos.  Add the requestion documentation of implementation details.

 manual/threads.texi | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 113 insertions(+)

Comments

Szabolcs Nagy June 25, 2020, 12:08 p.m. UTC | #1
The 06/24/2020 15:03, Florian Weimer via Libc-alpha wrote:
> ---
> V2: Fix typos.  Add the requestion documentation of implementation details.

this spec looks ok to me.

Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

one comment:

linux has several syscalls that change some property of
the current thread and get inherited by new threads
(e.g. setxid, various prctls, signal mask) and a user
may want to set such property for the entire process
and bail out if that cannot work, so the question is
if __libc_single_threaded works for this? even if this
is done in a signal handler? e.g.

if (__libc_single_thread)
  prctl(set_property,a2,a3,a4,a5); // entire process ok
else
  bailout(); // there might be other threads

i think this may be another valid usage (not related
to atomics).

> 
>  manual/threads.texi | 113 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 113 insertions(+)
> 
> diff --git a/manual/threads.texi b/manual/threads.texi
> index bb7a42c655..009026deb8 100644
> --- a/manual/threads.texi
> +++ b/manual/threads.texi
> @@ -628,6 +628,7 @@ the standard.
>  * Initial Thread Signal Mask::            Setting the initial mask of threads.
>  * Waiting with Explicit Clocks::          Functions for waiting with an
>                                            explicit clock specification.
> +* Single-Threaded::     Detecting single-threaded execution.
>  @end menu
>  
>  @node Default Thread Attributes
> @@ -843,6 +844,118 @@ Behaves like @code{pthread_timedjoin_np} except that the absolute time in
>  @var{abstime} is measured against the clock specified by @var{clockid}.
>  @end deftypefun
>  
> +@node Single-Threaded
> +@subsubsection Detecting Single-Threaded Execution
> +
> +Multi-threaded programs require synchronization among threads.  This
> +synchronization can be costly even if there is just a single thread
> +and no data is shared between multiple processors.  @Theglibc{} offers
> +an interface to detect whether the process is in single-threaded mode.
> +Applications can use this information to avoid synchronization, for
> +example by using regular instructions to load and store memory instead
> +of atomic instructions, or using relaxed memory ordering instead of
> +stronger memory ordering.
> +
> +@deftypevar char __libc_single_threaded
> +@standards{GNU, sys/single_threaded.h}
> +This variable is non-zero if the current process is definitely
> +single-threaded.  If it is zero, the process may be multi-threaded,
> +or @theglibc{} cannot determine at this point of the program execution
> +whether the process is single-threaded or not.
> +
> +Applications must never write to this variable.
> +@end deftypevar
> +
> +Most applications should perform the same actions whether or not
> +@code{__libc_single_threaded} is true, except with less
> +synchronization.  If this rule is followed, a process that
> +subsequently becomes multi-threaded is already in a consistent state.
> +For example, in order to increment a reference count, the following
> +code can be used:
> +
> +@smallexample
> +if (__libc_single_threaded)
> +  atomic_fetch_add (&reference_count, 1, memory_order_relaxed);
> +else
> +  atomic_fetch_add (&reference_count, 1, memory_order_acq_rel);
> +@end smallexample
> +
> +@c Note: No memory order on __libc_single_threaded.  The
> +@c implementation must ensure that exit of the critical
> +@c (second-to-last) thread happens-before setting
> +@c __libc_single_threaded to true.  Otherwise, acquire MO might be
> +@c needed for reading the variable in some scenarios, and that would
> +@c completely defeat its purpose.
> +
> +This still requires some form of synchronization on the
> +single-threaded branch, so it can be beneficial not to declare the
> +reference count as @code{_Atomic}, and use the GCC @code{__atomic}
> +built-ins.  @xref{__atomic Builtins,, Built-in Functions for Memory
> +Model Aware Atomic Operations, gcc, Using the GNU Compiler Collection
> +(GCC)}.  Then the code to increment a reference count looks like this:
> +
> +@smallexample
> +if (__libc_single_threaded)
> +  ++refeference_count;
> +else
> +  __atomic_fetch_add (&reference_count, 1, __ATOMIC_ACQ_REL);
> +@end smallexample
> +
> +(Depending on the data associated with the reference count, it may be
> +possible to use the weaker @code{__ATOMIC_RELAXED} memory ordering on
> +the multi-threaded branch.)
> +
> +Several functions in @theglibc{} can change the value of the
> +@code{__libc_single_threaded} variable.  For example, creating new
> +threads using the @code{pthread_create} or @code{thrd_create} function
> +sets the variable to false.  This can also happen indirectly, say via
> +a call to @code{dlopen}.  Therefore, applications need to make a copy
> +of the value of @code{__libc_single_threaded} if after such a function
> +call, behavior must match the value as it was before the call, like
> +this:
> +
> +@smallexample
> +bool single_threaded = __libc_single_threaded;
> +if (single_threaded)
> +  prepare_single_threaded ();
> +else
> +  prepare_multi_thread ();
> +
> +void *handle = dlopen (shared_library_name, RTLD_NOW);
> +lookup_symbols (handle);
> +
> +if (single_threaded)
> +  cleanup_single_threaded ();
> +else
> +  cleanup_multi_thread ();
> +@end smallexample
> +
> +Since the value of @code{__libc_single_threaded} can change from true
> +to false during the execution of the program, it is not useful for
> +selecting optimized function implementations in IFUNC resolvers.
> +
> +Atomic operations can also be used on mappings shared among
> +single-threaded processes.  This means that a compiler cannot use
> +@code{__libc_single_threaded} to optimize atomic operations, unless it
> +is able to prove that the memory is not shared.
> +
> +@strong{Implementation Note:} The @code{__libc_single_threaded}
> +variable is not declared as @code{volatile} because it is expected
> +that compilers optimize a sequence of single-threaded checks into one
> +check, for example if several reference counts are updated.  The
> +current implementation in @theglibc{} does not set the
> +@code{__libc_single_threaded} variable to a true value if a process
> +turns single-threaded again.  Future versions of @theglibc{} may do
> +this, but only as the result of function calls which imply an acquire
> +(compiler) barrier.  (Some compilers assume that well-known functions
> +such as @code{malloc} do not write to global variables, and setting
> +@code{__libc_single_threaded} would introduce a data race and
> +undefined behavior.)  In any case, an application must not write to
> +@code{__libc_single_threaded} even if it has joined the last
> +application-created thread because future versions of @theglibc{} may
> +create background threads after the first thread has been created, and
> +the application has no way of knowning that these threads are present.
> +
>  @c FIXME these are undocumented:
>  @c pthread_atfork
>  @c pthread_attr_destroy
> 

--
Florian Weimer June 30, 2020, 8:56 a.m. UTC | #2
* Szabolcs Nagy:

> linux has several syscalls that change some property of
> the current thread and get inherited by new threads
> (e.g. setxid, various prctls, signal mask) and a user
> may want to set such property for the entire process
> and bail out if that cannot work, so the question is
> if __libc_single_threaded works for this? even if this
> is done in a signal handler? e.g.
>
> if (__libc_single_thread)
>   prctl(set_property,a2,a3,a4,a5); // entire process ok
> else
>   bailout(); // there might be other threads
>
> i think this may be another valid usage (not related
> to atomics).

In theory, that could be the case, but the question is what you do in
the bailout case.

Thanks,
Florian
diff mbox series

Patch

diff --git a/manual/threads.texi b/manual/threads.texi
index bb7a42c655..009026deb8 100644
--- a/manual/threads.texi
+++ b/manual/threads.texi
@@ -628,6 +628,7 @@  the standard.
 * Initial Thread Signal Mask::            Setting the initial mask of threads.
 * Waiting with Explicit Clocks::          Functions for waiting with an
                                           explicit clock specification.
+* Single-Threaded::     Detecting single-threaded execution.
 @end menu
 
 @node Default Thread Attributes
@@ -843,6 +844,118 @@  Behaves like @code{pthread_timedjoin_np} except that the absolute time in
 @var{abstime} is measured against the clock specified by @var{clockid}.
 @end deftypefun
 
+@node Single-Threaded
+@subsubsection Detecting Single-Threaded Execution
+
+Multi-threaded programs require synchronization among threads.  This
+synchronization can be costly even if there is just a single thread
+and no data is shared between multiple processors.  @Theglibc{} offers
+an interface to detect whether the process is in single-threaded mode.
+Applications can use this information to avoid synchronization, for
+example by using regular instructions to load and store memory instead
+of atomic instructions, or using relaxed memory ordering instead of
+stronger memory ordering.
+
+@deftypevar char __libc_single_threaded
+@standards{GNU, sys/single_threaded.h}
+This variable is non-zero if the current process is definitely
+single-threaded.  If it is zero, the process may be multi-threaded,
+or @theglibc{} cannot determine at this point of the program execution
+whether the process is single-threaded or not.
+
+Applications must never write to this variable.
+@end deftypevar
+
+Most applications should perform the same actions whether or not
+@code{__libc_single_threaded} is true, except with less
+synchronization.  If this rule is followed, a process that
+subsequently becomes multi-threaded is already in a consistent state.
+For example, in order to increment a reference count, the following
+code can be used:
+
+@smallexample
+if (__libc_single_threaded)
+  atomic_fetch_add (&reference_count, 1, memory_order_relaxed);
+else
+  atomic_fetch_add (&reference_count, 1, memory_order_acq_rel);
+@end smallexample
+
+@c Note: No memory order on __libc_single_threaded.  The
+@c implementation must ensure that exit of the critical
+@c (second-to-last) thread happens-before setting
+@c __libc_single_threaded to true.  Otherwise, acquire MO might be
+@c needed for reading the variable in some scenarios, and that would
+@c completely defeat its purpose.
+
+This still requires some form of synchronization on the
+single-threaded branch, so it can be beneficial not to declare the
+reference count as @code{_Atomic}, and use the GCC @code{__atomic}
+built-ins.  @xref{__atomic Builtins,, Built-in Functions for Memory
+Model Aware Atomic Operations, gcc, Using the GNU Compiler Collection
+(GCC)}.  Then the code to increment a reference count looks like this:
+
+@smallexample
+if (__libc_single_threaded)
+  ++refeference_count;
+else
+  __atomic_fetch_add (&reference_count, 1, __ATOMIC_ACQ_REL);
+@end smallexample
+
+(Depending on the data associated with the reference count, it may be
+possible to use the weaker @code{__ATOMIC_RELAXED} memory ordering on
+the multi-threaded branch.)
+
+Several functions in @theglibc{} can change the value of the
+@code{__libc_single_threaded} variable.  For example, creating new
+threads using the @code{pthread_create} or @code{thrd_create} function
+sets the variable to false.  This can also happen indirectly, say via
+a call to @code{dlopen}.  Therefore, applications need to make a copy
+of the value of @code{__libc_single_threaded} if after such a function
+call, behavior must match the value as it was before the call, like
+this:
+
+@smallexample
+bool single_threaded = __libc_single_threaded;
+if (single_threaded)
+  prepare_single_threaded ();
+else
+  prepare_multi_thread ();
+
+void *handle = dlopen (shared_library_name, RTLD_NOW);
+lookup_symbols (handle);
+
+if (single_threaded)
+  cleanup_single_threaded ();
+else
+  cleanup_multi_thread ();
+@end smallexample
+
+Since the value of @code{__libc_single_threaded} can change from true
+to false during the execution of the program, it is not useful for
+selecting optimized function implementations in IFUNC resolvers.
+
+Atomic operations can also be used on mappings shared among
+single-threaded processes.  This means that a compiler cannot use
+@code{__libc_single_threaded} to optimize atomic operations, unless it
+is able to prove that the memory is not shared.
+
+@strong{Implementation Note:} The @code{__libc_single_threaded}
+variable is not declared as @code{volatile} because it is expected
+that compilers optimize a sequence of single-threaded checks into one
+check, for example if several reference counts are updated.  The
+current implementation in @theglibc{} does not set the
+@code{__libc_single_threaded} variable to a true value if a process
+turns single-threaded again.  Future versions of @theglibc{} may do
+this, but only as the result of function calls which imply an acquire
+(compiler) barrier.  (Some compilers assume that well-known functions
+such as @code{malloc} do not write to global variables, and setting
+@code{__libc_single_threaded} would introduce a data race and
+undefined behavior.)  In any case, an application must not write to
+@code{__libc_single_threaded} even if it has joined the last
+application-created thread because future versions of @theglibc{} may
+create background threads after the first thread has been created, and
+the application has no way of knowning that these threads are present.
+
 @c FIXME these are undocumented:
 @c pthread_atfork
 @c pthread_attr_destroy