[02/15] elf: Fix data race in _dl_name_match_p [BZ #21349]

Message ID 3721784a79c9d2040297304b2a7216d7072ea838.1613390045.git.szabolcs.nagy@arm.com
State Committed
Commit 395be7c2184645320c955b0ba214af9fa1ea9675
Delegated to: Adhemerval Zanella Netto
Headers
Series Dynamic TLS related data race fixes |

Commit Message

Szabolcs Nagy Feb. 15, 2021, 11:56 a.m. UTC
  From: Maninder Singh <maninder1.s@samsung.com>

dlopen updates libname_list by writing to lastp->next, but concurrent
reads in _dl_name_match_p were not synchronized when it was called
without holding GL(dl_load_lock), which can happen during lazy symbol
resolution.

This patch fixes the race between _dl_name_match_p reading lastp->next
and add_name_to_object writing to it. This could cause segfault on
targets with weak memory order when lastp->next->name is read, which
was observed on an arm system. Fixes bug 21349.

(Code is from Maninder Singh, comments and description is from Szabolcs
Nagy.)

Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
---
 elf/dl-load.c | 18 +++++++++++++++++-
 elf/dl-misc.c |  4 +++-
 2 files changed, 20 insertions(+), 2 deletions(-)
  

Comments

develop--- via Libc-alpha Feb. 15, 2021, 12:11 p.m. UTC | #1
Hi Szabolcs,

Thanks for trying this patch again.

Please add  Vaneet Name in co authered also.

>From: Maninder Singh <maninder1.s@samsung.com>
> 
>dlopen updates libname_list by writing to lastp->next, but concurrent
>reads in _dl_name_match_p were not synchronized when it was called
>without holding GL(dl_load_lock), which can happen during lazy symbol
>resolution.
> 
>This patch fixes the race between _dl_name_match_p reading lastp->next
>and add_name_to_object writing to it. This could cause segfault on
>targets with weak memory order when lastp->next->name is read, which
>was observed on an arm system. Fixes bug 21349.
> 
>(Code is from Maninder Singh, comments and description is from Szabolcs
>Nagy.)
> 
>Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Co-authored-by: Vaneet Narang <v.narang@samsung.com>

because issue was analysed by vaneet firstly.

Thanks Again,

Maninder Singh
  
Adhemerval Zanella April 1, 2021, 2:01 p.m. UTC | #2
On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
> From: Maninder Singh <maninder1.s@samsung.com>
> 
> dlopen updates libname_list by writing to lastp->next, but concurrent
> reads in _dl_name_match_p were not synchronized when it was called
> without holding GL(dl_load_lock), which can happen during lazy symbol
> resolution.
> 
> This patch fixes the race between _dl_name_match_p reading lastp->next
> and add_name_to_object writing to it. This could cause segfault on
> targets with weak memory order when lastp->next->name is read, which
> was observed on an arm system. Fixes bug 21349.
> 
> (Code is from Maninder Singh, comments and description is from Szabolcs
> Nagy.)
> 
> Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

I couldn't reproduce with the example provided in the bugzilla (on both
aarch64 and arm machines), but the explanation and the fix sounds logical.
I guess a testcase will be hard to create an exercise the issue.

LGTM, thanks.

Reviewed-by; Adhemerval Zanella  <adhemerval.zanella@linaro.org>

> ---
>  elf/dl-load.c | 18 +++++++++++++++++-
>  elf/dl-misc.c |  4 +++-
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 9e2089cfaa..be54bafad5 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -438,7 +438,23 @@ add_name_to_object (struct link_map *l, const char *name)
>    newname->name = memcpy (newname + 1, name, name_len);
>    newname->next = NULL;
>    newname->dont_free = 0;
> -  lastp->next = newname;
> +  /* CONCURRENCY NOTES:
> +
> +     Make sure the initialization of newname happens before its address is
> +     read from the lastp->next store below.
> +
> +     GL(dl_load_lock) is held here (and by other writers, e.g. dlclose), so
> +     readers of libname_list->next (e.g. _dl_check_caller or the reads above)
> +     can use that for synchronization, however the read in _dl_name_match_p
> +     may be executed without holding the lock during _dl_runtime_resolve
> +     (i.e. lazy symbol resolution when a function of library l is called).
> +
> +     The release MO store below synchronizes with the acquire MO load in
> +     _dl_name_match_p.  Other writes need to synchronize with that load too,
> +     however those happen either early when the process is single threaded
> +     (dl_main) or when the library is unloaded (dlclose) and the user has to
> +     synchronize library calls with unloading.  */
> +  atomic_store_release (&lastp->next, newname);
>  }
>  
>  /* Standard search directories.  */
> diff --git a/elf/dl-misc.c b/elf/dl-misc.c
> index 082f75f459..d4803bba4e 100644
> --- a/elf/dl-misc.c
> +++ b/elf/dl-misc.c
> @@ -347,7 +347,9 @@ _dl_name_match_p (const char *name, const struct link_map *map)
>      if (strcmp (name, runp->name) == 0)
>        return 1;
>      else
> -      runp = runp->next;
> +      /* Synchronize with the release MO store in add_name_to_object.
> +	 See CONCURRENCY NOTES in add_name_to_object in dl-load.c.  */
> +      runp = atomic_load_acquire (&runp->next);
>  
>    return 0;
>  }
>
  
Szabolcs Nagy April 6, 2021, 4:41 p.m. UTC | #3
The 04/01/2021 11:01, Adhemerval Zanella wrote:
> On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
> > From: Maninder Singh <maninder1.s@samsung.com>
> > 
> > dlopen updates libname_list by writing to lastp->next, but concurrent
> > reads in _dl_name_match_p were not synchronized when it was called
> > without holding GL(dl_load_lock), which can happen during lazy symbol
> > resolution.
> > 
> > This patch fixes the race between _dl_name_match_p reading lastp->next
> > and add_name_to_object writing to it. This could cause segfault on
> > targets with weak memory order when lastp->next->name is read, which
> > was observed on an arm system. Fixes bug 21349.
> > 
> > (Code is from Maninder Singh, comments and description is from Szabolcs
> > Nagy.)
> > 
> > Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> 
> I couldn't reproduce with the example provided in the bugzilla (on both
> aarch64 and arm machines), but the explanation and the fix sounds logical.
> I guess a testcase will be hard to create an exercise the issue.
> 
> LGTM, thanks.
> 
> Reviewed-by; Adhemerval Zanella  <adhemerval.zanella@linaro.org>

thanks, committed at 395be7c2184645320c955b0ba214af9fa1ea9675
  

Patch

diff --git a/elf/dl-load.c b/elf/dl-load.c
index 9e2089cfaa..be54bafad5 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -438,7 +438,23 @@  add_name_to_object (struct link_map *l, const char *name)
   newname->name = memcpy (newname + 1, name, name_len);
   newname->next = NULL;
   newname->dont_free = 0;
-  lastp->next = newname;
+  /* CONCURRENCY NOTES:
+
+     Make sure the initialization of newname happens before its address is
+     read from the lastp->next store below.
+
+     GL(dl_load_lock) is held here (and by other writers, e.g. dlclose), so
+     readers of libname_list->next (e.g. _dl_check_caller or the reads above)
+     can use that for synchronization, however the read in _dl_name_match_p
+     may be executed without holding the lock during _dl_runtime_resolve
+     (i.e. lazy symbol resolution when a function of library l is called).
+
+     The release MO store below synchronizes with the acquire MO load in
+     _dl_name_match_p.  Other writes need to synchronize with that load too,
+     however those happen either early when the process is single threaded
+     (dl_main) or when the library is unloaded (dlclose) and the user has to
+     synchronize library calls with unloading.  */
+  atomic_store_release (&lastp->next, newname);
 }
 
 /* Standard search directories.  */
diff --git a/elf/dl-misc.c b/elf/dl-misc.c
index 082f75f459..d4803bba4e 100644
--- a/elf/dl-misc.c
+++ b/elf/dl-misc.c
@@ -347,7 +347,9 @@  _dl_name_match_p (const char *name, const struct link_map *map)
     if (strcmp (name, runp->name) == 0)
       return 1;
     else
-      runp = runp->next;
+      /* Synchronize with the release MO store in add_name_to_object.
+	 See CONCURRENCY NOTES in add_name_to_object in dl-load.c.  */
+      runp = atomic_load_acquire (&runp->next);
 
   return 0;
 }