[BZ,#13862] Reuse of cached stack can cause bounds overrun of thread DTV
Commit Message
On Thu, Nov 27, 2014 at 04:51:08PM -0500, Carlos O'Donell wrote:
>
> >> Q: A dlclose() may modify GL(dl_tls_max_dtv_idx) concurrently with another
> >> thread starting up. Is it safe to read GL(dl_tls_max_dtv_idx) more than
> >> once in a function since they may differ? Does reading GL(dl_tls_max_dtv_idx)
> >> require an atomic load? We are trying to be explicit about this in glibc.
> >
> > There is a race between DTV access and dlclose:
>
> That doesn't answer my question.
>
> Alex Oliva and I went through the code and we have come to the conclusion
> that it is *not* safe to read GL(dl_tls_max_dtv_idx) more than once if the
> code doing the reading expects to get the same value. The same value is
> only guaranteed if you hold GL(dl_load_lock). However, it is safe to read
> an old value, since _dl_update_slotinfo will resize the DTV again if you need
> to access a TLS variable of a dlopen that was in-flight during thread creation.
>
> Therefore, please make the read of GL(dl_tls_max_dtv_idx) an atomic access
> to prevent the compiler from reloading this and to document that it is
> written to by other threads concurrently.
Done. I used atomic_load_acquire now.
>
> >> Q: If GL(dl_tls_max_dtv_idx) may change at any time during thread startup
> >> does it make sense to reallocate the dtv? Why not leave it uninitialized
> >> and allow _dl_update_slotinfo to reallocate it?
> >
> > Are you suggesting folding _dl_allocate_tls_init into _dl_update_slotinfo
> > and remove _dl_allocate_tls_init? I can give it a try.
>
> I had not thought of that, but it's a good idea.
>
> I don't know how you'd make it work though.
>
> I think _dl_allocate_tls_init needs to do some book keeping, but the
> allocation of the DTV could be deferred to __tls_get_addr time
> (_dl_update_slotinfo).
I tried and it doesn't work.
> >
> > There is a race condition. But it is a separate issue. They should
> > be fixed together.
>
> You are adding new code. That new code should be correct. This needs
> an atomic load.
Done. See above.
> >> Missing license.
> >
> > What is the license policy on testcases?
>
> Same license as the project with the appropriate header.
Added.
> >>> +
> >>> +#define DSO_SHARED_FILES 20
> >>> +#define DSO_OPEN_THREADS 20
> >>> +#define DSO_EXEC_THREADS 2
> >>
> >> Why these particular constants? If you chose them because
> >> they happened to work on a particular system, please comment
> >> that.
> >
> > They came from:
> >
> > https://sourceware.org/bugzilla/attachment.cgi?id=6290
>
> You don't answer the question.
>
> If you are submitting the patch you have to be ready to defend it.
>
> If you can't defend it you must at least add a comment saying the
> choices are arbitrary.
>
> /* The choices of thread count, and file counts are arbitary.
> The point is simply to run enough threads that an exiting
> thread has it's stack reused by another thread at the same
> time as new libraries have been loaded. */
Thanks for the suggestion. I copied it to the testcase.
> >> Use /* */.
> >
> > We have been using // for comments in glibc for a while. Why not here?
>
> glibc follows the GNU Coding Standard which says to use /* */?
>
Done. Changed to /**/.
> >>
> >> This test needs a verbose, minimum one paragraph, comment explaining
> >> what the test is testing.
>
> No comment?
I added one paragraph for each file.
> >>> +
> >>> + for (j = 0; j < 100; j++)
> >>
> >> Why 100 times? Please comment.
>
> Comment stating this is arbitrary.
Done.
> >>> +void
> >>> +function (void)
> >>> +{
> >>> + int i;
> >>> + for (i = 0; i < 256; i++)
> >>
> >> Why 256 times? Please comment.
> >>> + var[i] = i;
> >>> +}
> >>>
> >
> > I don't have answers for any particular values in this
> > testcase.
>
> Then add a comment saying they are arbitrary. Such that
> future readers know they are not special values.
>
I added some comments.
Here is the updated patch. OK to install?
Thanks.
H.J.
---
From f08bbe49c8820576e2149a5be44af0ce2ddb355b Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Thu, 27 Nov 2014 05:42:23 -0800
Subject: [PATCH] Resize DTV if the current DTV isn't big enough
This patch changes _dl_allocate_tls_init to resize DTV if the current DTV
isn't big enough. Tested on X86-64, x32 and ia32.
[BZ #13862]
* elf/dl-tls.c: Include <atomic.h>.
(oom): Remove #ifdef SHARED/#endif.
(_dl_static_dtv, _dl_initial_dtv): Moved before ...
(_dl_resize_dtv): This. Extracted from _dl_update_slotinfo.
(_dl_allocate_tls_init): Resize DTV if the current DTV isn't
big enough.
(_dl_update_slotinfo): Call _dl_resize_dtv to resize DTV.
* nptl/Makefile (tests): Add tst-stack4.
(modules-names): Add tst-stack4mod.
($(objpfx)tst-stack4): New.
(tst-stack4mod.sos): Likewise.
($(objpfx)tst-stack4.out): Likewise.
($(tst-stack4mod.sos)): Likewise.
(clean): Likewise.
* nptl/tst-stack4.c: New file.
* nptl/tst-stack4mod.c: Likewise.
---
ChangeLog | 20 +++++++
elf/dl-tls.c | 102 ++++++++++++++++++++-------------
nptl/Makefile | 17 +++++-
nptl/tst-stack4.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++++++
nptl/tst-stack4mod.c | 28 +++++++++
5 files changed, 283 insertions(+), 43 deletions(-)
create mode 100644 nptl/tst-stack4.c
create mode 100644 nptl/tst-stack4mod.c
Comments
On 11/27/2014 06:55 PM, H.J. Lu wrote:
> On Thu, Nov 27, 2014 at 04:51:08PM -0500, Carlos O'Donell wrote:
>>
>>>> Q: A dlclose() may modify GL(dl_tls_max_dtv_idx) concurrently with another
>>>> thread starting up. Is it safe to read GL(dl_tls_max_dtv_idx) more than
>>>> once in a function since they may differ? Does reading GL(dl_tls_max_dtv_idx)
>>>> require an atomic load? We are trying to be explicit about this in glibc.
>>>
>>> There is a race between DTV access and dlclose:
>>
>> That doesn't answer my question.
>>
>> Alex Oliva and I went through the code and we have come to the conclusion
>> that it is *not* safe to read GL(dl_tls_max_dtv_idx) more than once if the
>> code doing the reading expects to get the same value. The same value is
>> only guaranteed if you hold GL(dl_load_lock). However, it is safe to read
>> an old value, since _dl_update_slotinfo will resize the DTV again if you need
>> to access a TLS variable of a dlopen that was in-flight during thread creation.
>>
>> Therefore, please make the read of GL(dl_tls_max_dtv_idx) an atomic access
>> to prevent the compiler from reloading this and to document that it is
>> written to by other threads concurrently.
>
> Done. I used atomic_load_acquire now.
Thanks.
>>
>>>> Q: If GL(dl_tls_max_dtv_idx) may change at any time during thread startup
>>>> does it make sense to reallocate the dtv? Why not leave it uninitialized
>>>> and allow _dl_update_slotinfo to reallocate it?
>>>
>>> Are you suggesting folding _dl_allocate_tls_init into _dl_update_slotinfo
>>> and remove _dl_allocate_tls_init? I can give it a try.
>>
>> I had not thought of that, but it's a good idea.
>>
>> I don't know how you'd make it work though.
>>
>> I think _dl_allocate_tls_init needs to do some book keeping, but the
>> allocation of the DTV could be deferred to __tls_get_addr time
>> (_dl_update_slotinfo).
>
> I tried and it doesn't work.
Thanks for trying. I appreciate that.
>>>
>>> There is a race condition. But it is a separate issue. They should
>>> be fixed together.
>>
>> You are adding new code. That new code should be correct. This needs
>> an atomic load.
>
> Done. See above.
Thanks.
>>>> Missing license.
>>>
>>> What is the license policy on testcases?
>>
>> Same license as the project with the appropriate header.
>
> Added.
Thanks.
>>>>> +
>>>>> +#define DSO_SHARED_FILES 20
>>>>> +#define DSO_OPEN_THREADS 20
>>>>> +#define DSO_EXEC_THREADS 2
>>>>
>>>> Why these particular constants? If you chose them because
>>>> they happened to work on a particular system, please comment
>>>> that.
>>>
>>> They came from:
>>>
>>> https://sourceware.org/bugzilla/attachment.cgi?id=6290
>>
>> You don't answer the question.
>>
>> If you are submitting the patch you have to be ready to defend it.
>>
>> If you can't defend it you must at least add a comment saying the
>> choices are arbitrary.
>>
>> /* The choices of thread count, and file counts are arbitary.
>> The point is simply to run enough threads that an exiting
>> thread has it's stack reused by another thread at the same
>> time as new libraries have been loaded. */
>
> Thanks for the suggestion. I copied it to the testcase.
Thanks for applying.
>>>> Use /* */.
>>>
>>> We have been using // for comments in glibc for a while. Why not here?
>>
>> glibc follows the GNU Coding Standard which says to use /* */?
>>
>
> Done. Changed to /**/.
Thanks.
>>>>
>>>> This test needs a verbose, minimum one paragraph, comment explaining
>>>> what the test is testing.
>>
>> No comment?
>
> I added one paragraph for each file.
Thanks.
>>>>> +
>>>>> + for (j = 0; j < 100; j++)
>>>>
>>>> Why 100 times? Please comment.
>>
>> Comment stating this is arbitrary.
>
> Done.
>
>>>>> +void
>>>>> +function (void)
>>>>> +{
>>>>> + int i;
>>>>> + for (i = 0; i < 256; i++)
>>>>
>>>> Why 256 times? Please comment.
>>>>> + var[i] = i;
>>>>> +}
>>>>>
>>>
>>> I don't have answers for any particular values in this
>>> testcase.
>>
>> Then add a comment saying they are arbitrary. Such that
>> future readers know they are not special values.
>>
>
> I added some comments.
>
> Here is the updated patch. OK to install?
Yes, but only after you fix one typo. I made a suggested change
to the test case descriptive sentence, but it's your choice to
accept tha tor not.
Thank you for the very quick responses.
> Thanks.
>
>
> H.J.
> ---
> From f08bbe49c8820576e2149a5be44af0ce2ddb355b Mon Sep 17 00:00:00 2001
> From: "H.J. Lu" <hjl.tools@gmail.com>
> Date: Thu, 27 Nov 2014 05:42:23 -0800
> Subject: [PATCH] Resize DTV if the current DTV isn't big enough
>
> This patch changes _dl_allocate_tls_init to resize DTV if the current DTV
> isn't big enough. Tested on X86-64, x32 and ia32.
>
> [BZ #13862]
> * elf/dl-tls.c: Include <atomic.h>.
> (oom): Remove #ifdef SHARED/#endif.
> (_dl_static_dtv, _dl_initial_dtv): Moved before ...
> (_dl_resize_dtv): This. Extracted from _dl_update_slotinfo.
> (_dl_allocate_tls_init): Resize DTV if the current DTV isn't
> big enough.
> (_dl_update_slotinfo): Call _dl_resize_dtv to resize DTV.
> * nptl/Makefile (tests): Add tst-stack4.
> (modules-names): Add tst-stack4mod.
> ($(objpfx)tst-stack4): New.
> (tst-stack4mod.sos): Likewise.
> ($(objpfx)tst-stack4.out): Likewise.
> ($(tst-stack4mod.sos)): Likewise.
> (clean): Likewise.
> * nptl/tst-stack4.c: New file.
> * nptl/tst-stack4mod.c: Likewise.
> ---
> ChangeLog | 20 +++++++
> elf/dl-tls.c | 102 ++++++++++++++++++++-------------
> nptl/Makefile | 17 +++++-
> nptl/tst-stack4.c | 159 +++++++++++++++++++++++++++++++++++++++++++++++++++
> nptl/tst-stack4mod.c | 28 +++++++++
> 5 files changed, 283 insertions(+), 43 deletions(-)
> create mode 100644 nptl/tst-stack4.c
> create mode 100644 nptl/tst-stack4mod.c
>
> diff --git a/ChangeLog b/ChangeLog
> index be49dba..6a535e7 100644
> --- a/ChangeLog
> +++ b/ChangeLog
> @@ -1,3 +1,23 @@
> +2014-11-27 H.J. Lu <hongjiu.lu@intel.com>
> +
> + [BZ #13862]
> + * elf/dl-tls.c: Include <atomic.h>.
> + (oom): Remove #ifdef SHARED/#endif.
> + (_dl_static_dtv, _dl_initial_dtv): Moved before ...
> + (_dl_resize_dtv): This. Extracted from _dl_update_slotinfo.
> + (_dl_allocate_tls_init): Resize DTV if the current DTV isn't
> + big enough.
> + (_dl_update_slotinfo): Call _dl_resize_dtv to resize DTV.
> + * nptl/Makefile (tests): Add tst-stack4.
> + (modules-names): Add tst-stack4mod.
> + ($(objpfx)tst-stack4): New.
> + (tst-stack4mod.sos): Likewise.
> + ($(objpfx)tst-stack4.out): Likewise.
> + ($(tst-stack4mod.sos)): Likewise.
> + (clean): Likewise.
> + * nptl/tst-stack4.c: New file.
> + * nptl/tst-stack4mod.c: Likewise.
> +
> 2014-11-27 J. Brown <jb999@gmx.de>
>
> * sysdeps/x86/bits/string.h: Add recent CPUs.
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 5204fda..8dd4992 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -23,6 +23,7 @@
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/param.h>
> +#include <atomic.h>
>
> #include <tls.h>
> #include <dl-tls.h>
> @@ -34,14 +35,12 @@
>
>
> /* Out-of-memory handler. */
> -#ifdef SHARED
> static void
> __attribute__ ((__noreturn__))
> oom (void)
> {
> _dl_fatal_printf ("cannot allocate memory for thread-local data: ABORT\n");
> }
> -#endif
>
>
> size_t
> @@ -397,6 +396,53 @@ _dl_allocate_tls_storage (void)
> }
>
>
> +#ifndef SHARED
> +extern dtv_t _dl_static_dtv[];
> +# define _dl_initial_dtv (&_dl_static_dtv[1])
> +#endif
> +
> +static dtv_t *
> +_dl_resize_dtv (dtv_t *dtv)
> +{
> + /* Resize the dtv. */
> + dtv_t *newp;
> + /* Load GL(dl_tls_max_dtv_idx) atomically since it may be written to by
> + other threads oncurrently. */
s/oncurrently/concurrently/g
> + size_t newsize
> + = atomic_load_acquire (&GL(dl_tls_max_dtv_idx)) + DTV_SURPLUS;
> + size_t oldsize = dtv[-1].counter;
> +
> + if (dtv == GL(dl_initial_dtv))
> + {
> + /* This is the initial dtv that was either statically allocated in
> + __libc_setup_tls or allocated during rtld startup using the
> + dl-minimal.c malloc instead of the real malloc. We can't free
> + it, we have to abandon the old storage. */
> +
> + newp = malloc ((2 + newsize) * sizeof (dtv_t));
> + if (newp == NULL)
> + oom ();
> + memcpy (newp, &dtv[-1], (2 + oldsize) * sizeof (dtv_t));
> + }
> + else
> + {
> + newp = realloc (&dtv[-1],
> + (2 + newsize) * sizeof (dtv_t));
> + if (newp == NULL)
> + oom ();
> + }
> +
> + newp[0].counter = newsize;
> +
> + /* Clear the newly allocated part. */
> + memset (newp + 2 + oldsize, '\0',
> + (newsize - oldsize) * sizeof (dtv_t));
> +
> + /* Return the generation counter. */
> + return &newp[1];
> +}
> +
> +
> void *
> internal_function
> _dl_allocate_tls_init (void *result)
> @@ -410,6 +456,16 @@ _dl_allocate_tls_init (void *result)
> size_t total = 0;
> size_t maxgen = 0;
>
> + /* Check if the current dtv is big enough. */
> + if (dtv[-1].counter < GL(dl_tls_max_dtv_idx))
> + {
> + /* Resize the dtv. */
> + dtv = _dl_resize_dtv (dtv);
> +
> + /* Install this new dtv in the thread data structures. */
> + INSTALL_DTV (result, &dtv[-1]);
> + }
> +
> /* We have to prepare the dtv for all currently loaded modules using
> TLS. For those which are dynamically loaded we add the values
> indicating deferred allocation. */
> @@ -492,11 +548,6 @@ _dl_allocate_tls (void *mem)
> rtld_hidden_def (_dl_allocate_tls)
>
>
> -#ifndef SHARED
> -extern dtv_t _dl_static_dtv[];
> -# define _dl_initial_dtv (&_dl_static_dtv[1])
> -#endif
> -
> void
> internal_function
> _dl_deallocate_tls (void *tcb, bool dealloc_tcb)
> @@ -645,41 +696,10 @@ _dl_update_slotinfo (unsigned long int req_modid)
> assert (total + cnt == modid);
> if (dtv[-1].counter < modid)
> {
> - /* Reallocate the dtv. */
> - dtv_t *newp;
> - size_t newsize = GL(dl_tls_max_dtv_idx) + DTV_SURPLUS;
> - size_t oldsize = dtv[-1].counter;
> -
> - assert (map->l_tls_modid <= newsize);
> -
> - if (dtv == GL(dl_initial_dtv))
> - {
> - /* This is the initial dtv that was allocated
> - during rtld startup using the dl-minimal.c
> - malloc instead of the real malloc. We can't
> - free it, we have to abandon the old storage. */
> -
> - newp = malloc ((2 + newsize) * sizeof (dtv_t));
> - if (newp == NULL)
> - oom ();
> - memcpy (newp, &dtv[-1], (2 + oldsize) * sizeof (dtv_t));
> - }
> - else
> - {
> - newp = realloc (&dtv[-1],
> - (2 + newsize) * sizeof (dtv_t));
> - if (newp == NULL)
> - oom ();
> - }
> -
> - newp[0].counter = newsize;
> -
> - /* Clear the newly allocated part. */
> - memset (newp + 2 + oldsize, '\0',
> - (newsize - oldsize) * sizeof (dtv_t));
> + /* Resize the dtv. */
> + dtv = _dl_resize_dtv (dtv);
>
> - /* Point dtv to the generation counter. */
> - dtv = &newp[1];
> + assert (modid <= dtv[-1].counter);
>
> /* Install this new dtv in the thread data
> structures. */
> diff --git a/nptl/Makefile b/nptl/Makefile
> index 86a1b0c..ac76596 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -254,7 +254,7 @@ tests = tst-typesizes \
> tst-exec1 tst-exec2 tst-exec3 tst-exec4 \
> tst-exit1 tst-exit2 tst-exit3 \
> tst-stdio1 tst-stdio2 \
> - tst-stack1 tst-stack2 tst-stack3 tst-pthread-getattr \
> + tst-stack1 tst-stack2 tst-stack3 tst-stack4 tst-pthread-getattr \
> tst-pthread-attr-affinity \
> tst-unload \
> tst-dlsym1 \
> @@ -311,7 +311,7 @@ endif
>
> modules-names = tst-atfork2mod tst-tls3mod tst-tls4moda tst-tls4modb \
> tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
> - tst-tls5modd tst-tls5mode tst-tls5modf \
> + tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
> tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod
> extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) tst-cleanup4aux.o
> test-extras += $(modules-names) tst-cleanup4aux
> @@ -479,6 +479,19 @@ $(objpfx)tst-stack3-mem.out: $(objpfx)tst-stack3.out
> $(evaluate-test)
> generated += tst-stack3-mem.out tst-stack3.mtrace
>
> +$(objpfx)tst-stack4: $(libdl) $(shared-thread-library)
> +tst-stack4mod.sos=$(shell for i in 0 1 2 3 4 5 6 7 8 9 10 \
> + 11 12 13 14 15 16 17 18 19; do \
> + for j in 0 1 2 3 4 5 6 7 8 9 10 \
> + 11 12 13 14 15 16 17 18 19; do \
> + echo $(objpfx)tst-stack4mod-$$i-$$j.so; \
> + done; done)
> +$(objpfx)tst-stack4.out: $(tst-stack4mod.sos)
> +$(tst-stack4mod.sos): $(objpfx)tst-stack4mod.so
> + cp -f $< $@
> +clean:
> + rm -f $(tst-stack4mod.sos)
> +
> $(objpfx)tst-cleanup4: $(objpfx)tst-cleanup4aux.o $(shared-thread-library)
> $(objpfx)tst-cleanupx4: $(objpfx)tst-cleanup4aux.o $(shared-thread-library)
>
> diff --git a/nptl/tst-stack4.c b/nptl/tst-stack4.c
> new file mode 100644
> index 0000000..344c658
> --- /dev/null
> +++ b/nptl/tst-stack4.c
> @@ -0,0 +1,159 @@
> +/* Test DTV size overflow with pthread_create and TLS in dlopened shared
> + object.
Suggest:
Test DTV size oveflow when pthread_create reuses old DTV and TLS is used
by dlopened shared object.
> + Copyright (C) 2014 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +#include <stdio.h>
> +#include <stdint.h>
> +#include <dlfcn.h>
> +#include <assert.h>
> +#include <pthread.h>
> +
> +/* The choices of thread count, and file counts are arbitary.
> + The point is simply to run enough threads that an exiting
> + thread has it's stack reused by another thread at the same
> + time as new libraries have been loaded. */
> +#define DSO_SHARED_FILES 20
> +#define DSO_OPEN_THREADS 20
> +#define DSO_EXEC_THREADS 2
> +
> +/* Used to make sure that only one thread is calling dlopen and dlclose
> + at a time. */
> +pthread_mutex_t g_lock;
> +
> +typedef void (*function) (void);
> +
> +void *
> +dso_invoke(void *dso_fun)
> +{
> + function *fun_vec = (function *) dso_fun;
> + int dso;
> +
> + for (dso = 0; dso < DSO_SHARED_FILES; dso++)
> + (*fun_vec[dso]) ();
> +
> + pthread_exit (NULL);
> +}
> +
> +void *
> +dso_process (void * p)
> +{
> + void *handle[DSO_SHARED_FILES];
> + function fun_vec[DSO_SHARED_FILES];
> + char dso_path[DSO_SHARED_FILES][100];
> + int dso;
> + uintptr_t t = (uintptr_t) p;
> +
> + /* Open DSOs and get a function. */
> + for (dso = 0; dso < DSO_SHARED_FILES; dso++)
> + {
> + sprintf (dso_path[dso], "tst-stack4mod-%i-%i.so", t, dso);
> +
> + pthread_mutex_lock (&g_lock);
> +
> + handle[dso] = dlopen (dso_path[dso], RTLD_NOW);
> + assert (handle[dso]);
> +
> + fun_vec[dso] = (function) dlsym (handle[dso], "function");
> + assert (fun_vec[dso]);
> +
> + pthread_mutex_unlock (&g_lock);
> + }
> +
> + /* Spawn workers. */
> + pthread_t thread[DSO_EXEC_THREADS];
> + int i, ret;
> + uintptr_t result = 0;
> + for (i = 0; i < DSO_EXEC_THREADS; i++)
> + {
> + pthread_mutex_lock (&g_lock);
> + ret = pthread_create (&thread[i], NULL, dso_invoke, (void *) fun_vec);
> + if (ret != 0)
> + {
> + printf ("pthread_create failed: %d\n", ret);
> + result = 1;
> + }
> + pthread_mutex_unlock (&g_lock);
> + }
> +
> + if (!result)
> + for (i = 0; i < DSO_EXEC_THREADS; i++)
> + {
> + ret = pthread_join (thread[i], NULL);
> + if (ret != 0)
> + {
> + printf ("pthread_join failed: %d\n", ret);
> + result = 1;
> + }
> + }
> +
> + /* Close all DSOs. */
> + for (dso = 0; dso < DSO_SHARED_FILES; dso++)
> + {
> + pthread_mutex_lock (&g_lock);
> + dlclose (handle[dso]);
> + pthread_mutex_unlock (&g_lock);
> + }
> +
> + /* Exit. */
> + pthread_exit ((void *) result);
> +}
> +
> +static int
> +do_test (void)
> +{
> + pthread_t thread[DSO_OPEN_THREADS];
> + int i,j;
> + int ret;
> + int result = 0;
> +
> + pthread_mutex_init (&g_lock, NULL);
> +
> + /* 100 is arbitrary here and is known to trigger PR 13862. */
> + for (j = 0; j < 100; j++)
> + {
> + for (i = 0; i < DSO_OPEN_THREADS; i++)
> + {
> + ret = pthread_create (&thread[i], NULL, dso_process,
> + (void *) (uintptr_t) i);
> + if (ret != 0)
> + {
> + printf ("pthread_create failed: %d\n", ret);
> + result = 1;
> + }
> + }
> +
> + if (result)
> + break;
> +
> + for (i = 0; i < DSO_OPEN_THREADS; i++)
> + {
> + ret = pthread_join (thread[i], NULL);
> + if (ret != 0)
> + {
> + printf ("pthread_join failed: %d\n", ret);
> + result = 1;
> + }
> + }
> + }
> +
> + return result;
> +}
> +
> +#define TEST_FUNCTION do_test ()
> +#define TIMEOUT 100
> +#include "../test-skeleton.c"
> diff --git a/nptl/tst-stack4mod.c b/nptl/tst-stack4mod.c
> new file mode 100644
> index 0000000..a75aa09
> --- /dev/null
> +++ b/nptl/tst-stack4mod.c
> @@ -0,0 +1,28 @@
> +/* This tests DTV usage with TLS in dlopened shared object.
> + Copyright (C) 2014 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <http://www.gnu.org/licenses/>. */
> +
> +/* 256 is arbitrary here and is known to trigger PR 13862. */
> +__thread int var[256] attribute_hidden = {0};
> +
> +void
> +function (void)
> +{
> + int i;
> + for (i = 0; i < sizeof (var) / sizeof (int); i++)
> + var[i] = i;
> +}
>
Cheers,
Carlos.
@@ -1,3 +1,23 @@
+2014-11-27 H.J. Lu <hongjiu.lu@intel.com>
+
+ [BZ #13862]
+ * elf/dl-tls.c: Include <atomic.h>.
+ (oom): Remove #ifdef SHARED/#endif.
+ (_dl_static_dtv, _dl_initial_dtv): Moved before ...
+ (_dl_resize_dtv): This. Extracted from _dl_update_slotinfo.
+ (_dl_allocate_tls_init): Resize DTV if the current DTV isn't
+ big enough.
+ (_dl_update_slotinfo): Call _dl_resize_dtv to resize DTV.
+ * nptl/Makefile (tests): Add tst-stack4.
+ (modules-names): Add tst-stack4mod.
+ ($(objpfx)tst-stack4): New.
+ (tst-stack4mod.sos): Likewise.
+ ($(objpfx)tst-stack4.out): Likewise.
+ ($(tst-stack4mod.sos)): Likewise.
+ (clean): Likewise.
+ * nptl/tst-stack4.c: New file.
+ * nptl/tst-stack4mod.c: Likewise.
+
2014-11-27 J. Brown <jb999@gmx.de>
* sysdeps/x86/bits/string.h: Add recent CPUs.
@@ -23,6 +23,7 @@
#include <stdlib.h>
#include <unistd.h>
#include <sys/param.h>
+#include <atomic.h>
#include <tls.h>
#include <dl-tls.h>
@@ -34,14 +35,12 @@
/* Out-of-memory handler. */
-#ifdef SHARED
static void
__attribute__ ((__noreturn__))
oom (void)
{
_dl_fatal_printf ("cannot allocate memory for thread-local data: ABORT\n");
}
-#endif
size_t
@@ -397,6 +396,53 @@ _dl_allocate_tls_storage (void)
}
+#ifndef SHARED
+extern dtv_t _dl_static_dtv[];
+# define _dl_initial_dtv (&_dl_static_dtv[1])
+#endif
+
+static dtv_t *
+_dl_resize_dtv (dtv_t *dtv)
+{
+ /* Resize the dtv. */
+ dtv_t *newp;
+ /* Load GL(dl_tls_max_dtv_idx) atomically since it may be written to by
+ other threads oncurrently. */
+ size_t newsize
+ = atomic_load_acquire (&GL(dl_tls_max_dtv_idx)) + DTV_SURPLUS;
+ size_t oldsize = dtv[-1].counter;
+
+ if (dtv == GL(dl_initial_dtv))
+ {
+ /* This is the initial dtv that was either statically allocated in
+ __libc_setup_tls or allocated during rtld startup using the
+ dl-minimal.c malloc instead of the real malloc. We can't free
+ it, we have to abandon the old storage. */
+
+ newp = malloc ((2 + newsize) * sizeof (dtv_t));
+ if (newp == NULL)
+ oom ();
+ memcpy (newp, &dtv[-1], (2 + oldsize) * sizeof (dtv_t));
+ }
+ else
+ {
+ newp = realloc (&dtv[-1],
+ (2 + newsize) * sizeof (dtv_t));
+ if (newp == NULL)
+ oom ();
+ }
+
+ newp[0].counter = newsize;
+
+ /* Clear the newly allocated part. */
+ memset (newp + 2 + oldsize, '\0',
+ (newsize - oldsize) * sizeof (dtv_t));
+
+ /* Return the generation counter. */
+ return &newp[1];
+}
+
+
void *
internal_function
_dl_allocate_tls_init (void *result)
@@ -410,6 +456,16 @@ _dl_allocate_tls_init (void *result)
size_t total = 0;
size_t maxgen = 0;
+ /* Check if the current dtv is big enough. */
+ if (dtv[-1].counter < GL(dl_tls_max_dtv_idx))
+ {
+ /* Resize the dtv. */
+ dtv = _dl_resize_dtv (dtv);
+
+ /* Install this new dtv in the thread data structures. */
+ INSTALL_DTV (result, &dtv[-1]);
+ }
+
/* We have to prepare the dtv for all currently loaded modules using
TLS. For those which are dynamically loaded we add the values
indicating deferred allocation. */
@@ -492,11 +548,6 @@ _dl_allocate_tls (void *mem)
rtld_hidden_def (_dl_allocate_tls)
-#ifndef SHARED
-extern dtv_t _dl_static_dtv[];
-# define _dl_initial_dtv (&_dl_static_dtv[1])
-#endif
-
void
internal_function
_dl_deallocate_tls (void *tcb, bool dealloc_tcb)
@@ -645,41 +696,10 @@ _dl_update_slotinfo (unsigned long int req_modid)
assert (total + cnt == modid);
if (dtv[-1].counter < modid)
{
- /* Reallocate the dtv. */
- dtv_t *newp;
- size_t newsize = GL(dl_tls_max_dtv_idx) + DTV_SURPLUS;
- size_t oldsize = dtv[-1].counter;
-
- assert (map->l_tls_modid <= newsize);
-
- if (dtv == GL(dl_initial_dtv))
- {
- /* This is the initial dtv that was allocated
- during rtld startup using the dl-minimal.c
- malloc instead of the real malloc. We can't
- free it, we have to abandon the old storage. */
-
- newp = malloc ((2 + newsize) * sizeof (dtv_t));
- if (newp == NULL)
- oom ();
- memcpy (newp, &dtv[-1], (2 + oldsize) * sizeof (dtv_t));
- }
- else
- {
- newp = realloc (&dtv[-1],
- (2 + newsize) * sizeof (dtv_t));
- if (newp == NULL)
- oom ();
- }
-
- newp[0].counter = newsize;
-
- /* Clear the newly allocated part. */
- memset (newp + 2 + oldsize, '\0',
- (newsize - oldsize) * sizeof (dtv_t));
+ /* Resize the dtv. */
+ dtv = _dl_resize_dtv (dtv);
- /* Point dtv to the generation counter. */
- dtv = &newp[1];
+ assert (modid <= dtv[-1].counter);
/* Install this new dtv in the thread data
structures. */
@@ -254,7 +254,7 @@ tests = tst-typesizes \
tst-exec1 tst-exec2 tst-exec3 tst-exec4 \
tst-exit1 tst-exit2 tst-exit3 \
tst-stdio1 tst-stdio2 \
- tst-stack1 tst-stack2 tst-stack3 tst-pthread-getattr \
+ tst-stack1 tst-stack2 tst-stack3 tst-stack4 tst-pthread-getattr \
tst-pthread-attr-affinity \
tst-unload \
tst-dlsym1 \
@@ -311,7 +311,7 @@ endif
modules-names = tst-atfork2mod tst-tls3mod tst-tls4moda tst-tls4modb \
tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
- tst-tls5modd tst-tls5mode tst-tls5modf \
+ tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
tst-_res1mod1 tst-_res1mod2 tst-execstack-mod tst-fini1mod
extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) tst-cleanup4aux.o
test-extras += $(modules-names) tst-cleanup4aux
@@ -479,6 +479,19 @@ $(objpfx)tst-stack3-mem.out: $(objpfx)tst-stack3.out
$(evaluate-test)
generated += tst-stack3-mem.out tst-stack3.mtrace
+$(objpfx)tst-stack4: $(libdl) $(shared-thread-library)
+tst-stack4mod.sos=$(shell for i in 0 1 2 3 4 5 6 7 8 9 10 \
+ 11 12 13 14 15 16 17 18 19; do \
+ for j in 0 1 2 3 4 5 6 7 8 9 10 \
+ 11 12 13 14 15 16 17 18 19; do \
+ echo $(objpfx)tst-stack4mod-$$i-$$j.so; \
+ done; done)
+$(objpfx)tst-stack4.out: $(tst-stack4mod.sos)
+$(tst-stack4mod.sos): $(objpfx)tst-stack4mod.so
+ cp -f $< $@
+clean:
+ rm -f $(tst-stack4mod.sos)
+
$(objpfx)tst-cleanup4: $(objpfx)tst-cleanup4aux.o $(shared-thread-library)
$(objpfx)tst-cleanupx4: $(objpfx)tst-cleanup4aux.o $(shared-thread-library)
new file mode 100644
@@ -0,0 +1,159 @@
+/* Test DTV size overflow with pthread_create and TLS in dlopened shared
+ object.
+ Copyright (C) 2014 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+#include <stdio.h>
+#include <stdint.h>
+#include <dlfcn.h>
+#include <assert.h>
+#include <pthread.h>
+
+/* The choices of thread count, and file counts are arbitary.
+ The point is simply to run enough threads that an exiting
+ thread has it's stack reused by another thread at the same
+ time as new libraries have been loaded. */
+#define DSO_SHARED_FILES 20
+#define DSO_OPEN_THREADS 20
+#define DSO_EXEC_THREADS 2
+
+/* Used to make sure that only one thread is calling dlopen and dlclose
+ at a time. */
+pthread_mutex_t g_lock;
+
+typedef void (*function) (void);
+
+void *
+dso_invoke(void *dso_fun)
+{
+ function *fun_vec = (function *) dso_fun;
+ int dso;
+
+ for (dso = 0; dso < DSO_SHARED_FILES; dso++)
+ (*fun_vec[dso]) ();
+
+ pthread_exit (NULL);
+}
+
+void *
+dso_process (void * p)
+{
+ void *handle[DSO_SHARED_FILES];
+ function fun_vec[DSO_SHARED_FILES];
+ char dso_path[DSO_SHARED_FILES][100];
+ int dso;
+ uintptr_t t = (uintptr_t) p;
+
+ /* Open DSOs and get a function. */
+ for (dso = 0; dso < DSO_SHARED_FILES; dso++)
+ {
+ sprintf (dso_path[dso], "tst-stack4mod-%i-%i.so", t, dso);
+
+ pthread_mutex_lock (&g_lock);
+
+ handle[dso] = dlopen (dso_path[dso], RTLD_NOW);
+ assert (handle[dso]);
+
+ fun_vec[dso] = (function) dlsym (handle[dso], "function");
+ assert (fun_vec[dso]);
+
+ pthread_mutex_unlock (&g_lock);
+ }
+
+ /* Spawn workers. */
+ pthread_t thread[DSO_EXEC_THREADS];
+ int i, ret;
+ uintptr_t result = 0;
+ for (i = 0; i < DSO_EXEC_THREADS; i++)
+ {
+ pthread_mutex_lock (&g_lock);
+ ret = pthread_create (&thread[i], NULL, dso_invoke, (void *) fun_vec);
+ if (ret != 0)
+ {
+ printf ("pthread_create failed: %d\n", ret);
+ result = 1;
+ }
+ pthread_mutex_unlock (&g_lock);
+ }
+
+ if (!result)
+ for (i = 0; i < DSO_EXEC_THREADS; i++)
+ {
+ ret = pthread_join (thread[i], NULL);
+ if (ret != 0)
+ {
+ printf ("pthread_join failed: %d\n", ret);
+ result = 1;
+ }
+ }
+
+ /* Close all DSOs. */
+ for (dso = 0; dso < DSO_SHARED_FILES; dso++)
+ {
+ pthread_mutex_lock (&g_lock);
+ dlclose (handle[dso]);
+ pthread_mutex_unlock (&g_lock);
+ }
+
+ /* Exit. */
+ pthread_exit ((void *) result);
+}
+
+static int
+do_test (void)
+{
+ pthread_t thread[DSO_OPEN_THREADS];
+ int i,j;
+ int ret;
+ int result = 0;
+
+ pthread_mutex_init (&g_lock, NULL);
+
+ /* 100 is arbitrary here and is known to trigger PR 13862. */
+ for (j = 0; j < 100; j++)
+ {
+ for (i = 0; i < DSO_OPEN_THREADS; i++)
+ {
+ ret = pthread_create (&thread[i], NULL, dso_process,
+ (void *) (uintptr_t) i);
+ if (ret != 0)
+ {
+ printf ("pthread_create failed: %d\n", ret);
+ result = 1;
+ }
+ }
+
+ if (result)
+ break;
+
+ for (i = 0; i < DSO_OPEN_THREADS; i++)
+ {
+ ret = pthread_join (thread[i], NULL);
+ if (ret != 0)
+ {
+ printf ("pthread_join failed: %d\n", ret);
+ result = 1;
+ }
+ }
+ }
+
+ return result;
+}
+
+#define TEST_FUNCTION do_test ()
+#define TIMEOUT 100
+#include "../test-skeleton.c"
new file mode 100644
@@ -0,0 +1,28 @@
+/* This tests DTV usage with TLS in dlopened shared object.
+ Copyright (C) 2014 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <http://www.gnu.org/licenses/>. */
+
+/* 256 is arbitrary here and is known to trigger PR 13862. */
+__thread int var[256] attribute_hidden = {0};
+
+void
+function (void)
+{
+ int i;
+ for (i = 0; i < sizeof (var) / sizeof (int); i++)
+ var[i] = i;
+}