Message ID | or60pqngb7.fsf@livre.home |
---|---|
State | New, archived |
Headers |
Received: (qmail 96125 invoked by alias); 21 Sep 2016 02:22:15 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 96091 invoked by uid 89); 21 Sep 2016 02:22:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=brasil, Brasil, H*MI:home, free! X-HELO: mx1.redhat.com From: Alexandre Oliva <aoliva@redhat.com> To: libc-alpha@sourceware.org Subject: [PR19826] fix non-LE TLS in static programs Date: Tue, 20 Sep 2016 23:21:32 -0300 Message-ID: <or60pqngb7.fsf@livre.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain |
Commit Message
Alexandre Oliva
Sept. 21, 2016, 2:21 a.m. UTC
An earlier fix for TLS dropped early initialization of DTV entries for modules using static TLS, leaving it for __tls_get_addr to set them up. That worked on platforms that require the GD access model to be relaxed to LE in the main executable, but it caused a regression on platforms that allow GD in the main executable, particularly in statically-linked programs: they use a custom __tls_get_addr that does not update the DTV, which fails when the DTV early initialization is not performed. In static programs, __libc_setup_tls performs the DTV initialization for the main thread, but the DTV of other threads is set up in _dl_allocate_tls_init, so that's the fix that matters. Restoring the initialization in the remaining functions modified by this patch was just for uniformity. It's not clear that it is ever needed: even on platforms that allow GD in the main executable, the dynamically-linked version of __tls_get_addr would set up the DTV entries, even for static TLS modules, while updating the DTV counter. Thanks to Andreas Schwab for the hints and the testing on ARM, and Joseph Myers for the hint. I've regression-tested this myself on x86_64-linux-gnu, but the problem couldn't be triggered on that platform to begin with. Ok to install? for ChangeLog [BZ #19826] * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early initialization of static TLS entries. * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. * nptl/allocatestack.c (init_one_static_tls): Likewise. --- elf/dl-reloc.c | 6 ++++++ elf/dl-tls.c | 4 ++++ nptl/allocatestack.c | 9 ++++++--- 3 files changed, 16 insertions(+), 3 deletions(-)
Comments
On Sep 21 2016, Alexandre Oliva <aoliva@redhat.com> wrote: > [BZ #19826] > * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early > initialization of static TLS entries. > * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. > * nptl/allocatestack.c (init_one_static_tls): Likewise. Ok. Andreas.
On 09/20/2016 10:21 PM, Alexandre Oliva wrote: > Ok to install? > > for ChangeLog > > [BZ #19826] > * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early > initialization of static TLS entries. > * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. > * nptl/allocatestack.c (init_one_static_tls): Likewise. Alex is going to hand this patch off to me to champion and finish iterating on with the rest of the interested developers (Andreas, Torvald, Florian). I'll work up a version 2 right now and post it shortly.
On 09/29/2016 09:04 PM, Carlos O'Donell wrote: > On 09/20/2016 10:21 PM, Alexandre Oliva wrote: >> Ok to install? >> >> for ChangeLog >> >> [BZ #19826] >> * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early >> initialization of static TLS entries. >> * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. >> * nptl/allocatestack.c (init_one_static_tls): Likewise. > > Alex is going to hand this patch off to me to champion and finish > iterating on with the rest of the interested developers (Andreas, Torvald, > Florian). > > I'll work up a version 2 right now and post it shortly. What's the status here? It seems that Alexandre committed it on September 22nd, despite Torvald's objections. This patch broke non-optimized global-dynamic TLS on aarch64 (for shared builds), as shown by my new tst-tls-manydynamic test case. Florian
On 12/02/2016 10:25 AM, Florian Weimer wrote: > On 09/29/2016 09:04 PM, Carlos O'Donell wrote: >> On 09/20/2016 10:21 PM, Alexandre Oliva wrote: >>> Ok to install? >>> >>> for ChangeLog >>> >>> [BZ #19826] >>> * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early >>> initialization of static TLS entries. >>> * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. >>> * nptl/allocatestack.c (init_one_static_tls): Likewise. >> >> Alex is going to hand this patch off to me to champion and finish >> iterating on with the rest of the interested developers (Andreas, Torvald, >> Florian). >> >> I'll work up a version 2 right now and post it shortly. > > What's the status here? It seems that Alexandre committed it on > September 22nd, despite Torvald's objections. Alex should not have committed this patch without consensus. It was my intent to rework the patches to the point where they meet the community standard for P&C work. In fact I'm reviewing Szabolcs Nagy's high-level design on how to fix the P&C issues in TLS. In this case Andreas Schwab gave an ACK for the patch but Torvald had requested further clarification. Alex should have waited to resolve the issues with Torvald or asked Torvald if his objections constituted a sustained objection (blocks consensus and the patch). > This patch broke non-optimized global-dynamic TLS on aarch64 (for > shared builds), as shown by my new tst-tls-manydynamic test case. Does reverting Alex's patch fix tst-tls-manydynamic?
On 12/02/2016 05:25 PM, Carlos O'Donell wrote: >> This patch broke non-optimized global-dynamic TLS on aarch64 (for >> shared builds), as shown by my new tst-tls-manydynamic test case. > > Does reverting Alex's patch fix tst-tls-manydynamic? It's sufficient to revert the patch to nptl/allocatestack.c. I don't know if it is really needed, I and I have no idea how to make initialization at this point safe because the threads themselves could be using the DTV concurrently, so we cannot enlarge it. I filed <https://sourceware.org/bugzilla/show_bug.cgi?id=20915> so that we keep track of this regression and fix it before the release. Thanks, Florian
On 12/02/2016 11:32 AM, Florian Weimer wrote: > On 12/02/2016 05:25 PM, Carlos O'Donell wrote: > >>> This patch broke non-optimized global-dynamic TLS on aarch64 >>> (for shared builds), as shown by my new tst-tls-manydynamic test >>> case. >> >> Does reverting Alex's patch fix tst-tls-manydynamic? > > It's sufficient to revert the patch to nptl/allocatestack.c. I don't > know if it is really needed, I and I have no idea how to make > initialization at this point safe because the threads themselves > could be using the DTV concurrently, so we cannot enlarge it. > > I filed <https://sourceware.org/bugzilla/show_bug.cgi?id=20915> so > that we keep track of this regression and fix it before the release. You don't quite answer my question. If bug 20915 is a release blocker because of the new regression, and I think it should be, then we should immediately add it to the release blockers list for 2.25: https://sourceware.org/glibc/wiki/Release/2.25
On 12/02/2016 05:45 PM, Carlos O'Donell wrote: > On 12/02/2016 11:32 AM, Florian Weimer wrote: >> On 12/02/2016 05:25 PM, Carlos O'Donell wrote: >> >>>> This patch broke non-optimized global-dynamic TLS on aarch64 >>>> (for shared builds), as shown by my new tst-tls-manydynamic test >>>> case. >>> >>> Does reverting Alex's patch fix tst-tls-manydynamic? >> >> It's sufficient to revert the patch to nptl/allocatestack.c. I don't >> know if it is really needed, I and I have no idea how to make >> initialization at this point safe because the threads themselves >> could be using the DTV concurrently, so we cannot enlarge it. >> >> I filed <https://sourceware.org/bugzilla/show_bug.cgi?id=20915> so >> that we keep track of this regression and fix it before the release. > > You don't quite answer my question. Well, obviously, because as I wrote, the patch broke the tst-tls-manydynamic test case. > If bug 20915 is a release blocker because of the new regression, and > I think it should be, then we should immediately add it to the release > blockers list for 2.25: > > https://sourceware.org/glibc/wiki/Release/2.25 Thanks, I added it. Florian
On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: > On 09/29/2016 09:04 PM, Carlos O'Donell wrote: >> On 09/20/2016 10:21 PM, Alexandre Oliva wrote: >>> Ok to install? >>> >>> for ChangeLog >>> >>> [BZ #19826] >>> * elf/dl-tls.c (_dl_allocate_tls_init): Restore DTV early >>> initialization of static TLS entries. >>> * elf/dl-reloc.c (_dl_nothread_init_static_tls): Likewise. >>> * nptl/allocatestack.c (init_one_static_tls): Likewise. >> >> Alex is going to hand this patch off to me to champion and finish >> iterating on with the rest of the interested developers (Andreas, Torvald, >> Florian). >> >> I'll work up a version 2 right now and post it shortly. > What's the status here? It seems that Alexandre committed it on > September 22nd, despite Torvald's objections. No, I committed the *first* of two patches, that I'd posted on 2016-09-20. https://sourceware.org/ml/libc-alpha/2016-09/msg00365.html That patch was approved by Andreas on 2016-09-21. https://sourceware.org/ml/libc-alpha/2016-09/msg00378.html Not having got any objections, I pushed it on 2016-09-22. https://sourceware.org/bugzilla/show_bug.cgi?id=19826#c14 On 2016-09-24, *I* was the first and only person to object to this first patch, when I realized it reintroduced the bug that an earlier (zeroth) patch of mine had fixed back in March, 2015. Along with the objection, I posted a second patch in this thread, containing a comment that explained why that part of the first patch I'd installed was incorrect, in more detail than what I'd incorrently taken out in the first patch, so that we wouldn't make the same mistake again. https://sourceware.org/ml/libc-alpha/2016-09/msg00512.html Here's the zeroth patch. https://sourceware.org/ml/libc-alpha/2015-03/msg00563.html Also on 2016-09-24, Torvald objected to the *second* patch, because he wasn't happy with the wording of the comments, and that I wasn't adding documentation for all concurrency issues in the TLS implementation along with it, and later than I wasn't revamping it into using atomics, and instead I just wanted to revert this small change that I had mistakenly made but later realized why I shouldn't have. https://sourceware.org/ml/libc-alpha/2016-09/msg00513.html When Carlos announced he'd pick that up, after I asked him for help because the thread was turning into yet another infinite distraction, he responded to the first patch, rather than the second. I suppose that's the source of your confusion. Maybe he was already confused about which patch was pending back then, or he just went for the top of the thread or something. https://sourceware.org/ml/libc-alpha/2016-09/msg00593.html > This patch broke non-optimized global-dynamic TLS on aarch64 (for > shared builds), as shown by my new tst-tls-manydynamic test case. Would you please try the second patch and confirm that it fixes the problem? What you're hitting is most likely the DTV resizing race condition described in the comments there, at the '(**)' note. It's quite unfortunate that this known race has been left unfixed for so long because of objections to the wording of comments.
On Dec 2, 2016, "Carlos O'Donell" <carlos@redhat.com> wrote: >> This patch broke non-optimized global-dynamic TLS on aarch64 (for >> shared builds), as shown by my new tst-tls-manydynamic test case. > Does reverting Alex's patch fix tst-tls-manydynamic? That would break TLS on static programs on platforms that don't mandate GD-to-LE relaxations again. We just have to revert the two pieces of the first patch that are broken, which is what the second patch I posted does. It was a follow up to Andreas' approval for the first. We wouldn't really have revert both of them, as Florian afonud out, but since they play similar roles in threaded and non-threaded programs, it would make more sense to keep their behavior in sync.
On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: > I don't know if it is really needed, No, it's wrong, it had been fixed before. > I and I have no idea how to make initialization at this point safe It's safe to leave that initialization out. The checks for whether the DTV needs updating are "safe enough". As in, a thread will take care of its own DTV, so there won't be races with other threads. However, it's not AS-Safe: if DTV resizing is interrupted by a signal whose handler triggers another DTV resizing, we'll have trouble, as we always had. That's one of the reasons why GD TLS is not AS-Safe. Another is that dynamic TLS blocks may have to be allocated, and we use(d?) AS-Unsafe means for the allocation. There may be other reasons that don't come to mind right now. > I filed <https://sourceware.org/bugzilla/show_bug.cgi?id=20915> so > that we keep track of this regression and fix it before the release. Thanks. Carlos, please add a '[BZ #20915]' to the ChangeLog of the revised version of the second patch when you get to it.
On 12/06/2016 08:15 AM, Alexandre Oliva wrote: > On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: > >> I don't know if it is really needed, > > No, it's wrong, it had been fixed before. > >> I and I have no idea how to make initialization at this point safe > > It's safe to leave that initialization out. The checks for whether the > DTV needs updating are "safe enough". As in, a thread will take care of > its own DTV, so there won't be races with other threads. But it's also called for the thread which invoked dlopen. Does _dl_update_slotinfo (called from _dl_try_allocate_static_tls) take care of that? > However, it's > not AS-Safe: if DTV resizing is interrupted by a signal whose handler > triggers another DTV resizing, we'll have trouble, as we always had. Can we block signals during DTV resizing? > That's one of the reasons why GD TLS is not AS-Safe. Another is that > dynamic TLS blocks may have to be allocated, and we use(d?) AS-Unsafe > means for the allocation. There may be other reasons that don't come to > mind right now. We still have malloc calls in there. It's not just a problem for asynchronous signals, we do not have a way to report allocation failures, either. Thanks, Florian
On 12/06/2016 07:56 AM, Alexandre Oliva wrote: > On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: >> What's the status here? It seems that Alexandre committed it on >> September 22nd, despite Torvald's objections. > > No, I committed the *first* of two patches, that I'd posted on 2016-09-20. > https://sourceware.org/ml/libc-alpha/2016-09/msg00365.html I'm sorry I reconstructed the history here incorrectly. >> This patch broke non-optimized global-dynamic TLS on aarch64 (for >> shared builds), as shown by my new tst-tls-manydynamic test case. > > Would you please try the second patch and confirm that it fixes the > problem? If I revert the change to allocatestack.c, the test case passes. > What you're hitting is most likely the DTV resizing race > condition described in the comments there, at the '(**)' note. It's not even a race. The test case serializes DTV access externally, and still fails. So the comment about a race is rather misleading. Thanks, Florian
On Dec 6, 2016, Florian Weimer <fweimer@redhat.com> wrote: > On 12/06/2016 07:56 AM, Alexandre Oliva wrote: >> On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: >>> What's the status here? It seems that Alexandre committed it on >>> September 22nd, despite Torvald's objections. >> >> No, I committed the *first* of two patches, that I'd posted on 2016-09-20. >> https://sourceware.org/ml/libc-alpha/2016-09/msg00365.html > I'm sorry I reconstructed the history here incorrectly. No problem, I recall the whole thing was already confusing back when I tried to explain it to Carlos for him to pick it up ;-) > If I revert the change to allocatestack.c, the test case passes. Thanks >> What you're hitting is most likely the DTV resizing race >> condition described in the comments there, at the '(**)' note. > It's not even a race. The test case serializes DTV access externally, > and still fails. So the comment about a race is rather misleading. Ah, sorry, I haven't looked at the testcase yet, but from your description, I suppose you eliminated the race. We still access the DTV past the end, though. Earlier (before the zeroth patch installed on March, 2015), we had a limit on the Static TLS modid, so that the DTV entry initialization wouldn't write past the end of the DTV. That patch lifted that limit, and removed the DTV entry initialization from that point. The first patch (2016-09-20's) mistakenly brought that code back. We have to revert that. We should probably mention the writing past the end of the DTV in the comments too.
On Dec 6, 2016, Florian Weimer <fweimer@redhat.com> wrote: > On 12/06/2016 08:15 AM, Alexandre Oliva wrote: >> On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: >> >>> I don't know if it is really needed, >> >> No, it's wrong, it had been fixed before. >> >>> I and I have no idea how to make initialization at this point safe >> >> It's safe to leave that initialization out. The checks for whether the >> DTV needs updating are "safe enough". As in, a thread will take care of >> its own DTV, so there won't be races with other threads. > But it's also called for the thread which invoked dlopen. Does > _dl_update_slotinfo (called from _dl_try_allocate_static_tls) take > care of that? Yeah, exactly. The DTV update performed by the thread itself will take care of it, we don't need to set up the DTV entries (other than those for the LE module, that the code in the first patch that's not reverted in the second takes care of) as we bump the DTV generation: that will cause tls_get_addr to perform the update. >> However, it's >> not AS-Safe: if DTV resizing is interrupted by a signal whose handler >> triggers another DTV resizing, we'll have trouble, as we always had. > Can we block signals during DTV resizing? That would be a reasonable way to address this problem. >> That's one of the reasons why GD TLS is not AS-Safe. Another is that >> dynamic TLS blocks may have to be allocated, and we use(d?) AS-Unsafe >> means for the allocation. There may be other reasons that don't come to >> mind right now. > We still have malloc calls in there. *nod*. I think we could make malloc AS-Safe by having an alternate arena to fallback to, if the preferred arena is already locked, and if we realize we have to fallback to the alternate arena, we block signals before locking it and unblock them once the allocation is done. This would probably be better than using an alternate allocator for the DTV and for Static TLS and whatnot. > It's not just a problem for asynchronous signals, we do not have a way > to report allocation failures, either. Yeah, that's an essential consequence of deferred allocation of dynamic TLS blocks and of per-thread DTV resizing. We could avoid part of it by always allocating TLS blocks statically, with some additional synchronization at dlopen, but dealing with the DTV could be a lot more interesting. Ensuring one thread won't access its DTV while it's being resized could be tricky, and ensuring the accesses by the owner thread are well-defined after another thread resized and updated the DTV, rather than races under the chosen memory model, could become a performance nightmare.
On 12/06/2016 10:30 AM, Alexandre Oliva wrote: > *nod*. I think we could make malloc AS-Safe by having an alternate > arena to fallback to, if the preferred arena is already locked, and if > we realize we have to fallback to the alternate arena, we block signals > before locking it and unblock them once the allocation is done. This > would probably be better than using an alternate allocator for the DTV > and for Static TLS and whatnot. We need a separate allocator for thread-local data to get predictable performance, especially in NUMA environments. Otherwise, application could get unlucky and put TLS data structures on the same page as application data used by a completely different thread, and performance will be horrible. >> It's not just a problem for asynchronous signals, we do not have a way >> to report allocation failures, either. > > Yeah, that's an essential consequence of deferred allocation of dynamic > TLS blocks and of per-thread DTV resizing. We could avoid part of it by > always allocating TLS blocks statically, with some additional > synchronization at dlopen, but dealing with the DTV could be a lot more > interesting. But we know its size at dlopen time, right? So we could perform the allocation then and just defer installing it. Thanks, Florian
On Tue, 2016-12-06 at 04:56 -0200, Alexandre Oliva wrote: > It's quite unfortunate that this known race has been left unfixed for so > long because of objections to the wording of comments. Don't you think it would be easier to understand what's going on for all of us if the comments were better? There would likely be less need to have a long discussions -- such as the one we're having right now -- about what is what and why things are the way they are.
On 12/06/2016 03:03 AM, Florian Weimer wrote: > On 12/06/2016 07:56 AM, Alexandre Oliva wrote: >> On Dec 2, 2016, Florian Weimer <fweimer@redhat.com> wrote: >>> What's the status here? It seems that Alexandre committed it on >>> September 22nd, despite Torvald's objections. >> >> No, I committed the *first* of two patches, that I'd posted on 2016-09-20. >> https://sourceware.org/ml/libc-alpha/2016-09/msg00365.html > > I'm sorry I reconstructed the history here incorrectly. So did I. My deepest apologies Alex, you did indeed have consensus from Andreas over the first of two patches.
On Dec 6, 2016, Florian Weimer <fweimer@redhat.com> wrote: > On 12/06/2016 10:30 AM, Alexandre Oliva wrote: >> *nod*. I think we could make malloc AS-Safe by having an alternate >> arena to fallback to, if the preferred arena is already locked, and if >> we realize we have to fallback to the alternate arena, we block signals >> before locking it and unblock them once the allocation is done. This >> would probably be better than using an alternate allocator for the DTV >> and for Static TLS and whatnot. > We need a separate allocator for thread-local data to get predictable > performance, especially in NUMA environments. Sure, but that's not an argument against making malloc AS-Safe, is it? There may be an argument for making malloc behave in a different way within signal handlers, assuming allocations within them might end up being frequently used by them in other threads as well, except when allocating for TLS, because then we know the allocation is thread local. >>> It's not just a problem for asynchronous signals, we do not have a way >>> to report allocation failures, either. >> Yeah, that's an essential consequence of deferred allocation of dynamic >> TLS blocks and of per-thread DTV resizing. We could avoid part of it by >> always allocating TLS blocks statically, with some additional >> synchronization at dlopen, but dealing with the DTV could be a lot more >> interesting. > But we know its size at dlopen time, right? So we could perform the > allocation then and just defer installing it. Yeah, I suppose that may work. It could even make TLS allocations more efficient and thread-aware, say, by allocating pages and using the same layout for the DTV and for corresponding TLS blocks for all threads. I'm not sure how to make such allocations are placed efficiently in NUMA settings, but that can be worked out. Synchronization with threads created or terminated while the creation is ongoing might be an interesting nut to crack, especially in order to avoid priority inversion, but my intuition suggests none of these issues are insurmountable. I'm just a bit worried about the overhead of allocating and deallocating TLS blocks for all threads, in processes with thousands of threads, when a temporarily loaded library brings in a TLS block that is to be discarded shortly thereafter, when the library is unloaded. I don't know how common that is, or whether we want to discourage that, but it would surely be a major change to the execution profile. Not much more different than the change I made years ago that turned formerly dynamic TLS blocks into late-loaded static TLS ones, with similar consequences, and I didn't see anyone complaining about that... So maybe it's not something to worry too much over, after all. I suppose dynamic TLS not promoted to static TLS has become such an uncommon case on platforms that use TLS descriptors that it's nearly a corner case that those who end up exercising it get what they deserve ;-) (though we could offer means to adjust the initial static TLS block to enable undeserving users to avoid this undesirable scenario :-) Anyway... This is all way outside the work I'm scoped to do, or my responsibility to fixing the bugs I introduced, so I'll leave it to you all and get my focus back on the things I'm supposed to be doing.
On Dec 6, 2016, "Carlos O'Donell" <carlos@redhat.com> wrote: > My deepest apologies Alex, you did indeed have consensus from Andreas > over the first of two patches. Apologies accepted, no hard feelings, and thanks for the confirmation.
diff --git a/elf/dl-reloc.c b/elf/dl-reloc.c index 42bddc1..dcab666 100644 --- a/elf/dl-reloc.c +++ b/elf/dl-reloc.c @@ -137,6 +137,12 @@ _dl_nothread_init_static_tls (struct link_map *map) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" #endif + /* Fill in the DTV slot so that a later LD/GD access will find it. */ + dtv_t *dtv = THREAD_DTV (); + assert (map->l_tls_modid <= dtv[-1].counter); + dtv[map->l_tls_modid].pointer.to_free = NULL; + dtv[map->l_tls_modid].pointer.val = dest; + /* Initialize the memory. */ memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size), '\0', map->l_tls_blocksize - map->l_tls_initimage_size); diff --git a/elf/dl-tls.c b/elf/dl-tls.c index 17567ad..60f4c1d 100644 --- a/elf/dl-tls.c +++ b/elf/dl-tls.c @@ -538,6 +538,10 @@ _dl_allocate_tls_init (void *result) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" #endif + /* Set up the DTV entry. The simplified __tls_get_addr that + some platforms use in static programs requires it. */ + dtv[map->l_tls_modid].pointer.val = dest; + /* Copy the initialization image and clear the BSS part. */ memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size), '\0', diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c index 60b34dc..3016a2e 100644 --- a/nptl/allocatestack.c +++ b/nptl/allocatestack.c @@ -1207,9 +1207,12 @@ init_one_static_tls (struct pthread *curp, struct link_map *map) # error "Either TLS_TCB_AT_TP or TLS_DTV_AT_TP must be defined" # endif - /* We cannot delay the initialization of the Static TLS area, since - it can be accessed with LE or IE, but since the DTV is only used - by GD and LD, we can delay its update to avoid a race. */ + /* Fill in the DTV slot so that a later LD/GD access will find it. */ + dtv_t *dtv = GET_DTV (TLS_TPADJ (curp)); + dtv[map->l_tls_modid].pointer.to_free = NULL; + dtv[map->l_tls_modid].pointer.val = dest; + + /* Initialize the memory. */ memset (__mempcpy (dest, map->l_tls_initimage, map->l_tls_initimage_size), '\0', map->l_tls_blocksize - map->l_tls_initimage_size); }