Message ID | 87k1i6cah6.fsf@oldenburg2.str.redhat.com |
---|---|
State | Superseded |
Headers |
Received: (qmail 64413 invoked by alias); 11 Feb 2019 20:54:18 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <libc-alpha.sourceware.org> List-Unsubscribe: <mailto:libc-alpha-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 64405 invoked by uid 89); 11 Feb 2019 20:54:18 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy= X-HELO: mx1.redhat.com From: Florian Weimer <fweimer@redhat.com> To: libc-alpha@sourceware.org Subject: [PATCH] nptl: Fix invalid Systemtap probe in pthread_join [BZ #24211] Date: Mon, 11 Feb 2019 21:54:13 +0100 Message-ID: <87k1i6cah6.fsf@oldenburg2.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain |
Commit Message
Florian Weimer
Feb. 11, 2019, 8:54 p.m. UTC
After commit f1ac7455831546e5dca0ed98fe8af2686fae7ce6 ("arm: Use "nr" constraint for Systemtap probes [BZ #24164]"), we load pd->result into a register in the probe below: /* Free the TCB. */ __free_tcb (pd); } else pd->joinid = NULL; LIBC_PROBE (pthread_join_ret, 3, threadid, result, result); However, at this point, the thread descriptor has been freed. If the thread stack does not fit into the thread stack cache, the memory will have been unmapped, and the program will crash in the probe. 2019-02-11 Florian Weimer <fweimer@redhat.com> [BZ #24211] * nptl/pthread_join_common.c (__pthread_timedjoin_ex): Do not read pd->result again after the thread descriptor has been freed.
Comments
On Feb 11 2019, Florian Weimer <fweimer@redhat.com> wrote: > diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c > index ecb78ffba5..45deba6a74 100644 > --- a/nptl/pthread_join_common.c > +++ b/nptl/pthread_join_common.c > @@ -101,7 +101,7 @@ __pthread_timedjoin_ex (pthread_t threadid, void **thread_return, > else > pd->joinid = NULL; > > - LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); > + LIBC_PROBE (pthread_join_ret, 3, threadid, result, result); result and pd->result are not the same thing. Andreas.
On 2/11/19 3:54 PM, Florian Weimer wrote: > After commit f1ac7455831546e5dca0ed98fe8af2686fae7ce6 ("arm: Use "nr" > constraint for Systemtap probes [BZ #24164]"), we load pd->result into > a register in the probe below: > > /* Free the TCB. */ > __free_tcb (pd); > } > else > pd->joinid = NULL; > > LIBC_PROBE (pthread_join_ret, 3, threadid, result, result); > > However, at this point, the thread descriptor has been freed. If the > thread stack does not fit into the thread stack cache, the memory will > have been unmapped, and the program will crash in the probe. > > 2019-02-11 Florian Weimer <fweimer@redhat.com> > > [BZ #24211] > * nptl/pthread_join_common.c (__pthread_timedjoin_ex): Do not read > pd->result again after the thread descriptor has been freed. > > diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c > index ecb78ffba5..45deba6a74 100644 > --- a/nptl/pthread_join_common.c > +++ b/nptl/pthread_join_common.c > @@ -101,7 +101,7 @@ __pthread_timedjoin_ex (pthread_t threadid, void **thread_return, > else > pd->joinid = NULL; > > - LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); > + LIBC_PROBE (pthread_join_ret, 3, threadid, result, result); Why not move the probe up? 77 if (block) 78 { 79 /* If abstime is NULL we switch to asynchronous cancellation. If we 80 are cancelled then the thread we are waiting for must be marked as 81 un-wait-ed for again. */ 82 pthread_cleanup_push (cleanup, &pd->joinid); 83 84 result = lll_wait_tid (pd->tid, abstime); 85 86 pthread_cleanup_pop (0); 87 } 88 LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); ^^^^ At this point the join is complete. We have the threadid (constant). We have the result (just returned from lll_wait_tid). We have pd->result (the thread returned). We aren't waiting for anything else? 89 if (__glibc_likely (result == 0)) 90 { 91 /* We mark the thread as terminated and as joined. */ 92 pd->tid = -1; 93 94 /* Store the return value if the caller is interested. */ 95 if (thread_return != NULL) 96 *thread_return = pd->result; 97 98 /* Free the TCB. */ 99 __free_tcb (pd); 100 } 101 else 102 pd->joinid = NULL; 103 104 LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); > > return result; > } > As an orthogonal issue the probe needs documenting in manual/probes.texi under a new "POSIX Thread Probes" section ;-)
* Carlos O'Donell: > Why not move the probe up? > > 77 if (block) > 78 { > 79 /* If abstime is NULL we switch to asynchronous cancellation. If we > 80 are cancelled then the thread we are waiting for must be marked as > 81 un-wait-ed for again. */ > 82 pthread_cleanup_push (cleanup, &pd->joinid); > 83 > 84 result = lll_wait_tid (pd->tid, abstime); > 85 > 86 pthread_cleanup_pop (0); > 87 } > 88 > > LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); > ^^^^ I think it should come after the stack has been freed. It's explicitly documented as a return probe (“probe for pthread_join return”). > At this point the join is complete. > > We have the threadid (constant). > > We have the result (just returned from lll_wait_tid). > > We have pd->result (the thread returned). > > We aren't waiting for anything else? Something might be interested in the fact that the stack is actually gone. I'm not sure if this is a change we should make when backporting. Thanks, Florian
On 2/11/19 4:26 PM, Florian Weimer wrote: > * Carlos O'Donell: > >> Why not move the probe up? >> >> 77 if (block) >> 78 { >> 79 /* If abstime is NULL we switch to asynchronous cancellation. If we >> 80 are cancelled then the thread we are waiting for must be marked as >> 81 un-wait-ed for again. */ >> 82 pthread_cleanup_push (cleanup, &pd->joinid); >> 83 >> 84 result = lll_wait_tid (pd->tid, abstime); >> 85 >> 86 pthread_cleanup_pop (0); >> 87 } >> 88 >> >> LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); >> ^^^^ > > I think it should come after the stack has been freed. It's explicitly > documented as a return probe (“probe for pthread_join return”). Can you tell the difference? The compiler might move all sorts of things after the probe that will happen before the final return insruction. >> At this point the join is complete. >> >> We have the threadid (constant). >> >> We have the result (just returned from lll_wait_tid). >> >> We have pd->result (the thread returned). >> >> We aren't waiting for anything else? > > Something might be interested in the fact that the stack is actually > gone. I'm not sure if this is a change we should make when backporting. Nobody guarantees the stack is gone, and in fact with the stack cache, and a future configurable stack cache, the cache may be around forever until the program exits. The probes are explicitly off the list of things which are ABI/API unless we document them as such. Therefore I think they can reasonable change in their exact triggering position. If you change anything else, like semantics or arguments then you need a new name, and you can do that if you want, because this master is development for glibc.
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c index ecb78ffba5..45deba6a74 100644 --- a/nptl/pthread_join_common.c +++ b/nptl/pthread_join_common.c @@ -101,7 +101,7 @@ __pthread_timedjoin_ex (pthread_t threadid, void **thread_return, else pd->joinid = NULL; - LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd->result); + LIBC_PROBE (pthread_join_ret, 3, threadid, result, result); return result; }