Message ID | 87zh9kuuw1.fsf@oldenburg2.str.redhat.com |
---|---|
State | Superseded |
Headers |
Return-Path: <libc-alpha-bounces@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D0C15388A802; Wed, 3 Jun 2020 13:44:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D0C15388A802 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1591191844; bh=WcK39xk9ybO47PeQFslZdLD6krFvvtURlqLeBSLi2Ks=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=IXmgUUB+RJw72cag+Q3aYKwrE0+XrrAA6lIUD2HjzKQ768pj9nLUfqfK+oET1MXjI 23DKUX/xcEOitQ0JMezZ5+tfkgs/s3+oEQGR7fjTun12jDbmY2kiHmqL716BilszK7 ICGenYYihh37MvY/fE9iJaGSv8oUp4F5NPEEsuWo= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [207.211.31.120]) by sourceware.org (Postfix) with ESMTP id 002CE383E826 for <libc-alpha@sourceware.org>; Wed, 3 Jun 2020 13:44:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 002CE383E826 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-203-_YgKbKRMNM6vSnHoeCDNlA-1; Wed, 03 Jun 2020 09:44:01 -0400 X-MC-Unique: _YgKbKRMNM6vSnHoeCDNlA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4829491164 for <libc-alpha@sourceware.org>; Wed, 3 Jun 2020 13:44:00 +0000 (UTC) Received: from oldenburg2.str.redhat.com (ovpn-112-181.ams2.redhat.com [10.36.112.181]) by smtp.corp.redhat.com (Postfix) with ESMTPS id BFBC7100164C for <libc-alpha@sourceware.org>; Wed, 3 Jun 2020 13:43:59 +0000 (UTC) To: libc-alpha@sourceware.org Subject: [PATCH] elf: Fix dlclose of an empty namespace in auditing mode (bug 26076) Date: Wed, 03 Jun 2020 15:43:58 +0200 Message-ID: <87zh9kuuw1.fsf@oldenburg2.str.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <http://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <http://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: Florian Weimer via Libc-alpha <libc-alpha@sourceware.org> Reply-To: Florian Weimer <fweimer@redhat.com> Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces@sourceware.org> |
Series |
elf: Fix dlclose of an empty namespace in auditing mode (bug 26076)
|
|
Commit Message
Florian Weimer
June 3, 2020, 1:43 p.m. UTC
ns->_ns_loaded is NULL if nothing has been loaded into the namespace. It seems difficult to hit this bug reliably, so this change does not come with a test case. It was trigger by accident, due to TLS exhaustion. --- elf/dl-close.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On 6/3/20 9:43 AM, Florian Weimer via Libc-alpha wrote: > ns->_ns_loaded is NULL if nothing has been loaded into the namespace. > > It seems difficult to hit this bug reliably, so this change does not > come with a test case. It was trigger by accident, due to TLS > exhaustion. I think this should fail catastrophically and quickly. > --- > elf/dl-close.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/elf/dl-close.c b/elf/dl-close.c > index 73b2817bbf..896e59e42e 100644 > --- a/elf/dl-close.c > +++ b/elf/dl-close.c > @@ -782,7 +782,7 @@ _dl_close_worker (struct link_map *map, bool force) > { > struct link_map *head = ns->_ns_loaded; > /* Do not call the functions for any auditing object. */ > - if (head->l_auditing == 0) > + if (head != NULL && head->l_auditing == 0) > { > struct audit_ifaces *afct = GLRO(dl_audit); > for (unsigned int cnt = 0; cnt < GLRO(dl_naudit); ++cnt) > Use _dl_signal_error to indicate an internal error?
* Carlos O'Donell: > On 6/3/20 9:43 AM, Florian Weimer via Libc-alpha wrote: >> ns->_ns_loaded is NULL if nothing has been loaded into the namespace. >> >> It seems difficult to hit this bug reliably, so this change does not >> come with a test case. It was trigger by accident, due to TLS >> exhaustion. > > I think this should fail catastrophically and quickly. Why should this be fatal to the process? >> elf/dl-close.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/elf/dl-close.c b/elf/dl-close.c >> index 73b2817bbf..896e59e42e 100644 >> --- a/elf/dl-close.c >> +++ b/elf/dl-close.c >> @@ -782,7 +782,7 @@ _dl_close_worker (struct link_map *map, bool force) >> { >> struct link_map *head = ns->_ns_loaded; >> /* Do not call the functions for any auditing object. */ >> - if (head->l_auditing == 0) >> + if (head != NULL && head->l_auditing == 0) >> { >> struct audit_ifaces *afct = GLRO(dl_audit); >> for (unsigned int cnt = 0; cnt < GLRO(dl_naudit); ++cnt) >> > > Use _dl_signal_error to indicate an internal error? This *is* on the cleanup path after an error already happened. This is the backtrace I saw: (gdb) bt #0 _dl_close_worker (map=<optimized out>, force=force@entry=true) at dl-close.c:785 #1 0x00007fbe468d3eb3 in _dl_open (file=0x7ffde8ed3f30 "\253\003\aE\276\177", mode=mode@entry=-1946157055, caller_dlopen=caller_dlopen@entry=0x7fbe468c2330 <dl_main>, nsid=9, nsid@entry=-1, argc=1, argv=<optimized out>, env=0x7ffde8ed4550) at dl-open.c:906 #2 0x00007fbe468c120e in dlmopen_doit (a=a@entry=0x7ffde8ed41e0) at rtld.c:660 #3 0x00007fbe468dbd5e in _dl_catch_exception ( exception=exception@entry=0x7ffde8ed4120, operate=operate@entry=0x7fbe468c11d0 <dlmopen_doit>, args=args@entry=0x7ffde8ed41e0) at dl-error-skeleton.c:208 #4 0x00007fbe468dbe03 in _dl_catch_error (objname=objname@entry=0x7ffde8ed41d0, errstring=errstring@entry=0x7ffde8ed41d8, mallocedp=mallocedp@entry=0x7ffde8ed41cf, operate=operate@entry=0x7fbe468c11d0 <dlmopen_doit>, args=args@entry=0x7ffde8ed41e0) at dl-error-skeleton.c:227 #5 0x00007fbe468c48e4 in load_audit_module (last_audit=<synthetic pointer>, name=0x7ffde8ed42b8 "tst-auditmanymod9.so") at rtld.c:973 #6 load_audit_modules (audit_list=0x7ffde8ed4220, main_map=<optimized out>) at rtld.c:1114 #7 dl_main (phdr=<optimized out>, phnum=<optimized out>, user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:1679 #8 0x00007fbe468da84b in _dl_sysdep_start ( start_argptr=start_argptr@entry=0x7ffde8ed4520, dl_main=dl_main@entry=0x7fbe468c2330 <dl_main>) at ../elf/dl-sysdep.c:252 #9 0x00007fbe468c1e81 in _dl_start_final (arg=0x7ffde8ed4520) at rtld.c:489 #10 _dl_start (arg=0x7ffde8ed4520) at rtld.c:582 #11 0x00007fbe468c1098 in _start () from /home/fweimer/src/gnu/glibc/build/elf/ld-linux-x86-64.so.2 (gdb) print exception $1 = {objname = 0x7fbe450703ab "/home/fweimer/src/gnu/glibc/build/libc.so.6", errstring = 0x7fbe45070380 "cannot allocate memory in static TLS block", message_buffer = 0x0} We cannot report another error at this point. Thanks, Florian
On 6/3/20 5:00 PM, Florian Weimer wrote: > * Carlos O'Donell: > >> On 6/3/20 9:43 AM, Florian Weimer via Libc-alpha wrote: >>> ns->_ns_loaded is NULL if nothing has been loaded into the namespace. >>> >>> It seems difficult to hit this bug reliably, so this change does not >>> come with a test case. It was trigger by accident, due to TLS >>> exhaustion. >> >> I think this should fail catastrophically and quickly. > > Why should this be fatal to the process? Your backtrace does a good job explaining how you got to this point. I appreciate it, it is very illuminating. >>> elf/dl-close.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/elf/dl-close.c b/elf/dl-close.c >>> index 73b2817bbf..896e59e42e 100644 >>> --- a/elf/dl-close.c >>> +++ b/elf/dl-close.c >>> @@ -782,7 +782,7 @@ _dl_close_worker (struct link_map *map, bool force) >>> { >>> struct link_map *head = ns->_ns_loaded; >>> /* Do not call the functions for any auditing object. */ >>> - if (head->l_auditing == 0) >>> + if (head != NULL && head->l_auditing == 0) >>> { >>> struct audit_ifaces *afct = GLRO(dl_audit); >>> for (unsigned int cnt = 0; cnt < GLRO(dl_naudit); ++cnt) >>> >> >> Use _dl_signal_error to indicate an internal error? > > This *is* on the cleanup path after an error already happened. This is > the backtrace I saw: I'm going to ignore the oddity of auditors auditing themselves, I guess it's logically consistent that the first loaded auditor won't see any of this, but that subsequently loaded auditors will. Once auditor loading fails in your case we begin unloading that object. What I don't understand though is why is GLRO(dl_naudit) non-zero? If this is the first loaded auditor we don't increment GLRO(dl_naudit) until after we return from load_audit_module, so do_audit is false, and we never get here If this is the second loaded auditor we do set do_audit to true, but this means we already dereferenced ns->_ns_loaded->l_auditing already once to compute do_audit. What has subsequently happened to ns->_ns_loaded? For it to be null means we removed *every* object from the namespace, but we just established we had at least one already loaded successfully? Why would we remove more than just the current object that failed to load? Why would the current audit modules failure result in ns->_ns_loaded to be null? I *do* see other code that checks for nulls, for example: elf/dl-object.c: 28 /* Add the new link_map NEW to the end of the namespace list. */ 29 void 30 _dl_add_to_namespace_list (struct link_map *new, Lmid_t nsid) 31 { 32 /* We modify the list of loaded objects. */ 33 __rtld_lock_lock_recursive (GL(dl_load_write_lock)); 34 35 if (GL(dl_ns)[nsid]._ns_loaded != NULL) 36 { 37 struct link_map *l = GL(dl_ns)[nsid]._ns_loaded; 38 while (l->l_next != NULL) 39 l = l->l_next; 40 new->l_prev = l; 41 /* new->l_next = NULL; Would be necessary but we use calloc. */ 42 l->l_next = new; 43 } 44 else 45 GL(dl_ns)[nsid]._ns_loaded = new; Here we check to see if _ns_loaded is non-null and so something different, but here we might expect that it's the first link map being added to this namespace. I'd like to hear what you think might be going wrong here. > (gdb) bt > #0 _dl_close_worker (map=<optimized out>, force=force@entry=true) > at dl-close.c:785 > #1 0x00007fbe468d3eb3 in _dl_open (file=0x7ffde8ed3f30 "\253\003\aE\276\177", > mode=mode@entry=-1946157055, > caller_dlopen=caller_dlopen@entry=0x7fbe468c2330 <dl_main>, nsid=9, > nsid@entry=-1, argc=1, argv=<optimized out>, env=0x7ffde8ed4550) > at dl-open.c:906 > #2 0x00007fbe468c120e in dlmopen_doit (a=a@entry=0x7ffde8ed41e0) at rtld.c:660 > #3 0x00007fbe468dbd5e in _dl_catch_exception ( > exception=exception@entry=0x7ffde8ed4120, > operate=operate@entry=0x7fbe468c11d0 <dlmopen_doit>, > args=args@entry=0x7ffde8ed41e0) at dl-error-skeleton.c:208 > #4 0x00007fbe468dbe03 in _dl_catch_error (objname=objname@entry=0x7ffde8ed41d0, > errstring=errstring@entry=0x7ffde8ed41d8, > mallocedp=mallocedp@entry=0x7ffde8ed41cf, > operate=operate@entry=0x7fbe468c11d0 <dlmopen_doit>, > args=args@entry=0x7ffde8ed41e0) at dl-error-skeleton.c:227 > #5 0x00007fbe468c48e4 in load_audit_module (last_audit=<synthetic pointer>, > name=0x7ffde8ed42b8 "tst-auditmanymod9.so") at rtld.c:973 > #6 load_audit_modules (audit_list=0x7ffde8ed4220, main_map=<optimized out>) > at rtld.c:1114 > #7 dl_main (phdr=<optimized out>, phnum=<optimized out>, > user_entry=<optimized out>, auxv=<optimized out>) at rtld.c:1679 > #8 0x00007fbe468da84b in _dl_sysdep_start ( > start_argptr=start_argptr@entry=0x7ffde8ed4520, > dl_main=dl_main@entry=0x7fbe468c2330 <dl_main>) at ../elf/dl-sysdep.c:252 > #9 0x00007fbe468c1e81 in _dl_start_final (arg=0x7ffde8ed4520) at rtld.c:489 > #10 _dl_start (arg=0x7ffde8ed4520) at rtld.c:582 > #11 0x00007fbe468c1098 in _start () > from /home/fweimer/src/gnu/glibc/build/elf/ld-linux-x86-64.so.2 > > (gdb) print exception > $1 = {objname = 0x7fbe450703ab "/home/fweimer/src/gnu/glibc/build/libc.so.6", > errstring = 0x7fbe45070380 "cannot allocate memory in static TLS block", > message_buffer = 0x0} > > We cannot report another error at this point. Agreed. Thanks.
* Carlos O'Donell: >>>> elf/dl-close.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/elf/dl-close.c b/elf/dl-close.c >>>> index 73b2817bbf..896e59e42e 100644 >>>> --- a/elf/dl-close.c >>>> +++ b/elf/dl-close.c >>>> @@ -782,7 +782,7 @@ _dl_close_worker (struct link_map *map, bool force) >>>> { >>>> struct link_map *head = ns->_ns_loaded; >>>> /* Do not call the functions for any auditing object. */ >>>> - if (head->l_auditing == 0) >>>> + if (head != NULL && head->l_auditing == 0) >>>> { >>>> struct audit_ifaces *afct = GLRO(dl_audit); >>>> for (unsigned int cnt = 0; cnt < GLRO(dl_naudit); ++cnt) >>>> >>> >>> Use _dl_signal_error to indicate an internal error? >> >> This *is* on the cleanup path after an error already happened. This is >> the backtrace I saw: > > I'm going to ignore the oddity of auditors auditing themselves, I guess it's > logically consistent that the first loaded auditor won't see any of this, but > that subsequently loaded auditors will. > > Once auditor loading fails in your case we begin unloading that object. What I > don't understand though is why is GLRO(dl_naudit) non-zero? > > If this is the first loaded auditor we don't increment GLRO(dl_naudit) until > after we return from load_audit_module, so do_audit is false, and we never > get here Agreed. > If this is the second loaded auditor we do set do_audit to true, but this > means we already dereferenced ns->_ns_loaded->l_auditing already once to > compute do_audit. What has subsequently happened to ns->_ns_loaded? For it > to be null means we removed *every* object from the namespace, but we just > established we had at least one already loaded successfully? It looks like the namespace is not actually empty initially: (gdb) print _rtld_global._dl_ns[9] $10 = {_ns_loaded = 0x7ffff7ee7590, _ns_nloaded = 3, _ns_main_searchlist = 0x0, _ns_global_scope_alloc = 0, _ns_global_scope_pending_adds = 0, libc_map = 0x0, _ns_unique_sym_table = { lock = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 1, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 16 times>, "\001", '\000' <repeats 22 times>, __align = 0}}, entries = 0x0, size = 0, n_elements = 0, free = 0x0}, _ns_debug = {r_version = 1, r_map = 0x7ffff7ee7590, r_brk = 140737351924736, r_state = RT_CONSISTENT, r_ldbase = 140737351860224}} (gdb) print _rtld_global._dl_ns[9]._ns_loaded->l_name $13 = 0x7ffff7ee7570 "./elf/tst-auditmanymod9.so" (gdb) print _rtld_global._dl_ns[9]._ns_loaded->l_next->l_name $14 = 0x7ffff7ee7b70 "./libc.so.6" (gdb) print _rtld_global._dl_ns[9]._ns_loaded->l_next->l_next->l_name $15 = 0x7ffff7ee8140 "./elf/ld-linux-x86-64.so.2" So the commit message is wrong: the namespace doesn't start out as empty, it becomes empty during dlclose. It should be fairly easy to write a test for this. I'll post a new patch shortly. Thanks, Florian
diff --git a/elf/dl-close.c b/elf/dl-close.c index 73b2817bbf..896e59e42e 100644 --- a/elf/dl-close.c +++ b/elf/dl-close.c @@ -782,7 +782,7 @@ _dl_close_worker (struct link_map *map, bool force) { struct link_map *head = ns->_ns_loaded; /* Do not call the functions for any auditing object. */ - if (head->l_auditing == 0) + if (head != NULL && head->l_auditing == 0) { struct audit_ifaces *afct = GLRO(dl_audit); for (unsigned int cnt = 0; cnt < GLRO(dl_naudit); ++cnt)