[BZ,#17833] _dl_close_worker() does not release inconsistent objects.
Commit Message
Hello,
I faced with the following problem. [See test case:
https://sourceware.org/bugzilla/show_bug.cgi?id=17833]
I've a shared library that contains both undefined and unique symbols.
Then I try to call the following sequence of dlopen:
1. dlopen("./libfoo.so", RTLD_NOW)
2. dlopen("./libfoo.so", RTLD_LAZY | RTLD_GLOBAL)
First dlopen call terminates with error because of undefined symbols,
but STB_GNU_UNIQUE ones set DF_1_NODELETE flag and hence block library
in the memory.
The library goes into inconsistent state as several structures remain
uninitialized. For instance, relocations for GOT table were not performed.
By the time of second dlopen call this library looks like as it would be
fully initialized but this is not true: any call through incorrect GOT
table leads to segmentation fault.
On some systems this inconsistency triggers assertions in the dynamic
linker.
Suggested patch forces object deletion in case of dlopen() fail:
1. Clears DF_1_NODELETE flag if dlopen() fails, allowing library to be
deleted from memory.
2. For each unique symbol that is defined in this object clears
appropriate entry in _ns_unique_sym_table.
Thanks
Pavel
Comments
On 01/19/2015 09:39 PM, Pavel Kopyl wrote:
> Suggested patch forces object deletion in case of dlopen() fail:
Cc Mike. Could you add a testcase?
-Y
On 01/19/2015 01:39 PM, Pavel Kopyl wrote:
> Hello,
>
> I faced with the following problem. [See test case:
> https://sourceware.org/bugzilla/show_bug.cgi?id=17833]
>
> I've a shared library that contains both undefined and unique
> symbols. Then I try to call the following sequence of dlopen:
>
> 1. dlopen("./libfoo.so", RTLD_NOW)
> 2. dlopen("./libfoo.so", RTLD_LAZY | RTLD_GLOBAL)
>
> First dlopen call terminates with error because of undefined symbols,
> but STB_GNU_UNIQUE ones set DF_1_NODELETE flag and hence block
> library in the memory. The library goes into inconsistent state as
> several structures remain uninitialized. For instance, relocations
> for GOT table were not performed. By the time of second dlopen call
> this library looks like as it would be fully initialized but this is
> not true: any call through incorrect GOT table leads to segmentation
> fault. On some systems this inconsistency triggers assertions in the
> dynamic linker.
>
> Suggested patch forces object deletion in case of dlopen() fail:
>
> 1. Clears DF_1_NODELETE flag if dlopen() fails, allowing library to be
> deleted from memory.
> 2. For each unique symbol that is defined in this object clears
> appropriate entry in _ns_unique_sym_table.
From a high level perspective your patch looks plausibly correct.
I expect no other threads can have possibly seen the result of the
dlopen() because it's not yet complete, and so it's OK to clear
the STB_GNU_UNIQUE symbols list because none are yet used. Did you
verify this?
The solution you've chosen looks good, failing at dlopen time and
cleaning up the resulting changes is good.
This change should go into glibc 2.22 (February when branch opens).
I don't want to make this kind of internals change in dlopen right
now without having a full release to make sure we didn't break anything.
As another reviewer points out, we need a test case for this.
Please repost a v2 with a regression test.
Good work on the patch.
Cheers,
Carlos.
commit 10e53df26cdd232bfe38b449add8cb238ee108b8
Author: Pavel Kopyl <p.kopyl@samsung.com>
Date: Mon Jan 19 18:05:02 2015 +0300
2015-01-15 Pavel Kopyl <p.kopyl@samsung.com>
Mikhail Ilin m.ilin@samsung.com
[BZ #17833]
* elf/dl-open.c (_dl_open): Forced _dl_close_worker call.
(_dl_close): default _dl_close_worker call.
* include/dlfcn.h (_dl_close_worker): Add new parameter.
@@ -108,7 +108,7 @@ remove_slotinfo (size_t idx, struct dtv_slotinfo_list *listp, size_t disp,
void
-_dl_close_worker (struct link_map *map)
+_dl_close_worker (struct link_map *map, int force)
{
/* One less direct use. */
--map->l_direct_opencount;
@@ -152,6 +152,10 @@ _dl_close_worker (struct link_map *map)
l->l_idx = idx;
maps[idx] = l;
++idx;
+
+ /* clear DF_1_NODELETE to force object deletion. */
+ if (force)
+ l->l_flags_1 &= ~DF_1_NODELETE;
}
assert (idx == nloaded);
@@ -635,6 +639,31 @@ _dl_close_worker (struct link_map *map)
}
}
+ /* Reset unique symbols if forced. */
+ if (force)
+ {
+ struct unique_sym_table *tab = &ns->_ns_unique_sym_table;
+ __rtld_lock_lock_recursive (tab->lock);
+ struct unique_sym *entries = tab->entries;
+ if (entries != NULL)
+ {
+ size_t idx, size = tab->size;
+ for (idx = 0; idx < size; ++idx)
+ {
+ /* Clear unique symbol entries
+ that belong to this object. */
+ if (entries[idx].name != NULL
+ && entries[idx].map == imap)
+ {
+ entries[idx].name = NULL;
+ entries[idx].hashval = 0;
+ tab->n_elements--;
+ }
+ }
+ }
+ __rtld_lock_unlock_recursive (tab->lock);
+ }
+
/* We can unmap all the maps at once. We determined the
start address and length when we loaded the object and
the `munmap' call does the rest. */
@@ -772,7 +801,7 @@ _dl_close (void *_map)
/* Acquire the lock. */
__rtld_lock_lock_recursive (GL(dl_load_lock));
- _dl_close_worker (map);
+ _dl_close_worker (map, 0);
__rtld_lock_unlock_recursive (GL(dl_load_lock));
}
@@ -672,7 +672,7 @@ no more namespaces available for dlmopen()"));
if ((mode & __RTLD_AUDIT) == 0)
GL(dl_tls_dtv_gaps) = true;
- _dl_close_worker (args.map);
+ _dl_close_worker (args.map, 1);
}
assert (_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT);
@@ -54,7 +54,8 @@ struct link_map;
extern void _dl_close (void *map) attribute_hidden;
/* Same as above, but without locking and safety checks for user
provided map arguments. */
-extern void _dl_close_worker (struct link_map *map) attribute_hidden;
+extern void _dl_close_worker (struct link_map *map, int force)
+ attribute_hidden;
/* Look up NAME in shared object HANDLE (which may be RTLD_DEFAULT or
RTLD_NEXT). WHO is the calling function, for RTLD_NEXT. Returns