[BZ,#17833] _dl_close_worker() does not release inconsistent objects.

Message ID 54BD4F65.2090108@samsung.com
State Superseded
Delegated to: Carlos O'Donell
Headers

Commit Message

Pavel Kopyl Jan. 19, 2015, 6:39 p.m. UTC
  Hello,

I faced with the following problem. [See test case: 
https://sourceware.org/bugzilla/show_bug.cgi?id=17833]

I've a shared library that contains both undefined and unique symbols. 
Then I try to call the following sequence of dlopen:

1. dlopen("./libfoo.so", RTLD_NOW)
2. dlopen("./libfoo.so", RTLD_LAZY | RTLD_GLOBAL)

First dlopen call terminates with error because of undefined symbols, 
but STB_GNU_UNIQUE ones set DF_1_NODELETE flag and hence block library 
in the memory.
The library goes into inconsistent state as several structures remain 
uninitialized. For instance, relocations for GOT table were not performed.
By the time of second dlopen call this library looks like as it would be 
fully initialized but this is not true: any call through incorrect GOT 
table leads to segmentation fault.
On some systems this inconsistency triggers assertions in the dynamic 
linker.

Suggested patch forces object deletion in case of dlopen() fail:

 1. Clears DF_1_NODELETE flag if dlopen() fails, allowing library to be
    deleted from memory.
 2. For each unique symbol that is defined in this object clears
    appropriate entry in _ns_unique_sym_table.


Thanks

Pavel
  

Comments

Yury Gribov Jan. 20, 2015, 6:58 a.m. UTC | #1
On 01/19/2015 09:39 PM, Pavel Kopyl wrote:
> Suggested patch forces object deletion in case of dlopen() fail:

Cc Mike. Could you add a testcase?

-Y
  
Carlos O'Donell Jan. 21, 2015, 12:52 a.m. UTC | #2
On 01/19/2015 01:39 PM, Pavel Kopyl wrote:
> Hello,
> 
> I faced with the following problem. [See test case:
> https://sourceware.org/bugzilla/show_bug.cgi?id=17833]
> 
> I've a shared library that contains both undefined and unique
> symbols. Then I try to call the following sequence of dlopen:
> 
> 1. dlopen("./libfoo.so", RTLD_NOW)
> 2. dlopen("./libfoo.so", RTLD_LAZY | RTLD_GLOBAL)
> 
> First dlopen call terminates with error because of undefined symbols,
> but STB_GNU_UNIQUE ones set DF_1_NODELETE flag and hence block
> library in the memory. The library goes into inconsistent state as
> several structures remain uninitialized. For instance, relocations
> for GOT table were not performed. By the time of second dlopen call
> this library looks like as it would be fully initialized but this is
> not true: any call through incorrect GOT table leads to segmentation
> fault. On some systems this inconsistency triggers assertions in the
> dynamic linker.
>
> Suggested patch forces object deletion in case of dlopen() fail:
> 
> 1. Clears DF_1_NODELETE flag if dlopen() fails, allowing library to be
>    deleted from memory.
> 2. For each unique symbol that is defined in this object clears
>    appropriate entry in _ns_unique_sym_table.

From a high level perspective your patch looks plausibly correct.

I expect no other threads can have possibly seen the result of the
dlopen() because it's not yet complete, and so it's OK to clear
the STB_GNU_UNIQUE symbols list because none are yet used. Did you
verify this?

The solution you've chosen looks good, failing at dlopen time and
cleaning up the resulting changes is good.

This change should go into glibc 2.22 (February when branch opens).
I don't want to make this kind of internals change in dlopen right
now without having a full release to make sure we didn't break anything.

As another reviewer points out, we need a test case for this.

Please repost a v2 with a regression test.

Good work on the patch.

Cheers,
Carlos.
  

Patch

commit 10e53df26cdd232bfe38b449add8cb238ee108b8
Author: Pavel Kopyl <p.kopyl@samsung.com>
Date:   Mon Jan 19 18:05:02 2015 +0300

    2015-01-15  Pavel Kopyl  <p.kopyl@samsung.com>
    	    Mikhail Ilin  m.ilin@samsung.com
    
    	[BZ #17833]
    	* elf/dl-open.c (_dl_open): Forced _dl_close_worker call.
    	(_dl_close): default _dl_close_worker call.
    	* include/dlfcn.h (_dl_close_worker): Add new parameter.

diff --git a/elf/dl-close.c b/elf/dl-close.c
index cf8f9e0..3728d6d 100644
--- a/elf/dl-close.c
+++ b/elf/dl-close.c
@@ -108,7 +108,7 @@  remove_slotinfo (size_t idx, struct dtv_slotinfo_list *listp, size_t disp,
 
 
 void
-_dl_close_worker (struct link_map *map)
+_dl_close_worker (struct link_map *map, int force)
 {
   /* One less direct use.  */
   --map->l_direct_opencount;
@@ -152,6 +152,10 @@  _dl_close_worker (struct link_map *map)
       l->l_idx = idx;
       maps[idx] = l;
       ++idx;
+
+      /* clear DF_1_NODELETE to force object deletion.  */
+      if (force)
+	l->l_flags_1 &= ~DF_1_NODELETE;
     }
   assert (idx == nloaded);
 
@@ -635,6 +639,31 @@  _dl_close_worker (struct link_map *map)
 		}
 	    }
 
+	  /* Reset unique symbols if forced.  */
+	  if (force)
+	    {
+	      struct unique_sym_table *tab = &ns->_ns_unique_sym_table;
+	      __rtld_lock_lock_recursive (tab->lock);
+	      struct unique_sym *entries = tab->entries;
+	      if (entries != NULL)
+		{
+		  size_t idx, size = tab->size;
+		  for (idx = 0; idx < size; ++idx)
+		    {
+		      /* Clear unique symbol entries
+		       that belong to this object.  */
+		      if (entries[idx].name != NULL
+			  && entries[idx].map == imap)
+			{
+			  entries[idx].name = NULL;
+			  entries[idx].hashval = 0;
+			  tab->n_elements--;
+			}
+		    }
+		}
+		__rtld_lock_unlock_recursive (tab->lock);
+	    }
+
 	  /* We can unmap all the maps at once.  We determined the
 	     start address and length when we loaded the object and
 	     the `munmap' call does the rest.  */
@@ -772,7 +801,7 @@  _dl_close (void *_map)
   /* Acquire the lock.  */
   __rtld_lock_lock_recursive (GL(dl_load_lock));
 
-  _dl_close_worker (map);
+  _dl_close_worker (map, 0);
 
   __rtld_lock_unlock_recursive (GL(dl_load_lock));
 }
diff --git a/elf/dl-open.c b/elf/dl-open.c
index c358fff..2e245e6 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -672,7 +672,7 @@  no more namespaces available for dlmopen()"));
 	  if ((mode & __RTLD_AUDIT) == 0)
 	    GL(dl_tls_dtv_gaps) = true;
 
-	  _dl_close_worker (args.map);
+	  _dl_close_worker (args.map, 1);
 	}
 
       assert (_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT);
diff --git a/include/dlfcn.h b/include/dlfcn.h
index a67b2e3..cc1e2c0 100644
--- a/include/dlfcn.h
+++ b/include/dlfcn.h
@@ -54,7 +54,8 @@  struct link_map;
 extern void _dl_close (void *map) attribute_hidden;
 /* Same as above, but without locking and safety checks for user
    provided map arguments.  */
-extern void _dl_close_worker (struct link_map *map) attribute_hidden;
+extern void _dl_close_worker (struct link_map *map, int force)
+    attribute_hidden;
 
 /* Look up NAME in shared object HANDLE (which may be RTLD_DEFAULT or
    RTLD_NEXT).  WHO is the calling function, for RTLD_NEXT.  Returns