diff mbox

Bug 11941: Improper assert map->l_init_called in dlclose

Message ID b754d691-9293-0130-327a-5918b04a4118@redhat.com
State Superseded
Headers show

Commit Message

Carlos O'Donell Dec. 22, 2016, 5:36 a.m. UTC
The following patch fixes bug 11941 and adds a regression
test case for the particular plugin scenario that triggers
this assertion.

The simplest case of this problem can be seen in a plugin
framework which uses dlopen calls. A regression test case
looks like this:

Dependencies:
* Application Z -depends-> library Y.
* Library X -depends-> library Y.

Runtime:
* Application Z starts and configures library Y.
* Library Y dlopen's library X at the request of the application.
  * The dlopen is with RTLD_NODELETE and this is important.
* Application Z shuts down.

At this point glibc must shutdown X first since it depends
on Y, but because Y dlopen'd X it also has a responsibility
to dlclose() the handle it opened.

* Library X is shut down.
* Library Y is shut down.
* Library Y destructor calls dlclose() on X, not knowing it
  has been shutdown.

The cycle created by the X-depends->Y and Y-dlopens->X needs
to be broken for the application to have some kind of orderly
shutdown, and RTLD_NODELETE does not matter during exit
processing.

In this case the deterministic cycle break is that dlopen edge
in the dependency graph. In that case we have to make it safe
to minimally call dlclose on X. We can never make it fully safe
to call functions in X though and library Y must expect that
libraries are being shutdown even if it had dlopened them since
the order of destructors is not guaranteed. However, calling
dlclose from glibc to ensure proper API behaviour is expected
to work.

In _dl_close we have this code:

   /* First see whether we can remove the object at all.  */
   if (__glibc_unlikely (map->l_flags_1 & DF_1_NODELETE))
     {
       assert (map->l_init_called);
       /* Nope.  Do nothing.  */
       return;
     }

When Y tries to dlclose X already has map->l_init_called cleared
because the destructors have been run already.

The assert here is wrong exactly for the reasons shown in the
regression test case, that there is a valid scenario where a
library dependency and dlopen could result in a case where the
object is marked DF_1_NODELETE but l_init_called is zero.

The assert does not consider a DSO calling dlclose on a library
that has already been shutdown.

There are in fact two issues with _dl_close:

(a) map->l_flag_1 and l_direct_opencount are examined without
    dl_load_lock held and this is a data race. We could switch
    to atomic accesses but it's simply easier to take the lock at
    function entry and it's not easy to audit that this is
    exactly a case of double-checked locking for both variables.
    Reasoning about holding dl_load_lock is simpler.

(b) the assert is wrong because dlclose can be called on an
    object that has been destructed and it should be valid to
    do that without failure.

The patch addresses both by:

(1) Moving the locking to the start of the function.

(2) Removing the invalid assert. We still do nothing in DF_1_NODELETE
    case, but we allow a dlclose() to return success when closing
    an object that has already been destructed during exit processing.

No regressions in x86_64.

This is real issue reported in F25, where the plugins in question are
related to kerberos authentication:
https://bugzilla.redhat.com/show_bug.cgi?id=1398370

OK for master?

2016-12-21  Carlos O'Donell  <carlos@redhat.com>

	[BZ #11941]
	* elf/dl-close.c (_dl_close): Take dl_load_lock to examine map.
	Remove assert (map->l_init_called); if DF_1_NODELETE is set.
	* elf/Makefile [ifeq (yes,$(build-shared))] (tests): Add
	tst-nodelete-dlclose.
	(modules-names): Add tst-nodelete-dlclose-dso and
	tst-nodelete-dlclose-plugin.
	($(objpfx)tst-nodelete-dlclose-dso.so): Define.
	($(objpfx)tst-nodelete-dlclose-plugin.so): Define.
	($(objpfx)tst-nodelete-dlclose): Define.
	($(objpfx)tst-nodelete-dlclose.out): Define.

---

Cheers,
Carlos.

Comments

Florian Weimer Dec. 22, 2016, 9:36 a.m. UTC | #1
On 12/22/2016 06:36 AM, Carlos O'Donell wrote:

> +  /* We must take the lock to examine the contents of map whose
> +     l_flags_1 or l_direct_opencount may be modified by concurrent
> +     dlopen calls.  */
> +  __rtld_lock_lock_recursive (GL(dl_load_lock));
> +
>    /* First see whether we can remove the object at all.  */
>    if (__glibc_unlikely (map->l_flags_1 & DF_1_NODELETE))
>      {
> -      assert (map->l_init_called);
>        /* Nope.  Do nothing.  */
> +      __rtld_lock_unlock_recursive (GL(dl_load_lock));
>        return;
>      }
>
>    if (__builtin_expect (map->l_direct_opencount, 1) == 0)
>      _dl_signal_error (0, map->l_name, NULL, N_("shared object not open"));

Missing unlock before non-local exit.

The plugin should have some reference to a symbol defined by the other 
DSO, so that if ld applies --as-needed by default for some reason, the 
test still has the expected behavior.

> diff --git a/elf/tst-nodelete-dlclose-dso.c b/elf/tst-nodelete-dlclose-dso.c

> +void (*plugin_func)(void);

Missing space before paramter list.  Those variables could be static.

> +#define LIB_PLUGIN "tst-nodelete-dlclose-plugin.so"
> +
> +void
> +primary(void)

Likewise.

> +{
> +  char *error;
> +
> +  plugin_lib = dlopen (LIB_PLUGIN, RTLD_NOW | RTLD_LOCAL | RTLD_NODELETE);
> +  if (!plugin_lib)

Comparison against NULL is required by the style guide.

> +__attribute__((destructor))
> +void
> +primary_dtor(void)

Missing space before parameter list.  Function could be static.

> diff --git a/elf/tst-nodelete-dlclose-plugin.c b/elf/tst-nodelete-dlclose-plugin.c

> +plugin(void)

Likewise.

> +{
> +  printf("INFO: Calling plugin function.\n");

Likewise.

> +}
> +
> +__attribute__((destructor))

Likewise.

> +static void
> +plugin_dtor(void)

Likewise.

> +{
> +  printf("INFO: Calling plugin destructor.\n");

Likewise.

> diff --git a/elf/tst-nodelete-dlclose.c b/elf/tst-nodelete-dlclose.c

> +extern void primary(void);

Likewise.

> +
> +static int
> +do_test(void)

Likewise.

> +{
> +  printf ("INFO: Starting applicaiton.\n");

Typo: application

Thanks,
Florian
diff mbox

Patch

diff --git a/elf/Makefile b/elf/Makefile
index 330397e..cd26e16 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -152,7 +152,7 @@  tests += loadtest restest1 preloadtest loadfail multiload origtest resolvfail \
 	 tst-initorder tst-initorder2 tst-relsort1 tst-null-argv \
 	 tst-ptrguard1 tst-tlsalign tst-tlsalign-extern tst-nodelete-opened \
 	 tst-nodelete2 tst-audit11 tst-audit12 tst-dlsym-error tst-noload \
-	 tst-latepthread tst-tls-manydynamic
+	 tst-latepthread tst-tls-manydynamic tst-nodelete-dlclose
 #	 reldep9
 ifeq ($(build-hardcoded-path-in-tests),yes)
 tests += tst-dlopen-aout
@@ -231,7 +231,8 @@  modules-names = testobj1 testobj2 testobj3 testobj4 testobj5 testobj6 \
 		tst-tlsalign-lib tst-nodelete-opened-lib tst-nodelete2mod \
 		tst-audit11mod1 tst-audit11mod2 tst-auditmod11 \
 		tst-audit12mod1 tst-audit12mod2 tst-audit12mod3 tst-auditmod12 \
-		tst-latepthreadmod $(tst-tls-many-dynamic-modules)
+		tst-latepthreadmod $(tst-tls-many-dynamic-modules) \
+		tst-nodelete-dlclose-dso tst-nodelete-dlclose-plugin
 ifeq (yes,$(have-mtls-dialect-gnu2))
 tests += tst-gnu2-tls1
 modules-names += tst-gnu2-tls1mod
@@ -1345,3 +1346,12 @@  ifeq (no,$(nss-crypt))
 $(objpfx)tst-linkall-static: \
   $(common-objpfx)crypt/libcrypt.a
 endif
+
+# The application depends on the DSO, and the DSO loads the plugin.
+# The plugin also depends on the DSO. This creates the circular
+# dependency via dlopen that we're testing to make sure works.
+$(objpfx)tst-nodelete-dlclose-dso.so: $(libdl)
+$(objpfx)tst-nodelete-dlclose-plugin.so: $(objpfx)tst-nodelete-dlclose-dso.so
+$(objpfx)tst-nodelete-dlclose: $(objpfx)tst-nodelete-dlclose-dso.so
+$(objpfx)tst-nodelete-dlclose.out: $(objpfx)tst-nodelete-dlclose-dso.so \
+				   $(objpfx)tst-nodelete-dlclose-plugin.so
diff --git a/elf/dl-close.c b/elf/dl-close.c
index 6489703..ed532e3 100644
--- a/elf/dl-close.c
+++ b/elf/dl-close.c
@@ -805,20 +805,22 @@  _dl_close (void *_map)
 {
   struct link_map *map = _map;
 
+  /* We must take the lock to examine the contents of map whose
+     l_flags_1 or l_direct_opencount may be modified by concurrent
+     dlopen calls.  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+
   /* First see whether we can remove the object at all.  */
   if (__glibc_unlikely (map->l_flags_1 & DF_1_NODELETE))
     {
-      assert (map->l_init_called);
       /* Nope.  Do nothing.  */
+      __rtld_lock_unlock_recursive (GL(dl_load_lock));
       return;
     }
 
   if (__builtin_expect (map->l_direct_opencount, 1) == 0)
     _dl_signal_error (0, map->l_name, NULL, N_("shared object not open"));
 
-  /* Acquire the lock.  */
-  __rtld_lock_lock_recursive (GL(dl_load_lock));
-
   _dl_close_worker (map, false);
 
   __rtld_lock_unlock_recursive (GL(dl_load_lock));
diff --git a/elf/tst-nodelete-dlclose-dso.c b/elf/tst-nodelete-dlclose-dso.c
new file mode 100644
index 0000000..28a58ff
--- /dev/null
+++ b/elf/tst-nodelete-dlclose-dso.c
@@ -0,0 +1,81 @@ 
+/* Bug 11941: Improper assert map->l_init_called in dlclose.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This is the primary DSO that is loaded by the appliation.  This DSO
+   then loads a plugin with RTLD_NODELETE.  This plugin depends on this
+   DSO.  This dependency chain means that at application shutdown the
+   plugin will be destructed first.  Thus by the time this DSO is
+   destructed we will be calling dlclose() on an object that has already
+   been destructed.  It is allowed to call dlclose() in this way and
+   should not assert.  */
+#include <stdio.h>
+#include <stdlib.h>
+#include <dlfcn.h>
+
+/* Plugin to load.  */
+void *plugin_lib = NULL;
+/* Plugin function.  */
+void (*plugin_func)(void);
+#define LIB_PLUGIN "tst-nodelete-dlclose-plugin.so"
+
+void
+primary(void)
+{
+  char *error;
+
+  plugin_lib = dlopen (LIB_PLUGIN, RTLD_NOW | RTLD_LOCAL | RTLD_NODELETE);
+  if (!plugin_lib)
+    {
+      printf ("ERROR: Unable to load plugin library.\n");
+      exit (EXIT_FAILURE);
+    }
+
+  dlerror ();
+
+  plugin_func = (void (*)(void)) dlsym (plugin_lib, "plugin_func");
+
+  error = dlerror ();
+
+  if (error != NULL)
+    {
+      printf ("ERROR: Unable to find symbol with error \"%s\".",
+	      error);
+      exit (EXIT_FAILURE);
+    }
+
+  return;
+}
+
+__attribute__((destructor))
+void
+primary_dtor(void)
+{
+  int ret;
+  printf ("INFO: Calling primary destructor.\n");
+  /* The destructor runs in the test driver also, which
+     hasn't called primary(), in that case do nothing.  */
+  if (plugin_lib == NULL)
+    return;
+  ret = dlclose (plugin_lib);
+  if (ret != 0)
+    {
+      printf ("ERROR: Calling dlclose failed with \"%s\"\n",
+	      dlerror ());
+      exit (EXIT_FAILURE);
+    }
+}
diff --git a/elf/tst-nodelete-dlclose-plugin.c b/elf/tst-nodelete-dlclose-plugin.c
new file mode 100644
index 0000000..0159d5e
--- /dev/null
+++ b/elf/tst-nodelete-dlclose-plugin.c
@@ -0,0 +1,34 @@ 
+/* Bug 11941: Improper assert map->l_init_called in dlclose.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This DSO simulates a plugin with a dependency on the
+   primary DSO loaded by the appliation.  */
+#include <stdio.h>
+
+void
+plugin(void)
+{
+  printf("INFO: Calling plugin function.\n");
+}
+
+__attribute__((destructor))
+static void
+plugin_dtor(void)
+{
+  printf("INFO: Calling plugin destructor.\n");
+}
diff --git a/elf/tst-nodelete-dlclose.c b/elf/tst-nodelete-dlclose.c
new file mode 100644
index 0000000..861074e
--- /dev/null
+++ b/elf/tst-nodelete-dlclose.c
@@ -0,0 +1,35 @@ 
+/* Bug 11941: Improper assert map->l_init_called in dlclose.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This simulates an application using the primary DSO which loads the
+   plugin DSO.  */
+#include <stdio.h>
+#include <stdlib.h>
+
+extern void primary(void);
+
+static int
+do_test(void)
+{
+  printf ("INFO: Starting applicaiton.\n");
+  primary();
+  printf ("INFO: Exiting application.\n");
+  return 0;
+}
+
+#include <support/test-driver.c>