elf: Assign TLS modid later during dlopen [BZ #24930]

Message ID 87lfvlw35n.fsf@oldenburg2.str.redhat.com
State Committed
Headers

Commit Message

Florian Weimer Aug. 22, 2019, 4:55 p.m. UTC
  Commit a42faf59d6d9f82e5293a9ebcc26d9c9e562b12b ("Fix BZ #16634.")
attempted to fix a TLS modid consistency issue by adding additional
checks to the open_verify function.  However, this is fragile
because open_verify cannot reliably predict whether
_dl_map_object_from_fd will later fail in the more complex cases
(such as memory allocation failures).  Therefore, this commit
assigns the TLS modid as late as possible.  At that point, the link
map pointer will eventually be passed to _dl_close, which will undo
the TLS modid assignment.

2019-08-22  Florian Weimer  <fweimer@redhat.com>

	[BZ #24930]
	* elf/dl-load.c (_dl_map_object_from_fd): Only assign TLS modid if
	the link map will be returned to the caller.
	* elf/Makefile [$(have-fpie) && $(build-shared)] (tests, tests-pie):
	Add tst-dlopen-aout-pie.
	(tst-tst-dlopen-aout-no-pie): Set.
	(CFLAGS-tst-dlopen-aout-pie.c): Build with -fpie.
	(tst-dlopen-aout-pie): Link with -ldl -lpthread.
	* elf/tst-dlopen-aout-pie.c: New file.
  

Comments

Gabriel F. T. Gomes Oct. 4, 2019, 1:08 a.m. UTC | #1
Hi, Florian,

I'm sorry that I took so long to review this patch after I mentioned I
would try to.  It was, as you have told me, indeed hard to review.

The patch looks good to me, and I made a lot of comments below in order
to explain why I think that is (if you think I got it wrong, then maybe I
need to review it again).

Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

On Thu, 22 Aug 2019, Florian Weimer wrote:

>Commit a42faf59d6d9f82e5293a9ebcc26d9c9e562b12b ("Fix BZ #16634.")
>attempted to fix a TLS modid consistency issue by adding additional
>checks to the open_verify function.  However, this is fragile
>because open_verify cannot reliably predict whether
>_dl_map_object_from_fd will later fail in the more complex cases
>(such as memory allocation failures).

>Therefore, this commit assigns the TLS modid as late as possible.

I believe this is correct, because, as far as I could see, the code
between the old and new locations (where the assignment is made) does not
rely on l->l_tls_modid being previously set.

In elf/tst-dlopen-aout.c, _dl_map_object_from_fd exits before the new
location is reached, around line 1165:

  /* dlopen of an executable is not valid because it is not possible
     to perform proper relocations, handle static TLS, or run the   
     ELF constructors.  For PIE, the check needs the dynamic        
     section, so there is another check below.  */     
  if (__glibc_unlikely (type != ET_DYN)     
      && __glibc_unlikely ((mode & __RTLD_OPENEXEC) == 0))     
    {     
      /* This object is loaded at a fixed address.  This must never 
         happen for objects loaded with dlopen.  */     
      errstring = N_("cannot dynamically load executable");         
      goto call_lose;     
    }     

In elf/tst-dlopen-aout-pie.c, _dl_map_object_from_fd also exits before the
new location is reached, around line 1205:

  /* Make sure we are not dlopen'ing an object that has the
     DF_1_NOOPEN flag set, or a PIE object.  */
  if ((__glibc_unlikely (l->l_flags_1 & DF_1_NOOPEN)
       && (mode & __RTLD_DLOPEN))
      || (__glibc_unlikely (l->l_flags_1 & DF_1_PIE)
          && __glibc_unlikely ((mode & __RTLD_OPENEXEC) == 0)))
    {
      /* We are not supposed to load this object.  Free all resources.  */
      _dl_unmap_segments (l);

      if (!l->l_libname->dont_free)
        free (l->l_libname);

      if (l->l_phdr_allocated)
        free ((void *) l->l_phdr);

      if (l->l_flags_1 & DF_1_PIE)
        errstring
          = N_("cannot dynamically load position-independent executable");
      else
        errstring = N_("shared object cannot be dlopen()ed");
      goto call_lose;
    }

(Note: I saw both calls to 'lose' in GDB (one in each test case) and these
are the exact code paths that the test case is trying to stress, but I only
noticed that after actually reaching these places with GDB).

Anyhow, even if these paths are not hit, I don't see anywhere else in
_dl_map_object_from_fd that would need l->l_tls_modid to be set earlier
than the new location, so I think this patch is good.  :)

>At that point, the link
>map pointer will eventually be passed to _dl_close, which will undo
>the TLS modid assignment.

If I understood this correctly, you are saying that _dl_close will
eventually make a call to remove_slotinfo, which removes the TLS modid
assignment.  I don't understand yet when _dl_close gets called, but I
suppose the two test cases I mentioned above will not cause this.

Still, the change looks good to me, because I don't think _dl_close will
be called before l->l_tls_modid is set (in the new location).

>-tests += tst-pie1 tst-pie2 tst-dlopen-pie
>-tests-pie += tst-pie1 tst-pie2
>+tests += tst-pie1 tst-pie2 tst-dlopen-pie tst-dlopen-aout-pie
>+tests-pie += tst-pie1 tst-pie2 tst-dlopen-aout-pie

OK, stress another use case (mentioned above).

>diff --git a/elf/dl-load.c b/elf/dl-load.c
>index 7bf0ca563e..2cc842f641 100644
>--- a/elf/dl-load.c
>+++ b/elf/dl-load.c
>@@ -1118,27 +1118,21 @@ _dl_map_object_from_fd (const char *name, const char *origname, int fd,
> 	     offset.  We will adjust it later.  */
> 	  l->l_tls_initimage = (void *) ph->p_vaddr;
> 
>-	  /* If not loading the initial set of shared libraries,
>-	     check whether we should permit loading a TLS segment.  */
>-	  if (__glibc_likely (l->l_type == lt_library)
>-	      /* If GL(dl_tls_dtv_slotinfo_list) == NULL, then rtld.c did
>-		 not set up TLS data structures, so don't use them now.  */
>-	      || __glibc_likely (GL(dl_tls_dtv_slotinfo_list) != NULL))
>-	    {
>-	      /* Assign the next available module ID.  */
>-	      l->l_tls_modid = _dl_next_tls_modid ();
>-	      break;
>-	    }
>+	  /* l->l_tls_modid is assigned below, once there is no
>+	     possibility for failure.  */

OK. Good placeholder comment. 

>+	  if (l->l_type != lt_library
>+	      && GL(dl_tls_dtv_slotinfo_list) == NULL)
>+	    {

OK, this preserves the previous code path where, after the TLS modid
assignment, the break statement would avoid executing the asserts below.

> #ifdef SHARED
>-	  /* We are loading the executable itself when the dynamic
>-	     linker was executed directly.  The setup will happen
>-	     later.  Otherwise, the TLS data structures are already
>-	     initialized, and we assigned a TLS modid above.  */
>-	  assert (l->l_prev == NULL || (mode & __RTLD_AUDIT) != 0);
>+	      /* We are loading the executable itself when the dynamic
>+		 linker was executed directly.  The setup will happen
>+		 later.  */
>+	      assert (l->l_prev == NULL || (mode & __RTLD_AUDIT) != 0);
> #else
>-	  assert (false && "TLS not initialized in static application");
>+	      assert (false && "TLS not initialized in static application");

OK, just white-space changes and the update of the comment.

>+  /* _dl_close can only eventually undo the module ID assignment (via
>+     remove_slotinfo) if this function returns a pointer to a link
>+     map.  Therefore, delay this step until all possibilities for
>+     failure have been excluded.  */
>+  if (l->l_tls_blocksize > 0

Since l was calloc'ed (in _dl_new_object), l->l_tls_blocksize is only
different than zero if it has been previously assigned, which happens in
this function in the PT_TLS case.

This correctly reflects the conditions of the old location of the
assignment of TLS modids.

>@@ -1646,17 +1652,6 @@ open_verify (const char *name, int fd,
> 	  errstring = N_("only ET_DYN and ET_EXEC can be loaded");
> 	  goto call_lose;
> 	}
>-      else if (__glibc_unlikely (ehdr->e_type == ET_EXEC
>-				 && (mode & __RTLD_OPENEXEC) == 0))
>-	{
>-	  /* BZ #16634. It is an error to dlopen ET_EXEC (unless
>-	     __RTLD_OPENEXEC is explicitly set).  We return error here
>-	     so that code in _dl_map_object_from_fd does not try to set
>-	     l_tls_modid for this module.  */
>-
>-	  errstring = N_("cannot dynamically load executable");
>-	  goto call_lose;
>-	}

As noted before, this situation is triggered in _dl_map_object_from_fd,
around line 1165, and now that the assignment of the TLS modid is done
after that line, there's no need to check for this situation here.

Cheers,
Gabriel
  
Florian Weimer Oct. 4, 2019, 9:22 a.m. UTC | #2
* Gabriel F. T. Gomes:

> The patch looks good to me, and I made a lot of comments below in order
> to explain why I think that is (if you think I got it wrong, then maybe I
> need to review it again).
>
> Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>

Thanks, your comments match may understanding when I wrote the patch.

I do not particularly like the patch, but I think it is the best we can
do given the current implementation.  Do you see an alternative?

>>At that point, the link
>>map pointer will eventually be passed to _dl_close, which will undo
>>the TLS modid assignment.
>
> If I understood this correctly, you are saying that _dl_close will
> eventually make a call to remove_slotinfo, which removes the TLS modid
> assignment.  I don't understand yet when _dl_close gets called, but I
> suppose the two test cases I mentioned above will not cause this.

No, because the link map pointer is never passed to the caller.  We fail
too early for that.  In this case, we deallocate the link map locally in
_dl_map_object_from_fd (via the call_lose label, and finally free (l) in
the lose function).  _dl_close is never called to free the new object.

Thanks,
Florian
  
Gabriel F. T. Gomes Oct. 4, 2019, 7:07 p.m. UTC | #3
On Fri, 04 Oct 2019, Florian Weimer wrote:

>I do not particularly like the patch, but I think it is the best we can
>do given the current implementation.  Do you see an alternative?

I don't, and I actually like the delaying of the assignment.  I see no
point in making the assignment only to remove it later on.
  
Florian Weimer Oct. 4, 2019, 9:06 p.m. UTC | #4
* Gabriel F. T. Gomes:

> On Fri, 04 Oct 2019, Florian Weimer wrote:
>
>>I do not particularly like the patch, but I think it is the best we can
>>do given the current implementation.  Do you see an alternative?
>
> I don't, and I actually like the delaying of the assignment.  I see no
> point in making the assignment only to remove it later on.

Thanks, I've pushed both patches now.

Florian
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index d470e41402..a62d620e20 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -308,8 +308,8 @@  test-xfail-tst-protected1b = yes
 endif
 ifeq (yesyes,$(have-fpie)$(build-shared))
 modules-names += tst-piemod1
-tests += tst-pie1 tst-pie2 tst-dlopen-pie
-tests-pie += tst-pie1 tst-pie2
+tests += tst-pie1 tst-pie2 tst-dlopen-pie tst-dlopen-aout-pie
+tests-pie += tst-pie1 tst-pie2 tst-dlopen-aout-pie
 ifeq (yes,$(have-protected-data))
 tests += vismain
 tests-pie += vismain
@@ -1267,7 +1267,11 @@  tst-leaks1-static-ENV = MALLOC_TRACE=$(objpfx)tst-leaks1-static.mtrace
 $(objpfx)tst-addr1: $(libdl)
 
 $(objpfx)tst-thrlock: $(libdl) $(shared-thread-library)
+
+tst-tst-dlopen-aout-no-pie = yes
 $(objpfx)tst-dlopen-aout: $(libdl) $(shared-thread-library)
+CFLAGS-tst-dlopen-aout-pie.c += $(pie-ccflag)
+$(objpfx)tst-dlopen-aout-pie: $(libdl) $(shared-thread-library)
 $(objpfx)tst-dlopen-aout-container: $(libdl) $(shared-thread-library)
 LDFLAGS-tst-dlopen-aout-container += -Wl,-rpath,\$$ORIGIN
 
diff --git a/elf/dl-load.c b/elf/dl-load.c
index 7bf0ca563e..2cc842f641 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1118,27 +1118,21 @@  _dl_map_object_from_fd (const char *name, const char *origname, int fd,
 	     offset.  We will adjust it later.  */
 	  l->l_tls_initimage = (void *) ph->p_vaddr;
 
-	  /* If not loading the initial set of shared libraries,
-	     check whether we should permit loading a TLS segment.  */
-	  if (__glibc_likely (l->l_type == lt_library)
-	      /* If GL(dl_tls_dtv_slotinfo_list) == NULL, then rtld.c did
-		 not set up TLS data structures, so don't use them now.  */
-	      || __glibc_likely (GL(dl_tls_dtv_slotinfo_list) != NULL))
-	    {
-	      /* Assign the next available module ID.  */
-	      l->l_tls_modid = _dl_next_tls_modid ();
-	      break;
-	    }
+	  /* l->l_tls_modid is assigned below, once there is no
+	     possibility for failure.  */
 
+	  if (l->l_type != lt_library
+	      && GL(dl_tls_dtv_slotinfo_list) == NULL)
+	    {
 #ifdef SHARED
-	  /* We are loading the executable itself when the dynamic
-	     linker was executed directly.  The setup will happen
-	     later.  Otherwise, the TLS data structures are already
-	     initialized, and we assigned a TLS modid above.  */
-	  assert (l->l_prev == NULL || (mode & __RTLD_AUDIT) != 0);
+	      /* We are loading the executable itself when the dynamic
+		 linker was executed directly.  The setup will happen
+		 later.  */
+	      assert (l->l_prev == NULL || (mode & __RTLD_AUDIT) != 0);
 #else
-	  assert (false && "TLS not initialized in static application");
+	      assert (false && "TLS not initialized in static application");
 #endif
+	    }
 	  break;
 
 	case PT_GNU_STACK:
@@ -1379,6 +1373,18 @@  cannot enable executable stack as shared object requires");
     add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
 			    + l->l_info[DT_SONAME]->d_un.d_val));
 
+  /* _dl_close can only eventually undo the module ID assignment (via
+     remove_slotinfo) if this function returns a pointer to a link
+     map.  Therefore, delay this step until all possibilities for
+     failure have been excluded.  */
+  if (l->l_tls_blocksize > 0
+      && (__glibc_likely (l->l_type == lt_library)
+	  /* If GL(dl_tls_dtv_slotinfo_list) == NULL, then rtld.c did
+	     not set up TLS data structures, so don't use them now.  */
+	  || __glibc_likely (GL(dl_tls_dtv_slotinfo_list) != NULL)))
+    /* Assign the next available module ID.  */
+    l->l_tls_modid = _dl_next_tls_modid ();
+
 #ifdef DL_AFTER_LOAD
   DL_AFTER_LOAD (l);
 #endif
@@ -1646,17 +1652,6 @@  open_verify (const char *name, int fd,
 	  errstring = N_("only ET_DYN and ET_EXEC can be loaded");
 	  goto call_lose;
 	}
-      else if (__glibc_unlikely (ehdr->e_type == ET_EXEC
-				 && (mode & __RTLD_OPENEXEC) == 0))
-	{
-	  /* BZ #16634. It is an error to dlopen ET_EXEC (unless
-	     __RTLD_OPENEXEC is explicitly set).  We return error here
-	     so that code in _dl_map_object_from_fd does not try to set
-	     l_tls_modid for this module.  */
-
-	  errstring = N_("cannot dynamically load executable");
-	  goto call_lose;
-	}
       else if (__glibc_unlikely (ehdr->e_phentsize != sizeof (ElfW(Phdr))))
 	{
 	  errstring = N_("ELF file's phentsize not the expected size");
diff --git a/elf/tst-dlopen-aout-pie.c b/elf/tst-dlopen-aout-pie.c
new file mode 100644
index 0000000000..8d2009c0f3
--- /dev/null
+++ b/elf/tst-dlopen-aout-pie.c
@@ -0,0 +1,19 @@ 
+/* Test case for BZ #16634 and BZ#24900.  PIE version.
+   Copyright (C) 2014-2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "tst-dlopen-aout.c"