elf: Self-dlopen failure with explict loader invocation [BZ #24900]

Message ID 87o90q61ud.fsf@oldenburg2.str.redhat.com
State Committed
Headers

Commit Message

Florian Weimer Aug. 15, 2019, 12:38 p.m. UTC
  * Carlos O'Donell:

> On 8/12/19 3:15 PM, Florian Weimer wrote:
>> In case of an explicit loader invocation, ld.so essentially performs
>> a dlopen call to load the main executable.  Since the pathname of
>> the executable is known at this point, it gets stored in the link
>> map.  In regular mode, the pathname is not known and "" is used
>> instead.
>>
>> As a result, if a program calls dlopen on the pathname of the main
>> program, the dlopen call succeeds and returns a handle for the main
>> map.  This results in an unnecessary difference between glibc
>> testing (without --enable-hardcoded-path-in-tests) and production
>> usage.
>>
>> This commit discards the names when building the link map in
>> _dl_new_object for the main executable, but it still determines
>> the origin at this point in case of an explict loader invocation.
>> The reason is that the specified pathname has to be used; the kernel
>> has a different notion of the main executable.
>>
>> Tested on x86-64 with and without --enable-hardcoded-path-in-tests.
>
> Thanks for v2 of this patch.
>
> Overall I think harmonizing the linker behaviour has long-term benefit,
> so I approve of the approach, much in the vein of static vs. dynamic tests.
>
> If the strategy is to make direct linker invocation the same as non-direct
> linker invocation then you need to test both:
>
> (a) Normal test with the test framework.
> (b) Test-in-container test running the binary directly.
>
> Could you add the test-in-container test that runs the same test but
> in the container directly under the linker in question?

Here's a new patch with another test.

Thanks,
Florian

elf: Self-dlopen failure with explict loader invocation [BZ #24900]

In case of an explicit loader invocation, ld.so essentially performs
a dlopen call to load the main executable.  Since the pathname of
the executable is known at this point, it gets stored in the link
map.  In regular mode, the pathname is not known and "" is used
instead.

As a result, if a program calls dlopen on the pathname of the main
program, the dlopen call succeeds and returns a handle for the main
map.  This results in an unnecessary difference between glibc
testing (without --enable-hardcoded-path-in-tests) and production
usage.

This commit discards the names when building the link map in
_dl_new_object for the main executable, but it still determines
the origin at this point in case of an explict loader invocation.
The reason is that the specified pathname has to be used; the kernel
has a different notion of the main executable.

2019-08-15  Florian Weimer  <fweimer@redhat.com>

	[BZ #24900]
	* elf/dl-object.c (_dl_new_object): Do not store pathnames in the
	new object in __RTLD_OPENEXEC mode (except for the origin).
	* elf/tst-dlopen-aout.c (check_dlopen_failure): New function with
	check for the error message.
	(do_test): Call it.  Add check using relative path.
	* elf/Makefile (tests-container): Add tst-dlopen-aout-container.
	(tst-dlopen-aout-container): Link with libpthread.
	(LDFLAGS-tst-dlopen-aout-container): Set RPATH to $ORIGIN.
  

Comments

Carlos O'Donell Aug. 15, 2019, 4:09 p.m. UTC | #1
On 8/15/19 8:38 AM, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> On 8/12/19 3:15 PM, Florian Weimer wrote:
>>> In case of an explicit loader invocation, ld.so essentially performs
>>> a dlopen call to load the main executable.  Since the pathname of
>>> the executable is known at this point, it gets stored in the link
>>> map.  In regular mode, the pathname is not known and "" is used
>>> instead.
>>>
>>> As a result, if a program calls dlopen on the pathname of the main
>>> program, the dlopen call succeeds and returns a handle for the main
>>> map.  This results in an unnecessary difference between glibc
>>> testing (without --enable-hardcoded-path-in-tests) and production
>>> usage.
>>>
>>> This commit discards the names when building the link map in
>>> _dl_new_object for the main executable, but it still determines
>>> the origin at this point in case of an explict loader invocation.
>>> The reason is that the specified pathname has to be used; the kernel
>>> has a different notion of the main executable.
>>>
>>> Tested on x86-64 with and without --enable-hardcoded-path-in-tests.
>>
>> Thanks for v2 of this patch.
>>
>> Overall I think harmonizing the linker behaviour has long-term benefit,
>> so I approve of the approach, much in the vein of static vs. dynamic tests.
>>
>> If the strategy is to make direct linker invocation the same as non-direct
>> linker invocation then you need to test both:
>>
>> (a) Normal test with the test framework.
>> (b) Test-in-container test running the binary directly.
>>
>> Could you add the test-in-container test that runs the same test but
>> in the container directly under the linker in question?
> 
> Here's a new patch with another test.
> 
> Thanks,
> Florian
> 
> elf: Self-dlopen failure with explict loader invocation [BZ #24900]
> 
> In case of an explicit loader invocation, ld.so essentially performs
> a dlopen call to load the main executable.  Since the pathname of
> the executable is known at this point, it gets stored in the link
> map.  In regular mode, the pathname is not known and "" is used
> instead.
> 
> As a result, if a program calls dlopen on the pathname of the main
> program, the dlopen call succeeds and returns a handle for the main
> map.  This results in an unnecessary difference between glibc
> testing (without --enable-hardcoded-path-in-tests) and production
> usage.
> 
> This commit discards the names when building the link map in
> _dl_new_object for the main executable, but it still determines
> the origin at this point in case of an explict loader invocation.
> The reason is that the specified pathname has to be used; the kernel
> has a different notion of the main executable.
> 

OK for master.

Reviewed-by: Carlos O'Donell <carlos@redhat.com>

> 2019-08-15  Florian Weimer  <fweimer@redhat.com>
> 
> 	[BZ #24900]
> 	* elf/dl-object.c (_dl_new_object): Do not store pathnames in the
> 	new object in __RTLD_OPENEXEC mode (except for the origin).
> 	* elf/tst-dlopen-aout.c (check_dlopen_failure): New function with
> 	check for the error message.
> 	(do_test): Call it.  Add check using relative path.
> 	* elf/Makefile (tests-container): Add tst-dlopen-aout-container.
> 	(tst-dlopen-aout-container): Link with libpthread.
> 	(LDFLAGS-tst-dlopen-aout-container): Set RPATH to $ORIGIN.
> 
> diff --git a/elf/Makefile b/elf/Makefile
> index e8c3458963..d470e41402 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -199,7 +199,7 @@ tests-internal += loadtest unload unload2 circleload1 \
>   	 tst-tls3 tst-tls6 tst-tls7 tst-tls8 tst-dlmopen2 \
>   	 tst-ptrguard1 tst-stackguard1 tst-libc_dlvsym \
>   	 tst-create_format1
> -tests-container += tst-pldd
> +tests-container += tst-pldd tst-dlopen-aout-container

OK.

>   test-srcs = tst-pathopt
>   selinux-enabled := $(shell cat /selinux/enforce 2> /dev/null)
>   ifneq ($(selinux-enabled),1)
> @@ -1268,6 +1268,8 @@ $(objpfx)tst-addr1: $(libdl)
>   
>   $(objpfx)tst-thrlock: $(libdl) $(shared-thread-library)
>   $(objpfx)tst-dlopen-aout: $(libdl) $(shared-thread-library)
> +$(objpfx)tst-dlopen-aout-container: $(libdl) $(shared-thread-library)
> +LDFLAGS-tst-dlopen-aout-container += -Wl,-rpath,\$$ORIGIN

OK. Use DST of $ORIGIN allows the binary to be found.

>   
>   CFLAGS-ifuncmain1pic.c += $(pic-ccflag)
>   CFLAGS-ifuncmain1picstatic.c += $(pic-ccflag)
> diff --git a/elf/dl-object.c b/elf/dl-object.c
> index c3f5455ab4..e9520583cf 100644
> --- a/elf/dl-object.c
> +++ b/elf/dl-object.c
> @@ -57,14 +57,30 @@ struct link_map *
>   _dl_new_object (char *realname, const char *libname, int type,
>   		struct link_map *loader, int mode, Lmid_t nsid)
>   {
> +#ifdef SHARED
> +  unsigned int naudit;
> +  if (__glibc_unlikely ((mode & __RTLD_OPENEXEC) != 0))
> +    {
> +      assert (type == lt_executable);
> +      assert (nsid == LM_ID_BASE);
> +
> +      /* Ignore the specified libname for the main executable.  It is
> +	 only known with an explicit loader invocation.  */
> +      libname = "";
> +
> +      /* We create the map for the executable before we know whether
> +	 we have auditing libraries and if yes, how many.  Assume the
> +	 worst.  */
> +      naudit = DL_NNS;
> +    }
> +  else
> +    naudit = GLRO (dl_naudit);
> +#endif


OK.

> +
>     size_t libname_len = strlen (libname) + 1;
>     struct link_map *new;
>     struct libname_list *newname;
>   #ifdef SHARED
> -  /* We create the map for the executable before we know whether we have
> -     auditing libraries and if yes, how many.  Assume the worst.  */
> -  unsigned int naudit = GLRO(dl_naudit) ?: ((mode & __RTLD_OPENEXEC)
> -					    ? DL_NNS : 0);
>     size_t audit_space = naudit * sizeof (new->l_audit[0]);
>   #else
>   # define audit_space 0
> @@ -91,8 +107,20 @@ _dl_new_object (char *realname, const char *libname, int type,
>        and won't get dumped during core file generation. Therefore to assist
>        gdb and to create more self-contained core files we adjust l_name to
>        point at the newly allocated copy (which will get dumped) instead of
> -     the ld.so rodata copy.  */
> -  new->l_name = *realname ? realname : (char *) newname->name + libname_len - 1;
> +     the ld.so rodata copy.
> +
> +     Furthermore, in case of explicit loader invocation, discard the
> +     name of the main executable, to match the regular behavior, where
> +     name of the executable is not known.  */
> +#ifdef SHARED
> +  if (*realname != '\0' && (mode & __RTLD_OPENEXEC) == 0)
> +#else
> +  if (*realname != '\0')
> +#endif
> +    new->l_name = realname;
> +  else
> +    new->l_name = (char *) newname->name + libname_len - 1;
> +

OK.

>     new->l_type = type;
>     /* If we set the bit now since we know it is never used we avoid
>        dirtying the cache line later.  */
> @@ -149,7 +177,14 @@ _dl_new_object (char *realname, const char *libname, int type,
>   
>     new->l_local_scope[0] = &new->l_searchlist;
>   
> -  /* Don't try to find the origin for the main map which has the name "".  */
> +  /* Determine the origin.  If allocating the link map for the main
> +     executable, the realname is not known and "".  In this case, the
> +     origin needs to be determined by other means.  However, in case
> +     of an explicit loader invocation, the pathname of the main
> +     executable is known and needs to be processed here: From the
> +     point of view of the kernel, the main executable is the
> +     dynamic loader, and this would lead to a computation of the wrong
> +     origin.  */

OK.

>     if (realname[0] != '\0')
>       {
>         size_t realname_len = strlen (realname) + 1;
> diff --git a/elf/tst-dlopen-aout-container.c b/elf/tst-dlopen-aout-container.c
> new file mode 100644
> index 0000000000..9b9f86133d
> --- /dev/null
> +++ b/elf/tst-dlopen-aout-container.c
> @@ -0,0 +1,19 @@
> +/* Test case for BZ #16634 and BZ#24900.  Container version.
> +   Copyright (C) 2014-2019 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include "tst-dlopen-aout.c"

OK. Perfect, exactly what I was thinking.

> diff --git a/elf/tst-dlopen-aout.c b/elf/tst-dlopen-aout.c
> index 25cfe2f740..3e39fc6067 100644
> --- a/elf/tst-dlopen-aout.c
> +++ b/elf/tst-dlopen-aout.c
> @@ -1,7 +1,8 @@
> -/* Test case for BZ #16634.
> +/* Test case for BZ #16634 and BZ#24900.
>   
>      Verify that incorrectly dlopen()ing an executable without
> -   __RTLD_OPENEXEC does not cause assertion in ld.so.
> +   __RTLD_OPENEXEC does not cause assertion in ld.so, and that it
> +   actually results in an error.

OK.

>   
>      Copyright (C) 2014-2019 Free Software Foundation, Inc.
>      This file is part of the GNU C Library.
> @@ -24,6 +25,8 @@
>   #include <pthread.h>
>   #include <stdio.h>
>   #include <stdlib.h>
> +#include <string.h>
> +#include <support/check.h>

OK.

>   #include <support/support.h>
>   #include <support/xthread.h>
>   
> @@ -35,28 +38,35 @@ fn (void *p)
>     return p;
>   }
>   
> +/* Call dlopen on PATH and check that fails with an error message
> +   indicating an attempt to open an ET_EXEC or PIE object.  */
> +static void
> +check_dlopen_failure (const char *path)
> +{
> +  void *handle = dlopen (path, RTLD_LAZY);
> +  if (handle != NULL)
> +    FAIL_EXIT1 ("dlopen succeeded unexpectedly: %s", path);
> +
> +  const char *message = dlerror ();
> +  TEST_VERIFY_EXIT (message != NULL);
> +  if ((strstr (message,
> +	       "cannot dynamically load position-independent executable")
> +       == NULL)
> +      && strstr (message, "cannot dynamically load executable") == NULL)
> +    FAIL_EXIT1 ("invalid dlopen error message: \"%s\"", message);
> +}

OK.

> +
>   static int
>   do_test (int argc, char *argv[])
>   {
>     int j;
>   
> -  /* Use the full path so that the dynamic loader does not recognize
> -     the main program as already loaded (even with an explicit ld.so
> -     invocation).  */
> -  char *path = xasprintf ("%s/%s", support_objdir_root, "tst-dlopen-aout");
> -  printf ("info: dlopen object: %s\n", path);
> -
>     for (j = 0; j < 100; ++j)
>       {
>         pthread_t thr;
> -      void *p;
>   
> -      p = dlopen (path, RTLD_LAZY);
> -      if (p != NULL)
> -        {
> -          puts ("error: dlopen succeeded unexpectedly");
> -          return 1;
> -        }
> +      check_dlopen_failure (argv[0]);
> +
>         /* We create threads to force TLS allocation, which triggers
>   	 the original bug i.e. running out of surplus slotinfo entries
>   	 for TLS.  */
> @@ -64,7 +74,10 @@ do_test (int argc, char *argv[])
>         xpthread_join (thr);
>       }
>   
> -  free (path);
> +  /* The elf subdirectory (or $ORIGIN in the container case) is on the
> +     library search path.  */
> +  check_dlopen_failure ("tst-dlopen-aout");

OK.

> +
>     return 0;
>   }
>   
>
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index e8c3458963..d470e41402 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -199,7 +199,7 @@  tests-internal += loadtest unload unload2 circleload1 \
 	 tst-tls3 tst-tls6 tst-tls7 tst-tls8 tst-dlmopen2 \
 	 tst-ptrguard1 tst-stackguard1 tst-libc_dlvsym \
 	 tst-create_format1
-tests-container += tst-pldd
+tests-container += tst-pldd tst-dlopen-aout-container
 test-srcs = tst-pathopt
 selinux-enabled := $(shell cat /selinux/enforce 2> /dev/null)
 ifneq ($(selinux-enabled),1)
@@ -1268,6 +1268,8 @@  $(objpfx)tst-addr1: $(libdl)
 
 $(objpfx)tst-thrlock: $(libdl) $(shared-thread-library)
 $(objpfx)tst-dlopen-aout: $(libdl) $(shared-thread-library)
+$(objpfx)tst-dlopen-aout-container: $(libdl) $(shared-thread-library)
+LDFLAGS-tst-dlopen-aout-container += -Wl,-rpath,\$$ORIGIN
 
 CFLAGS-ifuncmain1pic.c += $(pic-ccflag)
 CFLAGS-ifuncmain1picstatic.c += $(pic-ccflag)
diff --git a/elf/dl-object.c b/elf/dl-object.c
index c3f5455ab4..e9520583cf 100644
--- a/elf/dl-object.c
+++ b/elf/dl-object.c
@@ -57,14 +57,30 @@  struct link_map *
 _dl_new_object (char *realname, const char *libname, int type,
 		struct link_map *loader, int mode, Lmid_t nsid)
 {
+#ifdef SHARED
+  unsigned int naudit;
+  if (__glibc_unlikely ((mode & __RTLD_OPENEXEC) != 0))
+    {
+      assert (type == lt_executable);
+      assert (nsid == LM_ID_BASE);
+
+      /* Ignore the specified libname for the main executable.  It is
+	 only known with an explicit loader invocation.  */
+      libname = "";
+
+      /* We create the map for the executable before we know whether
+	 we have auditing libraries and if yes, how many.  Assume the
+	 worst.  */
+      naudit = DL_NNS;
+    }
+  else
+    naudit = GLRO (dl_naudit);
+#endif
+
   size_t libname_len = strlen (libname) + 1;
   struct link_map *new;
   struct libname_list *newname;
 #ifdef SHARED
-  /* We create the map for the executable before we know whether we have
-     auditing libraries and if yes, how many.  Assume the worst.  */
-  unsigned int naudit = GLRO(dl_naudit) ?: ((mode & __RTLD_OPENEXEC)
-					    ? DL_NNS : 0);
   size_t audit_space = naudit * sizeof (new->l_audit[0]);
 #else
 # define audit_space 0
@@ -91,8 +107,20 @@  _dl_new_object (char *realname, const char *libname, int type,
      and won't get dumped during core file generation. Therefore to assist
      gdb and to create more self-contained core files we adjust l_name to
      point at the newly allocated copy (which will get dumped) instead of
-     the ld.so rodata copy.  */
-  new->l_name = *realname ? realname : (char *) newname->name + libname_len - 1;
+     the ld.so rodata copy.
+
+     Furthermore, in case of explicit loader invocation, discard the
+     name of the main executable, to match the regular behavior, where
+     name of the executable is not known.  */
+#ifdef SHARED
+  if (*realname != '\0' && (mode & __RTLD_OPENEXEC) == 0)
+#else
+  if (*realname != '\0')
+#endif
+    new->l_name = realname;
+  else
+    new->l_name = (char *) newname->name + libname_len - 1;
+
   new->l_type = type;
   /* If we set the bit now since we know it is never used we avoid
      dirtying the cache line later.  */
@@ -149,7 +177,14 @@  _dl_new_object (char *realname, const char *libname, int type,
 
   new->l_local_scope[0] = &new->l_searchlist;
 
-  /* Don't try to find the origin for the main map which has the name "".  */
+  /* Determine the origin.  If allocating the link map for the main
+     executable, the realname is not known and "".  In this case, the
+     origin needs to be determined by other means.  However, in case
+     of an explicit loader invocation, the pathname of the main
+     executable is known and needs to be processed here: From the
+     point of view of the kernel, the main executable is the
+     dynamic loader, and this would lead to a computation of the wrong
+     origin.  */
   if (realname[0] != '\0')
     {
       size_t realname_len = strlen (realname) + 1;
diff --git a/elf/tst-dlopen-aout-container.c b/elf/tst-dlopen-aout-container.c
new file mode 100644
index 0000000000..9b9f86133d
--- /dev/null
+++ b/elf/tst-dlopen-aout-container.c
@@ -0,0 +1,19 @@ 
+/* Test case for BZ #16634 and BZ#24900.  Container version.
+   Copyright (C) 2014-2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "tst-dlopen-aout.c"
diff --git a/elf/tst-dlopen-aout.c b/elf/tst-dlopen-aout.c
index 25cfe2f740..3e39fc6067 100644
--- a/elf/tst-dlopen-aout.c
+++ b/elf/tst-dlopen-aout.c
@@ -1,7 +1,8 @@ 
-/* Test case for BZ #16634.
+/* Test case for BZ #16634 and BZ#24900.
 
    Verify that incorrectly dlopen()ing an executable without
-   __RTLD_OPENEXEC does not cause assertion in ld.so.
+   __RTLD_OPENEXEC does not cause assertion in ld.so, and that it
+   actually results in an error.
 
    Copyright (C) 2014-2019 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
@@ -24,6 +25,8 @@ 
 #include <pthread.h>
 #include <stdio.h>
 #include <stdlib.h>
+#include <string.h>
+#include <support/check.h>
 #include <support/support.h>
 #include <support/xthread.h>
 
@@ -35,28 +38,35 @@  fn (void *p)
   return p;
 }
 
+/* Call dlopen on PATH and check that fails with an error message
+   indicating an attempt to open an ET_EXEC or PIE object.  */
+static void
+check_dlopen_failure (const char *path)
+{
+  void *handle = dlopen (path, RTLD_LAZY);
+  if (handle != NULL)
+    FAIL_EXIT1 ("dlopen succeeded unexpectedly: %s", path);
+
+  const char *message = dlerror ();
+  TEST_VERIFY_EXIT (message != NULL);
+  if ((strstr (message,
+	       "cannot dynamically load position-independent executable")
+       == NULL)
+      && strstr (message, "cannot dynamically load executable") == NULL)
+    FAIL_EXIT1 ("invalid dlopen error message: \"%s\"", message);
+}
+
 static int
 do_test (int argc, char *argv[])
 {
   int j;
 
-  /* Use the full path so that the dynamic loader does not recognize
-     the main program as already loaded (even with an explicit ld.so
-     invocation).  */
-  char *path = xasprintf ("%s/%s", support_objdir_root, "tst-dlopen-aout");
-  printf ("info: dlopen object: %s\n", path);
-
   for (j = 0; j < 100; ++j)
     {
       pthread_t thr;
-      void *p;
 
-      p = dlopen (path, RTLD_LAZY);
-      if (p != NULL)
-        {
-          puts ("error: dlopen succeeded unexpectedly");
-          return 1;
-        }
+      check_dlopen_failure (argv[0]);
+
       /* We create threads to force TLS allocation, which triggers
 	 the original bug i.e. running out of surplus slotinfo entries
 	 for TLS.  */
@@ -64,7 +74,10 @@  do_test (int argc, char *argv[])
       xpthread_join (thr);
     }
 
-  free (path);
+  /* The elf subdirectory (or $ORIGIN in the container case) is on the
+     library search path.  */
+  check_dlopen_failure ("tst-dlopen-aout");
+
   return 0;
 }