From patchwork Tue Aug 12 16:16:49 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brennan Shacklett X-Patchwork-Id: 2373 Received: (qmail 25822 invoked by alias); 12 Aug 2014 16:18:45 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 25810 invoked by uid 89); 12 Aug 2014 16:18:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pa0-f45.google.com X-Received: by 10.70.89.43 with SMTP id bl11mr1816100pdb.163.1407860318985; Tue, 12 Aug 2014 09:18:38 -0700 (PDT) From: Brennan Shacklett To: libc-alpha@sourceware.org Cc: Brennan Shacklett Subject: [PATCH][BZ 17251] Calculate RPATH $ORIGIN from absolute path Date: Tue, 12 Aug 2014 09:16:49 -0700 Message-Id: <1407860209-21797-1-git-send-email-bpshacklett@gmail.com> Bug 17251 deals with (what I believe is) the incorrect calculation of $ORIGIN for shared libraries. Currently $ORIGIN is calculated correctly for executables on linux, because readlink is called on /proc/self/exe, which means the resulting path is absolute and has no symlinks. Shared libraries with relative paths on the other hand are based off of appending the name / path of the library to the current working directory, which means if the library is a symlink, it is not followed, which breaks RPATH $ORIGIN in the following scenario: libone.so and libtwo.so are both in ~/lib libone.so needs libtwo.so, so libone.so has an RPATH of $ORIGIN. If I run ldd on ~/lib/libone.so, libtwo.so is found and all is good. If I create a symlink named ~/libone.so to ~/lib/libone.so, and run ldd on it libtwo is not found, because $ORIGIN for the library is calculated as ~ instead of ~/lib. If I was to repeat the above test but instead of libone.so use an executable, everything would work as expected, which is why I think the shared library behavior is a bug. I ran into this in the real world when attempting to dynamically load a .so for python SWIG bindings, because the .so which python was loading was symlinked to the actual library directory where all of the .so's dependencies were located. The attached patch fixes this behavior by having realname evaluated with __realpath in _dl_map_object_from_fd, before _dl_new_object is called (which is where l_origin is assigned). This also somewhat simplifies the code in elf/dl-object.c, because realname is guaranteed to be an absolute path generated by realpath, so I was able to remove the code dealing with relative paths. I used the implementation of dl_realpath from sysdeps/tile/dl-runtime.c as the generic implementation, and extended it with lstat and readlink in the linux version. This means the bug is still present on other systems than linux (the generic implementation only returns an absolute path, it doesn't do anything with symlinks), but if there is a way to get the generic version to follow symlinks on all systems please let me know. I tested and ran the elf test suite on Gentoo Linux x86-64. If this change is wanted, I will happily write a test to go along with it. I don't have any sort of copyright attribution set up with the FSF, let me know if it is necessary for this change. Thanks, Brennan Shacklett 2014-08-12 Brennan Shacklett [BZ #17251] * elf/Makefile (rtld-routines): Add dl-realpath. * elf/dl-load.c (_dl_map_object_from_fd): Calculate realname with __realpath. * elf/dl-object.c (_dl_new_object): Remove relative path handling. * elf/dl-realpath.c: New file. Generic implementation of __realpath. * elf/tile/dl-runtime.c (_dl_after_load): Remove dl_realpath and change call to __realpath. * sysdeps/unix/sysv/linux/Makefile (sysdep-rtld-routines): Add dl-lxstat64. * sysdeps/unix/sysv/linux/dl-lxstat64.c: New file. * sysdeps/unix/sysv/linux/dl-realpath.c: New file. Handle symlinks unlike generic implementation. --- elf/Makefile | 3 +- elf/dl-load.c | 18 +++++ elf/dl-object.c | 54 ++------------- elf/dl-realpath.c | 85 ++++++++++++++++++++++++ sysdeps/tile/dl-runtime.c | 66 +------------------ sysdeps/unix/sysv/linux/Makefile | 2 +- sysdeps/unix/sysv/linux/dl-lxstat64.c | 1 + sysdeps/unix/sysv/linux/dl-realpath.c | 121 ++++++++++++++++++++++++++++++++++ 8 files changed, 236 insertions(+), 114 deletions(-) create mode 100644 elf/dl-realpath.c create mode 100644 sysdeps/unix/sysv/linux/dl-lxstat64.c create mode 100644 sysdeps/unix/sysv/linux/dl-realpath.c diff --git a/elf/Makefile b/elf/Makefile index 25012cc..c130f21 100644 --- a/elf/Makefile +++ b/elf/Makefile @@ -43,7 +43,8 @@ shared-only-routines += dl-caller # ld.so uses those routines, plus some special stuff for being the program # interpreter and operating independent of libc. -rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal +rtld-routines := rtld $(dl-routines) dl-sysdep dl-environ dl-minimal \ + dl-realpath all-rtld-routines = $(rtld-routines) $(sysdep-rtld-routines) CFLAGS-dl-runtime.c = -fexceptions -fasynchronous-unwind-tables diff --git a/elf/dl-load.c b/elf/dl-load.c index 016a99c..7353837 100644 --- a/elf/dl-load.c +++ b/elf/dl-load.c @@ -891,6 +891,7 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp, int errval = 0; struct r_debug *r = _dl_debug_initialize (0, nsid); bool make_consistent = false; + char *absolutename = NULL; /* Get file information. */ if (__glibc_unlikely (__fxstat64 (_STAT_VER, fd, &st) < 0)) @@ -902,6 +903,23 @@ _dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp, lose (errval, fd, name, realname, l, errstring, make_consistent ? r : NULL, nsid); } + /* Find absolute pathname for object */ + absolutename = (char *) malloc (PATH_MAX); + if (!absolutename) + { + errstring = N_("cannot allocate memory for absolute path"); + goto call_lose_errno; + } + + if (!__realpath (realname, absolutename)) + { + free (absolutename); + errstring = N_("cannot find absolute path of shared object"); + goto call_lose_errno; + } + + free (realname); + realname = absolutename; /* Look again to see if the real name matched another already loaded. */ for (l = GL(dl_ns)[nsid]._ns_loaded; l; l = l->l_next) diff --git a/elf/dl-object.c b/elf/dl-object.c index afd80a6..d93935b 100644 --- a/elf/dl-object.c +++ b/elf/dl-object.c @@ -158,55 +158,15 @@ _dl_new_object (char *realname, const char *libname, int type, char *origin; char *cp; - if (realname[0] == '/') + /* It is an absolute path, calculated by realpath. Use it. + * But we have to make a copy since we strip out the trailing slash. */ + assert (realname[0] == '/'); + cp = origin = (char *) malloc (realname_len); + if (origin == NULL) { - /* It is an absolute path. Use it. But we have to make a - copy since we strip out the trailing slash. */ - cp = origin = (char *) malloc (realname_len); - if (origin == NULL) - { - origin = (char *) -1; - goto out; - } + origin = (char *) -1; + goto out; } - else - { - size_t len = realname_len; - char *result = NULL; - - /* Get the current directory name. */ - origin = NULL; - do - { - char *new_origin; - - len += 128; - new_origin = (char *) realloc (origin, len); - if (new_origin == NULL) - /* We exit the loop. Note that result == NULL. */ - break; - origin = new_origin; - } - while ((result = __getcwd (origin, len - realname_len)) == NULL - && errno == ERANGE); - - if (result == NULL) - { - /* We were not able to determine the current directory. - Note that free(origin) is OK if origin == NULL. */ - free (origin); - origin = (char *) -1; - goto out; - } - - /* Find the end of the path and see whether we have to add a - slash. We could use rawmemchr but this need not be - fast. */ - cp = (strchr) (origin, '\0'); - if (cp[-1] != '/') - *cp++ = '/'; - } - /* Add the real file name. */ cp = __mempcpy (cp, realname, realname_len); diff --git a/elf/dl-realpath.c b/elf/dl-realpath.c new file mode 100644 index 0000000..2549bb3 --- /dev/null +++ b/elf/dl-realpath.c @@ -0,0 +1,85 @@ +/* Dynamic loader version of realpath. + Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include + +/* Simplified implementation of realpath(): no dynamic memory use, no lstat(), + no set_errno(), no valid "rpath" on error, etc. This simplifies cases + involving relative paths, specifically where $ORIGIN needs to be + calculated. For this relatively rare case, one could also imagine using + link_map.l_origin to avoid the getcwd() here, but the simpler code + here seems like a better solution. */ +char * internal_function +__realpath (const char *name, char *rpath) +{ + char *dest; + const char *start, *end; + + if (name[0] != '/') + { + if (!__getcwd (rpath, PATH_MAX)) + return NULL; + dest = __rawmemchr (rpath, '\0'); + } + else + { + rpath[0] = '/'; + dest = rpath + 1; + } + + for (start = end = name; *start; start = end) + { + /* Skip sequence of multiple path-separators. */ + while (*start == '/') + ++start; + + /* Find end of path component. */ + for (end = start; *end && *end != '/'; ++end) + /* Nothing. */; + + if (end - start == 0) + break; + else if (end - start == 1 && start[0] == '.') + /* nothing */; + else if (end - start == 2 && start[0] == '.' && start[1] == '.') + { + /* Back up to previous component, ignore if at root already. */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + else + { + if (dest[-1] != '/') + *dest++ = '/'; + + if (dest + (end - start) >= rpath + PATH_MAX) + return NULL; + + dest = __mempcpy (dest, start, end - start); + *dest = '\0'; + } + } + if (dest > rpath + 1 && dest[-1] == '/') + --dest; + *dest = '\0'; + + return rpath; +} diff --git a/sysdeps/tile/dl-runtime.c b/sysdeps/tile/dl-runtime.c index bcc00bc..45251d0 100644 --- a/sysdeps/tile/dl-runtime.c +++ b/sysdeps/tile/dl-runtime.c @@ -29,70 +29,6 @@ #include #include -/* Like realpath(), but simplified: no dynamic memory use, no lstat(), - no set_errno(), no valid "rpath" on error, etc. This handles some - simple cases where the simulator might not have a valid entry for - a loaded Elf object, in particular dlopen() with a relative path. - For this relatively rare case, one could also imagine using - link_map.l_origin to avoid the getcwd() here, but the simpler code - here seems like a better solution. */ -static char * -dl_realpath (const char *name, char *rpath) -{ - char *dest; - const char *start, *end; - - if (name[0] != '/') - { - if (!__getcwd (rpath, PATH_MAX)) - return NULL; - dest = __rawmemchr (rpath, '\0'); - } - else - { - rpath[0] = '/'; - dest = rpath + 1; - } - - for (start = end = name; *start; start = end) - { - /* Skip sequence of multiple path-separators. */ - while (*start == '/') - ++start; - - /* Find end of path component. */ - for (end = start; *end && *end != '/'; ++end) - /* Nothing. */; - - if (end - start == 0) - break; - else if (end - start == 1 && start[0] == '.') - /* nothing */; - else if (end - start == 2 && start[0] == '.' && start[1] == '.') - { - /* Back up to previous component, ignore if at root already. */ - if (dest > rpath + 1) - while ((--dest)[-1] != '/'); - } - else - { - if (dest[-1] != '/') - *dest++ = '/'; - - if (dest + (end - start) >= rpath + PATH_MAX) - return NULL; - - dest = __mempcpy (dest, start, end - start); - *dest = '\0'; - } - } - if (dest > rpath + 1 && dest[-1] == '/') - --dest; - *dest = '\0'; - - return rpath; -} - /* Support notifying the simulator about new objects. */ void internal_function _dl_after_load (struct link_map *l) @@ -117,7 +53,7 @@ _dl_after_load (struct link_map *l) DLPUTC (':'); /* Write the library path, including the terminating '\0'. */ - path = dl_realpath (l->l_name, pathbuf) ?: l->l_name; + path = __realpath (l->l_name, pathbuf) ?: l->l_name; for (size_t i = 0;; i++) { DLPUTC (path[i]); diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile index 9ad6d22..4525429 100644 --- a/sysdeps/unix/sysv/linux/Makefile +++ b/sysdeps/unix/sysv/linux/Makefile @@ -176,7 +176,7 @@ endif ifeq ($(subdir),elf) sysdep-rtld-routines += dl-brk dl-sbrk dl-getcwd dl-openat64 dl-opendir \ - dl-fxstatat64 + dl-fxstatat64 dl-lxstat64 CPPFLAGS-lddlibc4 += -DNOT_IN_libc diff --git a/sysdeps/unix/sysv/linux/dl-lxstat64.c b/sysdeps/unix/sysv/linux/dl-lxstat64.c new file mode 100644 index 0000000..63e6800 --- /dev/null +++ b/sysdeps/unix/sysv/linux/dl-lxstat64.c @@ -0,0 +1 @@ +#include diff --git a/sysdeps/unix/sysv/linux/dl-realpath.c b/sysdeps/unix/sysv/linux/dl-realpath.c new file mode 100644 index 0000000..91652dc --- /dev/null +++ b/sysdeps/unix/sysv/linux/dl-realpath.c @@ -0,0 +1,121 @@ +/* Dynamic loader version of realpath for linux. + Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include + +/* Simplified version of realpath which extends elf/dl-realpath.c to use + * linux syscalls for handling symlinks */ +char * internal_function +__realpath (const char *name, char *rpath) +{ + char *dest; + char extra_buf[PATH_MAX]; + const char *start, *end; + int num_links = 0; + + if (name[0] != '/') + { + if (!__getcwd (rpath, PATH_MAX)) + return NULL; + dest = __rawmemchr (rpath, '\0'); + } + else + { + rpath[0] = '/'; + dest = rpath + 1; + } + + for (start = end = name; *start; start = end) + { + struct stat64 st; + int n; + /* Skip sequence of multiple path-separators. */ + while (*start == '/') + ++start; + + /* Find end of path component. */ + for (end = start; *end && *end != '/'; ++end) + /* Nothing. */; + + if (end - start == 0) + break; + else if (end - start == 1 && start[0] == '.') + /* nothing */; + else if (end - start == 2 && start[0] == '.' && start[1] == '.') + { + /* Back up to previous component, ignore if at root already. */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + else + { + if (dest[-1] != '/') + *dest++ = '/'; + + if (dest + (end - start) >= rpath + PATH_MAX) + return NULL; + + dest = __mempcpy (dest, start, end - start); + *dest = '\0'; + if (__lxstat64 (_STAT_VER, rpath, &st) < 0) + return NULL; + + if (S_ISLNK (st.st_mode)) + { + char buf[PATH_MAX]; + size_t len; + + INTERNAL_SYSCALL_DECL (err); + + n = INTERNAL_SYSCALL (readlink, err, 3, rpath, buf, + PATH_MAX - 1); + + if (n <= 0 || buf[0] == '[') + return NULL; + buf[n] = '\0'; + + len = strlen (end); + if ((long int) (n + len) >= PATH_MAX) + return NULL; + + /* Careful here, end may be a pointer into extra_buf... */ + memmove (&extra_buf[n], end, len + 1); + name = end = memcpy (extra_buf, buf, n); + + if (buf[0] == '/') + dest = rpath + 1; /* Absolute symlink */ + else + /* Back up to previous component, ignore if at root already: */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + } + } + if (dest > rpath + 1 && dest[-1] == '/') + --dest; + *dest = '\0'; + + return rpath; + +}