From patchwork Tue Aug 18 21:43:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 40281 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 31EF83857C4E; Tue, 18 Aug 2020 21:43:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 31EF83857C4E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1597787017; bh=rG6pdF/0718Xtwpw/82hwqVLWFkw4IZEcehVsyJhvos=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=b3t0JjeTShufCbTlEB4MaqTCkVAbgFahcb6ndJPKVMtsgAi1feHuhyMPeERg5Afyx ftbdwG6+FREXTLT4SsR15uRfrBgDv0qr/9WPOstKjHWcqfLxz8osVC0+4idzmXwDji Ru9pC/Gbvypq+lR2XgushCBEY9bJf+P9hmOZz3k8= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qv1-xf2d.google.com (mail-qv1-xf2d.google.com [IPv6:2607:f8b0:4864:20::f2d]) by sourceware.org (Postfix) with ESMTPS id 9C97D3858D37 for ; Tue, 18 Aug 2020 21:43:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9C97D3858D37 Received: by mail-qv1-xf2d.google.com with SMTP id dd12so10347834qvb.0 for ; Tue, 18 Aug 2020 14:43:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=rG6pdF/0718Xtwpw/82hwqVLWFkw4IZEcehVsyJhvos=; b=eJzTEsbCUChXDhGF7z6X1B+2EuO6K00wEbzEC9sLezmQNsM/DmbrKbD6WdfYX2UFpm xn6dB2dNjKUAT2OYFW69aT4fbVagBjjE2L18p7olGvs03if+7vkflPuIoqT8PxGk7d7L WEb8VAT3ZchMr7I9wokhsN2S964dHb322szTwBvTg2R3yOwXPaJJ6IuA/pcQZl5AGuDT DXL9bBJuU1td2loNv9OMc3AhYFpbxjxbD2W3Cq5liTBhKq+IXDVg3ty626P6GBo7plK3 TWd0p0Q7iNadKvRrgrQt1tzlr97fa8DsIq3tfPTAMiZDWz7wPO9v0ueyD64Z5V9g66jn ysqw== X-Gm-Message-State: AOAM533hsCk7R/3x457i6b7LlC6o9nzqC8fRwYvLKMqGp0b3GLdaxgjA 0RWwgYtfvRiB6Ac1d5DDfcQdnQYS7Py8nD7T X-Google-Smtp-Source: ABdhPJxxRN0ciD972JkJbiF3g5i95GdZ0tJqZmKCLvsM7KaBbhGfgNjbNvVHpN4rD78SV+cdEjbWXw== X-Received: by 2002:a05:6214:18d1:: with SMTP id cy17mr20987309qvb.29.1597787012642; Tue, 18 Aug 2020 14:43:32 -0700 (PDT) Received: from localhost.localdomain ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id x24sm26015213qtj.8.2020.08.18.14.43.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Aug 2020 14:43:32 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 1/2] stdlib: Use scratch_buffer on realpath (BZ #26341) Date: Tue, 18 Aug 2020 18:43:26 -0300 Message-Id: <20200818214327.3121808-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Cc: Xiaoming Ni Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" Changes from previous version [1]: - Use a scratch_buffer for the extra_buffer in case of symlink expansion and for the readlink syscall. [1] https://sourceware.org/pipermail/libc-alpha/2020-August/116968.html --- It limits the total stack allocation to around 1k for default cases and around 2k if the path contains symlinks. Larger path or symlink resolution will trigger dynamic memory allocation up to 2 times PATH_MAX. Checked on x86_64-linux-gnu and i686-linux-gnu. --- stdlib/Makefile | 3 +- stdlib/canonicalize.c | 119 +++++++++++++----- stdlib/tst-canon-bz26341.c | 108 ++++++++++++++++ support/support_set_small_thread_stack_size.c | 12 +- support/xthread.h | 2 + 5 files changed, 207 insertions(+), 37 deletions(-) create mode 100644 stdlib/tst-canon-bz26341.c diff --git a/stdlib/Makefile b/stdlib/Makefile index 4615f6dfe7..7093b8a584 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -87,7 +87,7 @@ tests := tst-strtol tst-strtod testmb testrand testsort testdiv \ tst-makecontext-align test-bz22786 tst-strtod-nan-sign \ tst-swapcontext1 tst-setcontext4 tst-setcontext5 \ tst-setcontext6 tst-setcontext7 tst-setcontext8 \ - tst-setcontext9 tst-bz20544 + tst-setcontext9 tst-bz20544 tst-canon-bz26341 tests-internal := tst-strtod1i tst-strtod3 tst-strtod4 tst-strtod5i \ tst-tls-atexit tst-tls-atexit-nodelete @@ -102,6 +102,7 @@ LDLIBS-test-atexit-race = $(shared-thread-library) LDLIBS-test-at_quick_exit-race = $(shared-thread-library) LDLIBS-test-cxa_atexit-race = $(shared-thread-library) LDLIBS-test-on_exit-race = $(shared-thread-library) +LDLIBS-tst-canon-bz26341 = $(shared-thread-library) LDLIBS-test-dlclose-exit-race = $(shared-thread-library) $(libdl) LDFLAGS-test-dlclose-exit-race = $(LDFLAGS-rdynamic) diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c index cbd885a3c5..43454f140c 100644 --- a/stdlib/canonicalize.c +++ b/stdlib/canonicalize.c @@ -25,9 +25,81 @@ #include #include +#include #include #include +#ifndef PATH_MAX +# ifdef MAXPATHLEN +# define PATH_MAX MAXPATHLEN +# else +# define PATH_MAX 1024 +# endif +#endif + +static ssize_t +resolve_readlink (const char *rpath, struct scratch_buffer *out) +{ + do + { + ssize_t n = __readlink (rpath, out->data, out->length); + if (n == -1) + return -1; + else if (n == out->length) + { + if (out->length > PATH_MAX) + { + __set_errno (ENAMETOOLONG); + return -1; + } + if (!scratch_buffer_grow (out)) + return -1; + } + else + { + ((char*)out->data)[n] = '\0'; + return n; + } + } + while (true); +} + +static bool +realpath_readlink (const char *rpath, const char *end, size_t path_max, + size_t st_size, struct scratch_buffer *out) +{ + struct scratch_buffer buf; + scratch_buffer_init (&buf); + + if (!scratch_buffer_set_array_size (&buf, st_size + 1, sizeof (char *))) + return false; + + bool r = false; + + ssize_t n = resolve_readlink (rpath, &buf); + if (n == -1) + goto out; + + size_t len = strlen (end); + if (path_max - buf.length <= len) + { + __set_errno (ENAMETOOLONG); + goto out; + } + + if (! scratch_buffer_set_array_size (out, n + len + 1, sizeof (char *))) + goto out; + + memmove (out->data + n, end, len + 1); + memcpy (out->data, buf.data, n); + + r = true; + +out: + scratch_buffer_free (&buf); + return r; +} + /* Return the canonical absolute name of file NAME. A canonical name does not contain any `.', `..' components nor any repeated path separators ('/') or symlinks. All path components must exist. If @@ -42,10 +114,13 @@ char * __realpath (const char *name, char *resolved) { - char *rpath, *dest, *extra_buf = NULL; + char *rpath, *dest; const char *start, *end, *rpath_limit; - long int path_max; + const size_t path_max = PATH_MAX; int num_links = 0; + struct scratch_buffer extra_buf; + + scratch_buffer_init (&extra_buf); if (name == NULL) { @@ -65,14 +140,6 @@ __realpath (const char *name, char *resolved) return NULL; } -#ifdef PATH_MAX - path_max = PATH_MAX; -#else - path_max = __pathconf (name, _PC_PATH_MAX); - if (path_max <= 0) - path_max = 1024; -#endif - if (resolved == NULL) { rpath = malloc (path_max); @@ -85,7 +152,7 @@ __realpath (const char *name, char *resolved) if (name[0] != '/') { - if (!__getcwd (rpath, path_max)) + if (__getcwd (rpath, path_max) == NULL) { rpath[0] = '\0'; goto error; @@ -101,7 +168,6 @@ __realpath (const char *name, char *resolved) for (start = end = name; *start; start = end) { struct stat64 st; - int n; /* Skip sequence of multiple path-separators. */ while (*start == '/') @@ -158,40 +224,24 @@ __realpath (const char *name, char *resolved) dest = __mempcpy (dest, start, end - start); *dest = '\0'; - if (__lxstat64 (_STAT_VER, rpath, &st) < 0) + if (__lstat64 (rpath, &st) < 0) goto error; if (S_ISLNK (st.st_mode)) { - char *buf = __alloca (path_max); - size_t len; - if (++num_links > __eloop_threshold ()) { __set_errno (ELOOP); goto error; } - n = __readlink (rpath, buf, path_max - 1); - if (n < 0) + if (! realpath_readlink (rpath, end, path_max, st.st_size, + &extra_buf)) goto error; - buf[n] = '\0'; - if (!extra_buf) - extra_buf = __alloca (path_max); + name = end = extra_buf.data; - len = strlen (end); - if (path_max - n <= len) - { - __set_errno (ENAMETOOLONG); - goto error; - } - - /* Careful here, end may be a pointer into extra_buf... */ - memmove (&extra_buf[n], end, len + 1); - name = end = memcpy (extra_buf, buf, n); - - if (buf[0] == '/') + if (((char *)extra_buf.data)[0] == '/') dest = rpath + 1; /* It's an absolute symlink */ else /* Back up to previous component, ignore if at root already: */ @@ -209,6 +259,8 @@ __realpath (const char *name, char *resolved) --dest; *dest = '\0'; + scratch_buffer_free (&extra_buf); + assert (resolved == NULL || resolved == rpath); return rpath; @@ -216,6 +268,7 @@ error: assert (resolved == NULL || resolved == rpath); if (resolved == NULL) free (rpath); + scratch_buffer_free (&extra_buf); return NULL; } libc_hidden_def (__realpath) diff --git a/stdlib/tst-canon-bz26341.c b/stdlib/tst-canon-bz26341.c new file mode 100644 index 0000000000..63474bddaa --- /dev/null +++ b/stdlib/tst-canon-bz26341.c @@ -0,0 +1,108 @@ +/* Check if realpath does not consume extra stack space based on symlink + existance in the path (BZ #26341) + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include + +#include +#include +#include +#include +#include + +static char *filename; +static size_t filenamelen; +static char *linkname; + +static int +maxsymlinks (void) +{ +#ifdef MAXSYMLINKS + return MAXSYMLINKS; +#else + long int sysconf_symloop_max = sysconf (_SC_SYMLOOP_MAX); + return sysconf_symloop_max <= 0 + ? _POSIX_SYMLOOP_MAX + : sysconf_symloop_max; +#endif +} + +#ifndef PATH_MAX +# define PATH_MAX 1024 +#endif + +static void +create_link (void) +{ + int fd = create_temp_file ("tst-canon-bz26341", &filename); + TEST_VERIFY_EXIT (fd != -1); + xclose (fd); + + char *prevlink = filename; + int maxlinks = maxsymlinks (); + for (int i = 0; i < maxlinks; i++) + { + linkname = xasprintf ("%s%d", filename, i); + xsymlink (prevlink, linkname); + add_temp_file (linkname); + prevlink = linkname; + } + + filenamelen = strlen (filename); +} + +static void * +do_realpath (void *arg) +{ + /* Old implementation of realpath allocates a PATH_MAX using alloca + for each symlink in the path, leading to MAXSYMLINKS times PATH_MAX + maximum stack usage. + This stack allocations tries fill the thread allocated stack minus + both the thread (plus some slack) and the realpath (plus some slack). + If realpath uses more than 2 * PATH_MAX plus some slack it will trigger + a stackoverflow. */ + + const size_t realpath_usage = 2 * PATH_MAX + 1024; + const size_t thread_usage = 1 * PATH_MAX + 1024; + size_t stack_size = support_small_thread_stack_size () + - realpath_usage - thread_usage; + char stack[stack_size]; + char *resolved = stack + stack_size - thread_usage + 1024; + + char *p = realpath (linkname, resolved); + TEST_VERIFY (p != NULL); + TEST_COMPARE_BLOB (resolved, filenamelen, filename, filenamelen); + + return NULL; +} + +static int +do_test (void) +{ + create_link (); + + pthread_t th = xpthread_create (support_small_stack_thread_attribute (), + do_realpath, NULL); + xpthread_join (th); + + return 0; +} + +#include diff --git a/support/support_set_small_thread_stack_size.c b/support/support_set_small_thread_stack_size.c index 69d66e97db..74a0e38a72 100644 --- a/support/support_set_small_thread_stack_size.c +++ b/support/support_set_small_thread_stack_size.c @@ -20,8 +20,8 @@ #include #include -void -support_set_small_thread_stack_size (pthread_attr_t *attr) +size_t +support_small_thread_stack_size (void) { /* Some architectures have too small values for PTHREAD_STACK_MIN which cannot be used for creating threads. Ensure that the stack @@ -31,5 +31,11 @@ support_set_small_thread_stack_size (pthread_attr_t *attr) if (stack_size < PTHREAD_STACK_MIN) stack_size = PTHREAD_STACK_MIN; #endif - xpthread_attr_setstacksize (attr, stack_size); + return stack_size; +} + +void +support_set_small_thread_stack_size (pthread_attr_t *attr) +{ + xpthread_attr_setstacksize (attr, support_small_thread_stack_size ()); } diff --git a/support/xthread.h b/support/xthread.h index 05f8d4a7d9..6ba2f5a18b 100644 --- a/support/xthread.h +++ b/support/xthread.h @@ -78,6 +78,8 @@ void xpthread_attr_setguardsize (pthread_attr_t *attr, /* Set the stack size in ATTR to a small value, but still large enough to cover most internal glibc stack usage. */ void support_set_small_thread_stack_size (pthread_attr_t *attr); +/* Return the stack size used on support_set_small_thread_stack_size. */ +size_t support_small_thread_stack_size (void); /* Return a pointer to a thread attribute which requests a small stack. The caller must not free this pointer. */ From patchwork Tue Aug 18 21:43:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 40282 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3E28D3861000; Tue, 18 Aug 2020 21:43:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3E28D3861000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1597787019; bh=u6djNbg6Fo7YTtBOBhlNxO533TYtQB29iUJ3UYkAeOk=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=Y0TfOcTimeerH2xJ/9SpsRiLMjhtm+exGpphQiPUdXPxmKQQIL51G7PIFpDrIeJfW uqCh54oFL8Tl/Nn+u8bxoHq3mtCSbdt+s+jpXAdlf0lh9Q+ZywPPzg/yqKFCpgYUka E389gqJk0QOEXj6Va8RW2jQlrWl0Xi9WPjnSH66A= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by sourceware.org (Postfix) with ESMTPS id 5B9E53858D38 for ; Tue, 18 Aug 2020 21:43:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5B9E53858D38 Received: by mail-qk1-x741.google.com with SMTP id j187so19757172qke.11 for ; Tue, 18 Aug 2020 14:43:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=u6djNbg6Fo7YTtBOBhlNxO533TYtQB29iUJ3UYkAeOk=; b=GVEaerT1BjHe0SYdDxs0QRDBQZJvSbXzcWCPV1/t3d774aQc6lKXSubdDB7UXgScmf FOOqJdiQYlohcSP0QJDvbxWJ2ifpAWfF5ckVYAVRWGD3zXSsn2GdJUB02gpkfZDVLa3g uWOaX2xyTzWjnPJbKwP3VbU8LbsBdxrBkxa2zsglQuL17p5GAvPJji5JZi/YI4pEyVGJ mMFLo6OnlWY83y4NtaxLFjyq7Xff3ApJnmHqpQeybddGRFCIGs+6PZD+SohpWLEWKacR 2VIPzhmAfd+58kUSe6PSo/CD0ums7XRWhLuIZoRhRU7TxP+ATV69WDy/UvijO9YwhXEZ zPOg== X-Gm-Message-State: AOAM5331hGqFzI+IGJcI/C4lQJfpABa7j2alzPH1kwmuTmr+lR25P9RT JacqOaCxvcnXwttWWLcRqRpPXDwF2O17arwr X-Google-Smtp-Source: ABdhPJxueNEE/g/3ExZx0muzejLa/ff7R8LgDnpv9Bl8k4hV+DXg37ZsSfhp1B4Nc+U3MqaW0oLR2Q== X-Received: by 2002:ae9:e641:: with SMTP id x1mr18852450qkl.424.1597787014333; Tue, 18 Aug 2020 14:43:34 -0700 (PDT) Received: from localhost.localdomain ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id x24sm26015213qtj.8.2020.08.18.14.43.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Aug 2020 14:43:33 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v3 2/2] linux: Optimize realpath Date: Tue, 18 Aug 2020 18:43:27 -0300 Message-Id: <20200818214327.3121808-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200818214327.3121808-1-adhemerval.zanella@linaro.org> References: <20200818214327.3121808-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Cc: Xiaoming Ni Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" Changes from previous version [1]: - Use a scratch_buffer the readlink call. - Fallback to generic implementation readlink syscall fails (for the case where /proc is not mounted). [1] https://sourceware.org/pipermail/libc-alpha/2020-August/116969.html --- The Linux implementation uses the trick to open the provided path and read the symlink pointed by '/proc/self/fd/'. As for the generic implementation, the link is read using a scratch_buffer so default stack usage is limited to around ~1k (and with dynamic bounded to PATH_MAX). Regarding syscalls usage, for a sucessful path without symlinks it trades 2 syscalls (getcwd/lstat) for 3 (openat, readlink, and close). This optimization is better if the input contains multiple symlinks (where it replace multiple lstat calls by a readlink one). For failure case it depends whether the 'resolved' buffer is provided, which calls the generic strategy (and thus requiring more syscalls in general). Checked on x86_64-linux-gnu and i686-linux-gnu. --- include/scratch_buffer.h | 13 ++ include/stdlib.h | 16 ++ stdlib/Makefile | 2 +- stdlib/canonicalize.c | 219 +---------------------- stdlib/realpath-impl.c | 25 +++ stdlib/resolve-path.c | 227 ++++++++++++++++++++++++ sysdeps/unix/sysv/linux/realpath-impl.c | 69 +++++++ 7 files changed, 352 insertions(+), 219 deletions(-) create mode 100644 stdlib/realpath-impl.c create mode 100644 stdlib/resolve-path.c create mode 100644 sysdeps/unix/sysv/linux/realpath-impl.c diff --git a/include/scratch_buffer.h b/include/scratch_buffer.h index c39da78629..abf44d2860 100644 --- a/include/scratch_buffer.h +++ b/include/scratch_buffer.h @@ -86,6 +86,19 @@ scratch_buffer_free (struct scratch_buffer *buffer) free (buffer->data); } +/* Returns the BUFFER->data and re-init the internal state if its was + allocated or NULL otherwise. */ +static inline void * +scratch_buffer_finalize (struct scratch_buffer *buffer) +{ + if (buffer->data == buffer->__space.__c) + return NULL; + + void *r = buffer->data; + scratch_buffer_init (buffer); + return r; +} + /* Grow *BUFFER by some arbitrary amount. The buffer contents is NOT preserved. Return true on success, false on allocation failure (in which case the old buffer is freed). On success, the new buffer is diff --git a/include/stdlib.h b/include/stdlib.h index ffcefd7b85..182de52f83 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -20,6 +20,14 @@ # include +# ifndef PATH_MAX +# ifdef MAXPATHLEN +# define PATH_MAX MAXPATHLEN +# else +# define PATH_MAX 1024 +# endif +# endif + extern __typeof (strtol_l) __strtol_l; extern __typeof (strtoul_l) __strtoul_l; extern __typeof (strtoll_l) __strtoll_l; @@ -92,6 +100,14 @@ extern int __unsetenv (const char *__name) attribute_hidden; extern int __clearenv (void) attribute_hidden; extern char *__mktemp (char *__template) __THROW __nonnull ((1)); extern char *__canonicalize_file_name (const char *__name); +struct scratch_buffer; +extern ssize_t __resolve_readlink (const char *rpath, + struct scratch_buffer *out) + attribute_hidden; +extern char *__resolve_path (const char *name, char *resolved) + attribute_hidden; +extern char *__realpath_impl (const char *name, char *resolved) + attribute_hidden; extern char *__realpath (const char *__name, char *__resolved); libc_hidden_proto (__realpath) extern int __ptsname_r (int __fd, char *__buf, size_t __buflen) diff --git a/stdlib/Makefile b/stdlib/Makefile index 7093b8a584..b5d806696d 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -53,7 +53,7 @@ routines := \ strtof strtod strtold \ strtof_l strtod_l strtold_l \ strtof_nan strtod_nan strtold_nan \ - system canonicalize \ + system canonicalize resolve-path realpath-impl \ a64l l64a \ rpmatch strfmon strfmon_l getsubopt xpg_basename fmtmsg \ strtoimax strtoumax wcstoimax wcstoumax \ diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c index 43454f140c..c068b29043 100644 --- a/stdlib/canonicalize.c +++ b/stdlib/canonicalize.c @@ -16,90 +16,10 @@ License along with the GNU C Library; if not, see . */ -#include #include -#include -#include -#include -#include #include -#include - -#include -#include #include -#ifndef PATH_MAX -# ifdef MAXPATHLEN -# define PATH_MAX MAXPATHLEN -# else -# define PATH_MAX 1024 -# endif -#endif - -static ssize_t -resolve_readlink (const char *rpath, struct scratch_buffer *out) -{ - do - { - ssize_t n = __readlink (rpath, out->data, out->length); - if (n == -1) - return -1; - else if (n == out->length) - { - if (out->length > PATH_MAX) - { - __set_errno (ENAMETOOLONG); - return -1; - } - if (!scratch_buffer_grow (out)) - return -1; - } - else - { - ((char*)out->data)[n] = '\0'; - return n; - } - } - while (true); -} - -static bool -realpath_readlink (const char *rpath, const char *end, size_t path_max, - size_t st_size, struct scratch_buffer *out) -{ - struct scratch_buffer buf; - scratch_buffer_init (&buf); - - if (!scratch_buffer_set_array_size (&buf, st_size + 1, sizeof (char *))) - return false; - - bool r = false; - - ssize_t n = resolve_readlink (rpath, &buf); - if (n == -1) - goto out; - - size_t len = strlen (end); - if (path_max - buf.length <= len) - { - __set_errno (ENAMETOOLONG); - goto out; - } - - if (! scratch_buffer_set_array_size (out, n + len + 1, sizeof (char *))) - goto out; - - memmove (out->data + n, end, len + 1); - memcpy (out->data, buf.data, n); - - r = true; - -out: - scratch_buffer_free (&buf); - return r; -} - /* Return the canonical absolute name of file NAME. A canonical name does not contain any `.', `..' components nor any repeated path separators ('/') or symlinks. All path components must exist. If @@ -114,14 +34,6 @@ out: char * __realpath (const char *name, char *resolved) { - char *rpath, *dest; - const char *start, *end, *rpath_limit; - const size_t path_max = PATH_MAX; - int num_links = 0; - struct scratch_buffer extra_buf; - - scratch_buffer_init (&extra_buf); - if (name == NULL) { /* As per Single Unix Specification V2 we must return an error if @@ -140,136 +52,7 @@ __realpath (const char *name, char *resolved) return NULL; } - if (resolved == NULL) - { - rpath = malloc (path_max); - if (rpath == NULL) - return NULL; - } - else - rpath = resolved; - rpath_limit = rpath + path_max; - - if (name[0] != '/') - { - if (__getcwd (rpath, path_max) == NULL) - { - rpath[0] = '\0'; - goto error; - } - dest = __rawmemchr (rpath, '\0'); - } - else - { - rpath[0] = '/'; - dest = rpath + 1; - } - - for (start = end = name; *start; start = end) - { - struct stat64 st; - - /* Skip sequence of multiple path-separators. */ - while (*start == '/') - ++start; - - /* Find end of path component. */ - for (end = start; *end && *end != '/'; ++end) - /* Nothing. */; - - if (end - start == 0) - break; - else if (end - start == 1 && start[0] == '.') - /* nothing */; - else if (end - start == 2 && start[0] == '.' && start[1] == '.') - { - /* Back up to previous component, ignore if at root already. */ - if (dest > rpath + 1) - while ((--dest)[-1] != '/'); - } - else - { - size_t new_size; - - if (dest[-1] != '/') - *dest++ = '/'; - - if (dest + (end - start) >= rpath_limit) - { - ptrdiff_t dest_offset = dest - rpath; - char *new_rpath; - - if (resolved) - { - __set_errno (ENAMETOOLONG); - if (dest > rpath + 1) - dest--; - *dest = '\0'; - goto error; - } - new_size = rpath_limit - rpath; - if (end - start + 1 > path_max) - new_size += end - start + 1; - else - new_size += path_max; - new_rpath = (char *) realloc (rpath, new_size); - if (new_rpath == NULL) - goto error; - rpath = new_rpath; - rpath_limit = rpath + new_size; - - dest = rpath + dest_offset; - } - - dest = __mempcpy (dest, start, end - start); - *dest = '\0'; - - if (__lstat64 (rpath, &st) < 0) - goto error; - - if (S_ISLNK (st.st_mode)) - { - if (++num_links > __eloop_threshold ()) - { - __set_errno (ELOOP); - goto error; - } - - if (! realpath_readlink (rpath, end, path_max, st.st_size, - &extra_buf)) - goto error; - - name = end = extra_buf.data; - - if (((char *)extra_buf.data)[0] == '/') - dest = rpath + 1; /* It's an absolute symlink */ - else - /* Back up to previous component, ignore if at root already: */ - if (dest > rpath + 1) - while ((--dest)[-1] != '/'); - } - else if (!S_ISDIR (st.st_mode) && *end != '\0') - { - __set_errno (ENOTDIR); - goto error; - } - } - } - if (dest > rpath + 1 && dest[-1] == '/') - --dest; - *dest = '\0'; - - scratch_buffer_free (&extra_buf); - - assert (resolved == NULL || resolved == rpath); - return rpath; - -error: - assert (resolved == NULL || resolved == rpath); - if (resolved == NULL) - free (rpath); - scratch_buffer_free (&extra_buf); - return NULL; + return __realpath_impl (name, resolved); } libc_hidden_def (__realpath) versioned_symbol (libc, __realpath, realpath, GLIBC_2_3); diff --git a/stdlib/realpath-impl.c b/stdlib/realpath-impl.c new file mode 100644 index 0000000000..e099e02c0d --- /dev/null +++ b/stdlib/realpath-impl.c @@ -0,0 +1,25 @@ +/* realpath internal implementation. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +char * +__realpath_impl (const char *name, char *resolved) +{ + return __resolve_path (name, resolved); +} diff --git a/stdlib/resolve-path.c b/stdlib/resolve-path.c new file mode 100644 index 0000000000..bfceaef160 --- /dev/null +++ b/stdlib/resolve-path.c @@ -0,0 +1,227 @@ +/* Internal realpath function. + Copyright (C) 1996-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include + +ssize_t +__resolve_readlink (const char *rpath, struct scratch_buffer *out) +{ + do + { + ssize_t n = __readlink (rpath, out->data, out->length); + if (n == -1) + return -1; + else if (n == out->length) + { + if (out->length > PATH_MAX) + { + __set_errno (ENAMETOOLONG); + return -1; + } + if (!scratch_buffer_grow (out)) + return -1; + } + else + { + ((char*)out->data)[n] = '\0'; + return n; + } + } + while (true); +} + +static bool +realpath_readlink (const char *rpath, const char *end, size_t path_max, + size_t st_size, struct scratch_buffer *out) +{ + struct scratch_buffer buf; + scratch_buffer_init (&buf); + + if (!scratch_buffer_set_array_size (&buf, st_size + 1, sizeof (char *))) + return false; + + bool r = false; + + ssize_t n = __resolve_readlink (rpath, &buf); + if (n == -1) + goto out; + + size_t len = strlen (end); + if (path_max - buf.length <= len) + { + __set_errno (ENAMETOOLONG); + goto out; + } + + if (! scratch_buffer_set_array_size (out, n + len + 1, sizeof (char *))) + goto out; + + memmove (out->data + n, end, len + 1); + memcpy (out->data, buf.data, n); + + r = true; + +out: + scratch_buffer_free (&buf); + return r; +} + +char * +__resolve_path (const char *name, char *resolved) +{ + char *rpath, *dest; + const char *start, *end, *rpath_limit; + const size_t path_max = PATH_MAX; + int num_links = 0; + struct scratch_buffer extra_buf; + + scratch_buffer_init (&extra_buf); + + if (resolved == NULL) + { + rpath = malloc (path_max); + if (rpath == NULL) + return NULL; + } + else + rpath = resolved; + rpath_limit = rpath + path_max; + + if (name[0] != '/') + { + if (__getcwd (rpath, path_max) == NULL) + { + rpath[0] = '\0'; + goto error; + } + dest = __rawmemchr (rpath, '\0'); + } + else + { + rpath[0] = '/'; + dest = rpath + 1; + } + + for (start = end = name; *start; start = end) + { + struct stat64 st; + + /* Skip sequence of multiple path-separators. */ + while (*start == '/') + ++start; + + /* Find end of path component. */ + for (end = start; *end && *end != '/'; ++end) + /* Nothing. */; + + if (end - start == 0) + break; + else if (end - start == 1 && start[0] == '.') + /* nothing */; + else if (end - start == 2 && start[0] == '.' && start[1] == '.') + { + /* Back up to previous component, ignore if at root already. */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + else + { + size_t new_size; + + if (dest[-1] != '/') + *dest++ = '/'; + + if (dest + (end - start) >= rpath_limit) + { + ptrdiff_t dest_offset = dest - rpath; + char *new_rpath; + + if (resolved) + { + __set_errno (ENAMETOOLONG); + if (dest > rpath + 1) + dest--; + *dest = '\0'; + goto error; + } + new_size = rpath_limit - rpath; + if (end - start + 1 > path_max) + new_size += end - start + 1; + else + new_size += path_max; + new_rpath = (char *) realloc (rpath, new_size); + if (new_rpath == NULL) + goto error; + rpath = new_rpath; + rpath_limit = rpath + new_size; + + dest = rpath + dest_offset; + } + + dest = __mempcpy (dest, start, end - start); + *dest = '\0'; + + if (__lstat64 (rpath, &st) < 0) + goto error; + + if (S_ISLNK (st.st_mode)) + { + if (++num_links > __eloop_threshold ()) + { + __set_errno (ELOOP); + goto error; + } + + if (! realpath_readlink (rpath, end, path_max, st.st_size, + &extra_buf)) + goto error; + + name = end = extra_buf.data; + + if (((char *)extra_buf.data)[0] == '/') + dest = rpath + 1; /* It's an absolute symlink */ + else + /* Back up to previous component, ignore if at root already: */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + else if (!S_ISDIR (st.st_mode) && *end != '\0') + { + __set_errno (ENOTDIR); + goto error; + } + } + } + if (dest > rpath + 1 && dest[-1] == '/') + --dest; + *dest = '\0'; + + scratch_buffer_free (&extra_buf); + return rpath; + +error: + if (resolved == NULL) + free (rpath); + scratch_buffer_free (&extra_buf); + return NULL; +} diff --git a/sysdeps/unix/sysv/linux/realpath-impl.c b/sysdeps/unix/sysv/linux/realpath-impl.c new file mode 100644 index 0000000000..aa3d0850f0 --- /dev/null +++ b/sysdeps/unix/sysv/linux/realpath-impl.c @@ -0,0 +1,69 @@ +/* Return the canonical absolute name of a given file. Linux version. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include + +/* The Linux implementation optimizes the worse case where the path contains + multiple symlinks by making the kernel resolve and return the full path + by opening NAME and reading the resulting /proc/self/fd entry. */ + +char * +__realpath_impl (const char *name, char *resolved) +{ + int fd = __open64_nocancel (name, O_PATH | O_NONBLOCK | O_CLOEXEC); + if (fd == -1) + { + /* If the call fails with either EACCES or ENOENT and resolved_path is + not NULL, then the prefix of path that is not readable or does not + exist is returned in resolved_path. This is a GNU extension. */ + if (resolved != NULL) + __resolve_path (name, resolved); + return NULL; + } + + struct fd_to_filename fdfilename; + const char *procname = __fd_to_filename (fd, &fdfilename); + + struct scratch_buffer buf; + scratch_buffer_init (&buf); + + ssize_t len = __resolve_readlink (procname, &buf); + __close_nocancel_nostatus (fd); + + if (len < 0) + { + /* Fallback to generic implementation is /proc is not mounted. */ + scratch_buffer_free (&buf); + return __resolve_path (name, resolved); + } + + char *r; + if (resolved != NULL) + r = strcpy (resolved, buf.data); + else + { + /* If buffer was allocated return it instead of duplicate it. */ + r = scratch_buffer_finalize (&buf); + r = r ?: __strdup (buf.data); + } + scratch_buffer_free (&buf); + return r; +}