From patchwork Mon Aug 10 20:48:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 40237 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AA693386184C; Mon, 10 Aug 2020 20:49:05 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AA693386184C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1597092545; bh=Fi//6RWm1u+oEwCAHpEWemlS00hG/N9SyS+GQKWx6iI=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=FrcGYFcHVU1innRaqfX6OF8jFQGHajIGoERqqbVejl61J2lp/nrc8uMweZcsBakEa 5CBWkoGcekvvcvbh4h3uVYMivln3OGmrxiItTfZJQ2QYaB4wX/P/Va+EQnSMVUIqEY suYZaFPUINyI9a6ZR1diCf+OJml60vXdo9JlgP5A= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by sourceware.org (Postfix) with ESMTPS id 278083850420 for ; Mon, 10 Aug 2020 20:49:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 278083850420 Received: by mail-qk1-x741.google.com with SMTP id m7so9707877qki.12 for ; Mon, 10 Aug 2020 13:49:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=Fi//6RWm1u+oEwCAHpEWemlS00hG/N9SyS+GQKWx6iI=; b=Qi24ZgThHGK9tx/0Dfr/Jm2xQyJjhuRusLIIP7UPirHMaH3KrOgIiuP5rTc4x51SY0 kewxZZ00dlYD+Xf0R2DGihgi/FTsCoT4LB+VnK7eu//NhDdbpVHeNfFjPwCVeHOZwMml HUmMMQTmhoLeNZMSlIxWobXJgK3hQ/wgYwEUiMx6hvuw4E9x7F2nUwtRiROBhDTYVAY+ KDlko2px/tzLI4lN2mlSZkoa6RaovcQdjwh+EXYIXIxivKv0q3BryCGZKvzLop1j+9ug gzk4AXeGtkWa29TNor2hCwN+VBzWt9uFLhILZqVgIpE4W194uhQJEUI/DEnufV8RRaJJ Dj2g== X-Gm-Message-State: AOAM533HOmpIGkHnYxT1r0IooopDk1TBFnvaaib+J7i7gabr1HdtO9/o RXpujACHWVlEsLhddjSdFY8ViXE9SgVaTg== X-Google-Smtp-Source: ABdhPJyWnoh6d49MXGg+FVNqKrO/RNV/SwW/Cw2pByOfb50pkh5ek+4a99xgTCOlgUQpY3txXv8H4w== X-Received: by 2002:a37:61ce:: with SMTP id v197mr28922218qkb.44.1597092542237; Mon, 10 Aug 2020 13:49:02 -0700 (PDT) Received: from localhost.localdomain ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id n128sm9823185qke.8.2020.08.10.13.49.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Aug 2020 13:49:01 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH 1/3] stdlib: Use fixed buffer size for realpath (BZ #26241) Date: Mon, 10 Aug 2020 17:48:54 -0300 Message-Id: <20200810204856.2111211-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Cc: Xiaoming Ni Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" It uses both a fixed internal buffer with PATH_MAX size to read and copy the results of the readlink call. Also, if PATH_MAX is not defined it uses a default value of 1024 as for other stdlib implementations. The expected stack usage is about 8k on Linux where PATH_MAX is define as 4096 (plus some internal function usage for local variable). Checked on x86_64-linux-gnu and i686-linux-gnu. --- stdlib/Makefile | 3 +- stdlib/canonicalize.c | 38 +++--- stdlib/tst-canon-bz26341.c | 108 ++++++++++++++++++ support/support_set_small_thread_stack_size.c | 12 +- support/xthread.h | 2 + 5 files changed, 138 insertions(+), 25 deletions(-) create mode 100644 stdlib/tst-canon-bz26341.c diff --git a/stdlib/Makefile b/stdlib/Makefile index 4615f6dfe7..7093b8a584 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -87,7 +87,7 @@ tests := tst-strtol tst-strtod testmb testrand testsort testdiv \ tst-makecontext-align test-bz22786 tst-strtod-nan-sign \ tst-swapcontext1 tst-setcontext4 tst-setcontext5 \ tst-setcontext6 tst-setcontext7 tst-setcontext8 \ - tst-setcontext9 tst-bz20544 + tst-setcontext9 tst-bz20544 tst-canon-bz26341 tests-internal := tst-strtod1i tst-strtod3 tst-strtod4 tst-strtod5i \ tst-tls-atexit tst-tls-atexit-nodelete @@ -102,6 +102,7 @@ LDLIBS-test-atexit-race = $(shared-thread-library) LDLIBS-test-at_quick_exit-race = $(shared-thread-library) LDLIBS-test-cxa_atexit-race = $(shared-thread-library) LDLIBS-test-on_exit-race = $(shared-thread-library) +LDLIBS-tst-canon-bz26341 = $(shared-thread-library) LDLIBS-test-dlclose-exit-race = $(shared-thread-library) $(libdl) LDFLAGS-test-dlclose-exit-race = $(LDFLAGS-rdynamic) diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c index cbd885a3c5..554ba221e4 100644 --- a/stdlib/canonicalize.c +++ b/stdlib/canonicalize.c @@ -28,6 +28,14 @@ #include #include +#ifndef PATH_MAX +# ifdef MAXPATHLEN +# define PATH_MAX MAXPATHLEN +# else +# define PATH_MAX 1024 +# endif +#endif + /* Return the canonical absolute name of file NAME. A canonical name does not contain any `.', `..' components nor any repeated path separators ('/') or symlinks. All path components must exist. If @@ -42,9 +50,8 @@ char * __realpath (const char *name, char *resolved) { - char *rpath, *dest, *extra_buf = NULL; + char *rpath, *dest, extra_buf[PATH_MAX]; const char *start, *end, *rpath_limit; - long int path_max; int num_links = 0; if (name == NULL) @@ -65,27 +72,19 @@ __realpath (const char *name, char *resolved) return NULL; } -#ifdef PATH_MAX - path_max = PATH_MAX; -#else - path_max = __pathconf (name, _PC_PATH_MAX); - if (path_max <= 0) - path_max = 1024; -#endif - if (resolved == NULL) { - rpath = malloc (path_max); + rpath = malloc (PATH_MAX); if (rpath == NULL) return NULL; } else rpath = resolved; - rpath_limit = rpath + path_max; + rpath_limit = rpath + PATH_MAX; if (name[0] != '/') { - if (!__getcwd (rpath, path_max)) + if (!__getcwd (rpath, PATH_MAX)) { rpath[0] = '\0'; goto error; @@ -142,10 +141,10 @@ __realpath (const char *name, char *resolved) goto error; } new_size = rpath_limit - rpath; - if (end - start + 1 > path_max) + if (end - start + 1 > PATH_MAX) new_size += end - start + 1; else - new_size += path_max; + new_size += PATH_MAX; new_rpath = (char *) realloc (rpath, new_size); if (new_rpath == NULL) goto error; @@ -163,7 +162,7 @@ __realpath (const char *name, char *resolved) if (S_ISLNK (st.st_mode)) { - char *buf = __alloca (path_max); + char buf[PATH_MAX]; size_t len; if (++num_links > __eloop_threshold ()) @@ -172,16 +171,13 @@ __realpath (const char *name, char *resolved) goto error; } - n = __readlink (rpath, buf, path_max - 1); + n = __readlink (rpath, buf, sizeof (buf) - 1); if (n < 0) goto error; buf[n] = '\0'; - if (!extra_buf) - extra_buf = __alloca (path_max); - len = strlen (end); - if (path_max - n <= len) + if (PATH_MAX - n <= len) { __set_errno (ENAMETOOLONG); goto error; diff --git a/stdlib/tst-canon-bz26341.c b/stdlib/tst-canon-bz26341.c new file mode 100644 index 0000000000..63474bddaa --- /dev/null +++ b/stdlib/tst-canon-bz26341.c @@ -0,0 +1,108 @@ +/* Check if realpath does not consume extra stack space based on symlink + existance in the path (BZ #26341) + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include + +#include +#include +#include +#include +#include + +static char *filename; +static size_t filenamelen; +static char *linkname; + +static int +maxsymlinks (void) +{ +#ifdef MAXSYMLINKS + return MAXSYMLINKS; +#else + long int sysconf_symloop_max = sysconf (_SC_SYMLOOP_MAX); + return sysconf_symloop_max <= 0 + ? _POSIX_SYMLOOP_MAX + : sysconf_symloop_max; +#endif +} + +#ifndef PATH_MAX +# define PATH_MAX 1024 +#endif + +static void +create_link (void) +{ + int fd = create_temp_file ("tst-canon-bz26341", &filename); + TEST_VERIFY_EXIT (fd != -1); + xclose (fd); + + char *prevlink = filename; + int maxlinks = maxsymlinks (); + for (int i = 0; i < maxlinks; i++) + { + linkname = xasprintf ("%s%d", filename, i); + xsymlink (prevlink, linkname); + add_temp_file (linkname); + prevlink = linkname; + } + + filenamelen = strlen (filename); +} + +static void * +do_realpath (void *arg) +{ + /* Old implementation of realpath allocates a PATH_MAX using alloca + for each symlink in the path, leading to MAXSYMLINKS times PATH_MAX + maximum stack usage. + This stack allocations tries fill the thread allocated stack minus + both the thread (plus some slack) and the realpath (plus some slack). + If realpath uses more than 2 * PATH_MAX plus some slack it will trigger + a stackoverflow. */ + + const size_t realpath_usage = 2 * PATH_MAX + 1024; + const size_t thread_usage = 1 * PATH_MAX + 1024; + size_t stack_size = support_small_thread_stack_size () + - realpath_usage - thread_usage; + char stack[stack_size]; + char *resolved = stack + stack_size - thread_usage + 1024; + + char *p = realpath (linkname, resolved); + TEST_VERIFY (p != NULL); + TEST_COMPARE_BLOB (resolved, filenamelen, filename, filenamelen); + + return NULL; +} + +static int +do_test (void) +{ + create_link (); + + pthread_t th = xpthread_create (support_small_stack_thread_attribute (), + do_realpath, NULL); + xpthread_join (th); + + return 0; +} + +#include diff --git a/support/support_set_small_thread_stack_size.c b/support/support_set_small_thread_stack_size.c index 69d66e97db..74a0e38a72 100644 --- a/support/support_set_small_thread_stack_size.c +++ b/support/support_set_small_thread_stack_size.c @@ -20,8 +20,8 @@ #include #include -void -support_set_small_thread_stack_size (pthread_attr_t *attr) +size_t +support_small_thread_stack_size (void) { /* Some architectures have too small values for PTHREAD_STACK_MIN which cannot be used for creating threads. Ensure that the stack @@ -31,5 +31,11 @@ support_set_small_thread_stack_size (pthread_attr_t *attr) if (stack_size < PTHREAD_STACK_MIN) stack_size = PTHREAD_STACK_MIN; #endif - xpthread_attr_setstacksize (attr, stack_size); + return stack_size; +} + +void +support_set_small_thread_stack_size (pthread_attr_t *attr) +{ + xpthread_attr_setstacksize (attr, support_small_thread_stack_size ()); } diff --git a/support/xthread.h b/support/xthread.h index 05f8d4a7d9..6ba2f5a18b 100644 --- a/support/xthread.h +++ b/support/xthread.h @@ -78,6 +78,8 @@ void xpthread_attr_setguardsize (pthread_attr_t *attr, /* Set the stack size in ATTR to a small value, but still large enough to cover most internal glibc stack usage. */ void support_set_small_thread_stack_size (pthread_attr_t *attr); +/* Return the stack size used on support_set_small_thread_stack_size. */ +size_t support_small_thread_stack_size (void); /* Return a pointer to a thread attribute which requests a small stack. The caller must not free this pointer. */ From patchwork Mon Aug 10 20:48:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 40238 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 384143861925; Mon, 10 Aug 2020 20:49:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 384143861925 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1597092547; bh=mix2hueujaxJHVkT6sbo6RckKIS/0IqTNK+DzkBljTQ=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=uKQPNaMe4FTtDINEnb11o6sk0jEDOpa4sIJYVUgyAFS+hgh5mPQQ92Q20XziPpc7o ZPD92CyCBelpyW9+MIWmhW+6BFa27gz1mCSF2/76DbI/jzvEUnaO+9TVO+C7U9W6fs PPAh9o4TbXXVJStsEC4nTmw88jeRnBN1lHxOuJ1I= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by sourceware.org (Postfix) with ESMTPS id 94DBC3850424 for ; Mon, 10 Aug 2020 20:49:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 94DBC3850424 Received: by mail-qk1-x744.google.com with SMTP id x69so9769959qkb.1 for ; Mon, 10 Aug 2020 13:49:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mix2hueujaxJHVkT6sbo6RckKIS/0IqTNK+DzkBljTQ=; b=mUvzoKXOIBH2HD27oGdImokKw34tO2o80TijvUI9spNZi55EoMhtiGYqkrxaO6kfJ/ B/NUbjzBfhnAFrSHaY2XGHSDwizupLCiOG9IH4W2ye7YOoOTNdvnGJH6wmos1LzFXQbE G/s8d+PQR48HdmvhjVYMaO0IWmu+dhOTqNFTmvIR+4i+nz2jxgQuZX0DSYnV5ebbyALg lfHKs5l+1Y2FM8bup2M+0KL8utZ2Tpf0ydvo+XRGx5KIvBbl7ac+RNXUo6AWTLZ0gXU/ JiQIomujK1ix2p4F57N0vSrCVLonJJzKc4JhzVLGLjKdqmvN/3AuX+d1JSMyqeLrJuGN FAEw== X-Gm-Message-State: AOAM5302You4c6mCLhggpbx0A7z2Q4GHVRD7QILz/9qKYsdpgwQLdePC 5HIo3C/JyQwAGSZz5K7LL/nPjkkhUujvMw== X-Google-Smtp-Source: ABdhPJzNt/Qsgr6hPLu4TcCP9p/OObRxrsdD7ygGPF1JnvcV++abxHAOX+ibEDXZLJmFmJLcQhKMfw== X-Received: by 2002:a05:620a:a05:: with SMTP id i5mr26820305qka.444.1597092544050; Mon, 10 Aug 2020 13:49:04 -0700 (PDT) Received: from localhost.localdomain ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id n128sm9823185qke.8.2020.08.10.13.49.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Aug 2020 13:49:03 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH 2/3] stdlib: Enforce PATH_MAX on allocated realpath buffer Date: Mon, 10 Aug 2020 17:48:55 -0300 Message-Id: <20200810204856.2111211-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200810204856.2111211-1-adhemerval.zanella@linaro.org> References: <20200810204856.2111211-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Cc: Xiaoming Ni Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" It removes the buffer resize for sizes larger than PATH_MAX in the case where the 'resolved' buffer is not specified. This allow assume realpath limit is PATH_MAX for all cases. Checked on x86_64-linux-gnu and i686-linux-gnu. --- stdlib/canonicalize.c | 30 +++++------------------------- 1 file changed, 5 insertions(+), 25 deletions(-) diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c index 554ba221e4..91c30e38be 100644 --- a/stdlib/canonicalize.c +++ b/stdlib/canonicalize.c @@ -122,36 +122,16 @@ __realpath (const char *name, char *resolved) } else { - size_t new_size; - if (dest[-1] != '/') *dest++ = '/'; if (dest + (end - start) >= rpath_limit) { - ptrdiff_t dest_offset = dest - rpath; - char *new_rpath; - - if (resolved) - { - __set_errno (ENAMETOOLONG); - if (dest > rpath + 1) - dest--; - *dest = '\0'; - goto error; - } - new_size = rpath_limit - rpath; - if (end - start + 1 > PATH_MAX) - new_size += end - start + 1; - else - new_size += PATH_MAX; - new_rpath = (char *) realloc (rpath, new_size); - if (new_rpath == NULL) - goto error; - rpath = new_rpath; - rpath_limit = rpath + new_size; - - dest = rpath + dest_offset; + __set_errno (ENAMETOOLONG); + if (dest > rpath + 1) + dest--; + *dest = '\0'; + goto error; } dest = __mempcpy (dest, start, end - start); From patchwork Mon Aug 10 20:48:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 40239 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BE3663861936; Mon, 10 Aug 2020 20:49:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BE3663861936 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1597092549; bh=HUg11OofXXV8lGY7hryu2mX7WgTk8kahUg0axEX5PcU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=FElPILG81lDDc9+wV21UW3CA/G7u5ZLOAzgS690vd1cc0FQYitazLBv7SOPBHi4G+ n3AmIMJJZjiPNvsXyRE+k8wvgRKY7EKZNZdIpWPbR7LsZVvO6xS0rPNTY1nVWRGx+U UnL6xC7DA3Jt0+n3qTA7JBXQsrCEQ7y0RXi8Ciyc= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-qk1-x735.google.com (mail-qk1-x735.google.com [IPv6:2607:f8b0:4864:20::735]) by sourceware.org (Postfix) with ESMTPS id C71383861872 for ; Mon, 10 Aug 2020 20:49:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C71383861872 Received: by mail-qk1-x735.google.com with SMTP id p4so9763486qkf.0 for ; Mon, 10 Aug 2020 13:49:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HUg11OofXXV8lGY7hryu2mX7WgTk8kahUg0axEX5PcU=; b=Ro8h3BeSXqx0aBIbFjsybxaGOJIerwnfT7XqymE21uKNlYPrIqgD5gmuhY4hAia4ET VjeikQeFMUSsdpCjKqidiA26OwRm0+5XixqcJ4tYMzv/3lhdWoWazXb2BzoVUy0aKnrc 6S0hFnWvSK1AyvOO/Z4ulV0T1gbWEw/4FtcEngwLFuEM6B91IGemNSns3qzsqKnGqZ9Q HY9PyYSVjRK5hAcwPZy2qU6TKIaODYZ9wfevrCdx6/en4q8utDTlXUI7GwkF2oNJy3GH bnPH1zFcgiWkzQ9uY8sEADWiz4/q4Tct6Eu9lLMPRtSP1et70+YhflKE5juCmQYAp7pJ QAhg== X-Gm-Message-State: AOAM5328wkkZ/YYdwlYuNzl1UzUWN0CCBCQ7lLZiHUpSR4Ad+QqhveiJ 4jm9qakSRbckDie65rfwWgjFP0PPPmHPtw== X-Google-Smtp-Source: ABdhPJxAPL15ymLVQx/fZ5kaNrKf2yBY6wuwGeY61h35I780MTMpyXNZcXFjKbOtCi2vS3qhifsKgw== X-Received: by 2002:a37:d83:: with SMTP id 125mr28764436qkn.430.1597092545842; Mon, 10 Aug 2020 13:49:05 -0700 (PDT) Received: from localhost.localdomain ([177.194.48.209]) by smtp.googlemail.com with ESMTPSA id n128sm9823185qke.8.2020.08.10.13.49.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 10 Aug 2020 13:49:05 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH 3/3] linux: Optimize realpath stack usage Date: Mon, 10 Aug 2020 17:48:56 -0300 Message-Id: <20200810204856.2111211-3-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200810204856.2111211-1-adhemerval.zanella@linaro.org> References: <20200810204856.2111211-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Netto Reply-To: Adhemerval Zanella Cc: Xiaoming Ni Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" This optimizes the stack usage for success case (from ~8K to ~4k) and where 'resolved' input buffer is not provided. For ithe failure case when the 'resolved' buffer is provided, it requires use the generic strategy to find the path when EACESS or ENOENT is returned (this is a GNU extension not defined in the standard). Regarding syscalls usage, for a sucessful path without symlinks it trades 2 syscalls (getcwd/lstat) for 5 (openat, readlink, fstat, stat, and close). Its is slighter better if the input contains multiple symlinks (where Linux kernel tricks allows replace multiple lstats by only one readlink). For failure it depends whether the 'resolved' buffer is provided, which will call the old strategy (and thus requiring more syscalls in general). Checked on x86_64-linux-gnu and i686-linux-gnu. --- include/stdlib.h | 13 +++ stdlib/Makefile | 2 +- stdlib/canonicalize-internal.c | 136 ++++++++++++++++++++++++++++ stdlib/canonicalize.c | 141 +---------------------------- stdlib/realpath.c | 42 +++++++++ sysdeps/unix/sysv/linux/realpath.c | 65 +++++++++++++ 6 files changed, 258 insertions(+), 141 deletions(-) create mode 100644 stdlib/canonicalize-internal.c create mode 100644 stdlib/realpath.c create mode 100644 sysdeps/unix/sysv/linux/realpath.c diff --git a/include/stdlib.h b/include/stdlib.h index ffcefd7b85..dd51f66b26 100644 --- a/include/stdlib.h +++ b/include/stdlib.h @@ -20,6 +20,14 @@ # include +# ifndef PATH_MAX +# ifdef MAXPATHLEN +# define PATH_MAX MAXPATHLEN +# else +# define PATH_MAX 1024 +# endif +# endif + extern __typeof (strtol_l) __strtol_l; extern __typeof (strtoul_l) __strtoul_l; extern __typeof (strtoll_l) __strtoll_l; @@ -92,6 +100,11 @@ extern int __unsetenv (const char *__name) attribute_hidden; extern int __clearenv (void) attribute_hidden; extern char *__mktemp (char *__template) __THROW __nonnull ((1)); extern char *__canonicalize_file_name (const char *__name); +extern _Bool __resolve_path (const char *name, char *resolved, + size_t path_max) + attribute_hidden; +extern char *__realpath_system (const char *name, char *resolved) + attribute_hidden; extern char *__realpath (const char *__name, char *__resolved); libc_hidden_proto (__realpath) extern int __ptsname_r (int __fd, char *__buf, size_t __buflen) diff --git a/stdlib/Makefile b/stdlib/Makefile index 7093b8a584..35ca04541f 100644 --- a/stdlib/Makefile +++ b/stdlib/Makefile @@ -53,7 +53,7 @@ routines := \ strtof strtod strtold \ strtof_l strtod_l strtold_l \ strtof_nan strtod_nan strtold_nan \ - system canonicalize \ + system realpath canonicalize canonicalize-internal \ a64l l64a \ rpmatch strfmon strfmon_l getsubopt xpg_basename fmtmsg \ strtoimax strtoumax wcstoimax wcstoumax \ diff --git a/stdlib/canonicalize-internal.c b/stdlib/canonicalize-internal.c new file mode 100644 index 0000000000..1b5f73a1cc --- /dev/null +++ b/stdlib/canonicalize-internal.c @@ -0,0 +1,136 @@ +/* Internal function for canonicalize absolute name of a given file. + Copyright (C) 1996-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include + +_Bool +__resolve_path (const char *name, char *resolved, size_t path_max) +{ + const char *start, *end; + char *rpath = resolved; + char *rpath_limit = rpath + path_max; + char *dest = resolved; + char extra_buf[PATH_MAX]; + int num_links = 0; + + if (name[0] != '/') + { + if (__getcwd (rpath, path_max) == NULL) + { + rpath[0] = '\0'; + return false; + } + dest = __rawmemchr (rpath, '\0'); + } + else + { + rpath[0] = '/'; + dest = rpath + 1; + } + + for (start = end = name; *start; start = end) + { + /* Skip sequence of multiple path-separators. */ + while (*start == '/') + ++start; + + /* Find end of path component. */ + for (end = start; *end && *end != '/'; ++end) + /* Nothing. */; + + if (end - start == 0) + break; + else if (end - start == 1 && start[0] == '.') + /* nothing */; + else if (end - start == 2 && start[0] == '.' && start[1] == '.') + { + /* Back up to previous component, ignore if at root already. */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + else + { + struct stat64 st; + + if (dest[-1] != '/') + *dest++ = '/'; + if (dest + (end - start) >= rpath_limit) + { + __set_errno (ENAMETOOLONG); + if (dest > rpath + 1) + dest--; + *dest = '\0'; + return false; + } + + dest = __mempcpy (dest, start, end - start); + *dest = '\0'; + + if (__lstat64 (rpath, &st) < 0) + return false; + + if (S_ISLNK (st.st_mode)) + { + if (++num_links > __eloop_threshold ()) + { + __set_errno (ELOOP); + return false; + } + + char buf[PATH_MAX]; + ssize_t n = __readlink (rpath, buf, sizeof (buf) - 1); + if (n < 0) + return false; + buf[n] = '\0'; + + size_t len = strlen (end); + if (path_max - n <= len) + { + __set_errno (ENAMETOOLONG); + return false; + } + + memmove (&extra_buf[n], end, len + 1); + name = end = memcpy (extra_buf, buf, n); + + if (buf[0] == '/') + dest = rpath + 1; /* It's an absolute symlink */ + else + /* Back up to previous component, ignore if at root already: */ + if (dest > rpath + 1) + while ((--dest)[-1] != '/'); + } + else if (!S_ISDIR (st.st_mode) && *end != '\0') + { + __set_errno (ENOTDIR); + return false; + } + } + } + + if (dest > rpath + 1 && dest[-1] == '/') + --dest; + *dest = '\0'; + + return true; +} diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c index 91c30e38be..f4ab528a15 100644 --- a/stdlib/canonicalize.c +++ b/stdlib/canonicalize.c @@ -16,26 +16,11 @@ License along with the GNU C Library; if not, see . */ -#include #include -#include -#include -#include -#include #include -#include -#include #include -#ifndef PATH_MAX -# ifdef MAXPATHLEN -# define PATH_MAX MAXPATHLEN -# else -# define PATH_MAX 1024 -# endif -#endif - /* Return the canonical absolute name of file NAME. A canonical name does not contain any `.', `..' components nor any repeated path separators ('/') or symlinks. All path components must exist. If @@ -50,10 +35,6 @@ char * __realpath (const char *name, char *resolved) { - char *rpath, *dest, extra_buf[PATH_MAX]; - const char *start, *end, *rpath_limit; - int num_links = 0; - if (name == NULL) { /* As per Single Unix Specification V2 we must return an error if @@ -72,127 +53,7 @@ __realpath (const char *name, char *resolved) return NULL; } - if (resolved == NULL) - { - rpath = malloc (PATH_MAX); - if (rpath == NULL) - return NULL; - } - else - rpath = resolved; - rpath_limit = rpath + PATH_MAX; - - if (name[0] != '/') - { - if (!__getcwd (rpath, PATH_MAX)) - { - rpath[0] = '\0'; - goto error; - } - dest = __rawmemchr (rpath, '\0'); - } - else - { - rpath[0] = '/'; - dest = rpath + 1; - } - - for (start = end = name; *start; start = end) - { - struct stat64 st; - int n; - - /* Skip sequence of multiple path-separators. */ - while (*start == '/') - ++start; - - /* Find end of path component. */ - for (end = start; *end && *end != '/'; ++end) - /* Nothing. */; - - if (end - start == 0) - break; - else if (end - start == 1 && start[0] == '.') - /* nothing */; - else if (end - start == 2 && start[0] == '.' && start[1] == '.') - { - /* Back up to previous component, ignore if at root already. */ - if (dest > rpath + 1) - while ((--dest)[-1] != '/'); - } - else - { - if (dest[-1] != '/') - *dest++ = '/'; - - if (dest + (end - start) >= rpath_limit) - { - __set_errno (ENAMETOOLONG); - if (dest > rpath + 1) - dest--; - *dest = '\0'; - goto error; - } - - dest = __mempcpy (dest, start, end - start); - *dest = '\0'; - - if (__lxstat64 (_STAT_VER, rpath, &st) < 0) - goto error; - - if (S_ISLNK (st.st_mode)) - { - char buf[PATH_MAX]; - size_t len; - - if (++num_links > __eloop_threshold ()) - { - __set_errno (ELOOP); - goto error; - } - - n = __readlink (rpath, buf, sizeof (buf) - 1); - if (n < 0) - goto error; - buf[n] = '\0'; - - len = strlen (end); - if (PATH_MAX - n <= len) - { - __set_errno (ENAMETOOLONG); - goto error; - } - - /* Careful here, end may be a pointer into extra_buf... */ - memmove (&extra_buf[n], end, len + 1); - name = end = memcpy (extra_buf, buf, n); - - if (buf[0] == '/') - dest = rpath + 1; /* It's an absolute symlink */ - else - /* Back up to previous component, ignore if at root already: */ - if (dest > rpath + 1) - while ((--dest)[-1] != '/'); - } - else if (!S_ISDIR (st.st_mode) && *end != '\0') - { - __set_errno (ENOTDIR); - goto error; - } - } - } - if (dest > rpath + 1 && dest[-1] == '/') - --dest; - *dest = '\0'; - - assert (resolved == NULL || resolved == rpath); - return rpath; - -error: - assert (resolved == NULL || resolved == rpath); - if (resolved == NULL) - free (rpath); - return NULL; + return __realpath_system (name, resolved); } libc_hidden_def (__realpath) versioned_symbol (libc, __realpath, realpath, GLIBC_2_3); diff --git a/stdlib/realpath.c b/stdlib/realpath.c new file mode 100644 index 0000000000..1a70c658b7 --- /dev/null +++ b/stdlib/realpath.c @@ -0,0 +1,42 @@ +/* Return the canonical absolute name of a given file. + Copyright (C) 1996-2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include + +char * +__realpath_system (const char *name, char *resolved) +{ + bool resolved_malloc = false; + if (resolved == NULL) + { + resolved = malloc (PATH_MAX); + if (resolved == NULL) + return NULL; + resolved_malloc = true; + } + + if (! __resolve_path (name, resolved, PATH_MAX)) + { + if (resolved_malloc) + free (resolved); + return NULL; + } + return resolved; +} diff --git a/sysdeps/unix/sysv/linux/realpath.c b/sysdeps/unix/sysv/linux/realpath.c new file mode 100644 index 0000000000..4c69011f92 --- /dev/null +++ b/sysdeps/unix/sysv/linux/realpath.c @@ -0,0 +1,65 @@ +/* Return the canonical absolute name of a given file. Linux version. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include + +char * +__realpath_system (const char *name, char *resolved) +{ + int fd = __open64_nocancel (name, O_PATH | O_NONBLOCK | O_CLOEXEC); + if (fd == -1) + { + /* If the call fails with either EACCES or ENOENT and resolved_path is + not NULL, then the prefix of path that is not readable or does not + exist is returned in resolved_path. This is a GNU extension. */ + if (resolved != NULL) + __resolve_path (name, resolved, PATH_MAX); + return NULL; + } + + char procname[sizeof ("/proc/self/fd/") + 3 * sizeof (int)]; + *_fitoa_word (fd, __stpcpy (procname, "/proc/self/fd/"), 10, 0) = '\0'; + + char path[PATH_MAX]; + ssize_t len = __readlink (procname, path, sizeof (path) - 1); + if (len < 0) + { + __close_nocancel_nostatus (fd); + return NULL; + } + path[len] = '\0'; + + struct stat64 st; + fstat64 (fd, &st); + dev_t st_dev = st.st_dev; + ino_t st_ino = st.st_ino; + int r = stat64 (path, &st); + if (r == -1 || st.st_dev != st_dev || st.st_ino != st_ino) + { + if (r == 0) + __set_errno (ELOOP); + __close_nocancel_nostatus (fd); + return NULL; + } + + __close_nocancel_nostatus (fd); + return resolved != NULL ? strcpy (resolved, path) : __strdup (path); +}