From patchwork Mon Dec 4 04:20:23 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dmitry Goncharov X-Patchwork-Id: 24702 Received: (qmail 28352 invoked by alias); 4 Dec 2017 04:20:29 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 28342 invoked by uid 89); 4 Dec 2017 04:20:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=filters X-HELO: mail-qt0-f174.google.com X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=d144j+L9r7vTB0METcVtaJlQHjuoZoYZudJCOkajyio=; b=L8HYZOn3b295K67dQriqwVryXL9IZYl8xXe//17fC9eyTR83ylSkAAoNRKqHT3VqTS HUit30eS1kpxSZBBjjlFIwW1dar0kPmxqfMU0knRGxmSrFNQA3zSMXKwdsU/50ZdVdQ4 ER6sc6V0RiJGu1fIrAIq5z8nfsuikl9866oBdTALxnFl5+qPTHufo+yH+ogfpbouKfAM TDmD09z30L4pGdXhw/nfEFeS0OyRbTD8Gxmpb8fFB/XAipEFaIMAdF8CkhKGRUz99MEU EMNg5K784vnhjlisAwvZEgqHzZlXlObhdNeO8T1CB/GwapzCEg+vwNO0OkDva3+g4wzp D8/w== X-Gm-Message-State: AKGB3mJaJ3HDHykCIe4HSSX2McbzgCXbRYvJzhm6MunHtUYdvFamWlFM c8M5xgJrHOPHMNQOHTUREHw= X-Google-Smtp-Source: AGs4zMZQmGONV9CIa548WWNUFG4uGXMHZ3ddkmo5jyXhERjFUrlvWhwF3UdXw4VauouomJovw2O9Eg== X-Received: by 10.237.53.2 with SMTP id a2mr18787350qte.5.1512361224138; Sun, 03 Dec 2017 20:20:24 -0800 (PST) Date: Sun, 3 Dec 2017 23:20:23 -0500 From: Dmitry Goncharov To: Paul Eggert Cc: libc-alpha@sourceware.org, bug-gnulib@gnu.org Subject: Re: [PATCH] posix: if glob has a trailing slash match directories only. Message-ID: <20171204042021.GA3045@madrid> References: <20171128210857.GC2745@madrid> <20171129042114.GA2702@madrid> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) On Wed, Nov 29, 2017 at 03:01:46PM -0800, Paul Eggert wrote: > That would be a better approach. Another possibility might be to change > the last assignment to: > >     flags |= GLOB_ONLYDIR | GLOB_MARK; > > and then at the end, filter out all matches that aren't marked with > trailing '/'. This would avoid creating a new GLOB_XXX option and would > probably be easier to implement. > Please have a look at this implementation of your idea. regards, Dmitry diff --git a/ChangeLog b/ChangeLog index ffeb208132..1b7e3984d2 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,11 @@ +2017-11-28 Dmitry Goncharov + + [BZ #22513] + * posix/glob.c (glob_in_dir): Make pattern with a trailing slash + match directores only. + * posix/globtest.sh: Add tests. + * posix/globtest.c: Print gl_pathv before errors. + 2017-11-13 Florian Weimer * support/next_to_fault.h, support/next_to_fault.c: New files. diff --git a/posix/glob.c b/posix/glob.c index cb39779d07..10756f1f92 100644 --- a/posix/glob.c +++ b/posix/glob.c @@ -1124,8 +1124,11 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int), { /* Append slashes to directory names. */ size_t i; + int filter; + filter = 0; for (i = oldcount; i < pglob->gl_pathc + pglob->gl_offs; ++i) + { if (is_dir (pglob->gl_pathv[i], flags, pglob)) { size_t len = strlen (pglob->gl_pathv[i]) + 2; @@ -1140,6 +1143,77 @@ __glob (const char *pattern, int flags, int (*errfunc) (const char *, int), strcpy (&new[len - 2], "/"); pglob->gl_pathv[i] = new; } + else if (flags & GLOB_ONLYDIR) + { + /* Either the caller specified GLOB_MARK and GLOB_ONLYDIR or the + caller specified the pattern that has a trailing slash. + Filter out everything, but directories. */ + free (pglob->gl_pathv[i]); + pglob->gl_pathv[i] = NULL; + filter = 1; + } + } + if (filter) + { + /* After the first pass above gl_pathv has full elements (directories) + and empty elements. + This algorithm partitions gl_pathv to have all full elements in the + beginning followed by all empty elements at the end and then sets + gl_pathc to the number of the full elements. + + Assume the case of 2 elements offsetted by the caller and 2 more matches + found by prior calls to glob with GLOB_APPEND and 3 more matches found + this time. + offs = 2, oldcount = 4, pathc = 5. + + Initial position. + gl_pathv e f + | | | + v v v + [ ][ ] [ ][ ][ ][ ][ ] + <----><--------------> + offs pathc + <-----------> + oldcount + From this position e will be advancing to the next empty element + and f will be backtracking to the last full element. At each + iteration contents of e and f will be swapped. + */ + + size_t n; + char **e, **f; + n = pglob->gl_pathc + pglob->gl_offs; + assert (oldcount < n); /* Otherwise filter'd be 0. */ + /* f points at the last element of gl_pathv. */ + f = pglob->gl_pathv + n - 1; + /* e points at gl_pathv[oldcount]. */ + e = pglob->gl_pathv + oldcount; + assert (f >= e); + assert (f >= pglob->gl_pathv); /* Otherwise filter'd be 0. */ + for (;;) + { + assert (f >= pglob->gl_pathv); + assert (e < pglob->gl_pathv + n); + /* Advance e to point at the first empty element. */ + while (*e != NULL && e < f) + ++e; + if (e >= f) + break; + /* Backtrack f to point at the last full element. */ + while (*f == NULL && f > e) + --f; + if (e >= f) + break; + assert (f > e); + *e = *f; + *f = NULL; + } + /* If there are full elements in gl_pathv then f points at the last + one. Otherwise f points at gl_pathv[oldcount]. */ + pglob->gl_pathc = f - pglob->gl_pathv - pglob->gl_offs; + if (pglob->gl_pathc == 0) + retval = GLOB_NOMATCH; + } } if (!(flags & GLOB_NOSORT)) diff --git a/posix/globtest.c b/posix/globtest.c index 878ae33044..0a94550d26 100644 --- a/posix/globtest.c +++ b/posix/globtest.c @@ -93,14 +93,6 @@ main (int argc, char *argv[]) glob_flags |= GLOB_APPEND; } - /* Was there an error? */ - if (i == GLOB_NOSPACE) - puts ("GLOB_NOSPACE"); - else if (i == GLOB_ABORTED) - puts ("GLOB_ABORTED"); - else if (i == GLOB_NOMATCH) - puts ("GLOB_NOMATCH"); - /* If we set an offset, fill in the first field. (Unless glob() has filled it in already - which is an error) */ if ((glob_flags & GLOB_DOOFFS) && g.gl_pathv[0] == NULL) @@ -109,10 +101,25 @@ main (int argc, char *argv[]) /* Print out the names. Unless otherwise specified, qoute them. */ if (g.gl_pathv) { - for (i = 0; i < g.gl_offs + g.gl_pathc; ++i) + int k; + for (k = 0; k < g.gl_offs + g.gl_pathc; ++k) printf ("%s%s%s\n", quotes ? "`" : "", - g.gl_pathv[i] ? g.gl_pathv[i] : "(null)", + g.gl_pathv[k] ? g.gl_pathv[k] : "(null)", quotes ? "'" : ""); + } + + /* Was there an error? + * Printing the error after printing g.gl_pathv simplifies the test suite, + * because when -o is used 'abc' is always printed first regardless of + * whether the pattern matches anyting. + */ + if (i == GLOB_NOSPACE) + puts ("GLOB_NOSPACE"); + else if (i == GLOB_ABORTED) + puts ("GLOB_ABORTED"); + else if (i == GLOB_NOMATCH) + puts ("GLOB_NOMATCH"); + return 0; } diff --git a/posix/globtest.sh b/posix/globtest.sh index 73f7ae31cc..bc86fcc232 100755 --- a/posix/globtest.sh +++ b/posix/globtest.sh @@ -43,13 +43,22 @@ export LC_ALL # Create the arena testdir=${common_objpfx}posix/globtest-dir +tdir2=${common_objpfx}posix/globtest-dir2 testout=${common_objpfx}posix/globtest-out -rm -rf $testdir $testout +rm -rf $testdir $tdir2 $testout mkdir $testdir +mkdir $tdir2 +mkdir $tdir2/hello1d +ln -s $tdir2/hello1d $tdir2/hello1ds +mkdir $tdir2/hello2d +echo 1 > $tdir2/hello1f +ln -s $tdir2/hello1f $tdir2/hello1fs +echo 2 > $tdir2/hello2f +ln $tdir2/hello2f $tdir2/hello2fl cleanup() { chmod 777 $testdir/noread - rm -fr $testdir $testout + rm -fr $testdir $tdir2 $testout } trap cleanup 0 HUP INT QUIT TERM @@ -74,6 +83,30 @@ echo 1_1 > $testdir/dir1/file1_1 echo 1_2 > $testdir/dir1/file1_2 ln -fs dir1 $testdir/link1 +trailing_slash_test() +{ + local dir="$1" + local pattern="$2" + local expected="$3" + failed=0 + ${test_program_prefix} \ + ${common_objpfx}posix/globtest -q "$dir" "$pattern" | sort -fd > $testout + echo -e "$expected" | $CMP - $testout >> $logfile || failed=1 + if test $failed -ne 0; then + echo "$pattern test failed" >> $logfile + result=1 + fi + + failed=0 + ${test_program_prefix} \ + ${common_objpfx}posix/globtest -qo "$dir" "$pattern" | sort -fd > $testout + echo -e "abc\n$expected" | $CMP - $testout >> $logfile || failed=1 + if test $failed -ne 0; then + echo "$pattern with dooffs test failed" >> $logfile + result=1 + fi +} + # Run some tests. result=0 rm -f $logfile @@ -692,6 +725,22 @@ if test $failed -ne 0; then result=1 fi +# Try multiple patterns (GLOB_APPEND) with offset (GLOB_DOOFFS) +# The pattern has a trailing slash which eliminates files in the +# subdirectories. +failed=0 +${test_program_prefix} \ +${common_objpfx}posix/globtest -o "$testdir" "file1" "*/*/" | +sort > $testout +cat <<"EOF" | $CMP - $testout >> $logfile || failed=1 +`abc' +`file1' +EOF +if test $failed -ne 0; then + echo "GLOB_APPEND2 with slash test failed" >> $logfile + result=1 +fi + # Test NOCHECK with non-existing file in subdir. failed=0 ${test_program_prefix} \ @@ -815,6 +864,93 @@ if test $failed -ne 0; then result=1 fi +# This code tests the algo that filters out non directories when the pattern +# has a trailing slash. +# There are at least the following cases +# - The pattern matches files and directories. +# - The pattern matches nothing. +# - The pattern matches only files. +# - The pattern matches 1 file. +# - The pattern matches only directories. +# - The pattern matches 1 directory. +# Each test is run twice, with and without a trailing slash to demonstrate +# the differentce that the slash makes. +# Each test is also run from the target directory by doing chdir and from the +# current directory by specifying the full path in the pattern. +# Each test is also run with and without dooffs. + +# The pattern matches files and directories. +pattern='hello*' +expected="hello1d\nhello1ds\nhello1f\nhello1fs\nhello2d\nhello2f\nhello2fl" +trailing_slash_test "$tdir2" "$pattern" "$expected" + +expected="hello1d/\nhello1ds/\nhello2d/" +trailing_slash_test "$tdir2" "$pattern/" "$expected" + +pattern="$tdir2/hello*" +expected="$tdir2/hello1d\n$tdir2/hello1ds\n$tdir2/hello1f\n\ +$tdir2/hello1fs\n$tdir2/hello2d\n$tdir2/hello2f\n$tdir2/hello2fl" +trailing_slash_test . "$pattern" "$expected" + +expected="$tdir2/hello1d/\n$tdir2/hello1ds/\n$tdir2/hello2d/" +trailing_slash_test . "$pattern/" "$expected" + +# The pattern matches nothing. +pattern='nomatch*' +trailing_slash_test "$tdir2" "$pattern" GLOB_NOMATCH +trailing_slash_test "$tdir2" "$pattern/" GLOB_NOMATCH +trailing_slash_test . "$pattern" GLOB_NOMATCH +trailing_slash_test . "$pattern/" GLOB_NOMATCH + +# The pattern matches only files. +pattern='hello?f*' +expected="hello1f\nhello1fs\nhello2f\nhello2fl" +trailing_slash_test "$tdir2" "$pattern" "$expected" +trailing_slash_test "$tdir2" "$pattern/" GLOB_NOMATCH + +pattern="$tdir2/hello?f*" +expected="$tdir2/hello1f\n$tdir2/hello1fs\n$tdir2/hello2f\n$tdir2/hello2fl" +trailing_slash_test . "$pattern" "$expected" +trailing_slash_test . "$pattern/" GLOB_NOMATCH + +# The pattern matches only 1 file. +pattern='hello*fl' +expected="hello2fl" +trailing_slash_test "$tdir2" "$pattern" "$expected" +trailing_slash_test "$tdir2" "$pattern/" GLOB_NOMATCH + +pattern="$tdir2/hello*fl" +expected="$tdir2/hello2fl" +trailing_slash_test . "$pattern" "$expected" +trailing_slash_test . "$pattern/" GLOB_NOMATCH + +# The pattern matches only directores. +pattern='hello?d*' +expected="hello1d\nhello1ds\nhello2d" +trailing_slash_test "$tdir2" "$pattern" "$expected" + +expected="hello1d/\nhello1ds/\nhello2d/" +trailing_slash_test "$tdir2" "$pattern/" "$expected" + +pattern="$tdir2/hello?d*" +expected="$tdir2/hello1d\n$tdir2/hello1ds\n$tdir2/hello2d" +trailing_slash_test . "$pattern" "$expected" + +expected="$tdir2/hello1d/\n$tdir2/hello1ds/\n$tdir2/hello2d/" +trailing_slash_test . "$pattern/" "$expected" + +# The pattern matches only 1 directory. +pattern='hello*ds' +expected="hello1ds" +trailing_slash_test "$tdir2" "$pattern" "$expected" +trailing_slash_test "$tdir2" "$pattern/" "$expected/" + +pattern="$tdir2/hello*ds" +expected="$tdir2/hello1ds" +trailing_slash_test . "$pattern" "$expected" +trailing_slash_test . "$pattern/" "$expected/" + + if test $result -eq 0; then echo "All OK." > $logfile fi