PR28240 patch: debuginfod-client negative-hit caching for root user

Message ID 20210818223907.GD6064@redhat.com
State Superseded
Headers
Series PR28240 patch: debuginfod-client negative-hit caching for root user |

Commit Message

Frank Ch. Eigler Aug. 18, 2021, 10:39 p.m. UTC
  Hi -

Important patch.


commit d2cbc610a9e6552f663e29136d19597b8fdcf832 (HEAD -> master)
Author: Frank Ch. Eigler <fche@redhat.com>
Date:   Wed Aug 18 18:29:34 2021 -0400

    PR28240: debuginfod client root-safe negative caching
    
    Negative cache (000-permission) files were incorrectly treated as
    valid cached files for the root user, because root can open even
    000-perm files without -EACCES.  Corrected this checking sequence.
    
    Fixed the debuginfod testsuite to run to completion as root or
    as an ordinary user, correcting corresponding permission checks:
        stat -c %A $FILE
    is right and
        [ -w $FILE]  [ -r $FILE ]
    were wrong.  Fixed related testsuite inconsistencies to assert
    subprocess errors (rc != 0), where
       ! CMD
    is right and
       CMD && false || true
    and similar were wrong.
    
    Signed-off-by: Frank Ch. Eigler <fche@redhat.com>
  

Comments

Mark Wielaard Aug. 26, 2021, 9:59 p.m. UTC | #1
Hi Frank,

On Wed, Aug 18, 2021 at 06:39:07PM -0400, Frank Ch. Eigler via Elfutils-devel wrote:
> commit d2cbc610a9e6552f663e29136d19597b8fdcf832 (HEAD -> master)
> Author: Frank Ch. Eigler <fche@redhat.com>
> Date:   Wed Aug 18 18:29:34 2021 -0400
> 
>     PR28240: debuginfod client root-safe negative caching
>     
>     Negative cache (000-permission) files were incorrectly treated as
>     valid cached files for the root user, because root can open even
>     000-perm files without -EACCES.  Corrected this checking sequence.

Yeah, technically access isn't really checking the file permissions.

>     Fixed the debuginfod testsuite to run to completion as root or
>     as an ordinary user, correcting corresponding permission checks:
>         stat -c %A $FILE
>     is right and
>         [ -w $FILE]  [ -r $FILE ]
>     were wrong.  Fixed related testsuite inconsistencies to assert
>     subprocess errors (rc != 0), where
>        ! CMD
>     is right and
>        CMD && false || true
>     and similar were wrong.

Thanks. Even though I think people shouldn't run the build/testsuite
as root. But people will... sigh.

> diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
> index 7d4b220f30dc..be15736b3ebd 100644
> --- a/debuginfod/debuginfod-client.c
> +++ b/debuginfod/debuginfod-client.c
> @@ -726,47 +726,49 @@ debuginfod_query_server (debuginfod_client *c,
>  
>    rc = debuginfod_init_cache(cache_path, interval_path, maxage_path);
>    if (rc != 0)
>      goto out;
>    rc = debuginfod_clean_cache(c, cache_path, interval_path, maxage_path);
>    if (rc != 0)
>      goto out;
>  
> -  /* If the target is already in the cache then we are done.  */
> -  int fd = open (target_cache_path, O_RDONLY);
> -  if (fd >= 0)
> -    {
> -      /* Success!!!! */
> -      if (path != NULL)
> -        *path = strdup(target_cache_path);
> -      rc = fd;
> -      goto out;
> -    }
> -
>    struct stat st;
> -  time_t cache_miss;
> -  /* Check if the file exists and it's of permission 000*/
> -  if (errno == EACCES
> -      && stat(target_cache_path, &st) == 0
> +  /* Check if the file exists and it's of permission 000; must check
> +     explicitly rather than trying to open it first (PR28240). */
> +  if (stat(target_cache_path, &st) == 0
>        && (st.st_mode & 0777) == 0)
>      {
> +      time_t cache_miss;
> +
>        rc = debuginfod_config_cache(cache_miss_path, cache_miss_default_s, &st);
>        if (rc < 0)
>          goto out;
>  
>        cache_miss = (time_t)rc;
>        if (time(NULL) - st.st_mtime <= cache_miss)
>          {
>           rc = -ENOENT;
>           goto out;
>         }
>        else
>          unlink(target_cache_path);
>      }
> +  
> +  /* If the target is already in the cache, and not a 000 file (PR28240), 
> +     then we are done. */
> +  int fd = open (target_cache_path, O_RDONLY);
> +  if (fd >= 0)
> +    {
> +      /* Success!!!! */
> +      if (path != NULL)
> +        *path = strdup(target_cache_path);
> +      rc = fd;
> +      goto out;
> +    }

Yes, this seems the right sequence, but don't we still have a little race
condition (if someone runs as root)?

If the timeout triggers, the file gets unlinked, then some other
process comes along, races past this one, does the server call, and
puts back a 000 file, then this process wakes up again and does the
open (as root) getting the empty file...

Cheers,

Mark
  
Frank Ch. Eigler Aug. 29, 2021, 11:43 a.m. UTC | #2
Hi -

On Thu, Aug 26, 2021 at 11:59:24PM +0200, Mark Wielaard wrote:
> [...]
> Yes, this seems the right sequence, but don't we still have a little race
> condition (if someone runs as root)?

Yeah, I see what you mean, good point.  The code should do the open(),
and then fstat() the fd.

- FChE
  

Patch

diff --git a/debuginfod/ChangeLog b/debuginfod/ChangeLog
index 9e82d78d2d4e..76155e69e51c 100644
--- a/debuginfod/ChangeLog
+++ b/debuginfod/ChangeLog
@@ -1,8 +1,14 @@ 
+2021-08-18  Frank Ch. Eigler  <fche@redhat.com>
+
+	PR28240
+	* debuginfod-client.c (debuginfod_query_server): Correct
+	negative-hit cache check sequence for root user.
+
 2021-07-26  Noah Sanci  <nsanci@redhat.com>
 
 	PR27982
 	* debuginfod-client.c (globals): added default_maxsize and
 	default_maxtime.
 	(debuginfod_query_server): Added DEBUGINFOD_MAXSIZE and
 	DEBUGINFOD_MAXTIME envvar processing.
 	* debuginfod.cxx (handler_cb): If the requested file exceeds
diff --git a/debuginfod/debuginfod-client.c b/debuginfod/debuginfod-client.c
index 7d4b220f30dc..be15736b3ebd 100644
--- a/debuginfod/debuginfod-client.c
+++ b/debuginfod/debuginfod-client.c
@@ -726,47 +726,49 @@  debuginfod_query_server (debuginfod_client *c,
 
   rc = debuginfod_init_cache(cache_path, interval_path, maxage_path);
   if (rc != 0)
     goto out;
   rc = debuginfod_clean_cache(c, cache_path, interval_path, maxage_path);
   if (rc != 0)
     goto out;
 
-  /* If the target is already in the cache then we are done.  */
-  int fd = open (target_cache_path, O_RDONLY);
-  if (fd >= 0)
-    {
-      /* Success!!!! */
-      if (path != NULL)
-        *path = strdup(target_cache_path);
-      rc = fd;
-      goto out;
-    }
-
   struct stat st;
-  time_t cache_miss;
-  /* Check if the file exists and it's of permission 000*/
-  if (errno == EACCES
-      && stat(target_cache_path, &st) == 0
+  /* Check if the file exists and it's of permission 000; must check
+     explicitly rather than trying to open it first (PR28240). */
+  if (stat(target_cache_path, &st) == 0
       && (st.st_mode & 0777) == 0)
     {
+      time_t cache_miss;
+
       rc = debuginfod_config_cache(cache_miss_path, cache_miss_default_s, &st);
       if (rc < 0)
         goto out;
 
       cache_miss = (time_t)rc;
       if (time(NULL) - st.st_mtime <= cache_miss)
         {
          rc = -ENOENT;
          goto out;
        }
       else
         unlink(target_cache_path);
     }
+  
+  /* If the target is already in the cache, and not a 000 file (PR28240), 
+     then we are done. */
+  int fd = open (target_cache_path, O_RDONLY);
+  if (fd >= 0)
+    {
+      /* Success!!!! */
+      if (path != NULL)
+        *path = strdup(target_cache_path);
+      rc = fd;
+      goto out;
+    }
 
   long timeout = default_timeout;
   const char* timeout_envvar = getenv(DEBUGINFOD_TIMEOUT_ENV_VAR);
   if (timeout_envvar != NULL)
     timeout = atoi (timeout_envvar);
 
   if (vfd >= 0)
     dprintf (vfd, "using timeout %ld\n", timeout);
diff --git a/tests/ChangeLog b/tests/ChangeLog
index 3bfd1ca224fb..6366bb19bacf 100644
--- a/tests/ChangeLog
+++ b/tests/ChangeLog
@@ -1,8 +1,15 @@ 
+2021-08-18  Frank Ch. Eigler  <fche@redhat.com>
+
+	PR28240
+	* run-debuginfod-find.sh: Correct XFAIL (rc!=0) subprocess
+	invocation syntax throughout.  Correct negative-cache
+	file permission checking.
+
 2021-08-04  Mark Wielaard  <mark@klomp.org>
 
 	PR28190
 	* backtrace.c (callback_verify): Check for pthread_kill as first
 	frame. Change asserts to fprintf plus abort.
 
 2021-07-26  Noah Sanci  <nsanci@redhat.com>
 
diff --git a/tests/run-debuginfod-find.sh b/tests/run-debuginfod-find.sh
index 991d1dc57164..bccd33051969 100755
--- a/tests/run-debuginfod-find.sh
+++ b/tests/run-debuginfod-find.sh
@@ -203,52 +203,52 @@  if [ -f ${DEBUGINFOD_CACHE_PATH}/${BUILDID} ]; then
   echo "File cached after maxtime check"
   err
 fi
 ########################################################################
 # PR25628
 rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
 
 # The query is designed to fail, while the 000-permission file should be created.
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo 01234567 || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo 01234567
 if [ ! -f $DEBUGINFOD_CACHE_PATH/01234567/debuginfo ]; then
   echo "could not find cache in $DEBUGINFOD_CACHE_PATH"
   err
 fi
 
-if [ -r $DEBUGINFOD_CACHE_PATH/01234567/debuginfo ]; then
+if [ `stat -c "%A" $DEBUGINFOD_CACHE_PATH/01234567/debuginfo` != "----------" ]; then
   echo "The cache $DEBUGINFOD_CACHE_PATH/01234567/debuginfo is readable"
   err
 fi
 
 bytecount_before=`curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_count{code="404"}'`
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo 01234567 || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo 01234567
 bytecount_after=`curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_count{code="404"}'`
 if [ "$bytecount_before" != "$bytecount_after" ]; then
   echo "http_responses_transfer_bytes_count{code="404"} has changed."
   err
 fi
 
 # set cache_miss_s to 0 and sleep 1 to make the mtime expire.
 echo 0 > $DEBUGINFOD_CACHE_PATH/cache_miss_s
 sleep 1
 bytecount_before=`curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_count{code="404"}'`
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo 01234567 || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo 01234567
 bytecount_after=`curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_count{code="404"}'`
 if [ "$bytecount_before" == "$bytecount_after" ]; then
   echo "http_responses_transfer_bytes_count{code="404"} should be incremented."
   err
 fi
 ########################################################################
 
 # Test whether debuginfod-find is able to fetch those files.
 rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
 filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID`
 cmp $filename F/p+r%o\$g.debug
-if [ -w $filename ]; then
+if [  `stat -c "%A" $filename` != -r-------- ]; then
     echo "cache file writable, boo"
     err
 fi
 
 filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable F/p+r%o\\$g`
 cmp $filename F/p+r%o\$g
 
 # raw source filename
@@ -388,18 +388,18 @@  sourcefiles=$(find -name \*\\.debug \
 cd ..
 rm -rf extracted
 
 wait_ready $PORT1 'found_sourcerefs_total{source=".rpm archive"}' $sourcefiles
 
 ########################################################################
 # PR27983 ensure no duplicate urls are used in when querying servers for files
 rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
-env DEBUGINFOD_URLS="http://127.0.0.1:$PORT1 http://127.0.0.1:$PORT1 http://127.0.0.1:$PORT1 http:127.0.0.1:7999" \
- LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -v executable $BUILDID2 > vlog4 2>&1 || true
+! env DEBUGINFOD_URLS="http://127.0.0.1:$PORT1 http://127.0.0.1:$PORT1 http://127.0.0.1:$PORT1 http:127.0.0.1:7999" \
+ LD_LIBRARY_PATH=$ldpath ${abs_top_builddir}/debuginfod/debuginfod-find -v executable $BUILDID2 > vlog4 2>&1
 tempfiles vlog4
 if [ $( grep -c 'duplicate url: http://127.0.0.1:'$PORT1'.*' vlog4 ) -ne 2 ]; then
   echo "Duplicated servers remain";
   err
 fi
 ########################################################################
 # Run a bank of queries against the debuginfod-rpms / debuginfod-debs test cases
 
@@ -471,17 +471,17 @@  kill -USR2 $PID1  # groom cycle
 # 1 groom cycle already took place at/soon-after startup, so -USR2 makes 2
 wait_ready $PORT1 'thread_work_total{role="groom"}' 2
 # Expect 4 rpms containing 2 buildids to be deleted by the groom
 wait_ready $PORT1 'groomed_total{decision="stale"}' 4
 
 rm -rf $DEBUGINFOD_CACHE_PATH # clean it from previous tests
 
 # this is one of the buildids from the groom-deleted rpms
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable $RPM_BUILDID && false || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable $RPM_BUILDID
 # but this one was not deleted so should be still around
 testrun ${abs_top_builddir}/debuginfod/debuginfod-find executable $BUILDID2
 
 ########################################################################
 
 # PR26810: Now rename some files in the R directory, then rescan, so
 # there are two copies of the same buildid in the index, one for the
 # no-longer-existing file name, and one under the new name.
@@ -573,38 +573,38 @@  if type bsdtar 2>/dev/null; then
     archive_test f17a29b5a25bd4960531d82aa6b07c8abe84fa66 "" ""
 fi
 
 rm -rf $DEBUGINFOD_CACHE_PATH
 testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
 
 # send a request to stress XFF and User-Agent federation relay;
 # we'll grep for the two patterns in vlog$PORT1
-curl -s -H 'User-Agent: TESTCURL' -H 'X-Forwarded-For: TESTXFF' $DEBUGINFOD_URLS/buildid/deaddeadbeef00000000/debuginfo -o /dev/null || true
+! curl -s -H 'User-Agent: TESTCURL' -H 'X-Forwarded-For: TESTXFF' $DEBUGINFOD_URLS/buildid/deaddeadbeef00000000/debuginfo -o /dev/null
 
 grep UA:TESTCURL vlog$PORT1
 grep XFF:TESTXFF vlog$PORT1
 
 
 # confirm that first server can't resolve symlinked info in L/ but second can
 BUILDID=`env LD_LIBRARY_PATH=$ldpath ${abs_builddir}/../src/readelf \
          -a L/foo | grep 'Build ID' | cut -d ' ' -f 7`
 file L/foo
 file -L L/foo
 export DEBUGINFOD_URLS=http://127.0.0.1:$PORT1
 rm -rf $DEBUGINFOD_CACHE_PATH
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID && false || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
 rm -f $DEBUGINFOD_CACHE_PATH/$BUILDID/debuginfo # drop 000-perm negative-hit file
 export DEBUGINFOD_URLS=http://127.0.0.1:$PORT2
 testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
 
 # test again with scheme free url
 export DEBUGINFOD_URLS=127.0.0.1:$PORT1
 rm -rf $DEBUGINFOD_CACHE_PATH
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID && false || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
 rm -f $DEBUGINFOD_CACHE_PATH/$BUILDID/debuginfo # drop 000-perm negative-hit file
 export DEBUGINFOD_URLS=127.0.0.1:$PORT2
 testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID
 
 # test parallel queries in client
 export DEBUGINFOD_CACHE_PATH=${PWD}/.client_cache3
 mkdir -p $DEBUGINFOD_CACHE_PATH
 export DEBUGINFOD_URLS="BAD http://127.0.0.1:$PORT1 127.0.0.1:$PORT1 http://127.0.0.1:$PORT2 DNE"
@@ -625,22 +625,22 @@  curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_duration_millisec
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_count'
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'http_responses_transfer_bytes_sum'
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'fdcache_'
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'error_count'
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'traversed_total'
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'scanned_bytes_total'
 
 # And generate a few errors into the second debuginfod's logs, for analysis just below
-curl -s http://127.0.0.1:$PORT2/badapi > /dev/null || true
-curl -s http://127.0.0.1:$PORT2/buildid/deadbeef/debuginfo > /dev/null || true  
+! curl -s http://127.0.0.1:$PORT2/badapi > /dev/null
+! curl -s http://127.0.0.1:$PORT2/buildid/deadbeef/debuginfo > /dev/null
 # NB: this error is used to seed the 404 failure for the survive-404 tests
 
 # Confirm bad artifact types are rejected without leaving trace
-curl -s http://127.0.0.1:$PORT2/buildid/deadbeef/badtype > /dev/null || true
+! curl -s http://127.0.0.1:$PORT2/buildid/deadbeef/badtype > /dev/null
 (curl -s http://127.0.0.1:$PORT2/metrics | grep 'badtype') && false
 
 # DISABLE VALGRIND checking because valgrind might use debuginfod client
 # requests itself, causing confusion about who put what in the cache.
 # It stays disabled till the end of this test.
 unset VALGRIND_CMD
 
 # Confirm that reused curl connections survive 404 errors.
@@ -680,17 +680,17 @@  dd if=/dev/zero of=$DB bs=1 count=1
 ls -al $DB
 # trigger some random activity that's Sure to get sqlite3 upset
 kill -USR1 $PID1
 wait_ready $PORT1 'thread_work_total{role="traverse"}' 9
 wait_ready $PORT1 'thread_work_pending{role="scan"}' 0
 wait_ready $PORT1 'thread_busy{role="scan"}' 0
 kill -USR2 $PID1
 wait_ready $PORT1 'thread_work_total{role="groom"}' 4
-curl -s http://127.0.0.1:$PORT1/buildid/beefbeefbeefd00dd00d/debuginfo > /dev/null || true
+! curl -s http://127.0.0.1:$PORT1/buildid/beefbeefbeefd00dd00d/debuginfo > /dev/null
 curl -s http://127.0.0.1:$PORT1/metrics | grep 'error_count.*sqlite'
 
 ########################################################################
 
 # Run the tests again without the servers running. The target file should
 # be found in the cache.
 
 kill -INT $PID1 $PID2
@@ -722,17 +722,17 @@  touch -d '1970-01-01' $DEBUGINFOD_CACHE_PATH/00000000 # old enough to guarantee
 
 # Trigger a cache clean and run the tests again. The clients should be unable to
 # find the target.
 echo 0 > $DEBUGINFOD_CACHE_PATH/cache_clean_interval_s
 echo 0 > $DEBUGINFOD_CACHE_PATH/max_unused_age_s
 
 testrun ${abs_builddir}/debuginfod_build_id_find -e F/p+r%o\$g 1
 
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID2 && false || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find debuginfo $BUILDID2
 
 if [ ! -f $DEBUGINFOD_CACHE_PATH/malformed0 ] \
     || [ ! -f $DEBUGINFOD_CACHE_PATH/malformed/malformed1 ]; then
   echo "unrelated files did not survive cache cleaning"
   err
 fi
 
 if [ -d $DEBUGINFOD_CACHE_PATH/00000000 ]; then
@@ -754,17 +754,17 @@  PID3=0
 # Test fetching a file using file:// . No debuginfod server needs to be run for
 # this test.
 local_dir=${PWD}/mocktree/buildid/aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd/source/my/path
 mkdir -p ${local_dir}
 echo "int main() { return 0; }" > ${local_dir}/main.c
 
 # first test that is doesn't work, when no DEBUGINFOD_URLS is set
 DEBUGINFOD_URLS=""
-testrun ${abs_top_builddir}/debuginfod/debuginfod-find source aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd /my/path/main.c && false || true
+! testrun ${abs_top_builddir}/debuginfod/debuginfod-find source aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd /my/path/main.c
 
 # Now test is with proper DEBUGINFOD_URLS
 DEBUGINFOD_URLS="file://${PWD}/mocktree/"
 filename=`testrun ${abs_top_builddir}/debuginfod/debuginfod-find source aaaaaaaaaabbbbbbbbbbccccccccccdddddddddd /my/path/main.c`
 cmp $filename ${local_dir}/main.c
 
 ########################################################################
 ## PR27711
@@ -791,17 +791,17 @@  wait_ready $PORT3 'groom{statistic="files scanned (#)"}' 0
 wait_ready $PORT3 'groom{statistic="files scanned (mb)"}' 0
 
 kill $PID4
 wait $PID4
 PID4=0
 
 ########################################################################
 # set up tests for retrying failed queries.
-retry_attempts=`(testrun env DEBUGINFOD_URLS=http://255.255.255.255/JUNKJUNK DEBUGINFOD_RETRY_LIMIT=10 DEBUGINFOD_VERBOSE=1 \
-	${abs_top_builddir}/debuginfod/debuginfod-find debuginfo /bin/ls || true) 2>&1 >/dev/null \
+retry_attempts=`(! testrun env DEBUGINFOD_URLS=http://255.255.255.255/JUNKJUNK DEBUGINFOD_RETRY_LIMIT=10 DEBUGINFOD_VERBOSE=1 \
+	${abs_top_builddir}/debuginfod/debuginfod-find debuginfo /bin/ls) 2>&1 >/dev/null \
 	| grep -c 'Retry failed query'`
 if [ $retry_attempts -ne 10 ]; then
     echo "retry mechanism failed."
     exit 1;
   fi
 
 exit 0