[v2] Add helper script for glibc debugging

Message ID 20190926142112.5299-1-gabriel@inconstante.net.br
State Superseded
Headers

Commit Message

Gabriel F. T. Gomes Sept. 26, 2019, 2:21 p.m. UTC
  From: "Gabriel F. T. Gomes" <gabrielftg@linux.ibm.com>

Changes since v1:

  - Added nptl_db to the argument to --library-path, which allows
    debugging with multiple threads (see additional comment below), as
    observed by Carlos;
  - Removed the hardcoded path to GDB, as suggested by Dmitry;
  - Removed the default test case and source directory paths, as
    suggested by Joseph, which removed some of the clutter in the
    script;

Note about multi-threaded test case debugging:

As Carlos correctly noticed, the first version of the script lacked the
required GDB setup for multi-threaded debugging.  This could be noticed
by running a multi-threaded test case, such as nptl/tst-exec1, and
observing the warning message in the output:

  warning: Unable to find libthread_db matching inferior's thread
           library, thread debugging will not be available.

In this second version, which explicitly adds nptl_db to the argument of
--library-path, the warning message is gone, and converted into the
following info message:

  [Thread debugging using libthread_db enabled]
  Using host libthread_db library
  "/home/gabriel/build/x86_64/glibc//nptl_db/libthread_db.so.1".

Not-for-commit comment from v1:

At Cauldron, while Florian showed his development workflow to us, we had
a brief discussion about debugging glibc with GDB.  After that, I spent
some time with Arjun and we shared our debugging scripts and tricks,
then we came to the conclusion that this could be helpful to other
people, maybe even more so to newcomers.  So, I got my off-tree script
and converted it into this auto-generated version.

Still at Cauldron, Carlos mentioned that it wasn't trivial to let GDB
know about the symbols within glibc.  This script doesn't solve this for
every glibc debugging needs (I suppose it doesn't help at all when
debugging the initial steps of program loading, for example), but it
makes it a little easier when debugging test cases, as it automatically
loads the symbols from them.

Maybe there are more trivial ways to do it, but that's what I have been
using for some time.

-- 8< --
This patch adds a new make rule that generates a helper script for
debugging glibc test cases.  The new script, debugglibc.sh, is similar
to testrun.sh, in the sense that it allows the execution of the
specified test case, however, it opens the test case in GDB, setting the
library path the same way that testrun.sh does.  The commands are based
on the instructions on the wiki page for glibc debugging [1].

By default, the script tells GDB to load the test case for symbol
information, so that, when a breakpoint is hit, the call stack is
displayed correctly (instead of printing lots of '??'s).  For instance,
after running 'make' and 'make check', one could do the following:

  $ ./debugglibc.sh nptl/tst-exec1 -B pthread_join

  Reading symbols from /home/gabriel/build/powerpc64le/glibc//elf/ld.so...done.
  Breakpoint 1 at 0x1444
  add symbol table from file "nptl/tst-exec1"
  [Thread debugging using libthread_db enabled]
  Using host libthread_db library "/home/gabriel/build/powerpc64le/glibc//nptl_db/libthread_db.so.1".

  Breakpoint 1, 0x00007ffff7fb1444 in _dl_start_user () from /home/gabriel/build/powerpc64le/glibc/elf/ld.so
  Breakpoint 2 at 0x7ffff7f49d48: file pthread_join.c, line 23.

Notice that the script will always start GDB with the program running
and halted at _dl_start_user.  So, in order to reach the actual
breakpoint of interest, one should hit 'c', not 'r':

  >>> c
  Continuing.
  [New Thread 0x7ffff7d1f180 (LWP 76443)]
  [Switching to Thread 0x7ffff7d1f180 (LWP 76443)]

  Thread 2 "ld.so" hit Breakpoint 2, __pthread_join (threadid=140737354087616, thread_return=0x0) at pthread_join.c:24
  24        return __pthread_timedjoin_ex (threadid, thread_return, NULL, true);

Then inspect the call stack with 'bt', as usual, and see symbols from
both the test case and from the libraries themselves:

  >>> bt
  #0  __pthread_join (threadid=140737354087616, thread_return=0x0) at pthread_join.c:24
  #1  0x0000000010001f4c in tf (arg=<optimized out>) at tst-exec1.c:37
  #2  0x00007ffff7f487e8 in start_thread (arg=0x7ffff7510000) at pthread_create.c:479
  #3  0x00007ffff7e523a8 in clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:82

Tested for powerpc64le and x86_64.

[1] https://sourceware.org/glibc/wiki/Debugging/Loader_Debugging
---
 Makefile | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 140 insertions(+), 1 deletion(-)
  

Comments

Carlos O'Donell Sept. 26, 2019, 4:29 p.m. UTC | #1
On 9/26/19 10:21 AM, Gabriel F. T. Gomes wrote:
> From: "Gabriel F. T. Gomes" <gabrielftg@linux.ibm.com>
> 
> Changes since v1:
> 
>   - Added nptl_db to the argument to --library-path, which allows
>     debugging with multiple threads (see additional comment below), as
>     observed by Carlos;
>   - Removed the hardcoded path to GDB, as suggested by Dmitry;
>   - Removed the default test case and source directory paths, as
>     suggested by Joseph, which removed some of the clutter in the
>     script;

We are getting close!

(1) Don't run gdb under the ld.so to be tested.

(2) Provide a way to pass env vars.

If you fix those two things I think the script is basically perfect :-)

> Note about multi-threaded test case debugging:
> 
> As Carlos correctly noticed, the first version of the script lacked the
> required GDB setup for multi-threaded debugging.  This could be noticed
> by running a multi-threaded test case, such as nptl/tst-exec1, and
> observing the warning message in the output:
> 
>   warning: Unable to find libthread_db matching inferior's thread
>            library, thread debugging will not be available.
> 
> In this second version, which explicitly adds nptl_db to the argument of
> --library-path, the warning message is gone, and converted into the
> following info message:
> 
>   [Thread debugging using libthread_db enabled]
>   Using host libthread_db library
>   "/home/gabriel/build/x86_64/glibc//nptl_db/libthread_db.so.1".
> 
> Not-for-commit comment from v1:
> 
> At Cauldron, while Florian showed his development workflow to us, we had
> a brief discussion about debugging glibc with GDB.  After that, I spent
> some time with Arjun and we shared our debugging scripts and tricks,
> then we came to the conclusion that this could be helpful to other
> people, maybe even more so to newcomers.  So, I got my off-tree script
> and converted it into this auto-generated version.
> 
> Still at Cauldron, Carlos mentioned that it wasn't trivial to let GDB
> know about the symbols within glibc.  This script doesn't solve this for
> every glibc debugging needs (I suppose it doesn't help at all when
> debugging the initial steps of program loading, for example), but it
> makes it a little easier when debugging test cases, as it automatically
> loads the symbols from them.
> 
> Maybe there are more trivial ways to do it, but that's what I have been
> using for some time.
> 
> -- 8< --
> This patch adds a new make rule that generates a helper script for
> debugging glibc test cases.  The new script, debugglibc.sh, is similar
> to testrun.sh, in the sense that it allows the execution of the
> specified test case, however, it opens the test case in GDB, setting the
> library path the same way that testrun.sh does.  The commands are based
> on the instructions on the wiki page for glibc debugging [1].
> 
> By default, the script tells GDB to load the test case for symbol
> information, so that, when a breakpoint is hit, the call stack is
> displayed correctly (instead of printing lots of '??'s).  For instance,
> after running 'make' and 'make check', one could do the following:
> 
>   $ ./debugglibc.sh nptl/tst-exec1 -B pthread_join
> 
>   Reading symbols from /home/gabriel/build/powerpc64le/glibc//elf/ld.so...done.
>   Breakpoint 1 at 0x1444
>   add symbol table from file "nptl/tst-exec1"
>   [Thread debugging using libthread_db enabled]
>   Using host libthread_db library "/home/gabriel/build/powerpc64le/glibc//nptl_db/libthread_db.so.1".
> 
>   Breakpoint 1, 0x00007ffff7fb1444 in _dl_start_user () from /home/gabriel/build/powerpc64le/glibc/elf/ld.so
>   Breakpoint 2 at 0x7ffff7f49d48: file pthread_join.c, line 23.
> 
> Notice that the script will always start GDB with the program running
> and halted at _dl_start_user.  So, in order to reach the actual
> breakpoint of interest, one should hit 'c', not 'r':
> 
>   >>> c
>   Continuing.
>   [New Thread 0x7ffff7d1f180 (LWP 76443)]
>   [Switching to Thread 0x7ffff7d1f180 (LWP 76443)]
> 
>   Thread 2 "ld.so" hit Breakpoint 2, __pthread_join (threadid=140737354087616, thread_return=0x0) at pthread_join.c:24
>   24        return __pthread_timedjoin_ex (threadid, thread_return, NULL, true);
> 
> Then inspect the call stack with 'bt', as usual, and see symbols from
> both the test case and from the libraries themselves:
> 
>   >>> bt
>   #0  __pthread_join (threadid=140737354087616, thread_return=0x0) at pthread_join.c:24
>   #1  0x0000000010001f4c in tf (arg=<optimized out>) at tst-exec1.c:37
>   #2  0x00007ffff7f487e8 in start_thread (arg=0x7ffff7510000) at pthread_create.c:479
>   #3  0x00007ffff7e523a8 in clone () at ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:82
> 
> Tested for powerpc64le and x86_64.
> 
> [1] https://sourceware.org/glibc/wiki/Debugging/Loader_Debugging
> ---
>  Makefile | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 140 insertions(+), 1 deletion(-)
> 
> diff --git a/Makefile b/Makefile
> index 67ddd01bfe..d57f97dcae 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -187,7 +187,146 @@ $(common-objpfx)testrun.sh: $(common-objpfx)config.make \
>  	mv -f $@T $@
>  postclean-generated += testrun.sh
>  
> -others: $(common-objpfx)testrun.sh
> +define debugglibc

OK.

> +#!/bin/bash
> +
> +SOURCE_DIR="$(CURDIR)"
> +BUILD_DIR="$(common-objpfx)"
> +CMD_FILE="$(common-objpfx)debugglibc.gdb"
> +GDB_PATH=`which gdb`
> +DIRECT=true
> +SYMBOLSFILE=true
> +unset TESTCASE
> +unset BREAKPOINTS
> +
> +usage()
> +{
> +cat << EOF
> +Usage: $$0 [OPTIONS] <testcase>
> +
> +  where <testcase> is the path to the program being tested.
> +
> +Options:
> +
> +  -h, --help
> +	Prints this message and leaves.
> +
> +  The following options require one argument:
> +
> +  -B, --breakpoint
> +	Breakpoints to set at the beginning of the execution
> +	(each breakpoint demands its own -B option, e.g. -B foo -B bar)

OK.

> +
> +  The following options do not take arguments:
> +
> +  -I, --no-direct
> +	Selects whether to pass the flag --direct to gdb.
> +	Required for glibc test cases and not allowed for non-glibc tests.
> +	Default behaviour is to pass the flag --direct to gdb.
> +  -S, --no-symbols-file
> +	Do not tell GDB to load debug symbols from the testcase.

OK.

> +EOF
> +}
> +
> +# Parse input options
> +while [[ $$# > 0 ]]
> +do
> +  key="$$1"
> +  case $$key in
> +    -h|--help)
> +      usage
> +      exit 0
> +      ;;
> +    -B|--breakpoint)
> +      BREAKPOINTS="$$BREAKPOINTS\n break $$2"
> +      shift
> +      ;;
> +    -I|--no-direct)
> +      DIRECT=false
> +      ;;
> +    -S|--no-symbols-file)
> +      SYMBOLSFILE=false
> +      ;;
> +    *)
> +      TESTCASE=$$1
> +      ;;
> +  esac
> +  shift
> +done
> +
> +if [ ! -v TESTCASE ]
> +then
> +  usage
> +  exit 1
> +fi
> +
> +# Expand direct argument
> +if [ "$$DIRECT" == true ]
> +then
> +  DIRECT="--direct"
> +else
> +  DIRECT=""
> +fi
> +
> +# Expand symbols loading command
> +if [ "$$SYMBOLSFILE" == true ]
> +then
> +  SYMBOLSFILE="add-symbol-file $${TESTCASE}"
> +else
> +  SYMBOLSFILE=""
> +fi
> +
> +# GDB commands template
> +template ()
> +{
> +cat <<EOF
> +set environment C -E -x c-header
> +break _dl_start_user
> +__SYMBOLSFILE__
> +run --library-path $(rpath-link):$${BUILD_DIR}/nptl_db \
> +__TESTCASE__ __DIRECT__
> +__BREAKPOINTS__
> +EOF
> +}
> +
> +# Generate the commands file for gdb initialization
> +template | sed \
> +  -e "s|__SYMBOLSFILE__|$$SYMBOLSFILE|" \
> +  -e "s|__TESTCASE__|$$TESTCASE|" \
> +  -e "s|__DIRECT__|$$DIRECT|" \
> +  -e "s|__BREAKPOINTS__|$$BREAKPOINTS|" \
> +  > $$CMD_FILE
> +
> +echo
> +echo "Debugging glibc..."
> +echo "Build directory  : $$BUILD_DIR"
> +echo "Source directory : $$SOURCE_DIR"
> +echo "GLIBC Testcase   : $$TESTCASE"
> +echo "GDB Commands     : $$CMD_FILE"
> +echo
> +
> +# We need to make sure that gdb is linked against the standalone glibc
> +# so that it picks up the correct nptl_db/libthread_db.so. So that means
> +# invoking gdb using the standalone glibc's linker and passing nptl_db
> +# in the argument to --library-path.
> +$${BUILD_DIR}/elf/ld.so \
> +  --library-path $(rpath-link):$${BUILD_DIR}/nptl_db \
> +  $${GDB_PATH} -q \
> +    -x $${CMD_FILE} \
> +    -d $${SOURCE_DIR} \
> +    $${BUILD_DIR}/elf/ld.so

This exposes the debugger to all sorts of problems with the bootstrapping
glibc, like bugs or other issues.

Why are we running gdb undger the new ld.so?

Why not use:

set auto-load safe-path <path>
set libthread-db-search-path <path>

instead?

What about environment variables exposed to the test case?

Can we have a consistent way to pass them to the gdb we're about to start
without exposing them to gdb?

e.g.

set exec-wrapper env 'LD_PRELOAD=libmalloc-extras.so'

> +endef
> +
> +# This is another handy script for debugging dynamically linked program
> +# against the current libc build for testing.
> +$(common-objpfx)debugglibc.sh: $(common-objpfx)config.make \
> +			    $(..)Makeconfig $(..)Makefile
> +	$(file >$@T,$(debugglibc))
> +	chmod a+x $@T
> +	mv -f $@T $@
> +postclean-generated += debugglibc.sh debugglibc.gdb
> +
> +others: $(common-objpfx)testrun.sh $(common-objpfx)debugglibc.sh
>  
>  # Makerules creates a file `stubs' in each subdirectory, which
>  # contains `#define __stub_FUNCTION' for each function defined in that
>
  
Gabriel F. T. Gomes Sept. 27, 2019, 7:42 p.m. UTC | #2
Hi Carlos,

On Thu, 26 Sep 2019, Carlos O'Donell wrote:

>We are getting close!

Nice! :)

>(1) Don't run gdb under the ld.so to be tested.
>
>(2) Provide a way to pass env vars.
>
>If you fix those two things I think the script is basically perfect :-)
>
> [...]
>
>On 9/26/19 10:21 AM, Gabriel F. T. Gomes wrote:
>>
>> +# We need to make sure that gdb is linked against the standalone glibc
>> +# so that it picks up the correct nptl_db/libthread_db.so. So that means
>> +# invoking gdb using the standalone glibc's linker and passing nptl_db
>> +# in the argument to --library-path.
>> +$${BUILD_DIR}/elf/ld.so \
>> +  --library-path $(rpath-link):$${BUILD_DIR}/nptl_db \
>> +  $${GDB_PATH} -q \
>> +    -x $${CMD_FILE} \
>> +    -d $${SOURCE_DIR} \
>> +    $${BUILD_DIR}/elf/ld.so  
>
>This exposes the debugger to all sorts of problems with the bootstrapping
>glibc, like bugs or other issues.

Right.  I haven't thought about that before.

>Why are we running gdb undger the new ld.so?

I think it's just because I never questioned what I learned from
<https://sourceware.org/glibc/wiki/Debugging/Loader_Debugging>.
I think I understand it better, now.

>Why not use:
>
>set auto-load safe-path <path>
>set libthread-db-search-path <path>
>
>instead?

You had mentioned it previously (on the URL you shared in your previous
review).  Sorry I didn't understand what you meant before, I think I got
it right this time.

>What about environment variables exposed to the test case?
>
>Can we have a consistent way to pass them to the gdb we're about to start
>without exposing them to gdb?
>
>e.g.
>
>set exec-wrapper env 'LD_PRELOAD=libmalloc-extras.so'

OK, I think I also got this right, this time.

I'm posting v3 with these changes.
  

Patch

diff --git a/Makefile b/Makefile
index 67ddd01bfe..d57f97dcae 100644
--- a/Makefile
+++ b/Makefile
@@ -187,7 +187,146 @@  $(common-objpfx)testrun.sh: $(common-objpfx)config.make \
 	mv -f $@T $@
 postclean-generated += testrun.sh
 
-others: $(common-objpfx)testrun.sh
+define debugglibc
+#!/bin/bash
+
+SOURCE_DIR="$(CURDIR)"
+BUILD_DIR="$(common-objpfx)"
+CMD_FILE="$(common-objpfx)debugglibc.gdb"
+GDB_PATH=`which gdb`
+DIRECT=true
+SYMBOLSFILE=true
+unset TESTCASE
+unset BREAKPOINTS
+
+usage()
+{
+cat << EOF
+Usage: $$0 [OPTIONS] <testcase>
+
+  where <testcase> is the path to the program being tested.
+
+Options:
+
+  -h, --help
+	Prints this message and leaves.
+
+  The following options require one argument:
+
+  -B, --breakpoint
+	Breakpoints to set at the beginning of the execution
+	(each breakpoint demands its own -B option, e.g. -B foo -B bar)
+
+  The following options do not take arguments:
+
+  -I, --no-direct
+	Selects whether to pass the flag --direct to gdb.
+	Required for glibc test cases and not allowed for non-glibc tests.
+	Default behaviour is to pass the flag --direct to gdb.
+  -S, --no-symbols-file
+	Do not tell GDB to load debug symbols from the testcase.
+EOF
+}
+
+# Parse input options
+while [[ $$# > 0 ]]
+do
+  key="$$1"
+  case $$key in
+    -h|--help)
+      usage
+      exit 0
+      ;;
+    -B|--breakpoint)
+      BREAKPOINTS="$$BREAKPOINTS\n break $$2"
+      shift
+      ;;
+    -I|--no-direct)
+      DIRECT=false
+      ;;
+    -S|--no-symbols-file)
+      SYMBOLSFILE=false
+      ;;
+    *)
+      TESTCASE=$$1
+      ;;
+  esac
+  shift
+done
+
+if [ ! -v TESTCASE ]
+then
+  usage
+  exit 1
+fi
+
+# Expand direct argument
+if [ "$$DIRECT" == true ]
+then
+  DIRECT="--direct"
+else
+  DIRECT=""
+fi
+
+# Expand symbols loading command
+if [ "$$SYMBOLSFILE" == true ]
+then
+  SYMBOLSFILE="add-symbol-file $${TESTCASE}"
+else
+  SYMBOLSFILE=""
+fi
+
+# GDB commands template
+template ()
+{
+cat <<EOF
+set environment C -E -x c-header
+break _dl_start_user
+__SYMBOLSFILE__
+run --library-path $(rpath-link):$${BUILD_DIR}/nptl_db \
+__TESTCASE__ __DIRECT__
+__BREAKPOINTS__
+EOF
+}
+
+# Generate the commands file for gdb initialization
+template | sed \
+  -e "s|__SYMBOLSFILE__|$$SYMBOLSFILE|" \
+  -e "s|__TESTCASE__|$$TESTCASE|" \
+  -e "s|__DIRECT__|$$DIRECT|" \
+  -e "s|__BREAKPOINTS__|$$BREAKPOINTS|" \
+  > $$CMD_FILE
+
+echo
+echo "Debugging glibc..."
+echo "Build directory  : $$BUILD_DIR"
+echo "Source directory : $$SOURCE_DIR"
+echo "GLIBC Testcase   : $$TESTCASE"
+echo "GDB Commands     : $$CMD_FILE"
+echo
+
+# We need to make sure that gdb is linked against the standalone glibc
+# so that it picks up the correct nptl_db/libthread_db.so. So that means
+# invoking gdb using the standalone glibc's linker and passing nptl_db
+# in the argument to --library-path.
+$${BUILD_DIR}/elf/ld.so \
+  --library-path $(rpath-link):$${BUILD_DIR}/nptl_db \
+  $${GDB_PATH} -q \
+    -x $${CMD_FILE} \
+    -d $${SOURCE_DIR} \
+    $${BUILD_DIR}/elf/ld.so
+endef
+
+# This is another handy script for debugging dynamically linked program
+# against the current libc build for testing.
+$(common-objpfx)debugglibc.sh: $(common-objpfx)config.make \
+			    $(..)Makeconfig $(..)Makefile
+	$(file >$@T,$(debugglibc))
+	chmod a+x $@T
+	mv -f $@T $@
+postclean-generated += debugglibc.sh debugglibc.gdb
+
+others: $(common-objpfx)testrun.sh $(common-objpfx)debugglibc.sh
 
 # Makerules creates a file `stubs' in each subdirectory, which
 # contains `#define __stub_FUNCTION' for each function defined in that