[PATCHv2] gdb/build-id: protect against weirdly short build-ids
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Build passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-arm |
success
|
Test passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
fail
|
Patch failed to apply
|
Commit Message
In v2:
- Print a debug message when a short build-id is ignored. I did
consider a warning, but I wasn't sure if that was suitable or not.
In the end some debug output means we can at least diagnose why
the debug lookup failed.
- Rebased onto something closer to current master.
---
While looking at build_id_to_bfd_suffix (in gdb/build-id.c) I realised
that GDB would likely not do what we wanted if a build-id was ever a
single byte.
Right now, build-ids generated by the GNU linker are 32 bytes, but
there's nothing that forces this to be the case, it's pretty easy to
create a fake, single byte, build-id. Given that the build-id is an
external input (read from the objfile), GDB should protect itself
against these edge cases.
The problem with build_id_to_bfd_suffix is that this function creates
the path used to lookup either the debug information, or an
executable, based on its build-id. So a 3-byte build-id 0xaabbcc will
look in the path: `$DEBUG_FILE_DIRECTORY/.build-id/aa/bbcc.debug`.
However, a single byte build-id 0xaa, will look in the file:
`$DEBUG_FILE_DIRECTORY/.build-id/aa/.debug` which doesn't seem right.
Worse, when looking for an objfile given a build-id GDB will look for
a file called `$DEBUG_FILE_DIRECTORY/.build-id/aa/` with a trailing
'/' character.
I propose that, in build_id_to_bfd_suffix we just return early if the
build-id is 1 byte (or less) with a return value that indicates no
separate file was found.
For testing I made use of the DWARF assembler. I needed to update the
build-id creation proc, the existing code assumes that the build-id is
a multiple of 4 bytes, so I added some additional padding to ensure
that the generated note was a multiple of 4 bytes, even if the
build-id was not.
I added a test with a 1 byte build-id, and also for the case where the
build-id has zero length. The zero length case already does what
you'd expect (no debug is loaded) as the bfd library rejects the
build-id when loading it from the objfile, but adding this additional
test is pretty cheap.
---
gdb/build-id.c | 26 ++++-
gdb/testsuite/gdb.dwarf2/short-build-id.exp | 119 ++++++++++++++++++++
gdb/testsuite/lib/dwarf.exp | 13 ++-
3 files changed, 149 insertions(+), 9 deletions(-)
create mode 100644 gdb/testsuite/gdb.dwarf2/short-build-id.exp
base-commit: 661611b9d71d9358305332d28a8b9d7bb47a4363
Comments
On Thu, 21 Nov 2024 14:29:05 +0000
Andrew Burgess <aburgess@redhat.com> wrote:
> In v2:
>
> - Print a debug message when a short build-id is ignored. I did
> consider a warning, but I wasn't sure if that was suitable or not.
> In the end some debug output means we can at least diagnose why
> the debug lookup failed.
I think this is fine. I've worked on bugs in the past where GDB was
spamming the user with hundreds (or thousands...) of warnings. If
we were to output a warning for this case, I think it'd be important
to make sure that it doesn't happen more than a few times per session.
In any case, I'm happy to see that you've provided a way to diagnose
the problem.
Approved-by: Kevin Buettner <kevinb@redhat.com>
Kevin Buettner <kevinb@redhat.com> writes:
> On Thu, 21 Nov 2024 14:29:05 +0000
> Andrew Burgess <aburgess@redhat.com> wrote:
>
>> In v2:
>>
>> - Print a debug message when a short build-id is ignored. I did
>> consider a warning, but I wasn't sure if that was suitable or not.
>> In the end some debug output means we can at least diagnose why
>> the debug lookup failed.
>
> I think this is fine. I've worked on bugs in the past where GDB was
> spamming the user with hundreds (or thousands...) of warnings. If
> we were to output a warning for this case, I think it'd be important
> to make sure that it doesn't happen more than a few times per session.
> In any case, I'm happy to see that you've provided a way to diagnose
> the problem.
>
> Approved-by: Kevin Buettner <kevinb@redhat.com>
Pushed.
Thanks,
Andrew
@@ -223,7 +223,13 @@ build_id_to_debug_bfd_1 (const std::string &original_link,
/* Common code for finding BFDs of a given build-id. This function
works with both debuginfo files (SUFFIX == ".debug") and executable
- files (SUFFIX == ""). */
+ files (SUFFIX == "").
+
+ The build-id will be split into a single byte sub-directory, followed by
+ the remaining build-id bytes as the filename, i.e. we use the lookup
+ format: `.build-id/xx/yy....zz`. As a consequence, if BUILD_ID_LEN is
+ less than 2 (bytes), no results will be found as there are not enough
+ bytes to form the `yy....zz` part of the lookup filename. */
static gdb_bfd_ref_ptr
build_id_to_bfd_suffix (size_t build_id_len, const bfd_byte *build_id,
@@ -231,6 +237,16 @@ build_id_to_bfd_suffix (size_t build_id_len, const bfd_byte *build_id,
{
SEPARATE_DEBUG_FILE_SCOPED_DEBUG_ENTER_EXIT;
+ if (build_id_len < 2)
+ {
+ /* Zero length build-ids are ignored by bfd. */
+ gdb_assert (build_id_len > 0);
+ separate_debug_file_debug_printf
+ ("Ignoring short build-id `%s' for build-id based lookup",
+ bin2hex (build_id, build_id_len).c_str ());
+ return {};
+ }
+
/* Keep backward compatibility so that DEBUG_FILE_DIRECTORY being "" will
cause "/.build-id/..." lookups. */
@@ -249,11 +265,9 @@ build_id_to_bfd_suffix (size_t build_id_len, const bfd_byte *build_id,
std::string link = debugdir.get ();
link += "/.build-id/";
- if (size > 0)
- {
- size--;
- string_appendf (link, "%02x/", (unsigned) *data++);
- }
+ gdb_assert (size > 1);
+ size--;
+ string_appendf (link, "%02x/", (unsigned) *data++);
while (size-- > 0)
string_appendf (link, "%02x", (unsigned) *data++);
new file mode 100644
@@ -0,0 +1,119 @@
+# Copyright 2024 Free Software Foundation, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+# Create a file with an artificially short (1-byte) build-id, and
+# check that GDB doesn't try to load debug information. If we do try
+# then we end up loading from: `debug-directory/.build-id/xx/.debug`
+# which isn't right.
+
+load_lib dwarf.exp
+
+# This test can only be run on targets which support DWARF-2 and use gas.
+require dwarf2_support
+
+# No remote host testing either.
+require {!is_remote host}
+
+standard_testfile main.c
+
+# Create an assembler file which encodes BUILDID as the build-id. Compile
+# this along with the global SRCFILE to create a test executable.
+#
+# Split the debug information out from the newly created executable and place
+# it into the debug file directory.
+#
+# Load the executable into GDB and check to see if the debug information was
+# loaded or not. For this test we are expecting that the debug information
+# was not loaded. The reason is that, with short values for BUILDID, GDB ends
+# up looking for the debug information in weird locations.
+proc run_test { buildid } {
+ set len [string length $buildid]
+
+ set asm_file [standard_output_file "$::testfile.$len.S"]
+ Dwarf::assemble $asm_file {
+ declare_labels int_label int_label2
+
+ upvar buildid buildid
+
+ build_id $buildid
+
+ cu { label cu_start } {
+ compile_unit {{language @DW_LANG_C}} {
+ int_label2: base_type {
+ {name int}
+ {byte_size 4 sdata}
+ {encoding @DW_ATE_signed}
+ }
+
+ constant {
+ {name the_int}
+ {type :$int_label2}
+ {const_value 99 data1}
+ }
+ }
+ }
+
+ aranges {} cu_start {
+ arange {} 0 0
+ }
+ }
+
+ set execfile [standard_output_file $::testfile.$len]
+
+ if { [build_executable_from_specs "failed to build" \
+ $execfile {debug no-build-id} \
+ $::srcfile debug \
+ $asm_file {}] } {
+ return
+ }
+
+ # Create the debug directory.
+ set debugdir [standard_output_file "debugdir.$len"]
+ set build_id_dir $debugdir/.build-id/$buildid
+ remote_exec host "mkdir -p $build_id_dir"
+
+ # Split out the debug information.
+ if {[gdb_gnu_strip_debug $execfile no-debuglink]} {
+ unresolved "failed to split out debug information"
+ return
+ }
+
+ # Move the debug information into the debug directory. We place the debug
+ # information into a file called just '.debug'. GDB should not check this
+ # file, but at one point GDB would check this file, even though this
+ # doesn't make much sense.
+ set execfile_debug ${execfile}.debug
+ remote_exec host "mv $execfile_debug $build_id_dir/.debug"
+
+ # Start GDB, set the debug-file-directory, and try loading the file.
+ clean_restart
+
+ gdb_test_no_output "set debug-file-directory $debugdir" \
+ "set debug-file-directory"
+
+ gdb_file_cmd $execfile
+
+ gdb_assert { $::gdb_file_cmd_debug_info eq "nodebug" } \
+ "no debug should be loaded"
+
+ # For sanity, read something that was encoded in the debug
+ # information, this should fail.
+ gdb_test "print the_int" \
+ "(?:No symbol table is loaded|No symbol \"the_int\" in current context).*"
+}
+
+foreach_with_prefix buildid { a4 "" } {
+ run_test $buildid
+}
@@ -3008,25 +3008,32 @@ namespace eval Dwarf {
proc _note {type name hexdata} {
set namelen [expr [string length $name] + 1]
+ set datalen [expr [string length $hexdata] / 2]
# Name size.
_op .4byte $namelen
# Data size.
- _op .4byte [expr [string length $hexdata] / 2]
+ _op .4byte $datalen
# Type.
_op .4byte $type
# The name.
_op .ascii [_quote $name]
- # Alignment.
+ # Alignment (to 4-byte boundary).
set align 2
set total [expr {($namelen + (1 << $align) - 1) & -(1 << $align)}]
for {set i $namelen} {$i < $total} {incr i} {
- _op .byte 0
+ _op .byte 0 padding
}
# The data.
foreach {a b} [split $hexdata {}] {
_op .byte 0x$a$b
}
+ # Alignment (to 4-byte boundary).
+ set align 2
+ set total [expr {($datalen + (1 << $align) - 1) & -(1 << $align)}]
+ for {set i $datalen} {$i < $total} {incr i} {
+ _op .byte 0 padding
+ }
}
# Emit a note section holding the given build-id.