Don't return "(null)" from bfd_elf_sym_name
Checks
Commit Message
A NULL return from bfd_elf_string_from_elf_section indicates an error.
That shouldn't be masked by bfd_elf_sym_name but rather passed up to
callers such as group_signature. If we want to print "(null)" then
that should be done at a higher level. That's what this patch does,
except that I chose to print "<null>" instead, like readelf. If we
see "(null)" we're probably passing a NULL to printf. I haven't
changed aoutx.h or pdp11.c print_symbol functions because they already
handle NULL names by omitting the name. I also haven't changed
mach-o.c, mmo.c, som.c, srec.c, tekhex.c, vms-alpha.c and
wasm-module.c print_symbol function because it looks like they will
never have NULL symbol names.
bfd/
* elf.c (bfd_elf_sym_name): Don't turn a NULL name into a
pointer to "(null)".
(bfd_elf_print_symbol): Print "<null>" for NULL symbol names.
* coffgen.c (coff_print_symbol): Likewise.
* ecoff.c (_bfd_ecoff_print_symbol): Likewise.
* pef.c (bfd_pef_print_symbol): Likewise.
* syms.c (bfd_symbol_info): Return "<null>" in symbol_info.name
if symbol name is NULL.
ld/
* ldlang.c (ld_is_local_symbol): Don't check for "(null)"
symbol name.
Comments
Hello,
Alan Modra <amodra@gmail.com> writes:
> A NULL return from bfd_elf_string_from_elf_section indicates an error.
> That shouldn't be masked by bfd_elf_sym_name but rather passed up to
> callers such as group_signature. If we want to print "(null)" then
> that should be done at a higher level. That's what this patch does,
> except that I chose to print "<null>" instead, like readelf. If we
> see "(null)" we're probably passing a NULL to printf. I haven't
> changed aoutx.h or pdp11.c print_symbol functions because they already
> handle NULL names by omitting the name. I also haven't changed
> mach-o.c, mmo.c, som.c, srec.c, tekhex.c, vms-alpha.c and
> wasm-module.c print_symbol function because it looks like they will
> never have NULL symbol names.
This patch causes GDB to segfault in a test with a mangled symbol table:
Running gdb.base/bfd-errors.exp ...
ERROR: GDB process no longer exists
UNRESOLVED: gdb.base/bfd-errors.exp: load library with add-symbol-file
This is how the test corrupts the symbol table:
# [...] a shared object with at least four symbols is
# created. The .dynsym section is extracted and offsets which should
# refer to strings in the .dynstr section are changed to be
# larger than the size of the .dynstr section.
GDB crashed because it wasn't prepared to get a NULL result from
bfd_asymbol_name () while building the dynamic symbols table. I did the
obvious fix:
diff --git a/gdb/elfread.c b/gdb/elfread.c
index e959d3a2f9d3..6907d4b7e3e3 100644
--- a/gdb/elfread.c
+++ b/gdb/elfread.c
@@ -612,6 +612,8 @@ elf_rel_plt_read (minimal_symbol_reader &reader,
const size_t got_suffix_len = strlen (SYMBOL_GOT_PLT_SUFFIX);
name = bfd_asymbol_name (*relplt->relocation[reloc].sym_ptr_ptr);
+ if (name == nullptr)
+ name = "(null)";
address = relplt->relocation[reloc].address;
asection *msym_section;
But then I get a crash within BFD itself, when GDB passes the dynamic
symbols table it built to bfd_get_synthetic_symtab (). The relevant part
of the backtrace is:
#6 <signal handler called>
#7 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81
#8 0x00005802042ce8d1 in _bfd_x86_elf_get_synthetic_symtab
(abfd=0x58020840f8e0, count=2, relsize=72, got_addr=0,
plts=0x7ffcefa985e0, dynsyms=0x580208466f60, ret=0x7ffcefa98718)
at /home/bauermann/src/binutils-gdb-wt-2/bfd/elfxx-x86.c:3713
#9 0x00005802042c4c41 in elf_x86_64_get_synthetic_symtab
(abfd=0x58020840f8e0, symcount=28, syms=0x580208457480,
dynsymcount=9, dynsyms=0x580208466f60, ret=0x7ffcefa98718)
at /home/bauermann/src/binutils-gdb-wt-2/bfd/elf64-x86-64.c:5427
#10 0x0000580203989601 in elf_read_minimal_symbols
(objfile=0x58020840bfd0, symfile_flags=2, ei=0x7ffcefa98810)
at /home/bauermann/src/binutils-gdb-wt-2/gdb/elfread.c:1160
Frame 8's line is:
3713 size += strlen ((*p->sym_ptr_ptr)->name) + sizeof ("@plt");
Where:
(top-gdb) p **p->sym_ptr_ptr
$8 = {
the_bfd = 0x58020840f8e0,
name = 0x0,
value = 0,
flags = 4227073,
section = 0x580205cfcad8 <_bfd_std_section+280>,
udata = {
p = 0x0,
i = 0
}
}
So it looks like bfd_get_synthetic_symtab () isn't prepared to deal with
the existance of symbols without a name. Should it? Or should GDB fix up
the section contents so that they'd point to a "(null)" string?
On Thu, Oct 03, 2024 at 09:45:56PM -0300, Thiago Jung Bauermann wrote:
> Hello,
>
> Alan Modra <amodra@gmail.com> writes:
>
> > A NULL return from bfd_elf_string_from_elf_section indicates an error.
> > That shouldn't be masked by bfd_elf_sym_name but rather passed up to
> > callers such as group_signature. If we want to print "(null)" then
> > that should be done at a higher level. That's what this patch does,
> > except that I chose to print "<null>" instead, like readelf. If we
> > see "(null)" we're probably passing a NULL to printf. I haven't
> > changed aoutx.h or pdp11.c print_symbol functions because they already
> > handle NULL names by omitting the name. I also haven't changed
> > mach-o.c, mmo.c, som.c, srec.c, tekhex.c, vms-alpha.c and
> > wasm-module.c print_symbol function because it looks like they will
> > never have NULL symbol names.
>
> This patch causes GDB to segfault in a test with a mangled symbol table:
>
> Running gdb.base/bfd-errors.exp ...
> ERROR: GDB process no longer exists
> UNRESOLVED: gdb.base/bfd-errors.exp: load library with add-symbol-file
Yes, I was just looking at this and had written the following:
From a88c23dda6360e247fa758f3482ee7cf661a29cf Mon Sep 17 00:00:00 2001
From: Alan Modra <amodra@gmail.com>
Date: Fri, 4 Oct 2024 07:47:05 +0930
Subject: gdb segv in elfread.c:elf_rel_plt_read
After commit 68bbe1183379, ELF symbols read via bfd_canonicalize_symtab
and similar functions which have bad st_name fields will have NULL in
the name rather than "(null)". gdb.base/bfd-errors.exp deliberately
creates a faulty shared library with st_name pointing outside of
.dynsym for two symbols, and thus now results in NULL symbol names.
This triggers a segv on string_buffer.assign(name). Fix that.
diff --git a/gdb/elfread.c b/gdb/elfread.c
index e959d3a2f9d..2e68b0dba1a 100644
--- a/gdb/elfread.c
+++ b/gdb/elfread.c
@@ -612,6 +612,8 @@ elf_rel_plt_read (minimal_symbol_reader &reader,
const size_t got_suffix_len = strlen (SYMBOL_GOT_PLT_SUFFIX);
name = bfd_asymbol_name (*relplt->relocation[reloc].sym_ptr_ptr);
+ if (!name)
+ continue;
address = relplt->relocation[reloc].address;
asection *msym_section;
I think it's better to ignore symbols with NULL names rather than turn
the name into "(null)" as you do in your patch, since you can't really
do anything sensible when looking up "(null)@got.plt" later.
> This is how the test corrupts the symbol table:
>
> # [...] a shared object with at least four symbols is
> # created. The .dynsym section is extracted and offsets which should
> # refer to strings in the .dynstr section are changed to be
> # larger than the size of the .dynstr section.
>
> GDB crashed because it wasn't prepared to get a NULL result from
> bfd_asymbol_name () while building the dynamic symbols table. I did the
> obvious fix:
>
> diff --git a/gdb/elfread.c b/gdb/elfread.c
> index e959d3a2f9d3..6907d4b7e3e3 100644
> --- a/gdb/elfread.c
> +++ b/gdb/elfread.c
> @@ -612,6 +612,8 @@ elf_rel_plt_read (minimal_symbol_reader &reader,
> const size_t got_suffix_len = strlen (SYMBOL_GOT_PLT_SUFFIX);
>
> name = bfd_asymbol_name (*relplt->relocation[reloc].sym_ptr_ptr);
> + if (name == nullptr)
> + name = "(null)";
> address = relplt->relocation[reloc].address;
>
> asection *msym_section;
>
> But then I get a crash within BFD itself, when GDB passes the dynamic
> symbols table it built to bfd_get_synthetic_symtab (). The relevant part
> of the backtrace is:
>
> #6 <signal handler called>
> #7 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex-base.S:81
> #8 0x00005802042ce8d1 in _bfd_x86_elf_get_synthetic_symtab
> (abfd=0x58020840f8e0, count=2, relsize=72, got_addr=0,
> plts=0x7ffcefa985e0, dynsyms=0x580208466f60, ret=0x7ffcefa98718)
> at /home/bauermann/src/binutils-gdb-wt-2/bfd/elfxx-x86.c:3713
> #9 0x00005802042c4c41 in elf_x86_64_get_synthetic_symtab
> (abfd=0x58020840f8e0, symcount=28, syms=0x580208457480,
> dynsymcount=9, dynsyms=0x580208466f60, ret=0x7ffcefa98718)
> at /home/bauermann/src/binutils-gdb-wt-2/bfd/elf64-x86-64.c:5427
> #10 0x0000580203989601 in elf_read_minimal_symbols
> (objfile=0x58020840bfd0, symfile_flags=2, ei=0x7ffcefa98810)
> at /home/bauermann/src/binutils-gdb-wt-2/gdb/elfread.c:1160
>
> Frame 8's line is:
>
> 3713 size += strlen ((*p->sym_ptr_ptr)->name) + sizeof ("@plt");
>
> Where:
>
> (top-gdb) p **p->sym_ptr_ptr
> $8 = {
> the_bfd = 0x58020840f8e0,
> name = 0x0,
> value = 0,
> flags = 4227073,
> section = 0x580205cfcad8 <_bfd_std_section+280>,
> udata = {
> p = 0x0,
> i = 0
> }
> }
>
> So it looks like bfd_get_synthetic_symtab () isn't prepared to deal with
> the existance of symbols without a name. Should it? Or should GDB fix up
> the section contents so that they'd point to a "(null)" string?
gdb definitely shouldn't be trying to fix anything here. The fix
belongs in bfd.
Alan Modra <amodra@gmail.com> writes:
> On Thu, Oct 03, 2024 at 09:45:56PM -0300, Thiago Jung Bauermann wrote:
>> Hello,
>>
>> Alan Modra <amodra@gmail.com> writes:
>>
>> > A NULL return from bfd_elf_string_from_elf_section indicates an error.
>> > That shouldn't be masked by bfd_elf_sym_name but rather passed up to
>> > callers such as group_signature. If we want to print "(null)" then
>> > that should be done at a higher level. That's what this patch does,
>> > except that I chose to print "<null>" instead, like readelf. If we
>> > see "(null)" we're probably passing a NULL to printf. I haven't
>> > changed aoutx.h or pdp11.c print_symbol functions because they already
>> > handle NULL names by omitting the name. I also haven't changed
>> > mach-o.c, mmo.c, som.c, srec.c, tekhex.c, vms-alpha.c and
>> > wasm-module.c print_symbol function because it looks like they will
>> > never have NULL symbol names.
>>
>> This patch causes GDB to segfault in a test with a mangled symbol table:
>>
>> Running gdb.base/bfd-errors.exp ...
>> ERROR: GDB process no longer exists
>> UNRESOLVED: gdb.base/bfd-errors.exp: load library with add-symbol-file
>
> Yes, I was just looking at this and had written the following:
>
> From a88c23dda6360e247fa758f3482ee7cf661a29cf Mon Sep 17 00:00:00 2001
> From: Alan Modra <amodra@gmail.com>
> Date: Fri, 4 Oct 2024 07:47:05 +0930
> Subject: gdb segv in elfread.c:elf_rel_plt_read
>
> After commit 68bbe1183379, ELF symbols read via bfd_canonicalize_symtab
> and similar functions which have bad st_name fields will have NULL in
> the name rather than "(null)". gdb.base/bfd-errors.exp deliberately
> creates a faulty shared library with st_name pointing outside of
> .dynsym for two symbols, and thus now results in NULL symbol names.
> This triggers a segv on string_buffer.assign(name). Fix that.
>
> diff --git a/gdb/elfread.c b/gdb/elfread.c
> index e959d3a2f9d..2e68b0dba1a 100644
> --- a/gdb/elfread.c
> +++ b/gdb/elfread.c
> @@ -612,6 +612,8 @@ elf_rel_plt_read (minimal_symbol_reader &reader,
> const size_t got_suffix_len = strlen (SYMBOL_GOT_PLT_SUFFIX);
>
> name = bfd_asymbol_name (*relplt->relocation[reloc].sym_ptr_ptr);
> + if (!name)
> + continue;
> address = relplt->relocation[reloc].address;
>
> asection *msym_section;
>
> I think it's better to ignore symbols with NULL names rather than turn
> the name into "(null)" as you do in your patch, since you can't really
> do anything sensible when looking up "(null)@got.plt" later.
I agree that makes more sense. Thank you for investigating and fixing
the problem. Will you submit the patch to GDB?
@@ -2161,11 +2161,12 @@ coff_print_symbol (bfd *abfd,
bfd_print_symbol_type how)
{
FILE * file = (FILE *) filep;
+ const char *symname = symbol->name ? symbol->name : "<null>";
switch (how)
{
case bfd_print_symbol_name:
- fprintf (file, "%s", symbol->name);
+ fprintf (file, "%s", symname);
break;
case bfd_print_symbol_more:
@@ -2189,7 +2190,7 @@ coff_print_symbol (bfd *abfd,
if (combined < obj_raw_syments (abfd)
|| combined >= obj_raw_syments (abfd) + obj_raw_syment_count (abfd))
{
- fprintf (file, _("<corrupt info> %s"), symbol->name);
+ fprintf (file, _("<corrupt info> %s"), symname);
break;
}
@@ -2207,7 +2208,7 @@ coff_print_symbol (bfd *abfd,
combined->u.syment.n_sclass,
combined->u.syment.n_numaux);
bfd_fprintf_vma (abfd, file, val);
- fprintf (file, " %s", symbol->name);
+ fprintf (file, " %s", symname);
for (aux = 0; aux < combined->u.syment.n_numaux; aux++)
{
@@ -2297,7 +2298,8 @@ coff_print_symbol (bfd *abfd,
if (l)
{
- fprintf (file, "\n%s :", l->u.sym->name);
+ fprintf (file, "\n%s :",
+ l->u.sym->name ? l->u.sym->name : "<null>");
l++;
while (l->line_number)
{
@@ -2317,7 +2319,7 @@ coff_print_symbol (bfd *abfd,
symbol->section->name,
coffsymbol (symbol)->native ? "n" : "g",
coffsymbol (symbol)->lineno ? "l" : " ",
- symbol->name);
+ symname);
}
}
}
@@ -1452,11 +1452,12 @@ _bfd_ecoff_print_symbol (bfd *abfd,
const struct ecoff_debug_swap * const debug_swap
= &ecoff_backend (abfd)->debug_swap;
FILE *file = (FILE *)filep;
+ const char *symname = symbol->name ? symbol->name : "<null>";
switch (how)
{
case bfd_print_symbol_name:
- fprintf (file, "%s", symbol->name);
+ fprintf (file, "%s", symname);
break;
case bfd_print_symbol_more:
if (ecoffsymbol (symbol)->local)
@@ -1526,7 +1527,7 @@ _bfd_ecoff_print_symbol (bfd *abfd,
(unsigned) ecoff_ext.asym.sc,
(unsigned) ecoff_ext.asym.index,
jmptbl, cobol_main, weakext,
- symbol->name);
+ symname);
if (ecoffsymbol (symbol)->fdr != NULL
&& ecoff_ext.asym.index != indexNil)
@@ -549,9 +549,7 @@ bfd_elf_sym_name (bfd *abfd,
}
name = bfd_elf_string_from_elf_section (abfd, shindex, iname);
- if (name == NULL)
- name = "(null)";
- else if (sym_sec && *name == '\0')
+ if (sym_sec && name && *name == '\0')
name = bfd_section_name (sym_sec);
return name;
@@ -2314,10 +2312,12 @@ bfd_elf_print_symbol (bfd *abfd,
bfd_print_symbol_type how)
{
FILE *file = (FILE *) filep;
+ const char *symname = symbol->name ? symbol->name : "<null>";
+
switch (how)
{
case bfd_print_symbol_name:
- fprintf (file, "%s", symbol->name);
+ fprintf (file, "%s", symname);
break;
case bfd_print_symbol_more:
fprintf (file, "elf ");
@@ -2340,11 +2340,10 @@ bfd_elf_print_symbol (bfd *abfd,
if (bed->elf_backend_print_symbol_all)
name = (*bed->elf_backend_print_symbol_all) (abfd, filep, symbol);
- if (name == NULL)
- {
- name = symbol->name;
- bfd_print_symbol_vandf (abfd, file, symbol);
- }
+ if (name != NULL)
+ symname = name;
+ else
+ bfd_print_symbol_vandf (abfd, file, symbol);
fprintf (file, " %s\t", section_name);
/* Print the "other" value for a symbol. For common symbols,
@@ -2391,7 +2390,7 @@ bfd_elf_print_symbol (bfd *abfd,
fprintf (file, " 0x%02x", (unsigned int) st_other);
}
- fprintf (file, " %s", name);
+ fprintf (file, " %s", symname);
}
break;
}
@@ -210,16 +210,17 @@ bfd_pef_print_symbol (bfd *abfd,
bfd_print_symbol_type how)
{
FILE *file = (FILE *) afile;
+ const char *symname = symbol->name ? symbol->name : "<null>";
switch (how)
{
case bfd_print_symbol_name:
- fprintf (file, "%s", symbol->name);
+ fprintf (file, "%s", symname);
break;
default:
bfd_print_symbol_vandf (abfd, (void *) file, symbol);
- fprintf (file, " %-5s %s", symbol->section->name, symbol->name);
- if (startswith (symbol->name, "__traceback_"))
+ fprintf (file, " %-5s %s", symbol->section->name, symname);
+ if (startswith (symname, "__traceback_"))
{
unsigned char *buf;
size_t offset = symbol->value + 4;
@@ -777,7 +777,7 @@ bfd_symbol_info (asymbol *symbol, symbol_info *ret)
else
ret->value = symbol->value + symbol->section->vma;
- ret->name = symbol->name;
+ ret->name = symbol->name ? symbol->name : "<null>";
}
/*
@@ -4895,9 +4895,6 @@ ld_is_local_symbol (asymbol * sym)
if (name == NULL || *name == 0)
return false;
- if (strcmp (name, "(null)") == 0)
- return false;
-
/* Skip .Lxxx and such like. */
if (bfd_is_local_label (link_info.output_bfd, sym))
return false;