Commit: readelf: warn about local symbols outside of a mergeable section

Message ID 87ikemqyd2.fsf@redhat.com
State New
Headers
Series Commit: readelf: warn about local symbols outside of a mergeable section |

Commit Message

Nick Clifton Dec. 4, 2025, 11:29 a.m. UTC
  Hi Guys,

  Whilst investigating a problem with corrupt Risc-V debug information
  I discovered that some object files contained local symbols in the
  .debug_line_str section that referred to locations beyond the end of
  the section.  This is problematic because when that section is merged
  with other .debug_line_str sections the symbol will no longer
  reference the correct location, and the merging code does not handle
  symbols that extend beyond the end of the section.

  So I have added a small patch to readelf to detect and warn about such
  symbols.  Hopefully this should allow the quick location of
  problematic object files.

Cheers
  Nick
  

Comments

Alan Modra Dec. 8, 2025, 7:11 a.m. UTC | #1
On Thu, Dec 04, 2025 at 11:29:13AM +0000, Nick Clifton wrote:
> Hi Guys,
> 
>   Whilst investigating a problem with corrupt Risc-V debug information
>   I discovered that some object files contained local symbols in the
>   .debug_line_str section that referred to locations beyond the end of
>   the section.  This is problematic because when that section is merged
>   with other .debug_line_str sections the symbol will no longer
>   reference the correct location, and the merging code does not handle
>   symbols that extend beyond the end of the section.
> 
>   So I have added a small patch to readelf to detect and warn about such
>   symbols.  Hopefully this should allow the quick location of
>   problematic object files.

Nick, this is broken for executables and shared libraries, where ELF
syms have st_value as their address.  SHN_COMMON sym st_value also are
not section offsets (but you shouldn't see those in SHF_MERGE
sections).

It show up as arm-linux and aarch64-linux testsuite regressions.
  
Alan Modra Dec. 8, 2025, 9:20 a.m. UTC | #2
On Mon, Dec 08, 2025 at 05:41:04PM +1030, Alan Modra wrote:
> On Thu, Dec 04, 2025 at 11:29:13AM +0000, Nick Clifton wrote:
> > Hi Guys,
> > 
> >   Whilst investigating a problem with corrupt Risc-V debug information
> >   I discovered that some object files contained local symbols in the
> >   .debug_line_str section that referred to locations beyond the end of
> >   the section.  This is problematic because when that section is merged
> >   with other .debug_line_str sections the symbol will no longer
> >   reference the correct location, and the merging code does not handle
> >   symbols that extend beyond the end of the section.
> > 
> >   So I have added a small patch to readelf to detect and warn about such
> >   symbols.  Hopefully this should allow the quick location of
> >   problematic object files.
> 
> Nick, this is broken for executables and shared libraries, where ELF
> syms have st_value as their address.  SHN_COMMON sym st_value also are
> not section offsets (but you shouldn't see those in SHF_MERGE
> sections).
> 
> It show up as arm-linux and aarch64-linux testsuite regressions.

This works for me, and I believe is the only case where anyone really
cares about SHF_MERGE symbols being outside the section.

From 1730f571c2690cdd276ba946540b664968a1c37a Mon Sep 17 00:00:00 2001
From: Alan Modra <amodra@gmail.com>
Date: Mon, 8 Dec 2025 19:44:35 +1030
Subject: Re: Add warning message to readelf for local symbols

Limit the warning to ET_REL.

diff --git a/binutils/readelf.c b/binutils/readelf.c
index 064c16056a2..fb321b713d3 100644
--- a/binutils/readelf.c
+++ b/binutils/readelf.c
@@ -14806,6 +14806,7 @@ print_symbol (Filedata *           filedata,
       && ! is_special
       && is_valid
       && psym->st_shndx < filedata->file_header.e_shnum
+      && filedata->file_header.e_type == ET_REL
       && filedata->section_headers != NULL
       /* FIXME: Should we warn for non-mergeable sections ? */
       && (filedata->section_headers[psym->st_shndx].sh_flags & SHF_MERGE)
  
Nick Clifton Dec. 8, 2025, 2:18 p.m. UTC | #3
Hi Alan,

>> Nick, this is broken for executables and shared libraries, where ELF
>> syms have st_value as their address.  SHN_COMMON sym st_value also are
>> not section offsets (but you shouldn't see those in SHF_MERGE
>> sections).
>>
>> It show up as arm-linux and aarch64-linux testsuite regressions.
> 
> This works for me, and I believe is the only case where anyone really
> cares about SHF_MERGE symbols being outside the section.
> 
>  From 1730f571c2690cdd276ba946540b664968a1c37a Mon Sep 17 00:00:00 2001
> From: Alan Modra <amodra@gmail.com>
> Date: Mon, 8 Dec 2025 19:44:35 +1030
> Subject: Re: Add warning message to readelf for local symbols
> 
> Limit the warning to ET_REL.
> 
> diff --git a/binutils/readelf.c b/binutils/readelf.c
> index 064c16056a2..fb321b713d3 100644
> --- a/binutils/readelf.c
> +++ b/binutils/readelf.c
> @@ -14806,6 +14806,7 @@ print_symbol (Filedata *           filedata,
>         && ! is_special
>         && is_valid
>         && psym->st_shndx < filedata->file_header.e_shnum
> +      && filedata->file_header.e_type == ET_REL
>         && filedata->section_headers != NULL
>         /* FIXME: Should we warn for non-mergeable sections ? */
>         && (filedata->section_headers[psym->st_shndx].sh_flags & SHF_MERGE)

Thanks for this.  I will go ahead and apply this change, along with
my fix for avoiding mapping symbols.

Cheers
   Nick
  
Alan Modra Dec. 8, 2025, 9:09 p.m. UTC | #4
On Mon, Dec 08, 2025 at 02:18:57PM +0000, Nick Clifton wrote:
> Hi Alan,
> 
> > > Nick, this is broken for executables and shared libraries, where ELF
> > > syms have st_value as their address.  SHN_COMMON sym st_value also are
> > > not section offsets (but you shouldn't see those in SHF_MERGE
> > > sections).
> > > 
> > > It show up as arm-linux and aarch64-linux testsuite regressions.
> > 
> > This works for me, and I believe is the only case where anyone really
> > cares about SHF_MERGE symbols being outside the section.
> > 
> >  From 1730f571c2690cdd276ba946540b664968a1c37a Mon Sep 17 00:00:00 2001
> > From: Alan Modra <amodra@gmail.com>
> > Date: Mon, 8 Dec 2025 19:44:35 +1030
> > Subject: Re: Add warning message to readelf for local symbols
> > 
> > Limit the warning to ET_REL.
> > 
> > diff --git a/binutils/readelf.c b/binutils/readelf.c
> > index 064c16056a2..fb321b713d3 100644
> > --- a/binutils/readelf.c
> > +++ b/binutils/readelf.c
> > @@ -14806,6 +14806,7 @@ print_symbol (Filedata *           filedata,
> >         && ! is_special
> >         && is_valid
> >         && psym->st_shndx < filedata->file_header.e_shnum
> > +      && filedata->file_header.e_type == ET_REL
> >         && filedata->section_headers != NULL
> >         /* FIXME: Should we warn for non-mergeable sections ? */
> >         && (filedata->section_headers[psym->st_shndx].sh_flags & SHF_MERGE)
> 
> Thanks for this.  I will go ahead and apply this change, along with
> my fix for avoiding mapping symbols.

Do you really need to exclude mapping symbols?  That just looks wrong
to me.
  
Nick Clifton Dec. 9, 2025, 9:22 a.m. UTC | #5
Hi Alan,

>> Thanks for this.  I will go ahead and apply this change, along with
>> my fix for avoiding mapping symbols.
> 
> Do you really need to exclude mapping symbols?  That just looks wrong
> to me.

I thought so too, but they were showing up in linker tests on AArch64 native
builds so I added the check.

Besides, what exactly does it mean for mapping symbols to be assigned to a
mergeable section ?  Is there any case where you would find both code and
data in such a section ?   I could not think of any scenario for this, but
since it does happen I assumed that I was just missing something and so
added the code to skip that kind of symbol.

Cheers
   Nick
  
Alan Modra Dec. 9, 2025, 9:40 a.m. UTC | #6
On Tue, Dec 09, 2025 at 09:22:16AM +0000, Nick Clifton wrote:
> Hi Alan,
> 
> > > Thanks for this.  I will go ahead and apply this change, along with
> > > my fix for avoiding mapping symbols.
> > 
> > Do you really need to exclude mapping symbols?  That just looks wrong
> > to me.
> 
> I thought so too, but they were showing up in linker tests on AArch64 native
> builds so I added the check.
> 
> Besides, what exactly does it mean for mapping symbols to be assigned to a
> mergeable section ?  Is there any case where you would find both code and
> data in such a section ?   I could not think of any scenario for this, but
> since it does happen I assumed that I was just missing something and so
> added the code to skip that kind of symbol.

Yes, but that was when you incorrectly compared st_value in ET_EXEC and
ET_DYN against section size, wasn't it?  ELF symbol values in
executables are addresses, not section offsets.
  
Jan Beulich Dec. 9, 2025, 10:50 a.m. UTC | #7
On 09.12.2025 10:22, Nick Clifton wrote:
>>> Thanks for this.  I will go ahead and apply this change, along with
>>> my fix for avoiding mapping symbols.
>>
>> Do you really need to exclude mapping symbols?  That just looks wrong
>> to me.
> 
> I thought so too, but they were showing up in linker tests on AArch64 native
> builds so I added the check.
> 
> Besides, what exactly does it mean for mapping symbols to be assigned to a
> mergeable section ?  Is there any case where you would find both code and
> data in such a section ?

Code snippets, e.g. for approaches like Linux'es alternatives patching,
could easily be a mix of code and data, and they could also live in a
mergeable section.

Jan

>   I could not think of any scenario for this, but
> since it does happen I assumed that I was just missing something and so
> added the code to skip that kind of symbol.
> 
> Cheers
>    Nick
> 
>
  

Patch

diff --git a/binutils/readelf.c b/binutils/readelf.c
index 425b7b78653..759c0367c72 100644
--- a/binutils/readelf.c
+++ b/binutils/readelf.c
@@ -14739,6 +14739,8 @@  print_symbol (Filedata *           filedata,
 	}
     }
 
+  bool is_valid = false;
+
   /* Get the symbol's name.  For section symbols without a
      specific name use the (already computed) section name.  */
   if (ELF_ST_TYPE (psym->st_info) == STT_SECTION
@@ -14749,8 +14751,6 @@  print_symbol (Filedata *           filedata,
     }
   else
     {
-      bool is_valid;
-
       is_valid = valid_symbol_name (strtab, strtab_size, psym->st_name);
       sstr = is_valid  ? strtab + psym->st_name : _("<corrupt>");
     }
@@ -14798,6 +14798,23 @@  print_symbol (Filedata *           filedata,
       && filedata->file_header.e_ident[EI_OSABI] != ELFOSABI_SOLARIS)
     warn (_("local symbol %" PRIu64 " found at index >= %s's sh_info value of %u\n"),
 	  symbol_index, printable_section_name (filedata, section), section->sh_info);
+
+  /* Local symbols whose value is larger than their section's size are suspicious
+     especially if that section is mergeable - and hence might change offsets of
+     the contents inside the section.  */
+  if (ELF_ST_BIND (psym->st_info) == STB_LOCAL
+      && ! is_special
+      && is_valid
+      && psym->st_shndx < filedata->file_header.e_shnum
+      && filedata->section_headers != NULL
+      /* FIXME: Should we warn for non-mergeable sections ? */
+      && (filedata->section_headers[psym->st_shndx].sh_flags & SHF_MERGE)
+      && psym->st_value > filedata->section_headers[psym->st_shndx].sh_size)
+    warn (_("local symbol %s has a value (%#" PRIx64 ") which is larger than mergeable section %s's size (%#" PRIx64 ")\n"),
+	  strtab + psym->st_name,
+	  psym->st_value,
+	  printable_section_name_from_index (filedata, psym->st_shndx, NULL),
+	  filedata->section_headers[psym->st_shndx].sh_size);
 }
 
 static const char *