libdwfl: Read no more than required to parse dynamic sections

Message ID 20221129062653.298772-1-gavin@matician.com
State Committed
Headers
Series libdwfl: Read no more than required to parse dynamic sections |

Commit Message

Gavin Li Nov. 29, 2022, 6:26 a.m. UTC
  From: Gavin Li <gavin@matician.com>

Since size checking has been moved to dwfl_elf_phdr_memory_callback(),
there is no longer a need for dwfl_segment_report_module() to enforce
the same. Reading beyond the end of the dynamic section actually causes
issues when passing the data to elfXX_xlatetom() because it is possible
that src->d_size is not a multiple of recsize (for ELF_T_DYN, recsize is
16 while the minimum required alignment is 8), causing elfXX_xlatetom()
to return ELF_E_INVALID_DATA.

Signed-off-by: Gavin Li <gavin@matician.com>
---
 libdwfl/dwfl_segment_report_module.c | 6 ------
 1 file changed, 6 deletions(-)
  

Comments

Mark Wielaard Nov. 29, 2022, 3:25 p.m. UTC | #1
Hi Gavin,

On Mon, 2022-11-28 at 22:26 -0800, gavin@matician.com wrote:
> Since size checking has been moved to
> dwfl_elf_phdr_memory_callback(),
> there is no longer a need for dwfl_segment_report_module() to enforce
> the same. Reading beyond the end of the dynamic section actually causes
> issues when passing the data to elfXX_xlatetom() because it is possible
> that src->d_size is not a multiple of recsize (for ELF_T_DYN, recsize is
> 16 while the minimum required alignment is 8), causing elfXX_xlatetom()
> to return ELF_E_INVALID_DATA.

I don't fully follow this logic.

The code as written doesn't seem to guarantee that
dwfl_segment_report_module will always be called with
dwfl_elf_phdr_memory_callback as memory_callback. Although it probably
will be in practice.

So you are removing this check:

>       && ! read_portion (&read_state, &dyn_data, &dyn_data_size,
>  			 start, segment, dyn_vaddr, dyn_filesz))
>      {
> -      /* dyn_data_size will be zero if we got everything from the initial
> -         buffer, otherwise it will be the size of the new buffer that
> -         could be read.  */
> -      if (dyn_data_size != 0)
> -	dyn_filesz = dyn_data_size;
> -

Reading read_portion it shows dyn_data_size being set to zero if
read_state->buffer_available has everything (dyn_filesz) requested.
Otherwise memory_callback (we assume dwfl_elf_phdr_memory_callback) is
called with *data_size = dyn_filesz. Which will then be set to the
actual buffer size being read.

So dyn_data_size might be bigger than the dynfilesz we are requesting?
Or smaller I assume.

If you are protecting against it becoming bigger, should the check be
changed to:

	if (dyn_data_size != 0 && dyn_data_size < dyn_filesz)
	  dyn_filesz = dyn_data_size;

?

Thanks,

Mark
  
Gavin Li Nov. 29, 2022, 9:48 p.m. UTC | #2
Hi Mark,

Thanks for looking over this patch. Responses are inline.

> The code as written doesn't seem to guarantee that
> dwfl_segment_report_module will always be called with
> dwfl_elf_phdr_memory_callback as memory_callback. Although it probably
> will be in practice.

All file/line references relate to commit
98bdf533c4990728f0db653ab4e98a503d7654ce.

dwfl_segment_report_module is an internal function that is currently
only called from
dwfl_core_file_report. Because of this, I think it would be fine to enforce that
memory_callback implementations must enforce minread or return an error.
dwfl_elf_phdr_memory_callback does return an error at core-file.c:336 if
the amount that is able to be read is less than minread.
dwfl_segment_report_module.c:340
does not buffer_available either, since it assumes that
memory_callback will return an error
if it is unable to read sizeof(Elf64_Ehdr) bytes.

The main issue I am currently seeing is that
dwfl_elf_phdr_memory_callback can return
a *buffer_available that is sometimes much larger than minread,
especially if the ELF file
is mmaped (elf->map_address != NULL). See core-file.c:324-325.

For example, on my core file, opened with elf_begin(fd, ELF_C_READ_MMAP, NULL),
dyn_data_size would be set to about 130000, representing all the data
between the point
at which the dynamic section is found in the core file, and the end of
the core file itself
(which is around 454KB). We then pass the 130KB buffer to
elf64_xlatetom, which if
we're fortunate, returns an error because the buffer size is not a
multiple of sizeof(Elf64_Dyn),
but if we're unfortunate, it treats the whole 130KB buffer as
Elf64_Dyn entries and fills
xlateto.d_buf with garbage.

Similar behavior likely occurs everywhere read_portion() is used:
dwfl_segment_report_module.c
lines 447-453, 545-546.

> Reading read_portion it shows dyn_data_size being set to zero if
> read_state->buffer_available has everything (dyn_filesz) requested.
> Otherwise memory_callback (we assume dwfl_elf_phdr_memory_callback) is
> called with *data_size = dyn_filesz. Which will then be set to the
> actual buffer size being read.

dwfl_elf_phdr_memory_callback() may read much more than minread or *buffer_size
if the ELF file is already mapped in as described above.

> If you are protecting against it becoming bigger, should the check be
> changed to:
>
>         if (dyn_data_size != 0 && dyn_data_size < dyn_filesz)
>           dyn_filesz = dyn_data_size;
>

I think for the purposes of reading small segments (like PT_DYNAMIC
and PT_NOTE),
we should ignore *buffer_available altogether.

Best,
Gavin
  
Mark Wielaard Nov. 30, 2022, 11:14 p.m. UTC | #3
Hi Gavin,

On Tue, Nov 29, 2022 at 01:48:42PM -0800, Gavin Li wrote:
> I think for the purposes of reading small segments (like PT_DYNAMIC
> and PT_NOTE), we should ignore *buffer_available altogether.

Thanks for walking me through the code. I think you are right and none
of the buffer_available checks are necessary. So I removed them all. I
also adjusted the commit message a bit. Could you look at this patch
and let me know if this works for you?

Cheers,

Mark
  
Gavin Li Dec. 1, 2022, 9:13 p.m. UTC | #4
Awesome, thanks for looking over this. I only have one comment:
there's an extra "xlatefrom.d_size = xlatefrom.d_size;" line that
should be removed.

dwfl_elf_phdr_memory_callback is called from dwfl_link_map_report
but if any issues arise, those could be addressed in a separate patch.

Best,
Gavin

On Wed, Nov 30, 2022 at 3:14 PM Mark Wielaard <mark@klomp.org> wrote:
>
> Hi Gavin,
>
> On Tue, Nov 29, 2022 at 01:48:42PM -0800, Gavin Li wrote:
> > I think for the purposes of reading small segments (like PT_DYNAMIC
> > and PT_NOTE), we should ignore *buffer_available altogether.
>
> Thanks for walking me through the code. I think you are right and none
> of the buffer_available checks are necessary. So I removed them all. I
> also adjusted the commit message a bit. Could you look at this patch
> and let me know if this works for you?
>
> Cheers,
>
> Mark
  
Mark Wielaard Dec. 13, 2022, 4:49 p.m. UTC | #5
Hi Gavin,

On Thu, 2022-12-01 at 13:13 -0800, Gavin Li wrote:
> Awesome, thanks for looking over this. I only have one comment:
> there's an extra "xlatefrom.d_size = xlatefrom.d_size;" line that
> should be removed.

Thanks for spotting that. Odd the compiler didn't warn for that. There
is a xlatefrom.d_size = phnum * phentsize; just before this that does
the correct assignment.

Pushed with that line removed.

Cheers,

Mark
  

Patch

diff --git a/libdwfl/dwfl_segment_report_module.c b/libdwfl/dwfl_segment_report_module.c
index 287fc002..08aca0eb 100644
--- a/libdwfl/dwfl_segment_report_module.c
+++ b/libdwfl/dwfl_segment_report_module.c
@@ -821,12 +821,6 @@  dwfl_segment_report_module (Dwfl *dwfl, int ndx, const char *name,
       && ! read_portion (&read_state, &dyn_data, &dyn_data_size,
 			 start, segment, dyn_vaddr, dyn_filesz))
     {
-      /* dyn_data_size will be zero if we got everything from the initial
-         buffer, otherwise it will be the size of the new buffer that
-         could be read.  */
-      if (dyn_data_size != 0)
-	dyn_filesz = dyn_data_size;
-
       if ((dyn_filesz / dyn_entsize) == 0
 	  || dyn_filesz > (SIZE_MAX / dyn_entsize))
 	goto out;