[1/2] libdwfl: handle duplicate ELFs when reporting archives

Message ID 20231117223541.920797-1-vvvvvv@google.com
State Superseded
Headers
Series [1/2] libdwfl: handle duplicate ELFs when reporting archives |

Commit Message

Aleksei Vetrov Nov. 17, 2023, 10:35 p.m. UTC
  From: Aleksei Vetrov <vvvvvv@google.com>

When archive is processed in process_archive (libdwfl/offline.c), it
creates an Elf object for each archive member. Then in
process_archive_member it calls process_file to create a Dwfl_Module
through __libdwfl_report_elf.

The ownership of the Elf object is expected to be:

* either transfered to the Dwfl_Module, if __libdwfl_report_elf returns
  not NULL;

* or handled at the end of process_archive_member by calling elf_end.

Moreover, Elf object is expected to be alive, if __libdwfl_report_elf
returns not NULL, because at the end of process_archive_member it
advances to the next member through the elf_next call.

The problem happens when __libdwfl_report_elf encounters Elf with the
same name and content as it seen before. In that case dwfl_report_module
will reuse existing Dwfl_Module object. This leads to a codepath that
calls elf_end on the Elf object, while returning not NULL, breaking the
elf_next call to the next member.

The fix is to destroy m->main.elf instead and put the new Elf object in
the already existing Dwfl_Module.

    * libdwfl/dwfl_report_elf.c (__libdwfl_report_elf): Replace Elf in
      the Dwfl_Module in case of overlapping or duplicate modules to
      prolong its lifetime for subsequent processing.

Signed-off-by: Aleksei Vetrov <vvvvvv@google.com>
---
 libdwfl/dwfl_report_elf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

Mark Wielaard Nov. 18, 2023, 10:47 p.m. UTC | #1
Hi Aleksei,

On Fri, Nov 17, 2023 at 10:35:40PM +0000, vvvvvv@google.com wrote:
> When archive is processed in process_archive (libdwfl/offline.c), it
> creates an Elf object for each archive member. Then in
> process_archive_member it calls process_file to create a Dwfl_Module
> through __libdwfl_report_elf.
> 
> The ownership of the Elf object is expected to be:
> 
> * either transfered to the Dwfl_Module, if __libdwfl_report_elf returns
>   not NULL;
> 
> * or handled at the end of process_archive_member by calling elf_end.
> 
> Moreover, Elf object is expected to be alive, if __libdwfl_report_elf
> returns not NULL, because at the end of process_archive_member it
> advances to the next member through the elf_next call.
> 
> The problem happens when __libdwfl_report_elf encounters Elf with the
> same name and content as it seen before. In that case dwfl_report_module
> will reuse existing Dwfl_Module object. This leads to a codepath that
> calls elf_end on the Elf object, while returning not NULL, breaking the
> elf_next call to the next member.
> 
> The fix is to destroy m->main.elf instead and put the new Elf object in
> the already existing Dwfl_Module.
> 
>     * libdwfl/dwfl_report_elf.c (__libdwfl_report_elf): Replace Elf in
>       the Dwfl_Module in case of overlapping or duplicate modules to
>       prolong its lifetime for subsequent processing.

Thanks for that analysis and proposed solution. The ownership
reasoning makes sense. But I do have one question.

> diff --git a/libdwfl/dwfl_report_elf.c b/libdwfl/dwfl_report_elf.c
> index 581f4079..58b06aea 100644
> --- a/libdwfl/dwfl_report_elf.c
> +++ b/libdwfl/dwfl_report_elf.c
> @@ -276,7 +276,8 @@ __libdwfl_report_elf (Dwfl *dwfl, const char *name, const char *file_name,
>  	}
>        else
>  	{
> -	  elf_end (elf);
> +	  elf_end (m->main.elf);
> +	  m->main.elf = elf;
>  	  if (m->main_bias != bias
>  	      || m->main.vaddr != vaddr || m->main.address_sync != address_sync)
>  	    goto overlap;

If we goto overlap here don't we still have a problem? overlap will
set m->gc = true; and return NULL. So the caller will think they
still owns the elf handle and will probably close it. But then when
the module is GCed in dwfl_report_end it will close the elf handle
again.

Should we instead move the elf_end and reassignment of main.elf to
after this if statement?

Thanks,

Mark
  
Aleksei Vetrov Nov. 20, 2023, 5:39 p.m. UTC | #2
Hello Mark,

On Sat, Nov 18, 2023 at 10:47 PM Mark Wielaard <mark@klomp.org> wrote:
> If we goto overlap here don't we still have a problem? overlap will
> set m->gc = true; and return NULL. So the caller will think they
> still owns the elf handle and will probably close it. But then when
> the module is GCed in dwfl_report_end it will close the elf handle
> again.

Thank you for noticing! Yes, this is oversight from my side.

> Should we instead move the elf_end and reassignment of main.elf to
> after this if statement?

Will fix exactly like this in [PATCH v2].
  

Patch

diff --git a/libdwfl/dwfl_report_elf.c b/libdwfl/dwfl_report_elf.c
index 581f4079..58b06aea 100644
--- a/libdwfl/dwfl_report_elf.c
+++ b/libdwfl/dwfl_report_elf.c
@@ -276,7 +276,8 @@  __libdwfl_report_elf (Dwfl *dwfl, const char *name, const char *file_name,
 	}
       else
 	{
-	  elf_end (elf);
+	  elf_end (m->main.elf);
+	  m->main.elf = elf;
 	  if (m->main_bias != bias
 	      || m->main.vaddr != vaddr || m->main.address_sync != address_sync)
 	    goto overlap;