[PATCHv2] libelf: Return already gotten Elf_Data from elf_getdata_rawchunk

Message ID 20220401141525.3384-1-mark@klomp.org
State Committed
Headers
Series [PATCHv2] libelf: Return already gotten Elf_Data from elf_getdata_rawchunk |

Commit Message

Mark Wielaard April 1, 2022, 2:15 p.m. UTC
  elf_getdata_rawchunk keeps a list of Elf_Data_Chunk to track which
Elf_Data structures have already been requested. This allows elf_end
to clean up all internal data structures and the Elf_Data d_buf if
it was malloced.

But it didn't check if a chunk was already requested earlier. This
meant that if for example dwelf_elf_gnu_build_id was called multiple
times to lookup a build-id from the phdrs a new Elf_Data_Chunk was
created. This could slowly leak memory.

So also keep track of the offset from which the size and type of
the rawdata was requested so we can return existing data if it is
requested multiple times.

Note that the current cache is a simple linked list but the chain
is normally not that long. It is normally used to get chunks from
the phdrs, and there are normally less than 10.

Signed-off-by: Mark Wielaard <mark@klomp.org>
---
 libelf/ChangeLog              |  7 +++++++
 libelf/elf_getdata_rawchunk.c | 16 ++++++++++++++++
 libelf/libelfP.h              |  1 +
 3 files changed, 24 insertions(+)

V2 now with actual code.
  

Comments

Mark Wielaard April 5, 2022, 1:13 p.m. UTC | #1
Hi,

On Fri, 2022-04-01 at 16:15 +0200, Mark Wielaard wrote:
> elf_getdata_rawchunk keeps a list of Elf_Data_Chunk to track which
> Elf_Data structures have already been requested. This allows elf_end
> to clean up all internal data structures and the Elf_Data d_buf if
> it was malloced.
> 
> But it didn't check if a chunk was already requested earlier. This
> meant that if for example dwelf_elf_gnu_build_id was called multiple
> times to lookup a build-id from the phdrs a new Elf_Data_Chunk was
> created. This could slowly leak memory.
> 
> So also keep track of the offset from which the size and type of
> the rawdata was requested so we can return existing data if it is
> requested multiple times.
> 
> Note that the current cache is a simple linked list but the chain
> is normally not that long. It is normally used to get chunks from
> the phdrs, and there are normally less than 10.

I pushed this.

Cheers,

Mark
  

Patch

diff --git a/libelf/ChangeLog b/libelf/ChangeLog
index 299179cb..985f795d 100644
--- a/libelf/ChangeLog
+++ b/libelf/ChangeLog
@@ -1,3 +1,10 @@ 
+2022-04-01  Mark Wielaard  <mark@klomp.org>
+
+	* libelfP.h (struct Elf_Data_Chunk): Add an int64_t offset field.
+	* elf_getdata_rawchunk.c (elf_getdata_rawchunk): Check whether the
+	requested chunk, offset, size and type, was already handed out.
+	Set new Elf_Data_Chunk offset field.
+
 2022-03-29  Mark Wielaard  <mark@klomp.org>
 
 	* gelf_xlate.c (START): Define and use sz variable.
diff --git a/libelf/elf_getdata_rawchunk.c b/libelf/elf_getdata_rawchunk.c
index 1072f7de..2f55cbb4 100644
--- a/libelf/elf_getdata_rawchunk.c
+++ b/libelf/elf_getdata_rawchunk.c
@@ -1,5 +1,6 @@ 
 /* Return converted data from raw chunk of ELF file.
    Copyright (C) 2007, 2014, 2015 Red Hat, Inc.
+   Copyright (C) 2022 Mark J. Wielaard <mark@klomp.org>
    This file is part of elfutils.
 
    This file is free software; you can redistribute it and/or modify
@@ -75,6 +76,20 @@  elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
 
   rwlock_rdlock (elf->lock);
 
+  /* Maybe we already got this chunk?  */
+  Elf_Data_Chunk *rawchunks = elf->state.elf.rawchunks;
+  while (rawchunks != NULL)
+    {
+      if ((rawchunks->offset == offset || size == 0)
+	  && rawchunks->data.d.d_size == size
+	  && rawchunks->data.d.d_type == type)
+	{
+	  result = &rawchunks->data.d;
+	  goto out;
+	}
+      rawchunks = rawchunks->next;
+    }
+
   size_t align = __libelf_type_align (elf->class, type);
   if (elf->map_address != NULL)
     {
@@ -171,6 +186,7 @@  elf_getdata_rawchunk (Elf *elf, int64_t offset, size_t size, Elf_Type type)
   chunk->data.d.d_type = type;
   chunk->data.d.d_align = align;
   chunk->data.d.d_version = EV_CURRENT;
+  chunk->offset = offset;
 
   rwlock_unlock (elf->lock);
   rwlock_wrlock (elf->lock);
diff --git a/libelf/libelfP.h b/libelf/libelfP.h
index 2c6995bb..56331f45 100644
--- a/libelf/libelfP.h
+++ b/libelf/libelfP.h
@@ -266,6 +266,7 @@  typedef struct Elf_Data_Chunk
     Elf_Scn dummy_scn;
     struct Elf_Data_Chunk *next;
   };
+  int64_t offset;		/* The original raw offset in the Elf image.  */
 } Elf_Data_Chunk;