[RFC] readelf: partial support of ZSTD compression
Commit Message
Support decompression of ZSTD sections and add support
for it when -SWz is used:
...
[30] .debug_abbrev PROGBITS 0000000000000000 00001f9d 00000168 0 C 0 0 1
[ELF ZSTD (2) 000002fc 1]
...
One TODO I see is that:
+libelf_so_LDLIBS = $(libelf_so_DEPS) -lz -lzstd
should be conditional based on HAVE_ZSTD. But I don't know how to do that?
And the second part should cover elfcompress where it should support ZSTD compression,
leaving that for now.
Martin
libelf/ChangeLog:
* Makefile.am: Add -lzstd.
* elf.h (ELFCOMPRESS_ZSTD): Add new value.
* elf_compress.c (__libelf_decompress): Dispatch based
on the compression algorithm.
(__libelf_decompress_zlib): New.
(__libelf_decompress_zstd): New.
(__libelf_decompress_elf): Pass type of compression to
__libelf_decompress.
* elf_compress_gnu.c (elf_compress_gnu): Use ELFCOMPRESS_ZLIB
as .z* sections can be only compressed with ZLIB.
* libelfP.h (__libelf_decompress): Change signature.
m4/ChangeLog:
* zstd.m4: New file.
src/ChangeLog:
* readelf.c (elf_ch_type_name): Use switch and support ZSTD.
---
libelf/Makefile.am | 2 +-
libelf/elf.h | 3 +++
libelf/elf_compress.c | 56 ++++++++++++++++++++++++++++++++++++---
libelf/elf_compress_gnu.c | 2 +-
libelf/libelfP.h | 2 +-
m4/zstd.m4 | 23 ++++++++++++++++
src/readelf.c | 18 ++++++++-----
7 files changed, 93 insertions(+), 13 deletions(-)
create mode 100644 m4/zstd.m4
Comments
On Mon, Oct 24, 2022 at 01:09:59PM +0200, Martin Liška wrote:
[...]
> One TODO I see is that:
> +libelf_so_LDLIBS = $(libelf_so_DEPS) -lz -lzstd
>
> should be conditional based on HAVE_ZSTD. But I don't know how to do that?
I suppose you're talking about libzstd_LIBS.
[...]
> diff --git a/m4/zstd.m4 b/m4/zstd.m4
> new file mode 100644
> index 00000000..6da4db68
> --- /dev/null
> +++ b/m4/zstd.m4
> @@ -0,0 +1,23 @@
> +dnl Copyright (C) 2022 Free Software Foundation, Inc.
> +dnl This file is free software, distributed under the terms of the GNU
> +dnl General Public License. As a special exception to the GNU General
> +dnl Public License, this file may be distributed as part of a program
> +dnl that contains a configuration script generated by Autoconf, under
> +dnl the same distribution terms as the rest of that program.
> +
> +dnl Enable features using the zstd library.
> +AC_DEFUN([AC_ZSTD], [
> +AC_ARG_WITH(zstd,
> + [AS_HELP_STRING([--with-zstd], [support zstd compressed debug sections (default=auto)])],
> + [], [with_zstd=auto])
Where does this code come from?
I though the "AC_" prefix is reserved for the GNU Autoconf.
Also, looks like it would be more appropriate to call it --enable-zstd
rather than --with-zstd.
On 10/24/22 13:41, Dmitry V. Levin wrote:
> On Mon, Oct 24, 2022 at 01:09:59PM +0200, Martin Liška wrote:
> [...]
>> One TODO I see is that:
>> +libelf_so_LDLIBS = $(libelf_so_DEPS) -lz -lzstd
>>
>> should be conditional based on HAVE_ZSTD. But I don't know how to do that?
>
> I suppose you're talking about libzstd_LIBS.
Hm, can't see it after autoreconf -fi and ./configure.
>
> [...]
>> diff --git a/m4/zstd.m4 b/m4/zstd.m4
>> new file mode 100644
>> index 00000000..6da4db68
>> --- /dev/null
>> +++ b/m4/zstd.m4
>> @@ -0,0 +1,23 @@
>> +dnl Copyright (C) 2022 Free Software Foundation, Inc.
>> +dnl This file is free software, distributed under the terms of the GNU
>> +dnl General Public License. As a special exception to the GNU General
>> +dnl Public License, this file may be distributed as part of a program
>> +dnl that contains a configuration script generated by Autoconf, under
>> +dnl the same distribution terms as the rest of that program.
>> +
>> +dnl Enable features using the zstd library.
>> +AC_DEFUN([AC_ZSTD], [
>> +AC_ARG_WITH(zstd,
>> + [AS_HELP_STRING([--with-zstd], [support zstd compressed debug sections (default=auto)])],
>> + [], [with_zstd=auto])
>
> Where does this code come from?
> I though the "AC_" prefix is reserved for the GNU Autoconf.
It comes from binutils './config/zstd.m4' file.
> Also, looks like it would be more appropriate to call it --enable-zstd
> rather than --with-zstd.
>
Ah, I see.
Thanks,
Martin
On Mon, Oct 24, 2022 at 02:17:17PM +0200, Martin Liška wrote:
> On 10/24/22 13:41, Dmitry V. Levin wrote:
> > On Mon, Oct 24, 2022 at 01:09:59PM +0200, Martin Liška wrote:
> > [...]
> >> One TODO I see is that:
> >> +libelf_so_LDLIBS = $(libelf_so_DEPS) -lz -lzstd
> >>
> >> should be conditional based on HAVE_ZSTD. But I don't know how to do that?
> >
> > I suppose you're talking about libzstd_LIBS.
>
> Hm, can't see it after autoreconf -fi and ./configure.
That's because you do
PKG_CHECK_MODULES(ZSTD, [libzstd], ...)
and this defines ZSTD_CFLAGS and ZSTD_LIBS instead of libzstd_CFLAGS
and libzstd_LIBS because PKG_CHECK_MODULES() uses its first argument
as the prefix for these variables.
On 10/24/22 18:48, Dmitry V. Levin wrote:
> On Mon, Oct 24, 2022 at 02:17:17PM +0200, Martin Liška wrote:
>> On 10/24/22 13:41, Dmitry V. Levin wrote:
>>> On Mon, Oct 24, 2022 at 01:09:59PM +0200, Martin Liška wrote:
>>> [...]
>>>> One TODO I see is that:
>>>> +libelf_so_LDLIBS = $(libelf_so_DEPS) -lz -lzstd
>>>>
>>>> should be conditional based on HAVE_ZSTD. But I don't know how to do that?
>>>
>>> I suppose you're talking about libzstd_LIBS.
>>
>> Hm, can't see it after autoreconf -fi and ./configure.
>
> That's because you do
> PKG_CHECK_MODULES(ZSTD, [libzstd], ...)
> and this defines ZSTD_CFLAGS and ZSTD_LIBS instead of libzstd_CFLAGS
> and libzstd_LIBS because PKG_CHECK_MODULES() uses its first argument
> as the prefix for these variables.
>
>
Thank you. Apparently, I collided with the existing:
eu_ZIPLIB(zstd,ZSTD,zstd,ZSTD_decompress,[ZSTD (zst)])
Anyway, I'm sending V2 that works fine --with-zstd and --without-zstd as expected.
Ready for master?
Thanks,
Martin
Hi Martin,
On Mon, Oct 24, 2022 at 08:16:09PM +0200, Martin Liška wrote:
> Anyway, I'm sending V2 that works fine --with-zstd and --without-zstd as expected.
>
> Ready for master?
The decompression part looks good. Few small nitpicks below.
Although I like to also have compression working. Then we can also
add support to src/elfcompress, which makes for a good testcase. See
tests/run-compress.sh (which has found some subtle memory issues when
run under valgrind).
> From 4aea412783b9b0dcaf0f887947bf2e8ee6c5368b Mon Sep 17 00:00:00 2001
> From: Martin Liska <mliska@suse.cz>
> Date: Mon, 24 Oct 2022 11:53:13 +0200
> Subject: [PATCH] readelf: partial support of ZSTD compression
>
> Support decompression of ZSTD sections and add support
> for it when -SWz is used:
>
> ...
> [30] .debug_abbrev PROGBITS 0000000000000000 00001f9d 00000168 0 C 0 0 1
> [ELF ZSTD (2) 000002fc 1]
> ...
And for this it would be nice to have a tests/run-readelf-z.sh testcase.
> ChangeLog:
>
> * configure.ac: Add zstd_LIBS.
>
> libelf/ChangeLog:
>
> * Makefile.am: Use zstd_LIBS.
> * elf.h (ELFCOMPRESS_ZSTD): Add new value.
> * elf_compress.c (__libelf_decompress): Dispatch based
> on the compression algorithm.
> (__libelf_decompress_zlib): New.
> (__libelf_decompress_zstd): New.
> (__libelf_decompress_elf): Pass type of compression to
> __libelf_decompress.
> * elf_compress_gnu.c (elf_compress_gnu): Use ELFCOMPRESS_ZLIB
> as .z* sections can be only compressed with ZLIB.
> * libelfP.h (__libelf_decompress): Change signature.
>
> src/ChangeLog:
>
> * readelf.c (elf_ch_type_name): Use switch and support ZSTD.
> diff --git a/libelf/elf.h b/libelf/elf.h
> index 02a1b3f5..f0f0ec7d 100644
> --- a/libelf/elf.h
> +++ b/libelf/elf.h
We normally sync elf.h from glibc. I'll do that in a second.
> +#ifdef USE_ZSTD
> +void *
> +internal_function
> +__libelf_decompress_zstd (void *buf_in, size_t size_in, size_t size_out)
> +{
> + /* Malloc might return NULL when requestion zero size. This is highly
requesting
> + unlikely, it would only happen when the compression was forced.
> + But we do need a non-NULL buffer to return and set as result.
> + Just make sure to always allocate at least 1 byte. */
> +void *
> +internal_function
> +__libelf_decompress (int chtype, void *buf_in, size_t size_in, size_t size_out)
> +{
> + if (chtype == ELFCOMPRESS_ZLIB)
> + return __libelf_decompress_zlib (buf_in, size_in, size_out);
> + else
> + {
> +#ifdef USE_ZSTD
> + return __libelf_decompress_zstd (buf_in, size_in, size_out);
> +#else
> + return 0;
> +#endif
return NULL for consistency?
> + }
> +}
> +
> void *
> internal_function
> __libelf_decompress_elf (Elf_Scn *scn, size_t *size_out, size_t *addralign)
> @@ -268,7 +316,7 @@ __libelf_decompress_elf (Elf_Scn *scn, size_t *size_out, size_t *addralign)
> if (gelf_getchdr (scn, &chdr) == NULL)
> return NULL;
>
> - if (chdr.ch_type != ELFCOMPRESS_ZLIB)
> + if (chdr.ch_type != ELFCOMPRESS_ZLIB && chdr.ch_type != ELFCOMPRESS_ZSTD)
> {
What about the ifndef USE_ZSTD case? Should this then not recognize
ELFCOMPRESS_ZSTD?
Thanks,
Mark
On 10/29/22 00:21, Mark Wielaard wrote:
> Although I like to also have compression working. Then we can also
> add support to src/elfcompress, which makes for a good testcase. See
> tests/run-compress.sh (which has found some subtle memory issues when
> run under valgrind).
Hi.
All right, so I'm preparing a full support for ZSTD (both compression and compression)
and I noticed a refactoring would be handy for compress_section function and callers
of the function.
Note right now, there are basically 3 compression types and process_file
function handles basically all combinations of these (3 x 3 options), while adding ZSTD
support would make it even more complicated. However, ZSTD will behave very similar to ZLIB
(not zlib-gnu), except a different algorithm will be used. Plus, in order to distinguish
ZLIB from ZSTD, we need to read GElf_Chdr.
So what do you think about the refactoring as the first step?
Cheers,
Martin
Hi Martin,
On Mon, Nov 28, 2022 at 02:16:35PM +0100, Martin Liška wrote:
> On 10/29/22 00:21, Mark Wielaard wrote:
> > Although I like to also have compression working. Then we can also
> > add support to src/elfcompress, which makes for a good testcase. See
> > tests/run-compress.sh (which has found some subtle memory issues when
> > run under valgrind).
>
> All right, so I'm preparing a full support for ZSTD (both compression and compression)
> and I noticed a refactoring would be handy for compress_section function and callers
> of the function.
>
> Note right now, there are basically 3 compression types and process_file
> function handles basically all combinations of these (3 x 3 options), while adding ZSTD
> support would make it even more complicated. However, ZSTD will behave very similar to ZLIB
> (not zlib-gnu), except a different algorithm will be used. Plus, in order to distinguish
> ZLIB from ZSTD, we need to read GElf_Chdr.
>
> So what do you think about the refactoring as the first step?
Looks good. I would just add a sanity check that chdr.ch_type is of a
known compression type (it is only checked to be not NONE atm).
Cheers,
Mark
On 11/28/22 23:29, Mark Wielaard wrote:
> Hi Martin,
>
> On Mon, Nov 28, 2022 at 02:16:35PM +0100, Martin Liška wrote:
>> On 10/29/22 00:21, Mark Wielaard wrote:
>>> Although I like to also have compression working. Then we can also
>>> add support to src/elfcompress, which makes for a good testcase. See
>>> tests/run-compress.sh (which has found some subtle memory issues when
>>> run under valgrind).
>>
>> All right, so I'm preparing a full support for ZSTD (both compression and compression)
>> and I noticed a refactoring would be handy for compress_section function and callers
>> of the function.
>>
>> Note right now, there are basically 3 compression types and process_file
>> function handles basically all combinations of these (3 x 3 options), while adding ZSTD
>> support would make it even more complicated. However, ZSTD will behave very similar to ZLIB
>> (not zlib-gnu), except a different algorithm will be used. Plus, in order to distinguish
>> ZLIB from ZSTD, we need to read GElf_Chdr.
>>
>> So what do you think about the refactoring as the first step?
>
> Looks good. I would just add a sanity check that chdr.ch_type is of a
> known compression type (it is only checked to be not NONE atm).
Thanks, fixes that and pushed as 6bb3e0b5. Now I can finish working on the ZSTD support.
Cheers,
Martin
>
> Cheers,
>
> Mark
>
>
@@ -106,7 +106,7 @@ libelf_pic_a_SOURCES =
am_libelf_pic_a_OBJECTS = $(libelf_a_SOURCES:.c=.os)
libelf_so_DEPS = ../lib/libeu.a
-libelf_so_LDLIBS = $(libelf_so_DEPS) -lz
+libelf_so_LDLIBS = $(libelf_so_DEPS) -lz -lzstd
if USE_LOCKS
libelf_so_LDLIBS += -lpthread
endif
@@ -506,6 +506,9 @@ typedef struct
/* Legal values for ch_type (compression algorithm). */
#define ELFCOMPRESS_ZLIB 1 /* ZLIB/DEFLATE algorithm. */
+#define ELFCOMPRESS_ZSTD 2 /* Compressed with zstd */
+ /* (see http://www.zstandard.org). */
+
#define ELFCOMPRESS_LOOS 0x60000000 /* Start of OS-specific. */
#define ELFCOMPRESS_HIOS 0x6fffffff /* End of OS-specific. */
#define ELFCOMPRESS_LOPROC 0x70000000 /* Start of processor-specific. */
@@ -39,6 +39,10 @@
#include <string.h>
#include <zlib.h>
+#ifdef USE_ZSTD
+#include <zstd.h>
+#endif
+
/* Cleanup and return result. Don't leak memory. */
static void *
do_deflate_cleanup (void *result, z_stream *z, void *out_buf,
@@ -207,7 +211,7 @@ __libelf_compress (Elf_Scn *scn, size_t hsize, int ei_data,
void *
internal_function
-__libelf_decompress (void *buf_in, size_t size_in, size_t size_out)
+__libelf_decompress_zlib (void *buf_in, size_t size_in, size_t size_out)
{
/* Catch highly unlikely compression ratios so we don't allocate
some giant amount of memory for nothing. The max compression
@@ -260,6 +264,50 @@ __libelf_decompress (void *buf_in, size_t size_in, size_t size_out)
return buf_out;
}
+#ifdef USE_ZSTD
+void *
+internal_function
+__libelf_decompress_zstd (void *buf_in, size_t size_in, size_t size_out)
+{
+ /* Malloc might return NULL when requestion zero size. This is highly
+ unlikely, it would only happen when the compression was forced.
+ But we do need a non-NULL buffer to return and set as result.
+ Just make sure to always allocate at least 1 byte. */
+ void *buf_out = malloc (size_out ?: 1);
+ if (unlikely (buf_out == NULL))
+ {
+ __libelf_seterrno (ELF_E_NOMEM);
+ return NULL;
+ }
+
+ size_t ret = ZSTD_decompress (buf_out, size_out, buf_in, size_in);
+ if (ZSTD_isError (ret))
+ {
+ free (buf_out);
+ __libelf_seterrno (ELF_E_DECOMPRESS_ERROR);
+ return NULL;
+ }
+ else
+ return buf_out;
+}
+#endif
+
+void *
+internal_function
+__libelf_decompress (int chtype, void *buf_in, size_t size_in, size_t size_out)
+{
+ if (chtype == ELFCOMPRESS_ZLIB)
+ return __libelf_decompress_zlib (buf_in, size_in, size_out);
+ else
+ {
+#ifdef USE_ZSTD
+ return __libelf_decompress_zstd (buf_in, size_in, size_out);
+#else
+ return 0;
+#endif
+ }
+}
+
void *
internal_function
__libelf_decompress_elf (Elf_Scn *scn, size_t *size_out, size_t *addralign)
@@ -268,7 +316,7 @@ __libelf_decompress_elf (Elf_Scn *scn, size_t *size_out, size_t *addralign)
if (gelf_getchdr (scn, &chdr) == NULL)
return NULL;
- if (chdr.ch_type != ELFCOMPRESS_ZLIB)
+ if (chdr.ch_type != ELFCOMPRESS_ZLIB && chdr.ch_type != ELFCOMPRESS_ZSTD)
{
__libelf_seterrno (ELF_E_UNKNOWN_COMPRESSION_TYPE);
return NULL;
@@ -295,7 +343,9 @@ __libelf_decompress_elf (Elf_Scn *scn, size_t *size_out, size_t *addralign)
? sizeof (Elf32_Chdr) : sizeof (Elf64_Chdr));
size_t size_in = data->d_size - hsize;
void *buf_in = data->d_buf + hsize;
- void *buf_out = __libelf_decompress (buf_in, size_in, chdr.ch_size);
+ void *buf_out
+ = __libelf_decompress (chdr.ch_type, buf_in, size_in, chdr.ch_size);
+
*size_out = chdr.ch_size;
*addralign = chdr.ch_addralign;
return buf_out;
@@ -178,7 +178,7 @@ elf_compress_gnu (Elf_Scn *scn, int inflate, unsigned int flags)
size_t size = gsize;
size_t size_in = data->d_size - hsize;
void *buf_in = data->d_buf + hsize;
- void *buf_out = __libelf_decompress (buf_in, size_in, size);
+ void *buf_out = __libelf_decompress (ELFCOMPRESS_ZLIB, buf_in, size_in, size);
if (buf_out == NULL)
return -1;
@@ -577,7 +577,7 @@ extern void * __libelf_compress (Elf_Scn *scn, size_t hsize, int ei_data,
size_t *size, bool force)
internal_function;
-extern void * __libelf_decompress (void *buf_in, size_t size_in,
+extern void * __libelf_decompress (int chtype, void *buf_in, size_t size_in,
size_t size_out) internal_function;
extern void * __libelf_decompress_elf (Elf_Scn *scn,
size_t *size_out, size_t *addralign)
new file mode 100644
@@ -0,0 +1,23 @@
+dnl Copyright (C) 2022 Free Software Foundation, Inc.
+dnl This file is free software, distributed under the terms of the GNU
+dnl General Public License. As a special exception to the GNU General
+dnl Public License, this file may be distributed as part of a program
+dnl that contains a configuration script generated by Autoconf, under
+dnl the same distribution terms as the rest of that program.
+
+dnl Enable features using the zstd library.
+AC_DEFUN([AC_ZSTD], [
+AC_ARG_WITH(zstd,
+ [AS_HELP_STRING([--with-zstd], [support zstd compressed debug sections (default=auto)])],
+ [], [with_zstd=auto])
+
+AS_IF([test "$with_zstd" != no],
+ [PKG_CHECK_MODULES(ZSTD, [libzstd], [
+ AC_DEFINE(HAVE_ZSTD, 1, [Define to 1 if zstd is enabled.])
+ ], [
+ if test "$with_zstd" = yes; then
+ AC_MSG_ERROR([--with-zstd was given, but pkgconfig/libzstd.pc is not found])
+ fi
+ ])
+ ])
+])
@@ -1229,13 +1229,17 @@ get_visibility_type (int value)
static const char *
elf_ch_type_name (unsigned int code)
{
- if (code == 0)
- return "NONE";
-
- if (code == ELFCOMPRESS_ZLIB)
- return "ZLIB";
-
- return "UNKNOWN";
+ switch (code)
+ {
+ case 0:
+ return "NONE";
+ case ELFCOMPRESS_ZLIB:
+ return "ZLIB";
+ case ELFCOMPRESS_ZSTD:
+ return "ZSTD";
+ default:
+ return "UNKNOWN";
+ }
}
/* Print the section headers. */