Use XXH_INLINE_ALL=1 to inline all xxhash functions

Message ID 20240917201627.936841-1-mark@klomp.org
State New
Headers
Series Use XXH_INLINE_ALL=1 to inline all xxhash functions |

Commit Message

Mark Wielaard Sept. 17, 2024, 8:15 p.m. UTC
  XXH_INLINE_ALL: Make all functions inline, implementation is directly
included within xxhash.h. This makes it so that there is no runtime
dependency on libxxhash.so and could result in faster code.

	   * configure: Don't AC_SUBST XXHASH_{VERSION,LIBS,CFLAGS}.
	   Remove XXHASH_LIBS from DEPS_LIBS. Add -DXXH_INLINE_ALL=1
	   to CFLAGS and CXXFLAGS.

Signed-off-by: Mark Wielaard <mark@klomp.org>
---

https://code.wildebeest.org/git/user/mjw/libabigail/commit/?h=xxhash-inline-all

 configure.ac | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)
  

Comments

Sam James Sept. 18, 2024, 3:13 a.m. UTC | #1
Mark Wielaard <mark@klomp.org> writes:

> XXH_INLINE_ALL: Make all functions inline, implementation is directly
> included within xxhash.h. This makes it so that there is no runtime
> dependency on libxxhash.so and could result in faster code.

Why is this desirable? I don't think the faster code bit is a given --
we could easily end up with a (much) larger binary, for example.

Having xxhash essentially bundled in all packages by forcing inlining
isn't a win IMO. If it's been measured to provide substantial benefit
for libabigail, then that's different (but please say so!), but I don't
think we should default to doing this whenever xxhash is used.

>
> 	   * configure: Don't AC_SUBST XXHASH_{VERSION,LIBS,CFLAGS}.
> 	   Remove XXHASH_LIBS from DEPS_LIBS. Add -DXXH_INLINE_ALL=1
> 	   to CFLAGS and CXXFLAGS.
>
> Signed-off-by: Mark Wielaard <mark@klomp.org>
> ---
>
> https://code.wildebeest.org/git/user/mjw/libabigail/commit/?h=xxhash-inline-all
>
>  configure.ac | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/configure.ac b/configure.ac
> index 07cebf32e46b..6c1a7ce6b85a 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -495,10 +495,6 @@ dnl Check for dependency: xxhash
>  XXHASH_VERSION=0.8.0
>  PKG_CHECK_MODULES(XXHASH, libxxhash >= $XXHASH_VERSION)
>  
> -AC_SUBST(XXHASH_VERSION)
> -AC_SUBST(XXHASH_LIBS)
> -AC_SUBST(XXHASH_CFLAGS)
> -
>  dnl Check for some programs like rm, mkdir, etc ...
>  AC_CHECK_PROG(HAS_RM, rm, yes, no)
>  if test x$HAS_RM = xno; then
> @@ -922,7 +918,7 @@ AX_VALGRIND_CHECK
>  
>  dnl Set the list of libraries libabigail depends on
>  
> -DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $XXHASH_LIBS $CTF_LIBS $BPF_LIBS"
> +DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $CTF_LIBS $BPF_LIBS"
>  AC_SUBST(DEPS_LIBS)
>  
>  if test x$ABIGAIL_DEVEL != x; then
> @@ -960,6 +956,10 @@ if test x$ENABLE_UBSAN = xyes; then
>      CXXFLAGS="$CXXFLAGS -fsanitize=undefined"
>  fi
>  
> +dnl We always want all xxhash functions inlined
> +CFLAGS="$CFLAGS -DXXH_INLINE_ALL=1"
> +CXXFLAGS="$CXXFLAGS -DXXH_INLINE_ALL=1"
> +
>  dnl Set a few Automake conditionals
>  
>  AM_CONDITIONAL([CTF_READER],[test "x$ENABLE_CTF" = "xyes"])
  
Sam James Sept. 18, 2024, 3:16 a.m. UTC | #2
Sam James <sam@gentoo.org> writes:

> Mark Wielaard <mark@klomp.org> writes:
>
>> XXH_INLINE_ALL: Make all functions inline, implementation is directly
>> included within xxhash.h. This makes it so that there is no runtime
>> dependency on libxxhash.so and could result in faster code.
>
> Why is this desirable? I don't think the faster code bit is a given --
> we could easily end up with a (much) larger binary, for example.
>
> Having xxhash essentially bundled in all packages by forcing inlining
> isn't a win IMO. If it's been measured to provide substantial benefit
> for libabigail, then that's different (but please say so!), but I don't
> think we should default to doing this whenever xxhash is used.

Reading https://github.com/Cyan4973/xxHash/issues/902 and
https://github.com/Cyan4973/xxHash/pull/903, it looks like a mess and
it's unclear whether it even guarantees that you *must* use the inline
version or whether it just gives the option (inline vs force inline).

That would mean that dropping the linkage is wrong and we should rely on
as-needed instead, at least.

>
>>
>> 	   * configure: Don't AC_SUBST XXHASH_{VERSION,LIBS,CFLAGS}.
>> 	   Remove XXHASH_LIBS from DEPS_LIBS. Add -DXXH_INLINE_ALL=1
>> 	   to CFLAGS and CXXFLAGS.
>>
>> Signed-off-by: Mark Wielaard <mark@klomp.org>
>> ---
>>
>> https://code.wildebeest.org/git/user/mjw/libabigail/commit/?h=xxhash-inline-all
>>
>>  configure.ac | 10 +++++-----
>>  1 file changed, 5 insertions(+), 5 deletions(-)
>>
>> diff --git a/configure.ac b/configure.ac
>> index 07cebf32e46b..6c1a7ce6b85a 100644
>> --- a/configure.ac
>> +++ b/configure.ac
>> @@ -495,10 +495,6 @@ dnl Check for dependency: xxhash
>>  XXHASH_VERSION=0.8.0
>>  PKG_CHECK_MODULES(XXHASH, libxxhash >= $XXHASH_VERSION)
>>  
>> -AC_SUBST(XXHASH_VERSION)
>> -AC_SUBST(XXHASH_LIBS)
>> -AC_SUBST(XXHASH_CFLAGS)
>> -
>>  dnl Check for some programs like rm, mkdir, etc ...
>>  AC_CHECK_PROG(HAS_RM, rm, yes, no)
>>  if test x$HAS_RM = xno; then
>> @@ -922,7 +918,7 @@ AX_VALGRIND_CHECK
>>  
>>  dnl Set the list of libraries libabigail depends on
>>  
>> -DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $XXHASH_LIBS $CTF_LIBS $BPF_LIBS"
>> +DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $CTF_LIBS $BPF_LIBS"
>>  AC_SUBST(DEPS_LIBS)
>>  
>>  if test x$ABIGAIL_DEVEL != x; then
>> @@ -960,6 +956,10 @@ if test x$ENABLE_UBSAN = xyes; then
>>      CXXFLAGS="$CXXFLAGS -fsanitize=undefined"
>>  fi
>>  
>> +dnl We always want all xxhash functions inlined
>> +CFLAGS="$CFLAGS -DXXH_INLINE_ALL=1"
>> +CXXFLAGS="$CXXFLAGS -DXXH_INLINE_ALL=1"
>> +
>>  dnl Set a few Automake conditionals
>>  
>>  AM_CONDITIONAL([CTF_READER],[test "x$ENABLE_CTF" = "xyes"])
  
Mark Wielaard Sept. 18, 2024, 9:56 a.m. UTC | #3
On Wed, Sep 18, 2024 at 04:13:41AM +0100, Sam James wrote:
> Mark Wielaard <mark@klomp.org> writes:
> 
> > XXH_INLINE_ALL: Make all functions inline, implementation is directly
> > included within xxhash.h. This makes it so that there is no runtime
> > dependency on libxxhash.so and could result in faster code.
> 
> Why is this desirable?

Like the comment says :) Less runtime dependencies and possibly faster
code.

> I don't think the faster code bit is a given --
> we could easily end up with a (much) larger binary, for example.

libabigail.so gets 11KB bigger (on 55MB). But the runtime library is
72K. We are only using three fairly simple function calls for one
algorithm.

> Having xxhash essentially bundled in all packages by forcing inlining
> isn't a win IMO. If it's been measured to provide substantial benefit
> for libabigail, then that's different (but please say so!), but I don't
> think we should default to doing this whenever xxhash is used.

Would you like a configure option instead?

The goal wasn't immediately speed, but getting rid of a runtime
dependency, so I only quickly checked it wasn't getting slower.  I did
only do a quick benchmark with make check. Which resulted in 6m34.733s
(orig) to 6m22.322s (XXH_INLINE) [make -j12 check best out of 3]. So
slightly faster, but really in the noise.

make check probably also isn't really a good benchmark. What would you
run to see if it is beneficial?

If we really want a faster hash we should really look into adopting
XXH3_64bit. But the problem is that libabigail encodes the hash in the
xml output. So one could say adopting a different hash will break
abi... :}

Cheers,

Mark

> >
> > 	   * configure: Don't AC_SUBST XXHASH_{VERSION,LIBS,CFLAGS}.
> > 	   Remove XXHASH_LIBS from DEPS_LIBS. Add -DXXH_INLINE_ALL=1
> > 	   to CFLAGS and CXXFLAGS.
> >
> > Signed-off-by: Mark Wielaard <mark@klomp.org>
> > ---
> >
> > https://code.wildebeest.org/git/user/mjw/libabigail/commit/?h=xxhash-inline-all
  
Mark Wielaard Sept. 18, 2024, 10:01 a.m. UTC | #4
Hi Sam,

On Wed, Sep 18, 2024 at 04:16:06AM +0100, Sam James wrote:
> Reading https://github.com/Cyan4973/xxHash/issues/902 and
> https://github.com/Cyan4973/xxHash/pull/903, it looks like a mess and
> it's unclear whether it even guarantees that you *must* use the inline
> version or whether it just gives the option (inline vs force inline).

I think you are reading too much into one function (which we aren't
even using) having missed the INLINE attribute. The code is still
inlined, and we still get rid of the runtime dependencies. The
discussion is just about whether it is all (recursively) inlined or
not. Which seems like nitpicking imho. Something like using -flto will
have a much bigger impact for example.

Cheers,

Mark
  
Sam James Sept. 18, 2024, 12:56 p.m. UTC | #5
Mark Wielaard <mark@klomp.org> writes:

> On Wed, Sep 18, 2024 at 04:13:41AM +0100, Sam James wrote:
>> Mark Wielaard <mark@klomp.org> writes:
>> 
>> > XXH_INLINE_ALL: Make all functions inline, implementation is directly
>> > included within xxhash.h. This makes it so that there is no runtime
>> > dependency on libxxhash.so and could result in faster code.
>> 
>> Why is this desirable?
>
> Like the comment says :) Less runtime dependencies and possibly faster
> code.

We're talking past each other a bit though, I'm saying that fewer
runtime dependencies isn't necessarily a win. What motivated it?

>
>> I don't think the faster code bit is a given --
>> we could easily end up with a (much) larger binary, for example.
>
> libabigail.so gets 11KB bigger (on 55MB). But the runtime library is
> 72K. We are only using three fairly simple function calls for one
> algorithm.

OK, fair enough.

>
>> Having xxhash essentially bundled in all packages by forcing inlining
>> isn't a win IMO. If it's been measured to provide substantial benefit
>> for libabigail, then that's different (but please say so!), but I don't
>> think we should default to doing this whenever xxhash is used.
>
> Would you like a configure option instead?
>
> The goal wasn't immediately speed, but getting rid of a runtime
> dependency, so I only quickly checked it wasn't getting slower.  I did
> only do a quick benchmark with make check. Which resulted in 6m34.733s
> (orig) to 6m22.322s (XXH_INLINE) [make -j12 check best out of 3]. So
> slightly faster, but really in the noise.
>
> make check probably also isn't really a good benchmark. What would you
> run to see if it is beneficial?
>
> If we really want a faster hash we should really look into adopting
> XXH3_64bit. But the problem is that libabigail encodes the hash in the
> xml output. So one could say adopting a different hash will break
> abi... :}

I think it's okay as it is, it's more that I don't really see the
motivation (the arguments here could apply to every consumer of xxhash),
but it doesn't matter much.

>
> Cheers,
>
> Mark
>
>> >
>> > 	   * configure: Don't AC_SUBST XXHASH_{VERSION,LIBS,CFLAGS}.
>> > 	   Remove XXHASH_LIBS from DEPS_LIBS. Add -DXXH_INLINE_ALL=1
>> > 	   to CFLAGS and CXXFLAGS.
>> >
>> > Signed-off-by: Mark Wielaard <mark@klomp.org>
>> > ---
>> >
>> > https://code.wildebeest.org/git/user/mjw/libabigail/commit/?h=xxhash-inline-all
  
Sam James Sept. 18, 2024, 12:57 p.m. UTC | #6
Mark Wielaard <mark@klomp.org> writes:

> Hi Sam,
>
> On Wed, Sep 18, 2024 at 04:16:06AM +0100, Sam James wrote:
>> Reading https://github.com/Cyan4973/xxHash/issues/902 and
>> https://github.com/Cyan4973/xxHash/pull/903, it looks like a mess and
>> it's unclear whether it even guarantees that you *must* use the inline
>> version or whether it just gives the option (inline vs force inline).
>
> I think you are reading too much into one function (which we aren't
> even using) having missed the INLINE attribute. The code is still
> inlined, and we still get rid of the runtime dependencies. The
> discussion is just about whether it is all (recursively) inlined or
> not. Which seems like nitpicking imho. Something like using -flto will
> have a much bigger impact for example.

What I'm saying is that there appears to be no guarantee of inlining so
you need to keep the -lxxhash bits and you should rely on as-needed
to drop it if it's not needed then.

force inline isn't used, so it entirely depends on if the compiler
chooses to inline it all, so you could end up with it *not* all being
inlined, and then getting a link failure.

>
> Cheers,
>
> Mark
  
Mark Wielaard Sept. 18, 2024, 4:40 p.m. UTC | #7
Hoi Sam,

On Wed, Sep 18, 2024 at 01:56:08PM +0100, Sam James wrote:
> Mark Wielaard <mark@klomp.org> writes:
> > On Wed, Sep 18, 2024 at 04:13:41AM +0100, Sam James wrote:
> >> Mark Wielaard <mark@klomp.org> writes:
> >> 
> >> > XXH_INLINE_ALL: Make all functions inline, implementation is directly
> >> > included within xxhash.h. This makes it so that there is no runtime
> >> > dependency on libxxhash.so and could result in faster code.
> >> 
> >> Why is this desirable?
> >
> > Like the comment says :) Less runtime dependencies and possibly faster
> > code.
> 
> We're talking past each other a bit though, I'm saying that fewer
> runtime dependencies isn't necessarily a win. What motivated it?

Some distros have multiple repositories with different support
arrangements. For example there might be a buildroot repo from which
packages are build. Not all of those packages might (also) be in the
runtime repository. A package that is only available in the buildroot
might have less (or different) support requirements than packages in
the runtime repository which might have "full" support requirements
because they can be used for anything.

> >> Having xxhash essentially bundled in all packages by forcing inlining
> >> isn't a win IMO. If it's been measured to provide substantial benefit
> >> for libabigail, then that's different (but please say so!), but I don't
> >> think we should default to doing this whenever xxhash is used.
> >
> > Would you like a configure option instead?
> >
> > The goal wasn't immediately speed, but getting rid of a runtime
> > dependency, so I only quickly checked it wasn't getting slower.  I did
> > only do a quick benchmark with make check. Which resulted in 6m34.733s
> > (orig) to 6m22.322s (XXH_INLINE) [make -j12 check best out of 3]. So
> > slightly faster, but really in the noise.
> >
> > make check probably also isn't really a good benchmark. What would you
> > run to see if it is beneficial?
> >
> > If we really want a faster hash we should really look into adopting
> > XXH3_64bit. But the problem is that libabigail encodes the hash in the
> > xml output. So one could say adopting a different hash will break
> > abi... :}
> 
> I think it's okay as it is, it's more that I don't really see the
> motivation (the arguments here could apply to every consumer of xxhash),
> but it doesn't matter much.

Yes, I am going over some other consumers of xxhash to see if they can
use it in XXH_INLINE "mode". Most already do, libabigail is the
exception as far as I have seen. But if that doesn't make sense to
others I don't want to force that. It can certainly be a configure
option.

Cheers,

Mark
  
Mark Wielaard Sept. 18, 2024, 4:48 p.m. UTC | #8
Hi Sam,

On Wed, Sep 18, 2024 at 01:57:22PM +0100, Sam James wrote:
> What I'm saying is that there appears to be no guarantee of inlining so
> you need to keep the -lxxhash bits and you should rely on as-needed
> to drop it if it's not needed then.
> 
> force inline isn't used, so it entirely depends on if the compiler
> chooses to inline it all, so you could end up with it *not* all being
> inlined, and then getting a link failure.

I don't think that is true, only if there is a bug in the header
file. Such a bug would manifest itself as the linker being unable link
because of a missing dynamic symbol. But the effect of XXH_INLINE_ALL
(or XXH_PRIVATE_API) is (or should be) that all functions are static
and no dynamic symbols will be used, so no linking to any dynamic
library is necessary

Cheers,

Mark
  
Sam James Sept. 18, 2024, 4:55 p.m. UTC | #9
Mark Wielaard <mark@klomp.org> writes:

> Hi Sam,
>
> On Wed, Sep 18, 2024 at 01:57:22PM +0100, Sam James wrote:
>> What I'm saying is that there appears to be no guarantee of inlining so
>> you need to keep the -lxxhash bits and you should rely on as-needed
>> to drop it if it's not needed then.
>> 
>> force inline isn't used, so it entirely depends on if the compiler
>> chooses to inline it all, so you could end up with it *not* all being
>> inlined, and then getting a link failure.
>
> I don't think that is true, only if there is a bug in the header
> file. Such a bug would manifest itself as the linker being unable link
> because of a missing dynamic symbol. But the effect of XXH_INLINE_ALL
> (or XXH_PRIVATE_API) is (or should be) that all functions are static
> and no dynamic symbols will be used, so no linking to any dynamic
> library is necessary

Yes, that's the case I was worried about. But if they're all marked
correctly, I guess it's indeed fine. We'll see ;)

So no objection (even if I don't love doing this in every xxhash consumer). Thanks!

>
> Cheers,
>
> Mark
  
Dodji Seketeli Sept. 27, 2024, 3:02 p.m. UTC | #10
Mark Wielaard <mark@klomp.org> a écrit:

> XXH_INLINE_ALL: Make all functions inline, implementation is directly
> included within xxhash.h. This makes it so that there is no runtime
> dependency on libxxhash.so and could result in faster code.
>
> 	   * configure: Don't AC_SUBST XXHASH_{VERSION,LIBS,CFLAGS}.
> 	   Remove XXHASH_LIBS from DEPS_LIBS. Add -DXXH_INLINE_ALL=1
> 	   to CFLAGS and CXXFLAGS.
>
> Signed-off-by: Mark Wielaard <mark@klomp.org>

Applied to the mainline, thanks!

[...]

Cheers,
  

Patch

diff --git a/configure.ac b/configure.ac
index 07cebf32e46b..6c1a7ce6b85a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -495,10 +495,6 @@  dnl Check for dependency: xxhash
 XXHASH_VERSION=0.8.0
 PKG_CHECK_MODULES(XXHASH, libxxhash >= $XXHASH_VERSION)
 
-AC_SUBST(XXHASH_VERSION)
-AC_SUBST(XXHASH_LIBS)
-AC_SUBST(XXHASH_CFLAGS)
-
 dnl Check for some programs like rm, mkdir, etc ...
 AC_CHECK_PROG(HAS_RM, rm, yes, no)
 if test x$HAS_RM = xno; then
@@ -922,7 +918,7 @@  AX_VALGRIND_CHECK
 
 dnl Set the list of libraries libabigail depends on
 
-DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $XXHASH_LIBS $CTF_LIBS $BPF_LIBS"
+DEPS_LIBS="$XML_LIBS $ELF_LIBS $DW_LIBS $CTF_LIBS $BPF_LIBS"
 AC_SUBST(DEPS_LIBS)
 
 if test x$ABIGAIL_DEVEL != x; then
@@ -960,6 +956,10 @@  if test x$ENABLE_UBSAN = xyes; then
     CXXFLAGS="$CXXFLAGS -fsanitize=undefined"
 fi
 
+dnl We always want all xxhash functions inlined
+CFLAGS="$CFLAGS -DXXH_INLINE_ALL=1"
+CXXFLAGS="$CXXFLAGS -DXXH_INLINE_ALL=1"
+
 dnl Set a few Automake conditionals
 
 AM_CONDITIONAL([CTF_READER],[test "x$ENABLE_CTF" = "xyes"])