elf: Add the soname to the libname_list eagerly on loading a library

Message ID 20230428062634.2152615-1-maskray@google.com
State Changes Requested
Headers
Series elf: Add the soname to the libname_list eagerly on loading a library |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Fangrui Song April 28, 2023, 6:26 a.m. UTC
  Original author is Ambrose Feinstein while working at Google.

_dl_map_object iterates over loaded objects and calls _dl_name_match_p.
If l->l_soname_added is 0, we incur two costs.

First, loading l->l_info[DT_SONAME]->d_un.d_val has TLB pressure as
every library's string table is in a different page.  The cost will be
avoided if the string is on the heap.

Second, add_name_to_object repeats the l_libname comparison already done
by the _dl_name_match_p call.

To remove these costs, we eagerly add the SONAME to the libname_list.
l_soname_added is typically 1, so laziness doesn't provide savings.
---
 elf/dl-load.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)
  

Comments

Florian Weimer April 28, 2023, 8:42 a.m. UTC | #1
* Fangrui Song via Libc-alpha:

> Original author is Ambrose Feinstein while working at Google.
>
> _dl_map_object iterates over loaded objects and calls _dl_name_match_p.
> If l->l_soname_added is 0, we incur two costs.
>
> First, loading l->l_info[DT_SONAME]->d_un.d_val has TLB pressure as
> every library's string table is in a different page.  The cost will be
> avoided if the string is on the heap.
>
> Second, add_name_to_object repeats the l_libname comparison already done
> by the _dl_name_match_p call.
>
> To remove these costs, we eagerly add the SONAME to the libname_list.
> l_soname_added is typically 1, so laziness doesn't provide savings.
> ---
>  elf/dl-load.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 9a0e40c0e9..1b17410ce0 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -1451,10 +1451,11 @@ cannot enable executable stack as shared object requires");
>  
>    /* When we profile the SONAME might be needed for something else but
>       loading.  Add it right away.  */
> -  if (__glibc_unlikely (GLRO(dl_profile) != NULL)
> -      && l->l_info[DT_SONAME] != NULL)
> +  if (l->l_info[DT_SONAME] != NULL) {
>      add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
>  			    + l->l_info[DT_SONAME]->d_un.d_val));
> +    l->l_soname_added = 1;
> +  }
>  
>    /* If we have newly loaded libc.so, update the namespace
>       description.  */

The comment is now outdated.

Not sure if this kind of micro-optimization makes sense.  Maybe until we
add a hash table here …

Thanks,
Florian
  
Fangrui Song May 2, 2023, 11:08 p.m. UTC | #2
On Fri, Apr 28, 2023 at 1:42 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Fangrui Song via Libc-alpha:
>
> > Original author is Ambrose Feinstein while working at Google.
> >
> > _dl_map_object iterates over loaded objects and calls _dl_name_match_p.
> > If l->l_soname_added is 0, we incur two costs.
> >
> > First, loading l->l_info[DT_SONAME]->d_un.d_val has TLB pressure as
> > every library's string table is in a different page.  The cost will be
> > avoided if the string is on the heap.
> >
> > Second, add_name_to_object repeats the l_libname comparison already done
> > by the _dl_name_match_p call.
> >
> > To remove these costs, we eagerly add the SONAME to the libname_list.
> > l_soname_added is typically 1, so laziness doesn't provide savings.
> > ---
> >  elf/dl-load.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/elf/dl-load.c b/elf/dl-load.c
> > index 9a0e40c0e9..1b17410ce0 100644
> > --- a/elf/dl-load.c
> > +++ b/elf/dl-load.c
> > @@ -1451,10 +1451,11 @@ cannot enable executable stack as shared object requires");
> >
> >    /* When we profile the SONAME might be needed for something else but
> >       loading.  Add it right away.  */
> > -  if (__glibc_unlikely (GLRO(dl_profile) != NULL)
> > -      && l->l_info[DT_SONAME] != NULL)
> > +  if (l->l_info[DT_SONAME] != NULL) {
> >      add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
> >                           + l->l_info[DT_SONAME]->d_un.d_val));
> > +    l->l_soname_added = 1;
> > +  }
> >
> >    /* If we have newly loaded libc.so, update the namespace
> >       description.  */
>
> The comment is now outdated.
>
> Not sure if this kind of micro-optimization makes sense.  Maybe until we
> add a hash table here …
>
> Thanks,
> Florian
>

Perhaps just remove the comment?

I think there is some merit replacing a tricky condition
__glibc_unlikely (GLRO(dl_profile) != NULL  to a simple one `
l->l_soname_added = 1;`.

This optimization matters for an executable with O(1000) shared object
dependencies.
The quadratic time DSO comparison multiplied by the strcmp dominates
the load time. This patch removes one bottleneck.
  
Florian Weimer July 5, 2023, 6:17 a.m. UTC | #3
* Florian Weimer:

> * Fangrui Song via Libc-alpha:
>
>> Original author is Ambrose Feinstein while working at Google.
>>
>> _dl_map_object iterates over loaded objects and calls _dl_name_match_p.
>> If l->l_soname_added is 0, we incur two costs.
>>
>> First, loading l->l_info[DT_SONAME]->d_un.d_val has TLB pressure as
>> every library's string table is in a different page.  The cost will be
>> avoided if the string is on the heap.
>>
>> Second, add_name_to_object repeats the l_libname comparison already done
>> by the _dl_name_match_p call.
>>
>> To remove these costs, we eagerly add the SONAME to the libname_list.
>> l_soname_added is typically 1, so laziness doesn't provide savings.
>> ---
>>  elf/dl-load.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/elf/dl-load.c b/elf/dl-load.c
>> index 9a0e40c0e9..1b17410ce0 100644
>> --- a/elf/dl-load.c
>> +++ b/elf/dl-load.c
>> @@ -1451,10 +1451,11 @@ cannot enable executable stack as shared object requires");
>>  
>>    /* When we profile the SONAME might be needed for something else but
>>       loading.  Add it right away.  */
>> -  if (__glibc_unlikely (GLRO(dl_profile) != NULL)
>> -      && l->l_info[DT_SONAME] != NULL)
>> +  if (l->l_info[DT_SONAME] != NULL) {
>>      add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
>>  			    + l->l_info[DT_SONAME]->d_un.d_val));
>> +    l->l_soname_added = 1;
>> +  }
>>  
>>    /* If we have newly loaded libc.so, update the namespace
>>       description.  */
>
> The comment is now outdated.
>
> Not sure if this kind of micro-optimization makes sense.  Maybe until we
> add a hash table here …

The hash table is implemented here:

  [PATCH 32/33] elf: Add hash tables to speed up DT_NEEDED, dlopen lookups
  <https://sourceware.org/pipermail/libc-alpha/2023-July/149669.html>

Thanks,
Florian
  
Fangrui Song July 5, 2023, 6:22 p.m. UTC | #4
On Tue, Jul 4, 2023 at 11:17 PM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Florian Weimer:
>
> > * Fangrui Song via Libc-alpha:
> >
> >> Original author is Ambrose Feinstein while working at Google.
> >>
> >> _dl_map_object iterates over loaded objects and calls _dl_name_match_p.
> >> If l->l_soname_added is 0, we incur two costs.
> >>
> >> First, loading l->l_info[DT_SONAME]->d_un.d_val has TLB pressure as
> >> every library's string table is in a different page.  The cost will be
> >> avoided if the string is on the heap.
> >>
> >> Second, add_name_to_object repeats the l_libname comparison already done
> >> by the _dl_name_match_p call.
> >>
> >> To remove these costs, we eagerly add the SONAME to the libname_list.
> >> l_soname_added is typically 1, so laziness doesn't provide savings.
> >> ---
> >>  elf/dl-load.c | 5 +++--
> >>  1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/elf/dl-load.c b/elf/dl-load.c
> >> index 9a0e40c0e9..1b17410ce0 100644
> >> --- a/elf/dl-load.c
> >> +++ b/elf/dl-load.c
> >> @@ -1451,10 +1451,11 @@ cannot enable executable stack as shared object requires");
> >>
> >>    /* When we profile the SONAME might be needed for something else but
> >>       loading.  Add it right away.  */
> >> -  if (__glibc_unlikely (GLRO(dl_profile) != NULL)
> >> -      && l->l_info[DT_SONAME] != NULL)
> >> +  if (l->l_info[DT_SONAME] != NULL) {
> >>      add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
> >>                          + l->l_info[DT_SONAME]->d_un.d_val));
> >> +    l->l_soname_added = 1;
> >> +  }
> >>
> >>    /* If we have newly loaded libc.so, update the namespace
> >>       description.  */
> >
> > The comment is now outdated.
> >
> > Not sure if this kind of micro-optimization makes sense.  Maybe until we
> > add a hash table here …
>
> The hash table is implemented here:
>
>   [PATCH 32/33] elf: Add hash tables to speed up DT_NEEDED, dlopen lookups
>   <https://sourceware.org/pipermail/libc-alpha/2023-July/149669.html>
>
> Thanks,
> Florian
>

Thank you for improving the dynamic loader performance and obsoleting
this patch! :)
  

Patch

diff --git a/elf/dl-load.c b/elf/dl-load.c
index 9a0e40c0e9..1b17410ce0 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -1451,10 +1451,11 @@  cannot enable executable stack as shared object requires");
 
   /* When we profile the SONAME might be needed for something else but
      loading.  Add it right away.  */
-  if (__glibc_unlikely (GLRO(dl_profile) != NULL)
-      && l->l_info[DT_SONAME] != NULL)
+  if (l->l_info[DT_SONAME] != NULL) {
     add_name_to_object (l, ((const char *) D_PTR (l, l_info[DT_STRTAB])
 			    + l->l_info[DT_SONAME]->d_un.d_val));
+    l->l_soname_added = 1;
+  }
 
   /* If we have newly loaded libc.so, update the namespace
      description.  */