diff mbox series

[RFC] elf: Implement filtering of symbols historically defined in libpthread

Message ID 87h7jqguew.fsf@oldenburg.str.redhat.com
State Superseded
Headers show
Series [RFC] elf: Implement filtering of symbols historically defined in libpthread | expand

Commit Message

Florian Weimer April 28, 2021, 6:01 p.m. UTC
Definitions of these symbols in libc expose bugs in existing binaries
linked against earlier glibc versions.  Therefore, hide these symbols
for old binaries.

The symbol list in sysdeps/nptl/dl-pthread-weak.c contains some
symbols which have not been moved yet, but that is harmless because
the function is only invoked if the symbol is found in libc.so.

The test suite passes on i686-gnu-linux, powerpc64-linux-gnu,
x86_64-linux-gnu with these changes.

Is this the direction we want to go in?  Then I'm going to add test
cases, probably using assembler.

Personally I think it's not *too* bad, also not particularly nice
either.  elf/dl-pthread-weak.os brings in 2-3 KiB of code (but few
run-time relocations).  One possibility I have not mentioned in the
comment is to put the moved symbols into a GLIBCPTHREAD_2.34 symbol
version and use the presence of this version on the chain as an
indicator that the symbol uses special treatment.  This eliminates the
separate string table.  The downside is that we cannot easily add more
symbols if we discover some are missing.  This happened to me during
development with pthread_mutexattr_gettype, which is a GLIBC_2.1 symbol
and therefore not actually suitable for detecting the presence of
libpthread (historically speaking).  And it could happen again with
thrd_exit (which is of course much younger).

Thanks,
Florian

---
 elf/Makefile                      |   2 +-
 elf/dl-lookup.c                   |  13 +++-
 elf/dl-open.c                     |   3 +
 elf/dl-pthread-weak.c             |  20 +++++
 elf/dl-version.c                  |   2 +
 sysdeps/generic/dl-pthread-weak.h |  67 +++++++++++++++++
 sysdeps/nptl/dl-pthread-weak.c    | 153 ++++++++++++++++++++++++++++++++++++++
 sysdeps/nptl/dl-pthread-weak.h    | 107 ++++++++++++++++++++++++++
 8 files changed, 363 insertions(+), 4 deletions(-)

Comments

Florian Weimer May 4, 2021, 6:55 a.m. UTC | #1
* Florian Weimer via Libc-alpha:

> Definitions of these symbols in libc expose bugs in existing binaries
> linked against earlier glibc versions.  Therefore, hide these symbols
> for old binaries.
>
> The symbol list in sysdeps/nptl/dl-pthread-weak.c contains some
> symbols which have not been moved yet, but that is harmless because
> the function is only invoked if the symbol is found in libc.so.
>
> The test suite passes on i686-gnu-linux, powerpc64-linux-gnu,
> x86_64-linux-gnu with these changes.
>
> Is this the direction we want to go in?  Then I'm going to add test
> cases, probably using assembler.
>
> Personally I think it's not *too* bad, also not particularly nice
> either.  elf/dl-pthread-weak.os brings in 2-3 KiB of code (but few
> run-time relocations).  One possibility I have not mentioned in the
> comment is to put the moved symbols into a GLIBCPTHREAD_2.34 symbol
> version and use the presence of this version on the chain as an
> indicator that the symbol uses special treatment.  This eliminates the
> separate string table.  The downside is that we cannot easily add more
> symbols if we discover some are missing.  This happened to me during
> development with pthread_mutexattr_gettype, which is a GLIBC_2.1 symbol
> and therefore not actually suitable for detecting the presence of
> libpthread (historically speaking).  And it could happen again with
> thrd_exit (which is of course much younger).

I'd appreciate if someone could have a look at this patch in tell me if
the approach is reasonable, all things considered.

  <https://sourceware.org/pipermail/libc-alpha/2021-April/125564.html>

Thanks,
Florian
H.J. Lu May 4, 2021, 11:47 a.m. UTC | #2
On Wed, Apr 28, 2021 at 1:37 PM Florian Weimer via Libc-alpha
<libc-alpha@sourceware.org> wrote:
>
> Definitions of these symbols in libc expose bugs in existing binaries
> linked against earlier glibc versions.  Therefore, hide these symbols
> for old binaries.
>
> The symbol list in sysdeps/nptl/dl-pthread-weak.c contains some
> symbols which have not been moved yet, but that is harmless because
> the function is only invoked if the symbol is found in libc.so.
>
> The test suite passes on i686-gnu-linux, powerpc64-linux-gnu,
> x86_64-linux-gnu with these changes.
>
> Is this the direction we want to go in?  Then I'm going to add test
> cases, probably using assembler.
>
> Personally I think it's not *too* bad, also not particularly nice
> either.  elf/dl-pthread-weak.os brings in 2-3 KiB of code (but few
> run-time relocations).  One possibility I have not mentioned in the
> comment is to put the moved symbols into a GLIBCPTHREAD_2.34 symbol
> version and use the presence of this version on the chain as an
> indicator that the symbol uses special treatment.  This eliminates the
> separate string table.  The downside is that we cannot easily add more
> symbols if we discover some are missing.  This happened to me during
> development with pthread_mutexattr_gettype, which is a GLIBC_2.1 symbol
> and therefore not actually suitable for detecting the presence of
> libpthread (historically speaking).  And it could happen again with
> thrd_exit (which is of course much younger).
>

Since we want to detect the binaries which were linked against glibc
older than 2.34, we should use the glibc version to check the old
binaries.  Of course, we should make a complete list of functions
which are really implemented in libpthread.so and will be moved to
libc.so in glibc 2.34.

H.J.
Florian Weimer May 4, 2021, 12:04 p.m. UTC | #3
* H. J. Lu:

> Since we want to detect the binaries which were linked against glibc
> older than 2.34, we should use the glibc version to check the old
> binaries.

The patch attempts to detect old main programs by looking for the
GLIBC_2.34 symbol version.  Since we added __libc_start_main@GLIBC_2.34
(which is called from our version of _start), all standard main programs
linked with glibc 2.34 or later will have this symbol version.

> Of course, we should make a complete list of functions which are
> really implemented in libpthread.so and will be moved to libc.so in
> glibc 2.34.

I'm not sure if it is necessary to do this for *all* symbols.  Formally,
a symbol added after glibc 2.0 is not suitable for detecting the
potential multi-threaded nature of a process because the application
could be built for an earlier glibc version, where the symbol is not
available, but pthread_create can still create new threads.  The use of
pthread_mutexattr_gettype@@GLIBC_2.1 as the detection symbol does not
follow the rule, of course.  With the explicit table approach (as in the
patch), we can add additional lookup symbols later, so I think we can
start out with the GLIBC_2.0 and GLIBC_2.1 symbols plus thrd_exit (the
latter is also used by gnulib for some reason).

Thanks,
Florian
H.J. Lu May 4, 2021, 12:24 p.m. UTC | #4
On Tue, May 4, 2021 at 5:04 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu:
>
> > Since we want to detect the binaries which were linked against glibc
> > older than 2.34, we should use the glibc version to check the old
> > binaries.
>
> The patch attempts to detect old main programs by looking for the
> GLIBC_2.34 symbol version.  Since we added __libc_start_main@GLIBC_2.34
> (which is called from our version of _start), all standard main programs
> linked with glibc 2.34 or later will have this symbol version.
>
> > Of course, we should make a complete list of functions which are
> > really implemented in libpthread.so and will be moved to libc.so in
> > glibc 2.34.
>
> I'm not sure if it is necessary to do this for *all* symbols.  Formally,
> a symbol added after glibc 2.0 is not suitable for detecting the
> potential multi-threaded nature of a process because the application
> could be built for an earlier glibc version, where the symbol is not
> available, but pthread_create can still create new threads.  The use of
> pthread_mutexattr_gettype@@GLIBC_2.1 as the detection symbol does not
> follow the rule, of course.  With the explicit table approach (as in the
> patch), we can add additional lookup symbols later, so I think we can
> start out with the GLIBC_2.0 and GLIBC_2.1 symbols plus thrd_exit (the
> latter is also used by gnulib for some reason).
>

Can we invent a symbol or version to detect the older binaries?
If not, can GNU property, ABI note, .... help here?
Florian Weimer May 4, 2021, 12:27 p.m. UTC | #5
* H. J. Lu:

>> The patch attempts to detect old main programs by looking for the
>> GLIBC_2.34 symbol version.  Since we added __libc_start_main@@GLIBC_2.34
>> (which is called from our version of _start), all standard main programs
>> linked with glibc 2.34 or later will have this symbol version.

> Can we invent a symbol or version to detect the older binaries?
> If not, can GNU property, ABI note, .... help here?

I think we have all we need due to __libc_start_main@@GLIBC_2.34.  It
was an unrelated change, but it helps here as well.

Thanks,
Florian
H.J. Lu May 4, 2021, 12:30 p.m. UTC | #6
On Tue, May 4, 2021 at 5:27 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu:
>
> >> The patch attempts to detect old main programs by looking for the
> >> GLIBC_2.34 symbol version.  Since we added __libc_start_main@@GLIBC_2.34
> >> (which is called from our version of _start), all standard main programs
> >> linked with glibc 2.34 or later will have this symbol version.
>
> > Can we invent a symbol or version to detect the older binaries?
> > If not, can GNU property, ABI note, .... help here?
>
> I think we have all we need due to __libc_start_main@@GLIBC_2.34.  It
> was an unrelated change, but it helps here as well.

So your patch isn't required?  Can you add some tests to verify it?
Florian Weimer May 4, 2021, 12:33 p.m. UTC | #7
* H. J. Lu:

> On Tue, May 4, 2021 at 5:27 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * H. J. Lu:
>>
>> >> The patch attempts to detect old main programs by looking for the
>> >> GLIBC_2.34 symbol version.  Since we added __libc_start_main@@GLIBC_2.34
>> >> (which is called from our version of _start), all standard main programs
>> >> linked with glibc 2.34 or later will have this symbol version.
>>
>> > Can we invent a symbol or version to detect the older binaries?
>> > If not, can GNU property, ABI note, .... help here?
>>
>> I think we have all we need due to __libc_start_main@@GLIBC_2.34.  It
>> was an unrelated change, but it helps here as well.
>
> So your patch isn't required?  Can you add some tests to verify it?

No, the patch uses the absence of GLIBC_2.34 for detecting old binaries.

We cannot use symbol versions in other ways because one key
characteristic of weak references and underlinking is the lack of
version information on the symbol itself.

Thanks,
Florian
H.J. Lu May 4, 2021, 12:36 p.m. UTC | #8
On Tue, May 4, 2021 at 5:33 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu:
>
> > On Tue, May 4, 2021 at 5:27 AM Florian Weimer <fweimer@redhat.com> wrote:
> >>
> >> * H. J. Lu:
> >>
> >> >> The patch attempts to detect old main programs by looking for the
> >> >> GLIBC_2.34 symbol version.  Since we added __libc_start_main@@GLIBC_2.34
> >> >> (which is called from our version of _start), all standard main programs
> >> >> linked with glibc 2.34 or later will have this symbol version.
> >>
> >> > Can we invent a symbol or version to detect the older binaries?
> >> > If not, can GNU property, ABI note, .... help here?
> >>
> >> I think we have all we need due to __libc_start_main@@GLIBC_2.34.  It
> >> was an unrelated change, but it helps here as well.
> >
> > So your patch isn't required?  Can you add some tests to verify it?
>
> No, the patch uses the absence of GLIBC_2.34 for detecting old binaries.
>
> We cannot use symbol versions in other ways because one key
> characteristic of weak references and underlinking is the lack of
> version information on the symbol itself.

Can we add support for binary testcases like this, even if it can only
run on a single target?  It shouldn't be too hard.
Florian Weimer May 4, 2021, 12:43 p.m. UTC | #9
* H. J. Lu:

> On Tue, May 4, 2021 at 5:33 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * H. J. Lu:
>>
>> > On Tue, May 4, 2021 at 5:27 AM Florian Weimer <fweimer@redhat.com> wrote:
>> >>
>> >> * H. J. Lu:
>> >>
>> >> >> The patch attempts to detect old main programs by looking for the
>> >> >> GLIBC_2.34 symbol version.  Since we added __libc_start_main@@GLIBC_2.34
>> >> >> (which is called from our version of _start), all standard main programs
>> >> >> linked with glibc 2.34 or later will have this symbol version.
>> >>
>> >> > Can we invent a symbol or version to detect the older binaries?
>> >> > If not, can GNU property, ABI note, .... help here?
>> >>
>> >> I think we have all we need due to __libc_start_main@@GLIBC_2.34.  It
>> >> was an unrelated change, but it helps here as well.
>> >
>> > So your patch isn't required?  Can you add some tests to verify it?
>>
>> No, the patch uses the absence of GLIBC_2.34 for detecting old binaries.
>>
>> We cannot use symbol versions in other ways because one key
>> characteristic of weak references and underlinking is the lack of
>> version information on the symbol itself.
>
> Can we add support for binary testcases like this, even if it can only
> run on a single target?  It shouldn't be too hard.

Yes, we can verify the binding status of weak symbols with a smaller
assembler program that was linked against a stub libc.so.6 library that
only contains the soname and no symbols.

As the code is architecture-agnostic, testing on e.g. x86-64 should be
sufficient.

We cannot easily verify the behavior of real-world binaries because
re-linking them with current binutils probably changes behavior.

Thanks,
Florian
Adhemerval Zanella May 4, 2021, 12:59 p.m. UTC | #10
On 28/04/2021 15:01, Florian Weimer via Libc-alpha wrote:
> Definitions of these symbols in libc expose bugs in existing binaries
> linked against earlier glibc versions.  Therefore, hide these symbols
> for old binaries.

It is unfortunate that we didn't specify that redefine libc symbol
visbility regarding elf linking should be considered undefined 
behaviour, so programs now rely un such behaviour (the bug foward
compatibility is such a messy constraint...).  It would be better if 
we knew earlier that a single thread API indication might just avoided 
this mess.

In any case I don't see a better way than moving the logic to loader
to handle such cases, the LD_PRELOAD is not really a solution. 

> 
> The symbol list in sysdeps/nptl/dl-pthread-weak.c contains some
> symbols which have not been moved yet, but that is harmless because
> the function is only invoked if the symbol is found in libc.so.
> 
> The test suite passes on i686-gnu-linux, powerpc64-linux-gnu,
> x86_64-linux-gnu with these changes.
> 
> Is this the direction we want to go in?  Then I'm going to add test
> cases, probably using assembler.
> 
> Personally I think it's not *too* bad, also not particularly nice
> either.  elf/dl-pthread-weak.os brings in 2-3 KiB of code (but few
> run-time relocations).  One possibility I have not mentioned in the
> comment is to put the moved symbols into a GLIBCPTHREAD_2.34 symbol
> version and use the presence of this version on the chain as an
> indicator that the symbol uses special treatment.  This eliminates the
> separate string table.  The downside is that we cannot easily add more
> symbols if we discover some are missing.  This happened to me during
> development with pthread_mutexattr_gettype, which is a GLIBC_2.1 symbol
> and therefore not actually suitable for detecting the presence of
> libpthread (historically speaking).  And it could happen again with
> thrd_exit (which is of course much younger).

My take on this is to postpone the 2.34 release until we have the
libpthread/librt *fully* integrate on libc.so.  The half way through
only add unecessary complexity and it might avoid the 2-3 KiB binary
increase in code .
H.J. Lu May 4, 2021, 1:06 p.m. UTC | #11
On Tue, May 4, 2021 at 5:43 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu:
>
> > On Tue, May 4, 2021 at 5:33 AM Florian Weimer <fweimer@redhat.com> wrote:
> >>
> >> * H. J. Lu:
> >>
> >> > On Tue, May 4, 2021 at 5:27 AM Florian Weimer <fweimer@redhat.com> wrote:
> >> >>
> >> >> * H. J. Lu:
> >> >>
> >> >> >> The patch attempts to detect old main programs by looking for the
> >> >> >> GLIBC_2.34 symbol version.  Since we added __libc_start_main@@GLIBC_2.34
> >> >> >> (which is called from our version of _start), all standard main programs
> >> >> >> linked with glibc 2.34 or later will have this symbol version.
> >> >>
> >> >> > Can we invent a symbol or version to detect the older binaries?
> >> >> > If not, can GNU property, ABI note, .... help here?
> >> >>
> >> >> I think we have all we need due to __libc_start_main@@GLIBC_2.34.  It
> >> >> was an unrelated change, but it helps here as well.
> >> >
> >> > So your patch isn't required?  Can you add some tests to verify it?
> >>
> >> No, the patch uses the absence of GLIBC_2.34 for detecting old binaries.
> >>
> >> We cannot use symbol versions in other ways because one key
> >> characteristic of weak references and underlinking is the lack of
> >> version information on the symbol itself.
> >
> > Can we add support for binary testcases like this, even if it can only
> > run on a single target?  It shouldn't be too hard.
>
> Yes, we can verify the binding status of weak symbols with a smaller
> assembler program that was linked against a stub libc.so.6 library that
> only contains the soname and no symbols.
>
> As the code is architecture-agnostic, testing on e.g. x86-64 should be
> sufficient.
>
> We cannot easily verify the behavior of real-world binaries because
> re-linking them with current binutils probably changes behavior.


Why re-linking is needed for the binary executable test?
Andreas Schwab May 4, 2021, 1:08 p.m. UTC | #12
https://build.opensuse.org/package/live_build_log/home:Andreas_Schwab:glibc:test/glibc/p/ppc64le

make  -C localedata install-locales
make[2]: Entering directory '/home/abuild/rpmbuild/BUILD/glibc-2.33.9000.493.g2a76821c30/localedata'
.././scripts/mkinstalldirs /home/abuild/rpmbuild/BUILDROOT/glibc-2.33.9000.493.g2a76821c30-2839.1.ppc64le/usr/lib/locale
mkdir -p -- /home/abuild/rpmbuild/BUILDROOT/glibc-2.33.9000.493.g2a76821c30-2839.1.ppc64le/usr/lib/locale
aa_DJ.UTF-8aa_ER.UTF-8C.UTF-8...aa_DJ.ISO-8859-1.........make[2]: *** [Makefile:449: install-archive-aa_DJ.UTF-8/UTF-8] Error 132
make[2]: *** Waiting for unfinished jobs....
make[2]: *** [Makefile:449: install-archive-C.UTF-8/UTF-8] Error 132
make[2]: *** [Makefile:449: install-archive-aa_ER/UTF-8] Error 132
make[2]: *** [Makefile:449: install-archive-aa_DJ/ISO-8859-1] Error 132
make[2]: Leaving directory '/home/abuild/rpmbuild/BUILD/glibc-2.33.9000.493.g2a76821c30/localedata'
make[1]: *** [Makefile:731: localedata/install-locales] Error 2
make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/glibc-2.33.9000.493.g2a76821c30'
make: *** [Makefile:9: localedata/install-locales] Error 2

Andreas.
Florian Weimer May 4, 2021, 1:13 p.m. UTC | #13
* H. J. Lu:

> Why re-linking is needed for the binary executable test?

I don't want us to bundle the sources of an old glibc version, just to
comply with the LGPL requirements for one single test (or talk to legal
folks to figure out if this is required).

Thanks,
Florian
H.J. Lu May 4, 2021, 1:28 p.m. UTC | #14
On Tue, May 4, 2021 at 6:13 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * H. J. Lu:
>
> > Why re-linking is needed for the binary executable test?
>
> I don't want us to bundle the sources of an old glibc version, just to
> comply with the LGPL requirements for one single test (or talk to legal
> folks to figure out if this is required).

We can include both the source and the binary.  We just use the pre-built
binary built against the older glibc.  I don't see an issue here.
Florian Weimer May 4, 2021, 4:52 p.m. UTC | #15
* Andreas Schwab:

> https://build.opensuse.org/package/live_build_log/home:Andreas_Schwab:glibc:test/glibc/p/ppc64le
>
> make  -C localedata install-locales
> make[2]: Entering directory '/home/abuild/rpmbuild/BUILD/glibc-2.33.9000.493.g2a76821c30/localedata'
> .././scripts/mkinstalldirs /home/abuild/rpmbuild/BUILDROOT/glibc-2.33.9000.493.g2a76821c30-2839.1.ppc64le/usr/lib/locale
> mkdir -p -- /home/abuild/rpmbuild/BUILDROOT/glibc-2.33.9000.493.g2a76821c30-2839.1.ppc64le/usr/lib/locale
> aa_DJ.UTF-8aa_ER.UTF-8C.UTF-8...aa_DJ.ISO-8859-1.........make[2]: *** [Makefile:449: install-archive-aa_DJ.UTF-8/UTF-8] Error 132
> make[2]: *** Waiting for unfinished jobs....
> make[2]: *** [Makefile:449: install-archive-C.UTF-8/UTF-8] Error 132
> make[2]: *** [Makefile:449: install-archive-aa_ER/UTF-8] Error 132
> make[2]: *** [Makefile:449: install-archive-aa_DJ/ISO-8859-1] Error 132
> make[2]: Leaving directory '/home/abuild/rpmbuild/BUILD/glibc-2.33.9000.493.g2a76821c30/localedata'
> make[1]: *** [Makefile:731: localedata/install-locales] Error 2
> make[1]: Leaving directory '/home/abuild/rpmbuild/BUILD/glibc-2.33.9000.493.g2a76821c30'
> make: *** [Makefile:9: localedata/install-locales] Error 2

Thanks, I'll try to reproduce this locally tomorrow.  I assume this
command is failing:

	$(LOCALEDEF) $$flags --alias-file=../intl/locale.alias \
		     -i locales/$$input -f charmaps/$$charset \
		     $(addprefix --prefix=,$(install_root)) $$locale \

Note that if you have existing binaries that use weak symbols *and*
GLIBC_2.34 (likely due to __libc_start_main@@GLIBC_2.34), you will have
to do something about them.  The compatibility kludge will not work if
programs are linked against certain intermediate glibc 2.34 snapshots.

Thanks,
Florian
Carlos O'Donell May 4, 2021, 9:08 p.m. UTC | #16
On 5/4/21 8:59 AM, Adhemerval Zanella via Libc-alpha wrote:
> My take on this is to postpone the 2.34 release until we have the
> libpthread/librt *fully* integrate on libc.so.  The half way through
> only add unecessary complexity and it might avoid the 2-3 KiB binary
> increase in code .

Glass half-full: I will accelerate my reviews :-)
Carlos O'Donell May 4, 2021, 9:28 p.m. UTC | #17
On 4/28/21 2:01 PM, Florian Weimer via Libc-alpha wrote:
> Definitions of these symbols in libc expose bugs in existing binaries
> linked against earlier glibc versions.  Therefore, hide these symbols
> for old binaries.

Overall this looks good to me.

(1) High level approach is good.

I like that this is not on the hot path, and called open for weak unversioned
symbol references via check_match(), and I think as approaches go this is
light and fast.

(2) Other solutions to the problem.

I don't see any other simple solutions to the effectively underlinked binary
that finds libphread loaded with libc. I like that we can contain the solution
within libc itself without needing to do any preloading.
 
> The symbol list in sysdeps/nptl/dl-pthread-weak.c contains some
> symbols which have not been moved yet, but that is harmless because
> the function is only invoked if the symbol is found in libc.so.
> 
> The test suite passes on i686-gnu-linux, powerpc64-linux-gnu,
> x86_64-linux-gnu with these changes.
> 
> Is this the direction we want to go in?  Then I'm going to add test
> cases, probably using assembler.
> 
> Personally I think it's not *too* bad, also not particularly nice
> either.  elf/dl-pthread-weak.os brings in 2-3 KiB of code (but few
> run-time relocations).  One possibility I have not mentioned in the
> comment is to put the moved symbols into a GLIBCPTHREAD_2.34 symbol
> version and use the presence of this version on the chain as an
> indicator that the symbol uses special treatment.  This eliminates the
> separate string table.  The downside is that we cannot easily add more
> symbols if we discover some are missing.  This happened to me during
> development with pthread_mutexattr_gettype, which is a GLIBC_2.1 symbol
> and therefore not actually suitable for detecting the presence of
> libpthread (historically speaking).  And it could happen again with
> thrd_exit (which is of course much younger).

(3) Keep it flexible.

I think we need to keep the flexibility of the table lookup and avoid
adding more versioned symbols.


> ---
>  elf/Makefile                      |   2 +-
>  elf/dl-lookup.c                   |  13 +++-
>  elf/dl-open.c                     |   3 +
>  elf/dl-pthread-weak.c             |  20 +++++
>  elf/dl-version.c                  |   2 +
>  sysdeps/generic/dl-pthread-weak.h |  67 +++++++++++++++++
>  sysdeps/nptl/dl-pthread-weak.c    | 153 ++++++++++++++++++++++++++++++++++++++
>  sysdeps/nptl/dl-pthread-weak.h    | 107 ++++++++++++++++++++++++++
>  8 files changed, 363 insertions(+), 4 deletions(-)
> 
> diff --git a/elf/Makefile b/elf/Makefile
> index f09988f7d2..0dd430366f 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -34,7 +34,7 @@ dl-routines	= $(addprefix dl-,load lookup object reloc deps \
>  				  version profile tls origin scope \
>  				  execstack open close trampoline \
>  				  exception sort-maps lookup-direct \
> -				  call-libc-early-init write \
> +				  call-libc-early-init write pthread-weak \

OK. Add pthread-weak.

>  				  thread_gscope_wait tls_init_tp)
>  ifeq (yes,$(use-ldconfig))
>  dl-routines += dl-cache
> diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c
> index eea217eb28..2a2b46f85c 100644
> --- a/elf/dl-lookup.c
> +++ b/elf/dl-lookup.c
> @@ -29,6 +29,7 @@
>  #include <tls.h>
>  #include <atomic.h>
>  #include <elf_machine_sym_no_match.h>
> +#include <dl-pthread-weak.h>

OK. Include during lookup.

>  
>  #include <assert.h>
>  
> @@ -64,6 +65,7 @@ check_match (const char *const undef_name,
>  	     const Elf_Symndx symidx,
>  	     const char *const strtab,
>  	     const struct link_map *const map,
> +	     const struct link_map *undef_map,

OK. Pass new argument to check_match.

>  	     const ElfW(Sym) **const versioned_sym,
>  	     int *const num_versions)
>  {
> @@ -142,6 +144,11 @@ check_match (const char *const undef_name,
>  	 public interface should be returned.  */
>        if (verstab != NULL)
>  	{
> +	  /* Check if this is a legacy pthread weak symbol reference.
> +	     If yes, then do not bind to this symbol.  */
> +	  if (dl_pthread_hide_symbol (undef_map, undef_name, ref, map))
> +	    return NULL;
> +
>  	  if ((verstab[symidx] & 0x7fff)
>  	      >= ((flags & DL_LOOKUP_RETURN_NEWEST) ? 2 : 3))
>  	    {
> @@ -429,8 +436,8 @@ do_lookup_x (const char *undef_name, uint_fast32_t new_hash,
>  			symidx = ELF_MACHINE_HASH_SYMIDX (map, hasharr);
>  			sym = check_match (undef_name, ref, version, flags,
>  					   type_class, &symtab[symidx], symidx,
> -					   strtab, map, &versioned_sym,
> -					   &num_versions);
> +					   strtab, map, undef_map,
> +					   &versioned_sym, &num_versions);

OK. Pass recursively.

>  			if (sym != NULL)
>  			  goto found_it;
>  		      }
> @@ -454,7 +461,7 @@ do_lookup_x (const char *undef_name, uint_fast32_t new_hash,
>  	    {
>  	      sym = check_match (undef_name, ref, version, flags,
>  				 type_class, &symtab[symidx], symidx,
> -				 strtab, map, &versioned_sym,
> +				 strtab, map, undef_map, &versioned_sym,
>  				 &num_versions);
>  	      if (sym != NULL)
>  		goto found_it;
> diff --git a/elf/dl-open.c b/elf/dl-open.c
> index ab7aaa345e..4389159717 100644
> --- a/elf/dl-open.c
> +++ b/elf/dl-open.c
> @@ -35,6 +35,7 @@
>  #include <libc-internal.h>
>  #include <array_length.h>
>  #include <libc-early-init.h>
> +#include <dl-pthread-weak.h>
>  
>  #include <dl-dst.h>
>  #include <dl-prop.h>
> @@ -743,6 +744,8 @@ dl_open_worker (void *a)
>         on memory allocation failure.  See bug 16134.  */
>      update_tls_slotinfo (new);
>  
> +  dl_pthread_record_dlopen (new);

OK. Opening.

> +
>    /* Notify the debugger all new objects have been relocated.  */
>    if (relocation_in_progress)
>      LIBC_PROBE (reloc_complete, 3, args->nsid, r, new);
> diff --git a/elf/dl-pthread-weak.c b/elf/dl-pthread-weak.c
> new file mode 100644
> index 0000000000..aff80d4177
> --- /dev/null
> +++ b/elf/dl-pthread-weak.c
> @@ -0,0 +1,20 @@
> +/* Weak references to symbols formerly in libpthread.  Generic version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +/* The generic version is a header-only implementation.  */
> +#include <dl-pthread-weak.h>
> diff --git a/elf/dl-version.c b/elf/dl-version.c
> index 914955c2a8..d4c3b24a76 100644
> --- a/elf/dl-version.c
> +++ b/elf/dl-version.c
> @@ -24,6 +24,7 @@
>  #include <string.h>
>  #include <ldsodefs.h>
>  #include <_itoa.h>
> +#include <dl-pthread-weak.h>
>  
>  #include <assert.h>
>  
> @@ -220,6 +221,7 @@ _dl_check_map_versions (struct link_map *map, int verbose, int trace_mode)
>  					  strtab + aux->vna_name,
>  					  needed->l_real, verbose,
>  					  aux->vna_flags & VER_FLG_WEAK);
> +		  dl_pthread_record_version (map, aux);
>  
>  		  /* Compare the version index.  */
>  		  if ((unsigned int) (aux->vna_other & 0x7fff) > ndx_high)
> diff --git a/sysdeps/generic/dl-pthread-weak.h b/sysdeps/generic/dl-pthread-weak.h
> new file mode 100644
> index 0000000000..109cb4264b
> --- /dev/null
> +++ b/sysdeps/generic/dl-pthread-weak.h
> @@ -0,0 +1,67 @@
> +/* Weak references to symbols formerly in libpthread.  Generic version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +/* A lot of applications contain code like this:
> +
> +   if (pthread_mutexattr_gettype != NULL)
> +     pthread_once (&once_control, initializer_function);
> +
> +   pthread_mutexattr_gettype and pthread_once are declared as weak in
> +   the application.  Traditionally, link editors apply various forms
> +   of relaxations to a call to a weak function symbol if the symbol is
> +   undefined at static link time.  This eliminates the symbol
> +   reference, but the relevant code path cannot be executed anymore.
> +   Such code paths become active after symbols like
> +   pthread_mutexattr_gettype are moved into libc, so it is necessary
> +   to mask the existence of the symbol for old binaries.  */

OK. Good comment. Agreed.

> +
> +#ifndef _DL_PTHREAD_WEAK_H
> +#define _DL_PTHREAD_WEAK_H
> +
> +#include <ldsodefs.h>
> +#include <link.h>
> +#include <stdbool.h>
> +
> +/* Returns true if check_match in elf/dl-lookup.c should not resolve
> +   the symbol.  Called only if an unversioned symbol is about to be
> +   bound to a versioned symbol.  */
> +static inline bool
> +dl_pthread_hide_symbol (const struct link_map *undef_map,
> +                        const char *undef_name,
> +                        const ElfW(Sym) *undef_sym,
> +                        const struct link_map *defining_map)
> +{
> +  return false;
> +}
> +
> +/* Called during dlopen in the base namespace.  This can be used to
> +   detect a reference to libpthread.  */
> +static inline void
> +dl_pthread_record_dlopen (const struct link_map *map)
> +{
> +}
> +
> +/* Called for each needed version during symbol version information
> +   processing as part of dlopen.  */
> +static inline void
> +dl_pthread_record_version (const struct link_map *map,
> +                           const ElfW(Vernaux) *aux)
> +{
> +}
> +
> +#endif /* _DL_PTHREAD_WEAK_H */
> diff --git a/sysdeps/nptl/dl-pthread-weak.c b/sysdeps/nptl/dl-pthread-weak.c
> new file mode 100644
> index 0000000000..ff32939253
> --- /dev/null
> +++ b/sysdeps/nptl/dl-pthread-weak.c
> @@ -0,0 +1,153 @@
> +/* Weak references to symbols formerly in libpthread.  NPTL version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <array_length.h>
> +#include <dl-pthread-weak.h>
> +
> +#if DL_PTHREAD_WEAK_NEEDED
> +bool _dl_pthread_weak_symbols;
> +
> +/* There are several ways to implement the check to identify the
> +   relevant symbols.  For example, we could use the otherwise unused
> +   weak symbol status within libc.so.  The set representation is
> +   reasonably small and fast.  This function is called only for weak
> +   unversioned symbol references already found in libc.so, which is an
> +   unusual case and therefore not on the fast path for symbol
> +   lookup.  */

OK. Agreed.

> +
> +
> +/* Lexicographically ordered list of symbols originally at the
> +   GLIBC_2.0 and GLIBC_2.1 versions.  Later symbols (including C11
> +   symbols) will give false negatives on earlier glibc versions and
> +   are thus unsuitable for libpthread detection.  Even GLIBC_2.1 is
> +   problematic in this regard, but actual binaries use
> +   pthread_mutexattr_gettype as the detector symbol.  */
> +enum { maximum_length = 21 };

Clarify in comments that this is a non-NULL termianted array please
and that you must not run strlen on these fields.

> +static const char symbols[][maximum_length] =
> +  {
> +    "_cleanup_pop",
> +    "_cleanup_pop_restore",
> +    "_cleanup_push",
> +    "_cleanup_push_defer",
> +    "_getspecific",
> +    "_key_create",
> +    "_mutex_destroy",
> +    "_mutex_init",
> +    "_mutex_lock",
> +    "_mutex_trylock",
> +    "_mutex_unlock",
> +    "_mutexattr_destroy",
> +    "_mutexattr_init",
> +    "_mutexattr_settype",
> +    "_once",
> +    "_setspecific",
> +    "atfork",
> +    "attr_getguardsize",
> +    "attr_getstackaddr",
> +    "attr_getstacksize",
> +    "attr_setguardsize",
> +    "attr_setstackaddr",
> +    "attr_setstacksize",
> +    "cancel",
> +    "create",
> +    "detach",
> +    "getconcurrency",
> +    "getspecific",
> +    "join",
> +    "key_create",
> +    "key_delete",
> +    "kill",
> +    "kill_other_threads_np",
> +    "mutex_trylock",
> +    "mutexattr_destroy",
> +    "mutexattr_getkind_np",
> +    "mutexattr_gettype",
> +    "mutexattr_init",
> +    "mutexattr_setkind_np",
> +    "mutexattr_settype",
> +    "once",
> +    "rwlock_destroy",
> +    "rwlock_init",
> +    "rwlock_rdlock",
> +    "rwlock_tryrdlock",
> +    "rwlock_trywrlock",
> +    "rwlock_unlock",
> +    "rwlock_wrlock",
> +    "rwlockattr_destroy",
> +    "rwlockattr_getkind_np",
> +    "rwlockattr_getpshared",
> +    "rwlockattr_init",
> +    "rwlockattr_setkind_np",
> +    "rwlockattr_setpshared",
> +    "sem_destroy",
> +    "sem_getvalue",
> +    "sem_init",
> +    "sem_post",
> +    "sem_trywait",
> +    "sem_wait",
> +    "setconcurrency",
> +    "setspecific",
> +    "sigmask",
> +    "testcancel",
> +  };
> +
> +static inline int
> +compare (const void *a, const void *b)
> +{
> +  return strncmp (a, b, maximum_length);
> +}
> +
> +bool
> +_dl_pthread_hidden_symbol (const char *undef_name)
> +{
> +  /* Turn the __pthread and _pthread prefixes into a _ prefix.  This
> +     allows us to use a single lookup table.  (The difference between
> +     __pthread_mutex_lock and pthread_mutex_lock is significant, for
> +     example.)  */
> +  const char *key = NULL;
> +  if (strncmp (undef_name, "__pthread_", strlen ("__pthread_")) == 0)
> +    key = undef_name + strlen ("__pthread");
> +  else if (strncmp (undef_name, "_pthread_", strlen ("_pthread_")) == 0)
> +    key = undef_name + strlen ("_pthread");
> +  else if (strncmp (undef_name, "pthread_", strlen ("pthread_")) == 0)
> +    key = undef_name + strlen ("pthread_");
> +  else if (strncmp (undef_name, "sem_", strlen ("sem_")) == 0)
> +    /* Do not remove the sem_ prefix.  This would result in false
> +       matches for symbols such as pthread_sem_post, but no such
> +       symbols exist.  */
> +    key = undef_name;
> +
> +  if (key == NULL || strlen (key) > maximum_length)
> +    /* The prefix of undef_name is not recognized, or the string is
> +       not in the table because it is too long.  */
> +    return false;
> +
> +  if (bsearch (key, symbols, array_length (symbols), maximum_length,
> +               compare) != NULL)
> +    {
> +      if (__glibc_unlikely (GLRO (dl_debug_mask) & DL_DEBUG_BINDINGS))
> +        _dl_debug_printf ("\
> +not binding legacy weak reference in main program to %s\n",
> +                          undef_name);

OK. Excellent use of _dl_debug_printf with DL_DEBUG_BINDINGS.

> +      return true;
> +    }
> +
> +  return false;
> +}
> +
> +#endif /* DL_PTHREAD_WEAK_NEEDED */
> diff --git a/sysdeps/nptl/dl-pthread-weak.h b/sysdeps/nptl/dl-pthread-weak.h
> new file mode 100644
> index 0000000000..f252abcafe
> --- /dev/null
> +++ b/sysdeps/nptl/dl-pthread-weak.h
> @@ -0,0 +1,107 @@
> +/* Weak references to symbols formerly in libpthread.  NPTL version.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#include <shlib-compat.h>
> +#if OTHER_SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_34)
> +
> +/* For context, refer to <sysdeps/generic/dl-pthread-weak.h>.  */
> +
> +# include <gnu/lib-names.h>
> +# include <ldsodefs.h>
> +# include <link.h>
> +# include <stdbool.h>
> +# include <string.h>
> +
> +/* The implementation file needs to be compiled.  */
> +#define DL_PTHREAD_WEAK_NEEDED 1
> +
> +/* If false, weak symbols to old libpthread-only functions are hidden
> +   from symbol resolution in the main program.  Set to true if
> +   libpthread is loaded by the main program (via
> +   dl_pthread_record_dlopen), or if the main program references
> +   GLIBC_2.34 (via dl_pthread_record_version).  */
> +extern bool _dl_pthread_weak_symbols attribute_hidden;
> +
> +/* Returns true if UNDEF_NAME refers to a libpthread symbol that is
> +   hidden (in case of certain weak symbol references from the main
> +   program).  */
> +bool _dl_pthread_hidden_symbol (const char *undef_name) attribute_hidden;
> +
> +static inline bool
> +dl_pthread_hide_symbol (const struct link_map *undef_map,
> +                        const char *undef_name,
> +                        const ElfW(Sym) *undef_sym,
> +                        const struct link_map *defining_map)
> +{
> +  /* Check if symbol hiding has been disabled.  */
> +  if (_dl_pthread_weak_symbols)
> +    return false;
> +
> +  /* Symbol hiding only applies to weak symbol references.  */
> +  if (undef_sym == NULL || ELFW(ST_BIND) (undef_sym->st_info) != STB_WEAK)
> +    return false;

OK.

> +
> +  /* Only symbols in the main map are potentially hidden.  Shared
> +     objects are compiled as PIC and are not affected by link editor
> +     optimizations.  This implies a check that we are in the base
> +     namespace.  */
> +  const struct link_map *main_map = GL (dl_ns)[LM_ID_BASE]._ns_loaded;
> +  if (undef_map != main_map)
> +    return false;
> +
> +  /* Symbol hiding only applies to symbols in libc.so.  */
> +  if (defining_map != GL (dl_ns)[LM_ID_BASE].libc_map)
> +    return false;
> +
> +  /* Delegate to the out-of-line name checking function.  */
> +  return _dl_pthread_hidden_symbol (undef_name);
> +}
> +
> +static inline void
> +dl_pthread_record_dlopen (const struct link_map *map)
> +{
> +  /* This assumes that our libpthread has soname (and still exists as
> +     a separate shared object).  */
> +  const char *strtab = (const char *) D_PTR (map, l_info[DT_STRTAB]);
> +  if (map->l_info[DT_SONAME] != NULL
> +      && strcmp (strtab + map->l_info[DT_SONAME]->d_un.d_val,
> +                 LIBPTHREAD_SO) == 0)
> +    _dl_pthread_weak_symbols = true;

OK.

> +}
> +
> +static inline void
> +dl_pthread_record_version (const struct link_map *map,
> +                           const ElfW(Vernaux) *aux)
> +{
> +  /* Only GLIBC_2.34 references from the main map disable weak symbol
> +     hiding.  */
> +  const struct link_map *main_map = GL (dl_ns)[LM_ID_BASE]._ns_loaded;
> +  if (map != main_map)
> +    return;
> +
> +  const char *strtab = (const char *) D_PTR (map, l_info[DT_STRTAB]);
> +  if (strcmp (strtab + aux->vna_name, "GLIBC_2.34") == 0)
> +    _dl_pthread_weak_symbols = true;

OK.

> +}
> +
> +#else
> +/* For static build and glibc after 2.34, it is possible to use the
> +   no-op default version.  */
> +# include <sysdeps/generic/dl-pthread-weak.h>
> +# define DL_PTHREAD_WEAK_NEEDED 0
> +#endif
>
diff mbox series

Patch

diff --git a/elf/Makefile b/elf/Makefile
index f09988f7d2..0dd430366f 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -34,7 +34,7 @@  dl-routines	= $(addprefix dl-,load lookup object reloc deps \
 				  version profile tls origin scope \
 				  execstack open close trampoline \
 				  exception sort-maps lookup-direct \
-				  call-libc-early-init write \
+				  call-libc-early-init write pthread-weak \
 				  thread_gscope_wait tls_init_tp)
 ifeq (yes,$(use-ldconfig))
 dl-routines += dl-cache
diff --git a/elf/dl-lookup.c b/elf/dl-lookup.c
index eea217eb28..2a2b46f85c 100644
--- a/elf/dl-lookup.c
+++ b/elf/dl-lookup.c
@@ -29,6 +29,7 @@ 
 #include <tls.h>
 #include <atomic.h>
 #include <elf_machine_sym_no_match.h>
+#include <dl-pthread-weak.h>
 
 #include <assert.h>
 
@@ -64,6 +65,7 @@  check_match (const char *const undef_name,
 	     const Elf_Symndx symidx,
 	     const char *const strtab,
 	     const struct link_map *const map,
+	     const struct link_map *undef_map,
 	     const ElfW(Sym) **const versioned_sym,
 	     int *const num_versions)
 {
@@ -142,6 +144,11 @@  check_match (const char *const undef_name,
 	 public interface should be returned.  */
       if (verstab != NULL)
 	{
+	  /* Check if this is a legacy pthread weak symbol reference.
+	     If yes, then do not bind to this symbol.  */
+	  if (dl_pthread_hide_symbol (undef_map, undef_name, ref, map))
+	    return NULL;
+
 	  if ((verstab[symidx] & 0x7fff)
 	      >= ((flags & DL_LOOKUP_RETURN_NEWEST) ? 2 : 3))
 	    {
@@ -429,8 +436,8 @@  do_lookup_x (const char *undef_name, uint_fast32_t new_hash,
 			symidx = ELF_MACHINE_HASH_SYMIDX (map, hasharr);
 			sym = check_match (undef_name, ref, version, flags,
 					   type_class, &symtab[symidx], symidx,
-					   strtab, map, &versioned_sym,
-					   &num_versions);
+					   strtab, map, undef_map,
+					   &versioned_sym, &num_versions);
 			if (sym != NULL)
 			  goto found_it;
 		      }
@@ -454,7 +461,7 @@  do_lookup_x (const char *undef_name, uint_fast32_t new_hash,
 	    {
 	      sym = check_match (undef_name, ref, version, flags,
 				 type_class, &symtab[symidx], symidx,
-				 strtab, map, &versioned_sym,
+				 strtab, map, undef_map, &versioned_sym,
 				 &num_versions);
 	      if (sym != NULL)
 		goto found_it;
diff --git a/elf/dl-open.c b/elf/dl-open.c
index ab7aaa345e..4389159717 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -35,6 +35,7 @@ 
 #include <libc-internal.h>
 #include <array_length.h>
 #include <libc-early-init.h>
+#include <dl-pthread-weak.h>
 
 #include <dl-dst.h>
 #include <dl-prop.h>
@@ -743,6 +744,8 @@  dl_open_worker (void *a)
        on memory allocation failure.  See bug 16134.  */
     update_tls_slotinfo (new);
 
+  dl_pthread_record_dlopen (new);
+
   /* Notify the debugger all new objects have been relocated.  */
   if (relocation_in_progress)
     LIBC_PROBE (reloc_complete, 3, args->nsid, r, new);
diff --git a/elf/dl-pthread-weak.c b/elf/dl-pthread-weak.c
new file mode 100644
index 0000000000..aff80d4177
--- /dev/null
+++ b/elf/dl-pthread-weak.c
@@ -0,0 +1,20 @@ 
+/* Weak references to symbols formerly in libpthread.  Generic version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* The generic version is a header-only implementation.  */
+#include <dl-pthread-weak.h>
diff --git a/elf/dl-version.c b/elf/dl-version.c
index 914955c2a8..d4c3b24a76 100644
--- a/elf/dl-version.c
+++ b/elf/dl-version.c
@@ -24,6 +24,7 @@ 
 #include <string.h>
 #include <ldsodefs.h>
 #include <_itoa.h>
+#include <dl-pthread-weak.h>
 
 #include <assert.h>
 
@@ -220,6 +221,7 @@  _dl_check_map_versions (struct link_map *map, int verbose, int trace_mode)
 					  strtab + aux->vna_name,
 					  needed->l_real, verbose,
 					  aux->vna_flags & VER_FLG_WEAK);
+		  dl_pthread_record_version (map, aux);
 
 		  /* Compare the version index.  */
 		  if ((unsigned int) (aux->vna_other & 0x7fff) > ndx_high)
diff --git a/sysdeps/generic/dl-pthread-weak.h b/sysdeps/generic/dl-pthread-weak.h
new file mode 100644
index 0000000000..109cb4264b
--- /dev/null
+++ b/sysdeps/generic/dl-pthread-weak.h
@@ -0,0 +1,67 @@ 
+/* Weak references to symbols formerly in libpthread.  Generic version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* A lot of applications contain code like this:
+
+   if (pthread_mutexattr_gettype != NULL)
+     pthread_once (&once_control, initializer_function);
+
+   pthread_mutexattr_gettype and pthread_once are declared as weak in
+   the application.  Traditionally, link editors apply various forms
+   of relaxations to a call to a weak function symbol if the symbol is
+   undefined at static link time.  This eliminates the symbol
+   reference, but the relevant code path cannot be executed anymore.
+   Such code paths become active after symbols like
+   pthread_mutexattr_gettype are moved into libc, so it is necessary
+   to mask the existence of the symbol for old binaries.  */
+
+#ifndef _DL_PTHREAD_WEAK_H
+#define _DL_PTHREAD_WEAK_H
+
+#include <ldsodefs.h>
+#include <link.h>
+#include <stdbool.h>
+
+/* Returns true if check_match in elf/dl-lookup.c should not resolve
+   the symbol.  Called only if an unversioned symbol is about to be
+   bound to a versioned symbol.  */
+static inline bool
+dl_pthread_hide_symbol (const struct link_map *undef_map,
+                        const char *undef_name,
+                        const ElfW(Sym) *undef_sym,
+                        const struct link_map *defining_map)
+{
+  return false;
+}
+
+/* Called during dlopen in the base namespace.  This can be used to
+   detect a reference to libpthread.  */
+static inline void
+dl_pthread_record_dlopen (const struct link_map *map)
+{
+}
+
+/* Called for each needed version during symbol version information
+   processing as part of dlopen.  */
+static inline void
+dl_pthread_record_version (const struct link_map *map,
+                           const ElfW(Vernaux) *aux)
+{
+}
+
+#endif /* _DL_PTHREAD_WEAK_H */
diff --git a/sysdeps/nptl/dl-pthread-weak.c b/sysdeps/nptl/dl-pthread-weak.c
new file mode 100644
index 0000000000..ff32939253
--- /dev/null
+++ b/sysdeps/nptl/dl-pthread-weak.c
@@ -0,0 +1,153 @@ 
+/* Weak references to symbols formerly in libpthread.  NPTL version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <array_length.h>
+#include <dl-pthread-weak.h>
+
+#if DL_PTHREAD_WEAK_NEEDED
+bool _dl_pthread_weak_symbols;
+
+/* There are several ways to implement the check to identify the
+   relevant symbols.  For example, we could use the otherwise unused
+   weak symbol status within libc.so.  The set representation is
+   reasonably small and fast.  This function is called only for weak
+   unversioned symbol references already found in libc.so, which is an
+   unusual case and therefore not on the fast path for symbol
+   lookup.  */
+
+
+/* Lexicographically ordered list of symbols originally at the
+   GLIBC_2.0 and GLIBC_2.1 versions.  Later symbols (including C11
+   symbols) will give false negatives on earlier glibc versions and
+   are thus unsuitable for libpthread detection.  Even GLIBC_2.1 is
+   problematic in this regard, but actual binaries use
+   pthread_mutexattr_gettype as the detector symbol.  */
+enum { maximum_length = 21 };
+static const char symbols[][maximum_length] =
+  {
+    "_cleanup_pop",
+    "_cleanup_pop_restore",
+    "_cleanup_push",
+    "_cleanup_push_defer",
+    "_getspecific",
+    "_key_create",
+    "_mutex_destroy",
+    "_mutex_init",
+    "_mutex_lock",
+    "_mutex_trylock",
+    "_mutex_unlock",
+    "_mutexattr_destroy",
+    "_mutexattr_init",
+    "_mutexattr_settype",
+    "_once",
+    "_setspecific",
+    "atfork",
+    "attr_getguardsize",
+    "attr_getstackaddr",
+    "attr_getstacksize",
+    "attr_setguardsize",
+    "attr_setstackaddr",
+    "attr_setstacksize",
+    "cancel",
+    "create",
+    "detach",
+    "getconcurrency",
+    "getspecific",
+    "join",
+    "key_create",
+    "key_delete",
+    "kill",
+    "kill_other_threads_np",
+    "mutex_trylock",
+    "mutexattr_destroy",
+    "mutexattr_getkind_np",
+    "mutexattr_gettype",
+    "mutexattr_init",
+    "mutexattr_setkind_np",
+    "mutexattr_settype",
+    "once",
+    "rwlock_destroy",
+    "rwlock_init",
+    "rwlock_rdlock",
+    "rwlock_tryrdlock",
+    "rwlock_trywrlock",
+    "rwlock_unlock",
+    "rwlock_wrlock",
+    "rwlockattr_destroy",
+    "rwlockattr_getkind_np",
+    "rwlockattr_getpshared",
+    "rwlockattr_init",
+    "rwlockattr_setkind_np",
+    "rwlockattr_setpshared",
+    "sem_destroy",
+    "sem_getvalue",
+    "sem_init",
+    "sem_post",
+    "sem_trywait",
+    "sem_wait",
+    "setconcurrency",
+    "setspecific",
+    "sigmask",
+    "testcancel",
+  };
+
+static inline int
+compare (const void *a, const void *b)
+{
+  return strncmp (a, b, maximum_length);
+}
+
+bool
+_dl_pthread_hidden_symbol (const char *undef_name)
+{
+  /* Turn the __pthread and _pthread prefixes into a _ prefix.  This
+     allows us to use a single lookup table.  (The difference between
+     __pthread_mutex_lock and pthread_mutex_lock is significant, for
+     example.)  */
+  const char *key = NULL;
+  if (strncmp (undef_name, "__pthread_", strlen ("__pthread_")) == 0)
+    key = undef_name + strlen ("__pthread");
+  else if (strncmp (undef_name, "_pthread_", strlen ("_pthread_")) == 0)
+    key = undef_name + strlen ("_pthread");
+  else if (strncmp (undef_name, "pthread_", strlen ("pthread_")) == 0)
+    key = undef_name + strlen ("pthread_");
+  else if (strncmp (undef_name, "sem_", strlen ("sem_")) == 0)
+    /* Do not remove the sem_ prefix.  This would result in false
+       matches for symbols such as pthread_sem_post, but no such
+       symbols exist.  */
+    key = undef_name;
+
+  if (key == NULL || strlen (key) > maximum_length)
+    /* The prefix of undef_name is not recognized, or the string is
+       not in the table because it is too long.  */
+    return false;
+
+  if (bsearch (key, symbols, array_length (symbols), maximum_length,
+               compare) != NULL)
+    {
+      if (__glibc_unlikely (GLRO (dl_debug_mask) & DL_DEBUG_BINDINGS))
+        _dl_debug_printf ("\
+not binding legacy weak reference in main program to %s\n",
+                          undef_name);
+      return true;
+    }
+
+  return false;
+}
+
+#endif /* DL_PTHREAD_WEAK_NEEDED */
diff --git a/sysdeps/nptl/dl-pthread-weak.h b/sysdeps/nptl/dl-pthread-weak.h
new file mode 100644
index 0000000000..f252abcafe
--- /dev/null
+++ b/sysdeps/nptl/dl-pthread-weak.h
@@ -0,0 +1,107 @@ 
+/* Weak references to symbols formerly in libpthread.  NPTL version.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <shlib-compat.h>
+#if OTHER_SHLIB_COMPAT (libpthread, GLIBC_2_0, GLIBC_2_34)
+
+/* For context, refer to <sysdeps/generic/dl-pthread-weak.h>.  */
+
+# include <gnu/lib-names.h>
+# include <ldsodefs.h>
+# include <link.h>
+# include <stdbool.h>
+# include <string.h>
+
+/* The implementation file needs to be compiled.  */
+#define DL_PTHREAD_WEAK_NEEDED 1
+
+/* If false, weak symbols to old libpthread-only functions are hidden
+   from symbol resolution in the main program.  Set to true if
+   libpthread is loaded by the main program (via
+   dl_pthread_record_dlopen), or if the main program references
+   GLIBC_2.34 (via dl_pthread_record_version).  */
+extern bool _dl_pthread_weak_symbols attribute_hidden;
+
+/* Returns true if UNDEF_NAME refers to a libpthread symbol that is
+   hidden (in case of certain weak symbol references from the main
+   program).  */
+bool _dl_pthread_hidden_symbol (const char *undef_name) attribute_hidden;
+
+static inline bool
+dl_pthread_hide_symbol (const struct link_map *undef_map,
+                        const char *undef_name,
+                        const ElfW(Sym) *undef_sym,
+                        const struct link_map *defining_map)
+{
+  /* Check if symbol hiding has been disabled.  */
+  if (_dl_pthread_weak_symbols)
+    return false;
+
+  /* Symbol hiding only applies to weak symbol references.  */
+  if (undef_sym == NULL || ELFW(ST_BIND) (undef_sym->st_info) != STB_WEAK)
+    return false;
+
+  /* Only symbols in the main map are potentially hidden.  Shared
+     objects are compiled as PIC and are not affected by link editor
+     optimizations.  This implies a check that we are in the base
+     namespace.  */
+  const struct link_map *main_map = GL (dl_ns)[LM_ID_BASE]._ns_loaded;
+  if (undef_map != main_map)
+    return false;
+
+  /* Symbol hiding only applies to symbols in libc.so.  */
+  if (defining_map != GL (dl_ns)[LM_ID_BASE].libc_map)
+    return false;
+
+  /* Delegate to the out-of-line name checking function.  */
+  return _dl_pthread_hidden_symbol (undef_name);
+}
+
+static inline void
+dl_pthread_record_dlopen (const struct link_map *map)
+{
+  /* This assumes that our libpthread has soname (and still exists as
+     a separate shared object).  */
+  const char *strtab = (const char *) D_PTR (map, l_info[DT_STRTAB]);
+  if (map->l_info[DT_SONAME] != NULL
+      && strcmp (strtab + map->l_info[DT_SONAME]->d_un.d_val,
+                 LIBPTHREAD_SO) == 0)
+    _dl_pthread_weak_symbols = true;
+}
+
+static inline void
+dl_pthread_record_version (const struct link_map *map,
+                           const ElfW(Vernaux) *aux)
+{
+  /* Only GLIBC_2.34 references from the main map disable weak symbol
+     hiding.  */
+  const struct link_map *main_map = GL (dl_ns)[LM_ID_BASE]._ns_loaded;
+  if (map != main_map)
+    return;
+
+  const char *strtab = (const char *) D_PTR (map, l_info[DT_STRTAB]);
+  if (strcmp (strtab + aux->vna_name, "GLIBC_2.34") == 0)
+    _dl_pthread_weak_symbols = true;
+}
+
+#else
+/* For static build and glibc after 2.34, it is possible to use the
+   no-op default version.  */
+# include <sysdeps/generic/dl-pthread-weak.h>
+# define DL_PTHREAD_WEAK_NEEDED 0
+#endif