[BZ,#17250] Static dlopen default library search path fix

Message ID alpine.LFD.2.20.1508041948170.1410@eddie.linux-mips.org
State Committed
Headers

Commit Message

Maciej W. Rozycki Aug. 4, 2015, 8:52 p.m. UTC
  On Tue, 4 Aug 2015, Andreas Schwab wrote:

> >  What problem are you seeing or trying to solve?
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=17250

 Umm, thanks.  Having looked through the patch again it looks to me like I 
got confused with the negative `__builtin_expect' expression.  My change 
was not neutral as intended as it's where DF_1_NODEFLIB is *set* that the 
conditional executes.

 So the solution is to leave `.l_flags_1' clear in the static link map, 
and I can see `__builtin_expect' has been since changed in elf/dl-load.c 
to `__glibc_unlikely' already so nothing to do here; obviously I must have 
not been the only one getting confused here.

 I'm not prepared to properly regression-test a change right away, but 
meanwhile can you try the below as a proposed fix?  Also I'm not sure 
offhand how to make a test case that covers this issue, but I'll be happy 
to accept ideas and implement them.

2015-08-04  Maciej W. Rozycki  <macro@linux-mips.org>

	[BZ #17250]
	* elf/dl-support.c (_dl_main_map): Don't initialize l_flags_1
	member.

  Maciej

glibc-static-dlopen-17250.diff
  

Comments

Andreas Schwab Aug. 5, 2015, 12:09 p.m. UTC | #1
"Maciej W. Rozycki" <macro@linux-mips.org> writes:

>  So the solution is to leave `.l_flags_1' clear in the static link map, 
> and I can see `__builtin_expect' has been since changed in elf/dl-load.c 
> to `__glibc_unlikely' already so nothing to do here; obviously I must have 
> not been the only one getting confused here.

The __builtin_expect -> __glibc_(un)likely change was a cleanup, nothing
about getting confused.

>  I'm not prepared to properly regression-test a change right away, but 
> meanwhile can you try the below as a proposed fix?

This fixes the testcase.

> Also I'm not sure offhand how to make a test case that covers this
> issue, but I'll be happy to accept ideas and implement them.

A testcase will have to make sure nscd isn't involved, which makes it
more difficult to implement.

Andreas.
  
Maciej W. Rozycki Aug. 5, 2015, 12:46 p.m. UTC | #2
On Wed, 5 Aug 2015, Andreas Schwab wrote:

> >  So the solution is to leave `.l_flags_1' clear in the static link map, 
> > and I can see `__builtin_expect' has been since changed in elf/dl-load.c 
> > to `__glibc_unlikely' already so nothing to do here; obviously I must have 
> > not been the only one getting confused here.
> 
> The __builtin_expect -> __glibc_(un)likely change was a cleanup, nothing
> about getting confused.

 Hmm, what was the purpose of introducing `__glibc_(un)likely' then?

  Maciej
  
Andreas Schwab Aug. 5, 2015, 1 p.m. UTC | #3
"Maciej W. Rozycki" <macro@linux-mips.org> writes:

> On Wed, 5 Aug 2015, Andreas Schwab wrote:
>
>> >  So the solution is to leave `.l_flags_1' clear in the static link map, 
>> > and I can see `__builtin_expect' has been since changed in elf/dl-load.c 
>> > to `__glibc_unlikely' already so nothing to do here; obviously I must have 
>> > not been the only one getting confused here.
>> 
>> The __builtin_expect -> __glibc_(un)likely change was a cleanup, nothing
>> about getting confused.
>
>  Hmm, what was the purpose of introducing `__glibc_(un)likely' then?

A nicer interface.

Andreas.
  
Mike Frysinger Aug. 6, 2015, 6:04 a.m. UTC | #4
On 05 Aug 2015 15:00, Andreas Schwab wrote:
> "Maciej W. Rozycki" <macro@linux-mips.org> writes:
> > On Wed, 5 Aug 2015, Andreas Schwab wrote:
> >> >  So the solution is to leave `.l_flags_1' clear in the static link map, 
> >> > and I can see `__builtin_expect' has been since changed in elf/dl-load.c 
> >> > to `__glibc_unlikely' already so nothing to do here; obviously I must have 
> >> > not been the only one getting confused here.
> >> 
> >> The __builtin_expect -> __glibc_(un)likely change was a cleanup, nothing
> >> about getting confused.
> >
> >  Hmm, what was the purpose of introducing `__glibc_(un)likely' then?
> 
> A nicer interface.

... because the raw gcc builtin is confusing ;)
-mike
  
Mike Frysinger Aug. 6, 2015, 6:06 a.m. UTC | #5
On 04 Aug 2015 21:52, Maciej W. Rozycki wrote:
> On Tue, 4 Aug 2015, Andreas Schwab wrote:
> > >  What problem are you seeing or trying to solve?
> > 
> > https://sourceware.org/bugzilla/show_bug.cgi?id=17250
> 
>  Umm, thanks.  Having looked through the patch again it looks to me like I 
> got confused with the negative `__builtin_expect' expression.  My change 
> was not neutral as intended as it's where DF_1_NODEFLIB is *set* that the 
> conditional executes.

we cc-ed you on it originally ... do you not see bugzilla e-mails ?
i'm just wondering how to address the visibility aspect.

>  I'm not prepared to properly regression-test a change right away, but 
> meanwhile can you try the below as a proposed fix?  Also I'm not sure 
> offhand how to make a test case that covers this issue, but I'll be happy 
> to accept ideas and implement them.

two were posted to the bug.  if you use tests-static in the Makefile,
should be fairly easy to add to the tree and check the result.
-mike
  
Maciej W. Rozycki Aug. 6, 2015, 9:52 p.m. UTC | #6
On Thu, 6 Aug 2015, Mike Frysinger wrote:

> >  Umm, thanks.  Having looked through the patch again it looks to me like I 
> > got confused with the negative `__builtin_expect' expression.  My change 
> > was not neutral as intended as it's where DF_1_NODEFLIB is *set* that the 
> > conditional executes.
> 
> we cc-ed you on it originally ... do you not see bugzilla e-mails ?
> i'm just wondering how to address the visibility aspect.

 I checked and I still have the bugzilla e-mails in my mailbox unarchived.  
Which means I saw them and marked for later processing.  I must have lost 
them from my radar then, sorry about that.  Last autumn was hectic for me.

 In any case pinging does help, as it did this time (thanks, Andreas!).

> >  I'm not prepared to properly regression-test a change right away, but 
> > meanwhile can you try the below as a proposed fix?  Also I'm not sure 
> > offhand how to make a test case that covers this issue, but I'll be happy 
> > to accept ideas and implement them.
> 
> two were posted to the bug.  if you use tests-static in the Makefile,
> should be fairly easy to add to the tree and check the result.

 Except that they cover the usual system-installed use case and we run our 
test suite over uninstalled libraries.  Normally that does not matter, but 
here it does.  Unless I'm missing something that is, am I?

  Maciej
  
Mike Frysinger Aug. 7, 2015, 3:15 a.m. UTC | #7
On 06 Aug 2015 22:52, Maciej W. Rozycki wrote:
> On Thu, 6 Aug 2015, Mike Frysinger wrote:
> > >  I'm not prepared to properly regression-test a change right away, but 
> > > meanwhile can you try the below as a proposed fix?  Also I'm not sure 
> > > offhand how to make a test case that covers this issue, but I'll be happy 
> > > to accept ideas and implement them.
> > 
> > two were posted to the bug.  if you use tests-static in the Makefile,
> > should be fairly easy to add to the tree and check the result.
> 
>  Except that they cover the usual system-installed use case and we run our 
> test suite over uninstalled libraries.  Normally that does not matter, but 
> here it does.  Unless I'm missing something that is, am I?

the test requires:
 - it be statically linked
   -> use tests-static in the makefile
 - the db accessed contains the data we request
   -> nptl/tst-setuid1.c seems to already assume "nobody" is a valid acct
 - it have access to the nss libs to load
   -> set LD_LIBRARY_PATH to the local paths using the xxx-env hooks if
      existing run logic doesn't already do it

what other external deps did you have in mind ?

i just tested it over here and it semed fine:
$ gcc test.c -B. -static
$ LD_LIBRARY_PATH=$PWD/nis:$PWD/nss:$PWD:$PWD/elf strace -e file ./a.out
<all libs are loaded from $PWD>
/etc/nsswitch.conf & /etc/passwd are read, but i think our tests already
rely on that ...
-mike
  
Andreas Schwab Aug. 10, 2015, 8:44 a.m. UTC | #8
Mike Frysinger <vapier@gentoo.org> writes:

> the test requires:
>  - it be statically linked
>    -> use tests-static in the makefile
>  - the db accessed contains the data we request
>    -> nptl/tst-setuid1.c seems to already assume "nobody" is a valid acct
>  - it have access to the nss libs to load
>    -> set LD_LIBRARY_PATH to the local paths using the xxx-env hooks if
>       existing run logic doesn't already do it

   - make sure to bypass a running nscd

Andreas.
  
Mike Frysinger Aug. 10, 2015, 9:53 a.m. UTC | #9
On 10 Aug 2015 10:44, Andreas Schwab wrote:
> Mike Frysinger <vapier@gentoo.org> writes:
> > the test requires:
> >  - it be statically linked
> >    -> use tests-static in the makefile
> >  - the db accessed contains the data we request
> >    -> nptl/tst-setuid1.c seems to already assume "nobody" is a valid acct
> >  - it have access to the nss libs to load
> >    -> set LD_LIBRARY_PATH to the local paths using the xxx-env hooks if
> >       existing run logic doesn't already do it
> 
>    - make sure to bypass a running nscd

that would be nice, but i don't think it's a blocker to adding the test.
is there even such a mechanism in place now ?
-mike
  
Andreas Schwab Aug. 10, 2015, 9:58 a.m. UTC | #10
Mike Frysinger <vapier@gentoo.org> writes:

> is there even such a mechanism in place now ?

Not AFAIK.

Andreas.
  
Maciej W. Rozycki Aug. 12, 2015, 2:50 p.m. UTC | #11
On Thu, 6 Aug 2015, Mike Frysinger wrote:

> > > >  I'm not prepared to properly regression-test a change right away, but 
> > > > meanwhile can you try the below as a proposed fix?  Also I'm not sure 
> > > > offhand how to make a test case that covers this issue, but I'll be happy 
> > > > to accept ideas and implement them.
> > > 
> > > two were posted to the bug.  if you use tests-static in the Makefile,
> > > should be fairly easy to add to the tree and check the result.
> > 
> >  Except that they cover the usual system-installed use case and we run our 
> > test suite over uninstalled libraries.  Normally that does not matter, but 
> > here it does.  Unless I'm missing something that is, am I?
> 
> the test requires:
>  - it be statically linked
>    -> use tests-static in the makefile
>  - the db accessed contains the data we request
>    -> nptl/tst-setuid1.c seems to already assume "nobody" is a valid acct
>  - it have access to the nss libs to load
>    -> set LD_LIBRARY_PATH to the local paths using the xxx-env hooks if
>       existing run logic doesn't already do it
> 
> what other external deps did you have in mind ?
> 
> i just tested it over here and it semed fine:
> $ gcc test.c -B. -static
> $ LD_LIBRARY_PATH=$PWD/nis:$PWD/nss:$PWD:$PWD/elf strace -e file ./a.out
> <all libs are loaded from $PWD>
> /etc/nsswitch.conf & /etc/passwd are read, but i think our tests already
> rely on that ...

 The bug affects any modules loaded dynamically from a static executable, 
so a dependency upon NSS having been configured in a particular way (mind 
the `--enable-static-nss' `configure' option) is not required.  An 
explicit test case triggering the problem is better than one indirectly 
relying on other subsystems anyway.  Such code can be easily written based 
on the nature of the problem.

 However according to both the bug report and how code affected looks like 
the problem only triggers for dynamic shared objects to be pulled from 
system directories, such as /lib or /usr/lib (assuming the usual system 
prefix), and I gather any further ones listed in /etc/ld.so.conf.  These 
are only populated at the install stage and therefore not relevant for 
regression testing.

 Furthermore the bug report explicitly refers to the setting of 
LD_LIBRARY_PATH as a workaround for the problem, so such an arrangement 
obviously cannot be used to trigger the bug.  So while code to trigger the 
problem can be easily made and run standalone, we currently have no 
provisions available to integrate it into our test suite.

 There was a discussion at the Glibc BOF at this year's GNU Tools Cauldron 
as to how to make the testing of libraries possible in cases where the 
root privilege or operation in a fully installed environment would 
normally be required to cover a problem being concerned, however at this 
point we are quite far from getting there I believe.

 Have I missed anything?

  Maciej
  
Mike Frysinger Aug. 12, 2015, 4:42 p.m. UTC | #12
On 12 Aug 2015 15:50, Maciej W. Rozycki wrote:
>  However according to both the bug report and how code affected looks like 
> the problem only triggers for dynamic shared objects to be pulled from 
> system directories, such as /lib or /usr/lib (assuming the usual system 
> prefix), and I gather any further ones listed in /etc/ld.so.conf.  These 
> are only populated at the install stage and therefore not relevant for 
> regression testing.
> 
>  Furthermore the bug report explicitly refers to the setting of 
> LD_LIBRARY_PATH as a workaround for the problem, so such an arrangement 
> obviously cannot be used to trigger the bug.  So while code to trigger the 
> problem can be easily made and run standalone, we currently have no 
> provisions available to integrate it into our test suite.

these two points makes writing a test case currently generally infeasible.
it could be added under a xfail section so that if we ever get to a place
where it could be exercised, we have it at hand.

having a testcase is not a hard requirement.  we really want one, but we
are also realistic -- if coming up with one is infeasible relative to the
fixing of the bug, then we just fix the bug.  i think this is such a case.
-mike
  
Maciej W. Rozycki Aug. 17, 2015, 2:08 p.m. UTC | #13
On Wed, 12 Aug 2015, Mike Frysinger wrote:

> having a testcase is not a hard requirement.  we really want one, but we
> are also realistic -- if coming up with one is infeasible relative to the
> fixing of the bug, then we just fix the bug.  i think this is such a case.

 Does anyone else have anything to add?  Otherwise with 2.22 out I'd like 
to push the fix to our trunk now.

 Also I gather we want backports to release branches.  The issue is 
currently present on 2.22, 2.21, 2.20 and 2.19 branches.  Any reason to 
omit any of them?

  Maciej
  
Andreas Schwab Sept. 7, 2015, 2:14 p.m. UTC | #14
"Maciej W. Rozycki" <macro@linux-mips.org> writes:

>  Does anyone else have anything to add?  Otherwise with 2.22 out I'd like 
> to push the fix to our trunk now.

Please go ahead.

Andreas.
  
Maciej W. Rozycki Sept. 25, 2015, 9:22 a.m. UTC | #15
On Mon, 7 Sep 2015, Andreas Schwab wrote:

> >  Does anyone else have anything to add?  Otherwise with 2.22 out I'd like 
> > to push the fix to our trunk now.
> 
> Please go ahead.

 Applied now, thanks for patience.

  Maciej
  
Mike Frysinger Sept. 25, 2015, 2:06 p.m. UTC | #16
On 25 Sep 2015 10:22, Maciej W. Rozycki wrote:
> On Mon, 7 Sep 2015, Andreas Schwab wrote:
> > >  Does anyone else have anything to add?  Otherwise with 2.22 out I'd like 
> > > to push the fix to our trunk now.
> > 
> > Please go ahead.
> 
>  Applied now, thanks for patience.

how confident are we with stability here ?  i.e. backporting to 2.2[012] ?
-mike
  

Patch

Index: glibc/elf/dl-support.c
===================================================================
--- glibc.orig/elf/dl-support.c	2015-08-04 20:10:10.297536823 +0100
+++ glibc/elf/dl-support.c	2015-08-04 20:10:33.180771532 +0100
@@ -91,7 +91,6 @@  static struct link_map _dl_main_map =
     .l_scope = _dl_main_map.l_scope_mem,
     .l_local_scope = { &_dl_main_map.l_searchlist },
     .l_used = 1,
-    .l_flags_1 = DF_1_NODEFLIB,
     .l_tls_offset = NO_TLS_OFFSET,
     .l_serial = 1,
   };