[BZ,#18422] elf/tst-audit tests fail without PLT entries

Message ID 20150523131408.GA18203@gmail.com
State Superseded
Headers

Commit Message

H.J. Lu May 23, 2015, 1:14 p.m. UTC
  PLT entries aren't required when -z now used.  Linker on master with:

commit 25070364b0ce33eed46aa5d78ebebbec6accec7e
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sat May 16 07:00:21 2015 -0700

    Don't generate PLT relocations for now binding

    There is no need for PLT relocations with -z now. We can use GOT
    relocations, which take less space, instead and replace 16-byte .plt
    entres with 8-byte .plt.got entries.

    bfd/

      * elf32-i386.c (elf_i386_check_relocs): Create .plt.got section
      for now binding.
      (elf_i386_allocate_dynrelocs): Use .plt.got section for now
      binding.
      * elf64-x86-64.c (elf_x86_64_check_relocs): Create .plt.got
      section for now binding.
      (elf_x86_64_allocate_dynrelocs): Use .plt.got section for now
      binding.

won't generate PLT entries with -z now.  elf/tst-audit2.c has

/* This calloc definition will be called by the dynamic linker itself.
   We test that it has initialized our TLS block by the time it does so.
*/

void *
calloc (size_t n, size_t m)
{
  if (magic[0] != MAGIC1 || magic[1] != MAGIC2)
    {
      printf ("{%x, %x} != {%x, %x}\n", magic[0], magic[1], MAGIC1, MAGIC2);
      abort ();
    }
  magic[0] = MAGIC2;
  magic[1] = MAGIC1;

Since ld.so is built with -z now, there are no PLT relocations and this
calloc won't be used:

Relocation section '.rela.dyn' at offset 0x66c contains 16 entries:
 Offset     Info    Type            Sym.Value  Sym. Name + Addend
00222e88  00000008 R_X86_64_RELATIVE            10970
00222e8c  00000008 R_X86_64_RELATIVE            f8f0
00222e90  00000008 R_X86_64_RELATIVE            f6c0
00222e94  00000008 R_X86_64_RELATIVE            11bd0
00222e98  00000008 R_X86_64_RELATIVE            a1b0
00222e9c  00000008 R_X86_64_RELATIVE            130c0
00222ea0  00000008 R_X86_64_RELATIVE            13c50
00222ea4  00000008 R_X86_64_RELATIVE            15fc0
00222ea8  00000008 R_X86_64_RELATIVE            12cd0
00222eac  00000008 R_X86_64_RELATIVE            17f90
00222fbc  00000a06 R_X86_64_GLOB_DAT 00018230	__libc_memalign@@GLIBC_2.16 + 0
00222fc4  00001506 R_X86_64_GLOB_DAT 00018300   malloc@@GLIBC_2.16 + 0
00222fcc  00000d06 R_X86_64_GLOB_DAT 00018310   calloc@@GLIBC_2.16 + 0
00222fd4  00000506 R_X86_64_GLOB_DAT 000184a0   realloc@@GLIBC_2.16 + 0
00222fdc  00000706 R_X86_64_GLOB_DAT 002239a0   _r_debug@@GLIBC_2.16 + 0
00222fe4  00000406 R_X86_64_GLOB_DAT 00018340   free@@GLIBC_2.16 + 0

Assuming we do want to keep PLT relocations in ld.so so that malloc
functions in ld.so can be overridden, ld.so should be built with -z now.
There is no reason to build ld.so with -z now since ld.so is the one
doing BIND_NOW.  The only thing we get with -z now on ld.so is DT tag:

 0x0000000000000018 (BIND_NOW)
 0x000000006ffffffb (FLAGS_1)            Flags: NOW

This patch removes -Wl,-z,now from ld.so build.

OK for master?

H.J.
	[BZ #18422]
	* elf/Makefile (z-now-yes): Removed.
	($(objpfx)ld.so): Remove $(z-now-$(bind-now)).
---
 elf/Makefile | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)
  

Comments

Carlos O'Donell May 24, 2015, 3:37 a.m. UTC | #1
On 05/23/2015 09:14 AM, H.J. Lu wrote:
> Since ld.so is built with -z now, there are no PLT relocations and this
> calloc won't be used:

Which is bad. We always want these functions to be interposable by all
of the analysis tools that want and need to track memory allocations.
Thus it is not just the test that matters.

> Relocation section '.rela.dyn' at offset 0x66c contains 16 entries:
>  Offset     Info    Type            Sym.Value  Sym. Name + Addend
> 00222e88  00000008 R_X86_64_RELATIVE            10970
> 00222e8c  00000008 R_X86_64_RELATIVE            f8f0
> 00222e90  00000008 R_X86_64_RELATIVE            f6c0
> 00222e94  00000008 R_X86_64_RELATIVE            11bd0
> 00222e98  00000008 R_X86_64_RELATIVE            a1b0
> 00222e9c  00000008 R_X86_64_RELATIVE            130c0
> 00222ea0  00000008 R_X86_64_RELATIVE            13c50
> 00222ea4  00000008 R_X86_64_RELATIVE            15fc0
> 00222ea8  00000008 R_X86_64_RELATIVE            12cd0
> 00222eac  00000008 R_X86_64_RELATIVE            17f90
> 00222fbc  00000a06 R_X86_64_GLOB_DAT 00018230	__libc_memalign@@GLIBC_2.16 + 0
> 00222fc4  00001506 R_X86_64_GLOB_DAT 00018300   malloc@@GLIBC_2.16 + 0
> 00222fcc  00000d06 R_X86_64_GLOB_DAT 00018310   calloc@@GLIBC_2.16 + 0
> 00222fd4  00000506 R_X86_64_GLOB_DAT 000184a0   realloc@@GLIBC_2.16 + 0
> 00222fdc  00000706 R_X86_64_GLOB_DAT 002239a0   _r_debug@@GLIBC_2.16 + 0
> 00222fe4  00000406 R_X86_64_GLOB_DAT 00018340   free@@GLIBC_2.16 + 0

Doesn't this also cause check-localplt to fail for ld.so given
that calloc and others are no longer R_X86_64_JUMP_SLOT?

> Assuming we do want to keep PLT relocations in ld.so so that malloc
> functions in ld.so can be overridden, ld.so should be built with -z now.
> There is no reason to build ld.so with -z now since ld.so is the one
> doing BIND_NOW.  The only thing we get with -z now on ld.so is DT tag:
> 
>  0x0000000000000018 (BIND_NOW)
>  0x000000006ffffffb (FLAGS_1)            Flags: NOW
> 
> This patch removes -Wl,-z,now from ld.so build.
> 
> OK for master?

No. I'd like to see more discussion on this.

I don't see any other way forward, and I agree that DT_BIND_NOW seems
a bit silly for the linker since it itself is the component responsible
for that binding.

My worry is that the missing DT tag is going to have security implications.
I'm including several other distro people on the TO.

The first thing I'll have to explain is "Why doesn't ld.so meet full RELRO?"
Since full RELRO requires DT_BIND_NOW + RO segments. Does this mean ld.so
with your patch will by lazily bound and not mark it's own PLT immediately RO?
That seems wrong.

Given the wrongness I'd like to see more discussion, and it's late in my TZ
so I'm not up for a detailed response yet.

However, the first thing that pops into mind is that this wrong, but wrong
in the sense that I don't know if the static linker can even make this choice
(removing the PLT) without information from the user.

The notion of -z,now means "Bind symbols now, not lazily", but that doesn't
give you enough information to elide the PLT and use the GOT directly since
the interposition is still useful and relied upon semantic in ELF.

Cheers,
Carlos.

> H.J.
> 	[BZ #18422]
> 	* elf/Makefile (z-now-yes): Removed.
> 	($(objpfx)ld.so): Remove $(z-now-$(bind-now)).
> ---
>  elf/Makefile | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/elf/Makefile b/elf/Makefile
> index 34450ea..324b4a2 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -349,13 +349,11 @@ $(objpfx)librtld.os: $(objpfx)dl-allobjs.os $(objpfx)rtld-libc.a
>  
>  generated += librtld.map librtld.mk rtld-libc.a librtld.os.map
>  
> -z-now-yes = -Wl,-z,now
> -
>  $(objpfx)ld.so: $(objpfx)librtld.os $(ld-map)
>  # Link into a temporary file so that we don't touch $@ at all
>  # if the sanity check below fails.
>  	$(LINK.o) -nostdlib -nostartfiles -shared -o $@.new		\
> -		  $(LDFLAGS-rtld) -Wl,-z,defs $(z-now-$(bind-now))	\
> +		  $(LDFLAGS-rtld) -Wl,-z,defs				\
>  		  $(filter-out $(map-file),$^) $(load-map-file)		\
>  		  -Wl,-soname=$(rtld-installed-name)			\
>  		  -Wl,-defsym=_begin=0
>
  
H.J. Lu May 24, 2015, 12:45 p.m. UTC | #2
On Sat, May 23, 2015 at 8:37 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 05/23/2015 09:14 AM, H.J. Lu wrote:
>> Since ld.so is built with -z now, there are no PLT relocations and this
>> calloc won't be used:
>
> Which is bad. We always want these functions to be interposable by all
> of the analysis tools that want and need to track memory allocations.
> Thus it is not just the test that matters.
>
...
>> 00222fc4  00001506 R_X86_64_GLOB_DAT 00018300   malloc@@GLIBC_2.16 + 0
>> 00222fcc  00000d06 R_X86_64_GLOB_DAT 00018310   calloc@@GLIBC_2.16 + 0
>> 00222fd4  00000506 R_X86_64_GLOB_DAT 000184a0   realloc@@GLIBC_2.16 + 0
>> 00222fdc  00000706 R_X86_64_GLOB_DAT 002239a0   _r_debug@@GLIBC_2.16 + 0
>> 00222fe4  00000406 R_X86_64_GLOB_DAT 00018340   free@@GLIBC_2.16 + 0
>
> Doesn't this also cause check-localplt to fail for ld.so given
> that calloc and others are no longer R_X86_64_JUMP_SLOT?

Yes.

>> Assuming we do want to keep PLT relocations in ld.so so that malloc
>> functions in ld.so can be overridden, ld.so should be built with -z now.
>> There is no reason to build ld.so with -z now since ld.so is the one
>> doing BIND_NOW.  The only thing we get with -z now on ld.so is DT tag:
>>
>>  0x0000000000000018 (BIND_NOW)
>>  0x000000006ffffffb (FLAGS_1)            Flags: NOW
>>
>> This patch removes -Wl,-z,now from ld.so build.
>>
>> OK for master?
>
> No. I'd like to see more discussion on this.
>
> I don't see any other way forward, and I agree that DT_BIND_NOW seems
> a bit silly for the linker since it itself is the component responsible
> for that binding.
>
> My worry is that the missing DT tag is going to have security implications.
> I'm including several other distro people on the TO.
>
> The first thing I'll have to explain is "Why doesn't ld.so meet full RELRO?"
> Since full RELRO requires DT_BIND_NOW + RO segments. Does this mean ld.so
> with your patch will by lazily bound and not mark it's own PLT immediately RO?
> That seems wrong.

My patch just doesn't use -z now to build ld.so when --enable-bind-now is
used and ld.so always gets PLT with/without  --enable-bind-now.  It is the
same as before when building glibc with the older ld.  The only difference now
is we no long generate DT_BIND_NOW even when --enable-bind-now is used.
I don't believe DT_BIND_NOW should make a difference on ld.so.  If it is, it is
a separate bug and we should fix it.

> Given the wrongness I'd like to see more discussion, and it's late in my TZ
> so I'm not up for a detailed response yet.
>
> However, the first thing that pops into mind is that this wrong, but wrong
> in the sense that I don't know if the static linker can even make this choice
> (removing the PLT) without information from the user.

Passing -z now tells ld that we don't do lazy bind, the only thing which PLT
is used for.  ld also implemented an optimization to generate .plt.got, which
uses the GOT slot, instead of the GOTPLT slot, when we need the PLT entry
without PLT relocation.

https://groups.google.com/forum/#!topic/x86-64-abi/LDuYGdXGskY

> The notion of -z,now means "Bind symbols now, not lazily", but that doesn't
> give you enough information to elide the PLT and use the GOT directly since
> the interposition is still useful and relied upon semantic in ELF.

You still get the interposition with GOT, just not lazy bind, which needs PLT
relocation. GCC 6 even has a new option, -fno-plt, to avoid PLT:

https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00231.html

ld.so is a special case and needs PLT for the interposition unless
ld.so re-applies
GOT relocations on itself after all modules are loaded.
  
Carlos O'Donell May 25, 2015, 2:06 a.m. UTC | #3
On 05/24/2015 08:45 AM, H.J. Lu wrote:
> On Sat, May 23, 2015 at 8:37 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 05/23/2015 09:14 AM, H.J. Lu wrote:
>>> Since ld.so is built with -z now, there are no PLT relocations and this
>>> calloc won't be used:
>>
>> Which is bad. We always want these functions to be interposable by all
>> of the analysis tools that want and need to track memory allocations.
>> Thus it is not just the test that matters.
>>
> ...
>>> 00222fc4  00001506 R_X86_64_GLOB_DAT 00018300   malloc@@GLIBC_2.16 + 0
>>> 00222fcc  00000d06 R_X86_64_GLOB_DAT 00018310   calloc@@GLIBC_2.16 + 0
>>> 00222fd4  00000506 R_X86_64_GLOB_DAT 000184a0   realloc@@GLIBC_2.16 + 0
>>> 00222fdc  00000706 R_X86_64_GLOB_DAT 002239a0   _r_debug@@GLIBC_2.16 + 0
>>> 00222fe4  00000406 R_X86_64_GLOB_DAT 00018340   free@@GLIBC_2.16 + 0
>>
>> Doesn't this also cause check-localplt to fail for ld.so given
>> that calloc and others are no longer R_X86_64_JUMP_SLOT?
> 
> Yes.

Good.

>>> Assuming we do want to keep PLT relocations in ld.so so that malloc
>>> functions in ld.so can be overridden, ld.so should be built with -z now.
>>> There is no reason to build ld.so with -z now since ld.so is the one
>>> doing BIND_NOW.  The only thing we get with -z now on ld.so is DT tag:
>>>
>>>  0x0000000000000018 (BIND_NOW)
>>>  0x000000006ffffffb (FLAGS_1)            Flags: NOW
>>>
>>> This patch removes -Wl,-z,now from ld.so build.
>>>
>>> OK for master?
>>
>> No. I'd like to see more discussion on this.
>>
>> I don't see any other way forward, and I agree that DT_BIND_NOW seems
>> a bit silly for the linker since it itself is the component responsible
>> for that binding.
>>
>> My worry is that the missing DT tag is going to have security implications.
>> I'm including several other distro people on the TO.
>>
>> The first thing I'll have to explain is "Why doesn't ld.so meet full RELRO?"
>> Since full RELRO requires DT_BIND_NOW + RO segments. Does this mean ld.so
>> with your patch will by lazily bound and not mark it's own PLT immediately RO?
>> That seems wrong.
> 
> My patch just doesn't use -z now to build ld.so when --enable-bind-now is
> used and ld.so always gets PLT with/without  --enable-bind-now.  It is the
> same as before when building glibc with the older ld.  The only difference now
> is we no long generate DT_BIND_NOW even when --enable-bind-now is used.
> I don't believe DT_BIND_NOW should make a difference on ld.so.  If it is, it is
> a separate bug and we should fix it.

The patch is a hack.

- ld.so should continue to have DT_BIND_NOW since we are requesting that
  any symbol binding done in ld.so be done at startup and not lazily, and
  generic security tools should be able to check all binaries including
  ld.so to see that this is enabled and requested without making special
  exceptions.

- ld.so can have no PLT if that is what happens as a result of a dynamic
  loader optimization triggered by -Wl,-z,now. That seems like a useful
  optimization which would be disabled by your suggested patch?

- ld.so should still support ELF symbol interposition, which is orthogonal
  to lazy binding. Lazy binding and ELF symbol interposition should not be
  conflated.

>> The notion of -z,now means "Bind symbols now, not lazily", but that doesn't
>> give you enough information to elide the PLT and use the GOT directly since
>> the interposition is still useful and relied upon semantic in ELF.
> 
> You still get the interposition with GOT, just not lazy bind, which needs PLT
> relocation.

Good point.

> GCC 6 even has a new option, -fno-plt, to avoid PLT:
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-05/msg00231.html

GCC has many options that must be used with care and foresight.
 
> ld.so is a special case and needs PLT for the interposition unless
> ld.so re-applies GOT relocations on itself after all modules are loaded.

And that is the *real* bug. Please fix that instead of removing -Wl,-z,now.
It should be a net-zero performance chance since either you are doing
relocations for the GOT and then the PLT, or the GOT twice?

I think it is important we continue to build ld.so with -Wl,-z,now, both
for the optimizations, and for consistency among our security tooling.

I'm happy to hear input from others.

Have you checked to see how your patch to remove the PLT impacts analysis
tooling like Asan and Valgrind?

Cheers,
Carlos.
  
Florian Weimer May 26, 2015, 11:41 a.m. UTC | #4
On 05/24/2015 05:37 AM, Carlos O'Donell wrote:

> My worry is that the missing DT tag is going to have security implications.
> I'm including several other distro people on the TO.

I tried to raise awareness of this change here:

  <http://www.openwall.com/lists/oss-security/2015/05/26/2>

Personally, I cannot offer much guidance on the potential impact of this
proposal at present, I'm afraid.
  

Patch

diff --git a/elf/Makefile b/elf/Makefile
index 34450ea..324b4a2 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -349,13 +349,11 @@  $(objpfx)librtld.os: $(objpfx)dl-allobjs.os $(objpfx)rtld-libc.a
 
 generated += librtld.map librtld.mk rtld-libc.a librtld.os.map
 
-z-now-yes = -Wl,-z,now
-
 $(objpfx)ld.so: $(objpfx)librtld.os $(ld-map)
 # Link into a temporary file so that we don't touch $@ at all
 # if the sanity check below fails.
 	$(LINK.o) -nostdlib -nostartfiles -shared -o $@.new		\
-		  $(LDFLAGS-rtld) -Wl,-z,defs $(z-now-$(bind-now))	\
+		  $(LDFLAGS-rtld) -Wl,-z,defs				\
 		  $(filter-out $(map-file),$^) $(load-map-file)		\
 		  -Wl,-soname=$(rtld-installed-name)			\
 		  -Wl,-defsym=_begin=0