[BZ,#18422] elf/tst-audit tests fail without PLT entries

Message ID CAMe9rOqAzx2xAVeOcgU0qpaHj4=+wWQzhyB3OVTq514ZVBPyyA@mail.gmail.com
State Committed
Headers

Commit Message

H.J. Lu May 25, 2015, 6:54 p.m. UTC
  On Sun, May 24, 2015 at 7:06 PM, Carlos O'Donell <carlos@redhat.com> wrote:

> And that is the *real* bug. Please fix that instead of removing -Wl,-z,now.
> It should be a net-zero performance chance since either you are doing
> relocations for the GOT and then the PLT, or the GOT twice?

I did a closer look.  Nothing is wrong, except for that tst-audit2 expects
certain order in ld.so.  With JUMP_SLOT relocation, the GOTPLT entry of
calloc is update to calloc defined in tst-audit2:

(gdb) bt
#0  0xf7fe56bd in elf_machine_rel (reloc=<optimized out>,
    skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
    version=<optimized out>, sym=<optimized out>, map=<optimized out>)
    at ../sysdeps/i386/dl-machine.h:329
#1  elf_dynamic_do_Rel (skip_ifunc=<optimized out>, lazy=<optimized out>,
    nrelative=<optimized out>, relsize=<optimized out>,
    reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
#2  _dl_relocate_object (scope=0xf7ffdab8, reloc_mode=reloc_mode@entry=0,
    consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
#3  0xf7fdc648 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
    user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
#4  0xf7ff0de7 in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0xffffcfb0,
    dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at ../elf/dl-sysdep.c:249
#5  0xf7fddd05 in _dl_start_final (arg=0xffffcfb0) at rtld.c:308
#6  _dl_start (arg=0xffffcfb0) at rtld.c:414
#7  0xf7fd9a87 in _start ()
   from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)

and then calloc is called:

(gdb) c
Continuing.

Breakpoint 4, calloc (n=n@entry=20, m=4) at tst-audit2.c:18
18 {
(gdb) bt
#0  calloc (n=n@entry=20, m=4) at tst-audit2.c:18
#1  0xf7fe668d in _dl_relocate_object (scope=0xf7ffdab8,
    reloc_mode=reloc_mode@entry=0, consider_profiling=1,
    consider_profiling@entry=0) at dl-reloc.c:272
#2  0xf7fdc648 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
    user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
#3  0xf7ff0de7 in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0xffffcfb0,
    dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at ../elf/dl-sysdep.c:249
#4  0xf7fddd05 in _dl_start_final (arg=0xffffcfb0) at rtld.c:308
#5  _dl_start (arg=0xffffcfb0) at rtld.c:414
#6  0xf7fd9a87 in _start ()
   from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)

With GLOB_DAT relocation, calloc in ld.so is called first:

(gdb) bt
#0  calloc (nmemb=20, size=4) at dl-minimal.c:102
#1  0xf7fe665d in _dl_relocate_object (scope=0xf7fcfb20, reloc_mode=1,
    consider_profiling=1) at dl-reloc.c:272
#2  0xf7fdc500 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
    user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2074
#3  0xf7ff0db7 in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0xffffcfa0,
    dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at ../elf/dl-sysdep.c:249
#4  0xf7fddcd5 in _dl_start_final (arg=0xffffcfa0) at rtld.c:308
#5  _dl_start (arg=0xffffcfa0) at rtld.c:414
#6  0xf7fd9a57 in _start ()
   from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)

and then the GOT entry of calloc is updated:

(gdb) bt
#0  0xf7fe568d in elf_machine_rel (reloc=<optimized out>,
    skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
    version=<optimized out>, sym=<optimized out>, map=<optimized out>)
    at ../sysdeps/i386/dl-machine.h:329
#1  elf_dynamic_do_Rel (skip_ifunc=<optimized out>, lazy=<optimized out>,
    nrelative=<optimized out>, relsize=<optimized out>,
    reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
#2  _dl_relocate_object (scope=0xf7ffdab8, reloc_mode=reloc_mode@entry=0,
    consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
#3  0xf7fdc618 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
    user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2133
#4  0xf7ff0db7 in _dl_sysdep_start (
    start_argptr=start_argptr@entry=0xffffcfa0,
    dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at ../elf/dl-sysdep.c:249
#5  0xf7fddcd5 in _dl_start_final (arg=0xffffcfa0) at rtld.c:308
#6  _dl_start (arg=0xffffcfa0) at rtld.c:414
#7  0xf7fd9a57 in _start ()
   from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)

After, calloc isn't called and magic in tst-audit2 isn't updated.  Both
orders are correct.  Here is a patch to make sure that calloc in tst-audit2.c is
called at least once from ld.so.

> I think it is important we continue to build ld.so with -Wl,-z,now, both
> for the optimizations, and for consistency among our security tooling.
>
> I'm happy to hear input from others.
>
> Have you checked to see how your patch to remove the PLT impacts analysis
> tooling like Asan and Valgrind?
>

No, i didn't test them nor I expect any problems.
  

Comments

Carlos O'Donell May 25, 2015, 8:01 p.m. UTC | #1
On 05/25/2015 02:54 PM, H.J. Lu wrote:
> On Sun, May 24, 2015 at 7:06 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> 
>> > And that is the *real* bug. Please fix that instead of removing -Wl,-z,now.
>> > It should be a net-zero performance chance since either you are doing
>> > relocations for the GOT and then the PLT, or the GOT twice?
> I did a closer look.  Nothing is wrong, except for that tst-audit2 expects
> certain order in ld.so.  With JUMP_SLOT relocation, the GOTPLT entry of
> calloc is update to calloc defined in tst-audit2:

Thank you very much for looking into this.

I'm glad to see that it does work, but that it is an ordering issue
between JUMP_SLOT and GOTPLT processing.

I have one question, please see below.

> 
> (gdb) bt
> #0  0xf7fe56bd in elf_machine_rel (reloc=<optimized out>,
>     skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
>     version=<optimized out>, sym=<optimized out>, map=<optimized out>)
>     at ../sysdeps/i386/dl-machine.h:329
> #1  elf_dynamic_do_Rel (skip_ifunc=<optimized out>, lazy=<optimized out>,
>     nrelative=<optimized out>, relsize=<optimized out>,
>     reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
> #2  _dl_relocate_object (scope=0xf7ffdab8, reloc_mode=reloc_mode@entry=0,
>     consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
> #3  0xf7fdc648 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
>     user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
> #4  0xf7ff0de7 in _dl_sysdep_start (
>     start_argptr=start_argptr@entry=0xffffcfb0,
>     dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at ../elf/dl-sysdep.c:249
> #5  0xf7fddd05 in _dl_start_final (arg=0xffffcfb0) at rtld.c:308
> #6  _dl_start (arg=0xffffcfb0) at rtld.c:414
> #7  0xf7fd9a87 in _start ()
>    from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
> (gdb)
> 
> and then calloc is called:
> 
> (gdb) c
> Continuing.
> 
> Breakpoint 4, calloc (n=n@entry=20, m=4) at tst-audit2.c:18
> 18 {
> (gdb) bt
> #0  calloc (n=n@entry=20, m=4) at tst-audit2.c:18
> #1  0xf7fe668d in _dl_relocate_object (scope=0xf7ffdab8,
>     reloc_mode=reloc_mode@entry=0, consider_profiling=1,
>     consider_profiling@entry=0) at dl-reloc.c:272
> #2  0xf7fdc648 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
>     user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
> #3  0xf7ff0de7 in _dl_sysdep_start (
>     start_argptr=start_argptr@entry=0xffffcfb0,
>     dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at ../elf/dl-sysdep.c:249
> #4  0xf7fddd05 in _dl_start_final (arg=0xffffcfb0) at rtld.c:308
> #5  _dl_start (arg=0xffffcfb0) at rtld.c:414
> #6  0xf7fd9a87 in _start ()
>    from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
> (gdb)
> 
> With GLOB_DAT relocation, calloc in ld.so is called first:

This results in one more calloc from dl-minimal.c which has to be tracked
and not freed.

> (gdb) bt
> #0  calloc (nmemb=20, size=4) at dl-minimal.c:102
> #1  0xf7fe665d in _dl_relocate_object (scope=0xf7fcfb20, reloc_mode=1,
>     consider_profiling=1) at dl-reloc.c:272
> #2  0xf7fdc500 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
>     user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2074
> #3  0xf7ff0db7 in _dl_sysdep_start (
>     start_argptr=start_argptr@entry=0xffffcfa0,
>     dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at ../elf/dl-sysdep.c:249
> #4  0xf7fddcd5 in _dl_start_final (arg=0xffffcfa0) at rtld.c:308
> #5  _dl_start (arg=0xffffcfa0) at rtld.c:414
> #6  0xf7fd9a57 in _start ()
>    from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
> (gdb)
> 
> and then the GOT entry of calloc is updated:

OK.

> (gdb) bt
> #0  0xf7fe568d in elf_machine_rel (reloc=<optimized out>,
>     skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
>     version=<optimized out>, sym=<optimized out>, map=<optimized out>)
>     at ../sysdeps/i386/dl-machine.h:329
> #1  elf_dynamic_do_Rel (skip_ifunc=<optimized out>, lazy=<optimized out>,
>     nrelative=<optimized out>, relsize=<optimized out>,
>     reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
> #2  _dl_relocate_object (scope=0xf7ffdab8, reloc_mode=reloc_mode@entry=0,
>     consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
> #3  0xf7fdc618 in dl_main (phdr=<optimized out>, phnum=<optimized out>,
>     user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2133
> #4  0xf7ff0db7 in _dl_sysdep_start (
>     start_argptr=start_argptr@entry=0xffffcfa0,
>     dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at ../elf/dl-sysdep.c:249
> #5  0xf7fddcd5 in _dl_start_final (arg=0xffffcfa0) at rtld.c:308
> #6  _dl_start (arg=0xffffcfa0) at rtld.c:414
> #7  0xf7fd9a57 in _start ()
>    from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
> (gdb)
> 
> After, calloc isn't called and magic in tst-audit2 isn't updated.  Both
> orders are correct.  Here is a patch to make sure that calloc in tst-audit2.c is
> called at least once from ld.so.

In the past before your ld optimization the JUMP_SLOT would have been
relocated to libc.so.6's version of calloc, or the test version of calloc,
correct?

The change from JUMP_SLOT -> GLOB_DAT for calloc means that we call calloc
one more time using the dl-minimal.c implementation. This seems dangerous to
me since in ld.so we take extreme caution not to attempt to free this result
because dl-minimal.c doesn't support freeing anything but the most recent
allocation. We also don't want to call free() from libc.so.6 with data that
was calloc'd from the dl-minimal.c implementation.

Are we certain that this additional calloc, now using dl-minimal.c, isn't
going to cause problems? When is it freed? What implementation of free is used?
 
>> > I think it is important we continue to build ld.so with -Wl,-z,now, both
>> > for the optimizations, and for consistency among our security tooling.
>> >
>> > I'm happy to hear input from others.
>> >
>> > Have you checked to see how your patch to remove the PLT impacts analysis
>> > tooling like Asan and Valgrind?
>> >
> No, i didn't test them nor I expect any problems.

Well, right away if we have one more call of calloc to dl-minimal.c, that's
another allocation the tool can't track. It's not huegely problematic because
they already can't track all the early allocations via dl-minimal.c.

In summary:
- My next worry is about free of calloc'd data that is now using dl-minimal.c

If we can answer that question, then I think this patch to adjust tst-audit2.c
is the best solution.

Cheers,
Carlos.
  
Andreas Schwab May 26, 2015, 8 a.m. UTC | #2
"Carlos O'Donell" <carlos@redhat.com> writes:

> In summary:
> - My next worry is about free of calloc'd data that is now using dl-minimal.c

Anything allocated with dl-minimal must strictly be kept inside ld.so
and never be freed.

Andreas.
  
H.J. Lu May 26, 2015, 11:19 a.m. UTC | #3
On Tue, May 26, 2015 at 1:00 AM, Andreas Schwab <schwab@suse.de> wrote:
> "Carlos O'Donell" <carlos@redhat.com> writes:
>
>> In summary:
>> - My next worry is about free of calloc'd data that is now using dl-minimal.c
>
> Anything allocated with dl-minimal must strictly be kept inside ld.so
> and never be freed.

The calloc call is made at:

   if (__glibc_unlikely (consider_profiling)
        && l->l_info[DT_PLTRELSZ] != NULL)
      {
        /* Allocate the array which will contain the already found
           relocations.  If the shared object lacks a PLT (for example
           if it only contains lead function) the l_info[DT_PLTRELSZ]
           will be NULL.  */
        size_t sizeofrel = l->l_info[DT_PLTREL]->d_un.d_val == DT_RELA
                           ? sizeof (ElfW(Rela))
                           : sizeof (ElfW(Rel));
        size_t relcount = l->l_info[DT_PLTRELSZ]->d_un.d_val / sizeofrel;
        l->l_reloc_result = calloc (sizeof (l->l_reloc_result[0]), relcount);

        if (l->l_reloc_result == NULL)
          {
            errstring = N_("\
%s: out of memory to store relocation results for %s\n");
            _dl_fatal_printf (errstring, RTLD_PROGNAME, l->l_name);
          }
      }

ld.so never frees  l->l_reloc_result.
  
Carlos O'Donell May 26, 2015, 11:54 p.m. UTC | #4
On 05/26/2015 04:00 AM, Andreas Schwab wrote:
> "Carlos O'Donell" <carlos@redhat.com> writes:
> 
>> In summary:
>> - My next worry is about free of calloc'd data that is now using dl-minimal.c
> 
> Anything allocated with dl-minimal must strictly be kept inside ld.so
> and never be freed.

I agree.

However, now that we are delaying the interposition until we process the
GOT relocs, the tst-audit2 test fails because what was once a call to libc.so's
calloc is now a call to dl-minimal. I haven't debugged this so I don't know if
we're tracking that calloc correctly such that we don't attempt to free it.

Did I get something wrong?

Cheers,
Carlos.
  
H.J. Lu May 27, 2015, 12:30 a.m. UTC | #5
On Tue, May 26, 2015 at 4:54 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 05/26/2015 04:00 AM, Andreas Schwab wrote:
>> "Carlos O'Donell" <carlos@redhat.com> writes:
>>
>>> In summary:
>>> - My next worry is about free of calloc'd data that is now using dl-minimal.c
>>
>> Anything allocated with dl-minimal must strictly be kept inside ld.so
>> and never be freed.
>
> I agree.
>
> However, now that we are delaying the interposition until we process the
> GOT relocs, the tst-audit2 test fails because what was once a call to libc.so's
> calloc is now a call to dl-minimal. I haven't debugged this so I don't know if
> we're tracking that calloc correctly such that we don't attempt to free it.

See:

https://sourceware.org/ml/libc-alpha/2015-05/msg00632.html
  
Carlos O'Donell May 27, 2015, 12:44 a.m. UTC | #6
On 05/26/2015 07:19 AM, H.J. Lu wrote:
> On Tue, May 26, 2015 at 1:00 AM, Andreas Schwab <schwab@suse.de> wrote:
>> "Carlos O'Donell" <carlos@redhat.com> writes:
>>
>>> In summary:
>>> - My next worry is about free of calloc'd data that is now using dl-minimal.c
>>
>> Anything allocated with dl-minimal must strictly be kept inside ld.so
>> and never be freed.
> 
> The calloc call is made at:
> 
>    if (__glibc_unlikely (consider_profiling)
>         && l->l_info[DT_PLTRELSZ] != NULL)
>       {
>         /* Allocate the array which will contain the already found
>            relocations.  If the shared object lacks a PLT (for example
>            if it only contains lead function) the l_info[DT_PLTRELSZ]
>            will be NULL.  */
>         size_t sizeofrel = l->l_info[DT_PLTREL]->d_un.d_val == DT_RELA
>                            ? sizeof (ElfW(Rela))
>                            : sizeof (ElfW(Rel));
>         size_t relcount = l->l_info[DT_PLTRELSZ]->d_un.d_val / sizeofrel;
>         l->l_reloc_result = calloc (sizeof (l->l_reloc_result[0]), relcount);
> 
>         if (l->l_reloc_result == NULL)
>           {
>             errstring = N_("\
> %s: out of memory to store relocation results for %s\n");
>             _dl_fatal_printf (errstring, RTLD_PROGNAME, l->l_name);
>           }
>       }
> 
> ld.so never frees  l->l_reloc_result.

Thanks.

The only other place I was worried about was TLS data structures, but
there we already use dl_initial_tls to indicate the data structure was
allocated early (specifically for use by auditors) and we do not pass
it to realloc because it was allocated by dl-minimal, thus we are OK
there also.

After your changes in binutils is the test at all useful?

We are no longer able to interpose calloc to catch early TLS init,
therefore we are no longer testing early TLS init and the comments
in the test need to be changed to match.

The new test is:
"Test that calloc is called at least once after dlopen and initialization
 of TLS varibles in the DSO."

Why do we care about this?

Is there any way to still test that early TLS initialization has occurred
when using LD_AUDIT?

Would't such a test look like this?

- Create auditor that uses TLS in audit funciton.
- Interpose calloc.
- Check that things don't crash.

We already have a test for this, it's tst-audit9 (Bug 16613).

OK to checkin your change to tst-audit2 if you change the test comment
to reflect the change in what is being tested:

"Test that interposed calloc is called by the dynamic loader, and that
 TLS is fully initialized by then."

Thanks for working through this.

Cheers,
Carlos.
  

Patch

From 31d9cfee87edf3acf95640db262b950e3b0514f3 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" <hjl.tools@gmail.com>
Date: Mon, 25 May 2015 11:30:57 -0700
Subject: [PATCH] Make sure that calloc is called at least once
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

PLT relocations aren't required when -z now used.  Linker on master with:

commit 25070364b0ce33eed46aa5d78ebebbec6accec7e
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Sat May 16 07:00:21 2015 -0700

    Don't generate PLT relocations for now binding

    There is no need for PLT relocations with -z now. We can use GOT
    relocations, which take less space, instead and replace 16-byte .plt
    entres with 8-byte .plt.got entries.

    bfd/

      * elf32-i386.c (elf_i386_check_relocs): Create .plt.got section
      for now binding.
      (elf_i386_allocate_dynrelocs): Use .plt.got section for now
      binding.
      * elf64-x86-64.c (elf_x86_64_check_relocs): Create .plt.got
      section for now binding.
      (elf_x86_64_allocate_dynrelocs): Use .plt.got section for now
      binding.

won't generate PLT relocations with -z now.  elf/tst-audit2.c expect
certain order of execution in ld.so.  With PLT relocations, the GOTPLT
entry of calloc is update to calloc defined in tst-audit2:

(gdb) bt
    skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
    version=<optimized out>, sym=<optimized out>, map=<optimized out>)
    at ../sysdeps/i386/dl-machine.h:329
out>,
    nrelative=<optimized out>, relsize=<optimized out>,
    reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
reloc_mode=reloc_mode@entry=0,
    consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
    user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
    start_argptr=start_argptr@entry=0xffffcfb0,
    dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at
../elf/dl-sysdep.c:249
   from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)

and then calloc is called:

(gdb) c
Continuing.

Breakpoint 4, calloc (n=n@entry=20, m=4) at tst-audit2.c:18
18 {
(gdb) bt
    reloc_mode=reloc_mode@entry=0, consider_profiling=1,
    consider_profiling@entry=0) at dl-reloc.c:272
    user_entry=0xffffcf1c, auxv=0xffffd0a8) at rtld.c:2133
    start_argptr=start_argptr@entry=0xffffcfb0,
    dl_main=dl_main@entry=0xf7fda6f0 <dl_main>) at
../elf/dl-sysdep.c:249
   from /export/build/gnu/glibc-32bit/build-i686-linux/elf/ld.so
(gdb)

With GOT relocation, calloc in ld.so is called first:

(gdb) bt
    consider_profiling=1) at dl-reloc.c:272
    user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2074
    start_argptr=start_argptr@entry=0xffffcfa0,
    dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at
../elf/dl-sysdep.c:249
   from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)

and then the GOT entry of calloc is updated:

(gdb) bt
    skip_ifunc=<optimized out>, reloc_addr_arg=<optimized out>,
    version=<optimized out>, sym=<optimized out>, map=<optimized out>)
    at ../sysdeps/i386/dl-machine.h:329
out>,
    nrelative=<optimized out>, relsize=<optimized out>,
    reladdr=<optimized out>, map=<optimized out>) at do-rel.h:137
reloc_mode=reloc_mode@entry=0,
    consider_profiling=1, consider_profiling@entry=0) at dl-reloc.c:258
    user_entry=0xffffcf0c, auxv=0xffffd098) at rtld.c:2133
    start_argptr=start_argptr@entry=0xffffcfa0,
    dl_main=dl_main@entry=0xf7fda6c0 <dl_main>) at
../elf/dl-sysdep.c:249
   from /export/build/gnu/glibc-32bit-test/build-i686-linux/elf/ld.so
(gdb)

After that, since calloc isn't called from ld.so nor any other modules,
magic in tst-audit2 isn't updated.  Both orders are correct.  This patch
makes sure that calloc in tst-audit2.c is called at least once from ld.so.

	[BZ #18422]
	* Makefile ($(objpfx)tst-audit2): Depend on $(libdl).
	($(objpfx)tst-audit2.out): Also depend on
	$(objpfx)tst-auditmod9b.so.
	* elf/tst-audit2.c: Include <dlfcn.h>.
	(calloc_called): New.
	(calloc): Allow to be called more than once.
	(do_test): dllopen/dlclose $ORIGIN/tst-auditmod9b.so.
---
 elf/Makefile     |  3 ++-
 elf/tst-audit2.c | 23 ++++++++++++++++++-----
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/elf/Makefile b/elf/Makefile
index b06e0a7..dedf3c7 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1034,7 +1034,8 @@  $(objpfx)tst-dlmopen3.out: $(objpfx)tst-dlmopen1mod.so
 $(objpfx)tst-audit1.out: $(objpfx)tst-auditmod1.so
 tst-audit1-ENV = LD_AUDIT=$(objpfx)tst-auditmod1.so
 
-$(objpfx)tst-audit2.out: $(objpfx)tst-auditmod1.so
+$(objpfx)tst-audit2: $(libdl)
+$(objpfx)tst-audit2.out: $(objpfx)tst-auditmod1.so $(objpfx)tst-auditmod9b.so
 # Prevent GCC-5 from translating a malloc/memset pair into calloc
 CFLAGS-tst-audit2.c += -fno-builtin
 tst-audit2-ENV = LD_AUDIT=$(objpfx)tst-auditmod1.so
diff --git a/elf/tst-audit2.c b/elf/tst-audit2.c
index acad1b0..44c74d4 100644
--- a/elf/tst-audit2.c
+++ b/elf/tst-audit2.c
@@ -3,10 +3,12 @@ 
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <dlfcn.h>
 
 #define MAGIC1 0xabcdef72
 #define MAGIC2 0xd8675309
 static __thread unsigned int magic[] = { MAGIC1, MAGIC2 };
+static __thread int calloc_called;
 
 #undef calloc
 
@@ -16,13 +18,19 @@  static __thread unsigned int magic[] = { MAGIC1, MAGIC2 };
 void *
 calloc (size_t n, size_t m)
 {
-  if (magic[0] != MAGIC1 || magic[1] != MAGIC2)
+  if (!calloc_called)
     {
-      printf ("{%x, %x} != {%x, %x}\n", magic[0], magic[1], MAGIC1, MAGIC2);
-      abort ();
+      /* Allow our calloc to be called more than once.  */
+      calloc_called = 1;
+      if (magic[0] != MAGIC1 || magic[1] != MAGIC2)
+	{
+	  printf ("{%x, %x} != {%x, %x}\n",
+		  magic[0], magic[1], MAGIC1, MAGIC2);
+	  abort ();
+	}
+      magic[0] = MAGIC2;
+      magic[1] = MAGIC1;
     }
-  magic[0] = MAGIC2;
-  magic[1] = MAGIC1;
 
   n *= m;
   void *ptr = malloc (n);
@@ -34,6 +42,11 @@  calloc (size_t n, size_t m)
 static int
 do_test (void)
 {
+  /* Make sure that our calloc is called from the dynamic linker at least
+     once.  */
+  void *h = dlopen("$ORIGIN/tst-auditmod9b.so", RTLD_LAZY);
+  if (h != NULL)
+    dlclose (h);
   if (magic[1] != MAGIC1 || magic[0] != MAGIC2)
     {
       printf ("{%x, %x} != {%x, %x}\n", magic[0], magic[1], MAGIC2, MAGIC1);
-- 
2.1.0