building GLIBC with -fsanitize=address

Message ID CAGQ9bdzRLKBsAr5wq_9JMZQ388n84dfW+N9ZDuXfFABdbBbpWw@mail.gmail.com
State Not applicable
Headers

Commit Message

Kostya Serebryany Aug. 26, 2014, 10:27 p.m. UTC
  Hello,

I want to ask some assistance with building the GLIBC with AddressSanitizer
(aka asan, GCC's -fsanitize=address flag).
The end goal is to have libc-asan.so, a shared libc library with code
instrumented by asan.

Simply doing CFLAGS='-g -O1 -fsanitize=address' ../glibc/configure ..
does not work -- it fails at configure time with
configure: error: Need linker with .init_array/.fini_array support.
Even if we bypass this error there will be others and, finally, we
don't want to instrument
every file we build (e.g. there are object files that go into the
ld.so which we don't want to instrument).

One of the things we discussed at the Cauldron was to add a configure
option (e.g. --enable-asan)
similar to the existing --enable-profile which adds the -pg build flag.
This is indeed similar with one major difference: --enable-profile
builds a static libc and
for asan we need a DSO. I've managed to make a patch that builds
libc-asan.a (attached)
but I don't see how to get libc-asan.so from it.

Thoughts?

Thanks,

--kcc
  

Comments

Roland McGrath Aug. 27, 2014, 10:45 p.m. UTC | #1
The machinery based on object-suffixes takes care of having separate object
file builds and archive libraries.  But if you want to make a separate
flavor of shared objects from those, you'll need to duplicate (refactor)
a bunch more rules for doing the actual ld -shared step.

I'm inclined to think that this is not really the right model for an ASan
build.  But to be sure, we need to dig deeper into exactly what you/we
envision an ASan build containing and how it would be used.

The tack you've attempted leads to the single canonical glibc build
(optionally) producing another library variant.  The alternate approach is
a configure switch that changes what the "primary" build does rather than
adding a variant.  That is, you would configure with --enable-asan (or
perhaps this should be --with-asan for this approach) as an orthogonal
switch to --{en,dis}able-{static,shared,profile}.  This build would produce
a libc.so (and libc.a too unless --disable-static) and ld.so built with
-fsanitize=address.  You'd configure this for a different --prefix or
something like that, and the canonical absolute dynamic linker file name
for PT_INTERP would be a different one (perhaps just different because of
the different prefix, or perhaps explicitly modified to be ld...-asan).
Then to build applications with this library, you'd point the compiler
driver's paths at the differently prefixed lib directory and pass the
different name to the linker's -dynamic-linker.

Before getting into more details, I think we need to be clear on the vision
of what you want to produce and how it would be used.


Thanks,
Roland
  
Kostya Serebryany Aug. 27, 2014, 11:08 p.m. UTC | #2
On Wed, Aug 27, 2014 at 3:45 PM, Roland McGrath <roland@hack.frob.com> wrote:
> The machinery based on object-suffixes takes care of having separate object
> file builds and archive libraries.  But if you want to make a separate
> flavor of shared objects from those, you'll need to duplicate (refactor)
> a bunch more rules for doing the actual ld -shared step.
>
> I'm inclined to think that this is not really the right model for an ASan
> build.  But to be sure, we need to dig deeper into exactly what you/we
> envision an ASan build containing and how it would be used.
>
> The tack you've attempted leads to the single canonical glibc build
> (optionally) producing another library variant.  The alternate approach is
> a configure switch that changes what the "primary" build does rather than
> adding a variant.  That is, you would configure with --enable-asan (or
> perhaps this should be --with-asan for this approach) as an orthogonal
> switch to --{en,dis}able-{static,shared,profile}.  This build would produce
> a libc.so (and libc.a too unless --disable-static) and ld.so built with
> -fsanitize=address.  You'd configure this for a different --prefix or
> something like that, and the canonical absolute dynamic linker file name
> for PT_INTERP would be a different one (perhaps just different because of
> the different prefix, or perhaps explicitly modified to be ld...-asan).
> Then to build applications with this library, you'd point the compiler
> driver's paths at the differently prefixed lib directory and pass the
> different name to the linker's -dynamic-linker.
>
> Before getting into more details, I think we need to be clear on the vision
> of what you want to produce and how it would be used.

As long as I can link the resulting instrumented libc.so to binaries
and run them
either way works for me. The idea with reusing the -pg approach (which
I think came from Carlos?),
is nice because -pg has similar properties, i.e. we can not add -pg to
objects going to ld.so (right?).
A primary build option (--with-asan) is fine too as long as we can
omit -fsanitize=address when building
some objects (e.g. those for ld.so, maybe a few more).

--kcc
>
>
> Thanks,
> Roland
  
Carlos O'Donell Sept. 4, 2014, 1:41 a.m. UTC | #3
On 08/27/2014 07:08 PM, Konstantin Serebryany wrote:
> On Wed, Aug 27, 2014 at 3:45 PM, Roland McGrath <roland@hack.frob.com> wrote:
>> The machinery based on object-suffixes takes care of having separate object
>> file builds and archive libraries.  But if you want to make a separate
>> flavor of shared objects from those, you'll need to duplicate (refactor)
>> a bunch more rules for doing the actual ld -shared step.
>>
>> I'm inclined to think that this is not really the right model for an ASan
>> build.  But to be sure, we need to dig deeper into exactly what you/we
>> envision an ASan build containing and how it would be used.
>>
>> The tack you've attempted leads to the single canonical glibc build
>> (optionally) producing another library variant.  The alternate approach is
>> a configure switch that changes what the "primary" build does rather than
>> adding a variant.  That is, you would configure with --enable-asan (or
>> perhaps this should be --with-asan for this approach) as an orthogonal
>> switch to --{en,dis}able-{static,shared,profile}.  This build would produce
>> a libc.so (and libc.a too unless --disable-static) and ld.so built with
>> -fsanitize=address.  You'd configure this for a different --prefix or
>> something like that, and the canonical absolute dynamic linker file name
>> for PT_INTERP would be a different one (perhaps just different because of
>> the different prefix, or perhaps explicitly modified to be ld...-asan).
>> Then to build applications with this library, you'd point the compiler
>> driver's paths at the differently prefixed lib directory and pass the
>> different name to the linker's -dynamic-linker.

Configuring with an alternate --prefix defeats one of the main reasons
I want this. I would like to run an entire Fedora bootstrap, or part of
one, with ASan turned on in glibc. Therefore it has to look like dropin
libc.so.6. Nothing in your plan prevents me from doing that, but I just
wanted to point out that alternate --prefix and building everything with
-Wl,--dynamic-linker= goes against my use case. Even though I'm
bootstraping it's always a problem to get flags down into all of the
right places to make the system use a non-default loader (without wrapping
gcc itself).

>> Before getting into more details, I think we need to be clear on the vision
>> of what you want to produce and how it would be used.
> 
> As long as I can link the resulting instrumented libc.so to binaries
> and run them
> either way works for me. The idea with reusing the -pg approach (which
> I think came from Carlos?),
> is nice because -pg has similar properties, i.e. we can not add -pg to
> objects going to ld.so (right?).

Correct, which is why I recommended the -pg approach.

> A primary build option (--with-asan) is fine too as long as we can
> omit -fsanitize=address when building
> some objects (e.g. those for ld.so, maybe a few more).

We can, but then it becomes a set of more custom rules for
building ld.so.

I'm still inclined to think the -pg route is the least work
for ASan, but Roland actually has the best experience to
comment on the amount of work he thinks is required from
each solution.

Cheers,
Carlos.
  
Kostya Serebryany Sept. 15, 2014, 10:11 p.m. UTC | #4
So, Roland, what is your final recommendation?
If you still suggest to use a separate build, do you know if there is
anything similar already in the build system (so that I can duplicate
that)?

Thanks,

--kcc

On Wed, Sep 3, 2014 at 6:41 PM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 08/27/2014 07:08 PM, Konstantin Serebryany wrote:
>> On Wed, Aug 27, 2014 at 3:45 PM, Roland McGrath <roland@hack.frob.com> wrote:
>>> The machinery based on object-suffixes takes care of having separate object
>>> file builds and archive libraries.  But if you want to make a separate
>>> flavor of shared objects from those, you'll need to duplicate (refactor)
>>> a bunch more rules for doing the actual ld -shared step.
>>>
>>> I'm inclined to think that this is not really the right model for an ASan
>>> build.  But to be sure, we need to dig deeper into exactly what you/we
>>> envision an ASan build containing and how it would be used.
>>>
>>> The tack you've attempted leads to the single canonical glibc build
>>> (optionally) producing another library variant.  The alternate approach is
>>> a configure switch that changes what the "primary" build does rather than
>>> adding a variant.  That is, you would configure with --enable-asan (or
>>> perhaps this should be --with-asan for this approach) as an orthogonal
>>> switch to --{en,dis}able-{static,shared,profile}.  This build would produce
>>> a libc.so (and libc.a too unless --disable-static) and ld.so built with
>>> -fsanitize=address.  You'd configure this for a different --prefix or
>>> something like that, and the canonical absolute dynamic linker file name
>>> for PT_INTERP would be a different one (perhaps just different because of
>>> the different prefix, or perhaps explicitly modified to be ld...-asan).
>>> Then to build applications with this library, you'd point the compiler
>>> driver's paths at the differently prefixed lib directory and pass the
>>> different name to the linker's -dynamic-linker.
>
> Configuring with an alternate --prefix defeats one of the main reasons
> I want this. I would like to run an entire Fedora bootstrap, or part of
> one, with ASan turned on in glibc. Therefore it has to look like dropin
> libc.so.6. Nothing in your plan prevents me from doing that, but I just
> wanted to point out that alternate --prefix and building everything with
> -Wl,--dynamic-linker= goes against my use case. Even though I'm
> bootstraping it's always a problem to get flags down into all of the
> right places to make the system use a non-default loader (without wrapping
> gcc itself).
>
>>> Before getting into more details, I think we need to be clear on the vision
>>> of what you want to produce and how it would be used.
>>
>> As long as I can link the resulting instrumented libc.so to binaries
>> and run them
>> either way works for me. The idea with reusing the -pg approach (which
>> I think came from Carlos?),
>> is nice because -pg has similar properties, i.e. we can not add -pg to
>> objects going to ld.so (right?).
>
> Correct, which is why I recommended the -pg approach.
>
>> A primary build option (--with-asan) is fine too as long as we can
>> omit -fsanitize=address when building
>> some objects (e.g. those for ld.so, maybe a few more).
>
> We can, but then it becomes a set of more custom rules for
> building ld.so.
>
> I'm still inclined to think the -pg route is the least work
> for ASan, but Roland actually has the best experience to
> comment on the amount of work he thinks is required from
> each solution.
>
> Cheers,
> Carlos.
>
  
Kostya Serebryany Sept. 24, 2014, 12:41 a.m. UTC | #5
FTR:
here is the compiler wrapper script that tampers with the compiler
flags to build
ASan-instrumented glibc w/o changing the glibc sources:
https://code.google.com/p/address-sanitizer/source/browse/trunk/asan-glibc/asan-glibc-gcc-wrapper.py

With this I can build the full glibc (tested on 2.19 and 2.20) with
asan instrumentation and find injected bugs.
Now I can move further; however it would still be great if someone can
assist me with properly patching the glibc build system to support
asan build.

--kcc



On Mon, Sep 15, 2014 at 3:11 PM, Konstantin Serebryany
<konstantin.s.serebryany@gmail.com> wrote:
> So, Roland, what is your final recommendation?
> If you still suggest to use a separate build, do you know if there is
> anything similar already in the build system (so that I can duplicate
> that)?
>
> Thanks,
>
> --kcc
>
> On Wed, Sep 3, 2014 at 6:41 PM, Carlos O'Donell <carlos@redhat.com> wrote:
>> On 08/27/2014 07:08 PM, Konstantin Serebryany wrote:
>>> On Wed, Aug 27, 2014 at 3:45 PM, Roland McGrath <roland@hack.frob.com> wrote:
>>>> The machinery based on object-suffixes takes care of having separate object
>>>> file builds and archive libraries.  But if you want to make a separate
>>>> flavor of shared objects from those, you'll need to duplicate (refactor)
>>>> a bunch more rules for doing the actual ld -shared step.
>>>>
>>>> I'm inclined to think that this is not really the right model for an ASan
>>>> build.  But to be sure, we need to dig deeper into exactly what you/we
>>>> envision an ASan build containing and how it would be used.
>>>>
>>>> The tack you've attempted leads to the single canonical glibc build
>>>> (optionally) producing another library variant.  The alternate approach is
>>>> a configure switch that changes what the "primary" build does rather than
>>>> adding a variant.  That is, you would configure with --enable-asan (or
>>>> perhaps this should be --with-asan for this approach) as an orthogonal
>>>> switch to --{en,dis}able-{static,shared,profile}.  This build would produce
>>>> a libc.so (and libc.a too unless --disable-static) and ld.so built with
>>>> -fsanitize=address.  You'd configure this for a different --prefix or
>>>> something like that, and the canonical absolute dynamic linker file name
>>>> for PT_INTERP would be a different one (perhaps just different because of
>>>> the different prefix, or perhaps explicitly modified to be ld...-asan).
>>>> Then to build applications with this library, you'd point the compiler
>>>> driver's paths at the differently prefixed lib directory and pass the
>>>> different name to the linker's -dynamic-linker.
>>
>> Configuring with an alternate --prefix defeats one of the main reasons
>> I want this. I would like to run an entire Fedora bootstrap, or part of
>> one, with ASan turned on in glibc. Therefore it has to look like dropin
>> libc.so.6. Nothing in your plan prevents me from doing that, but I just
>> wanted to point out that alternate --prefix and building everything with
>> -Wl,--dynamic-linker= goes against my use case. Even though I'm
>> bootstraping it's always a problem to get flags down into all of the
>> right places to make the system use a non-default loader (without wrapping
>> gcc itself).
>>
>>>> Before getting into more details, I think we need to be clear on the vision
>>>> of what you want to produce and how it would be used.
>>>
>>> As long as I can link the resulting instrumented libc.so to binaries
>>> and run them
>>> either way works for me. The idea with reusing the -pg approach (which
>>> I think came from Carlos?),
>>> is nice because -pg has similar properties, i.e. we can not add -pg to
>>> objects going to ld.so (right?).
>>
>> Correct, which is why I recommended the -pg approach.
>>
>>> A primary build option (--with-asan) is fine too as long as we can
>>> omit -fsanitize=address when building
>>> some objects (e.g. those for ld.so, maybe a few more).
>>
>> We can, but then it becomes a set of more custom rules for
>> building ld.so.
>>
>> I'm still inclined to think the -pg route is the least work
>> for ASan, but Roland actually has the best experience to
>> comment on the amount of work he thinks is required from
>> each solution.
>>
>> Cheers,
>> Carlos.
>>
  
Carlos O'Donell Sept. 25, 2014, 5:57 p.m. UTC | #6
On 09/23/2014 08:41 PM, Konstantin Serebryany wrote:
> FTR:
> here is the compiler wrapper script that tampers with the compiler
> flags to build
> ASan-instrumented glibc w/o changing the glibc sources:
> https://code.google.com/p/address-sanitizer/source/browse/trunk/asan-glibc/asan-glibc-gcc-wrapper.py
> 
> With this I can build the full glibc (tested on 2.19 and 2.20) with
> asan instrumentation and find injected bugs.
> Now I can move further; however it would still be great if someone can
> assist me with properly patching the glibc build system to support
> asan build.

Have you been able to make progress on logging instead of faulting on
error?

We can help with glibc builds, but it would be nice to make sure that
if we were to turn it on that it won't cause the system to immediately 
abort at boot time when it detects errors.

Recall our conversation at Cauldron 2014, in that ASan + glibc needs to
log errors as precisely as possible for later offline analysis by the
distribution team maintaining glibc.

Does that goal still make sense?

Cheers,
Carlos.
  
Kostya Serebryany Sept. 26, 2014, 9:21 p.m. UTC | #7
On Thu, Sep 25, 2014 at 10:57 AM, Carlos O'Donell <carlos@redhat.com> wrote:
> On 09/23/2014 08:41 PM, Konstantin Serebryany wrote:
>> FTR:
>> here is the compiler wrapper script that tampers with the compiler
>> flags to build
>> ASan-instrumented glibc w/o changing the glibc sources:
>> https://code.google.com/p/address-sanitizer/source/browse/trunk/asan-glibc/asan-glibc-gcc-wrapper.py
>>
>> With this I can build the full glibc (tested on 2.19 and 2.20) with
>> asan instrumentation and find injected bugs.
>> Now I can move further; however it would still be great if someone can
>> assist me with properly patching the glibc build system to support
>> asan build.
>
> Have you been able to make progress on logging instead of faulting on
> error?
>
> We can help with glibc builds, but it would be nice to make sure that
> if we were to turn it on that it won't cause the system to immediately
> abort at boot time when it detects errors.
>
> Recall our conversation at Cauldron 2014, in that ASan + glibc needs to
> log errors as precisely as possible for later offline analysis by the
> distribution team maintaining glibc.
>
> Does that goal still make sense?

Yes, the goal still makes sense.
Today, both GCC and Clang asan implementations have flags to emit
instrumentation via callbacks instead of inline code.
It makes asan somewhat slower, but allows to continue after the first report.

% cat load.c
int load(int *a) { return *a; }
% ~/gcc-inst/bin/gcc -fsanitize=address -O -S -o -   --param
asan-instrumentation-with-call-threshold=0 load.c
pushq %rbx
movq %rdi, %rbx
call __asan_load4
movl (%rbx), %eax
popq %rbx
ret

We may need very minor additional changes in asan-run-time, but we
will need them anyway to support whatever kind of logging you need for
glibc.

--kcc

>
> Cheers,
> Carlos.
>
  
Yury Gribov Oct. 1, 2014, 9:55 a.m. UTC | #8
On 09/27/2014 01:21 AM, Konstantin Serebryany wrote:
> Today, both GCC and Clang asan implementations have flags to emit
> instrumentation via callbacks instead of inline code.
> It makes asan somewhat slower, but allows to continue after the first report.

And there is always opportunity to enable logging in inline 
instrumentation as well if outline instrumentation proves to be too slow.

-Y
  
Mike Frysinger March 7, 2015, 9:54 p.m. UTC | #9
On 03 Sep 2014 21:41, Carlos O'Donell wrote:
> On 08/27/2014 07:08 PM, Konstantin Serebryany wrote:
> > On Wed, Aug 27, 2014 at 3:45 PM, Roland McGrath wrote:
> >> Before getting into more details, I think we need to be clear on the vision
> >> of what you want to produce and how it would be used.
> > 
> > As long as I can link the resulting instrumented libc.so to binaries
> > and run them
> > either way works for me. The idea with reusing the -pg approach (which
> > I think came from Carlos?),
> > is nice because -pg has similar properties, i.e. we can not add -pg to
> > objects going to ld.so (right?).
> 
> Correct, which is why I recommended the -pg approach.

iiuc, the profile option creates a bunch of static libs with _p suffixes right ?  
so instead of -lc, you use -lc_p ?  i don't think that really works for asan.

crazy idea: what about pseudo hwcaps ?  that way you could have asan enabled 
shared libs in an asan/ subdir and the ldso would select them on the fly based 
on your main executable.
-mike
  

Patch

diff --git a/Makeconfig b/Makeconfig
index cef0f06..d2f4506 100644
--- a/Makeconfig
+++ b/Makeconfig
@@ -846,7 +846,7 @@  endif
 # The compilation rules use $(CPPFLAGS-${SUFFIX}) and $(CFLAGS-${SUFFIX})
 # to pass different flags for each flavor.
 libtypes = $(foreach o,$(object-suffixes-for-libc),$(libtype$o))
-all-object-suffixes := .o .os .op .og .oS
+all-object-suffixes := .o .os .op .og .oS .oasan
 object-suffixes :=
 CPPFLAGS-.o = $(pic-default)
 CFLAGS-.o = $(filter %frame-pointer,$(+cflags))
@@ -877,6 +877,15 @@  CFLAGS-.op = -pg
 libtype.op = lib%_p.a
 endif
 
+ifeq (yes,$(build-asan))
+# Under --enable-asan, TODO.
+object-suffixes += .op
+CPPFLAGS-.op = $(pic-default)
+CFLAGS-.op = -fsanitize=address
+libtype.op = lib%_asan.a
+endif
+
+
 # Convenience variable for when we want to treat shared-library cases
 # differently from the rest.
 object-suffixes-noshared := $(filter-out .os,$(object-suffixes))
diff --git a/config.make.in b/config.make.in
index 6bcab8a..eef62a9 100644
--- a/config.make.in
+++ b/config.make.in
@@ -82,6 +82,7 @@  nss-crypt = @libc_cv_nss_crypt@
 build-shared = @shared@
 build-pic-default= @libc_cv_pic_default@
 build-profile = @profile@
+build-asan = @asan@
 build-static-nss = @static_nss@
 add-ons = @add_ons@
 add-on-subdirs = @add_on_subdirs@
diff --git a/configure b/configure
index c8d2967..c928def 100755
--- a/configure
+++ b/configure
@@ -576,6 +576,7 @@  mach_interface_list
 DEFINES
 static_nss
 profile
+asan
 libc_cv_pic_default
 shared
 static
@@ -737,6 +738,7 @@  with_default_link
 enable_sanity_checks
 enable_shared
 enable_profile
+enable_asan
 enable_oldest_abi
 enable_hardcoded_path_in_tests
 enable_stackguard_randomization
@@ -1390,6 +1392,8 @@  Optional Features:
                           in special situations) [default=yes]
   --enable-shared         build shared library [default=yes if GNU ld]
   --enable-profile        build profiled library [default=no]
+  --enable-asan           build library instrumented with AddressSanitizer
+                          (ASan) [default=no]
   --enable-oldest-abi=ABI configure the oldest ABI supported [e.g. 2.2]
                           [default=glibc default]
   --enable-hardcoded-path-in-tests
@@ -3431,6 +3435,13 @@  else
   profile=no
 fi
 
+# Check whether --enable-asan was given.
+if test "${enable_asan+set}" = set; then :
+  enableval=$enable_asan; asan=$enableval
+else
+  asan=no
+fi
+
 
 # Check whether --enable-oldest-abi was given.
 if test "${enable_oldest_abi+set}" = set; then :
diff --git a/configure.ac b/configure.ac
index 566ecb2..cb4bae4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -150,6 +150,11 @@  AC_ARG_ENABLE([profile],
 			     [build profiled library @<:@default=no@:>@]),
 	      [profile=$enableval],
 	      [profile=no])
+AC_ARG_ENABLE([asan],
+	      AC_HELP_STRING([--enable-asan],
+			     [build library instrumented with AddressSanitizer (ASan) @<:@default=no@:>@]),
+	      [asan=$enableval],
+	      [asan=no])
 
 AC_ARG_ENABLE([oldest-abi],
 	      AC_HELP_STRING([--enable-oldest-abi=ABI],
@@ -1996,6 +2001,7 @@  rm -f conftest.*])
 AC_SUBST(libc_cv_pic_default)
 
 AC_SUBST(profile)
+AC_SUBST(asan)
 AC_SUBST(static_nss)
 
 AC_SUBST(DEFINES)