[BZ#20422] Do not allow asan/msan/tsan and fortify at the same time.

Message ID 57CDAB08.8060601@samsung.com
State New, archived
Headers

Commit Message

Maxim Ostapenko Sept. 5, 2016, 5:27 p.m. UTC
  Hi!

When fortify is used with MSan it will cause MSan false positives.

#include <stdio.h>
#include <string.h>
int main()
{
         char text[100];
         sprintf(text, "hello");
         printf("%lu\n", strlen(text));
}

% clang test.c -fsanitize=memory   -O3 && ./a.out
5
% clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
Uninitialized bytes in __interceptor_strlen at offset 0 inside 
[0x7ffe259e4d20, 6)
==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
     #0 0x4869cc in main

With ASan, this will not cause false positives, but may case false 
negatives or just confuse people with "wrong" reports when fortify 
catches the error.

Although fortify is good thing as it (and it's enabled by default on 
some major distros e.g. Ubuntu and Gentoo), people still complain about 
{A, M}San vs fortify interaction, see e.g. 
https://github.com/google/sanitizers/issues/689. One possible solution 
would be to extend {A, M}San to support foo_chk() functions, but this 
would increase the complexity of sanitizer tools with quite small 
benefit. Another choice would be to warn users when they compile their 
code with {A, M, T}San and fortify enabled.

This patch implements the second approach. The simplest way to warn is 
to modify the Glibc headers to check if fortify and one of the 
sanitizers is enabled. Does this look reasonable?

I've tried to add a testcase for new warning into Glibc testsuite, but 
failed to see how exactly I can do it. Does Glibc have some framework 
for compilation tests? Could someone help me with this issue?
For now, I've tested this patch locally with GCC 4.8, fresh GCC and 
fresh Clang on my Ubuntu box:

gcc test.c -fsanitize=address -D_FORTIFY_SOURCE=2  -O3 
-L${SYSROOT}/usr/lib -I${SYSROOT}/include -Wl,-rpath=${SYSROOT}/lib 
-Wl,--dynamic-linker=${SYSROOT}/lib/ld-2.24.90.so  -S
In file included from 
/home/max/install/glibc//include/bits/libc-header-start.h:33:0,
                  from /home/max/install/glibc//include/stdio.h:28,
                  from test.c:1:
/home/max/install/glibc//include/features.h:374:3: warning: #warning 
_FORTIFY_SOURCE is not compatible with sanitizer [-Wcpp]
  # warning _FORTIFY_SOURCE is not compatible with sanitizer


~/install/master/bin/gcc test.c -fsanitize=address -D_FORTIFY_SOURCE=2  
-O3 -L${SYSROOT}/usr/lib -I${SYSROOT}/include -Wl,-rpath=${SYSROOT}/lib 
-Wl,--dynamic-linker=${SYSROOT}/lib/ld-2.24.90.so  -S
In file included from 
/home/max/install/glibc//include/bits/libc-header-start.h:33:0,
                  from /home/max/install/glibc//include/stdio.h:28,
                  from test.c:1:
/home/max/install/glibc//include/features.h:374:3: warning: #warning 
_FORTIFY_SOURCE is not compatible with sanitizer [-Wcpp]
  # warning _FORTIFY_SOURCE is not compatible with sanitizer

clang  test.c -fsanitize=address -D_FORTIFY_SOURCE=2  -O3 
-L${SYSROOT}/usr/lib -I${SYSROOT}/include -Wl,-rpath=${SYSROOT}/lib 
-Wl,--dynamic-linker=${SYSROOT}/lib/ld-2.24.90.so
In file included from test.c:1:
In file included from /home/max/install/glibc//include/stdio.h:28:
In file included from 
/home/max/install/glibc//include/bits/libc-header-start.h:33:
/home/max/install/glibc//include/features.h:374:3: warning: 
_FORTIFY_SOURCE is not compatible with sanitizer [-W#warnings]
# warning _FORTIFY_SOURCE is not compatible with sanitizer
   ^
1 warning generated.


-Maxim
  

Comments

Andrew Pinski Sept. 5, 2016, 7:47 p.m. UTC | #1
On Mon, Sep 5, 2016 at 10:27 AM, Maxim Ostapenko
<m.ostapenko@samsung.com> wrote:
> Hi!
>
> When fortify is used with MSan it will cause MSan false positives.
>
> #include <stdio.h>
> #include <string.h>
> int main()
> {
>         char text[100];
>         sprintf(text, "hello");
>         printf("%lu\n", strlen(text));
> }
>
> % clang test.c -fsanitize=memory   -O3 && ./a.out
> 5
> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
> Uninitialized bytes in __interceptor_strlen at offset 0 inside
> [0x7ffe259e4d20, 6)
> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x4869cc in main
>
> With ASan, this will not cause false positives, but may case false negatives
> or just confuse people with "wrong" reports when fortify catches the error.
>
> Although fortify is good thing as it (and it's enabled by default on some
> major distros e.g. Ubuntu and Gentoo), people still complain about {A, M}San
> vs fortify interaction, see e.g.
> https://github.com/google/sanitizers/issues/689. One possible solution would
> be to extend {A, M}San to support foo_chk() functions, but this would
> increase the complexity of sanitizer tools with quite small benefit. Another
> choice would be to warn users when they compile their code with {A, M, T}San
> and fortify enabled.


This is the only solution in my mind.  And it does not expand the
complexity as much as complex as the Sanitizers already complex and
needs a lot of porting to new target already.

Thanks,
Andrew

>
> This patch implements the second approach. The simplest way to warn is to
> modify the Glibc headers to check if fortify and one of the sanitizers is
> enabled. Does this look reasonable?
>
> I've tried to add a testcase for new warning into Glibc testsuite, but
> failed to see how exactly I can do it. Does Glibc have some framework for
> compilation tests? Could someone help me with this issue?
> For now, I've tested this patch locally with GCC 4.8, fresh GCC and fresh
> Clang on my Ubuntu box:
>
> gcc test.c -fsanitize=address -D_FORTIFY_SOURCE=2  -O3 -L${SYSROOT}/usr/lib
> -I${SYSROOT}/include -Wl,-rpath=${SYSROOT}/lib
> -Wl,--dynamic-linker=${SYSROOT}/lib/ld-2.24.90.so  -S
> In file included from
> /home/max/install/glibc//include/bits/libc-header-start.h:33:0,
>                  from /home/max/install/glibc//include/stdio.h:28,
>                  from test.c:1:
> /home/max/install/glibc//include/features.h:374:3: warning: #warning
> _FORTIFY_SOURCE is not compatible with sanitizer [-Wcpp]
>  # warning _FORTIFY_SOURCE is not compatible with sanitizer
>
>
> ~/install/master/bin/gcc test.c -fsanitize=address -D_FORTIFY_SOURCE=2  -O3
> -L${SYSROOT}/usr/lib -I${SYSROOT}/include -Wl,-rpath=${SYSROOT}/lib
> -Wl,--dynamic-linker=${SYSROOT}/lib/ld-2.24.90.so  -S
> In file included from
> /home/max/install/glibc//include/bits/libc-header-start.h:33:0,
>                  from /home/max/install/glibc//include/stdio.h:28,
>                  from test.c:1:
> /home/max/install/glibc//include/features.h:374:3: warning: #warning
> _FORTIFY_SOURCE is not compatible with sanitizer [-Wcpp]
>  # warning _FORTIFY_SOURCE is not compatible with sanitizer
>
> clang  test.c -fsanitize=address -D_FORTIFY_SOURCE=2  -O3
> -L${SYSROOT}/usr/lib -I${SYSROOT}/include -Wl,-rpath=${SYSROOT}/lib
> -Wl,--dynamic-linker=${SYSROOT}/lib/ld-2.24.90.so
> In file included from test.c:1:
> In file included from /home/max/install/glibc//include/stdio.h:28:
> In file included from
> /home/max/install/glibc//include/bits/libc-header-start.h:33:
> /home/max/install/glibc//include/features.h:374:3: warning: _FORTIFY_SOURCE
> is not compatible with sanitizer [-W#warnings]
> # warning _FORTIFY_SOURCE is not compatible with sanitizer
>   ^
> 1 warning generated.
>
>
> -Maxim
  
Florian Weimer Sept. 6, 2016, 8:39 a.m. UTC | #2
On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
> Hi!
>
> When fortify is used with MSan it will cause MSan false positives.
>
> #include <stdio.h>
> #include <string.h>
> int main()
> {
>         char text[100];
>         sprintf(text, "hello");
>         printf("%lu\n", strlen(text));
> }
>
> % clang test.c -fsanitize=memory   -O3 && ./a.out
> 5
> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
> Uninitialized bytes in __interceptor_strlen at offset 0 inside
> [0x7ffe259e4d20, 6)
> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x4869cc in main
>
> With ASan, this will not cause false positives, but may case false
> negatives or just confuse people with "wrong" reports when fortify
> catches the error.

Why does the above cause a warning, but this does not, and happily 
prints the undefined array members?

#include <stdio.h>
#include <pwd.h>
#include <unistd.h>
#include <string.h>
#include <err.h>
#include <grp.h>

int main()
{
   struct passwd *pw = getpwuid (getuid ());
   if (pw == NULL)
     err (1, "getpwuid");
   gid_t groups[50];
   int ngroups = 50;
   if (getgrouplist (pw->pw_name, pw->pw_gid, groups, &ngroups) < 0)
     errx (1, "getgrouplist");
   for (int i = 0; i < 50; ++i)
     printf ("group %d: %lld\n", i + 1, (long long) groups[i]);
}


I find this rather confusing.  -D_FORTIFY_SOURCE=2 does not make a 
difference here.

> Although fortify is good thing as it (and it's enabled by default on
> some major distros e.g. Ubuntu and Gentoo), people still complain about
> {A, M}San vs fortify interaction, see e.g.
> https://github.com/google/sanitizers/issues/689. One possible solution
> would be to extend {A, M}San to support foo_chk() functions, but this
> would increase the complexity of sanitizer tools with quite small
> benefit. Another choice would be to warn users when they compile their
> code with {A, M, T}San and fortify enabled.

At least for Memory Sanitizer, the documentation clearly says that 
*everything* has to be compiled with it.  I read that as as meaning that 
the interceptors are just a kludge.

If you do not intended to implement further interceptors, I expect you 
don't want to implement those for open/openat either, other compile-time 
fortification, or whatever other fortify functions we might add which 
are not directly related to memory safety.  This means that for full 
coverage, a developer would have to compile to test with the different 
sanitizers *and* _FORTIFY_SOURCE.  I'm not sure if that leads to a good 
developer experience.

We could conceivably guard every _chk wrapper (say __sprintf_chk) with

#ifndef __fortify_no___sprintf_chk
…
#endif

.  In sanitizer mode, for those wrappers you have deemed to be 
unnecessary, the compiler would define these macros, so that glibc 
wouldn't install the wrapper.

Another option would be to provide a glibc which has been compiled with 
the required sanitizers, so that most interceptors become unnecessary.

> I've tried to add a testcase for new warning into Glibc testsuite, but
> failed to see how exactly I can do it. Does Glibc have some framework
> for compilation tests?

It does not.  Carlos proposed something earlier this year, but it was 
not able to check for the presence expected warnings.  We need something 
like this for the compile-time checks.

Florian
  
Yuri Gribov Sept. 6, 2016, 8:58 a.m. UTC | #3
On Tue, Sep 6, 2016 at 9:39 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
>>
>> Hi!
>>
>> When fortify is used with MSan it will cause MSan false positives.
>>
>> #include <stdio.h>
>> #include <string.h>
>> int main()
>> {
>>         char text[100];
>>         sprintf(text, "hello");
>>         printf("%lu\n", strlen(text));
>> }
>>
>> % clang test.c -fsanitize=memory   -O3 && ./a.out
>> 5
>> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
>> Uninitialized bytes in __interceptor_strlen at offset 0 inside
>> [0x7ffe259e4d20, 6)
>> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>>     #0 0x4869cc in main
>>
>> With ASan, this will not cause false positives, but may case false
>> negatives or just confuse people with "wrong" reports when fortify
>> catches the error.
>
>
> Why does the above cause a warning, but this does not, and happily prints
> the undefined array members?
>
> #include <stdio.h>
> #include <pwd.h>
> #include <unistd.h>
> #include <string.h>
> #include <err.h>
> #include <grp.h>
>
> int main()
> {
>   struct passwd *pw = getpwuid (getuid ());
>   if (pw == NULL)
>     err (1, "getpwuid");
>   gid_t groups[50];
>   int ngroups = 50;
>   if (getgrouplist (pw->pw_name, pw->pw_gid, groups, &ngroups) < 0)
>     errx (1, "getgrouplist");
>   for (int i = 0; i < 50; ++i)
>     printf ("group %d: %lld\n", i + 1, (long long) groups[i]);
> }
>
> I find this rather confusing.  -D_FORTIFY_SOURCE=2 does not make a
> difference here.
>
>> Although fortify is good thing as it (and it's enabled by default on
>> some major distros e.g. Ubuntu and Gentoo), people still complain about
>> {A, M}San vs fortify interaction, see e.g.
>> https://github.com/google/sanitizers/issues/689. One possible solution
>> would be to extend {A, M}San to support foo_chk() functions, but this
>> would increase the complexity of sanitizer tools with quite small
>> benefit. Another choice would be to warn users when they compile their
>> code with {A, M, T}San and fortify enabled.
>
>
> At least for Memory Sanitizer, the documentation clearly says that
> *everything* has to be compiled with it.  I read that as as meaning that the
> interceptors are just a kludge.

MSan is special in this regard. All other tools (T, A and UBSan)
support separate sanitization as this tremendously simplifies
integration in production environments.

> If you do not intended to implement further interceptors, I expect you don't
> want to implement those for open/openat either, other compile-time
> fortification, or whatever other fortify functions we might add which are
> not directly related to memory safety.  This means that for full coverage, a
> developer would have to compile to test with the different sanitizers *and*
> _FORTIFY_SOURCE.  I'm not sure if that leads to a good developer experience.

Even now developer has to run his program with all sanitizers
separately (MSan, ASan+UBSan, TSan and many still add Valgrind) to get
full coverage so we aren't adding much to the zoo of security tools.

> We could conceivably guard every _chk wrapper (say __sprintf_chk) with
>
> #ifndef __fortify_no___sprintf_chk
> …
> #endif
>
> .  In sanitizer mode, for those wrappers you have deemed to be unnecessary,
> the compiler would define these macros, so that glibc wouldn't install the
> wrapper.

Or just undefine _FORTIFY_SOURCE?

> Another option would be to provide a glibc which has been compiled with the
> required sanitizers, so that most interceptors become unnecessary.

Two important things to consider:
* sanitized Glibc will be slower (potentially much slower) than
current approach (because interceptors are heavily optimized)
* I'm unaware of any plans to implement sanitized Glibc (even though
prototype has been available for 1+ year)

>> I've tried to add a testcase for new warning into Glibc testsuite, but
>> failed to see how exactly I can do it. Does Glibc have some framework
>> for compilation tests?
>
>
> It does not.  Carlos proposed something earlier this year, but it was not
> able to check for the presence expected warnings.  We need something like
> this for the compile-time checks.
>
> Florian
  
Florian Weimer Sept. 6, 2016, 9:13 a.m. UTC | #4
On 09/06/2016 10:58 AM, Yuri Gribov wrote:

>> We could conceivably guard every _chk wrapper (say __sprintf_chk) with
>>
>> #ifndef __fortify_no___sprintf_chk
>> …
>> #endif
>>
>> .  In sanitizer mode, for those wrappers you have deemed to be unnecessary,
>> the compiler would define these macros, so that glibc wouldn't install the
>> wrapper.
>
> Or just undefine _FORTIFY_SOURCE?

Well, but this way, you'll lose the completely unrelated alternative 
implementations which might still be useful because none of the 
sanitizers cover these aspects.

>> Another option would be to provide a glibc which has been compiled with the
>> required sanitizers, so that most interceptors become unnecessary.
>
> Two important things to consider:
> * sanitized Glibc will be slower (potentially much slower) than
> current approach (because interceptors are heavily optimized)

But in turn, you might uncover more bugs (mostly in glibc, but also in 
applications because the interceptors happen to be slightly off in the 
behavior they model).

> * I'm unaware of any plans to implement sanitized Glibc (even though
> prototype has been available for 1+ year)

I hope to get remove the last remaining nested functions soon, if that's 
any help.

Florian
  
Maxim Ostapenko Sept. 6, 2016, 9:16 a.m. UTC | #5
On 06/09/16 11:39, Florian Weimer wrote:
> On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
>> Hi!
>>
>> When fortify is used with MSan it will cause MSan false positives.
>>
>> #include <stdio.h>
>> #include <string.h>
>> int main()
>> {
>>         char text[100];
>>         sprintf(text, "hello");
>>         printf("%lu\n", strlen(text));
>> }
>>
>> % clang test.c -fsanitize=memory   -O3 && ./a.out
>> 5
>> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
>> Uninitialized bytes in __interceptor_strlen at offset 0 inside
>> [0x7ffe259e4d20, 6)
>> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>>     #0 0x4869cc in main
>>
>> With ASan, this will not cause false positives, but may case false
>> negatives or just confuse people with "wrong" reports when fortify
>> catches the error.
>
> Why does the above cause a warning, but this does not, and happily 
> prints the undefined array members?
>
> #include <stdio.h>
> #include <pwd.h>
> #include <unistd.h>
> #include <string.h>
> #include <err.h>
> #include <grp.h>
>
> int main()
> {
>   struct passwd *pw = getpwuid (getuid ());
>   if (pw == NULL)
>     err (1, "getpwuid");
>   gid_t groups[50];
>   int ngroups = 50;
>   if (getgrouplist (pw->pw_name, pw->pw_gid, groups, &ngroups) < 0)
>     errx (1, "getgrouplist");
>   for (int i = 0; i < 50; ++i)
>     printf ("group %d: %lld\n", i + 1, (long long) groups[i]);
> }
>
>
> I find this rather confusing.  -D_FORTIFY_SOURCE=2 does not make a 
> difference here.


I think because MSan doesn't have getgrouplist interceptor.

>
>> Although fortify is good thing as it (and it's enabled by default on
>> some major distros e.g. Ubuntu and Gentoo), people still complain about
>> {A, M}San vs fortify interaction, see e.g.
>> https://github.com/google/sanitizers/issues/689. One possible solution
>> would be to extend {A, M}San to support foo_chk() functions, but this
>> would increase the complexity of sanitizer tools with quite small
>> benefit. Another choice would be to warn users when they compile their
>> code with {A, M, T}San and fortify enabled.
>
> At least for Memory Sanitizer, the documentation clearly says that 
> *everything* has to be compiled with it.  I read that as as meaning 
> that the interceptors are just a kludge.

I thought we should compile everything except Glibc, no? Compiling Glibc 
with MSan would be tricky, that's why it needs interceptors.

>
> If you do not intended to implement further interceptors, I expect you 
> don't want to implement those for open/openat either, other 
> compile-time fortification, or whatever other fortify functions we 
> might add which are not directly related to memory safety.  This means 
> that for full coverage, a developer would have to compile to test with 
> the different sanitizers *and* _FORTIFY_SOURCE.  I'm not sure if that 
> leads to a good developer experience.
>
> We could conceivably guard every _chk wrapper (say __sprintf_chk) with
>
> #ifndef __fortify_no___sprintf_chk
> …
> #endif
>
> .  In sanitizer mode, for those wrappers you have deemed to be 
> unnecessary, the compiler would define these macros, so that glibc 
> wouldn't install the wrapper.
>
> Another option would be to provide a glibc which has been compiled 
> with the required sanitizers, so that most interceptors become 
> unnecessary.
>
>> I've tried to add a testcase for new warning into Glibc testsuite, but
>> failed to see how exactly I can do it. Does Glibc have some framework
>> for compilation tests?
>
> It does not.  Carlos proposed something earlier this year, but it was 
> not able to check for the presence expected warnings.  We need 
> something like this for the compile-time checks.
>
> Florian
>
>
  
Yuri Gribov Sept. 6, 2016, 9:20 a.m. UTC | #6
On Tue, Sep 6, 2016 at 10:13 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 09/06/2016 10:58 AM, Yuri Gribov wrote:
>
>>> We could conceivably guard every _chk wrapper (say __sprintf_chk) with
>>>
>>> #ifndef __fortify_no___sprintf_chk
>>> …
>>> #endif
>>>
>>> .  In sanitizer mode, for those wrappers you have deemed to be
>>> unnecessary,
>>> the compiler would define these macros, so that glibc wouldn't install
>>> the
>>> wrapper.
>>
>>
>> Or just undefine _FORTIFY_SOURCE?
>
>
> Well, but this way, you'll lose the completely unrelated alternative
> implementations which might still be useful because none of the sanitizers
> cover these aspects.

But isn't it the same with Valgrind - it doesn't play nice with
sanitizers (because both tools are far too complicated) but noone
really cares to fix this and just people just perform separate
verification runs.

>>> Another option would be to provide a glibc which has been compiled with
>>> the
>>> required sanitizers, so that most interceptors become unnecessary.
>>
>>
>> Two important things to consider:
>> * sanitized Glibc will be slower (potentially much slower) than
>> current approach (because interceptors are heavily optimized)
>
>
> But in turn, you might uncover more bugs (mostly in glibc, but also in
> applications because the interceptors happen to be slightly off in the
> behavior they model).

Definitely. Sanitized Glibc would be great but it would be alternative
to interceptor-based approach, not a replacement.

>> * I'm unaware of any plans to implement sanitized Glibc (even though
>> prototype has been available for 1+ year)
>
>
> I hope to get remove the last remaining nested functions soon, if that's any
> help.
>
> Florian
  
Florian Weimer Sept. 6, 2016, 9:44 a.m. UTC | #7
On 09/06/2016 11:16 AM, Maxim Ostapenko wrote:

>> At least for Memory Sanitizer, the documentation clearly says that
>> *everything* has to be compiled with it.  I read that as as meaning
>> that the interceptors are just a kludge.
>
> I thought we should compile everything except Glibc, no? Compiling Glibc
> with MSan would be tricky, that's why it needs interceptors.

There are some parts of glibc which cannot be instrumented, like most 
things involved early in process startup.  But there is nothing really 
special about, say, getpwuid, or even strerror.

Florian
  
Florian Weimer Sept. 6, 2016, 9:51 a.m. UTC | #8
On 09/06/2016 11:20 AM, Yuri Gribov wrote:

> But isn't it the same with Valgrind - it doesn't play nice with
> sanitizers (because both tools are far too complicated) but noone
> really cares to fix this and just people just perform separate
> verification runs.

Sure, we can pile workarounds on workarounds, but we get maintenance 
problems (like the dlsym error reporting change which broke ASan earlier 
this year), and misleading results because the interceptors do not 
correctly model the glibc behavior.

For example, the following program has a data race on glibc, but not on 
other libcs, and the interceptors appear to assume the other libc's 
behavior, and -fsanitize=thread does not report anything.

#include <pthread.h>
#include <stdbool.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <unistd.h>

__attribute__ ((noinline, noclone))
static void
check_value (const char *s, int expected)
{
   int actual;
   if (sscanf (s, "Unknown error %d", &actual) != 1)
     {
       printf ("error: scanf parsing failed: [[%s]]\n", s);
       _exit (1);
     }
   if (actual != expected)
     {
       printf ("error: scanf returned wrong value\n");
       printf ("error:  expected: %d\n", expected);
       printf ("error:  actual:   %d\n", actual);
       _exit (1);
     }
}

static void *
thread_func (void *closure)
{
   int *pval = closure;
   while (true)
     {
       check_value (strerror (*pval), *pval);
     }
   return NULL;
}

int
main ()
{
   int val1000 = 1000;
   pthread_t thr;
   if (pthread_create (&thr, NULL, thread_func, &val1000) != 0)
     abort ();
   int val9999 = 9999;
   thread_func (&val9999);
}


This also applies to the NSS functions (such as getpwuid) and the locale 
handling.  All these data races are currently obscured by the 
interceptors, I think.  We can fix the strerror and getpwuid races with 
more TLS data, but for locale, it's not really an option.

Florian
  
Kostya Serebryany Sept. 9, 2016, 10:35 p.m. UTC | #9
>
> I hope to get remove the last remaining nested functions soon, if that's any
> help.
This will help to compile Glibc with clang.
But to compile Glibc with asan (or other *san) we'll need makefile
surgery (similar to what has been done for profiler support)
  
Kostya Serebryany Sept. 9, 2016, 10:36 p.m. UTC | #10
> I thought we should compile everything except Glibc, no? Compiling Glibc
> with MSan would be tricky, that's why it needs interceptors.

Correct.
  
Florian Weimer Sept. 12, 2016, 9:30 a.m. UTC | #11
On 09/10/2016 12:36 AM, Kostya Serebryany wrote:
>> I thought we should compile everything except Glibc, no? Compiling Glibc
>> with MSan would be tricky, that's why it needs interceptors.
>
> Correct.

Why?  Large parts of glibc are just another library and not that low-level.

Sure, it is quite a bit of work to disentangle the low-level bits from 
those that can be instrumented, but so is writing correct interceptors 
which actually follow glibc behavior and are not merely approximation of 
the core functionality.  (I'm not talking about memset, but high-level 
things like getpwuid or glob.)

Florian
  
Florian Weimer Sept. 12, 2016, 9:36 a.m. UTC | #12
On 09/10/2016 12:34 AM, Kostya Serebryany wrote:

>> I hope to […] remove the last remaining nested functions soon, if that's
>> any help.

> This will help to compile Glibc with clang.
> But to compile Glibc with asan (or other *san) we'll need makefile surgery
> (similar to what has been done for profiler support)

Sure.  I had hoped for some collaboration here, so that we end up with 
something that works for all of us (and our users). :)

Florian
  
Kostya Serebryany Sept. 12, 2016, 6:54 p.m. UTC | #13
On Mon, Sep 12, 2016 at 2:30 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 09/10/2016 12:36 AM, Kostya Serebryany wrote:
>>>
>>> I thought we should compile everything except Glibc, no? Compiling Glibc
>>> with MSan would be tricky, that's why it needs interceptors.
>>
>>
>> Correct.
>
>
> Why?  Large parts of glibc are just another library and not that low-level.

[re-sending in plain text]
I don't say the current situation is ideal.
It is as it is because we simply can't build Glibc with clang and msan today.

>
> Sure, it is quite a bit of work to disentangle the low-level bits from those
> that can be instrumented, but so is writing correct interceptors which
> actually follow glibc behavior and are not merely approximation of the core
> functionality.  (I'm not talking about memset, but high-level things like
> getpwuid or glob.)
>
> Florian
>
  
Yuri Gribov Sept. 17, 2016, 9 a.m. UTC | #14
On Tue, Sep 6, 2016 at 9:39 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
>>
>> Hi!
>>
>> When fortify is used with MSan it will cause MSan false positives.
>>
>> #include <stdio.h>
>> #include <string.h>
>> int main()
>> {
>>         char text[100];
>>         sprintf(text, "hello");
>>         printf("%lu\n", strlen(text));
>> }
>>
>> % clang test.c -fsanitize=memory   -O3 && ./a.out
>> 5
>> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
>> Uninitialized bytes in __interceptor_strlen at offset 0 inside
>> [0x7ffe259e4d20, 6)
>> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>>     #0 0x4869cc in main
>>
>> With ASan, this will not cause false positives, but may case false
>> negatives or just confuse people with "wrong" reports when fortify
>> catches the error.
>
>
> Why does the above cause a warning, but this does not, and happily prints
> the undefined array members?
>
> #include <stdio.h>
> #include <pwd.h>
> #include <unistd.h>
> #include <string.h>
> #include <err.h>
> #include <grp.h>
>
> int main()
> {
>   struct passwd *pw = getpwuid (getuid ());
>   if (pw == NULL)
>     err (1, "getpwuid");
>   gid_t groups[50];
>   int ngroups = 50;
>   if (getgrouplist (pw->pw_name, pw->pw_gid, groups, &ngroups) < 0)
>     errx (1, "getgrouplist");
>   for (int i = 0; i < 50; ++i)
>     printf ("group %d: %lld\n", i + 1, (long long) groups[i]);
> }
>
>
> I find this rather confusing.  -D_FORTIFY_SOURCE=2 does not make a
> difference here.
>
>> Although fortify is good thing as it (and it's enabled by default on
>> some major distros e.g. Ubuntu and Gentoo), people still complain about
>> {A, M}San vs fortify interaction, see e.g.
>> https://github.com/google/sanitizers/issues/689. One possible solution
>> would be to extend {A, M}San to support foo_chk() functions, but this
>> would increase the complexity of sanitizer tools with quite small
>> benefit. Another choice would be to warn users when they compile their
>> code with {A, M, T}San and fortify enabled.
>
>
> At least for Memory Sanitizer, the documentation clearly says that
> *everything* has to be compiled with it.  I read that as as meaning that the
> interceptors are just a kludge.
>
> If you do not intended to implement further interceptors, I expect you don't
> want to implement those for open/openat either, other compile-time
> fortification, or whatever other fortify functions we might add which are
> not directly related to memory safety.  This means that for full coverage, a
> developer would have to compile to test with the different sanitizers *and*
> _FORTIFY_SOURCE.  I'm not sure if that leads to a good developer experience.
>
> We could conceivably guard every _chk wrapper (say __sprintf_chk) with
>
> #ifndef __fortify_no___sprintf_chk
> …
> #endif
>
> .  In sanitizer mode, for those wrappers you have deemed to be unnecessary,
> the compiler would define these macros, so that glibc wouldn't install the
> wrapper.

Hi Florian,

Would the above approach be accepted for trunk? The reason for me
pushing this is because FORTIFY_SOURCE is now enabled by default in
major distros and this start to be a detrimental factor for ASan
efficiency (there are already 3 open bugs related to this in tracker
and they keep coming).

Sanitized Glibc may be a solution for some users (although I suspect
many users would keep using current interceptor-based approach as it
less invasive and much easier to integrate) but I'm not aware of
anyone working or planning to start work on it in near future.

> Another option would be to provide a glibc which has been compiled with the
> required sanitizers, so that most interceptors become unnecessary.
>
>> I've tried to add a testcase for new warning into Glibc testsuite, but
>> failed to see how exactly I can do it. Does Glibc have some framework
>> for compilation tests?
>
>
> It does not.  Carlos proposed something earlier this year, but it was not
> able to check for the presence expected warnings.  We need something like
> this for the compile-time checks.
>
> Florian
  
Florian Weimer Sept. 29, 2016, 8:08 a.m. UTC | #15
* Yuri Gribov:

> Would the above approach be accepted for trunk? The reason for me
> pushing this is because FORTIFY_SOURCE is now enabled by default in
> major distros and this start to be a detrimental factor for ASan
> efficiency (there are already 3 open bugs related to this in tracker
> and they keep coming).

We have received a related feature request:

  <https://sourceware.org/bugzilla/show_bug.cgi?id=20644>

Do the sanitizers depend on direct calls to the interceptors from
application code, or can we add an indirection which has been compiled
*without* sanitizer support?

If the indirection is acceptable, we could perhaps provided a DSO
which maps back the fortify wrappers to the unfortified versions.
libasan could link against that, for valgrind, it could be preloaded.

memstop could use this as well:

  <https://bugzilla.redhat.com/show_bug.cgi?id=1034894>

The advantage of the unfortify library is that it keeps the knowledge
about fortify wrappers in glibc.
  
Yuri Gribov Sept. 29, 2016, 9:47 a.m. UTC | #16
On Thu, Sep 29, 2016 at 9:08 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Yuri Gribov:
>
>> Would the above approach be accepted for trunk? The reason for me
>> pushing this is because FORTIFY_SOURCE is now enabled by default in
>> major distros and this start to be a detrimental factor for ASan
>> efficiency (there are already 3 open bugs related to this in tracker
>> and they keep coming).
>
> We have received a related feature request:
>
>   <https://sourceware.org/bugzilla/show_bug.cgi?id=20644>
>
> Do the sanitizers depend on direct calls to the interceptors from
> application code, or can we add an indirection which has been compiled
> *without* sanitizer support?

Cc Jakub.

No, there is not dependency on direct calls so indirection should work fine.

> If the indirection is acceptable, we could perhaps provided a DSO
> which maps back the fortify wrappers to the unfortified versions.
> libasan could link against that, for valgrind, it could be preloaded.

Disabling Glibc fortification through indirection should generally
work (Kostya, Jakub -please comment if it's not). Actually it would be
even better than asking users to disable Fortify when building with
Asan because the tool will then be able to intercept calls to
fortified functions from unsanitized parts of the program (and detect
errors in such calls).

I'm just wondering if there are less invasive solutions e.g. providing
a runtime Glibc setting (e.g. through environment variable) to disable
fortification at startup? This shouldn't introduce significant
performance penalty.

> memstop could use this as well:
>
>   <https://bugzilla.redhat.com/show_bug.cgi?id=1034894>
>
> The advantage of the unfortify library is that it keeps the knowledge
> about fortify wrappers in glibc.

Yes, duplicating this information in all existing instrumentation
tools would be inefficient and error-prone.

-Y
  
Jakub Jelinek Sept. 29, 2016, 10:04 a.m. UTC | #17
On Thu, Sep 29, 2016 at 10:47:28AM +0100, Yuri Gribov wrote:
> > Do the sanitizers depend on direct calls to the interceptors from
> > application code, or can we add an indirection which has been compiled
> > *without* sanitizer support?
> 
> Cc Jakub.
> 
> No, there is not dependency on direct calls so indirection should work fine.
> 
> > If the indirection is acceptable, we could perhaps provided a DSO
> > which maps back the fortify wrappers to the unfortified versions.
> > libasan could link against that, for valgrind, it could be preloaded.
> 
> Disabling Glibc fortification through indirection should generally
> work (Kostya, Jakub -please comment if it's not). Actually it would be
> even better than asking users to disable Fortify when building with
> Asan because the tool will then be able to intercept calls to
> fortified functions from unsanitized parts of the program (and detect
> errors in such calls).
> 
> I'm just wondering if there are less invasive solutions e.g. providing
> a runtime Glibc setting (e.g. through environment variable) to disable
> fortification at startup? This shouldn't introduce significant
> performance penalty.

Ugh, so you want to slow down the performance critical common path of some
of the most often used functions just to avoid adding small code to the
sanitizers and valgrind?

> 
> > memstop could use this as well:
> >
> >   <https://bugzilla.redhat.com/show_bug.cgi?id=1034894>
> >
> > The advantage of the unfortify library is that it keeps the knowledge
> > about fortify wrappers in glibc.
> 
> Yes, duplicating this information in all existing instrumentation
> tools would be inefficient and error-prone.

Why?  There are not many of the __*_chk entrypoints, and it is fairly easy
to handle them.  E.g. asan has hundreds to thousands of wrappers, and most
of the *_chk entrypoints could be easily macrofied and the tool (asan,
valgrind) can then decide what it prefers to do with them (ignore the __bos
supplied argument, or add another diagnostics for the UB, etc.).

	Jakub
  
Yuri Gribov Sept. 29, 2016, 10:32 a.m. UTC | #18
On Thu, Sep 29, 2016 at 11:04 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Sep 29, 2016 at 10:47:28AM +0100, Yuri Gribov wrote:
>> > Do the sanitizers depend on direct calls to the interceptors from
>> > application code, or can we add an indirection which has been compiled
>> > *without* sanitizer support?
>>
>> Cc Jakub.
>>
>> No, there is not dependency on direct calls so indirection should work fine.
>>
>> > If the indirection is acceptable, we could perhaps provided a DSO
>> > which maps back the fortify wrappers to the unfortified versions.
>> > libasan could link against that, for valgrind, it could be preloaded.
>>
>> Disabling Glibc fortification through indirection should generally
>> work (Kostya, Jakub -please comment if it's not). Actually it would be
>> even better than asking users to disable Fortify when building with
>> Asan because the tool will then be able to intercept calls to
>> fortified functions from unsanitized parts of the program (and detect
>> errors in such calls).
>>
>> I'm just wondering if there are less invasive solutions e.g. providing
>> a runtime Glibc setting (e.g. through environment variable) to disable
>> fortification at startup? This shouldn't introduce significant
>> performance penalty.
>
> Ugh, so you want to slow down the performance critical common path of some
> of the most often used functions just to avoid adding small code to the
> sanitizers and valgrind?

Well, I think it's not just ASan and valgrind but rather all (or most
of) dynamic instrumentation tools (TSan, MSan, dmalloc, etc.). We
suggest to handle forwarding in one place rather than ask all tools
developers to re-implement it.

Note that Florian's suggestion would not incur penalties in normal
(i.e. non-sanitized) case and sanitized executables will suffer one
additional indirect jump per call.

>> > memstop could use this as well:
>> >
>> >   <https://bugzilla.redhat.com/show_bug.cgi?id=1034894>
>> >
>> > The advantage of the unfortify library is that it keeps the knowledge
>> > about fortify wrappers in glibc.
>>
>> Yes, duplicating this information in all existing instrumentation
>> tools would be inefficient and error-prone.
>
> Why?  There are not many of the __*_chk entrypoints, and it is fairly easy
> to handle them.

I can only access 2.12 now (yeah...) but it has 90+ *_chk APIs. Surely
not all are necessary but that's still a lot of new interceptors.

> E.g. asan has hundreds to thousands of wrappers, and most
> of the *_chk entrypoints could be easily macrofied and the tool (asan,
> valgrind) can then decide what it prefers to do with them (ignore the __bos
> supplied argument, or add another diagnostics for the UB, etc.).

It's not just about implementing them but also synchronizing when
Glibc adds new fortified implementations or changes their interfaces.

-Y
  
Jakub Jelinek Sept. 29, 2016, 10:44 a.m. UTC | #19
On Thu, Sep 29, 2016 at 11:32:19AM +0100, Yuri Gribov wrote:
> > Ugh, so you want to slow down the performance critical common path of some
> > of the most often used functions just to avoid adding small code to the
> > sanitizers and valgrind?
> 
> Well, I think it's not just ASan and valgrind but rather all (or most
> of) dynamic instrumentation tools (TSan, MSan, dmalloc, etc.). We
> suggest to handle forwarding in one place rather than ask all tools
> developers to re-implement it.

ASan/TSan/MSan can all share the same implementation, like they share
various other wrappers.  And the needs of the different tools are so
different that adding a preload DSO that just ignores it will not help at
all, usually you really don't want to ignore it, but complain using your
tools diagnostic framework, for that you need to know what argument is
objsz, against what it is compared (some other argument, strlen of some
argument, something else) etc.

> > Why?  There are not many of the __*_chk entrypoints, and it is fairly easy
> > to handle them.
> 
> I can only access 2.12 now (yeah...) but it has 90+ *_chk APIs. Surely
> not all are necessary but that's still a lot of new interceptors.

So what?  In most cases with sufficient macros you'll need just one or three
lines for each.  Basically return type, function name, arguments and their
types, how they map to the underlying non-fortified fn, what is the objsz
argument and against what it is compared.
BTW, I only count 80 on my box with glibc 2.21:
nm -D /lib64/libc.so.6 | awk '/_chk/{print $3}' | sort -u | wc -l

> > E.g. asan has hundreds to thousands of wrappers, and most
> > of the *_chk entrypoints could be easily macrofied and the tool (asan,
> > valgrind) can then decide what it prefers to do with them (ignore the __bos
> > supplied argument, or add another diagnostics for the UB, etc.).
> 
> It's not just about implementing them but also synchronizing when
> Glibc adds new fortified implementations or changes their interfaces.

It adds them very rarely, and never changes, it has to maintain the ABI.

	Jakub
  
Andrew Pinski Sept. 29, 2016, 10:52 a.m. UTC | #20
On Thu, Sep 29, 2016 at 6:44 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Sep 29, 2016 at 11:32:19AM +0100, Yuri Gribov wrote:
>> > Ugh, so you want to slow down the performance critical common path of some
>> > of the most often used functions just to avoid adding small code to the
>> > sanitizers and valgrind?
>>
>> Well, I think it's not just ASan and valgrind but rather all (or most
>> of) dynamic instrumentation tools (TSan, MSan, dmalloc, etc.). We
>> suggest to handle forwarding in one place rather than ask all tools
>> developers to re-implement it.
>
> ASan/TSan/MSan can all share the same implementation, like they share
> various other wrappers.  And the needs of the different tools are so
> different that adding a preload DSO that just ignores it will not help at
> all, usually you really don't want to ignore it, but complain using your
> tools diagnostic framework, for that you need to know what argument is
> objsz, against what it is compared (some other argument, strlen of some
> argument, something else) etc.

Plus the other argument doing it not in glibc is that some distros are
not going to support this until much later so it will be still broken
on those.  Plus what about FreeBSD, etc.

>
>> > Why?  There are not many of the __*_chk entrypoints, and it is fairly easy
>> > to handle them.
>>
>> I can only access 2.12 now (yeah...) but it has 90+ *_chk APIs. Surely
>> not all are necessary but that's still a lot of new interceptors.
>
> So what?  In most cases with sufficient macros you'll need just one or three
> lines for each.  Basically return type, function name, arguments and their
> types, how they map to the underlying non-fortified fn, what is the objsz
> argument and against what it is compared.
> BTW, I only count 80 on my box with glibc 2.21:
> nm -D /lib64/libc.so.6 | awk '/_chk/{print $3}' | sort -u | wc -l

And for 2.23 on ARM64 I get only 81 (well 80 if you don't count
__chk_fail which the above command still counts).
In one year none was added.


Thanks,
Andrew Pinski

>
>> > E.g. asan has hundreds to thousands of wrappers, and most
>> > of the *_chk entrypoints could be easily macrofied and the tool (asan,
>> > valgrind) can then decide what it prefers to do with them (ignore the __bos
>> > supplied argument, or add another diagnostics for the UB, etc.).
>>
>> It's not just about implementing them but also synchronizing when
>> Glibc adds new fortified implementations or changes their interfaces.
>
> It adds them very rarely, and never changes, it has to maintain the ABI.
>
>         Jakub
  
Kostya Serebryany Sept. 29, 2016, 9:23 p.m. UTC | #21
80 interceptors to support *san and fortification is 80 too many, IMHO.
The fact that other pre-compiled libraries use fortify by default is very sad.
I think this is a clear case of misuse of fortify because now users of
the library can't opt out.

If someone is willing to provide a patch to sanitizers that is
  * in sanitizer_common,
  * uses a separate file for all of these 80 functions,
  * does not touch any other file (in a significant way, at least)
  * has tests
I'll most likely accept it.

(Yuri, thanks for all this work!  )

--kcc


On Thu, Sep 29, 2016 at 3:52 AM, Andrew Pinski <pinskia@gmail.com> wrote:
> On Thu, Sep 29, 2016 at 6:44 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Thu, Sep 29, 2016 at 11:32:19AM +0100, Yuri Gribov wrote:
>>> > Ugh, so you want to slow down the performance critical common path of some
>>> > of the most often used functions just to avoid adding small code to the
>>> > sanitizers and valgrind?
>>>
>>> Well, I think it's not just ASan and valgrind but rather all (or most
>>> of) dynamic instrumentation tools (TSan, MSan, dmalloc, etc.). We
>>> suggest to handle forwarding in one place rather than ask all tools
>>> developers to re-implement it.
>>
>> ASan/TSan/MSan can all share the same implementation, like they share
>> various other wrappers.  And the needs of the different tools are so
>> different that adding a preload DSO that just ignores it will not help at
>> all, usually you really don't want to ignore it, but complain using your
>> tools diagnostic framework, for that you need to know what argument is
>> objsz, against what it is compared (some other argument, strlen of some
>> argument, something else) etc.
>
> Plus the other argument doing it not in glibc is that some distros are
> not going to support this until much later so it will be still broken
> on those.  Plus what about FreeBSD, etc.
>
>>
>>> > Why?  There are not many of the __*_chk entrypoints, and it is fairly easy
>>> > to handle them.
>>>
>>> I can only access 2.12 now (yeah...) but it has 90+ *_chk APIs. Surely
>>> not all are necessary but that's still a lot of new interceptors.
>>
>> So what?  In most cases with sufficient macros you'll need just one or three
>> lines for each.  Basically return type, function name, arguments and their
>> types, how they map to the underlying non-fortified fn, what is the objsz
>> argument and against what it is compared.
>> BTW, I only count 80 on my box with glibc 2.21:
>> nm -D /lib64/libc.so.6 | awk '/_chk/{print $3}' | sort -u | wc -l
>
> And for 2.23 on ARM64 I get only 81 (well 80 if you don't count
> __chk_fail which the above command still counts).
> In one year none was added.
>
>
> Thanks,
> Andrew Pinski
>
>>
>>> > E.g. asan has hundreds to thousands of wrappers, and most
>>> > of the *_chk entrypoints could be easily macrofied and the tool (asan,
>>> > valgrind) can then decide what it prefers to do with them (ignore the __bos
>>> > supplied argument, or add another diagnostics for the UB, etc.).
>>>
>>> It's not just about implementing them but also synchronizing when
>>> Glibc adds new fortified implementations or changes their interfaces.
>>
>> It adds them very rarely, and never changes, it has to maintain the ABI.
>>
>>         Jakub
  
Andrew Pinski Oct. 1, 2016, 9:38 p.m. UTC | #22
On Thu, Sep 29, 2016 at 2:23 PM, Kostya Serebryany <kcc@google.com> wrote:
> 80 interceptors to support *san and fortification is 80 too many, IMHO.
> The fact that other pre-compiled libraries use fortify by default is very sad.
> I think this is a clear case of misuse of fortify because now users of
> the library can't opt out.

Why do you think this is very sad?  This is not a misuse of fortify
but rather santizers not adapting to the changing environments and
handling how people are treating security issues now.  Remember 10
years ago there was no such thing as fortification and security was
not treated as a first class citizen.  Now Security is treated as a
first class and people are requiring security before anything else.

Thanks,
Andrew Pinski


>
> If someone is willing to provide a patch to sanitizers that is
>   * in sanitizer_common,
>   * uses a separate file for all of these 80 functions,
>   * does not touch any other file (in a significant way, at least)
>   * has tests
> I'll most likely accept it.
>
> (Yuri, thanks for all this work!  )
>
> --kcc
>
>
> On Thu, Sep 29, 2016 at 3:52 AM, Andrew Pinski <pinskia@gmail.com> wrote:
>> On Thu, Sep 29, 2016 at 6:44 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Thu, Sep 29, 2016 at 11:32:19AM +0100, Yuri Gribov wrote:
>>>> > Ugh, so you want to slow down the performance critical common path of some
>>>> > of the most often used functions just to avoid adding small code to the
>>>> > sanitizers and valgrind?
>>>>
>>>> Well, I think it's not just ASan and valgrind but rather all (or most
>>>> of) dynamic instrumentation tools (TSan, MSan, dmalloc, etc.). We
>>>> suggest to handle forwarding in one place rather than ask all tools
>>>> developers to re-implement it.
>>>
>>> ASan/TSan/MSan can all share the same implementation, like they share
>>> various other wrappers.  And the needs of the different tools are so
>>> different that adding a preload DSO that just ignores it will not help at
>>> all, usually you really don't want to ignore it, but complain using your
>>> tools diagnostic framework, for that you need to know what argument is
>>> objsz, against what it is compared (some other argument, strlen of some
>>> argument, something else) etc.
>>
>> Plus the other argument doing it not in glibc is that some distros are
>> not going to support this until much later so it will be still broken
>> on those.  Plus what about FreeBSD, etc.
>>
>>>
>>>> > Why?  There are not many of the __*_chk entrypoints, and it is fairly easy
>>>> > to handle them.
>>>>
>>>> I can only access 2.12 now (yeah...) but it has 90+ *_chk APIs. Surely
>>>> not all are necessary but that's still a lot of new interceptors.
>>>
>>> So what?  In most cases with sufficient macros you'll need just one or three
>>> lines for each.  Basically return type, function name, arguments and their
>>> types, how they map to the underlying non-fortified fn, what is the objsz
>>> argument and against what it is compared.
>>> BTW, I only count 80 on my box with glibc 2.21:
>>> nm -D /lib64/libc.so.6 | awk '/_chk/{print $3}' | sort -u | wc -l
>>
>> And for 2.23 on ARM64 I get only 81 (well 80 if you don't count
>> __chk_fail which the above command still counts).
>> In one year none was added.
>>
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>>> > E.g. asan has hundreds to thousands of wrappers, and most
>>>> > of the *_chk entrypoints could be easily macrofied and the tool (asan,
>>>> > valgrind) can then decide what it prefers to do with them (ignore the __bos
>>>> > supplied argument, or add another diagnostics for the UB, etc.).
>>>>
>>>> It's not just about implementing them but also synchronizing when
>>>> Glibc adds new fortified implementations or changes their interfaces.
>>>
>>> It adds them very rarely, and never changes, it has to maintain the ABI.
>>>
>>>         Jakub
  
Jakub Jelinek Oct. 1, 2016, 9:50 p.m. UTC | #23
On Sat, Oct 01, 2016 at 02:38:24PM -0700, Andrew Pinski wrote:
> On Thu, Sep 29, 2016 at 2:23 PM, Kostya Serebryany <kcc@google.com> wrote:
> > 80 interceptors to support *san and fortification is 80 too many, IMHO.
> > The fact that other pre-compiled libraries use fortify by default is very sad.
> > I think this is a clear case of misuse of fortify because now users of
> > the library can't opt out.
> 
> Why do you think this is very sad?  This is not a misuse of fortify
> but rather santizers not adapting to the changing environments and
> handling how people are treating security issues now.  Remember 10
> years ago there was no such thing as fortification and security was

Well, -D_FORTIFY_SOURCE=2 has been used heavily already 10 years ago.
But I certainly don't find anything sad on the fact that many programs
are fortified, it is a good thing.

> not treated as a first class citizen.  Now Security is treated as a
> first class and people are requiring security before anything else.

	Jakub
  
Florian Weimer Oct. 2, 2016, 7:51 a.m. UTC | #24
* Kostya Serebryany:

> 80 interceptors to support *san and fortification is 80 too many, IMHO.
> The fact that other pre-compiled libraries use fortify by default is very sad.
> I think this is a clear case of misuse of fortify because now users of
> the library can't opt out.

If we had some libc-unfortify.so DSO, you *could* opt out.

> If someone is willing to provide a patch to sanitizers that is
>   * in sanitizer_common,
>   * uses a separate file for all of these 80 functions,
>   * does not touch any other file (in a significant way, at least)
>   * has tests
> I'll most likely accept it.

Why do you want to put this into sanitizer_common?

It would make more sense to maintain this inside glibc, so that it can
be updated in sync with fortification development.  We can keep an eye
on the ability to build the sources separately from glibc, to bridge
the time until this DSO is routinely shipped as part of glibc.
  
Jakub Jelinek Oct. 2, 2016, 9:39 a.m. UTC | #25
On Sun, Oct 02, 2016 at 09:51:17AM +0200, Florian Weimer wrote:
> * Kostya Serebryany:
> 
> > 80 interceptors to support *san and fortification is 80 too many, IMHO.
> > The fact that other pre-compiled libraries use fortify by default is very sad.
> > I think this is a clear case of misuse of fortify because now users of
> > the library can't opt out.
> 
> If we had some libc-unfortify.so DSO, you *could* opt out.
> 
> > If someone is willing to provide a patch to sanitizers that is
> >   * in sanitizer_common,
> >   * uses a separate file for all of these 80 functions,
> >   * does not touch any other file (in a significant way, at least)
> >   * has tests
> > I'll most likely accept it.
> 
> Why do you want to put this into sanitizer_common?
> 
> It would make more sense to maintain this inside glibc, so that it can
> be updated in sync with fortification development.  We can keep an eye
> on the ability to build the sources separately from glibc, to bridge
> the time until this DSO is routinely shipped as part of glibc.

Because you really don't know what kind of information will each tool want
to know, and that can significantly differ between valgrind, [amt]san etc.
In sanitizer_common, you can come up with some macros that will serve the
needs of all the tools, and have each tool use those macros, other than
that, it is a trivial 3 liner wrapper for each fortification function.

	Jakub
  
Florian Weimer Oct. 2, 2016, 9:43 a.m. UTC | #26
* Jakub Jelinek:

> Because you really don't know what kind of information will each tool want
> to know, and that can significantly differ between valgrind, [amt]san etc.
> In sanitizer_common, you can come up with some macros that will serve the
> needs of all the tools, and have each tool use those macros, other than
> that, it is a trivial 3 liner wrapper for each fortification function.

Uh-oh, would you subject matter experts please come up with a
consistent opinion what is *actually* needed?

Further up thread, in
<CAJOtW+7xjtx=DxEOSnaPfpU708RdUJYLRX8prv0bFW=x47+tmA@mail.gmail.com>,
Yuri Gribov said that the sanitizers will work fine despite the
additional indirection.

If this is not actually true, then of course it does not make sense to
maintain the unfortify bits in glibc.
  
Yuri Gribov Oct. 2, 2016, 2:02 p.m. UTC | #27
On Sun, Oct 2, 2016 at 10:43 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
> * Jakub Jelinek:
>
>> Because you really don't know what kind of information will each tool want
>> to know, and that can significantly differ between valgrind, [amt]san etc.

Frankly I'm not sure why an arbitrary dynamic tool would want to
handle fortified APIs differently from to their unfortified
counterparts. It's certainly not the case for sanitizers.

>> In sanitizer_common, you can come up with some macros that will serve the
>> needs of all the tools, and have each tool use those macros, other than
>> that, it is a trivial 3 liner wrapper for each fortification function.
>
> Uh-oh, would you subject matter experts please come up with a
> consistent opinion what is *actually* needed?
>
> Further up thread, in
> <CAJOtW+7xjtx=DxEOSnaPfpU708RdUJYLRX8prv0bFW=x47+tmA@mail.gmail.com>,
> Yuri Gribov said that the sanitizers will work fine despite the
> additional indirection.
>
> If this is not actually true, then of course it does not make sense to
> maintain the unfortify bits in glibc.

Well, simple dynamic redirection of fortified APIs would be enough for
all sanitizer tools (ASan, MSan, TSan) and probably also Valgrind. I
suggested that it's common across majority of dynamic tools (which
would justify it's centralized implementation, be it in Glibc itself
or outside) but this has been debated by Jakub above.

-Y
  
Kostya Serebryany Oct. 4, 2016, 12:52 a.m. UTC | #28
>> Why do you think this is very sad?  This is not a misuse of fortify
>> but rather santizers not adapting to the changing environments and
>> handling how people are treating security issues now.  Remember 10
>> years ago there was no such thing as fortification and security was
>> not treated as a first class citizen.  Now Security is treated as a
>> first class and people are requiring security before anything else.

>> Well, -D_FORTIFY_SOURCE=2 has been used heavily already 10 years ago.
>> But I certainly don't find anything sad on the fact that many programs
>> are fortified, it is a good thing.

I am sad because the current state causes other security/testing
measures to silently fail.
Fortify is a security measure for corner cases, it does not protect
from 95% of bugs that asan/etc find,
but as we see it actively hurts asan.

An unrelated sadness is caused by popular user attitude, which I've
heard several times:
"I am using Fortify, an advertised security feature, so I am safe. "
(Admittedly, this is not related to the current discussion)

>> Why do you want to put this into sanitizer_common?

I don't! I pretty much hate this idea. But do I have any other choice?

>> It would make more sense to maintain this inside glibc, so that it can
>> be updated in sync with fortification development.  We can keep an eye
>> on the ability to build the sources separately from glibc, to bridge
>> the time until this DSO is routinely shipped as part of glibc.

Yes, please.
More broadly, it would be nice if glibc had some process to test that
the sanitizers still work.
Since sanitizers are largely based on *assumptions* about how glibc
works and not on contracts,
glibc changes has broken *san a few times in the past.
My attempt (suggested by Jakub in some other discussion) to have such
a contract did not end with anything,
https://sourceware.org/glibc/wiki/ThreadPropertiesAPI, so I've never
made another attempt.
I may be asking too much here...

>> Because you really don't know what kind of information will each tool want
>> to know, and that can significantly differ between valgrind, [amt]san etc.
>> In sanitizer_common, you can come up with some macros that will serve the
>> needs of all the tools, and have each tool use those macros, other than
>> that, it is a trivial 3 liner wrapper for each fortification function

Disagree. all the sanitizers will want to be directed to the original functions.
Valgrind most likely too.

>> Frankly I'm not sure why an arbitrary dynamic tool would want to
>> handle fortified APIs differently from to their unfortified
>> counterparts. It's certainly not the case for sanitizers.

Yes.

>> Well, simple dynamic redirection of fortified APIs would be enough for
>> all sanitizer tools (ASan, MSan, TSan) and probably also Valgrind. I
>> suggested that it's common across majority of dynamic tools (which
>> would justify it's centralized implementation, be it in Glibc itself

Yes.


>> or outside) but this has been debated by Jakub above.

... and I frankly don't understand why.

--kcc

On Sun, Oct 2, 2016 at 7:02 AM, Yuri Gribov <tetra2005@gmail.com> wrote:
> On Sun, Oct 2, 2016 at 10:43 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
>> * Jakub Jelinek:
>>
>>> Because you really don't know what kind of information will each tool want
>>> to know, and that can significantly differ between valgrind, [amt]san etc.
>
> Frankly I'm not sure why an arbitrary dynamic tool would want to
> handle fortified APIs differently from to their unfortified
> counterparts. It's certainly not the case for sanitizers.
>
>>> In sanitizer_common, you can come up with some macros that will serve the
>>> needs of all the tools, and have each tool use those macros, other than
>>> that, it is a trivial 3 liner wrapper for each fortification function.
>>
>> Uh-oh, would you subject matter experts please come up with a
>> consistent opinion what is *actually* needed?
>>
>> Further up thread, in
>> <CAJOtW+7xjtx=DxEOSnaPfpU708RdUJYLRX8prv0bFW=x47+tmA@mail.gmail.com>,
>> Yuri Gribov said that the sanitizers will work fine despite the
>> additional indirection.
>>
>> If this is not actually true, then of course it does not make sense to
>> maintain the unfortify bits in glibc.
>
> Well, simple dynamic redirection of fortified APIs would be enough for
> all sanitizer tools (ASan, MSan, TSan) and probably also Valgrind. I
> suggested that it's common across majority of dynamic tools (which
> would justify it's centralized implementation, be it in Glibc itself
> or outside) but this has been debated by Jakub above.
>
> -Y
  
Jakub Jelinek Oct. 4, 2016, 6:46 a.m. UTC | #29
On Mon, Oct 03, 2016 at 05:52:58PM -0700, Kostya Serebryany wrote:
> >> Because you really don't know what kind of information will each tool want
> >> to know, and that can significantly differ between valgrind, [amt]san etc.
> >> In sanitizer_common, you can come up with some macros that will serve the
> >> needs of all the tools, and have each tool use those macros, other than
> >> that, it is a trivial 3 liner wrapper for each fortification function
> 
> Disagree. all the sanitizers will want to be directed to the original functions.
> Valgrind most likely too.

At least in my understanding, valgrind doesn't want that, it wants to be
able to nicely diagnose the error and for that it needs all the information
about the reason.  It has the ability to override calls that don't go through
PLT symbols already.

	Jakub
  
Mark Wielaard Oct. 4, 2016, 12:15 p.m. UTC | #30
On Tue, 2016-10-04 at 08:46 +0200, Jakub Jelinek wrote:
> On Mon, Oct 03, 2016 at 05:52:58PM -0700, Kostya Serebryany wrote:
> > >> Because you really don't know what kind of information will each tool want
> > >> to know, and that can significantly differ between valgrind, [amt]san etc.
> > >> In sanitizer_common, you can come up with some macros that will serve the
> > >> needs of all the tools, and have each tool use those macros, other than
> > >> that, it is a trivial 3 liner wrapper for each fortification function
> > 
> > Disagree. all the sanitizers will want to be directed to the original functions.
> > Valgrind most likely too.
> 
> At least in my understanding, valgrind doesn't want that, it wants to be
> able to nicely diagnose the error and for that it needs all the information
> about the reason.  It has the ability to override calls that don't go through
> PLT symbols already.

I haven't followed this thread but noticed it was mentioned in the bug
report "RFE: valgrind interoperation with fortification"
https://sourceware.org/bugzilla/show_bug.cgi?id=20644

I think Jakub is right, valgrind is fine with fortified code, it will
just be handled like any other code. The fortify checks might catch some
issues that valgrind/memcheck wouldn't because it has stricter bounds
(valgrind/memcheck only knows about memory blocks as a whole, so
wouldn't for example catch some buffer/stack overflows if it sees it as
one continuous block). On the other hand valgrind/memcheck will catch
other issues like using a recently freed block or dependencies not
initialized memory in the code.

So in general fortified code and valgrind memcheck compliment each
other. But valgrind could probably provide some extra information if it
understood why __chk_fail was called. Then it could intercept __chk_fail
and report what it knows about the address that was about to be accessed
(but that fortification prevented, so valgrind doesn't see it). For
example it could report where the block was created (with backtrace),
whether it was recently freed (with backtrace), how far beyond/inside a
block it was, etc.

My proposal was to extend __chk_fail (or introduce a new __chk_fail_addr
function) that provides the address that would have been accessed. Then
valgrind just intercepts __chk_fail and uses that address to provide
some additional information.

This might however be a slightly naive approach. It might create memory
pressure in the fast path because the address needs to be preserved
across the __chk_fail call (although the address was just checked, so is
probably easily available). It also might not provide enough information
(it doesn't say anything about the actual block/bounds that the user
intended to use, we just hope that the address is near enough for
valgrind to deduce the correct information).

Alternatively we might just add intercepts for all _chk functions and
write out the checks and report information ourselves. In theory this
would be easy, maybe a couple of days work since all those functions are
relatively trivial. Which would give much more accurate reports if a
fortification check triggers under valgrind. But there doesn't seem to
be an official list of _chk functions, it seems to involve just grepping
around in the glibc sources. And I am not sure how it would interact
with the gcc builtins
https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
We don't want to have to recompile a program to run under valgrind (the
code you run under valgrind should be what you would run in production
whenever possible). It seems we cannot intercept those builtin_chks and
have to rely on catching/intercepting the __chk_fail call they produce.
Which would call for adding the address/__chk_fail_addr again.

Cheers,

Mark
  
Florian Weimer Oct. 5, 2016, 11:49 a.m. UTC | #31
On 10/04/2016 02:15 PM, Mark Wielaard wrote:
> My proposal was to extend __chk_fail (or introduce a new __chk_fail_addr
> function) that provides the address that would have been accessed. Then
> valgrind just intercepts __chk_fail and uses that address to provide
> some additional information.

There many different reasons why __chk_fail might be called.  For 
example, you get a fortify failure if you call snprintf with a buffer 
size that is larger than what is inferred by the compiler.  This happens 
even if the actual written value fits within the shorter space.  So you 
have a fortify failure without any invalid memory accesses.

Florian
  
Mark Wielaard Oct. 5, 2016, 12:02 p.m. UTC | #32
On Wed, 2016-10-05 at 13:49 +0200, Florian Weimer wrote:
> On 10/04/2016 02:15 PM, Mark Wielaard wrote:
> > My proposal was to extend __chk_fail (or introduce a new __chk_fail_addr
> > function) that provides the address that would have been accessed. Then
> > valgrind just intercepts __chk_fail and uses that address to provide
> > some additional information.
> 
> There many different reasons why __chk_fail might be called.  For 
> example, you get a fortify failure if you call snprintf with a buffer 
> size that is larger than what is inferred by the compiler.  This happens 
> even if the actual written value fits within the shorter space.  So you 
> have a fortify failure without any invalid memory accesses.

Right. And this is where I think valgrind can augment the fortification
check if given the address. You don't want to rely on valgrind/memcheck
noticing an invalid access, because in such cases it might not actually
be. You would call __chk_fail_addr (s + slen) and valgrind can tell you
where the block associated with that pointer came from (backtrace),
whether it was recently freed, how far inside/beyond an existing block
it is, etc.

Cheers,

Mark
  
Florian Weimer Oct. 5, 2016, 2:26 p.m. UTC | #33
On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
> Although fortify is good thing as it (and it's enabled by default on
> some major distros e.g. Ubuntu and Gentoo), people still complain about
> {A, M}San vs fortify interaction, see e.g.
> https://github.com/google/sanitizers/issues/689. One possible solution
> would be to extend {A, M}San to support foo_chk() functions, but this
> would increase the complexity of sanitizer tools with quite small benefit.

I'm not sure if there is a small benefit.  Based on Mark's feedback on 
valgrind's capabilities compared to source fortification, I ran an 
experiment.

Consider this example program:

#include <stdlib.h>
#include <string.h>

struct foo {
   int a;
   char b[4];
   void *c;
};

int
main (int argc, char **argv)
{
   struct foo *p = malloc (sizeof (*p));
   strcpy (p->b, argv[1]);
   asm volatile ("" ::: "memory");
   free (p);
}

Compiled with -D_FORTIFY_SOURCE=2, passing a four-character string yields:

$ ./a.out aaaa
*** buffer overflow detected ***: ./a.out terminated
[…]

But compiling with clang 3.8 and -fsanitize=address (without source 
fortification), I can pass a longer string before Address Sanitizer 
detects the overflow:

$ ./a.out aaaaaaaaaaa
$ ./a.out aaaaaaaaaaaa
=================================================================
==21921==ERROR: AddressSanitizer: heap-buffer-overflow on address 
0x60200000f000 at pc 0x000000488d77 bp 0x7ffe8eb00d50 sp 0x7ffe8eb00500

Depending on the application, intra-object overflows can be quite 
relevant.  Is this perhaps a sufficient case for Address Sanitizer using 
this data as well?

Thanks,
Florian
  
Maxim Ostapenko Oct. 5, 2016, 3:46 p.m. UTC | #34
On 05/10/16 17:26, Florian Weimer wrote:
> On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
>> Although fortify is good thing as it (and it's enabled by default on
>> some major distros e.g. Ubuntu and Gentoo), people still complain about
>> {A, M}San vs fortify interaction, see e.g.
>> https://github.com/google/sanitizers/issues/689. One possible solution
>> would be to extend {A, M}San to support foo_chk() functions, but this
>> would increase the complexity of sanitizer tools with quite small 
>> benefit.
>
> I'm not sure if there is a small benefit.  Based on Mark's feedback on 
> valgrind's capabilities compared to source fortification, I ran an 
> experiment.
>
> Consider this example program:
>
> #include <stdlib.h>
> #include <string.h>
>
> struct foo {
>   int a;
>   char b[4];
>   void *c;
> };
>
> int
> main (int argc, char **argv)
> {
>   struct foo *p = malloc (sizeof (*p));
>   strcpy (p->b, argv[1]);
>   asm volatile ("" ::: "memory");
>   free (p);
> }
>
> Compiled with -D_FORTIFY_SOURCE=2, passing a four-character string 
> yields:
>
> $ ./a.out aaaa
> *** buffer overflow detected ***: ./a.out terminated
> […]
>
> But compiling with clang 3.8 and -fsanitize=address (without source 
> fortification), I can pass a longer string before Address Sanitizer 
> detects the overflow:
>
> $ ./a.out aaaaaaaaaaa
> $ ./a.out aaaaaaaaaaaa
> =================================================================
> ==21921==ERROR: AddressSanitizer: heap-buffer-overflow on address 
> 0x60200000f000 at pc 0x000000488d77 bp 0x7ffe8eb00d50 sp 0x7ffe8eb00500
>
> Depending on the application, intra-object overflows can be quite 
> relevant.  Is this perhaps a sufficient case for Address Sanitizer 
> using this data as well?

Well, this is a valid point, ASan doesn't poison intra objects, but I 
wonder how ASan can reuse it? Perhaps poison dest + bos0(dest) at 
interceptor entry and unpoison at exit?

-Maxim

>
> Thanks,
> Florian
>
>
>
  
Maxim Ostapenko Oct. 5, 2016, 4 p.m. UTC | #35
On 05/10/16 18:46, Maxim Ostapenko wrote:
> On 05/10/16 17:26, Florian Weimer wrote:
>> On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
>>> Although fortify is good thing as it (and it's enabled by default on
>>> some major distros e.g. Ubuntu and Gentoo), people still complain about
>>> {A, M}San vs fortify interaction, see e.g.
>>> https://github.com/google/sanitizers/issues/689. One possible solution
>>> would be to extend {A, M}San to support foo_chk() functions, but this
>>> would increase the complexity of sanitizer tools with quite small 
>>> benefit.
>>
>> I'm not sure if there is a small benefit.  Based on Mark's feedback 
>> on valgrind's capabilities compared to source fortification, I ran an 
>> experiment.
>>
>> Consider this example program:
>>
>> #include <stdlib.h>
>> #include <string.h>
>>
>> struct foo {
>>   int a;
>>   char b[4];
>>   void *c;
>> };
>>
>> int
>> main (int argc, char **argv)
>> {
>>   struct foo *p = malloc (sizeof (*p));
>>   strcpy (p->b, argv[1]);
>>   asm volatile ("" ::: "memory");
>>   free (p);
>> }
>>
>> Compiled with -D_FORTIFY_SOURCE=2, passing a four-character string 
>> yields:
>>
>> $ ./a.out aaaa
>> *** buffer overflow detected ***: ./a.out terminated
>> […]
>>
>> But compiling with clang 3.8 and -fsanitize=address (without source 
>> fortification), I can pass a longer string before Address Sanitizer 
>> detects the overflow:
>>
>> $ ./a.out aaaaaaaaaaa
>> $ ./a.out aaaaaaaaaaaa
>> =================================================================
>> ==21921==ERROR: AddressSanitizer: heap-buffer-overflow on address 
>> 0x60200000f000 at pc 0x000000488d77 bp 0x7ffe8eb00d50 sp 0x7ffe8eb00500
>>
>> Depending on the application, intra-object overflows can be quite 
>> relevant.  Is this perhaps a sufficient case for Address Sanitizer 
>> using this data as well?
>
> Well, this is a valid point, ASan doesn't poison intra objects, but I 
> wonder how ASan can reuse it? Perhaps poison dest + bos0(dest) at 
> interceptor entry and unpoison at exit?

Hm, no, this wouldn't work correctly in concurrent environment. The 
clearest solution would be to forward the call to real _chk function 
(because ASan can't poison memory here and I'm not sure that 
reimplementing fortify logic into sanitizers is acceptable), but why do 
we need interceptor in this case?

>
> -Maxim
>
>>
>> Thanks,
>> Florian
>>
>>
>>
>
  
Kostya Serebryany Oct. 5, 2016, 4:01 p.m. UTC | #36
On Wed, Oct 5, 2016 at 8:46 AM, Maxim Ostapenko <m.ostapenko@samsung.com> wrote:
> On 05/10/16 17:26, Florian Weimer wrote:
>>
>> On 09/05/2016 07:27 PM, Maxim Ostapenko wrote:
>>>
>>> Although fortify is good thing as it (and it's enabled by default on
>>> some major distros e.g. Ubuntu and Gentoo), people still complain about
>>> {A, M}San vs fortify interaction, see e.g.
>>> https://github.com/google/sanitizers/issues/689. One possible solution
>>> would be to extend {A, M}San to support foo_chk() functions, but this
>>> would increase the complexity of sanitizer tools with quite small
>>> benefit.
>>
>>
>> I'm not sure if there is a small benefit.  Based on Mark's feedback on
>> valgrind's capabilities compared to source fortification, I ran an
>> experiment.
>>
>> Consider this example program:
>>
>> #include <stdlib.h>
>> #include <string.h>
>>
>> struct foo {
>>   int a;
>>   char b[4];
>>   void *c;
>> };
>>
>> int
>> main (int argc, char **argv)
>> {
>>   struct foo *p = malloc (sizeof (*p));
>>   strcpy (p->b, argv[1]);
>>   asm volatile ("" ::: "memory");
>>   free (p);
>> }
>>
>> Compiled with -D_FORTIFY_SOURCE=2, passing a four-character string yields:
>>
>> $ ./a.out aaaa
>> *** buffer overflow detected ***: ./a.out terminated
>> […]
>>
>> But compiling with clang 3.8 and -fsanitize=address (without source
>> fortification), I can pass a longer string before Address Sanitizer detects
>> the overflow:
>>
>> $ ./a.out aaaaaaaaaaa
>> $ ./a.out aaaaaaaaaaaa
>> =================================================================
>> ==21921==ERROR: AddressSanitizer: heap-buffer-overflow on address
>> 0x60200000f000 at pc 0x000000488d77 bp 0x7ffe8eb00d50 sp 0x7ffe8eb00500
>>
>> Depending on the application, intra-object overflows can be quite
>> relevant.  Is this perhaps a sufficient case for Address Sanitizer using
>> this data as well?

Indeed, asan does not handle intra-object overflow, good point.
(fortify also doesn't handle the general case, only when a libc
function is involved and the bounds are known at compile-time)

asan handles all kinds of signals and prints stack traces for them, so
fortify+asan could print a nice stack trace when fortify finds
intra-object overflow.
Also, using __asan_describe_address fortify could print what asan
knows about the address
(e.g. N bytes inside M-byte object allocated at <STACK_TRACE>).
Makes sense.

Now the question where to do this.
To me it sounds more like a fortify extension.



>
>
> Well, this is a valid point, ASan doesn't poison intra objects, but I wonder
> how ASan can reuse it? Perhaps poison dest + bos0(dest) at interceptor entry
> and unpoison at exit?

This would be racy.

--kcc
>
> -Maxim
>
>>
>> Thanks,
>> Florian
>>
>>
>>
>
  
Zack Weinberg Oct. 5, 2016, 4:06 p.m. UTC | #37
On Mon, Sep 5, 2016 at 1:27 PM, Maxim Ostapenko <m.ostapenko@samsung.com> wrote:
> When fortify is used with MSan it will cause MSan false positives.

I feel like this discussion has gone way into the weeds.  Your
original problem report ...

> #include <stdio.h>
> #include <string.h>
> int main()
> {
>         char text[100];
>         sprintf(text, "hello");
>         printf("%lu\n", strlen(text));
> }
>
> % clang test.c -fsanitize=memory   -O3 && ./a.out
> 5
> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
> Uninitialized bytes in __interceptor_strlen at offset 0 inside
> [0x7ffe259e4d20, 6)
> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x4869cc in main

... appears to me to be a plain old bug.  Either the fortify shims are
actually using an uninitialized value, in which case they should be
fixed, or MSan has misunderstood the code generated in _FORTIFY_SOURCE
mode, in which case MSan should be fixed.

You understand what is going on better than anyone else here, I think
- can you please write up a detailed description of exactly why this
goes wrong?

zw
  
Kostya Serebryany Oct. 5, 2016, 4:10 p.m. UTC | #38
On Wed, Oct 5, 2016 at 9:06 AM, Zack Weinberg <zackw@panix.com> wrote:
> On Mon, Sep 5, 2016 at 1:27 PM, Maxim Ostapenko <m.ostapenko@samsung.com> wrote:
>> When fortify is used with MSan it will cause MSan false positives.
>
> I feel like this discussion has gone way into the weeds.  Your
> original problem report ...
>
>> #include <stdio.h>
>> #include <string.h>
>> int main()
>> {
>>         char text[100];
>>         sprintf(text, "hello");
>>         printf("%lu\n", strlen(text));
>> }
>>
>> % clang test.c -fsanitize=memory   -O3 && ./a.out
>> 5
>> % clang test.c -fsanitize=memory -D_FORTIFY_SOURCE=2  -O3 && ./a.out
>> Uninitialized bytes in __interceptor_strlen at offset 0 inside
>> [0x7ffe259e4d20, 6)
>> ==26297==WARNING: MemorySanitizer: use-of-uninitialized-value
>>     #0 0x4869cc in main
>
> ... appears to me to be a plain old bug.  Either the fortify shims are
> actually using an uninitialized value, in which case they should be
> fixed, or MSan has misunderstood the code generated in _FORTIFY_SOURCE
> mode, in which case MSan should be fixed.
>
> You understand what is going on better than anyone else here, I think
> - can you please write up a detailed description of exactly why this
> goes wrong?

* fortify replaces sprintf with sprintf_chk
* msan does not know about sprintf_chk
* sprintf_chk initializes 'text' but msan does not know that
* in strlen(text) msan thinks that 'text' has uninitialized bits.


>
> zw
  
Zack Weinberg Oct. 5, 2016, 4:46 p.m. UTC | #39
On Wed, Oct 5, 2016 at 12:10 PM, Kostya Serebryany <kcc@google.com> wrote:
> On Wed, Oct 5, 2016 at 9:06 AM, Zack Weinberg <zackw@panix.com> wrote:
>> On Mon, Sep 5, 2016 at 1:27 PM, Maxim Ostapenko <m.ostapenko@samsung.com> wrote:
>>> When fortify is used with MSan it will cause MSan false positives.
>> ... appears to me to be a plain old bug.  Either the fortify shims are
>> actually using an uninitialized value, in which case they should be
>> fixed, or MSan has misunderstood the code generated in _FORTIFY_SOURCE
>> mode, in which case MSan should be fixed.
>>
>> You understand what is going on better than anyone else here, I think
>> - can you please write up a detailed description of exactly why this
>> goes wrong?
>
> * fortify replaces sprintf with sprintf_chk
> * msan does not know about sprintf_chk
> * sprintf_chk initializes 'text' but msan does not know that

OK, that's the bug right there, msan should know that.

Why is it hard to fix that bug?  Why are we instead arguing over
whether we should prevent people from enabling both defensive measures
at the same time?

zw
  
Yuri Gribov Oct. 5, 2016, 5:58 p.m. UTC | #40
On Wed, Oct 5, 2016 at 5:46 PM, Zack Weinberg <zackw@panix.com> wrote:
> On Wed, Oct 5, 2016 at 12:10 PM, Kostya Serebryany <kcc@google.com> wrote:
>> On Wed, Oct 5, 2016 at 9:06 AM, Zack Weinberg <zackw@panix.com> wrote:
>>> On Mon, Sep 5, 2016 at 1:27 PM, Maxim Ostapenko <m.ostapenko@samsung.com> wrote:
>>>> When fortify is used with MSan it will cause MSan false positives.
>>> ... appears to me to be a plain old bug.  Either the fortify shims are
>>> actually using an uninitialized value, in which case they should be
>>> fixed, or MSan has misunderstood the code generated in _FORTIFY_SOURCE
>>> mode, in which case MSan should be fixed.
>>>
>>> You understand what is going on better than anyone else here, I think
>>> - can you please write up a detailed description of exactly why this
>>> goes wrong?
>>
>> * fortify replaces sprintf with sprintf_chk
>> * msan does not know about sprintf_chk
>> * sprintf_chk initializes 'text' but msan does not know that
>
> OK, that's the bug right there, msan should know that.
>
> Why is it hard to fix that bug?  Why are we instead arguing over
> whether we should prevent people from enabling both defensive measures
> at the same time?

Not that it's hard, rather what's the appropriate place to fix it. I
think the discussion has mostly died out so there is no point keep it
going.

> zw
  

Patch

commit b153492ed25b1bc70141566e6644897b89a82e26
Author: Maxim Ostapenko <m.ostapenko@samsung.com>
Date:   Wed Aug 31 15:29:49 2016 +0300

    Do not allow asan/msan/tsan and fortify at the same time.

    When fortify is used with msan it will cause msan false positives. This is
    happening because fortify replaces libc functions (e.g. sprintf) with its own
    variants (__sprintf_chk) and the sanitizers don't know about these variants.
    Supporting fortify in *san makes little sense because fortify does not add
    anything to the sanitizers and it will only increase the complexity. So, the
    better way is to warn the user that the sanitizers and fortify are
    incompatible.

	[BZ #20422]
	* include/features.h (__SANITIZER_ENABLED): New macros. Define it if
	compiler has {A, T, M}San enabled. Warn if __SANITIZER_ENABLED and
	_FORTIFY_SOURCE are both defined.

diff --git a/ChangeLog b/ChangeLog
index d24536b..ee513cd 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@ 
+2016-09-05  Maxim Ostapenko  <m.ostapenko@samsung.com>
+
+	[BZ #20422]
+	* include/features.h (__SANITIZER_ENABLED): New macros. Define it if
+	compiler has {A, T, M}San enabled. Warn if __SANITIZER_ENABLED and
+	_FORTIFY_SOURCE are both defined.
+
 2016-08-30  Svante Signell  <svante.signell@gmail.com>
 
 	* sysdeps/mach/hurd/adjtime.c (__adjtime): When OLDDELTA is NULL, make
diff --git a/include/features.h b/include/features.h
index 650d4c5..1158e5a 100644
--- a/include/features.h
+++ b/include/features.h
@@ -355,11 +355,23 @@ 
 # define __USE_REENTRANT	1
 #endif
 
+#if !defined(__has_feature)
+# define __has_feature(x) 0
+#endif
+
+#if defined __SANITIZE_ADDRESS__ || defined __SANITIZE_THREAD__		    \
+    || __has_feature(address_sanitizer) || __has_feature(thread_sanitizer)  \
+    || __has_feature(memory_sanitizer)
+# define __SANITIZER_ENABLED 1
+#endif
+
 #if defined _FORTIFY_SOURCE && _FORTIFY_SOURCE > 0
 # if !defined __OPTIMIZE__ || __OPTIMIZE__ <= 0
 #  warning _FORTIFY_SOURCE requires compiling with optimization (-O)
 # elif !__GNUC_PREREQ (4, 1)
 #  warning _FORTIFY_SOURCE requires GCC 4.1 or later
+# elif defined __SANITIZER_ENABLED
+# warning _FORTIFY_SOURCE is not compatible with sanitizer
 # elif _FORTIFY_SOURCE > 1
 #  define __USE_FORTIFY_LEVEL 2
 # else