malloc: Abort on heap corruption, without a backtrace [BZ #21754]

Message ID 2fdd6ddc-b59d-c9c7-52f4-4899159e1516@redhat.com
State Committed
Headers

Commit Message

Florian Weimer Aug. 18, 2017, 11:52 a.m. UTC
  I really	 think we should avoid generating backtraces on heap
corruption.  The process is in a precarious state at this point, and
doing more work at this point risks obscuring the root cause of the
corruption or enabling code execution exploits.

When it came to backporting the corrupted arena avoidance patches, we
asked internally if our support folks needed the backtraces, and they
said they could live without it.

The attached patch is just the minimum change required to get going.  We
can do quite a few cleanups afterwards (including removal of the corrupt
arena avoidance logic, which does not work reliable anyway and never can).

Thanks,
Florian
  

Comments

Carlos O'Donell Aug. 18, 2017, 1:52 p.m. UTC | #1
On 08/18/2017 07:52 AM, Florian Weimer wrote:
> I really	 think we should avoid generating backtraces on heap
> corruption.  The process is in a precarious state at this point, and
> doing more work at this point risks obscuring the root cause of the
> corruption or enabling code execution exploits.
> 
> When it came to backporting the corrupted arena avoidance patches, we
> asked internally if our support folks needed the backtraces, and they
> said they could live without it.
> 
> The attached patch is just the minimum change required to get going.  We
> can do quite a few cleanups afterwards (including removal of the corrupt
> arena avoidance logic, which does not work reliable anyway and never can).
> 
> Thanks,
> Florian

This looks good to me, but I would give it a week before you commit to allow
other developers to comment.

It took me a long time of talking to support people to become convinced that
what we provide here is relatively useless. The large-scale SOS reports we
collect at Red Hat are much much more comprehensive than the dump glibc
provides. Your pressure in this area has spurred me to these discussions,
and what I thought was a semi-useful feature in the past is really looking
like an unused security hazard. Thank you for changing my mind :-)

I've suggested a longer worded NEWS entry.

Will you be updating the linux man pages?
 
> malloc-abort.patch
> 
> malloc: Abort on heap corruption, without a backtrace [BZ #21754]
> 
> The stack trace printing caused deadlocks and has been itself been
> targeted by code execution exploits.
> 
> 2017-08-18  Florian Weimer  <fweimer@redhat.com>
> 
> 	[BZ #21754]
> 	* malloc/malloc.c (malloc_printerr): Always terminate the process,
> 	without printing a backtrace.  Do not leak any information in the
> 	error message.
> 	* manual/memory.texi (Heap Consistency Checking): Update.
> 	* manual/tunables.texi (Memory Allocation Tunables): Likewise.
> 
> diff --git a/NEWS b/NEWS
> index 6639633c2b..03282126d6 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -12,6 +12,9 @@ Major new features:
>  * Optimized x86-64 asin, atan2, exp, expf, log, pow, atan, sin and tan
>    with FMA, contributed by Arjan van de Ven and H.J. Lu from Intel.
>  
> +* malloc, free, and related functions no longer print addresses and a stack
> +  backtrace after detecting heap corruption.
> +

Suggest:

* In order to support faster and safer process termination the malloc API family
  of functions will no longer print a failure address and stack backtrace after
  detecting heap corruption. The goal is to minimize the amount of work done
  after corruption is detected and to avoid potential security issues in continued
  process execution. Reducing shutdown time leads to lower overall process restart
  latency, so there is benefit both from a security and performance perspective.

>  Deprecated and removed features, and other changes affecting compatibility:
>  
>  * On GNU/Linux, the obsolete Linux constant PTRACE_SEIZE_DEVEL is no longer
> diff --git a/malloc/malloc.c b/malloc/malloc.c
> index e3ff778113..c158c63b1c 100644
> --- a/malloc/malloc.c
> +++ b/malloc/malloc.c
> @@ -1019,7 +1019,8 @@ static void*  _int_realloc(mstate, mchunkptr, INTERNAL_SIZE_T,
>  static void*  _int_memalign(mstate, size_t, size_t);
>  static void*  _mid_memalign(size_t, size_t, void *);
>  
> -static void malloc_printerr(int action, const char *str, void *ptr, mstate av);
> +static void malloc_printerr(int action, const char *str, void *ptr, mstate av)
> +  __attribute__ ((noreturn));

OK.

>  
>  static void* internal_function mem2mem_check(void *p, size_t sz);
>  static int internal_function top_check(void);
> @@ -5408,24 +5409,8 @@ malloc_printerr (int action, const char *str, void *ptr, mstate ar_ptr)
>    if (ar_ptr)
>      set_arena_corrupt (ar_ptr);
>  
> -  if ((action & 5) == 5)
> -    __libc_message ((action & 2) ? (do_abort | do_backtrace) : do_message,
> -		    "%s\n", str);
> -  else if (action & 1)
> -    {
> -      char buf[2 * sizeof (uintptr_t) + 1];
> -
> -      buf[sizeof (buf) - 1] = '\0';
> -      char *cp = _itoa_word ((uintptr_t) ptr, &buf[sizeof (buf) - 1], 16, 0);
> -      while (cp > buf)
> -        *--cp = '0';
> -
> -      __libc_message ((action & 2) ? (do_abort | do_backtrace) : do_message,
> -		      "*** Error in `%s': %s: 0x%s ***\n",
> -                      __libc_argv[0] ? : "<unknown>", str, cp);
> -    }
> -  else if (action & 2)
> -    abort ();
> +  __libc_message (do_abort, "Fatal glibc error in malloc: %s\n", str);
> +  __builtin_unreachable ();

OK.

>  }
>  
>  /* We need a wrapper function for one of the additions of POSIX.  */
> diff --git a/manual/memory.texi b/manual/memory.texi
> index 82f473806c..13cce7a750 100644
> --- a/manual/memory.texi
> +++ b/manual/memory.texi
> @@ -1309,17 +1309,15 @@ The block was already freed.
>  
>  Another possibility to check for and guard against bugs in the use of
>  @code{malloc}, @code{realloc} and @code{free} is to set the environment
> -variable @code{MALLOC_CHECK_}.  When @code{MALLOC_CHECK_} is set, a
> -special (less efficient) implementation is used which is designed to be
> -tolerant against simple errors, such as double calls of @code{free} with
> -the same argument, or overruns of a single byte (off-by-one bugs).  Not
> -all such errors can be protected against, however, and memory leaks can
> -result.  If @code{MALLOC_CHECK_} is set to @code{0}, any detected heap
> -corruption is silently ignored; if set to @code{1}, a diagnostic is
> -printed on @code{stderr}; if set to @code{2}, @code{abort} is called
> -immediately.  This can be useful because otherwise a crash may happen
> -much later, and the true cause for the problem is then very hard to
> -track down.
> +variable @code{MALLOC_CHECK_}.  When @code{MALLOC_CHECK_} is set to a
> +non-zero value, a special (less efficient) implementation is used which
> +is designed to be tolerant against simple errors, such as double calls
> +of @code{free} with the same argument, or overruns of a single byte
> +(off-by-one bugs).  Not all such errors can be protected against,
> +however, and memory leaks can result.
> +
> +Any detected heap corruption results in immediate termination of the
> +process.

OK.

>  
>  There is one problem with @code{MALLOC_CHECK_}: in SUID or SGID binaries
>  it could possibly be exploited since diverging from the normal programs
> diff --git a/manual/tunables.texi b/manual/tunables.texi
> index 3c19567a28..b09e3fe791 100644
> --- a/manual/tunables.texi
> +++ b/manual/tunables.texi
> @@ -71,27 +71,13 @@ following tunables in the @code{malloc} namespace:
>  This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is
>  identical in features.
>  
> -Setting this tunable enables a special (less efficient) memory allocator for
> -the malloc family of functions that is designed to be tolerant against simple
> -errors such as double calls of free with the same argument, or overruns of a
> -single byte (off-by-one bugs). Not all such errors can be protected against,
> -however, and memory leaks can result.  The following list describes the values
> -that this tunable can take and the effect they have on malloc functionality:
> -
> -@itemize @bullet
> -@item @code{0} Ignore all errors.  The default allocator continues to be in
> -use, but all errors are silently ignored.
> -@item @code{1} Report errors.  The alternate allocator is selected and heap
> -corruption, if detected, is reported as diagnostic messages to @code{stderr}
> -and the program continues execution.
> -@item @code{2} Abort on errors.  The alternate allocator is selected and if
> -heap corruption is detected, the program is ended immediately by calling
> -@code{abort}.
> -@item @code{3} Fully enabled.  The alternate allocator is selected and is fully
> -functional.  That is, if heap corruption is detected, a verbose diagnostic
> -message is printed to @code{stderr} and the program is ended by calling
> -@code{abort}.
> -@end itemize
> +Setting this tunable to a non-zero value enables a special (less
> +efficient) memory allocator for the malloc family of functions that is
> +designed to be tolerant against simple errors such as double calls of
> +free with the same argument, or overruns of a single byte (off-by-one
> +bugs). Not all such errors can be protected against, however, and memory
> +leaks can result.  Any detected heap corruption results in immediate
> +termination of the process.

OK.

>  
>  Like @env{MALLOC_CHECK_}, @code{glibc.malloc.check} has a problem in that it
>  diverges from normal program behavior by writing to @code{stderr}, which could
  
Paul Eggert Aug. 18, 2017, 3:29 p.m. UTC | #2
Thanks, I like the overall change too. One quibble:

> +  __libc_message (do_abort, "Fatal glibc error in malloc: %s\n", str);

This change to the message wording won't work well, as the user will see 
redundant text like "Fatal glibc error in malloc: malloc(): memory corruption 
(fast)". Also, the wording "glibc error" suggests that there must be a fault in 
glibc, whereas it's more likely that the fault is in the application. I suggest 
changing the above line to:

   __libc_message (do_abort, "%s\n", str);

This is simpler, and this way glibc will keep using the same wording that it's 
been using for a while.
  
Florian Weimer Aug. 30, 2017, 2:44 p.m. UTC | #3
I'm going to push this with the NEWS edit suggested by Carlos, and with
the error message change reverted, per Paul's recommendation.

I still think we need to adopt a more consistent formatting for
process-terminating error messages and document them in the manual, but
this change is not the place for this discussion.

Thanks,
Florian
  

Patch

malloc: Abort on heap corruption, without a backtrace [BZ #21754]

The stack trace printing caused deadlocks and has been itself been
targeted by code execution exploits.

2017-08-18  Florian Weimer  <fweimer@redhat.com>

	[BZ #21754]
	* malloc/malloc.c (malloc_printerr): Always terminate the process,
	without printing a backtrace.  Do not leak any information in the
	error message.
	* manual/memory.texi (Heap Consistency Checking): Update.
	* manual/tunables.texi (Memory Allocation Tunables): Likewise.

diff --git a/NEWS b/NEWS
index 6639633c2b..03282126d6 100644
--- a/NEWS
+++ b/NEWS
@@ -12,6 +12,9 @@  Major new features:
 * Optimized x86-64 asin, atan2, exp, expf, log, pow, atan, sin and tan
   with FMA, contributed by Arjan van de Ven and H.J. Lu from Intel.
 
+* malloc, free, and related functions no longer print addresses and a stack
+  backtrace after detecting heap corruption.
+
 Deprecated and removed features, and other changes affecting compatibility:
 
 * On GNU/Linux, the obsolete Linux constant PTRACE_SEIZE_DEVEL is no longer
diff --git a/malloc/malloc.c b/malloc/malloc.c
index e3ff778113..c158c63b1c 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -1019,7 +1019,8 @@  static void*  _int_realloc(mstate, mchunkptr, INTERNAL_SIZE_T,
 static void*  _int_memalign(mstate, size_t, size_t);
 static void*  _mid_memalign(size_t, size_t, void *);
 
-static void malloc_printerr(int action, const char *str, void *ptr, mstate av);
+static void malloc_printerr(int action, const char *str, void *ptr, mstate av)
+  __attribute__ ((noreturn));
 
 static void* internal_function mem2mem_check(void *p, size_t sz);
 static int internal_function top_check(void);
@@ -5408,24 +5409,8 @@  malloc_printerr (int action, const char *str, void *ptr, mstate ar_ptr)
   if (ar_ptr)
     set_arena_corrupt (ar_ptr);
 
-  if ((action & 5) == 5)
-    __libc_message ((action & 2) ? (do_abort | do_backtrace) : do_message,
-		    "%s\n", str);
-  else if (action & 1)
-    {
-      char buf[2 * sizeof (uintptr_t) + 1];
-
-      buf[sizeof (buf) - 1] = '\0';
-      char *cp = _itoa_word ((uintptr_t) ptr, &buf[sizeof (buf) - 1], 16, 0);
-      while (cp > buf)
-        *--cp = '0';
-
-      __libc_message ((action & 2) ? (do_abort | do_backtrace) : do_message,
-		      "*** Error in `%s': %s: 0x%s ***\n",
-                      __libc_argv[0] ? : "<unknown>", str, cp);
-    }
-  else if (action & 2)
-    abort ();
+  __libc_message (do_abort, "Fatal glibc error in malloc: %s\n", str);
+  __builtin_unreachable ();
 }
 
 /* We need a wrapper function for one of the additions of POSIX.  */
diff --git a/manual/memory.texi b/manual/memory.texi
index 82f473806c..13cce7a750 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -1309,17 +1309,15 @@  The block was already freed.
 
 Another possibility to check for and guard against bugs in the use of
 @code{malloc}, @code{realloc} and @code{free} is to set the environment
-variable @code{MALLOC_CHECK_}.  When @code{MALLOC_CHECK_} is set, a
-special (less efficient) implementation is used which is designed to be
-tolerant against simple errors, such as double calls of @code{free} with
-the same argument, or overruns of a single byte (off-by-one bugs).  Not
-all such errors can be protected against, however, and memory leaks can
-result.  If @code{MALLOC_CHECK_} is set to @code{0}, any detected heap
-corruption is silently ignored; if set to @code{1}, a diagnostic is
-printed on @code{stderr}; if set to @code{2}, @code{abort} is called
-immediately.  This can be useful because otherwise a crash may happen
-much later, and the true cause for the problem is then very hard to
-track down.
+variable @code{MALLOC_CHECK_}.  When @code{MALLOC_CHECK_} is set to a
+non-zero value, a special (less efficient) implementation is used which
+is designed to be tolerant against simple errors, such as double calls
+of @code{free} with the same argument, or overruns of a single byte
+(off-by-one bugs).  Not all such errors can be protected against,
+however, and memory leaks can result.
+
+Any detected heap corruption results in immediate termination of the
+process.
 
 There is one problem with @code{MALLOC_CHECK_}: in SUID or SGID binaries
 it could possibly be exploited since diverging from the normal programs
diff --git a/manual/tunables.texi b/manual/tunables.texi
index 3c19567a28..b09e3fe791 100644
--- a/manual/tunables.texi
+++ b/manual/tunables.texi
@@ -71,27 +71,13 @@  following tunables in the @code{malloc} namespace:
 This tunable supersedes the @env{MALLOC_CHECK_} environment variable and is
 identical in features.
 
-Setting this tunable enables a special (less efficient) memory allocator for
-the malloc family of functions that is designed to be tolerant against simple
-errors such as double calls of free with the same argument, or overruns of a
-single byte (off-by-one bugs). Not all such errors can be protected against,
-however, and memory leaks can result.  The following list describes the values
-that this tunable can take and the effect they have on malloc functionality:
-
-@itemize @bullet
-@item @code{0} Ignore all errors.  The default allocator continues to be in
-use, but all errors are silently ignored.
-@item @code{1} Report errors.  The alternate allocator is selected and heap
-corruption, if detected, is reported as diagnostic messages to @code{stderr}
-and the program continues execution.
-@item @code{2} Abort on errors.  The alternate allocator is selected and if
-heap corruption is detected, the program is ended immediately by calling
-@code{abort}.
-@item @code{3} Fully enabled.  The alternate allocator is selected and is fully
-functional.  That is, if heap corruption is detected, a verbose diagnostic
-message is printed to @code{stderr} and the program is ended by calling
-@code{abort}.
-@end itemize
+Setting this tunable to a non-zero value enables a special (less
+efficient) memory allocator for the malloc family of functions that is
+designed to be tolerant against simple errors such as double calls of
+free with the same argument, or overruns of a single byte (off-by-one
+bugs). Not all such errors can be protected against, however, and memory
+leaks can result.  Any detected heap corruption results in immediate
+termination of the process.
 
 Like @env{MALLOC_CHECK_}, @code{glibc.malloc.check} has a problem in that it
 diverges from normal program behavior by writing to @code{stderr}, which could