malloc tcache: Debugger now sees the address of the corrupted chunk.

Message ID CAHQ-jgjWjknn9TV6vhiB4pwTo2=M8wn4LJN4NdYa7pLTGZ1Tdw@mail.gmail.com
State Changes Requested, archived
Headers
Series malloc tcache: Debugger now sees the address of the corrupted chunk. |

Commit Message

Adder Dec. 7, 2020, 12:57 a.m. UTC
  If the user overwrites the "next" field of the tcache_entry
(i.e. the first 4 or 8 bytes at the address returned by malloc)
illegally (after having freed the chunk),
it is possible (and actually probable, especially on 64-bit platforms)
that the resulting pointer cannot be dereferenced.

Thus, a crash is caused upon a later allocation of similar size
(in our tcache_get function).

But at that point, only the "next" field is available to the debugger
(it has been copied in tcache->entries[tc_idx] and is being dereferenced).
In particular, the address of the corrupted chunk is not available.

To make it available and therefore to ease debugging,
we attempt to dereference (currently for reading) the "next" field earlier,
while the address of the chunk is still available.
---
 malloc/malloc.c | 41 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 40 insertions(+), 1 deletion(-)

+{
+  tcache_entry *const e_next = REVEAL_PTR (e->next);
+  __libc_malloc_tcache_dbg_peek (dbg_tcache, dbg_tc_idx, e, e_next);
+  return e_next;
+}
+
 /* Caller must ensure that we know tc_idx is valid and there's
    available chunks to remove.  */
 static __always_inline void *
@@ -2954,7 +2993,7 @@ tcache_get (size_t tc_idx)
   tcache_entry *e = tcache->entries[tc_idx];
   if (__glibc_unlikely (!aligned_OK (e)))
     malloc_printerr ("malloc(): unaligned tcache chunk detected");
-  tcache->entries[tc_idx] = REVEAL_PTR (e->next);
+  tcache->entries[tc_idx] = tcache_dbg_get_next (tcache, tc_idx, e);
   --(tcache->counts[tc_idx]);
   e->key = NULL;
   return (void *) e;
  

Comments

Adder Dec. 11, 2020, 11:48 p.m. UTC | #1
On Sat, Dec 12, 2020 at 12:51 AM DJ Delorie <dj@redhat.com> wrote:
>
> Adder <adder.thief@gmail.com> writes:
> > But at that point, only the "next" field is available to the debugger
> > (it has been copied in tcache->entries[tc_idx] and is being dereferenced).
> > In particular, the address of the corrupted chunk is not available.
>
> Er, the address of the corrupted chunk is the 'e' value that we're
> returning from that function.  Why can't the debugger see that?

Because the crash occurs on the second (not the first) malloc after corruption.

Example: The crash occurs on the malloc for p9 (not the one for p7):

  free (p1);
  ...

  strcpy (p1, "Wonder!");
  ...

  // We copy the corrupted pointer, ((tcache_entry *) p1)->next,
  // to our array, as tcache->entries [tc_idx].
  // Yes, we now return p1 as e, as pointed out in the question.
  // (And, the user saves it in p7, but I doubt that is part of the question.)
  p7 = malloc (64);
  ...

  // We only crash now, when dereferencing the copy of the corrupted pointer.
  p9 = malloc (64);

(The core file does not generally capture p7 or, equivalently, p1.
The malloc for p7 generally occurs in a nested function call.)

The proposal is to dereference the-corrupted-pointer earlier,
while we still have the-pointer-to-the-corrupted-pointer (as e),
not just the-copy-of-the-corrupted-pointer (as tcache->entries [tc_idx]).
  
Adder Dec. 12, 2020, 5:18 a.m. UTC | #2
On Sat, Dec 12, 2020 at 5:17 AM DJ Delorie <dj@redhat.com> wrote:

> Hmm... OK, I think I get it. It's not the 'e' we know, its the 'e' from
> the previous call to tcache_get().

Yes ! I am so happy !


> So basically, when we remove a chunk from the tcache, we want to
> validate the pointer we're leaving behind?

Correct ! And the result is crash on the first malloc, not the second one.

This has the additional benefit of capturing (in the core file)
the illegally-overwritten-via-p1 contents of the chunk,
without further (legal) write results via p7 (which is equal to p1),
which may occur before the second malloc (which currently crashes).


>  static __always_inline void *
>  tcache_get (size_t tc_idx)
>  {
>    tcache_entry *e = tcache->entries[tc_idx];
>    if (__glibc_unlikely (!aligned_OK (e)))
>      malloc_printerr ("malloc(): unaligned tcache chunk detected");
>    tcache->entries[tc_idx] = REVEAL_PTR (e->next);
> +  /* Validate the pointer we're leaving behind, while we still know
> +     where it came from, in case a use-after-free corrupted it.  */
> +  if (tcache->entries[tc_idx])
> +    * (volatile char **) tcache->entries[tc_idx];
>    --(tcache->counts[tc_idx]);
>    e->key = NULL;
>    return (void *) e;
>  }

This looks cleaner and more efficient than my attempt.
Thank you for the "volatile" tip !

But I am getting:

  malloc.c:2961:5: error: statement with no effect [-Werror=unused-value]
       * (volatile char **) tcache->entries[tc_idx];

Even if I #pragma out the warning, no machine code seems to be generated
(at least for x86_64).

I would like to suggest:

static __always_inline void *
tcache_get (size_t tc_idx)
{
  tcache_entry *e = tcache->entries[tc_idx];
  if (__glibc_unlikely (!aligned_OK (e)))
    malloc_printerr ("malloc(): unaligned tcache chunk detected");

-    tcache->entries[tc_idx] = REVEAL_PTR (e->next);
+  tcache_entry *const e_next = REVEAL_PTR (e->next);
+
+  /* Validate the pointer we're leaving behind, while we still know
+     where it came from, in case a write-after-free has corrupted it.
+     The write operation (via pointer-to-volatile) serves two purposes:
+     it prevents the code from being optimized out
+     and it validates the pointer for writing (not just for reading). */
+  if (e_next)
+    ((volatile tcache_entry *) e_next)->next = e_next->next;

  tcache->entries[tc_idx] = e_next;
  --(tcache->counts[tc_idx]);
  e->key = NULL;
  return (void *) e;
}



It crashes as desired, and we can see the value of e in the core file.

The extra generated machine code is, again,
much simpler than my original version:

  cmp r9, rdi
  je @@new_label_1
  mov rdi, [rax]
  mov [rax], rdi
 @@ new_label_1:
  ...
  
Adder Dec. 13, 2020, 2:16 a.m. UTC | #3
On Sat, Dec 12, 2020 at 7:32 AM DJ Delorie <dj@redhat.com> wrote:
>
> Adder <adder.thief@gmail.com> writes:
> >
> >   malloc.c:2961:5: error: statement with no effect [-Werror=unused-value]
> >        * (volatile char **) tcache->entries[tc_idx];
>
> Sorry, add (void) in front of it:
>
>         (void) * (volatile char **) tcache->entries[tc_idx];

This eliminates the "no effect" warning (=> successful build).
But the desired crash is not produced.
No machine code is generated.

On the bright side, I have just learned to use (void) to avoid a warning.



> The volatile might be in the wrong spot.  Try:
>
>         (void) * (char * volatile *) tcache->entries[tc_idx];

Hurray !

This eliminates the "no effect" warning (=> successful build).

And shuffling the qualifier eliminates the need for casting to void.
I.e. the following also produces no compiler warning:

                 * (char * volatile *) tcache->entries[tc_idx];

The desired crash is produced (=> we have the address of the corrupted chunk).

The generated machine code is lean -- just three extra instructions:

    ...
    cmp rdi, rsi  ; This tests whether r9 is zero. Trust me.
    je @@Passed
    mov rsi, [r9] ; Crash here as desired. r9 is e: '(gdb) print e'.
    @@Passed:
    ...



We can also use the following:

  ((volatile tcache_entry *) tcache->entries[tc_idx])->next;

It produces the exact same machine code, but it is more resilient
to changes in the definition of our tcache_entry struct
(example: to a field being added before our first field, i.e. before "next").



But we may still want to change this to:

    ...
    cmp rdi, rsi  ; This tests whether r9 is zero. Trust me.
    je @@Passed
    mov rsi, [r9] ; Crash here if e is not a pointer.
    mov [r9], rsi ; Crash here if e happens to be an only-allow-reading pointer.
    @@Passed:
    ...

I.e. we may still want to add the extra "mov [r9], rsi" instruction.

This helps us when the user illegally overwrites our tcache_entry's "next" field
with a value which (unluckily) happens to REVEAL_PTR an address in a
present page,
but (luckily) that page is not writable.



We can perform the "Is this page writable ?" test using:

  ((volatile tcache_entry *) e_next)->next = e_next->next;

where "e_next" is the variable I have been trying to introduce
in order to avoid repeated typing of "tcache->entries[tc_idx]":

  tcache_entry *const e_next = REVEAL_PTR (e->next);
  ...
  tcache->entries[tc_idx] = e_next;



Let us add a test which detects, while we still have the important "e",
whether the "key" field in our tcache_entry has been corrupted,
for example after such usage (on a 64-bit platform):

  size_t *const p1 = (size_t *) malloc (8 * sizeof (size_t));
  ...
  free (p1);
  p1 [1] = 0x647952657A696C45;
  ...
  char *const p7 = (char *) malloc (64);

This is useful when p1 [0] is not illegally modified, but p1 [1] is.

Normally, if we detect this corruption, we should call malloc_printerr.
But with all the stack frames resulting from that call,
our important value "e" value is lost to the debugger (at least on x86_64).
Therefore, I suggest we crash right away.



The resulting function is:

 static __always_inline void *
 tcache_get (size_t tc_idx)
 {
   tcache_entry *e = tcache->entries[tc_idx];
+
   if (__glibc_unlikely (!aligned_OK (e)))
     malloc_printerr ("malloc(): unaligned tcache chunk detected");
-  tcache->entries[tc_idx] = REVEAL_PTR (e->next);
+
+  tcache_entry *const e_next = REVEAL_PTR (e->next);
+
+  if (e_next)
+    ((volatile tcache_entry *) e_next)->next = e_next->next;
+
+  if (__glibc_unlikely (e->key != tcache))
+    * ((tcache_entry *volatile *) 0) = e;
+
+  tcache->entries[tc_idx] = e_next;
   --(tcache->counts[tc_idx]);
   e->key = NULL;
   return (void *) e;
 }



I wish to suggest that we quarrel on Comments after we agree on Code. (-:



> The (void) thing is a common idiom.

Thank you for the "(void)" thing and for the "* (volatile T *)" thing.
  
Adder Dec. 17, 2020, 2:08 a.m. UTC | #4
On Thu, Dec 17, 2020 at 12:30 AM DJ Delorie <dj@redhat.com> wrote:
>
> Adder <adder.thief@gmail.com> writes:
>
> > And shuffling the qualifier eliminates the need for casting to void.
>
> The (void) is a common idiom; we should leave it even if it has no
> effect *at the moment*.  It tells future code readers our current
> intentions, and is more resilient to compiler changes as the compiler
> devs know about it too.

I see. Okay.



> > We can also use the following:
> >
> >   ((volatile tcache_entry *) tcache->entries[tc_idx])->next;
> >
> > It produces the exact same machine code, but it is more resilient
> > to changes in the definition of our tcache_entry struct
> > (example: to a field being added before our first field, i.e. before "next").
>
> For validating the "next" pointer, it doesn't matter what the "next"
> tcache_entry struct looks like; it only matters if it's in valid memory.
> Reading the first word of the entry is sufficient.  tcache->entries[] is
> a pointer, e->next is a pointer, either way we need to read *one* word
> to validate that pointer.  We don't need to read the whole struct it
> points to, nor does it matter which word in the struct is read.
>
> I would argue that casting to a non-struct pointer emphasizes that we
> care about the pointer more than what it points to, but a sufficiently
> verbose comment would suffice as well.

After reading the explanation (thanks), I can understand the casts.



> Whether we use e->next or tcache->entries for this purpose is
> irrelevent, I think.  Whichever results in a smaller line of code is
> probably a bit cleaner ;-)

Then does the version without casts win ?  [-:



> >     mov [r9], rsi ; Crash here if e happens to be an only-allow-reading pointer.
>
> Given the size of a 64-bit address space, the odds of randomly hitting a
> read-only page is negligible, compared to the odds of hitting an
> unmapped page, which is very high.  Given that this code is on the fast
> path, I would argue against this as "not useful enough".

I see. I agree about probability.

But perhaps the speed cost is lower than it appears.
I have always been bad at evaluating time.
I am going to think about a way to benchmark this.
Suggestions are warmly welcome.

In the meanwhile, could we also consider the following alternative ?

  if (e_next)
    {
      if (__glibc_unlikely (e_next->key != tcache))
        please_crash_in_a_way_which_allows_debugger_to_print_e ();
    }

Advantages (YMMV):

  - no (volatile T *), no (void), no other cast;
  - easier to understand (even without "corrupted pointer" in mind);
  - only loads, no stores;
  - probability of e_next accidentally being valid and having a good key is low.

Disadvantages:

  - conditional jump;
  - a bit slower than just reading (Really ? Should we benchmark ?).



> > +  if (__glibc_unlikely (e->key != tcache))
> > +    * ((tcache_entry *volatile *) 0) = e;
>
> This is where malloc_printerr should be called.  Even if the data is
> corrupt, this is what we do elsewhere in these cases.

Purpose is to give "e" to the debugger (including the human debugger).
In my testing (on x86_64), adding malloc_printerr here loses "e".

For clarity and consistence with usage of malloc_printerr elsewhere,
I wish to suggest adding a function malloc_printerr_4
which is given the pointer to the string and 4 additional uint64_t args.

It is not necessary to hex-dump the 4 values in the output message
(though it could be done). Just to make them available to the debugger.

Then this could be used to complete other messages
where the address of the corrupted chunk is/might-be lost to the debugger,
e.g. "corrupted size vs. prev_size", "corrupted double-linked list".

(Currently, I am not saying that the address is lost there. To be tested.)



> > I wish to suggest that we quarrel on Comments after we agree on Code. (-:
>
> Perhaps, but good comments will be required ;-)

I agree !

Let us just postpone them for a little bit, if you will, as it helps
me more easily read our emails with versions of Code.
  
Adder Dec. 22, 2020, 2:15 a.m. UTC | #5
On Thu, Dec 17, 2020 at 6:18 AM DJ Delorie <dj@redhat.com> wrote:
>
> Adder <adder.thief@gmail.com> writes:
> > But perhaps the speed cost is lower than it appears.
> > I have always been bad at evaluating time.
> > I am going to think about a way to benchmark this.
> > Suggestions are warmly welcome.
>
> https://developers.redhat.com/blog/2016/03/11/practical-micro-benchmarking-with-ltrace-and-sched/

Thank you ! I just need some time to start applying. I hope to get back on this.



> > In the meanwhile, could we also consider the following alternative ?
> >
> >   if (e_next)
> >     {
> >       if (__glibc_unlikely (e_next->key != tcache))
> >         please_crash_in_a_way_which_allows_debugger_to_print_e ();
> >     }
> >
> > Advantages (YMMV):
> >
> >   - no (volatile T *), no (void), no other cast;
> >   - easier to understand (even without "corrupted pointer" in mind);
> >   - only loads, no stores;
> >   - probability of e_next accidentally being valid and having a good key is low.
>
> Note that it's checking the next ptr in one chunk, and the key in a
> *different* chunk.

Affirmative.

We are validating e->next by looking at "REVEAL_PTR (e->next)->key".

This can *crash* if e->next points within a page we cannot read.

Otherwise, the test can still (correctly) *fail*
in a large number of cases when the pointer has been overwritten
(and just happens to REVEAL_PTR a readable page).

Finally, the test can also *pass*. (-:



> I think that's OK, corruption can happen anywhere,
> except that the please_crash would need to decide which 'e' is the
> relevent one for the message it's printing.

If a choice has to be made, I would strive to preserve the original 'e'.
Then the human debugger can walk the singly-linked list using gdb commands
(and pen-and-paper-or-more-gdb-magic to decode the encrypted pointers).

(The proposal below does not make a choice, it simply prints 4 pointers.
This is convenient for the human debugger,
at the expense of code bloat, albeit on the __glibc_unlikely path.)



> But I think we need to ask the larger glibc group if there's a preferred
> idiom for "make this variable available to the debugger"

Could we all have a look at malloc_printerr_ex below, please ?

I have tried other versions such as:

   __asm__ ("int $3");
   * (char *) NULL = 0;
   * (volatile const char *) NULL;

Either of these versions crashes as desired.

But the latter two (at least when factored out in a separate function)
cause tcache_get to miss out from "(gdb) backtrace".
And we cannot easily access "e" or any other locals for tcache_get.

We have to look at the CPU registers
(which are indeed better preserved than with malloc_printerr_ex).

This is a bit cumbersome since the code is far away from the main code
of tcache_get, which, by the way, cannot easily be reached any more
(now that tcache_get is missing out from the backtrace, methinks)
(e.g. "disas tcache_get" or "x/10i tcache_get" no longer works for me).

Please let me know if more details are useful regarding these other versions.



> > Disadvantages:
> >
> >   - conditional jump;
>
> That's what __glibc_unlikely() is for - it tells gcc to optimize the
> conditional for the common path, often negating the costs of the jump
> completely.

Indeed, I can now see that __glibc_unlikely does modify the machine code.
The conditional jump instruction points to the "unlikely" branch.
The "likely" branch simply follows after.

(FWIW, I have not personally benchmarked the result yet.)



> >> This is where malloc_printerr should be called.  Even if the data is
> >> corrupt, this is what we do elsewhere in these cases.
> >
> > Purpose is to give "e" to the debugger (including the human debugger).
> > In my testing (on x86_64), adding malloc_printerr here loses "e".
> >
> > For clarity and consistence with usage of malloc_printerr elsewhere,
> > I wish to suggest adding a function malloc_printerr_4
> > which is given the pointer to the string and 4 additional uint64_t args.
>
> I was going to suggest passing a chunk_ptr to malloc_printerr as that's
> what we usually have (or pass NULL if we don't).

Could we introduce a function which calls vsnprintf and malloc_printerr ?

We are going to be able to easily deploy it in other places where we find
that the debugger does not have access to important variables
(e.g. "corrupted size vs. prev_size", "corrupted double-linked list"
-- if so proven).

Behold a new proposal:

 static void malloc_printerr(const char *str) __attribute__ ((noreturn));
+static void malloc_printerr_ex(const char *format, ...) __attribute__
((noreturn));

 [...]

 static __always_inline void *
 tcache_get (size_t tc_idx)
 {
   tcache_entry *e = tcache->entries[tc_idx];
+
   if (__glibc_unlikely (!aligned_OK (e)))
     malloc_printerr ("malloc(): unaligned tcache chunk detected");
-  tcache->entries[tc_idx] = REVEAL_PTR (e->next);
+
+  tcache_entry *const e_next = REVEAL_PTR (e->next);
+
+  /* Validate e->next. It should point within another freed chunk (or
be NULL). */
+  if (e_next)
+    {
+      /* Test ability to read. Might succeed or crash or message-and-abort. */
+      if (__glibc_unlikely (e_next->key != tcache))
+        malloc_printerr_ex ("malloc tcache_get: Corrupted e->next"
+                            " (tcache %p, e %p, e_next %p) (e_next->key %p).",
+                            tcache, e, e_next, e_next->key);
+
+      /* Test ability to write. TODO: Benchmark, consider probability. */
+      ((volatile tcache_entry *) e_next)->key = e_next->key;
+    }
+
+  /* Validate e->key. */
+  if (__glibc_unlikely (e->key != tcache))
+    malloc_printerr_ex ("malloc tcache_get: Corrupted e->key"
+                        " (tcache %p, e %p, e_next %p) (e->key %p).",
+                        tcache, e, e_next, e->key);
+
+  tcache->entries[tc_idx] = e_next;
   --(tcache->counts[tc_idx]);
   e->key = NULL;
   return (void *) e;
 }

 [...]

+static void
+malloc_printerr_ex (const char *format, ...)
+{
+  va_list ap;
+  va_start (ap, format);
+
+  char message [0x1000];
+  {
+    vsnprintf (message, sizeof (message) / sizeof (message [0]), format, ap);
+  }
+
+  va_end (ap);
+
+  malloc_printerr (message);
+}



> We have macros to convert tcache pointers to chunk pointers.

Thank you. I can now see mem2chunk and chunk2mem ! :)
  

Patch

diff --git a/malloc/malloc.c b/malloc/malloc.c
index 5b87bdb0..6e1cae1c 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -2946,6 +2946,45 @@  tcache_put (mchunkptr chunk, size_t tc_idx)
   ++(tcache->counts[tc_idx]);
 }

+/* Upon free, we overwrite the beginning of the memory block with a
tcache_entry.
+   If the application (illegally) then overwrites the "next" field of
our tcache_entry,
+   a later tcache_get might cause a page fault (SIGSEGV)
+   when that field is (revealed and) dereferenced.
+
+   But at that point we are not going to see (in the debugger)
+   the address of the corrupted chunk -- only the (revealed) "next" field.
+
+   Let us dereference the "next" field while we still have the chunk address.
+   This consumes a few CPU cycles, but the extra information may help
debugging.
+
+   The "dbg_" parameters might seem redundant, but they are useful
for the debugger
+   (who might not otherwise be able to obtain their "optimized out" values).
+
+   This function is not inlined and not static in order to prevent
optimizations
+   and ensure all parameters are available.
+*/
+
+__attribute__ ((noinline)) tcache_entry *
+__libc_malloc_tcache_dbg_peek (const tcache_perthread_struct
*dbg_tcache, size_t dbg_tc_idx,
+                               const tcache_entry *dbg_e, const
tcache_entry *e_next)
+{
+  /* The caller does not use the result.
+     Therefore, the REVEAL_PTR computation is only for clarity.
+     But it might be useful later if we decide to walk for more than one step,
+     for earlier detection of the write-after-free bug.
+     (We do return a result in order to help prevent unwanted
optimizations.) */
+  return e_next ? REVEAL_PTR (e_next->next) : NULL;
+}
+
+static __always_inline tcache_entry *
+tcache_dbg_get_next (const tcache_perthread_struct *dbg_tcache,
size_t dbg_tc_idx,
+                     const tcache_entry *e)