Harden tcache double-free check

Message ID 20210707012919.1298612-1-siddhesh@sourceware.org
State Superseded
Headers
Series Harden tcache double-free check |

Checks

Context Check Description
dj/TryBot-apply_patch success Patch applied to master at the time it was sent
dj/TryBot-32bit success Build for i686

Commit Message

Siddhesh Poyarekar July 7, 2021, 1:29 a.m. UTC
  The tcache allocator layer uses the tcache pointer as a key to
identify a block that may be freed twice.  Since this is in the
application data area, an attacker exploiting a use-after-free could
potentially get access to the entire tcache structure through this
key.  A detailed write-up was provided by Awarau here:

https://awaraucom.wordpress.com/2020/07/19/house-of-io-remastered/

Replace this static pointer use for key checking with one that is
generated at malloc initialization.  The first attempt is through
getrandom with a fallback to random_bits(), which is a simple
pseudo-random number generator based on the clock.  The fallback ought
to be sufficient since the goal of the randomness is only to make the
key arbitrary enough that it is very unlikely to collide with user
data.

Co-authored-by: Eyal Itkin <eyalit@checkpoint.com>
---
 malloc/arena.c  |  8 ++++++++
 malloc/malloc.c | 32 ++++++++++++++++++++++++++++----
 2 files changed, 36 insertions(+), 4 deletions(-)
  

Comments

Florian Weimer July 7, 2021, 5:17 p.m. UTC | #1
* Siddhesh Poyarekar:

> +/* Process-wide key to try and catch a double-free in the same thread.  */
> +static uintptr_t tcache_key;
> +
> +/* The value of tcache_key does not really have to be a cryptographically
> +   secure random number.  It only needs to be arbitrary enough so that it does
> +   not collide with values present in applications, which would be quite rare,
> +   about 1 in 2^wordsize.  */

The comment should say that a collision could result in performance
problems.

Apart from that, the patch looks okay to me.

Thanks,
Florian
  
Adhemerval Zanella July 7, 2021, 5:35 p.m. UTC | #2
On 06/07/2021 22:29, Siddhesh Poyarekar via Libc-alpha wrote:

> +/* Process-wide key to try and catch a double-free in the same thread.  */
> +static uintptr_t tcache_key;
> +
> +/* The value of tcache_key does not really have to be a cryptographically
> +   secure random number.  It only needs to be arbitrary enough so that it does
> +   not collide with values present in applications, which would be quite rare,
> +   about 1 in 2^wordsize.  */
> +static void
> +tcache_key_initialize (void)
> +{
> +  if (__getrandom (&tcache_key, sizeof(tcache_key), GRND_NONBLOCK)
> +      != sizeof (tcache_key))
> +    {
> +      tcache_key = random_bits ();
> +#if __WORDSIZE == 64
> +      tcache_key = (tcache_key << 32) | random_bits ();
> +#endif

The other usage for ramdom_bits at sysdeps/posix/tempname.c already uses
a uint_fast64_t, so I think it would be better to make it random_bits()
return 64-bit instead.

Also, calling random_bits() in a sequence will result in quite low
entropy, since it uses __clock_gettime64().  Maybe you could add
XOR the pid for the higher bits:

  static inline uint64_t
  random_bits (void)
  {
    struct __timespec64 tv;
    __clock_gettime64 (CLOCK_MONOTONIC, &tv);
    /* Shuffle the lower bits to minimize the clock bias.  */
    uint32_t ret_low = tv.tv_nsec ^ tv.tv_sec;
    ret_low ^= (ret_low << 24) | (ret_low >> 8);
    /* And add a lit more entropy for higher bits.  */
    uint32_t ret_high = ret_low ^ (uint32_t) __getpid ();
    return ((uint64_t) ret_high << 32) | ret_low;
  }
  
Siddhesh Poyarekar July 7, 2021, 5:58 p.m. UTC | #3
On 7/7/21 11:05 PM, Adhemerval Zanella via Libc-alpha wrote:
> 
> 
> On 06/07/2021 22:29, Siddhesh Poyarekar via Libc-alpha wrote:
> 
>> +/* Process-wide key to try and catch a double-free in the same thread.  */
>> +static uintptr_t tcache_key;
>> +
>> +/* The value of tcache_key does not really have to be a cryptographically
>> +   secure random number.  It only needs to be arbitrary enough so that it does
>> +   not collide with values present in applications, which would be quite rare,
>> +   about 1 in 2^wordsize.  */
>> +static void
>> +tcache_key_initialize (void)
>> +{
>> +  if (__getrandom (&tcache_key, sizeof(tcache_key), GRND_NONBLOCK)
>> +      != sizeof (tcache_key))
>> +    {
>> +      tcache_key = random_bits ();
>> +#if __WORDSIZE == 64
>> +      tcache_key = (tcache_key << 32) | random_bits ();
>> +#endif
> 
> The other usage for ramdom_bits at sysdeps/posix/tempname.c already uses

tempname.c seems to have its own random_bits function (i.e. it just 
happens to have the same name); it doesn't use the one in random_bits.h 
AFAICT.  All other users of random_bits() use 32-bit.  Entropy isn't 
really a concern in the above use case, it's just a key to avoid collisions.

Siddhesh
  
Adhemerval Zanella July 7, 2021, 6:09 p.m. UTC | #4
On 07/07/2021 14:58, Siddhesh Poyarekar wrote:
> On 7/7/21 11:05 PM, Adhemerval Zanella via Libc-alpha wrote:
>>
>>
>> On 06/07/2021 22:29, Siddhesh Poyarekar via Libc-alpha wrote:
>>
>>> +/* Process-wide key to try and catch a double-free in the same thread.  */
>>> +static uintptr_t tcache_key;
>>> +
>>> +/* The value of tcache_key does not really have to be a cryptographically
>>> +   secure random number.  It only needs to be arbitrary enough so that it does
>>> +   not collide with values present in applications, which would be quite rare,
>>> +   about 1 in 2^wordsize.  */
>>> +static void
>>> +tcache_key_initialize (void)
>>> +{
>>> +  if (__getrandom (&tcache_key, sizeof(tcache_key), GRND_NONBLOCK)
>>> +      != sizeof (tcache_key))
>>> +    {
>>> +      tcache_key = random_bits ();
>>> +#if __WORDSIZE == 64
>>> +      tcache_key = (tcache_key << 32) | random_bits ();
>>> +#endif
>>
>> The other usage for ramdom_bits at sysdeps/posix/tempname.c already uses
> 
> tempname.c seems to have its own random_bits function (i.e. it just happens to have the same name); it doesn't use the one in random_bits.h AFAICT.  All other users of random_bits() use 32-bit.  Entropy isn't really a concern in the above use case, it's just a key to avoid collisions.
Yes, but it is essentially what you are doing here: query getrandom
with GRND_NONBLOCK a falling back to clock_gettime.  And we also
have the random_bits from include/random-bits.h.

I think we would better to consolidate it with only one implementation.
  
Siddhesh Poyarekar July 7, 2021, 6:12 p.m. UTC | #5
On 7/7/21 11:39 PM, Adhemerval Zanella via Libc-alpha wrote:
> Yes, but it is essentially what you are doing here: query getrandom
> with GRND_NONBLOCK a falling back to clock_gettime.  And we also
> have the random_bits from include/random-bits.h.
> 
> I think we would better to consolidate it with only one implementation.

The randomness needs are quite different, but I see your point.  We 
could make a separate random_bits64 to cater to these two use cases. 
Can it be a separate patch though?  I'll put it on top of this one.

Thanks,
Siddhesh
  
Adhemerval Zanella July 7, 2021, 6:27 p.m. UTC | #6
On 07/07/2021 15:12, Siddhesh Poyarekar wrote:
> On 7/7/21 11:39 PM, Adhemerval Zanella via Libc-alpha wrote:
>> Yes, but it is essentially what you are doing here: query getrandom
>> with GRND_NONBLOCK a falling back to clock_gettime.  And we also
>> have the random_bits from include/random-bits.h.
>>
>> I think we would better to consolidate it with only one implementation.
> 
> The randomness needs are quite different, but I see your point.  We could make a separate random_bits64 to cater to these two use cases. Can it be a separate patch though?  I'll put it on top of this one.

I think it can be separate consolidation, what I am trying to avoid is
multiple random_bits implementation which different semantics within
glibc.
  
Siddhesh Poyarekar July 7, 2021, 6:28 p.m. UTC | #7
On 7/7/21 11:57 PM, Adhemerval Zanella wrote:
> I think it can be separate consolidation, what I am trying to avoid is
> multiple random_bits implementation which different semantics within
> glibc.
> 

Thanks, I'll push this if there are no further objections.

Siddhesh
  

Patch

diff --git a/malloc/arena.c b/malloc/arena.c
index 7eb110445e..991fc21a7e 100644
--- a/malloc/arena.c
+++ b/malloc/arena.c
@@ -287,6 +287,10 @@  extern struct dl_open_hook *_dl_open_hook;
 libc_hidden_proto (_dl_open_hook);
 #endif
 
+#if USE_TCACHE
+static void tcache_key_initialize (void);
+#endif
+
 static void
 ptmalloc_init (void)
 {
@@ -295,6 +299,10 @@  ptmalloc_init (void)
 
   __malloc_initialized = 0;
 
+#if USE_TCACHE
+  tcache_key_initialize ();
+#endif
+
 #ifdef USE_MTAG
   if ((TUNABLE_GET_FULL (glibc, mem, tagging, int32_t, NULL) & 1) != 0)
     {
diff --git a/malloc/malloc.c b/malloc/malloc.c
index bb9a1642aa..68dc18dd03 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -252,6 +252,10 @@ 
 
 #include <libc-internal.h>
 
+/* For tcache double-free check.  */
+#include <random-bits.h>
+#include <sys/random.h>
+
 /*
   Debugging:
 
@@ -3091,7 +3095,7 @@  typedef struct tcache_entry
 {
   struct tcache_entry *next;
   /* This field exists to detect double frees.  */
-  struct tcache_perthread_struct *key;
+  uintptr_t key;
 } tcache_entry;
 
 /* There is one of these for each thread, which contains the
@@ -3108,6 +3112,26 @@  typedef struct tcache_perthread_struct
 static __thread bool tcache_shutting_down = false;
 static __thread tcache_perthread_struct *tcache = NULL;
 
+/* Process-wide key to try and catch a double-free in the same thread.  */
+static uintptr_t tcache_key;
+
+/* The value of tcache_key does not really have to be a cryptographically
+   secure random number.  It only needs to be arbitrary enough so that it does
+   not collide with values present in applications, which would be quite rare,
+   about 1 in 2^wordsize.  */
+static void
+tcache_key_initialize (void)
+{
+  if (__getrandom (&tcache_key, sizeof(tcache_key), GRND_NONBLOCK)
+      != sizeof (tcache_key))
+    {
+      tcache_key = random_bits ();
+#if __WORDSIZE == 64
+      tcache_key = (tcache_key << 32) | random_bits ();
+#endif
+    }
+}
+
 /* Caller must ensure that we know tc_idx is valid and there's room
    for more chunks.  */
 static __always_inline void
@@ -3117,7 +3141,7 @@  tcache_put (mchunkptr chunk, size_t tc_idx)
 
   /* Mark this chunk as "in the tcache" so the test in _int_free will
      detect a double free.  */
-  e->key = tcache;
+  e->key = tcache_key;
 
   e->next = PROTECT_PTR (&e->next, tcache->entries[tc_idx]);
   tcache->entries[tc_idx] = e;
@@ -3134,7 +3158,7 @@  tcache_get (size_t tc_idx)
     malloc_printerr ("malloc(): unaligned tcache chunk detected");
   tcache->entries[tc_idx] = REVEAL_PTR (e->next);
   --(tcache->counts[tc_idx]);
-  e->key = NULL;
+  e->key = 0;
   return (void *) e;
 }
 
@@ -4437,7 +4461,7 @@  _int_free (mstate av, mchunkptr p, int have_lock)
 	   trust it (it also matches random payload data at a 1 in
 	   2^<size_t> chance), so verify it's not an unlikely
 	   coincidence before aborting.  */
-	if (__glibc_unlikely (e->key == tcache))
+	if (__glibc_unlikely (e->key == tcache_key))
 	  {
 	    tcache_entry *tmp;
 	    size_t cnt = 0;