malloc: check tcache mem size in tcache_get_n to avoid arbitrary mem allocation

Message ID 20250409082617.112696-1-dudududuMaxVer@gmail.com (mailing list archive)
State Under Review
Delegated to: Wilco Dijkstra
Headers
Series malloc: check tcache mem size in tcache_get_n to avoid arbitrary mem allocation |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit fail Patch caused testsuite regressions
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Test passed

Commit Message

dbgbgtf April 9, 2025, 8:26 a.m. UTC
  Hi,

> What do you mean with "arb alloc"? Never heard of the term...

`arb alloc` is abbreviation of `arbitrary alloc`.

> If the malloc succeeds despite corruption, it will likely fail if that block is freed
> again, or on the next malloc that uses the now corrupted freelist. Adding free(ptr[0])
> at the end of your test causes "double free or corruption".

`ptr[0]` is the address of `ptr[2]` at the end of the
test.
At this time, attackers already got an arbitrary mem alloc, we shouldn't
leave it to `free` or another `malloc`.

> You can statically link with GLIBC, but that's not needed here - "make bench" will generate
> benchmarks which will always dynamically link with your local GLIBC build.

Is that true?
Because on my machine ldd shows it will use my system
library as default, while the GLIBC build libc.so.6 is in
`/home/dbgbgtf/Main/work/build/libc.so.6`

```
❯ ldd benchtests/bench-malloc-simple
	linux-vdso.so.1 (0x000077ae74ebf000)
	libc.so.6 => /usr/lib/libc.so.6 (0x000077ae74c98000)
	/usr/lib/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x000077ae74ec1000)
```
But anyway, `make bench` will use the correct library to run benchtests.
If you're using `make bench`, it's fine.

> There is a 3.8% difference between those runs. Which one is old, which one is new?

You can tell the old and new ones from the directory, `mybuild` was
built with my patch, while `build` was built with the `origin/master`.

So you can see my patch is faster, which make me confused.
(maybe because I remove the fastbin checks in _int_malloc?)

> Note those results seem slow given the i7-13700H is quite new. Are you running on
> the fast cores? Does the generated code look well optimized?

Those results came from my laptop, so it could be slower than desktop.
I didn't specific the tests to use fast cores. But the code is well optimized.

So I rerun the benchtests like `taskset -c 0 make bench BENCHSET="malloc-simple"`
The result was not different compare to the result in my last email.

Also I rerun `make bench BENCHSET="malloc-thread"`

This is `origin/master`
```
❯ cat benchtests/bench-malloc-thread-32.out
{
 "timing_type": "hp_timing",
 "functions": {
  "malloc": {
   "": {
    "duration": 9.33561e+11,
    "iterations": 1.10199e+10,
    "time_per_iteration": 84.7163,
    "max_rss": 7284,
    "threads": 32,
    "min_size": 4,
    "max_size": 32768,
    "random_seed": 88
   }
  }
 }
}%
```

This is `origin/master` with my patch
```
❯ cat benchtests/bench-malloc-thread-32.out
{
 "timing_type": "hp_timing",
 "functions": {
  "malloc": {
   "": {
    "duration": 9.33958e+11,
    "iterations": 1.07042e+10,
    "time_per_iteration": 87.2519,
    "max_rss": 6820,
    "threads": 32,
    "min_size": 4,
    "max_size": 32768,
    "random_seed": 88
   }
  }
 }
}%
```

Thanks,
dbgbgtf

Signed-off-by: dbgbgtf <dudududuMaxVer@gmail.com>
---
 malloc/malloc.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)
  

Patch

diff --git a/malloc/malloc.c b/malloc/malloc.c
index a0bc733482..74f9668d3b 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -3186,6 +3186,8 @@  tcache_get_n (size_t tc_idx, tcache_entry **ep)
 
   if (__glibc_unlikely (!aligned_OK (e)))
     malloc_printerr ("malloc(): unaligned tcache chunk detected");
+  if (__glibc_unlikely (tc_idx != csize2tidx(chunksize_nomask (mem2chunk(e)))))
+    malloc_printerr ("malloc(): tcache mem size vs request size");
 
   if (ep == &(tcache->entries[tc_idx]))
       *ep = REVEAL_PTR (e->next);
@@ -4009,11 +4011,6 @@  _int_malloc (mstate av, size_t bytes)
 		  while (tcache->counts[tc_idx] < mp_.tcache_count
 			 && (tc_victim = *fb) != NULL)
 		    {
-		      if (__glibc_unlikely (misaligned_chunk (tc_victim)))
-			malloc_printerr ("malloc(): unaligned fastbin chunk detected 3");
-		      size_t victim_tc_idx = csize2tidx (chunksize (tc_victim));
-		      if (__glibc_unlikely (tc_idx != victim_tc_idx))
-			malloc_printerr ("malloc(): chunk size mismatch in fastbin");
 		      if (SINGLE_THREAD_P)
 			*fb = REVEAL_PTR (tc_victim->fd);
 		      else