malloc: Trim unused arenas on thread exit
Commit Message
I tried the attached patch to trim unused arenas on thread exit. The
trimming actually happens (the heap consolidation is visible in the
malloc_info output from tst-malloc_info), but the arena heaps aren't
deallocated.
I think trimming unused arenas as much as possible is a good heuristics
to minimize RSS, so getting this to work might be worthwhile.
Thanks,
Florian
Comments
On Thursday 09 November 2017 04:25 PM, Florian Weimer wrote:
> I tried the attached patch to trim unused arenas on thread exit. The
> trimming actually happens (the heap consolidation is visible in the
> malloc_info output from tst-malloc_info), but the arena heaps aren't
> deallocated.
>
> I think trimming unused arenas as much as possible is a good heuristics
> to minimize RSS, so getting this to work might be worthwhile.
I wonder if the overhead of unmapping the arena heaps is worthwhile.
For cases where it matters (overcommit is a ratio or is disabled or for
__libc_enable_secure), trimming of the heaps should reduce the commit
charge already since we do remap the pages as PROT_NONE.
The patch seems OK, I'm just wondering if the additional work is worth
the effort because it will also hurt applications that spawn threads
frequently and have similar resource usage; they'll miss out on the
caching effect.
Siddhesh
> Thanks,
> Florian
>
> mtrim.patch
>
>
> diff --git a/malloc/arena.c b/malloc/arena.c
> index 85b985e193..758226c222 100644
> --- a/malloc/arena.c
> +++ b/malloc/arena.c
> @@ -953,12 +953,22 @@ arena_thread_freeres (void)
> /* If this was the last attached thread for this arena, put the
> arena on the free list. */
> assert (a->attached_threads > 0);
> - if (--a->attached_threads == 0)
> + bool arena_is_unused = --a->attached_threads == 0;
> + if (arena_is_unused)
> {
> a->next_free = free_list;
> free_list = a;
> }
> __libc_lock_unlock (free_list_lock);
> +
> + /* If there are no more users, compact the arena as much as
> + possible. */
> + if (arena_is_unused)
> + {
> + __libc_lock_lock (a->mutex);
> + mtrim (a, 0);
> + __libc_lock_unlock (a->mutex);
> + }
> }
> }
> text_set_element (__libc_thread_subfreeres, arena_thread_freeres);
> diff --git a/malloc/malloc.c b/malloc/malloc.c
> index 1f003d2ef0..a0b11784d2 100644
> --- a/malloc/malloc.c
> +++ b/malloc/malloc.c
> @@ -1831,7 +1831,7 @@ malloc_init_state (mstate av)
> static void *sysmalloc (INTERNAL_SIZE_T, mstate);
> static int systrim (size_t, mstate);
> static void malloc_consolidate (mstate);
> -
> +static int mtrim (mstate av, size_t pad);
>
> /* -------------- Early definitions for debugging hooks ---------------- */
>
>
On 11/10/2017 11:52 AM, Siddhesh Poyarekar wrote:
> On Thursday 09 November 2017 04:25 PM, Florian Weimer wrote:
>> I tried the attached patch to trim unused arenas on thread exit. The
>> trimming actually happens (the heap consolidation is visible in the
>> malloc_info output from tst-malloc_info), but the arena heaps aren't
>> deallocated.
>>
>> I think trimming unused arenas as much as possible is a good heuristics
>> to minimize RSS, so getting this to work might be worthwhile.
>
> I wonder if the overhead of unmapping the arena heaps is worthwhile.
> For cases where it matters (overcommit is a ratio or is disabled or for
> __libc_enable_secure), trimming of the heaps should reduce the commit
> charge already since we do remap the pages as PROT_NONE.
I would have expected that the test makes the second sub-heap completely
unused, so that the existing logic would unmap it. But that does not
seem to happen.
> The patch seems OK, I'm just wondering if the additional work is worth
> the effort because it will also hurt applications that spawn threads
> frequently and have similar resource usage; they'll miss out on the
> caching effect.
We could do this not for the current arena, but he arena at the op of
the free list. Then we'd return some caching, and you'd get a ping-pong
effect only if you stop and start more than one thread.
The consolidation we should still perform on the current arena, maybe
even if it is not unused. I assume that before the thread exits, it
deallocates some resources, and we should make sure that malloc sees
them once the arena is reused, even from a thread which has a vastly
different allocation pattern.
Thanks,
Florian
@@ -953,12 +953,22 @@ arena_thread_freeres (void)
/* If this was the last attached thread for this arena, put the
arena on the free list. */
assert (a->attached_threads > 0);
- if (--a->attached_threads == 0)
+ bool arena_is_unused = --a->attached_threads == 0;
+ if (arena_is_unused)
{
a->next_free = free_list;
free_list = a;
}
__libc_lock_unlock (free_list_lock);
+
+ /* If there are no more users, compact the arena as much as
+ possible. */
+ if (arena_is_unused)
+ {
+ __libc_lock_lock (a->mutex);
+ mtrim (a, 0);
+ __libc_lock_unlock (a->mutex);
+ }
}
}
text_set_element (__libc_thread_subfreeres, arena_thread_freeres);
@@ -1831,7 +1831,7 @@ malloc_init_state (mstate av)
static void *sysmalloc (INTERNAL_SIZE_T, mstate);
static int systrim (size_t, mstate);
static void malloc_consolidate (mstate);
-
+static int mtrim (mstate av, size_t pad);
/* -------------- Early definitions for debugging hooks ---------------- */