[v3] Don't fall back to mmap if there are still locked non-avoided arenas to try.

Message ID 55F1F212.1040101@redhat.com
State Dropped
Headers

Commit Message

Carlos O'Donell Sept. 10, 2015, 9:11 p.m. UTC
  On 08/19/2015 01:25 PM, Siddhesh Poyarekar wrote:
> On Wed, Aug 19, 2015 at 09:50:29AM -0400, Mike Frysinger wrote:
>> verified how ?
> 
> Sorry, I meant to write "verified that there were no regressions on
> x86_64".  Testing this behaviour itself is tricky since you'll need to
> generate the right amount of load to cause contention, with a single
> arena or in a retry path.
> 
>> doesn't this comment explicitly say you don't want to use the avoid arena ?
>> doesn't it need updating now with this change in logic ?
> 
> Right, updated patch with comment:

The logic in reused_arena is still confused. For example we don't
need a begin, since next_to_use is the start of our loop. Rather than
correct this, I'm going to give you an alternate suggestion.

From first principles we need only two loops.

Loop 1: Hot loop.

Find the first non-corrupt, non-locked, non-avoided arena.
-> Return it (since we just locked it with trylock).

Failing that.

Loop 2: Slow loop.

Find the first non-corrupt, non-avoided arena.
-> Lock it (potentially waiting).
-> Return it.

Note: We do not inline slow loop into hot loop, even though we could
      and remember the last non-corrupt non-avoided arena, but that
      complicates the logic significantly.

Failing that, all arenas are corrupt or avoided, so we return NULL
to signal that other allocation schemes need to be used e.g. malloc.

Note: avoid_arena are not locked as the comments say, it's always
      unlocked by arena_get_retry before looking to reuse an arena.

What do you think about this?

v3
- Split into hot/slow loops.
- Rewrite logic.

---

Cheers,
Carlos.
  

Comments

Siddhesh Poyarekar Sept. 16, 2015, 1:24 p.m. UTC | #1
On Thu, 2015-09-10 at 17:11 -0400, Carlos O'Donell wrote:
> diff --git a/malloc/arena.c b/malloc/arena.c
> index b44e307..da94816 100644
> --- a/malloc/arena.c
> +++ b/malloc/arena.c
> @@ -797,7 +804,7 @@ get_free_list (void)
>  
>  /* Lock and return an arena that can be reused for memory
> allocation.
>     Avoid AVOID_ARENA as we have already failed to allocate memory in
> -   it and it is currently locked.  */
> +   that arena.  */

Right, Jeff had fixed this so that we always come here without any
locks so the earlier comment was incorrect.

>  static mstate
>  reused_arena (mstate avoid_arena)
>  {
> @@ -809,32 +816,34 @@ reused_arena (mstate avoid_arena)
>    result = next_to_use;
>    do
>      {
> -      if (!arena_is_corrupt (result) && !mutex_trylock (&result
> ->mutex))
> +      if (!arena_is_corrupt (result) &&
> +	  !mutex_trylock (&result->mutex) &&
> +	  result != avoid_arena)

This could result in taking the lock if result == avoid_arena and
continue to try another arena, which could then result in a deadlock
since we don't release that lock anywhere.  The correct thing to do
here would be:

  if (result != avoid_arena && !arena_is_corrupt (result) && 	  
            !mutex_trylock (&result->mutex))

>          goto out;
>  
>        result = result->next;
>      }
>    while (result != next_to_use);
>  
> -  /* Avoid AVOID_ARENA as we have already failed to allocate memory
> -     in that arena and it is currently locked.   */
> -  if (result == avoid_arena)
> -    result = result->next;
> -
> -  /* Make sure that the arena we get is not corrupted.  */
> -  mstate begin = result;
> -  while (arena_is_corrupt (result) || result == avoid_arena)
> +  /* All arenas are locked, corrupted, or avoided. We start again
> looking for
> +     any arena to use.  This time we consider locked non-avoided
> arenas and
> +     will wait for them instead of falling back to other allocation
> schemes.  
> +     We will not retry the avoided arena which just failed.  */
> +  do
>      {
> +      if (!arena_is_corrupt (result) &&
> +	  result != avoid_arena)

Swap the order of the conditionals so that they're more intuitive -
first check the mstate and then check a member of the mstate.  That's
how the code ought to be generated anyhow.

Looks good to me with those changes.

Thanks,
Siddhesh

> +	goto lock;
> +
>        result = result->next;
> -      if (result == begin)
> -	break;
>      }
> +  while (result != next_to_use);
>  
> -  /* We could not find any arena that was either not corrupted or
> not the one
> -     we wanted to avoid.  */
> -  if (result == begin || result == avoid_arena)
> -    return NULL;
> +  /* All the arenas are corrupted or avoided. We have no choice but
> to return
> +     NULL to signal that other fallbacks should be used.  */
> +  return NULL;
>  
> +lock:
>    /* No arena available without contention.  Wait for the next in
> line.  */
>    LIBC_PROBE (memory_arena_reuse_wait, 3, &result->mutex, result,
> avoid_arena);
>    (void) mutex_lock (&result->mutex);
> ---
> 
> Cheers,
> Carlos.
  
Florian Weimer Oct. 15, 2015, 1:47 p.m. UTC | #2
On 09/10/2015 11:11 PM, Carlos O'Donell wrote:
> The logic in reused_arena is still confused. For example we don't
> need a begin, since next_to_use is the start of our loop. Rather than
> correct this, I'm going to give you an alternate suggestion.

I think the begin variable was an attempt to deal with the concurrency
issue in this function.  Please keep it and use relaxed loads and stores
to access next_to_use.  I don't think it is necessary to change it with
a CAS because the variable is just a hint anyway.  The important point
is that every invocation of reused_arena terminates, and with the
current code, it's far from obvious that it does when the function is
called concurrently.

Florian
  

Patch

diff --git a/malloc/arena.c b/malloc/arena.c
index b44e307..da94816 100644
--- a/malloc/arena.c
+++ b/malloc/arena.c
@@ -797,7 +804,7 @@  get_free_list (void)
 
 /* Lock and return an arena that can be reused for memory allocation.
    Avoid AVOID_ARENA as we have already failed to allocate memory in
-   it and it is currently locked.  */
+   that arena.  */
 static mstate
 reused_arena (mstate avoid_arena)
 {
@@ -809,32 +816,34 @@  reused_arena (mstate avoid_arena)
   result = next_to_use;
   do
     {
-      if (!arena_is_corrupt (result) && !mutex_trylock (&result->mutex))
+      if (!arena_is_corrupt (result) &&
+	  !mutex_trylock (&result->mutex) &&
+	  result != avoid_arena)
         goto out;
 
       result = result->next;
     }
   while (result != next_to_use);
 
-  /* Avoid AVOID_ARENA as we have already failed to allocate memory
-     in that arena and it is currently locked.   */
-  if (result == avoid_arena)
-    result = result->next;
-
-  /* Make sure that the arena we get is not corrupted.  */
-  mstate begin = result;
-  while (arena_is_corrupt (result) || result == avoid_arena)
+  /* All arenas are locked, corrupted, or avoided. We start again looking for
+     any arena to use.  This time we consider locked non-avoided arenas and
+     will wait for them instead of falling back to other allocation schemes.  
+     We will not retry the avoided arena which just failed.  */
+  do
     {
+      if (!arena_is_corrupt (result) &&
+	  result != avoid_arena)
+	goto lock;
+
       result = result->next;
-      if (result == begin)
-	break;
     }
+  while (result != next_to_use);
 
-  /* We could not find any arena that was either not corrupted or not the one
-     we wanted to avoid.  */
-  if (result == begin || result == avoid_arena)
-    return NULL;
+  /* All the arenas are corrupted or avoided. We have no choice but to return
+     NULL to signal that other fallbacks should be used.  */
+  return NULL;
 
+lock:
   /* No arena available without contention.  Wait for the next in line.  */
   LIBC_PROBE (memory_arena_reuse_wait, 3, &result->mutex, result, avoid_arena);
   (void) mutex_lock (&result->mutex);