[2/2] abg-workers: guard bring_workers_down to avoid dead lock

Message ID 20200312113158.24055-2-maennich@google.com
State Committed
Headers
Series [1/2] configure: add support for thread sanitizer (--enable-tsan) |

Commit Message

Matthias Männich March 12, 2020, 11:31 a.m. UTC
  Since bring_workers_down is not atomic, a worker thread can make a racy
read on it while do_bring_workers_down() writes it. That can lead to a
deadlock between the worker thread waiting for more work and
do_bring_workers_down() waiting for the worker to finish.
Address this by guarding all access to bring_workers_down with locking
tasks_todo_mutex. This is likely to be dropped after migrating to newer
C++ standards supporting std::atomic.

	* src/abg-workers.cc(do_bring_workers_down): keep
        task_todo_mutex locked while writing bring_workers_down,
	(wait_to_execute_a_task): rewrite the loop condition to ensure
	safe access to bring_workers_down.

Signed-off-by: Matthias Maennich <maennich@google.com>
---
 src/abg-workers.cc | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)
  

Comments

Giuliano Procida March 16, 2020, 3:13 p.m. UTC | #1
Hi.

This looks good to me.

On Thu, 12 Mar 2020 at 11:32, 'Matthias Maennich' via kernel-team
<kernel-team@android.com> wrote:
>
> Since bring_workers_down is not atomic, a worker thread can make a racy
> read on it while do_bring_workers_down() writes it. That can lead to a
> deadlock between the worker thread waiting for more work and
> do_bring_workers_down() waiting for the worker to finish.
> Address this by guarding all access to bring_workers_down with locking
> tasks_todo_mutex. This is likely to be dropped after migrating to newer
> C++ standards supporting std::atomic.
>
>         * src/abg-workers.cc(do_bring_workers_down): keep
>         task_todo_mutex locked while writing bring_workers_down,
>         (wait_to_execute_a_task): rewrite the loop condition to ensure
>         safe access to bring_workers_down.
>

Reviewed-by: Giuliano Procida <gprocida@google.com>

Regards,
Giuliano.

> Signed-off-by: Matthias Maennich <maennich@google.com>
> ---
>  src/abg-workers.cc | 17 +++++++++++++----
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/abg-workers.cc b/src/abg-workers.cc
> index eb892688c489..9035854df958 100644
> --- a/src/abg-workers.cc
> +++ b/src/abg-workers.cc
> @@ -114,7 +114,9 @@ struct worker
>  struct queue::priv
>  {
>    // A boolean to say if the user wants to shutdown the worker
> -  // threads.
> +  // threads. guarded by tasks_todo_mutex.
> +  // TODO: once we have std::atomic<bool>, use it and reconsider the
> +  // synchronization around its reads and writes
>    bool                         bring_workers_down;
>    // The number of worker threads.
>    size_t                       num_workers;
> @@ -249,11 +251,11 @@ struct queue::priv
>      while (!tasks_todo.empty())
>        pthread_cond_wait(&tasks_done_cond, &tasks_todo_mutex);
>
> +    bring_workers_down = true;
>      pthread_mutex_unlock(&tasks_todo_mutex);
>
>      // Now that the task queue is empty, drain the workers by waking them up,
>      // letting them finish their final task before termination.
> -    bring_workers_down = true;
>      ABG_ASSERT(pthread_cond_broadcast(&tasks_todo_cond) == 0);
>
>      for (std::vector<worker>::const_iterator i = workers.begin();
> @@ -388,7 +390,7 @@ queue::task_done_notify::operator()(const task_sptr&/*task_done*/)
>  queue::priv*
>  worker::wait_to_execute_a_task(queue::priv* p)
>  {
> -  do
> +  while (true)
>      {
>        pthread_mutex_lock(&p->tasks_todo_mutex);
>        // If there is no more tasks to perform and the queue is not to
> @@ -426,8 +428,15 @@ worker::wait_to_execute_a_task(queue::priv* p)
>           pthread_mutex_unlock(&p->tasks_done_mutex);
>           pthread_cond_signal(&p->tasks_done_cond);
>         }
> +
> +      // ensure we access bring_workers_down always guarded
> +      bool drop_out = false;
> +      pthread_mutex_lock(&p->tasks_todo_mutex);
> +      drop_out = p->bring_workers_down;
> +      pthread_mutex_unlock(&p->tasks_todo_mutex);
> +      if (drop_out)
> +       break;
>      }
> -  while (!p->bring_workers_down);
>
>    return p;
>  }
> --
> 2.25.1.481.gfbce0eb801-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
  
Dodji Seketeli March 18, 2020, 1:41 p.m. UTC | #2
Hello Matthias,

Matthias Maennich <maennich@google.com> a ?crit:

> Since bring_workers_down is not atomic, a worker thread can make a racy
> read on it while do_bring_workers_down() writes it. That can lead to a
> deadlock between the worker thread waiting for more work and
> do_bring_workers_down() waiting for the worker to finish.
> Address this by guarding all access to bring_workers_down with locking
> tasks_todo_mutex. This is likely to be dropped after migrating to newer
> C++ standards supporting std::atomic.
>
> 	* src/abg-workers.cc(do_bring_workers_down): keep
>         task_todo_mutex locked while writing bring_workers_down,
> 	(wait_to_execute_a_task): rewrite the loop condition to ensure
> 	safe access to bring_workers_down.

Applied to master.

Thanks!
  

Patch

diff --git a/src/abg-workers.cc b/src/abg-workers.cc
index eb892688c489..9035854df958 100644
--- a/src/abg-workers.cc
+++ b/src/abg-workers.cc
@@ -114,7 +114,9 @@  struct worker
 struct queue::priv
 {
   // A boolean to say if the user wants to shutdown the worker
-  // threads.
+  // threads. guarded by tasks_todo_mutex.
+  // TODO: once we have std::atomic<bool>, use it and reconsider the
+  // synchronization around its reads and writes
   bool				bring_workers_down;
   // The number of worker threads.
   size_t			num_workers;
@@ -249,11 +251,11 @@  struct queue::priv
     while (!tasks_todo.empty())
       pthread_cond_wait(&tasks_done_cond, &tasks_todo_mutex);
 
+    bring_workers_down = true;
     pthread_mutex_unlock(&tasks_todo_mutex);
 
     // Now that the task queue is empty, drain the workers by waking them up,
     // letting them finish their final task before termination.
-    bring_workers_down = true;
     ABG_ASSERT(pthread_cond_broadcast(&tasks_todo_cond) == 0);
 
     for (std::vector<worker>::const_iterator i = workers.begin();
@@ -388,7 +390,7 @@  queue::task_done_notify::operator()(const task_sptr&/*task_done*/)
 queue::priv*
 worker::wait_to_execute_a_task(queue::priv* p)
 {
-  do
+  while (true)
     {
       pthread_mutex_lock(&p->tasks_todo_mutex);
       // If there is no more tasks to perform and the queue is not to
@@ -426,8 +428,15 @@  worker::wait_to_execute_a_task(queue::priv* p)
 	  pthread_mutex_unlock(&p->tasks_done_mutex);
 	  pthread_cond_signal(&p->tasks_done_cond);
 	}
+
+      // ensure we access bring_workers_down always guarded
+      bool drop_out = false;
+      pthread_mutex_lock(&p->tasks_todo_mutex);
+      drop_out = p->bring_workers_down;
+      pthread_mutex_unlock(&p->tasks_todo_mutex);
+      if (drop_out)
+	break;
     }
-  while (!p->bring_workers_down);
 
   return p;
 }