[2/2] abg-workers: guard bring_workers_down to avoid dead lock
Commit Message
Since bring_workers_down is not atomic, a worker thread can make a racy
read on it while do_bring_workers_down() writes it. That can lead to a
deadlock between the worker thread waiting for more work and
do_bring_workers_down() waiting for the worker to finish.
Address this by guarding all access to bring_workers_down with locking
tasks_todo_mutex. This is likely to be dropped after migrating to newer
C++ standards supporting std::atomic.
* src/abg-workers.cc(do_bring_workers_down): keep
task_todo_mutex locked while writing bring_workers_down,
(wait_to_execute_a_task): rewrite the loop condition to ensure
safe access to bring_workers_down.
Signed-off-by: Matthias Maennich <maennich@google.com>
---
src/abg-workers.cc | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
Comments
Hi.
This looks good to me.
On Thu, 12 Mar 2020 at 11:32, 'Matthias Maennich' via kernel-team
<kernel-team@android.com> wrote:
>
> Since bring_workers_down is not atomic, a worker thread can make a racy
> read on it while do_bring_workers_down() writes it. That can lead to a
> deadlock between the worker thread waiting for more work and
> do_bring_workers_down() waiting for the worker to finish.
> Address this by guarding all access to bring_workers_down with locking
> tasks_todo_mutex. This is likely to be dropped after migrating to newer
> C++ standards supporting std::atomic.
>
> * src/abg-workers.cc(do_bring_workers_down): keep
> task_todo_mutex locked while writing bring_workers_down,
> (wait_to_execute_a_task): rewrite the loop condition to ensure
> safe access to bring_workers_down.
>
Reviewed-by: Giuliano Procida <gprocida@google.com>
Regards,
Giuliano.
> Signed-off-by: Matthias Maennich <maennich@google.com>
> ---
> src/abg-workers.cc | 17 +++++++++++++----
> 1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/src/abg-workers.cc b/src/abg-workers.cc
> index eb892688c489..9035854df958 100644
> --- a/src/abg-workers.cc
> +++ b/src/abg-workers.cc
> @@ -114,7 +114,9 @@ struct worker
> struct queue::priv
> {
> // A boolean to say if the user wants to shutdown the worker
> - // threads.
> + // threads. guarded by tasks_todo_mutex.
> + // TODO: once we have std::atomic<bool>, use it and reconsider the
> + // synchronization around its reads and writes
> bool bring_workers_down;
> // The number of worker threads.
> size_t num_workers;
> @@ -249,11 +251,11 @@ struct queue::priv
> while (!tasks_todo.empty())
> pthread_cond_wait(&tasks_done_cond, &tasks_todo_mutex);
>
> + bring_workers_down = true;
> pthread_mutex_unlock(&tasks_todo_mutex);
>
> // Now that the task queue is empty, drain the workers by waking them up,
> // letting them finish their final task before termination.
> - bring_workers_down = true;
> ABG_ASSERT(pthread_cond_broadcast(&tasks_todo_cond) == 0);
>
> for (std::vector<worker>::const_iterator i = workers.begin();
> @@ -388,7 +390,7 @@ queue::task_done_notify::operator()(const task_sptr&/*task_done*/)
> queue::priv*
> worker::wait_to_execute_a_task(queue::priv* p)
> {
> - do
> + while (true)
> {
> pthread_mutex_lock(&p->tasks_todo_mutex);
> // If there is no more tasks to perform and the queue is not to
> @@ -426,8 +428,15 @@ worker::wait_to_execute_a_task(queue::priv* p)
> pthread_mutex_unlock(&p->tasks_done_mutex);
> pthread_cond_signal(&p->tasks_done_cond);
> }
> +
> + // ensure we access bring_workers_down always guarded
> + bool drop_out = false;
> + pthread_mutex_lock(&p->tasks_todo_mutex);
> + drop_out = p->bring_workers_down;
> + pthread_mutex_unlock(&p->tasks_todo_mutex);
> + if (drop_out)
> + break;
> }
> - while (!p->bring_workers_down);
>
> return p;
> }
> --
> 2.25.1.481.gfbce0eb801-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
Hello Matthias,
Matthias Maennich <maennich@google.com> a ?crit:
> Since bring_workers_down is not atomic, a worker thread can make a racy
> read on it while do_bring_workers_down() writes it. That can lead to a
> deadlock between the worker thread waiting for more work and
> do_bring_workers_down() waiting for the worker to finish.
> Address this by guarding all access to bring_workers_down with locking
> tasks_todo_mutex. This is likely to be dropped after migrating to newer
> C++ standards supporting std::atomic.
>
> * src/abg-workers.cc(do_bring_workers_down): keep
> task_todo_mutex locked while writing bring_workers_down,
> (wait_to_execute_a_task): rewrite the loop condition to ensure
> safe access to bring_workers_down.
Applied to master.
Thanks!
@@ -114,7 +114,9 @@ struct worker
struct queue::priv
{
// A boolean to say if the user wants to shutdown the worker
- // threads.
+ // threads. guarded by tasks_todo_mutex.
+ // TODO: once we have std::atomic<bool>, use it and reconsider the
+ // synchronization around its reads and writes
bool bring_workers_down;
// The number of worker threads.
size_t num_workers;
@@ -249,11 +251,11 @@ struct queue::priv
while (!tasks_todo.empty())
pthread_cond_wait(&tasks_done_cond, &tasks_todo_mutex);
+ bring_workers_down = true;
pthread_mutex_unlock(&tasks_todo_mutex);
// Now that the task queue is empty, drain the workers by waking them up,
// letting them finish their final task before termination.
- bring_workers_down = true;
ABG_ASSERT(pthread_cond_broadcast(&tasks_todo_cond) == 0);
for (std::vector<worker>::const_iterator i = workers.begin();
@@ -388,7 +390,7 @@ queue::task_done_notify::operator()(const task_sptr&/*task_done*/)
queue::priv*
worker::wait_to_execute_a_task(queue::priv* p)
{
- do
+ while (true)
{
pthread_mutex_lock(&p->tasks_todo_mutex);
// If there is no more tasks to perform and the queue is not to
@@ -426,8 +428,15 @@ worker::wait_to_execute_a_task(queue::priv* p)
pthread_mutex_unlock(&p->tasks_done_mutex);
pthread_cond_signal(&p->tasks_done_cond);
}
+
+ // ensure we access bring_workers_down always guarded
+ bool drop_out = false;
+ pthread_mutex_lock(&p->tasks_todo_mutex);
+ drop_out = p->bring_workers_down;
+ pthread_mutex_unlock(&p->tasks_todo_mutex);
+ if (drop_out)
+ break;
}
- while (!p->bring_workers_down);
return p;
}