diff mbox series

[1/4] Avoid submitting empty tasks in parallel_for_each

Message ID	20221215190759.2494095-2-tromey@adacore.com
State	Committed
Headers	DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 309A63894C34 To: gdb-patches@sourceware.org Cc: Tom Tromey <tromey@adacore.com> Subject: [PATCH 1/4] Avoid submitting empty tasks in parallel_for_each Date: Thu, 15 Dec 2022 12:07:56 -0700 Message-Id: <20221215190759.2494095-2-tromey@adacore.com> In-Reply-To: <20221215190759.2494095-1-tromey@adacore.com> References: <20221215190759.2494095-1-tromey@adacore.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list From: Tom Tromey via Gdb-patches <gdb-patches@sourceware.org> Reply-To: Tom Tromey <tromey@adacore.com> Errors-To: gdb-patches-bounces+patchwork=sourceware.org@sourceware.org Sender: "Gdb-patches" <gdb-patches-bounces+patchwork=sourceware.org@sourceware.org>
Series	Fix regression in new DWARF reader \| [0/4] Fix regression in new DWARF reader [1/4] Avoid submitting empty tasks in parallel_for_each [2/4] Don't erase empty indices in DWARF reader [3/4] Move hash_entry and eq_entry into cooked_index::do_finalize [4/4] Fix parameter-less template regression in new DWARF reader

Commit Message

Tom Tromey Dec. 15, 2022, 7:07 p.m. UTC

  I found that parallel_for_each would submit empty tasks to the thread
pool.  For example, this can happen if the number of tasks is smaller
than the number of available threads.  In the DWARF reader, this
resulted in the cooked index containing empty sub-indices.  This patch
arranges to instead shrink the result vector and process the trailing
entries in the calling thread.
---
 gdbsupport/parallel-for.h | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Comments

Joel Brobecker Jan. 7, 2023, 11:19 a.m. UTC | #1

Hi Tom,

On Thu, Dec 15, 2022 at 12:07:56PM -0700, Tom Tromey via Gdb-patches wrote:
> I found that parallel_for_each would submit empty tasks to the thread
> pool.  For example, this can happen if the number of tasks is smaller
> than the number of available threads.  In the DWARF reader, this
> resulted in the cooked index containing empty sub-indices.  This patch
> arranges to instead shrink the result vector and process the trailing
> entries in the calling thread.

This patch looks good to me.

If I'm allowed to nitpick...

> ---
>  gdbsupport/parallel-for.h | 23 +++++++++++++++++++++++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/gdbsupport/parallel-for.h b/gdbsupport/parallel-for.h
> index acd9137efbd..ed460f31f63 100644
> --- a/gdbsupport/parallel-for.h
> +++ b/gdbsupport/parallel-for.h
> @@ -70,6 +70,12 @@ struct par_for_accumulator
>      return result;
>    }
>  
> +  /* Resize the results to N.  */
> +  void resize (size_t n)
> +  {
> +    m_futures.resize (n);
> +  }
> +
>  private:
>    
>    /* A vector of futures coming from the tasks run in the
> @@ -108,6 +114,12 @@ struct par_for_accumulator<void>
>        }
>    }
>  
> +  /* Resize the results to N.  */
> +  void resize (size_t n)
> +  {
> +    m_futures.resize (n);
> +  }
> +
>  private:
>  
>    std::vector<gdb::future<void>> m_futures;
> @@ -232,6 +244,17 @@ parallel_for_each (unsigned n, RandomIt first, RandomIt last,
>  	  end = j;
>  	  remaining_size -= chunk_size;
>  	}
> +
> +      if (end == last)
> +	{
> +	  /* We're about to dispatch the last batch of elements, which
> +	     we normally process in the main thread.  So just truncate
> +	     the result list here.  */

... I'd expand the last sentence to explain that this is to avoid
submitting empty tasks to the thread pool.

But I'll leave it up to you do decide whether that's a useful suggestion
or not ;-).

Thanks for that patch!

> +	  count = i;
> +	  results.resize (count);
> +	  break;
> +	}
> +
>        if (parallel_for_each_debug)
>  	{
>  	  debug_printf (_("Parallel for: elements on worker thread %i\t: %zu"),
> -- 
> 2.34.3
>

Tom Tromey Jan. 9, 2023, 4:20 p.m. UTC | #2

>>>>> "Joel" == Joel Brobecker <brobecker@adacore.com> writes:

>> +	  /* We're about to dispatch the last batch of elements, which
>> +	     we normally process in the main thread.  So just truncate
>> +	     the result list here.  */

Joel> ... I'd expand the last sentence to explain that this is to avoid
Joel> submitting empty tasks to the thread pool.

Joel> But I'll leave it up to you do decide whether that's a useful suggestion
Joel> or not ;-).

Makes sense to me, I did this.

Tom

diff mbox series

Patch

diff --git a/gdbsupport/parallel-for.h b/gdbsupport/parallel-for.h
index acd9137efbd..ed460f31f63 100644
--- a/gdbsupport/parallel-for.h
+++ b/gdbsupport/parallel-for.h
@@ -70,6 +70,12 @@  struct par_for_accumulator
     return result;
   }
 
+  /* Resize the results to N.  */
+  void resize (size_t n)
+  {
+    m_futures.resize (n);
+  }
+
 private:
   
   /* A vector of futures coming from the tasks run in the
@@ -108,6 +114,12 @@  struct par_for_accumulator<void>
       }
   }
 
+  /* Resize the results to N.  */
+  void resize (size_t n)
+  {
+    m_futures.resize (n);
+  }
+
 private:
 
   std::vector<gdb::future<void>> m_futures;
@@ -232,6 +244,17 @@  parallel_for_each (unsigned n, RandomIt first, RandomIt last,
 	  end = j;
 	  remaining_size -= chunk_size;
 	}
+
+      if (end == last)
+	{
+	  /* We're about to dispatch the last batch of elements, which
+	     we normally process in the main thread.  So just truncate
+	     the result list here.  */
+	  count = i;
+	  results.resize (count);
+	  break;
+	}
+
       if (parallel_for_each_debug)
 	{
 	  debug_printf (_("Parallel for: elements on worker thread %i\t: %zu"),