[v3,0/2] Write DWARF index cache files in background

Message ID 20221219164558.378363-1-tromey@adacore.com
Headers
Series Write DWARF index cache files in background |

Message

Tom Tromey Dec. 19, 2022, 4:45 p.m. UTC
  Here's v3 of the patches to write the DWARF index in the background.

v2 was here:

https://sourceware.org/pipermail/gdb-patches/2022-June/189889.html

This hides a user-noticeable pause when the feature is enabled.  (Note
that, when the feature is used in batch mode, the pause is still
seen.)

This version of the series fixes a possible race that I found in v2.
Essentially, the index-writing background job must be started after
all the finalization tasks have started.  There's a comment in the
code explaining this.

Let me know what you think.

Tom
  

Comments

Tom Tromey Feb. 24, 2023, 6:46 p.m. UTC | #1
>>>>> "Tom" == Tom Tromey via Gdb-patches <gdb-patches@sourceware.org> writes:

Tom> Here's v3 of the patches to write the DWARF index in the background.
Tom> v2 was here:

Tom> https://sourceware.org/pipermail/gdb-patches/2022-June/189889.html

Tom> This hides a user-noticeable pause when the feature is enabled.  (Note
Tom> that, when the feature is used in batch mode, the pause is still
Tom> seen.)

Tom> This version of the series fixes a possible race that I found in v2.
Tom> Essentially, the index-writing background job must be started after
Tom> all the finalization tasks have started.  There's a comment in the
Tom> code explaining this.

I'm checking this in now.

Tom
  
Simon Marchi Feb. 24, 2023, 9:44 p.m. UTC | #2
On 2/24/23 13:46, Tom Tromey via Gdb-patches wrote:
>>>>>> "Tom" == Tom Tromey via Gdb-patches <gdb-patches@sourceware.org> writes:
> 
> Tom> Here's v3 of the patches to write the DWARF index in the background.
> Tom> v2 was here:
> 
> Tom> https://sourceware.org/pipermail/gdb-patches/2022-June/189889.html
> 
> Tom> This hides a user-noticeable pause when the feature is enabled.  (Note
> Tom> that, when the feature is used in batch mode, the pause is still
> Tom> seen.)
> 
> Tom> This version of the series fixes a possible race that I found in v2.
> Tom> Essentially, the index-writing background job must be started after
> Tom> all the finalization tasks have started.  There's a comment in the
> Tom> code explaining this.
> 
> I'm checking this in now.
> 
> Tom

I started seeing this, I guess it's related:

FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: at least one file was created
FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: expected file is there

Simon
  
Tom Tromey March 2, 2023, 3:51 p.m. UTC | #3
>>>>> "Simon" == Simon Marchi <simark@simark.ca> writes:

Simon> I started seeing this, I guess it's related:

Simon> FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: at least one file was created
Simon> FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: expected file is there

Yeah, I will look into it soon.

Tom
  
Tom Tromey March 3, 2023, 4:41 p.m. UTC | #4
Simon> FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: at least one file was created
Simon> FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: expected file is there

I can't reliably reproduce this.

However, I believe the problem is that because the cache file is written
in the background, the test is racy.

I think the appended should fix it.  This just mirrors something done in
test_cache_enabled_hit, and the idea is that waiting for gdb to exit
ensures that the cache file will have been written.

Tom

diff --git a/gdb/testsuite/gdb.base/index-cache.exp b/gdb/testsuite/gdb.base/index-cache.exp
index 0614d4ee2db..d9a6145b5df 100644
--- a/gdb/testsuite/gdb.base/index-cache.exp
+++ b/gdb/testsuite/gdb.base/index-cache.exp
@@ -157,6 +157,9 @@ proc_with_prefix test_cache_enabled_miss { cache_dir } {
 
     lassign [ls_host $cache_dir] ret files_before
 
+    # Just to populate the cache.
+    run_test_with_flags $cache_dir on {}
+
     run_test_with_flags $cache_dir on {
 
 	lassign [ls_host $cache_dir] ret files_after
  
Simon Marchi March 3, 2023, 5:22 p.m. UTC | #5
On 3/3/23 11:41, Tom Tromey wrote:
> Simon> FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: at least one file was created
> Simon> FAIL: gdb.base/index-cache.exp: test_cache_enabled_miss: expected file is there
> 
> I can't reliably reproduce this.
> 
> However, I believe the problem is that because the cache file is written
> in the background, the test is racy.

Ok, thanks for the tip.  It should be possible to reliably reproduce it
with that change then, which makes GDB take more time to write the
index:


diff --git a/gdb/dwarf2/index-write.c b/gdb/dwarf2/index-write.c
index 62c2cc6ac7de..d3bad4aaa0c9 100644
--- a/gdb/dwarf2/index-write.c
+++ b/gdb/dwarf2/index-write.c
@@ -1485,6 +1485,7 @@ write_dwarf_index (dwarf2_per_bfd *per_bfd, const char *dir,
 		    (dwz_index_wip.has_value ()
 		     ? dwz_index_wip->out_file.get () : NULL));

+  sleep(3);
   objfile_index_wip.finalize ();

   if (dwz_index_wip.has_value ())

> I think the appended should fix it.  This just mirrors something done in
> test_cache_enabled_hit, and the idea is that waiting for gdb to exit
> ensures that the cache file will have been written.
> 
> Tom
> 
> diff --git a/gdb/testsuite/gdb.base/index-cache.exp b/gdb/testsuite/gdb.base/index-cache.exp
> index 0614d4ee2db..d9a6145b5df 100644
> --- a/gdb/testsuite/gdb.base/index-cache.exp
> +++ b/gdb/testsuite/gdb.base/index-cache.exp
> @@ -157,6 +157,9 @@ proc_with_prefix test_cache_enabled_miss { cache_dir } {
>  
>      lassign [ls_host $cache_dir] ret files_before
>  
> +    # Just to populate the cache.
> +    run_test_with_flags $cache_dir on {}
> +
>      run_test_with_flags $cache_dir on {
>  
>  	lassign [ls_host $cache_dir] ret files_after

I don't understand this.  The first run_test_with_flags will populate
the cache, so the second run will use the cache and be a cache hit.  The
tests that checks that the expected file is there will work.  But the
check_cache_stats test will operate the GDB that will have had a cache
hit, so we won't get the expected result (a cache miss).

I think this could be solved with a maintenance command to explicitly
wait for index cache creation (calls wait_for_index_cache).  I don't see
another reliable way.  What do you think?  I can implement it if you
think it would work.

Simon
  
Tom Tromey March 3, 2023, 5:29 p.m. UTC | #6
Simon> I don't understand this.

Yeah, sorry, I didn't properly understand the test.

Simon> I think this could be solved with a maintenance command to explicitly
Simon> wait for index cache creation (calls wait_for_index_cache).  I don't see
Simon> another reliable way.  What do you think?  I can implement it if you
Simon> think it would work.

I think you're right.  I can handle it.

Tom
  
Simon Marchi March 3, 2023, 5:52 p.m. UTC | #7
On 3/3/23 12:29, Tom Tromey via Gdb-patches wrote:
> Simon> I don't understand this.
> 
> Yeah, sorry, I didn't properly understand the test.
> 
> Simon> I think this could be solved with a maintenance command to explicitly
> Simon> wait for index cache creation (calls wait_for_index_cache).  I don't see
> Simon> another reliable way.  What do you think?  I can implement it if you
> Simon> think it would work.
> 
> I think you're right.  I can handle it.

I spotted another failure, when running with native-extended-gdbserver:


254 (gdb) file /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/index-cache/index-cache^M
255 Reading symbols from /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.base/index-cache/index-cache...^M
256 /home/smarchi/src/binutils-gdb/gdb/gdbtypes.h:967: internal-error: field: Assertion `idx >= 0 && idx < num_fields ()' failed.^M
257 A problem internal to GDB has been detected,^M
258 further debugging may prove unreliable.^M
259 ERROR: Couldn't load index-cache into GDB (GDB internal error).

I don't know why it would be specific to native-extended-gdbserver,
maybe it changes the timing in just the right (wrong) way.  I haven't
dug into it yet.

Simon