[RFC,v2,01/21] gdb: update is_addr_in_objfile to support "dynamic" objfiles

Message ID 20241121124714.419946-2-jan.vrany@labware.com
State New
Headers
Series Add Python "JIT" API |

Commit Message

Jan Vrany Nov. 21, 2024, 12:46 p.m. UTC
  While working with objfiles in Python I noticed that
gdb.Progspace.objfile_for_address () does not return "dynamic" objfile
create by (for example) GDB's JIT reader API.

This is because is_addr_in_objfile() checks if given address falls into
any (mappped) section of that objfile. However objfiles created by JIT
reader API do not have sections.

To solve this issue, this commit updates is_addr_in_objfile() to also
check if address fall into any compunit if that objfile. It does so only
if objfile has no sections.
---
 gdb/objfiles.c                        | 16 ++++++++++++++++
 gdb/testsuite/gdb.base/jit-reader.exp |  9 +++++++++
 2 files changed, 25 insertions(+)
  

Comments

Andrew Burgess Dec. 7, 2024, 11:45 a.m. UTC | #1
Jan, thanks for working on this fix.

Jan Vrany <jan.vrany@labware.com> writes:

> While working with objfiles in Python I noticed that
> gdb.Progspace.objfile_for_address () does not return "dynamic" objfile

s/objfile/objfiles/.

> create by (for example) GDB's JIT reader API.
>
> This is because is_addr_in_objfile() checks if given address falls into

"checks if a given address"

> any (mappped) section of that objfile. However objfiles created by JIT
> reader API do not have sections.
>
> To solve this issue, this commit updates is_addr_in_objfile() to also
> check if address fall into any compunit if that objfile. It does so only

"check if the address falls into any compunit in that objfile."

> if objfile has no sections.

"if the objfile has no sections".

> ---
>  gdb/objfiles.c                        | 16 ++++++++++++++++
>  gdb/testsuite/gdb.base/jit-reader.exp |  9 +++++++++
>  2 files changed, 25 insertions(+)
>
> diff --git a/gdb/objfiles.c b/gdb/objfiles.c
> index 555195dc61f..0bb578fa6a8 100644
> --- a/gdb/objfiles.c
> +++ b/gdb/objfiles.c
> @@ -1194,6 +1194,22 @@ is_addr_in_objfile (CORE_ADDR addr, const struct objfile *objfile)
>        if (osect->contains (addr))
>  	return true;
>      }
> +  /* Objfiles created dynamically by JIT reader API (and possibly by
> +     other means too) do not have sections and therefore the above
> +     check never succeeds.
> +
> +     For such "dynamic" objfiles walk over all compunits and check
> +     if given ADDR falls into compunit's global block.  */

"into a compunit's global block."

> +  if (! objfile->sections_start)

This should be:

  if (objfile->sections_start == nullptr)

> +    {
> +      for (compunit_symtab *cu
> +	     : const_cast<struct objfile *>(objfile)->compunits ())

I think this can be 'const compunit_symtab *cu' here.  All the methods
called below have 'const' variants available.

> +	{
> +	  global_block *gb = cu->blockvector ()->global_block ();
> +	  if (gb->start () <= addr && addr <= gb->end ())

I believe that the block's end address is the first address outside the
block.  So I think this should be:

  if (gb->start () <= addr && addr < gb->end ())

There are two problems with this approach, one I'm not sure how to deal
with is that the blocks start/end only reflects the text range of the
compunit_symtab, so while is_addr_in_objfile will return an objfile
based on a data address for "normal" objfiles, for dynamic objfiles this
is not true, only a lookup with a text address will work.

I'm not sure if there's a solution to this problem, I suspect you
understand dynamic objfiles better than me.  But if there's not good
solution maybe the right solution is to just document this in the
function comment, and leave it be for now.

The second issue is that it is possible for there to be a gap in an
objfile's range.  This is described in a comment inside the function
find_pc_sect_compunit_symtab (symtab.c), which is worth a read.  But
basically a CU might cover 0x1000..0x2000 then 0x3000..0x4000 which a
different CU from a different objfile might cover 0x2000..0x3000.  The
global block for the first CU will have a start/end of 0x1000..0x4000,
but shouldn't claim addresses in the range 0x2000..0x3000.

You might need to factor out some of the logic from
find_pc_sect_compunit_symtab and do an address to compunit_symtab
lookup, then check if the objfile associated with the compunit_symtab is
the one being checked for?

Thanks,
Andrew

> +	    return true;
> +	}
> +    }
>    return false;
>  }
>  
> diff --git a/gdb/testsuite/gdb.base/jit-reader.exp b/gdb/testsuite/gdb.base/jit-reader.exp
> index 62f6af29ac1..9a873c77909 100644
> --- a/gdb/testsuite/gdb.base/jit-reader.exp
> +++ b/gdb/testsuite/gdb.base/jit-reader.exp
> @@ -229,6 +229,15 @@ proc jit_reader_test {} {
>  		gdb_test "python print( \[o for o in gdb.objfiles() if o.filename.startswith('<< JIT compiled code')\]\[0\].build_id )" \
>  		    "None" \
>  		    "python gdb.Objfile.build_id"
> +
> +		# Check that Progspace.objfile_for_address () finds "jitted"
> +		# objfile
> +		gdb_test "frame 0" \
> +		    "#0  $hex in jit_function_stack_mangle ()$any" \
> +		    "select frame 0"
> +		gdb_test "python print( gdb.current_progspace().objfile_for_address(gdb.parse_and_eval('\$pc')) )" \
> +		    "<gdb.Objfile filename=<< JIT compiled code at $hex >>>" \
> +		    "python gdb.Progspace.objfile_for_address"
>  	    }
>  	}
>      }
> -- 
> 2.45.2
  
Jan Vrany Dec. 11, 2024, 5:06 p.m. UTC | #2
On Sat, 2024-12-07 at 11:45 +0000, Andrew Burgess wrote:
> 
> Jan, thanks for working on this fix.
> 
> Jan Vrany <jan.vrany@labware.com> writes:
> 
> > While working with objfiles in Python I noticed that
> > gdb.Progspace.objfile_for_address () does not return "dynamic"
> > objfile
> 
> s/objfile/objfiles/.
> 
> > create by (for example) GDB's JIT reader API.
> > 
> > This is because is_addr_in_objfile() checks if given address falls
> > into
> 
> "checks if a given address"
> 
> > any (mappped) section of that objfile. However objfiles created by
> > JIT
> > reader API do not have sections.
> > 
> > To solve this issue, this commit updates is_addr_in_objfile() to
> > also
> > check if address fall into any compunit if that objfile. It does so
> > only
> 
> "check if the address falls into any compunit in that objfile."
> 
> > if objfile has no sections.
> 
> "if the objfile has no sections".
> 
> > ---
> >  gdb/objfiles.c                        | 16 ++++++++++++++++
> >  gdb/testsuite/gdb.base/jit-reader.exp |  9 +++++++++
> >  2 files changed, 25 insertions(+)
> > 
> > diff --git a/gdb/objfiles.c b/gdb/objfiles.c
> > index 555195dc61f..0bb578fa6a8 100644
> > --- a/gdb/objfiles.c
> > +++ b/gdb/objfiles.c
> > @@ -1194,6 +1194,22 @@ is_addr_in_objfile (CORE_ADDR addr, const
> > struct objfile *objfile)
> >        if (osect->contains (addr))
> >  	return true;
> >      }
> > +  /* Objfiles created dynamically by JIT reader API (and possibly
> > by
> > +     other means too) do not have sections and therefore the above
> > +     check never succeeds.
> > +
> > +     For such "dynamic" objfiles walk over all compunits and check
> > +     if given ADDR falls into compunit's global block.  */
> 
> "into a compunit's global block."
> 
> > +  if (! objfile->sections_start)
> 
> This should be:
> 
>   if (objfile->sections_start == nullptr)
> 
> > +    {
> > +      for (compunit_symtab *cu
> > +	     : const_cast<struct objfile *>(objfile)->compunits
> > ())
> 
> I think this can be 'const compunit_symtab *cu' here.  All the
> methods
> called below have 'const' variants available.
> 
> > +	{
> > +	  global_block *gb = cu->blockvector ()->global_block ();
> > +	  if (gb->start () <= addr && addr <= gb->end ())
> 
> I believe that the block's end address is the first address outside
> the
> block.  So I think this should be:
> 
>   if (gb->start () <= addr && addr < gb->end ())

Thanks, I'll fix this and all other things mentioned above. 

> 
> There are two problems with this approach, one I'm not sure how to
> deal
> with is that the blocks start/end only reflects the text range of the
> compunit_symtab, so while is_addr_in_objfile will return an objfile
> based on a data address for "normal" objfiles, for dynamic objfiles
> this
> is not true, only a lookup with a text address will work.

Sorry, I do not understand what do you mean by this. For "normal"
objfiles, I can give it both, a code (text) address or address of
some static data:

(top-gdb) bt
#0  init_page_info () at utils.c:1130
#1  0x00005555574ed6ea in gdb_init () at top.c:2305
#2  0x00005555570817f2 in captured_main_1 (context=0x7fffffffdc80) at
main.c:1040
#3  0x000055555708263d in captured_main (data=0x7fffffffdc80) at
main.c:1330
#4  0x00005555570826dc in gdb_main (args=0x7fffffffdc80) at main.c:1359
#5  0x0000555556999908 in main (argc=1, argv=0x7fffffffddb8) at
gdb.c:38

(top-gdb) python
print(gdb.selected_inferior().progspace.objfile_for_address(int(gdb.par
se_and_eval('&objfiles_pspace_data'))))
<gdb.Objfile filename=/home/.../gdb/gdb>

(top-gdb) python
print(gdb.selected_inferior().progspace.objfile_for_address(int(gdb.par
se_and_eval('&objfiles_pspace_data'))))
<gdb.Objfile filename=/home/.../gdb/gdb>

> 
> I'm not sure if there's a solution to this problem, I suspect you
> understand dynamic objfiles better than me.  
> But if there's not good
> solution maybe the right solution is to just document this in the
> function comment, and leave it be for now.
> 
> The second issue is that it is possible for there to be a gap in an
> objfile's range.  This is described in a comment inside the function
> find_pc_sect_compunit_symtab (symtab.c), which is worth a read.  But
> basically a CU might cover 0x1000..0x2000 then 0x3000..0x4000 which a
> different CU from a different objfile might cover 0x2000..0x3000. 
> The
> global block for the first CU will have a start/end of
> 0x1000..0x4000,
> but shouldn't claim addresses in the range 0x2000..0x3000.

Yes. My thinking was that it is not possible by using JIT APIs
(current JIT-reader or the one introduced in this series) to 
create a cu with "holes" like the above. So for any objfile,
it is either a normal one and the existing code would care for 
it, or it is a "dynamic" one (which has no sections) and then
it cannot have CUs with holes and the simple check would do.

> 
> You might need to factor out some of the logic from
> find_pc_sect_compunit_symtab and do an address to compunit_symtab
> lookup, then check if the objfile associated with the compunit_symtab
> is
> the one being checked for?

Yes, this makes sense even if such dynamic objfile cannot 
be created right now. It may not stay so forever though. 
I'll have a look. 

Thanks a lot! 

Jan

> 
> Thanks,
> Andrew
> 
> > +	    return true;
> > +	}
> > +    }
> >    return false;
> >  }
> >  
> > diff --git a/gdb/testsuite/gdb.base/jit-reader.exp
> > b/gdb/testsuite/gdb.base/jit-reader.exp
> > index 62f6af29ac1..9a873c77909 100644
> > --- a/gdb/testsuite/gdb.base/jit-reader.exp
> > +++ b/gdb/testsuite/gdb.base/jit-reader.exp
> > @@ -229,6 +229,15 @@ proc jit_reader_test {} {
> >  		gdb_test "python print( \[o for o in
> > gdb.objfiles() if o.filename.startswith('<< JIT compiled
> > code')\]\[0\].build_id )" \
> >  		    "None" \
> >  		    "python gdb.Objfile.build_id"
> > +
> > +		# Check that Progspace.objfile_for_address ()
> > finds "jitted"
> > +		# objfile
> > +		gdb_test "frame 0" \
> > +		    "#0  $hex in jit_function_stack_mangle ()$any"
> > \
> > +		    "select frame 0"
> > +		gdb_test "python print(
> > gdb.current_progspace().objfile_for_address(gdb.parse_and_eval('\$p
> > c')) )" \
> > +		    "<gdb.Objfile filename=<< JIT compiled code at
> > $hex >>>" \
> > +		    "python gdb.Progspace.objfile_for_address"
> >  	    }
> >  	}
> >      }
> > -- 
> > 2.45.2
>
  
Jan Vrany Dec. 12, 2024, 2:43 p.m. UTC | #3
On Sat, 2024-12-07 at 11:45 +0000, Andrew Burgess wrote:
> 
> The second issue is that it is possible for there to be a gap in an
> objfile's range.  This is described in a comment inside the function
> find_pc_sect_compunit_symtab (symtab.c), which is worth a read.  But
> basically a CU might cover 0x1000..0x2000 then 0x3000..0x4000 which a
> different CU from a different objfile might cover 0x2000..0x3000. 
> The
> global block for the first CU will have a start/end of
> 0x1000..0x4000,
> but shouldn't claim addresses in the range 0x2000..0x3000.
> 
> You might need to factor out some of the logic from
> find_pc_sect_compunit_symtab and do an address to compunit_symtab
> lookup, then check if the objfile associated with the compunit_symtab
> is
> the one being checked for?

On a second thought, in what situations there can be 2 CUs, each 
from different objfile, interleaved like the comment says? 

The comment says:

  This happens for native ecoff format, where code from included files
  gets its own symtab.

In this case, symtabs for included files would belong to the same 
objfile, no? 

In the other cases:

  It also happens for objfiles that have their functions reordered.

Again, all functions (possibly from different CUs) would belong to
the same objfile, no?

Jan
> 
> Thanks,
> Andrew
> 
> > +	    return true;
> > +	}
> > +    }
> >    return false;
> >  }
> >  
> > diff --git a/gdb/testsuite/gdb.base/jit-reader.exp
> > b/gdb/testsuite/gdb.base/jit-reader.exp
> > index 62f6af29ac1..9a873c77909 100644
> > --- a/gdb/testsuite/gdb.base/jit-reader.exp
> > +++ b/gdb/testsuite/gdb.base/jit-reader.exp
> > @@ -229,6 +229,15 @@ proc jit_reader_test {} {
> >  		gdb_test "python print( \[o for o in
> > gdb.objfiles() if o.filename.startswith('<< JIT compiled
> > code')\]\[0\].build_id )" \
> >  		    "None" \
> >  		    "python gdb.Objfile.build_id"
> > +
> > +		# Check that Progspace.objfile_for_address ()
> > finds "jitted"
> > +		# objfile
> > +		gdb_test "frame 0" \
> > +		    "#0  $hex in jit_function_stack_mangle ()$any"
> > \
> > +		    "select frame 0"
> > +		gdb_test "python print(
> > gdb.current_progspace().objfile_for_address(gdb.parse_and_eval('\$p
> > c')) )" \
> > +		    "<gdb.Objfile filename=<< JIT compiled code at
> > $hex >>>" \
> > +		    "python gdb.Progspace.objfile_for_address"
> >  	    }
> >  	}
> >      }
> > -- 
> > 2.45.2
>
  
Tom Tromey Dec. 19, 2024, 7:43 p.m. UTC | #4
>> +	  global_block *gb = cu->blockvector ()->global_block ();
>> +	  if (gb->start () <= addr && addr <= gb->end ())

Andrew> I believe that the block's end address is the first address outside the
Andrew> block.  So I think this should be:

Andrew>   if (gb->start () <= addr && addr < gb->end ())

If this will still be in the final patch, it may make sense to have a
'block' method for this.

thanks,
Tom
  
Jan Vrany Dec. 19, 2024, 10:27 p.m. UTC | #5
On Thu, 2024-12-19 at 12:43 -0700, Tom Tromey wrote:
> > > +	  global_block *gb = cu->blockvector ()->global_block
> > > ();
> > > +	  if (gb->start () <= addr && addr <= gb->end ())
> 
> Andrew> I believe that the block's end address is the first address
> outside the
> Andrew> block.  So I think this should be:
> 
> Andrew>   if (gb->start () <= addr && addr < gb->end ())
> 
> If this will still be in the final patch, it may make sense to have a
> 'block' method for this.
> 
Yes, in my work-in-progress v3 I did exactly this. Accounting for 
(at least some cases of) non-continguous compunits caused more 
changes not only in this patch but also in others.

Thanks, 
Jan

> thanks,
> Tom
>
  

Patch

diff --git a/gdb/objfiles.c b/gdb/objfiles.c
index 555195dc61f..0bb578fa6a8 100644
--- a/gdb/objfiles.c
+++ b/gdb/objfiles.c
@@ -1194,6 +1194,22 @@  is_addr_in_objfile (CORE_ADDR addr, const struct objfile *objfile)
       if (osect->contains (addr))
 	return true;
     }
+  /* Objfiles created dynamically by JIT reader API (and possibly by
+     other means too) do not have sections and therefore the above
+     check never succeeds.
+
+     For such "dynamic" objfiles walk over all compunits and check
+     if given ADDR falls into compunit's global block.  */
+  if (! objfile->sections_start)
+    {
+      for (compunit_symtab *cu
+	     : const_cast<struct objfile *>(objfile)->compunits ())
+	{
+	  global_block *gb = cu->blockvector ()->global_block ();
+	  if (gb->start () <= addr && addr <= gb->end ())
+	    return true;
+	}
+    }
   return false;
 }
 
diff --git a/gdb/testsuite/gdb.base/jit-reader.exp b/gdb/testsuite/gdb.base/jit-reader.exp
index 62f6af29ac1..9a873c77909 100644
--- a/gdb/testsuite/gdb.base/jit-reader.exp
+++ b/gdb/testsuite/gdb.base/jit-reader.exp
@@ -229,6 +229,15 @@  proc jit_reader_test {} {
 		gdb_test "python print( \[o for o in gdb.objfiles() if o.filename.startswith('<< JIT compiled code')\]\[0\].build_id )" \
 		    "None" \
 		    "python gdb.Objfile.build_id"
+
+		# Check that Progspace.objfile_for_address () finds "jitted"
+		# objfile
+		gdb_test "frame 0" \
+		    "#0  $hex in jit_function_stack_mangle ()$any" \
+		    "select frame 0"
+		gdb_test "python print( gdb.current_progspace().objfile_for_address(gdb.parse_and_eval('\$pc')) )" \
+		    "<gdb.Objfile filename=<< JIT compiled code at $hex >>>" \
+		    "python gdb.Progspace.objfile_for_address"
 	    }
 	}
     }