Use function entry point record only for entry values

Message ID 20231207124745.1362-1-ssbssa@yahoo.de
State New
Headers
Series Use function entry point record only for entry values |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Hannes Domani Dec. 7, 2023, 12:47 p.m. UTC
  PR28987 notes that optimized code sometimes shows the wrong
value of variables at the entry point of a function, if some
code was optimized away and the variable has multiple values
stored in the debug info for this location:
```
(gdb) info address i
Symbol "i" is multi-location:
  Base address 0x140001600  Range 0x13fd41600-0x13fd41600: the constant 0
  Range 0x13fd41600-0x13fd41600: the constant 1
  Range 0x13fd41600-0x13fd41600: the constant 2
  Range 0x13fd41600-0x13fd41600: the constant 3
  Range 0x13fd41600-0x13fd41600: the constant 4
  Range 0x13fd41600-0x13fd41600: the constant 5
  Range 0x13fd41600-0x13fd41600: the constant 6
  Range 0x13fd41600-0x13fd41600: the constant 7
  Range 0x13fd41600-0x13fd4160f: the constant 8
(gdb) p i
$1 = 0
```

Currently, when at the entry point of a function, it will
always show the initial value (here 0), while the user would
expect the last value (here 8).
This logic was introduced for showing the entry-values of
function arguments if they are available, but for some
reason this was added for non-entry-values as well.

One of the tests of amd64-entry-value.exp shows the same
problem for function arguments:
```
s1=s1@entry=11, s2=s2@entry=12, ..., d9=d9@entry=11.5, da=da@entry=12.5
```

I've fixed this by only using the initial values when
explicitely looking for entry values.

Now the local variable is as expected:
```
(gdb) p i
$1 = 8
```

And the test of amd64-entry-value.exp shows the expected
current and entry values of the function arguments:
```
s1=3, s1@entry=11, s2=4, s2@entry=12, ..., d9=3.5, d9@entry=11.5, da=4.5, da@entry=12.5
```

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28987
---
 gdb/dwarf2/loc.c                             | 7 ++++---
 gdb/dwarf2/loc.h                             | 3 ++-
 gdb/testsuite/gdb.arch/amd64-entry-value.exp | 2 +-
 3 files changed, 7 insertions(+), 5 deletions(-)
  

Comments

Guinevere Larsen Dec. 13, 2023, 5:39 p.m. UTC | #1
On 07/12/2023 13:47, Hannes Domani wrote:
> PR28987 notes that optimized code sometimes shows the wrong
> value of variables at the entry point of a function, if some
> code was optimized away and the variable has multiple values
> stored in the debug info for this location:
> ```
> (gdb) info address i
> Symbol "i" is multi-location:
>    Base address 0x140001600  Range 0x13fd41600-0x13fd41600: the constant 0
>    Range 0x13fd41600-0x13fd41600: the constant 1
>    Range 0x13fd41600-0x13fd41600: the constant 2
>    Range 0x13fd41600-0x13fd41600: the constant 3
>    Range 0x13fd41600-0x13fd41600: the constant 4
>    Range 0x13fd41600-0x13fd41600: the constant 5
>    Range 0x13fd41600-0x13fd41600: the constant 6
>    Range 0x13fd41600-0x13fd41600: the constant 7
>    Range 0x13fd41600-0x13fd4160f: the constant 8
> (gdb) p i
> $1 = 0
> ```
>
> Currently, when at the entry point of a function, it will
> always show the initial value (here 0), while the user would
> expect the last value (here 8).
> This logic was introduced for showing the entry-values of
> function arguments if they are available, but for some
> reason this was added for non-entry-values as well.
>
> One of the tests of amd64-entry-value.exp shows the same
> problem for function arguments:
> ```
> s1=s1@entry=11, s2=s2@entry=12, ..., d9=d9@entry=11.5, da=da@entry=12.5
> ```
>
> I've fixed this by only using the initial values when
> explicitely looking for entry values.
>
> Now the local variable is as expected:
> ```
> (gdb) p i
> $1 = 8
> ```
>
> And the test of amd64-entry-value.exp shows the expected
> current and entry values of the function arguments:
> ```
> s1=3, s1@entry=11, s2=4, s2@entry=12, ..., d9=3.5, d9@entry=11.5, da=4.5, da@entry=12.5
> ```
>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28987
> ---
I'm not sure what's going on in this code, but I tested this on fedora 
37 with both gcc and clang and this patch fixes the issue without adding 
regressions.

Tested-By: Guinevere Larsen <blarsen@redhat.com>
  
Tom Tromey Dec. 13, 2023, 8:48 p.m. UTC | #2
>>>>> "Hannes" == Hannes Domani <ssbssa@yahoo.de> writes:

Hannes> PR28987 notes that optimized code sometimes shows the wrong
Hannes> value of variables at the entry point of a function, if some
Hannes> code was optimized away and the variable has multiple values
Hannes> stored in the debug info for this location:

Hannes> ```
Hannes> (gdb) info address i
Hannes> Symbol "i" is multi-location:
Hannes>   Base address 0x140001600  Range 0x13fd41600-0x13fd41600: the constant 0
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 1
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 2
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 3
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 4
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 5
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 6
Hannes>   Range 0x13fd41600-0x13fd41600: the constant 7
Hannes>   Range 0x13fd41600-0x13fd4160f: the constant 8

This seems super pathological.  I don't really understand why one value
would be preferred over any other value here, at least without that
"views" feature that AFAIK was never implemented.

Hannes> One of the tests of amd64-entry-value.exp shows the same
Hannes> problem for function arguments:
Hannes> ```
Hannes> s1=s1@entry=11, s2=s2@entry=12, ..., d9=d9@entry=11.5, da=da@entry=12.5
Hannes> ```

Hannes> I've fixed this by only using the initial values when
Hannes> explicitely looking for entry values.

Hannes> Now the local variable is as expected:

Hannes> ```
Hannes> (gdb) p i
Hannes> $1 = 8
Hannes> ```

I didn't understand this part, where is this?
That function doesn't seem to have an 'i'.

Hannes>  gdb_test "bt" \
Hannes>      [multi_line \
Hannes> -	 "^#0 +stacktest *\\(r1=r1@entry=1, r2=r2@entry=2, \[^\r\n\]+, s1=s1@entry=11, s2=s2@entry=12, \[^\r\n\]+, d9=d9@entry=11\\.5, da=da@entry=12\\.5\\) \[^\r\n\]*" \
Hannes> +	 "^#0 +stacktest *\\(r1=r1@entry=1, r2=r2@entry=2, \[^\r\n\]+, s1=3, s1@entry=11, s2=4, s2@entry=12, \[^\r\n\]+, d9=3\\.5, d9@entry=11\\.5, da=4\\.5, da@entry=12\\.5\\) \[^\r\n\]*" \
Hannes>  	 "#1 +0x\[0-9a-f\]+ in main .*"] \
Hannes>      "entry_stack: bt at entry"
 
So this appears to be at the prologue of this function:

    static void __attribute__((noinline, noclone))
    stacktest (int r1, int r2, int r3, int r4, int r5, int r6, int s1, int s2,
               double d1, double d2, double d3, double d4, double d5, double d6,
               double d7, double d8, double d9, double da)
    {
      s1 = 3;
      s2 = 4;
      d9 = 3.5;
      da = 4.5;
      e (v, v);
    asm ("breakhere_stacktest:");
      e (v, v);
    }

... but surely the original result here is more correct?  If you "break
func" and then stop there, all arguments should have their entry values.

I suppose the idea is that this is not happening due to optimization?
Like, those initial assignments are smeared into the prologue and gdb
can't see that?

I tend to think your patch is probably alright, but I don't really
understand it -- I also don't understand the pre-existing comment in
dwarf2_find_location_expression.  So maybe more justification would be
helpful.

Tom
  
Hannes Domani Dec. 14, 2023, 6:29 a.m. UTC | #3
Am Mittwoch, 13. Dezember 2023 um 21:48:07 MEZ hat Tom Tromey <tom@tromey.com> Folgendes geschrieben:

> >>>>> "Hannes" == Hannes Domani <ssbssa@yahoo.de> writes:
>
> Hannes> PR28987 notes that optimized code sometimes shows the wrong
> Hannes> value of variables at the entry point of a function, if some
> Hannes> code was optimized away and the variable has multiple values
> Hannes> stored in the debug info for this location:
>
> Hannes> ```
> Hannes> (gdb) info address i
> Hannes> Symbol "i" is multi-location:
> Hannes>  Base address 0x140001600  Range 0x13fd41600-0x13fd41600: the constant 0
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 1
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 2
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 3
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 4
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 5
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 6
> Hannes>  Range 0x13fd41600-0x13fd41600: the constant 7
> Hannes>  Range 0x13fd41600-0x13fd4160f: the constant 8
>
> This seems super pathological.  I don't really understand why one value
> would be preferred over any other value here, at least without that
> "views" feature that AFAIK was never implemented.
>
> Hannes> One of the tests of amd64-entry-value.exp shows the same
> Hannes> problem for function arguments:
> Hannes> ```
> Hannes> s1=s1@entry=11, s2=s2@entry=12, ..., d9=d9@entry=11.5, da=da@entry=12.5
> Hannes> ```
>
> Hannes> I've fixed this by only using the initial values when
> Hannes> explicitely looking for entry values.
>
> Hannes> Now the local variable is as expected:
>
> Hannes> ```
> Hannes> (gdb) p i
> Hannes> $1 = 8
> Hannes> ```
>
> I didn't understand this part, where is this?
> That function doesn't seem to have an 'i'.
>
> Hannes>  gdb_test "bt" \
> Hannes>      [multi_line \
> Hannes> -    "^#0 +stacktest *\\(r1=r1@entry=1, r2=r2@entry=2, \[^\r\n\]+, s1=s1@entry=11, s2=s2@entry=12, \[^\r\n\]+, d9=d9@entry=11\\.5, da=da@entry=12\\.5\\) \[^\r\n\]*" \
> Hannes> +    "^#0 +stacktest *\\(r1=r1@entry=1, r2=r2@entry=2, \[^\r\n\]+, s1=3, s1@entry=11, s2=4, s2@entry=12, \[^\r\n\]+, d9=3\\.5, d9@entry=11\\.5, da=4\\.5, da@entry=12\\.5\\) \[^\r\n\]*" \
> Hannes>      "#1 +0x\[0-9a-f\]+ in main .*"] \
> Hannes>      "entry_stack: bt at entry"
>
> So this appears to be at the prologue of this function:
>
>     static void __attribute__((noinline, noclone))
>     stacktest (int r1, int r2, int r3, int r4, int r5, int r6, int s1, int s2,
>               double d1, double d2, double d3, double d4, double d5, double d6,
>               double d7, double d8, double d9, double da)
>
>     {
>       s1 = 3;
>       s2 = 4;
>       d9 = 3.5;
>       da = 4.5;
>
>       e (v, v);
>     asm ("breakhere_stacktest:");
>       e (v, v);
>     }
>
> ... but surely the original result here is more correct?  If you "break
> func" and then stop there, all arguments should have their entry values.
>
> I suppose the idea is that this is not happening due to optimization?
> Like, those initial assignments are smeared into the prologue and gdb
> can't see that?
>
> I tend to think your patch is probably alright, but I don't really
> understand it -- I also don't understand the pre-existing comment in
> dwarf2_find_location_expression.  So maybe more justification would be
> helpful.

Yes, it is about optimization.
I should have added more context, since I was even talking about
2 different examples.
If I change the commit message to the following, maybe the
behavior is more clear:


PR28987 notes that optimized code sometimes shows the wrong
value of variables at the entry point of a function, if some
code was optimized away and the variable has multiple values
stored in the debug info for this location.

In this example:
```
void foo()
{
   int l_3 = 5, i = 0;
   for (; i < 8; i++)
       ;
   test(l_3, i);
}
```
When compiled with optimization, the entry point of foo is at
the test() function call, since everything else is optimized
away.
The debug info of i looks like this:
```
(gdb) info address i
Symbol "i" is multi-location:
  Base address 0x140001600  Range 0x13fd41600-0x13fd41600: the constant 0
  Range 0x13fd41600-0x13fd41600: the constant 1
  Range 0x13fd41600-0x13fd41600: the constant 2
  Range 0x13fd41600-0x13fd41600: the constant 3
  Range 0x13fd41600-0x13fd41600: the constant 4
  Range 0x13fd41600-0x13fd41600: the constant 5
  Range 0x13fd41600-0x13fd41600: the constant 6
  Range 0x13fd41600-0x13fd41600: the constant 7
  Range 0x13fd41600-0x13fd4160f: the constant 8
(gdb) p i
$1 = 0
```

Currently, when at the entry point of a function, it will
always show the initial value (here 0), while the user would
expect the last value (here 8).
This logic was introduced for showing the entry-values of
function arguments if they are available, but for some
reason this was added for non-entry-values as well.

One of the tests of amd64-entry-value.exp shows the same
problem for function arguments, if you "break stacktest"
in the following example, you stop at this line:
```
124     static void __attribute__((noinline, noclone))
125     stacktest (int r1, int r2, int r3, int r4, int r5, int r6, int s1, int s2,
126                double d1, double d2, double d3, double d4, double d5, double d6,
127                double d7, double d8, double d9, double da)
128     {
129       s1 = 3;
130       s2 = 4;
131       d9 = 3.5;
132       da = 4.5;
133 ->    e (v, v);
134     asm ("breakhere_stacktest:");
135       e (v, v);
136     }
```
But `bt` still shows the entry values:
```
s1=s1@entry=11, s2=s2@entry=12, ..., d9=d9@entry=11.5, da=da@entry=12.5
```

I've fixed this by only using the initial values when
explicitely looking for entry values.

Now the local variable of the first example is as expected:
```
(gdb) p i
$1 = 8
```

And the test of amd64-entry-value.exp shows the expected
current and entry values of the function arguments:
```
s1=3, s1@entry=11, s2=4, s2@entry=12, ..., d9=3.5, d9@entry=11.5, da=4.5, da@entry=12.5
```
  
Tom Tromey Dec. 16, 2023, 12:32 a.m. UTC | #4
>>>>> "Hannes" == Hannes Domani <ssbssa@yahoo.de> writes:

Hannes> Yes, it is about optimization.
Hannes> I should have added more context, since I was even talking about
Hannes> 2 different examples.
Hannes> If I change the commit message to the following, maybe the
Hannes> behavior is more clear:

Thank you.  I re-read the code and your description, and I think it is
clear now.

Hannes>   Range 0x13fd41600-0x13fd41600: the constant 1

... the start==end case is the sign that this is an entry value, IIUC.

Anyway I think your patch is ok.
Approved-By: Tom Tromey <tom@tromey.com>

Tom
  
Hannes Domani Dec. 16, 2023, 10:29 a.m. UTC | #5
Am Samstag, 16. Dezember 2023, 01:32:06 MEZ hat Tom Tromey <tom@tromey.com> Folgendes geschrieben:

> >>>>> "Hannes" == Hannes Domani <ssbssa@yahoo.de> writes:
>
> Hannes> Yes, it is about optimization.
> Hannes> I should have added more context, since I was even talking about
> Hannes> 2 different examples.
> Hannes> If I change the commit message to the following, maybe the
> Hannes> behavior is more clear:
>
> Thank you.  I re-read the code and your description, and I think it is
> clear now.
>
> Hannes>   Range 0x13fd41600-0x13fd41600: the constant 1
>
> ... the start==end case is the sign that this is an entry value, IIUC.

Yes.


> Anyway I think your patch is ok.
> Approved-By: Tom Tromey <tom@tromey.com>

Pushed, thanks.
  

Patch

diff --git a/gdb/dwarf2/loc.c b/gdb/dwarf2/loc.c
index 5b2d58ab44e..c15221eb7a2 100644
--- a/gdb/dwarf2/loc.c
+++ b/gdb/dwarf2/loc.c
@@ -363,7 +363,8 @@  decode_debug_loc_dwo_addresses (dwarf2_per_cu_data *per_cu,
 
 const gdb_byte *
 dwarf2_find_location_expression (const dwarf2_loclist_baton *baton,
-				 size_t *locexpr_length, const CORE_ADDR pc)
+				 size_t *locexpr_length, const CORE_ADDR pc,
+				 bool at_entry)
 {
   dwarf2_per_objfile *per_objfile = baton->per_objfile;
   struct objfile *objfile = per_objfile->objfile;
@@ -456,7 +457,7 @@  dwarf2_find_location_expression (const dwarf2_loclist_baton *baton,
 	  loc_ptr += bytes_read;
 	}
 
-      if (low == high && unrel_pc == low)
+      if (low == high && unrel_pc == low && at_entry)
 	{
 	  /* This is entry PC record present only at entry point
 	     of a function.  Verify it is really the function entry point.  */
@@ -3920,7 +3921,7 @@  loclist_read_variable_at_entry (struct symbol *symbol, frame_info_ptr frame)
   if (frame == NULL || !get_frame_func_if_available (frame, &pc))
     return value::allocate_optimized_out (symbol->type ());
 
-  data = dwarf2_find_location_expression (dlbaton, &size, pc);
+  data = dwarf2_find_location_expression (dlbaton, &size, pc, true);
   if (data == NULL)
     return value::allocate_optimized_out (symbol->type ());
 
diff --git a/gdb/dwarf2/loc.h b/gdb/dwarf2/loc.h
index 5cf824d3ae2..94e1fbe517e 100644
--- a/gdb/dwarf2/loc.h
+++ b/gdb/dwarf2/loc.h
@@ -39,7 +39,8 @@  extern unsigned int entry_values_debug;
 const gdb_byte *dwarf2_find_location_expression
   (const dwarf2_loclist_baton *baton,
    size_t *locexpr_length,
-   CORE_ADDR pc);
+   CORE_ADDR pc,
+   bool at_entry = false);
 
 /* Find the frame base information for FRAMEFUNC at PC.  START is an
    out parameter which is set to point to the DWARF expression to
diff --git a/gdb/testsuite/gdb.arch/amd64-entry-value.exp b/gdb/testsuite/gdb.arch/amd64-entry-value.exp
index 3c666acc117..c7fea226df7 100644
--- a/gdb/testsuite/gdb.arch/amd64-entry-value.exp
+++ b/gdb/testsuite/gdb.arch/amd64-entry-value.exp
@@ -77,7 +77,7 @@  gdb_continue_to_breakpoint "entry_stack: stacktest"
 
 gdb_test "bt" \
     [multi_line \
-	 "^#0 +stacktest *\\(r1=r1@entry=1, r2=r2@entry=2, \[^\r\n\]+, s1=s1@entry=11, s2=s2@entry=12, \[^\r\n\]+, d9=d9@entry=11\\.5, da=da@entry=12\\.5\\) \[^\r\n\]*" \
+	 "^#0 +stacktest *\\(r1=r1@entry=1, r2=r2@entry=2, \[^\r\n\]+, s1=3, s1@entry=11, s2=4, s2@entry=12, \[^\r\n\]+, d9=3\\.5, d9@entry=11\\.5, da=4\\.5, da@entry=12\\.5\\) \[^\r\n\]*" \
 	 "#1 +0x\[0-9a-f\]+ in main .*"] \
     "entry_stack: bt at entry"