[1/2] PowerPC, Fix-test-gdb.base-store.exp

Message ID 76b8ed7b93608d40ab42b0538319f78eaf7d621c.camel@us.ibm.com
State New
Headers
Series [1/2] PowerPC, Fix-test-gdb.base-store.exp |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Carl Love Oct. 12, 2023, 2:58 p.m. UTC
  GDB maintainers:

This is the first patch in the series which fixes the DWWARF register
mapping and signal handling issues on PowerPC.

                  Carl

-----------------------------------------------

rs6000, Fix Linux DWARF register mapping

The PowerPC DWARF register mapping is the same for the .eh_frame and
.debug_frame on Linux.  PowerPC uses different mapping for .eh_frame and
.debug_frame on other operating systems.  The current GDB support for
mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum and
rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct for Linux.
The files have some legacy mappings for spe_acc, spefscr, EV which was
removed from GCC in 2017.

This patch adds a two new functions rs6000_linux_dwarf2_reg_to_regnum,
and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c to handle
the DWARF register mappings on Linux.  Function
rs6000_linux_dwarf2_reg_to_regnum is installed for both gdb_dwarf_to_regnum
and gdbarch_stab_reg_to_regnum since the mappings are the same.

The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is updated to
call set_gdbarch_dwarf2_reg_to_regnum map the new function
rs6000_linux_dwarf2_reg_to_regnum for the architecture.  Similarly,
dwarf2_frame_set_adjust_regnum is called to map
rs6000_linux_adjust_frame_regnum into the architecture.

The second issue fixed by this patch is the check for subroutine
process_event_stop_test.  Need to make sure the frame is not the
SIGTRAMP_FRAME.  The sequence of events on most platforms is:

  1) Some code is running when a signal arrives.
  2) The kernel handles the signal and dispatches to the handler.
    ...

However on PowerPC the sequence of events is:

  1) Some code is running when a signal arrives.
  2) The kernel handles the signal and dispatches to the trampoline.
  3) The trampoline performs a normal function call to the handler.
      ...

We want "nexti" to step into, not over, signal handlers invoked
by the kernel.  This is the case most platforms as the kernel puts a
signal trampoline frame onto the stack to handle proper return after the
handler.  However, on some platforms such as PowerPC, the kernel actually
uses a trampoline to handle *invocation* of the handler.

The issue is fixed in function process_event_stop_test by adding a check
that the frame is not a SIGTRAMP_FRAME to the if statement to stop in
a subroutine call.  This prevents GDB from erroneously detecting the
trampoline invocation as a subroutine call.

This patch fixes two regression test failures in gdb.base/store.exp.  It
also fixes two regression failures in gdb.python/py-thread-exited.exp.

Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10 with no
new regressions.
---
 gdb/infrun.c         | 13 ++++++++++
 gdb/ppc-linux-tdep.c | 56 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)
  

Comments

Keith Seitz Oct. 13, 2023, 8:34 p.m. UTC | #1
Carl Love wrote:

> This patch fixes two regression test failures in gdb.base/store.exp.  It
> also fixes two regression failures in gdb.python/py-thread-exited.exp.

I did not notice any failures on HEAD in this test? But I also don't see
any new regressions in any test with your patch.  Just FAIL -> PASS.

> Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10 with no
> new regressions.

I've read through this patch (and tested it), and I only have a few very
trivial fixes to request.  I don't think there's any reason to repost to
fix a couple of typos, so just await a proper maintainers approval and
commit with that approval.

I don't know this code sufficiently well to give a proper approval, but
I did not notice anything egregiously wrong.

Tested-by: Keith Seitz <keiths@redhat.com>

Thanks for the patch!
Keith

---
 gdb/infrun.c         | 13 ++++++++++
 gdb/ppc-linux-tdep.c | 56 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
index 784dafa59db..7fb90799dff 100644
--- a/gdb/ppc-linux-tdep.c
+++ b/gdb/ppc-linux-tdep.c
@@ -83,6 +83,7 @@
 #include "features/rs6000/powerpc-isa207-vsx64l.c"
 #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
 #include "features/rs6000/powerpc-e500l.c"
+#include "dwarf2/frame.h"
 
 /* Shared library operations for PowerPC-Linux.  */
 static struct target_so_ops powerpc_so_ops;
@@ -2088,6 +2089,52 @@ ppc_linux_displaced_step_prepare  (gdbarch *arch, thread_info *thread,
   return per_inferior->disp_step_buf->prepare (thread, displaced_pc);
 }
 
+/* Convert a Dwarf 2 register number to a GDB register number for Linux.  */
+static int
+rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int num)
+{
+  ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
+
+  if (0 <= num && num <= 31)
+    return tdep->ppc_gp0_regnum + num;
+  else if (32 <= num && num <= 63)
+    /* FIXME: jimb/2004-05-05: What should we do when the debug info
+       specifies registers the architecture doesn't have?  Our
+       callers don't check the value we return.  */
+    return tdep->ppc_fp0_regnum + (num - 32);
+  else if (77 <= num && num < 77 + 32)
+    return tdep->ppc_vr0_regnum + (num - 77);
+  else
+    switch (num)
+      {
+      case 65:
+	return tdep->ppc_lr_regnum;
+      case 66:
+	return tdep->ppc_ctr_regnum;
+      case 76:
+	return tdep->ppc_xer_regnum;
+      case 109:
+	return tdep->ppc_vrsave_regnum;
+      case 110:
+	return tdep->ppc_vrsave_regnum - 1; /* vscr */
+      }
+
+  /* Unknown DWARF register number.  */
+  return -1;
+}
+

The spacing here is inconsistent. In the function above, there is no
newline between the comment and the definition.  Here there are two
newlines:

+/* Translate a .eh_frame register to DWARF register, or adjust a
+   .debug_frame register.  */
+
+
+static int
+rs6000_linux_adjust_frame_regnum (struct gdbarch *gdbarch, int num,
+				  int eh_frame_p)
+{
+  /* Linux uses the same numbering for .debug_frame numbering as .eh_frame.  */
+  return num;
+}
+
 static void
 ppc_linux_init_abi (struct gdbarch_info info,
 		    struct gdbarch *gdbarch)
@@ -2135,6 +2182,15 @@ ppc_linux_init_abi (struct gdbarch_info info,
   set_gdbarch_stap_is_single_operand (gdbarch, ppc_stap_is_single_operand);
   set_gdbarch_stap_parse_special_token (gdbarch,
 					ppc_stap_parse_special_token);
+  /* Linux DWARF register mapping is different from the othe OS's.  */

Note the typo, "othe[r]". The correct plural form is "OSes".

+  set_gdbarch_dwarf2_reg_to_regnum (gdbarch,
+				    rs6000_linux_dwarf2_reg_to_regnum);
+  /* Note on Linux the mapping for the DWARF registers and the stab registers
+     use the same numbers.  Install rs6000_linux_dwarf2_reg_to_regnum for the
+     stab register mappings as well.  */
+  set_gdbarch_stab_reg_to_regnum (gdbarch,
+				    rs6000_linux_dwarf2_reg_to_regnum);
+  dwarf2_frame_set_adjust_regnum (gdbarch, rs6000_linux_adjust_frame_regnum);
 
   if (tdep->wordsize == 4)
     {
  
Carl Love Oct. 13, 2023, 9 p.m. UTC | #2
Keith:

Thanks for the review.  I have fixed the typos as noted below.

                 Carl 
----------

On Fri, 2023-10-13 at 13:34 -0700, Keith Seitz wrote:
> Carl Love wrote:
> 
> > This patch fixes two regression test failures in
> > gdb.base/store.exp.  It
> > also fixes two regression failures in gdb.python/py-thread-
> > exited.exp.
> 
> I did not notice any failures on HEAD in this test? But I also don't
> see
> any new regressions in any test with your patch.  Just FAIL -> PASS.
> 
> > Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10
> > with no
> > new regressions.
> 
> I've read through this patch (and tested it), and I only have a few
> very
> trivial fixes to request.  I don't think there's any reason to repost
> to
> fix a couple of typos, so just await a proper maintainers approval
> and
> commit with that approval.
> 
> I don't know this code sufficiently well to give a proper approval,
> but
> I did not notice anything egregiously wrong.
> 
> Tested-by: Keith Seitz <keiths@redhat.com>
> 
> Thanks for the patch!
> Keith
> 
> ---
>  gdb/infrun.c         | 13 ++++++++++
>  gdb/ppc-linux-tdep.c | 56
> ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
> 
> diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> index 784dafa59db..7fb90799dff 100644
> --- a/gdb/ppc-linux-tdep.c
> +++ b/gdb/ppc-linux-tdep.c
> @@ -83,6 +83,7 @@
>  #include "features/rs6000/powerpc-isa207-vsx64l.c"
>  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
>  #include "features/rs6000/powerpc-e500l.c"
> +#include "dwarf2/frame.h"
>  
>  /* Shared library operations for PowerPC-Linux.  */
>  static struct target_so_ops powerpc_so_ops;
> @@ -2088,6 +2089,52 @@ ppc_linux_displaced_step_prepare  (gdbarch
> *arch, thread_info *thread,
>    return per_inferior->disp_step_buf->prepare (thread,
> displaced_pc);
>  }
>  
> +/* Convert a Dwarf 2 register number to a GDB register number for
> Linux.  */
> +static int
> +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int num)
> +{
> +  ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
> +
> +  if (0 <= num && num <= 31)
> +    return tdep->ppc_gp0_regnum + num;
> +  else if (32 <= num && num <= 63)
> +    /* FIXME: jimb/2004-05-05: What should we do when the debug info
> +       specifies registers the architecture doesn't have?  Our
> +       callers don't check the value we return.  */
> +    return tdep->ppc_fp0_regnum + (num - 32);
> +  else if (77 <= num && num < 77 + 32)
> +    return tdep->ppc_vr0_regnum + (num - 77);
> +  else
> +    switch (num)
> +      {
> +      case 65:
> +	return tdep->ppc_lr_regnum;
> +      case 66:
> +	return tdep->ppc_ctr_regnum;
> +      case 76:
> +	return tdep->ppc_xer_regnum;
> +      case 109:
> +	return tdep->ppc_vrsave_regnum;
> +      case 110:
> +	return tdep->ppc_vrsave_regnum - 1; /* vscr */
> +      }
> +
> +  /* Unknown DWARF register number.  */
> +  return -1;
> +}
> +
> 
> The spacing here is inconsistent. In the function above, there is no
> newline between the comment and the definition.  Here there are two
> newlines:

Added a blank line between comment and
rs6000_linux_dwarf2_reg_to_regnum function definition.

> 
> +/* Translate a .eh_frame register to DWARF register, or adjust a
> +   .debug_frame register.  */
> +
> +

Removed the extra line.

> +static int
> +rs6000_linux_adjust_frame_regnum (struct gdbarch *gdbarch, int num,
> +				  int eh_frame_p)
> +{
> +  /* Linux uses the same numbering for .debug_frame numbering as
> .eh_frame.  */
> +  return num;
> +}
> +
>  static void
>  ppc_linux_init_abi (struct gdbarch_info info,
>  		    struct gdbarch *gdbarch)
> @@ -2135,6 +2182,15 @@ ppc_linux_init_abi (struct gdbarch_info info,
>    set_gdbarch_stap_is_single_operand (gdbarch,
> ppc_stap_is_single_operand);
>    set_gdbarch_stap_parse_special_token (gdbarch,
>  					ppc_stap_parse_special_token);
> +  /* Linux DWARF register mapping is different from the othe
> OS's.  */
> 
> Note the typo, "othe[r]". The correct plural form is "OSes".

Ah, thanks for catching the missing r and the spelling error.  Fixed.

> 
> +  set_gdbarch_dwarf2_reg_to_regnum (gdbarch,
> +				    rs6000_linux_dwarf2_reg_to_regnum);
> +  /* Note on Linux the mapping for the DWARF registers and the stab
> registers
> +     use the same numbers.  Install
> rs6000_linux_dwarf2_reg_to_regnum for the
> +     stab register mappings as well.  */
> +  set_gdbarch_stab_reg_to_regnum (gdbarch,
> +				    rs6000_linux_dwarf2_reg_to_regnum);
> +  dwarf2_frame_set_adjust_regnum (gdbarch,
> rs6000_linux_adjust_frame_regnum);
>  
>    if (tdep->wordsize == 4)
>      {
  
Ulrich Weigand Oct. 16, 2023, 11:12 a.m. UTC | #3
Carl Love <cel@us.ibm.com> wrote:

>Thanks for the review.  I have fixed the typos as noted below.

This patch (including fixed for the issues pointed out by
Keith) is OK.

Thanks,
Ulrich
  
Andrew Burgess Oct. 16, 2023, 2:31 p.m. UTC | #4
Carl Love <cel@us.ibm.com> writes:

> GDB maintainers:
>
> This is the first patch in the series which fixes the DWWARF register
> mapping and signal handling issues on PowerPC.
>
>                   Carl
>
> -----------------------------------------------
>
> rs6000, Fix Linux DWARF register mapping
>
> The PowerPC DWARF register mapping is the same for the .eh_frame and
> .debug_frame on Linux.  PowerPC uses different mapping for .eh_frame and
> .debug_frame on other operating systems.  The current GDB support for
> mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum and
> rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct for Linux.
> The files have some legacy mappings for spe_acc, spefscr, EV which was
> removed from GCC in 2017.
>
> This patch adds a two new functions rs6000_linux_dwarf2_reg_to_regnum,
> and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c to handle
> the DWARF register mappings on Linux.  Function
> rs6000_linux_dwarf2_reg_to_regnum is installed for both gdb_dwarf_to_regnum
> and gdbarch_stab_reg_to_regnum since the mappings are the same.
>
> The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is updated to
> call set_gdbarch_dwarf2_reg_to_regnum map the new function
> rs6000_linux_dwarf2_reg_to_regnum for the architecture.  Similarly,
> dwarf2_frame_set_adjust_regnum is called to map
> rs6000_linux_adjust_frame_regnum into the architecture.
>
> The second issue fixed by this patch is the check for subroutine
> process_event_stop_test.  Need to make sure the frame is not the
> SIGTRAMP_FRAME.  The sequence of events on most platforms is:

Usually for GDB we avoid bundling unrelated changes into a single
commit.  Each commit should address one self contained issue (as far as
possible).

I really struggling to see any connection between the two fixes you have
here.

>
>   1) Some code is running when a signal arrives.
>   2) The kernel handles the signal and dispatches to the handler.
>     ...
>
> However on PowerPC the sequence of events is:
>
>   1) Some code is running when a signal arrives.
>   2) The kernel handles the signal and dispatches to the trampoline.
>   3) The trampoline performs a normal function call to the handler.
>       ...
>
> We want "nexti" to step into, not over, signal handlers invoked
> by the kernel.  This is the case most platforms as the kernel puts a
> signal trampoline frame onto the stack to handle proper return after the
> handler.  However, on some platforms such as PowerPC, the kernel actually
> uses a trampoline to handle *invocation* of the handler.
>
> The issue is fixed in function process_event_stop_test by adding a check
> that the frame is not a SIGTRAMP_FRAME to the if statement to stop in
> a subroutine call.  This prevents GDB from erroneously detecting the
> trampoline invocation as a subroutine call.
>
> This patch fixes two regression test failures in gdb.base/store.exp.  It
> also fixes two regression failures in gdb.python/py-thread-exited.exp.

On the one random PPC box I tried this patch on, I'm not seeing any
failures in gdb.python/py-thread-exited.exp either before, or after this
commit.

Which tests in gdb.python/py-thread-exited.exp are you seeing as broken?
And which of the two fixes in this commit fix the problems you're seeing?

>
> Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10 with no
> new regressions.
> ---
>  gdb/infrun.c         | 13 ++++++++++
>  gdb/ppc-linux-tdep.c | 56 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 69 insertions(+)
>
> diff --git a/gdb/infrun.c b/gdb/infrun.c
> index 4730d290442..922d8a6913d 100644
> --- a/gdb/infrun.c
> +++ b/gdb/infrun.c
> @@ -7334,8 +7334,21 @@ process_event_stop_test (struct execution_control_state *ecs)
>       initial outermost frame, before sp was valid, would
>       have code_addr == &_start.  See the comment in frame_id::operator==
>       for more.  */
> +
> +  /* We want "nexti" to step into, not over, signal handlers invoked
> +     by the kernel, therefore this subroutine check should not trigger
> +     for a signal handler invocation.  On most platforms, this is already
> +     not the case, as the kernel puts a signal trampoline frame onto the
> +     stack to handle proper return after the handler, and therefore at this
> +     point, the current frame is a grandchild of the step frame, not a
> +     child.  However, on some platforms, the kernel actually uses a
> +     trampoline to handle *invocation* of the handler.  In that case,
> +     when executing the first instruction of the trampoline, this check
> +     would erroneously detect the trampoline invocation as a subroutine
> +     call.  Fix this by checking for SIGTRAMP_FRAME.  */
>    if ((get_stack_frame_id (frame)
>         != ecs->event_thread->control.step_stack_frame_id)
> +      && get_frame_type (frame) != SIGTRAMP_FRAME
>        && ((frame_unwind_caller_id (get_current_frame ())
>  	   == ecs->event_thread->control.step_stack_frame_id)
>  	  && ((ecs->event_thread->control.step_stack_frame_id
> diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> index 784dafa59db..7fb90799dff 100644
> --- a/gdb/ppc-linux-tdep.c
> +++ b/gdb/ppc-linux-tdep.c
> @@ -83,6 +83,7 @@
>  #include "features/rs6000/powerpc-isa207-vsx64l.c"
>  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
>  #include "features/rs6000/powerpc-e500l.c"
> +#include "dwarf2/frame.h"
>  
>  /* Shared library operations for PowerPC-Linux.  */
>  static struct target_so_ops powerpc_so_ops;
> @@ -2088,6 +2089,52 @@ ppc_linux_displaced_step_prepare  (gdbarch *arch, thread_info *thread,
>    return per_inferior->disp_step_buf->prepare (thread, displaced_pc);
>  }
>  
> +/* Convert a Dwarf 2 register number to a GDB register number for Linux.  */
> +static int
> +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int num)
> +{
> +  ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
> +
> +  if (0 <= num && num <= 31)
> +    return tdep->ppc_gp0_regnum + num;
> +  else if (32 <= num && num <= 63)
> +    /* FIXME: jimb/2004-05-05: What should we do when the debug info
> +       specifies registers the architecture doesn't have?  Our
> +       callers don't check the value we return.  */

I see this comment was just copied from else where, but isn't the answer
just: return -1 ?

The comment about the 'return -1' at the trail of this function seems to
suggest that would be the correct thing to do.

I guess I'm asking: do we need to add another copy of this (I think out
of date) fixme?

Thanks,
Andrew

> +    return tdep->ppc_fp0_regnum + (num - 32);
> +  else if (77 <= num && num < 77 + 32)
> +    return tdep->ppc_vr0_regnum + (num - 77);
> +  else
> +    switch (num)
> +      {
> +      case 65:
> +	return tdep->ppc_lr_regnum;
> +      case 66:
> +	return tdep->ppc_ctr_regnum;
> +      case 76:
> +	return tdep->ppc_xer_regnum;
> +      case 109:
> +	return tdep->ppc_vrsave_regnum;
> +      case 110:
> +	return tdep->ppc_vrsave_regnum - 1; /* vscr */
> +      }
> +
> +  /* Unknown DWARF register number.  */
> +  return -1;
> +}
> +
> +/* Translate a .eh_frame register to DWARF register, or adjust a
> +   .debug_frame register.  */
> +
> +
> +static int
> +rs6000_linux_adjust_frame_regnum (struct gdbarch *gdbarch, int num,
> +				  int eh_frame_p)
> +{
> +  /* Linux uses the same numbering for .debug_frame numbering as .eh_frame.  */
> +  return num;
> +}
> +
>  static void
>  ppc_linux_init_abi (struct gdbarch_info info,
>  		    struct gdbarch *gdbarch)
> @@ -2135,6 +2182,15 @@ ppc_linux_init_abi (struct gdbarch_info info,
>    set_gdbarch_stap_is_single_operand (gdbarch, ppc_stap_is_single_operand);
>    set_gdbarch_stap_parse_special_token (gdbarch,
>  					ppc_stap_parse_special_token);
> +  /* Linux DWARF register mapping is different from the othe OS's.  */
> +  set_gdbarch_dwarf2_reg_to_regnum (gdbarch,
> +				    rs6000_linux_dwarf2_reg_to_regnum);
> +  /* Note on Linux the mapping for the DWARF registers and the stab registers
> +     use the same numbers.  Install rs6000_linux_dwarf2_reg_to_regnum for the
> +     stab register mappings as well.  */
> +  set_gdbarch_stab_reg_to_regnum (gdbarch,
> +				    rs6000_linux_dwarf2_reg_to_regnum);
> +  dwarf2_frame_set_adjust_regnum (gdbarch, rs6000_linux_adjust_frame_regnum);
>  
>    if (tdep->wordsize == 4)
>      {
> -- 
> 2.37.2
  
Carl Love Oct. 16, 2023, 3:51 p.m. UTC | #5
Andrew:

Thanks for the review and comments.  See responses below.

On Mon, 2023-10-16 at 15:31 +0100, Andrew Burgess wrote:
> Carl Love <cel@us.ibm.com> writes:
> 
> > GDB maintainers:
> > 
> > This is the first patch in the series which fixes the DWWARF
> > register
> > mapping and signal handling issues on PowerPC.
> > 
> >                   Carl
> > 
> > -----------------------------------------------
> > 
> > rs6000, Fix Linux DWARF register mapping
> > 
> > The PowerPC DWARF register mapping is the same for the .eh_frame
> > and
> > .debug_frame on Linux.  PowerPC uses different mapping for
> > .eh_frame and
> > .debug_frame on other operating systems.  The current GDB support
> > for
> > mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum
> > and
> > rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct
> > for Linux.
> > The files have some legacy mappings for spe_acc, spefscr, EV which
> > was
> > removed from GCC in 2017.
> > 
> > This patch adds a two new functions
> > rs6000_linux_dwarf2_reg_to_regnum,
> > and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c
> > to handle
> > the DWARF register mappings on Linux.  Function
> > rs6000_linux_dwarf2_reg_to_regnum is installed for both
> > gdb_dwarf_to_regnum
> > and gdbarch_stab_reg_to_regnum since the mappings are the same.
> > 
> > The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is updated
> > to
> > call set_gdbarch_dwarf2_reg_to_regnum map the new function
> > rs6000_linux_dwarf2_reg_to_regnum for the architecture.  Similarly,
> > dwarf2_frame_set_adjust_regnum is called to map
> > rs6000_linux_adjust_frame_regnum into the architecture.
> > 
> > The second issue fixed by this patch is the check for subroutine
> > process_event_stop_test.  Need to make sure the frame is not the
> > SIGTRAMP_FRAME.  The sequence of events on most platforms is:
> 
> Usually for GDB we avoid bundling unrelated changes into a single
> commit.  Each commit should address one self contained issue (as far
> as
> possible).
> 
> I really struggling to see any connection between the two fixes you
> have
> here.

As mentioned in the commit log, the first patch which fixes the DWARF
register mapping fixes two of the 5 failures seen in store.exp. 
Specifically the patch fixes the issue of GDB not stopping in the
handler which results in the step.exp test failing.  The second patch
then fixes the rest of the issues with the step.exp test.
> 
> >   1) Some code is running when a signal arrives.
> >   2) The kernel handles the signal and dispatches to the handler.
> >     ...
> > 
> > However on PowerPC the sequence of events is:
> > 
> >   1) Some code is running when a signal arrives.
> >   2) The kernel handles the signal and dispatches to the
> > trampoline.
> >   3) The trampoline performs a normal function call to the handler.
> >       ...
> > 
> > We want "nexti" to step into, not over, signal handlers invoked
> > by the kernel.  This is the case most platforms as the kernel puts
> > a
> > signal trampoline frame onto the stack to handle proper return
> > after the
> > handler.  However, on some platforms such as PowerPC, the kernel
> > actually
> > uses a trampoline to handle *invocation* of the handler.
> > 
> > The issue is fixed in function process_event_stop_test by adding a
> > check
> > that the frame is not a SIGTRAMP_FRAME to the if statement to stop
> > in
> > a subroutine call.  This prevents GDB from erroneously detecting
> > the
> > trampoline invocation as a subroutine call.
> > 
> > This patch fixes two regression test failures in
> > gdb.base/store.exp.  It
> > also fixes two regression failures in gdb.python/py-thread-
> > exited.exp.
> 
> On the one random PPC box I tried this patch on, I'm not seeing any
> failures in gdb.python/py-thread-exited.exp either before, or after
> this
> commit.
> 
> Which tests in gdb.python/py-thread-exited.exp are you seeing as
> broken?
> And which of the two fixes in this commit fix the problems you're
> seeing?

The comment about the py-thread-exited.exp should have been removed.  I
found that sometimes that test fails and sometime passes.  I have yet
to dig into it and figure out why the test is inconsistent.  The
inconsistent behavior exists with and without these patches.

I will remove the comment about the gdb.python/py-thread-exited.exp
fixes as that is erroneous.


> 
> > Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10
> > with no
> > new regressions.
> > ---
> >  gdb/infrun.c         | 13 ++++++++++
> >  gdb/ppc-linux-tdep.c | 56
> > ++++++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 69 insertions(+)
> > 
> > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > index 4730d290442..922d8a6913d 100644
> > --- a/gdb/infrun.c
> > +++ b/gdb/infrun.c
> > @@ -7334,8 +7334,21 @@ process_event_stop_test (struct
> > execution_control_state *ecs)
> >       initial outermost frame, before sp was valid, would
> >       have code_addr == &_start.  See the comment in
> > frame_id::operator==
> >       for more.  */
> > +
> > +  /* We want "nexti" to step into, not over, signal handlers
> > invoked
> > +     by the kernel, therefore this subroutine check should not
> > trigger
> > +     for a signal handler invocation.  On most platforms, this is
> > already
> > +     not the case, as the kernel puts a signal trampoline frame
> > onto the
> > +     stack to handle proper return after the handler, and
> > therefore at this
> > +     point, the current frame is a grandchild of the step frame,
> > not a
> > +     child.  However, on some platforms, the kernel actually uses
> > a
> > +     trampoline to handle *invocation* of the handler.  In that
> > case,
> > +     when executing the first instruction of the trampoline, this
> > check
> > +     would erroneously detect the trampoline invocation as a
> > subroutine
> > +     call.  Fix this by checking for SIGTRAMP_FRAME.  */
> >    if ((get_stack_frame_id (frame)
> >         != ecs->event_thread->control.step_stack_frame_id)
> > +      && get_frame_type (frame) != SIGTRAMP_FRAME
> >        && ((frame_unwind_caller_id (get_current_frame ())
> >  	   == ecs->event_thread->control.step_stack_frame_id)
> >  	  && ((ecs->event_thread->control.step_stack_frame_id
> > diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> > index 784dafa59db..7fb90799dff 100644
> > --- a/gdb/ppc-linux-tdep.c
> > +++ b/gdb/ppc-linux-tdep.c
> > @@ -83,6 +83,7 @@
> >  #include "features/rs6000/powerpc-isa207-vsx64l.c"
> >  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
> >  #include "features/rs6000/powerpc-e500l.c"
> > +#include "dwarf2/frame.h"
> >  
> >  /* Shared library operations for PowerPC-Linux.  */
> >  static struct target_so_ops powerpc_so_ops;
> > @@ -2088,6 +2089,52 @@ ppc_linux_displaced_step_prepare  (gdbarch
> > *arch, thread_info *thread,
> >    return per_inferior->disp_step_buf->prepare (thread,
> > displaced_pc);
> >  }
> >  
> > +/* Convert a Dwarf 2 register number to a GDB register number for
> > Linux.  */
> > +static int
> > +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int
> > num)
> > +{
> > +  ppc_gdbarch_tdep *tdep =
> > gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
> > +
> > +  if (0 <= num && num <= 31)
> > +    return tdep->ppc_gp0_regnum + num;
> > +  else if (32 <= num && num <= 63)
> > +    /* FIXME: jimb/2004-05-05: What should we do when the debug
> > info
> > +       specifies registers the architecture doesn't have?  Our
> > +       callers don't check the value we return.  */
> 
> I see this comment was just copied from else where, but isn't the
> answer
> just: return -1 ?
> 
> The comment about the 'return -1' at the trail of this function seems
> to
> suggest that would be the correct thing to do.
> 
> I guess I'm asking: do we need to add another copy of this (I think
> out
> of date) fixme?

Yes, the function is based on rs6000_dwarf2_reg_to_regnum with the
specific changes for the Linux DWARF mappings.  The comment looked out
of date to me, but I left it for consistency with the other two
functions that have the same comment. It would probably be best to
investigate the comment further and update it in all places in a
separate patch. 

Would it be acceptable to leave the comment as is for now, for
consistency sake, and I will work on a separate cleanup patch to
address removing the comment in the original two places and this
additional place?  I will want to look into it a bit more but I think
we can just remove the comment.  But I do want to verify that the
return value is never used first.

                          Carl
  
Carl Love Oct. 19, 2023, 3:54 p.m. UTC | #6
Andrew:

Just wanted to make sure you were OK with response, particularly about
creating a new patch to address the issues with the comment in the code
before I commit the two patches.

Sorry for the slow response.  My email is being migrated from 
cel@us.ibm.com to cel@linux.ibm.com.  There have been some issues with
the migration.  I don't seem to be getting email at the old address but
the new address does seem to work.  So any replies need to include the
new email address.  Thanks.

                  Carl 

On Mon, 2023-10-16 at 08:51 -0700, Carl Love wrote:
> Andrew:
> 
> Thanks for the review and comments.  See responses below.
> 
> On Mon, 2023-10-16 at 15:31 +0100, Andrew Burgess wrote:
> > Carl Love <cel@us.ibm.com> writes:
> > 
> > > GDB maintainers:
> > > 
> > > This is the first patch in the series which fixes the DWWARF
> > > register
> > > mapping and signal handling issues on PowerPC.
> > > 
> > >                   Carl
> > > 
> > > -----------------------------------------------
> > > 
> > > rs6000, Fix Linux DWARF register mapping
> > > 
> > > The PowerPC DWARF register mapping is the same for the .eh_frame
> > > and
> > > .debug_frame on Linux.  PowerPC uses different mapping for
> > > .eh_frame and
> > > .debug_frame on other operating systems.  The current GDB support
> > > for
> > > mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum
> > > and
> > > rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct
> > > for Linux.
> > > The files have some legacy mappings for spe_acc, spefscr, EV
> > > which
> > > was
> > > removed from GCC in 2017.
> > > 
> > > This patch adds a two new functions
> > > rs6000_linux_dwarf2_reg_to_regnum,
> > > and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c
> > > to handle
> > > the DWARF register mappings on Linux.  Function
> > > rs6000_linux_dwarf2_reg_to_regnum is installed for both
> > > gdb_dwarf_to_regnum
> > > and gdbarch_stab_reg_to_regnum since the mappings are the same.
> > > 
> > > The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is
> > > updated
> > > to
> > > call set_gdbarch_dwarf2_reg_to_regnum map the new function
> > > rs6000_linux_dwarf2_reg_to_regnum for the
> > > architecture.  Similarly,
> > > dwarf2_frame_set_adjust_regnum is called to map
> > > rs6000_linux_adjust_frame_regnum into the architecture.
> > > 
> > > The second issue fixed by this patch is the check for subroutine
> > > process_event_stop_test.  Need to make sure the frame is not the
> > > SIGTRAMP_FRAME.  The sequence of events on most platforms is:
> > 
> > Usually for GDB we avoid bundling unrelated changes into a single
> > commit.  Each commit should address one self contained issue (as
> > far
> > as
> > possible).
> > 
> > I really struggling to see any connection between the two fixes you
> > have
> > here.
> 
> As mentioned in the commit log, the first patch which fixes the DWARF
> register mapping fixes two of the 5 failures seen in store.exp. 
> Specifically the patch fixes the issue of GDB not stopping in the
> handler which results in the step.exp test failing.  The second patch
> then fixes the rest of the issues with the step.exp test.
> > >   1) Some code is running when a signal arrives.
> > >   2) The kernel handles the signal and dispatches to the handler.
> > >     ...
> > > 
> > > However on PowerPC the sequence of events is:
> > > 
> > >   1) Some code is running when a signal arrives.
> > >   2) The kernel handles the signal and dispatches to the
> > > trampoline.
> > >   3) The trampoline performs a normal function call to the
> > > handler.
> > >       ...
> > > 
> > > We want "nexti" to step into, not over, signal handlers invoked
> > > by the kernel.  This is the case most platforms as the kernel
> > > puts
> > > a
> > > signal trampoline frame onto the stack to handle proper return
> > > after the
> > > handler.  However, on some platforms such as PowerPC, the kernel
> > > actually
> > > uses a trampoline to handle *invocation* of the handler.
> > > 
> > > The issue is fixed in function process_event_stop_test by adding
> > > a
> > > check
> > > that the frame is not a SIGTRAMP_FRAME to the if statement to
> > > stop
> > > in
> > > a subroutine call.  This prevents GDB from erroneously detecting
> > > the
> > > trampoline invocation as a subroutine call.
> > > 
> > > This patch fixes two regression test failures in
> > > gdb.base/store.exp.  It
> > > also fixes two regression failures in gdb.python/py-thread-
> > > exited.exp.
> > 
> > On the one random PPC box I tried this patch on, I'm not seeing any
> > failures in gdb.python/py-thread-exited.exp either before, or after
> > this
> > commit.
> > 
> > Which tests in gdb.python/py-thread-exited.exp are you seeing as
> > broken?
> > And which of the two fixes in this commit fix the problems you're
> > seeing?
> 
> The comment about the py-thread-exited.exp should have been
> removed.  I
> found that sometimes that test fails and sometime passes.  I have yet
> to dig into it and figure out why the test is inconsistent.  The
> inconsistent behavior exists with and without these patches.
> 
> I will remove the comment about the gdb.python/py-thread-exited.exp
> fixes as that is erroneous.
> 
> 
> > > Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10
> > > with no
> > > new regressions.
> > > ---
> > >  gdb/infrun.c         | 13 ++++++++++
> > >  gdb/ppc-linux-tdep.c | 56
> > > ++++++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 69 insertions(+)
> > > 
> > > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > > index 4730d290442..922d8a6913d 100644
> > > --- a/gdb/infrun.c
> > > +++ b/gdb/infrun.c
> > > @@ -7334,8 +7334,21 @@ process_event_stop_test (struct
> > > execution_control_state *ecs)
> > >       initial outermost frame, before sp was valid, would
> > >       have code_addr == &_start.  See the comment in
> > > frame_id::operator==
> > >       for more.  */
> > > +
> > > +  /* We want "nexti" to step into, not over, signal handlers
> > > invoked
> > > +     by the kernel, therefore this subroutine check should not
> > > trigger
> > > +     for a signal handler invocation.  On most platforms, this
> > > is
> > > already
> > > +     not the case, as the kernel puts a signal trampoline frame
> > > onto the
> > > +     stack to handle proper return after the handler, and
> > > therefore at this
> > > +     point, the current frame is a grandchild of the step frame,
> > > not a
> > > +     child.  However, on some platforms, the kernel actually
> > > uses
> > > a
> > > +     trampoline to handle *invocation* of the handler.  In that
> > > case,
> > > +     when executing the first instruction of the trampoline,
> > > this
> > > check
> > > +     would erroneously detect the trampoline invocation as a
> > > subroutine
> > > +     call.  Fix this by checking for SIGTRAMP_FRAME.  */
> > >    if ((get_stack_frame_id (frame)
> > >         != ecs->event_thread->control.step_stack_frame_id)
> > > +      && get_frame_type (frame) != SIGTRAMP_FRAME
> > >        && ((frame_unwind_caller_id (get_current_frame ())
> > >  	   == ecs->event_thread->control.step_stack_frame_id)
> > >  	  && ((ecs->event_thread->control.step_stack_frame_id
> > > diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> > > index 784dafa59db..7fb90799dff 100644
> > > --- a/gdb/ppc-linux-tdep.c
> > > +++ b/gdb/ppc-linux-tdep.c
> > > @@ -83,6 +83,7 @@
> > >  #include "features/rs6000/powerpc-isa207-vsx64l.c"
> > >  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
> > >  #include "features/rs6000/powerpc-e500l.c"
> > > +#include "dwarf2/frame.h"
> > >  
> > >  /* Shared library operations for PowerPC-Linux.  */
> > >  static struct target_so_ops powerpc_so_ops;
> > > @@ -2088,6 +2089,52 @@ ppc_linux_displaced_step_prepare  (gdbarch
> > > *arch, thread_info *thread,
> > >    return per_inferior->disp_step_buf->prepare (thread,
> > > displaced_pc);
> > >  }
> > >  
> > > +/* Convert a Dwarf 2 register number to a GDB register number
> > > for
> > > Linux.  */
> > > +static int
> > > +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int
> > > num)
> > > +{
> > > +  ppc_gdbarch_tdep *tdep =
> > > gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
> > > +
> > > +  if (0 <= num && num <= 31)
> > > +    return tdep->ppc_gp0_regnum + num;
> > > +  else if (32 <= num && num <= 63)
> > > +    /* FIXME: jimb/2004-05-05: What should we do when the debug
> > > info
> > > +       specifies registers the architecture doesn't have?  Our
> > > +       callers don't check the value we return.  */
> > 
> > I see this comment was just copied from else where, but isn't the
> > answer
> > just: return -1 ?
> > 
> > The comment about the 'return -1' at the trail of this function
> > seems
> > to
> > suggest that would be the correct thing to do.
> > 
> > I guess I'm asking: do we need to add another copy of this (I think
> > out
> > of date) fixme?
> 
> Yes, the function is based on rs6000_dwarf2_reg_to_regnum with the
> specific changes for the Linux DWARF mappings.  The comment looked
> out
> of date to me, but I left it for consistency with the other two
> functions that have the same comment. It would probably be best to
> investigate the comment further and update it in all places in a
> separate patch. 
> 
> Would it be acceptable to leave the comment as is for now, for
> consistency sake, and I will work on a separate cleanup patch to
> address removing the comment in the original two places and this
> additional place?  I will want to look into it a bit more but I think
> we can just remove the comment.  But I do want to verify that the
> return value is never used first.
> 
>                           Carl 
>
  
Andrew Burgess Oct. 24, 2023, 8:50 a.m. UTC | #7
Carl Love <cel@us.ibm.com> writes:

> Andrew:
>
> Thanks for the review and comments.  See responses below.

Carl,

Sorry for my slow reply.

>
> On Mon, 2023-10-16 at 15:31 +0100, Andrew Burgess wrote:
>> Carl Love <cel@us.ibm.com> writes:
>> 
>> > GDB maintainers:
>> > 
>> > This is the first patch in the series which fixes the DWWARF
>> > register
>> > mapping and signal handling issues on PowerPC.
>> > 
>> >                   Carl
>> > 
>> > -----------------------------------------------
>> > 
>> > rs6000, Fix Linux DWARF register mapping
>> > 
>> > The PowerPC DWARF register mapping is the same for the .eh_frame
>> > and
>> > .debug_frame on Linux.  PowerPC uses different mapping for
>> > .eh_frame and
>> > .debug_frame on other operating systems.  The current GDB support
>> > for
>> > mapping the DWARF registers in rs6000_linux_dwarf2_reg_to_regnum
>> > and
>> > rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not correct
>> > for Linux.
>> > The files have some legacy mappings for spe_acc, spefscr, EV which
>> > was
>> > removed from GCC in 2017.
>> > 
>> > This patch adds a two new functions
>> > rs6000_linux_dwarf2_reg_to_regnum,
>> > and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-tdep.c
>> > to handle
>> > the DWARF register mappings on Linux.  Function
>> > rs6000_linux_dwarf2_reg_to_regnum is installed for both
>> > gdb_dwarf_to_regnum
>> > and gdbarch_stab_reg_to_regnum since the mappings are the same.
>> > 
>> > The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is updated
>> > to
>> > call set_gdbarch_dwarf2_reg_to_regnum map the new function
>> > rs6000_linux_dwarf2_reg_to_regnum for the architecture.  Similarly,
>> > dwarf2_frame_set_adjust_regnum is called to map
>> > rs6000_linux_adjust_frame_regnum into the architecture.
>> > 
>> > The second issue fixed by this patch is the check for subroutine
>> > process_event_stop_test.  Need to make sure the frame is not the
>> > SIGTRAMP_FRAME.  The sequence of events on most platforms is:
>> 
>> Usually for GDB we avoid bundling unrelated changes into a single
>> commit.  Each commit should address one self contained issue (as far
>> as
>> possible).
>> 
>> I really struggling to see any connection between the two fixes you
>> have
>> here.
>
> As mentioned in the commit log, the first patch which fixes the DWARF
> register mapping fixes two of the 5 failures seen in store.exp. 
> Specifically the patch fixes the issue of GDB not stopping in the
> handler which results in the step.exp test failing.  The second patch
> then fixes the rest of the issues with the step.exp test.

I'm now even more confused.

The commit message for the first patch suggests that it contains two
separate fixes, but the commit message, even your updated commit message
for 'ver2' only mentions store.exp.

To be more specific, in patch 1/2, the changes in ppc-linux-tdep.c seem
unrelated to the changes in infrun.c.  And I'm confused about how the
changes in infrun.c can fix anything in the store.exp test script -- the
comments relating to that part of the patch specifically talk about
'nexti' and signal handlers, but I don't see either of these being used
in store.exp (we do use 'next' though, so maybe that's what you mean?)
But I'm not clear if a signal is invoked in that test at all.

In the text above you do mention step.exp, so maybe that's what the
infrun.c changes are fixing?  I'll grab a PPC machine and have a play,
but your 'ver 2' commit message has no mention of step.exp, so if that
test is important you might need to update the text.

Usually in GDB when we talk about each patch containing a single fix
we're talking about changes to GDB functionality, not fixing a test
script.  So it's not: 'Patch 1/1 fixes all of store.exp' but rather,
'Patch 1/2 fixes PPC DWARF register mapping' and 'Patch 2/2 fixes nexti
blah blah signal frames'.

>> 
>> >   1) Some code is running when a signal arrives.
>> >   2) The kernel handles the signal and dispatches to the handler.
>> >     ...
>> > 
>> > However on PowerPC the sequence of events is:
>> > 
>> >   1) Some code is running when a signal arrives.
>> >   2) The kernel handles the signal and dispatches to the
>> > trampoline.
>> >   3) The trampoline performs a normal function call to the handler.
>> >       ...
>> > 
>> > We want "nexti" to step into, not over, signal handlers invoked
>> > by the kernel.  This is the case most platforms as the kernel puts
>> > a
>> > signal trampoline frame onto the stack to handle proper return
>> > after the
>> > handler.  However, on some platforms such as PowerPC, the kernel
>> > actually
>> > uses a trampoline to handle *invocation* of the handler.
>> > 
>> > The issue is fixed in function process_event_stop_test by adding a
>> > check
>> > that the frame is not a SIGTRAMP_FRAME to the if statement to stop
>> > in
>> > a subroutine call.  This prevents GDB from erroneously detecting
>> > the
>> > trampoline invocation as a subroutine call.
>> > 
>> > This patch fixes two regression test failures in
>> > gdb.base/store.exp.  It
>> > also fixes two regression failures in gdb.python/py-thread-
>> > exited.exp.
>> 
>> On the one random PPC box I tried this patch on, I'm not seeing any
>> failures in gdb.python/py-thread-exited.exp either before, or after
>> this
>> commit.
>> 
>> Which tests in gdb.python/py-thread-exited.exp are you seeing as
>> broken?
>> And which of the two fixes in this commit fix the problems you're
>> seeing?
>
> The comment about the py-thread-exited.exp should have been removed.  I
> found that sometimes that test fails and sometime passes.  I have yet
> to dig into it and figure out why the test is inconsistent.  The
> inconsistent behavior exists with and without these patches.
>
> I will remove the comment about the gdb.python/py-thread-exited.exp
> fixes as that is erroneous.

Thanks.

>
>
>> 
>> > Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10
>> > with no
>> > new regressions.
>> > ---
>> >  gdb/infrun.c         | 13 ++++++++++
>> >  gdb/ppc-linux-tdep.c | 56
>> > ++++++++++++++++++++++++++++++++++++++++++++
>> >  2 files changed, 69 insertions(+)
>> > 
>> > diff --git a/gdb/infrun.c b/gdb/infrun.c
>> > index 4730d290442..922d8a6913d 100644
>> > --- a/gdb/infrun.c
>> > +++ b/gdb/infrun.c
>> > @@ -7334,8 +7334,21 @@ process_event_stop_test (struct
>> > execution_control_state *ecs)
>> >       initial outermost frame, before sp was valid, would
>> >       have code_addr == &_start.  See the comment in
>> > frame_id::operator==
>> >       for more.  */
>> > +
>> > +  /* We want "nexti" to step into, not over, signal handlers
>> > invoked
>> > +     by the kernel, therefore this subroutine check should not
>> > trigger
>> > +     for a signal handler invocation.  On most platforms, this is
>> > already
>> > +     not the case, as the kernel puts a signal trampoline frame
>> > onto the
>> > +     stack to handle proper return after the handler, and
>> > therefore at this
>> > +     point, the current frame is a grandchild of the step frame,
>> > not a
>> > +     child.  However, on some platforms, the kernel actually uses
>> > a
>> > +     trampoline to handle *invocation* of the handler.  In that
>> > case,
>> > +     when executing the first instruction of the trampoline, this
>> > check
>> > +     would erroneously detect the trampoline invocation as a
>> > subroutine
>> > +     call.  Fix this by checking for SIGTRAMP_FRAME.  */
>> >    if ((get_stack_frame_id (frame)
>> >         != ecs->event_thread->control.step_stack_frame_id)
>> > +      && get_frame_type (frame) != SIGTRAMP_FRAME
>> >        && ((frame_unwind_caller_id (get_current_frame ())
>> >  	   == ecs->event_thread->control.step_stack_frame_id)
>> >  	  && ((ecs->event_thread->control.step_stack_frame_id
>> > diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
>> > index 784dafa59db..7fb90799dff 100644
>> > --- a/gdb/ppc-linux-tdep.c
>> > +++ b/gdb/ppc-linux-tdep.c
>> > @@ -83,6 +83,7 @@
>> >  #include "features/rs6000/powerpc-isa207-vsx64l.c"
>> >  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
>> >  #include "features/rs6000/powerpc-e500l.c"
>> > +#include "dwarf2/frame.h"
>> >  
>> >  /* Shared library operations for PowerPC-Linux.  */
>> >  static struct target_so_ops powerpc_so_ops;
>> > @@ -2088,6 +2089,52 @@ ppc_linux_displaced_step_prepare  (gdbarch
>> > *arch, thread_info *thread,
>> >    return per_inferior->disp_step_buf->prepare (thread,
>> > displaced_pc);
>> >  }
>> >  
>> > +/* Convert a Dwarf 2 register number to a GDB register number for
>> > Linux.  */
>> > +static int
>> > +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int
>> > num)
>> > +{
>> > +  ppc_gdbarch_tdep *tdep =
>> > gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
>> > +
>> > +  if (0 <= num && num <= 31)
>> > +    return tdep->ppc_gp0_regnum + num;
>> > +  else if (32 <= num && num <= 63)
>> > +    /* FIXME: jimb/2004-05-05: What should we do when the debug
>> > info
>> > +       specifies registers the architecture doesn't have?  Our
>> > +       callers don't check the value we return.  */
>> 
>> I see this comment was just copied from else where, but isn't the
>> answer
>> just: return -1 ?
>> 
>> The comment about the 'return -1' at the trail of this function seems
>> to
>> suggest that would be the correct thing to do.
>> 
>> I guess I'm asking: do we need to add another copy of this (I think
>> out
>> of date) fixme?
>
> Yes, the function is based on rs6000_dwarf2_reg_to_regnum with the
> specific changes for the Linux DWARF mappings.  The comment looked out
> of date to me, but I left it for consistency with the other two
> functions that have the same comment. It would probably be best to
> investigate the comment further and update it in all places in a
> separate patch. 
>
> Would it be acceptable to leave the comment as is for now, for
> consistency sake, and I will work on a separate cleanup patch to
> address removing the comment in the original two places and this
> additional place?  I will want to look into it a bit more but I think
> we can just remove the comment.  But I do want to verify that the
> return value is never used first.

I took a quick look.  There are three callers: 1 in dwarf2/loc.c, and 2
in jit.c.  One of the callers in jit.c does fail to check the return
value, but this looks like a bug to me.  The other callers do check for
a -1 return value.  The comment in gdbarch-gen.h is also explicit that
the right thing to do is return -1.

Coupled with the fact that the comment, in its current location just
makes no sense I think the best idea would be to drop it from your new
function.  Cleaning up the older code in the future would be nice, but
is not required.  But I don't think we need to introduce more incorrect
comments just for consistency.

Thanks,
Andrew
  
Carl Love Oct. 24, 2023, 4:05 p.m. UTC | #8
Andrew:

On Tue, 2023-10-24 at 09:50 +0100, Andrew Burgess wrote:
> Carl Love <cel@us.ibm.com> writes:
> 
> > Andrew:
> > 
> > Thanks for the review and comments.  See responses below.
> 
> Carl,
> 
> Sorry for my slow reply.
> 
> > On Mon, 2023-10-16 at 15:31 +0100, Andrew Burgess wrote:
> > > Carl Love <cel@us.ibm.com> writes:
> > > 
> > > > GDB maintainers:
> > > > 
> > > > This is the first patch in the series which fixes the DWWARF
> > > > register
> > > > mapping and signal handling issues on PowerPC.
> > > > 
> > > >                   Carl
> > > > 
> > > > -----------------------------------------------
> > > > 
> > > > rs6000, Fix Linux DWARF register mapping
> > > > 
> > > > The PowerPC DWARF register mapping is the same for the
> > > > .eh_frame
> > > > and
> > > > .debug_frame on Linux.  PowerPC uses different mapping for
> > > > .eh_frame and
> > > > .debug_frame on other operating systems.  The current GDB
> > > > support
> > > > for
> > > > mapping the DWARF registers in
> > > > rs6000_linux_dwarf2_reg_to_regnum
> > > > and
> > > > rs6000_adjust_frame_regnum file gdb/rs6000-tdep.c is not
> > > > correct
> > > > for Linux.
> > > > The files have some legacy mappings for spe_acc, spefscr, EV
> > > > which
> > > > was
> > > > removed from GCC in 2017.
> > > > 
> > > > This patch adds a two new functions
> > > > rs6000_linux_dwarf2_reg_to_regnum,
> > > > and rs6000_linux_adjust_frame_regnum in file gdb/ppc-linux-
> > > > tdep.c
> > > > to handle
> > > > the DWARF register mappings on Linux.  Function
> > > > rs6000_linux_dwarf2_reg_to_regnum is installed for both
> > > > gdb_dwarf_to_regnum
> > > > and gdbarch_stab_reg_to_regnum since the mappings are the same.
> > > > 
> > > > The ppc_linux_init_abi function in gdb/ppc-linux-tdep.c is
> > > > updated
> > > > to
> > > > call set_gdbarch_dwarf2_reg_to_regnum map the new function
> > > > rs6000_linux_dwarf2_reg_to_regnum for the
> > > > architecture.  Similarly,
> > > > dwarf2_frame_set_adjust_regnum is called to map
> > > > rs6000_linux_adjust_frame_regnum into the architecture.
> > > > 
> > > > The second issue fixed by this patch is the check for
> > > > subroutine
> > > > process_event_stop_test.  Need to make sure the frame is not
> > > > the
> > > > SIGTRAMP_FRAME.  The sequence of events on most platforms is:
> > > 
> > > Usually for GDB we avoid bundling unrelated changes into a single
> > > commit.  Each commit should address one self contained issue (as
> > > far
> > > as
> > > possible).
> > > 
> > > I really struggling to see any connection between the two fixes
> > > you
> > > have
> > > here.
> > 
> > As mentioned in the commit log, the first patch which fixes the
> > DWARF
> > register mapping fixes two of the 5 failures seen in store.exp. 
> > Specifically the patch fixes the issue of GDB not stopping in the
> > handler which results in the step.exp test failing.  The second
> > patch
> > then fixes the rest of the issues with the step.exp test.
> 
> I'm now even more confused.
> 
> The commit message for the first patch suggests that it contains two
> separate fixes, but the commit message, even your updated commit
> message
> for 'ver2' only mentions store.exp.
> 
> To be more specific, in patch 1/2, the changes in ppc-linux-tdep.c
> seem
> unrelated to the changes in infrun.c.  And I'm confused about how the
> changes in infrun.c can fix anything in the store.exp test script --
> the
> comments relating to that part of the patch specifically talk about
> 'nexti' and signal handlers, but I don't see either of these being
> used
> in store.exp (we do use 'next' though, so maybe that's what you
> mean?)
> But I'm not clear if a signal is invoked in that test at all.
> 
> In the text above you do mention step.exp, so maybe that's what the
> infrun.c changes are fixing?  I'll grab a PPC machine and have a
> play,
> but your 'ver 2' commit message has no mention of step.exp, so if
> that
> test is important you might need to update the text.
> 

The 128-bit floating point issues in store.exp will need a Power10
system which supports the 128-bit floating point types.  A Power 9
system is not sufficient.  

OK, I think I see what is missing in the explanation. So initially, I
fixed the DWARF register mapping for the store.exp test.  This fixed a
couple of the issues with the store.exp test.  Then in the second patch
we fixed the DWARF register mapping for the 128-bit floating point
values.  Specifically, DWARF used the same register numbers for 64-bit
and 128-bit floating point values.  However, the 128-bit floating point
values are stored in the VSR register file while the 64-bit values are
stored in the floating point registers.  The function
ieee_128_float_regnum_adjust takes care of that fix.  So at this point,
the store.exp test works.  

But now I have a regression failure on gdb.base/sigstep.exp.  This test
is specifically dealing with signal handlers.  Fixing the register
mapping, in patch 1, exposed an underlying bug in the signal handler
code.  The fixes for the nexti, stepi in the signal handling fixes the
regression error.  The signal handling fix is needed as part of the
DWARF register mapping fix so we don't have any regression failures and
so these fixes got rolled into the first patch but it was not really
explained in the original patch description.  So yes, there are really
two things getting fixed in the first patch. But you don't see the
signal handler issue until after the register mapping is fixed for
Linux.

The signal handler issues should be exposed on Power 9 if we split out
the DWARF register mapping from the signal handler fixes.  I didn't
play with these individually on Power 9 as part of the patch work.  All
of the patch work was done on Power 10.

So maybe it will be best if I redo the two patch sequence into three
patches.  Patch 1, fix the mapping; Patch 2 fix the signal handling
issue exposed once the register mapping is fixed; Patch 3 fix the
handling of the 128-bit float DWARF register mapping.  This should then
make it clearer what is going on in each patch.  Thoughts?

                      Carl 

> Usually in GDB when we talk about each patch containing a single fix
> we're talking about changes to GDB functionality, not fixing a test
> script.  So it's not: 'Patch 1/1 fixes all of store.exp' but rather,
> 'Patch 1/2 fixes PPC DWARF register mapping' and 'Patch 2/2 fixes
> nexti
> blah blah signal frames'.
> 
> > > >   1) Some code is running when a signal arrives.
> > > >   2) The kernel handles the signal and dispatches to the
> > > > handler.
> > > >     ...
> > > > 
> > > > However on PowerPC the sequence of events is:
> > > > 
> > > >   1) Some code is running when a signal arrives.
> > > >   2) The kernel handles the signal and dispatches to the
> > > > trampoline.
> > > >   3) The trampoline performs a normal function call to the
> > > > handler.
> > > >       ...
> > > > 
> > > > We want "nexti" to step into, not over, signal handlers invoked
> > > > by the kernel.  This is the case most platforms as the kernel
> > > > puts
> > > > a
> > > > signal trampoline frame onto the stack to handle proper return
> > > > after the
> > > > handler.  However, on some platforms such as PowerPC, the
> > > > kernel
> > > > actually
> > > > uses a trampoline to handle *invocation* of the handler.
> > > > 
> > > > The issue is fixed in function process_event_stop_test by
> > > > adding a
> > > > check
> > > > that the frame is not a SIGTRAMP_FRAME to the if statement to
> > > > stop
> > > > in
> > > > a subroutine call.  This prevents GDB from erroneously
> > > > detecting
> > > > the
> > > > trampoline invocation as a subroutine call.
> > > > 
> > > > This patch fixes two regression test failures in
> > > > gdb.base/store.exp.  It
> > > > also fixes two regression failures in gdb.python/py-thread-
> > > > exited.exp.
> > > 
> > > On the one random PPC box I tried this patch on, I'm not seeing
> > > any
> > > failures in gdb.python/py-thread-exited.exp either before, or
> > > after
> > > this
> > > commit.
> > > 
> > > Which tests in gdb.python/py-thread-exited.exp are you seeing as
> > > broken?
> > > And which of the two fixes in this commit fix the problems you're
> > > seeing?
> > 
> > The comment about the py-thread-exited.exp should have been
> > removed.  I
> > found that sometimes that test fails and sometime passes.  I have
> > yet
> > to dig into it and figure out why the test is inconsistent.  The
> > inconsistent behavior exists with and without these patches.
> > 
> > I will remove the comment about the gdb.python/py-thread-exited.exp
> > fixes as that is erroneous.
> 
> Thanks.
> 
> > 
> > > > Patch has been tested on Power 8 LE/BE, Power 9 LE/BE, Power 10
> > > > with no
> > > > new regressions.
> > > > ---
> > > >  gdb/infrun.c         | 13 ++++++++++
> > > >  gdb/ppc-linux-tdep.c | 56
> > > > ++++++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 69 insertions(+)
> > > > 
> > > > diff --git a/gdb/infrun.c b/gdb/infrun.c
> > > > index 4730d290442..922d8a6913d 100644
> > > > --- a/gdb/infrun.c
> > > > +++ b/gdb/infrun.c
> > > > @@ -7334,8 +7334,21 @@ process_event_stop_test (struct
> > > > execution_control_state *ecs)
> > > >       initial outermost frame, before sp was valid, would
> > > >       have code_addr == &_start.  See the comment in
> > > > frame_id::operator==
> > > >       for more.  */
> > > > +
> > > > +  /* We want "nexti" to step into, not over, signal handlers
> > > > invoked
> > > > +     by the kernel, therefore this subroutine check should not
> > > > trigger
> > > > +     for a signal handler invocation.  On most platforms, this
> > > > is
> > > > already
> > > > +     not the case, as the kernel puts a signal trampoline
> > > > frame
> > > > onto the
> > > > +     stack to handle proper return after the handler, and
> > > > therefore at this
> > > > +     point, the current frame is a grandchild of the step
> > > > frame,
> > > > not a
> > > > +     child.  However, on some platforms, the kernel actually
> > > > uses
> > > > a
> > > > +     trampoline to handle *invocation* of the handler.  In
> > > > that
> > > > case,
> > > > +     when executing the first instruction of the trampoline,
> > > > this
> > > > check
> > > > +     would erroneously detect the trampoline invocation as a
> > > > subroutine
> > > > +     call.  Fix this by checking for SIGTRAMP_FRAME.  */
> > > >    if ((get_stack_frame_id (frame)
> > > >         != ecs->event_thread->control.step_stack_frame_id)
> > > > +      && get_frame_type (frame) != SIGTRAMP_FRAME
> > > >        && ((frame_unwind_caller_id (get_current_frame ())
> > > >  	   == ecs->event_thread->control.step_stack_frame_id)
> > > >  	  && ((ecs->event_thread->control.step_stack_frame_id
> > > > diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
> > > > index 784dafa59db..7fb90799dff 100644
> > > > --- a/gdb/ppc-linux-tdep.c
> > > > +++ b/gdb/ppc-linux-tdep.c
> > > > @@ -83,6 +83,7 @@
> > > >  #include "features/rs6000/powerpc-isa207-vsx64l.c"
> > > >  #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
> > > >  #include "features/rs6000/powerpc-e500l.c"
> > > > +#include "dwarf2/frame.h"
> > > >  
> > > >  /* Shared library operations for PowerPC-Linux.  */
> > > >  static struct target_so_ops powerpc_so_ops;
> > > > @@ -2088,6 +2089,52 @@
> > > > ppc_linux_displaced_step_prepare  (gdbarch
> > > > *arch, thread_info *thread,
> > > >    return per_inferior->disp_step_buf->prepare (thread,
> > > > displaced_pc);
> > > >  }
> > > >  
> > > > +/* Convert a Dwarf 2 register number to a GDB register number
> > > > for
> > > > Linux.  */
> > > > +static int
> > > > +rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch,
> > > > int
> > > > num)
> > > > +{
> > > > +  ppc_gdbarch_tdep *tdep =
> > > > gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
> > > > +
> > > > +  if (0 <= num && num <= 31)
> > > > +    return tdep->ppc_gp0_regnum + num;
> > > > +  else if (32 <= num && num <= 63)
> > > > +    /* FIXME: jimb/2004-05-05: What should we do when the
> > > > debug
> > > > info
> > > > +       specifies registers the architecture doesn't have?  Our
> > > > +       callers don't check the value we return.  */
> > > 
> > > I see this comment was just copied from else where, but isn't the
> > > answer
> > > just: return -1 ?
> > > 
> > > The comment about the 'return -1' at the trail of this function
> > > seems
> > > to
> > > suggest that would be the correct thing to do.
> > > 
> > > I guess I'm asking: do we need to add another copy of this (I
> > > think
> > > out
> > > of date) fixme?
> > 
> > Yes, the function is based on rs6000_dwarf2_reg_to_regnum with the
> > specific changes for the Linux DWARF mappings.  The comment looked
> > out
> > of date to me, but I left it for consistency with the other two
> > functions that have the same comment. It would probably be best to
> > investigate the comment further and update it in all places in a
> > separate patch. 
> > 
> > Would it be acceptable to leave the comment as is for now, for
> > consistency sake, and I will work on a separate cleanup patch to
> > address removing the comment in the original two places and this
> > additional place?  I will want to look into it a bit more but I
> > think
> > we can just remove the comment.  But I do want to verify that the
> > return value is never used first.
> 
> I took a quick look.  There are three callers: 1 in dwarf2/loc.c, and
> 2
> in jit.c.  One of the callers in jit.c does fail to check the return
> value, but this looks like a bug to me.  The other callers do check
> for
> a -1 return value.  The comment in gdbarch-gen.h is also explicit
> that
> the right thing to do is return -1.
> 
> Coupled with the fact that the comment, in its current location just
> makes no sense I think the best idea would be to drop it from your
> new
> function.  Cleaning up the older code in the future would be nice,
> but
> is not required.  But I don't think we need to introduce more
> incorrect
> comments just for consistency.
> 
> Thanks,
> Andrew
>
  

Patch

diff --git a/gdb/infrun.c b/gdb/infrun.c
index 4730d290442..922d8a6913d 100644
--- a/gdb/infrun.c
+++ b/gdb/infrun.c
@@ -7334,8 +7334,21 @@  process_event_stop_test (struct execution_control_state *ecs)
      initial outermost frame, before sp was valid, would
      have code_addr == &_start.  See the comment in frame_id::operator==
      for more.  */
+
+  /* We want "nexti" to step into, not over, signal handlers invoked
+     by the kernel, therefore this subroutine check should not trigger
+     for a signal handler invocation.  On most platforms, this is already
+     not the case, as the kernel puts a signal trampoline frame onto the
+     stack to handle proper return after the handler, and therefore at this
+     point, the current frame is a grandchild of the step frame, not a
+     child.  However, on some platforms, the kernel actually uses a
+     trampoline to handle *invocation* of the handler.  In that case,
+     when executing the first instruction of the trampoline, this check
+     would erroneously detect the trampoline invocation as a subroutine
+     call.  Fix this by checking for SIGTRAMP_FRAME.  */
   if ((get_stack_frame_id (frame)
        != ecs->event_thread->control.step_stack_frame_id)
+      && get_frame_type (frame) != SIGTRAMP_FRAME
       && ((frame_unwind_caller_id (get_current_frame ())
 	   == ecs->event_thread->control.step_stack_frame_id)
 	  && ((ecs->event_thread->control.step_stack_frame_id
diff --git a/gdb/ppc-linux-tdep.c b/gdb/ppc-linux-tdep.c
index 784dafa59db..7fb90799dff 100644
--- a/gdb/ppc-linux-tdep.c
+++ b/gdb/ppc-linux-tdep.c
@@ -83,6 +83,7 @@ 
 #include "features/rs6000/powerpc-isa207-vsx64l.c"
 #include "features/rs6000/powerpc-isa207-htm-vsx64l.c"
 #include "features/rs6000/powerpc-e500l.c"
+#include "dwarf2/frame.h"
 
 /* Shared library operations for PowerPC-Linux.  */
 static struct target_so_ops powerpc_so_ops;
@@ -2088,6 +2089,52 @@  ppc_linux_displaced_step_prepare  (gdbarch *arch, thread_info *thread,
   return per_inferior->disp_step_buf->prepare (thread, displaced_pc);
 }
 
+/* Convert a Dwarf 2 register number to a GDB register number for Linux.  */
+static int
+rs6000_linux_dwarf2_reg_to_regnum (struct gdbarch *gdbarch, int num)
+{
+  ppc_gdbarch_tdep *tdep = gdbarch_tdep<ppc_gdbarch_tdep>(gdbarch);
+
+  if (0 <= num && num <= 31)
+    return tdep->ppc_gp0_regnum + num;
+  else if (32 <= num && num <= 63)
+    /* FIXME: jimb/2004-05-05: What should we do when the debug info
+       specifies registers the architecture doesn't have?  Our
+       callers don't check the value we return.  */
+    return tdep->ppc_fp0_regnum + (num - 32);
+  else if (77 <= num && num < 77 + 32)
+    return tdep->ppc_vr0_regnum + (num - 77);
+  else
+    switch (num)
+      {
+      case 65:
+	return tdep->ppc_lr_regnum;
+      case 66:
+	return tdep->ppc_ctr_regnum;
+      case 76:
+	return tdep->ppc_xer_regnum;
+      case 109:
+	return tdep->ppc_vrsave_regnum;
+      case 110:
+	return tdep->ppc_vrsave_regnum - 1; /* vscr */
+      }
+
+  /* Unknown DWARF register number.  */
+  return -1;
+}
+
+/* Translate a .eh_frame register to DWARF register, or adjust a
+   .debug_frame register.  */
+
+
+static int
+rs6000_linux_adjust_frame_regnum (struct gdbarch *gdbarch, int num,
+				  int eh_frame_p)
+{
+  /* Linux uses the same numbering for .debug_frame numbering as .eh_frame.  */
+  return num;
+}
+
 static void
 ppc_linux_init_abi (struct gdbarch_info info,
 		    struct gdbarch *gdbarch)
@@ -2135,6 +2182,15 @@  ppc_linux_init_abi (struct gdbarch_info info,
   set_gdbarch_stap_is_single_operand (gdbarch, ppc_stap_is_single_operand);
   set_gdbarch_stap_parse_special_token (gdbarch,
 					ppc_stap_parse_special_token);
+  /* Linux DWARF register mapping is different from the othe OS's.  */
+  set_gdbarch_dwarf2_reg_to_regnum (gdbarch,
+				    rs6000_linux_dwarf2_reg_to_regnum);
+  /* Note on Linux the mapping for the DWARF registers and the stab registers
+     use the same numbers.  Install rs6000_linux_dwarf2_reg_to_regnum for the
+     stab register mappings as well.  */
+  set_gdbarch_stab_reg_to_regnum (gdbarch,
+				    rs6000_linux_dwarf2_reg_to_regnum);
+  dwarf2_frame_set_adjust_regnum (gdbarch, rs6000_linux_adjust_frame_regnum);
 
   if (tdep->wordsize == 4)
     {