[master,+,gdb-13] Fix Solaris regression (PR tdep/30252)

Message ID 20230324132215.1070558-1-pedro@palves.net
State New
Headers
Series [master,+,gdb-13] Fix Solaris regression (PR tdep/30252) |

Commit Message

Pedro Alves March 24, 2023, 1:22 p.m. UTC
  PR tdep/30252 reports that using GDB on Solaris fails an assertion in
target_resume:

 target.c:2648: internal-error: target_resume: Assertion `inferior_ptid != null_ptid' failed.
 A problem internal to GDB has been detected,
 further debugging may prove unreliable.
 Quit this debugging session? (y or n)

The backtrace, after running it through c++filt, looks like:

 ----- Backtrace -----
 0xa18914 gdb_internal_backtrace_1
	 /root/binutils-gdb/gdb/bt-utils.c:122
 0xa18914 gdb_internal_backtrace()
	 /root/binutils-gdb/gdb/bt-utils.c:168
 0xdec834 internal_vproblem
	 /root/binutils-gdb/gdb/utils.c:401
 0xdecad8 internal_verror(char const*, int, char const*, __va_list_tag*)
	 /root/binutils-gdb/gdb/utils.c:481
 0xf3638c internal_error_loc(char const*, int, char const*, ...)
	 /root/binutils-gdb/gdbsupport/errors.cc:58
 0xd70580 target_resume(ptid_t, int, gdb_signal)
	 /root/binutils-gdb/gdb/target.c:2648
 0xc59e85 procfs_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>)
	 /root/binutils-gdb/gdb/procfs.c:2187
 0xcf6da7 sol_thread_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>)
	 /root/binutils-gdb/gdb/sol-thread.c:442
 0xd73711 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>)
	 /root/binutils-gdb/gdb/target.c:2586
 ...

The problem is that the procfs backend, while inside target_wait,
called target_resume without switching to the leader thread of that
resumption.

The target_resume interface is:

 /* Resume execution (or prepare for execution) of the current thread
    (INFERIOR_PTID), while optionally letting other threads of the
    current process or all processes run free.
    ...

Thus calling target_resume with inferior_ptid == null_ptid is bogus.

target_wait (which leads to procfs_target::wait on Solaris) is called
with inferior_ptid == null_ptid on entry exactly to help catch such
bogus uses.

From the backtrace, it seems that the relevant line in question is
procfs.c:2187:

2186  /* How to keep going without returning to wfi: */
2187  target_continue_no_signal (ptid);
2188  goto wait_again;

target_continue_no_signal is a small wrapper around target_resume,
which would make sense.

The fix is to not call target_resume or go via the target stack at
all.  Instead, factor out a new proc_resume function out of
procfs_target::resume, and call that.  The new function does not rely
on inferior_ptid.

I've not been able to test it myself, but Petr confirmed it fixes the
assertion failure with his test case.

Tested-By: Petr Šumbera <petr.sumbera@oracle.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30252
Change-Id: I9657f9a96f2bc7fded72b8a771e592b6b91efea3
---
 gdb/procfs.c | 46 +++++++++++++++++++++++++---------------------
 1 file changed, 25 insertions(+), 21 deletions(-)


base-commit: 80251d4185048c6391b74abe96c754e8a536b35f
  

Comments

Pedro Alves July 6, 2023, 2:03 p.m. UTC | #1
Hi!

I almost forgot about this.

There's been another confirmation in Bugzilla that this fixes the issue, so I'm going
ahead and merging it.

Pedro Alves

On 2023-03-24 13:22, Pedro Alves wrote:
> PR tdep/30252 reports that using GDB on Solaris fails an assertion in
> target_resume:
> 
>  target.c:2648: internal-error: target_resume: Assertion `inferior_ptid != null_ptid' failed.
>  A problem internal to GDB has been detected,
>  further debugging may prove unreliable.
>  Quit this debugging session? (y or n)
> 
> The backtrace, after running it through c++filt, looks like:
> 
>  ----- Backtrace -----
>  0xa18914 gdb_internal_backtrace_1
> 	 /root/binutils-gdb/gdb/bt-utils.c:122
>  0xa18914 gdb_internal_backtrace()
> 	 /root/binutils-gdb/gdb/bt-utils.c:168
>  0xdec834 internal_vproblem
> 	 /root/binutils-gdb/gdb/utils.c:401
>  0xdecad8 internal_verror(char const*, int, char const*, __va_list_tag*)
> 	 /root/binutils-gdb/gdb/utils.c:481
>  0xf3638c internal_error_loc(char const*, int, char const*, ...)
> 	 /root/binutils-gdb/gdbsupport/errors.cc:58
>  0xd70580 target_resume(ptid_t, int, gdb_signal)
> 	 /root/binutils-gdb/gdb/target.c:2648
>  0xc59e85 procfs_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>)
> 	 /root/binutils-gdb/gdb/procfs.c:2187
>  0xcf6da7 sol_thread_target::wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>)
> 	 /root/binutils-gdb/gdb/sol-thread.c:442
>  0xd73711 target_wait(ptid_t, target_waitstatus*, enum_flags<target_wait_flag>)
> 	 /root/binutils-gdb/gdb/target.c:2586
>  ...
> 
> The problem is that the procfs backend, while inside target_wait,
> called target_resume without switching to the leader thread of that
> resumption.
> 
> The target_resume interface is:
> 
>  /* Resume execution (or prepare for execution) of the current thread
>     (INFERIOR_PTID), while optionally letting other threads of the
>     current process or all processes run free.
>     ...
> 
> Thus calling target_resume with inferior_ptid == null_ptid is bogus.
> 
> target_wait (which leads to procfs_target::wait on Solaris) is called
> with inferior_ptid == null_ptid on entry exactly to help catch such
> bogus uses.
> 
> From the backtrace, it seems that the relevant line in question is
> procfs.c:2187:
> 
> 2186  /* How to keep going without returning to wfi: */
> 2187  target_continue_no_signal (ptid);
> 2188  goto wait_again;
> 
> target_continue_no_signal is a small wrapper around target_resume,
> which would make sense.
> 
> The fix is to not call target_resume or go via the target stack at
> all.  Instead, factor out a new proc_resume function out of
> procfs_target::resume, and call that.  The new function does not rely
> on inferior_ptid.
> 
> I've not been able to test it myself, but Petr confirmed it fixes the
> assertion failure with his test case.
> 
> Tested-By: Petr Šumbera <petr.sumbera@oracle.com>
> Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30252
> Change-Id: I9657f9a96f2bc7fded72b8a771e592b6b91efea3
> ---
>  gdb/procfs.c | 46 +++++++++++++++++++++++++---------------------
>  1 file changed, 25 insertions(+), 21 deletions(-)
> 
> diff --git a/gdb/procfs.c b/gdb/procfs.c
> index 741e62a2402..75f3a182754 100644
> --- a/gdb/procfs.c
> +++ b/gdb/procfs.c
> @@ -589,6 +589,8 @@ static int proc_iterate_over_threads
>    (procinfo *pi,
>     int (*func) (procinfo *, procinfo *, void *),
>     void *ptr);
> +static void proc_resume (procinfo *pi, ptid_t scope_ptid,
> +			 int step, enum gdb_signal signo);
>  
>  static void
>  proc_warn (procinfo *pi, const char *func, int line)
> @@ -2119,7 +2121,7 @@ procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
>  		      gdb_printf (_("[%s exited]\n"),
>  				  target_pid_to_str (retval).c_str ());
>  		    delete_thread (find_thread_ptid (this, retval));
> -		    target_continue_no_signal (ptid);
> +		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
>  		    goto wait_again;
>  		  }
>  		else if (what == SYS_exit)
> @@ -2183,8 +2185,7 @@ procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
>  				      i, sysargs[i]);
>  		      }
>  
> -		    /* How to keep going without returning to wfi: */
> -		    target_continue_no_signal (ptid);
> +		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
>  		    goto wait_again;
>  		  }
>  		break;
> @@ -2217,7 +2218,7 @@ procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
>  		    if (!in_thread_list (this, temp_ptid))
>  		      add_thread (this, temp_ptid);
>  
> -		    target_continue_no_signal (ptid);
> +		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
>  		    goto wait_again;
>  		  }
>  		else if (what == SYS_lwp_exit)
> @@ -2249,7 +2250,7 @@ procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
>  				      i, sysargs[i]);
>  		      }
>  
> -		    target_continue_no_signal (ptid);
> +		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
>  		    goto wait_again;
>  		  }
>  		break;
> @@ -2428,20 +2429,16 @@ invalidate_cache (procinfo *parent, procinfo *pi, void *ptr)
>    return 0;
>  }
>  
> -/* Make the child process runnable.  Normally we will then call
> -   procfs_wait and wait for it to stop again (unless gdb is async).
> +/* Make child process PI runnable.
>  
>     If STEP is true, then arrange for the child to stop again after
> -   executing a single instruction.  If SIGNO is zero, then cancel any
> -   pending signal; if non-zero, then arrange for the indicated signal
> -   to be delivered to the child when it runs.  If PID is -1, then
> -   allow any child thread to run; if non-zero, then allow only the
> -   indicated thread to run.  (not implemented yet).  */
> +   executing a single instruction.  SCOPE_PTID, STEP and SIGNO are
> +   like in the target_resume interface.  */
>  
> -void
> -procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
> +static void
> +proc_resume (procinfo *pi, ptid_t scope_ptid, int step, enum gdb_signal signo)
>  {
> -  procinfo *pi, *thread;
> +  procinfo *thread;
>    int native_signo;
>  
>    /* FIXME: Check/reword.  */
> @@ -2453,10 +2450,6 @@ procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
>       So basically PR_STEP is the sole argument that must be passed
>       to proc_run_process.  */
>  
> -  /* Find procinfo for main process.  */
> -  pi = find_procinfo_or_die (inferior_ptid.pid (), 0);
> -
> -  /* First cut: ignore pid argument.  */
>    errno = 0;
>  
>    /* Convert signal to host numbering.  */
> @@ -2473,11 +2466,11 @@ procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
>    /* Void the process procinfo's caches.  */
>    invalidate_cache (NULL, pi, NULL);
>  
> -  if (ptid.pid () != -1)
> +  if (scope_ptid.pid () != -1)
>      {
>        /* Resume a specific thread, presumably suppressing the
>  	 others.  */
> -      thread = find_procinfo (ptid.pid (), ptid.lwp ());
> +      thread = find_procinfo (scope_ptid.pid (), scope_ptid.lwp ());
>        if (thread != NULL)
>  	{
>  	  if (thread->tid != 0)
> @@ -2502,6 +2495,17 @@ procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
>      }
>  }
>  
> +/* Implementation of target_ops::resume.  */
> +
> +void
> +procfs_target::resume (ptid_t scope_ptid, int step, enum gdb_signal signo)
> +{
> +  /* Find procinfo for main process.  */
> +  procinfo *pi = find_procinfo_or_die (inferior_ptid.pid (), 0);
> +
> +  proc_resume (pi, scope_ptid, step, signo);
> +}
> +
>  /* Set up to trace signals in the child process.  */
>  
>  void
> 
> base-commit: 80251d4185048c6391b74abe96c754e8a536b35f
  

Patch

diff --git a/gdb/procfs.c b/gdb/procfs.c
index 741e62a2402..75f3a182754 100644
--- a/gdb/procfs.c
+++ b/gdb/procfs.c
@@ -589,6 +589,8 @@  static int proc_iterate_over_threads
   (procinfo *pi,
    int (*func) (procinfo *, procinfo *, void *),
    void *ptr);
+static void proc_resume (procinfo *pi, ptid_t scope_ptid,
+			 int step, enum gdb_signal signo);
 
 static void
 proc_warn (procinfo *pi, const char *func, int line)
@@ -2119,7 +2121,7 @@  procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
 		      gdb_printf (_("[%s exited]\n"),
 				  target_pid_to_str (retval).c_str ());
 		    delete_thread (find_thread_ptid (this, retval));
-		    target_continue_no_signal (ptid);
+		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
 		    goto wait_again;
 		  }
 		else if (what == SYS_exit)
@@ -2183,8 +2185,7 @@  procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
 				      i, sysargs[i]);
 		      }
 
-		    /* How to keep going without returning to wfi: */
-		    target_continue_no_signal (ptid);
+		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
 		    goto wait_again;
 		  }
 		break;
@@ -2217,7 +2218,7 @@  procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
 		    if (!in_thread_list (this, temp_ptid))
 		      add_thread (this, temp_ptid);
 
-		    target_continue_no_signal (ptid);
+		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
 		    goto wait_again;
 		  }
 		else if (what == SYS_lwp_exit)
@@ -2249,7 +2250,7 @@  procfs_target::wait (ptid_t ptid, struct target_waitstatus *status,
 				      i, sysargs[i]);
 		      }
 
-		    target_continue_no_signal (ptid);
+		    proc_resume (pi, ptid, 0, GDB_SIGNAL_0);
 		    goto wait_again;
 		  }
 		break;
@@ -2428,20 +2429,16 @@  invalidate_cache (procinfo *parent, procinfo *pi, void *ptr)
   return 0;
 }
 
-/* Make the child process runnable.  Normally we will then call
-   procfs_wait and wait for it to stop again (unless gdb is async).
+/* Make child process PI runnable.
 
    If STEP is true, then arrange for the child to stop again after
-   executing a single instruction.  If SIGNO is zero, then cancel any
-   pending signal; if non-zero, then arrange for the indicated signal
-   to be delivered to the child when it runs.  If PID is -1, then
-   allow any child thread to run; if non-zero, then allow only the
-   indicated thread to run.  (not implemented yet).  */
+   executing a single instruction.  SCOPE_PTID, STEP and SIGNO are
+   like in the target_resume interface.  */
 
-void
-procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
+static void
+proc_resume (procinfo *pi, ptid_t scope_ptid, int step, enum gdb_signal signo)
 {
-  procinfo *pi, *thread;
+  procinfo *thread;
   int native_signo;
 
   /* FIXME: Check/reword.  */
@@ -2453,10 +2450,6 @@  procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
      So basically PR_STEP is the sole argument that must be passed
      to proc_run_process.  */
 
-  /* Find procinfo for main process.  */
-  pi = find_procinfo_or_die (inferior_ptid.pid (), 0);
-
-  /* First cut: ignore pid argument.  */
   errno = 0;
 
   /* Convert signal to host numbering.  */
@@ -2473,11 +2466,11 @@  procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
   /* Void the process procinfo's caches.  */
   invalidate_cache (NULL, pi, NULL);
 
-  if (ptid.pid () != -1)
+  if (scope_ptid.pid () != -1)
     {
       /* Resume a specific thread, presumably suppressing the
 	 others.  */
-      thread = find_procinfo (ptid.pid (), ptid.lwp ());
+      thread = find_procinfo (scope_ptid.pid (), scope_ptid.lwp ());
       if (thread != NULL)
 	{
 	  if (thread->tid != 0)
@@ -2502,6 +2495,17 @@  procfs_target::resume (ptid_t ptid, int step, enum gdb_signal signo)
     }
 }
 
+/* Implementation of target_ops::resume.  */
+
+void
+procfs_target::resume (ptid_t scope_ptid, int step, enum gdb_signal signo)
+{
+  /* Find procinfo for main process.  */
+  procinfo *pi = find_procinfo_or_die (inferior_ptid.pid (), 0);
+
+  proc_resume (pi, scope_ptid, step, signo);
+}
+
 /* Set up to trace signals in the child process.  */
 
 void