[v2] Fix sporadic XFAILs in gdb.threads/attach-many-short-lived-threads.exp

Message ID AS8P193MB1285AC3E42F9767F851AE50BE4022@AS8P193MB1285.EURP193.PROD.OUTLOOK.COM
State New
Headers
Series [v2] Fix sporadic XFAILs in gdb.threads/attach-many-short-lived-threads.exp |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed

Commit Message

Bernd Edlinger April 6, 2024, 4:46 a.m. UTC
  This is about random test failures like those:

XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 6: attach (EPERM)
XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 7: attach (EPERM)
XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (EPERM)
XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 9: attach (EPERM)
XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 10: attach (EPERM)

The reason for this effect is apparently as follows:

There is a race condition when gdb tries to attach a thread but the
thread exits at the same time.  Normally when that happens the return
code of ptrace(PTRACE_ATTACH, x) is EPERM, which could also have other
reasons.  To detect the true reason, we try to open /proc/<pid>/status
which normally fails in that situation, but it may happen that the
fopen succeeds, and the thread disappears while reading the content,
then the read has the errno=ESRCH, use that as an indication that the
thread has exited between fopen and reading of the status file.
---
 gdb/nat/linux-procfs.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

v2: from kernel code review, it seems the missing "State:"
 can only happen if the thread disappeared, so no need to
 look at errno at all here.
  

Comments

Andrew Burgess April 6, 2024, 5:40 p.m. UTC | #1
Bernd Edlinger <bernd.edlinger@hotmail.de> writes:

Thanks for looking into these failures.

> This is about random test failures like those:
>
> XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 6: attach (EPERM)
> XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 7: attach (EPERM)
> XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 8: attach (EPERM)
> XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 9: attach (EPERM)
> XFAIL: gdb.threads/attach-many-short-lived-threads.exp: iter 10: attach (EPERM)
>
> The reason for this effect is apparently as follows:
>
> There is a race condition when gdb tries to attach a thread but the
> thread exits at the same time.  Normally when that happens the return
> code of ptrace(PTRACE_ATTACH, x) is EPERM, which could also have other
> reasons.  To detect the true reason, we try to open /proc/<pid>/status
> which normally fails in that situation, but it may happen that the
> fopen succeeds, and the thread disappears while reading the content,
> then the read has the errno=ESRCH, use that as an indication that the
> thread has exited between fopen and reading of the status file.

I assume this would be the commit message for this change?  This text
seems to be out of date given the changes in V2 and probably needs
updating.

Thanks,
Andrew



> ---
>  gdb/nat/linux-procfs.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> v2: from kernel code review, it seems the missing "State:"
>  can only happen if the thread disappeared, so no need to
>  look at errno at all here.
>
> diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
> index e2086952ce6..8d46d5bf289 100644
> --- a/gdb/nat/linux-procfs.c
> +++ b/gdb/nat/linux-procfs.c
> @@ -157,17 +157,12 @@ linux_proc_pid_is_gone (pid_t pid)
>    enum proc_state state;
>  
>    have_state = linux_proc_pid_get_state (pid, 0, &state);
> -  if (have_state < 0)
> +  if (have_state <= 0)
>      {
> -      /* If we can't open the status file, assume the thread has
> -	 disappeared.  */
> +      /* If we can't open the status file or there is no "State:" line,
> +	 assume the thread has disappeared.  */
>        return 1;
>      }
> -  else if (have_state == 0)
> -    {
> -      /* No "State:" line, assume thread is alive.  */
> -      return 0;
> -    }
>    else
>      return (state == PROC_STATE_ZOMBIE || state == PROC_STATE_DEAD);
>  }
> -- 
> 2.39.2
  

Patch

diff --git a/gdb/nat/linux-procfs.c b/gdb/nat/linux-procfs.c
index e2086952ce6..8d46d5bf289 100644
--- a/gdb/nat/linux-procfs.c
+++ b/gdb/nat/linux-procfs.c
@@ -157,17 +157,12 @@  linux_proc_pid_is_gone (pid_t pid)
   enum proc_state state;
 
   have_state = linux_proc_pid_get_state (pid, 0, &state);
-  if (have_state < 0)
+  if (have_state <= 0)
     {
-      /* If we can't open the status file, assume the thread has
-	 disappeared.  */
+      /* If we can't open the status file or there is no "State:" line,
+	 assume the thread has disappeared.  */
       return 1;
     }
-  else if (have_state == 0)
-    {
-      /* No "State:" line, assume thread is alive.  */
-      return 0;
-    }
   else
     return (state == PROC_STATE_ZOMBIE || state == PROC_STATE_DEAD);
 }