[2/2] Fix AIX core file handling: prevent crashes during GDB quit

Message ID 20260324095330.46665-2-akamath996@gmail.com
State New
Headers
Series [1/2] Fix asertion failure while analysing core files in AIX with terminated threads. |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Build passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Build passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Test passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Test passed

Commit Message

Aditya Vidyadhar Kamath March 24, 2026, 9:53 a.m. UTC
  From: Aditya Vidyadhar Kamath <aditya.kamath1@ibm.com>

This patch fixes two root causes that caused GDB to crash in AIX
while quitting debugger both while analysing AIX core files.

They are:
1: gdb/aix-thread.c: (pd_disable function) When debugging core files,
	all threads are terminated and the pthdb session data is invalid.
	Calling pthdb_session_destroy() on this invalid session causes
	undefined behaviour. So check if it is a running process and then
	only clear the session data. For core files do not do it.

2: bfd/coffgen.c: (_bfd_coff_free_cached_info): Exclude core files
	AIX core files use the XCOFF format which have different tdata
	structure (core_dumpxx) than COFF objects (coff_tdata). The
	function was incorrectly treating core file tdata as COFF data
	and attempting to free hash tables that don't exist in the core
	dump structure, resulting in segment faults.  So restrict to
	bfd_object format only.

We can see this problem when quit after debugging a large core file
in AIX 7.3.

Ex:
Program terminated with signal SIGSEGV, Segmentation fault.
   from /opt/freeware/lib/libpython3.9.a(libpython3.9.so)
(gdb)q

Fatal signal: Segmentation fault
----- Backtrace -----
0x1009fbffb ???
0x1009fc11f ???
0x1005c3587 ???
0x1005c3833 ???
0x4fdf ???
---
 bfd/coffgen.c    | 3 +--
 gdb/aix-thread.c | 7 ++++++-
 2 files changed, 7 insertions(+), 3 deletions(-)
  

Comments

Ulrich Weigand March 24, 2026, 12:17 p.m. UTC | #1
Aditya Vidyadhar Kamath <akamath996@gmail.com> wrote:

>1: gdb/aix-thread.c: (pd_disable function) When debugging core files,
>	all threads are terminated and the pthdb session data is
invalid.
>	Calling pthdb_session_destroy() on this invalid session causes
>	undefined behaviour. So check if it is a running process and
then
>	only clear the session data. For core files do not do it.

I do not understand this.  Why is the session invalid?  We created
that session by calling pthdb_session_init in pd_activate, and that
call must have returned PTHDB_SUCCESS.  Why would the resulting
session be invalid in a way that pthdb_session_destroy crashes?
Not calling this is simply a memory leak here ...

>2: bfd/coffgen.c: (_bfd_coff_free_cached_info): Exclude core files
>	AIX core files use the XCOFF format which have different tdata
>	structure (core_dumpxx) than COFF objects (coff_tdata). The
>	function was incorrectly treating core file tdata as COFF data
>	and attempting to free hash tables that don't exist in the
core
>	dump structure, resulting in segment faults.  So restrict to
>	bfd_object format only.

Doesn't AIX use XCOFF for everything, objects and core files?  Why
would this COFF-specific routine be invoked at all on AIX?  Conversely,
for actual COFF platforms, don't we need to do this cleanup also for
core files?

In any case, this part of the patch touches binutils and needs to
be reviewed on the binutils mailing list.

Bye,
Ulrich
  

Patch

diff --git a/bfd/coffgen.c b/bfd/coffgen.c
index 030dbc1dc79..98d6f98d575 100644
--- a/bfd/coffgen.c
+++ b/bfd/coffgen.c
@@ -3310,8 +3310,7 @@  _bfd_coff_free_cached_info (bfd *abfd)
   struct coff_tdata *tdata;
 
   if (bfd_family_coff (abfd)
-      && (bfd_get_format (abfd) == bfd_object
-	  || bfd_get_format (abfd) == bfd_core)
+      && bfd_get_format (abfd) == bfd_object
       && (tdata = coff_data (abfd)) != NULL)
     {
       if (tdata->section_by_index)
diff --git a/gdb/aix-thread.c b/gdb/aix-thread.c
index e891f510e08..e80e4db0d79 100644
--- a/gdb/aix-thread.c
+++ b/gdb/aix-thread.c
@@ -1003,7 +1003,12 @@  pd_disable (inferior *inf)
     return;
   if (!data->pd_active)
     return;
-  pthdb_session_destroy (data->pd_session);
+  /* For core files, all threads are terminated. Calling
+     pthdb_session_destroy (data->pd_session) on it is incorrect
+     and will cause undefined behaviour.  So skip it since we
+     are cleaning.  For binaries clean up.  */
+  if (target_has_execution ())
+    pthdb_session_destroy (data->pd_session);
 
   pid_to_prc (&inferior_ptid);
   data->pd_active = 0;