[1/1,V5] gdb : Signal to pstack/gdb kills the attached process.
Checks
Context |
Check |
Description |
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gdb_build--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-arm |
success
|
Testing passed
|
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 |
success
|
Testing passed
|
Commit Message
From: Partha Sarathi Satapathy <partha.satapathy@oracle.com>
Problem: While gdb is attaching an inferior, if ctrl-c is pressed in the
middle of the process attach, the sigint is passed to the debugged
process. This triggers the exit of the inferior. For example in pstack,
printing a stack can take significant time, and ctrl-c is pressed to
abort the pstack/gdb application. This in turn kills the debugged
process, which can be critical for the system. In this case, the
intention of ctrl+c is to kill pstack/gdb, but not the inferior
application.
gdb -p <<pid>>
or gdb /proc/<<pid>>/exe pid
Attaching to process
<< ctrl+c is pressed during attach
(gdb) q
<<<< inferior process exited >>>>
A Ctrl-C/sigint received by gdb during the attachment of an inferior
passed to the debugged at some definite points during the window of
process attachment. The process of attaching an inferior is a multistep
process, and it takes time to get ready with the GDB prompt. As the
debugger and debugger are not fully attached during this period, the
sigint takes its default action to terminate the process.
Solution: While GDB attaches processes, the inferior is not the current
session leader. Hence, until attach is complete and the GDB prompt is
available, the sigint should not be passed to the inferior.
The signal should be skipped if the process runs in the background. With
this approach, we can skip passing the signature if the process is
attached to the GDB and the process attach is not complete.
attach_flag : Set if process is attached
sync_flag : Set if attach is complete
If attached and sync_flag is not set, dont kill attached process
---
gdb/infcmd.c | 2 ++
gdb/inferior.h | 3 +++
gdb/inflow.c | 2 ++
3 files changed, 7 insertions(+)
Comments
>>>>> "Partha" == Partha Satapathy <partha.satapathy@oracle.com> writes:
IMO it would be best if Pedro reviewed this, but...
Partha> diff --git a/gdb/inflow.c b/gdb/inflow.c
Partha> index 773ac0ba4997..381e6e4c22dd 100644
Partha> --- a/gdb/inflow.c
Partha> +++ b/gdb/inflow.c
Partha> @@ -585,6 +585,8 @@ child_pass_ctrlc (struct target_ops *self)
Partha> if (inf->terminal_state != target_terminal_state::is_ours)
Partha> {
Partha> gdb_assert (inf->pid != 0);
Partha> + if ((inf->attach_flag) && !(inf->sync_flag))
Partha> + return;
... if this code is run, doesn't it mean a C-c will be ignored?
Also earlier:
Partha> + check_quit_flag();
I think this just returns true/false and clears the flag.
That doesn't seem right.
Maybe check_quit_flag should be marked [[nodiscard]].
Tom
Hi.
Just wanted to let you know that I've read all the discussion around this until this
email I'm replying to, and started thinking about it a bit. Unfortunately this is one of
those areas in GDB where the right change is rarely immediately obvious (to me).
Some questions:
- If you ctrl-c to abort the attach, do we really abort the
attach properly? Or do we stay attached in some half broken state?
- Below you mention pstack, where can we find it? And you mention
that ctrl-c is pressed while that is printing a stack. I'm assuming
that's a backtrace command. I'm confused in that case, as if that is
so, then we should already be past the initial attach. The question
would then becomes, shouldn't gdb have the terminal at that point?
How come it does not?
I'm wondering whether Baris's patch to eliminate the inferior
continuations would help with this, as it probably makes the attaching
sequence synchronous. I should probably look at that one.
Pedro Alves
Adding another question to the list below. (I haven't tried to reproduce this yet myself, btw.)
On 2024-05-10 21:19, Pedro Alves wrote:
> Hi.
>
> Just wanted to let you know that I've read all the discussion around this until this
> email I'm replying to, and started thinking about it a bit. Unfortunately this is one of
> those areas in GDB where the right change is rarely immediately obvious (to me).
>
> Some questions:
>
> - If you ctrl-c to abort the attach, do we really abort the
> attach properly? Or do we stay attached in some half broken state?
>
> - Below you mention pstack, where can we find it? And you mention
> that ctrl-c is pressed while that is printing a stack. I'm assuming
> that's a backtrace command. I'm confused in that case, as if that is
> so, then we should already be past the initial attach. The question
> would then becomes, shouldn't gdb have the terminal at that point?
> How come it does not?
#3 - The patch description states:
> Problem: While gdb is attaching an inferior, if ctrl-c is pressed in the
> middle of the process attach, the sigint is passed to the debugged
> process. This triggers the exit of the inferior.
This SIGINT passing is done with "kill(-pgrp, SIGINT)". How does that manage
to trigger the exit of the inferior at all? ptrace should intercept the
SIGINT before the inferior ever sees it. Did it not?
Or could it be that the real issue is that because that sends the SIGINT
to all the processes in the inferior's pgrp, we kill more processes than
the one we're attaching to, and those processes exiting cause the inferior
to exit as well. If so, then this is orthogonal to the initial attach,
and can happen after the attach as well. There is a bug open about this
on bugzilla.
Pedro Alves
>
> I'm wondering whether Baris's patch to eliminate the inferior
> continuations would help with this, as it probably makes the attaching
> sequence synchronous. I should probably look at that one.
>
> Pedro Alves
>
On 5/13/2024 8:19 PM, Pedro Alves wrote:
> Adding another question to the list below. (I haven't tried to reproduce this yet myself, btw.)
>
> On 2024-05-10 21:19, Pedro Alves wrote:
>> Hi.
>>
>> Just wanted to let you know that I've read all the discussion around this until this
>> email I'm replying to, and started thinking about it a bit. Unfortunately this is one of
>> those areas in GDB where the right change is rarely immediately obvious (to me).
>>
>> Some questions:
>>
>> - If you ctrl-c to abort the attach, do we really abort the
>> attach properly? Or do we stay attached in some half broken state?
>>
>> - Below you mention pstack, where can we find it? And you mention
>> that ctrl-c is pressed while that is printing a stack. I'm assuming
>> that's a backtrace command. I'm confused in that case, as if that is
>> so, then we should already be past the initial attach. The question
>> would then becomes, shouldn't gdb have the terminal at that point?
>> How come it does not?
>
> #3 - The patch description states:
>
> > Problem: While gdb is attaching an inferior, if ctrl-c is pressed in the
> > middle of the process attach, the sigint is passed to the debugged
> > process. This triggers the exit of the inferior.
>
> This SIGINT passing is done with "kill(-pgrp, SIGINT)". How does that manage
> to trigger the exit of the inferior at all? ptrace should intercept the
> SIGINT before the inferior ever sees it. Did it not?
>
> Or could it be that the real issue is that because that sends the SIGINT
> to all the processes in the inferior's pgrp, we kill more processes than
> the one we're attaching to, and those processes exiting cause the inferior
> to exit as well. If so, then this is orthogonal to the initial attach,
> and can happen after the attach as well. There is a bug open about this
> on bugzilla.
>
> Pedro Alves
>
>>
>> I'm wondering whether Baris's patch to eliminate the inferior
>> continuations would help with this, as it probably makes the attaching
>> sequence synchronous. I should probably look at that one.
>>
>> Pedro Alves
>>
Thanks Pedro and Tom for reviewing the problem.
Problem :
pstack, dumps the stack of all threads in a process. In some cases
printing of stack can take significant time and ctrl-c is pressed to
abort pstack/gdb application. This in turn kills the debugged process,
which can be critical for the system. In this case the intention of
“ctrl+c” to kill pstack/gdb, but not the target application.
# tail pstack -n 12
# Run GDB, strip out unwanted noise.
# --readnever is no longer used since .gdb_index is now in use.
$GDB --quiet -nx $GDBARGS /proc/$1/exe $1 <<EOF 2>&1 |
set width 0
set height 0
set pagination no
$backtrace
EOF
/bin/sed -n \
-e 's/^\((gdb) \)*//' \
-e '/^#/p' \
-e '/^Thread/p'
This is the interest part in the pstack, rest is cosmetic.
pstack uses:
# pstack 1
#0 0x00007fa18cf44017 in epoll_wait () from /lib64/libc.so.6
#1 0x00007fa18e67e036 in sd_event_wait () from
/usr/lib/systemd/libsystemd-shared-239.so
#2 0x00007fa18e67f33b in sd_event_run () from
/usr/lib/systemd/libsystemd-shared-239.so
#3 0x000055c155da8c22 in manager_loop ()
#4 0x000055c155d5f133 in main ()
Reproduction:
The debugged application generally attached to process by:
gdb -p <<pid>>
or gdb /proc/<<pid>>/exe pid
pstack uses the latter method to attach the debugged to gdb. If the
application is large or process of reading symbols is slow, gives a good
window to press the ctrl+c during attach. Spawning "gdb" under "strace
-k" makes gdb a lot slower and gives a larger window to easily press the
ctrl+c at the precise period i.e. during the attach of the debugged
process. The above strace hack will enhance rate of reproduction of the
issue. Testcase:
With GDB 13.1
ps aux | grep abrtd
root 2195168 /usr/sbin/abrtd -d -s
#strace -k -o log gdb -p 2195168
Attaching to process 2195168
[New LWP 2195177]
[New LWP 2195179]
^C[Thread debugging using libthread_db enabled]
<<<< Note the ctrl+c is pressed after attach is initiated and it’s
still reading the symbols from library >>>> Using host libthread_db
library "/lib64/libthread_db.so.1".
0x00007fe3ed6d70d1 in poll () from /lib64/libc.so.6
(gdb) q
A debugging session is active.
Inferior 1 [process 2195168] will be detached Quit anyway? (y
or n) y Detaching from program: /usr/sbin/abrtd, process 2195168
# ps aux | grep 2195168
<<<< Process exited >>>>
This is having a very narrow window to press the ctrlc.
Session1 :
]$ ps aux | grep abrtd
root 1329 0.0 0.0 602624 13076 ? Ssl May03 0:00
/usr/sbin/abrtd -d -s
Session2:
# ./tpstack 1329
+ strace -o omlog -k ./gdb --quiet -nx -ex 'set width 0' -ex 'set height
0' -ex 'set pagination no' -ex 'set confirm off' -ex 'thread apply all
bt' -ex quit /proc/1329/exe 1329
Reading symbols from /proc/1329/exe...
Python Exception <class 'AttributeError'>: module 'gdb' has no attribute
'_handle_missing_debuginfo'
Reading symbols from .gnu_debugdata for /usr/sbin/abrtd...
(No debugging symbols found in .gnu_debugdata for /usr/sbin/abrtd)
Attaching to program: /proc/1329/exe, process 1329
[New LWP 1399]
[New LWP 1349] ^C
Session1:
[opc@pssatapa-ol8 TEST]$ ps aux | grep abrtd
<<<1329 Is killed >>>
This is a very small window, so a heavy application is good for
reproduction. I modified the the last part of pstack like:
# Run GDB, strip out unwanted noise.
# --readnever is no longer used since .gdb_index is now in use.
strace -o omlog -k ./gdb --quiet -nx -ex 'set width 0' -ex 'set
height 0' -ex 'set pagination no' -ex 'set confirm off' -ex 'thread
apply all bt' -ex quit /proc/$1/exe $1
The strace with -k on gdb make gdb slow and we get a window to press
Ctrl+c. otherwise the window is very small to time the signal. We
observe the problem while the FileStsyem or Kernel or proc FS is slow.
The signal is not intended to the inferior.
The signal is passed from "gdb" to the inferior.
The SIGINT handler in gdb, marks the QUIT flag and
in some paths we check the quit flag and pass the signal to inferior.
That is killing the inferior.
On :
+ check_quit_flag();
This should be set only when inf->attach_flag is true.
I will add the check in next iteration.
The idea here is to clear any pending QUIT flag set by sigint
else, post we set the sync_flag , a check to QUIT Flag
and can kill the inferior.
Thanks
Partha
@@ -2510,6 +2510,8 @@ setup_inferior (int from_tty)
target_post_attach (inferior_ptid.pid ());
post_create_inferior (from_tty);
+ check_quit_flag();
+ current_inferior ()->sync_flag = true;
}
/* What to do after the first program stops after attaching. */
@@ -603,6 +603,9 @@ class inferior : public refcounted_object,
/* True if this child process was attached rather than forked. */
bool attach_flag = false;
+ /* True if inferior has been fully attached*/
+ bool sync_flag = false;
+
/* If this inferior is a vfork child, then this is the pointer to
its vfork parent, if GDB is still attached to it. */
inferior *vfork_parent = NULL;
@@ -585,6 +585,8 @@ child_pass_ctrlc (struct target_ops *self)
if (inf->terminal_state != target_terminal_state::is_ours)
{
gdb_assert (inf->pid != 0);
+ if ((inf->attach_flag) && !(inf->sync_flag))
+ return;
#ifndef _WIN32
kill (inf->pid, SIGINT);