[RFC,AArch64] Fix some erroneous behavior in gdb.base/step-over-syscall.exp

  I've been chasing/investigating this particular bug for a while now, and this
is a tentative patch to fix the problem.

Although this may not be aarch64-specific, i can reproduce it consitently in
this particular architecture.

In summary, this bug is a mix of random signal handling (SIGCHLD) and
fork/vfork handling when single-stepping.

Forks and vforks are handled in 3 to 4 steps.

Fork

1 - We get a PTRACE_EVENT_FORK event.
2 - We single-step the inferior to get a SIGTRAP.
3 - Inferior is ready to be resumed as we wish.

Vfork

1 - We get a PTRACE_EVENT_VFORK event.
2 - We continue the inferior to get a PTRACE_EVENT_VFORK_DONE event.
3 - We single-step the inferior to get a SIGTRAP.
4 - Inferior is ready to be resumed as we wish.

The problem manifests itself when we are sitting at a syscall instruction and
we try to instruction-step past it. There may or may not be a breakpoint
inserted at the syscall instruction location.

The expected outcome is that we step a single instruction and the resulting PC
points at the next instruction ($pc + 4 in aarch64).

What i see is that we end up in $pc + 8, that is, one instruction further than
we should've been.

This happens because, when forking/vforking, the child exits too quickly (in the
particular case of gdb.base/step-over-syscall.exp) and a SIGCHLD signal is
issued to the parent. Sometimes this signal arrives before GDB is done handling
the fork/vfork situation. Usually in step 2 for fork and step 3 for vfork.

In turn, this causes GDB to handle that SIGCHLD as a random signal that must be
passed to the inferior, which is correct. But this particular code in infrun.c
doesn't cope with this particular situation and GDB inserts a HP step-resume
breakpoint and registers its wish for the inferior to be stepped yet again.

Thus we end up in $pc + 8, even though we shouldn't have stepped again in this
particular situation.

The delivery of SIGCHLD in between fork/vfork handling steps is sensitive to
timing. If we tweak things a little, enable debugging output or do any other
thing that affects the timing, we may not see this behavior.

This bug doesn't show up as failures in gdb.base/step-over-syscall.exp due to
the way the test is written.  As long as the PC is the same from the previous
test to the next, GDB thinks it is good. The proposed patch doesn't cause a
change in the number of PASSes/FAIL's.

The proposed patch adds a variable waiting_for_fork_sigtrap that gets set just
before the last step of fork/vfork handling and gets reset when we end up
seeing that SIGTRAP we wanted.

This variable is used in the random_signal handling of handle_signal_stop,
where there are a couple conditional blocks that handle signals reaching the
inferior at an unexpected time. If waiting_for_fork_sigtrap is set, we don't
set GDB to do the additional single-step it wants to do.

I gave this a thought and tried to find a better place to put the check in,
but this seemed the most appropriate place to me.

I also tried to solve this without having to introduce a new variable.
But it seems we can't distinguish between a random signal reaching GDB in
the middle of single-stepping and a random signal reaching GDB in the middle
of single-stepping AND handling fork/vfork.

Thoughts?

gdb/ChangeLog:

2019-12-30  Luis Machado  <luis.machado@linaro.org>

	* inferior.h (class inferior) <waiting_for_fork_sigtrap>: New member.
	Set to false by default.
	* infrun.c (handle_inferior_event): Set waiting_for_fork_sigtrap when
	a fork or vfork_done is detected.
	(handle_signal_stop): Don't step again if waiting_for_fork_sigtrap is
	true.
	Reset waiting_for_fork_sigtrap to false when nexti/stepi is detected.

Change-Id: I2849e0dc49ad9c0a026daa8ced4610aa0ddbe637
---
 gdb/inferior.h | 12 ++++++++++++
 gdb/infrun.c   | 31 +++++++++++++++++++++++++++++--
 2 files changed, 41 insertions(+), 2 deletions(-)

[RFC,AArch64] Fix some erroneous behavior in gdb.base/step-over-syscall.exp

Commit Message

Comments

Patch