gdb: Improve syscall entry/return tracking on Linux

Message ID 1444353736-14451-1-git-send-email-jistone@redhat.com
State New, archived
Headers

Commit Message

Josh Stone Oct. 9, 2015, 1:22 a.m. UTC
  The existing logic was simply to flip syscall entry/return state when a
syscall trap was seen, and even then only with active 'catch syscall'.
That can get out of sync if 'catch syscall' is toggled at odd times.

This patch updates the entry/return state for all syscall traps,
regardless of catching state, and also updates known syscall state for
other kinds of traps.  Almost all PTRACE_EVENT stops are delivered from
the middle of a syscall, so this can act like an entry.  Every other
kind of ptrace stop is only delivered outside of syscall event pairs, so
marking them ignored ensures the next syscall trap looks like an entry.

Three new test scenarios are added to catch-syscall.exp:

- Disable 'catch syscall' from an entry to deliberately miss the return
  event, then re-enable to make sure a new entry is recognized.

- Enable 'catch syscall' for the first time from a fork event, which is
  a PTRACE_EVENT_FORK in the middle of the syscall.  Make sure the next
  syscall event is recognized as the return.

- Make sure entry and return are recognized for an ENOSYS syscall.  This
  is to defeat a common x86 hack that uses the pre-filled ENOSYS return
  value as a sign of being on the entry side.

gdb/ChangeLog:

2015-10-08  Josh Stone  <jistone@redhat.com>

	* linux-nat.c (linux_handle_syscall_trap): Always update entry/
	return state, even when not actively catching syscalls at all.
	(linux_handle_extended_wait): Mark syscall_state like an entry.
	(wait_lwp): Set syscall_state ignored for other traps.
	(linux_nat_filter_event): Likewise.

gdb/testsuite/ChangeLog:

2015-10-08  Josh Stone  <jistone@redhat.com>

	* gdb.base/catch-syscall.c (fork_syscall): New variable.
	(unknown_syscall): Likewise.
	(main): Trigger a fork and an unknown syscall.
	* gdb.base/catch-syscall.exp (unknown_syscall_number): New variable.
	(test_catch_syscall_without_args): Check the unknown syscall.
	(test_catch_syscall_skipping_return): New test toggling off 'catch
	syscall' to step over the syscall return, then toggling back on.
	(test_catch_syscall_mid_fork): New test turning on 'catch syscall'
	during a PTRACE_EVENT_FORK stop, in the middle of a fork syscall.
	(do_syscall_tests): Call test_catch_syscall_without_args and
	test_catch_syscall_mid_fork.
	(test_catch_syscall_without_args_noxml): Check the unknown syscall.
	(fill_all_syscalls_numbers): Initialize unknown_syscall_number.
	(setup_all_syscalls): Add "fork" to all_syscalls.
---
 gdb/linux-nat.c                          | 31 ++++++++----
 gdb/testsuite/gdb.base/catch-syscall.c   |  9 ++++
 gdb/testsuite/gdb.base/catch-syscall.exp | 86 ++++++++++++++++++++++++++++++--
 3 files changed, 113 insertions(+), 13 deletions(-)
  

Comments

Pedro Alves Oct. 9, 2015, 10:48 a.m. UTC | #1
Hi Josh,

This looks generally good to me.  A couple comments below.

On 10/09/2015 02:22 AM, Josh Stone wrote:

> @@ -2324,6 +2329,10 @@ wait_lwp (struct lwp_info *lp)
>        if (linux_handle_syscall_trap (lp, 1))
>  	return wait_lwp (lp);
>      }
> +  else
> +    /* Almost all other ptrace-stops are known to be outside of system
> +       calls, with further exceptions in linux_handle_extended_wait.  */
> +    lp->syscall_state = TARGET_WAITKIND_IGNORE;

Our coding conventions state that this should be wrapped in braces:

 https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards

(look for "braces")

>  
>    /* Handle GNU/Linux's extended waitstatus for trace events.  */
>    if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP
> @@ -3126,6 +3135,10 @@ linux_nat_filter_event (int lwpid, int status)
>        if (linux_handle_syscall_trap (lp, 0))
>  	return NULL;
>      }
> +  else
> +    /* Almost all other ptrace-stops are known to be outside of system
> +       calls, with further exceptions in linux_handle_extended_wait.  */
> +    lp->syscall_state = TARGET_WAITKIND_IGNORE;

Ditto.

>  
>    /* Handle GNU/Linux's extended waitstatus for trace events.  */
>    if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP
> diff --git a/gdb/testsuite/gdb.base/catch-syscall.c b/gdb/testsuite/gdb.base/catch-syscall.c
> index 4d0131c0d733..35955fe4a078 100644
> --- a/gdb/testsuite/gdb.base/catch-syscall.c
> +++ b/gdb/testsuite/gdb.base/catch-syscall.c
> @@ -27,6 +27,8 @@ int pipe_syscall = SYS_pipe;
>  int pipe2_syscall = SYS_pipe2;
>  #endif
>  int write_syscall = SYS_write;
> +int fork_syscall = SYS_fork;

no-mmu / uclinux systems don't have fork.  I'm not sure whether
fork returns ENOSYS or SYS_fork isn't even defined there.
Maybe just switch to vfork so we can keep catch syscall
coverage on those systems?

> +int unknown_syscall = 123456789;
>  int exit_group_syscall = SYS_exit_group;
>  
>  int
> @@ -47,6 +49,13 @@ main (void)
>  	write (fd[1], buf1, sizeof (buf1));
>  	read (fd[0], buf2, sizeof (buf2));
>  
> +	/* Test fork-event interactions.  Child exits immediately.
> +	   NB: glibc actually uses clone(), so force a fork.  */
> +	if (syscall (fork_syscall) == 0) _exit (0);

We've recently agreed that tests should follow the coding conventions too,
unless there's a good reason otherwise.  Can you put the _exit on
its own line?

Thanks,
Pedro Alves
  
Josh Stone Oct. 9, 2015, 5:57 p.m. UTC | #2
On 10/09/2015 03:48 AM, Pedro Alves wrote:
> Hi Josh,
> 
> This looks generally good to me.  A couple comments below.

Thanks for reviewing!

> On 10/09/2015 02:22 AM, Josh Stone wrote:
> 
>> @@ -2324,6 +2329,10 @@ wait_lwp (struct lwp_info *lp)
>>        if (linux_handle_syscall_trap (lp, 1))
>>  	return wait_lwp (lp);
>>      }
>> +  else
>> +    /* Almost all other ptrace-stops are known to be outside of system
>> +       calls, with further exceptions in linux_handle_extended_wait.  */
>> +    lp->syscall_state = TARGET_WAITKIND_IGNORE;
> 
> Our coding conventions state that this should be wrapped in braces:
> 
>  https://sourceware.org/gdb/wiki/Internals%20GDB-C-Coding-Standards
> 
> (look for "braces")

Ok, I will update these.

>>  
>>    /* Handle GNU/Linux's extended waitstatus for trace events.  */
>>    if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP
>> @@ -3126,6 +3135,10 @@ linux_nat_filter_event (int lwpid, int status)
>>        if (linux_handle_syscall_trap (lp, 0))
>>  	return NULL;
>>      }
>> +  else
>> +    /* Almost all other ptrace-stops are known to be outside of system
>> +       calls, with further exceptions in linux_handle_extended_wait.  */
>> +    lp->syscall_state = TARGET_WAITKIND_IGNORE;
> 
> Ditto.
> 
>>  
>>    /* Handle GNU/Linux's extended waitstatus for trace events.  */
>>    if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP
>> diff --git a/gdb/testsuite/gdb.base/catch-syscall.c b/gdb/testsuite/gdb.base/catch-syscall.c
>> index 4d0131c0d733..35955fe4a078 100644
>> --- a/gdb/testsuite/gdb.base/catch-syscall.c
>> +++ b/gdb/testsuite/gdb.base/catch-syscall.c
>> @@ -27,6 +27,8 @@ int pipe_syscall = SYS_pipe;
>>  int pipe2_syscall = SYS_pipe2;
>>  #endif
>>  int write_syscall = SYS_write;
>> +int fork_syscall = SYS_fork;
> 
> no-mmu / uclinux systems don't have fork.  I'm not sure whether
> fork returns ENOSYS or SYS_fork isn't even defined there.
> Maybe just switch to vfork so we can keep catch syscall
> coverage on those systems?

In kernel/fork.c I see that lacking CONFIG_MMU returns EINVAL.

But it appears a few archs don't implement fork/vfork syscalls at all,
only clone.  Maybe I should use CLONE_VFORK for broadest coverage?

>> +int unknown_syscall = 123456789;
>>  int exit_group_syscall = SYS_exit_group;
>>  
>>  int
>> @@ -47,6 +49,13 @@ main (void)
>>  	write (fd[1], buf1, sizeof (buf1));
>>  	read (fd[0], buf2, sizeof (buf2));
>>  
>> +	/* Test fork-event interactions.  Child exits immediately.
>> +	   NB: glibc actually uses clone(), so force a fork.  */
>> +	if (syscall (fork_syscall) == 0) _exit (0);
> 
> We've recently agreed that tests should follow the coding conventions too,
> unless there's a good reason otherwise.  Can you put the _exit on
> its own line?

No problem.
  
Pedro Alves Oct. 9, 2015, 6:09 p.m. UTC | #3
On 10/09/2015 06:57 PM, Josh Stone wrote:
> On 10/09/2015 03:48 AM, Pedro Alves wrote:

>> no-mmu / uclinux systems don't have fork.  I'm not sure whether
>> fork returns ENOSYS or SYS_fork isn't even defined there.
>> Maybe just switch to vfork so we can keep catch syscall
>> coverage on those systems?
> 
> In kernel/fork.c I see that lacking CONFIG_MMU returns EINVAL.
> 
> But it appears a few archs don't implement fork/vfork syscalls at all,
> only clone.  

Ah, yeah.  Even on x86 glibc doesn't really implement fork
with the fork syscall.

> Maybe I should use CLONE_VFORK for broadest coverage?

That does sound the best.

Thanks,
Pedro Alves
  
Josh Stone Oct. 9, 2015, 11:52 p.m. UTC | #4
On 10/09/2015 11:09 AM, Pedro Alves wrote:
> On 10/09/2015 06:57 PM, Josh Stone wrote:
>> On 10/09/2015 03:48 AM, Pedro Alves wrote:
> 
>>> no-mmu / uclinux systems don't have fork.  I'm not sure whether
>>> fork returns ENOSYS or SYS_fork isn't even defined there.
>>> Maybe just switch to vfork so we can keep catch syscall
>>> coverage on those systems?
>>
>> In kernel/fork.c I see that lacking CONFIG_MMU returns EINVAL.
>>
>> But it appears a few archs don't implement fork/vfork syscalls at all,
>> only clone.  
> 
> Ah, yeah.  Even on x86 glibc doesn't really implement fork
> with the fork syscall.
> 
>> Maybe I should use CLONE_VFORK for broadest coverage?
> 
> That does sound the best.

I tried, but this area is fraught with idiosyncrasies.  I'm going with a
direct vfork() call and loose pattern matching for what syscall results,
since I don't actually care about that detail for this test.  Hope that
works for you.
  
Pedro Alves Oct. 10, 2015, 2:46 p.m. UTC | #5
On 10/10/2015 12:52 AM, Josh Stone wrote:

> I tried, but this area is fraught with idiosyncrasies.  I'm going with a
> direct vfork() call and loose pattern matching for what syscall results,
> since I don't actually care about that detail for this test.  Hope that
> works for you.

Certainly does.

Thanks,
Pedro Alves
  

Patch

diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index eb9f5bb91f6d..4cbb079de3f6 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1916,17 +1916,17 @@  linux_handle_syscall_trap (struct lwp_info *lp, int stopping)
       return 1;
     }
 
+  /* Always update the entry/return state, even if this particular
+     syscall isn't interesting to the core now.  In async mode,
+     the user could install a new catchpoint for this syscall
+     between syscall enter/return, and we'll need to know to
+     report a syscall return if that happens.  */
+  lp->syscall_state = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
+		       ? TARGET_WAITKIND_SYSCALL_RETURN
+		       : TARGET_WAITKIND_SYSCALL_ENTRY);
+
   if (catch_syscall_enabled ())
     {
-      /* Always update the entry/return state, even if this particular
-	 syscall isn't interesting to the core now.  In async mode,
-	 the user could install a new catchpoint for this syscall
-	 between syscall enter/return, and we'll need to know to
-	 report a syscall return if that happens.  */
-      lp->syscall_state = (lp->syscall_state == TARGET_WAITKIND_SYSCALL_ENTRY
-			   ? TARGET_WAITKIND_SYSCALL_RETURN
-			   : TARGET_WAITKIND_SYSCALL_ENTRY);
-
       if (catching_syscall_number (syscall_number))
 	{
 	  /* Alright, an event to report.  */
@@ -2006,6 +2006,11 @@  linux_handle_extended_wait (struct lwp_info *lp, int status)
   struct target_waitstatus *ourstatus = &lp->waitstatus;
   int event = linux_ptrace_get_extended_event (status);
 
+  /* All extended events we currently use are mid-syscall.  Only
+     PTRACE_EVENT_STOP is delivered more like a signal-stop, but
+     you have to be using PTRACE_SEIZE to get that.  */
+  lp->syscall_state = TARGET_WAITKIND_SYSCALL_ENTRY;
+
   if (event == PTRACE_EVENT_FORK || event == PTRACE_EVENT_VFORK
       || event == PTRACE_EVENT_CLONE)
     {
@@ -2324,6 +2329,10 @@  wait_lwp (struct lwp_info *lp)
       if (linux_handle_syscall_trap (lp, 1))
 	return wait_lwp (lp);
     }
+  else
+    /* Almost all other ptrace-stops are known to be outside of system
+       calls, with further exceptions in linux_handle_extended_wait.  */
+    lp->syscall_state = TARGET_WAITKIND_IGNORE;
 
   /* Handle GNU/Linux's extended waitstatus for trace events.  */
   if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP
@@ -3126,6 +3135,10 @@  linux_nat_filter_event (int lwpid, int status)
       if (linux_handle_syscall_trap (lp, 0))
 	return NULL;
     }
+  else
+    /* Almost all other ptrace-stops are known to be outside of system
+       calls, with further exceptions in linux_handle_extended_wait.  */
+    lp->syscall_state = TARGET_WAITKIND_IGNORE;
 
   /* Handle GNU/Linux's extended waitstatus for trace events.  */
   if (WIFSTOPPED (status) && WSTOPSIG (status) == SIGTRAP
diff --git a/gdb/testsuite/gdb.base/catch-syscall.c b/gdb/testsuite/gdb.base/catch-syscall.c
index 4d0131c0d733..35955fe4a078 100644
--- a/gdb/testsuite/gdb.base/catch-syscall.c
+++ b/gdb/testsuite/gdb.base/catch-syscall.c
@@ -27,6 +27,8 @@  int pipe_syscall = SYS_pipe;
 int pipe2_syscall = SYS_pipe2;
 #endif
 int write_syscall = SYS_write;
+int fork_syscall = SYS_fork;
+int unknown_syscall = 123456789;
 int exit_group_syscall = SYS_exit_group;
 
 int
@@ -47,6 +49,13 @@  main (void)
 	write (fd[1], buf1, sizeof (buf1));
 	read (fd[0], buf2, sizeof (buf2));
 
+	/* Test fork-event interactions.  Child exits immediately.
+	   NB: glibc actually uses clone(), so force a fork.  */
+	if (syscall (fork_syscall) == 0) _exit (0);
+
+	/* Trigger an intentional ENOSYS.  */
+	syscall (unknown_syscall);
+
 	/* The last syscall.  Do not change this.  */
 	_exit (0);
 }
diff --git a/gdb/testsuite/gdb.base/catch-syscall.exp b/gdb/testsuite/gdb.base/catch-syscall.exp
index 499da322267c..644ab8a687e6 100644
--- a/gdb/testsuite/gdb.base/catch-syscall.exp
+++ b/gdb/testsuite/gdb.base/catch-syscall.exp
@@ -51,6 +51,8 @@  set all_syscalls_numbers { }
 set last_syscall "exit_group"
 set last_syscall_number { }
 
+set unknown_syscall_number { }
+
 # Internal procedure used to check if, after issuing a 'catch syscall'
 # command (without arguments), the 'info breakpoints' command displays
 # that '"any syscall"' is to be caught.
@@ -154,7 +156,7 @@  proc check_for_program_end {} {
 }
 
 proc test_catch_syscall_without_args {} {
-    global all_syscalls last_syscall decimal
+    global all_syscalls last_syscall unknown_syscall_number decimal
 
     with_test_prefix "without arguments" {
 	# Trying to set the syscall.
@@ -167,6 +169,10 @@  proc test_catch_syscall_without_args {} {
 	    check_continue $name
 	}
 
+	with_test_prefix "ENOSYS" {
+	    check_continue $unknown_syscall_number
+	}
+
 	# At last but not least, we check if the inferior has called
 	# the last (exit) syscall.
 	check_call_to_syscall $last_syscall
@@ -243,6 +249,65 @@  proc test_catch_syscall_restarting_inferior {} {
     }
 }
 
+proc test_catch_syscall_skipping_return {} {
+    with_test_prefix "skipping return" {
+	with_test_prefix "entry" {
+	    set syscall_name "write"
+
+	    insert_catch_syscall_with_arg $syscall_name
+
+	    # Let's first reach the entry of the syscall.
+	    check_call_to_syscall $syscall_name
+
+	    # Now purposely skip the syscall return.
+	    delete_breakpoints
+	    gdb_test "stepi" ".*" "step over syscall return"
+	}
+
+	# With a naive entry/return toggle, gdb will still think
+	# the target is due for a syscall return.
+
+	with_test_prefix "entry/return" {
+	    set syscall_name "read"
+
+	    insert_catch_syscall_with_arg $syscall_name
+
+	    # Check for entry first, then return.
+	    check_continue $syscall_name
+
+	    # Can we finish?
+	    check_for_program_end
+	}
+    }
+}
+
+proc test_catch_syscall_mid_fork {} {
+    global gdb_prompt decimal
+
+    with_test_prefix "mid-fork" {
+	set syscall_name "fork"
+
+	# Verify that the system supports "catch fork".
+	gdb_test "catch fork" "Catchpoint $decimal \\(fork\\)" "insert first fork catchpoint"
+	gdb_test_multiple "continue" "continue to first fork catchpoint" {
+	    -re ".*Your system does not support this type\r\nof catchpoint.*$gdb_prompt $" {
+		unsupported "continue to first fork catchpoint"
+		return
+	    }
+	    -re ".*Catchpoint $decimal \\(forked process $decimal\\).*$gdb_prompt $" {
+		pass "continue to first fork catchpoint"
+	    }
+	}
+
+	# Check that we now reach fork return only
+	insert_catch_syscall_with_arg $syscall_name
+	check_return_from_syscall $syscall_name
+
+	# Can we finish?
+	check_for_program_end
+    }
+}
+
 proc test_catch_syscall_fail_nodatadir {} {
     with_test_prefix "fail no datadir" {
 	# Sanitizing.
@@ -302,10 +367,17 @@  proc do_syscall_tests {} {
     # This test should not trigger any catchpoints.
     if [runto_main] then { test_catch_syscall_with_wrong_args }
 
-    # Testing the 'catch' syscall command during a restart of
+    # Testing the 'catch syscall' command during a restart of
     # the inferior.
     if [runto_main] then { test_catch_syscall_restarting_inferior }
 
+    # Testing the 'catch syscall' command toggling off past a
+    # syscall return, then resuming entry/return as normal.
+    if [runto_main] then { test_catch_syscall_skipping_return }
+
+    # Testing the 'catch syscall' command starting mid-fork.
+    if [runto_main] then { test_catch_syscall_mid_fork }
+
     # Testing if the 'catch syscall' command works when switching to
     # different architectures on-the-fly (PR gdb/10737).
     if [runto_main] then { test_catch_syscall_multi_arch }
@@ -315,7 +387,7 @@  proc test_catch_syscall_without_args_noxml {} {
     with_test_prefix "without args noxml" {
 	# We will need the syscall names even not using it because we
 	# need to know know many syscalls are in the example file.
-	global all_syscalls last_syscall_number all_syscalls_numbers
+	global all_syscalls last_syscall_number unknown_syscall_number all_syscalls_numbers
 
 	delete_breakpoints
 
@@ -329,6 +401,10 @@  proc test_catch_syscall_without_args_noxml {} {
 	    }
 	}
 
+	with_test_prefix "ENOSYS" {
+	    check_continue $unknown_syscall_number
+	}
+
 	# At last but not least, we check if the inferior has called
 	# the last (exit) syscall.
 	check_call_to_syscall $last_syscall_number
@@ -466,13 +542,14 @@  proc do_syscall_tests_without_xml {} {
 # This procedure fills the vector "all_syscalls_numbers" with the proper
 # numbers for the used syscalls according to the architecture.
 proc fill_all_syscalls_numbers {} {
-    global all_syscalls_numbers last_syscall_number all_syscalls
+    global all_syscalls_numbers last_syscall_number unknown_syscall_number all_syscalls
 
     foreach syscall $all_syscalls {
 	lappend all_syscalls_numbers [get_integer_valueof "${syscall}_syscall" -1]
     }
 
     set last_syscall_number [get_integer_valueof "exit_group_syscall" -1]
+    set unknown_syscall_number [get_integer_valueof "unknown_syscall" -1]
 }
 
 # Set up the vector all_syscalls.
@@ -501,6 +578,7 @@  proc setup_all_syscalls {} {
 
     lappend all_syscalls "write"
     lappend all_syscalls "read"
+    lappend all_syscalls "fork"
 }
 
 setup_all_syscalls