[v2] Improve ptrace-error detection on Linux targets

Message ID ca41b317-cf1c-dfde-9f54-f990acf19dc5@redhat.com
State New, archived
Headers

Commit Message

Pedro Alves Aug. 30, 2019, 12:45 p.m. UTC
  On 8/29/19 8:27 PM, Sergio Durigan Junior wrote:
> On Thursday, August 29 2019, Pedro Alves wrote:

>>> diff --git a/gdb/nat/linux-ptrace.h b/gdb/nat/linux-ptrace.h
>>> index fd2f12a342..04ada53bf6 100644
>>> --- a/gdb/nat/linux-ptrace.h
>>> +++ b/gdb/nat/linux-ptrace.h
>>> @@ -176,13 +176,26 @@ struct buffer;
>>>  # define TRAP_HWBKPT 4
>>>  #endif
>>>  
>>> -extern std::string linux_ptrace_attach_fail_reason (pid_t pid);
>>> +/* Find all possible reasons we could fail to attach PID and return
>>> +   these as a string.  An empty string is returned if we didn't find
>>> +   any reason.  If ERR is EACCES or EPERM, we also add a warning about
>>> +   possible restrictions to use ptrace.  */
>>> +extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err = -1);
>>
>> If ERR is an errno number, then it's a bit odd to use -1 for default,
>> since errno == 0 is the traditional "no error" number.  Pedantically, I believe
>> there's no garantee that a valid error number must be a positive integer.
>>
>> But, why the default argument in the first place?  What calls this
>> without passing an error?
> 
> So, the only place that calls linux_ptrace_attach_fail_reason without
> passing the ERR argument is linux_ptrace_attach_fail_reason_string:
> 
>   std::string
>   linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err)
>   {
>     long lwpid = ptid.lwp ();
>     std::string reason = linux_ptrace_attach_fail_reason (lwpid);
> 
>     if (!reason.empty ())
>       return string_printf ("%s (%d), %s", safe_strerror (err), err,
>                             reason.c_str ());
>     else
>       return string_printf ("%s (%d)", safe_strerror (err), err);
>   }
> 
> In this case, I opted to keep it as is because the function will compose
> a string contaning like:
> 
>   A (B)[: C]
> 
> Where:
> 
>  A = safe_strerror
>  B = errno
>  C = fail reason (optional)
> 
> This function (linux_ptrace_attach_fail_reason_string) is called in
> three places:
> 
> gdb/linux-nat.c:
> 
> 	      std::string reason
> 		= linux_ptrace_attach_fail_reason_string (ptid, err);
> 
> 	      warning (_("Cannot attach to lwp %d: %s"),
> 		       lwpid, reason.c_str ());
> 
> gdb/gdbserver/linux-low.c:
> 
> 	  std::string reason
> 	    = linux_ptrace_attach_fail_reason_string (ptid, err);
> 
> 	  warning (_("Cannot attach to lwp %d: %s"), lwpid, reason.c_str ());
> 
> and
> 
>       std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
>       error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
> 
> 
> It seems to me like these error messages are expecting a short string to
> just append to their existing strings, so I didn't think it made much
> sense to extend the ptrace error checking here as well.  That's why I
> didn't extend linux_ptrace_attach_fail_reason_string to pass ERR down to
> linux_ptrace_attach_fail_reason.

OK, I think that's just a bit too messy.  Let's take a fresh look at the
result:

 /* Find all possible reasons we could fail to attach PID and return
    these as a string.  An empty string is returned if we didn't find
    any reason.  If ERR is EACCES or EPERM, we also add a warning about
    possible restrictions to use ptrace.  */
 extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err = -1);

 /* Find all possible reasons we could have failed to attach to PTID
    and return them as a string.  ERR is the error PTRACE_ATTACH failed
    with (an errno).  */
 extern std::string linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err);

Both the functions have the exact same prototype (same parameters, same return).
And since they both return std::string, why is linux_ptrace_attach_fail_reason_string
called "xxx_string"?  From the function names alone, I'd think
linux_ptrace_attach_fail_reason_string returned a string, while
linux_ptrace_attach_fail_reason returned something else, or printed
directly.  But that's not the case.

So linux_ptrace_attach_fail_reason_string is used when we're attaching
to an lwp, other than the thread group leader (the main thread).
IMO, it'd be better to rename that function accordingly, and update
its comments accordingly too.

Then, it becomes clearer that linux_ptrace_attach_fail_reason is to
be used in the initial process attach, in which case we want to
check the ptrace permissions.

Also, we can factor out things such that we don't need the
default parameter.  I tend to prefer avoiding mode-setting default
parameters, preferring differently named functions, because default
parameters make it harder to grep around the sources, distinguishing the
cases where you passed an argument or not.  In this case,
the linux_ptrace_attach_fail_reason / linux_ptrace_attach_fail_reason_string
functions serve different purposes, so decoupling them via not using
default parameters is better, IMO, it avoids missing some conversion
in some spot.

Which is exactly what happened.  Here's a patch implementing
the idea.  Notice how gdbserver/linux-low.c:linux_attach
has a spot where it is currently using linux_ptrace_attach_fail_reason_string
after attaching to the main thread fails.  That spot should be including
the new ptrace restrictions fail reasons, but it wasn't due to use of
linux_ptrace_attach_fail_reason_string:

 @@ -1202,7 +1202,7 @@ linux_attach (unsigned long pid)
     {
       remove_process (proc);
 
 -      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
 +      std::string reason = linux_ptrace_attach_fail_reason (pid, err);
       error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
     }

I haven't checked whether the resulting error message looks good as is,
or whether we need to tweak that error call in addition.  Can you take it
from here?

Here's the patch on top of yours.

From a3e8744c1ed7fa8c702009fe3caf8578bb1785ea Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Fri, 30 Aug 2019 13:15:12 +0100
Subject: [PATCH] linux_ptrace_attach_fail_reason

---
 gdb/gdbserver/linux-low.c |  4 ++--
 gdb/gdbserver/thread-db.c |  2 +-
 gdb/linux-nat.c           |  2 +-
 gdb/nat/linux-ptrace.c    | 24 ++++++++++++++++++------
 gdb/nat/linux-ptrace.h    | 10 ++++++----
 5 files changed, 28 insertions(+), 14 deletions(-)
  

Comments

Sergio Durigan Junior Sept. 4, 2019, 7:21 p.m. UTC | #1
On Friday, August 30 2019, Pedro Alves wrote:

> On 8/29/19 8:27 PM, Sergio Durigan Junior wrote:
>> On Thursday, August 29 2019, Pedro Alves wrote:
>
>>>> diff --git a/gdb/nat/linux-ptrace.h b/gdb/nat/linux-ptrace.h
>>>> index fd2f12a342..04ada53bf6 100644
>>>> --- a/gdb/nat/linux-ptrace.h
>>>> +++ b/gdb/nat/linux-ptrace.h
>>>> @@ -176,13 +176,26 @@ struct buffer;
>>>>  # define TRAP_HWBKPT 4
>>>>  #endif
>>>>  
>>>> -extern std::string linux_ptrace_attach_fail_reason (pid_t pid);
>>>> +/* Find all possible reasons we could fail to attach PID and return
>>>> +   these as a string.  An empty string is returned if we didn't find
>>>> +   any reason.  If ERR is EACCES or EPERM, we also add a warning about
>>>> +   possible restrictions to use ptrace.  */
>>>> +extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err = -1);
>>>
>>> If ERR is an errno number, then it's a bit odd to use -1 for default,
>>> since errno == 0 is the traditional "no error" number.  Pedantically, I believe
>>> there's no garantee that a valid error number must be a positive integer.
>>>
>>> But, why the default argument in the first place?  What calls this
>>> without passing an error?
>> 
>> So, the only place that calls linux_ptrace_attach_fail_reason without
>> passing the ERR argument is linux_ptrace_attach_fail_reason_string:
>> 
>>   std::string
>>   linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err)
>>   {
>>     long lwpid = ptid.lwp ();
>>     std::string reason = linux_ptrace_attach_fail_reason (lwpid);
>> 
>>     if (!reason.empty ())
>>       return string_printf ("%s (%d), %s", safe_strerror (err), err,
>>                             reason.c_str ());
>>     else
>>       return string_printf ("%s (%d)", safe_strerror (err), err);
>>   }
>> 
>> In this case, I opted to keep it as is because the function will compose
>> a string contaning like:
>> 
>>   A (B)[: C]
>> 
>> Where:
>> 
>>  A = safe_strerror
>>  B = errno
>>  C = fail reason (optional)
>> 
>> This function (linux_ptrace_attach_fail_reason_string) is called in
>> three places:
>> 
>> gdb/linux-nat.c:
>> 
>> 	      std::string reason
>> 		= linux_ptrace_attach_fail_reason_string (ptid, err);
>> 
>> 	      warning (_("Cannot attach to lwp %d: %s"),
>> 		       lwpid, reason.c_str ());
>> 
>> gdb/gdbserver/linux-low.c:
>> 
>> 	  std::string reason
>> 	    = linux_ptrace_attach_fail_reason_string (ptid, err);
>> 
>> 	  warning (_("Cannot attach to lwp %d: %s"), lwpid, reason.c_str ());
>> 
>> and
>> 
>>       std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
>>       error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
>> 
>> 
>> It seems to me like these error messages are expecting a short string to
>> just append to their existing strings, so I didn't think it made much
>> sense to extend the ptrace error checking here as well.  That's why I
>> didn't extend linux_ptrace_attach_fail_reason_string to pass ERR down to
>> linux_ptrace_attach_fail_reason.
>
> OK, I think that's just a bit too messy.  Let's take a fresh look at the
> result:
>
>  /* Find all possible reasons we could fail to attach PID and return
>     these as a string.  An empty string is returned if we didn't find
>     any reason.  If ERR is EACCES or EPERM, we also add a warning about
>     possible restrictions to use ptrace.  */
>  extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err = -1);
>
>  /* Find all possible reasons we could have failed to attach to PTID
>     and return them as a string.  ERR is the error PTRACE_ATTACH failed
>     with (an errno).  */
>  extern std::string linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err);
>
> Both the functions have the exact same prototype (same parameters, same return).
> And since they both return std::string, why is linux_ptrace_attach_fail_reason_string
> called "xxx_string"?  From the function names alone, I'd think
> linux_ptrace_attach_fail_reason_string returned a string, while
> linux_ptrace_attach_fail_reason returned something else, or printed
> directly.  But that's not the case.
>
> So linux_ptrace_attach_fail_reason_string is used when we're attaching
> to an lwp, other than the thread group leader (the main thread).
> IMO, it'd be better to rename that function accordingly, and update
> its comments accordingly too.
>
> Then, it becomes clearer that linux_ptrace_attach_fail_reason is to
> be used in the initial process attach, in which case we want to
> check the ptrace permissions.
>
> Also, we can factor out things such that we don't need the
> default parameter.  I tend to prefer avoiding mode-setting default
> parameters, preferring differently named functions, because default
> parameters make it harder to grep around the sources, distinguishing the
> cases where you passed an argument or not.  In this case,
> the linux_ptrace_attach_fail_reason / linux_ptrace_attach_fail_reason_string
> functions serve different purposes, so decoupling them via not using
> default parameters is better, IMO, it avoids missing some conversion
> in some spot.
>
> Which is exactly what happened.  Here's a patch implementing
> the idea.  Notice how gdbserver/linux-low.c:linux_attach
> has a spot where it is currently using linux_ptrace_attach_fail_reason_string
> after attaching to the main thread fails.  That spot should be including
> the new ptrace restrictions fail reasons, but it wasn't due to use of
> linux_ptrace_attach_fail_reason_string:
>
>  @@ -1202,7 +1202,7 @@ linux_attach (unsigned long pid)
>      {
>        remove_process (proc);
>  
>  -      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
>  +      std::string reason = linux_ptrace_attach_fail_reason (pid, err);
>        error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
>      }
>
> I haven't checked whether the resulting error message looks good as is,
> or whether we need to tweak that error call in addition.  Can you take it
> from here?
>
> Here's the patch on top of yours.

Sure, thanks for the patch.  I've tried it here and it seems fine.  I'll
submit it integrated into v3 soon.

>
> From a3e8744c1ed7fa8c702009fe3caf8578bb1785ea Mon Sep 17 00:00:00 2001
> From: Pedro Alves <palves@redhat.com>
> Date: Fri, 30 Aug 2019 13:15:12 +0100
> Subject: [PATCH] linux_ptrace_attach_fail_reason
>
> ---
>  gdb/gdbserver/linux-low.c |  4 ++--
>  gdb/gdbserver/thread-db.c |  2 +-
>  gdb/linux-nat.c           |  2 +-
>  gdb/nat/linux-ptrace.c    | 24 ++++++++++++++++++------
>  gdb/nat/linux-ptrace.h    | 10 ++++++----
>  5 files changed, 28 insertions(+), 14 deletions(-)
>
> diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
> index 1e0a5cbf54d..45cf43ad9ed 100644
> --- a/gdb/gdbserver/linux-low.c
> +++ b/gdb/gdbserver/linux-low.c
> @@ -1170,7 +1170,7 @@ attach_proc_task_lwp_callback (ptid_t ptid)
>        else if (err != 0)
>  	{
>  	  std::string reason
> -	    = linux_ptrace_attach_fail_reason_string (ptid, err);
> +	    = linux_ptrace_attach_fail_reason_lwp (ptid, err);
>  
>  	  warning (_("Cannot attach to lwp %d: %s"), lwpid, reason.c_str ());
>  	}
> @@ -1202,7 +1202,7 @@ linux_attach (unsigned long pid)
>      {
>        remove_process (proc);
>  
> -      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
> +      std::string reason = linux_ptrace_attach_fail_reason (pid, err);
>        error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
>      }
>  
> diff --git a/gdb/gdbserver/thread-db.c b/gdb/gdbserver/thread-db.c
> index b2791b0513a..cfba05977e6 100644
> --- a/gdb/gdbserver/thread-db.c
> +++ b/gdb/gdbserver/thread-db.c
> @@ -225,7 +225,7 @@ attach_thread (const td_thrhandle_t *th_p, td_thrinfo_t *ti_p)
>    err = linux_attach_lwp (ptid);
>    if (err != 0)
>      {
> -      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
> +      std::string reason = linux_ptrace_attach_fail_reason_lwp (ptid, err);
>  
>        warning ("Could not attach to thread %ld (LWP %d): %s",
>  	       (unsigned long) ti_p->ti_tid, ti_p->ti_lid, reason.c_str ());
> diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
> index b5a9eaf72e3..22cb55e6dcc 100644
> --- a/gdb/linux-nat.c
> +++ b/gdb/linux-nat.c
> @@ -1136,7 +1136,7 @@ attach_proc_task_lwp_callback (ptid_t ptid)
>  	  else
>  	    {
>  	      std::string reason
> -		= linux_ptrace_attach_fail_reason_string (ptid, err);
> +		= linux_ptrace_attach_fail_reason_lwp (ptid, err);
>  
>  	      warning (_("Cannot attach to lwp %d: %s"),
>  		       lwpid, reason.c_str ());
> diff --git a/gdb/nat/linux-ptrace.c b/gdb/nat/linux-ptrace.c
> index 599d9cfb550..2201a24d812 100644
> --- a/gdb/nat/linux-ptrace.c
> +++ b/gdb/nat/linux-ptrace.c
> @@ -92,10 +92,13 @@ for more details.\n");
>    return ret;
>  }
>  
> -/* See declaration in linux-ptrace.h.  */
> +/* Find all possible reasons we could fail to attach PID and return
> +   these as a string.  An empty string is returned if we didn't find
> +   any reason.  Helper for linux_ptrace_attach_fail_reason and
> +   linux_ptrace_attach_fail_reason_lwp.  */
>  
> -std::string
> -linux_ptrace_attach_fail_reason (pid_t pid, int err)
> +static std::string
> +linux_ptrace_attach_fail_reason_1 (pid_t pid)
>  {
>    pid_t tracerpid = linux_proc_get_tracerpid_nowarn (pid);
>    std::string result;
> @@ -111,8 +114,17 @@ linux_ptrace_attach_fail_reason (pid_t pid, int err)
>  		      "terminated"),
>  		    (int) pid);
>  
> -  std::string ptrace_restrict = linux_ptrace_restricted_fail_reason (err);
> +  return result;
> +}
>  
> +/* See declaration in linux-ptrace.h.  */
> +
> +std::string
> +linux_ptrace_attach_fail_reason (pid_t pid, int err)
> +{
> +  std::string result = linux_ptrace_attach_fail_reason_1 (pid);
> +
> +  std::string ptrace_restrict = linux_ptrace_restricted_fail_reason (err);
>    if (!ptrace_restrict.empty ())
>      result += "\n" + ptrace_restrict;
>  
> @@ -122,10 +134,10 @@ linux_ptrace_attach_fail_reason (pid_t pid, int err)
>  /* See linux-ptrace.h.  */
>  
>  std::string
> -linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err)
> +linux_ptrace_attach_fail_reason_lwp (ptid_t ptid, int err)
>  {
>    long lwpid = ptid.lwp ();
> -  std::string reason = linux_ptrace_attach_fail_reason (lwpid);
> +  std::string reason = linux_ptrace_attach_fail_reason_1 (lwpid);
>  
>    if (!reason.empty ())
>      return string_printf ("%s (%d), %s", safe_strerror (err), err,
> diff --git a/gdb/nat/linux-ptrace.h b/gdb/nat/linux-ptrace.h
> index 04ada53bf69..94c9ba48ba5 100644
> --- a/gdb/nat/linux-ptrace.h
> +++ b/gdb/nat/linux-ptrace.h
> @@ -180,12 +180,14 @@ struct buffer;
>     these as a string.  An empty string is returned if we didn't find
>     any reason.  If ERR is EACCES or EPERM, we also add a warning about
>     possible restrictions to use ptrace.  */
> -extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err = -1);
> +extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err);
>  
> -/* Find all possible reasons we could have failed to attach to PTID
> +/* Find all possible reasons we could have failed to attach to LWPID
>     and return them as a string.  ERR is the error PTRACE_ATTACH failed
> -   with (an errno).  */
> -extern std::string linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err);
> +   with (an errno).  Unlike linux_ptrace_attach_fail_reason, this
> +   function should be used when attaching to an LWP other than the
> +   leader; it does not warn about ptrace restrictions.  */
> +extern std::string linux_ptrace_attach_fail_reason_lwp (ptid_t lwpid, int err);
>  
>  /* When the call to 'ptrace (PTRACE_TRACEME...' fails, and we have
>     already forked, this function can be called in order to try to
> -- 
> 2.14.5
  
Sergio Durigan Junior Sept. 4, 2019, 7:31 p.m. UTC | #2
On Friday, August 30 2019, Pedro Alves wrote:

> Which is exactly what happened.  Here's a patch implementing
> the idea.  Notice how gdbserver/linux-low.c:linux_attach
> has a spot where it is currently using linux_ptrace_attach_fail_reason_string
> after attaching to the main thread fails.  That spot should be including
> the new ptrace restrictions fail reasons, but it wasn't due to use of
> linux_ptrace_attach_fail_reason_string:
>
>  @@ -1202,7 +1202,7 @@ linux_attach (unsigned long pid)
>      {
>        remove_process (proc);
>  
>  -      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
>  +      std::string reason = linux_ptrace_attach_fail_reason (pid, err);
>        error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
>      }
>
> I haven't checked whether the resulting error message looks good as is,
> or whether we need to tweak that error call in addition.  Can you take it
> from here?

I forgot to say, but the way my patch works now prevents the code path
above to be executed.  My patch catches the ptrace failure very early in
gdbserver initialization the process, when linux_child_function is
called during the execution of linux_check_ptrace_features.  Since
linux_child_function will try to perform a PTRACE_TRACEME (and fail if
there are any restrictions in place), gdbserver will error out before it
even tries to spawn/attach.

Nevertheless, during my tests I bypassed linux_check_ptrace_features and
confirmed that the error message from linux_attach (above) is correctly
displayed:

  [sergio@fedora-rawhide build]$ sleep 120 &
  [5] 2732388
  [sergio@fedora-rawhide build]$ ./gdb/gdbserver/gdbserver :9911 --attach 2732388
  gdbserver: linux_ptrace_test_ret_to_nx: Cannot PTRACE_TRACEME: Operation not permitted
  gdbserver: linux_ptrace_test_ret_to_nx: status 256 is not WIFSTOPPED!
  gdbserver: linux_ptrace_test_ret_to_nx: failed to kill child pid 2732391 No such process
  Cannot attach to process 2732388: 

  The SELinux 'deny_ptrace' option is enabled and preventing GDB
  from using 'ptrace'.  You can disable it by executing (as root):

    setsebool deny_ptrace off

  The Linux kernel's Yama ptrace scope is in effect, which can prevent
  GDB from using 'ptrace'.  You can disable it by executing (as root):

    echo 0 > /proc/sys/kernel/yama/ptrace_scope

  Exiting

Thanks,
  
Pedro Alves Sept. 4, 2019, 7:58 p.m. UTC | #3
On 9/4/19 8:31 PM, Sergio Durigan Junior wrote:
> I forgot to say, but the way my patch works now prevents the code path
> above to be executed.  My patch catches the ptrace failure very early in
> gdbserver initialization the process, when linux_child_function is
> called during the execution of linux_check_ptrace_features.  Since
> linux_child_function will try to perform a PTRACE_TRACEME (and fail if
> there are any restrictions in place), gdbserver will error out before it
> even tries to spawn/attach.
> 
> Nevertheless, during my tests I bypassed linux_check_ptrace_features and
> confirmed that the error message from linux_attach (above) is correctly
> displayed:
> 
>   [sergio@fedora-rawhide build]$ sleep 120 &
>   [5] 2732388
>   [sergio@fedora-rawhide build]$ ./gdb/gdbserver/gdbserver :9911 --attach 2732388
>   gdbserver: linux_ptrace_test_ret_to_nx: Cannot PTRACE_TRACEME: Operation not permitted
>   gdbserver: linux_ptrace_test_ret_to_nx: status 256 is not WIFSTOPPED!
>   gdbserver: linux_ptrace_test_ret_to_nx: failed to kill child pid 2732391 No such process
>   Cannot attach to process 2732388: 
> 
>   The SELinux 'deny_ptrace' option is enabled and preventing GDB
>   from using 'ptrace'.  You can disable it by executing (as root):
> 
>     setsebool deny_ptrace off
> 
>   The Linux kernel's Yama ptrace scope is in effect, which can prevent
>   GDB from using 'ptrace'.  You can disable it by executing (as root):
> 
>     echo 0 > /proc/sys/kernel/yama/ptrace_scope
> 
>   Exiting

So if you _don't_ hack linux_check_ptrace_features, what does the user
see, with either

        gdbserver [OPTIONS] --attach COMM PID

or extended-remote + "(gdb) attach" ?

Thanks,
Pedro Alves
  
Sergio Durigan Junior Sept. 4, 2019, 8:21 p.m. UTC | #4
On Wednesday, September 04 2019, Pedro Alves wrote:

> On 9/4/19 8:31 PM, Sergio Durigan Junior wrote:
>> I forgot to say, but the way my patch works now prevents the code path
>> above to be executed.  My patch catches the ptrace failure very early in
>> gdbserver initialization the process, when linux_child_function is
>> called during the execution of linux_check_ptrace_features.  Since
>> linux_child_function will try to perform a PTRACE_TRACEME (and fail if
>> there are any restrictions in place), gdbserver will error out before it
>> even tries to spawn/attach.
>> 
>> Nevertheless, during my tests I bypassed linux_check_ptrace_features and
>> confirmed that the error message from linux_attach (above) is correctly
>> displayed:
>> 
>>   [sergio@fedora-rawhide build]$ sleep 120 &
>>   [5] 2732388
>>   [sergio@fedora-rawhide build]$ ./gdb/gdbserver/gdbserver :9911 --attach 2732388
>>   gdbserver: linux_ptrace_test_ret_to_nx: Cannot PTRACE_TRACEME: Operation not permitted
>>   gdbserver: linux_ptrace_test_ret_to_nx: status 256 is not WIFSTOPPED!
>>   gdbserver: linux_ptrace_test_ret_to_nx: failed to kill child pid 2732391 No such process
>>   Cannot attach to process 2732388: 
>> 
>>   The SELinux 'deny_ptrace' option is enabled and preventing GDB
>>   from using 'ptrace'.  You can disable it by executing (as root):
>> 
>>     setsebool deny_ptrace off
>> 
>>   The Linux kernel's Yama ptrace scope is in effect, which can prevent
>>   GDB from using 'ptrace'.  You can disable it by executing (as root):
>> 
>>     echo 0 > /proc/sys/kernel/yama/ptrace_scope
>> 
>>   Exiting
>
> So if you _don't_ hack linux_check_ptrace_features, what does the user
> see, with either
>
>         gdbserver [OPTIONS] --attach COMM PID

Basically the same thing:

  [sergio@fedora-rawhide build]$ ./gdb/gdbserver/gdbserver :9911 --attach 2733600
  gdbserver: linux_ptrace_test_ret_to_nx: Cannot PTRACE_TRACEME: Operation not permitted
  gdbserver: linux_ptrace_test_ret_to_nx: status 256 is not WIFSTOPPED!
  gdbserver: linux_ptrace_test_ret_to_nx: failed to kill child pid 2733602 No such process
  gdbserver: Could not trace the inferior process.
  gdbserver: ptrace: Operation not permitted
  The SELinux 'deny_ptrace' option is enabled and preventing GDB
  from using 'ptrace'.  You can disable it by executing (as root):

    setsebool deny_ptrace off

  The Linux kernel's Yama ptrace scope is in effect, which can prevent
  GDB from using 'ptrace'.  You can disable it by executing (as root):

    echo 0 > /proc/sys/kernel/yama/ptrace_scope

  linux_check_ptrace_features: waitpid: unexpected status 32512.
  Exiting

The only difference is that we don't see the "Cannot attach to process
PID:" message.

> or extended-remote + "(gdb) attach" ?

I'm trying to come up with a way to test this.  The only way would be to
make gdbserver successfully start so that we could perform an
extended-remote connection from GDB.  However, if gdbserver starts
without problems, this means that the ptrace check succeeded and that
there are no ptrace restrictions in place.  Therefore, an "attach" would
succeed as well.  Which means that even if we're running GDB in a
ptrace-restricted system, things will go OK as long as gdbserver is not
restricted.  In this case, we wouldn't see any error messages IIUC.

Thanks,
  
Pedro Alves Sept. 4, 2019, 8:35 p.m. UTC | #5
On 9/4/19 9:21 PM, Sergio Durigan Junior wrote:
>> or extended-remote + "(gdb) attach" ?
> I'm trying to come up with a way to test this.  The only way would be to
> make gdbserver successfully start so that we could perform an
> extended-remote connection from GDB.  However, if gdbserver starts
> without problems, this means that the ptrace check succeeded and that
> there are no ptrace restrictions in place.  Therefore, an "attach" would
> succeed as well.  Which means that even if we're running GDB in a
> ptrace-restricted system, things will go OK as long as gdbserver is not
> restricted.  In this case, we wouldn't see any error messages IIUC.

So

 $ gdbserver --multi :9999

exits with error immediately?

You could start gdbserver with the restrictions off (like a long
lived daemon), and then while gdbserver is running enable
restrictions, I suppose.

Thanks,
Pedro Alves
  
Sergio Durigan Junior Sept. 4, 2019, 8:56 p.m. UTC | #6
On Wednesday, September 04 2019, Pedro Alves wrote:

> On 9/4/19 9:21 PM, Sergio Durigan Junior wrote:
>>> or extended-remote + "(gdb) attach" ?
>> I'm trying to come up with a way to test this.  The only way would be to
>> make gdbserver successfully start so that we could perform an
>> extended-remote connection from GDB.  However, if gdbserver starts
>> without problems, this means that the ptrace check succeeded and that
>> there are no ptrace restrictions in place.  Therefore, an "attach" would
>> succeed as well.  Which means that even if we're running GDB in a
>> ptrace-restricted system, things will go OK as long as gdbserver is not
>> restricted.  In this case, we wouldn't see any error messages IIUC.
>
> So
>
>  $ gdbserver --multi :9999
>
> exits with error immediately?

Yeah.

> You could start gdbserver with the restrictions off (like a long
> lived daemon), and then while gdbserver is running enable
> restrictions, I suppose.

Ah, right, I think that would also work.

Thanks,
  
Pedro Alves Sept. 4, 2019, 9:23 p.m. UTC | #7
On 9/4/19 9:56 PM, Sergio Durigan Junior wrote:
>> You could start gdbserver with the restrictions off (like a long
>> lived daemon), and then while gdbserver is running enable
>> restrictions, I suppose.
> Ah, right, I think that would also work.

I'm interested in knowing whether the error percolates to
the gdb side in a reasonable form.

Thanks,
Pedro Alves
  
Sergio Durigan Junior Sept. 4, 2019, 9:36 p.m. UTC | #8
On Wednesday, September 04 2019, Pedro Alves wrote:

> On 9/4/19 9:56 PM, Sergio Durigan Junior wrote:
>>> You could start gdbserver with the restrictions off (like a long
>>> lived daemon), and then while gdbserver is running enable
>>> restrictions, I suppose.
>> Ah, right, I think that would also work.
>
> I'm interested in knowing whether the error percolates to
> the gdb side in a reasonable form.

The attach error is displayed by GDB, but the information about ptrace
restrictions is not.

This is what I see on GDB:

  (gdb) attach 32378
  Attaching to process 32378
  Attaching to process 32378 failed

And this is what I see on gdbserver:

  gdbserver: Cannot attach to process 32378: Permission denied (13), the SELinux boolean 'deny_ptrace' is enabled, you can disable this process attach protection by: (gdb) shell sudo setsebool deny_ptrace=0


I think there's something wrong with the terminal settings because
newlines aren't being respected here.  But the message is still there,
and the user can still understand it, I think.

Ideally we'd have GDB also display the ptrace restriction warning, but I
think that if the user can see that the attach failed, she will likely
look at what happened with gdbserver, and will see the error there.  If
we were to show the message on GDB, we'd also have to rewrite it in a
way that explains that the command needs to be run at the target, which
may confuse things.

Thanks,
  
Pedro Alves Sept. 5, 2019, 12:19 p.m. UTC | #9
On 9/4/19 10:36 PM, Sergio Durigan Junior wrote:
> On Wednesday, September 04 2019, Pedro Alves wrote:
> 
>> On 9/4/19 9:56 PM, Sergio Durigan Junior wrote:
>>>> You could start gdbserver with the restrictions off (like a long
>>>> lived daemon), and then while gdbserver is running enable
>>>> restrictions, I suppose.
>>> Ah, right, I think that would also work.
>>
>> I'm interested in knowing whether the error percolates to
>> the gdb side in a reasonable form.
> 
> The attach error is displayed by GDB, but the information about ptrace
> restrictions is not.
> 
> This is what I see on GDB:
> 
>   (gdb) attach 32378
>   Attaching to process 32378
>   Attaching to process 32378 failed
> 
> And this is what I see on gdbserver:
> 
>   gdbserver: Cannot attach to process 32378: Permission denied (13), the SELinux boolean 'deny_ptrace' is enabled, you can disable this process attach protection by: (gdb) shell sudo setsebool deny_ptrace=0
> 
> 
> I think there's something wrong with the terminal settings because
> newlines aren't being respected here.  But the message is still there,
> and the user can still understand it, I think.

Odd.  We should fix that...

But this made me realize something.  Here:

  static int
  linux_child_function (void *child_stack)
  {
 -  ptrace (PTRACE_TRACEME, 0, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0);
 +  if (ptrace (PTRACE_TRACEME, 0, (PTRACE_TYPE_ARG3) 0,
 +             (PTRACE_TYPE_ARG4) 0) != 0)
 +    trace_start_error_with_name ("ptrace",
 +                                linux_ptrace_me_fail_reason (errno).c_str ());
 +

This is code that is run by the fork child.  Recall that we 
had introduced trace_start_error_with_name instead of using
perror_with_name / error, because between after fork and before
exec, we can only safely use async-signal-safe functions.

But that linux_ptrace_me_fail_reason call is doing a lot of
non-async-signal safe things...

I think the right approach would be to make the child pass back the errno
value to the parent, and then have the parent, do the dlopens, the
std::string/malloc uses, etc. and print the error messages.

The traditional way to pass data around like this is via a pipe.
darwin-nat.c uses a pipe around fork, see ptrace_fds, though for a
different reason.

> Ideally we'd have GDB also display the ptrace restriction warning, but I
> think that if the user can see that the attach failed, she will likely
> look at what happened with gdbserver, and will see the error there.  

I don't think that assuming that the user has easy/convenient access
to gdbserver's terminal is a great assumption.  Consider IDEs automating
things, or gdbserver really being used as a service in a container, etc.

Note that at least for Yama, you can have ptrace allow PTRACE_ME, while
PTRACE_ATTACH gets refused.  So it's not guaranteed that gdbserver would
exit out early on start up, meaning, I think that handling this
gdbserver + "(gdb) attach" error case isn't being pedantic.

> If
> we were to show the message on GDB, we'd also have to rewrite it in a
> way that explains that the command needs to be run at the target, which
> may confuse things.
Or it may not confuse things?  I don't think that should prevent
improving things.  Surely we can find some way to express the message
without causing confusion.  I'm not sure it does, but if it does,
maybe we can just append a "on the target system" somewhere?

Thanks,
Pedro Alves
  
Sergio Durigan Junior Sept. 5, 2019, 5:58 p.m. UTC | #10
On Thursday, September 05 2019, Pedro Alves wrote:

> On 9/4/19 10:36 PM, Sergio Durigan Junior wrote:
>> On Wednesday, September 04 2019, Pedro Alves wrote:
>> 
>>> On 9/4/19 9:56 PM, Sergio Durigan Junior wrote:
>>>>> You could start gdbserver with the restrictions off (like a long
>>>>> lived daemon), and then while gdbserver is running enable
>>>>> restrictions, I suppose.
>>>> Ah, right, I think that would also work.
>>>
>>> I'm interested in knowing whether the error percolates to
>>> the gdb side in a reasonable form.
>> 
>> The attach error is displayed by GDB, but the information about ptrace
>> restrictions is not.
>> 
>> This is what I see on GDB:
>> 
>>   (gdb) attach 32378
>>   Attaching to process 32378
>>   Attaching to process 32378 failed
>> 
>> And this is what I see on gdbserver:
>> 
>>   gdbserver: Cannot attach to process 32378: Permission denied (13),
>> the SELinux boolean 'deny_ptrace' is enabled, you can disable this
>> process attach protection by: (gdb) shell sudo setsebool
>> deny_ptrace=0
>> 
>> 
>> I think there's something wrong with the terminal settings because
>> newlines aren't being respected here.  But the message is still there,
>> and the user can still understand it, I think.
>
> Odd.  We should fix that...

OK, then.

> But this made me realize something.  Here:
>
>   static int
>   linux_child_function (void *child_stack)
>   {
>  -  ptrace (PTRACE_TRACEME, 0, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0);
>  +  if (ptrace (PTRACE_TRACEME, 0, (PTRACE_TYPE_ARG3) 0,
>  +             (PTRACE_TYPE_ARG4) 0) != 0)
>  +    trace_start_error_with_name ("ptrace",
>  +                                linux_ptrace_me_fail_reason (errno).c_str ());
>  +
>
> This is code that is run by the fork child.  Recall that we 
> had introduced trace_start_error_with_name instead of using
> perror_with_name / error, because between after fork and before
> exec, we can only safely use async-signal-safe functions.

Ah, I didn't remember the exact reason, thanks for refreshing my
memory.

> But that linux_ptrace_me_fail_reason call is doing a lot of
> non-async-signal safe things...

Fair enough.

> I think the right approach would be to make the child pass back the errno
> value to the parent, and then have the parent, do the dlopens, the
> std::string/malloc uses, etc. and print the error messages.
>
> The traditional way to pass data around like this is via a pipe.
> darwin-nat.c uses a pipe around fork, see ptrace_fds, though for a
> different reason.

Funny, at first (when I was trying out the patch) I noticed that this
ptrace call was failing, but thought that, because it is being called
inside the child, it would be good if we could pass errno back to the
parent, as you said.  But then I just decided to go the easy way and
call trace_start_error_with_name directly there.

I'll look into passing the errno back to the parent via pipes, as
suggested.

>> Ideally we'd have GDB also display the ptrace restriction warning, but I
>> think that if the user can see that the attach failed, she will likely
>> look at what happened with gdbserver, and will see the error there.  
>
> I don't think that assuming that the user has easy/convenient access
> to gdbserver's terminal is a great assumption.  Consider IDEs automating
> things, or gdbserver really being used as a service in a container, etc.
>
> Note that at least for Yama, you can have ptrace allow PTRACE_ME, while
> PTRACE_ATTACH gets refused.  So it's not guaranteed that gdbserver would
> exit out early on start up, meaning, I think that handling this
> gdbserver + "(gdb) attach" error case isn't being pedantic.
>
>> If
>> we were to show the message on GDB, we'd also have to rewrite it in a
>> way that explains that the command needs to be run at the target, which
>> may confuse things.
> Or it may not confuse things?  I don't think that should prevent
> improving things.  Surely we can find some way to express the message
> without causing confusion.  I'm not sure it does, but if it does,
> maybe we can just append a "on the target system" somewhere?

"on the target system" can be confusing for newcomers.  But sure, I can
certainly find a way to write the warning in a non-confusing way.  I'm
just thinking about the extra work involved to implement this.  I'll
design something here and send a v4 when ready.

Thanks,
  

Patch

diff --git a/gdb/gdbserver/linux-low.c b/gdb/gdbserver/linux-low.c
index 1e0a5cbf54d..45cf43ad9ed 100644
--- a/gdb/gdbserver/linux-low.c
+++ b/gdb/gdbserver/linux-low.c
@@ -1170,7 +1170,7 @@  attach_proc_task_lwp_callback (ptid_t ptid)
       else if (err != 0)
 	{
 	  std::string reason
-	    = linux_ptrace_attach_fail_reason_string (ptid, err);
+	    = linux_ptrace_attach_fail_reason_lwp (ptid, err);
 
 	  warning (_("Cannot attach to lwp %d: %s"), lwpid, reason.c_str ());
 	}
@@ -1202,7 +1202,7 @@  linux_attach (unsigned long pid)
     {
       remove_process (proc);
 
-      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
+      std::string reason = linux_ptrace_attach_fail_reason (pid, err);
       error ("Cannot attach to process %ld: %s", pid, reason.c_str ());
     }
 
diff --git a/gdb/gdbserver/thread-db.c b/gdb/gdbserver/thread-db.c
index b2791b0513a..cfba05977e6 100644
--- a/gdb/gdbserver/thread-db.c
+++ b/gdb/gdbserver/thread-db.c
@@ -225,7 +225,7 @@  attach_thread (const td_thrhandle_t *th_p, td_thrinfo_t *ti_p)
   err = linux_attach_lwp (ptid);
   if (err != 0)
     {
-      std::string reason = linux_ptrace_attach_fail_reason_string (ptid, err);
+      std::string reason = linux_ptrace_attach_fail_reason_lwp (ptid, err);
 
       warning ("Could not attach to thread %ld (LWP %d): %s",
 	       (unsigned long) ti_p->ti_tid, ti_p->ti_lid, reason.c_str ());
diff --git a/gdb/linux-nat.c b/gdb/linux-nat.c
index b5a9eaf72e3..22cb55e6dcc 100644
--- a/gdb/linux-nat.c
+++ b/gdb/linux-nat.c
@@ -1136,7 +1136,7 @@  attach_proc_task_lwp_callback (ptid_t ptid)
 	  else
 	    {
 	      std::string reason
-		= linux_ptrace_attach_fail_reason_string (ptid, err);
+		= linux_ptrace_attach_fail_reason_lwp (ptid, err);
 
 	      warning (_("Cannot attach to lwp %d: %s"),
 		       lwpid, reason.c_str ());
diff --git a/gdb/nat/linux-ptrace.c b/gdb/nat/linux-ptrace.c
index 599d9cfb550..2201a24d812 100644
--- a/gdb/nat/linux-ptrace.c
+++ b/gdb/nat/linux-ptrace.c
@@ -92,10 +92,13 @@  for more details.\n");
   return ret;
 }
 
-/* See declaration in linux-ptrace.h.  */
+/* Find all possible reasons we could fail to attach PID and return
+   these as a string.  An empty string is returned if we didn't find
+   any reason.  Helper for linux_ptrace_attach_fail_reason and
+   linux_ptrace_attach_fail_reason_lwp.  */
 
-std::string
-linux_ptrace_attach_fail_reason (pid_t pid, int err)
+static std::string
+linux_ptrace_attach_fail_reason_1 (pid_t pid)
 {
   pid_t tracerpid = linux_proc_get_tracerpid_nowarn (pid);
   std::string result;
@@ -111,8 +114,17 @@  linux_ptrace_attach_fail_reason (pid_t pid, int err)
 		      "terminated"),
 		    (int) pid);
 
-  std::string ptrace_restrict = linux_ptrace_restricted_fail_reason (err);
+  return result;
+}
 
+/* See declaration in linux-ptrace.h.  */
+
+std::string
+linux_ptrace_attach_fail_reason (pid_t pid, int err)
+{
+  std::string result = linux_ptrace_attach_fail_reason_1 (pid);
+
+  std::string ptrace_restrict = linux_ptrace_restricted_fail_reason (err);
   if (!ptrace_restrict.empty ())
     result += "\n" + ptrace_restrict;
 
@@ -122,10 +134,10 @@  linux_ptrace_attach_fail_reason (pid_t pid, int err)
 /* See linux-ptrace.h.  */
 
 std::string
-linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err)
+linux_ptrace_attach_fail_reason_lwp (ptid_t ptid, int err)
 {
   long lwpid = ptid.lwp ();
-  std::string reason = linux_ptrace_attach_fail_reason (lwpid);
+  std::string reason = linux_ptrace_attach_fail_reason_1 (lwpid);
 
   if (!reason.empty ())
     return string_printf ("%s (%d), %s", safe_strerror (err), err,
diff --git a/gdb/nat/linux-ptrace.h b/gdb/nat/linux-ptrace.h
index 04ada53bf69..94c9ba48ba5 100644
--- a/gdb/nat/linux-ptrace.h
+++ b/gdb/nat/linux-ptrace.h
@@ -180,12 +180,14 @@  struct buffer;
    these as a string.  An empty string is returned if we didn't find
    any reason.  If ERR is EACCES or EPERM, we also add a warning about
    possible restrictions to use ptrace.  */
-extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err = -1);
+extern std::string linux_ptrace_attach_fail_reason (pid_t pid, int err);
 
-/* Find all possible reasons we could have failed to attach to PTID
+/* Find all possible reasons we could have failed to attach to LWPID
    and return them as a string.  ERR is the error PTRACE_ATTACH failed
-   with (an errno).  */
-extern std::string linux_ptrace_attach_fail_reason_string (ptid_t ptid, int err);
+   with (an errno).  Unlike linux_ptrace_attach_fail_reason, this
+   function should be used when attaching to an LWP other than the
+   leader; it does not warn about ptrace restrictions.  */
+extern std::string linux_ptrace_attach_fail_reason_lwp (ptid_t lwpid, int err);
 
 /* When the call to 'ptrace (PTRACE_TRACEME...' fails, and we have
    already forked, this function can be called in order to try to