gdb: relax requirement for the map_failed stap probe to be present

Message ID daf50f6f4ee3d18635d72cdbe5af189b85f3010b.1669129770.git.aburgess@redhat.com
State New
Headers
Series gdb: relax requirement for the map_failed stap probe to be present |

Commit Message

Andrew Burgess Nov. 22, 2022, 3:09 p.m. UTC
  From glibc 2.35 and later, the "map_failed" stap probe is no longer
included in glibc.  The removal of the probe looks like an accident,
but it was caused by a glibc commit which meant that the "map_failed"
probe could no longer be reached; the compiler than helpfully
optimised out the probe.

In GDB, in solib-svr4.c, we have a list of probes that we look for
related to the shared library loading detection.  If any of these
probes are missing then GDB will fall back to the non-probe based
mechanism for detecting shared library loading.  The "map_failed"
probe is include in the list of required probes.

This means that on glibc 2.35 (or later) systems, GDB is going to
always fall back to the non-probes based mechanism for detecting
shared library loading.

I raised a glibc bug to discuss this issue:

  https://sourceware.org/bugzilla/show_bug.cgi?id=29818

But, whatever the ultimate decision from the glibc team, given there
are version of glibc in the wild without the "map_failed" probe, we
probably should update GDB to handle this situation.

The "map_failed" probe is already a little strange, very early
versions of glibc didn't include this probe, so, in some cases, if
this probe is missing GDB is happy to ignore it.  In this commit I
just expand this logic to make the "map_failed" probe fully optional.

With this commit in place, then, when using a glibc 2.35 or later
system, GDB will once again use the stap probes for shared library
detection.
---
 gdb/solib-svr4.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)


base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
  

Comments

Simon Marchi Nov. 22, 2022, 3:31 p.m. UTC | #1
On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
> From glibc 2.35 and later, the "map_failed" stap probe is no longer
> included in glibc.  The removal of the probe looks like an accident,
> but it was caused by a glibc commit which meant that the "map_failed"
> probe could no longer be reached; the compiler than helpfully
> optimised out the probe.
> 
> In GDB, in solib-svr4.c, we have a list of probes that we look for
> related to the shared library loading detection.  If any of these
> probes are missing then GDB will fall back to the non-probe based
> mechanism for detecting shared library loading.  The "map_failed"
> probe is include in the list of required probes.
> 
> This means that on glibc 2.35 (or later) systems, GDB is going to
> always fall back to the non-probes based mechanism for detecting
> shared library loading.
> 
> I raised a glibc bug to discuss this issue:
> 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
> 
> But, whatever the ultimate decision from the glibc team, given there
> are version of glibc in the wild without the "map_failed" probe, we
> probably should update GDB to handle this situation.
> 
> The "map_failed" probe is already a little strange, very early
> versions of glibc didn't include this probe, so, in some cases, if
> this probe is missing GDB is happy to ignore it.  In this commit I
> just expand this logic to make the "map_failed" probe fully optional.
> 
> With this commit in place, then, when using a glibc 2.35 or later
> system, GDB will once again use the stap probes for shared library
> detection.
> ---
>  gdb/solib-svr4.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
> index 6acaf87960b..87cd06f251a 100644
> --- a/gdb/solib-svr4.c
> +++ b/gdb/solib-svr4.c
> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>  
>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>  
> -      /* The "map_failed" probe did not exist in early
> -	 versions of the probes code in which the probes'
> -	 names were prefixed with "rtld_".  */
> -      if (with_prefix && streq (name, "rtld_map_failed"))
> +      /* The "map_failed" probe did not exist in early versions of the
> +	 probes code in which the probes' names were prefixed with
> +	 "rtld_".
> +
> +	 Additionally, the "map_failed" probe was accidentally removed
> +	 from glibc 2.35 and later, when changes in glibc meant the probe
> +	 could no longer be reached.  In this case the probe name doesn't
> +	 have the "rtld_" prefix.  */
> +      if (streq (probe_info[i].name, "map_failed"))
>  	continue;
>  
>        /* Ensure at least one probe for the current name was found.  */
> 
> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
> -- 
> 2.25.4

Hi,

I looked at this separately, and this was one of the fixes I considered.

Another option was to make GDB not give up on the probes interface if
failing to look up a probe whose action is DO_NOTHING.  Probes with that
action are not used by GDB for solib bookkeeping, but can be used to
stop on solib events, with "set stop-on-solib-events".  I was just
worried if there was some cases where a probe would be missing, but the
corresponding event could be caught if using the original interface.  In
that case, using the probes interface would be a regression.  But it's
probably not worth wondering about.  If that happens it's just a bug
that needs to be fixed.  In the case we are looking at, if the
map_failed probe gets optimized out, then surely the corresponding call
to the r_brk function would also be optimized out.

So, the patch LGTM:

Approved-By: Simon Marchi <simon.marchi@efficios.com>

Simon
  
Lancelot SIX Nov. 22, 2022, 3:33 p.m. UTC | #2
On Tue, Nov 22, 2022 at 03:09:40PM +0000, Andrew Burgess via Gdb-patches wrote:
> From glibc 2.35 and later, the "map_failed" stap probe is no longer
> included in glibc.  The removal of the probe looks like an accident,
> but it was caused by a glibc commit which meant that the "map_failed"
> probe could no longer be reached; the compiler than helpfully
> optimised out the probe.
> 
> In GDB, in solib-svr4.c, we have a list of probes that we look for
> related to the shared library loading detection.  If any of these
> probes are missing then GDB will fall back to the non-probe based
> mechanism for detecting shared library loading.  The "map_failed"
> probe is include in the list of required probes.
> 
> This means that on glibc 2.35 (or later) systems, GDB is going to
> always fall back to the non-probes based mechanism for detecting
> shared library loading.
> 
> I raised a glibc bug to discuss this issue:
> 
>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
> 
> But, whatever the ultimate decision from the glibc team, given there
> are version of glibc in the wild without the "map_failed" probe, we
> probably should update GDB to handle this situation.
> 
> The "map_failed" probe is already a little strange, very early
> versions of glibc didn't include this probe, so, in some cases, if
> this probe is missing GDB is happy to ignore it.  In this commit I
> just expand this logic to make the "map_failed" probe fully optional.
> 
> With this commit in place, then, when using a glibc 2.35 or later
> system, GDB will once again use the stap probes for shared library
> detection.
> ---
>  gdb/solib-svr4.c | 13 +++++++++----
>  1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
> index 6acaf87960b..87cd06f251a 100644
> --- a/gdb/solib-svr4.c
> +++ b/gdb/solib-svr4.c
> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>  
>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>  
> -      /* The "map_failed" probe did not exist in early
> -	 versions of the probes code in which the probes'
> -	 names were prefixed with "rtld_".  */
> -      if (with_prefix && streq (name, "rtld_map_failed"))
> +      /* The "map_failed" probe did not exist in early versions of the
> +	 probes code in which the probes' names were prefixed with
> +	 "rtld_".
> +
> +	 Additionally, the "map_failed" probe was accidentally removed
> +	 from glibc 2.35 and later, when changes in glibc meant the probe
> +	 could no longer be reached.  In this case the probe name doesn't
> +	 have the "rtld_" prefix.  */
> +      if (streq (probe_info[i].name, "map_failed"))
>  	continue;

Hi,

Wouldn't this just disable the "map_failed" probe for everyone, even if
it is available for the current glibc?

Should this become:

  if (streq (probe_info[i].name, "map_failed") && probes[i].empty ())
    continue

to only ignore this probe if it is missing?

Best,
Lancelot.
>  
>        /* Ensure at least one probe for the current name was found.  */
> 
> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
> -- 
> 2.25.4
>
  
Andrew Burgess Nov. 24, 2022, 10:46 a.m. UTC | #3
Simon Marchi <simark@simark.ca> writes:

> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>> From glibc 2.35 and later, the "map_failed" stap probe is no longer
>> included in glibc.  The removal of the probe looks like an accident,
>> but it was caused by a glibc commit which meant that the "map_failed"
>> probe could no longer be reached; the compiler than helpfully
>> optimised out the probe.
>> 
>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>> related to the shared library loading detection.  If any of these
>> probes are missing then GDB will fall back to the non-probe based
>> mechanism for detecting shared library loading.  The "map_failed"
>> probe is include in the list of required probes.
>> 
>> This means that on glibc 2.35 (or later) systems, GDB is going to
>> always fall back to the non-probes based mechanism for detecting
>> shared library loading.
>> 
>> I raised a glibc bug to discuss this issue:
>> 
>>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>> 
>> But, whatever the ultimate decision from the glibc team, given there
>> are version of glibc in the wild without the "map_failed" probe, we
>> probably should update GDB to handle this situation.
>> 
>> The "map_failed" probe is already a little strange, very early
>> versions of glibc didn't include this probe, so, in some cases, if
>> this probe is missing GDB is happy to ignore it.  In this commit I
>> just expand this logic to make the "map_failed" probe fully optional.
>> 
>> With this commit in place, then, when using a glibc 2.35 or later
>> system, GDB will once again use the stap probes for shared library
>> detection.
>> ---
>>  gdb/solib-svr4.c | 13 +++++++++----
>>  1 file changed, 9 insertions(+), 4 deletions(-)
>> 
>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>> index 6acaf87960b..87cd06f251a 100644
>> --- a/gdb/solib-svr4.c
>> +++ b/gdb/solib-svr4.c
>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>  
>>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>  
>> -      /* The "map_failed" probe did not exist in early
>> -	 versions of the probes code in which the probes'
>> -	 names were prefixed with "rtld_".  */
>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>> +      /* The "map_failed" probe did not exist in early versions of the
>> +	 probes code in which the probes' names were prefixed with
>> +	 "rtld_".
>> +
>> +	 Additionally, the "map_failed" probe was accidentally removed
>> +	 from glibc 2.35 and later, when changes in glibc meant the probe
>> +	 could no longer be reached.  In this case the probe name doesn't
>> +	 have the "rtld_" prefix.  */
>> +      if (streq (probe_info[i].name, "map_failed"))
>>  	continue;
>>  
>>        /* Ensure at least one probe for the current name was found.  */
>> 
>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>> -- 
>> 2.25.4
>
> Hi,
>
> I looked at this separately, and this was one of the fixes I considered.
>
> Another option was to make GDB not give up on the probes interface if
> failing to look up a probe whose action is DO_NOTHING.  Probes with that
> action are not used by GDB for solib bookkeeping, but can be used to
> stop on solib events, with "set stop-on-solib-events".  I was just
> worried if there was some cases where a probe would be missing, but the
> corresponding event could be caught if using the original interface.  In
> that case, using the probes interface would be a regression.  But it's
> probably not worth wondering about.  If that happens it's just a bug
> that needs to be fixed.  In the case we are looking at, if the
> map_failed probe gets optimized out, then surely the corresponding call
> to the r_brk function would also be optimized out.

I also considered just ignoring any probe was (a) missing, and (b) had a
DO_NOTHING action.  The reason I didn't post this patch was because, at
the time, my thinking was: if we don't care about any probe with a
DO_NOTHING action, why even look for those probes, why not just remove
them from the list?

I think you've (partially) convinced me that the user might be
interested in seeing a stop at these probes even if GDB's action is
DO_NOTHING.

I say partially above because GDB doesn't really do anything to tell the
user which probe we stopped at, e.g. was is "init_start", "map_start",
"map_failed", etc.  The user might be able to figure it out from the
backtrace, but I still think it's not going to be trivial in all cases,
e.g. "map_start" and "map_failed" are both located in the same function,
so I think the user would need to lookup the probe address in the ELF,
then compare that to the stop address.  Not impossible, but, I suspect,
the complexity is an indication that users are not doing this much.
Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
probes.

However, I think there is enough of a justification there for keeping
the probes in the list, and just skipping any DO_NOTHING probes that
don't exist.

Below then, is an alternative patch.  I don't have a strong preference
between this one, and the original[1], but I thought I'd post this for
discussion.  If this is preferred then I can just merge this.

[1] I'll also post an update to the original patch shortly that
addresses Lancelot's feedback.

Thanks,
Andrew

---

commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Tue Nov 22 12:45:56 2022 +0000

    gdb: relax requirement for the map_failed stap probe to be present
    
    From glibc 2.35 and later, the "map_failed" stap probe is no longer
    included in glibc.  The removal of the probe looks like an accident,
    but it was caused by a glibc commit which meant that the "map_failed"
    probe could no longer be reached; the compiler than helpfully
    optimised out the probe.
    
    In GDB, in solib-svr4.c, we have a list of probes that we look for
    related to the shared library loading detection.  If any of these
    probes are missing then GDB will fall back to the non-probe based
    mechanism for detecting shared library loading.  The "map_failed"
    probe is include in the list of required probes.
    
    This means that on glibc 2.35 (or later) systems, GDB is going to
    always fall back to the non-probes based mechanism for detecting
    shared library loading.
    
    I raised a glibc bug to discuss this issue:
    
      https://sourceware.org/bugzilla/show_bug.cgi?id=29818
    
    But, whatever the ultimate decision from the glibc team, given there
    are version of glibc in the wild without the "map_failed" probe, we
    probably should update GDB to handle this situation.
    
    The "map_failed" probe is already a little strange, very early
    versions of glibc didn't include this probe, so, in some cases, if
    this probe is missing GDB is happy to ignore it.  This is fine, the
    action associated with this probe inside GDB is DO_NOTHING, this means
    the probe isn't actually required in order for GDB to correctly detect
    the loading of shared libraries.
    
    In this commit I propose changing the rules so that any probe whose
    action is DO_NOTHING, is optional.
    
    There is one possible downside to this change, and that concerns 'set
    stop-on-solib-events on'.  If a probe is removed from glibc, but the
    old style breakpoint based mechanism is still in place within glibc
    for that same event, then GDB will stop when using the old style
    non-probe based mechanism, but not when using the probes based
    mechanism.
    
    For the map_failed case this is not a problem, both the map_failed
    probe, and the call to the old style breakpoint location were
    optimised out, and so neither event (probes based, or breakpoint
    based) will trigger.  This would only become an issue if glibc removed
    a probe, but left the breakpoint in place (this would almost certainly
    be a bug in glibc).
    
    For now, I'm proposing that we just don't worry about this.  Because
    some probes have actions that are not DO_NOTHING, then we know the
    user will always seem _some_ stops when a shared library is
    loaded/unloaded, and (I'm guessing), in most cases, that's all they
    care about.  I figure when someone complains then we can figure out
    what the right solution is then.
    
    With this commit in place, then, when using a glibc 2.35 or later
    system, GDB will once again use the stap probes for shared library
    detection.

diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 6acaf87960b..10e446af908 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
 
       probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
 
-      /* The "map_failed" probe did not exist in early
-	 versions of the probes code in which the probes'
-	 names were prefixed with "rtld_".  */
-      if (with_prefix && streq (name, "rtld_map_failed"))
-	continue;
-
       /* Ensure at least one probe for the current name was found.  */
       if (probes[i].empty ())
-	return false;
+	{
+	  /* The "map_failed" probe did not exist in early versions of the
+	     probes code in which the probes' names were prefixed with
+	     "rtld_".
+
+	     Additionally, the "map_failed" probe was accidentally removed
+	     from glibc 2.35 and later, when changes in glibc meant the
+	     probe could no longer be reached, and the compiler optimized
+	     the probe away.  In this case the probe name doesn't have the
+	     "rtld_" prefix.
+
+	     To handle this, and give GDB as much flexibility as possible,
+	     we make the rule that, if a probe isn't required for the
+	     correct operation of GDB (i.e. it's action is DO_NOTHING),
+	     then we will still use the probes interface, even if that
+	     probe is missing.
+
+	     The only (possible) downside of this is that, if the user has
+	     'set stop-on-solib-events on' in effect, then they might get
+	     fewer events using the probes interface than with the classic
+	     non-probes interface.  */
+	  if (prove_info[i].action == DO_NOTHING)
+	    continue;
+	  else
+	    return false;
+	}
 
       /* Ensure probe arguments can be evaluated.  */
       for (probe *p : probes[i])
  
Andrew Burgess Nov. 24, 2022, 11:39 a.m. UTC | #4
Lancelot SIX <lsix@lancelotsix.com> writes:

> On Tue, Nov 22, 2022 at 03:09:40PM +0000, Andrew Burgess via Gdb-patches wrote:
>> From glibc 2.35 and later, the "map_failed" stap probe is no longer
>> included in glibc.  The removal of the probe looks like an accident,
>> but it was caused by a glibc commit which meant that the "map_failed"
>> probe could no longer be reached; the compiler than helpfully
>> optimised out the probe.
>> 
>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>> related to the shared library loading detection.  If any of these
>> probes are missing then GDB will fall back to the non-probe based
>> mechanism for detecting shared library loading.  The "map_failed"
>> probe is include in the list of required probes.
>> 
>> This means that on glibc 2.35 (or later) systems, GDB is going to
>> always fall back to the non-probes based mechanism for detecting
>> shared library loading.
>> 
>> I raised a glibc bug to discuss this issue:
>> 
>>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>> 
>> But, whatever the ultimate decision from the glibc team, given there
>> are version of glibc in the wild without the "map_failed" probe, we
>> probably should update GDB to handle this situation.
>> 
>> The "map_failed" probe is already a little strange, very early
>> versions of glibc didn't include this probe, so, in some cases, if
>> this probe is missing GDB is happy to ignore it.  In this commit I
>> just expand this logic to make the "map_failed" probe fully optional.
>> 
>> With this commit in place, then, when using a glibc 2.35 or later
>> system, GDB will once again use the stap probes for shared library
>> detection.
>> ---
>>  gdb/solib-svr4.c | 13 +++++++++----
>>  1 file changed, 9 insertions(+), 4 deletions(-)
>> 
>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>> index 6acaf87960b..87cd06f251a 100644
>> --- a/gdb/solib-svr4.c
>> +++ b/gdb/solib-svr4.c
>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>  
>>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>  
>> -      /* The "map_failed" probe did not exist in early
>> -	 versions of the probes code in which the probes'
>> -	 names were prefixed with "rtld_".  */
>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>> +      /* The "map_failed" probe did not exist in early versions of the
>> +	 probes code in which the probes' names were prefixed with
>> +	 "rtld_".
>> +
>> +	 Additionally, the "map_failed" probe was accidentally removed
>> +	 from glibc 2.35 and later, when changes in glibc meant the probe
>> +	 could no longer be reached.  In this case the probe name doesn't
>> +	 have the "rtld_" prefix.  */
>> +      if (streq (probe_info[i].name, "map_failed"))
>>  	continue;
>
> Hi,
>
> Wouldn't this just disable the "map_failed" probe for everyone, even if
> it is available for the current glibc?
>
> Should this become:
>
>   if (streq (probe_info[i].name, "map_failed") && probes[i].empty ())
>     continue
>
> to only ignore this probe if it is missing?

That's a very good point.

In the update below, I've moved the name check inside the immediately
following 'if (probes[i].empty)' block.  I've also added an assert that
the action for the probe is DO_NOTHING - if that action did ever change,
and we then skipped the probe we'd be in trouble.

However, I've also replied to Simon with an alternative approach to this
fix.  I'm not really sure which of the two solutions I prefer, so I'm
posting both to see if there's a preference in the group.

Thanks,
Andrew

---

commit 137177ba43c07aa3e33f5244fcc826e26f2474eb
Author: Andrew Burgess <aburgess@redhat.com>
Date:   Tue Nov 22 12:45:56 2022 +0000

    gdb: relax requirement for the map_failed stap probe to be present
    
    From glibc 2.35 and later, the "map_failed" stap probe is no longer
    included in glibc.  The removal of the probe looks like an accident,
    but it was caused by a glibc commit which meant that the "map_failed"
    probe could no longer be reached; the compiler than helpfully
    optimised out the probe.
    
    In GDB, in solib-svr4.c, we have a list of probes that we look for
    related to the shared library loading detection.  If any of these
    probes are missing then GDB will fall back to the non-probe based
    mechanism for detecting shared library loading.  The "map_failed"
    probe is include in the list of required probes.
    
    This means that on glibc 2.35 (or later) systems, GDB is going to
    always fall back to the non-probes based mechanism for detecting
    shared library loading.
    
    I raised a glibc bug to discuss this issue:
    
      https://sourceware.org/bugzilla/show_bug.cgi?id=29818
    
    But, whatever the ultimate decision from the glibc team, given there
    are version of glibc in the wild without the "map_failed" probe, we
    probably should update GDB to handle this situation.
    
    The "map_failed" probe is already a little strange, very early
    versions of glibc didn't include this probe, so, in some cases, if
    this probe is missing GDB is happy to ignore it.  In this commit I
    just expand this logic to make the "map_failed" probe fully optional.
    
    With this commit in place, then, when using a glibc 2.35 or later
    system, GDB will once again use the stap probes for shared library
    detection.

diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 6acaf87960b..c948808f853 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -2205,15 +2205,26 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
 
       probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
 
-      /* The "map_failed" probe did not exist in early
-	 versions of the probes code in which the probes'
-	 names were prefixed with "rtld_".  */
-      if (with_prefix && streq (name, "rtld_map_failed"))
-	continue;
-
       /* Ensure at least one probe for the current name was found.  */
       if (probes[i].empty ())
-	return false;
+	{
+	  /* The "map_failed" probe did not exist in early versions of the
+	     probes code in which the probes' names were prefixed with
+	     "rtld_".
+
+	     Additionally, the "map_failed" probe was accidentally removed
+	     from glibc 2.35 and later, when changes in glibc meant the
+	     probe could no longer be reached, and the compiler optimized
+	     the probe away.  In this case the probe name doesn't have the
+	     "rtld_" prefix.  */
+	  if (streq (probe_info[i].name, "map_failed"))
+	    {
+	      gdb_assert (probe_info[i].action == DO_NOTHING);
+	      continue;
+	    }
+
+	  return false;
+	}
 
       /* Ensure probe arguments can be evaluated.  */
       for (probe *p : probes[i])
  
Simon Marchi Nov. 24, 2022, 3:10 p.m. UTC | #5
On 11/24/22 05:46, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi <simark@simark.ca> writes:
> 
>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>> From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>> included in glibc.  The removal of the probe looks like an accident,
>>> but it was caused by a glibc commit which meant that the "map_failed"
>>> probe could no longer be reached; the compiler than helpfully
>>> optimised out the probe.
>>>
>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>> related to the shared library loading detection.  If any of these
>>> probes are missing then GDB will fall back to the non-probe based
>>> mechanism for detecting shared library loading.  The "map_failed"
>>> probe is include in the list of required probes.
>>>
>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>> always fall back to the non-probes based mechanism for detecting
>>> shared library loading.
>>>
>>> I raised a glibc bug to discuss this issue:
>>>
>>>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>
>>> But, whatever the ultimate decision from the glibc team, given there
>>> are version of glibc in the wild without the "map_failed" probe, we
>>> probably should update GDB to handle this situation.
>>>
>>> The "map_failed" probe is already a little strange, very early
>>> versions of glibc didn't include this probe, so, in some cases, if
>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>> just expand this logic to make the "map_failed" probe fully optional.
>>>
>>> With this commit in place, then, when using a glibc 2.35 or later
>>> system, GDB will once again use the stap probes for shared library
>>> detection.
>>> ---
>>>  gdb/solib-svr4.c | 13 +++++++++----
>>>  1 file changed, 9 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>> index 6acaf87960b..87cd06f251a 100644
>>> --- a/gdb/solib-svr4.c
>>> +++ b/gdb/solib-svr4.c
>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>  
>>>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>  
>>> -      /* The "map_failed" probe did not exist in early
>>> -	 versions of the probes code in which the probes'
>>> -	 names were prefixed with "rtld_".  */
>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>> +      /* The "map_failed" probe did not exist in early versions of the
>>> +	 probes code in which the probes' names were prefixed with
>>> +	 "rtld_".
>>> +
>>> +	 Additionally, the "map_failed" probe was accidentally removed
>>> +	 from glibc 2.35 and later, when changes in glibc meant the probe
>>> +	 could no longer be reached.  In this case the probe name doesn't
>>> +	 have the "rtld_" prefix.  */
>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>  	continue;
>>>  
>>>        /* Ensure at least one probe for the current name was found.  */
>>>
>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>> -- 
>>> 2.25.4
>>
>> Hi,
>>
>> I looked at this separately, and this was one of the fixes I considered.
>>
>> Another option was to make GDB not give up on the probes interface if
>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>> action are not used by GDB for solib bookkeeping, but can be used to
>> stop on solib events, with "set stop-on-solib-events".  I was just
>> worried if there was some cases where a probe would be missing, but the
>> corresponding event could be caught if using the original interface.  In
>> that case, using the probes interface would be a regression.  But it's
>> probably not worth wondering about.  If that happens it's just a bug
>> that needs to be fixed.  In the case we are looking at, if the
>> map_failed probe gets optimized out, then surely the corresponding call
>> to the r_brk function would also be optimized out.
> 
> I also considered just ignoring any probe was (a) missing, and (b) had a
> DO_NOTHING action.  The reason I didn't post this patch was because, at
> the time, my thinking was: if we don't care about any probe with a
> DO_NOTHING action, why even look for those probes, why not just remove
> them from the list?
> 
> I think you've (partially) convinced me that the user might be
> interested in seeing a stop at these probes even if GDB's action is
> DO_NOTHING.
> 
> I say partially above because GDB doesn't really do anything to tell the
> user which probe we stopped at, e.g. was is "init_start", "map_start",
> "map_failed", etc.  The user might be able to figure it out from the
> backtrace, but I still think it's not going to be trivial in all cases,
> e.g. "map_start" and "map_failed" are both located in the same function,
> so I think the user would need to lookup the probe address in the ELF,
> then compare that to the stop address.  Not impossible, but, I suspect,
> the complexity is an indication that users are not doing this much.
> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
> probes.

That's a good point.

> However, I think there is enough of a justification there for keeping
> the probes in the list, and just skipping any DO_NOTHING probes that
> don't exist.
> 
> Below then, is an alternative patch.  I don't have a strong preference
> between this one, and the original[1], but I thought I'd post this for
> discussion.  If this is preferred then I can just merge this.

You convinced me that this is fine, I think I am really over-worrying.

I like this version of the patch.

> [1] I'll also post an update to the original patch shortly that
> addresses Lancelot's feedback.
> 
> Thanks,
> Andrew
> 
> ---
> 
> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Tue Nov 22 12:45:56 2022 +0000
> 
>     gdb: relax requirement for the map_failed stap probe to be present
>     
>     From glibc 2.35 and later, the "map_failed" stap probe is no longer
>     included in glibc.  The removal of the probe looks like an accident,
>     but it was caused by a glibc commit which meant that the "map_failed"
>     probe could no longer be reached; the compiler than helpfully
>     optimised out the probe.
>     
>     In GDB, in solib-svr4.c, we have a list of probes that we look for
>     related to the shared library loading detection.  If any of these
>     probes are missing then GDB will fall back to the non-probe based
>     mechanism for detecting shared library loading.  The "map_failed"
>     probe is include in the list of required probes.
>     
>     This means that on glibc 2.35 (or later) systems, GDB is going to
>     always fall back to the non-probes based mechanism for detecting
>     shared library loading.
>     
>     I raised a glibc bug to discuss this issue:
>     
>       https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>     
>     But, whatever the ultimate decision from the glibc team, given there
>     are version of glibc in the wild without the "map_failed" probe, we
>     probably should update GDB to handle this situation.
>     
>     The "map_failed" probe is already a little strange, very early
>     versions of glibc didn't include this probe, so, in some cases, if
>     this probe is missing GDB is happy to ignore it.  This is fine, the
>     action associated with this probe inside GDB is DO_NOTHING, this means
>     the probe isn't actually required in order for GDB to correctly detect
>     the loading of shared libraries.
>     
>     In this commit I propose changing the rules so that any probe whose
>     action is DO_NOTHING, is optional.
>     
>     There is one possible downside to this change, and that concerns 'set
>     stop-on-solib-events on'.  If a probe is removed from glibc, but the
>     old style breakpoint based mechanism is still in place within glibc
>     for that same event, then GDB will stop when using the old style
>     non-probe based mechanism, but not when using the probes based
>     mechanism.
>     
>     For the map_failed case this is not a problem, both the map_failed
>     probe, and the call to the old style breakpoint location were
>     optimised out, and so neither event (probes based, or breakpoint
>     based) will trigger.  This would only become an issue if glibc removed
>     a probe, but left the breakpoint in place (this would almost certainly
>     be a bug in glibc).
>     
>     For now, I'm proposing that we just don't worry about this.  Because
>     some probes have actions that are not DO_NOTHING, then we know the
>     user will always seem _some_ stops when a shared library is
>     loaded/unloaded, and (I'm guessing), in most cases, that's all they
>     care about.  I figure when someone complains then we can figure out
>     what the right solution is then.
>     
>     With this commit in place, then, when using a glibc 2.35 or later
>     system, GDB will once again use the stap probes for shared library
>     detection.
> 
> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
> index 6acaf87960b..10e446af908 100644
> --- a/gdb/solib-svr4.c
> +++ b/gdb/solib-svr4.c
> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>  
>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>  
> -      /* The "map_failed" probe did not exist in early
> -	 versions of the probes code in which the probes'
> -	 names were prefixed with "rtld_".  */
> -      if (with_prefix && streq (name, "rtld_map_failed"))
> -	continue;
> -
>        /* Ensure at least one probe for the current name was found.  */
>        if (probes[i].empty ())
> -	return false;
> +	{
> +	  /* The "map_failed" probe did not exist in early versions of the
> +	     probes code in which the probes' names were prefixed with
> +	     "rtld_".
> +
> +	     Additionally, the "map_failed" probe was accidentally removed
> +	     from glibc 2.35 and later, when changes in glibc meant the

Nit: I would maybe just say "removed in glibc 2.35" and not imply
anything about later versions, as we don't know whether it will be there
or not.

> +	     probe could no longer be reached, and the compiler optimized
> +	     the probe away.  In this case the probe name doesn't have the
> +	     "rtld_" prefix.
> +
> +	     To handle this, and give GDB as much flexibility as possible,
> +	     we make the rule that, if a probe isn't required for the
> +	     correct operation of GDB (i.e. it's action is DO_NOTHING),

it's -> its, I think

Simon
  
Lancelot SIX Nov. 24, 2022, 4:13 p.m. UTC | #6
On Thu, Nov 24, 2022 at 10:46:45AM +0000, Andrew Burgess via Gdb-patches wrote:
> Simon Marchi <simark@simark.ca> writes:
> 
> > On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
> >> From glibc 2.35 and later, the "map_failed" stap probe is no longer
> >> included in glibc.  The removal of the probe looks like an accident,
> >> but it was caused by a glibc commit which meant that the "map_failed"
> >> probe could no longer be reached; the compiler than helpfully
> >> optimised out the probe.
> >> 
> >> In GDB, in solib-svr4.c, we have a list of probes that we look for
> >> related to the shared library loading detection.  If any of these
> >> probes are missing then GDB will fall back to the non-probe based
> >> mechanism for detecting shared library loading.  The "map_failed"
> >> probe is include in the list of required probes.
> >> 
> >> This means that on glibc 2.35 (or later) systems, GDB is going to
> >> always fall back to the non-probes based mechanism for detecting
> >> shared library loading.
> >> 
> >> I raised a glibc bug to discuss this issue:
> >> 
> >>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
> >> 
> >> But, whatever the ultimate decision from the glibc team, given there
> >> are version of glibc in the wild without the "map_failed" probe, we
> >> probably should update GDB to handle this situation.
> >> 
> >> The "map_failed" probe is already a little strange, very early
> >> versions of glibc didn't include this probe, so, in some cases, if
> >> this probe is missing GDB is happy to ignore it.  In this commit I
> >> just expand this logic to make the "map_failed" probe fully optional.
> >> 
> >> With this commit in place, then, when using a glibc 2.35 or later
> >> system, GDB will once again use the stap probes for shared library
> >> detection.
> >> ---
> >>  gdb/solib-svr4.c | 13 +++++++++----
> >>  1 file changed, 9 insertions(+), 4 deletions(-)
> >> 
> >> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
> >> index 6acaf87960b..87cd06f251a 100644
> >> --- a/gdb/solib-svr4.c
> >> +++ b/gdb/solib-svr4.c
> >> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
> >>  
> >>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
> >>  
> >> -      /* The "map_failed" probe did not exist in early
> >> -	 versions of the probes code in which the probes'
> >> -	 names were prefixed with "rtld_".  */
> >> -      if (with_prefix && streq (name, "rtld_map_failed"))
> >> +      /* The "map_failed" probe did not exist in early versions of the
> >> +	 probes code in which the probes' names were prefixed with
> >> +	 "rtld_".
> >> +
> >> +	 Additionally, the "map_failed" probe was accidentally removed
> >> +	 from glibc 2.35 and later, when changes in glibc meant the probe
> >> +	 could no longer be reached.  In this case the probe name doesn't
> >> +	 have the "rtld_" prefix.  */
> >> +      if (streq (probe_info[i].name, "map_failed"))
> >>  	continue;
> >>  
> >>        /* Ensure at least one probe for the current name was found.  */
> >> 
> >> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
> >> -- 
> >> 2.25.4
> >
> > Hi,
> >
> > I looked at this separately, and this was one of the fixes I considered.
> >
> > Another option was to make GDB not give up on the probes interface if
> > failing to look up a probe whose action is DO_NOTHING.  Probes with that
> > action are not used by GDB for solib bookkeeping, but can be used to
> > stop on solib events, with "set stop-on-solib-events".  I was just
> > worried if there was some cases where a probe would be missing, but the
> > corresponding event could be caught if using the original interface.  In
> > that case, using the probes interface would be a regression.  But it's
> > probably not worth wondering about.  If that happens it's just a bug
> > that needs to be fixed.  In the case we are looking at, if the
> > map_failed probe gets optimized out, then surely the corresponding call
> > to the r_brk function would also be optimized out.
> 
> I also considered just ignoring any probe was (a) missing, and (b) had a
> DO_NOTHING action.  The reason I didn't post this patch was because, at
> the time, my thinking was: if we don't care about any probe with a
> DO_NOTHING action, why even look for those probes, why not just remove
> them from the list?
> 
> I think you've (partially) convinced me that the user might be
> interested in seeing a stop at these probes even if GDB's action is
> DO_NOTHING.
> 
> I say partially above because GDB doesn't really do anything to tell the
> user which probe we stopped at, e.g. was is "init_start", "map_start",
> "map_failed", etc.  The user might be able to figure it out from the
> backtrace, but I still think it's not going to be trivial in all cases,
> e.g. "map_start" and "map_failed" are both located in the same function,
> so I think the user would need to lookup the probe address in the ELF,
> then compare that to the stop address.  Not impossible, but, I suspect,
> the complexity is an indication that users are not doing this much.
> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
> probes.
> 
> However, I think there is enough of a justification there for keeping
> the probes in the list, and just skipping any DO_NOTHING probes that
> don't exist.
> 
> Below then, is an alternative patch.  I don't have a strong preference
> between this one, and the original[1], but I thought I'd post this for
> discussion.  If this is preferred then I can just merge this.
> 
> [1] I'll also post an update to the original patch shortly that
> addresses Lancelot's feedback.
> 
> Thanks,
> Andrew
> 
> ---
> 
> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Tue Nov 22 12:45:56 2022 +0000
> 
>     gdb: relax requirement for the map_failed stap probe to be present
>     
>     From glibc 2.35 and later, the "map_failed" stap probe is no longer
>     included in glibc.  The removal of the probe looks like an accident,
>     but it was caused by a glibc commit which meant that the "map_failed"
>     probe could no longer be reached; the compiler than helpfully
>     optimised out the probe.
>     
>     In GDB, in solib-svr4.c, we have a list of probes that we look for
>     related to the shared library loading detection.  If any of these
>     probes are missing then GDB will fall back to the non-probe based
>     mechanism for detecting shared library loading.  The "map_failed"
>     probe is include in the list of required probes.
>     
>     This means that on glibc 2.35 (or later) systems, GDB is going to
>     always fall back to the non-probes based mechanism for detecting
>     shared library loading.
>     
>     I raised a glibc bug to discuss this issue:
>     
>       https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>     
>     But, whatever the ultimate decision from the glibc team, given there
>     are version of glibc in the wild without the "map_failed" probe, we
>     probably should update GDB to handle this situation.
>     
>     The "map_failed" probe is already a little strange, very early
>     versions of glibc didn't include this probe, so, in some cases, if
>     this probe is missing GDB is happy to ignore it.  This is fine, the
>     action associated with this probe inside GDB is DO_NOTHING, this means
>     the probe isn't actually required in order for GDB to correctly detect
>     the loading of shared libraries.
>     
>     In this commit I propose changing the rules so that any probe whose
>     action is DO_NOTHING, is optional.
>     
>     There is one possible downside to this change, and that concerns 'set
>     stop-on-solib-events on'.  If a probe is removed from glibc, but the
>     old style breakpoint based mechanism is still in place within glibc
>     for that same event, then GDB will stop when using the old style
>     non-probe based mechanism, but not when using the probes based
>     mechanism.
>     
>     For the map_failed case this is not a problem, both the map_failed
>     probe, and the call to the old style breakpoint location were
>     optimised out, and so neither event (probes based, or breakpoint
>     based) will trigger.  This would only become an issue if glibc removed
>     a probe, but left the breakpoint in place (this would almost certainly
>     be a bug in glibc).
>     
>     For now, I'm proposing that we just don't worry about this.  Because
>     some probes have actions that are not DO_NOTHING, then we know the
>     user will always seem _some_ stops when a shared library is
>     loaded/unloaded, and (I'm guessing), in most cases, that's all they
>     care about.  I figure when someone complains then we can figure out
>     what the right solution is then.
>     
>     With this commit in place, then, when using a glibc 2.35 or later
>     system, GDB will once again use the stap probes for shared library
>     detection.
> 
> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
> index 6acaf87960b..10e446af908 100644
> --- a/gdb/solib-svr4.c
> +++ b/gdb/solib-svr4.c
> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>  
>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>  
> -      /* The "map_failed" probe did not exist in early
> -	 versions of the probes code in which the probes'
> -	 names were prefixed with "rtld_".  */
> -      if (with_prefix && streq (name, "rtld_map_failed"))
> -	continue;
> -
>        /* Ensure at least one probe for the current name was found.  */
>        if (probes[i].empty ())
> -	return false;
> +	{
> +	  /* The "map_failed" probe did not exist in early versions of the
> +	     probes code in which the probes' names were prefixed with
> +	     "rtld_".
> +
> +	     Additionally, the "map_failed" probe was accidentally removed
> +	     from glibc 2.35 and later, when changes in glibc meant the
> +	     probe could no longer be reached, and the compiler optimized
> +	     the probe away.  In this case the probe name doesn't have the
> +	     "rtld_" prefix.
> +
> +	     To handle this, and give GDB as much flexibility as possible,
> +	     we make the rule that, if a probe isn't required for the
> +	     correct operation of GDB (i.e. it's action is DO_NOTHING),
> +	     then we will still use the probes interface, even if that
> +	     probe is missing.
> +
> +	     The only (possible) downside of this is that, if the user has
> +	     'set stop-on-solib-events on' in effect, then they might get
> +	     fewer events using the probes interface than with the classic
> +	     non-probes interface.  */
> +	  if (prove_info[i].action == DO_NOTHING)

I guess you have a typo here.

s/prove_info/probe_info/ here

Other than that and considering the remarks Simon already did, and for
what it is worth, this looks good to me.  I do not really have a strong
opinion between this version and the alternate you sent.  My instinct is
to have the minimal change to current behavior (so the other patch).
However as you point out the extra stops a user can get compared to the
traditional interface is not very informative anyway.

Whichever version you end up using, feel free to add my RB tag:

Reviewed-By: Lancelot SIX <lancelot.six@amd.com>

Best,
Lancelot.

> +	    continue;
> +	  else
> +	    return false;
> +	}
>  
>        /* Ensure probe arguments can be evaluated.  */
>        for (probe *p : probes[i])
>
  
Pedro Alves Nov. 28, 2022, 3:47 p.m. UTC | #7
On 2022-11-24 10:46 a.m., Andrew Burgess via Gdb-patches wrote:
> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Tue Nov 22 12:45:56 2022 +0000
> 
>     gdb: relax requirement for the map_failed stap probe to be present
>     
>     From glibc 2.35 and later, the "map_failed" stap probe is no longer
>     included in glibc.  The removal of the probe looks like an accident,
>     but it was caused by a glibc commit which meant that the "map_failed"
>     probe could no longer be reached; the compiler than helpfully

than -> then

>     optimised out the probe.
>
  
Andrew Burgess Nov. 28, 2022, 5:18 p.m. UTC | #8
Thanks you all for your review feedback.

I've now pushed this version of the patch (inc Pedro's suggested typo
fix).

Thanks,
Andrew


Andrew Burgess <aburgess@redhat.com> writes:

> Simon Marchi <simark@simark.ca> writes:
>
>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>> From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>> included in glibc.  The removal of the probe looks like an accident,
>>> but it was caused by a glibc commit which meant that the "map_failed"
>>> probe could no longer be reached; the compiler than helpfully
>>> optimised out the probe.
>>> 
>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>> related to the shared library loading detection.  If any of these
>>> probes are missing then GDB will fall back to the non-probe based
>>> mechanism for detecting shared library loading.  The "map_failed"
>>> probe is include in the list of required probes.
>>> 
>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>> always fall back to the non-probes based mechanism for detecting
>>> shared library loading.
>>> 
>>> I raised a glibc bug to discuss this issue:
>>> 
>>>   https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>> 
>>> But, whatever the ultimate decision from the glibc team, given there
>>> are version of glibc in the wild without the "map_failed" probe, we
>>> probably should update GDB to handle this situation.
>>> 
>>> The "map_failed" probe is already a little strange, very early
>>> versions of glibc didn't include this probe, so, in some cases, if
>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>> just expand this logic to make the "map_failed" probe fully optional.
>>> 
>>> With this commit in place, then, when using a glibc 2.35 or later
>>> system, GDB will once again use the stap probes for shared library
>>> detection.
>>> ---
>>>  gdb/solib-svr4.c | 13 +++++++++----
>>>  1 file changed, 9 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>> index 6acaf87960b..87cd06f251a 100644
>>> --- a/gdb/solib-svr4.c
>>> +++ b/gdb/solib-svr4.c
>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>  
>>>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>  
>>> -      /* The "map_failed" probe did not exist in early
>>> -	 versions of the probes code in which the probes'
>>> -	 names were prefixed with "rtld_".  */
>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>> +      /* The "map_failed" probe did not exist in early versions of the
>>> +	 probes code in which the probes' names were prefixed with
>>> +	 "rtld_".
>>> +
>>> +	 Additionally, the "map_failed" probe was accidentally removed
>>> +	 from glibc 2.35 and later, when changes in glibc meant the probe
>>> +	 could no longer be reached.  In this case the probe name doesn't
>>> +	 have the "rtld_" prefix.  */
>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>  	continue;
>>>  
>>>        /* Ensure at least one probe for the current name was found.  */
>>> 
>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>> -- 
>>> 2.25.4
>>
>> Hi,
>>
>> I looked at this separately, and this was one of the fixes I considered.
>>
>> Another option was to make GDB not give up on the probes interface if
>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>> action are not used by GDB for solib bookkeeping, but can be used to
>> stop on solib events, with "set stop-on-solib-events".  I was just
>> worried if there was some cases where a probe would be missing, but the
>> corresponding event could be caught if using the original interface.  In
>> that case, using the probes interface would be a regression.  But it's
>> probably not worth wondering about.  If that happens it's just a bug
>> that needs to be fixed.  In the case we are looking at, if the
>> map_failed probe gets optimized out, then surely the corresponding call
>> to the r_brk function would also be optimized out.
>
> I also considered just ignoring any probe was (a) missing, and (b) had a
> DO_NOTHING action.  The reason I didn't post this patch was because, at
> the time, my thinking was: if we don't care about any probe with a
> DO_NOTHING action, why even look for those probes, why not just remove
> them from the list?
>
> I think you've (partially) convinced me that the user might be
> interested in seeing a stop at these probes even if GDB's action is
> DO_NOTHING.
>
> I say partially above because GDB doesn't really do anything to tell the
> user which probe we stopped at, e.g. was is "init_start", "map_start",
> "map_failed", etc.  The user might be able to figure it out from the
> backtrace, but I still think it's not going to be trivial in all cases,
> e.g. "map_start" and "map_failed" are both located in the same function,
> so I think the user would need to lookup the probe address in the ELF,
> then compare that to the stop address.  Not impossible, but, I suspect,
> the complexity is an indication that users are not doing this much.
> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
> probes.
>
> However, I think there is enough of a justification there for keeping
> the probes in the list, and just skipping any DO_NOTHING probes that
> don't exist.
>
> Below then, is an alternative patch.  I don't have a strong preference
> between this one, and the original[1], but I thought I'd post this for
> discussion.  If this is preferred then I can just merge this.
>
> [1] I'll also post an update to the original patch shortly that
> addresses Lancelot's feedback.
>
> Thanks,
> Andrew
>
> ---
>
> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
> Author: Andrew Burgess <aburgess@redhat.com>
> Date:   Tue Nov 22 12:45:56 2022 +0000
>
>     gdb: relax requirement for the map_failed stap probe to be present
>     
>     From glibc 2.35 and later, the "map_failed" stap probe is no longer
>     included in glibc.  The removal of the probe looks like an accident,
>     but it was caused by a glibc commit which meant that the "map_failed"
>     probe could no longer be reached; the compiler than helpfully
>     optimised out the probe.
>     
>     In GDB, in solib-svr4.c, we have a list of probes that we look for
>     related to the shared library loading detection.  If any of these
>     probes are missing then GDB will fall back to the non-probe based
>     mechanism for detecting shared library loading.  The "map_failed"
>     probe is include in the list of required probes.
>     
>     This means that on glibc 2.35 (or later) systems, GDB is going to
>     always fall back to the non-probes based mechanism for detecting
>     shared library loading.
>     
>     I raised a glibc bug to discuss this issue:
>     
>       https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>     
>     But, whatever the ultimate decision from the glibc team, given there
>     are version of glibc in the wild without the "map_failed" probe, we
>     probably should update GDB to handle this situation.
>     
>     The "map_failed" probe is already a little strange, very early
>     versions of glibc didn't include this probe, so, in some cases, if
>     this probe is missing GDB is happy to ignore it.  This is fine, the
>     action associated with this probe inside GDB is DO_NOTHING, this means
>     the probe isn't actually required in order for GDB to correctly detect
>     the loading of shared libraries.
>     
>     In this commit I propose changing the rules so that any probe whose
>     action is DO_NOTHING, is optional.
>     
>     There is one possible downside to this change, and that concerns 'set
>     stop-on-solib-events on'.  If a probe is removed from glibc, but the
>     old style breakpoint based mechanism is still in place within glibc
>     for that same event, then GDB will stop when using the old style
>     non-probe based mechanism, but not when using the probes based
>     mechanism.
>     
>     For the map_failed case this is not a problem, both the map_failed
>     probe, and the call to the old style breakpoint location were
>     optimised out, and so neither event (probes based, or breakpoint
>     based) will trigger.  This would only become an issue if glibc removed
>     a probe, but left the breakpoint in place (this would almost certainly
>     be a bug in glibc).
>     
>     For now, I'm proposing that we just don't worry about this.  Because
>     some probes have actions that are not DO_NOTHING, then we know the
>     user will always seem _some_ stops when a shared library is
>     loaded/unloaded, and (I'm guessing), in most cases, that's all they
>     care about.  I figure when someone complains then we can figure out
>     what the right solution is then.
>     
>     With this commit in place, then, when using a glibc 2.35 or later
>     system, GDB will once again use the stap probes for shared library
>     detection.
>
> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
> index 6acaf87960b..10e446af908 100644
> --- a/gdb/solib-svr4.c
> +++ b/gdb/solib-svr4.c
> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>  
>        probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>  
> -      /* The "map_failed" probe did not exist in early
> -	 versions of the probes code in which the probes'
> -	 names were prefixed with "rtld_".  */
> -      if (with_prefix && streq (name, "rtld_map_failed"))
> -	continue;
> -
>        /* Ensure at least one probe for the current name was found.  */
>        if (probes[i].empty ())
> -	return false;
> +	{
> +	  /* The "map_failed" probe did not exist in early versions of the
> +	     probes code in which the probes' names were prefixed with
> +	     "rtld_".
> +
> +	     Additionally, the "map_failed" probe was accidentally removed
> +	     from glibc 2.35 and later, when changes in glibc meant the
> +	     probe could no longer be reached, and the compiler optimized
> +	     the probe away.  In this case the probe name doesn't have the
> +	     "rtld_" prefix.
> +
> +	     To handle this, and give GDB as much flexibility as possible,
> +	     we make the rule that, if a probe isn't required for the
> +	     correct operation of GDB (i.e. it's action is DO_NOTHING),
> +	     then we will still use the probes interface, even if that
> +	     probe is missing.
> +
> +	     The only (possible) downside of this is that, if the user has
> +	     'set stop-on-solib-events on' in effect, then they might get
> +	     fewer events using the probes interface than with the classic
> +	     non-probes interface.  */
> +	  if (prove_info[i].action == DO_NOTHING)
> +	    continue;
> +	  else
> +	    return false;
> +	}
>  
>        /* Ensure probe arguments can be evaluated.  */
>        for (probe *p : probes[i])
  
Luis Machado Nov. 29, 2022, 8:27 a.m. UTC | #9
Hi Andrew,

This seems to have broken armhf on Ubuntu 22.04.

https://builder.sourceware.org/buildbot/#/builders/169/builds/1163

On 11/28/22 17:18, Andrew Burgess via Gdb-patches wrote:
> 
> Thanks you all for your review feedback.
> 
> I've now pushed this version of the patch (inc Pedro's suggested typo
> fix).
> 
> Thanks,
> Andrew
> 
> 
> Andrew Burgess <aburgess@redhat.com> writes:
> 
>> Simon Marchi <simark@simark.ca> writes:
>>
>>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>>>  From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>> included in glibc.  The removal of the probe looks like an accident,
>>>> but it was caused by a glibc commit which meant that the "map_failed"
>>>> probe could no longer be reached; the compiler than helpfully
>>>> optimised out the probe.
>>>>
>>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>> related to the shared library loading detection.  If any of these
>>>> probes are missing then GDB will fall back to the non-probe based
>>>> mechanism for detecting shared library loading.  The "map_failed"
>>>> probe is include in the list of required probes.
>>>>
>>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>>> always fall back to the non-probes based mechanism for detecting
>>>> shared library loading.
>>>>
>>>> I raised a glibc bug to discuss this issue:
>>>>
>>>>    https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>
>>>> But, whatever the ultimate decision from the glibc team, given there
>>>> are version of glibc in the wild without the "map_failed" probe, we
>>>> probably should update GDB to handle this situation.
>>>>
>>>> The "map_failed" probe is already a little strange, very early
>>>> versions of glibc didn't include this probe, so, in some cases, if
>>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>>> just expand this logic to make the "map_failed" probe fully optional.
>>>>
>>>> With this commit in place, then, when using a glibc 2.35 or later
>>>> system, GDB will once again use the stap probes for shared library
>>>> detection.
>>>> ---
>>>>   gdb/solib-svr4.c | 13 +++++++++----
>>>>   1 file changed, 9 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>> index 6acaf87960b..87cd06f251a 100644
>>>> --- a/gdb/solib-svr4.c
>>>> +++ b/gdb/solib-svr4.c
>>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>   
>>>>         probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>   
>>>> -      /* The "map_failed" probe did not exist in early
>>>> -	 versions of the probes code in which the probes'
>>>> -	 names were prefixed with "rtld_".  */
>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>> +	 probes code in which the probes' names were prefixed with
>>>> +	 "rtld_".
>>>> +
>>>> +	 Additionally, the "map_failed" probe was accidentally removed
>>>> +	 from glibc 2.35 and later, when changes in glibc meant the probe
>>>> +	 could no longer be reached.  In this case the probe name doesn't
>>>> +	 have the "rtld_" prefix.  */
>>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>>   	continue;
>>>>   
>>>>         /* Ensure at least one probe for the current name was found.  */
>>>>
>>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>>> -- 
>>>> 2.25.4
>>>
>>> Hi,
>>>
>>> I looked at this separately, and this was one of the fixes I considered.
>>>
>>> Another option was to make GDB not give up on the probes interface if
>>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>>> action are not used by GDB for solib bookkeeping, but can be used to
>>> stop on solib events, with "set stop-on-solib-events".  I was just
>>> worried if there was some cases where a probe would be missing, but the
>>> corresponding event could be caught if using the original interface.  In
>>> that case, using the probes interface would be a regression.  But it's
>>> probably not worth wondering about.  If that happens it's just a bug
>>> that needs to be fixed.  In the case we are looking at, if the
>>> map_failed probe gets optimized out, then surely the corresponding call
>>> to the r_brk function would also be optimized out.
>>
>> I also considered just ignoring any probe was (a) missing, and (b) had a
>> DO_NOTHING action.  The reason I didn't post this patch was because, at
>> the time, my thinking was: if we don't care about any probe with a
>> DO_NOTHING action, why even look for those probes, why not just remove
>> them from the list?
>>
>> I think you've (partially) convinced me that the user might be
>> interested in seeing a stop at these probes even if GDB's action is
>> DO_NOTHING.
>>
>> I say partially above because GDB doesn't really do anything to tell the
>> user which probe we stopped at, e.g. was is "init_start", "map_start",
>> "map_failed", etc.  The user might be able to figure it out from the
>> backtrace, but I still think it's not going to be trivial in all cases,
>> e.g. "map_start" and "map_failed" are both located in the same function,
>> so I think the user would need to lookup the probe address in the ELF,
>> then compare that to the stop address.  Not impossible, but, I suspect,
>> the complexity is an indication that users are not doing this much.
>> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
>> probes.
>>
>> However, I think there is enough of a justification there for keeping
>> the probes in the list, and just skipping any DO_NOTHING probes that
>> don't exist.
>>
>> Below then, is an alternative patch.  I don't have a strong preference
>> between this one, and the original[1], but I thought I'd post this for
>> discussion.  If this is preferred then I can just merge this.
>>
>> [1] I'll also post an update to the original patch shortly that
>> addresses Lancelot's feedback.
>>
>> Thanks,
>> Andrew
>>
>> ---
>>
>> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
>> Author: Andrew Burgess <aburgess@redhat.com>
>> Date:   Tue Nov 22 12:45:56 2022 +0000
>>
>>      gdb: relax requirement for the map_failed stap probe to be present
>>      
>>      From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>      included in glibc.  The removal of the probe looks like an accident,
>>      but it was caused by a glibc commit which meant that the "map_failed"
>>      probe could no longer be reached; the compiler than helpfully
>>      optimised out the probe.
>>      
>>      In GDB, in solib-svr4.c, we have a list of probes that we look for
>>      related to the shared library loading detection.  If any of these
>>      probes are missing then GDB will fall back to the non-probe based
>>      mechanism for detecting shared library loading.  The "map_failed"
>>      probe is include in the list of required probes.
>>      
>>      This means that on glibc 2.35 (or later) systems, GDB is going to
>>      always fall back to the non-probes based mechanism for detecting
>>      shared library loading.
>>      
>>      I raised a glibc bug to discuss this issue:
>>      
>>        https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>      
>>      But, whatever the ultimate decision from the glibc team, given there
>>      are version of glibc in the wild without the "map_failed" probe, we
>>      probably should update GDB to handle this situation.
>>      
>>      The "map_failed" probe is already a little strange, very early
>>      versions of glibc didn't include this probe, so, in some cases, if
>>      this probe is missing GDB is happy to ignore it.  This is fine, the
>>      action associated with this probe inside GDB is DO_NOTHING, this means
>>      the probe isn't actually required in order for GDB to correctly detect
>>      the loading of shared libraries.
>>      
>>      In this commit I propose changing the rules so that any probe whose
>>      action is DO_NOTHING, is optional.
>>      
>>      There is one possible downside to this change, and that concerns 'set
>>      stop-on-solib-events on'.  If a probe is removed from glibc, but the
>>      old style breakpoint based mechanism is still in place within glibc
>>      for that same event, then GDB will stop when using the old style
>>      non-probe based mechanism, but not when using the probes based
>>      mechanism.
>>      
>>      For the map_failed case this is not a problem, both the map_failed
>>      probe, and the call to the old style breakpoint location were
>>      optimised out, and so neither event (probes based, or breakpoint
>>      based) will trigger.  This would only become an issue if glibc removed
>>      a probe, but left the breakpoint in place (this would almost certainly
>>      be a bug in glibc).
>>      
>>      For now, I'm proposing that we just don't worry about this.  Because
>>      some probes have actions that are not DO_NOTHING, then we know the
>>      user will always seem _some_ stops when a shared library is
>>      loaded/unloaded, and (I'm guessing), in most cases, that's all they
>>      care about.  I figure when someone complains then we can figure out
>>      what the right solution is then.
>>      
>>      With this commit in place, then, when using a glibc 2.35 or later
>>      system, GDB will once again use the stap probes for shared library
>>      detection.
>>
>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>> index 6acaf87960b..10e446af908 100644
>> --- a/gdb/solib-svr4.c
>> +++ b/gdb/solib-svr4.c
>> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>   
>>         probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>   
>> -      /* The "map_failed" probe did not exist in early
>> -	 versions of the probes code in which the probes'
>> -	 names were prefixed with "rtld_".  */
>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>> -	continue;
>> -
>>         /* Ensure at least one probe for the current name was found.  */
>>         if (probes[i].empty ())
>> -	return false;
>> +	{
>> +	  /* The "map_failed" probe did not exist in early versions of the
>> +	     probes code in which the probes' names were prefixed with
>> +	     "rtld_".
>> +
>> +	     Additionally, the "map_failed" probe was accidentally removed
>> +	     from glibc 2.35 and later, when changes in glibc meant the
>> +	     probe could no longer be reached, and the compiler optimized
>> +	     the probe away.  In this case the probe name doesn't have the
>> +	     "rtld_" prefix.
>> +
>> +	     To handle this, and give GDB as much flexibility as possible,
>> +	     we make the rule that, if a probe isn't required for the
>> +	     correct operation of GDB (i.e. it's action is DO_NOTHING),
>> +	     then we will still use the probes interface, even if that
>> +	     probe is missing.
>> +
>> +	     The only (possible) downside of this is that, if the user has
>> +	     'set stop-on-solib-events on' in effect, then they might get
>> +	     fewer events using the probes interface than with the classic
>> +	     non-probes interface.  */
>> +	  if (prove_info[i].action == DO_NOTHING)
>> +	    continue;
>> +	  else
>> +	    return false;
>> +	}
>>   
>>         /* Ensure probe arguments can be evaluated.  */
>>         for (probe *p : probes[i])
>
  
Luis Machado Nov. 29, 2022, 8:38 a.m. UTC | #10
On 11/29/22 08:27, Luis Machado wrote:
> Hi Andrew,
> 
> This seems to have broken armhf on Ubuntu 22.04.
> 
> https://builder.sourceware.org/buildbot/#/builders/169/builds/1163
> 

I haven't investigated this. I only spotted it in the sourceware buildbot page. But I'd guess it is something to do
with thumb mode detection early in the startup.

Ubuntu has moved to not stripping ld.so for armhf because the probe mechanism is not capable of conveying the thumb mode
information, so gdb has to rely on symbols instead. If the changes have touched this fragile area, it may have broken the
delicate balance.


> On 11/28/22 17:18, Andrew Burgess via Gdb-patches wrote:
>>
>> Thanks you all for your review feedback.
>>
>> I've now pushed this version of the patch (inc Pedro's suggested typo
>> fix).
>>
>> Thanks,
>> Andrew
>>
>>
>> Andrew Burgess <aburgess@redhat.com> writes:
>>
>>> Simon Marchi <simark@simark.ca> writes:
>>>
>>>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>>>>  From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>> included in glibc.  The removal of the probe looks like an accident,
>>>>> but it was caused by a glibc commit which meant that the "map_failed"
>>>>> probe could no longer be reached; the compiler than helpfully
>>>>> optimised out the probe.
>>>>>
>>>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>> related to the shared library loading detection.  If any of these
>>>>> probes are missing then GDB will fall back to the non-probe based
>>>>> mechanism for detecting shared library loading.  The "map_failed"
>>>>> probe is include in the list of required probes.
>>>>>
>>>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>> always fall back to the non-probes based mechanism for detecting
>>>>> shared library loading.
>>>>>
>>>>> I raised a glibc bug to discuss this issue:
>>>>>
>>>>>    https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>>
>>>>> But, whatever the ultimate decision from the glibc team, given there
>>>>> are version of glibc in the wild without the "map_failed" probe, we
>>>>> probably should update GDB to handle this situation.
>>>>>
>>>>> The "map_failed" probe is already a little strange, very early
>>>>> versions of glibc didn't include this probe, so, in some cases, if
>>>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>>>> just expand this logic to make the "map_failed" probe fully optional.
>>>>>
>>>>> With this commit in place, then, when using a glibc 2.35 or later
>>>>> system, GDB will once again use the stap probes for shared library
>>>>> detection.
>>>>> ---
>>>>>   gdb/solib-svr4.c | 13 +++++++++----
>>>>>   1 file changed, 9 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>>> index 6acaf87960b..87cd06f251a 100644
>>>>> --- a/gdb/solib-svr4.c
>>>>> +++ b/gdb/solib-svr4.c
>>>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>>         probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>> -      /* The "map_failed" probe did not exist in early
>>>>> -     versions of the probes code in which the probes'
>>>>> -     names were prefixed with "rtld_".  */
>>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>>> +     probes code in which the probes' names were prefixed with
>>>>> +     "rtld_".
>>>>> +
>>>>> +     Additionally, the "map_failed" probe was accidentally removed
>>>>> +     from glibc 2.35 and later, when changes in glibc meant the probe
>>>>> +     could no longer be reached.  In this case the probe name doesn't
>>>>> +     have the "rtld_" prefix.  */
>>>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>>>       continue;
>>>>>         /* Ensure at least one probe for the current name was found.  */
>>>>>
>>>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>>>> -- 
>>>>> 2.25.4
>>>>
>>>> Hi,
>>>>
>>>> I looked at this separately, and this was one of the fixes I considered.
>>>>
>>>> Another option was to make GDB not give up on the probes interface if
>>>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>>>> action are not used by GDB for solib bookkeeping, but can be used to
>>>> stop on solib events, with "set stop-on-solib-events".  I was just
>>>> worried if there was some cases where a probe would be missing, but the
>>>> corresponding event could be caught if using the original interface.  In
>>>> that case, using the probes interface would be a regression.  But it's
>>>> probably not worth wondering about.  If that happens it's just a bug
>>>> that needs to be fixed.  In the case we are looking at, if the
>>>> map_failed probe gets optimized out, then surely the corresponding call
>>>> to the r_brk function would also be optimized out.
>>>
>>> I also considered just ignoring any probe was (a) missing, and (b) had a
>>> DO_NOTHING action.  The reason I didn't post this patch was because, at
>>> the time, my thinking was: if we don't care about any probe with a
>>> DO_NOTHING action, why even look for those probes, why not just remove
>>> them from the list?
>>>
>>> I think you've (partially) convinced me that the user might be
>>> interested in seeing a stop at these probes even if GDB's action is
>>> DO_NOTHING.
>>>
>>> I say partially above because GDB doesn't really do anything to tell the
>>> user which probe we stopped at, e.g. was is "init_start", "map_start",
>>> "map_failed", etc.  The user might be able to figure it out from the
>>> backtrace, but I still think it's not going to be trivial in all cases,
>>> e.g. "map_start" and "map_failed" are both located in the same function,
>>> so I think the user would need to lookup the probe address in the ELF,
>>> then compare that to the stop address.  Not impossible, but, I suspect,
>>> the complexity is an indication that users are not doing this much.
>>> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
>>> probes.
>>>
>>> However, I think there is enough of a justification there for keeping
>>> the probes in the list, and just skipping any DO_NOTHING probes that
>>> don't exist.
>>>
>>> Below then, is an alternative patch.  I don't have a strong preference
>>> between this one, and the original[1], but I thought I'd post this for
>>> discussion.  If this is preferred then I can just merge this.
>>>
>>> [1] I'll also post an update to the original patch shortly that
>>> addresses Lancelot's feedback.
>>>
>>> Thanks,
>>> Andrew
>>>
>>> ---
>>>
>>> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
>>> Author: Andrew Burgess <aburgess@redhat.com>
>>> Date:   Tue Nov 22 12:45:56 2022 +0000
>>>
>>>      gdb: relax requirement for the map_failed stap probe to be present
>>>      From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>      included in glibc.  The removal of the probe looks like an accident,
>>>      but it was caused by a glibc commit which meant that the "map_failed"
>>>      probe could no longer be reached; the compiler than helpfully
>>>      optimised out the probe.
>>>      In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>      related to the shared library loading detection.  If any of these
>>>      probes are missing then GDB will fall back to the non-probe based
>>>      mechanism for detecting shared library loading.  The "map_failed"
>>>      probe is include in the list of required probes.
>>>      This means that on glibc 2.35 (or later) systems, GDB is going to
>>>      always fall back to the non-probes based mechanism for detecting
>>>      shared library loading.
>>>      I raised a glibc bug to discuss this issue:
>>>        https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>      But, whatever the ultimate decision from the glibc team, given there
>>>      are version of glibc in the wild without the "map_failed" probe, we
>>>      probably should update GDB to handle this situation.
>>>      The "map_failed" probe is already a little strange, very early
>>>      versions of glibc didn't include this probe, so, in some cases, if
>>>      this probe is missing GDB is happy to ignore it.  This is fine, the
>>>      action associated with this probe inside GDB is DO_NOTHING, this means
>>>      the probe isn't actually required in order for GDB to correctly detect
>>>      the loading of shared libraries.
>>>      In this commit I propose changing the rules so that any probe whose
>>>      action is DO_NOTHING, is optional.
>>>      There is one possible downside to this change, and that concerns 'set
>>>      stop-on-solib-events on'.  If a probe is removed from glibc, but the
>>>      old style breakpoint based mechanism is still in place within glibc
>>>      for that same event, then GDB will stop when using the old style
>>>      non-probe based mechanism, but not when using the probes based
>>>      mechanism.
>>>      For the map_failed case this is not a problem, both the map_failed
>>>      probe, and the call to the old style breakpoint location were
>>>      optimised out, and so neither event (probes based, or breakpoint
>>>      based) will trigger.  This would only become an issue if glibc removed
>>>      a probe, but left the breakpoint in place (this would almost certainly
>>>      be a bug in glibc).
>>>      For now, I'm proposing that we just don't worry about this.  Because
>>>      some probes have actions that are not DO_NOTHING, then we know the
>>>      user will always seem _some_ stops when a shared library is
>>>      loaded/unloaded, and (I'm guessing), in most cases, that's all they
>>>      care about.  I figure when someone complains then we can figure out
>>>      what the right solution is then.
>>>      With this commit in place, then, when using a glibc 2.35 or later
>>>      system, GDB will once again use the stap probes for shared library
>>>      detection.
>>>
>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>> index 6acaf87960b..10e446af908 100644
>>> --- a/gdb/solib-svr4.c
>>> +++ b/gdb/solib-svr4.c
>>> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>         probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>> -      /* The "map_failed" probe did not exist in early
>>> -     versions of the probes code in which the probes'
>>> -     names were prefixed with "rtld_".  */
>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>> -    continue;
>>> -
>>>         /* Ensure at least one probe for the current name was found.  */
>>>         if (probes[i].empty ())
>>> -    return false;
>>> +    {
>>> +      /* The "map_failed" probe did not exist in early versions of the
>>> +         probes code in which the probes' names were prefixed with
>>> +         "rtld_".
>>> +
>>> +         Additionally, the "map_failed" probe was accidentally removed
>>> +         from glibc 2.35 and later, when changes in glibc meant the
>>> +         probe could no longer be reached, and the compiler optimized
>>> +         the probe away.  In this case the probe name doesn't have the
>>> +         "rtld_" prefix.
>>> +
>>> +         To handle this, and give GDB as much flexibility as possible,
>>> +         we make the rule that, if a probe isn't required for the
>>> +         correct operation of GDB (i.e. it's action is DO_NOTHING),
>>> +         then we will still use the probes interface, even if that
>>> +         probe is missing.
>>> +
>>> +         The only (possible) downside of this is that, if the user has
>>> +         'set stop-on-solib-events on' in effect, then they might get
>>> +         fewer events using the probes interface than with the classic
>>> +         non-probes interface.  */
>>> +      if (prove_info[i].action == DO_NOTHING)
>>> +        continue;
>>> +      else
>>> +        return false;
>>> +    }
>>>         /* Ensure probe arguments can be evaluated.  */
>>>         for (probe *p : probes[i])
>>
>
  
Andrew Burgess Dec. 5, 2022, 10:09 a.m. UTC | #11
Luis Machado <luis.machado@arm.com> writes:

> On 11/29/22 08:27, Luis Machado wrote:
>> Hi Andrew,
>> 
>> This seems to have broken armhf on Ubuntu 22.04.
>> 
>> https://builder.sourceware.org/buildbot/#/builders/169/builds/1163
>> 
>
> I haven't investigated this. I only spotted it in the sourceware buildbot page. But I'd guess it is something to do
> with thumb mode detection early in the startup.
>
> Ubuntu has moved to not stripping ld.so for armhf because the probe mechanism is not capable of conveying the thumb mode
> information, so gdb has to rely on symbols instead. If the changes have touched this fragile area, it may have broken the
> delicate balance.

I somehow missed this last week.  I'll take a look today.

Sorry for the breakage.

Andrew


>
>
>> On 11/28/22 17:18, Andrew Burgess via Gdb-patches wrote:
>>>
>>> Thanks you all for your review feedback.
>>>
>>> I've now pushed this version of the patch (inc Pedro's suggested typo
>>> fix).
>>>
>>> Thanks,
>>> Andrew
>>>
>>>
>>> Andrew Burgess <aburgess@redhat.com> writes:
>>>
>>>> Simon Marchi <simark@simark.ca> writes:
>>>>
>>>>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>>>>>  From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>>> included in glibc.  The removal of the probe looks like an accident,
>>>>>> but it was caused by a glibc commit which meant that the "map_failed"
>>>>>> probe could no longer be reached; the compiler than helpfully
>>>>>> optimised out the probe.
>>>>>>
>>>>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>>> related to the shared library loading detection.  If any of these
>>>>>> probes are missing then GDB will fall back to the non-probe based
>>>>>> mechanism for detecting shared library loading.  The "map_failed"
>>>>>> probe is include in the list of required probes.
>>>>>>
>>>>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>>> always fall back to the non-probes based mechanism for detecting
>>>>>> shared library loading.
>>>>>>
>>>>>> I raised a glibc bug to discuss this issue:
>>>>>>
>>>>>>    https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>>>
>>>>>> But, whatever the ultimate decision from the glibc team, given there
>>>>>> are version of glibc in the wild without the "map_failed" probe, we
>>>>>> probably should update GDB to handle this situation.
>>>>>>
>>>>>> The "map_failed" probe is already a little strange, very early
>>>>>> versions of glibc didn't include this probe, so, in some cases, if
>>>>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>>>>> just expand this logic to make the "map_failed" probe fully optional.
>>>>>>
>>>>>> With this commit in place, then, when using a glibc 2.35 or later
>>>>>> system, GDB will once again use the stap probes for shared library
>>>>>> detection.
>>>>>> ---
>>>>>>   gdb/solib-svr4.c | 13 +++++++++----
>>>>>>   1 file changed, 9 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>>>> index 6acaf87960b..87cd06f251a 100644
>>>>>> --- a/gdb/solib-svr4.c
>>>>>> +++ b/gdb/solib-svr4.c
>>>>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>>>         probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>>> -      /* The "map_failed" probe did not exist in early
>>>>>> -     versions of the probes code in which the probes'
>>>>>> -     names were prefixed with "rtld_".  */
>>>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>>>> +     probes code in which the probes' names were prefixed with
>>>>>> +     "rtld_".
>>>>>> +
>>>>>> +     Additionally, the "map_failed" probe was accidentally removed
>>>>>> +     from glibc 2.35 and later, when changes in glibc meant the probe
>>>>>> +     could no longer be reached.  In this case the probe name doesn't
>>>>>> +     have the "rtld_" prefix.  */
>>>>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>>>>       continue;
>>>>>>         /* Ensure at least one probe for the current name was found.  */
>>>>>>
>>>>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>>>>> -- 
>>>>>> 2.25.4
>>>>>
>>>>> Hi,
>>>>>
>>>>> I looked at this separately, and this was one of the fixes I considered.
>>>>>
>>>>> Another option was to make GDB not give up on the probes interface if
>>>>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>>>>> action are not used by GDB for solib bookkeeping, but can be used to
>>>>> stop on solib events, with "set stop-on-solib-events".  I was just
>>>>> worried if there was some cases where a probe would be missing, but the
>>>>> corresponding event could be caught if using the original interface.  In
>>>>> that case, using the probes interface would be a regression.  But it's
>>>>> probably not worth wondering about.  If that happens it's just a bug
>>>>> that needs to be fixed.  In the case we are looking at, if the
>>>>> map_failed probe gets optimized out, then surely the corresponding call
>>>>> to the r_brk function would also be optimized out.
>>>>
>>>> I also considered just ignoring any probe was (a) missing, and (b) had a
>>>> DO_NOTHING action.  The reason I didn't post this patch was because, at
>>>> the time, my thinking was: if we don't care about any probe with a
>>>> DO_NOTHING action, why even look for those probes, why not just remove
>>>> them from the list?
>>>>
>>>> I think you've (partially) convinced me that the user might be
>>>> interested in seeing a stop at these probes even if GDB's action is
>>>> DO_NOTHING.
>>>>
>>>> I say partially above because GDB doesn't really do anything to tell the
>>>> user which probe we stopped at, e.g. was is "init_start", "map_start",
>>>> "map_failed", etc.  The user might be able to figure it out from the
>>>> backtrace, but I still think it's not going to be trivial in all cases,
>>>> e.g. "map_start" and "map_failed" are both located in the same function,
>>>> so I think the user would need to lookup the probe address in the ELF,
>>>> then compare that to the stop address.  Not impossible, but, I suspect,
>>>> the complexity is an indication that users are not doing this much.
>>>> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
>>>> probes.
>>>>
>>>> However, I think there is enough of a justification there for keeping
>>>> the probes in the list, and just skipping any DO_NOTHING probes that
>>>> don't exist.
>>>>
>>>> Below then, is an alternative patch.  I don't have a strong preference
>>>> between this one, and the original[1], but I thought I'd post this for
>>>> discussion.  If this is preferred then I can just merge this.
>>>>
>>>> [1] I'll also post an update to the original patch shortly that
>>>> addresses Lancelot's feedback.
>>>>
>>>> Thanks,
>>>> Andrew
>>>>
>>>> ---
>>>>
>>>> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
>>>> Author: Andrew Burgess <aburgess@redhat.com>
>>>> Date:   Tue Nov 22 12:45:56 2022 +0000
>>>>
>>>>      gdb: relax requirement for the map_failed stap probe to be present
>>>>      From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>      included in glibc.  The removal of the probe looks like an accident,
>>>>      but it was caused by a glibc commit which meant that the "map_failed"
>>>>      probe could no longer be reached; the compiler than helpfully
>>>>      optimised out the probe.
>>>>      In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>      related to the shared library loading detection.  If any of these
>>>>      probes are missing then GDB will fall back to the non-probe based
>>>>      mechanism for detecting shared library loading.  The "map_failed"
>>>>      probe is include in the list of required probes.
>>>>      This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>      always fall back to the non-probes based mechanism for detecting
>>>>      shared library loading.
>>>>      I raised a glibc bug to discuss this issue:
>>>>        https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>      But, whatever the ultimate decision from the glibc team, given there
>>>>      are version of glibc in the wild without the "map_failed" probe, we
>>>>      probably should update GDB to handle this situation.
>>>>      The "map_failed" probe is already a little strange, very early
>>>>      versions of glibc didn't include this probe, so, in some cases, if
>>>>      this probe is missing GDB is happy to ignore it.  This is fine, the
>>>>      action associated with this probe inside GDB is DO_NOTHING, this means
>>>>      the probe isn't actually required in order for GDB to correctly detect
>>>>      the loading of shared libraries.
>>>>      In this commit I propose changing the rules so that any probe whose
>>>>      action is DO_NOTHING, is optional.
>>>>      There is one possible downside to this change, and that concerns 'set
>>>>      stop-on-solib-events on'.  If a probe is removed from glibc, but the
>>>>      old style breakpoint based mechanism is still in place within glibc
>>>>      for that same event, then GDB will stop when using the old style
>>>>      non-probe based mechanism, but not when using the probes based
>>>>      mechanism.
>>>>      For the map_failed case this is not a problem, both the map_failed
>>>>      probe, and the call to the old style breakpoint location were
>>>>      optimised out, and so neither event (probes based, or breakpoint
>>>>      based) will trigger.  This would only become an issue if glibc removed
>>>>      a probe, but left the breakpoint in place (this would almost certainly
>>>>      be a bug in glibc).
>>>>      For now, I'm proposing that we just don't worry about this.  Because
>>>>      some probes have actions that are not DO_NOTHING, then we know the
>>>>      user will always seem _some_ stops when a shared library is
>>>>      loaded/unloaded, and (I'm guessing), in most cases, that's all they
>>>>      care about.  I figure when someone complains then we can figure out
>>>>      what the right solution is then.
>>>>      With this commit in place, then, when using a glibc 2.35 or later
>>>>      system, GDB will once again use the stap probes for shared library
>>>>      detection.
>>>>
>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>> index 6acaf87960b..10e446af908 100644
>>>> --- a/gdb/solib-svr4.c
>>>> +++ b/gdb/solib-svr4.c
>>>> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>         probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>> -      /* The "map_failed" probe did not exist in early
>>>> -     versions of the probes code in which the probes'
>>>> -     names were prefixed with "rtld_".  */
>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>> -    continue;
>>>> -
>>>>         /* Ensure at least one probe for the current name was found.  */
>>>>         if (probes[i].empty ())
>>>> -    return false;
>>>> +    {
>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>> +         probes code in which the probes' names were prefixed with
>>>> +         "rtld_".
>>>> +
>>>> +         Additionally, the "map_failed" probe was accidentally removed
>>>> +         from glibc 2.35 and later, when changes in glibc meant the
>>>> +         probe could no longer be reached, and the compiler optimized
>>>> +         the probe away.  In this case the probe name doesn't have the
>>>> +         "rtld_" prefix.
>>>> +
>>>> +         To handle this, and give GDB as much flexibility as possible,
>>>> +         we make the rule that, if a probe isn't required for the
>>>> +         correct operation of GDB (i.e. it's action is DO_NOTHING),
>>>> +         then we will still use the probes interface, even if that
>>>> +         probe is missing.
>>>> +
>>>> +         The only (possible) downside of this is that, if the user has
>>>> +         'set stop-on-solib-events on' in effect, then they might get
>>>> +         fewer events using the probes interface than with the classic
>>>> +         non-probes interface.  */
>>>> +      if (prove_info[i].action == DO_NOTHING)
>>>> +        continue;
>>>> +      else
>>>> +        return false;
>>>> +    }
>>>>         /* Ensure probe arguments can be evaluated.  */
>>>>         for (probe *p : probes[i])
>>>
>>
  
Luis Machado Dec. 5, 2022, 10:27 a.m. UTC | #12
On 12/5/22 10:09, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> On 11/29/22 08:27, Luis Machado wrote:
>>> Hi Andrew,
>>>
>>> This seems to have broken armhf on Ubuntu 22.04.
>>>
>>> https://builder.sourceware.org/buildbot/#/builders/169/builds/1163
>>>
>>
>> I haven't investigated this. I only spotted it in the sourceware buildbot page. But I'd guess it is something to do
>> with thumb mode detection early in the startup.
>>
>> Ubuntu has moved to not stripping ld.so for armhf because the probe mechanism is not capable of conveying the thumb mode
>> information, so gdb has to rely on symbols instead. If the changes have touched this fragile area, it may have broken the
>> delicate balance.
> 
> I somehow missed this last week.  I'll take a look today.
> 
> Sorry for the breakage.

No problem. For the record, I haven't managed to reproduce this on my Ubuntu 20.04/22.04 setup. I'll have to give it a try on the
buildbot machine.

Just a heads-up in case this doesn't reproduce for you either.

> 
> Andrew
> 
> 
>>
>>
>>> On 11/28/22 17:18, Andrew Burgess via Gdb-patches wrote:
>>>>
>>>> Thanks you all for your review feedback.
>>>>
>>>> I've now pushed this version of the patch (inc Pedro's suggested typo
>>>> fix).
>>>>
>>>> Thanks,
>>>> Andrew
>>>>
>>>>
>>>> Andrew Burgess <aburgess@redhat.com> writes:
>>>>
>>>>> Simon Marchi <simark@simark.ca> writes:
>>>>>
>>>>>> On 11/22/22 10:09, Andrew Burgess via Gdb-patches wrote:
>>>>>>>   From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>>>> included in glibc.  The removal of the probe looks like an accident,
>>>>>>> but it was caused by a glibc commit which meant that the "map_failed"
>>>>>>> probe could no longer be reached; the compiler than helpfully
>>>>>>> optimised out the probe.
>>>>>>>
>>>>>>> In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>>>> related to the shared library loading detection.  If any of these
>>>>>>> probes are missing then GDB will fall back to the non-probe based
>>>>>>> mechanism for detecting shared library loading.  The "map_failed"
>>>>>>> probe is include in the list of required probes.
>>>>>>>
>>>>>>> This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>>>> always fall back to the non-probes based mechanism for detecting
>>>>>>> shared library loading.
>>>>>>>
>>>>>>> I raised a glibc bug to discuss this issue:
>>>>>>>
>>>>>>>     https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>>>>
>>>>>>> But, whatever the ultimate decision from the glibc team, given there
>>>>>>> are version of glibc in the wild without the "map_failed" probe, we
>>>>>>> probably should update GDB to handle this situation.
>>>>>>>
>>>>>>> The "map_failed" probe is already a little strange, very early
>>>>>>> versions of glibc didn't include this probe, so, in some cases, if
>>>>>>> this probe is missing GDB is happy to ignore it.  In this commit I
>>>>>>> just expand this logic to make the "map_failed" probe fully optional.
>>>>>>>
>>>>>>> With this commit in place, then, when using a glibc 2.35 or later
>>>>>>> system, GDB will once again use the stap probes for shared library
>>>>>>> detection.
>>>>>>> ---
>>>>>>>    gdb/solib-svr4.c | 13 +++++++++----
>>>>>>>    1 file changed, 9 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>>>>> index 6acaf87960b..87cd06f251a 100644
>>>>>>> --- a/gdb/solib-svr4.c
>>>>>>> +++ b/gdb/solib-svr4.c
>>>>>>> @@ -2205,10 +2205,15 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>>>>          probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>>>> -      /* The "map_failed" probe did not exist in early
>>>>>>> -     versions of the probes code in which the probes'
>>>>>>> -     names were prefixed with "rtld_".  */
>>>>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>>>>> +     probes code in which the probes' names were prefixed with
>>>>>>> +     "rtld_".
>>>>>>> +
>>>>>>> +     Additionally, the "map_failed" probe was accidentally removed
>>>>>>> +     from glibc 2.35 and later, when changes in glibc meant the probe
>>>>>>> +     could no longer be reached.  In this case the probe name doesn't
>>>>>>> +     have the "rtld_" prefix.  */
>>>>>>> +      if (streq (probe_info[i].name, "map_failed"))
>>>>>>>        continue;
>>>>>>>          /* Ensure at least one probe for the current name was found.  */
>>>>>>>
>>>>>>> base-commit: 84f9fbe90e5429adb9dee68f04f44c92fa9e2345
>>>>>>> -- 
>>>>>>> 2.25.4
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I looked at this separately, and this was one of the fixes I considered.
>>>>>>
>>>>>> Another option was to make GDB not give up on the probes interface if
>>>>>> failing to look up a probe whose action is DO_NOTHING.  Probes with that
>>>>>> action are not used by GDB for solib bookkeeping, but can be used to
>>>>>> stop on solib events, with "set stop-on-solib-events".  I was just
>>>>>> worried if there was some cases where a probe would be missing, but the
>>>>>> corresponding event could be caught if using the original interface.  In
>>>>>> that case, using the probes interface would be a regression.  But it's
>>>>>> probably not worth wondering about.  If that happens it's just a bug
>>>>>> that needs to be fixed.  In the case we are looking at, if the
>>>>>> map_failed probe gets optimized out, then surely the corresponding call
>>>>>> to the r_brk function would also be optimized out.
>>>>>
>>>>> I also considered just ignoring any probe was (a) missing, and (b) had a
>>>>> DO_NOTHING action.  The reason I didn't post this patch was because, at
>>>>> the time, my thinking was: if we don't care about any probe with a
>>>>> DO_NOTHING action, why even look for those probes, why not just remove
>>>>> them from the list?
>>>>>
>>>>> I think you've (partially) convinced me that the user might be
>>>>> interested in seeing a stop at these probes even if GDB's action is
>>>>> DO_NOTHING.
>>>>>
>>>>> I say partially above because GDB doesn't really do anything to tell the
>>>>> user which probe we stopped at, e.g. was is "init_start", "map_start",
>>>>> "map_failed", etc.  The user might be able to figure it out from the
>>>>> backtrace, but I still think it's not going to be trivial in all cases,
>>>>> e.g. "map_start" and "map_failed" are both located in the same function,
>>>>> so I think the user would need to lookup the probe address in the ELF,
>>>>> then compare that to the stop address.  Not impossible, but, I suspect,
>>>>> the complexity is an indication that users are not doing this much.
>>>>> Thus, I suspect, in reality, nobody really cares about the DO_NOTHING
>>>>> probes.
>>>>>
>>>>> However, I think there is enough of a justification there for keeping
>>>>> the probes in the list, and just skipping any DO_NOTHING probes that
>>>>> don't exist.
>>>>>
>>>>> Below then, is an alternative patch.  I don't have a strong preference
>>>>> between this one, and the original[1], but I thought I'd post this for
>>>>> discussion.  If this is preferred then I can just merge this.
>>>>>
>>>>> [1] I'll also post an update to the original patch shortly that
>>>>> addresses Lancelot's feedback.
>>>>>
>>>>> Thanks,
>>>>> Andrew
>>>>>
>>>>> ---
>>>>>
>>>>> commit 11be1f25f446e68c23d0709cde46e32ff24b7eb9
>>>>> Author: Andrew Burgess <aburgess@redhat.com>
>>>>> Date:   Tue Nov 22 12:45:56 2022 +0000
>>>>>
>>>>>       gdb: relax requirement for the map_failed stap probe to be present
>>>>>       From glibc 2.35 and later, the "map_failed" stap probe is no longer
>>>>>       included in glibc.  The removal of the probe looks like an accident,
>>>>>       but it was caused by a glibc commit which meant that the "map_failed"
>>>>>       probe could no longer be reached; the compiler than helpfully
>>>>>       optimised out the probe.
>>>>>       In GDB, in solib-svr4.c, we have a list of probes that we look for
>>>>>       related to the shared library loading detection.  If any of these
>>>>>       probes are missing then GDB will fall back to the non-probe based
>>>>>       mechanism for detecting shared library loading.  The "map_failed"
>>>>>       probe is include in the list of required probes.
>>>>>       This means that on glibc 2.35 (or later) systems, GDB is going to
>>>>>       always fall back to the non-probes based mechanism for detecting
>>>>>       shared library loading.
>>>>>       I raised a glibc bug to discuss this issue:
>>>>>         https://sourceware.org/bugzilla/show_bug.cgi?id=29818
>>>>>       But, whatever the ultimate decision from the glibc team, given there
>>>>>       are version of glibc in the wild without the "map_failed" probe, we
>>>>>       probably should update GDB to handle this situation.
>>>>>       The "map_failed" probe is already a little strange, very early
>>>>>       versions of glibc didn't include this probe, so, in some cases, if
>>>>>       this probe is missing GDB is happy to ignore it.  This is fine, the
>>>>>       action associated with this probe inside GDB is DO_NOTHING, this means
>>>>>       the probe isn't actually required in order for GDB to correctly detect
>>>>>       the loading of shared libraries.
>>>>>       In this commit I propose changing the rules so that any probe whose
>>>>>       action is DO_NOTHING, is optional.
>>>>>       There is one possible downside to this change, and that concerns 'set
>>>>>       stop-on-solib-events on'.  If a probe is removed from glibc, but the
>>>>>       old style breakpoint based mechanism is still in place within glibc
>>>>>       for that same event, then GDB will stop when using the old style
>>>>>       non-probe based mechanism, but not when using the probes based
>>>>>       mechanism.
>>>>>       For the map_failed case this is not a problem, both the map_failed
>>>>>       probe, and the call to the old style breakpoint location were
>>>>>       optimised out, and so neither event (probes based, or breakpoint
>>>>>       based) will trigger.  This would only become an issue if glibc removed
>>>>>       a probe, but left the breakpoint in place (this would almost certainly
>>>>>       be a bug in glibc).
>>>>>       For now, I'm proposing that we just don't worry about this.  Because
>>>>>       some probes have actions that are not DO_NOTHING, then we know the
>>>>>       user will always seem _some_ stops when a shared library is
>>>>>       loaded/unloaded, and (I'm guessing), in most cases, that's all they
>>>>>       care about.  I figure when someone complains then we can figure out
>>>>>       what the right solution is then.
>>>>>       With this commit in place, then, when using a glibc 2.35 or later
>>>>>       system, GDB will once again use the stap probes for shared library
>>>>>       detection.
>>>>>
>>>>> diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
>>>>> index 6acaf87960b..10e446af908 100644
>>>>> --- a/gdb/solib-svr4.c
>>>>> +++ b/gdb/solib-svr4.c
>>>>> @@ -2205,15 +2205,34 @@ svr4_find_and_create_probe_breakpoints (svr4_info *info,
>>>>>          probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
>>>>> -      /* The "map_failed" probe did not exist in early
>>>>> -     versions of the probes code in which the probes'
>>>>> -     names were prefixed with "rtld_".  */
>>>>> -      if (with_prefix && streq (name, "rtld_map_failed"))
>>>>> -    continue;
>>>>> -
>>>>>          /* Ensure at least one probe for the current name was found.  */
>>>>>          if (probes[i].empty ())
>>>>> -    return false;
>>>>> +    {
>>>>> +      /* The "map_failed" probe did not exist in early versions of the
>>>>> +         probes code in which the probes' names were prefixed with
>>>>> +         "rtld_".
>>>>> +
>>>>> +         Additionally, the "map_failed" probe was accidentally removed
>>>>> +         from glibc 2.35 and later, when changes in glibc meant the
>>>>> +         probe could no longer be reached, and the compiler optimized
>>>>> +         the probe away.  In this case the probe name doesn't have the
>>>>> +         "rtld_" prefix.
>>>>> +
>>>>> +         To handle this, and give GDB as much flexibility as possible,
>>>>> +         we make the rule that, if a probe isn't required for the
>>>>> +         correct operation of GDB (i.e. it's action is DO_NOTHING),
>>>>> +         then we will still use the probes interface, even if that
>>>>> +         probe is missing.
>>>>> +
>>>>> +         The only (possible) downside of this is that, if the user has
>>>>> +         'set stop-on-solib-events on' in effect, then they might get
>>>>> +         fewer events using the probes interface than with the classic
>>>>> +         non-probes interface.  */
>>>>> +      if (prove_info[i].action == DO_NOTHING)
>>>>> +        continue;
>>>>> +      else
>>>>> +        return false;
>>>>> +    }
>>>>>          /* Ensure probe arguments can be evaluated.  */
>>>>>          for (probe *p : probes[i])
>>>>
>>>
>
  
Andrew Burgess Dec. 5, 2022, 12:04 p.m. UTC | #13
Luis Machado <luis.machado@arm.com> writes:

> On 12/5/22 10:09, Andrew Burgess wrote:
>> Luis Machado <luis.machado@arm.com> writes:
>> 
>>> On 11/29/22 08:27, Luis Machado wrote:
>>>> Hi Andrew,
>>>>
>>>> This seems to have broken armhf on Ubuntu 22.04.
>>>>
>>>> https://builder.sourceware.org/buildbot/#/builders/169/builds/1163
>>>>
>>>
>>> I haven't investigated this. I only spotted it in the sourceware buildbot page. But I'd guess it is something to do
>>> with thumb mode detection early in the startup.
>>>
>>> Ubuntu has moved to not stripping ld.so for armhf because the probe mechanism is not capable of conveying the thumb mode
>>> information, so gdb has to rely on symbols instead. If the changes have touched this fragile area, it may have broken the
>>> delicate balance.
>> 
>> I somehow missed this last week.  I'll take a look today.
>> 
>> Sorry for the breakage.
>
> No problem. For the record, I haven't managed to reproduce this on my Ubuntu 20.04/22.04 setup. I'll have to give it a try on the
> buildbot machine.
>
> Just a heads-up in case this doesn't reproduce for you either.

I don't have immediate access to a suitable ARM machine.  I'm trying to
set something up that might work, but given what you report, I'm not too
hopeful.

My guess for what is happening though is this:

  - Previously, when Ubuntu was using glibc 2.35 / 2.36, one of the
    shared library probes was missing.  As a result GDB would fall back
    to use the symbol/breakpoint approach.  This worked out just fine
    for the ARM case as is appears we shouldn't be using the probes
    anyway.

  - After my patch GDB no longer cares about the missing probe, and so
    started trying to use the probes based interface, as this is the
    preferred strategy.  Unfortunately for ARM this doesn't work :/

I guess the ideal solution would be that glibc didn't include probe
information if those probes can't then be used for some reason.  I guess
that will need someone who understands the problem to either raise a
glibc bug, or propose a glibc fix.

Until then we probably need to consider GDB based solutions.

Currently we don't have anyway to avoid particular probes on a
per-architecture basis, this will probably need a new gdbarch hook
adding.  I'll see if I can write something like this and test it on the
buildbot.


Thanks,
Andrew
  
Luis Machado Dec. 5, 2022, 12:55 p.m. UTC | #14
On 12/5/22 12:04, Andrew Burgess wrote:
> Luis Machado <luis.machado@arm.com> writes:
> 
>> On 12/5/22 10:09, Andrew Burgess wrote:
>>> Luis Machado <luis.machado@arm.com> writes:
>>>
>>>> On 11/29/22 08:27, Luis Machado wrote:
>>>>> Hi Andrew,
>>>>>
>>>>> This seems to have broken armhf on Ubuntu 22.04.
>>>>>
>>>>> https://builder.sourceware.org/buildbot/#/builders/169/builds/1163
>>>>>
>>>>
>>>> I haven't investigated this. I only spotted it in the sourceware buildbot page. But I'd guess it is something to do
>>>> with thumb mode detection early in the startup.
>>>>
>>>> Ubuntu has moved to not stripping ld.so for armhf because the probe mechanism is not capable of conveying the thumb mode
>>>> information, so gdb has to rely on symbols instead. If the changes have touched this fragile area, it may have broken the
>>>> delicate balance.
>>>
>>> I somehow missed this last week.  I'll take a look today.
>>>
>>> Sorry for the breakage.
>>
>> No problem. For the record, I haven't managed to reproduce this on my Ubuntu 20.04/22.04 setup. I'll have to give it a try on the
>> buildbot machine.
>>
>> Just a heads-up in case this doesn't reproduce for you either.
> 
> I don't have immediate access to a suitable ARM machine.  I'm trying to
> set something up that might work, but given what you report, I'm not too
> hopeful.
> 
> My guess for what is happening though is this:
> 
>    - Previously, when Ubuntu was using glibc 2.35 / 2.36, one of the
>      shared library probes was missing.  As a result GDB would fall back
>      to use the symbol/breakpoint approach.  This worked out just fine
>      for the ARM case as is appears we shouldn't be using the probes
>      anyway.
> 
>    - After my patch GDB no longer cares about the missing probe, and so
>      started trying to use the probes based interface, as this is the
>      preferred strategy.  Unfortunately for ARM this doesn't work :/

It is not exactly that ARM can't use the probe mechanism, but rather that when using the probe mechanism ARM gdb needs to be able to
find some dynamic linker symbols to determine the thumb mode. I'll let you know what I find when trying things on the buildbot
machine.

> 
> I guess the ideal solution would be that glibc didn't include probe
> information if those probes can't then be used for some reason.  I guess
> that will need someone who understands the problem to either raise a
> glibc bug, or propose a glibc fix.
> 
> Until then we probably need to consider GDB based solutions.
> 
> Currently we don't have anyway to avoid particular probes on a
> per-architecture basis, this will probably need a new gdbarch hook
> adding.  I'll see if I can write something like this and test it on the
> buildbot.
> 
> 
> Thanks,
> Andrew
>
  

Patch

diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index 6acaf87960b..87cd06f251a 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -2205,10 +2205,15 @@  svr4_find_and_create_probe_breakpoints (svr4_info *info,
 
       probes[i] = find_probes_in_objfile (os->objfile, "rtld", name);
 
-      /* The "map_failed" probe did not exist in early
-	 versions of the probes code in which the probes'
-	 names were prefixed with "rtld_".  */
-      if (with_prefix && streq (name, "rtld_map_failed"))
+      /* The "map_failed" probe did not exist in early versions of the
+	 probes code in which the probes' names were prefixed with
+	 "rtld_".
+
+	 Additionally, the "map_failed" probe was accidentally removed
+	 from glibc 2.35 and later, when changes in glibc meant the probe
+	 could no longer be reached.  In this case the probe name doesn't
+	 have the "rtld_" prefix.  */
+      if (streq (probe_info[i].name, "map_failed"))
 	continue;
 
       /* Ensure at least one probe for the current name was found.  */