[RFA,2/5] Darwin: Handle unrelocated dyld.

Message ID 87lg7ysdpb.fsf@tromey.com
State New, archived
Headers

Commit Message

Tom Tromey Sept. 18, 2018, 9:22 p.m. UTC
  >>>>> "Simon" == Simon Marchi <simon.marchi@polymtl.ca> writes:

Simon> With this patch, there seems to be two way the notifier breakpoint can
Simon> be set: in darwin_solib_create_inferior_hook and in
Simon> darwin_handle_solib_event.  Can you explain why do we need both?

Simon> In particular, I don't understand when the one in
Simon> darwin_handle_solib_event can be useful.  There are two ways
Simon> darwin_handle_solib_event can be called:
[...]

Today I took Xavier's patch and instrumented it with some printfs.

What happens here is that at startup, darwin_solib_create_inferior_hook
clears the notifier_set flag:

+              notifier += dyld_relocated_base_address;
+              info->notifier_set = false;

... but it still creates an solib event breakpoint.

This later causes darwin_handle_solib_event to be called.

In my case the address of the solib event breakpoint is at the same
location in both cases.  But, there is also this comment:

+	 (Apparently dyld doesn't need to relocate itself on x86-64 darwin,
+	 but don't assume that).

What this says to me is that perhaps there is an architecture where
darwin_handle_solib_event computes a different solib breakpoint address.

My conclusion is that the patch is generally ok (certainly it works)
and, while this one part is unusual, it isn't fatally so.

>> +  /* True when a breakpoint has been added on dyld notifier.  */
>> +  unsigned int notifier_set;

Simon> Use bool?

I did this.

Simon> Also, from what I understand from the rest of the patch (I might be
Simon> wrong), this is also set if we have tried to add it but failed
Simon> (because we failed to get info about the dynamic loader, for example).
Simon> If this is right, can you update the comment to reflect this?

I don't think this is quite accurate; and the existing comments were
enough for me.

But there's one case where it sets notifier=0 -- this is the one I don't
understand.

>> +static gdb_bfd_ref_ptr
>> +darwin_get_dyld_bfd (void)

Simon> The comment above the function would need to be updated.

I did it.

>> {
>> char *interp_name;
>> -  CORE_ADDR load_addr = 0;
>> 
>> /* This method doesn't work with an attached process.  */
>> if (current_inferior ()->attach_flag)
>> -    return;
>> +    return NULL;

Simon> Should this last snippet (check for attach_flag) be here or in
Simon> darwin_solib_get_all_image_info_addr_at_init?  From what I understand,
Simon> there is nothing preventing darwin_get_dyld_bfd to work when working
Simon> with an attached process.

I didn't really understand this choice either, but all the calls to
darwin_get_dyld_bfd ultimately come from
darwin_solib_create_inferior_hook anyway, so maybe it is intentional.

>> +  CORE_ADDR notifier;

Simon> Style nit: you can declare variables where you are using them the
Simon> first time.

I made this change too.

Tom

commit 71b7becde98dd066de4e1e3a5142233c4dd16581
Author: Xavier Roirand <roirand@adacore.com>
Date:   Wed Aug 22 12:11:14 2018 +0200

    Darwin: Handle unrelocated dyld.
    
    On Darwin, debugging an helloworld program with GDB does
    not work and ends with:
    
      (gdb) set startup-with-shell off
      (gdb) start
      Temporary breakpoint 1 at 0x100000fb4: file /tmp/helloworld.c, line 1.
      Starting program: /private/tmp/helloworld
      [New Thread 0x2703 of process 18906]
      [New Thread 0x2603 of process 18906]
    
      [1]+  Stopped                 ./gdb/gdb /tmp/helloworld
    
    When debugging with lldb, instead of having the STOP signal, we can
    see that a breakpoint is not set to a proper location:
    
      Warning:
      Cannot insert breakpoint -1.
      Cannot access memory at address 0xf726
    
      Command aborted.
    
    The inserted breakpoint is the one used when GDB has to stop the target
    when a shared library is loaded or unloaded. The notifier address used
    for adding the breakpoint is wrong thus the above failure.
    This notifier address is an offset relative to dyld base address, so
    the value calculation has to be updated to reflect this.
    This patch fix this.
    
    gdb/ChangeLog:
    
            * solib-darwin.c (struct darwin_info): Add notifier_set field.
            (darwin_get_dyld_bfd): New function.
            (darwin_solib_get_all_image_info_addr_at_init): Update call.
            (darwin_handle_solib_event): New function.
            (darwin_solib_create_inferior_hook): Handle unrelocated dyld.
            (_initialize_darwin_solib): Add solib_event.
    
    Change-Id: I7dde5008c9158f17b78dc89bd7f4bd8a12d4a6e1
  

Comments

Joel Brobecker Sept. 19, 2018, 1:40 p.m. UTC | #1
> In my case the address of the solib event breakpoint is at the same
> location in both cases.  But, there is also this comment:
> 
> +	 (Apparently dyld doesn't need to relocate itself on x86-64 darwin,
> +	 but don't assume that).
> 
> What this says to me is that perhaps there is an architecture where
> darwin_handle_solib_event computes a different solib breakpoint address.
> 
> My conclusion is that the patch is generally ok (certainly it works)
> and, while this one part is unusual, it isn't fatally so.

I am wondering whether the difference in what you are seeing
might be explained by a difference in MacOS X version; if I were
to guess, I would say that Xavier was running on Mac OS X Sierra.
What version were you running on?

Or perhaps the intent is to be extra careful meaning that while
today the relocation is not necessary, we still handle it so that
it continues working the day it becomes so?

If the comment above is confusing, I would vote for removing it.
To me, this is like Windows, where DLLs have prefered base addresses
where they get loaded, but we still need to do the reloc just in case,
because the loader may have to load it elsewhere. So what this is
doing here is somewhat "classic".

That makes me realize (again) that, for MacOS X, we should be more
proactive at specificying which version a patch we are submitting
was tested on, and some information about which versions of MacOS X
a given patch helps. A fair amount of work that Tristan did once
the initial port was created was to adapt it to subsequent versions
of Darwin. Nearly every new version of Darwin introduced its new
set of changes requiring additional adaptations.
  
Simon Marchi Sept. 19, 2018, 2:14 p.m. UTC | #2
On 2018-09-19 09:40, Joel Brobecker wrote:
> That makes me realize (again) that, for MacOS X, we should be more
> proactive at specificying which version a patch we are submitting
> was tested on, and some information about which versions of MacOS X
> a given patch helps. A fair amount of work that Tristan did once
> the initial port was created was to adapt it to subsequent versions
> of Darwin. Nearly every new version of Darwin introduced its new
> set of changes requiring additional adaptations.

Given the low amount of resources we have to work on GDB for macOS and 
the fact that it's difficult for one person to test on different 
versions of macOS (you would need multiple Mac computers, AFAIK), I 
suggest that we only aim at supporting the last two released versions of 
macOS.  I say two, because there will always be an overlap after a new 
version is released, where some people use the new version and others 
have not upgraded yet.

If some people use older versions of macOS and want to contribute 
patches and help testing, that's fine too, but if everybody contributing 
is on latest or latest-1, we can't realistically ensure it works on 
previous versions.

That said, I agree that it would be very helpful if commit messages and 
comments in the code mentioned the version of macOS that was used at the 
time that commit message or comment was written.

Simon
  
Joel Brobecker Sept. 19, 2018, 2:28 p.m. UTC | #3
> Given the low amount of resources we have to work on GDB for macOS and the
> fact that it's difficult for one person to test on different versions of
> macOS (you would need multiple Mac computers, AFAIK), I suggest that we only
> aim at supporting the last two released versions of macOS.  I say two,
> because there will always be an overlap after a new version is released,
> where some people use the new version and others have not upgraded yet.

That makes a lot of sense to me.

If we agree, let's document that somewhere. I'm not sure where
the best place might be. Maybe this wiki page?

        https://sourceware.org/gdb/wiki/BuildingOnDarwin

> If some people use older versions of macOS and want to contribute patches
> and help testing, that's fine too, but if everybody contributing is on
> latest or latest-1, we can't realistically ensure it works on previous
> versions.
> 
> That said, I agree that it would be very helpful if commit messages and
> comments in the code mentioned the version of macOS that was used at the
> time that commit message or comment was written.
  
Tom Tromey Sept. 19, 2018, 2:36 p.m. UTC | #4
>>>>> "Joel" == Joel Brobecker <brobecker@adacore.com> writes:

>> +	 (Apparently dyld doesn't need to relocate itself on x86-64 darwin,
>> +	 but don't assume that).

Joel> I am wondering whether the difference in what you are seeing
Joel> might be explained by a difference in MacOS X version; if I were
Joel> to guess, I would say that Xavier was running on Mac OS X Sierra.
Joel> What version were you running on?

High Sierra.

Joel> Or perhaps the intent is to be extra careful meaning that while
Joel> today the relocation is not necessary, we still handle it so that
Joel> it continues working the day it becomes so?

I couldn't really say, I just concluded what I did based on the mention
of the specific architecture there.

Joel> If the comment above is confusing, I would vote for removing it.

I think we need more information.

If we remove the comment because we think it is untrue, then Simon's
original critique of the patch -- that there is no reason for a second
solib breakpoint -- seems correct.  And in this case we should remove a
chunk of the patch.

Based on my testing thus far, this would be fine.  But I don't know what
testing Xavier and Tristan did, or with what architectures.

Also my testing hasn't been exactly exhaustive.  I just try simple
things because today those break.

Now, for my purposes, it would be fine to land the more minimal patch.
That would not preclude adding this code back later.  If that seems ok,
I'm happy to do it.  I'd like to get this working so I can make some
progress on other patches, and anyway un-break the Mac port.

Joel> That makes me realize (again) that, for MacOS X, we should be more
Joel> proactive at specificying which version a patch we are submitting
Joel> was tested on, and some information about which versions of MacOS X
Joel> a given patch helps. A fair amount of work that Tristan did once
Joel> the initial port was created was to adapt it to subsequent versions
Joel> of Darwin. Nearly every new version of Darwin introduced its new
Joel> set of changes requiring additional adaptations.

Wholly agreed.

Tom
  
Simon Marchi Sept. 19, 2018, 2:44 p.m. UTC | #5
On 2018-09-19 10:36, Tom Tromey wrote:
>>>>>> "Joel" == Joel Brobecker <brobecker@adacore.com> writes:
> 
>>> +	 (Apparently dyld doesn't need to relocate itself on x86-64 darwin,
>>> +	 but don't assume that).
> 
> Joel> I am wondering whether the difference in what you are seeing
> Joel> might be explained by a difference in MacOS X version; if I were
> Joel> to guess, I would say that Xavier was running on Mac OS X Sierra.
> Joel> What version were you running on?
> 
> High Sierra.
> 
> Joel> Or perhaps the intent is to be extra careful meaning that while
> Joel> today the relocation is not necessary, we still handle it so that
> Joel> it continues working the day it becomes so?
> 
> I couldn't really say, I just concluded what I did based on the mention
> of the specific architecture there.
> 
> Joel> If the comment above is confusing, I would vote for removing it.
> 
> I think we need more information.
> 
> If we remove the comment because we think it is untrue, then Simon's
> original critique of the patch -- that there is no reason for a second
> solib breakpoint -- seems correct.  And in this case we should remove a
> chunk of the patch.
> 
> Based on my testing thus far, this would be fine.  But I don't know 
> what
> testing Xavier and Tristan did, or with what architectures.
> 
> Also my testing hasn't been exactly exhaustive.  I just try simple
> things because today those break.
> 
> Now, for my purposes, it would be fine to land the more minimal patch.
> That would not preclude adding this code back later.  If that seems ok,
> I'm happy to do it.  I'd like to get this working so I can make some
> progress on other patches, and anyway un-break the Mac port.
> 
> Joel> That makes me realize (again) that, for MacOS X, we should be 
> more
> Joel> proactive at specificying which version a patch we are submitting
> Joel> was tested on, and some information about which versions of MacOS 
> X
> Joel> a given patch helps. A fair amount of work that Tristan did once
> Joel> the initial port was created was to adapt it to subsequent 
> versions
> Joel> of Darwin. Nearly every new version of Darwin introduced its new
> Joel> set of changes requiring additional adaptations.
> 
> Wholly agreed.
> 
> Tom

I would vote for only checking in the code you know is necessary for 
now, otherwise it will just be more confusing in the future, trying to 
figure out what is needed and what isn't.

Simon
  
Joel Brobecker Sept. 19, 2018, 3:32 p.m. UTC | #6
> I would vote for only checking in the code you know is necessary for now,
> otherwise it will just be more confusing in the future, trying to figure out
> what is needed and what isn't.

I agree.
  

Patch

diff --git a/gdb/solib-darwin.c b/gdb/solib-darwin.c
index ed8e0c13365..2633dc04d43 100644
--- a/gdb/solib-darwin.c
+++ b/gdb/solib-darwin.c
@@ -77,6 +77,10 @@  struct darwin_info
 
   /* Gdb copy of dyld_all_info_infos.  */
   struct gdb_dyld_all_image_infos all_image;
+
+  /* True when a breakpoint has been added on dyld notifier, or when
+     an attempt to do so has been made and failed.  */
+  bool notifier_set;
 };
 
 /* Per-program-space data key.  */
@@ -429,23 +433,21 @@  gdb_bfd_mach_o_fat_extract (bfd *abfd, bfd_format format,
   return gdb_bfd_ref_ptr (result);
 }
 
-/* Extract dyld_all_image_addr when the process was just created, assuming the
-   current PC is at the entry of the dynamic linker.  */
+/* Return the BFD for the program interpreter.  */
 
-static void
-darwin_solib_get_all_image_info_addr_at_init (struct darwin_info *info)
+static gdb_bfd_ref_ptr
+darwin_get_dyld_bfd ()
 {
   char *interp_name;
-  CORE_ADDR load_addr = 0;
 
   /* This method doesn't work with an attached process.  */
   if (current_inferior ()->attach_flag)
-    return;
+    return NULL;
 
   /* Find the program interpreter.  */
   interp_name = find_program_interpreter ();
   if (!interp_name)
-    return;
+    return NULL;
 
   /* Create a bfd for the interpreter.  */
   gdb_bfd_ref_ptr dyld_bfd (gdb_bfd_open (interp_name, gnutarget, -1));
@@ -459,6 +461,18 @@  darwin_solib_get_all_image_info_addr_at_init (struct darwin_info *info)
       else
 	dyld_bfd.release ();
     }
+  return dyld_bfd;
+}
+
+/* Extract dyld_all_image_addr when the process was just created, assuming the
+   current PC is at the entry of the dynamic linker.  */
+
+static void
+darwin_solib_get_all_image_info_addr_at_init (struct darwin_info *info)
+{
+  CORE_ADDR load_addr = 0;
+  gdb_bfd_ref_ptr dyld_bfd (darwin_get_dyld_bfd ());
+
   if (dyld_bfd == NULL)
     return;
 
@@ -502,6 +516,37 @@  darwin_solib_read_all_image_info_addr (struct darwin_info *info)
   info->all_image_addr = extract_unsigned_integer (buf, len, BFD_ENDIAN_BIG);
 }
 
+/* Callback called on solib event.
+   Used when a breakpoint has to be added on the entry point, to wait for
+   dyld initialization.  */
+
+static void
+darwin_handle_solib_event (void)
+{
+  struct darwin_info *info = get_darwin_info ();
+
+  /* Nothing to do if notifier was already set.  */
+  if (info->notifier_set)
+    return;
+  info->notifier_set = true;
+
+  /* Remove breakpoint on entry, but not now (as our caller is iterating on
+     breakpoints).  */
+  remove_solib_event_breakpoints_at_next_stop ();
+
+  /* Reload dyld infos.  */
+  darwin_load_image_infos (info);
+
+  if (info->all_image.count == 0)
+    {
+      /* Not expected: no dylibs.  */
+      return;
+    }
+
+  /* Insert the real solib event on the dyld notifier.  */
+  create_solib_event_breakpoint (target_gdbarch (), info->all_image.notifier);
+}
+
 /* Shared library startup support.  See documentation in solib-svr4.c.  */
 
 static void
@@ -528,10 +573,6 @@  darwin_solib_create_inferior_hook (int from_tty)
       return;
     }
 
-  /* Add the breakpoint which is hit by dyld when the list of solib is
-     modified.  */
-  create_solib_event_breakpoint (target_gdbarch (), info->all_image.notifier);
-
   if (info->all_image.count != 0)
     {
       /* Possible relocate the main executable (PIE).  */
@@ -558,6 +599,54 @@  darwin_solib_create_inferior_hook (int from_tty)
       if (vmaddr != load_addr)
 	objfile_rebase (symfile_objfile, load_addr - vmaddr);
     }
+
+  /* Set solib notifier (to reload list of shared libraries).  */
+  CORE_ADDR notifier = info->all_image.notifier;
+  info->notifier_set = true;
+
+  if (info->all_image.count == 0)
+    {
+      /* Dyld hasn't yet relocated itself, so the notifier address may be
+	 incorrect (as it has to be relocated).
+	 (Apparently dyld doesn't need to relocate itself on x86-64 darwin,
+	 but don't assume that).
+	 Set an event breakpoint at the entry point.  */
+      CORE_ADDR start = bfd_get_start_address (exec_bfd);
+      if (start == 0)
+	notifier = 0;
+      else
+        {
+          gdb_bfd_ref_ptr dyld_bfd (darwin_get_dyld_bfd ());
+          if (dyld_bfd != NULL)
+            {
+              CORE_ADDR dyld_bfd_start_address;
+              CORE_ADDR dyld_relocated_base_address;
+              CORE_ADDR pc;
+
+              dyld_bfd_start_address = bfd_get_start_address (dyld_bfd.get());
+
+              /* We find the dynamic linker's base address by examining
+                 the current pc (which should point at the entry point
+                 for the dynamic linker) and subtracting the offset of
+                 the entry point.  */
+
+              pc = regcache_read_pc (get_current_regcache ());
+              dyld_relocated_base_address = pc - dyld_bfd_start_address;
+
+              /* We get the proper notifier relocated address by
+                 adding the dyld relocated base address to the current
+                 notifier offset value.  */
+
+              notifier += dyld_relocated_base_address;
+              info->notifier_set = false;
+            }
+        }
+    }
+
+  /* Add the breakpoint which is hit by dyld when the list of solib is
+     modified.  */
+  if (notifier != 0)
+    create_solib_event_breakpoint (target_gdbarch (), notifier);
 }
 
 static void
@@ -658,4 +747,5 @@  _initialize_darwin_solib (void)
   darwin_so_ops.in_dynsym_resolve_code = darwin_in_dynsym_resolve_code;
   darwin_so_ops.lookup_lib_global_symbol = darwin_lookup_lib_symbol;
   darwin_so_ops.bfd_open = darwin_bfd_open;
+  darwin_so_ops.handle_event = darwin_handle_solib_event;
 }