Cell multi-arch broken (Re: [PATCH 2/2] GNU/Linux: Stop using libthread_db/td_ta_thr_iter)

Message ID 20150826190157.EE3D939FA@oc7340732750.ibm.com
State New, archived
Headers

Commit Message

Ulrich Weigand Aug. 26, 2015, 7:01 p.m. UTC
  Pedro Alves wrote:
> On 08/26/2015 06:39 PM, Ulrich Weigand wrote:
> 
> > In fact, it is so broken that the test suite assumes we're not even
> > on a Cell/B.E. (since it can't debug the trivial test program), and
> > silently skips all Cell tests, so I didn't notice in the daily build
> > reports ...
> 
> (Sounds like the testsuite could be improved to better detect this.)

Yes, I think I'll at least set the test case to UNRESOLVED if
something unexpected happens while attempting to detect whether
we have Cell/B.E. hardware.

> > The reason why we're running into the abort is that the multi-arch
> > debugging logic attempts to resolve a thread-local variable from
> > inside the frame unwinders (which is probably not done elsewhere).
> > This uncovers a code path where the above assertion is wrong:
> 
> Curious.  Could you point me at this code path?  I can't seem
> to find it.  I wonder whether can trigger this assertion
> by stopping the inferior before thread_db is initialized (e.g.,
> entry point), and then trying to print a tls variable?  (In order
> to construct a test case).

This is where it triggers for me:

#0  internal_error (file=0x107e0ce0 "/home/uweigand/fsf/binutils-gdb/gdb/linux-thread-db.c", line=1663, fmt=0x107dacd8 "%s: Assertion `%s' failed.")
    at /home/uweigand/fsf/binutils-gdb/gdb/common/errors.c:51
#1  0x00000000100ee6e0 in find_new_threads_once (info=0x10f00de0, iteration=0, errp=0xfffffc6d380) at /home/uweigand/fsf/binutils-gdb/gdb/linux-thread-db.c:1663
#2  0x00000000100ee9ac in thread_db_find_new_threads_2 (ptid=..., until_no_new=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/linux-thread-db.c:1729
#3  0x00000000100eec10 in thread_db_get_thread_local_address (ops=0x10a59fa8, ptid=..., lm=552219129840, offset=4398046642176)
    at /home/uweigand/fsf/binutils-gdb/gdb/linux-thread-db.c:1856
#4  0x00000000102b0cfc in delegate_get_thread_local_address (self=<value optimized out>, arg1=..., arg2=<value optimized out>, arg3=<value optimized out>)
    at /home/uweigand/fsf/binutils-gdb/gdb/target-delegates.c:1931
#5  0x00000000100c5540 in ppc_linux_spe_context (wordsize=0, byte_order=BFD_ENDIAN_BIG, n=0, id=0xfffffc6d660, npc=0xfffffc6d664)
    at /home/uweigand/fsf/binutils-gdb/gdb/ppc-linux-tdep.c:1205
#6  0x00000000100c58f4 in ppu2spu_sniffer (self=<value optimized out>, this_frame=0x10c35fe0, this_prologue_cache=0x10c35ff8)
    at /home/uweigand/fsf/binutils-gdb/gdb/ppc-linux-tdep.c:1341
#7  0x00000000103a276c in frame_unwind_try_unwinder (this_frame=0x10c35fe0, this_cache=0x10c35ff8, unwinder=0x10942b70) at /home/uweigand/fsf/binutils-gdb/gdb/frame-unwind.c:105
#8  0x00000000103a28dc in frame_unwind_find_by_frame (this_frame=0x10c35fe0, this_cache=0x10c35ff8) at /home/uweigand/fsf/binutils-gdb/gdb/frame-unwind.c:160
#9  0x00000000103a0f70 in compute_frame_id (this_frame=0x10c35f10) at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:454
#10 get_prev_frame_if_no_cycle (this_frame=0x10c35f10) at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:1781
#11 0x00000000103a1260 in get_prev_frame_always_1 (this_frame=0x10c35f10) at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:1955
#12 get_prev_frame_always (this_frame=0x10c35f10) at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:1971
#13 0x00000000103a1788 in get_prev_frame (this_frame=0x10c35f10) at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:2213
#14 0x00000000103a19ac in unwind_to_current_frame (ui_out=<value optimized out>, args=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:1450
#15 0x000000001027707c in catch_exceptions_with_msg (func_uiout=<value optimized out>, func=@0x10a1b980: 0x103a1990 <unwind_to_current_frame>, func_args=0x10c35f10,
    gdberrmsg=0x0, mask=0) at /home/uweigand/fsf/binutils-gdb/gdb/exceptions.c:187
#16 0x000000001039f2a8 in get_current_frame () at /home/uweigand/fsf/binutils-gdb/gdb/frame.c:1489
#17 0x0000000010264408 in process_event_stop_test (ecs=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/infrun.c:5877
#18 0x0000000010265fac in handle_inferior_event_1 (ecs=0xfffffc6ea08) at /home/uweigand/fsf/binutils-gdb/gdb/infrun.c:5857
#19 handle_inferior_event (ecs=0xfffffc6ea08) at /home/uweigand/fsf/binutils-gdb/gdb/infrun.c:5053
#20 0x0000000010267978 in fetch_inferior_event (client_data=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/infrun.c:3785
#21 0x00000000102864b8 in inferior_event_handler (event_type=INF_REG_EVENT, client_data=0x0) at /home/uweigand/fsf/binutils-gdb/gdb/inf-loop.c:56
#22 0x00000000100f4bb8 in handle_target_event (error=<value optimized out>, client_data=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/linux-nat.c:4706
#23 0x0000000010284374 in handle_file_event (file_ptr=0x10ddc670, ready_mask=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/event-loop.c:708
#24 0x00000000102845dc in gdb_wait_for_event (block=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/event-loop.c:859
#25 0x00000000102849c8 in gdb_do_one_event () at /home/uweigand/fsf/binutils-gdb/gdb/event-loop.c:298
#26 0x0000000010284af4 in start_event_loop () at /home/uweigand/fsf/binutils-gdb/gdb/event-loop.c:347
#27 0x0000000010286408 in cli_command_loop (data=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/event-top.c:186
#28 0x000000001027ae28 in current_interp_command_loop () at /home/uweigand/fsf/binutils-gdb/gdb/interps.c:317
#29 0x000000001027bfc4 in captured_command_loop (data=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/main.c:318
#30 0x0000000010276e14 in catch_errors (func=@0x10a0c950: 0x1027bfa0 <captured_command_loop>, func_args=0x0, errstring=0x108d1078 "", mask=0)
    at /home/uweigand/fsf/binutils-gdb/gdb/exceptions.c:240
#31 0x000000001027cd84 in captured_main (data=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/main.c:1157
#32 0x0000000010276e14 in catch_errors (func=@0x10a0c980: 0x1027c270 <captured_main>, func_args=0xfffffc6f3c0, errstring=0x108d1078 "", mask=0)
    at /home/uweigand/fsf/binutils-gdb/gdb/exceptions.c:240
#33 0x000000001027ba48 in gdb_main (args=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/main.c:1165
#34 0x0000000010093644 in main (argc=<value optimized out>, argv=<value optimized out>) at /home/uweigand/fsf/binutils-gdb/gdb/gdb.c:32

I haven't really tried to construct a test case yet ...

> Try doing it like gdbserver's thread_db_get_tls_address.
> 
> ...
>   lwp = get_thread_lwp (thread);
>   if (!lwp->thread_known)
>     find_one_thread (thread->entry.id);
> ...
> 
> That is, here what we're really after is the td_thrhandle_t
> of the current thread, in order to be able to call
> td_thr_tls_get_addr.  There's no need to walk thread_db's
> thread list to find that for a single thread.
> 
> find_one_thread makes use of td_ta_map_lwp2thr for the
> mapping we're after.
> On the GDB side, the equivalent is linux-thread-db.c:thread_from_lwp.

Ah, indeed that works for me.  The attached patch also fixes the
problem for me.

Bye,
Ulrich
  

Comments

Pedro Alves Aug. 26, 2015, 7:42 p.m. UTC | #1
On 08/26/2015 08:01 PM, Ulrich Weigand wrote:
> Pedro Alves wrote:

>> (Sounds like the testsuite could be improved to better detect this.)
> 
> Yes, I think I'll at least set the test case to UNRESOLVED if
> something unexpected happens while attempting to detect whether
> we have Cell/B.E. hardware.

Sounds good to me.

> This is where it triggers for me:
> 

Thanks, I see.

>> find_one_thread makes use of td_ta_map_lwp2thr for the
>> mapping we're after.
>> On the GDB side, the equivalent is linux-thread-db.c:thread_from_lwp.
> 
> Ah, indeed that works for me.  The attached patch also fixes the
> problem for me.
> 

LGTM.

Thanks,
Pedro Alves
  

Patch

Index: binutils-gdb/gdb/linux-thread-db.c
===================================================================
--- binutils-gdb.orig/gdb/linux-thread-db.c
+++ binutils-gdb/gdb/linux-thread-db.c
@@ -1851,13 +1851,16 @@  thread_db_get_thread_local_address (stru
   struct thread_info *thread_info;
   struct target_ops *beneath;
 
-  /* If we have not discovered any threads yet, check now.  */
-  if (!have_threads (ptid))
-    thread_db_find_new_threads_1 (ptid);
-
   /* Find the matching thread.  */
   thread_info = find_thread_ptid (ptid);
 
+  /* We may not have discovered the thread yet.  */
+  if (thread_info != NULL && thread_info->priv == NULL)
+    {
+      thread_from_lwp (ptid);
+      thread_info = find_thread_ptid (ptid);
+    }
+
   if (thread_info != NULL && thread_info->priv != NULL)
     {
       td_err_e err;