Message ID | 20150325191418.GA32233@adacore.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 34114 invoked by alias); 25 Mar 2015 19:14:25 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 34105 invoked by uid 89); 25 Mar 2015 19:14:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.8 required=5.0 tests=AWL, BAYES_00, LIKELY_SPAM_BODY autolearn=no version=3.3.2 X-HELO: rock.gnat.com Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 25 Mar 2015 19:14:22 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id 56EF5D3867; Wed, 25 Mar 2015 15:14:20 -0400 (EDT) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 3oGx6AJKL4Rh; Wed, 25 Mar 2015 15:14:20 -0400 (EDT) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 14FCAD368C; Wed, 25 Mar 2015 15:14:20 -0400 (EDT) Received: by joel.gnat.com (Postfix, from userid 1000) id BF7C340EAD; Wed, 25 Mar 2015 12:14:18 -0700 (PDT) Date: Wed, 25 Mar 2015 12:14:18 -0700 From: Joel Brobecker <brobecker@adacore.com> To: Sergio Durigan Junior <sergiodj@redhat.com> Cc: "Jose E. Marchesi" <jose.marchesi@oracle.com>, gdb-patches@sourceware.org Subject: Re: [PATCH V4 5/9] New probe type: DTrace USDT probes. Message-ID: <20150325191418.GA32233@adacore.com> References: <1422874968-382-1-git-send-email-jose.marchesi@oracle.com> <1422874968-382-6-git-send-email-jose.marchesi@oracle.com> <87r3tp722i.fsf@redhat.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="9amGYk9869ThD9tj" Content-Disposition: inline In-Reply-To: <87r3tp722i.fsf@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) |
Commit Message
Joel Brobecker
March 25, 2015, 7:14 p.m. UTC
Jose, > > 2015-02-02 Jose E. Marchesi <jose.marchesi@oracle.com> > > > > * breakpoint.c (BREAK_ARGS_HELP): help string updated to mention > > the -probe-dtrace new vpossible value for PROBE_MODIFIER. > > * configure.ac (CONFIG_OBS): dtrace-probe.o added if BFD can > > handle ELF files. > > * Makefile.in (SFILES): dtrace-probe.c added. > > * configure: Regenerate. > > * dtrace-probe.c: New file. > > (SHT_SUNW_dof): New constant. > > (dtrace_probe_type): New enum. > > (dtrace_probe_arg): New struct. > > (dtrace_probe_arg_s): New typedef. > > (struct dtrace_probe_enabler): New struct. > > (dtrace_probe_enabler_s): New typedef. > > (dtrace_probe): New struct. > > (dtrace_probe_is_linespec): New function. > > (dtrace_dof_sect_type): New enum. > > (dtrace_dof_dofh_ident): Likewise. > > (dtrace_dof_encoding): Likewise. > > (DTRACE_DOF_ENCODE_LSB): Likewise. > > (DTRACE_DOF_ENCODE_MSB): Likewise. > > (dtrace_dof_hdr): New struct. > > (dtrace_dof_sect): Likewise. > > (dtrace_dof_provider): Likewise. > > (dtrace_dof_probe): Likewise. > > (DOF_UINT): New macro. > > (DTRACE_DOF_PTR): Likewise. > > (DTRACE_DOF_SECT): Likewise. > > (dtrace_process_dof_probe): New function. > > (dtrace_process_dof): Likewise. > > (dtrace_build_arg_exprs): Likewise. > > (dtrace_get_arg): Likewise. > > (dtrace_get_probes): Likewise. > > (dtrace_get_probe_argument_count): Likewise. > > (dtrace_can_evaluate_probe_arguments): Likewise. > > (dtrace_evaluate_probe_argument): Likewise. > > (dtrace_compile_to_ax): Likewise. > > (dtrace_probe_destroy): Likewise. > > (dtrace_gen_info_probes_table_header): Likewise. > > (dtrace_gen_info_probes_table_values): Likewise. > > (dtrace_probe_is_enabled): Likewise. > > (dtrace_probe_ops): New variable. > > (info_probes_dtrace_command): New function. > > (_initialize_dtrace_probe): Likewise. > > (dtrace_type_name): Likewise. Unfortunately, this patch is breaking GDB completely on Solaris. The first issue I started investigating is the following, which happens with nearly any unthreaded program: (gdb) start Temporary breakpoint 1 at 0x80593bc: file simple_main.adb, line 4. Starting program: /[...]/simple_main [Thread debugging using libthread_db enabled] No definition of "mutex_t" in current context. The error happens in... > > +static void > > +dtrace_process_dof_probe (struct objfile *objfile, ... and more precisely... > > + /* Store argument type descriptions. A description of the type > > + of the argument is in the (J+1)th null-terminated string > > + starting at 'strtab' + 'probe->dofpr_nargv'. */ > > + ret->args = NULL; > > + p = strtab + DOF_UINT (dof, probe->dofpr_nargv); > > + for (j = 0; j < ret->probe_argc; j++) > > + { > > + struct dtrace_probe_arg arg; > > + struct expression *expr; > > + > > + arg.type_str = xstrdup (p); > > + > > + /* Use strtab_size as a sentinel. */ > > + while (*p++ != '\0' && p - strtab < strtab_size); > > + > > + /* Try to parse a type expression from the type string. If > > + this does not work then we set the type to `long > > + int'. */ > > + arg.type = builtin_type (gdbarch)->builtin_long; > > + expr = parse_expression (arg.type_str); ^^^^^^^^^^^^^^^^ there The most obvious issue is that type "mutex_t" does not exist in my program, so I think that the code should handle that case. The second, less obvious issue, is that the parsing is done using the current language, which I don't think is a good idea. Attached is small prototype that attempts to do something like that. I think we need a function that parses an expression using a specific language, but I went for short prototype to test my theory first. That solves most of the issues, even if I don't really know if it's not just sweeping them under the carpet instead. And once I had that fixed, the next issue that I looked at was: (gdb) b adainit Breakpoint 1 at 0x8051f03 (gdb) run Starting program: /[...]/a [Thread debugging using libthread_db enabled] zsh: 12378 segmentation fault (core dumped) /[...]/gdb -q a This is where I'm getting even more out of my league, here. The SEGV happens on the following line: 377 uint32_t enabler_offset 378 = ((uint32_t *) eofftab)[DOF_UINT (dof, probe->dofpr_enoffidx) + i]; ... and eofftab seems to have a bogus value, because: (gdb) p eofftab $6 = 0x9479fb2 <error: Cannot access memory at address 0x9479fb2> For me to investigate further, I think I'd have to investigate how probes work and what's where, and I don't really have the time or interest at the moment. So, I'm hoping that you have access to some Solaris systems and you could look into this? I can send you reproducers, if you need. I investigated the issue on x86-solaris 2.10, but the same happens on sparc-solaris 2.10 and sparc64-solaris 2.10. Thank you,
Comments
Hi Joel! > > + /* Try to parse a type expression from the type string. If > > + this does not work then we set the type to `long > > + int'. */ > > + arg.type = builtin_type (gdbarch)->builtin_long; > > + expr = parse_expression (arg.type_str); ^^^^^^^^^^^^^^^^ there The most obvious issue is that type "mutex_t" does not exist in my program, so I think that the code should handle that case. Certainly. The second, less obvious issue, is that the parsing is done using the current language, which I don't think is a good idea. Attached is small prototype that attempts to do something like that. I think we need a function that parses an expression using a specific language, but I went for short prototype to test my theory first. That solves most of the issues, even if I don't really know if it's not just sweeping them under the carpet instead. Well, the TRY..CATCH in your prototype would catch any error that may be thrown in parse_expression, and the `set_language' must take care of changing the language, so I would say that is enough... And once I had that fixed, the next issue that I looked at was: (gdb) b adainit Breakpoint 1 at 0x8051f03 (gdb) run Starting program: /[...]/a [Thread debugging using libthread_db enabled] zsh: 12378 segmentation fault (core dumped) /[...]/gdb -q a This is where I'm getting even more out of my league, here. The SEGV happens on the following line: 377 uint32_t enabler_offset 378 = ((uint32_t *) eofftab)[DOF_UINT (dof, probe->dofpr_enoffidx) + i]; I will debug that SIGSEGV in solaris, but the problem seems to be related to the DOF program embedded in your program, more than to the platform. Could you please send me your sparc-solaris reproducer? Salud!
> Well, the TRY..CATCH in your prototype would catch any error that may be > thrown in parse_expression, and the `set_language' must take care of > changing the language, so I would say that is enough... OK - I will send an updated patch that makes things a little cleaner. I didn't know whether it was OK to default to type long makes much sense when the probe says the parameter should be using type "mutex_t". > And once I had that fixed, the next issue that I looked at was: > > (gdb) b adainit > Breakpoint 1 at 0x8051f03 > (gdb) run > Starting program: /[...]/a > [Thread debugging using libthread_db enabled] > zsh: 12378 segmentation fault (core dumped) /[...]/gdb -q a > > This is where I'm getting even more out of my league, here. > The SEGV happens on the following line: > > 377 uint32_t enabler_offset > 378 = ((uint32_t *) eofftab)[DOF_UINT (dof, probe->dofpr_enoffidx) + i]; > > I will debug that SIGSEGV in solaris, but the problem seems to be > related to the DOF program embedded in your program, more than to the > platform. > > Could you please send me your sparc-solaris reproducer? Thanks for the offer to help! Sadly, the SEGV doesn't seem to happen on sparc-solaris, it seems. Once I apply the patch above, I pretty much get normal results back (yay!). So, the problem appears to be specific to x86-solaris. I didn't know the DOF program was embedded in the executable, but I suspect there is a problem in its contents. How do you think we should proceed? Thanks!
> Could you please send me your sparc-solaris reproducer?
Thanks for the offer to help! Sadly, the SEGV doesn't seem to
happen on sparc-solaris, it seems. Once I apply the patch above,
I pretty much get normal results back (yay!).
So, the problem appears to be specific to x86-solaris. I didn't know
the DOF program was embedded in the executable, but I suspect there is
a problem in its contents. How do you think we should proceed?
Hmm, this may be an endianness problem:
/* DOF supports two different encodings: MSB (big-endian) and LSB
(little-endian). The encoding is itself encoded in the DOF header.
The following function returns an unsigned value in the host
endianness. */
#define DOF_UINT(dof, field) \
extract_unsigned_integer ((gdb_byte *) &(field), \
sizeof ((field)), \
(((dof)->dofh_ident[DTRACE_DOF_ID_ENCODING] \
== DTRACE_DOF_ENCODE_MSB) \
? BFD_ENDIAN_BIG : BFD_ENDIAN_LITTLE))
We know for a fact that the implementation works properly on a
little-endian host (x86) and little-endian DOF data (generated by either
pdtrace or dtrace).
I would need to look at the x86-solaris binaries you are using. Maybe
dtrace on that platform is generating big-endian DOF data in the
little-endian platform and the implementation is not handling that
properly.
Hi Joel. > And once I had that fixed, the next issue that I looked at was: > > (gdb) b adainit > Breakpoint 1 at 0x8051f03 > (gdb) run > Starting program: /[...]/a > [Thread debugging using libthread_db enabled] > zsh: 12378 segmentation fault (core dumped) /[...]/gdb -q a > > This is where I'm getting even more out of my league, here. > The SEGV happens on the following line: > > 377 uint32_t enabler_offset > 378 = ((uint32_t *) eofftab)[DOF_UINT (dof, probe->dofpr_enoffidx) + i]; > > I will debug that SIGSEGV in solaris, but the problem seems to be > related to the DOF program embedded in your program, more than to the > platform. > > Could you please send me your sparc-solaris reproducer? Thanks for the offer to help! Sadly, the SEGV doesn't seem to happen on sparc-solaris, it seems. Once I apply the patch above, I pretty much get normal results back (yay!). So, the problem appears to be specific to x86-solaris. I didn't know the DOF program was embedded in the executable, but I suspect there is a problem in its contents. I just tried the following in a x86-solaris machine, using todays git GDB: $ uname -a SunOS solaris-x86 5.11 11.1 i86pc i386 i86pc $ cat > foo.d <<EOF provider test { probe progress__counter (char *, int); }; EOF $ cat > foo.c <<EOF #include "foo.h" int main () { char *name = "application"; int i = 0; while (i < 10) { i++; if (TEST_PROGRESS_COUNTER_ENABLED()) TEST_PROGRESS_COUNTER (name, i); } return 0; } EOF $ dtrace -h -s foo.d -o foo.h $ gcc -c foo.c $ dtrace -G -s foo.d -o foo-p.o foo.o $ gcc -o foo foo.o foo-p.o $ gdb foo [...] (gdb) info probes Type Provider Name Where Enabled Object dtrace test progress-counter 0x08051024 unknown /export/home/jose/binutils-gdb/build/foo (gdb) disas main Dump of assembler code for function main: 0x08050fec <+0>: push %ebp 0x08050fed <+1>: mov %esp,%ebp 0x08050fef <+3>: and $0xfffffff0,%esp 0x08050ff2 <+6>: sub $0x20,%esp 0x08050ff5 <+9>: movl $0x80514f0,0x18(%esp) 0x08050ffd <+17>: movl $0x0,0x1c(%esp) 0x08051005 <+25>: jmp 0x8051029 <main+61> 0x08051007 <+27>: addl $0x1,0x1c(%esp) 0x0805100c <+32>: xor %eax,%eax 0x0805100e <+34>: nop 0x0805100f <+35>: nop 0x08051010 <+36>: nop 0x08051011 <+37>: test %eax,%eax 0x08051013 <+39>: je 0x8051029 <main+61> 0x08051015 <+41>: mov 0x1c(%esp),%eax 0x08051019 <+45>: mov %eax,0x4(%esp) 0x0805101d <+49>: mov 0x18(%esp),%eax 0x08051021 <+53>: mov %eax,(%esp) 0x08051024 <+56>: nop 0x08051025 <+57>: nop 0x08051026 <+58>: nop 0x08051027 <+59>: nop 0x08051028 <+60>: nop 0x08051029 <+61>: cmpl $0x9,0x1c(%esp) 0x0805102e <+66>: jle 0x8051007 <main+27> 0x08051030 <+68>: mov $0x0,%eax 0x08051035 <+73>: leave 0x08051036 <+74>: ret End of assembler dump. (gdb) In this system DTrace generates little-endian DOF data, and it looks like GDB handles it just fine. I still think the problem you observed is related to endianness, but I cannot be sure without looking at the contents of the DOF program embedded in your binaries...
Hi Jose, > I just tried the following in a x86-solaris machine, using todays git > GDB: [...] > In this system DTrace generates little-endian DOF data, and it looks > like GDB handles it just fine. I still think the problem you observed > is related to endianness, but I cannot be sure without looking at the > contents of the DOF program embedded in your binaries... Sorry, I was going to send you an email this morning, and then got carried away by other duties... :-( I've been able to reproduce the problem, but when I checked again with a non-AdaCore debugger, the problem went away. I'm not really sure whether this indicates whether this is an issue caused by some of our local changes, or whether this is just a latent bug that gets triggered by those local changes. Either way, it felt like I would have no choice but to get more familiar with how those probes work. The problem is that I'm short on time again. But, if you have any kind of document describing how this data is laid out in the binary, I'm very interested. Sorry I waited so long before getting back to you!
I've been able to reproduce the problem, but when I checked again with a non-AdaCore debugger, the problem went away. I'm not really sure whether this indicates whether this is an issue caused by some of our local changes, or whether this is just a latent bug that gets triggered by those local changes. Either way, it felt like I would have no choice but to get more familiar with how those probes work. The problem is that I'm short on time again. But, if you have any kind of document describing how this data is laid out in the binary, I'm very interested. I don't know of any document describing the DOF layout. I would suggest you to take a look to the script gdb/testsuite/lib/pdtrace.in. It creates and populates DOF sections and it documents the structure of the generated data. Feel free to send me any question you may have regarding DOF. I will try to help even if I am not a DOF expert myself :)
diff --git a/gdb/dtrace-probe.c b/gdb/dtrace-probe.c index 491d853..5d5c5b2 100644 --- a/gdb/dtrace-probe.c +++ b/gdb/dtrace-probe.c @@ -414,6 +414,7 @@ dtrace_process_dof_probe (struct objfile *objfile, { struct dtrace_probe_arg arg; struct expression *expr; + struct cleanup *old_chain = NULL; /* Set arg.expr to ensure all fields in expr are initialized and the compiler will not warn when arg is used. */ @@ -427,8 +428,24 @@ dtrace_process_dof_probe (struct objfile *objfile, this does not work then we set the type to `long int'. */ arg.type = builtin_type (gdbarch)->builtin_long; - expr = parse_expression (arg.type_str); - if (expr->elts[0].opcode == OP_TYPE) + + if (current_language->la_language != language_c) + { + old_chain = make_cleanup_restore_current_language (); + set_language (language_c); + } + TRY + { + expr = parse_expression (arg.type_str); + } + CATCH (ex, RETURN_MASK_ERROR) + { + expr = NULL; + } + END_CATCH + if (old_chain != NULL) + do_cleanups (old_chain); + if (expr != NULL && expr->elts[0].opcode == OP_TYPE) arg.type = expr->elts[1].type; VEC_safe_push (dtrace_probe_arg_s, ret->args, &arg);