Message ID | 556DB1BB.50601@codesourcery.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 87971 invoked by alias); 2 Jun 2015 13:38:24 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 87959 invoked by uid 89); 2 Jun 2015 13:38:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS, T_FROM_12LTRDOM autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 02 Jun 2015 13:38:13 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1YzmOP-0000Ow-Ri from Taimoor_Mirza@mentor.com for gdb-patches@sourceware.org; Tue, 02 Jun 2015 06:38:10 -0700 Received: from [137.202.157.84] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server (TLS) id 14.3.224.2; Tue, 2 Jun 2015 14:38:08 +0100 Message-ID: <556DB1BB.50601@codesourcery.com> Date: Tue, 2 Jun 2015 18:38:03 +0500 From: Taimoor <tmirza@codesourcery.com> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: "gdb-patches@sourceware.org" <gdb-patches@sourceware.org> Subject: Improving GDB's mechanism to check if function is GC'ed Content-Type: multipart/mixed; boundary="------------000600090503000203050805" X-IsSubscribed: yes |
Commit Message
Taimoor Mirza
June 2, 2015, 1:38 p.m. UTC
Hi, GDB currently uses following mechanism to check if function is GC'ed by the linker: For any function whose address is 0x0, if 'textlow' field of partial symbol table is not zero, function is considered to be GC'ed by the linker. Below is the code doing this: case DW_LNE_set_address: address = read_address (abfd, line_ptr, cu, &bytes_read); /* If address < lowpc then it's not a usable value, it's outside the pc range of the CU. However, we restrict the test to only address values of zero to preserve GDB's previous behaviour which is to handle the specific case of a function being GC'd by the linker. */ if (address == 0 && address < lowpc) { /* This line table is for a function which has been GCd by the linker. Ignore it. PR gdb/12528 */ This change was done in https://sourceware.org/ml/gdb-patches/2014-08/msg00468.html This does not work for cases where symbols are manually loaded using add-symbol-file command. For any incrementally loaded objfile whose symbols are added using add-symbol-file command can have function at 0x0 in debug info and can have its lowpc non-zero because of add-symbol-file command that allows user to provide section addresses. Current Problem =============== We are currently using GDB to debug Nucleus based bare-metal system that also allows to dynamically load and unload Nucleus process modules during system execution. We currently load symbols of a modules using add-symbol-file whenever a module is loaded at runtime. It is very common to have functions at address 0x0 in debug information and then lowpc in symbol table to be non-zero as it depends on section addresses given in add-symbol-file command. With above mentioned GC'ing mechanism, GDB assumes that all these functions are GC'ed by linker. Because of this breakpoints do not work properly in debug session. Possible Solution ================= * Modify GC checking mechanism to mark any function GC'ed using above mentioned mechanism only if objfile is not dynamically loaded. So, for any function with address 0x0, it'll be marked GC'ed only if lowpc is not zero and objfile is main symbol file. For this I have made following modifications in if condition: if (address == 0 && address < lowpc && (objfile->flags & OBJF_MAINLINE)) { I have regression tested this change and it seems to work fine. Only downside that it is possible (though not common) to load main symbol file using add-symbol-file command. In that case, GDB will not check for GC'ed functions. Attached is patch to better highlight this solution. I am open to any other suggestions to improve this GC'ing mechanism and solving this problem. Thanks, Taimoor
Comments
ping. Yao, As you implemented current GC'ing mechanism, Can you kindly take a look at below mentioned problem and proposed solution. Thanks, Taimoor On 06/02/2015 06:38 PM, Taimoor wrote: > Hi, > > GDB currently uses following mechanism to check if function is GC'ed by > the linker: > For any function whose address is 0x0, if 'textlow' field of partial > symbol table is not zero, function is considered to be GC'ed by the > linker. Below is the code doing this: > > case DW_LNE_set_address: > address = read_address (abfd, line_ptr, cu, &bytes_read); > > /* If address < lowpc then it's not a usable value, it's > outside the pc range of the CU. However, we restrict > the test to only address values of zero to preserve > GDB's previous behaviour which is to handle the specific > case of a function being GC'd by the linker. */ > if (address == 0 && address < lowpc) > { > /* This line table is for a function which has been > GCd by the linker. Ignore it. PR gdb/12528 */ > > > This change was done in > https://sourceware.org/ml/gdb-patches/2014-08/msg00468.html > > This does not work for cases where symbols are manually loaded using > add-symbol-file command. For any incrementally loaded objfile whose > symbols are added using add-symbol-file command can have function at 0x0 > in debug info and can have its lowpc non-zero because of add-symbol-file > command that allows user to provide section addresses. > > Current Problem > =============== > > We are currently using GDB to debug Nucleus based bare-metal system that > also allows to dynamically load and unload Nucleus process modules > during system execution. > We currently load symbols of a modules using add-symbol-file whenever a > module is loaded at runtime. It is very common to have functions at > address 0x0 in debug information and then lowpc in symbol table to be > non-zero as it depends on section addresses given in add-symbol-file > command. > > With above mentioned GC'ing mechanism, GDB assumes that all these > functions are GC'ed by linker. Because of this breakpoints do not work > properly in debug session. > > Possible Solution > ================= > > * Modify GC checking mechanism to mark any function GC'ed using above > mentioned mechanism only if objfile is not dynamically loaded. So, for > any function with address 0x0, it'll be marked GC'ed only if lowpc is > not zero and objfile is main symbol file. > > For this I have made following modifications in if condition: > > if (address == 0 && address < lowpc > && (objfile->flags & OBJF_MAINLINE)) > { > > I have regression tested this change and it seems to work fine. > Only downside that it is possible (though not common) to load main > symbol file using add-symbol-file command. In that case, GDB will not > check for GC'ed functions. > > Attached is patch to better highlight this solution. > > I am open to any other suggestions to improve this GC'ing mechanism and > solving this problem. > > Thanks, > Taimoor
Hi Taimoor, Sorry for the delayed reply. On 09/06/15 07:16, Taimoor wrote: > As you implemented current GC'ing mechanism, Can you kindly take a look > at below mentioned problem and proposed solution. I'll take a look next week when I am back to office, if nobody review this patch in this period. I am in a travel for Linaro Toolchain Sprint this week.
Taimoor <tmirza@codesourcery.com> writes: Hi Taimoor, I happen to have to some time today to read your patch, here are my comments below, > > Current Problem > =============== > > We are currently using GDB to debug Nucleus based bare-metal system > that also allows to dynamically load and unload Nucleus process > modules during system execution. > We currently load symbols of a modules using add-symbol-file whenever > a module is loaded at runtime. It is very common to have functions at > address 0x0 in debug information and then lowpc in symbol table to be > non-zero as it depends on section addresses given in add-symbol-file > command. GDB just uses some heuristics to determine whether the function is GC'ed by linker, so they may not be perfect. However, GDB doesn't support Nucleus, so it isn't a valid case to me. Do we have other cases that we add-symbol-file in which function address is at 0x0 on platforms GDB supports? If the problem only exists on Nucleus, I am afraid I don't agree with accepting this change, because GDB doesn't support Nucleus. Sorry. > > Possible Solution > ================= > > * Modify GC checking mechanism to mark any function GC'ed using above > mentioned mechanism only if objfile is not dynamically loaded. So, for > any function with address 0x0, it'll be marked GC'ed only if lowpc is > not zero and objfile is main symbol file. > > For this I have made following modifications in if condition: > > if (address == 0 && address < lowpc > && (objfile->flags & OBJF_MAINLINE)) > { > > I have regression tested this change and it seems to work fine. > Only downside that it is possible (though not common) to load main > symbol file using add-symbol-file command. In that case, GDB will not > check for GC'ed functions. In this way, GDB doesn't ignore the GC'ed function from the dynamically loaded objects, which is a regression to me.
On 06/10/2015 09:53 AM, Yao Qi wrote: > If the problem only exists on Nucleus, I am afraid I don't agree with > accepting this change, because GDB doesn't support Nucleus. Sorry. Hmm, does it really need to, though? We expose mechanisms like add-symbol-file, xml library list with qXfer:libraries:read (the default solib provider), xml target descriptions, "info os", etc., exactly so that GDB doesn't have to learn about the myriad of random RTOS's out there. That said, I don't really understand the patch. How can you have real code at address 0, but then _not_ have address 0 covered by a section? Thanks, Pedro Alves
On 06/10/2015 05:53 AM, Yao Qi wrote: > Taimoor <tmirza@codesourcery.com> writes: > > Hi Taimoor, > I happen to have to some time today to read your patch, here are my > comments below, > >> >> Current Problem >> =============== >> >> We are currently using GDB to debug Nucleus based bare-metal system >> that also allows to dynamically load and unload Nucleus process >> modules during system execution. >> We currently load symbols of a modules using add-symbol-file whenever >> a module is loaded at runtime. It is very common to have functions at >> address 0x0 in debug information and then lowpc in symbol table to be >> non-zero as it depends on section addresses given in add-symbol-file >> command. > > GDB just uses some heuristics to determine whether the function is GC'ed > by linker, so they may not be perfect. However, GDB doesn't support > Nucleus, so it isn't a valid case to me. Do we have other cases that we > add-symbol-file in which function address is at 0x0 on platforms GDB > supports? > > If the problem only exists on Nucleus, I am afraid I don't agree with > accepting this change, because GDB doesn't support Nucleus. Sorry. add-symbol-file can cause things to get weird with addresses given the user can specify the base address as it pleases. I don't think this is Nucleus-specific at all, but more generally bare-metal-specific. I take it DWARF says 0x0 and GDB relocates the symbol file/addresses based on the provided base address? Taimoor?
On 06/10/2015 04:18 PM, Luis Machado wrote: > On 06/10/2015 05:53 AM, Yao Qi wrote: >> Taimoor <tmirza@codesourcery.com> writes: >> >> Hi Taimoor, >> I happen to have to some time today to read your patch, here are my >> comments below, >> >>> >>> Current Problem >>> =============== >>> >>> We are currently using GDB to debug Nucleus based bare-metal system >>> that also allows to dynamically load and unload Nucleus process >>> modules during system execution. >>> We currently load symbols of a modules using add-symbol-file whenever >>> a module is loaded at runtime. It is very common to have functions at >>> address 0x0 in debug information and then lowpc in symbol table to be >>> non-zero as it depends on section addresses given in add-symbol-file >>> command. >> >> GDB just uses some heuristics to determine whether the function is GC'ed >> by linker, so they may not be perfect. However, GDB doesn't support >> Nucleus, so it isn't a valid case to me. Do we have other cases that we >> add-symbol-file in which function address is at 0x0 on platforms GDB >> supports? >> >> If the problem only exists on Nucleus, I am afraid I don't agree with >> accepting this change, because GDB doesn't support Nucleus. Sorry. > > add-symbol-file can cause things to get weird with addresses given the > user can specify the base address as it pleases. I don't think this is > Nucleus-specific at all, but more generally bare-metal-specific. > > I take it DWARF says 0x0 and GDB relocates the symbol file/addresses > based on the provided base address? Taimoor? Yes. Its not something specific to Nucleus. Only issue is that function address in DWARF is 0x0 as this object file is loaded at runtime and its symbols are added using "add-symbol-file" command on the basis of address space where it is loaded. IMO, for any dynamic relocatable object, GDB provides a mechanism to load its symbols using add-symbol-file. But if that object file has function at 0x0, current mechanism consider it GC'ed. Thanks, Taimoor
On 06/10/2015 03:17 PM, Pedro Alves wrote: > Hmm, does it really need to, though? We expose mechanisms like > add-symbol-file, xml library list with qXfer:libraries:read (the default > solib provider), xml target descriptions, "info os", etc., exactly so > that GDB doesn't have to learn about the myriad of random RTOS's out there. > > That said, I don't really understand the patch. How can you have > real code at address 0, but then_not_ have address 0 covered > by a section? This code is a relocatable object that is dynamically loaded by dynamic loader that is part of loaded image. So you load your main kernel image but you can load/unload relocatable objects to add further enhancements to your main kernel image. This allows RTOS application developers to add new features or provide bug fixes in their applications without rebuilding and reflashing everything. We currently use add-symbol-file for these relocatable objects. Previous GC'ing mechanism of GDB was working fine for us as it was using flag has_section_at_zero to determine whether address zero in debug info means the corresponding code has been GC'ed. But this mechanism was changed in https://sourceware.org/ml/gdb-patches/2014-08/msg00468.html and now functions at address 0x0 in dwarf of relocatable object are marked GC'ed by GDB. Thanks, Taimoor
On 10/06/15 11:17, Pedro Alves wrote: > Hmm, does it really need to, though? We expose mechanisms like > add-symbol-file, xml library list with qXfer:libraries:read (the default > solib provider), xml target descriptions, "info os", etc., exactly so > that GDB doesn't have to learn about the myriad of random RTOS's out there. If these stuffs (add-symbol-file, xml library list, etc) works well for the given RTOS, and GDB doesn't need to know about the RTOS, that is fine. However, in this case, Nucleus needs some changes in GDB c code while GDB doesn't support Nucleus. I can't see how this patch benefits any targets GDB supported. This is the reason I think this patch is not acceptable.
On 10/06/15 12:45, Taimoor wrote: > > Yes. Its not something specific to Nucleus. Only issue is that function > address in DWARF is 0x0 as this object file is loaded at runtime and its > symbols are added using "add-symbol-file" command on the basis of > address space where it is loaded. > > IMO, for any dynamic relocatable object, GDB provides a mechanism to > load its symbols using add-symbol-file. But if that object file has > function at 0x0, current mechanism consider it GC'ed. Could you help me to understand how to reproduce this problem on any GDB supported targets? Your fix may not be Nucleus specific, but I think the problem does only exist on Nucleus, please correct me if I am wrong.
On 06/10/2015 01:06 PM, Yao Qi wrote: > On 10/06/15 11:17, Pedro Alves wrote: >> Hmm, does it really need to, though? We expose mechanisms like >> add-symbol-file, xml library list with qXfer:libraries:read (the default >> solib provider), xml target descriptions, "info os", etc., exactly so >> that GDB doesn't have to learn about the myriad of random RTOS's out there. > > If these stuffs (add-symbol-file, xml library list, etc) works well for > the given RTOS, and GDB doesn't need to know about the RTOS, that is > fine. However, in this case, Nucleus needs some changes in GDB c code > while GDB doesn't support Nucleus. I can't see how this patch benefits > any targets GDB supported. This is the reason I think this patch is > not acceptable. This strikes me as an odd position, given the whole point of those commands and features is letting GDB support the target without builtin knowledge of them, so it's natural that GDB didn't have built-in support for the target thus far. But now we broke one of the mechanisms for some use case. Put another way, if we added support for --target=$cpu-nucleus, just as a configure alias for --target=$cpu-elf, so that we could say we supported Nucleus, how would we go about fixing this? I think we need to look and understand _why_ Nucleous' binaries trigger the problem. If they're standard elf, it just looks to me that Nucleous is a red herring. IIUC, this triggers on use of add-symbol-file with relocatable objects, when there's real code at address 0. I think that's the angle we should look at things. I'd suspect that things are broken too when targets list relocatable objects in the qXfer:libraries:read list (in which case the target will list "section" elements instead of "segment" elements for each "library" element), and that's definitely meant to work. ("If the target supports dynamic linking of a relocatable object file, its library XML element should instead include a list of allocated sections.") Thanks, Pedro Alves
On 10/06/15 15:43, Pedro Alves wrote: > This strikes me as an odd position, given the whole point of those commands > and features is letting GDB support the target without builtin knowledge of > them, so it's natural that GDB didn't have built-in support for the target > thus far. But now we broke one of the mechanisms for some use case. > Put another way, if we added support for --target=$cpu-nucleus, just as a > configure alias for --target=$cpu-elf, so that we could say we supported > Nucleus, how would we go about fixing this? Even Nucleus is supported in GDB, I don't know how to fix this problem either. If this problem only exist on Nucleus, do we need to fix this problem? > > I think we need to look and understand_why_ Nucleous' binaries trigger > the problem. If they're standard elf, it just looks to me that Nucleous > is a red herring. I agree that we need look deep and understand why Nucleus binaries trigger this problem, to see whether it is Nucleus specific or not. > > IIUC, this triggers on use of add-symbol-file with relocatable > objects, when there's real code at address 0. I think that's the angle > we should look at things. > Yes, that is the problem. GDB thinks the function is GC'ed by linker but in fact it isn't.
On 06/11/2015 09:30 AM, Yao Qi wrote: >> > >> > IIUC, this triggers on use of add-symbol-file with relocatable >> > objects, when there's real code at address 0. I think that's the angle >> > we should look at things. >> > > Yes, that is the problem. GDB thinks the function is GC'ed by linker > but in fact it isn't. The key point here is _relocatable_ objects. These are special in that they aren't fully linked, so they have relocations in the debug info, and their sections on file start at address zero. See default_symfile_offsets. I still don't understand how lowpc doesn't end up as 0 in the case at hand, meaning there's really code at 0. Or does lowpc really end up as 0, but the real issue is that "lowpc==0" is ambiguous, and we need to track a separate "have pc range" flag? Taimoor, I'd expect you're passing a "0" to add-symbol-file, like: (gdb) add-symbol-file ... -s .some_section 0 ... I think it'd help to see an example invocation of the command. I'd think it possible to expand gdb.base/relocate.exp (or create a new test based on it) to cover this scenario everywhere. Thanks, Pedro Alves
diff --git a/gdb/dwarf2read.c b/gdb/dwarf2read.c index f6b0c01..4f84b40 100644 --- a/gdb/dwarf2read.c +++ b/gdb/dwarf2read.c @@ -17665,7 +17665,8 @@ dwarf_decode_lines_1 (struct line_header *lh, struct dwarf2_cu *cu, the test to only address values of zero to preserve GDB's previous behaviour which is to handle the specific case of a function being GC'd by the linker. */ - if (address == 0 && address < lowpc) + if (address == 0 && address < lowpc + && (objfile->flags & OBJF_MAINLINE)) { /* This line table is for a function which has been GCd by the linker. Ignore it. PR gdb/12528 */