Message ID | 87vbg1eg08.fsf@igel.home |
---|---|
State | New, archived |
Headers |
Received: (qmail 7747 invoked by alias); 9 May 2015 18:57:03 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 7738 invoked by uid 89); 9 May 2015 18:57:02 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.5 required=5.0 tests=AWL, BAYES_20, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mail-out.m-online.net Received: from mail-out.m-online.net (HELO mail-out.m-online.net) (212.18.0.9) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Sat, 09 May 2015 18:57:00 +0000 Received: from frontend01.mail.m-online.net (unknown [192.168.8.182]) by mail-out.m-online.net (Postfix) with ESMTP id 3lkd6w4sQtz3hjTb for <gdb-patches@sourceware.org>; Sat, 9 May 2015 20:56:56 +0200 (CEST) Received: from localhost (dynscan1.mnet-online.de [192.168.6.68]) by mail.m-online.net (Postfix) with ESMTP id 3lkd6w31Q9zvjDh for <gdb-patches@sourceware.org>; Sat, 9 May 2015 20:56:56 +0200 (CEST) Received: from mail.mnet-online.de ([192.168.8.182]) by localhost (dynscan1.mail.m-online.net [192.168.6.68]) (amavisd-new, port 10024) with ESMTP id c5z0TAd-ra5V for <gdb-patches@sourceware.org>; Sat, 9 May 2015 20:56:55 +0200 (CEST) X-Auth-Info: 2Ra903M1HChbLwYNo6YA1FK3I0xsitsiitpR+VbkwUFaMTRnCbfhd9x/FNtDgrKi Received: from igel.home (ppp-93-104-92-148.dynamic.mnet-online.de [93.104.92.148]) by mail.mnet-online.de (Postfix) with ESMTPA for <gdb-patches@sourceware.org>; Sat, 9 May 2015 20:56:55 +0200 (CEST) Received: by igel.home (Postfix, from userid 1000) id 47DEA2C0C67; Sat, 9 May 2015 20:56:55 +0200 (CEST) From: Andreas Schwab <schwab@linux-m68k.org> To: gdb-patches@sourceware.org Subject: [PATCH] Fix wrong assertions X-Yow: Was my SOY LOAF left out in th'RAIN? It tastes REAL GOOD!! Date: Sat, 09 May 2015 20:56:55 +0200 Message-ID: <87vbg1eg08.fsf@igel.home> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain |
Commit Message
Andreas Schwab
May 9, 2015, 6:56 p.m. UTC
Both callers and callees are lengths, and the sum of them can validly equal the total length. Andreas. PR symtab/18392 * dwarf2-frame-tailcall.c (pretended_chain_levels): Correct assertion. * dwarf2loc.c (chain_candidate): Likewise. --- gdb/dwarf2-frame-tailcall.c | 2 +- gdb/dwarf2loc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
Comments
On Sat, 09 May 2015 20:56:55 +0200, Andreas Schwab wrote: > Both callers and callees are lengths, and the sum of them can validly > equal the total length. That '<' and not '<=' was there intentional. Personally I think it needs more investigation why that can happen. The idea was that if two solutions exist neither can be perfect so there have to be some ambiguous enties so there will be '<' and not '<=' (to fit the ambiguous entries between). But creating artifical reproducers is a bit difficult and you haven't provided a reproducer so I cannot say anything much specific. Personally I do not mind, it is up to the maintainers whether the goal is just to remove the assertion or verify there aren't some other bugs causing it. Jan
Jan Kratochvil <jan.kratochvil@redhat.com> writes: > But creating artifical reproducers is a bit difficult and you haven't provided > a reproducer so I cannot say anything much specific. Break on reload1.c:2917 (which is "return replace_equiv_address_nv (x, new_rtx);") and try to step into it. Andreas.
Jan Kratochvil <jan.kratochvil@redhat.com> writes: > That '<' and not '<=' was there intentional. Personally I think it needs more > investigation why that can happen. The idea was that if two solutions exist > neither can be perfect so there have to be some ambiguous enties so there will > be '<' and not '<=' (to fit the ambiguous entries between). > > But creating artifical reproducers is a bit difficult and you haven't provided > a reproducer so I cannot say anything much specific. > > Personally I do not mind, it is up to the maintainers whether the goal is just > to remove the assertion or verify there aren't some other bugs causing it. I can reproduce this internal error via Jan's test case, and spend some time investigating it, but still unable to fully understand the code. Jan, since you wrote this piece of code, please help me to understand it. (gdb) set debug entry-values 1 (gdb) bt tailcall: initial: 0x40052e(a) tailcall: compare: 0x400527(a) 0x40052e(a) tailcall: reduced: | 0x40052e(a) gdb/git/gdb/dwarf2loc.c:834: internal-error: chain_candidate: Assertion `result->callers + result->callees < result->length' failed. A problem internal to GDB has been detected, further debugging may prove unreliable. Quit this debugging session? (y or n) y I don't know why we need do intersection in chain_candidate, as the comments say: /* Intersect RESULTP with CHAIN to keep RESULTP unambiguous, keep in RESULTP only top callers and bottom callees which are present in both. GDBARCH is used only for ENTRY_VALUES_DEBUG. RESULTP is NULL after return if there are no remaining possibilities to provide unambiguous non-trivial result. RESULTP should point to NULL on the first (initialization) call. Caller is responsible for xfree of any RESULTP data. */ What do you mean by "ambiguous" here? Is it ambiguous if we can get more than one call chain path from caller_pc to callee_pc? For example, main tail calls a, a tail call b and c, b and c tail call d, when GDB unwinds from d, there are two chains, main -> a -> b -> d, and main -> a -> c -> d. Are they ambiguous by your definition? Further, what is "partially ambiguous result" in the comments below? /* Determined tail calls for constructing virtual tail call frames. */ struct call_site_chain { /* Initially CALLERS == CALLEES == LENGTH. For partially ambiguous result CALLERS + CALLEES < LENGTH. */ int callers, callees, length; /* Variably sized array with LENGTH elements. Later [0..CALLERS-1] contain top (GDB "prev") sites and [LENGTH-CALLEES..LENGTH-1] contain bottom (GDB "next") sites. One is interested primarily in the PC field. */ struct call_site *call_site[1]; }; I am confused by the usage of the variable-sized array call_site, elements from 0 to CALLERS-1 are top sites, and elements from LENGTH-CALLEES to LENGTH-1 are bottom sites, so I conclude that CALLERS-1 < LENGTH-CALLEES, then CALLERS + CALLEES < LENGTH + 1, then CALLERS + CALLEES =< LENGTH. Is it right? I still have other questions, but I'd like to stop here, otherwise this mail will be too long. Your answers to these questions may help to answer the rest of my questions.
On Fri, 29 May 2015 11:31:18 +0200, Yao Qi wrote: > spend some > time investigating it, but still unable to fully understand the code. I admit the code comments are not too great, I did not notice that when writing them. > (gdb) set debug entry-values 1 > (gdb) bt > tailcall: initial: 0x40052e(a) > tailcall: compare: 0x400527(a) 0x40052e(a) > tailcall: reduced: | 0x40052e(a) > gdb/git/gdb/dwarf2loc.c:834: internal-error: chain_candidate: Assertion `result->callers + result->callees < result->length' failed. > A problem internal to GDB has been detected, > further debugging may prove unreliable. > Quit this debugging session? (y or n) y > > I don't know why we need do intersection in chain_candidate, as the > comments say: > > /* Intersect RESULTP with CHAIN to keep RESULTP unambiguous, keep in RESULTP > only top callers and bottom callees which are present in both. GDBARCH is > used only for ENTRY_VALUES_DEBUG. RESULTP is NULL after return if there are > no remaining possibilities to provide unambiguous non-trivial result. > RESULTP should point to NULL on the first (initialization) call. Caller is > responsible for xfree of any RESULTP data. */ > > What do you mean by "ambiguous" here? Is it ambiguous if we can get > more than one call chain path from caller_pc to callee_pc? Yes. > For example, > main tail calls a, a tail call b and c, b and c tail call d, when GDB > unwinds from d, there are two chains, main -> a -> b -> d, and main -> a > -> c -> d. Are they ambiguous by your definition? Those two chains are ambigous as it could be also the other chain. Chain intersecting those two chains is: main -> a -> <???> -> d > Further, what is "partially ambiguous result" in the comments below? The terminology seems bogus there. "partially ambiguous" was meant the chain: main -> a -> <???> -> d An intersection of all possible chains. > /* Determined tail calls for constructing virtual tail call frames. */ > > struct call_site_chain > { > /* Initially CALLERS == CALLEES == LENGTH. For partially ambiguous result > CALLERS + CALLEES < LENGTH. */ > int callers, callees, length; > > /* Variably sized array with LENGTH elements. Later [0..CALLERS-1] contain > top (GDB "prev") sites and [LENGTH-CALLEES..LENGTH-1] contain bottom > (GDB "next") sites. One is interested primarily in the PC field. */ > struct call_site *call_site[1]; > }; > > I am confused by the usage of the variable-sized array call_site, > elements from 0 to CALLERS-1 are top sites, and elements from > LENGTH-CALLEES to LENGTH-1 are bottom sites, so I conclude that > CALLERS-1 < LENGTH-CALLEES, then CALLERS + CALLEES < LENGTH + 1, > then CALLERS + CALLEES =< LENGTH. Is it right? Yes, that is right. Initially there is some chain (let's say the longest one but that doe snot matter). Consequently its elements from the middle are being removed and there remains only some few unambiguous top and bottom ones. The original idea why the comparison should be sharp ("<") was that if there are multiple chains like (0xaddr show jmp instruction address): main(0x100) -> a(0x200) -> d(0x400) main(0x100) -> a(0x200) -> c(0x300) -> d(0x400) then - such situation cannot exist - if two jmp instructions in "a" have the same address they must also jump to the same address (*). (*) jump to a computed address would be never considered for the DWARF tail-call records. So there could be: main(0x100) -> a(0x200) -> d(0x400) main(0x100) -> a(0x270) -> c(0x300) -> d(0x400) But then "a" frame itself is ambiguous and it must not be displayed. I did not realize that there can be self-tail-call: main(0x100) -> a(0x200) -> d(0x400) main(0x100) -> a(0x280) -> a(0x200) -> d(0x400) which intersects to: main(0x100) -> <???>? -> a(0x200) -> d(0x400) And so if the first chain was chosen the main(0x100) -> a(0x200) -> d(0x400) then the final intersection has callers+callees==length. Originally the patchset tried to display the "ambiguous" part <???> in backtrace creating a bogus frame there but GDB had too many problems with such a frame. So currently no such frame is created although still backtrace could annotate it somehow there are "ambiguous" frames between these two frames. Jan
Jan Kratochvil <jan.kratochvil@redhat.com> writes: Hi, Jan, thanks for your explanations... they are very helpful. >> Further, what is "partially ambiguous result" in the comments below? > > The terminology seems bogus there. > > "partially ambiguous" was meant the chain: > main -> a -> <???> -> d > An intersection of all possible chains. > Sounds like "partially ambiguous" is equivalent to "ambiguous". > >> /* Determined tail calls for constructing virtual tail call frames. */ >> >> struct call_site_chain >> { >> /* Initially CALLERS == CALLEES == LENGTH. For partially ambiguous result >> CALLERS + CALLEES < LENGTH. */ >> int callers, callees, length; >> >> /* Variably sized array with LENGTH elements. Later [0..CALLERS-1] contain >> top (GDB "prev") sites and [LENGTH-CALLEES..LENGTH-1] contain bottom >> (GDB "next") sites. One is interested primarily in the PC field. */ >> struct call_site *call_site[1]; >> }; >> >> I am confused by the usage of the variable-sized array call_site, >> elements from 0 to CALLERS-1 are top sites, and elements from >> LENGTH-CALLEES to LENGTH-1 are bottom sites, so I conclude that >> CALLERS-1 < LENGTH-CALLEES, then CALLERS + CALLEES < LENGTH + 1, >> then CALLERS + CALLEES =< LENGTH. Is it right? > > Yes, that is right. Initially there is some chain (let's say the longest one If that is right, the assert below is too strict, isn't? /* See call_site_find_chain_1 why there is no way to reach the bottom callee PC again. In such case there must be two different code paths to reach it, therefore some of the former determined intermediate PCs must differ and the unambiguous chain gets shortened. */ gdb_assert (result->callers + result->callees < result->length); > but that doe snot matter). Consequently its elements from the middle are > being removed and there remains only some few unambiguous top and > bottom ones. If there is no call sites removed from the chain during the intersection, CALLERS + CALLEES == LENGTH, right? in function chain_candidate, result->length is set by the length of a chain. If this chain is the shortest one, CALLERS + CALLEES == LENGTH otherwise, CALLERS + CALLEES < LENGTH. Is it right? If so, we need to relax the condition in the assert and update the comments. > > The original idea why the comparison should be sharp ("<") was that if there > are multiple chains like (0xaddr show jmp instruction address): > main(0x100) -> a(0x200) -> d(0x400) > main(0x100) -> a(0x200) -> c(0x300) -> d(0x400) > then - such situation cannot exist - if two jmp instructions in "a" have the > same address they must also jump to the same address (*). > > (*) jump to a computed address would be never considered for the DWARF > tail-call records. > > So there could be: > main(0x100) -> a(0x200) -> d(0x400) > main(0x100) -> a(0x270) -> c(0x300) -> d(0x400) > But then "a" frame itself is ambiguous and it must not be displayed. > > I did not realize that there can be self-tail-call: > main(0x100) -> a(0x200) -> d(0x400) > main(0x100) -> a(0x280) -> a(0x200) -> d(0x400) > which intersects to: > main(0x100) -> <???>? -> a(0x200) -> d(0x400) > And so if the first chain was chosen the > main(0x100) -> a(0x200) -> d(0x400) > then the final intersection has callers+callees==length. What are the definitions of CALLERS, CALLEES, top and bottom? given this example?
diff --git a/gdb/dwarf2-frame-tailcall.c b/gdb/dwarf2-frame-tailcall.c index b412a5b..f964ab2 100644 --- a/gdb/dwarf2-frame-tailcall.c +++ b/gdb/dwarf2-frame-tailcall.c @@ -197,7 +197,7 @@ pretended_chain_levels (struct call_site_chain *chain) return chain->length; chain_levels = chain->callers + chain->callees; - gdb_assert (chain_levels < chain->length); + gdb_assert (chain_levels <= chain->length); return chain_levels; } diff --git a/gdb/dwarf2loc.c b/gdb/dwarf2loc.c index d811106..29b265b 100644 --- a/gdb/dwarf2loc.c +++ b/gdb/dwarf2loc.c @@ -827,7 +827,7 @@ chain_candidate (struct gdbarch *gdbarch, struct call_site_chain **resultp, PC again. In such case there must be two different code paths to reach it, therefore some of the former determined intermediate PCs must differ and the unambiguous chain gets shortened. */ - gdb_assert (result->callers + result->callees < result->length); + gdb_assert (result->callers + result->callees <= result->length); } /* Create and return call_site_chain for CALLER_PC and CALLEE_PC. All the