[0/3] Fix issues with writing Linux core PRSTATUS note on MIPS o32, n32 and n64 into core file
Message ID | 64ad38a4-b8ae-912e-45d6-7048135ada2e@rt-rk.com |
---|---|
State | Not applicable |
Headers |
Received: (qmail 66084 invoked by alias); 26 Oct 2017 11:22:00 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 66042 invoked by uid 89); 26 Oct 2017 11:21:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-15.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD, SPAM_URI autolearn=ham version=3.3.2 spammy=reproduced, 12552, 12556 X-Spam-User: qpsmtpd, 2 recipients X-HELO: mail.rt-rk.com Received: from mx2.rt-rk.com (HELO mail.rt-rk.com) (89.216.37.149) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 26 Oct 2017 11:21:57 +0000 Received: from localhost (localhost [127.0.0.1]) by mail.rt-rk.com (Postfix) with ESMTP id 66D0F1A4996; Thu, 26 Oct 2017 13:21:54 +0200 (CEST) Received: from [10.10.13.94] (djole.domain.local [10.10.13.94]) by mail.rt-rk.com (Postfix) with ESMTPSA id 234671A4988; Thu, 26 Oct 2017 13:21:54 +0200 (CEST) Subject: Re: [PATCH 0/3] Fix issues with writing Linux core PRSTATUS note on MIPS o32, n32 and n64 into core file To: "Maciej W. Rozycki" <macro@mips.com>, Pedro Alves <palves@redhat.com> Cc: binutils@sourceware.org, gdb-patches@sourceware.org, "nemanja.popov@rt-rk.com" <nemanja.popov@rt-rk.com>, petar.jovanovic@rt-rk.com, "Ananthakrishna Sowda (asowda)" <asowda@cisco.com>, Nikola Prica <nikola.prica@rt-rk.com> References: <74618d56-fa31-4cfe-329f-6a9078bac92b@rt-rk.com> <alpine.DEB.2.00.1710171449200.3886@tp.orcam.me.uk> <724f0bc9-6744-a915-d19d-77db7e9ce514@rt-rk.com> <alpine.DEB.2.00.1710252126220.3886@tp.orcam.me.uk> From: Djordje Todorovic <djordje.todorovic@rt-rk.com> Message-ID: <64ad38a4-b8ae-912e-45d6-7048135ada2e@rt-rk.com> Date: Thu, 26 Oct 2017 13:21:46 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <alpine.DEB.2.00.1710252126220.3886@tp.orcam.me.uk> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable |
Commit Message
Djordje Todorovic
Oct. 26, 2017, 11:21 a.m. UTC
Hi Maciej, > Thank you. There's still a small nit remaining WRT 1/3, but as I have > noted in a separate reply I'll handle it myself when committing. Thanks a lot! I really appreciate it! >> Regarding patch 3/3, as Pedro said, currently “thread” function has a >> piece of code that is never going to be executed since core file is >> going to be generated as soon as the first thread reaches the thread >> function. So, since we are testing fetching value of TLS variable from >> core file, creating just a one thread in test case would be enough and >> regarding that, I have cut unnecessary code from the function. > > Hmm, to understand how exactly the bug triggers (as you may recall long > ago I observed GDB prefers LWPID over PID if available) I ran your test > case with the code changes removed and the test has still passed across > the three ABIs. So it does not actually cover the case concerned here. > I still would like to keep it to verify the consistency between core files > written and read (following the various issues discovered in the course of > this review, including the n32 BFD fix I have cc-ed you on lately), > however that brings the question again about how you originally observed > the problem you've addressed. Is there another test scenario that can be > created? Thanks a lot, I saw that n32 BFD fix, and it is great! Actually, I have faced it when I tried to read value of TLS variable on MIPS platforms (native). I agree with you that GDB prefers LWPID over PID if available, but with no PID from core file GDB, at least on my boards, can not read value of TLS variable. I have tried to do some observation on x86_64 native platform and commented piece of code for fetching PID from core file like this: and rebuilt GDB (with "make clean" and "make"), the test case was FAIL: (gdb) file /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core Reading symbols from /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core...done. (gdb) core /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/gcore.test [New LWP 12556] [New LWP 12552] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core'. Program terminated with signal SIGTRAP, Trace/breakpoint trap. #0 thread_proc (arg=0x0) at /home/rtrk/gdb/build_native/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.threads/tls-core.c:25 25 return arg; [Current thread is 1 (LWP 12556)] (gdb) PASS: gdb.threads/tls-core.exp: load generated corefile p/x foo Cannot find user-level thread for LWP 12556: generic error When I reverted this observation changes and rebuilt it again, GDB was able to read value of TLS variable: (gdb) file /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core Reading symbols from /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core...done. (gdb) core /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/gcore.test [New LWP 11099] [New LWP 11095] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core'. Program terminated with signal SIGTRAP, Trace/breakpoint trap. #0 thread_proc (arg=0x0) at /home/rtrk/gdb/build_native/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.threads/tls-core.c:25 25 return arg; [Current thread is 1 (Thread 0x7ffff77ef700 (LWP 11099))] (gdb) PASS: gdb.threads/tls-core.exp: load generated corefile p/x foo $1 = 0xdeadbeef Exactly the same situation I have reproduced on MIPS platforms. Have I missed something? Regarding fetching value of TLS variable with cross or Multiarch GDB, we actually had proposal for resolving that issue in GDB, where we proposed that GDB can have its own "libthread_db" functions for getting TLS in case of cross debugging, where it is going to "simulate" "target" architecture, but that code would be hard to maintain, because every changes in "libthread_db" functions regarding particular architecture in GLIBC would take also changes in GDB, so it was not accepted by community, reasonably (please briefly see that thread: http://sourceware-org.1504.n7.nabble.com/PATCH-TLS-access-support-in-cross-Linux-GDB-td452155.html). But yes, I also think in those cases, test should be marked as known failure. Thanks, Djordje
Comments
Hi Djordje, > > Hmm, to understand how exactly the bug triggers (as you may recall long > > ago I observed GDB prefers LWPID over PID if available) I ran your test > > case with the code changes removed and the test has still passed across > > the three ABIs. So it does not actually cover the case concerned here. > > I still would like to keep it to verify the consistency between core files > > written and read (following the various issues discovered in the course of > > this review, including the n32 BFD fix I have cc-ed you on lately), > > however that brings the question again about how you originally observed > > the problem you've addressed. Is there another test scenario that can be > > created? > > Thanks a lot, I saw that n32 BFD fix, and it is great! > > Actually, I have faced it when I tried to read value of TLS variable on MIPS platforms (native). I agree with you that GDB prefers LWPID over PID if available, but with no PID from core file GDB, at least on my boards, can not read value of TLS variable. I have tried to do some observation on x86_64 native platform and commented piece of code for fetching PID from core file like this: > > diff --git a/bfd/elf64-x86-64.c b/bfd/elf64-x86-64.c > index 6a5159d..9ee9231 100644 > --- a/bfd/elf64-x86-64.c > +++ b/bfd/elf64-x86-64.c > @@ -415,8 +415,8 @@ elf_x86_64_grok_psinfo (bfd *abfd, Elf_Internal_Note *note) > break; > > case 136: /* sizeof(struct elf_prpsinfo) on Linux/x86_64 */ > - elf_tdata (abfd)->core->pid > - = bfd_get_32 (abfd, note->descdata + 24); > + //elf_tdata (abfd)->core->pid > + // = bfd_get_32 (abfd, note->descdata + 24); > elf_tdata (abfd)->core->program > = _bfd_elfcore_strndup (abfd, note->descdata + 40, 16); > elf_tdata (abfd)->core->command > > and rebuilt GDB (with "make clean" and "make"), the test case was FAIL: > > (gdb) file /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core > Reading symbols from /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core...done. > (gdb) core /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/gcore.test > [New LWP 12556] > [New LWP 12552] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core'. > Program terminated with signal SIGTRAP, Trace/breakpoint trap. > #0 thread_proc (arg=0x0) at /home/rtrk/gdb/build_native/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.threads/tls-core.c:25 > 25 return arg; > [Current thread is 1 (LWP 12556)] > (gdb) PASS: gdb.threads/tls-core.exp: load generated corefile > p/x foo > Cannot find user-level thread for LWP 12556: generic error > > When I reverted this observation changes and rebuilt it again, GDB was > able to read value of TLS variable: > > (gdb) file /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core > Reading symbols from /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core...done. > (gdb) core /home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/gcore.test > [New LWP 11099] > [New LWP 11095] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". > Core was generated by `/home/rtrk/gdb/build_native/gdb/testsuite/outputs/gdb.threads/tls-core/tls-core'. > Program terminated with signal SIGTRAP, Trace/breakpoint trap. > #0 thread_proc (arg=0x0) at /home/rtrk/gdb/build_native/gdb/testsuite/../../../binutils-gdb/gdb/testsuite/gdb.threads/tls-core.c:25 > 25 return arg; > [Current thread is 1 (Thread 0x7ffff77ef700 (LWP 11099))] > (gdb) PASS: gdb.threads/tls-core.exp: load generated corefile > p/x foo > $1 = 0xdeadbeef > > Exactly the same situation I have reproduced on MIPS platforms. Have I > missed something? I'll see if I can reproduce it with the x86-64 target and that may help me understand why I don't see it with the MIPS one. > Regarding fetching value of TLS variable with cross or Multiarch GDB, we > actually had proposal for resolving that issue in GDB, where we proposed > that GDB can have its own "libthread_db" functions for getting TLS in > case of cross debugging, where it is going to "simulate" "target" > architecture, but that code would be hard to maintain, because every > changes in "libthread_db" functions regarding particular architecture in > GLIBC would take also changes in GDB, so it was not accepted by > community, reasonably (please briefly see that thread: > http://sourceware-org.1504.n7.nabble.com/PATCH-TLS-access-support-in-cross-Linux-GDB-td452155.html). > But yes, I also think in those cases, test should be marked as known > failure. Thank you for the reference, and indeed in this case, with the problem being dealt with already, I'm even more confident that having it recorded as a known failure is the right way. But that needs an existing issue report in our Bugzilla to refer to. As you have been handling this already can you file one please? Once we have an issue number, we can then: if [is_remote target] { setup_kfail "gdb/..." "*-*-*" } ahead of the offending case. Overall we must never gratuitously cause new FAILs to appear in testing with new test cases -- either the causing issue should be fixed, or the problem KFAILed or XFAILed as appropriate, depending on the circumstances. I'm off throughout this week and this is my last reply in this thread until I am back. Can you please use the time to enter the bug report, and can you also tell me if your copyright assignment has been sorted out with FSF? Maciej
On Mon, 30 Oct 2017, Maciej W. Rozycki wrote: > Once we have an issue number, we can then: > > if [is_remote target] { > setup_kfail "gdb/..." "*-*-*" > } > > ahead of the offending case. Or: if {![string match $host_triplet $target_triplet]} { setup_kfail "gdb/..." "*-*-*" } really (`isnative' doesn't help here as it's too restrictive: we could have a native GDB built on a different system; this is actually what I use as I don't currently have an adequate version of native MIPS GCC set up). Maciej
Hi Maciej, >Can you please use the time to enter the bug report, and >can you also tell me if your copyright assignment has been sorted out with >FSF? The bug report is created and it can be found on the following link: https://sourceware.org/bugzilla/show_bug.cgi?id=22381 Also, the copyright assignment with FSF has been sorted out. > if {![string match $host_triplet $target_triplet]} { > setup_kfail "gdb/..." "*-*-*" > } > >really (`isnative' doesn't help here as it's too restrictive: we could >have a native GDB built on a different system; this is actually what I use >as I don't currently have an adequate version of native MIPS GCC set up). I agree with you that this is more appropriate way to do it. So, could you please, tell me do you want from me to resubmit the whole patch series or just the last one? Thanks, Djordje
On Mon, 30 Oct 2017, Maciej W. Rozycki wrote: > > Exactly the same situation I have reproduced on MIPS platforms. Have I > > missed something? > > I'll see if I can reproduce it with the x86-64 target and that may help > me understand why I don't see it with the MIPS one. So I have looked into it now and tracked down `libthread_db' rather than GDB to be the component requiring PID retrieval from a core file for TLS access to work, and then only before glibc commit c579f48edba8 ("Remove cached PID/TID in clone"), which was first included in 2.25 glibc release. Given I have been using recent glibc checkouts for MIPS verification I avoided the requirement, but I was indeed able to reproduce it natively with x86-64 and the system-supplied `libthread_db', and then with my MIPS environment as well, once I went with my glibc build back to commit c579f48edba8^. I will be adding a reference to said glibc commit when pushing your 2/3 change. Maciej
Hi Djordje, > >Can you please use the time to enter the bug report, and > >can you also tell me if your copyright assignment has been sorted out with > >FSF? > > The bug report is created and it can be found on the following link: > https://sourceware.org/bugzilla/show_bug.cgi?id=22381 Thank you, I have updated it with some pointers, based on your earlier reference. > Also, the copyright assignment with FSF has been sorted out. Great! > > if {![string match $host_triplet $target_triplet]} { > > setup_kfail "gdb/..." "*-*-*" > > } > > > >really (`isnative' doesn't help here as it's too restrictive: we could > >have a native GDB built on a different system; this is actually what I use > >as I don't currently have an adequate version of native MIPS GCC set up). > > I agree with you that this is more appropriate way to do it. > > So, could you please, tell me do you want from me to resubmit the whole patch series > or just the last one? I think there is no point in resending, as long as Pedro is fine with and approves your most recent version of the test case patch posted (NB it helps if on resubmission you number patch revisions in the subject line like [PATCH v3 2/3]). Pedro? Once we've got the final ack from Pedro, I'll proceed with the right steps to push your changes, with the tweaks noted in the final review round applied. I'll post the actual patches committed, for posterity. Maciej
On 11/07/2017 09:29 PM, Maciej W. Rozycki wrote: > On Mon, 30 Oct 2017, Maciej W. Rozycki wrote: > >>> Exactly the same situation I have reproduced on MIPS platforms. Have I >>> missed something? >> >> I'll see if I can reproduce it with the x86-64 target and that may help >> me understand why I don't see it with the MIPS one. > > So I have looked into it now and tracked down `libthread_db' rather > than GDB to be the component requiring PID retrieval from a core file > for TLS access to work, and then only before glibc commit c579f48edba8 > ("Remove cached PID/TID in clone"), which was first included in 2.25 > glibc release. > > Given I have been using recent glibc checkouts for MIPS verification I > avoided the requirement, but I was indeed able to reproduce it natively > with x86-64 and the system-supplied `libthread_db', and then with my > MIPS environment as well, once I went with my glibc build back to commit > c579f48edba8^. > > I will be adding a reference to said glibc commit when pushing your 2/3 > change. Interesting, hadn't realized libthread_db stopped requiring this. I wonder whether it'd be a good idea to mention it in the code itself too, say, in proc-service.:ps_getpid. Thanks, Pedro Alves
On 11/07/2017 09:58 PM, Maciej W. Rozycki wrote: >> So, could you please, tell me do you want from me to resubmit the whole patch series >> or just the last one? > > I think there is no point in resending, as long as Pedro is fine with > and approves your most recent version of the test case patch posted ... (NB it helps if on resubmission you number patch revisions in the subject > line like [PATCH v3 2/3]). Pedro? Agreed. And threading the patch series emails under the cover letter also helps a lot. In fact, I'm a bit lost and am not sure which is the last version of the test patch that I should be looking at. :-P :-) I think... this one: https://sourceware.org/ml/gdb-patches/2017-10/msg00770.html ? That looks fine to me. > > Once we've got the final ack from Pedro, I'll proceed with the right > steps to push your changes, with the tweaks noted in the final review > round applied. I'll post the actual patches committed, for posterity. Thanks, Pedro Alves
On Wed, 8 Nov 2017, Pedro Alves wrote: > > So I have looked into it now and tracked down `libthread_db' rather > > than GDB to be the component requiring PID retrieval from a core file > > for TLS access to work, and then only before glibc commit c579f48edba8 > > ("Remove cached PID/TID in clone"), which was first included in 2.25 > > glibc release. > > > > Given I have been using recent glibc checkouts for MIPS verification I > > avoided the requirement, but I was indeed able to reproduce it natively > > with x86-64 and the system-supplied `libthread_db', and then with my > > MIPS environment as well, once I went with my glibc build back to commit > > c579f48edba8^. > > > > I will be adding a reference to said glibc commit when pushing your 2/3 > > change. > > Interesting, hadn't realized libthread_db stopped requiring this. > I wonder whether it'd be a good idea to mention it in the code itself > too, say, in proc-service.:ps_getpid. This might not be the best place. Calls to `ps_getpid' are still made from places throughout `libthread_db', and it's only one from `td_ta_thr_iter' (which matters here) whose result is discarded, by only passing it down to `iterate_thread_list' to become its ignored `match_pid' argument. This looks like an oversight of some kind to me, which Adhemerval might be able to shed some light on. Adhemerval, in your commit referred above, corresponding to the change held at: <https://sourceware.org/ml/libc-alpha/2016-11/msg00282.html>, you made the `match_pid' argument of `iterate_thread_list' unused. The function is static, called from `td_ta_thr_iter' only (and I actually wonder why the build does not trip on an unused argument here). Is it OK then to remove the argument then along with the `ps_getpid' call in `td_ta_thr_iter' used to set the corresponding parameter, or is there something missing from there you intended to do and `match_pid' was meant to remain used? NB I would post a clean-up proposal, but regrettably my company FSF copyright assignment status is in a limbo state right now, after the transition from Imagination to the new MIPS company, and I'd rather not block the cleanup by posting a patch that cannot move forward. Or would it be small enough to qualify as not needing an assignment? Hmm... As to choosing the right place to comment on this `libthread_db' semantics change, I think either the call to `bfd_core_file_pid' made from `add_to_thread_list' in corelow.c or the heading of the latter function might be better, as this is where the consumer of BFD's core file data PID is. Adding a comment like this might be small/trivial enough for me to sneak in while the FSF paperwork is still pending. Maciej
On Wed, 8 Nov 2017, Pedro Alves wrote: > > I think there is no point in resending, as long as Pedro is fine with > > and approves your most recent version of the test case patch posted > > ... > > (NB it helps if on resubmission you number patch revisions in the subject > > line like [PATCH v3 2/3]). Pedro? > > Agreed. And threading the patch series emails under the cover letter > also helps a lot. In fact, I'm a bit lost and am not sure which is the > last version of the test patch that I should be looking at. :-P :-) > > I think... this one: > https://sourceware.org/ml/gdb-patches/2017-10/msg00770.html > ? Yes. Apologies for not pointing you explicitly at it. > That looks fine to me. Great. In reply to this message I'm following up with the final versions committed. These are v7 according to my notes. Maciej
Hi Maciej, Thanks for committing this, I appreciate it. Best regards, Djordje
Hi Djordje,
> Thanks for committing this, I appreciate it.
You are welcome, it is my duty after all.
Maciej
diff --git a/bfd/elf64-x86-64.c b/bfd/elf64-x86-64.c index 6a5159d..9ee9231 100644 --- a/bfd/elf64-x86-64.c +++ b/bfd/elf64-x86-64.c @@ -415,8 +415,8 @@ elf_x86_64_grok_psinfo (bfd *abfd, Elf_Internal_Note *note) break; case 136: /* sizeof(struct elf_prpsinfo) on Linux/x86_64 */ - elf_tdata (abfd)->core->pid - = bfd_get_32 (abfd, note->descdata + 24); + //elf_tdata (abfd)->core->pid + // = bfd_get_32 (abfd, note->descdata + 24); elf_tdata (abfd)->core->program = _bfd_elfcore_strndup (abfd, note->descdata + 40, 16); elf_tdata (abfd)->core->command