Message ID | 1410786062-19274-1-git-send-email-brobecker@adacore.com |
---|---|
State | New, archived |
Headers |
Received: (qmail 26779 invoked by alias); 15 Sep 2014 13:01:17 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: <gdb-patches.sourceware.org> List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org> List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org> List-Archive: <http://sourceware.org/ml/gdb-patches/> List-Post: <mailto:gdb-patches@sourceware.org> List-Help: <mailto:gdb-patches-help@sourceware.org>, <http://sourceware.org/ml/#faqs> Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 26765 invoked by uid 89); 15 Sep 2014 13:01:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: rock.gnat.com Received: from rock.gnat.com (HELO rock.gnat.com) (205.232.38.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 15 Sep 2014 13:01:09 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by filtered-rock.gnat.com (Postfix) with ESMTP id CE848116301 for <gdb-patches@sourceware.org>; Mon, 15 Sep 2014 09:01:07 -0400 (EDT) Received: from rock.gnat.com ([127.0.0.1]) by localhost (rock.gnat.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id yVKzV68CJnd1 for <gdb-patches@sourceware.org>; Mon, 15 Sep 2014 09:01:07 -0400 (EDT) Received: from joel.gnat.com (localhost.localdomain [127.0.0.1]) by rock.gnat.com (Postfix) with ESMTP id 9FFE91162FE for <gdb-patches@sourceware.org>; Mon, 15 Sep 2014 09:01:07 -0400 (EDT) Received: by joel.gnat.com (Postfix, from userid 1000) id 5F47940E17; Mon, 15 Sep 2014 06:01:06 -0700 (PDT) From: Joel Brobecker <brobecker@adacore.com> To: gdb-patches@sourceware.org Subject: [RFA/commit] arm-tdep.c: Do not single-step after hitting a watchpoint. Date: Mon, 15 Sep 2014 06:01:02 -0700 Message-Id: <1410786062-19274-1-git-send-email-brobecker@adacore.com> |
Commit Message
Joel Brobecker
Sept. 15, 2014, 1:01 p.m. UTC
Hello! Re: question about ARM watchpoints https://www.sourceware.org/ml/gdb/2014-09/msg00000.html This patch fixes an issue with watchpoints on ARM targets, where the debugger stops 2 instructions after the instruction causing the watchpoint. GDB is expected to stop at the next instruction. The problem is caused by the fact that GDB does an extra single-step after receiving the watchpoint notification, because the have_nonsteppable_watchpoint gdbarch attribute is set for ARM targets. Our experiments indicate that this is incorrect, at least for the versions of ARM that we tested on (ARMv7). We tried to get confirmation of this through the ARM documentation, but did not manage to get a clear answer. So, in light of evidence that the current code is wrong for some versions of ARM, and with the lack of evidence regarding the versions of ARM we could not test on, this patch simply unsets this gdbarch attribute for all versions of ARM. The plan is to refine this later on if/when we find that some systems behave differently. gdb/ChangeLog: * arm-tdep.c (arm_gdbarch_init): Remove call to set_gdbarch_have_nonsteppable_watchpoint. Unless there are comments, I will commit the patch in a week. Thank you! --- gdb/arm-tdep.c | 3 --- 1 file changed, 3 deletions(-)
Comments
Joel Brobecker <brobecker@adacore.com> writes: > Hello! > > Re: question about ARM watchpoints > https://www.sourceware.org/ml/gdb/2014-09/msg00000.html > > This patch fixes an issue with watchpoints on ARM targets, where > the debugger stops 2 instructions after the instruction causing > the watchpoint. GDB is expected to stop at the next instruction. > > The problem is caused by the fact that GDB does an extra single-step > after receiving the watchpoint notification, because the > have_nonsteppable_watchpoint gdbarch attribute is set for ARM > targets. Our experiments indicate that this is incorrect, at > least for the versions of ARM that we tested on (ARMv7). We tried Joel, Can you elaborate your experiments? Do you do experiments on qemu, arm bare metal targets or arm linux targets? I find Peter tries to fix the same problem we encounter in qemu side, [Qemu-devel] [PATCH] gdbstub: Allow target CPUs to specify watchpoint STOP_BEFORE_ACCESS flag http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02665.html and this patch isn't accepted yet. Without this patch, program stops two instructions after the variable is updated on qemu trunk, 0x000001ae <+10>: str r3, [r7, #12] 0x000001b0 <+12>: ldr r3, [r7, #4] => 0x000001b2 <+14>: cmp r3, #1 0x000001b4 <+16>: bne.n 0x1ba <recurse+22> however, with Peter's patch applied, program stops one instruction after the variable is updated, (gdb) watch b Hardware watchpoint 3: b (gdb) c Continuing. Hardware watchpoint 3: b Old value = 1283 New value = 0 recurse (a=10) at ../../../../git/gdb/testsuite/gdb.base/recurse.c:15 15 if (a == 1) (gdb) disassemble recurse Dump of assembler code for function recurse: 0x000001a4 <+0>: push {r7, lr} 0x000001a6 <+2>: sub sp, #16 0x000001a8 <+4>: add r7, sp, #0 0x000001aa <+6>: str r0, [r7, #4] 0x000001ac <+8>: movs r3, #0 0x000001ae <+10>: str r3, [r7, #12] => 0x000001b0 <+12>: ldr r3, [r7, #4] 0x000001b2 <+14>: cmp r3, #1 Note that with patched qemu, two fails in gdb.base/recurse.exp are fixed. At least, gdb and qemu should be in sync on this.
> Can you elaborate your experiments? Do you do experiments on qemu, arm > bare metal targets or arm linux targets? My experiments were on QEMU, but others tried on bare-metal. I tried on GNU/Linux targets as well, but hardware watchpoints simply did not work (no signal). > I find Peter tries to fix the same problem we encounter in qemu side, > > [Qemu-devel] [PATCH] gdbstub: Allow target CPUs to specify watchpoint STOP_BEFORE_ACCESS flag > http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02665.html > > and this patch isn't accepted yet. [...] > Note that with patched qemu, two fails in gdb.base/recurse.exp are > fixed. At least, gdb and qemu should be in sync on this. I think the experiments that were run showed that QEMU is in fact correct and should NOT be changed.
On 09/16/2014 08:59 AM, Joel Brobecker wrote: >> Can you elaborate your experiments? Do you do experiments on qemu, arm >> bare metal targets or arm linux targets? > > My experiments were on QEMU, but others tried on bare-metal. I tried > on GNU/Linux targets as well, but hardware watchpoints simply did > not work (no signal). > >> I find Peter tries to fix the same problem we encounter in qemu side, >> >> [Qemu-devel] [PATCH] gdbstub: Allow target CPUs to specify watchpoint STOP_BEFORE_ACCESS flag >> http://lists.nongnu.org/archive/html/qemu-devel/2014-09/msg02665.html >> >> and this patch isn't accepted yet. > [...] >> Note that with patched qemu, two fails in gdb.base/recurse.exp are >> fixed. At least, gdb and qemu should be in sync on this. > > I think the experiments that were run showed that QEMU is in fact > correct and should NOT be changed. > Do we know what the Linux kernel's behavior on this one is? I wonder what the stopped data address shows. Someone with access to a board with a relatively new kernel could try that and rule it out, otherwise we risk fixing something for QEMU/bare metal and breaking things for Linux. Luis
> >I think the experiments that were run showed that QEMU is in fact > >correct and should NOT be changed. > > Do we know what the Linux kernel's behavior on this one is? I wonder > what the stopped data address shows. > > Someone with access to a board with a relatively new kernel could > try that and rule it out, otherwise we risk fixing something for > QEMU/bare metal and breaking things for Linux. When I tested on GNU/Linux, watchpoints simply did not work (silently ignored, no signal). I was using an old kernel (2012), though; but that's all I had access to. But, all in all, baremetal should be our most reliable source of info, though,no? - no software layer to murky the waters.
On 09/16/2014 09:48 AM, Joel Brobecker wrote: >>> I think the experiments that were run showed that QEMU is in fact >>> correct and should NOT be changed. >> >> Do we know what the Linux kernel's behavior on this one is? I wonder >> what the stopped data address shows. >> >> Someone with access to a board with a relatively new kernel could >> try that and rule it out, otherwise we risk fixing something for >> QEMU/bare metal and breaking things for Linux. > > When I tested on GNU/Linux, watchpoints simply did not work > (silently ignored, no signal). I was using an old kernel (2012), > though; but that's all I had access to. But, all in all, baremetal > should be our most reliable source of info, though,no? - no software > layer to murky the waters. > It is hard to tell. ARM's documentation is not clear. For example, this is probably where the stopped data address should be coming from: -- WFAR - Watchpoint Fault Address Register The WFAR is updated to indicate the address of the instruction that accessed the watchpointed address: - the address of the instruction + 8 in ARM state - the address of the instruction + 4 in Thumb® state -- So it seems in line with what we are seeing? The program being trapped two instructions after the data fault? If it stops just a single instruction after the data fault, then someone (probe, emulator or kernel) may be trying to help GDB by decrementing the data fault address. Luis
Hi Terry, Marcus, Can someone at ARM shed some light on this, please? This thread is here: https://sourceware.org/ml/gdb-patches/2014-09/msg00498.html And the discussion started in another thread here: https://sourceware.org/ml/gdb/2014-09/msg00000.html I've just added a test that hopefully helps with this, btw: https://sourceware.org/ml/gdb-patches/2014-09/msg00535.html I'm also wondering whether Aarch64 needs adjustment as well. Thanks, Pedro Alves On 09/16/2014 02:09 PM, Luis Machado wrote: > On 09/16/2014 09:48 AM, Joel Brobecker wrote: >>>> I think the experiments that were run showed that QEMU is in fact >>>> correct and should NOT be changed. >>> >>> Do we know what the Linux kernel's behavior on this one is? I wonder >>> what the stopped data address shows. >>> >>> Someone with access to a board with a relatively new kernel could >>> try that and rule it out, otherwise we risk fixing something for >>> QEMU/bare metal and breaking things for Linux. >> >> When I tested on GNU/Linux, watchpoints simply did not work >> (silently ignored, no signal). I was using an old kernel (2012), >> though; but that's all I had access to. But, all in all, baremetal >> should be our most reliable source of info, though,no? - no software >> layer to murky the waters. >> > > It is hard to tell. ARM's documentation is not clear. For example, this > is probably where the stopped data address should be coming from: > > -- > > WFAR - Watchpoint Fault Address Register > > The WFAR is updated to indicate the address of the instruction that > accessed the watchpointed address: > > - the address of the instruction + 8 in ARM state > - the address of the instruction + 4 in Thumb® state > > -- > > So it seems in line with what we are seeing? The program being trapped > two instructions after the data fault? > > If it stops just a single instruction after the data fault, then someone > (probe, emulator or kernel) may be trying to help GDB by decrementing > the data fault address. > > Luis >
On 16 September 2014 16:21, Pedro Alves <palves@redhat.com> wrote: > Hi Terry, Marcus, > > Can someone at ARM shed some light on this, please? > > This thread is here: > > https://sourceware.org/ml/gdb-patches/2014-09/msg00498.html > > And the discussion started in another thread here: > > https://sourceware.org/ml/gdb/2014-09/msg00000.html > > I've just added a test that hopefully helps with this, btw: > > https://sourceware.org/ml/gdb-patches/2014-09/msg00535.html > > I'm also wondering whether Aarch64 needs adjustment as well. > > Thanks, > Pedro Alves Hi, In aarch32 execution state a watch point event is taken as a data abort with the PC containing the address of the faulting instruction + 8 irrespective of thumb mode. The linux kernel adjusts the reported PC by subtracting 8 such that the ptrace interface will indicate the address of the faulting instruction. Peter Maydell's proposed qemu patch referenced in the thread above appears to me to align the gdbstub behaviour in qemu with the linux kernel ptrace() interface behaviour. w.r.t DBGWFAR, it's use is described as deprecated in ARM ARMv7-A&R Issue C.c c11.11.45. It is not used by linux kernel. Cheers /Marcus
Hi Marcus, On 09/18/2014 12:40 PM, Marcus Shawcroft wrote: > On 16 September 2014 16:21, Pedro Alves <palves@redhat.com> wrote: >> Hi Terry, Marcus, >> >> Can someone at ARM shed some light on this, please? >> >> This thread is here: >> >> https://sourceware.org/ml/gdb-patches/2014-09/msg00498.html >> >> And the discussion started in another thread here: >> >> https://sourceware.org/ml/gdb/2014-09/msg00000.html >> >> I've just added a test that hopefully helps with this, btw: >> >> https://sourceware.org/ml/gdb-patches/2014-09/msg00535.html >> >> I'm also wondering whether Aarch64 needs adjustment as well. >> > In aarch32 execution state a watch point event is taken as a data > abort with the PC containing the address of the faulting instruction + > 8 irrespective of thumb mode. So the documentation is actually wrong for thumb? Or you're saying that for Thumb2, while it'd be 4 for Thumb 1? If you're talking about something else, not DBGWFAR (what I think Luis pointed out before), then can you give a pointer to where these other watch point events are documented? > > The linux kernel adjusts the reported PC by subtracting 8 such that > the ptrace interface will indicate the address of the faulting > instruction. Hmm. So when the data abort triggers at fault+8, the instruction that triggered the abort hasn't actually completed, right? No memory has changed yet. So if nothing does the adjustment, like Gareth found out happens with the Black Magic Probe, then we'll resume execution from the wrong address/instruction (with the effects of the skipped instructions missing, including the memory write...). Did I understand that right? (Gareth, is that what you see?) > Peter Maydell's proposed qemu patch referenced in the thread above > appears to me to align the gdbstub behaviour in qemu with the linux > kernel ptrace() interface behaviour. > > w.r.t DBGWFAR, it's use is described as deprecated in ARM ARMv7-A&R > Issue C.c c11.11.45. It is not used by linux kernel. Thanks, Pedro Alves
Hello all, > Hmm. So when the data abort triggers at fault+8, the instruction > that triggered the abort hasn't actually completed, right? No memory > has changed yet. > > So if nothing does the adjustment, like Gareth found out happens with > the Black Magic Probe, then we'll resume execution from the > wrong address/instruction (with the effects of the skipped instructions > missing, including the memory write...). Did I understand that > right? (Gareth, is that what you see?) I have been trying to understand the various contributions, and I admit I am still not quite sure... Does it look like the patch I proposed is correct? It seems to be supported by Terry Guo's experiments as well... Thanks!
On 09/29/2014 02:51 PM, Joel Brobecker wrote: > Hello all, > >> Hmm. So when the data abort triggers at fault+8, the instruction >> that triggered the abort hasn't actually completed, right? No memory >> has changed yet. >> >> So if nothing does the adjustment, like Gareth found out happens with >> the Black Magic Probe, then we'll resume execution from the >> wrong address/instruction (with the effects of the skipped instructions >> missing, including the memory write...). Did I understand that >> right? (Gareth, is that what you see?) > > I have been trying to understand the various contributions, and > I admit I am still not quite sure... > > Does it look like the patch I proposed is correct? It seems to be > supported by Terry Guo's experiments as well... > > Thanks! > From previous mails, it does not seem to be correct for Linux, where the ptrace interface adjusts the data fault address to point to the address of the instruction that caused the trigger. So it looks like the current behavior of GDB is correct for Linux, though it may not be correct for QEMU or bare metal. Luis
On 09/29/2014 06:51 PM, Joel Brobecker wrote: > Hello all, > >> Hmm. So when the data abort triggers at fault+8, the instruction >> that triggered the abort hasn't actually completed, right? No memory >> has changed yet. >> >> So if nothing does the adjustment, like Gareth found out happens with >> the Black Magic Probe, then we'll resume execution from the >> wrong address/instruction (with the effects of the skipped instructions >> missing, including the memory write...). Did I understand that >> right? (Gareth, is that what you see?) > > I have been trying to understand the various contributions, and > I admit I am still not quite sure... > > Does it look like the patch I proposed is correct? It seems to be > supported by Terry Guo's experiments as well... Nope, Terry's experiments supported the current code. The experiments (which were on Linux) showed that the watchpoint was reported to GDB first with the PC pointing at the instruction that accessed memory, and then GDB single-stepped once, and the PC ends up pointing at one instruction after the instruction that changed memory. Thanks, Pedro Alves
On Mon, Sep 29, 2014 at 10:04:11PM +0100, Pedro Alves wrote: > On 09/29/2014 06:51 PM, Joel Brobecker wrote: > > Hello all, > > > >> Hmm. So when the data abort triggers at fault+8, the instruction > >> that triggered the abort hasn't actually completed, right? No memory > >> has changed yet. > >> > >> So if nothing does the adjustment, like Gareth found out happens with > >> the Black Magic Probe, then we'll resume execution from the > >> wrong address/instruction (with the effects of the skipped instructions > >> missing, including the memory write...). Did I understand that > >> right? (Gareth, is that what you see?) > > > > I have been trying to understand the various contributions, and > > I admit I am still not quite sure... > > > > Does it look like the patch I proposed is correct? It seems to be > > supported by Terry Guo's experiments as well... > > Nope, Terry's experiments supported the current code. > > The experiments (which were on Linux) showed that the watchpoint was > reported to GDB first with the PC pointing at the instruction that > accessed memory, and then GDB single-stepped once, and the PC ends up > pointing at one instruction after the instruction that changed memory. FWIW, that also matches the intention of the kernel-side code. The same logic applies to arm64, despite the availability of hardware single-step there (the PTRACE_SINGLESTEP request can be used to access that feature). Furthermore, this also matches the ARMv7/8 debug architectures; a watchpoint data abort will be taken before the faulting instruction has executed. Will
diff --git a/gdb/arm-tdep.c b/gdb/arm-tdep.c index f9feb52..990c4ad 100644 --- a/gdb/arm-tdep.c +++ b/gdb/arm-tdep.c @@ -10446,9 +10446,6 @@ arm_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches) if (tdep->arm_abi == ARM_ABI_AUTO) tdep->arm_abi = ARM_ABI_APCS; - /* Watchpoints are not steppable. */ - set_gdbarch_have_nonsteppable_watchpoint (gdbarch, 1); - /* We used to default to FPA for generic ARM, but almost nobody uses that now, and we now provide a way for the user to force the model. So default to the most useful variant. */