[1/2] This patch fixes GDBServer's run control for single stepping

Message ID 20170127150139.GB24676@E107787-LIN
State New, archived
Headers

Commit Message

Yao Qi Jan. 27, 2017, 3:01 p.m. UTC
  On 16-11-29 07:07:01, Antoine Tremblay wrote:
> ** Note these patches depend on:
> https://sourceware.org/ml/gdb-patches/2016-11/msg00914.html
> 
> Before, installing single-step breakpoints was done in proceed_one_lwp as
> each thread was started.
> 
> This caused a problem on ARM, which is the only architecture using
> software single-step on which it is not safe to modify an instruction
> to insert a breakpoint in one thread while the other threads are running.
> See the architecture manual section "A3.5.4 Concurrent modification and
> execution of instructions":
> 
> "The ARMv7 architecture limits the set of instructions that can be
> executed by one thread of execution as they are being modified by another
> thread of execution without requiring explicit synchronization.  Except
> for the instructions identified in this section, the effect of the
> concurrent modification and execution of an instruction is UNPREDICTABLE
> ."

This doesn't apply to ptrace.  PTRACE_POKETEXT on a word aligned address
is atomic, so threads observe the coherent result, either breakpoint
instruction or the original instruction.  We don't see any fails in -marm
mode, do we?

In -mthumb mode, breakpoint can be 16-bit or 32-bit, but GDBserver still
PTRACE_POKETEXT on a word aligned address (in linux-low.c:linux_write_memory)
and write a word each time.  It should be atomic, however, if the 32-bit
thumb-2 instruction is 2-byte aligned, we end up two PTRACE_POKETEXT,
which is non-atomic.  That is the root cause of the problem, AFAICS.

32-bit thumb-2 breakpoint was invented for single step 32-bit thumb-2
instruction in IT block,
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/007488.html
but we can use 16-bit thumb breakpoint instruction anywhere else.  If I
force GDBserver to use 16-bit thumb breakpoint instruction even on 32-bit
thumb-2 instruction, I can't see any fails in schedlock.exp anymore!  See
the patch below.

Out of IT block, we can use 16-bit thumb breakpoint for any thumb code.  In
IT block, we can use 16-bit thumb breakpoint for 16-bit instruction, and
32-bit thumb-2 breakpoint for 4-byte aligned 32-bit thumb-2 instruction.
However, we can not use either 16-bit or 32-bit breakpoint for 2-byte
aligned 32-bit thumb-2 instruction in IT block.

The reason we can't use 16-bit breakpoint on a 32-bit instruction in IT
block is that 16-bit breakpoint instruction makes the second half of
32-bit instruction to another different 16-bit instruction, and the 16-bit
breakpoint instruction may not be executed at all, so the second half
of instruction may be executed, and the result is completely unknown.
GDB sets two breakpoints on two branches, "if" branch and "else" branch,
because GDB doesn't know how does the instruction to be executed affect
the flag.

	      /* There are conditional instructions after this one.
		 If this instruction modifies the flags, then we can
		 not predict what the next executed instruction will
		 be.  Fortunately, this instruction is architecturally
		 forbidden to branch; we know it will fall through.
		 Start by skipping past it.  */

GDB/GDBserver has to emulate the instruction on how does it affect the
flag, and only insert the breakpoint on the "true" branch.  Since the
target instruction will be definitely executed, we can safely use
16-bit breakpoint instruction.
  

Comments

Antoine Tremblay Jan. 27, 2017, 4:06 p.m. UTC | #1
Yao Qi writes:

> On 16-11-29 07:07:01, Antoine Tremblay wrote:
>> ** Note these patches depend on:
>> https://sourceware.org/ml/gdb-patches/2016-11/msg00914.html
>>
>> Before, installing single-step breakpoints was done in proceed_one_lwp as
>> each thread was started.
>>
>> This caused a problem on ARM, which is the only architecture using
>> software single-step on which it is not safe to modify an instruction
>> to insert a breakpoint in one thread while the other threads are running.
>> See the architecture manual section "A3.5.4 Concurrent modification and
>> execution of instructions":
>>
>> "The ARMv7 architecture limits the set of instructions that can be
>> executed by one thread of execution as they are being modified by another
>> thread of execution without requiring explicit synchronization.  Except
>> for the instructions identified in this section, the effect of the
>> concurrent modification and execution of an instruction is UNPREDICTABLE
>> ."
>
> This doesn't apply to ptrace.  PTRACE_POKETEXT on a word aligned address
> is atomic, so threads observe the coherent result, either breakpoint
> instruction or the original instruction.  We don't see any fails in -marm
> mode, do we?
>

Indeed.

> In -mthumb mode, breakpoint can be 16-bit or 32-bit, but GDBserver still
> PTRACE_POKETEXT on a word aligned address (in linux-low.c:linux_write_memory)
> and write a word each time.  It should be atomic, however, if the 32-bit
> thumb-2 instruction is 2-byte aligned, we end up two PTRACE_POKETEXT,
> which is non-atomic.  That is the root cause of the problem, AFAICS.
>

Haaa makes total sense now :) thx!

> 32-bit thumb-2 breakpoint was invented for single step 32-bit thumb-2
> instruction in IT block,
> http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/007488.html
> but we can use 16-bit thumb breakpoint instruction anywhere else.  If I
> force GDBserver to use 16-bit thumb breakpoint instruction even on 32-bit
> thumb-2 instruction, I can't see any fails in schedlock.exp anymore!  See
> the patch below.
>
> Out of IT block, we can use 16-bit thumb breakpoint for any thumb code.  In
> IT block, we can use 16-bit thumb breakpoint for 16-bit instruction, and
> 32-bit thumb-2 breakpoint for 4-byte aligned 32-bit thumb-2 instruction.
> However, we can not use either 16-bit or 32-bit breakpoint for 2-byte
> aligned 32-bit thumb-2 instruction in IT block.
>
> The reason we can't use 16-bit breakpoint on a 32-bit instruction in IT
> block is that 16-bit breakpoint instruction makes the second half of
> 32-bit instruction to another different 16-bit instruction, and the 16-bit
> breakpoint instruction may not be executed at all, so the second half
> of instruction may be executed, and the result is completely unknown.
> GDB sets two breakpoints on two branches, "if" branch and "else" branch,
> because GDB doesn't know how does the instruction to be executed affect
> the flag.
>
> 	      /* There are conditional instructions after this one.
> 		 If this instruction modifies the flags, then we can
> 		 not predict what the next executed instruction will
> 		 be.  Fortunately, this instruction is architecturally
> 		 forbidden to branch; we know it will fall through.
> 		 Start by skipping past it.  */
>
> GDB/GDBserver has to emulate the instruction on how does it affect the
> flag, and only insert the breakpoint on the "true" branch.  Since the
> target instruction will be definitely executed, we can safely use
> 16-bit breakpoint instruction.

Ouch, reading the kernel thread it looks like this emulation would be
complex to say the least.

I think it would be better to get the current single stepping working
with the stop all threads logic since GDBServer was working like that
when GDB was doing the single stepping anyway. This would fix the current
regression.

Then work could be done to improve GDBServer to be better at
non-stopping.

WDYT ?

Thanks,
Antoine
  
Yao Qi Jan. 27, 2017, 5 p.m. UTC | #2
On Fri, Jan 27, 2017 at 4:06 PM, Antoine Tremblay
<antoine.tremblay@ericsson.com> wrote:
>> GDB/GDBserver has to emulate the instruction on how does it affect the
>> flag, and only insert the breakpoint on the "true" branch.  Since the
>> target instruction will be definitely executed, we can safely use
>> 16-bit breakpoint instruction.
>
> Ouch, reading the kernel thread it looks like this emulation would be
> complex to say the least.
>

There are some other ideas discussed in the kernel threads, but I didn't
go through them.  They may work.  If emulation is complex, probably
we can partially fix this problem by "always using 16-bit thumb instruction
if program is out of IT block".

> I think it would be better to get the current single stepping working
> with the stop all threads logic since GDBServer was working like that
> when GDB was doing the single stepping anyway. This would fix the current
> regression.
>
> Then work could be done to improve GDBServer to be better at
> non-stopping.
>
> WDYT ?
>

Sounds like we are applying the ARM linux limitation to a general place.
Other software single step targets may write breakpoint in atomic way,
so we don't need to stop all threads.  Even in -marm mode, we don't
have to stop all threads on inserting breakpoints.
  
Antoine Tremblay Jan. 27, 2017, 6:24 p.m. UTC | #3
Yao Qi writes:

> On Fri, Jan 27, 2017 at 4:06 PM, Antoine Tremblay
> <antoine.tremblay@ericsson.com> wrote:
>>> GDB/GDBserver has to emulate the instruction on how does it affect the
>>> flag, and only insert the breakpoint on the "true" branch.  Since the
>>> target instruction will be definitely executed, we can safely use
>>> 16-bit breakpoint instruction.
>>
>> Ouch, reading the kernel thread it looks like this emulation would be
>> complex to say the least.
>>
>
> There are some other ideas discussed in the kernel threads, but I didn't
> go through them.  They may work.

There was this one
http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/007515.html

But it got discarded: http://lists.infradead.org/pipermail/linux-arm-kernel/2010-January/007517.html

> If emulation is complex, probably
> we can partially fix this problem by "always using 16-bit thumb instruction
> if program is out of IT block".
>

I would be against that since it would leave the feature partially
working and crashing the program when it fails...

It would also be a regression compared to previous GDBServer.
Also, IT blocks are a common place to have a breakpoint/tracepoint.

>> I think it would be better to get the current single stepping working
>> with the stop all threads logic since GDBServer was working like that
>> when GDB was doing the single stepping anyway. This would fix the current
>> regression.
>>
>> Then work could be done to improve GDBServer to be better at
>> non-stopping.
>>
>> WDYT ?
>>
>
> Sounds like we are applying the ARM linux limitation to a general
> place.
> Other software single step targets may write breakpoint in atomic way,
> so we don't need to stop all threads.  Even in -marm mode, we don't
> have to stop all threads on inserting breakpoints.

Well ARM is the only software single stepping target, while I agreee
that we would be affecting general code, I think that since there is
no software single step breakpoint target that supports atomic
breakpoint writes at the moment it's normal that the run control
doesn't support that.

I don't count -marm as an arch since there no way to check that all the
program including shared libs etc is -marm, I don't think we could make
the distinction in GDBServer AFAIK.

Should an arch support this in the future we could adapt the run control
to support both cases possibly via some arch specific property.

I think also that given the current trends in architecture design the
odds of another arch needing that in the future are also unlikely ?

WDYT ?
  
Yao Qi Jan. 29, 2017, 9:40 p.m. UTC | #4
On Fri, Jan 27, 2017 at 6:24 PM, Antoine Tremblay
<antoine.tremblay@ericsson.com> wrote:
>
> Yao Qi writes:
>
>> On Fri, Jan 27, 2017 at 4:06 PM, Antoine Tremblay
>> <antoine.tremblay@ericsson.com> wrote:

>> If emulation is complex, probably
>> we can partially fix this problem by "always using 16-bit thumb instruction
>> if program is out of IT block".
>>
>
> I would be against that since it would leave the feature partially
> working and crashing the program when it fails...
>
> It would also be a regression compared to previous GDBServer.

There won't be any regression.  16-bit thumb breakpoint works pretty well
for any thumb instructions (16-bit and 32-bit) if program is out of IT block.
The 32-bit thumb-2 breakpoint was added for single step IT block.

> Also, IT blocks are a common place to have a breakpoint/tracepoint.
>

We don't change anything when setting breakpoint inside IT block.

>>> I think it would be better to get the current single stepping working
>>> with the stop all threads logic since GDBServer was working like that
>>> when GDB was doing the single stepping anyway. This would fix the current
>>> regression.
>>>
>>> Then work could be done to improve GDBServer to be better at
>>> non-stopping.
>>>
>>> WDYT ?
>>>
>>
>> Sounds like we are applying the ARM linux limitation to a general
>> place.
>> Other software single step targets may write breakpoint in atomic way,
>> so we don't need to stop all threads.  Even in -marm mode, we don't
>> have to stop all threads on inserting breakpoints.
>
> Well ARM is the only software single stepping target, while I agreee
> that we would be affecting general code, I think that since there is
> no software single step breakpoint target that supports atomic
> breakpoint writes at the moment it's normal that the run control
> doesn't support that.

No, ARM Linux ptrace POKETEXT is _atomic_ if the address is 4-byte
aligned, so 32-bit arm breakpoint and 16-bit thumb breakpoint can be
written atomically.  32-bit thumb-2 breakpoint can be written atomically
too if the address is 4-byte aligned.

The only problem we have is inserting a breakpoint on a 2-byte aligned
32-bit thumb-2 instruction inside IT block, we can not use neither 16-bit thumb
breakpoint nor 32-bit thumb breakpoint.  Everything works fine in other
cases.

>
> I don't count -marm as an arch since there no way to check that all the
> program including shared libs etc is -marm, I don't think we could make
> the distinction in GDBServer AFAIK.

We can check with -mthumb.  My hack in last email fixes fails in
schedlock.exp in thumb mode.
  
Antoine Tremblay Jan. 30, 2017, 1:28 p.m. UTC | #5
Yao Qi writes:

> On Fri, Jan 27, 2017 at 6:24 PM, Antoine Tremblay
> <antoine.tremblay@ericsson.com> wrote:
>>
>> Yao Qi writes:
>>
>>> On Fri, Jan 27, 2017 at 4:06 PM, Antoine Tremblay
>>> <antoine.tremblay@ericsson.com> wrote:
>
>>> If emulation is complex, probably
>>> we can partially fix this problem by "always using 16-bit thumb instruction
>>> if program is out of IT block".
>>>
>>
>> I would be against that since it would leave the feature partially
>> working and crashing the program when it fails...
>>
>> It would also be a regression compared to previous GDBServer.
>
> There won't be any regression.  16-bit thumb breakpoint works pretty well
> for any thumb instructions (16-bit and 32-bit) if program is out of IT block.
> The 32-bit thumb-2 breakpoint was added for single step IT block.

Yes so there will be a regression for single step inside an IT block if
we keep using the 32 bit breakpoint since this one can be non atomic if
it's not aligned properly.

We could write a test case for it and it would fail like schedlock
did. But it would not with the single stepping controlled by GDB like it
was before.

>
>> Also, IT blocks are a common place to have a breakpoint/tracepoint.
>>
>
> We don't change anything when setting breakpoint inside IT block.

Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
2 bytes like discussed before.

>
>>>> I think it would be better to get the current single stepping working
>>>> with the stop all threads logic since GDBServer was working like that
>>>> when GDB was doing the single stepping anyway. This would fix the current
>>>> regression.
>>>>
>>>> Then work could be done to improve GDBServer to be better at
>>>> non-stopping.
>>>>
>>>> WDYT ?
>>>>
>>>
>>> Sounds like we are applying the ARM linux limitation to a general
>>> place.
>>> Other software single step targets may write breakpoint in atomic way,
>>> so we don't need to stop all threads.  Even in -marm mode, we don't
>>> have to stop all threads on inserting breakpoints.
>>
>> Well ARM is the only software single stepping target, while I agreee
>> that we would be affecting general code, I think that since there is
>> no software single step breakpoint target that supports atomic
>> breakpoint writes at the moment it's normal that the run control
>> doesn't support that.
>
> No, ARM Linux ptrace POKETEXT is _atomic_ if the address is 4-byte
> aligned, so 32-bit arm breakpoint and 16-bit thumb breakpoint can be
> written atomically.  32-bit thumb-2 breakpoint can be written atomically
> too if the address is 4-byte aligned.
>
> The only problem we have is inserting a breakpoint on a 2-byte aligned
> 32-bit thumb-2 instruction inside IT block, we can not use neither 16-bit thumb
> breakpoint nor 32-bit thumb breakpoint.  Everything works fine in other
> cases.
>

I mean software single stepping as a whole, which means considering all
cases, is not atomic. I don't see how we could leave that case
unaddressed ?

Especially since it will crash the inferior ?

>>
>> I don't count -marm as an arch since there no way to check that all the
>> program including shared libs etc is -marm, I don't think we could make
>> the distinction in GDBServer AFAIK.
>
> We can check with -mthumb.  My hack in last email fixes fails in
> schedlock.exp in thumb mode.

Yes but schedlock.exp is not made to test the 2 byte aligned thumb2
32-bit breakpoint IIRC.
  
Pedro Alves Feb. 3, 2017, 4:13 p.m. UTC | #6
On 01/30/2017 01:28 PM, Antoine Tremblay wrote:

>> We don't change anything when setting breakpoint inside IT block.
> 
> Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
> 2 bytes like discussed before.

Can we restrict the stopping-all-threads to the case that
needs it, only?

An optimization that would avoid this would be to use a
hardware watchpoint/breakpoint (if available) for single-stepping.
IIRC, you can ARM a breakpoint (or was it a watchpoint) on ARM for 
triggering when the PC is different from the current PC (or really,
some specified address).

In IT blocks, this would probably make the thread stop on
instructions that aren't really executed (e.g., the then/else
branches when the condition is correspondingly false/true),
unlike the current solution where breakpoint instructions are
not executed by the CPU when it falls on an instruction that
isn't executed (because the breakpoint opcode we use it just
some magic invalid instruction, node the BKPT instruction), but
I think that when the thread stops, and we're stepping an IT
block, we could look at the status registers and decide whether
to single-step again.

TBC, I'm not suggesting that we need to do that right now.

The emulation solution discussed on the lkml thread made
me thing of displaced stepping -- the ARM implementation
emulates some instructions.  I wonder whether displaced
stepping over IT blocks is already handled correctly.

Thanks,
Pedro Alves
  
Yao Qi Feb. 16, 2017, 10:31 p.m. UTC | #7
On 17-01-30 08:28:29, Antoine Tremblay wrote:
>
> Yao Qi writes:
>
> > On Fri, Jan 27, 2017 at 6:24 PM, Antoine Tremblay
> > <antoine.tremblay@ericsson.com> wrote:
> >>
> >> Yao Qi writes:
> >>
> >>> On Fri, Jan 27, 2017 at 4:06 PM, Antoine Tremblay
> >>> <antoine.tremblay@ericsson.com> wrote:
> >
> >>> If emulation is complex, probably
> >>> we can partially fix this problem by "always using 16-bit thumb instruction
> >>> if program is out of IT block".
> >>>
> >>
> >> I would be against that since it would leave the feature partially
> >> working and crashing the program when it fails...
> >>
> >> It would also be a regression compared to previous GDBServer.
> >
> > There won't be any regression.  16-bit thumb breakpoint works pretty well
> > for any thumb instructions (16-bit and 32-bit) if program is out of IT block.
> > The 32-bit thumb-2 breakpoint was added for single step IT block.
>
> Yes so there will be a regression for single step inside an IT block if
> we keep using the 32 bit breakpoint since this one can be non atomic if
> it's not aligned properly.
>

It does fail, but not a regression, because current GDBserver fails to
write 32-bit thumb breakpoint to 2-byte aligned 32-bit instruction
atomically, regardless the program is within IT block or not.  My
suggestion is that "let us fix this problem when program is out of IT
block by using 16-bit thumb breakpoint".  That will fix the issue
of atomic write when program is out of IT block, and leave the problem
there when program is within IT block.  Why is it a regression?

> We could write a test case for it and it would fail like schedlock
> did. But it would not with the single stepping controlled by GDB like it
> was before.
>
> >
> >> Also, IT blocks are a common place to have a breakpoint/tracepoint.
> >>
> >
> > We don't change anything when setting breakpoint inside IT block.
>
> Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
> 2 bytes like discussed before.
>

That is just the sub-set of the original problem.

> >
> >>>> I think it would be better to get the current single stepping working
> >>>> with the stop all threads logic since GDBServer was working like that
> >>>> when GDB was doing the single stepping anyway. This would fix the current
> >>>> regression.
> >>>>
> >>>> Then work could be done to improve GDBServer to be better at
> >>>> non-stopping.
> >>>>
> >>>> WDYT ?
> >>>>
> >>>
> >>> Sounds like we are applying the ARM linux limitation to a general
> >>> place.
> >>> Other software single step targets may write breakpoint in atomic way,
> >>> so we don't need to stop all threads.  Even in -marm mode, we don't
> >>> have to stop all threads on inserting breakpoints.
> >>
> >> Well ARM is the only software single stepping target, while I agreee
> >> that we would be affecting general code, I think that since there is
> >> no software single step breakpoint target that supports atomic
> >> breakpoint writes at the moment it's normal that the run control
> >> doesn't support that.
> >
> > No, ARM Linux ptrace POKETEXT is _atomic_ if the address is 4-byte
> > aligned, so 32-bit arm breakpoint and 16-bit thumb breakpoint can be
> > written atomically.  32-bit thumb-2 breakpoint can be written atomically
> > too if the address is 4-byte aligned.
> >
> > The only problem we have is inserting a breakpoint on a 2-byte aligned
> > 32-bit thumb-2 instruction inside IT block, we can not use neither 16-bit thumb
> > breakpoint nor 32-bit thumb breakpoint.  Everything works fine in other
> > cases.
> >
>
> I mean software single stepping as a whole, which means considering all
> cases, is not atomic. I don't see how we could leave that case
> unaddressed ?
>
> Especially since it will crash the inferior ?
>

I am thinking how do we make progress here.  Nowadays, the multi-threaded
program may crash almost everywhere, but we can partially fix it to an
extent that the multi-threaded program may only crash in side IT block.
Is it a progress?  I feel it is a good example to apply "divide and
conquer" to a complicated engineering problem.  We can easily fix the
first half of this problem, and then think about the second half.

> >>
> >> I don't count -marm as an arch since there no way to check that all the
> >> program including shared libs etc is -marm, I don't think we could make
> >> the distinction in GDBServer AFAIK.
> >
> > We can check with -mthumb.  My hack in last email fixes fails in
> > schedlock.exp in thumb mode.
>
> Yes but schedlock.exp is not made to test the 2 byte aligned thumb2
> 32-bit breakpoint IIRC.
>

We can write one test for single step 2-byte aligned thumb-2 32-bit
instruction.  No problem.
  
Antoine Tremblay Feb. 17, 2017, 1:41 a.m. UTC | #8
Pedro Alves writes:

> On 01/30/2017 01:28 PM, Antoine Tremblay wrote:
>
>>> We don't change anything when setting breakpoint inside IT block.
>>
>> Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
>> 2 bytes like discussed before.
>

Sorry for the delay I just saw your mail...

> Can we restrict the stopping-all-threads to the case that
> needs it, only?

Possibly but I would like to avoid introducing 2 execution paths in the
run control, it's already hard to follow as it is and it would introduce
a lot of code in the arch independant code just for this case, that's
something I would like to avoid too.

>
> An optimization that would avoid this would be to use a
> hardware watchpoint/breakpoint (if available) for single-stepping.
> IIRC, you can ARM a breakpoint (or was it a watchpoint) on ARM for
> triggering when the PC is different from the current PC (or really,
> some specified address).

I did not know that but I'm worried about the limited number of hardware
watchpoints available. IIRC on my board I had only 4, if GDBServer can't
find one available would it refuse to single step ? That would be
weird... ?

It's an interesting approch however I'll dig about it more.

>
> In IT blocks, this would probably make the thread stop on
> instructions that aren't really executed (e.g., the then/else
> branches when the condition is correspondingly false/true),

Really ? I need to find something about that in the arch man...

> unlike the current solution where breakpoint instructions are
> not executed by the CPU when it falls on an instruction that
> isn't executed (because the breakpoint opcode we use it just
> some magic invalid instruction, node the BKPT instruction), but
> I think that when the thread stops, and we're stepping an IT
> block, we could look at the status registers and decide whether
> to single-step again.
>
> TBC, I'm not suggesting that we need to do that right now.
>
> The emulation solution discussed on the lkml thread made
> me thing of displaced stepping -- the ARM implementation
> emulates some instructions.  I wonder whether displaced
> stepping over IT blocks is already handled correctly.
>

It does look like displaced stepping would be preferable to trying to
emulate and validate that the emulation is correct.

Simon and I are looking into displaced stepping but we've yet to have
this discussion with Yao...
  
Pedro Alves Feb. 17, 2017, 2:04 a.m. UTC | #9
On 02/17/2017 01:41 AM, Antoine Tremblay wrote:
> 
> Pedro Alves writes:
> 
>> On 01/30/2017 01:28 PM, Antoine Tremblay wrote:
>>
>>>> We don't change anything when setting breakpoint inside IT block.
>>>
>>> Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
>>> 2 bytes like discussed before.
>>
> 
> Sorry for the delay I just saw your mail...
> 
>> Can we restrict the stopping-all-threads to the case that
>> needs it, only?
> 
> Possibly but I would like to avoid introducing 2 execution paths in the
> run control, it's already hard to follow as it is and it would introduce
> a lot of code in the arch independant code just for this case, that's
> something I would like to avoid too.

I don't immediately see how this would imply introducing lots of code
in the run control.  We shouldn't be imposing this stop-all-threads
on all archs, since it adds a lot of slowness (and
the more threads the inferior has, the worse it gets).  So if
we already need the 2 execution paths, making the condition that
determines whether to pause all threads consider more state
doesn't seem to add that much complexity to the run control
part itself.

>> An optimization that would avoid this would be to use a
>> hardware watchpoint/breakpoint (if available) for single-stepping.
>> IIRC, you can ARM a breakpoint (or was it a watchpoint) on ARM for
>> triggering when the PC is different from the current PC (or really,
>> some specified address).
> 
> I did not know that but I'm worried about the limited number of hardware
> watchpoints available. IIRC on my board I had only 4, if GDBServer can't
> find one available would it refuse to single step ? That would be
> weird... ?

My thought was that we'd give preference to user-requested
watchpoints.  I.e., treat this as an optimization.  We always need to
pause all threads in order to install watchpoints in all
threads (watchpoints must be inserted with the thread paused, and
we do that on thread resume).  So if a request for a user-watchpoints
comes in, we'd just interrupt the current single-step as we currently
do, install the watchpoints, and go back to single-stepping, if it didn't
manage to finish, as we currently do.  At this point, we notice that we
don't have free hardware watchpoints left, and fallback to do
the slow software single-step as before.

> 
> It's an interesting approch however I'll dig about it more.
> 
>>
>> In IT blocks, this would probably make the thread stop on
>> instructions that aren't really executed (e.g., the then/else
>> branches when the condition is correspondingly false/true),
> 
> Really ? I need to find something about that in the arch man...

AFAIK, in IT blocks, all instructions always "execute", but, when
the condition is false, the instruction becomes as-if a nop.
The only reason breakpoints don't stop there currently is that
breakpoints are just another instruction (actually an undefined
instruction the kernel is aware of, that causes an undefined
instruction exception that the kernel translates to a SIGTRAP
instead of a SIGILL), and when the condition is false, the breakpoint
instruction becomes a nop too...  If you have a hardware trap
set to trap at such an address though, then it should trap, I believe,
as if you had armed a hardware breakpoint to trap on the address
of a real nop instruction.

Thanks,
Pedro Alves
  
Antoine Tremblay Feb. 17, 2017, 2:16 a.m. UTC | #10
Yao Qi writes:

> On 17-01-30 08:28:29, Antoine Tremblay wrote:
>>
>> Yao Qi writes:
>>
>> > On Fri, Jan 27, 2017 at 6:24 PM, Antoine Tremblay
>> > <antoine.tremblay@ericsson.com> wrote:
>> >>
>> >> Yao Qi writes:
>> >>
>> >>> On Fri, Jan 27, 2017 at 4:06 PM, Antoine Tremblay
>> >>> <antoine.tremblay@ericsson.com> wrote:
>> >
>> >>> If emulation is complex, probably
>> >>> we can partially fix this problem by "always using 16-bit thumb instruction
>> >>> if program is out of IT block".
>> >>>
>> >>
>> >> I would be against that since it would leave the feature partially
>> >> working and crashing the program when it fails...
>> >>
>> >> It would also be a regression compared to previous GDBServer.
>> >
>> > There won't be any regression.  16-bit thumb breakpoint works pretty well
>> > for any thumb instructions (16-bit and 32-bit) if program is out of IT block.
>> > The 32-bit thumb-2 breakpoint was added for single step IT block.
>>
>> Yes so there will be a regression for single step inside an IT block if
>> we keep using the 32 bit breakpoint since this one can be non atomic if
>> it's not aligned properly.
>>
>
> It does fail, but not a regression, because current GDBserver fails to
> write 32-bit thumb breakpoint to 2-byte aligned 32-bit instruction
> atomically, regardless the program is within IT block or not.

I mean it's a regression compared to GDB 7.11. In 7.11 GDBServer will
stop all threads write the 4 bytes and then restart all threads so the
issue is not present.

Yes it wasn't atomic before either but it did not create an issue due to
the stopping of the threads.

Or maybe I misunderstand what you mean... ?

> My
> suggestion is that "let us fix this problem when program is out of IT
> block by using 16-bit thumb breakpoint".  That will fix the issue
> of atomic write when program is out of IT block, and leave the problem
> there when program is within IT block.  Why is it a regression?
>

Well like I said before in 7.11 because GDBServer stopped the threads to
write the 4 bytes, it did not have to be atomic and schedlock.exp etc
passed, single stepping a program even with a IT block did not fail.

I would like to re-validate this since you're introducing doubt into my
mind but I can't at the moment, I hope my memory serves me right but I
have not retested this now.

>> We could write a test case for it and it would fail like schedlock
>> did. But it would not with the single stepping controlled by GDB like it
>> was before.
>>
>> >
>> >> Also, IT blocks are a common place to have a breakpoint/tracepoint.
>> >>
>> >
>> > We don't change anything when setting breakpoint inside IT block.
>>
>> Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
>> 2 bytes like discussed before.
>>
>
> That is just the sub-set of the original problem.
>
>> >
>> >>>> I think it would be better to get the current single stepping working
>> >>>> with the stop all threads logic since GDBServer was working like that
>> >>>> when GDB was doing the single stepping anyway. This would fix the current
>> >>>> regression.
>> >>>>
>> >>>> Then work could be done to improve GDBServer to be better at
>> >>>> non-stopping.
>> >>>>
>> >>>> WDYT ?
>> >>>>
>> >>>
>> >>> Sounds like we are applying the ARM linux limitation to a general
>> >>> place.
>> >>> Other software single step targets may write breakpoint in atomic way,
>> >>> so we don't need to stop all threads.  Even in -marm mode, we don't
>> >>> have to stop all threads on inserting breakpoints.
>> >>
>> >> Well ARM is the only software single stepping target, while I agreee
>> >> that we would be affecting general code, I think that since there is
>> >> no software single step breakpoint target that supports atomic
>> >> breakpoint writes at the moment it's normal that the run control
>> >> doesn't support that.
>> >
>> > No, ARM Linux ptrace POKETEXT is _atomic_ if the address is 4-byte
>> > aligned, so 32-bit arm breakpoint and 16-bit thumb breakpoint can be
>> > written atomically.  32-bit thumb-2 breakpoint can be written atomically
>> > too if the address is 4-byte aligned.
>> >
>> > The only problem we have is inserting a breakpoint on a 2-byte aligned
>> > 32-bit thumb-2 instruction inside IT block, we can not use neither 16-bit thumb
>> > breakpoint nor 32-bit thumb breakpoint.  Everything works fine in other
>> > cases.
>> >
>>
>> I mean software single stepping as a whole, which means considering all
>> cases, is not atomic. I don't see how we could leave that case
>> unaddressed ?
>>
>> Especially since it will crash the inferior ?
>>
>
> I am thinking how do we make progress here.  Nowadays, the multi-threaded
> program may crash almost everywhere, but we can partially fix it to an
> extent that the multi-threaded program may only crash in side IT block.
> Is it a progress?  I feel it is a good example to apply "divide and
> conquer" to a complicated engineering problem.  We can easily fix the
> first half of this problem, and then think about the second half.
>

Well I agree with the divide an conquer approach, I would just have
divided it another way by stopping all threads so that we fix all the
problem right now, and then think about the second half which would be
to allow non-stop operations.

The solution to the program crashing in the IT block seems non-trivial
and I don't know how much time will pass before a fix is done.  I'm
afraid this would linger for a long time but maybe I'm wrong, do you plan
to address the second part for 8.0 too ? 

I would feel better if GDB worked for all cases meanwhile.

This is more important to me than not stopping the threads, especially
since in 7.11 the threads were stopped.

My point is that if we can fix the problem completely now while we think
about a better solution isn't that preferable to leaving GDB partially
fixed ?

>> >>
>> >> I don't count -marm as an arch since there no way to check that all the
>> >> program including shared libs etc is -marm, I don't think we could make
>> >> the distinction in GDBServer AFAIK.
>> >
>> > We can check with -mthumb.  My hack in last email fixes fails in
>> > schedlock.exp in thumb mode.
>>
>> Yes but schedlock.exp is not made to test the 2 byte aligned thumb2
>> 32-bit breakpoint IIRC.
>>
>
> We can write one test for single step 2-byte aligned thumb-2 32-bit
> instruction.  No problem.
  
Antoine Tremblay Feb. 17, 2017, 3:05 a.m. UTC | #11
Pedro Alves writes:

> On 02/17/2017 01:41 AM, Antoine Tremblay wrote:
>>
>> Pedro Alves writes:
>>
>>> On 01/30/2017 01:28 PM, Antoine Tremblay wrote:
>>>
>>>>> We don't change anything when setting breakpoint inside IT block.
>>>>
>>>> Well that's a problem if we write a 32 bit thumb2 breakpoint aligned on
>>>> 2 bytes like discussed before.
>>>
>>
>> Sorry for the delay I just saw your mail...
>>
>>> Can we restrict the stopping-all-threads to the case that
>>> needs it, only?
>>
>> Possibly but I would like to avoid introducing 2 execution paths in the
>> run control, it's already hard to follow as it is and it would introduce
>> a lot of code in the arch independant code just for this case, that's
>> something I would like to avoid too.
>
> I don't immediately see how this would imply introducing lots of code
> in the run control.

Well lots can be debatable, but at first glance you would basically need
this current posted patch + what it removes and a switch between the 2.

And I'm not sure how the IT block detection would be done.

It just seems like much to me, I'm actually very surprised that you're
suggesting having those two paths.

It seems like it creates much to maintain to support this particular
problem with the ARM breakpoints.

Maybe it's just that I had such a hard time getting all that run control
right, it's full of traps and corner cases with all-stop/non-stop,
stopping, suspending, deleting the breakpoints, re-adding them at the
right time etc... Adding more state is giving me quite a headache.

> We shouldn't be imposing this stop-all-threads
> on all archs, since it adds a lot of slowness (and
> the more threads the inferior has, the worse it gets).

I would be the first the agree usually that's something I would like to
avoid but considering that it was crashing the inferior in the only
architecture using this, not stopping seemed less important.

Also I'm not arguing to keep it like this forever but until there's a
better solution to the problem it seemed reasonable to me to take the
performance hit.

And I was pretty much certain that before another arch uses this we
would have figured it out.

Is there another arch on the horizon that would use this ?

>So if
> we already need the 2 execution paths, making the condition that
> determines whether to pause all threads consider more state
> doesn't seem to add that much complexity to the run control
> part itself.
>
>>> An optimization that would avoid this would be to use a
>>> hardware watchpoint/breakpoint (if available) for single-stepping.
>>> IIRC, you can ARM a breakpoint (or was it a watchpoint) on ARM for
>>> triggering when the PC is different from the current PC (or really,
>>> some specified address).
>>
>> I did not know that but I'm worried about the limited number of hardware
>> watchpoints available. IIRC on my board I had only 4, if GDBServer can't
>> find one available would it refuse to single step ? That would be
>> weird... ?
>
> My thought was that we'd give preference to user-requested
> watchpoints.  I.e., treat this as an optimization.  We always need to
> pause all threads in order to install watchpoints in all
> threads (watchpoints must be inserted with the thread paused, and
> we do that on thread resume).  So if a request for a user-watchpoints
> comes in, we'd just interrupt the current single-step as we currently
> do, install the watchpoints, and go back to single-stepping, if it didn't
> manage to finish, as we currently do.  At this point, we notice that we
> don't have free hardware watchpoints left, and fallback to do
> the slow software single-step as before.
>

OK I see. Interesting.

>>
>> It's an interesting approch however I'll dig about it more.
>>
>>>
>>> In IT blocks, this would probably make the thread stop on
>>> instructions that aren't really executed (e.g., the then/else
>>> branches when the condition is correspondingly false/true),
>>
>> Really ? I need to find something about that in the arch man...
>
> AFAIK, in IT blocks, all instructions always "execute", but, when
> the condition is false, the instruction becomes as-if a nop.
> The only reason breakpoints don't stop there currently is that
> breakpoints are just another instruction (actually an undefined
> instruction the kernel is aware of, that causes an undefined
> instruction exception that the kernel translates to a SIGTRAP
> instead of a SIGILL), and when the condition is false, the breakpoint
> instruction becomes a nop too...  If you have a hardware trap
> set to trap at such an address though, then it should trap, I believe,
> as if you had armed a hardware breakpoint to trap on the address
> of a real nop instruction.
>

Seems to make sense :) I'll test it out with a hardware breakpoint.

Thanks!,
Antoine
  
Yao Qi Feb. 17, 2017, 10:19 p.m. UTC | #12
On Fri, Feb 17, 2017 at 3:05 AM, Antoine Tremblay
<antoine.tremblay@ericsson.com> wrote:
>
> And I'm not sure how the IT block detection would be done.
>

In ARM ARM, we have the pseudo code,

boolean InITBlock()
return (ITSTATE.IT<3:0> != ‘0000’);

ITSTATE can be got from CPSR.
  
Antoine Tremblay Feb. 18, 2017, 12:17 a.m. UTC | #13
Yao Qi writes:

> On Fri, Feb 17, 2017 at 3:05 AM, Antoine Tremblay
> <antoine.tremblay@ericsson.com> wrote:
>>
>> And I'm not sure how the IT block detection would be done.
>>
>
> In ARM ARM, we have the pseudo code,
>
> boolean InITBlock()
> return (ITSTATE.IT<3:0> != ‘0000’);
>
> ITSTATE can be got from CPSR.

Yes that's good if you're inserting a breakpoint at current PC but
otherwise you will need something else...
  
Yao Qi Feb. 18, 2017, 10:49 p.m. UTC | #14
On 17-02-17 19:17:56, Antoine Tremblay wrote:
> > In ARM ARM, we have the pseudo code,
> >
> > boolean InITBlock()
> > return (ITSTATE.IT<3:0> != ‘0000’);
> >
> > ITSTATE can be got from CPSR.
>
> Yes that's good if you're inserting a breakpoint at current PC but
> otherwise you will need something else...

In software single step, we calculate the next pcs, and select
breakpoint kinds of them, according to current pc.  If current
pc is not within IT block (!InITBlock ()) or the last instruction
in IT block (LastInITBlock ()), we can safely use 16-bit thumb
breakpoint for any thumb instruction.  That is, in
gdbserver/linux-aarch32-low.c:arm_breakpoint_kind_from_current_state,
we can return ARM_BP_KIND_THUMB if (!InITBlock () || LastInITBlock ()).

Then, in some level, when installing software single step breakpoints,
if one breakpoint type is ARM_BP_KIND_THUMB2 and installed
address is 2-byte aligned, stop all threads.
  
Antoine Tremblay Feb. 19, 2017, 7:39 p.m. UTC | #15
Yao Qi writes:

> On 17-02-17 19:17:56, Antoine Tremblay wrote:
>> > In ARM ARM, we have the pseudo code,
>> >
>> > boolean InITBlock()
>> > return (ITSTATE.IT<3:0> != ‘0000’);
>> >
>> > ITSTATE can be got from CPSR.
>>
>> Yes that's good if you're inserting a breakpoint at current PC but
>> otherwise you will need something else...
>
> In software single step, we calculate the next pcs, and select
> breakpoint kinds of them, according to current pc.  If current
> pc is not within IT block (!InITBlock ()) or the last instruction
> in IT block (LastInITBlock ()), we can safely use 16-bit thumb
> breakpoint for any thumb instruction.  That is, in
> gdbserver/linux-aarch32-low.c:arm_breakpoint_kind_from_current_state,
> we can return ARM_BP_KIND_THUMB if (!InITBlock () || LastInITBlock ()).
>
> Then, in some level, when installing software single step breakpoints,
> if one breakpoint type is ARM_BP_KIND_THUMB2 and installed
> address is 2-byte aligned, stop all threads.

Yes that would make sense but I think we can be calling get_next_pc
to insert a software single step breakpoint at an address different then
the current PCs next address.

See: https://sourceware.org/ml/gdb-patches/2016-06/msg00268.html

"> If we only remove before reporting an event to gdb, then I don't
> understand this.  We already insert single-step breakpoints when
> we process the resume request from gdb, no?

We insert single-step breakpoints when we process the resume requests
and threads are about to be resumed.  If threads still have pending
status, single-step breakpoints are not installed, so we need to install
them in proceed_all_lwps."

This is still true AFAIK so GDBServer may be at any PC stopped on a
pending event and need to insert a single step breakpoint at an address
unrelated to that event so in that case CPSR can't be used...

Thanks,
Antoine
  
Antoine Tremblay Feb. 19, 2017, 8:31 p.m. UTC | #16
Antoine Tremblay writes:

> Yao Qi writes:
>
>> On 17-02-17 19:17:56, Antoine Tremblay wrote:
>>> > In ARM ARM, we have the pseudo code,
>>> >
>>> > boolean InITBlock()
>>> > return (ITSTATE.IT<3:0> != ‘0000’);
>>> >
>>> > ITSTATE can be got from CPSR.
>>>
>>> Yes that's good if you're inserting a breakpoint at current PC but
>>> otherwise you will need something else...
>>
>> In software single step, we calculate the next pcs, and select
>> breakpoint kinds of them, according to current pc.  If current
>> pc is not within IT block (!InITBlock ()) or the last instruction
>> in IT block (LastInITBlock ()), we can safely use 16-bit thumb
>> breakpoint for any thumb instruction.  That is, in
>> gdbserver/linux-aarch32-low.c:arm_breakpoint_kind_from_current_state,
>> we can return ARM_BP_KIND_THUMB if (!InITBlock () || LastInITBlock ()).
>>
>> Then, in some level, when installing software single step breakpoints,
>> if one breakpoint type is ARM_BP_KIND_THUMB2 and installed
>> address is 2-byte aligned, stop all threads.
>
> Yes that would make sense but I think we can be calling get_next_pc
> to insert a software single step breakpoint at an address different then
> the current PCs next address.
>
> See: https://sourceware.org/ml/gdb-patches/2016-06/msg00268.html
>
> "> If we only remove before reporting an event to gdb, then I don't
>> understand this.  We already insert single-step breakpoints when
>> we process the resume request from gdb, no?
>
> We insert single-step breakpoints when we process the resume requests
> and threads are about to be resumed.  If threads still have pending
> status, single-step breakpoints are not installed, so we need to install
> them in proceed_all_lwps."
>
> This is still true AFAIK so GDBServer may be at any PC stopped on a
> pending event and need to insert a single step breakpoint at an address
> unrelated to that event so in that case CPSR can't be used...
>
> Thanks,
> Antoine

Oops looks like it had been a while since I checked the get_next_pc
code, so we switch the current thread and use that PC so it should work.

Disregard my last mail :)

Looks like this solution is a good way forward.
  
Antoine Tremblay March 29, 2017, 12:40 p.m. UTC | #17
Yao Qi writes:

> On 17-02-17 19:17:56, Antoine Tremblay wrote:
>> > In ARM ARM, we have the pseudo code,
>> >
>> > boolean InITBlock()
>> > return (ITSTATE.IT<3:0> != ‘0000’);
>> >
>> > ITSTATE can be got from CPSR.
>>
>> Yes that's good if you're inserting a breakpoint at current PC but
>> otherwise you will need something else...
>
> In software single step, we calculate the next pcs, and select
> breakpoint kinds of them, according to current pc.  If current
> pc is not within IT block (!InITBlock ()) or the last instruction
> in IT block (LastInITBlock ()), we can safely use 16-bit thumb
> breakpoint for any thumb instruction.  That is, in
> gdbserver/linux-aarch32-low.c:arm_breakpoint_kind_from_current_state,
> we can return ARM_BP_KIND_THUMB if (!InITBlock () || LastInITBlock ()).

This is not entirely true since we have to check if the next pcs are in
an IT block rather then only the current one, so there's multiple
scenarios.

Consider if current PC is the IT instruction for example, then there's
at least 2 next pcs inside the IT block where we will need to install an THUMB2
breakpoint and get_next_pcs will return that.

Now I find myself trying to replicate the logic in thumb_get_next_pcs_raw
for IT blocks in arm_breakpoint_kind_from_current_state and it doesn't
feel right.

It gives something like:

void
set_single_step_breakpoint (CORE_ADDR stop_at, ptid_t ptid)
{
  struct single_step_breakpoint *bp;

  gdb_assert (ptid_get_pid (current_ptid) == ptid_get_pid (ptid));

  CORE_ADDR placed_address = stop_at;
  int breakpoint_kind
    = target_breakpoint_kind_from_current_state (&placed_address);

  bp = (struct single_step_breakpoint *) set_breakpoint_type_at_with_kind
    (single_step_breakpoint, placed_address, NULL, breakpoint_kind);
  bp->ptid = ptid;
}

int
arm_breakpoint_kind_from_current_state (CORE_ADDR *pcptr)
{
  if (arm_is_thumb_mode ())
    {

    if ( current_pcs_or_expected_next_pcs_are_in_it_block  ??? /* That's
    not right... */

	{
	  *pcptr = UNMAKE_THUMB_ADDR (*pcptr);
	  return ARM_BP_KIND_THUMB;
	}
      else
	{
	  *pcptr = MAKE_THUMB_ADDR (*pcptr);
	  return arm_breakpoint_kind_from_pc (pcptr);
	}
    }
  else
    {
      return arm_breakpoint_kind_from_pc (pcptr);
    }
}

It would be much better if get_next_pcs could select the breakpoint kind
itself and hint that to set_single_step_breakpoint like :

 VEC (struct { next_pc, breakpoint_kind })  = (*the_low_target.get_next_pcs) (regcache);

  for (i = 0; VEC_iterate (struct  { CORE_ADDR, kind }, next_pcs, i, pc); ++i)
    set_single_step_breakpoint (pc, kind, current_ptid);

But of course that means changing that virtual function for all archs
etc... :(

I'm thinking of going with that approch but I would like to know if you
have any other solutions to propose or if that sounds OK before writing
all that code ?

Thanks,
Antoine
  
Antoine Tremblay March 29, 2017, 2:11 p.m. UTC | #18
Antoine Tremblay writes:

> Yao Qi writes:
>
>> On 17-02-17 19:17:56, Antoine Tremblay wrote:
>>> > In ARM ARM, we have the pseudo code,
>>> >
>>> > boolean InITBlock()
>>> > return (ITSTATE.IT<3:0> != ‘0000’);
>>> >
>>> > ITSTATE can be got from CPSR.
>>>
>>> Yes that's good if you're inserting a breakpoint at current PC but
>>> otherwise you will need something else...
>>
>> In software single step, we calculate the next pcs, and select
>> breakpoint kinds of them, according to current pc.  If current
>> pc is not within IT block (!InITBlock ()) or the last instruction
>> in IT block (LastInITBlock ()), we can safely use 16-bit thumb
>> breakpoint for any thumb instruction.  That is, in
>> gdbserver/linux-aarch32-low.c:arm_breakpoint_kind_from_current_state,
>> we can return ARM_BP_KIND_THUMB if (!InITBlock () || LastInITBlock ()).
>
> This is not entirely true since we have to check if the next pcs are in
> an IT block rather then only the current one, so there's multiple
> scenarios.
>
> Consider if current PC is the IT instruction for example, then there's
> at least 2 next pcs inside the IT block where we will need to install an THUMB2
> breakpoint and get_next_pcs will return that.
>
> Now I find myself trying to replicate the logic in thumb_get_next_pcs_raw
> for IT blocks in arm_breakpoint_kind_from_current_state and it doesn't
> feel right.
>
> It gives something like:
>
> void
> set_single_step_breakpoint (CORE_ADDR stop_at, ptid_t ptid)
> {
>   struct single_step_breakpoint *bp;
>
>   gdb_assert (ptid_get_pid (current_ptid) == ptid_get_pid (ptid));
>
>   CORE_ADDR placed_address = stop_at;
>   int breakpoint_kind
>     = target_breakpoint_kind_from_current_state (&placed_address);
>
>   bp = (struct single_step_breakpoint *) set_breakpoint_type_at_with_kind
>     (single_step_breakpoint, placed_address, NULL, breakpoint_kind);
>   bp->ptid = ptid;
> }
>
> int
> arm_breakpoint_kind_from_current_state (CORE_ADDR *pcptr)
> {
>   if (arm_is_thumb_mode ())
>     {
>
>     if ( current_pcs_or_expected_next_pcs_are_in_it_block  ??? /* That's
>     not right... */
>
> 	{
> 	  *pcptr = UNMAKE_THUMB_ADDR (*pcptr);
> 	  return ARM_BP_KIND_THUMB;
> 	}
>       else
> 	{
> 	  *pcptr = MAKE_THUMB_ADDR (*pcptr);
> 	  return arm_breakpoint_kind_from_pc (pcptr);
> 	}
>     }
>   else
>     {
>       return arm_breakpoint_kind_from_pc (pcptr);
>     }
> }
>
> It would be much better if get_next_pcs could select the breakpoint kind
> itself and hint that to set_single_step_breakpoint like :
>
>  VEC (struct { next_pc, breakpoint_kind })  = (*the_low_target.get_next_pcs) (regcache);
>
>   for (i = 0; VEC_iterate (struct  { CORE_ADDR, kind }, next_pcs, i, pc); ++i)
>     set_single_step_breakpoint (pc, kind, current_ptid);
>
> But of course that means changing that virtual function for all archs
> etc... :(
>
> I'm thinking of going with that approch but I would like to know if you
> have any other solutions to propose or if that sounds OK before writing
> all that code ?
>
> Thanks,
> Antoine


Just a small follow-up on this idea, I think it would simplify GDB's
implementation too, it would have to change anyway since the interface
is shared.

See commit 833b7ab5008b769dca6db6d5ee1d21d33e730132
introduces a special case in arm_breakpoint_kind_from_current_state
where it's checked if GDB is single stepping and if so redoes the
arm_get_next_pc to get the execution mode of the next instruction.

I could do the same kind of thing in GDBServer recall get_next_pcs and
if I get > 1 PC in the vect it would mean I have an IT block, but it
sounds like a hack.

I also find that confusing given that the documentation for
breakpoint_kind_from_current_state is:

# Return the breakpoint kind for this target based on the current
# processor state (e.g. the current instruction mode on ARM) and the
# *PCPTR.  In default, it is gdbarch->breakpoint_kind_from_pc.

Finding the next PCs kind with this function seems like adding another
meaning to it...
  
Antoine Tremblay March 29, 2017, 5:54 p.m. UTC | #19
Antoine Tremblay writes:

>> Consider if current PC is the IT instruction for example, then there's
>> at least 2 next pcs inside the IT block where we will need to install an THUMB2
>> breakpoint and get_next_pcs will return that.

Oops please read that "Consider if the current instruction is the CMP instruction
before an IT instruction...."

Basically so that we get into the arm-get-next-pcs.c:351 case... in fact
now that I think of it maybe that would be OK if I were to add that check in
breakpoint_kind_from_current_state also but the previous comments still
apply about the possibly hackish state of this..

Thanks,
Antoine
  

Patch

diff --git a/gdb/gdbserver/linux-aarch32-low.c b/gdb/gdbserver/linux-aarch32-low.c
index 2b710ba..789e0e6 100644
--- a/gdb/gdbserver/linux-aarch32-low.c
+++ b/gdb/gdbserver/linux-aarch32-low.c
@@ -243,7 +243,7 @@  arm_breakpoint_kind_from_pc (CORE_ADDR *pcptr)
 
 	  target_read_memory (*pcptr, (gdb_byte *) &inst1, 2);
 	  if (thumb_insn_size (inst1) == 4)
-	    return ARM_BP_KIND_THUMB2;
+	    return ARM_BP_KIND_THUMB;
 	}
       return ARM_BP_KIND_THUMB;
     }