[2/2,gdb/tdep] Fix gdb.base/watch-bitfields.exp on aarch64

Message ID 20240220205425.13587-2-tdevries@suse.de
State Committed
Headers
Series [1/2,gdb/tdep] Fix gdb.base/watchpoint-unaligned.exp on aarch64 |

Checks

Context Check Description
linaro-tcwg-bot/tcwg_gdb_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_gdb_check--master-arm success Testing passed

Commit Message

Tom de Vries Feb. 20, 2024, 8:54 p.m. UTC
  On aarch64-linux, with test-case gdb.base/watch-bitfields.exp I run into:
...
(gdb) continue^M
Continuing.^M
^M
Hardware watchpoint 2: -location q.a^M
^M
Old value = 1^M
New value = 0^M
main () at watch-bitfields.c:42^M
42        q.h--;^M
(gdb) FAIL: $exp: -location watch against bitfields: q.e: 0->5: continue
...

In a minimal form, if we step past line 37 which sets q.e, and we have a
watchpoint set on q.e, it triggers:
...
$ gdb -q -batch watch-bitfields -ex "b 37" -ex run -ex "watch q.e" -ex step
Breakpoint 1 at 0x410204: file watch-bitfields.c, line 37.

Breakpoint 1, main () at watch-bitfields.c:37
37        q.e = 5;
Hardware watchpoint 2: q.e

Hardware watchpoint 2: q.e

Old value = 0
New value = 5
main () at /home/vries/gdb/src/gdb/testsuite/gdb.base/watch-bitfields.c:38
38        q.f = 6;
...

However, if we set in addition a watchpoint on q.a, the watchpoint on q.e
doesn't trigger.

How does this happen?

Bitfield q.a is just bit 0 of byte 0, and bitfield q.e is bit 4..7 of byte 1
and bit 1 of byte 2.  So, watch q.a should watch byte 0, and watch q.e should
watch bytes 1 and 2.

Using "maint set show-debug-regs on" (and some more detailed debug prints) we
get:
...
WP2: addr=0x440028 (orig=0x440029), ctrl=0x000000d5, ref.count=1
  ctrl: enabled=1, offset=1, len=2
WP3: addr=0x440028 (orig=0x440028), ctrl=0x00000035, ref.count=1
  ctrl: enabled=1, offset=0, len=1
...
which matches that.

When executing line 37, a hardware watchpoint trap triggers and we hit
aarch64_stopped_data_address with addr_trap == 0x440028:
...
(gdb) p /x addr_trap
$1 = 0x440028
....
and since the loop in aarch64_stopped_data_address walks backward, we check
WP3 first, which matches, and consequently target_stopped_by_watchpoint
returns true in watchpoints_triggered.

Likewise for target_stopped_data_address, which also returns addr == 0x440028.
Watchpoints_triggered matches watchpoint q.a to that address, and sets
watch_triggered_yes.

However, subsequently the value of q.a is checked, and it's the same value as
before (becase the insn in line 37 didn't change q.a), so the watchpoint
hardware trap is not reported to the user.

The problem originates from that fact that aarch64_stopped_data_address picked
WP3 instead of WP2.

There's something we can do about this.  In the example above, both
target_stopped_by_watchpoint and target_stopped_data_address returned true.
Instead we can return true in target_stopped_by_watchpoint but false in
target_stopped_data_address.  This lets watchpoints_triggered known that a
watchpoint was triggered, but we don't know where, and both watchpoints
get set to watch_triggered_unknown.

Subsequently, the values of both q.a and q.e are checked, and since q.e is not
the same value as before, the watchpoint hardware trap is reported to the user.

Note that this works well for regular (write) watchpoints (watch command), but
not for read watchpoints (rwatch command), because for those no value is
checked.  Likewise for access watchpoints (awatch command).

So, fix this by:
- passing a nullptr in aarch64_fbsd_nat_target::stopped_by_watchpoint and
  aarch64_linux_nat_target::stopped_by_watchpoint to make clear we're not
  interested in the stop address,
- introducing a two-phase approach in aarch64_stopped_data_address, where:
  - phase one handles access and read watchpoints, as before, and
  - phase two handles write watchpoints, where multiple matches cause:
    - return true if addr_p == null, and
    - return false if addr_p != null.

Tested on aarch64-linux.
---
 gdb/aarch64-fbsd-nat.c     |   4 +-
 gdb/aarch64-linux-nat.c    |   4 +-
 gdb/aarch64-nat.c          | 133 ++++++++++++++++++++++++++-----------
 gdb/nat/aarch64-hw-point.c |  25 +++++++
 gdb/nat/aarch64-hw-point.h |   2 +
 5 files changed, 123 insertions(+), 45 deletions(-)
  

Comments

Luis Machado March 7, 2024, 12:11 p.m. UTC | #1
Hi Tom,

On 2/20/24 20:54, Tom de Vries wrote:
> On aarch64-linux, with test-case gdb.base/watch-bitfields.exp I run into:
> ...
> (gdb) continue^M
> Continuing.^M
> ^M
> Hardware watchpoint 2: -location q.a^M
> ^M
> Old value = 1^M
> New value = 0^M
> main () at watch-bitfields.c:42^M
> 42        q.h--;^M
> (gdb) FAIL: $exp: -location watch against bitfields: q.e: 0->5: continue
> ...


Unfortunately I don't see the same failure on a graviton (aarch64). It is usually how these
hardware watchpoint issues go. Things behave differently depending on the setup you
have, since the reported trap address may be different.

> 
> In a minimal form, if we step past line 37 which sets q.e, and we have a
> watchpoint set on q.e, it triggers:
> ...
> $ gdb -q -batch watch-bitfields -ex "b 37" -ex run -ex "watch q.e" -ex step
> Breakpoint 1 at 0x410204: file watch-bitfields.c, line 37.
> 
> Breakpoint 1, main () at watch-bitfields.c:37
> 37        q.e = 5;
> Hardware watchpoint 2: q.e
> 
> Hardware watchpoint 2: q.e
> 
> Old value = 0
> New value = 5
> main () at /home/vries/gdb/src/gdb/testsuite/gdb.base/watch-bitfields.c:38
> 38        q.f = 6;
> ...
> 
> However, if we set in addition a watchpoint on q.a, the watchpoint on q.e
> doesn't trigger.
> 
> How does this happen?
> 
> Bitfield q.a is just bit 0 of byte 0, and bitfield q.e is bit 4..7 of byte 1
> and bit 1 of byte 2.  So, watch q.a should watch byte 0, and watch q.e should
> watch bytes 1 and 2.
> 
> Using "maint set show-debug-regs on" (and some more detailed debug prints) we
> get:
> ...
> WP2: addr=0x440028 (orig=0x440029), ctrl=0x000000d5, ref.count=1
>   ctrl: enabled=1, offset=1, len=2
> WP3: addr=0x440028 (orig=0x440028), ctrl=0x00000035, ref.count=1
>   ctrl: enabled=1, offset=0, len=1
> ...
> which matches that.
> 
> When executing line 37, a hardware watchpoint trap triggers and we hit
> aarch64_stopped_data_address with addr_trap == 0x440028:
> ...
> (gdb) p /x addr_trap
> $1 = 0x440028
> ....
> and since the loop in aarch64_stopped_data_address walks backward, we check
> WP3 first, which matches, and consequently target_stopped_by_watchpoint
> returns true in watchpoints_triggered.
> 
> Likewise for target_stopped_data_address, which also returns addr == 0x440028.
> Watchpoints_triggered matches watchpoint q.a to that address, and sets
> watch_triggered_yes.
> 
> However, subsequently the value of q.a is checked, and it's the same value as
> before (becase the insn in line 37 didn't change q.a), so the watchpoint
> hardware trap is not reported to the user.
> 
> The problem originates from that fact that aarch64_stopped_data_address picked
> WP3 instead of WP2.
> 
> There's something we can do about this.  In the example above, both
> target_stopped_by_watchpoint and target_stopped_data_address returned true.
> Instead we can return true in target_stopped_by_watchpoint but false in
> target_stopped_data_address.  This lets watchpoints_triggered known that a
> watchpoint was triggered, but we don't know where, and both watchpoints
> get set to watch_triggered_unknown.
> 
> Subsequently, the values of both q.a and q.e are checked, and since q.e is not
> the same value as before, the watchpoint hardware trap is reported to the user.
> 
> Note that this works well for regular (write) watchpoints (watch command), but
> not for read watchpoints (rwatch command), because for those no value is
> checked.  Likewise for access watchpoints (awatch command).
> 
> So, fix this by:
> - passing a nullptr in aarch64_fbsd_nat_target::stopped_by_watchpoint and
>   aarch64_linux_nat_target::stopped_by_watchpoint to make clear we're not
>   interested in the stop address,
> - introducing a two-phase approach in aarch64_stopped_data_address, where:
>   - phase one handles access and read watchpoints, as before, and
>   - phase two handles write watchpoints, where multiple matches cause:
>     - return true if addr_p == null, and
>     - return false if addr_p != null.


To make sure I understand the approach, does the case where we have a write hardware watchpoint
and addr_p != null (thus returning false for aarch64_stopped_data_address) mean we don't report
any stopped data addresses, but we will still report a hardware watchpoint trigger internally and
remove said hardware watchpoints before attempting to step-over them?

I think one of the more critical issues is gdb getting stuck on hardware watchpoint triggers it is
not noticing, thus causing an endless loop of trying to step-over them and failing.

> 
> Tested on aarch64-linux.
> ---
>  gdb/aarch64-fbsd-nat.c     |   4 +-
>  gdb/aarch64-linux-nat.c    |   4 +-
>  gdb/aarch64-nat.c          | 133 ++++++++++++++++++++++++++-----------
>  gdb/nat/aarch64-hw-point.c |  25 +++++++
>  gdb/nat/aarch64-hw-point.h |   2 +
>  5 files changed, 123 insertions(+), 45 deletions(-)
> 
> diff --git a/gdb/aarch64-fbsd-nat.c b/gdb/aarch64-fbsd-nat.c
> index d7aa819023b..89ed12bb8e5 100644
> --- a/gdb/aarch64-fbsd-nat.c
> +++ b/gdb/aarch64-fbsd-nat.c
> @@ -164,9 +164,7 @@ aarch64_fbsd_nat_target::stopped_data_address (CORE_ADDR *addr_p)
>  bool
>  aarch64_fbsd_nat_target::stopped_by_watchpoint ()
>  {
> -  CORE_ADDR addr;
> -
> -  return stopped_data_address (&addr);
> +  return stopped_data_address (nullptr);
>  }
>  
>  /* Implement the "stopped_by_hw_breakpoint" target_ops method.  */
> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
> index 11a41e1afae..7dc53ff9e91 100644
> --- a/gdb/aarch64-linux-nat.c
> +++ b/gdb/aarch64-linux-nat.c
> @@ -972,9 +972,7 @@ aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
>  bool
>  aarch64_linux_nat_target::stopped_by_watchpoint ()
>  {
> -  CORE_ADDR addr;
> -
> -  return stopped_data_address (&addr);
> +  return stopped_data_address (nullptr);
>  }
>  
>  /* Implement the "can_do_single_step" target_ops method.  */
> diff --git a/gdb/aarch64-nat.c b/gdb/aarch64-nat.c
> index 5d7d2be2827..9e859cf28cc 100644
> --- a/gdb/aarch64-nat.c
> +++ b/gdb/aarch64-nat.c
> @@ -231,47 +231,102 @@ bool
>  aarch64_stopped_data_address (const struct aarch64_debug_reg_state *state,
>  			      CORE_ADDR addr_trap, CORE_ADDR *addr_p)
>  {
> -  int i;
> -
> -  for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
> -    {
> -      const unsigned int offset
> -	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> -      const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> -      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> -      const CORE_ADDR addr_watch_aligned
> -	= align_down (state->dr_addr_wp[i], 16);
> -      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> -
> -      if (state->dr_ref_count_wp[i]
> -	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
> -	  && addr_trap >= addr_watch_aligned
> -	  && addr_trap < addr_watch + len)
> -	{
> -	  /* ADDR_TRAP reports the first address of the memory range
> -	     accessed by the CPU, regardless of what was the memory
> -	     range watched.  Thus, a large CPU access that straddles
> -	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> -	     ADDR_TRAP that is lower than the
> -	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
> -
> -	     addr: |   4   |   5   |   6   |   7   |   8   |
> -				   |---- range watched ----|
> -		   |----------- range accessed ------------|
> -
> -	     In this case, ADDR_TRAP will be 4.
> -
> -	     To match a watchpoint known to GDB core, we must never
> -	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> -	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> -	     positive on kernels older than 4.10.  See PR
> -	     external/20207.  */
> +  bool found = false;
> +  for (int phase = 0; phase <= 1; ++phase)
> +    for (int i = aarch64_num_wp_regs - 1; i >= 0; --i)
> +      {
> +	if (!(state->dr_ref_count_wp[i]
> +	      && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])))
> +	  {
> +	    /* Watchpoint disabled.  */
> +	    continue;
> +	  }
> +
> +	const enum target_hw_bp_type type
> +	  = aarch64_watchpoint_type (state->dr_ctrl_wp[i]);
> +	if (type == hw_execute)
> +	  {
> +	    /* Watchpoint disabled.  */
> +	    continue;
> +	  }
> +
> +	if (phase == 0)
> +	  {
> +	    /* Phase 0: No hw_write.  */
> +	    if (type == hw_write)
> +	      continue;
> +	  }
> +	else
> +	  {
> +	    /* Phase 1: Only hw_write.  */
> +	    if (type != hw_write)
> +	      continue;
> +	  }
> +
> +	const unsigned int offset
> +	  = aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
> +	const unsigned int len
> +	  = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
> +	const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
> +	const CORE_ADDR addr_watch_aligned
> +	  = align_down (state->dr_addr_wp[i], 16);
> +	const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
> +
> +	/* ADDR_TRAP reports the first address of the memory range
> +	   accessed by the CPU, regardless of what was the memory
> +	   range watched.  Thus, a large CPU access that straddles
> +	   the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
> +	   ADDR_TRAP that is lower than the
> +	   ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
> +
> +	   addr: |   4	 |   5	 |   6	 |   7	 |   8	 |
> +				 |---- range watched ----|
> +		 |----------- range accessed ------------|
> +
> +	   In this case, ADDR_TRAP will be 4.  */
> +	if (!(addr_trap >= addr_watch_aligned
> +	      && addr_trap < addr_watch + len))
> +	  {
> +	    /* Not a match.  */
> +	    continue;
> +	  }
> +
> +	/* To match a watchpoint known to GDB core, we must never
> +	   report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
> +	   range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
> +	   positive on kernels older than 4.10.  See PR
> +	   external/20207.  */
> +	if (addr_p != nullptr)
>  	  *addr_p = addr_orig;
> -	  return true;
> -	}
> -    }
>  
> -  return false;
> +	if (phase == 0)
> +	  {
> +	    /* Phase 0: Return first match.  */
> +	    return true;
> +	  }
> +
> +	/* Phase 1.  */
> +	if (addr_p == nullptr)
> +	  {
> +	    /* First match, and we don't need to report an address.  No need
> +	       to look for other matches.  */
> +	    return true;
> +	  }
> +
> +	if (!found)
> +	  {
> +	    /* First match, and we need to report an address.  Look for other
> +	       matches.  */
> +	    found = true;
> +	    continue;
> +	  }
> +
> +	/* More that one match, and we need to return an address.  No need to


s/More that/More than

> +	   look for further matches.  */
> +	return false;
> +      }
> +
> +  return found;
>  }
>  
>  /* Define AArch64 maintenance commands.  */
> diff --git a/gdb/nat/aarch64-hw-point.c b/gdb/nat/aarch64-hw-point.c
> index 04674dea2a6..08fd230b71f 100644
> --- a/gdb/nat/aarch64-hw-point.c
> +++ b/gdb/nat/aarch64-hw-point.c
> @@ -73,6 +73,31 @@ aarch64_watchpoint_length (unsigned int ctrl)
>    return retval;
>  }
>  
> +/* Utility function that returns the type of a watchpoint according to the
> +   content of a hardware debug control register CTRL.  */
> +
> +enum target_hw_bp_type
> +aarch64_watchpoint_type (unsigned int ctrl)
> +{
> +  unsigned int type = DR_CONTROL_TYPE (ctrl);
> +
> +  switch (type)
> +    {
> +    case 1:
> +      return hw_read;
> +    case 2:
> +      return hw_write;
> +    case 3:
> +      return hw_access;
> +    case 0:
> +      /* Reserved for a watchpoint.  It must behave as if the watchpoint is
> +	 disabled.  */
> +      return hw_execute;
> +    default:
> +      gdb_assert_not_reached ("");
> +    }
> +}
> +
>  /* Given the hardware breakpoint or watchpoint type TYPE and its
>     length LEN, return the expected encoding for a hardware
>     breakpoint/watchpoint control register.  */
> diff --git a/gdb/nat/aarch64-hw-point.h b/gdb/nat/aarch64-hw-point.h
> index 70f71db7520..bdcca932e57 100644
> --- a/gdb/nat/aarch64-hw-point.h
> +++ b/gdb/nat/aarch64-hw-point.h
> @@ -73,6 +73,7 @@
>  
>  #define DR_CONTROL_ENABLED(ctrl)	(((ctrl) & 0x1) == 1)
>  #define DR_CONTROL_MASK(ctrl)		(((ctrl) >> 5) & 0xff)
> +#define DR_CONTROL_TYPE(ctrl)		(((ctrl) >> 3) & 0x3)
>  
>  /* Structure for managing the hardware breakpoint/watchpoint resources.
>     DR_ADDR_* stores the address, DR_CTRL_* stores the control register
> @@ -107,6 +108,7 @@ void aarch64_notify_debug_reg_change (ptid_t ptid, int is_watchpoint,
>  
>  unsigned int aarch64_watchpoint_offset (unsigned int ctrl);
>  unsigned int aarch64_watchpoint_length (unsigned int ctrl);
> +enum target_hw_bp_type aarch64_watchpoint_type (unsigned int ctrl);
>  
>  int aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
>  			       int len, int is_insert, ptid_t ptid,

I think this is a reasonable workaround for this situation you describe. Testing on my end doesn't
seem to cause any issues, so I think this is OK.

Out of curiosity, what is your aarch64 setup where you can reproduce this failure?

Approved-By: Luis Machado <luis.machado@arm.com>
  
Tom de Vries March 11, 2024, 3:04 p.m. UTC | #2
On 3/7/24 13:11, Luis Machado wrote:
> Unfortunately I don't see the same failure on a graviton (aarch64). It is usually how these
> hardware watchpoint issues go. Things behave differently depending on the setup you
> have, since the reported trap address may be different.
> 

I managed to reproduce the fails in both gdb.base/watch-bitfields.exp 
and gdb.base/watchpoint-unaligned.exp on cfarm103 ( 
https://portal.cfarm.net/machines/list/ ), so if you want to reproduce 
and investigate the behaviour, you could try on that machine.

Thanks,
- Tom
  
Luis Machado March 11, 2024, 3:12 p.m. UTC | #3
On 3/11/24 15:04, Tom de Vries wrote:
> On 3/7/24 13:11, Luis Machado wrote:
>> Unfortunately I don't see the same failure on a graviton (aarch64). It is usually how these
>> hardware watchpoint issues go. Things behave differently depending on the setup you
>> have, since the reported trap address may be different.
>>
> 
> I managed to reproduce the fails in both gdb.base/watch-bitfields.exp and gdb.base/watchpoint-unaligned.exp on cfarm103 ( https://portal.cfarm.net/machines/list/ ), so if you want to reproduce and investigate the behaviour, you could try on that machine.

Thanks for the info.

Does patch 2/2 rely on patch 1/2 to work properly? Otherwise I think patch 2/2 is good to push.

I'm a bit nervous about 1/2 though, as that may also widen the window for false positives in terms of hardware
watchpoint triggers.
  
Tom de Vries March 12, 2024, 4:01 p.m. UTC | #4
On 3/7/24 13:11, Luis Machado wrote:
> Hi Tom,
> 
> On 2/20/24 20:54, Tom de Vries wrote:
>> On aarch64-linux, with test-case gdb.base/watch-bitfields.exp I run into:
>> ...
>> (gdb) continue^M
>> Continuing.^M
>> ^M
>> Hardware watchpoint 2: -location q.a^M
>> ^M
>> Old value = 1^M
>> New value = 0^M
>> main () at watch-bitfields.c:42^M
>> 42        q.h--;^M
>> (gdb) FAIL: $exp: -location watch against bitfields: q.e: 0->5: continue
>> ...
> 
> 
> Unfortunately I don't see the same failure on a graviton (aarch64). It is usually how these
> hardware watchpoint issues go. Things behave differently depending on the setup you
> have, since the reported trap address may be different.
> 
>>
>> In a minimal form, if we step past line 37 which sets q.e, and we have a
>> watchpoint set on q.e, it triggers:
>> ...
>> $ gdb -q -batch watch-bitfields -ex "b 37" -ex run -ex "watch q.e" -ex step
>> Breakpoint 1 at 0x410204: file watch-bitfields.c, line 37.
>>
>> Breakpoint 1, main () at watch-bitfields.c:37
>> 37        q.e = 5;
>> Hardware watchpoint 2: q.e
>>
>> Hardware watchpoint 2: q.e
>>
>> Old value = 0
>> New value = 5
>> main () at /home/vries/gdb/src/gdb/testsuite/gdb.base/watch-bitfields.c:38
>> 38        q.f = 6;
>> ...
>>
>> However, if we set in addition a watchpoint on q.a, the watchpoint on q.e
>> doesn't trigger.
>>
>> How does this happen?
>>
>> Bitfield q.a is just bit 0 of byte 0, and bitfield q.e is bit 4..7 of byte 1
>> and bit 1 of byte 2.  So, watch q.a should watch byte 0, and watch q.e should
>> watch bytes 1 and 2.
>>
>> Using "maint set show-debug-regs on" (and some more detailed debug prints) we
>> get:
>> ...
>> WP2: addr=0x440028 (orig=0x440029), ctrl=0x000000d5, ref.count=1
>>    ctrl: enabled=1, offset=1, len=2
>> WP3: addr=0x440028 (orig=0x440028), ctrl=0x00000035, ref.count=1
>>    ctrl: enabled=1, offset=0, len=1
>> ...
>> which matches that.
>>
>> When executing line 37, a hardware watchpoint trap triggers and we hit
>> aarch64_stopped_data_address with addr_trap == 0x440028:
>> ...
>> (gdb) p /x addr_trap
>> $1 = 0x440028
>> ....
>> and since the loop in aarch64_stopped_data_address walks backward, we check
>> WP3 first, which matches, and consequently target_stopped_by_watchpoint
>> returns true in watchpoints_triggered.
>>
>> Likewise for target_stopped_data_address, which also returns addr == 0x440028.
>> Watchpoints_triggered matches watchpoint q.a to that address, and sets
>> watch_triggered_yes.
>>
>> However, subsequently the value of q.a is checked, and it's the same value as
>> before (becase the insn in line 37 didn't change q.a), so the watchpoint
>> hardware trap is not reported to the user.
>>
>> The problem originates from that fact that aarch64_stopped_data_address picked
>> WP3 instead of WP2.
>>
>> There's something we can do about this.  In the example above, both
>> target_stopped_by_watchpoint and target_stopped_data_address returned true.
>> Instead we can return true in target_stopped_by_watchpoint but false in
>> target_stopped_data_address.  This lets watchpoints_triggered known that a
>> watchpoint was triggered, but we don't know where, and both watchpoints
>> get set to watch_triggered_unknown.
>>
>> Subsequently, the values of both q.a and q.e are checked, and since q.e is not
>> the same value as before, the watchpoint hardware trap is reported to the user.
>>
>> Note that this works well for regular (write) watchpoints (watch command), but
>> not for read watchpoints (rwatch command), because for those no value is
>> checked.  Likewise for access watchpoints (awatch command).
>>
>> So, fix this by:
>> - passing a nullptr in aarch64_fbsd_nat_target::stopped_by_watchpoint and
>>    aarch64_linux_nat_target::stopped_by_watchpoint to make clear we're not
>>    interested in the stop address,
>> - introducing a two-phase approach in aarch64_stopped_data_address, where:
>>    - phase one handles access and read watchpoints, as before, and
>>    - phase two handles write watchpoints, where multiple matches cause:
>>      - return true if addr_p == null, and
>>      - return false if addr_p != null.
> 
> 
> To make sure I understand the approach, does the case where we have a write hardware watchpoint
> and addr_p != null (thus returning false for aarch64_stopped_data_address) mean we don't report
> any stopped data addresses, but we will still report a hardware watchpoint trigger internally and
> remove said hardware watchpoints before attempting to step-over them?
> 

Hi Luis,

thanks for the review.

So, we're talking about the situation that:
- target_stopped_by_watchpoint () == true, and
- target_stopped_data_address () == false.

If you look for the uses of target_stopped_data_address, there are only 
two, with one in infrun for debugging purposes, so the only real use is 
in watchpoints_triggered.

That one returns 1 if target_stopped_by_watchpoint () == true, 
regardless of target_stopped_data_address, and that's what causes 
stopped_by_watchpoint to be set to 1 in handle_signal_stop, which does 
al the processing related to stepping over watchpoints.

The only thing that target_stopped_data_address influences is the 
watch_triggered_{no,yes,unknown} state of individual watchpoints.

So, I think the answer to your question is yes.

> I think one of the more critical issues is gdb getting stuck on hardware watchpoint triggers it is
> not noticing, thus causing an endless loop of trying to step-over them and failing.
> 
>>
>> Tested on aarch64-linux.
>> ---
>>   gdb/aarch64-fbsd-nat.c     |   4 +-
>>   gdb/aarch64-linux-nat.c    |   4 +-
>>   gdb/aarch64-nat.c          | 133 ++++++++++++++++++++++++++-----------
>>   gdb/nat/aarch64-hw-point.c |  25 +++++++
>>   gdb/nat/aarch64-hw-point.h |   2 +
>>   5 files changed, 123 insertions(+), 45 deletions(-)
>>
>> diff --git a/gdb/aarch64-fbsd-nat.c b/gdb/aarch64-fbsd-nat.c
>> index d7aa819023b..89ed12bb8e5 100644
>> --- a/gdb/aarch64-fbsd-nat.c
>> +++ b/gdb/aarch64-fbsd-nat.c
>> @@ -164,9 +164,7 @@ aarch64_fbsd_nat_target::stopped_data_address (CORE_ADDR *addr_p)
>>   bool
>>   aarch64_fbsd_nat_target::stopped_by_watchpoint ()
>>   {
>> -  CORE_ADDR addr;
>> -
>> -  return stopped_data_address (&addr);
>> +  return stopped_data_address (nullptr);
>>   }
>>   
>>   /* Implement the "stopped_by_hw_breakpoint" target_ops method.  */
>> diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
>> index 11a41e1afae..7dc53ff9e91 100644
>> --- a/gdb/aarch64-linux-nat.c
>> +++ b/gdb/aarch64-linux-nat.c
>> @@ -972,9 +972,7 @@ aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
>>   bool
>>   aarch64_linux_nat_target::stopped_by_watchpoint ()
>>   {
>> -  CORE_ADDR addr;
>> -
>> -  return stopped_data_address (&addr);
>> +  return stopped_data_address (nullptr);
>>   }
>>   
>>   /* Implement the "can_do_single_step" target_ops method.  */
>> diff --git a/gdb/aarch64-nat.c b/gdb/aarch64-nat.c
>> index 5d7d2be2827..9e859cf28cc 100644
>> --- a/gdb/aarch64-nat.c
>> +++ b/gdb/aarch64-nat.c
>> @@ -231,47 +231,102 @@ bool
>>   aarch64_stopped_data_address (const struct aarch64_debug_reg_state *state,
>>   			      CORE_ADDR addr_trap, CORE_ADDR *addr_p)
>>   {
>> -  int i;
>> -
>> -  for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
>> -    {
>> -      const unsigned int offset
>> -	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
>> -      const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
>> -      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
>> -      const CORE_ADDR addr_watch_aligned
>> -	= align_down (state->dr_addr_wp[i], 16);
>> -      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
>> -
>> -      if (state->dr_ref_count_wp[i]
>> -	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
>> -	  && addr_trap >= addr_watch_aligned
>> -	  && addr_trap < addr_watch + len)
>> -	{
>> -	  /* ADDR_TRAP reports the first address of the memory range
>> -	     accessed by the CPU, regardless of what was the memory
>> -	     range watched.  Thus, a large CPU access that straddles
>> -	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
>> -	     ADDR_TRAP that is lower than the
>> -	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
>> -
>> -	     addr: |   4   |   5   |   6   |   7   |   8   |
>> -				   |---- range watched ----|
>> -		   |----------- range accessed ------------|
>> -
>> -	     In this case, ADDR_TRAP will be 4.
>> -
>> -	     To match a watchpoint known to GDB core, we must never
>> -	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
>> -	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
>> -	     positive on kernels older than 4.10.  See PR
>> -	     external/20207.  */
>> +  bool found = false;
>> +  for (int phase = 0; phase <= 1; ++phase)
>> +    for (int i = aarch64_num_wp_regs - 1; i >= 0; --i)
>> +      {
>> +	if (!(state->dr_ref_count_wp[i]
>> +	      && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])))
>> +	  {
>> +	    /* Watchpoint disabled.  */
>> +	    continue;
>> +	  }
>> +
>> +	const enum target_hw_bp_type type
>> +	  = aarch64_watchpoint_type (state->dr_ctrl_wp[i]);
>> +	if (type == hw_execute)
>> +	  {
>> +	    /* Watchpoint disabled.  */
>> +	    continue;
>> +	  }
>> +
>> +	if (phase == 0)
>> +	  {
>> +	    /* Phase 0: No hw_write.  */
>> +	    if (type == hw_write)
>> +	      continue;
>> +	  }
>> +	else
>> +	  {
>> +	    /* Phase 1: Only hw_write.  */
>> +	    if (type != hw_write)
>> +	      continue;
>> +	  }
>> +
>> +	const unsigned int offset
>> +	  = aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
>> +	const unsigned int len
>> +	  = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
>> +	const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
>> +	const CORE_ADDR addr_watch_aligned
>> +	  = align_down (state->dr_addr_wp[i], 16);
>> +	const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
>> +
>> +	/* ADDR_TRAP reports the first address of the memory range
>> +	   accessed by the CPU, regardless of what was the memory
>> +	   range watched.  Thus, a large CPU access that straddles
>> +	   the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
>> +	   ADDR_TRAP that is lower than the
>> +	   ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
>> +
>> +	   addr: |   4	 |   5	 |   6	 |   7	 |   8	 |
>> +				 |---- range watched ----|
>> +		 |----------- range accessed ------------|
>> +
>> +	   In this case, ADDR_TRAP will be 4.  */
>> +	if (!(addr_trap >= addr_watch_aligned
>> +	      && addr_trap < addr_watch + len))
>> +	  {
>> +	    /* Not a match.  */
>> +	    continue;
>> +	  }
>> +
>> +	/* To match a watchpoint known to GDB core, we must never
>> +	   report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
>> +	   range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
>> +	   positive on kernels older than 4.10.  See PR
>> +	   external/20207.  */
>> +	if (addr_p != nullptr)
>>   	  *addr_p = addr_orig;
>> -	  return true;
>> -	}
>> -    }
>>   
>> -  return false;
>> +	if (phase == 0)
>> +	  {
>> +	    /* Phase 0: Return first match.  */
>> +	    return true;
>> +	  }
>> +
>> +	/* Phase 1.  */
>> +	if (addr_p == nullptr)
>> +	  {
>> +	    /* First match, and we don't need to report an address.  No need
>> +	       to look for other matches.  */
>> +	    return true;
>> +	  }
>> +
>> +	if (!found)
>> +	  {
>> +	    /* First match, and we need to report an address.  Look for other
>> +	       matches.  */
>> +	    found = true;
>> +	    continue;
>> +	  }
>> +
>> +	/* More that one match, and we need to return an address.  No need to
> 
> 
> s/More that/More than
> 

Fixed.

>> +	   look for further matches.  */
>> +	return false;
>> +      }
>> +
>> +  return found;
>>   }
>>   
>>   /* Define AArch64 maintenance commands.  */
>> diff --git a/gdb/nat/aarch64-hw-point.c b/gdb/nat/aarch64-hw-point.c
>> index 04674dea2a6..08fd230b71f 100644
>> --- a/gdb/nat/aarch64-hw-point.c
>> +++ b/gdb/nat/aarch64-hw-point.c
>> @@ -73,6 +73,31 @@ aarch64_watchpoint_length (unsigned int ctrl)
>>     return retval;
>>   }
>>   
>> +/* Utility function that returns the type of a watchpoint according to the
>> +   content of a hardware debug control register CTRL.  */
>> +
>> +enum target_hw_bp_type
>> +aarch64_watchpoint_type (unsigned int ctrl)
>> +{
>> +  unsigned int type = DR_CONTROL_TYPE (ctrl);
>> +
>> +  switch (type)
>> +    {
>> +    case 1:
>> +      return hw_read;
>> +    case 2:
>> +      return hw_write;
>> +    case 3:
>> +      return hw_access;
>> +    case 0:
>> +      /* Reserved for a watchpoint.  It must behave as if the watchpoint is
>> +	 disabled.  */
>> +      return hw_execute;
>> +    default:
>> +      gdb_assert_not_reached ("");
>> +    }
>> +}
>> +
>>   /* Given the hardware breakpoint or watchpoint type TYPE and its
>>      length LEN, return the expected encoding for a hardware
>>      breakpoint/watchpoint control register.  */
>> diff --git a/gdb/nat/aarch64-hw-point.h b/gdb/nat/aarch64-hw-point.h
>> index 70f71db7520..bdcca932e57 100644
>> --- a/gdb/nat/aarch64-hw-point.h
>> +++ b/gdb/nat/aarch64-hw-point.h
>> @@ -73,6 +73,7 @@
>>   
>>   #define DR_CONTROL_ENABLED(ctrl)	(((ctrl) & 0x1) == 1)
>>   #define DR_CONTROL_MASK(ctrl)		(((ctrl) >> 5) & 0xff)
>> +#define DR_CONTROL_TYPE(ctrl)		(((ctrl) >> 3) & 0x3)
>>   
>>   /* Structure for managing the hardware breakpoint/watchpoint resources.
>>      DR_ADDR_* stores the address, DR_CTRL_* stores the control register
>> @@ -107,6 +108,7 @@ void aarch64_notify_debug_reg_change (ptid_t ptid, int is_watchpoint,
>>   
>>   unsigned int aarch64_watchpoint_offset (unsigned int ctrl);
>>   unsigned int aarch64_watchpoint_length (unsigned int ctrl);
>> +enum target_hw_bp_type aarch64_watchpoint_type (unsigned int ctrl);
>>   
>>   int aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
>>   			       int len, int is_insert, ptid_t ptid,
> 
> I think this is a reasonable workaround for this situation you describe. Testing on my end doesn't
> seem to cause any issues, so I think this is OK.
> 

Thanks, I'll commit shortly.

> Out of curiosity, what is your aarch64 setup where you can reproduce this failure?
> 

M1 macbook air with fedora asahi remix, and M1 mac mini with debian 
trixie (on the compile farm, as mentioned earlier).

Thanks,
- Tom

> Approved-By: Luis Machado <luis.machado@arm.com>
  
Tom de Vries March 12, 2024, 4:19 p.m. UTC | #5
On 3/11/24 16:12, Luis Machado wrote:
> On 3/11/24 15:04, Tom de Vries wrote:
>> On 3/7/24 13:11, Luis Machado wrote:
>>> Unfortunately I don't see the same failure on a graviton (aarch64). It is usually how these
>>> hardware watchpoint issues go. Things behave differently depending on the setup you
>>> have, since the reported trap address may be different.
>>>
>>
>> I managed to reproduce the fails in both gdb.base/watch-bitfields.exp and gdb.base/watchpoint-unaligned.exp on cfarm103 ( https://portal.cfarm.net/machines/list/ ), so if you want to reproduce and investigate the behaviour, you could try on that machine.
> 
> Thanks for the info.
> 
> Does patch 2/2 rely on patch 1/2 to work properly? Otherwise I think patch 2/2 is good to push.
> 

Patch 2/2 works fine without patch 1/2, I've pushed it.

> I'm a bit nervous about 1/2 though, as that may also widen the window for false positives in terms of hardware
> watchpoint triggers.

Ack, I'll try to follow up in that thread.

Thanks,
- Tom
  

Patch

diff --git a/gdb/aarch64-fbsd-nat.c b/gdb/aarch64-fbsd-nat.c
index d7aa819023b..89ed12bb8e5 100644
--- a/gdb/aarch64-fbsd-nat.c
+++ b/gdb/aarch64-fbsd-nat.c
@@ -164,9 +164,7 @@  aarch64_fbsd_nat_target::stopped_data_address (CORE_ADDR *addr_p)
 bool
 aarch64_fbsd_nat_target::stopped_by_watchpoint ()
 {
-  CORE_ADDR addr;
-
-  return stopped_data_address (&addr);
+  return stopped_data_address (nullptr);
 }
 
 /* Implement the "stopped_by_hw_breakpoint" target_ops method.  */
diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index 11a41e1afae..7dc53ff9e91 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -972,9 +972,7 @@  aarch64_linux_nat_target::stopped_data_address (CORE_ADDR *addr_p)
 bool
 aarch64_linux_nat_target::stopped_by_watchpoint ()
 {
-  CORE_ADDR addr;
-
-  return stopped_data_address (&addr);
+  return stopped_data_address (nullptr);
 }
 
 /* Implement the "can_do_single_step" target_ops method.  */
diff --git a/gdb/aarch64-nat.c b/gdb/aarch64-nat.c
index 5d7d2be2827..9e859cf28cc 100644
--- a/gdb/aarch64-nat.c
+++ b/gdb/aarch64-nat.c
@@ -231,47 +231,102 @@  bool
 aarch64_stopped_data_address (const struct aarch64_debug_reg_state *state,
 			      CORE_ADDR addr_trap, CORE_ADDR *addr_p)
 {
-  int i;
-
-  for (i = aarch64_num_wp_regs - 1; i >= 0; --i)
-    {
-      const unsigned int offset
-	= aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
-      const unsigned int len = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
-      const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
-      const CORE_ADDR addr_watch_aligned
-	= align_down (state->dr_addr_wp[i], 16);
-      const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
-
-      if (state->dr_ref_count_wp[i]
-	  && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])
-	  && addr_trap >= addr_watch_aligned
-	  && addr_trap < addr_watch + len)
-	{
-	  /* ADDR_TRAP reports the first address of the memory range
-	     accessed by the CPU, regardless of what was the memory
-	     range watched.  Thus, a large CPU access that straddles
-	     the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
-	     ADDR_TRAP that is lower than the
-	     ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
-
-	     addr: |   4   |   5   |   6   |   7   |   8   |
-				   |---- range watched ----|
-		   |----------- range accessed ------------|
-
-	     In this case, ADDR_TRAP will be 4.
-
-	     To match a watchpoint known to GDB core, we must never
-	     report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
-	     range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
-	     positive on kernels older than 4.10.  See PR
-	     external/20207.  */
+  bool found = false;
+  for (int phase = 0; phase <= 1; ++phase)
+    for (int i = aarch64_num_wp_regs - 1; i >= 0; --i)
+      {
+	if (!(state->dr_ref_count_wp[i]
+	      && DR_CONTROL_ENABLED (state->dr_ctrl_wp[i])))
+	  {
+	    /* Watchpoint disabled.  */
+	    continue;
+	  }
+
+	const enum target_hw_bp_type type
+	  = aarch64_watchpoint_type (state->dr_ctrl_wp[i]);
+	if (type == hw_execute)
+	  {
+	    /* Watchpoint disabled.  */
+	    continue;
+	  }
+
+	if (phase == 0)
+	  {
+	    /* Phase 0: No hw_write.  */
+	    if (type == hw_write)
+	      continue;
+	  }
+	else
+	  {
+	    /* Phase 1: Only hw_write.  */
+	    if (type != hw_write)
+	      continue;
+	  }
+
+	const unsigned int offset
+	  = aarch64_watchpoint_offset (state->dr_ctrl_wp[i]);
+	const unsigned int len
+	  = aarch64_watchpoint_length (state->dr_ctrl_wp[i]);
+	const CORE_ADDR addr_watch = state->dr_addr_wp[i] + offset;
+	const CORE_ADDR addr_watch_aligned
+	  = align_down (state->dr_addr_wp[i], 16);
+	const CORE_ADDR addr_orig = state->dr_addr_orig_wp[i];
+
+	/* ADDR_TRAP reports the first address of the memory range
+	   accessed by the CPU, regardless of what was the memory
+	   range watched.  Thus, a large CPU access that straddles
+	   the ADDR_WATCH..ADDR_WATCH+LEN range may result in an
+	   ADDR_TRAP that is lower than the
+	   ADDR_WATCH..ADDR_WATCH+LEN range.  E.g.:
+
+	   addr: |   4	 |   5	 |   6	 |   7	 |   8	 |
+				 |---- range watched ----|
+		 |----------- range accessed ------------|
+
+	   In this case, ADDR_TRAP will be 4.  */
+	if (!(addr_trap >= addr_watch_aligned
+	      && addr_trap < addr_watch + len))
+	  {
+	    /* Not a match.  */
+	    continue;
+	  }
+
+	/* To match a watchpoint known to GDB core, we must never
+	   report *ADDR_P outside of any ADDR_WATCH..ADDR_WATCH+LEN
+	   range.  ADDR_WATCH <= ADDR_TRAP < ADDR_ORIG is a false
+	   positive on kernels older than 4.10.  See PR
+	   external/20207.  */
+	if (addr_p != nullptr)
 	  *addr_p = addr_orig;
-	  return true;
-	}
-    }
 
-  return false;
+	if (phase == 0)
+	  {
+	    /* Phase 0: Return first match.  */
+	    return true;
+	  }
+
+	/* Phase 1.  */
+	if (addr_p == nullptr)
+	  {
+	    /* First match, and we don't need to report an address.  No need
+	       to look for other matches.  */
+	    return true;
+	  }
+
+	if (!found)
+	  {
+	    /* First match, and we need to report an address.  Look for other
+	       matches.  */
+	    found = true;
+	    continue;
+	  }
+
+	/* More that one match, and we need to return an address.  No need to
+	   look for further matches.  */
+	return false;
+      }
+
+  return found;
 }
 
 /* Define AArch64 maintenance commands.  */
diff --git a/gdb/nat/aarch64-hw-point.c b/gdb/nat/aarch64-hw-point.c
index 04674dea2a6..08fd230b71f 100644
--- a/gdb/nat/aarch64-hw-point.c
+++ b/gdb/nat/aarch64-hw-point.c
@@ -73,6 +73,31 @@  aarch64_watchpoint_length (unsigned int ctrl)
   return retval;
 }
 
+/* Utility function that returns the type of a watchpoint according to the
+   content of a hardware debug control register CTRL.  */
+
+enum target_hw_bp_type
+aarch64_watchpoint_type (unsigned int ctrl)
+{
+  unsigned int type = DR_CONTROL_TYPE (ctrl);
+
+  switch (type)
+    {
+    case 1:
+      return hw_read;
+    case 2:
+      return hw_write;
+    case 3:
+      return hw_access;
+    case 0:
+      /* Reserved for a watchpoint.  It must behave as if the watchpoint is
+	 disabled.  */
+      return hw_execute;
+    default:
+      gdb_assert_not_reached ("");
+    }
+}
+
 /* Given the hardware breakpoint or watchpoint type TYPE and its
    length LEN, return the expected encoding for a hardware
    breakpoint/watchpoint control register.  */
diff --git a/gdb/nat/aarch64-hw-point.h b/gdb/nat/aarch64-hw-point.h
index 70f71db7520..bdcca932e57 100644
--- a/gdb/nat/aarch64-hw-point.h
+++ b/gdb/nat/aarch64-hw-point.h
@@ -73,6 +73,7 @@ 
 
 #define DR_CONTROL_ENABLED(ctrl)	(((ctrl) & 0x1) == 1)
 #define DR_CONTROL_MASK(ctrl)		(((ctrl) >> 5) & 0xff)
+#define DR_CONTROL_TYPE(ctrl)		(((ctrl) >> 3) & 0x3)
 
 /* Structure for managing the hardware breakpoint/watchpoint resources.
    DR_ADDR_* stores the address, DR_CTRL_* stores the control register
@@ -107,6 +108,7 @@  void aarch64_notify_debug_reg_change (ptid_t ptid, int is_watchpoint,
 
 unsigned int aarch64_watchpoint_offset (unsigned int ctrl);
 unsigned int aarch64_watchpoint_length (unsigned int ctrl);
+enum target_hw_bp_type aarch64_watchpoint_type (unsigned int ctrl);
 
 int aarch64_handle_breakpoint (enum target_hw_bp_type type, CORE_ADDR addr,
 			       int len, int is_insert, ptid_t ptid,