[AArch64,Linux] Get rid of top byte from tagged address

Message ID 1508400527-20718-1-git-send-email-yao.qi@linaro.org
State New, archived
Headers

Commit Message

Yao Qi Oct. 19, 2017, 8:08 a.m. UTC
  ARMv8 supports tagged address, that is, the top one byte in address
is ignored.  It is always enabled on aarch64-linux.  The patch clear
the top byte of the virtual address, at the point before GDB/GDBserver
pass the address to /proc or ptrace syscall.  The top byte of address is
still retained in the rest of GDB, because these bits can be used by
different applications in different ways.  That is reason I didn't
implement gdbarch method addr_bits_remove to get rid of them.

Before this patch,
(gdb) x/x 0x0000000000411030
0x411030 <global>:	0x00000000
(gdb) x/x 0xf000000000411030
0xf000000000411030:	Cannot access memory at address 0xf000000000411030

After this patch,

(gdb) x/x 0x0000000000411030
0x411030 <global>:	0x00000000
(gdb) x/x 0xf000000000411030
0xf000000000411030:	0x00000000

gdb:

2017-10-19  Yao Qi  <yao.qi@linaro.org>

	* aarch64-linux-nat.c (super_xfer_partial): New function pointer.
	(aarch64_linux_xfer_partial): New function.
	(_initialize_aarch64_linux_nat): Initialize super_xfer_partial,
	and update to_xfer_partial.

gdb/gdbserver:

2017-10-19  Yao Qi  <yao.qi@linaro.org>

	* linux-aarch64-low.c (super_read_memory): New function pointer.
	(super_write_memory): Likewise.
	(aarch64_linux_read_memory): New function.
	(aarch64_linux_write_memory): New function.
	(initialize_low_arch): Initialize super_read_memory and
	super_write_memory.  Update read_memory and write_memory.
---
 gdb/aarch64-linux-nat.c           | 25 +++++++++++++++++++++++++
 gdb/gdbserver/linux-aarch64-low.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)
  

Comments

Ramana Radhakrishnan Oct. 19, 2017, 9:50 a.m. UTC | #1
On 10/19/17 9:08 AM, Yao Qi wrote:
> ARMv8 supports tagged address, that is, the top one byte in address
> is ignored.  It is always enabled on aarch64-linux.  The patch clear
> the top byte of the virtual address, at the point before GDB/GDBserver
> pass the address to /proc or ptrace syscall.  The top byte of address is
> still retained in the rest of GDB, because these bits can be used by
> different applications in different ways.  That is reason I didn't
> implement gdbarch method addr_bits_remove to get rid of them.

I think you also want to point people at ...

https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt

regards
Ramana
  
Pedro Alves Oct. 19, 2017, 9:52 a.m. UTC | #2
On 10/19/2017 09:08 AM, Yao Qi wrote:
> ARMv8 supports tagged address, that is, the top one byte in address
> is ignored.  It is always enabled on aarch64-linux.

In that case, why isn't the kernel itself stripping the top byte?

OK, looking around, I found:

 https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt

where it's documented that the top byte must be 0 when calling
into the kernel.

Having this reference in the log is helpful.

> The patch clear
> the top byte of the virtual address, at the point before GDB/GDBserver
> pass the address to /proc or ptrace syscall.  The top byte of address is
> still retained in the rest of GDB, because these bits can be used by
> different applications in different ways.  That is reason I didn't
> implement gdbarch method addr_bits_remove to get rid of them.

I'm fine with doing this if it's what arm/linaro folks want,
though personally (with absolutely no experience in this) I have
reservations about whether stripping the top byte in the special
case of memory accesses is a good idea, since it may puzzle folks
when they pass such pointers/addresses in registers/structures and
things don't magically work then (and then gdb masks the problem when
folks try to diagnose it, as in "but I can access the object
via "p *s->ptr", why isn't this working???  bad gdb.").

So I think this should be documented in the manual somewhere.

> Before this patch,
> (gdb) x/x 0x0000000000411030
> 0x411030 <global>:	0x00000000
> (gdb) x/x 0xf000000000411030
> 0xf000000000411030:	Cannot access memory at address 0xf000000000411030
> 
> After this patch,
> 
> (gdb) x/x 0x0000000000411030
> 0x411030 <global>:	0x00000000
> (gdb) x/x 0xf000000000411030
> 0xf000000000411030:	0x00000000

I think we should have a testsuite test for this.
Could be something like the above (though I'd suggest making
that 'global' variable have some value other than 0s to make
sure we're reading the right memory).

And/or something like:

  uint32_t global = 0x11223344;

  gdb_test "p *(uint32_t *) (((uintptr_t) 0xf << 60) | (uintptr_t) &global) == global" \
    " = 1" \
    "top byte ignored"

> +  if (object == TARGET_OBJECT_MEMORY)
> +    {
> +      /* ARMv8 supports tagged address, that is, the top one byte in
> +	 virtual address is ignored.  */
> +      offset = offset & 0x0fffffffffffffffULL;

In several places, instead of:
   V = V & MASK;
you can write:
   V &= MASK;
i.e here, write instead:
   offset &= 0x0fffffffffffffffULL;
which is I think more usual.

Thanks,
Pedro Alves
  
Ramana Radhakrishnan Oct. 19, 2017, 9:55 a.m. UTC | #3
On 10/19/17 10:52 AM, Pedro Alves wrote:
> On 10/19/2017 09:08 AM, Yao Qi wrote:
>> ARMv8 supports tagged address, that is, the top one byte in address
>> is ignored.  It is always enabled on aarch64-linux.
> 
> In that case, why isn't the kernel itself stripping the top byte?
> 
> OK, looking around, I found:
> 
>   https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt
> 
> where it's documented that the top byte must be 0 when calling
> into the kernel.
> 
> Having this reference in the log is helpful.
> 
>> The patch clear
>> the top byte of the virtual address, at the point before GDB/GDBserver
>> pass the address to /proc or ptrace syscall.  The top byte of address is
>> still retained in the rest of GDB, because these bits can be used by
>> different applications in different ways.  That is reason I didn't
>> implement gdbarch method addr_bits_remove to get rid of them.
> 
> I'm fine with doing this if it's what arm/linaro folks want,
> though personally (with absolutely no experience in this) I have
> reservations about whether stripping the top byte in the special
> case of memory accesses is a good idea, since it may puzzle folks
> when they pass such pointers/addresses in registers/structures and
> things don't magically work then (and then gdb masks the problem when
> folks try to diagnose it, as in "but I can access the object
> via "p *s->ptr", why isn't this working???  bad gdb.").

Yeah that thought crossed my mind too whether it makes a better debug 
experience keeping the top byte in the debug view but only stripping it 
off in the ptrace interface or wherever you have to respect the kernel 
interface.


regards
Ramana
  
Yao Qi Oct. 19, 2017, 10:51 a.m. UTC | #4
Pedro Alves <palves@redhat.com> writes:

> I'm fine with doing this if it's what arm/linaro folks want,
> though personally (with absolutely no experience in this) I have
> reservations about whether stripping the top byte in the special
> case of memory accesses is a good idea, since it may puzzle folks
> when they pass such pointers/addresses in registers/structures and
> things don't magically work then (and then gdb masks the problem when
> folks try to diagnose it, as in "but I can access the object
> via "p *s->ptr", why isn't this working???  bad gdb.").
>
> So I think this should be documented in the manual somewhere.

I don't understand how does GDB affect the program.  ARMv8 architecture
supports tagged address for data values, and top byte of virtual address
is ignored in some cases,

  struct s s1;
  s1.i = 1234;
  struct s *p1 = &s1;
  struct s *p2 = &s1;
  uint64_t t = (uint64_t) p2;

  t |= 0xf000000000000000ULL;
  p2 = (struct s *) t;

  printf ("%p %d\n", p2, p2->i);

The program output is "0xf000ffffc2e51720 1234".

However, linux kernel applies an restriction that top one byte should be
zero when user space passes address to syscall or access /proc file
system.  When GDB debugs inferior, it needs to either pass address to
kernel through syscall (ptrace) or access /proc, kernel complains on the
address.  The point of this patch is to keep tagged bits in VA, and only
get rid of them at the point when the address is to be passed to kernel.
GDB has no problem debugging the example above,

(gdb) p p2
$1 = (struct s *) 0xf000fffffffff500
(gdb) p *p2
$2 = {i = 1234}
(gdb) p p2->i
$3 = 1234
  
Yao Qi Oct. 19, 2017, 10:53 a.m. UTC | #5
Ramana Radhakrishnan <ramana.radhakrishnan@foss.arm.com> writes:

> I think you also want to point people at ...
>
> https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt

OK, I'll update commit log to mention it.
  
Pedro Alves Oct. 19, 2017, 11:08 a.m. UTC | #6
> Pedro Alves <palves@redhat.com> writes:
> 
>> I'm fine with doing this if it's what arm/linaro folks want,
>> though personally (with absolutely no experience in this) I have
>> reservations about whether stripping the top byte in the special
>> case of memory accesses is a good idea, since it may puzzle folks
>> when they pass such pointers/addresses in registers/structures and
>> things don't magically work then (and then gdb masks the problem when
>> folks try to diagnose it, as in "but I can access the object
>> via "p *s->ptr", why isn't this working???  bad gdb.").
>>
>> So I think this should be documented in the manual somewhere.
> 
> I don't understand how does GDB affect the program.  ARMv8 architecture
> supports tagged address for data values, and top byte of virtual address
> is ignored in some cases,

I didn't say that GDB affects the program.  From the kernel document:

~~~
All interpretation of userspace memory addresses by the kernel assumes
an address tag of 0x00.

This includes, but is not limited to, addresses found in:

 - pointer arguments to system calls, including pointers in structures
   passed to system calls,
~~~

This means with something like:

#define tagptr(PTR) \
 ((typeof (PTR)) ((uintptr_t) (PTR) | 0xf000000000000000ULL))

  strcat (buf, "hello\n");

  char *ptr = tagptr(buf);  // assume this is hidden from view.

  write (1, ptr, 6);  // kernel rejects this.

and then the user might be puzzled because stepping through
that code:

  (gdb) print ptr
  (gdb) print ptr[0]

etc. works without error.

Same with iovec/readv, ioctl, etc., any system call that takes
a pointer argument.

Thanks,
Pedro Alves
  
Pedro Alves Oct. 19, 2017, 11:21 a.m. UTC | #7
On 10/19/2017 09:08 AM, Yao Qi wrote:
> ARMv8 supports tagged address, that is, the top one byte in address
> is ignored.  It is always enabled on aarch64-linux.  The patch clear
> the top byte of the virtual address, at the point before GDB/GDBserver
> pass the address to /proc or ptrace syscall.  The top byte of address is
> still retained in the rest of GDB, because these bits can be used by
> different applications in different ways.  That is reason I didn't
> implement gdbarch method addr_bits_remove to get rid of them.

Something else that came to mind -- do we need to make GDB's
stack/code caches aware of this?  I.e., for the case of
writing through a tagged address and reading via a non-tagged
address, and vice-versa.

Thanks,
Pedro Alves
  
Yao Qi Oct. 19, 2017, 1:17 p.m. UTC | #8
Pedro Alves <palves@redhat.com> writes:

> This means with something like:
>
> #define tagptr(PTR) \
>  ((typeof (PTR)) ((uintptr_t) (PTR) | 0xf000000000000000ULL))
>
>   strcat (buf, "hello\n");
>
>   char *ptr = tagptr(buf);  // assume this is hidden from view.
>
>   write (1, ptr, 6);  // kernel rejects this.
>

Right, it returns -1, and errno is EFAULT.

> and then the user might be puzzled because stepping through
> that code:
>
>   (gdb) print ptr
>   (gdb) print ptr[0]
>
> etc. works without error.

That is right/expected to me, because in the c code, we can still access
ptr[0] without any error, like "char c = ptr[0]", so it is reasonable
that we can access them in GDB.  Kernel rejects that address, doesn't
mean we can't access that address.

>
> Same with iovec/readv, ioctl, etc., any system call that takes
> a pointer argument.
  
Pedro Alves Oct. 19, 2017, 1:22 p.m. UTC | #9
On 10/19/2017 02:17 PM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
> 
>> This means with something like:
>>
>> #define tagptr(PTR) \
>>  ((typeof (PTR)) ((uintptr_t) (PTR) | 0xf000000000000000ULL))
>>
>>   strcat (buf, "hello\n");
>>
>>   char *ptr = tagptr(buf);  // assume this is hidden from view.
>>
>>   write (1, ptr, 6);  // kernel rejects this.
>>
> 
> Right, it returns -1, and errno is EFAULT.
> 
>> and then the user might be puzzled because stepping through
>> that code:
>>
>>   (gdb) print ptr
>>   (gdb) print ptr[0]
>>
>> etc. works without error.
> 
> That is right/expected to me, because in the c code, we can still access
> ptr[0] without any error, like "char c = ptr[0]", so it is reasonable
> that we can access them in GDB.  Kernel rejects that address, doesn't
> mean we can't access that address.

OK, that's reasonable.

Thanks,
Pedro Alves
  
Yao Qi Oct. 19, 2017, 1:25 p.m. UTC | #10
Pedro Alves <palves@redhat.com> writes:

> Something else that came to mind -- do we need to make GDB's
> stack/code caches aware of this?  I.e., for the case of
> writing through a tagged address and reading via a non-tagged
> address, and vice-versa.

Yes, we need take care of cache in gdb.  Let me figure it out.
  

Patch

diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index 4feaec3..3c6fe2c 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -780,6 +780,27 @@  aarch64_linux_can_do_single_step (struct target_ops *target)
   return 1;
 }
 
+static target_xfer_partial_ftype *super_xfer_partial;
+
+/* Implement the "to_xfer_partial" target_ops method.  */
+
+static enum target_xfer_status
+aarch64_linux_xfer_partial (struct target_ops *ops, enum target_object object,
+			    const char *annex, gdb_byte *readbuf,
+			    const gdb_byte *writebuf, ULONGEST offset,
+			    ULONGEST len, ULONGEST *xfered_len)
+{
+  if (object == TARGET_OBJECT_MEMORY)
+    {
+      /* ARMv8 supports tagged address, that is, the top one byte in
+	 virtual address is ignored.  */
+      offset = offset & 0x0fffffffffffffffULL;
+    }
+
+  return super_xfer_partial (ops, object, annex, readbuf, writebuf,
+			     offset, len, xfered_len);
+}
+
 /* Define AArch64 maintenance commands.  */
 
 static void
@@ -834,6 +855,10 @@  _initialize_aarch64_linux_nat (void)
   super_post_startup_inferior = t->to_post_startup_inferior;
   t->to_post_startup_inferior = aarch64_linux_child_post_startup_inferior;
 
+  /* Override the default to_xfer_partial.  */
+  super_xfer_partial = t->to_xfer_partial;
+  t->to_xfer_partial = aarch64_linux_xfer_partial;
+
   /* Register the target.  */
   linux_nat_add_target (t);
   linux_nat_set_new_thread (t, aarch64_linux_new_thread);
diff --git a/gdb/gdbserver/linux-aarch64-low.c b/gdb/gdbserver/linux-aarch64-low.c
index 6d5c4e5..a78f023 100644
--- a/gdb/gdbserver/linux-aarch64-low.c
+++ b/gdb/gdbserver/linux-aarch64-low.c
@@ -3015,6 +3015,31 @@  struct linux_target_ops the_low_target =
   aarch64_get_syscall_trapinfo,
 };
 
+static int (*super_read_memory) (CORE_ADDR memaddr, unsigned char *myaddr,
+				 int len);
+
+static  int (*super_write_memory) (CORE_ADDR memaddr,
+				   const unsigned char *myaddr, int len);
+
+static int
+aarch64_linux_read_memory (CORE_ADDR memaddr, unsigned char *myaddr, int len)
+{
+  /* ARMv8 supports tagged address, that is, the top one byte in
+     virtual address is ignored.  */
+  memaddr = memaddr & 0x0fffffffffffffffULL;
+  return super_read_memory (memaddr, myaddr, len);
+}
+
+static int
+aarch64_linux_write_memory (CORE_ADDR memaddr, const unsigned char *myaddr,
+			    int len)
+{
+  /* ARMv8 supports tagged address, that is, the top one byte in
+     virtual address is ignored.  */
+  memaddr = memaddr & 0x0fffffffffffffffULL;
+  return super_write_memory (memaddr, myaddr, len);
+}
+
 void
 initialize_low_arch (void)
 {
@@ -3023,4 +3048,12 @@  initialize_low_arch (void)
   initialize_low_arch_aarch32 ();
 
   initialize_regsets_info (&aarch64_regsets_info);
+
+  /* Override the default read_memory.  */
+  super_read_memory = the_target->read_memory;
+  the_target->read_memory = aarch64_linux_read_memory;
+
+  /* Override the default write_memory.  */
+  super_write_memory = the_target->write_memory;
+  the_target->write_memory = aarch64_linux_write_memory;
 }