Patch 2 of 2 for aarch64 ILP32 support in gdb

Message ID 1485904559.22118.17.camel@caviumnetworks.com
State New, archived
Headers

Commit Message

Steve Ellcey Jan. 31, 2017, 11:15 p.m. UTC
  On Sun, 2017-01-29 at 22:40 +0000, Yao Qi wrote:

> This changes the word size and address size, which is used to determine
> the inferior is arm or aarch64 by checking "bits_per_word == 32" in
> multi-arch debugging.  You can find some instances of such check in
> aarch64-linux-nat.c.  If the bits_per_word is 32 in ILP32, GDB thinks
> the inferior is an ARM one, rather than an AArch64 ILP32 one. which is
> wrong.  We need to tweak the condition above to correctly identify the
> ARM inferior.

> Do you compare the ILP32 gdb test result with normal aarch64 one?

OK, I fixed up arch64-linux-nat.c and didn't find any other places were
the bits_per_word was being checked and I fixed the TRUE/true
FALSE/false constants.  If I run gdb without any changes or with this
patch but debugging 64 bit code I get around 500 failures.  The number
doesn't seem to be constant, I have gotten anywhere from 488 to 503
failures testing the same code.  My last run had:

# of expected passes            30428
# of unexpected failures        503
# of expected failures          53
# of unknown successes          2
# of known failures             64
# of untested testcases         66
# of unresolved testcases       6
# of unsupported tests          293

Many of the failures seem to be thread related and I see a lot of
messages about timeouts and 'program is no longer running'.  When
I run gdb on 32 bit programs with:

	RUNTESTFLAGS="--target_board=unix/-mabi=ilp32"

I get 700+ failures.  My last run had:

# of expected passes            29482
# of unexpected failures        740
# of expected failures          30
# of unknown successes          2
# of known failures             65
# of untested testcases         79
# of unresolved testcases       4
# of unsupported tests          288

So there are more failures in ILP32 mode but there seem to be enough
passes that I think it still makes sense to check in the patch to get
some gdb functionality working.

Here is the latest patch.

Steve Ellcey
sellcey@cavium.com


gdb ChangeLog:

2017-01-31  Andrew Pinski  <apinski@cavium.com>
	    Steve Ellcey  <sellcey@cavium.com>
	    Yao Qi  <yao.qi@linaro.org>

	* aarch64-linux-nat.c (IS_ARM32): New macro.
	(fetch_gregs_from_thread): Use IS_ARM32 macro.
	(store_gregs_to_thread): Ditto.
	(fetch_fpregs_from_thread): Ditto.
	(store_fpregs_to_thread): Ditto.
	(ps_get_thread_area): Ditto.
	(aarch64_linux_siginfo_fixup): Ditto.
	* aarch64-linux-tdep.c (aarch64_linux_init_abi): Set link
	map offsets to 32 or 64 bits.
	* aarch64-tdep.c (aarch64_ilp32_register_type): New function.
	(aarch64_gdbarch_init): Setup ILP32 support.
	Make sure the gdbarches have compatible ilp32 flags.
	Set long and ptr sizes correctly for ilp32.
	* aarch64-tdep.h (gdbarch_tdep) <ilp32>: New field.

gdbserver ChangeLog:

2017-01-31  Andrew Pinski  <apinski@cavium.com>
	    Steve Ellcey  <sellcey@cavium.com>

	* linux-aarch64-low.c (aarch64_linux_read_description): Use
	machine instead of is_elf64 to determine architecture.  Give an
	error when using 32 bit gdbserver on 64 bit program.
  

Comments

Yao Qi Feb. 1, 2017, 10:20 p.m. UTC | #1
On 17-01-31 15:15:59, Steve Ellcey wrote:
> On Sun, 2017-01-29 at 22:40 +0000, Yao Qi wrote:
>
> > This changes the word size and address size, which is used to determine
> > the inferior is arm or aarch64 by checking "bits_per_word == 32" in
> > multi-arch debugging.????You can find some instances of such check in
> > aarch64-linux-nat.c.????If the bits_per_word is 32 in ILP32, GDB thinks
> > the inferior is an ARM one, rather than an AArch64 ILP32 one. which is
> > wrong.????We need to tweak the condition above to correctly identify the
> > ARM inferior.
> >??
> > Do you compare the ILP32 gdb test result with normal aarch64 one?
>
> OK, I fixed up??arch64-linux-nat.c and didn't find any other places were
> the bits_per_word was being checked and I fixed the TRUE/true
> FALSE/false constants. ??If I run gdb without any changes or with this
> patch but debugging 64 bit code I get around 500 failures. ??The number
> doesn't seem to be constant, I have gotten anywhere from 488 to 503
> failures testing the same code. ??My last run had:
>
> # of expected passes????????????????????????30428
> # of unexpected failures????????????????503
> # of expected failures????????????????????53
> # of unknown successes????????????????????2
> # of known failures??????????????????????????64
> # of untested testcases??????????????????66
> # of unresolved testcases??????????????6
> # of unsupported tests????????????????????293
>
> Many of the failures seem to be thread related and I see a lot of
> messages about timeouts and 'program is no longer running'. ??When

If you get 500 fails in vanilla GDB on aarch64-linux, there must be
something wrong.  We only have 36 fails in buildbot slave.

What is your compiler, library and kernel?  They may have some local
ilp32 patches.

> I run gdb on 32 bit programs with:
>
> RUNTESTFLAGS="--target_board=unix/-mabi=ilp32"
>
> I get 700+ failures. ??My last run had:
>
> # of expected passes????????????????????????29482
> # of unexpected failures????????????????740
> # of expected failures????????????????????30
> # of unknown successes????????????????????2
> # of known failures??????????????????????????65
> # of untested testcases??????????????????79
> # of unresolved testcases??????????????4
> # of unsupported tests????????????????????288
>
> So there are more failures in ILP32 mode but there seem to be enough
> passes that I think it still makes sense to check in the patch to get
> some gdb functionality working.
>

"even if GDB is horribly broken, many tests will still pass."
https://sourceware.org/gdb/wiki/TestingGDB#Caveat_emptor

500 fails and 240 regressions are not a good signal.
I don't see anything wrong from ilp32 gdb patch here, but I suspect
these 500 fails are caused by some bugs or oversights in other parts,
such as kernel or gliblc.  aarch64-linux GDB test shouldn't have 500 fails on
aarch64-linux system with ilp32 enabled.  Could you dig it out?
  
Yao Qi Feb. 1, 2017, 10:23 p.m. UTC | #2
On Wed, Feb 1, 2017 at 10:20 PM, Yao Qi <qiyaoltc@gmail.com> wrote:
> something wrong.  We only have 36 fails in buildbot slave.
>

https://gdb-build.sergiodj.net/builders/Ubuntu-AArch64-m64/builds/1337/steps/regressions/logs/stdio
  
Steve Ellcey Feb. 2, 2017, 12:27 a.m. UTC | #3
On Wed, 2017-02-01 at 22:20 +0000, Yao Qi wrote:

> If you get 500 fails in vanilla GDB on aarch64-linux, there must be
> something wrong.  We only have 36 fails in buildbot slave.
> 
> What is your compiler, library and kernel?  They may have some local
> ilp32 patches.

My compiler, library, and kernel do have ILP32 patches, though I am
building gdb in LP64 mode.  Those patches haven't caused new failures
in the GCC testsuite and only a couple of regressions in the glibc
testsuite.

I have heard there are some kernel bugs, fixed in 4.10, that could be
causing some of these failures.  I am using a 4.9 kernel with ILP32
patches.  I was running the testsuite in parallel mode and I have also
heard that could cause problems, I just redid it in serial mode to see
if that helps.  I still got 460 failures when compiling the tests in
LP64 mode.

347 of them are in gdb.threads, with 207 being messages like:

FAIL: gdb.threads/clone-attach-detach.exp: fg attach 2: detach (the program is no longer running)

I.e. a message containing '(the program is no longer running)'  I also
noticed that most of the tests that are failing have the name 'attach'
in them.  There are 97 tests in gdb.base failing and about 2/3 of them
also have some reference to 'attach'. So I think I have some issue
withgdb attaching to a process.  For example, the first failures are:

FAIL: gdb.base/attach-pie-noexec.exp: info shared
FAIL: gdb.base/attach-twice.exp: attach
FAIL: gdb.base/attach.exp: attach1, after setting file
FAIL: gdb.base/attach.exp: attach1 detach (the program is no longer
running)
FAIL: gdb.base/attach.exp: attach2, with no file
FAIL: gdb.base/attach.exp: after attach2, set should_exit
FAIL: gdb.base/attach.exp: continue to breakpoint: postloop (the
program is no longer running)
FAIL: gdb.base/attach.exp: continue until exit at after attach2, exit
(the program is no longer running)
FAIL: gdb.base/attach.exp: attach when process' a.out not in cwd
FAIL: gdb.base/attach.exp: continue until exit (the program is no
longer running)
FAIL: gdb.base/attach.exp: starting with --pid
FAIL: gdb.base/attach.exp: cmdline attach run: run to prompt


Steve Ellcey
sellcey@cavium.com
  
Steve Ellcey Feb. 2, 2017, 12:39 a.m. UTC | #4
It looks like I am having a problem with ptrace.  From gdb.log:

Attaching to program: /home/ubuntu/sellcey/gdb-std/obj/binutils/gdb/testsuite/outputs/gdb.base/attach/attach, process 7631
ptrace: Operation not permitted.
(gdb) FAIL: gdb.base/attach.exp: attach1, after setting file
print should_exit
$1 = 0


Steve Ellcey
sellcey@cavium.com
  
Yao Qi Feb. 2, 2017, 9:51 a.m. UTC | #5
On 17-02-01 16:39:56, Steve Ellcey wrote:
> It looks like I am having a problem with ptrace.  From gdb.log:
> 
> Attaching to program: /home/ubuntu/sellcey/gdb-std/obj/binutils/gdb/testsuite/outputs/gdb.base/attach/attach, process 7631
> ptrace: Operation not permitted.

Hi Steve,
This is a common ptrace attach permission issue.  You can
"echo 0 > /proc/sys/kernel/yama/ptrace_scope" or modify it in some config
file /etc/sysctl.d/10-ptrace.conf (different distro may have different
path).
  
Steve Ellcey Feb. 2, 2017, 10:03 p.m. UTC | #6
On Thu, 2017-02-02 at 09:51 +0000, Yao Qi wrote:
> On 17-02-01 16:39:56, Steve Ellcey wrote:
> > 
> > It looks like I am having a problem with ptrace.  From gdb.log:
> > 
> > Attaching to program: /home/ubuntu/sellcey/gdb-
> > std/obj/binutils/gdb/testsuite/outputs/gdb.base/attach/attach,
> > process 7631
> > ptrace: Operation not permitted.
> Hi Steve,
> This is a common ptrace attach permission issue.  You can
> "echo 0 > /proc/sys/kernel/yama/ptrace_scope" or modify it in some config
> file /etc/sysctl.d/10-ptrace.conf (different distro may have different
> path).


OK, the pthread permission problem was the main cause of my LP64
failures.  I fixed that, ran the standard gdb testsuite on unpatched
code in parallel mode and got 99 failures.  If I run in sequential mode
I got 74 failures.  Updating the kernel got me to 72 failures.  Many of
these are timeouts so I may need to increase that.  If I ignore the
timeouts I have 39 failures, which is pretty close to the 36 from the
buildbot testing though the list of what is and is not failing is
different.

I then ran the testing with 64 bit programs using the patched sources
and got about the same results as the unpatched sources on my
machine.  There was one failure that went away (a timeout):

FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: displaced=off: no thread-specific bp: continue: continue (timeout)

And two that appeared:

FAIL: gdb.base/watchpoint.exp: next over buffer set
FAIL: gdb.base/watchpoint.exp: next over ptr init

When I tested gdb on ILP32 executables I got 235 failures, only 3 of
which were timeouts.  So there are definitely some issues there.  One
thing I noticed is references to ldd.  The ldd I have in my path does
not understand ILP32 executables so that is probably causing some
failures.  I will fix that and see what else I can find to explain the
ILP32 failures.

Steve Ellcey
sellcey@cavium.com
  
Yao Qi Feb. 3, 2017, 9:17 a.m. UTC | #7
On 17-02-02 14:03:41, Steve Ellcey wrote:
> OK, the pthread permission problem was the main cause of my LP64
> failures.  I fixed that, ran the standard gdb testsuite on unpatched
> code in parallel mode and got 99 failures.  If I run in sequential mode
> I got 74 failures.  Updating the kernel got me to 72 failures.  Many of
> these are timeouts so I may need to increase that.  If I ignore the
> timeouts I have 39 failures, which is pretty close to the 36 from the
> buildbot testing though the list of what is and is not failing is
> different.

The result looks reasonable now.

> 
> I then ran the testing with 64 bit programs using the patched sources
> and got about the same results as the unpatched sources on my
> machine.  There was one failure that went away (a timeout):
> 
> FAIL: gdb.threads/step-over-trips-on-watchpoint.exp: displaced=off: no thread-specific bp: continue: continue (timeout)
> 
> And two that appeared:
> 
> FAIL: gdb.base/watchpoint.exp: next over buffer set
> FAIL: gdb.base/watchpoint.exp: next over ptr init
> 
> When I tested gdb on ILP32 executables I got 235 failures, only 3 of
> which were timeouts.  So there are definitely some issues there.  One
> thing I noticed is references to ldd.  The ldd I have in my path does
> not understand ILP32 executables so that is probably causing some
> failures.  I will fix that and see what else I can find to explain the
> ILP32 failures.
> 

OK, great!
  

Patch

diff --git a/gdb/aarch64-linux-nat.c b/gdb/aarch64-linux-nat.c
index 0d472e2..f6fd19a 100644
--- a/gdb/aarch64-linux-nat.c
+++ b/gdb/aarch64-linux-nat.c
@@ -49,6 +49,9 @@ 
 #define TRAP_HWBKPT 0x0004
 #endif
 
+/* Check if we are on arm (as opposed to aarch64).  */
+#define IS_ARM32(arch) (gdbarch_bfd_arch_info(arch)->mach == bfd_mach_arm)
+
 /* Per-process data.  We don't bind this to a per-inferior registry
    because of targets like x86 GNU/Linux that need to keep track of
    processes that aren't bound to any inferior (e.g., fork children,
@@ -166,7 +169,7 @@  fetch_gregs_from_thread (struct regcache *regcache)
   tid = ptid_get_lwp (inferior_ptid);
 
   iovec.iov_base = &regs;
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     iovec.iov_len = 18 * 4;
   else
     iovec.iov_len = sizeof (regs);
@@ -175,7 +178,7 @@  fetch_gregs_from_thread (struct regcache *regcache)
   if (ret < 0)
     perror_with_name (_("Unable to fetch general registers."));
 
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     aarch32_gp_regcache_supply (regcache, (uint32_t *) regs, 1);
   else
     {
@@ -203,7 +206,7 @@  store_gregs_to_thread (const struct regcache *regcache)
   tid = ptid_get_lwp (inferior_ptid);
 
   iovec.iov_base = &regs;
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     iovec.iov_len = 18 * 4;
   else
     iovec.iov_len = sizeof (regs);
@@ -212,7 +215,7 @@  store_gregs_to_thread (const struct regcache *regcache)
   if (ret < 0)
     perror_with_name (_("Unable to fetch general registers."));
 
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     aarch32_gp_regcache_collect (regcache, (uint32_t *) regs, 1);
   else
     {
@@ -248,7 +251,7 @@  fetch_fpregs_from_thread (struct regcache *regcache)
 
   iovec.iov_base = &regs;
 
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     {
       iovec.iov_len = VFP_REGS_SIZE;
 
@@ -295,7 +298,7 @@  store_fpregs_to_thread (const struct regcache *regcache)
 
   iovec.iov_base = &regs;
 
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     {
       iovec.iov_len = VFP_REGS_SIZE;
 
@@ -328,7 +331,7 @@  store_fpregs_to_thread (const struct regcache *regcache)
 			      (char *) &regs.fpcr);
     }
 
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     {
       ret = ptrace (PTRACE_SETREGSET, tid, NT_ARM_VFP, &iovec);
       if (ret < 0)
@@ -460,8 +463,7 @@  ps_err_e
 ps_get_thread_area (struct ps_prochandle *ph,
 		    lwpid_t lwpid, int idx, void **base)
 {
-  int is_64bit_p
-    = (gdbarch_bfd_arch_info (target_gdbarch ())->bits_per_word == 64);
+  int is_64bit_p = !IS_ARM32 (target_gdbarch ());
 
   return aarch64_ps_get_thread_area (ph, lwpid, idx, base, is_64bit_p);
 }
@@ -517,7 +519,7 @@  aarch64_linux_siginfo_fixup (siginfo_t *native, gdb_byte *inf, int direction)
 
   /* Is the inferior 32-bit?  If so, then do fixup the siginfo
      object.  */
-  if (gdbarch_bfd_arch_info (gdbarch)->bits_per_word == 32)
+  if (IS_ARM32 (gdbarch))
     {
       if (direction == 0)
 	aarch64_compat_siginfo_from_siginfo ((struct compat_siginfo *) inf,
diff --git a/gdb/aarch64-linux-tdep.c b/gdb/aarch64-linux-tdep.c
index b94ccb2..6d0c3bd 100644
--- a/gdb/aarch64-linux-tdep.c
+++ b/gdb/aarch64-linux-tdep.c
@@ -1000,8 +1000,12 @@  aarch64_linux_init_abi (struct gdbarch_info info, struct gdbarch *gdbarch)
 
   linux_init_abi (info, gdbarch);
 
-  set_solib_svr4_fetch_link_map_offsets (gdbarch,
-					 svr4_lp64_fetch_link_map_offsets);
+  if (tdep->ilp32)
+    set_solib_svr4_fetch_link_map_offsets (gdbarch,
+					   svr4_ilp32_fetch_link_map_offsets);
+  else
+    set_solib_svr4_fetch_link_map_offsets (gdbarch,
+					   svr4_lp64_fetch_link_map_offsets);
 
   /* Enable TLS support.  */
   set_gdbarch_fetch_tls_load_module_address (gdbarch,
diff --git a/gdb/aarch64-tdep.c b/gdb/aarch64-tdep.c
index 801c03d..67f1a2e 100644
--- a/gdb/aarch64-tdep.c
+++ b/gdb/aarch64-tdep.c
@@ -2091,6 +2091,22 @@  aarch64_gen_return_address (struct gdbarch *gdbarch,
 }
 
 
+/* Implement the "register_type" gdbarch method.
+   Adjust the register type of $PC and $SP on ILP32.  */
+
+static struct type *
+aarch64_ilp32_register_type (struct gdbarch *gdbarch, int regnum)
+{
+  struct gdbarch_tdep *tdep = gdbarch_tdep (gdbarch);
+
+  gdb_assert (tdep->ilp32);
+
+  if (regnum == AARCH64_SP_REGNUM || regnum == AARCH64_PC_REGNUM)
+    return builtin_type (gdbarch)->builtin_uint64;
+  else
+    return tdesc_register_type (gdbarch, regnum);
+}
+
 /* Return the pseudo register name corresponding to register regnum.  */
 
 static const char *
@@ -2851,6 +2867,10 @@  aarch64_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   const struct tdesc_feature *feature;
   int num_regs = 0;
   int num_pseudo_regs = 0;
+  bool ilp32 = false;
+
+  if (info.bfd_arch_info->mach == bfd_mach_aarch64_ilp32)
+    ilp32 = true;
 
   /* Ensure we always have a target descriptor.  */
   if (!tdesc_has_registers (tdesc))
@@ -2908,6 +2928,9 @@  aarch64_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
        best_arch != NULL;
        best_arch = gdbarch_list_lookup_by_info (best_arch->next, &info))
     {
+      /* ILP32 and LP64 are incompatible. */
+      if (gdbarch_tdep (arches->gdbarch)->ilp32 != ilp32)
+	continue;
       /* Found a match.  */
       break;
     }
@@ -2926,6 +2949,7 @@  aarch64_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   tdep->lowest_pc = 0x20;
   tdep->jb_pc = -1;		/* Longjump support not enabled by default.  */
   tdep->jb_elt_size = 8;
+  tdep->ilp32 = ilp32;
 
   set_gdbarch_push_dummy_call (gdbarch, aarch64_push_dummy_call);
   set_gdbarch_frame_align (gdbarch, aarch64_frame_align);
@@ -2968,9 +2992,9 @@  aarch64_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
   set_gdbarch_float_bit (gdbarch, 32);
   set_gdbarch_double_bit (gdbarch, 64);
   set_gdbarch_long_double_bit (gdbarch, 128);
-  set_gdbarch_long_bit (gdbarch, 64);
+  set_gdbarch_long_bit (gdbarch, ilp32 ? 32 : 64);
   set_gdbarch_long_long_bit (gdbarch, 64);
-  set_gdbarch_ptr_bit (gdbarch, 64);
+  set_gdbarch_ptr_bit (gdbarch, ilp32 ? 32 : 64);
   set_gdbarch_char_signed (gdbarch, 0);
   set_gdbarch_float_format (gdbarch, floatformats_ieee_single);
   set_gdbarch_double_format (gdbarch, floatformats_ieee_double);
@@ -3012,6 +3036,13 @@  aarch64_gdbarch_init (struct gdbarch_info info, struct gdbarch_list *arches)
 
   tdesc_use_registers (gdbarch, tdesc, tdesc_data);
 
+  if (ilp32)
+    {
+      /* Override tdesc_register_type to adjust the types of $PC and
+	 $SP in ILP32.  */
+      set_gdbarch_register_type (gdbarch, aarch64_ilp32_register_type);
+    }
+
   /* Add standard register aliases.  */
   for (i = 0; i < ARRAY_SIZE (aarch64_register_aliases); i++)
     user_reg_add (gdbarch, aarch64_register_aliases[i].name,
diff --git a/gdb/aarch64-tdep.h b/gdb/aarch64-tdep.h
index 85c6a97..87d36b6 100644
--- a/gdb/aarch64-tdep.h
+++ b/gdb/aarch64-tdep.h
@@ -97,6 +97,9 @@  struct gdbarch_tdep
 
   /* syscall record.  */
   int (*aarch64_syscall_record) (struct regcache *regcache, unsigned long svc_number);
+  /* If this is ILP32 or LP64.  */
+  bool ilp32;
+
 };
 
 extern struct target_desc *tdesc_aarch64;
diff --git a/gdb/gdbserver/linux-aarch64-low.c b/gdb/gdbserver/linux-aarch64-low.c
index 334310b..e31d5c5 100644
--- a/gdb/gdbserver/linux-aarch64-low.c
+++ b/gdb/gdbserver/linux-aarch64-low.c
@@ -484,7 +484,12 @@  aarch64_linux_read_description (void)
 
   is_elf64 = linux_pid_exe_is_elf_64_file (tid, &machine);
 
-  if (is_elf64)
+  /* There are problems with ptrace when gdbserver is 32 bits and the
+     program being debugged is 64 bits.  */
+  if (sizeof (void *) == 4 && is_elf64)
+    error (_("Can't debug 64-bit process with 32-bit GDBserver"));
+
+  if (machine == EM_AARCH64)
     return tdesc_aarch64;
   else
     return tdesc_arm_with_neon;