[2/4] Fall back to a default value of 0 for the MISA register.

Message ID 20180919231950.22634-3-jhb@FreeBSD.org
State New, archived
Headers

Commit Message

John Baldwin Sept. 19, 2018, 11:19 p.m. UTC
  The riscv architecture supports multiple architectures with differing
register sizes.  If the MISA register is present, the specific
architecture is chosen based on its value, otherwise properties are
inferred from flags in the ELF header.  However, the code to read MISA
throws an exception if the register is not present.  This does not
trip when cross-debugging a core dump, but does trigger when
attempting to debug native processes if the native target does not
provide MISA.

MISA is a machine-mode register in RISC-V which may or may not be
available to supervisor operating systems.  Rather than requiring
targets to always provide a fake MISA value, fallback to 0 for now.
(The Linux native target currently always provides a fake value of 0
for MISA.)

Longer term, the riscv architecture should perhaps add target
descriptions for the various sub-architectures and permit targets to
set a description.  It would then only use MISA as a fallback if an
explicit description is not provided.  This will permit the proper
register set to be used when debugging a RV32 process (where U-XLEN in
SSTATUS is set to 32 bits) on an RV64 host (where XLEN in MISA
indicates 64 bits) for example by using the U-XLEN field in SSTATUS to
set the target description (RV32 vs RV64) for individual processes.

gdb/ChangeLog:

	* riscv-tdep.c (riscv_read_misa_reg): Fall back to a default value
	of zero.
---
 gdb/ChangeLog    |  5 +++++
 gdb/riscv-tdep.c | 13 +++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)
  

Comments

Jim Wilson Sept. 20, 2018, 12:09 a.m. UTC | #1
On Wed, Sep 19, 2018 at 4:20 PM John Baldwin <jhb@freebsd.org> wrote:
> MISA is a machine-mode register in RISC-V which may or may not be
> available to supervisor operating systems.  Rather than requiring
> targets to always provide a fake MISA value, fallback to 0 for now.
> (The Linux native target currently always provides a fake value of 0
> for MISA.)

The linux target provides a fake value for misa because gdb is a side
project for me, and I haven't had time to add ptrace support for it
yet.  Too many binutils and gcc problems I need to fix first.

I think providing a default misa value in riscv-tdep.c is a mistake.
One major problem is with the use of compressed breakpoints.  Consider
the case where the target has compressed instruction support, the C
library was compiled to use compressed instructions, and the user
program was compiled without compressed instruction support.  If we
can't read misa, then we check the program ELF headers, see that there
is no compressed instruction support, and start using 4-byte
breakpoints.  Storing a 4 byte breakpoint into a library compiled with
compressed instructions means that we will be clobbering instructions
and things start breaking badly.  I ran into this earlier before
fixing some bugs in the breakpoint support in riscv-tdep.c.  Glibc
provides a stub function for gdb support that happens to be exactly 2
bytes long.  Storing a 4-byte breakpoint there clobbers the first
instruction of the next function, which happens to be an important
function for program startup, which means we get an illegal
instruction error before even reaching main which is very confusing.

If we can read misa, then we don't have this problem, as we will
always know if the hardware supports compressed breakpoints, and we
can avoid this problem.  If we can't read misa, then we will have to
check the ELF headers of every shared library used by a program,
including the interpreter ld.so, to see if any of them use compressed
instructions.  We will also have to check every library loaded at
run-time via dlload.  This seems like a hassle.  Another option might
be to try to execute a 2-byte breakpoint to see if it works or not,
and then use compressed breakpoints if they work, but this also seems
like a hassle.  Reading misa seems like the easy solution, all I need
is a linux kernel patch to add ptrace support for reading it.  If we
have access to misa, then we also know whether FP registers are
present, and their size, which is also important info that we can't
easily get any other way.  misa is roughly equivalent to x86 cpuid.  I
don't know of any reason why misa wouldn't be available to the OS.

I suppose another option here might be the auxvec/hwcap info.  Maybe
info about compressed instructions and FP support can be passed from
the kernel to the user program this way, and then maybe gdb can get
the info from there, if gdb has some way to read and parse
auxvec/hwcap info passed to the user program.

Jim
  
John Baldwin Sept. 20, 2018, 12:40 a.m. UTC | #2
On 9/19/18 5:09 PM, Jim Wilson wrote:
> On Wed, Sep 19, 2018 at 4:20 PM John Baldwin <jhb@freebsd.org> wrote:
>> MISA is a machine-mode register in RISC-V which may or may not be
>> available to supervisor operating systems.  Rather than requiring
>> targets to always provide a fake MISA value, fallback to 0 for now.
>> (The Linux native target currently always provides a fake value of 0
>> for MISA.)
> 
> The linux target provides a fake value for misa because gdb is a side
> project for me, and I haven't had time to add ptrace support for it
> yet.  Too many binutils and gcc problems I need to fix first.
> 
> I think providing a default misa value in riscv-tdep.c is a mistake.
> One major problem is with the use of compressed breakpoints.  Consider
> the case where the target has compressed instruction support, the C
> library was compiled to use compressed instructions, and the user
> program was compiled without compressed instruction support.  If we
> can't read misa, then we check the program ELF headers, see that there
> is no compressed instruction support, and start using 4-byte
> breakpoints.  Storing a 4 byte breakpoint into a library compiled with
> compressed instructions means that we will be clobbering instructions
> and things start breaking badly.  I ran into this earlier before
> fixing some bugs in the breakpoint support in riscv-tdep.c.  Glibc
> provides a stub function for gdb support that happens to be exactly 2
> bytes long.  Storing a 4-byte breakpoint there clobbers the first
> instruction of the next function, which happens to be an important
> function for program startup, which means we get an illegal
> instruction error before even reaching main which is very confusing.
> 
> If we can read misa, then we don't have this problem, as we will
> always know if the hardware supports compressed breakpoints, and we
> can avoid this problem.  If we can't read misa, then we will have to
> check the ELF headers of every shared library used by a program,
> including the interpreter ld.so, to see if any of them use compressed
> instructions.  We will also have to check every library loaded at
> run-time via dlload.  This seems like a hassle.  Another option might
> be to try to execute a 2-byte breakpoint to see if it works or not,
> and then use compressed breakpoints if they work, but this also seems
> like a hassle.  Reading misa seems like the easy solution, all I need
> is a linux kernel patch to add ptrace support for reading it.  If we
> have access to misa, then we also know whether FP registers are
> present, and their size, which is also important info that we can't
> easily get any other way.  misa is roughly equivalent to x86 cpuid.  I
> don't know of any reason why misa wouldn't be available to the OS.
> 
> I suppose another option here might be the auxvec/hwcap info.  Maybe
> info about compressed instructions and FP support can be passed from
> the kernel to the user program this way, and then maybe gdb can get
> the info from there, if gdb has some way to read and parse
> auxvec/hwcap info passed to the user program.

GDB certainly can examine AT_ values (it already has to read AT_BASE
to find the runtime linker and from there the list of shared libraries,
etc.).  For some things like F vs D vs Q we probably want to have different
target descriptions (this is how x86 manages x86 vs SSE vs AVX) and have
the native target's ::read_description method return a suitable description.

Whether or not 'C' is present is a bit more subtle as it isn't something
you could infer just from a register being present or not.  On the other
hand, for breakpoints, I wonder if a more straightforward approach would
be to examine the opcode of the existing instruction and use that to
decide which breakpoint to use?  That is, read the first byte of the
existing instruction and if the low 2 bits indicate a 16-bit instruction
use C.BREAK and otherwise use BREAK?
  

Patch

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index e1892f531a..9b8a5175b0 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,8 @@ 
+2018-09-19  John Baldwin  <jhb@FreeBSD.org>
+
+	* riscv-tdep.c (riscv_read_misa_reg): Fall back to a default value
+	of zero.
+
 2018-09-19  John Baldwin  <jhb@FreeBSD.org>
 
 	* trad-frame.c (trad_frame_set_regmap, trad_frame_set_reg_regmap):
diff --git a/gdb/riscv-tdep.c b/gdb/riscv-tdep.c
index 254914c9c7..4b385ed5da 100644
--- a/gdb/riscv-tdep.c
+++ b/gdb/riscv-tdep.c
@@ -317,9 +317,18 @@  riscv_read_misa_reg (bool *read_p)
 	}
       CATCH (ex, RETURN_MASK_ERROR)
 	{
-	  /* Old cores might have MISA located at a different offset.  */
-	  value = get_frame_register_unsigned (frame,
+	  TRY
+	    {
+	      /* Old cores might have MISA located at a different offset.  */
+	      value
+		= get_frame_register_unsigned (frame,
 					       RISCV_CSR_LEGACY_MISA_REGNUM);
+	    }
+	  CATCH (ex, RETURN_MASK_ERROR)
+	    {
+	      value = 0;
+	    }
+	  END_CATCH
 	}
       END_CATCH
     }