[3/4] Explicit fixed-width integral File I/O protocol types

Message ID 20180428011940.115515-4-julio@farjump.io
State New, archived
Headers

Commit Message

Julio Guerra April 28, 2018, 1:19 a.m. UTC
  The File I/O extension defines portable types of there host-specific
counterparts, such as `struct stat` and `struct timeval`. This patch improves
these type definitions in `include/gdb/fileio.h` to make possible sharing them
with target programs, and avoid redefining them by being able to include this
header, even with cross-compiled programs.

The patch thus removes several drawbacks:
- avoid implicit pointers when defining fixed-width integers as array typedefs.
- explicitly state the sizes of fixed-width integers (e.g. fio_ulong_t becomes
  fio_uint64_t).
It also renames a few misnamed conversion functions with the convention
`host_to_fileio_*` used everywhere else.

Note that fixed-width integer types are defined using GCC's preprocessor builtin
macros to avoid using the libc's stdint.h which might not be available on the
target compiler. Therefore, `include/gdb/fileio.h` is standalone.

Tested with https://github.com/farjump/raspberry-pi/blob/ea31c48d7c7eed27d39fb1bec2d3a1d308cb8ae7/sdk/libalpha/include/alpha/fileio.h#L13

2018-04-28  Julio Guerra  <julio@farjump.io>

	* include/gdb/fileio.h: explicit fixed-width integral File I/O protocol types.
	* gdb/common/fileio.c: explicit fixed-width integral File I/O protocol types.
	* gdb/common/fileio.h: explicit fixed-width integral File I/O protocol types.
	* gdb/remote-fileio.c: explicit fixed-width integral File I/O protocol types.

Signed-off-by: Julio Guerra <julio@farjump.io>
---
 gdb/ChangeLog        |  7 ++++++
 gdb/common/fileio.c  | 39 +++++++++++++++--------------
 gdb/common/fileio.h  |  8 +++---
 gdb/remote-fileio.c  | 26 ++++++++++----------
 include/gdb/fileio.h | 58 +++++++++++++++-----------------------------
 5 files changed, 63 insertions(+), 75 deletions(-)
  

Comments

Pedro Alves May 2, 2018, 10:45 a.m. UTC | #1
On 04/28/2018 02:19 AM, Julio Guerra wrote:
> The File I/O extension defines portable types of there host-specific
> counterparts, such as `struct stat` and `struct timeval`. This patch improves
> these type definitions in `include/gdb/fileio.h` to make possible sharing them
> with target programs, and avoid redefining them by being able to include this
> header, even with cross-compiled programs.
> 
> The patch thus removes several drawbacks:
> - avoid implicit pointers when defining fixed-width integers as array typedefs.
> - explicitly state the sizes of fixed-width integers (e.g. fio_ulong_t becomes
>   fio_uint64_t).
> It also renames a few misnamed conversion functions with the convention
> `host_to_fileio_*` used everywhere else.
> 
> Note that fixed-width integer types are defined using GCC's preprocessor builtin
> macros to avoid using the libc's stdint.h which might not be available on the
> target compiler. Therefore, `include/gdb/fileio.h` is standalone.

Sorry, but NAK.

I don't see how using GCC preprocessor builtins would be more
portable than stdint.h.  That doesn't make the file
standalone -- it makes it dependent on gcc instead.

But regardless, the major problem here is that this approach
does not work, because it ignores the issue of alignment and
padding holes.
With the current approach, all structure fields are aligned
to 1 byte.  With the explicit types approach, fields are
aligned to their natural alignment.  E.g., from a glance,
it seems like we'd be creating a 4-byte padding hole after
fst_rdev.  Even if by chance the fields end up aligned,
that's not something we should be relying on.  Both host
and client must agree on the layout of the data structures.
fio_stat objects are copied directly into target memory,
see tail end of remote-fileio.c:remote_fileio_func_stat,
for example:

  if (statptr)
    {
      host_to_fileio_stat (&st, &fst);
      host_to_fileio_uint (0, fst.fst_dev);

      errno = target_write_memory (statptr, (gdb_byte *) &fst, sizeof fst);
      if (errno != 0)

As for the "int"/"long"/etc naming vs explicitly-sized uint32_t etc.,
note that the File I/O protocol's "int" "long" etc. types are defined here:

 https://sourceware.org/gdb/current/onlinedocs/gdb/Integral-Datatypes.html#Integral-Datatypes

I don't have a strong opinion, but I'm not sure it's worth it to
deviate from the documentation.

Thanks,
Pedro Alves
  
Julio Guerra May 2, 2018, 3:18 p.m. UTC | #2
Hello,

Since I am talking a lot of embedded software, I CC'd Joel Brobecker
from AdaCore to maybe give the final direction for a v2.

On 05/02/2018 12:45, Pedro Alves wrote :
> I don't see how using GCC preprocessor builtins would be more

> portable than stdint.h.  That doesn't make the file

> standalone -- it makes it dependent on gcc instead.

I was indeed writing a GCC-based solution. Isn't GDB only compiled with GCC?

When compiling embedded programs doing File IOs, I have always been in
bare-metal cases, thus using CFLAGS "-ffreestanding -nostdinc
-nostdlib". So including "stdint.h" would fail because of -nostdinc. And
since my end goal here is to make fileio.h the single source of truth of
File IO protocol types, to then be able to include it in other C/C++
programs, I need it be compiled with these flags, i.e. without the
standard library. And in my opinion, using the File IO protocol in
bare-metal programs is where File IOs shine the most :)

So, if other compilers than GCC are required, I can rewrite it using
standard unsigned int, etc.
> But regardless, the major problem here is that this approach

> does not work, because it ignores the issue of alignment and

> padding holes.

> With the current approach, all structure fields are aligned

> to 1 byte.  With the explicit types approach, fields are

> aligned to their natural alignment.  E.g., from a glance,

> it seems like we'd be creating a 4-byte padding hole after

> fst_rdev.  Even if by chance the fields end up aligned,

> that's not something we should be relying on.  Both host

> and client must agree on the layout of the data structures.

> fio_stat objects are copied directly into target memory,

> see tail end of remote-fileio.c:remote_fileio_func_stat,

> for example:

>

>   if (statptr)

>     {

>       host_to_fileio_stat (&st, &fst);

>       host_to_fileio_uint (0, fst.fst_dev);

>

>       errno = target_write_memory (statptr, (gdb_byte *) &fst, sizeof fst);

>       if (errno != 0)


Yes, and this is why I rather prefer this fileio.h file to become the
single source of truth instead of redefining them and ending with
mistakes such as memory overflows. For example, note the "sizeof fst",
where we are assuming that the target program has correctly redefined
it. Sharing fileio.h with the target will most likely limit the problem.

I indeed missed the alignment. Can't we simply pack the structure
"__attribute__ ((packed))"? I find these implicit "address-of" used
everywhere in GDB's code very opaque, and I am not sure how it would end
up when included in other C/C++ programs. It is anyway a source of
errors since no one expects integral types to be defined as an array, at
least not named like that (for example, I had problems because I wrote
fio_uint* to pass a File IO integer pointer, which ends up as a
"char(*)[32]"...).
> As for the "int"/"long"/etc naming vs explicitly-sized uint32_t etc.,

> note that the File I/O protocol's "int" "long" etc. types are defined here:

>

>  https://sourceware.org/gdb/current/onlinedocs/gdb/Integral-Datatypes.html#Integral-Datatypes

>

> I don't have a strong opinion, but I'm not sure it's worth it to

> deviate from the documentation.

Sorry, I missed this documentation entry. I'm totally fine in keeping
current type names since if fielio.h ends up installed, we would still
be able to avoid redefiniton mistakes and reuse them.

-- 
Julio Guerra
Co-founder & CTO of Farjump
Mobile: +33 618 644 164
LinkedIn: https://linkedin.com/in/guerrajulio
Slack: farjump.slack.com
  
Joel Brobecker May 2, 2018, 4:14 p.m. UTC | #3
> Since I am talking a lot of embedded software, I CC'd Joel Brobecker
> from AdaCore to maybe give the final direction for a v2.

I'm not really sure what is expected of me, can you clarify? If you
are thinking of my experience with baremetal targets, that's no problem.
But, although I do work with C code on baremetal targets, it tends
to be in situations where we have a C library (based on newlib, for
the most part).

Are we talking about the remote File-IO extension? I should say up front
that my knowledge of this extension was nearly null until I read the
documentation in the manual. I think Pedro is the expert on that.

If I understand correctly, you are arguing that we should be providing
fileio.h so that programs being debugged can use it? I don't understand
how this would integrate with GDB, though. If I understand correctly,
this file is used for the definition of some structures used within
the remote protocol, so the information exchanged between GDB and
GDBserver. How would you exploit that? (what this tells me is that
your patch series deserves a documentation update)
  
Julio Guerra May 2, 2018, 4:31 p.m. UTC | #4
Le 02/05/2018 à 18:14, Joel Brobecker a écrit :
>> Since I am talking a lot of embedded software, I CC'd Joel Brobecker

>> from AdaCore to maybe give the final direction for a v2.

> I'm not really sure what is expected of me, can you clarify? If you

> are thinking of my experience with baremetal targets, that's no problem.

> But, although I do work with C code on baremetal targets, it tends

> to be in situations where we have a C library (based on newlib, for

> the most part).

Yes, this is why I thought of you.
> Are we talking about the remote File-IO extension? I should say up front

> that my knowledge of this extension was nearly null until I read the

> documentation in the manual. I think Pedro is the expert on that.

Yes, we are talking about the File IO extension.
> If I understand correctly, you are arguing that we should be providing

> fileio.h so that programs being debugged can use it? I don't understand

> how this would integrate with GDB, though. If I understand correctly,

> this file is used for the definition of some structures used within

> the remote protocol, so the information exchanged between GDB and

> GDBserver. How would you exploit that? (what this tells me is that

> your patch series deserves a documentation update)

No, these structures in fileio.h are passed from GDB to the target
program being debugged through its GDBserver.

When encapsulated in the newlib, it requires to stub the available
syscalls using this extension (that's what we do) . The situation is the
same at the stub level (redefining File IO structures, changing the
endianness if necessary, and copying them into standard structures (for
struct stat and struct timeval).

Ok, so let's wait Pedro's answer ;)

Thank you,

-- 
Julio Guerra
  

Patch

diff --git a/gdb/ChangeLog b/gdb/ChangeLog
index 6986798f08..dc7f7cd8ac 100644
--- a/gdb/ChangeLog
+++ b/gdb/ChangeLog
@@ -1,3 +1,10 @@ 
+2018-04-28  Julio Guerra  <julio@farjump.io>
+
+	* include/gdb/fileio.h: explicit fixed-width integral File I/O protocol types.
+	* gdb/common/fileio.c: explicit fixed-width integral File I/O protocol types.
+	* gdb/common/fileio.h: explicit fixed-width integral File I/O protocol types.
+	* gdb/remote-fileio.c: explicit fixed-width integral File I/O protocol types.
+
 2018-04-28  Julio Guerra  <julio@farjump.io>
 
 	* gdb/remote-fileio.c: do not clear the value of st_dev in File I/O's stat().
diff --git a/gdb/common/fileio.c b/gdb/common/fileio.c
index 912a7ede3c..71b2745434 100644
--- a/gdb/common/fileio.c
+++ b/gdb/common/fileio.c
@@ -205,17 +205,17 @@  fileio_mode_pack (mode_t mode)
 /* Pack a host-format mode_t into an fio_mode_t.  */
 
 static void
-host_to_fileio_mode (mode_t num, fio_mode_t fnum)
+host_to_fileio_mode (mode_t num, fio_mode_t *fnum)
 {
-  host_to_bigendian (fileio_mode_pack (num), (char *) fnum, 4);
+  host_to_bigendian (fileio_mode_pack (num), (char *) fnum, sizeof (fio_mode_t));
 }
 
 /* Pack a host-format integer into an fio_ulong_t.  */
 
 static void
-host_to_fileio_ulong (LONGEST num, fio_ulong_t fnum)
+host_to_fileio_uint64 (LONGEST num, fio_uint64_t* fnum)
 {
-  host_to_bigendian (num, (char *) fnum, 8);
+  host_to_bigendian (num, (char *) fnum, sizeof (fio_uint64_t));
 }
 
 /* See fileio.h.  */
@@ -225,31 +225,30 @@  host_to_fileio_stat (struct stat *st, struct fio_stat *fst)
 {
   LONGEST blksize;
 
-  host_to_fileio_uint ((long) st->st_dev, fst->fst_dev);
-  host_to_fileio_uint ((long) st->st_ino, fst->fst_ino);
-  host_to_fileio_mode (st->st_mode, fst->fst_mode);
-  host_to_fileio_uint ((long) st->st_nlink, fst->fst_nlink);
-  host_to_fileio_uint ((long) st->st_uid, fst->fst_uid);
-  host_to_fileio_uint ((long) st->st_gid, fst->fst_gid);
-  host_to_fileio_uint ((long) st->st_rdev, fst->fst_rdev);
-  host_to_fileio_ulong ((LONGEST) st->st_size, fst->fst_size);
+  host_to_fileio_uint32 ((long) st->st_dev, &fst->fst_dev);
+  host_to_fileio_uint32 ((long) st->st_ino, &fst->fst_ino);
+  host_to_fileio_mode (st->st_mode, &fst->fst_mode);
+  host_to_fileio_uint32 ((long) st->st_nlink, &fst->fst_nlink);
+  host_to_fileio_uint32 ((long) st->st_uid, &fst->fst_uid);
+  host_to_fileio_uint32 ((long) st->st_gid, &fst->fst_gid);
+  host_to_fileio_uint32 ((long) st->st_rdev, &fst->fst_rdev);
+  host_to_fileio_uint64 ((LONGEST) st->st_size, &fst->fst_size);
 #ifdef HAVE_STRUCT_STAT_ST_BLKSIZE
   blksize = st->st_blksize;
 #else
   blksize = 512;
 #endif
-  host_to_fileio_ulong (blksize, fst->fst_blksize);
+  host_to_fileio_uint64 ((LONGEST) blksize, &fst->fst_blksize);
 #if HAVE_STRUCT_STAT_ST_BLOCKS
-  host_to_fileio_ulong ((LONGEST) st->st_blocks, fst->fst_blocks);
+  host_to_fileio_uint64 ((LONGEST) st->st_blocks, &fst->fst_blocks);
 #else
   /* FIXME: This is correct for DJGPP, but other systems that don't
      have st_blocks, if any, might prefer 512 instead of st_blksize.
      (eliz, 30-12-2003)  */
-  host_to_fileio_ulong (((LONGEST) st->st_size + blksize - 1)
-			/ blksize,
-			fst->fst_blocks);
+  host_to_fileio_uint64 (((LONGEST) st->st_size + blksize - 1) / blksize,
+			 &fst->fst_blocks);
 #endif
-  host_to_fileio_time (st->st_atime, fst->fst_atime);
-  host_to_fileio_time (st->st_mtime, fst->fst_mtime);
-  host_to_fileio_time (st->st_ctime, fst->fst_ctime);
+  host_to_fileio_time (st->st_atime, &fst->fst_atime);
+  host_to_fileio_time (st->st_mtime, &fst->fst_mtime);
+  host_to_fileio_time (st->st_ctime, &fst->fst_ctime);
 }
diff --git a/gdb/common/fileio.h b/gdb/common/fileio.h
index 92d26742a7..85155ccf9a 100644
--- a/gdb/common/fileio.h
+++ b/gdb/common/fileio.h
@@ -53,17 +53,17 @@  host_to_bigendian (LONGEST num, char *buf, int bytes)
 /* Pack a host-format integer into an fio_uint_t.  */
 
 static inline void
-host_to_fileio_uint (long num, fio_uint_t fnum)
+host_to_fileio_uint32 (long num, fio_uint32_t *fnum)
 {
-  host_to_bigendian ((LONGEST) num, (char *) fnum, 4);
+  host_to_bigendian ((LONGEST) num, (char *) fnum, sizeof (fio_uint32_t));
 }
 
 /* Pack a host-format time_t into an fio_time_t.  */
 
 static inline void
-host_to_fileio_time (time_t num, fio_time_t fnum)
+host_to_fileio_time (time_t num, fio_time_t *fnum)
 {
-  host_to_bigendian ((LONGEST) num, (char *) fnum, 4);
+  host_to_bigendian ((LONGEST) num, (char *) fnum, sizeof (fio_time_t));
 }
 
 /* Pack a host-format struct stat into a struct fio_stat.  */
diff --git a/gdb/remote-fileio.c b/gdb/remote-fileio.c
index e855c682a0..2237787f72 100644
--- a/gdb/remote-fileio.c
+++ b/gdb/remote-fileio.c
@@ -286,16 +286,16 @@  remote_fileio_extract_ptr_w_len (char **buf, CORE_ADDR *ptrval, int *length)
 }
 
 static void
-remote_fileio_to_fio_long (LONGEST num, fio_long_t fnum)
+host_to_fileio_int64 (LONGEST num, fio_int64_t *fnum)
 {
-  host_to_bigendian (num, (char *) fnum, 8);
+  host_to_bigendian (num, (char *) fnum, sizeof (fio_int64_t));
 }
 
 static void
-remote_fileio_to_fio_timeval (struct timeval *tv, struct fio_timeval *ftv)
+host_to_fileio_timeval (struct timeval *tv, struct fio_timeval *ftv)
 {
-  host_to_fileio_time (tv->tv_sec, ftv->ftv_sec);
-  remote_fileio_to_fio_long (tv->tv_usec, ftv->ftv_usec);
+  host_to_fileio_time (tv->tv_sec, &ftv->ftv_sec);
+  host_to_fileio_int64 (tv->tv_usec, &ftv->ftv_usec);
 }
 
 /* The quit handler originally installed.  */
@@ -914,7 +914,7 @@  remote_fileio_func_fstat (char *buf)
 
   if (fd == FIO_FD_CONSOLE_IN || fd == FIO_FD_CONSOLE_OUT)
     {
-      host_to_fileio_uint (1, fst.fst_dev);
+      host_to_fileio_uint32 (1, &fst.fst_dev);
       memset (&st, 0, sizeof (st));
       st.st_mode = S_IFCHR | (fd == FIO_FD_CONSOLE_IN ? S_IRUSR : S_IWUSR);
       st.st_nlink = 1;
@@ -997,7 +997,7 @@  remote_fileio_func_gettimeofday (char *buf)
 
   if (ptrval)
     {
-      remote_fileio_to_fio_timeval (&tv, &ftv);
+      host_to_fileio_timeval (&tv, &ftv);
 
       errno = target_write_memory (ptrval, (gdb_byte *) &ftv, sizeof ftv);
       if (errno != 0)
@@ -1178,21 +1178,21 @@  remote_fileio_request (char *buf, int ctrlc_pending_p)
 }
 
 
-/* Unpack an fio_uint_t.  */
+/* Unpack an fio_uint32_t.  */
 
 static unsigned int
-remote_fileio_to_host_uint (fio_uint_t fnum)
+remote_fileio_to_host_uint (fio_uint32_t fnum)
 {
-  return extract_unsigned_integer ((gdb_byte *) fnum, 4,
+  return extract_unsigned_integer ((gdb_byte *) &fnum, 4,
 				   BFD_ENDIAN_BIG);
 }
 
-/* Unpack an fio_ulong_t.  */
+/* Unpack an fio_uint64_t.  */
 
 static ULONGEST
-remote_fileio_to_host_ulong (fio_ulong_t fnum)
+remote_fileio_to_host_ulong (fio_uint64_t fnum)
 {
-  return extract_unsigned_integer ((gdb_byte *) fnum, 8,
+  return extract_unsigned_integer ((gdb_byte *) &fnum, 8,
 				   BFD_ENDIAN_BIG);
 }
 
diff --git a/include/gdb/fileio.h b/include/gdb/fileio.h
index 7bb55f579f..8b37df4dfe 100644
--- a/include/gdb/fileio.h
+++ b/include/gdb/fileio.h
@@ -95,50 +95,32 @@ 
 #define FILEIO_ULONG_MAX   18446744073709551615ULL
 
 /* Integral types as used in protocol. */
-#if 0
-typedef __int32_t fio_int_t;
-typedef __uint32_t fio_uint_t, fio_mode_t, fio_time_t;
-typedef __int64_t fio_long_t;
-typedef __uint64_t fio_ulong_t;
-#endif
-
-#define FIO_INT_LEN   4
-#define FIO_UINT_LEN  4
-#define FIO_MODE_LEN  4
-#define FIO_TIME_LEN  4
-#define FIO_LONG_LEN  8
-#define FIO_ULONG_LEN 8
-
-typedef char fio_int_t[FIO_INT_LEN];   
-typedef char fio_uint_t[FIO_UINT_LEN];
-typedef char fio_mode_t[FIO_MODE_LEN];
-typedef char fio_time_t[FIO_TIME_LEN];
-typedef char fio_long_t[FIO_LONG_LEN];
-typedef char fio_ulong_t[FIO_ULONG_LEN];
-
-/* Struct stat as used in protocol.  For complete independence
-   of host/target systems, it's defined as an array with offsets
-   to the members. */
+typedef __INT32_TYPE__  fio_int32_t;
+typedef __UINT32_TYPE__ fio_uint32_t, fio_mode_t, fio_time_t;
+typedef __INT64_TYPE__  fio_int64_t;
+typedef __UINT64_TYPE__ fio_uint64_t;
 
+/* Struct stat as used in protocol. */
 struct fio_stat {
-  fio_uint_t  fst_dev;
-  fio_uint_t  fst_ino;
-  fio_mode_t  fst_mode;
-  fio_uint_t  fst_nlink;
-  fio_uint_t  fst_uid;
-  fio_uint_t  fst_gid;
-  fio_uint_t  fst_rdev;
-  fio_ulong_t fst_size;
-  fio_ulong_t fst_blksize;
-  fio_ulong_t fst_blocks;
-  fio_time_t  fst_atime;
-  fio_time_t  fst_mtime;
-  fio_time_t  fst_ctime;
+  fio_uint32_t fst_dev;
+  fio_uint32_t fst_ino;
+  fio_mode_t   fst_mode;
+  fio_uint32_t fst_nlink;
+  fio_uint32_t fst_uid;
+  fio_uint32_t fst_gid;
+  fio_uint32_t fst_rdev;
+  fio_uint64_t fst_size;
+  fio_uint64_t fst_blksize;
+  fio_uint64_t fst_blocks;
+  fio_time_t   fst_atime;
+  fio_time_t   fst_mtime;
+  fio_time_t   fst_ctime;
 };
 
+/* Struct timeval as used in protocol. */
 struct fio_timeval {
   fio_time_t  ftv_sec;
-  fio_long_t  ftv_usec;
+  fio_int64_t ftv_usec;
 };
 
 #endif /* GDB_FILEIO_H_ */