[RFC,v4,08/24] sysdeps/stat: Copy the statx struct to stat instead of stat64

Message ID b0eda5aa19f0dec0916c2f187f9382183b198e85.1565398513.git.alistair.francis@wdc.com
State New, archived
Headers

Commit Message

Alistair Francis Aug. 10, 2019, 1 a.m. UTC
  When copying the statx struct to the stat stuct use the original stat
struct instead of the stat64 struct. As the padding in the original is
type 'unsigned short int' but the padding in the stat64 is 'unsigned int'
the copy  can result in misallgined data. This would then incorrectly
trigger the stat_overflow() failure.

This would be very obvious when using a 64-bit ino_t type on a 32-bit
system, such as the RV32 port.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
---
 .../sysv/linux/generic/wordsize-32/fxstat.c   |  2 +-
 .../sysv/linux/generic/wordsize-32/fxstatat.c |  2 +-
 .../sysv/linux/generic/wordsize-32/lxstat.c   |  2 +-
 .../sysv/linux/generic/wordsize-32/xstat.c    |  2 +-
 sysdeps/unix/sysv/linux/statx_cp.c            | 24 +++++++++++++++++++
 sysdeps/unix/sysv/linux/statx_cp.h            |  3 +++
 6 files changed, 31 insertions(+), 4 deletions(-)
  

Comments

Joseph Myers Aug. 12, 2019, 8:01 p.m. UTC | #1
On Fri, 9 Aug 2019, Alistair Francis wrote:

> When copying the statx struct to the stat stuct use the original stat
> struct instead of the stat64 struct. As the padding in the original is
> type 'unsigned short int' but the padding in the stat64 is 'unsigned int'
> the copy  can result in misallgined data. This would then incorrectly
> trigger the stat_overflow() failure.

This indicates there's something else wrong with the port.  By design, the 
following apply for linux/generic/wordsize-32 ports of glibc:

* The layout of struct stat and struct stat64 is identical, except that 
some bytes that are padding in struct stat serve as high parts of fields 
that are wider in struct stat64 (and thus have endian-dependent positions 
as determined by the __field64 macro in bits/stat.h).

* Conversions from statx have to go to stat64, including setting those 
high parts as appropriate, so that the subsequent overflow checks (which 
work by examining those padding fields) can correctly detect whether 
overflow occurred and set errno to EOVERFLOW accordingly.  See my C-SKY 
port reviews that resulted in the code we have now 
<https://sourceware.org/ml/libc-alpha/2018-11/msg00624.html> 
<https://sourceware.org/ml/libc-alpha/2018-11/msg00668.html>.

> This would be very obvious when using a 64-bit ino_t type on a 32-bit
> system, such as the RV32 port.

If those types are 64-bit, you should not have padding around them in 
struct stat, so as to preserve the property that struct stat and struct 
stat64 have the same layout.  I suppose this means bits/stat.h needs to 
check further macros such as __OFF_T_MATCHES_OFF64_T.

You'll also need to ensure that XSTAT_IS_XSTAT64 is defined to 1.  And 
you'll need to make wordsize-32/overflow.h define trivial versions of the 
*_overflow functions in cases where the types match (this should be done 
in that file, rather than making an RV32-specific copy, to benefit future 
ports that make the same choices as RV32).
  
Alistair Francis Aug. 12, 2019, 10:22 p.m. UTC | #2
On Mon, Aug 12, 2019 at 1:01 PM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Fri, 9 Aug 2019, Alistair Francis wrote:
>
> > When copying the statx struct to the stat stuct use the original stat
> > struct instead of the stat64 struct. As the padding in the original is
> > type 'unsigned short int' but the padding in the stat64 is 'unsigned int'
> > the copy  can result in misallgined data. This would then incorrectly
> > trigger the stat_overflow() failure.
>
> This indicates there's something else wrong with the port.  By design, the
> following apply for linux/generic/wordsize-32 ports of glibc:
>
> * The layout of struct stat and struct stat64 is identical, except that
> some bytes that are padding in struct stat serve as high parts of fields
> that are wider in struct stat64 (and thus have endian-dependent positions
> as determined by the __field64 macro in bits/stat.h).

__ino_t is a 64-bit value in RV32, so in both stat and stat64 it's the
same thing. The __pad1 changes in size between the two structs as one
is a short and one is just an int. I don't see how they are identical.

>
> * Conversions from statx have to go to stat64, including setting those
> high parts as appropriate, so that the subsequent overflow checks (which
> work by examining those padding fields) can correctly detect whether
> overflow occurred and set errno to EOVERFLOW accordingly.  See my C-SKY
> port reviews that resulted in the code we have now
> <https://sourceware.org/ml/libc-alpha/2018-11/msg00624.html>
> <https://sourceware.org/ml/libc-alpha/2018-11/msg00668.html>.

Ok, this makes sense.

>
> > This would be very obvious when using a 64-bit ino_t type on a 32-bit
> > system, such as the RV32 port.
>
> If those types are 64-bit, you should not have padding around them in
> struct stat, so as to preserve the property that struct stat and struct
> stat64 have the same layout.  I suppose this means bits/stat.h needs to
> check further macros such as __OFF_T_MATCHES_OFF64_T.

Changing the padding would fix the problem. Just to be clear is that
what you are suggesting?

>
> You'll also need to ensure that XSTAT_IS_XSTAT64 is defined to 1.  And
> you'll need to make wordsize-32/overflow.h define trivial versions of the
> *_overflow functions in cases where the types match (this should be done
> in that file, rather than making an RV32-specific copy, to benefit future
> ports that make the same choices as RV32).

I think I tested that and it didn't fix the problem so I dropped it.
I'll add the XSTAT_IS_XSTAT64 define back.

I'm not sure what you mean by then the types match?

Alistair

>
> --
> Joseph S. Myers
> joseph@codesourcery.com
  
Joseph Myers Aug. 12, 2019, 11:02 p.m. UTC | #3
On Mon, 12 Aug 2019, Alistair Francis wrote:

> > * The layout of struct stat and struct stat64 is identical, except that
> > some bytes that are padding in struct stat serve as high parts of fields
> > that are wider in struct stat64 (and thus have endian-dependent positions
> > as determined by the __field64 macro in bits/stat.h).
> 
> __ino_t is a 64-bit value in RV32, so in both stat and stat64 it's the
> same thing. The __pad1 changes in size between the two structs as one
> is a short and one is just an int. I don't see how they are identical.

What file are you looking at, in what version of the source tree?

I'm looking at sysdeps/unix/sysv/linux/generic/bits/stat.h.  As far as I 
can tell, your patch series doesn't change that file.  That's the version 
used by RV64 and it should be used by RV32 (the headers installed for RV32 
and RV64 should be identical apart from the special cases of 
gnu/lib-names-*.h and gnu/stubs-*.h; if it isn't, that needs to be fixed).

In that file, __pad1 is of type __dev_t in both struct stat and struct 
stat64.

> > If those types are 64-bit, you should not have padding around them in
> > struct stat, so as to preserve the property that struct stat and struct
> > stat64 have the same layout.  I suppose this means bits/stat.h needs to
> > check further macros such as __OFF_T_MATCHES_OFF64_T.
> 
> Changing the padding would fix the problem. Just to be clear is that
> what you are suggesting?

Yes.

Either change the definition of __field64, or the uses of it.

Changing the definition runs into the complication that it's used three 
times and each has its own macro to say whether the two types in question 
match, but you could always have a #error if some match and others don't.

> > You'll also need to ensure that XSTAT_IS_XSTAT64 is defined to 1.  And
> > you'll need to make wordsize-32/overflow.h define trivial versions of the
> > *_overflow functions in cases where the types match (this should be done
> > in that file, rather than making an RV32-specific copy, to benefit future
> > ports that make the same choices as RV32).
> 
> I think I tested that and it didn't fix the problem so I dropped it.
> I'll add the XSTAT_IS_XSTAT64 define back.
> 
> I'm not sure what you mean by then the types match?

off_t and off64_t match if they have the same size and alignment (if 
__OFF_T_MATCHES_OFF64_T is defined).  Likewise for the other pairs.

If off_t and off64_t match, there is no need for the "buf->__st_size_pad 
== 0" check in stat_overflow (and once you've removed the padding in that 
case, that check won't even compile because __st_size_pad won't exist).  
Likewise for the other pairs.  In practice, you can probably just make 
that function return 0 if all three macros are defined, rather than 
allowing for arbitrary subsets of the types matching or not matching.

Similar considerations apply to statfs_overflow if the types and padding 
used in struct statfs match those used in struct statfs64.  So apply 
similar considerations to the two versions of that structure.

You probably don't need to do anything special with lseek_overflow because 
the compiler should just optimize that away when off_t and loff_t are the 
same type.
  
Alistair Francis Aug. 12, 2019, 11:06 p.m. UTC | #4
On Mon, Aug 12, 2019 at 4:02 PM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Mon, 12 Aug 2019, Alistair Francis wrote:
>
> > > * The layout of struct stat and struct stat64 is identical, except that
> > > some bytes that are padding in struct stat serve as high parts of fields
> > > that are wider in struct stat64 (and thus have endian-dependent positions
> > > as determined by the __field64 macro in bits/stat.h).
> >
> > __ino_t is a 64-bit value in RV32, so in both stat and stat64 it's the
> > same thing. The __pad1 changes in size between the two structs as one
> > is a short and one is just an int. I don't see how they are identical.
>
> What file are you looking at, in what version of the source tree?

I'm looking at sysdeps/unix/sysv/linux/bits/stat.h, it has a different
definition then the one you are looking at.

>
> I'm looking at sysdeps/unix/sysv/linux/generic/bits/stat.h.  As far as I
> can tell, your patch series doesn't change that file.  That's the version
> used by RV64 and it should be used by RV32 (the headers installed for RV32
> and RV64 should be identical apart from the special cases of
> gnu/lib-names-*.h and gnu/stubs-*.h; if it isn't, that needs to be fixed).

It looks like __field64(__ino_t, __ino64_t, st_ino) will use type int
for RV32 (I don't think __USE_FILE_OFFSET64 is defined) which might
need some changes.

>
> In that file, __pad1 is of type __dev_t in both struct stat and struct
> stat64.

In the file you are looking at that is correct.

>
> > > If those types are 64-bit, you should not have padding around them in
> > > struct stat, so as to preserve the property that struct stat and struct
> > > stat64 have the same layout.  I suppose this means bits/stat.h needs to
> > > check further macros such as __OFF_T_MATCHES_OFF64_T.
> >
> > Changing the padding would fix the problem. Just to be clear is that
> > what you are suggesting?
>
> Yes.
>
> Either change the definition of __field64, or the uses of it.

Yep, in the file you are looking at that seems the best way to go.

>
> Changing the definition runs into the complication that it's used three
> times and each has its own macro to say whether the two types in question
> match, but you could always have a #error if some match and others don't.
>
> > > You'll also need to ensure that XSTAT_IS_XSTAT64 is defined to 1.  And
> > > you'll need to make wordsize-32/overflow.h define trivial versions of the
> > > *_overflow functions in cases where the types match (this should be done
> > > in that file, rather than making an RV32-specific copy, to benefit future
> > > ports that make the same choices as RV32).
> >
> > I think I tested that and it didn't fix the problem so I dropped it.
> > I'll add the XSTAT_IS_XSTAT64 define back.
> >
> > I'm not sure what you mean by then the types match?
>
> off_t and off64_t match if they have the same size and alignment (if
> __OFF_T_MATCHES_OFF64_T is defined).  Likewise for the other pairs.
>
> If off_t and off64_t match, there is no need for the "buf->__st_size_pad
> == 0" check in stat_overflow (and once you've removed the padding in that
> case, that check won't even compile because __st_size_pad won't exist).
> Likewise for the other pairs.  In practice, you can probably just make
> that function return 0 if all three macros are defined, rather than
> allowing for arbitrary subsets of the types matching or not matching.

Ok, I think I understand. I'll update the patch series.

>
> Similar considerations apply to statfs_overflow if the types and padding
> used in struct statfs match those used in struct statfs64.  So apply
> similar considerations to the two versions of that structure.

Ok

Alistair

>
> You probably don't need to do anything special with lseek_overflow because
> the compiler should just optimize that away when off_t and loff_t are the
> same type.
>
> --
> Joseph S. Myers
> joseph@codesourcery.com
  
Joseph Myers Aug. 12, 2019, 11:31 p.m. UTC | #5
On Mon, 12 Aug 2019, Alistair Francis wrote:

> I'm looking at sysdeps/unix/sysv/linux/bits/stat.h, it has a different
> definition then the one you are looking at.

That file is not relevant for any Linux kernel port added in the past 
several years.  All recent ports should use the asm-generic structure 
layouts.
  
Alistair Francis Aug. 13, 2019, 8:36 p.m. UTC | #6
On Mon, Aug 12, 2019 at 4:31 PM Joseph Myers <joseph@codesourcery.com> wrote:
>
> On Mon, 12 Aug 2019, Alistair Francis wrote:
>
> > I'm looking at sysdeps/unix/sysv/linux/bits/stat.h, it has a different
> > definition then the one you are looking at.
>
> That file is not relevant for any Linux kernel port added in the past
> several years.  All recent ports should use the asm-generic structure
> layouts.

Ok, now I understand. I have made all of the changes you requested.

Alistair

>
> --
> Joseph S. Myers
> joseph@codesourcery.com
  

Patch

diff --git a/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstat.c b/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstat.c
index 81e137c585b..234098b5199 100644
--- a/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstat.c
+++ b/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstat.c
@@ -43,7 +43,7 @@  __fxstat (int vers, int fd, struct stat *buf)
       int rc = INLINE_SYSCALL (statx, 5, fd, "", AT_EMPTY_PATH,
                                STATX_BASIC_STATS, &tmp);
       if (rc == 0)
-        __cp_stat64_statx ((struct stat64 *)buf, &tmp);
+        __cp_stat_statx (buf, &tmp);
 # endif
       return rc ?: stat_overflow (buf);
     }
diff --git a/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstatat.c b/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstatat.c
index e203260485c..bd3bfc101cd 100644
--- a/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstatat.c
+++ b/sysdeps/unix/sysv/linux/generic/wordsize-32/fxstatat.c
@@ -44,7 +44,7 @@  __fxstatat (int vers, int fd, const char *file, struct stat *buf, int flag)
                                AT_NO_AUTOMOUNT | flag,
                                STATX_BASIC_STATS, &tmp);
       if (rc == 0)
-        __cp_stat64_statx ((struct stat64 *)buf, &tmp);
+        __cp_stat_statx (buf, &tmp);
 # endif
       return rc ?: stat_overflow (buf);
     }
diff --git a/sysdeps/unix/sysv/linux/generic/wordsize-32/lxstat.c b/sysdeps/unix/sysv/linux/generic/wordsize-32/lxstat.c
index 2d4dfbaf527..01edf9f22cc 100644
--- a/sysdeps/unix/sysv/linux/generic/wordsize-32/lxstat.c
+++ b/sysdeps/unix/sysv/linux/generic/wordsize-32/lxstat.c
@@ -44,7 +44,7 @@  __lxstat (int vers, const char *name, struct stat *buf)
                                AT_NO_AUTOMOUNT | AT_SYMLINK_NOFOLLOW,
                                STATX_BASIC_STATS, &tmp);
       if (rc == 0)
-        __cp_stat64_statx ((struct stat64 *)buf, &tmp);
+        __cp_stat_statx (buf, &tmp);
 #endif
       return rc ?: stat_overflow (buf);
     }
diff --git a/sysdeps/unix/sysv/linux/generic/wordsize-32/xstat.c b/sysdeps/unix/sysv/linux/generic/wordsize-32/xstat.c
index 5f1f53c0f3f..dea6ec525e9 100644
--- a/sysdeps/unix/sysv/linux/generic/wordsize-32/xstat.c
+++ b/sysdeps/unix/sysv/linux/generic/wordsize-32/xstat.c
@@ -42,7 +42,7 @@  __xstat (int vers, const char *name, struct stat *buf)
       int rc = INLINE_SYSCALL (statx, 5, AT_FDCWD, name, AT_NO_AUTOMOUNT,
                                STATX_BASIC_STATS, &tmp);
       if (rc == 0)
-        __cp_stat64_statx ((struct stat64 *)buf, &tmp);
+        __cp_stat_statx (buf, &tmp);
 # endif
       return rc ?: stat_overflow (buf);
     }
diff --git a/sysdeps/unix/sysv/linux/statx_cp.c b/sysdeps/unix/sysv/linux/statx_cp.c
index 3b4e5583e9a..2574d05563c 100644
--- a/sysdeps/unix/sysv/linux/statx_cp.c
+++ b/sysdeps/unix/sysv/linux/statx_cp.c
@@ -46,4 +46,28 @@  __cp_stat64_statx (struct stat64 *to, struct statx *from)
   to->st_blocks = from->stx_blocks;
   to->st_blksize = from->stx_blksize;
 }
+
+void
+__cp_stat_statx (struct stat *to, struct statx *from)
+{
+  memset (to, 0, sizeof (struct stat64));
+  to->st_dev = ((from->stx_dev_minor & 0xff) | (from->stx_dev_major << 8)
+    | ((from->stx_dev_minor & ~0xff) << 12));
+  to->st_rdev = ((from->stx_rdev_minor & 0xff) | (from->stx_rdev_major << 8)
+     | ((from->stx_rdev_minor & ~0xff) << 12));
+  to->st_ino = from->stx_ino;
+  to->st_mode = from->stx_mode;
+  to->st_nlink = from->stx_nlink;
+  to->st_uid = from->stx_uid;
+  to->st_gid = from->stx_gid;
+  to->st_atime = from->stx_atime.tv_sec;
+  to->st_atim.tv_nsec = from->stx_atime.tv_nsec;
+  to->st_mtime = from->stx_mtime.tv_sec;
+  to->st_mtim.tv_nsec = from->stx_mtime.tv_nsec;
+  to->st_ctime = from->stx_ctime.tv_sec;
+  to->st_ctim.tv_nsec = from->stx_ctime.tv_nsec;
+  to->st_size = from->stx_size;
+  to->st_blocks = from->stx_blocks;
+  to->st_blksize = from->stx_blksize;
+}
 #endif
diff --git a/sysdeps/unix/sysv/linux/statx_cp.h b/sysdeps/unix/sysv/linux/statx_cp.h
index f08a7a8dcc8..2bb1d0605fd 100644
--- a/sysdeps/unix/sysv/linux/statx_cp.h
+++ b/sysdeps/unix/sysv/linux/statx_cp.h
@@ -18,3 +18,6 @@ 
 
 extern void __cp_stat64_statx (struct stat64 *to, struct statx *from)
   attribute_hidden;
+
+extern void __cp_stat_statx (struct stat *to, struct statx *from)
+  attribute_hidden;