[v2] Fix 32-bit getcwd() on filesystems with 64-bit inodes.

Message ID 87606j5nkn.fsf_-_@esperi.org.uk
State New, archived
Headers

Commit Message

Nix Feb. 26, 2018, 11:38 p.m. UTC
  Here's the latest patch round, with your review comments incorporated,
and retested on trunk. Thanks for the review!

On 26 Feb 2018, Adhemerval Zanella said:

> Thanks for checking on it and I see no reason to continue using non-LFS
> calls on loader for 32-bits architectures.

I'm sure there are more. I might do a big audit for all remaining
external references to non-LFS stuff from stuff that is not itself a
non-LFS implementation, and squash them all at once. (But maybe we
should do this differently, by internally defining __readdir() etc to be
__readdir64() as is true when _FILE_OFFSET_BITS=64, and adding an
internal __readdir32() for the non-LFS version? That way you'd have to
do something unusual to call the non-LFS variant, which is surely the
right way around.)

I audited all other readdir64() implementations: no others define any
non-default symbol versions except by #including the i386 one, so I
think we're fine on other arches. The Hurd version calls __dir_readdir()
which is not in glibc so I just have to hope that it doesn't have the
same limitation as Linux getdents(). :)

(I hope scissors lines work here: git am --scissors.)

8<---------------------->8
From: Nick Alcock <nick.alcock@oracle.com>

When compiling a 32-bit glibc on a filesystem with 64-bit inodes,
we observe failures of io/tst-getcwd-abspath:

tst-getcwd-abspath.c:42: numeric comparison failure
   left: 75 (0x4b); from: errno
  right: 2 (0x2); from: ENOENT
tst-getcwd-abspath.c:47: numeric comparison failure
   left: 75 (0x4b); from: errno
  right: 2 (0x2); from: ENOENT
error: 2 test failures

Errno 75 is EOVERFLOW, which is most unexpected from getcwd()! Having
had experience with this class of pain in zic before (a patch which I
should perhaps resubmit or combine with this one), the cause is clear:
the getcwd() implementation was calling readdir() because it was in a
disconnected subtree, and in glibc that is the non-LFS implementation
so it ends up calling getdents(), which falls over with just that
error if it sees a single inode with a value nonrepresentable in 32
bits in the directories it fetches, and it scans all parents of the
current directory so it has a lot of opportunities to hit such an
inode.

getcwd() is used in the dynamic linker as part of $ORIGIN support, so
the usual SHLIB_COMPAT dance is needed there to prevent versioned symbols
getting into it and causing disaster.

	[BZ #22899]
	* include/dirent.h: Make __readdir64 hidden in rtld too.
	* sysdeps/unix/sysv/linux/i386/readdir64.c: Mark SHLIB_COMPAT
	in libc only.
	* posix/getcwd.c [__GNU_LIBRARY__] (__readdir64): Define.
	(__getcwd): Use dirent64 and __readdir64.
---
 include/dirent.h                         |  4 +++-
 sysdeps/posix/getcwd.c                   |  5 +++--
 sysdeps/unix/sysv/linux/i386/readdir64.c | 14 ++++++++------
 3 files changed, 14 insertions(+), 9 deletions(-)
  

Comments

Adhemerval Zanella Netto Feb. 27, 2018, 1:21 p.m. UTC | #1
On 26/02/2018 20:38, Nick Alcock wrote:
> Here's the latest patch round, with your review comments incorporated,
> and retested on trunk. Thanks for the review!
> 
> On 26 Feb 2018, Adhemerval Zanella said:
> 
>> Thanks for checking on it and I see no reason to continue using non-LFS
>> calls on loader for 32-bits architectures.
> 
> I'm sure there are more. I might do a big audit for all remaining
> external references to non-LFS stuff from stuff that is not itself a
> non-LFS implementation, and squash them all at once. (But maybe we
> should do this differently, by internally defining __readdir() etc to be
> __readdir64() as is true when _FILE_OFFSET_BITS=64, and adding an
> internal __readdir32() for the non-LFS version? That way you'd have to
> do something unusual to call the non-LFS variant, which is surely the
> right way around.)

Usually the internal naming would be __function64 for LFS and __function
for non-LFS, and recently we started to reuse the __libc_function64 and 
__libc_function for newer symbols.  I would prefer to explicit call 
__function64 instead of redefine it with macros, we usually do it on some
implementation because they came from gnulib and it uses this approach
because of its constraints.

> 
> I audited all other readdir64() implementations: no others define any
> non-default symbol versions except by #including the i386 one, so I
> think we're fine on other arches. The Hurd version calls __dir_readdir()
> which is not in glibc so I just have to hope that it doesn't have the
> same limitation as Linux getdents(). :)

I will try to spend some time today to consolidate the readdir64 Linux
implementation.

> 
> (I hope scissors lines work here: git am --scissors.)
> 
> 8<---------------------->8
> From: Nick Alcock <nick.alcock@oracle.com>
> 
> When compiling a 32-bit glibc on a filesystem with 64-bit inodes,
> we observe failures of io/tst-getcwd-abspath:
> 
> tst-getcwd-abspath.c:42: numeric comparison failure
>    left: 75 (0x4b); from: errno
>   right: 2 (0x2); from: ENOENT
> tst-getcwd-abspath.c:47: numeric comparison failure
>    left: 75 (0x4b); from: errno
>   right: 2 (0x2); from: ENOENT
> error: 2 test failures
> 
> Errno 75 is EOVERFLOW, which is most unexpected from getcwd()! Having
> had experience with this class of pain in zic before (a patch which I
> should perhaps resubmit or combine with this one), the cause is clear:
> the getcwd() implementation was calling readdir() because it was in a
> disconnected subtree, and in glibc that is the non-LFS implementation
> so it ends up calling getdents(), which falls over with just that
> error if it sees a single inode with a value nonrepresentable in 32
> bits in the directories it fetches, and it scans all parents of the
> current directory so it has a lot of opportunities to hit such an
> inode.
> 
> getcwd() is used in the dynamic linker as part of $ORIGIN support, so
> the usual SHLIB_COMPAT dance is needed there to prevent versioned symbols
> getting into it and causing disaster.

We usually add where we actually tested the patch on commit message,
so if you may please add the explanation you wrote before about the
XFS inode limitations along where you actually tested (for instance
'checked on x86_64-linux-gnu').

> 
> 	[BZ #22899]
> 	* include/dirent.h: Make __readdir64 hidden in rtld too.
> 	* sysdeps/unix/sysv/linux/i386/readdir64.c: Mark SHLIB_COMPAT
> 	in libc only.
> 	* posix/getcwd.c [__GNU_LIBRARY__] (__readdir64): Define.
> 	(__getcwd): Use dirent64 and __readdir64.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

> ---
>  include/dirent.h                         |  4 +++-
>  sysdeps/posix/getcwd.c                   |  5 +++--
>  sysdeps/unix/sysv/linux/i386/readdir64.c | 14 ++++++++------
>  3 files changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/include/dirent.h b/include/dirent.h
> index caaeb0be85..9ca588a4e3 100644
> --- a/include/dirent.h
> +++ b/include/dirent.h
> @@ -21,7 +21,9 @@ extern DIR *__fdopendir (int __fd) attribute_hidden;
>  extern int __closedir (DIR *__dirp) attribute_hidden;
>  extern struct dirent *__readdir (DIR *__dirp) attribute_hidden;
>  extern struct dirent64 *__readdir64 (DIR *__dirp);
> -libc_hidden_proto (__readdir64)
> +#  if IS_IN (libc) || (IS_IN (rtld) && !defined NO_RTLD_HIDDEN)
> +   hidden_proto (__readdir64)
> +#  endif
>  extern int __readdir_r (DIR *__dirp, struct dirent *__entry,
>  			struct dirent **__result);
>  extern int __readdir64_r (DIR *__dirp, struct dirent64 *__entry,

Ok.

> diff --git a/sysdeps/posix/getcwd.c b/sysdeps/posix/getcwd.c
> index b53433a2dc..f547cd6612 100644
> --- a/sysdeps/posix/getcwd.c
> +++ b/sysdeps/posix/getcwd.c
> @@ -194,6 +194,7 @@ extern char *alloca ();
>  
>  #ifndef __GNU_LIBRARY__
>  # define __lstat64	stat64
> +# define __readdir64	readdir64
>  #endif
>  
>  #ifndef _LIBC
> @@ -366,14 +367,14 @@ __getcwd (char *buf, size_t size)
>  	goto lose;
>        fd_needs_closing = false;
>  
> -      struct dirent *d;
> +      struct dirent64 *d;
>        bool use_d_ino = true;
>        while (1)
>  	{
>  	  /* Clear errno to distinguish EOF from error if readdir returns
>  	     NULL.  */
>  	  __set_errno (0);
> -	  d = __readdir (dirstream);
> +	  d = __readdir64 (dirstream);
>  	  if (d == NULL)
>  	    {
>  	      if (errno == 0)

Ok.

> diff --git a/sysdeps/unix/sysv/linux/i386/readdir64.c b/sysdeps/unix/sysv/linux/i386/readdir64.c
> index 42b73023e0..e2981c7f9c 100644
> --- a/sysdeps/unix/sysv/linux/i386/readdir64.c
> +++ b/sysdeps/unix/sysv/linux/i386/readdir64.c
> @@ -28,19 +28,21 @@
>  #undef DIRENT_TYPE
>  
>  libc_hidden_def (__readdir64)
> +#if IS_IN (libc)
>  versioned_symbol (libc, __readdir64, readdir64, GLIBC_2_2);
>  
> -#if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
> +#  if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)

The indentation is done with one extra space.

>  
> -#include <olddirent.h>
> +#  include <olddirent.h>
>  
> -#define __READDIR attribute_compat_text_section __old_readdir64
> -#define __GETDENTS __old_getdents64
> -#define DIRENT_TYPE struct __old_dirent64
> +#  define __READDIR attribute_compat_text_section __old_readdir64
> +#  define __GETDENTS __old_getdents64
> +#  define DIRENT_TYPE struct __old_dirent64
>  
> -#include <sysdeps/posix/readdir.c>
> +#  include <sysdeps/posix/readdir.c>
>  
>  libc_hidden_def (__old_readdir64)
>  
>  compat_symbol (libc, __old_readdir64, readdir64, GLIBC_2_1);
>  #endif
> +#endif
> 

Ok with the removal of the extra space.
  
Nix Feb. 27, 2018, 4:33 p.m. UTC | #2
On 27 Feb 2018, Adhemerval Zanella outgrape:

> On 26/02/2018 20:38, Nick Alcock wrote:
>> getcwd() is used in the dynamic linker as part of $ORIGIN support, so
>> the usual SHLIB_COMPAT dance is needed there to prevent versioned symbols
>> getting into it and causing disaster.
>
> We usually add where we actually tested the patch on commit message,
> so if you may please add the explanation you wrote before about the
> XFS inode limitations along where you actually tested (for instance
> 'checked on x86_64-linux-gnu').

Will add.

>> diff --git a/sysdeps/unix/sysv/linux/i386/readdir64.c b/sysdeps/unix/sysv/linux/i386/readdir64.c
>> index 42b73023e0..e2981c7f9c 100644
>> --- a/sysdeps/unix/sysv/linux/i386/readdir64.c
>> +++ b/sysdeps/unix/sysv/linux/i386/readdir64.c
>> @@ -28,19 +28,21 @@
>>  #undef DIRENT_TYPE
>>  
>>  libc_hidden_def (__readdir64)
>> +#if IS_IN (libc)
>>  versioned_symbol (libc, __readdir64, readdir64, GLIBC_2_2);
>>  
>> -#if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
>> +#  if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
>
> The indentation is done with one extra space.

We are already inside one level of #if here: #ifndef _DIRENT_H at the
top. There is a probable error there: # ifndef _ISOMAC is followed by
no identation change at all, so frankly I have no *idea* what level is
appropriate. Perhaps two, perhaps three, certainly not one.

>>  
>> -#include <olddirent.h>
>> +#  include <olddirent.h>
>>  
>> -#define __READDIR attribute_compat_text_section __old_readdir64
>> -#define __GETDENTS __old_getdents64
>> -#define DIRENT_TYPE struct __old_dirent64
>> +#  define __READDIR attribute_compat_text_section __old_readdir64
>> +#  define __GETDENTS __old_getdents64
>> +#  define DIRENT_TYPE struct __old_dirent64
>>  
>> -#include <sysdeps/posix/readdir.c>
>> +#  include <sysdeps/posix/readdir.c>
>>  
>>  libc_hidden_def (__old_readdir64)
>>  
>>  compat_symbol (libc, __old_readdir64, readdir64, GLIBC_2_1);
>>  #endif
>> +#endif
>
> Ok with the removal of the extra space.

this is a double-level of #if, too, though the indentation on the
innermost #if itself, and on its corresponding #endif, is wrong: will
fix.
  
Zack Weinberg Feb. 27, 2018, 6:19 p.m. UTC | #3
On Tue, Feb 27, 2018 at 11:33 AM, Nick Alcock <nix@esperi.org.uk> wrote:
>>> -#if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
>>> +#  if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
>>
>> The indentation is done with one extra space.
>
> We are already inside one level of #if here: #ifndef _DIRENT_H at the
> top. There is a probable error there: # ifndef _ISOMAC is followed by
> no identation change at all, so frankly I have no *idea* what level is
> appropriate. Perhaps two, perhaps three, certainly not one.

The policy as I understand it is that every level of #if nesting
should increase preprocessor-directive indentation by one space,
*except* for multiple-include guards (the #ifndef _DIRENT_H).

If the file is internally inconsistent, please fix it up as a separate
commit.  Corrections to indentation (and other corrections that only
change white space) do not require prior review - just post the patch
and then check it in.  You do need to add a ChangeLog entry, but it
just needs to say "* whatever.h: Fix indentation."

zw
  

Patch

diff --git a/include/dirent.h b/include/dirent.h
index caaeb0be85..9ca588a4e3 100644
--- a/include/dirent.h
+++ b/include/dirent.h
@@ -21,7 +21,9 @@  extern DIR *__fdopendir (int __fd) attribute_hidden;
 extern int __closedir (DIR *__dirp) attribute_hidden;
 extern struct dirent *__readdir (DIR *__dirp) attribute_hidden;
 extern struct dirent64 *__readdir64 (DIR *__dirp);
-libc_hidden_proto (__readdir64)
+#  if IS_IN (libc) || (IS_IN (rtld) && !defined NO_RTLD_HIDDEN)
+   hidden_proto (__readdir64)
+#  endif
 extern int __readdir_r (DIR *__dirp, struct dirent *__entry,
 			struct dirent **__result);
 extern int __readdir64_r (DIR *__dirp, struct dirent64 *__entry,
diff --git a/sysdeps/posix/getcwd.c b/sysdeps/posix/getcwd.c
index b53433a2dc..f547cd6612 100644
--- a/sysdeps/posix/getcwd.c
+++ b/sysdeps/posix/getcwd.c
@@ -194,6 +194,7 @@  extern char *alloca ();
 
 #ifndef __GNU_LIBRARY__
 # define __lstat64	stat64
+# define __readdir64	readdir64
 #endif
 
 #ifndef _LIBC
@@ -366,14 +367,14 @@  __getcwd (char *buf, size_t size)
 	goto lose;
       fd_needs_closing = false;
 
-      struct dirent *d;
+      struct dirent64 *d;
       bool use_d_ino = true;
       while (1)
 	{
 	  /* Clear errno to distinguish EOF from error if readdir returns
 	     NULL.  */
 	  __set_errno (0);
-	  d = __readdir (dirstream);
+	  d = __readdir64 (dirstream);
 	  if (d == NULL)
 	    {
 	      if (errno == 0)
diff --git a/sysdeps/unix/sysv/linux/i386/readdir64.c b/sysdeps/unix/sysv/linux/i386/readdir64.c
index 42b73023e0..e2981c7f9c 100644
--- a/sysdeps/unix/sysv/linux/i386/readdir64.c
+++ b/sysdeps/unix/sysv/linux/i386/readdir64.c
@@ -28,19 +28,21 @@ 
 #undef DIRENT_TYPE
 
 libc_hidden_def (__readdir64)
+#if IS_IN (libc)
 versioned_symbol (libc, __readdir64, readdir64, GLIBC_2_2);
 
-#if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
+#  if SHLIB_COMPAT(libc, GLIBC_2_1, GLIBC_2_2)
 
-#include <olddirent.h>
+#  include <olddirent.h>
 
-#define __READDIR attribute_compat_text_section __old_readdir64
-#define __GETDENTS __old_getdents64
-#define DIRENT_TYPE struct __old_dirent64
+#  define __READDIR attribute_compat_text_section __old_readdir64
+#  define __GETDENTS __old_getdents64
+#  define DIRENT_TYPE struct __old_dirent64
 
-#include <sysdeps/posix/readdir.c>
+#  include <sysdeps/posix/readdir.c>
 
 libc_hidden_def (__old_readdir64)
 
 compat_symbol (libc, __old_readdir64, readdir64, GLIBC_2_1);
 #endif
+#endif