ftw.c: Check for "." and ".." branchlessly

Message ID 20230924064516.17398-1-tirtajames45@gmail.com
State Dropped
Headers
Series ftw.c: Check for "." and ".." branchlessly |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
redhat-pt-bot/TryBot-32bit fail Patch caused testsuite regressions
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm fail Testing failed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 fail Testing failed

Commit Message

James Tirta Halim Sept. 24, 2023, 6:45 a.m. UTC
  ---
 io/ftw.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)
  

Comments

Xi Ruoyao Sept. 24, 2023, 11:29 a.m. UTC | #1
On Sun, 2023-09-24 at 13:45 +0700, James Tirta Halim wrote:
> +static int
> +is_relative (const char *fname)
> +{
> +	typedef uint16_t w2 __attribute__ ((__may_alias__));
> +	typedef uint32_t w4 __attribute__ ((__may_alias__));
> +	const uint16_t n1 = '.' | '\0' SH 8;
> +	const uint32_t n2 = '.' | '.' SH 8 | '\0' SH 16;
> +	return (*(w2 *)fname == n1) | ((*(w4 *)fname | '\0' SH 24) ==
> n2);
> +}

No.  The unaligned access will cause segmentation fault or bus error on
some architectures.
  
Xi Ruoyao Sept. 24, 2023, 11:43 a.m. UTC | #2
On Sun, 2023-09-24 at 19:29 +0800, Xi Ruoyao wrote:
> On Sun, 2023-09-24 at 13:45 +0700, James Tirta Halim wrote:
> > +static int
> > +is_relative (const char *fname)
> > +{
> > +	typedef uint16_t w2 __attribute__ ((__may_alias__));
> > +	typedef uint32_t w4 __attribute__ ((__may_alias__));
> > +	const uint16_t n1 = '.' | '\0' SH 8;
> > +	const uint32_t n2 = '.' | '.' SH 8 | '\0' SH 16;
> > +	return (*(w2 *)fname == n1) | ((*(w4 *)fname | '\0' SH 24) ==
> > n2);
> > +}
> 
> No.  The unaligned access will cause segmentation fault or bus error on
> some architectures.

There are also other issues:

1. If "fname" is an empty string, there will be a buffer overread.  I
don't know if fname can be empty string here though.

2. If "fname" is "..", then the bytes in "w4" can be:

'.', '.', '\0', **anything**

The last byte is completely arbitrary (and it may be even out of a
mapped memory range and then reading it will immediately cause a
segfault).  Again I don't know if fname can be just ".." here though.

3. We should let the compiler decide to use the branchless operation or
not.  If the compiler does not do the correct thing we should fix the
compiler instead of writing some nasty code here because fixing the
compiler will benefit both Glibc and other packages.  There is some
related existing GCC bug report (https://gcc.gnu.org/PR109555), the
problem is using a branchless operation here is not always a win.  See
the comments in the bug report for reference.
  

Patch

diff --git a/io/ftw.c b/io/ftw.c
index a72c7d5171..742369a2d2 100644
--- a/io/ftw.c
+++ b/io/ftw.c
@@ -66,6 +66,7 @@  char *alloca ();
 #include <unistd.h>
 #include <not-cancel.h>
 #include <sys/param.h>
+#include <endian.h>
 #ifdef _LIBC
 # include <include/sys/stat.h>
 #else
@@ -388,6 +389,25 @@  open_dir_stream (int *dfdp, struct ftw_data *data, struct dir_data *dirp)
 }
 
 
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+#  define SH <<
+#else
+#  define SH >>
+#endif
+
+static int
+is_relative (const char *fname)
+{
+	typedef uint16_t w2 __attribute__ ((__may_alias__));
+	typedef uint32_t w4 __attribute__ ((__may_alias__));
+	const uint16_t n1 = '.' | '\0' SH 8;
+	const uint32_t n2 = '.' | '.' SH 8 | '\0' SH 16;
+	return (*(w2 *)fname == n1) | ((*(w4 *)fname | '\0' SH 24) == n2);
+}
+
+#undef SH
+
+
 static int
 process_entry (struct ftw_data *data, struct dir_data *dir, const char *name,
 	       size_t namlen, int d_type)
@@ -397,9 +417,8 @@  process_entry (struct ftw_data *data, struct dir_data *dir, const char *name,
   int flag = 0;
   size_t new_buflen;
 
-  if (name[0] == '.' && (name[1] == '\0'
-			 || (name[1] == '.' && name[2] == '\0')))
-    /* Don't process the "." and ".." entries.  */
+  /* Don't process the "." and ".." entries.  */
+  if (is_relative(name))
     return 0;
 
   new_buflen = data->ftw.base + namlen + 2;