[v2,4/4] stdlib: Remove lstat usage from realpath [BZ #24970]

Message ID 20201027143531.2448132-4-adhemerval.zanella@linaro.org
State Superseded
Headers
Series [v2,1/4] Sync canonicalize with gnulib [BZ #10635] |

Commit Message

Adhemerval Zanella Oct. 27, 2020, 2:35 p.m. UTC
  The readlink already tells whether the file is a symlink, so there is
no need to call lstat to check it.  However for '..' it requires an
extra readlink check if the previous component can be really accessed,
otherwise the next iteration will check a possible valid path and end
early.  It should performance-wise acceptable and a gain over lstat,
afaik symlink should not update any inode information.

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 stdlib/canonicalize.c | 52 ++++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 25 deletions(-)
  

Comments

Florian Weimer Oct. 27, 2020, 3:36 p.m. UTC | #1
* Adhemerval Zanella via Libc-alpha:

> The readlink already tells whether the file is a symlink, so there is
> no need to call lstat to check it.  However for '..' it requires an
> extra readlink check if the previous component can be really accessed,
> otherwise the next iteration will check a possible valid path and end
> early.  It should performance-wise acceptable and a gain over lstat,
> afaik symlink should not update any inode information.

Should this go into gnulib first?

Thanks,
Florian
  
Adhemerval Zanella Oct. 27, 2020, 5:30 p.m. UTC | #2
On 27/10/2020 12:36, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> The readlink already tells whether the file is a symlink, so there is
>> no need to call lstat to check it.  However for '..' it requires an
>> extra readlink check if the previous component can be really accessed,
>> otherwise the next iteration will check a possible valid path and end
>> early.  It should performance-wise acceptable and a gain over lstat,
>> afaik symlink should not update any inode information.
> 
> Should this go into gnulib first?

I can send to gnulib, although glibc implementation is not really in
sync with gnulib one's.  Also on first patch of this set I explicit
did not sync some gnulib features that I think only clutter the
implementation with code that either won't be used by any system glibc
supports or will be removed in subsequent patches.

Paul, do you think this could be included in gnulib?
  
Paul Eggert Oct. 27, 2020, 7:11 p.m. UTC | #3
On 10/27/20 10:30 AM, Adhemerval Zanella wrote:
> Paul, do you think this could be included in gnulib?

I'd rather keep the two in sync. This should be doable by putting the 
Gnulib-specific stuff in "#ifndef _LIBC" areas that shouldn't bother Glibc 
maintenance significantly. We've had a reasonable amount of luck doing that in 
other library code. I could give a shot at doing this, install the result into 
Gnulib, and having you look at the result (we could think of this as being part 
of the glibc review process :-).
  

Patch

diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c
index 9aa69676e4..952f4dca41 100644
--- a/stdlib/canonicalize.c
+++ b/stdlib/canonicalize.c
@@ -55,6 +55,7 @@  __realpath (const char *name, char *resolved)
   const char *start, *end, *rpath_limit;
   const size_t path_max = PATH_MAX;
   int num_links = 0;
+  char buf[PATH_MAX];
   char extra_buf[PATH_MAX];
 
   if (name == NULL)
@@ -104,12 +105,6 @@  __realpath (const char *name, char *resolved)
 
   for (end = start; *start; start = end)
     {
-#ifdef _LIBC
-      struct stat64 st;
-#else
-      struct stat st;
-#endif
-
       /* Skip sequence of multiple path-separators.  */
       while (*start == '/')
         ++start;
@@ -118,12 +113,25 @@  __realpath (const char *name, char *resolved)
       for (end = start; *end && *end != '/'; ++end)
         /* Nothing.  */;
 
-      if (end - start == 0)
-        break;
-      else if (end - start == 1 && start[0] == '.')
+      if (end - start == 1 && start[0] == '.')
         /* nothing */;
       else if (end - start == 2 && start[0] == '.' && start[1] == '.')
         {
+          ssize_t n;
+
+          if (dest[-1] != '/')
+            *dest++ = '/';
+          *dest = '\0';
+
+          n = __readlink (rpath, buf, path_max - 1);
+          if (n == -1)
+            {
+              if (errno == ENOTDIR && dest[-1] == '/')
+                dest[-1] = '\0';
+              if (errno != EINVAL)
+                goto error;
+            }
+
           /* Back up to previous component, ignore if at root already.  */
           if (dest > rpath + 1)
             for (--dest; dest > rpath && dest[-1] != '/'; --dest)
@@ -132,6 +140,7 @@  __realpath (const char *name, char *resolved)
       else
         {
           size_t new_size;
+          ssize_t n;
 
           if (dest[-1] != '/')
             *dest++ = '/';
@@ -166,25 +175,23 @@  __realpath (const char *name, char *resolved)
           dest = __mempcpy (dest, start, end - start);
           *dest = '\0';
 
-          if (__lstat64 (rpath, &st) < 0)
-            goto error;
-
-          if (S_ISLNK (st.st_mode))
+          n = __readlink (rpath, buf, path_max - 1);
+          if (n < 0)
+            {
+              if (errno == ENOTDIR && dest[-1] == '/')
+                dest[-1] = '\0';
+              if (errno != EINVAL)
+                goto error;
+            }
+          else
             {
-              char buf[PATH_MAX];
               size_t len;
-              ssize_t n;
 
               if (++num_links > __eloop_threshold ())
                 {
                   __set_errno (ELOOP);
                   goto error;  }
 
-              n = __readlink (rpath, buf, path_max - 1);
-              if (n < 0)
-                goto error;
-              buf[n] = '\0';
-
               len = strlen (end);
               /* Check that n + len + 1 doesn't overflow and is <= path_max. */
               if (n >= SIZE_MAX - len || n + len >= path_max)
@@ -211,11 +218,6 @@  __realpath (const char *name, char *resolved)
                       continue;
                 }
             }
-          else if (!S_ISDIR (st.st_mode) && *end != '\0')
-            {
-              __set_errno (ENOTDIR);
-              goto error;
-            }
         }
     }
   if (dest > rpath + 1 && dest[-1] == '/')