[3/3] linux: Optimize realpath stack usage

Message ID 20200810204856.2111211-3-adhemerval.zanella@linaro.org
State Superseded
Headers
Series [1/3] stdlib: Use fixed buffer size for realpath (BZ #26241) |

Commit Message

Adhemerval Zanella Aug. 10, 2020, 8:48 p.m. UTC
  This optimizes the stack usage for success case (from ~8K to ~4k) and
where 'resolved' input buffer is not provided.  For ithe failure case
when the 'resolved' buffer is provided, it requires use the generic
strategy to find the path when EACESS or ENOENT is returned (this is
a GNU extension not defined in the standard).

Regarding syscalls usage, for a sucessful path without symlinks it
trades 2 syscalls (getcwd/lstat) for 5 (openat, readlink, fstat, stat,
 and close).  Its is slighter better if the input contains multiple
symlinks (where Linux kernel tricks allows replace multiple lstats by
only one readlink).  For failure it depends whether the 'resolved' buffer
is provided, which will call the old strategy (and thus requiring more
syscalls in general).

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 include/stdlib.h                   |  13 +++
 stdlib/Makefile                    |   2 +-
 stdlib/canonicalize-internal.c     | 136 ++++++++++++++++++++++++++++
 stdlib/canonicalize.c              | 141 +----------------------------
 stdlib/realpath.c                  |  42 +++++++++
 sysdeps/unix/sysv/linux/realpath.c |  65 +++++++++++++
 6 files changed, 258 insertions(+), 141 deletions(-)
 create mode 100644 stdlib/canonicalize-internal.c
 create mode 100644 stdlib/realpath.c
 create mode 100644 sysdeps/unix/sysv/linux/realpath.c
  

Comments

Paul Eggert Aug. 10, 2020, 9:25 p.m. UTC | #1
On 8/10/20 1:48 PM, Adhemerval Zanella wrote:

> Regarding syscalls usage, for a sucessful path without symlinks it
> trades 2 syscalls (getcwd/lstat) for 5 (openat, readlink, fstat, stat,
>   and close). 

Thanks for looking into this.

There's no need for the new __resolve_path function to call __lstat64. It does 
that only to determine whether the file is a symlink or a directory, and it can 
do the former with __readlink (which it's gonna do already) and the latter by 
going into the next iteration and seeing whether it gets NOTDIR. Avoiding 
__lstat64 means we don't have to worry about irrelevant EOVERFLOW hassles.

In the Linux __realpath_system, the code should be prepared for the /proc 
syscall to fail because /proc is not mounted. It can fall back on __resolve_path 
in that case. The code should be prepared for the fstat64 to fail due to 
EOVERFLOW. And I'm not quite getting why the code is comparing fstat64's result 
to stat64's result -- perhaps explain this in a comment and how EOVERFLOW is 
dealt with?
  
Andreas Schwab Aug. 11, 2020, 9:46 a.m. UTC | #2
On Aug 10 2020, Adhemerval Zanella via Libc-alpha wrote:

> This optimizes the stack usage for success case (from ~8K to ~4k) and
> where 'resolved' input buffer is not provided.  For ithe failure case

s/ithe/the/

Andreas.
  
Adhemerval Zanella Aug. 11, 2020, 2:14 p.m. UTC | #3
On 10/08/2020 18:25, Paul Eggert wrote:
> On 8/10/20 1:48 PM, Adhemerval Zanella wrote:
> 
>> Regarding syscalls usage, for a sucessful path without symlinks it
>> trades 2 syscalls (getcwd/lstat) for 5 (openat, readlink, fstat, stat,
>>   and close). 
> 
> Thanks for looking into this.
> 
> There's no need for the new __resolve_path function to call __lstat64. It does that only to determine whether the file is a symlink or a directory, and it can do the former with __readlink (which it's gonna do already) and the latter by going into the next iteration and seeing whether it gets NOTDIR. Avoiding __lstat64 means we don't have to worry about irrelevant EOVERFLOW hassles.

I am not sure we can skip the __lstat64 because we can't check if a subpath
passed on __readlink is a directory or not (so we can move to next iteration
in some cases).

For instance the test 30 from stdlib/test-canon.c which issues:

  realpath ("./doesExist/someFile/", ...)

On the directory with:

  mkdir ("doesExist", 0777)
  creat ("doesExist/someFile", 0777)

By just issuing the readlink on (strace output):

  readlink(".../doesExist", 0x7fff32598060, 4095) = -1 EINVAL

We can't see that is a S_ISDIR to go for next iteration (and then fail on next
iteration since "../doesExist/someFile' is not a directory end *end != '\0').

> 
> In the Linux __realpath_system, the code should be prepared for the /proc syscall to fail because /proc is not mounted. It can fall back on __resolve_path in that case. The code should be prepared for the fstat64 to fail due to EOVERFLOW. And I'm not quite getting why the code is comparing fstat64's result to stat64's result -- perhaps explain this in a comment and how EOVERFLOW is dealt with?

Right, I though about the '/proc' failure and decided to fail based on 
on other implementations (fexecve fallback, fchmodat), but the 
__resolve_path should be ok to use in such case.

For fstat64, afaik EOVERFLOW is returned only when called non-LFS interface
which is not this case. Do you think re really should handle EOVERFLOW ?

The fstat64/stat64 comparison is just a small check to see if the
file was removed between calls.  Thinking twice I am not sure how
effective this really is, maybe remove it?
  
Adhemerval Zanella Aug. 11, 2020, 3:18 p.m. UTC | #4
On 11/08/2020 11:14, Adhemerval Zanella wrote:
> 
> 
> On 10/08/2020 18:25, Paul Eggert wrote:
>> On 8/10/20 1:48 PM, Adhemerval Zanella wrote:
>>
>>> Regarding syscalls usage, for a sucessful path without symlinks it
>>> trades 2 syscalls (getcwd/lstat) for 5 (openat, readlink, fstat, stat,
>>>   and close). 
>>
>> Thanks for looking into this.
>>
>> There's no need for the new __resolve_path function to call __lstat64. It does that only to determine whether the file is a symlink or a directory, and it can do the former with __readlink (which it's gonna do already) and the latter by going into the next iteration and seeing whether it gets NOTDIR. Avoiding __lstat64 means we don't have to worry about irrelevant EOVERFLOW hassles.
> 
> I am not sure we can skip the __lstat64 because we can't check if a subpath
> passed on __readlink is a directory or not (so we can move to next iteration
> in some cases).
> 
> For instance the test 30 from stdlib/test-canon.c which issues:
> 
>   realpath ("./doesExist/someFile/", ...)

And I just checked your patch on BZ#24970 and it shows the issues:

$ grep ^FAIL stdlib/subdir-tests.sum 
FAIL: stdlib/test-canon
$ cat stdlib/test-canon.out 
/home/azanella/Projects/glibc/build/x86_64-linux-gnu/stdlib/test-canon: flunked test 30 (expected `NULL', got `/home/azanella/Projects/glibc/build/x86_64-linux-gnu/stdlib/doesExist/someFile')
/home/azanella/Projects/glibc/build/x86_64-linux-gnu/stdlib/test-canon: flunked test 31 (expected `NULL', got `/home/azanella/Projects/glibc/build/x86_64-linux-gnu/stdlib/doesExist')
2 errors.

I couldn't figure out a way to have the expected realpath semantic by
just using readlink.
  
Paul Eggert Aug. 11, 2020, 3:52 p.m. UTC | #5
On 8/11/20 7:14 AM, Adhemerval Zanella wrote:

> For instance the test 30 from stdlib/test-canon.c which issues:
> 
>    realpath ("./doesExist/someFile/", ...)
> 
> On the directory with:
> 
>    mkdir ("doesExist", 0777)
>    creat ("doesExist/someFile", 0777)
> 
> By just issuing the readlink on (strace output):
> 
>    readlink(".../doesExist", 0x7fff32598060, 4095) = -1 EINVAL
> 
> We can't see that is a S_ISDIR to go for next iteration (and then fail on next
> iteration since "../doesExist/someFile' is not a directory end *end != '\0').

We don't need to test whether "doesExist" is a directory. The EINVAL tells us 
that it exists and is not a symlink. Therefore it is either a directory or a 
non-(symlink-or-directory). To discover whether it is a directory or a 
non-(symlink-or-directory), we go ahead with the next iteration:

     readlink ("doesExist/someFile", ...)

If this fails with ENOTDIR, then "doesExist" was a non-(symlink-or-directory); 
if it succeeds (or fails with EINVAL) doesExist was a directory; otherwise can 
simply fail with whatever errno it fails with.

> For fstat64, afaik EOVERFLOW is returned only when called non-LFS interface
> which is not this case. Do you think re really should handle EOVERFLOW ?

Ah, I hadn't thought of that. If EOVERFLOW is indeed impossible with fstat64 
then we needn't worry about EOVERFLOW.

> The fstat64/stat64 comparison is just a small check to see if the
> file was removed between calls.  Thinking twice I am not sure how
> effective this really is, maybe remove it?

Yes, let's remove that. realpath cannot possibly be atomic when it is issuing 
multiple syscalls, so there's no point to it trying to be "atomic" here.
  
Andreas Schwab Aug. 11, 2020, 4:46 p.m. UTC | #6
On Aug 10 2020, Paul Eggert wrote:

> In the Linux __realpath_system, the code should be prepared for the /proc
> syscall to fail because /proc is not mounted.

We already depend very much on /proc to be mounted, so this can be
ignored.

Andreas.
  
Adhemerval Zanella Aug. 11, 2020, 7:01 p.m. UTC | #7
On 11/08/2020 12:52, Paul Eggert wrote:
> On 8/11/20 7:14 AM, Adhemerval Zanella wrote:
> 
>> For instance the test 30 from stdlib/test-canon.c which issues:
>>
>>    realpath ("./doesExist/someFile/", ...)
>>
>> On the directory with:
>>
>>    mkdir ("doesExist", 0777)
>>    creat ("doesExist/someFile", 0777)
>>
>> By just issuing the readlink on (strace output):
>>
>>    readlink(".../doesExist", 0x7fff32598060, 4095) = -1 EINVAL
>>
>> We can't see that is a S_ISDIR to go for next iteration (and then fail on next
>> iteration since "../doesExist/someFile' is not a directory end *end != '\0').
> 
> We don't need to test whether "doesExist" is a directory. The EINVAL tells us that it exists and is not a symlink. Therefore it is either a directory or a non-(symlink-or-directory). To discover whether it is a directory or a non-(symlink-or-directory), we go ahead with the next iteration:
> 
>     readlink ("doesExist/someFile", ...)
> 
> If this fails with ENOTDIR, then "doesExist" was a non-(symlink-or-directory); if it succeeds (or fails with EINVAL) doesExist was a directory; otherwise can simply fail with whatever errno it fails with.

The issue seems that readlink does not fail with ENOTDIR if 'doesExist/someFile'
is indeed a file:

  readlink("/home/azanella/Projects/glibc/build/x86_64-linux-gnu/stdlib/doesExist/someFile", 0x7ffe2d3efb70, 4096) = -1 EINVAL (Invalid argument)

We need to check with a final '/':

  readlink("/home/azanella/Projects/glibc/build/x86_64-linux-gnu/stdlib/doesExist/someFile/", 0x7ffee1893aa0, 4096) = -1 ENOTDIR (Not a directory)

This is turn requires to change the loop to make this extra check with
readlink with expected path (by appending the '/' when required). There
is another issue which we also need to handle "./doesExist/someFile/..".

But I have spent too much time for an small optimization not directly
related to the Linux optimization.  If you could make either current
algorithm or the one on stdlib/canonicalize-internal.c no call lstat
I can add on series (I don't have plan to spend more time on this
readlink optimization).

> 
>> For fstat64, afaik EOVERFLOW is returned only when called non-LFS interface
>> which is not this case. Do you think re really should handle EOVERFLOW ?
> 
> Ah, I hadn't thought of that. If EOVERFLOW is indeed impossible with fstat64 then we needn't worry about EOVERFLOW.
> 
>> The fstat64/stat64 comparison is just a small check to see if the
>> file was removed between calls.  Thinking twice I am not sure how
>> effective this really is, maybe remove it?
> 
> Yes, let's remove that. realpath cannot possibly be atomic when it is issuing multiple syscalls, so there's no point to it trying to be "atomic" here.

Ack.
  
Dmitry V. Levin Aug. 17, 2020, 2 p.m. UTC | #8
On Tue, Aug 11, 2020 at 06:46:24PM +0200, Andreas Schwab wrote:
> On Aug 10 2020, Paul Eggert wrote:
> 
> > In the Linux __realpath_system, the code should be prepared for the /proc
> > syscall to fail because /proc is not mounted.
> 
> We already depend very much on /proc to be mounted, so this can be
> ignored.

Do we?
  
Andreas Schwab Aug. 17, 2020, 3:13 p.m. UTC | #9
On Aug 17 2020, Dmitry V. Levin wrote:

> On Tue, Aug 11, 2020 at 06:46:24PM +0200, Andreas Schwab wrote:
>> On Aug 10 2020, Paul Eggert wrote:
>> 
>> > In the Linux __realpath_system, the code should be prepared for the /proc
>> > syscall to fail because /proc is not mounted.
>> 
>> We already depend very much on /proc to be mounted, so this can be
>> ignored.
>
> Do we?

Yes, see <87blk9ry5d.fsf@oldenburg2.str.redhat.com> for example.

Andreas.
  
Dmitry V. Levin Aug. 17, 2020, 4:17 p.m. UTC | #10
On Mon, Aug 17, 2020 at 05:13:44PM +0200, Andreas Schwab wrote:
> On Aug 17 2020, Dmitry V. Levin wrote:
> > On Tue, Aug 11, 2020 at 06:46:24PM +0200, Andreas Schwab wrote:
> >> On Aug 10 2020, Paul Eggert wrote:
> >> 
> >> > In the Linux __realpath_system, the code should be prepared for the /proc
> >> > syscall to fail because /proc is not mounted.
> >> 
> >> We already depend very much on /proc to be mounted, so this can be
> >> ignored.
> >
> > Do we?
> 
> Yes, see <87blk9ry5d.fsf@oldenburg2.str.redhat.com> for example.

That's a regression, I filed a bug at
https://sourceware.org/bugzilla/show_bug.cgi?id=26401

Thanks,
  

Patch

diff --git a/include/stdlib.h b/include/stdlib.h
index ffcefd7b85..dd51f66b26 100644
--- a/include/stdlib.h
+++ b/include/stdlib.h
@@ -20,6 +20,14 @@ 
 
 # include <rtld-malloc.h>
 
+# ifndef PATH_MAX
+#  ifdef MAXPATHLEN
+#   define PATH_MAX MAXPATHLEN
+#  else
+#   define PATH_MAX 1024
+#  endif
+# endif
+
 extern __typeof (strtol_l) __strtol_l;
 extern __typeof (strtoul_l) __strtoul_l;
 extern __typeof (strtoll_l) __strtoll_l;
@@ -92,6 +100,11 @@  extern int __unsetenv (const char *__name) attribute_hidden;
 extern int __clearenv (void) attribute_hidden;
 extern char *__mktemp (char *__template) __THROW __nonnull ((1));
 extern char *__canonicalize_file_name (const char *__name);
+extern _Bool __resolve_path (const char *name, char *resolved,
+			     size_t path_max)
+     attribute_hidden;
+extern char *__realpath_system (const char *name, char *resolved)
+     attribute_hidden;
 extern char *__realpath (const char *__name, char *__resolved);
 libc_hidden_proto (__realpath)
 extern int __ptsname_r (int __fd, char *__buf, size_t __buflen)
diff --git a/stdlib/Makefile b/stdlib/Makefile
index 7093b8a584..35ca04541f 100644
--- a/stdlib/Makefile
+++ b/stdlib/Makefile
@@ -53,7 +53,7 @@  routines	:=							      \
 	strtof strtod strtold						      \
 	strtof_l strtod_l strtold_l					      \
 	strtof_nan strtod_nan strtold_nan				      \
-	system canonicalize						      \
+	system realpath canonicalize canonicalize-internal		      \
 	a64l l64a							      \
 	rpmatch strfmon strfmon_l getsubopt xpg_basename fmtmsg		      \
 	strtoimax strtoumax wcstoimax wcstoumax				      \
diff --git a/stdlib/canonicalize-internal.c b/stdlib/canonicalize-internal.c
new file mode 100644
index 0000000000..1b5f73a1cc
--- /dev/null
+++ b/stdlib/canonicalize-internal.c
@@ -0,0 +1,136 @@ 
+/* Internal function for canonicalize absolute name of a given file.
+   Copyright (C) 1996-2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdbool.h>
+#include <string.h>
+#include <stdlib.h>
+#include <errno.h>
+#include <unistd.h>
+#include <eloop-threshold.h>
+
+_Bool
+__resolve_path (const char *name, char *resolved, size_t path_max)
+{
+  const char *start, *end;
+  char *rpath = resolved;
+  char *rpath_limit = rpath + path_max;
+  char *dest = resolved;
+  char extra_buf[PATH_MAX];
+  int num_links = 0;
+
+  if (name[0] != '/')
+    {
+      if (__getcwd (rpath, path_max) == NULL)
+	{
+	  rpath[0] = '\0';
+	  return false;
+	}
+      dest = __rawmemchr (rpath, '\0');
+    }
+  else
+    {
+      rpath[0] = '/';
+      dest = rpath + 1;
+    }
+
+  for (start = end = name; *start; start = end)
+    {
+      /* Skip sequence of multiple path-separators.  */
+      while (*start == '/')
+	++start;
+
+      /* Find end of path component.  */
+      for (end = start; *end && *end != '/'; ++end)
+	/* Nothing.  */;
+
+      if (end - start == 0)
+	break;
+      else if (end - start == 1 && start[0] == '.')
+	/* nothing */;
+      else if (end - start == 2 && start[0] == '.' && start[1] == '.')
+	{
+	  /* Back up to previous component, ignore if at root already.  */
+	  if (dest > rpath + 1)
+	    while ((--dest)[-1] != '/');
+	}
+      else
+	{
+	  struct stat64 st;
+
+	  if (dest[-1] != '/')
+	    *dest++ = '/';
+	  if (dest + (end - start) >= rpath_limit)
+	    {
+	      __set_errno (ENAMETOOLONG);
+	      if (dest > rpath + 1)
+		dest--;
+	      *dest = '\0';
+	      return false;
+	    }
+
+	  dest = __mempcpy (dest, start, end - start);
+	  *dest = '\0';
+
+	  if (__lstat64 (rpath, &st) < 0)
+	    return false;
+
+	  if (S_ISLNK (st.st_mode))
+	    {
+	      if (++num_links > __eloop_threshold ())
+		{
+		  __set_errno (ELOOP);
+		  return false;
+		}
+
+	      char buf[PATH_MAX];
+	      ssize_t n = __readlink (rpath, buf, sizeof (buf) - 1);
+	      if (n < 0)
+		return false;
+	      buf[n] = '\0';
+
+	      size_t len = strlen (end);
+	      if (path_max - n <= len)
+		{
+		  __set_errno (ENAMETOOLONG);
+		  return false;
+		}
+
+	      memmove (&extra_buf[n], end, len + 1);
+	      name = end = memcpy (extra_buf, buf, n);
+
+	      if (buf[0] == '/')
+		dest = rpath + 1;	/* It's an absolute symlink */
+	      else
+		/* Back up to previous component, ignore if at root already: */
+		if (dest > rpath + 1)
+		  while ((--dest)[-1] != '/');
+	    }
+	  else if (!S_ISDIR (st.st_mode) && *end != '\0')
+	    {
+	      __set_errno (ENOTDIR);
+	      return false;
+	    }
+	}
+    }
+
+  if (dest > rpath + 1 && dest[-1] == '/')
+    --dest;
+  *dest = '\0';
+
+  return true;
+}
diff --git a/stdlib/canonicalize.c b/stdlib/canonicalize.c
index 91c30e38be..f4ab528a15 100644
--- a/stdlib/canonicalize.c
+++ b/stdlib/canonicalize.c
@@ -16,26 +16,11 @@ 
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <assert.h>
 #include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <limits.h>
-#include <sys/stat.h>
 #include <errno.h>
-#include <stddef.h>
 
-#include <eloop-threshold.h>
 #include <shlib-compat.h>
 
-#ifndef PATH_MAX
-# ifdef MAXPATHLEN
-#  define PATH_MAX MAXPATHLEN
-# else
-#  define PATH_MAX 1024
-# endif
-#endif
-
 /* Return the canonical absolute name of file NAME.  A canonical name
    does not contain any `.', `..' components nor any repeated path
    separators ('/') or symlinks.  All path components must exist.  If
@@ -50,10 +35,6 @@ 
 char *
 __realpath (const char *name, char *resolved)
 {
-  char *rpath, *dest, extra_buf[PATH_MAX];
-  const char *start, *end, *rpath_limit;
-  int num_links = 0;
-
   if (name == NULL)
     {
       /* As per Single Unix Specification V2 we must return an error if
@@ -72,127 +53,7 @@  __realpath (const char *name, char *resolved)
       return NULL;
     }
 
-  if (resolved == NULL)
-    {
-      rpath = malloc (PATH_MAX);
-      if (rpath == NULL)
-	return NULL;
-    }
-  else
-    rpath = resolved;
-  rpath_limit = rpath + PATH_MAX;
-
-  if (name[0] != '/')
-    {
-      if (!__getcwd (rpath, PATH_MAX))
-	{
-	  rpath[0] = '\0';
-	  goto error;
-	}
-      dest = __rawmemchr (rpath, '\0');
-    }
-  else
-    {
-      rpath[0] = '/';
-      dest = rpath + 1;
-    }
-
-  for (start = end = name; *start; start = end)
-    {
-      struct stat64 st;
-      int n;
-
-      /* Skip sequence of multiple path-separators.  */
-      while (*start == '/')
-	++start;
-
-      /* Find end of path component.  */
-      for (end = start; *end && *end != '/'; ++end)
-	/* Nothing.  */;
-
-      if (end - start == 0)
-	break;
-      else if (end - start == 1 && start[0] == '.')
-	/* nothing */;
-      else if (end - start == 2 && start[0] == '.' && start[1] == '.')
-	{
-	  /* Back up to previous component, ignore if at root already.  */
-	  if (dest > rpath + 1)
-	    while ((--dest)[-1] != '/');
-	}
-      else
-	{
-	  if (dest[-1] != '/')
-	    *dest++ = '/';
-
-	  if (dest + (end - start) >= rpath_limit)
-	    {
-	      __set_errno (ENAMETOOLONG);
-	      if (dest > rpath + 1)
-		dest--;
-	      *dest = '\0';
-	      goto error;
-	    }
-
-	  dest = __mempcpy (dest, start, end - start);
-	  *dest = '\0';
-
-	  if (__lxstat64 (_STAT_VER, rpath, &st) < 0)
-	    goto error;
-
-	  if (S_ISLNK (st.st_mode))
-	    {
-	      char buf[PATH_MAX];
-	      size_t len;
-
-	      if (++num_links > __eloop_threshold ())
-		{
-		  __set_errno (ELOOP);
-		  goto error;
-		}
-
-	      n = __readlink (rpath, buf, sizeof (buf) - 1);
-	      if (n < 0)
-		goto error;
-	      buf[n] = '\0';
-
-	      len = strlen (end);
-	      if (PATH_MAX - n <= len)
-		{
-		  __set_errno (ENAMETOOLONG);
-		  goto error;
-		}
-
-	      /* Careful here, end may be a pointer into extra_buf... */
-	      memmove (&extra_buf[n], end, len + 1);
-	      name = end = memcpy (extra_buf, buf, n);
-
-	      if (buf[0] == '/')
-		dest = rpath + 1;	/* It's an absolute symlink */
-	      else
-		/* Back up to previous component, ignore if at root already: */
-		if (dest > rpath + 1)
-		  while ((--dest)[-1] != '/');
-	    }
-	  else if (!S_ISDIR (st.st_mode) && *end != '\0')
-	    {
-	      __set_errno (ENOTDIR);
-	      goto error;
-	    }
-	}
-    }
-  if (dest > rpath + 1 && dest[-1] == '/')
-    --dest;
-  *dest = '\0';
-
-  assert (resolved == NULL || resolved == rpath);
-  return rpath;
-
-error:
-  assert (resolved == NULL || resolved == rpath);
-  if (resolved == NULL)
-    free (rpath);
-  return NULL;
+  return __realpath_system (name, resolved);
 }
 libc_hidden_def (__realpath)
 versioned_symbol (libc, __realpath, realpath, GLIBC_2_3);
diff --git a/stdlib/realpath.c b/stdlib/realpath.c
new file mode 100644
index 0000000000..1a70c658b7
--- /dev/null
+++ b/stdlib/realpath.c
@@ -0,0 +1,42 @@ 
+/* Return the canonical absolute name of a given file.
+   Copyright (C) 1996-2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <errno.h>
+#include <shlib-compat.h>
+
+char *
+__realpath_system (const char *name, char *resolved)
+{
+  bool resolved_malloc = false;
+  if (resolved == NULL)
+    {
+      resolved = malloc (PATH_MAX);
+      if (resolved == NULL)
+	return NULL;
+      resolved_malloc = true;
+    }
+
+  if (! __resolve_path (name, resolved, PATH_MAX))
+    {
+      if (resolved_malloc)
+      	free (resolved);
+      return NULL;
+    }
+  return resolved;
+}
diff --git a/sysdeps/unix/sysv/linux/realpath.c b/sysdeps/unix/sysv/linux/realpath.c
new file mode 100644
index 0000000000..4c69011f92
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/realpath.c
@@ -0,0 +1,65 @@ 
+/* Return the canonical absolute name of a given file.  Linux version.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#include <stdlib.h>
+#include <errno.h>
+#include <not-cancel.h>
+#include <shlib-compat.h>
+
+char *
+__realpath_system (const char *name, char *resolved)
+{
+  int fd = __open64_nocancel (name, O_PATH | O_NONBLOCK | O_CLOEXEC);
+  if (fd == -1)
+    {
+      /* If the call fails with either EACCES or ENOENT and resolved_path is
+	 not NULL, then the prefix of path that is not readable or does not
+	 exist is returned in resolved_path.  This is a GNU extension.  */
+      if (resolved != NULL)
+	__resolve_path (name, resolved, PATH_MAX);
+      return NULL;
+    }
+
+  char procname[sizeof ("/proc/self/fd/") + 3 * sizeof (int)];
+  *_fitoa_word (fd, __stpcpy (procname, "/proc/self/fd/"), 10, 0) = '\0';
+
+  char path[PATH_MAX];
+  ssize_t len = __readlink (procname, path, sizeof (path) - 1);
+  if (len < 0)
+    {
+      __close_nocancel_nostatus (fd);
+      return NULL;
+    }
+  path[len] = '\0';
+
+  struct stat64 st;
+  fstat64 (fd, &st);
+  dev_t st_dev = st.st_dev;
+  ino_t st_ino = st.st_ino;
+  int r = stat64 (path, &st);
+  if (r == -1 || st.st_dev != st_dev || st.st_ino != st_ino)
+    {
+      if (r == 0)
+	__set_errno (ELOOP);
+      __close_nocancel_nostatus (fd);
+      return NULL;
+    }
+
+  __close_nocancel_nostatus (fd);
+  return resolved != NULL ? strcpy (resolved, path) : __strdup (path);
+}