[02/10] linux: Simplify opendir buffer allocation

Message ID 20200417132209.22065-2-adhemerval.zanella@linaro.org
State Superseded
Headers
Series [01/10] linux: Move posix dir implementations to Linux |

Commit Message

Adhemerval Zanella Netto April 17, 2020, 1:22 p.m. UTC
  THe fallback allocation is removed, so the possible size constraint
should be analized just once; __alloc_dir assumes that 'statp'
argument is non-null, and the max_buffer_size move to close its
used.

Checked on x86_64-linux-gnu and i686-linux-gnu.
---
 include/dirent.h                  |  3 +-
 sysdeps/unix/sysv/linux/opendir.c | 52 +++++++++++--------------------
 2 files changed, 21 insertions(+), 34 deletions(-)
  

Comments

Florian Weimer April 21, 2020, 10:28 a.m. UTC | #1
* Adhemerval Zanella via Libc-alpha:

> THe fallback allocation is removed, so the possible size constraint
> should be analized just once; __alloc_dir assumes that 'statp'
> argument is non-null, and the max_buffer_size move to close its
> used.

Type: analized

> +    return NULL;
> +
> +  /* The st_blksize value of the directory is used as a hint for the
> +     size of the buffer which receives struct dirent values from the
> +     kernel.  st_blksize is limited to max_buffer_size, in case the
> +     file system provides a bogus value.  */
> +  enum { max_buffer_size = 1U << 20 };
> +
> +  const size_t allocation_size = 4 * BUFSIZ;
> +  _Static_assert (allocation_size >= sizeof (struct dirent64),
> +		  "allocation_size < sizeof (struct dirent64)");
> +
>    /* Increase allocation if requested, but not if the value appears to
> +     be bogus.  It will be between 32Kb (for blocksizes smaller than BUFSIZ)
> +     up to 1Mb.  */
> +  size_t allocation = MIN (MAX ((size_t) statp->st_blksize, allocation_size),
> +			   max_buffer_size);
>  
>    DIR *dirp = (DIR *) malloc (sizeof (DIR) + allocation);
>    if (dirp == NULL)
>      {
> +      if (close_fd)
> +	__close_nocancel_nostatus (fd);
> +      return NULL;
>      }
>  
>    dirp->fd = fd;

This looks okay to me.  I wonder if we should just use 32 KiB
uncodintionally, though?
  
Rafal Luzynski April 23, 2020, 9:27 p.m. UTC | #2
21.04.2020 12:28 Florian Weimer <fw@deneb.enyo.de> wrote:
> 
> * Adhemerval Zanella via Libc-alpha:
> 
> > THe fallback allocation is removed, so the possible size constraint
> > should be analized just once; __alloc_dir assumes that 'statp'
> > argument is non-null, and the max_buffer_size move to close its
> > used.
> 
> Type: analized

True, also (which is maybe obvious but maybe not) "THe" -> "The".

Regards,

Rafal
  
Adhemerval Zanella Netto April 23, 2020, 9:39 p.m. UTC | #3
On 21/04/2020 07:28, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> THe fallback allocation is removed, so the possible size constraint
>> should be analized just once; __alloc_dir assumes that 'statp'
>> argument is non-null, and the max_buffer_size move to close its
>> used.
> 
> Type: analized
> 
>> +    return NULL;
>> +
>> +  /* The st_blksize value of the directory is used as a hint for the
>> +     size of the buffer which receives struct dirent values from the
>> +     kernel.  st_blksize is limited to max_buffer_size, in case the
>> +     file system provides a bogus value.  */
>> +  enum { max_buffer_size = 1U << 20 };
>> +
>> +  const size_t allocation_size = 4 * BUFSIZ;
>> +  _Static_assert (allocation_size >= sizeof (struct dirent64),
>> +		  "allocation_size < sizeof (struct dirent64)");
>> +
>>    /* Increase allocation if requested, but not if the value appears to
>> +     be bogus.  It will be between 32Kb (for blocksizes smaller than BUFSIZ)
>> +     up to 1Mb.  */
>> +  size_t allocation = MIN (MAX ((size_t) statp->st_blksize, allocation_size),
>> +			   max_buffer_size);
>>  
>>    DIR *dirp = (DIR *) malloc (sizeof (DIR) + allocation);
>>    if (dirp == NULL)
>>      {
>> +      if (close_fd)
>> +	__close_nocancel_nostatus (fd);
>> +      return NULL;
>>      }
>>  
>>    dirp->fd = fd;
> 
> This looks okay to me.  I wonder if we should just use 32 KiB
> uncodintionally, though?

Do you mean instead of '4 * BUFSIZ'?
  
Florian Weimer April 24, 2020, 10:11 a.m. UTC | #4
* Adhemerval Zanella via Libc-alpha:

>> This looks okay to me.  I wonder if we should just use 32 KiB
>> uncodintionally, though?
>
> Do you mean instead of '4 * BUFSIZ'?

Yes.  I assumed BUFSIZ was 8192.
  
Adhemerval Zanella Netto April 24, 2020, 12:08 p.m. UTC | #5
On 24/04/2020 07:11, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>>> This looks okay to me.  I wonder if we should just use 32 KiB
>>> uncodintionally, though?
>>
>> Do you mean instead of '4 * BUFSIZ'?
> 
> Yes.  I assumed BUFSIZ was 8192.
> 

Alright, I can make this change. Any opinion whether we should keep
32 KiB or increase/decrease it?
  
Florian Weimer April 24, 2020, 1:08 p.m. UTC | #6
* Adhemerval Zanella:

> On 24/04/2020 07:11, Florian Weimer wrote:
>> * Adhemerval Zanella via Libc-alpha:
>> 
>>>> This looks okay to me.  I wonder if we should just use 32 KiB
>>>> uncodintionally, though?
>>>
>>> Do you mean instead of '4 * BUFSIZ'?
>> 
>> Yes.  I assumed BUFSIZ was 8192.
>> 
>
> Alright, I can make this change. Any opinion whether we should keep
> 32 KiB or increase/decrease it?

Sorry, I think it should be a separate change.  Lowering the buffer
size will unfortunately expose application bugs when applications
modify the directory while iterating through it using readdir.  (musl
sees more such bugs due to its smaller buffer size.)
  

Patch

diff --git a/include/dirent.h b/include/dirent.h
index 2b1cdcf8bd..fdf4c4a2f1 100644
--- a/include/dirent.h
+++ b/include/dirent.h
@@ -48,7 +48,8 @@  extern int __versionsort64 (const struct dirent64 **a,
 			    const struct dirent64 **b)
      __attribute_pure__;
 extern DIR *__alloc_dir (int fd, bool close_fd, int flags,
-			 const struct stat64 *statp) attribute_hidden;
+			 const struct stat64 *statp)
+     __nonnull (4) attribute_hidden;
 extern __typeof (rewinddir) __rewinddir;
 extern __typeof (seekdir) __seekdir;
 extern __typeof (dirfd) __dirfd;
diff --git a/sysdeps/unix/sysv/linux/opendir.c b/sysdeps/unix/sysv/linux/opendir.c
index c6ab79246c..765c8104b3 100644
--- a/sysdeps/unix/sysv/linux/opendir.c
+++ b/sysdeps/unix/sysv/linux/opendir.c
@@ -23,12 +23,6 @@ 
 
 #include <not-cancel.h>
 
-/* The st_blksize value of the directory is used as a hint for the
-   size of the buffer which receives struct dirent values from the
-   kernel.  st_blksize is limited to MAX_DIR_BUFFER_SIZE, in case the
-   file system provides a bogus value.  */
-#define MAX_DIR_BUFFER_SIZE 1048576U
-
 enum {
   opendir_oflags = O_RDONLY|O_NDELAY|O_DIRECTORY|O_LARGEFILE|O_CLOEXEC
 };
@@ -100,38 +94,30 @@  __alloc_dir (int fd, bool close_fd, int flags, const struct stat64 *statp)
      file descriptor.  */
   if (!close_fd
       && __glibc_unlikely (__fcntl64_nocancel (fd, F_SETFD, FD_CLOEXEC) < 0))
-	goto lose;
-
-  const size_t default_allocation = (4 * BUFSIZ < sizeof (struct dirent64)
-				     ? sizeof (struct dirent64) : 4 * BUFSIZ);
-  const size_t small_allocation = (BUFSIZ < sizeof (struct dirent64)
-				   ? sizeof (struct dirent64) : BUFSIZ);
-  size_t allocation = default_allocation;
-#ifdef _STATBUF_ST_BLKSIZE
+    return NULL;
+
+  /* The st_blksize value of the directory is used as a hint for the
+     size of the buffer which receives struct dirent values from the
+     kernel.  st_blksize is limited to max_buffer_size, in case the
+     file system provides a bogus value.  */
+  enum { max_buffer_size = 1U << 20 };
+
+  const size_t allocation_size = 4 * BUFSIZ;
+  _Static_assert (allocation_size >= sizeof (struct dirent64),
+		  "allocation_size < sizeof (struct dirent64)");
+
   /* Increase allocation if requested, but not if the value appears to
-     be bogus.  */
-  if (statp != NULL)
-    allocation = MIN (MAX ((size_t) statp->st_blksize, default_allocation),
-		      MAX_DIR_BUFFER_SIZE);
-#endif
+     be bogus.  It will be between 32Kb (for blocksizes smaller than BUFSIZ)
+     up to 1Mb.  */
+  size_t allocation = MIN (MAX ((size_t) statp->st_blksize, allocation_size),
+			   max_buffer_size);
 
   DIR *dirp = (DIR *) malloc (sizeof (DIR) + allocation);
   if (dirp == NULL)
     {
-      allocation = small_allocation;
-      dirp = (DIR *) malloc (sizeof (DIR) + allocation);
-
-      if (dirp == NULL)
-      lose:
-	{
-	  if (close_fd)
-	    {
-	      int save_errno = errno;
-	      __close_nocancel_nostatus (fd);
-	      __set_errno (save_errno);
-	    }
-	  return NULL;
-	}
+      if (close_fd)
+	__close_nocancel_nostatus (fd);
+      return NULL;
     }
 
   dirp->fd = fd;