glob: Avoid copying the d_name field of struct dirent [BZ #19779]

  On 03/11/2016 11:27 PM, Roland McGrath wrote:
>> --- a/posix/bug-glob2.c
>> +++ b/posix/bug-glob2.c
> [...]
>> @@ -75,7 +87,7 @@ typedef struct
>>    int level;
>>    int idx;
>>    struct dirent d;
>> -  char room_for_dirent[NAME_MAX];
>> +  char room_for_dirent[1000];
>>  } my_DIR;
> 
> There should be some comment about where this arbitrary size comes from.

I changed it to a sizeof.

>> +/* A representation of a directory entry which does not depend on the
>> +   layout of struct dirent, or the size of ino_t.  */
>> +struct abstract_dirent
>> +{
>> +  const char *d_name;
>> +  int d_type;
>> +  bool skip_entry;
>> +};
> 
> I'm not convinced that this is a good name for the struct.  It's not an
> abstract form of 'struct dirent', because the name member points into some
> other buffer whose lifetime is not intrinsically related to this struct.

It's now struct readdir_result.

> uint8_t is big enough for d_type, and will make the struct smaller on
> ILP32.

I assume the struct is turned into scalars anyway, but I added the
__typeof__.

>> +/* Extract name and type from directory entry.  No copy of the name is
>> +   made.  */
>> +static void
>> +convert_dirent (const struct dirent *source, struct abstract_dirent *target)
> 
> I like Paul's suggestion of using a struct-returning function here.  The
> comment should be explicit that the result will point to SOURCE->d_name and
> so SOURCE must be kept live.

I made these changes.

>> +#if defined _DIRENT_HAVE_D_TYPE || defined HAVE_STRUCT_DIRENT_D_TYPE
>> +  target->d_type = source->d_type;
>> +#else
>> +  target->d_type = 0;
>> +#endif
>> +#if (defined POSIX || defined WINDOWS32) && !defined __GNU_LIBRARY__
>> +  /* Posix does not require that the d_ino field be present, and some
>> +     systems do not provide it. */
>> +  target->skip_entry = false;
>> +#else
>> +  target->skip_entry = source->d_ino == 0;
>> +#endif
> 
> For both of these I'd prefer to see an inline or macro that encapsulates
> the #if stuff without other repetition, e.g.:

There are now separate macros for that.

> 	target->type = D_TYPE (source);
> 	target->skip_entry = D_SKIP_ME (source);
> 
>> +	      struct abstract_dirent e;
>> +	      {
>> +		struct dirent *d = (__builtin_expect (flags & GLOB_ALTDIRFUNC, 0)
>> +				    ? ((struct dirent *)
>> +				       (*pglob->gl_readdir) (stream))
>> +				    : __readdir (stream));
> 
> Convert to __glibc_unlikely while you're here (unless the gnulib sharing
> rules say not to, not clear on that off hand).

>> +		if (d == NULL)
>> +		  break;
>> +		convert_dirent (d, &e);
>> +	      }
> 
> This could all be nicely broken out into a function that takes STREAM,
> FLAGS, PGLOB, and returns E.

It's slightly more complicated than that because I accidentally removed
the interesting parts.

Further cleanups should wait until we have a decision whether we will
ever try to synchronize with gnulib again.

>> -		      len = NAMLEN (d);
>> -		      names->name[cur] = (char *) malloc (len + 1);
>> +		      names->name[cur] = strdup (e.d_name);
>>  		      if (names->name[cur] == NULL)
>>  			goto memory_error;
>> -		      *((char *) mempcpy (names->name[cur++], name, len))
>> -			= '\0';
>> +		      ++cur;
> 
> In the _DIRENT_HAVE_D_NAMLEN case, this assumes that strdup is as efficient
> as malloc+mempcpy (with no strlen required).  Perhaps it's close enough,
> but that is a subtle change you didn't mention as intended.

I want to avoid conditionalized code because these things tend to break
after a while (or never work in the first place).

I have tested the attached patch with fake-disabled d_type support, and
on i386-redhat-linux-gnu and x86_64-redhat-linux-gnu.

Florian

glob: Avoid copying the d_name field of struct dirent [BZ #19779]

Commit Message

Comments

Patch