[0/8] Remove alloca usage from glob

Message ID 20210105185820.3796657-1-adhemerval.zanella@linaro.org
Headers
Series Remove alloca usage from glob |

Message

Adhemerval Zanella Netto Jan. 5, 2021, 6:58 p.m. UTC
  This is a respin on an old patchset [1] which is no longer on patchwork.
The main change is just a rebase against master since glibc version is
now on sync with gnulib.

I am sending it also to gnulib to get feedback from the project,
specifically for the part not specific for the system support by glibc
and extra requirements.

The idea of removing the alloca allows a slight better code generation,
simplifies the boilerplate code to avoid the unbounded alloca usage,
and it plays better with security compiler mitigation tools (such as the
ones for stack clash).

On x86_64 with gcc 9.2.1 build glob for shared object, the total stack
usage went from:

glob.c:261:1:next_brace_sub     8       static
glob.c:1205:1:prefix_array      96      static
glob.c:1185:1:collated_compare  8       static
glob.c:250:1:is_dir.isra.0      160     static
glob.c:1259:1:glob_in_dir       864     dynamic
glob.c:297:1:__glob     1360    dynamic

to

glob.c:231:1:next_brace_sub     8       static
dynarray-skeleton.c:203:1:globnames_array_free  32      static
glob.c:1131:1:prefix_array      96      static
glob.c:1111:1:collated_compare  8       static
dynarray-skeleton.c:275:1:globnames_array_add__ 48      static
dynarray-skeleton.c:373:1:char_array_resize.part.0      48      static
char_array-skeleton.c:274:1:char_array_replace_str_pos  48      static
char_array-skeleton.c:204:1:char_array_crop     32      static
char_array-skeleton.c:240:1:char_array_append_str       48      static
char_array-skeleton.c:118:1:char_array_set_str_size     32      static
char_array-skeleton.c:161:1:char_array_init_str 32      static
glob.c:220:1:is_dir.isra.0      160     static
glob.c:1200:1:glob_in_dir       1088    static
glob.c:267:1:__glob     1824    static

(it was obtained with -fstack-usage gcc option).

It is a slight increase and due GNU extension where glob might be called
recursivelly internally the total stack usage might be still slight
larger than current implementation.  But I still think it is a worth
tradeoff.

[1] https://sourceware.org/pipermail/libc-alpha/2018-February/091326.html

Adhemerval Zanella (8):
  malloc: Add specialized dynarray for C strings
  posix: Use char_array for internal glob dirname
  posix: Remove alloca usage for GLOB_BRACE on glob
  posix: Remove alloca usage on glob dirname
  posix: Use dynarray for globname in glob
  posix: Remove alloca usage on glob user_name
  posix: Use char_array for home_dir in glob
  posix: Remove all alloca usage in glob

 malloc/Makefile                    |   4 +-
 malloc/Versions                    |   7 +
 malloc/char_array-impl.c           |  57 +++
 malloc/char_array-skeleton.c       | 288 +++++++++++++
 malloc/char_array.h                |  53 +++
 malloc/dynarray.h                  |   9 +
 malloc/dynarray_overflow_failure.c |  31 ++
 malloc/malloc-internal.h           |  14 +
 malloc/tst-char_array.c            | 112 ++++++
 posix/glob.c                       | 622 +++++++++++------------------
 10 files changed, 817 insertions(+), 380 deletions(-)
 create mode 100644 malloc/char_array-impl.c
 create mode 100644 malloc/char_array-skeleton.c
 create mode 100644 malloc/char_array.h
 create mode 100644 malloc/dynarray_overflow_failure.c
 create mode 100644 malloc/tst-char_array.c
  

Comments

Paul Eggert Jan. 13, 2021, 7:36 p.m. UTC | #1
On 1/5/21 10:58 AM, Adhemerval Zanella wrote:
> The idea of removing the alloca allows a slight better code generation,
> simplifies the boilerplate code to avoid the unbounded alloca usage,
> and it plays better with security compiler mitigation tools (such as the
> ones for stack clash).

Instead of complicating dynarray by adding char_array stuff, I suggest 
adding any primitives just to glob for now, as it's not clear that 
they're generally useful.

I do see a problem with the proposed patch set, in that it creates 
several different dynarrays when glob really needs only one or two. The 
existing code is full of memory-allocation gotchas and would be 
simplified if it treated the memory it allocates as a first-class part 
of the problem rather than as some sort of cranky subsidiary that needs 
to be babied and its diaper changed at random intervals.

Attached is a draft set of patches against Gnulib commit 
6a00fdb4bb105697aa27ba97ef7ec33287790ad3 which gives a hint about the 
sort of thing that I mean here. This patch set is not complete (it does 
only the equivalent of the first four of the patches you proposed) and 
it's not exactly the form that I want, so I haven't installed it into 
Gnulib. However, I hope it shows the sort of thing I have in mind. So 
far, it's needed only one scratch buffer.

In this patch set, glob.c continues to use scratch_buffer.h because 
scratch buffers are good enough for all the changes needed so far. I 
suppose dynarrays will be helpful for later patches and that we can 
switch to them as needed, but I wanted to focus on the actual problem 
first rather than worrying about scratch buffers vs dynarrays.