[07/17] Regex: Additional memory management checks.
Commit Message
This patch adds several small memory management safety checks.
The last one is particularly important.
2017-11-27 Arnold D. Robbins <arnold@skeeve.com>
* posix/regcomp.c (analyze): Additional memory management
safety checks.
* posix/regexec.c (re_search_internal): Ditto.
* posix/regex_internal.c (re_node_set_alloc): Ditto.
Comments
On 12/08/2017 01:16 AM in
<https://sourceware.org/ml/libc-alpha/2017-12/msg00242.html> Arnold
Robbins wrote:
> + /* some malloc()-checkers don't like zero allocations */
> + if (preg->re_nsub > 0)
> dfa->subexp_map = re_malloc (int, preg->re_nsub);
> + else
> + dfa->subexp_map = NULL;
Which checkers are these? Can they be told that 'malloc (0)' is OK?
Since the code here is fine (i.e. it will work even if 'malloc (0)'
succeeds and returns NULL) it may be better to leave the code alone and
to fix the checkers instead.
> + * ADR: valgrind says size can be 0, which then doesn't
> + * free the block of size 0. Harumph. This seems
> + * to work ok, though.
> + */
> + if (size == 0)
> + {
> + memset(set, 0, sizeof(*set));
> + return REG_NOERROR;
> + }
> set->alloc = size;
> set->nelem = 0;
> set->elems = re_malloc (int, size);
For this, how about if we use the corresponding Gnulib fix instead? An
advantage of the Gnulib fix is that it doesn't introduce runtime
overhead when glibc is used. Here is a URL:
https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=bbf0d723ed2335add96bcc0f842885d8a5d8b6da
> diff --git a/posix/regexec.c b/posix/regexec.c
> index 2d2bc46..8573765 100644
> --- a/posix/regexec.c
> +++ b/posix/regexec.c
> @@ -605,7 +605,7 @@ re_search_internal (const regex_t *preg, const char *string, int length,
> nmatch -= extra_nmatch;
>
> /* Check if the DFA haven't been compiled. */
> - if (BE (preg->used == 0 || dfa->init_state == NULL
> + if (BE (preg->used == 0 || dfa == NULL || dfa->init_state == NULL
> || dfa->init_state_word == NULL || dfa->init_state_nl == NULL
> || dfa->init_state_begbuf == NULL, 0))
> return REG_NOMATCH;
Why is this change needed? I couldn't see a code path that would trigger
it. Obviously I'm missing something. I suggest mentioning the reason in
the comment.
Hi Paul.
Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 12/08/2017 01:16 AM in
> <https://sourceware.org/ml/libc-alpha/2017-12/msg00242.html> Arnold
> Robbins wrote:
> > + /* some malloc()-checkers don't like zero allocations */
>
> Which checkers are these?
Lord only knows. That change has been in gawk's regex for years and
years and I don't remember. So:
> Can they be told that 'malloc (0)' is OK?
Practically speaking, no.
> > + * ADR: valgrind says size can be 0, which then doesn't
> > + * free the block of size 0. Harumph. This seems
> > + * to work ok, though.
> > + */
> > + if (size == 0)
> > + {
> > + memset(set, 0, sizeof(*set));
> > + return REG_NOERROR;
> > + }
> > set->alloc = size;
> > set->nelem = 0;
> > set->elems = re_malloc (int, size);
>
> For this, how about if we use the corresponding Gnulib fix instead? An
> advantage of the Gnulib fix is that it doesn't introduce runtime
> overhead when glibc is used. Here is a URL:
I think that the runtime overhead is so small that it cannot be
measured. I don't want to pull in more gnulib m4 goop for this.
The GLIBC guys can do as they wish, of course. :-)
> > diff --git a/posix/regexec.c b/posix/regexec.c
> > index 2d2bc46..8573765 100644
> > --- a/posix/regexec.c
> > +++ b/posix/regexec.c
> > @@ -605,7 +605,7 @@ re_search_internal (const regex_t *preg, const char *string, int length,
> > nmatch -= extra_nmatch;
> >
> > /* Check if the DFA haven't been compiled. */
> > - if (BE (preg->used == 0 || dfa->init_state == NULL
> > + if (BE (preg->used == 0 || dfa == NULL || dfa->init_state == NULL
> > || dfa->init_state_word == NULL || dfa->init_state_nl == NULL
> > || dfa->init_state_begbuf == NULL, 0))
> > return REG_NOMATCH;
>
> Why is this change needed? I couldn't see a code path that would trigger
> it.
I managed once while doing some changes to cause dfa to be NULL. So
I added the check. I don't remember what I did.
Thanks,
Arnold
@@ -1157,7 +1157,11 @@ analyze (regex_t *preg)
|| dfa->eclosures == NULL, 0))
return REG_ESPACE;
+ /* some malloc()-checkers don't like zero allocations */
+ if (preg->re_nsub > 0)
dfa->subexp_map = re_malloc (int, preg->re_nsub);
+ else
+ dfa->subexp_map = NULL;
if (dfa->subexp_map != NULL)
{
int i;
@@ -958,6 +958,16 @@ static reg_errcode_t
__attribute_warn_unused_result__
re_node_set_alloc (re_node_set *set, int size)
{
+ /*
+ * ADR: valgrind says size can be 0, which then doesn't
+ * free the block of size 0. Harumph. This seems
+ * to work ok, though.
+ */
+ if (size == 0)
+ {
+ memset(set, 0, sizeof(*set));
+ return REG_NOERROR;
+ }
set->alloc = size;
set->nelem = 0;
set->elems = re_malloc (int, size);
@@ -605,7 +605,7 @@ re_search_internal (const regex_t *preg, const char *string, int length,
nmatch -= extra_nmatch;
/* Check if the DFA haven't been compiled. */
- if (BE (preg->used == 0 || dfa->init_state == NULL
+ if (BE (preg->used == 0 || dfa == NULL || dfa->init_state == NULL
|| dfa->init_state_word == NULL || dfa->init_state_nl == NULL
|| dfa->init_state_begbuf == NULL, 0))
return REG_NOMATCH;