[07/17] Regex: Additional memory management checks.

Message ID 201712080916.vB89GxcF005503@skeeve.com
State New, archived
Headers

Commit Message

Arnold Robbins Dec. 8, 2017, 9:16 a.m. UTC
  This patch adds several small memory management safety checks.
The last one is particularly important.

2017-11-27         Arnold D. Robbins     <arnold@skeeve.com>

	* posix/regcomp.c (analyze): Additional memory management
	safety checks.
	* posix/regexec.c (re_search_internal): Ditto.
	* posix/regex_internal.c (re_node_set_alloc): Ditto.
  

Comments

Paul Eggert Dec. 20, 2017, 1:04 a.m. UTC | #1
On 12/08/2017 01:16 AM in 
<https://sourceware.org/ml/libc-alpha/2017-12/msg00242.html> Arnold 
Robbins wrote:
> +  /* some malloc()-checkers don't like zero allocations */
> +  if (preg->re_nsub > 0)
>     dfa->subexp_map = re_malloc (int, preg->re_nsub);
> +  else
> +    dfa->subexp_map = NULL;

Which checkers are these? Can they be told that 'malloc (0)' is OK? 
Since the code here is fine (i.e. it will work even if 'malloc (0)' 
succeeds and returns NULL) it may be better to leave the code alone and 
to fix the checkers instead.

> +   * ADR: valgrind says size can be 0, which then doesn't
> +   * free the block of size 0.  Harumph. This seems
> +   * to work ok, though.
> +   */
> +  if (size == 0)
> +    {
> +       memset(set, 0, sizeof(*set));
> +       return REG_NOERROR;
> +    }
>     set->alloc = size;
>     set->nelem = 0;
>     set->elems = re_malloc (int, size);

For this, how about if we use the corresponding Gnulib fix instead? An 
advantage of the Gnulib fix is that it doesn't introduce runtime 
overhead when glibc is used. Here is a URL:

https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=bbf0d723ed2335add96bcc0f842885d8a5d8b6da

> diff --git a/posix/regexec.c b/posix/regexec.c
> index 2d2bc46..8573765 100644
> --- a/posix/regexec.c
> +++ b/posix/regexec.c
> @@ -605,7 +605,7 @@ re_search_internal (const regex_t *preg, const char *string, int length,
>     nmatch -= extra_nmatch;
>   
>     /* Check if the DFA haven't been compiled.  */
> -  if (BE (preg->used == 0 || dfa->init_state == NULL
> +  if (BE (preg->used == 0 || dfa == NULL || dfa->init_state == NULL
>   	  || dfa->init_state_word == NULL || dfa->init_state_nl == NULL
>   	  || dfa->init_state_begbuf == NULL, 0))
>       return REG_NOMATCH;

Why is this change needed? I couldn't see a code path that would trigger 
it. Obviously I'm missing something. I suggest mentioning the reason in 
the comment.
  
Arnold Robbins Dec. 20, 2017, 6:58 p.m. UTC | #2
Hi Paul.

Paul Eggert <eggert@cs.ucla.edu> wrote:

> On 12/08/2017 01:16 AM in 
> <https://sourceware.org/ml/libc-alpha/2017-12/msg00242.html> Arnold 
> Robbins wrote:
> > +  /* some malloc()-checkers don't like zero allocations */
>
> Which checkers are these?

Lord only knows. That change has been in gawk's regex for years and
years and I don't remember. So:

> Can they be told that 'malloc (0)' is OK? 

Practically speaking, no.

> > +   * ADR: valgrind says size can be 0, which then doesn't
> > +   * free the block of size 0.  Harumph. This seems
> > +   * to work ok, though.
> > +   */
> > +  if (size == 0)
> > +    {
> > +       memset(set, 0, sizeof(*set));
> > +       return REG_NOERROR;
> > +    }
> >     set->alloc = size;
> >     set->nelem = 0;
> >     set->elems = re_malloc (int, size);
>
> For this, how about if we use the corresponding Gnulib fix instead? An 
> advantage of the Gnulib fix is that it doesn't introduce runtime 
> overhead when glibc is used. Here is a URL:

I think that the runtime overhead is so small that it cannot be
measured.  I don't want to pull in more gnulib m4 goop for this.
The GLIBC guys can do as they wish, of course. :-)

> > diff --git a/posix/regexec.c b/posix/regexec.c
> > index 2d2bc46..8573765 100644
> > --- a/posix/regexec.c
> > +++ b/posix/regexec.c
> > @@ -605,7 +605,7 @@ re_search_internal (const regex_t *preg, const char *string, int length,
> >     nmatch -= extra_nmatch;
> >   
> >     /* Check if the DFA haven't been compiled.  */
> > -  if (BE (preg->used == 0 || dfa->init_state == NULL
> > +  if (BE (preg->used == 0 || dfa == NULL || dfa->init_state == NULL
> >   	  || dfa->init_state_word == NULL || dfa->init_state_nl == NULL
> >   	  || dfa->init_state_begbuf == NULL, 0))
> >       return REG_NOMATCH;
>
> Why is this change needed? I couldn't see a code path that would trigger 
> it.

I managed once while doing some changes to cause dfa to be NULL. So
I added the check.  I don't remember what I did.

Thanks,

Arnold
  

Patch

diff --git a/posix/regcomp.c b/posix/regcomp.c
index c1fd23b..83fcc40 100644
--- a/posix/regcomp.c
+++ b/posix/regcomp.c
@@ -1157,7 +1157,11 @@  analyze (regex_t *preg)
 	  || dfa->eclosures == NULL, 0))
     return REG_ESPACE;
 
+  /* some malloc()-checkers don't like zero allocations */
+  if (preg->re_nsub > 0)
   dfa->subexp_map = re_malloc (int, preg->re_nsub);
+  else
+    dfa->subexp_map = NULL;
   if (dfa->subexp_map != NULL)
     {
       int i;
diff --git a/posix/regex_internal.c b/posix/regex_internal.c
index 506ccad..968fd77 100644
--- a/posix/regex_internal.c
+++ b/posix/regex_internal.c
@@ -958,6 +958,16 @@  static reg_errcode_t
 __attribute_warn_unused_result__
 re_node_set_alloc (re_node_set *set, int size)
 {
+  /*
+   * ADR: valgrind says size can be 0, which then doesn't
+   * free the block of size 0.  Harumph. This seems
+   * to work ok, though.
+   */
+  if (size == 0)
+    {
+       memset(set, 0, sizeof(*set));
+       return REG_NOERROR;
+    }
   set->alloc = size;
   set->nelem = 0;
   set->elems = re_malloc (int, size);
diff --git a/posix/regexec.c b/posix/regexec.c
index 2d2bc46..8573765 100644
--- a/posix/regexec.c
+++ b/posix/regexec.c
@@ -605,7 +605,7 @@  re_search_internal (const regex_t *preg, const char *string, int length,
   nmatch -= extra_nmatch;
 
   /* Check if the DFA haven't been compiled.  */
-  if (BE (preg->used == 0 || dfa->init_state == NULL
+  if (BE (preg->used == 0 || dfa == NULL || dfa->init_state == NULL
 	  || dfa->init_state_word == NULL || dfa->init_state_nl == NULL
 	  || dfa->init_state_begbuf == NULL, 0))
     return REG_NOMATCH;