[15/17] Regex: Diagnose ERE "()|\1".

Message ID 201712080919.vB89JOMM005649@skeeve.com
State New, archived
Headers

Commit Message

Arnold Robbins Dec. 8, 2017, 9:19 a.m. UTC
  This patch fixes a bug related to diagnosing ERE '()|\1'. See
http://bugs.gnu.org/21513.  I pulled this in from GNULIB.

2017-11-30         Paul Eggert  <eggert@cs.ucla.edu>

	Diagnose ERE '()|\1'
	Problem reported by Hanno Boeck in: http://bugs.gnu.org/21513

	* posix/regcomp.c (parse_reg_exp): While parsing alternatives, keep
	track of the set of previously-completed subexpressions available
	before the first alternative, and restore this set just before
	parsing each subsequent alternative.  This lets us diagnose the
	invalid back-reference in the ERE '()|\1'.
  

Patch

diff --git a/posix/regcomp.c b/posix/regcomp.c
index 8920cf1..e63c258 100644
--- a/posix/regcomp.c
+++ b/posix/regcomp.c
@@ -2157,6 +2157,7 @@  parse_reg_exp (re_string_t *regexp, regex_t *preg, re_token_t *token,
 {
   re_dfa_t *dfa = (re_dfa_t *) preg->buffer;
   bin_tree_t *tree, *branch = NULL;
+  bitset_word_t initial_bkref_map = dfa->completed_bkref_map;
   tree = parse_branch (regexp, preg, token, syntax, nest, err);
   if (BE (*err != REG_NOERROR && tree == NULL, 0))
     return NULL;
@@ -2167,6 +2168,8 @@  parse_reg_exp (re_string_t *regexp, regex_t *preg, re_token_t *token,
       if (token->type != OP_ALT && token->type != END_OF_RE
 	  && (nest == 0 || token->type != OP_CLOSE_SUBEXP))
 	{
+	  bitset_word_t accumulated_bkref_map = dfa->completed_bkref_map;
+	  dfa->completed_bkref_map = initial_bkref_map;
 	  branch = parse_branch (regexp, preg, token, syntax, nest, err);
 	  if (BE (*err != REG_NOERROR && branch == NULL, 0))
 	    {
@@ -2174,6 +2177,7 @@  parse_reg_exp (re_string_t *regexp, regex_t *preg, re_token_t *token,
 		postorder (tree, free_tree, NULL);
 	      return NULL;
 	    }
+	  dfa->completed_bkref_map |= accumulated_bkref_map;
 	}
       else
 	branch = NULL;