wcsmbs: Avoid escaped character literals in <wchar.h>

Message ID 87blpxcrdx.fsf@mid.deneb.enyo.de
State Committed
Headers

Commit Message

Florian Weimer Feb. 17, 2020, 12:47 p.m. UTC
  * Andreas Schwab:

> On Feb 17 2020, Florian Weimer wrote:
>
>> They confuse scripts/conformtest.py because it treats the L and the
>
> s/scripts/conform/
>
>> x7f as namespace-violating identifiers.
>
> Can the script be fixed not to do that?

Like this?

A more elaborate alternative would be to use Zack's C tokenizer in the
conform tests, but I don't know if its feature set is aligned with
what we need in the conform tests.

Subject: Add wide and character literal support to conform/conformtest.py

Without this change, tokens such as L'x7f' are reconginzed as a
identifiers L, x7f, which are not in the implementation namespace and
therefore trigger failures.
  

Comments

Andreas Schwab Feb. 17, 2020, 4:06 p.m. UTC | #1
On Feb 17 2020, Florian Weimer wrote:

> Subject: Add wide and character literal support to conform/conformtest.py
>
> Without this change, tokens such as L'x7f' are reconginzed as a
> identifiers L, x7f, which are not in the implementation namespace and
> therefore trigger failures.
>
> diff --git a/conform/conformtest.py b/conform/conformtest.py
> index 951e3b2420..3bdc2a8e57 100644
> --- a/conform/conformtest.py
> +++ b/conform/conformtest.py
> @@ -638,7 +638,7 @@ class HeaderTests(object):
>                  # constants, and hex floats may be wrongly split into
>                  # tokens including identifiers, but this is sufficient
>                  # in practice and matches the old perl script.
> -                line = re.sub(r'"[^"]*"', '', line)
> +                line = re.sub(r'(?:\bL)?(?:"[^"]*"|\'[^\']*\')', '', line)
>                  line = line.strip()
>                  for token in re.split(r'[^A-Za-z0-9_]+', line):
>                      if re.match(r'[A-Za-z_]', token):

Ok if you update the comment.

Andreas.
  

Patch

diff --git a/conform/conformtest.py b/conform/conformtest.py
index 951e3b2420..3bdc2a8e57 100644
--- a/conform/conformtest.py
+++ b/conform/conformtest.py
@@ -638,7 +638,7 @@  class HeaderTests(object):
                 # constants, and hex floats may be wrongly split into
                 # tokens including identifiers, but this is sufficient
                 # in practice and matches the old perl script.
-                line = re.sub(r'"[^"]*"', '', line)
+                line = re.sub(r'(?:\bL)?(?:"[^"]*"|\'[^\']*\')', '', line)
                 line = line.strip()
                 for token in re.split(r'[^A-Za-z0-9_]+', line):
                     if re.match(r'[A-Za-z_]', token):