[v7,1/3] posix: regcomp(): clear RE_DOT_NOT_NULL

Message ID aeffa34acdcb8da840639ba67cf8984d2bd343d6.1686530834.git.nabijaczleweli@nabijaczleweli.xyz
State Superseded
Headers
Series [v7,1/3] posix: regcomp(): clear RE_DOT_NOT_NULL |

Checks

Context Check Description
redhat-pt-bot/TryBot-apply_patch success Patch applied to master at the time it was sent
linaro-tcwg-bot/tcwg_glibc_build--master-aarch64 success Testing passed
linaro-tcwg-bot/tcwg_glibc_build--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-arm success Testing passed
linaro-tcwg-bot/tcwg_glibc_check--master-aarch64 success Testing passed

Commit Message

Ahelenia Ziemiańska June 12, 2023, 12:47 a.m. UTC
  The POSIX API always stops at first NUL so there's no change for that.

The BSD REG_STARTEND API, with its explicit range, can include NULs
within that range, and those NULs are matched with . and [^].

Heretofor, for a string of "a\0c", glibc would match "[^q]c", but not
".c". This is both inconsistent and nonconformant to BSD REG_STARTEND.

With this patch, they're identical like you'd expect, and the
  tst-reg-startend.c: ..c: a^@c: no match$
failure is removed.

Another approach would be to remove it from _RE_SYNTAX_POSIX_COMMON,
but it's unclear to me what the custody chain is like for that and what
other regex APIs glibc offers that could be affected by this.

Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
---
No changes (rebased cleanly); full resend.

 posix/regcomp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Carlos O'Donell June 12, 2023, 1:11 p.m. UTC | #1
On 6/11/23 20:47, наб wrote:
> The POSIX API always stops at first NUL so there's no change for that.
> 
> The BSD REG_STARTEND API, with its explicit range, can include NULs
> within that range, and those NULs are matched with . and [^].
> 
> Heretofor, for a string of "a\0c", glibc would match "[^q]c", but not
> ".c". This is both inconsistent and nonconformant to BSD REG_STARTEND.
> 
> With this patch, they're identical like you'd expect, and the
>   tst-reg-startend.c: ..c: a^@c: no match$
> failure is removed.
> 
> Another approach would be to remove it from _RE_SYNTAX_POSIX_COMMON,
> but it's unclear to me what the custody chain is like for that and what
> other regex APIs glibc offers that could be affected by this.
> 
> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>

These changes are being made to sources shared between gnulib and glibc.

As the files are listed in SHARED-SOURCES we cannot easily accept changes to them
since they should be shared with gnulib.

Would you be willing to disclaim these changes or assign copyright?

> ---
> No changes (rebased cleanly); full resend.
> 
>  posix/regcomp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/posix/regcomp.c b/posix/regcomp.c
> index 12650714c0..a928ef6c2d 100644
> --- a/posix/regcomp.c
> +++ b/posix/regcomp.c
> @@ -462,7 +462,7 @@ regcomp (regex_t *__restrict preg, const char *__restrict pattern, int cflags)
>  {
>    reg_errcode_t ret;
>    reg_syntax_t syntax = ((cflags & REG_EXTENDED) ? RE_SYNTAX_POSIX_EXTENDED
> -			 : RE_SYNTAX_POSIX_BASIC);
> +			 : RE_SYNTAX_POSIX_BASIC) & ~RE_DOT_NOT_NULL;
>  
>    preg->buffer = NULL;
>    preg->allocated = 0;
  

Patch

diff --git a/posix/regcomp.c b/posix/regcomp.c
index 12650714c0..a928ef6c2d 100644
--- a/posix/regcomp.c
+++ b/posix/regcomp.c
@@ -462,7 +462,7 @@  regcomp (regex_t *__restrict preg, const char *__restrict pattern, int cflags)
 {
   reg_errcode_t ret;
   reg_syntax_t syntax = ((cflags & REG_EXTENDED) ? RE_SYNTAX_POSIX_EXTENDED
-			 : RE_SYNTAX_POSIX_BASIC);
+			 : RE_SYNTAX_POSIX_BASIC) & ~RE_DOT_NOT_NULL;
 
   preg->buffer = NULL;
   preg->allocated = 0;