regex: fix buffer read overrun in search [BZ#28470]
Checks
Context |
Check |
Description |
dj/TryBot-apply_patch |
success
|
Patch applied to master at the time it was sent
|
dj/TryBot-32bit |
success
|
Build for i686
|
Commit Message
Problem reported by Benno Schulenberg in:
https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* posix/regexec.c (re_search_internal): Use better bounds check.
---
posix/regexec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Okt 18 2021, Paul Eggert wrote:
> /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
> Note that MATCH_FIRST must not be smaller than 0. */
> - ch = (match_first >= length
> + ch = (mctx.input.valid_len <= offset
That needs to update the comment.
Andreas.
On 10/19/21 00:17, Andreas Schwab wrote:
> That needs to update the comment.
Thanks, revised patch attached.
On Okt 19 2021, Paul Eggert wrote:
> + ch = (mctx.input.valid_len <= offset
This is backwards.
Andreas.
On 10/19/21 01:25, Andreas Schwab wrote:
> On Okt 19 2021, Paul Eggert wrote:
>
>> + ch = (mctx.input.valid_len <= offset
>
> This is backwards.
It's correct as-is, so that comment is merely about style. I revamped
the patch to turn the comparison around; see attached. Let's not have
our longstanding style disagreement distract us from the fix.
On Okt 19 2021, Paul Eggert wrote:
> diff --git a/posix/regexec.c b/posix/regexec.c
> index 83e9aaf8ca..6aeba3c0b4 100644
> --- a/posix/regexec.c
> +++ b/posix/regexec.c
> @@ -758,10 +758,9 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
>
> offset = match_first - mctx.input.raw_mbs_idx;
> }
> - /* If MATCH_FIRST is out of the buffer, leave it as '\0'.
> - Note that MATCH_FIRST must not be smaller than 0. */
> - ch = (match_first >= length
> - ? 0 : re_string_byte_at (&mctx.input, offset));
> + /* Use buffer byte if OFFSET is in buffer, otherwise '\0'. */
> + ch = (offset < mctx.input.valid_len
> + ? re_string_byte_at (&mctx.input, offset) : 0);
Why is the bug not in re_string_reconstruct? Since string[match_first]
exists, so should re_string_byte_at (&mctx.input, offset).
Andreas.
On 10/19/21 08:09, Andreas Schwab wrote:
> Why is the bug not in re_string_reconstruct? Since string[match_first]
> exists, so should re_string_byte_at (&mctx.input, offset).
I don't know, as I lacked the time to investigate re_string_reconstruct.
Although the patch I proposed fixes the test case that prompted it,
possibly it is only a partial fix for a more-general problem.
No further comment, and the patch is safe and has been used in Gnulib
for some time even if it doesn't necessarily fix all the underlying
problem, so I installed it. Tests pass on x86-64.
On Nov 24 2021, Paul Eggert wrote:
> the patch is safe
Is it? Why?
Andreas.
On 11/24/21 14:45, Andreas Schwab wrote:
> Is it? Why?
Partly because it refuses to read past the bounds of an array, where the
old code would. And partly because it's been run through several tests
- not just glibc tests, but also grep and coreutils and probably some
others by now.
Of course this is not a 100% guarantee of safety, but it's close enough.
On Nov 24 2021, Paul Eggert wrote:
> On 11/24/21 14:45, Andreas Schwab wrote:
>> Is it? Why?
>
> Partly because it refuses to read past the bounds of an array, where the
> old code would.
That's just papering over a bug, not fixing it.
> And partly because it's been run through several tests - not just
> glibc tests, but also grep and coreutils and probably some others by
> now.
How much coverage do they provide?
Also, you failed to add a test.
Andreas.
On 11/25/21 01:01, Andreas Schwab wrote:
>> Partly because it refuses to read past the bounds of an array, where the
>> old code would.
>
> That's just papering over a bug, not fixing it.
That's not clear to me. Perhaps you're right, but perhaps it really does
fix the bug.
>> And partly because it's been run through several tests - not just
>> glibc tests, but also grep and coreutils and probably some others by
>> now.
>
> How much coverage do they provide?
Someone who has more time could presumably determine this by looking at
the respective test suites. I forgot to mention, Gnulib also has its own
regex tests (which also pass).
> Also, you failed to add a test.
Yes, that's correct. It would be nice if someone could do that. However,
it'd be some work and like you I'm pressed for time.
On Nov 26 2021, Paul Eggert wrote:
> On 11/25/21 01:01, Andreas Schwab wrote:
>
>>> Partly because it refuses to read past the bounds of an array, where the
>>> old code would.
>> That's just papering over a bug, not fixing it.
>
> That's not clear to me. Perhaps you're right, but perhaps it really does
> fix the bug.
That's why we need a proper test case. Not voodoo programming.
Andreas.
@@ -760,7 +760,7 @@ re_search_internal (const regex_t *preg, const char *string, Idx length,
}
/* If MATCH_FIRST is out of the buffer, leave it as '\0'.
Note that MATCH_FIRST must not be smaller than 0. */
- ch = (match_first >= length
+ ch = (mctx.input.valid_len <= offset
? 0 : re_string_byte_at (&mctx.input, offset));
if (fastmap[ch])
break;