From patchwork Fri Jun 29 23:10:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Paul Eggert X-Patchwork-Id: 28166 Received: (qmail 78358 invoked by alias); 29 Jun 2018 23:10:10 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 78129 invoked by uid 89); 29 Jun 2018 23:10:09 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.2 spammy=herring, Properties, mathematically X-HELO: zimbra.cs.ucla.edu Subject: Re: [PATCH] posix: Sync gnulib regex implementation To: Adhemerval Zanella , Florian Weimer , libc-alpha@sourceware.org References: <1530204275-31537-1-git-send-email-adhemerval.zanella@linaro.org> <135856c7-fee5-a5de-4ccb-12bf0b8fe166@linaro.org> <17feeb08-1478-21ae-e581-6ffecfddab5c@cs.ucla.edu> <50b3b084-4049-0ed8-bdf5-3a75547c0894@redhat.com> From: Paul Eggert Openpgp: preference=signencrypt Message-ID: <09fa88fe-87e5-2c98-43e7-2dfeffabffbf@cs.ucla.edu> Date: Fri, 29 Jun 2018 16:10:03 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: On 06/29/2018 09:00 AM, Adhemerval Zanella wrote: > Paul, could you check what I am missing on the overflow check so we can > move forward? I really don't want to get this sync stalled because of > intprops.h addition as a pre-requisite. I can add comments to  INT_ADD_WRAPV, if that would help. Please ask questions about it; that would help me understand what parts need better commenting. I'm afraid that Florian's comment about testability is a bit of a red herring, as intprops.h is an internal header, not a published one. Glibc typically doesn't have test cases for macros defined in internal include files, and there's no need to test intprops.h directly. Besides, intprops.h and INT_ADD_WRAPV have been tested widely in commonly-used GNU applications like Coreutils, so it's not like I'm proposing anything flaky here. That being said, this use of INT_ADD_WRAPV is unimportant (as far as I know, no Gnu apps use this two-buffer regex searching code any more), so I guess it's OK here to write a simple substitute that's good enough for regex even if it's not robust or general. The goal should be simplicity, portability, and meeting regex's needs. Proposed further Glibc patch attached (it assumes your patch). I've installed a similar patch into Gnulib. I really don't want to get into the habit of doing this sort of thing, though. There's a lot of hard-won integer overflow expertise in intprops.h and it's a waste of time to continue to maintain poor imitations of it. > I tried to check the pos-processed implementation using intprops.h for > the snippet ... And its results are pretty much unreadable. Trying to audit this by looking at the preprocessed output is a bit like auditing glibc by looking at the generated machine code. It can be done, but you can't expect it to be easy. For your particular compiler I expect that the assembly-language output is more readable than the preprocessor output, so if you want to look at low level stuff I suggest looking at assembly language. But really, you should be better off reading the original source (and if it needs more comments, please say where and why and l'll add them). From 614946f802c3a7c2b11be14f85d3ed80f5e7ab57 Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Fri, 29 Jun 2018 15:42:16 -0700 Subject: [PATCH] Merge better with Gnulib * posix/regcomp.c (TYPE_SIGNED): Remove; now done by regex_internal.h. * posix/regex_internal.h [_LIBC]: Do not include intprops.h. (TYPE_SIGNED, INT_ADD_WRAPV) [_LIBC]: New macros. * posix/regexec.c (check_add_overflow_idx): Remove. (re_search_2_stub): Use INT_ADD_WRAPV insead of check_add_overflow_idx. --- ChangeLog | 9 +++++++++ posix/regcomp.c | 4 ---- posix/regex_internal.h | 18 ++++++++++++++++++ posix/regexec.c | 18 +----------------- 4 files changed, 28 insertions(+), 21 deletions(-) diff --git a/ChangeLog b/ChangeLog index c51c651220..4e665873ff 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,12 @@ +2018-06-29 Paul Eggert + + Merge better with Gnulib + * posix/regcomp.c (TYPE_SIGNED): Remove; now done by regex_internal.h. + * posix/regex_internal.h [_LIBC]: Do not include intprops.h. + (TYPE_SIGNED, INT_ADD_WRAPV) [_LIBC]: New macros. + * posix/regexec.c (check_add_overflow_idx): Remove. + (re_search_2_stub): Use INT_ADD_WRAPV insead of check_add_overflow_idx. + 2018-06-29 Adhemerval Zanella [BZ #23233] diff --git a/posix/regcomp.c b/posix/regcomp.c index 6aaa54f66d..7b5ddaad0c 100644 --- a/posix/regcomp.c +++ b/posix/regcomp.c @@ -2650,10 +2650,6 @@ parse_dup_op (bin_tree_t *elem, re_string_t *regexp, re_dfa_t *dfa, if (BE (tree == NULL, 0)) goto parse_dup_op_espace; -/* From gnulib's "intprops.h": - True if the arithmetic type T is signed. */ -#define TYPE_SIGNED(t) (! ((t) 0 < (t) -1)) - /* This loop is actually executed only when end != -1, to rewrite {0,n} as ((...?)?)?... We have already created the start+1-th copy. */ diff --git a/posix/regex_internal.h b/posix/regex_internal.h index f38f6b3d80..3b836ed206 100644 --- a/posix/regex_internal.h +++ b/posix/regex_internal.h @@ -33,6 +33,24 @@ #include #include +/* Properties of integers. Although Gnulib has intprops.h, glibc does + without for now. */ +#ifndef _LIBC +# include "intprops.h" +#else +/* True if the real type T is signed. */ +# define TYPE_SIGNED(t) (! ((t) 0 < (t) -1)) + +/* True if adding the nonnegative Idx values A and B would overflow. + If false, set *R to A + B. A, B, and R may be evaluated more than + once, or zero times. Although this is not a full implementation of + Gnulib INT_ADD_WRAPV, it is good enough for glibc regex code. + FIXME: This implementation is a fragile stopgap, and this file would + be simpler and more robust if intprops.h were migrated into glibc. */ +# define INT_ADD_WRAPV(a, b, r) \ + (IDX_MAX - (a) < (b) ? true : (*(r) = (a) + (b), false)) +#endif + #ifdef _LIBC # include # define lock_define(name) __libc_lock_define (, name) diff --git a/posix/regexec.c b/posix/regexec.c index d64b025d12..63aef97535 100644 --- a/posix/regexec.c +++ b/posix/regexec.c @@ -317,22 +317,6 @@ re_search_2 (struct re_pattern_buffer *bufp, const char *string1, Idx length1, weak_alias (__re_search_2, re_search_2) #endif -/* Set *R = A + B. Return true if the answer is mathematically incorrect due - to overflow; in this case, *R is the low order bits of the correct - answer. */ -static inline bool -check_add_overflow_idx (Idx a, Idx b, Idx *r) -{ -#if __GNUC__ >= 5 - return __builtin_add_overflow (a, b, r); -#else - *r = a + b; - if (a < 0 && b < IDX_MAX -a) - return false; - return true; -#endif -} - static regoff_t re_search_2_stub (struct re_pattern_buffer *bufp, const char *string1, Idx length1, const char *string2, Idx length2, Idx start, @@ -345,7 +329,7 @@ re_search_2_stub (struct re_pattern_buffer *bufp, const char *string1, char *s = NULL; if (BE ((length1 < 0 || length2 < 0 || stop < 0 - || check_add_overflow_idx (length1, length2, &len)), + || INT_ADD_WRAPV (length1, length2, &len)), 0)) return -2; -- 2.17.1