From patchwork Thu Sep 11 23:19:53 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roland McGrath X-Patchwork-Id: 2777 Received: (qmail 6435 invoked by alias); 11 Sep 2014 23:19:59 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 6420 invoked by uid 89); 11 Sep 2014 23:19:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=AWL, BAYES_00 autolearn=ham version=3.3.2 X-HELO: topped-with-meat.com MIME-Version: 1.0 From: Roland McGrath To: Konstantin Serebryany Cc: Will Newton , Andreas Schwab , GNU C Library Subject: Re: [PATCH] un-nest findidx() In-Reply-To: Konstantin Serebryany's message of Tuesday, 9 September 2014 20:03:04 -0700 References: <20140909215726.B0AE92C39D8@topped-with-meat.com> Message-Id: <20140911231953.941812C3979@topped-with-meat.com> Date: Thu, 11 Sep 2014 16:19:53 -0700 (PDT) X-CMAE-Score: 0 X-CMAE-Analysis: v=2.1 cv=SvUDtp+0 c=1 sm=1 tr=0 a=WkljmVdYkabdwxfqvArNOQ==:117 a=14OXPxybAAAA:8 a=qIT-BNRQUkwA:10 a=6PtDOmPyWx4A:10 a=Z6MIti7PxpgA:10 a=kj9zAlcOel0A:10 a=hOe2yjtxAAAA:8 a=pGLkceISAAAA:8 a=mDV3o1hIAAAA:8 a=Z13e-Ksw0J8LeKJ99RIA:9 a=CjuIK1q_8ugA:10 a=MSl-tDqOz04A:10 > Here it is, verified against fresh trunk. When I applied the patch there was some offset in one of the files, which suggests this wasn't a freshly-generated patch. I've fixed up various style nits and committed it for you. I'll describe what I changed in detail so that you'll know how to do things better next time. I'll append the patch as I committed it just so you can see easily. > 2014-09-09 Kostya Serebryany > > * locale/weight.h: add include guard. These need to be tabs, not leading spaces. Capitalize and punctuate sentences properly in log entries. > (findidx): un-nest, make it static inline, add parameters. "Un-nest" is not an established term. Don't use it. Describe exactly what you did. When you change the signature of a function, describe exactly what changed. (Thereafter it's OK to describe an exactly analogous change with "Likewise", and to describe corresponding changes to call sites with just "Add new parameters" or "Update callers".) > * posix/fnmatch_loop.c: include weightwc.h or weight.h depending on > WIDE_CHAR_VERSION. Define FINDIDX as findidxwc or findidx. When what you're touching is inside #if stuff, then identify it with [...] notation. > (FCT): change type of 'extra' to wint_t; [...] Mention a local as VARNAME, not 'varname'. The int32_t->wint_t change seems wholly unrelated to the rest of this change and really should have been separate. But it's harmless enough that I left it in just so we can move on already. I found the renaming to findidxwc and the macroification with FINDIDX to be unnecessary everywhere except for fnmatch.c, which is the only place that includes both versions. It is doing a lot of local macro hooey to include fnmatch_loop.c twice, so it seemed right just to add to that rather than doing something broader that requires touching other users. > +#ifndef _WEIGHT_H_ > +#define _WEIGHT_H_ It certainly doesn't matter in practice, but our style is to always define these to 1 rather than empty. > /* Find index of weight. */ > -auto inline int32_t > +static inline int32_t > __attribute ((always_inline)) > -findidx (const unsigned char **cpp, size_t len) This wasn't an error you introduced, but I fixed it since I saw it: we always use __attribute__ rather than __attribute; it can go on the end of the type line rather than having its own. > --- a/posix/regcomp.c > +++ b/posix/regcomp.c > @@ -3389,6 +3389,8 @@ parse_bracket_symbol (bracket_elem_t *elem, re_string_t *regexp, > return REG_NOERROR; > } > > +#include When an #include is at top level and not inside an #if condition that is easily shared, then it should always be at the top of the file rather than between functions. > --- a/posix/regex_internal.h > +++ b/posix/regex_internal.h > @@ -733,6 +733,8 @@ re_string_wchar_at (const re_string_t *pstr, int idx) > } > > # ifndef NOT_IN_libc > +# include Any nested # directive gets one additional space from its containing #if block: # ifndef NOT_IN_libc # include There was another case where you moved an '# if ...' block out of the context where that indentation was right, but failed to adjust it to be just '#if ...' (and correspondingly one space fewer in the directives nested inside). > --- a/posix/regexec.c > +++ b/posix/regexec.c > @@ -3749,6 +3749,8 @@ group_nodes_into_DFAstates (const re_dfa_t *dfa, const re_dfastate_t *state, > one collating element like '.', '[a-z]', opposite to the other nodes > can only accept one byte. */ > > +#include This one is not misplaced, because it's inside of [RE_ENABLE_I18N]. But it also needs to be inside #ifdef _LIBC to match its use inside the function below. > --- a/string/strcoll_l.c > +++ b/string/strcoll_l.c > @@ -146,13 +147,14 @@ get_next_seq_cached (coll_seq *seq, int nrules, int pass, > seq->idxnow = idxnow; > } > > +#include WEIGHT_H Belongs at top of file. What I've committed is below for reference on how I got all the nits the way I like them. Thanks, Roland 2014-09-11 Kostya Serebryany Roland McGrath * locale/weight.h: Add include guard. (findidx): Make static rather than auto; take new parameters TABLE, INDIRECT, and EXTRA instead of getting them as outer locals. * locale/weightwc.h: Likewise. * posix/fnmatch_loop.c (FCT): Change type of EXTRA from int32_t to wint_t. Don't include either header inside the function. Call FINDIDX rather than findidx, and pass new arguments. #undef FINDIDX at the end of the file. * posix/fnmatch.c [_LIBC]: #include and define FINDIDX before including fnmatch_loop.c for the non-wide version. [_LIBC] [HANDLE_MULTIBYTE]: #define findidx to findidxwc around #include , and define FINDIDX to findidxwc for the wide version. * posix/regcomp.c [_LIBC]: #include . (build_equiv_class) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * posix/regexec.c [RE_ENABLE_I18N] [_LIBC]: #include . [RE_ENABLE_I18N] (check_node_accept_bytes) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * posix/regex_internal.h [!NOT_IN_libc] [_LIBC]: #include . (re_string_elem_size_at): Don't #include it inside the function. Pass new arguments to findidx. * string/strcoll_l.c: #include WEIGHT_H at top level. (get_next_seq): Don't #include it inside the function. Pass new arguments to findidx. (get_next_seq_nocache): Likewise. * string/strxfrm_l.c: #include WEIGHT_H at top level. (STRXFRM): Don't #include it inside the function. Pass new arguments to findidx. --- a/locale/weight.h +++ b/locale/weight.h @@ -16,10 +16,15 @@ License along with the GNU C Library; if not, see . */ +#ifndef _WEIGHT_H_ +#define _WEIGHT_H_ 1 + /* Find index of weight. */ -auto inline int32_t -__attribute ((always_inline)) -findidx (const unsigned char **cpp, size_t len) +static inline int32_t __attribute__ ((always_inline)) +findidx (const int32_t *table, + const int32_t *indirect, + const unsigned char *extra, + const unsigned char **cpp, size_t len) { int_fast32_t i = table[*(*cpp)++]; const unsigned char *cp; @@ -130,3 +135,5 @@ findidx (const unsigned char **cpp, size_t len) /* NOTREACHED */ return 0x43219876; } + +#endif /* weight.h */ --- a/locale/weightwc.h +++ b/locale/weightwc.h @@ -16,10 +16,15 @@ License along with the GNU C Library; if not, see . */ +#ifndef _WEIGHTWC_H_ +#define _WEIGHTWC_H_ 1 + /* Find index of weight. */ -auto inline int32_t -__attribute ((always_inline)) -findidx (const wint_t **cpp, size_t len) +static inline int32_t __attribute__ ((always_inline)) +findidx (const int32_t *table, + const int32_t *indirect, + const wint_t *extra, + const wint_t **cpp, size_t len) { wint_t ch = *(*cpp)++; int32_t i = __collidx_table_lookup ((const char *) table, ch); @@ -109,3 +114,5 @@ findidx (const wint_t **cpp, size_t len) /* NOTREACHED */ return 0x43219876; } + +#endif /* weightwc.h */ --- a/posix/fnmatch.c +++ b/posix/fnmatch.c @@ -221,6 +221,8 @@ __wcschrnul (s, c) # define MEMCHR(S, C, N) memchr (S, C, N) # define STRCOLL(S1, S2) strcoll (S1, S2) # define WIDE_CHAR_VERSION 0 +# include +# define FINDIDX findidx # include "fnmatch_loop.c" @@ -246,6 +248,12 @@ __wcschrnul (s, c) # define MEMCHR(S, C, N) wmemchr (S, C, N) # define STRCOLL(S1, S2) wcscoll (S1, S2) # define WIDE_CHAR_VERSION 1 +/* Change the name the header defines so it doesn't conflict with + the version included above. */ +# define findidx findidxwc +# include +# undef findidx +# define FINDIDX findidxwc # undef IS_CHAR_CLASS /* We have to convert the wide character string in a multibyte string. But --- a/posix/fnmatch_loop.c +++ b/posix/fnmatch_loop.c @@ -376,7 +376,7 @@ FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used) const int32_t *table; # if WIDE_CHAR_VERSION const int32_t *weights; - const int32_t *extra; + const wint_t *extra; # else const unsigned char *weights; const unsigned char *extra; @@ -385,19 +385,12 @@ FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used) int32_t idx; const UCHAR *cp = (const UCHAR *) str; - /* This #include defines a local function! */ -# if WIDE_CHAR_VERSION -# include -# else -# include -# endif - # if WIDE_CHAR_VERSION table = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_TABLEWC); weights = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_WEIGHTWC); - extra = (const int32_t *) + extra = (const wint_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_EXTRAWC); indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTWC); @@ -412,7 +405,7 @@ FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); # endif - idx = findidx (&cp, 1); + idx = FINDIDX (table, indirect, extra, &cp, 1); if (idx != 0) { /* We found a table entry. Now see whether the @@ -422,7 +415,8 @@ FCT (pattern, string, string_end, no_leading_period, flags, ends, alloca_used) int32_t idx2; const UCHAR *np = (const UCHAR *) n; - idx2 = findidx (&np, string_end - n); + idx2 = FINDIDX (table, indirect, extra, + &np, string_end - n); if (idx2 != 0 && (idx >> 24) == (idx2 >> 24) && len == weights[idx2 & 0xffffff]) @@ -1277,3 +1271,4 @@ EXT (INT opt, const CHAR *pattern, const CHAR *string, const CHAR *string_end, #undef L #undef BTOWC #undef WIDE_CHAR_VERSION +#undef FINDIDX --- a/posix/regcomp.c +++ b/posix/regcomp.c @@ -19,6 +19,10 @@ #include +#ifdef _LIBC +# include +#endif + static reg_errcode_t re_compile_internal (regex_t *preg, const char * pattern, size_t length, reg_syntax_t syntax); static void re_compile_fastmap_iter (regex_t *bufp, @@ -3426,8 +3430,6 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) int32_t idx1, idx2; unsigned int ch; size_t len; - /* This #include defines a local function! */ -# include /* Calculate the index for equivalence class. */ cp = name; table = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_TABLEMB); @@ -3437,7 +3439,7 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) _NL_COLLATE_EXTRAMB); indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); - idx1 = findidx (&cp, -1); + idx1 = findidx (table, indirect, extra, &cp, -1); if (BE (idx1 == 0 || *cp != '\0', 0)) /* This isn't a valid character. */ return REG_ECOLLATE; @@ -3448,7 +3450,7 @@ build_equiv_class (bitset_t sbcset, const unsigned char *name) { char_buf[0] = ch; cp = char_buf; - idx2 = findidx (&cp, 1); + idx2 = findidx (table, indirect, extra, &cp, 1); /* idx2 = table[ch]; */ --- a/posix/regex_internal.h +++ b/posix/regex_internal.h @@ -733,6 +733,10 @@ re_string_wchar_at (const re_string_t *pstr, int idx) } # ifndef NOT_IN_libc +# ifdef _LIBC +# include +# endif + static int internal_function __attribute__ ((pure, unused)) re_string_elem_size_at (const re_string_t *pstr, int idx) @@ -740,7 +744,6 @@ re_string_elem_size_at (const re_string_t *pstr, int idx) # ifdef _LIBC const unsigned char *p, *extra; const int32_t *table, *indirect; -# include uint_fast32_t nrules = _NL_CURRENT_WORD (LC_COLLATE, _NL_COLLATE_NRULES); if (nrules != 0) @@ -751,7 +754,7 @@ re_string_elem_size_at (const re_string_t *pstr, int idx) indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); p = pstr->mbs + idx; - findidx (&p, pstr->len - idx); + findidx (table, indirect, extra, &p, pstr->len - idx); return p - pstr->mbs - idx; } else --- a/posix/regexec.c +++ b/posix/regexec.c @@ -3749,6 +3749,10 @@ group_nodes_into_DFAstates (const re_dfa_t *dfa, const re_dfastate_t *state, one collating element like '.', '[a-z]', opposite to the other nodes can only accept one byte. */ +# ifdef _LIBC +# include +# endif + static int internal_function check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, @@ -3868,8 +3872,6 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, const int32_t *table, *indirect; const unsigned char *weights, *extra; const char *collseqwc; - /* This #include defines a local function! */ -# include /* match with collating_symbol? */ if (cset->ncoll_syms) @@ -3925,7 +3927,7 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, _NL_CURRENT (LC_COLLATE, _NL_COLLATE_EXTRAMB); indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); - int32_t idx = findidx (&cp, elem_len); + int32_t idx = findidx (table, indirect, extra, &cp, elem_len); if (idx > 0) for (i = 0; i < cset->nequiv_classes; ++i) { --- a/string/strcoll_l.c +++ b/string/strcoll_l.c @@ -41,6 +41,7 @@ #define CONCAT1(a,b) a##b #include "../locale/localeinfo.h" +#include WEIGHT_H /* Track status while looking for sequences in a string. */ typedef struct @@ -152,7 +153,6 @@ get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets, const USTRING_TYPE *weights, const int32_t *table, const USTRING_TYPE *extra, const int32_t *indirect) { -#include WEIGHT_H size_t val = seq->val = 0; int len = seq->len; size_t backw_stop = seq->backw_stop; @@ -194,7 +194,7 @@ get_next_seq (coll_seq *seq, int nrules, const unsigned char *rulesets, while (*us != L('\0')) { - int32_t tmp = findidx (&us, -1); + int32_t tmp = findidx (table, indirect, extra, &us, -1); rulearr[idxmax] = tmp >> 24; idxarr[idxmax] = tmp & 0xffffff; idxcnt = idxmax++; @@ -242,7 +242,6 @@ get_next_seq_nocache (coll_seq *seq, int nrules, const unsigned char *rulesets, const USTRING_TYPE *extra, const int32_t *indirect, int pass) { -#include WEIGHT_H size_t val = seq->val = 0; int len = seq->len; size_t backw_stop = seq->backw_stop; @@ -285,7 +284,7 @@ get_next_seq_nocache (coll_seq *seq, int nrules, const unsigned char *rulesets, us = seq->back_us; while (i < backw) { - int32_t tmp = findidx (&us, -1); + int32_t tmp = findidx (table, indirect, extra, &us, -1); idx = tmp & 0xffffff; i++; } @@ -300,7 +299,7 @@ get_next_seq_nocache (coll_seq *seq, int nrules, const unsigned char *rulesets, while (*us != L('\0')) { - int32_t tmp = findidx (&us, -1); + int32_t tmp = findidx (table, indirect, extra, &us, -1); unsigned char rule = tmp >> 24; prev_idx = idx; idx = tmp & 0xffffff; --- a/string/strxfrm_l.c +++ b/string/strxfrm_l.c @@ -41,6 +41,7 @@ #define CONCAT1(a,b) a##b #include "../locale/localeinfo.h" +#include WEIGHT_H #ifndef WIDE_CHAR_VERSION @@ -104,8 +105,6 @@ STRXFRM (STRING_TYPE *dest, const STRING_TYPE *src, size_t n, __locale_t l) size_t idxcnt; int use_malloc; -#include WEIGHT_H - if (nrules == 0) { if (n != 0) @@ -174,7 +173,7 @@ STRXFRM (STRING_TYPE *dest, const STRING_TYPE *src, size_t n, __locale_t l) idxmax = 0; do { - int32_t tmp = findidx (&usrc, -1); + int32_t tmp = findidx (table, indirect, extra, &usrc, -1); rulearr[idxmax] = tmp >> 24; idxarr[idxmax] = tmp & 0xffffff;