From patchwork Mon Jun 10 21:30:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shawn Landden X-Patchwork-Id: 33075 Received: (qmail 18342 invoked by alias); 10 Jun 2019 21:30:55 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 18173 invoked by uid 89); 10 Jun 2019 21:30:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_WEB, SPF_PASS autolearn=ham version=3.3.1 spammy=H*F:D*icu, H*Ad:D*icu, HContent-Transfer-Encoding:8bit X-HELO: mail-vk1-f196.google.com Received: from mail-vk1-f196.google.com (HELO mail-vk1-f196.google.com) (209.85.221.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 10 Jun 2019 21:30:40 +0000 Received: by mail-vk1-f196.google.com with SMTP id b69so2073187vkb.3 for ; Mon, 10 Jun 2019 14:30:25 -0700 (PDT) Return-Path: Received: from shawn-ThinkPad-L420.hitronhub.home ([190.233.49.202]) by smtp.gmail.com with ESMTPSA id 185sm5543258vst.8.2019.06.10.14.30.22 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Mon, 10 Jun 2019 14:30:23 -0700 (PDT) From: Shawn Landden To: gdb-patches@sourceware.org Cc: jhb@freebsd.org, eliz@gnu.org, Shawn Landden Subject: [PATCH v2] Fix slow and non-deterministic behavior of isspace() and tolower() Date: Mon, 10 Jun 2019 16:30:17 -0500 Message-Id: <20190610213017.2021-1-shawn@git.icu> In-Reply-To: <20190609151704.16061-1-shawn@git.icu> References: <20190609151704.16061-1-shawn@git.icu> MIME-Version: 1.0 I was getting 8% and 6% cpu usage in tolower() and isspace(), respectively, waiting for a breakpoint on ppc64el. Also, gdb doesn't want non-deterministic behavior here. v2: do not clash with C99 names --- gdb/utils.c | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/gdb/utils.c b/gdb/utils.c index 9686927473..0b68fabe4d 100644 --- a/gdb/utils.c +++ b/gdb/utils.c @@ -2626,10 +2626,29 @@ strcmp_iw (const char *string1, const char *string2) user searches for "foo", then strcmp will sort "foo" before "foo$". Then lookup_partial_symbol will notice that strcmp_iw("foo$", "foo") is false, so it won't proceed to the actual match of "foo(int)" with "foo". */ +/* glibc versions of these have non-deterministic locale-dependant behavior, + and are very slow, taking 8% and 6% of total CPU time with some use-cases */ + +static inline int isspace_inline(int c) +{ + return c == ' ' || (unsigned)c-'\t' < 5; +} + +static inline int isupper_inline(int c) +{ + return (unsigned)c-'A' < 26; +} + +static inline int tolower_inline(int c) +{ + if (isupper(c)) return c | 32; + return c; +} + int strcmp_iw_ordered (const char *string1, const char *string2) { const char *saved_string1 = string1, *saved_string2 = string2; enum case_sensitivity case_pass = case_sensitive_off; @@ -2641,20 +2660,20 @@ strcmp_iw_ordered (const char *string1, const char *string2) strings. */ char c1 = 'X', c2 = 'X'; while (*string1 != '\0' && *string2 != '\0') { - while (isspace (*string1)) + while (isspace_inline (*string1)) string1++; - while (isspace (*string2)) + while (isspace_inline (*string2)) string2++; switch (case_pass) { case case_sensitive_off: - c1 = tolower ((unsigned char) *string1); - c2 = tolower ((unsigned char) *string2); + c1 = tolower_inline ((unsigned char) *string1); + c2 = tolower_inline ((unsigned char) *string2); break; case case_sensitive_on: c1 = *string1; c2 = *string2; break;