From patchwork Mon Jun 10 21:30:17 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Shawn Landden <shawn@git.icu>
X-Patchwork-Id: 33075
Received: (qmail 18342 invoked by alias); 10 Jun 2019 21:30:55 -0000
Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <gdb-patches.sourceware.org>
List-Unsubscribe: <mailto:gdb-patches-unsubscribe-##L=##H@sourceware.org>
List-Subscribe: <mailto:gdb-patches-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/gdb-patches/>
List-Post: <mailto:gdb-patches@sourceware.org>
List-Help: <mailto:gdb-patches-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: gdb-patches-owner@sourceware.org
Delivered-To: mailing list gdb-patches@sourceware.org
Received: (qmail 18173 invoked by uid 89); 10 Jun 2019 21:30:42 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-25.4 required=5.0 tests=AWL, BAYES_00,
	FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2,
	GIT_PATCH_3, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_WEB,
	SPF_PASS autolearn=ham version=3.3.1 spammy=H*F:D*icu,
	H*Ad:D*icu, HContent-Transfer-Encoding:8bit
X-HELO: mail-vk1-f196.google.com
Received: from mail-vk1-f196.google.com (HELO mail-vk1-f196.google.com)
	(209.85.221.196) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Mon, 10 Jun 2019 21:30:40 +0000
Received: by mail-vk1-f196.google.com with SMTP id b69so2073187vkb.3 for
	<gdb-patches@sourceware.org>; Mon, 10 Jun 2019 14:30:25 -0700 (PDT)
Return-Path: <slandden@gmail.com>
Received: from shawn-ThinkPad-L420.hitronhub.home ([190.233.49.202]) by
	smtp.gmail.com with ESMTPSA id
	185sm5543258vst.8.2019.06.10.14.30.22 (version=TLS1_3
	cipher=AEAD-AES256-GCM-SHA384 bits=256/256);
	Mon, 10 Jun 2019 14:30:23 -0700 (PDT)
From: Shawn Landden <shawn@git.icu>
To: gdb-patches@sourceware.org
Cc: jhb@freebsd.org,	eliz@gnu.org,	Shawn Landden <shawn@git.icu>
Subject: [PATCH v2] Fix slow and non-deterministic behavior of isspace() and
	tolower()
Date: Mon, 10 Jun 2019 16:30:17 -0500
Message-Id: <20190610213017.2021-1-shawn@git.icu>
In-Reply-To: <20190609151704.16061-1-shawn@git.icu>
References: <20190609151704.16061-1-shawn@git.icu>
MIME-Version: 1.0

I was getting 8% and 6% cpu usage in tolower() and isspace(),
respectively, waiting for a breakpoint on ppc64el.

Also, gdb doesn't want non-deterministic behavior here.

v2: do not clash with C99 names
---
 gdb/utils.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/gdb/utils.c b/gdb/utils.c
index 9686927473..0b68fabe4d 100644
--- a/gdb/utils.c
+++ b/gdb/utils.c
@@ -2626,10 +2626,29 @@ strcmp_iw (const char *string1, const char *string2)
    user searches for "foo", then strcmp will sort "foo" before "foo$".
    Then lookup_partial_symbol will notice that strcmp_iw("foo$",
    "foo") is false, so it won't proceed to the actual match of
    "foo(int)" with "foo".  */
 
+/* glibc versions of these have non-deterministic locale-dependant behavior,
+   and are very slow, taking 8% and 6% of total CPU time with some use-cases */
+
+static inline int isspace_inline(int c)
+{
+  return c == ' ' || (unsigned)c-'\t' < 5;
+}
+
+static inline int isupper_inline(int c)
+{
+  return (unsigned)c-'A' < 26;
+}
+
+static inline int tolower_inline(int c)
+{
+  if (isupper(c)) return c | 32;
+  return c;
+}
+
 int
 strcmp_iw_ordered (const char *string1, const char *string2)
 {
   const char *saved_string1 = string1, *saved_string2 = string2;
   enum case_sensitivity case_pass = case_sensitive_off;
@@ -2641,20 +2660,20 @@ strcmp_iw_ordered (const char *string1, const char *string2)
 	 strings.  */
       char c1 = 'X', c2 = 'X';
 
       while (*string1 != '\0' && *string2 != '\0')
 	{
-	  while (isspace (*string1))
+	  while (isspace_inline (*string1))
 	    string1++;
-	  while (isspace (*string2))
+	  while (isspace_inline (*string2))
 	    string2++;
 
 	  switch (case_pass)
 	  {
 	    case case_sensitive_off:
-	      c1 = tolower ((unsigned char) *string1);
-	      c2 = tolower ((unsigned char) *string2);
+	      c1 = tolower_inline ((unsigned char) *string1);
+	      c2 = tolower_inline ((unsigned char) *string2);
 	      break;
 	    case case_sensitive_on:
 	      c1 = *string1;
 	      c2 = *string2;
 	      break;