From patchwork Fri Jun 21 20:12:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pedro Alves X-Patchwork-Id: 33260 Received: (qmail 26756 invoked by alias); 21 Jun 2019 20:12:06 -0000 Mailing-List: contact gdb-patches-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: gdb-patches-owner@sourceware.org Delivered-To: mailing list gdb-patches@sourceware.org Received: (qmail 26748 invoked by uid 89); 21 Jun 2019 20:12:06 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 spammy=hosts, 22797 X-HELO: mail-wm1-f67.google.com Received: from mail-wm1-f67.google.com (HELO mail-wm1-f67.google.com) (209.85.128.67) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 21 Jun 2019 20:12:04 +0000 Received: by mail-wm1-f67.google.com with SMTP id s3so7357570wms.2 for ; Fri, 21 Jun 2019 13:12:03 -0700 (PDT) Return-Path: Received: from ?IPv6:2001:8a0:f913:f700:56ee:75ff:fe8d:232b? ([2001:8a0:f913:f700:56ee:75ff:fe8d:232b]) by smtp.gmail.com with ESMTPSA id x129sm2776402wmg.44.2019.06.21.13.12.00 (version=TLS1_3 cipher=AEAD-AES128-GCM-SHA256 bits=128/128); Fri, 21 Jun 2019 13:12:01 -0700 (PDT) Subject: Re: [PATCH] Fix slow and non-deterministic behavior of isspace() and tolower() To: Shawn Landden , gdb-patches@sourceware.org References: <20190610213017.2021-1-shawn@git.icu> <20190621191959.24450-1-shawn@git.icu> Cc: jhb@freebsd.org, eliz@gnu.org From: Pedro Alves Message-ID: <325f6653-a7ae-1242-53f6-b004f6edca37@redhat.com> Date: Fri, 21 Jun 2019 21:12:00 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20190621191959.24450-1-shawn@git.icu> On 6/21/19 8:19 PM, Shawn Landden wrote: > I was getting 8% and 6% cpu usage in tolower() and isspace(), > respectively, waiting for a breakpoint on ppc64el. > > Also, gdb doesn't want non-deterministic behavior here. > > v2: do not touch C namespace > v3: Turns out there are two places using these in performance-critical > parts. > Follow GNU coding standards. This doesn't address my comments about . Something like this. How does this compare? Any significant difference in your benchmark? And again, how are you benchmarking this? From 1bd3203a26f989ba146376842f5b6a78bdfae181 Mon Sep 17 00:00:00 2001 From: Pedro Alves Date: Fri, 21 Jun 2019 19:12:30 +0100 Subject: [PATCH] Use ISSPACE etc --- gdb/common/gdb-safe-ctype.h | 46 ++++++++++++++++++++++++++++++++++++++++++++ gdb/utils.c | 47 +++++++++++++++++++++++---------------------- 2 files changed, 70 insertions(+), 23 deletions(-) create mode 100644 gdb/common/gdb-safe-ctype.h diff --git a/gdb/common/gdb-safe-ctype.h b/gdb/common/gdb-safe-ctype.h new file mode 100644 index 00000000000..01a97b7ce39 --- /dev/null +++ b/gdb/common/gdb-safe-ctype.h @@ -0,0 +1,46 @@ +/* Wrapper around libiberty's safe-ctype.h for GDB, the GNU debugger. + + Copyright (C) 2019 Free Software Foundation, Inc. + + This file is part of GDB. + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program. If not, see . */ + +#ifndef GDB_SAFE_CTYPE_H +#define GDB_SAFE_CTYPE_H + +/* After safe-ctype.h is included, we can no longer use the host's + ctype routines. Trying to do so results in compile errors. Code + that uses safe-ctype.h that wants to refer to the locale-dependent + ctype functions must call these wrapper versions instead. */ + +static inline int +gdb_isprint (int ch) +{ + return isprint (ch); +} + +/* readline.h defines these symbols too, but we want libiberty's + versions. */ +#undef ISALPHA +#undef ISALNUM +#undef ISDIGIT +#undef ISLOWER +#undef ISPRINT +#undef ISUPPER +#undef ISXDIGIT + +#include "safe-ctype.h" + +#endif diff --git a/gdb/utils.c b/gdb/utils.c index c7922cf7f56..367ea277c7d 100644 --- a/gdb/utils.c +++ b/gdb/utils.c @@ -73,6 +73,7 @@ #include "common/pathstuff.h" #include "cli/cli-style.h" #include "common/scope-exit.h" +#include "common/gdb-safe-ctype.h" void (*deprecated_error_begin_hook) (void); @@ -1057,7 +1058,7 @@ parse_escape (struct gdbarch *gdbarch, const char **string_ptr) while (++count < 3) { c = (**string_ptr); - if (isdigit (c) && c != '8' && c != '9') + if (ISDIGIT (c) && c != '8' && c != '9') { (*string_ptr)++; i *= 8; @@ -1985,7 +1986,7 @@ puts_debug (char *prefix, char *string, char *suffix) switch (ch) { default: - if (isprint (ch)) + if (gdb_isprint (ch)) fputc_unfiltered (ch, gdb_stdlog); else @@ -2268,7 +2269,7 @@ fprintf_symbol_filtered (struct ui_file *stream, const char *name, static bool valid_identifier_name_char (int ch) { - return (isalnum (ch) || ch == '_'); + return (ISALNUM (ch) || ch == '_'); } /* Skip to end of token, or to END, whatever comes first. Input is @@ -2278,7 +2279,7 @@ static const char * cp_skip_operator_token (const char *token, const char *end) { const char *p = token; - while (p != end && !isspace (*p) && *p != '(') + while (p != end && !ISSPACE (*p) && *p != '(') { if (valid_identifier_name_char (*p)) { @@ -2332,9 +2333,9 @@ cp_skip_operator_token (const char *token, const char *end) static void skip_ws (const char *&string1, const char *&string2, const char *end_str2) { - while (isspace (*string1)) + while (ISSPACE (*string1)) string1++; - while (string2 < end_str2 && isspace (*string2)) + while (string2 < end_str2 && ISSPACE (*string2)) string2++; } @@ -2396,8 +2397,8 @@ strncmp_iw_with_mode (const char *string1, const char *string2, while (1) { if (skip_spaces - || ((isspace (*string1) && !valid_identifier_name_char (*string2)) - || (isspace (*string2) && !valid_identifier_name_char (*string1)))) + || ((ISSPACE (*string1) && !valid_identifier_name_char (*string2)) + || (ISSPACE (*string2) && !valid_identifier_name_char (*string1)))) { skip_ws (string1, string2, end_str2); skip_spaces = false; @@ -2430,7 +2431,7 @@ strncmp_iw_with_mode (const char *string1, const char *string2, if (match_for_lcd != NULL && abi_start != string1) match_for_lcd->mark_ignored_range (abi_start, string1); - while (isspace (*string1)) + while (ISSPACE (*string1)) string1++; } @@ -2455,9 +2456,9 @@ strncmp_iw_with_mode (const char *string1, const char *string2, string1++; string2++; - while (isspace (*string1)) + while (ISSPACE (*string1)) string1++; - while (string2 < end_str2 && isspace (*string2)) + while (string2 < end_str2 && ISSPACE (*string2)) string2++; continue; } @@ -2551,14 +2552,14 @@ strncmp_iw_with_mode (const char *string1, const char *string2, if (case_sensitivity == case_sensitive_on && *string1 != *string2) break; if (case_sensitivity == case_sensitive_off - && (tolower ((unsigned char) *string1) - != tolower ((unsigned char) *string2))) + && (TOLOWER ((unsigned char) *string1) + != TOLOWER ((unsigned char) *string2))) break; /* If we see any non-whitespace, non-identifier-name character (any of "()<>*&" etc.), then skip spaces the next time around. */ - if (!isspace (*string1) && !valid_identifier_name_char (*string1)) + if (!ISSPACE (*string1) && !valid_identifier_name_char (*string1)) skip_spaces = true; string1++; @@ -2679,16 +2680,16 @@ strcmp_iw_ordered (const char *string1, const char *string2) while (*string1 != '\0' && *string2 != '\0') { - while (isspace (*string1)) + while (ISSPACE (*string1)) string1++; - while (isspace (*string2)) + while (ISSPACE (*string2)) string2++; switch (case_pass) { case case_sensitive_off: - c1 = tolower ((unsigned char) *string1); - c2 = tolower ((unsigned char) *string2); + c1 = TOLOWER ((unsigned char) *string1); + c2 = TOLOWER ((unsigned char) *string2); break; case case_sensitive_on: c1 = *string1; @@ -2927,17 +2928,17 @@ string_to_core_addr (const char *my_string) { CORE_ADDR addr = 0; - if (my_string[0] == '0' && tolower (my_string[1]) == 'x') + if (my_string[0] == '0' && TOLOWER (my_string[1]) == 'x') { /* Assume that it is in hex. */ int i; for (i = 2; my_string[i] != '\0'; i++) { - if (isdigit (my_string[i])) + if (ISDIGIT (my_string[i])) addr = (my_string[i] - '0') + (addr * 16); - else if (isxdigit (my_string[i])) - addr = (tolower (my_string[i]) - 'a' + 0xa) + (addr * 16); + else if (ISXDIGIT (my_string[i])) + addr = (TOLOWER (my_string[i]) - 'a' + 0xa) + (addr * 16); else error (_("invalid hex \"%s\""), my_string); } @@ -2949,7 +2950,7 @@ string_to_core_addr (const char *my_string) for (i = 0; my_string[i] != '\0'; i++) { - if (isdigit (my_string[i])) + if (ISDIGIT (my_string[i])) addr = (my_string[i] - '0') + (addr * 10); else error (_("invalid decimal \"%s\""), my_string);