From patchwork Mon Nov 14 12:20:46 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 17450 Received: (qmail 55150 invoked by alias); 14 Nov 2016 12:21:05 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 55123 invoked by uid 89); 14 Nov 2016 12:21:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=U*libc-alpha, libc-alpha@sourceware.org, libcalphasourcewareorg, sk:libc-al X-HELO: EUR01-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: Re: [PATCH] Improve strtok(_r) performance Date: Mon, 14 Nov 2016 12:20:46 +0000 Message-ID: References: In-Reply-To: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-microsoft-exchange-diagnostics: 1; AM5PR0802MB2609; 7:GpDSUY6FffJRIuU/lMQB+Z/tPstiycC7FpB+SgEK9Xv6ZnPkDC+hRtZD8u8xvS8frz8ebLnBl+1zdVlxUQtFLFmTKB9kdVTJ6BQdPhCXZ0V02MUgZKvmocLSHI9ecLqFeJ+vlEo6dpcxmKaaN37nwgajuuVCrdTgWJbQ2qGtLcuUFWHG8MkvbifuWL5EmSpq/mQGYRCkpzMvguPXdNkqfzoc01X0kqPNebpzypKpcvHmcouzVtlYSCCJxVTzk9JzCGs7eE1AzK+qrojJogeExNZDMqM4OPLE/uW76s/ujEDcFyGCmgu28CCQAqjfdscNekKjqWoLzR4bsYgd5NKFZR9GDv9NM3nxW4AyXMcyrf0= x-ms-office365-filtering-correlation-id: 4cad7c26-6c45-4abc-4f90-08d40c88a9c8 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001); SRVR:AM5PR0802MB2609; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6060326)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026)(6061321); SRVR:AM5PR0802MB2609; BCL:0; PCL:0; RULEID:; SRVR:AM5PR0802MB2609; x-forefront-prvs: 0126A32F74 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(336003)(199003)(377424004)(189002)(54534003)(586003)(77096005)(2501003)(3280700002)(81156014)(33656002)(81166006)(450100001)(189998001)(6116002)(4001150100001)(2900100001)(8936002)(86362001)(3660700001)(87936001)(66066001)(76576001)(6916009)(5660300001)(2950100002)(110136003)(7696004)(97736004)(4326007)(106356001)(7736002)(68736007)(3900700001)(229853002)(2906002)(2351001)(92566002)(106116001)(54356999)(76176999)(74316002)(9686002)(122556002)(101416001)(8676002)(50986999)(5640700001)(305945005)(7846002)(102836003)(105586002)(3846002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM5PR0802MB2609; H:AM5PR0802MB2610.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 14 Nov 2016 12:20:46.8754 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0802MB2609 ping From: Wilco Dijkstra Sent: 28 October 2016 12:35 To: libc-alpha@sourceware.org Cc: nd Subject: [PATCH] Improve strtok(_r) performance   Improve strtok(_r) performance.  Instead of calling strpbrk which calls strcspn, call strcspn directly so we get the end of the token without an extra call to rawmemchr.  Also avoid an unnecessary call to strcspn after the last token by adding an early exit for an empty string.  The result is a ~2x speedup of strtok on most inputs in bench-strtok. Passes regression tests, OK for commit? ChangeLog: 2015-10-28  Wilco Dijkstra          * string/strtok.c (STRTOK): Optimize for performance.         * string/strtok_r.c (__strtok_r): Likewise. diff --git a/string/strtok.c b/string/strtok.c index 7a4574db5c80501e47d045ad4347e8a287b32191..b1ed48c24c8d20706b7d05481a138b18a01ff802 100644 --- a/string/strtok.c +++ b/string/strtok.c @@ -38,11 +38,18 @@ static char *olds;  char *  STRTOK (char *s, const char *delim)  { -  char *token; +  char *end;      if (s == NULL)      s = olds;   +  /* Return immediately at end of string.  */ +  if (*s == '\0') +    { +      olds = s; +      return NULL; +    } +    /* Scan leading delimiters.  */    s += strspn (s, delim);    if (*s == '\0') @@ -52,16 +59,15 @@ STRTOK (char *s, const char *delim)      }      /* Find the end of the token.  */ -  token = s; -  s = strpbrk (token, delim); -  if (s == NULL) -    /* This token finishes the string.  */ -    olds = __rawmemchr (token, '\0'); -  else +  end = s + strcspn (s, delim); +  if (*end == '\0')      { -      /* Terminate the token and make OLDS point past it.  */ -      *s = '\0'; -      olds = s + 1; +      olds = end; +      return s;      } -  return token; + +  /* Terminate the token and make OLDS point past it.  */ +  *end = '\0'; +  olds = end + 1; +  return s;  } diff --git a/string/strtok_r.c b/string/strtok_r.c index f351304766108dad2c1cff881ad3bebae821b2a0..e049a5c82e026a3b6c1ba5da16ce81743717805e 100644 --- a/string/strtok_r.c +++ b/string/strtok_r.c @@ -45,11 +45,17 @@  char *  __strtok_r (char *s, const char *delim, char **save_ptr)  { -  char *token; +  char *end;      if (s == NULL)      s = *save_ptr;   +  if (*s == '\0') +    { +      *save_ptr = s; +      return NULL; +    } +    /* Scan leading delimiters.  */    s += strspn (s, delim);    if (*s == '\0') @@ -59,18 +65,17 @@ __strtok_r (char *s, const char *delim, char **save_ptr)      }      /* Find the end of the token.  */ -  token = s; -  s = strpbrk (token, delim); -  if (s == NULL) -    /* This token finishes the string.  */ -    *save_ptr = __rawmemchr (token, '\0'); -  else +  end = s + strcspn (s, delim); +  if (*end == '\0')      { -      /* Terminate the token and make *SAVE_PTR point past it.  */ -      *s = '\0'; -      *save_ptr = s + 1; +      *save_ptr = end; +      return s;      } -  return token; + +  /* Terminate the token and make *SAVE_PTR point past it.  */ +  *end = '\0'; +  *save_ptr = end + 1; +  return s;  }  #ifdef weak_alias  libc_hidden_def (__strtok_r)