From patchwork Fri May 6 12:49:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Siddhesh Poyarekar X-Patchwork-Id: 53550 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C80883954422 for ; Fri, 6 May 2022 12:51:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C80883954422 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1651841475; bh=rXSK7a5k9XlGmOudxk57BNDM3COO30tFHb2jPMok1dk=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=b48Kd0EhhKNuNNdFjM9rx7sqw5xifKSlq+Z2YaAnt6h9/hQyxhz8a3BZang0I1k5/ uHW5cESl/PwcCq5Ipw70kl2x6hCXGWRo4tbubaVU6R+KC+Wd+3FcVSMXoMqSAM0aAY fxi+lNry2X0djUvC5Cb2eSD3tIQqB4gavN0cSn3Y= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from antelope.elm.relay.mailchannels.net (antelope.elm.relay.mailchannels.net [23.83.212.4]) by sourceware.org (Postfix) with ESMTPS id 07DC73954418 for ; Fri, 6 May 2022 12:50:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 07DC73954418 X-Sender-Id: dreamhost|x-authsender|siddhesh@gotplt.org Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 7053D801AE2; Fri, 6 May 2022 12:50:15 +0000 (UTC) Received: from pdx1-sub0-mail-a305.dreamhost.com (unknown [127.0.0.6]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id F15B680196C; Fri, 6 May 2022 12:50:14 +0000 (UTC) ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1651841415; a=rsa-sha256; cv=none; b=0Xk1JDxdVQjuGook3J3DybPq1lxqdZ2JOtwJAobVjRpB0H46t2TEvOLYGwNXUu6RFW5b+F xE6zKMEy1Cpj7IPf6YnYUiTNHwGfUMhrE+PGLG2eVeXme0qH2GJKfMQWy6UE3HSvSDsKRT 9DSP34wy+7cEw+wXKV30GT/vFbOW5GpzmX+nUSqwNxSq+lUizJgPaIz+7iF+VJgPQTu96E 1uZ1m+T2e/8+xAdH9OQ7uYC7ppzmrvZLmkEzoWiiCnSYOQ4iYcyXZ2hYebYPJr2Hpzd7sl 27p3GrOJ4C6YrxS0FCaAJArLCg2MxXPye41IyIUgbRfq/XzPbMVI+DM5xFIMSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1651841415; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rXSK7a5k9XlGmOudxk57BNDM3COO30tFHb2jPMok1dk=; b=g3keaTY+LqL/++SfjM3gqUlQVe1g7ORneHhkpi8QKldZHEWOzHEEMJ4crwWBZoxIIgY9Ac FsMCoFaSo6bMRG2XlVlSX0QVX1JB5NqsGrLyaT5NMja+MIki5KMpdMv31yJHQJw7baxVXH FyjK+FhDBvv3PV4rMtW21C6xnHIczspw+9L++wyLy5dhPlH28MM/iQflswsWkCTLDbsa22 peOf3WPGvnImAjrbmz9FeXD+8kbx6RqiY7LjWJfiOC42tqRVrr8XNm6Zk/8rA7O036r1hx MRTiTI11TSfH2cTrVUO8KcJqa/evTb4V8GnMIq5/n8Y1jMRWR1ceYOsR+ZO53A== ARC-Authentication-Results: i=1; rspamd-fdd564cbf-4z8gm; auth=pass smtp.auth=dreamhost smtp.mailfrom=siddhesh@sourceware.org X-Sender-Id: dreamhost|x-authsender|siddhesh@gotplt.org X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|siddhesh@gotplt.org X-MailChannels-Auth-Id: dreamhost X-Imminent-Lettuce: 6560bb2826e757bb_1651841415319_3387232292 X-MC-Loop-Signature: 1651841415319:251935647 X-MC-Ingress-Time: 1651841415319 Received: from pdx1-sub0-mail-a305.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384) by 100.120.38.172 (trex/6.7.1); Fri, 06 May 2022 12:50:15 +0000 Received: from rhbox.redhat.com (unknown [1.186.223.145]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: siddhesh@gotplt.org) by pdx1-sub0-mail-a305.dreamhost.com (Postfix) with ESMTPSA id 4Kvr514N3jz1PD; Fri, 6 May 2022 05:50:13 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [committed] benchtests: Add wcrtomb microbenchmark Date: Fri, 6 May 2022 18:19:58 +0530 Message-Id: <20220506124958.16717-1-siddhesh@sourceware.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <87v8ujjabr.fsf@oldenburg.str.redhat.com> References: <87v8ujjabr.fsf@oldenburg.str.redhat.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3488.9 required=5.0 tests=BAYES_50, BODY_8BITS, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NEUTRAL, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Siddhesh Poyarekar via Libc-alpha From: Siddhesh Poyarekar Reply-To: Siddhesh Poyarekar Cc: Florian Weimer Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Add a simple benchmark that measures wcrtomb performance with various locales with 1-4 byte characters. Signed-off-by: Siddhesh Poyarekar Reviewed-by: Florian Weimer --- benchtests/Makefile | 1 + benchtests/bench-wcrtomb.c | 139 +++++++++++++++++++++++++++++++++++++ 2 files changed, 140 insertions(+) create mode 100644 benchtests/bench-wcrtomb.c diff --git a/benchtests/Makefile b/benchtests/Makefile index 149d87e22e..de9de5cf58 100644 --- a/benchtests/Makefile +++ b/benchtests/Makefile @@ -171,6 +171,7 @@ ifeq (no,$(cross-compiling)) wcsmbs-benchset := \ wcpcpy \ wcpncpy \ + wcrtomb \ wcscat \ wcschr \ wcschrnul \ diff --git a/benchtests/bench-wcrtomb.c b/benchtests/bench-wcrtomb.c new file mode 100644 index 0000000000..232a7d59de --- /dev/null +++ b/benchtests/bench-wcrtomb.c @@ -0,0 +1,139 @@ +/* Measure wcrtomb function. + Copyright The GNU Toolchain Authors. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include + +#include "bench-timing.h" +#include "json-lib.h" + +#define NITERS 100000 + +struct test_inputs +{ + const char *locale; + const wchar_t *input_chars; +}; + +/* The inputs represent different types of characters, e.g. RTL, 1 byte, 2 + byte, 3 byte and 4 byte chars. The exact number of inputs per locale + doesn't really matter because we're not looking to compare performance + between locales. */ +struct test_inputs inputs[] = +{ + /* RTL. */ + {"ar_SA.UTF-8", + L",-.،؟ـًُّ٠٢٣٤ءآأؤإئابةتثجحخدذرزسشصضطظعغفقكلمنهوىي"}, + + /* Various mixes of 1 and 2 byte chars. */ + {"cs_CZ.UTF-8", + L",.aAábcCčdDďeEéÉěĚfFghHiIíJlLmMnNňŇoóÓpPqQrřsSšŠTťuUúÚůŮvVWxyýz"}, + + {"el_GR.UTF-8", + L",.αΑβγδΔεΕζηΗθΘιΙκΚλμΜνΝξοΟπΠρΡσΣςτυΥφΦχψω"}, + + {"en_GB.UTF-8", + L",.aAāĀæÆǽǣǢbcCċdDðÐeEēĒfFgGġhHiIīĪlLmMnNoōpPqQrsSTuUūŪvVwxyȝzþÞƿǷ"}, + + {"fr_FR.UTF-8", + L",.aAàâbcCçdDeEéèêëfFghHiIîïjlLmMnNoOôœpPqQrRsSTuUùûvVwxyz"}, + + {"he_IL.UTF-8", + L"',.ִאבגדהוזחטיכךלמםנןסעפףצץקרשת"}, + + /* Devanagari, Japanese, 3-byte chars. */ + {"hi_IN.UTF-8", + L"(।ं०४५७अआइईउऎएओऔकखगघचछजञटडढणतथदधनपफ़बभमयरलवशषसहािीुूृेैोौ्"}, + + {"ja_JP.UTF-8", + L".ー0123456789あアいイうウえエおオかカがきキぎくクぐけケげこコごさサざ"}, + + /* More mixtures of 1 and 2 byte chars. */ + {"ru_RU.UTF-8", + L",.аАбвВгдДеЕёЁжЖзЗийЙкКлЛмМнНоОпПрстТуУфФхХЦчшШщъыЫьэЭюЮя"}, + + {"sr_RS.UTF-8", + L",.aAbcCćčdDđĐeEfgGhHiIlLmMnNoOpPqQrsSšŠTuUvVxyzZž"}, + + {"sv_SE.UTF-8", + L",.aAåÅäÄæÆbBcCdDeEfFghHiIjlLmMnNoOöÖpPqQrsSTuUvVwxyz"}, + + /* Chinese, 3-byte chars */ + {"zh_CN.UTF-8", + L"一七三下不与世両並中串主乱予事二五亡京人今仕付以任企伎会伸住佐体作使"}, + + /* 4-byte chars, because smileys are the universal language and we want to + ensure optimal performance with them ?. */ + {"en_US.UTF-8", + L"??????????????????????????????????"} +}; + +char buf[MB_LEN_MAX]; +size_t ret; + +int +main (int argc, char **argv) +{ + json_ctx_t json_ctx; + json_init (&json_ctx, 0, stdout); + json_document_begin (&json_ctx); + + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); + json_attr_object_begin (&json_ctx, "functions"); + json_attr_object_begin (&json_ctx, "wcrtomb"); + + for (size_t i = 0; i < array_length (inputs); i++) + { + json_attr_object_begin (&json_ctx, inputs[i].locale); + setlocale (LC_ALL, inputs[i].locale); + + timing_t min = 0x7fffffffffffffff, max = 0, total = 0; + const wchar_t *inp = inputs[i].input_chars; + const size_t len = wcslen (inp); + mbstate_t s; + + memset (&s, '\0', sizeof (s)); + + for (size_t n = 0; n < NITERS; n++) + { + timing_t start, end, elapsed; + + TIMING_NOW (start); + for (size_t j = 0; j < len; j++) + ret = wcrtomb (buf, inp[j], &s); + TIMING_NOW (end); + TIMING_DIFF (elapsed, start, end); + if (min > elapsed) + min = elapsed; + if (max < elapsed) + max = elapsed; + TIMING_ACCUM (total, elapsed); + } + json_attr_double (&json_ctx, "max", max); + json_attr_double (&json_ctx, "min", min); + json_attr_double (&json_ctx, "mean", total / NITERS); + json_attr_object_end (&json_ctx); + } + + json_attr_object_end (&json_ctx); + json_attr_object_end (&json_ctx); + json_document_end (&json_ctx); +}