From patchwork Tue Jun 28 09:40:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yanchao X-Patchwork-Id: 55477 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A68AE383577A for ; Tue, 28 Jun 2022 09:40:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A68AE383577A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1656409249; bh=RS6+hB1y2BDUHJ7IG4JSBstDAJ40ucfR3vrt6w2H16I=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=enfC8YjmG0ErmBGkou9geIcmIBzIvCzAXCll4gzAEgeANGAtk/1MUhO7q+QpfylKd xQaQliGrMlaYx2fttB/fP5dQP5/v7tIqge5eS/RoDZy649K3uXjsBRD56nHzu9gapb 1CtGHgYt0hO6Sglc5IXrqhQFkDZ3rGLhBdtlSfJA= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by sourceware.org (Postfix) with ESMTPS id 32C953857836; Tue, 28 Jun 2022 09:40:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 32C953857836 Received: from dggpeml500020.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4LXKLm2VxFz9sv3; Tue, 28 Jun 2022 17:39:44 +0800 (CST) Received: from dggpeml100016.china.huawei.com (7.185.36.216) by dggpeml500020.china.huawei.com (7.185.36.88) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 28 Jun 2022 17:40:23 +0800 Received: from [10.174.179.133] (10.174.179.133) by dggpeml100016.china.huawei.com (7.185.36.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Tue, 28 Jun 2022 17:40:23 +0800 To: Subject: malloc: Optimize the number of arenas for better application performance Message-ID: <4ffb458f-9c32-f0e5-8503-44838d87defe@huawei.com> Date: Tue, 28 Jun 2022 17:40:23 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.9.0 MIME-Version: 1.0 X-Originating-IP: [10.174.179.133] X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpeml100016.china.huawei.com (7.185.36.216) X-CFilter-Loop: Reflected X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Yang Yanchao via Libc-alpha From: Yang Yanchao Reply-To: Yang Yanchao Cc: ldv@altlinux.org, linfeilong@huawei.com, siddhesh@sourceware.org Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" At Kunpeng920 platform, tpcc-mysql scores decreased by about 11.2% between glibc-2.36 and glibc2.28. Comparing the code, I find that the two commits causes performance degradation. 11a02b035b46 (misc: Add __get_nprocs_sched) 97ba273b5057 (linux: __get_nprocs_sched: do not feed CPU_COUNT_S with garbage [BZ #28850]) These two patches modify the default behavior. However, my machine is 96 cores and I have 91 cores bound. It means that perhaps the current way of computing arenas is not optimal. So I roll back some of the code submitted by 11a02b035(misc: Add __get_nprocs_sched). --- malloc/arena.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/malloc/arena.c b/malloc/arena.c index 0a684a720d..a1ee7928d3 100644 --- a/malloc/arena.c +++ b/malloc/arena.c @@ -937,7 +937,7 @@ arena_get2 (size_t size, mstate avoid_arena) narenas_limit = mp_.arena_max; else if (narenas > mp_.arena_test) { - int n = __get_nprocs_sched (); + int n = __get_nprocs (); if (n >= 1) narenas_limit = NARENAS_FROM_NCORES (n);