Message ID | AM5PR0801MB16680E6787BC0553E1847A0B83649@AM5PR0801MB1668.eurprd08.prod.outlook.com |
---|---|
State | Superseded |
Headers |
Return-Path: <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B140B385AC1B for <patchwork@sourceware.org>; Thu, 11 Aug 2022 16:23:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B140B385AC1B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1660235018; bh=iNhUsfXe3mb0aFR+XadmTOAXHoAEKypMnRhVFARgZqA=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=i0z2YNwL55/L5u6+5bhVwroBQ+IggMGcdCjH+GcnRojQQmUrbjxKfwfg5nQFNyW7B wpwh3asUhS04ziEo1KEeLUfVAnWJHxqWtGirDW6OXXxBoXOLb3x5vS/6soAuEMu+uM Vw6TeOwZSpHPngh6ZFuWV7T9k7igC4KSQOKggxcE= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50047.outbound.protection.outlook.com [40.107.5.47]) by sourceware.org (Postfix) with ESMTPS id 772523857C4B for <libc-alpha@sourceware.org>; Thu, 11 Aug 2022 16:23:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 772523857C4B ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=dyX1Ed7h3E5yk86VROS1p/npbZaOwm8p71zZWfmEtvK2uNbmVGVWGmLYsyau4ctxzgHZV8y+KWi5nuwPqs5baI95e0AW/5qoM70W5jfp+VldgpLwrij3kcVp1xKgGEbqe3Pmc2k49RfKejSJGrVu39xdRhmuV9mSJ8P0pUSsLXRbH/fx7A3fCr0X3pM+PlyKz6M1rK7BDAVyKg5a9f4FRY4lwq1rYqHgw3QXky3Ny+Rl9dovO2IBRvFQSkwQ4i4hyPEvyjzUgf+8bOqz93X/KVcPNZE2azUIYblnOs7b/hun4Bm/ppkpij/C4jcLWkctrqK2r3ivTQk4Le+BQVoEuA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iNhUsfXe3mb0aFR+XadmTOAXHoAEKypMnRhVFARgZqA=; b=TAOhQb+5rOZ2f3LSkyudLHbURmIHwNXjljoZA9J2oh6uP/75a75mr/MHLTifsBCKaxTyGDUUuDFBZyVBIn2w3R0Yr1Xvkg3TpY5QLAc54WPt7A6s9Cb7BCKbbT/CaUQSOTXEBDAz79l+cq6Qmbr7VDdCr1QYQmjmJ/WbfYDfrCiFO6MwgfUTd6/TCWdLbcrHYh6abaWDUVJ17Tg3bEtkAWRqyiPyEVooKkT/pzCyuGjqv/D1Xv00rmNqD18bsHM0/Y2Tgr34ZoxSI9pIMq5QdSXnAl9lc4+9IN+V94Cs8jjTGFtpe01+dkPfQlQnu3SY8VNigsciVs++BuhVCb65Hw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) Received: from AS9PR04CA0079.eurprd04.prod.outlook.com (2603:10a6:20b:48b::23) by AM4PR08MB2756.eurprd08.prod.outlook.com (2603:10a6:205:c::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.17; Thu, 11 Aug 2022 16:23:12 +0000 Received: from AM5EUR03FT038.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:48b:cafe::9e) by AS9PR04CA0079.outlook.office365.com (2603:10a6:20b:48b::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5525.12 via Frontend Transport; Thu, 11 Aug 2022 16:23:12 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT038.mail.protection.outlook.com (10.152.17.118) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5525.11 via Frontend Transport; Thu, 11 Aug 2022 16:23:11 +0000 Received: ("Tessian outbound 2af316122c7a:v123"); Thu, 11 Aug 2022 16:23:11 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 34e3ddc0dad8109c X-CR-MTA-TID: 64aa7808 Received: from c68342dd23c7.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 61358E27-3692-42C6-83B8-0F3AD7D3CAB4.1; Thu, 11 Aug 2022 16:23:04 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c68342dd23c7.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 11 Aug 2022 16:23:04 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Jvli0Y+AaTxb0Pu443lywYhGVdw48IQ5597qvAB9htlM28Pb2nsF2jyQ34yGIqlRlEs2KuYLWLvNfjpkXMsAyvkN4c4ZccqwKzc6f4cKtXjBjOx3oPmYB5f9Mknl44O2b6QvaYX+u7r5eeXgt0t0+xxOAlgznDWGSRqj2aoi/sNnXUJkHQpKv9FJgubhuoSs+Eo0GC/iyrXSfODDXj5Z5VDz6otgEdqAf6tw+YkYU6c6CxsqqUgjxE11PDf/jb9/kzrwl8wrJka1n74/zfnxvOt1FVxx7HLllJTSx7JxIpz2lNMrBzBI6v/WGA/cHN4tJPKfR1rBoUSJ+h5h41KrCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iNhUsfXe3mb0aFR+XadmTOAXHoAEKypMnRhVFARgZqA=; b=gxF2NfXt2u7cgqmutYFeSJ/+L60B9jvV9ZVfXakzSioOuAG9hovKckrkwXkHRc8qa4YvHpdZNmb2QMSwtl2GnLS2m2p+aD5XQ6tquKtvOCUn+T8y05NXS8utbwf+nuhQQMu0NIhUBi0C9EbixI9S4Uvkp+UUsldq0BUrAyIg56zOUiYiXHZEuskzRSoQXji34kKkf+DefAW0E+S2g+E7a/a/lnsubZ9rrd/btxQ0Sv2GS3zMkO9oN7bCrleAWJYtr1MF3THiRb6bSuc/bT4e5rP23uaqyBqYz8tGhl+1iTMUGjIBEcTNEG7OMtrTGX6alCxWz7ru/Vvnxa595ADopw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from AM5PR0801MB1668.eurprd08.prod.outlook.com (2603:10a6:203:3c::14) by AM0PR08MB4484.eurprd08.prod.outlook.com (2603:10a6:208:139::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.16; Thu, 11 Aug 2022 16:23:02 +0000 Received: from AM5PR0801MB1668.eurprd08.prod.outlook.com ([fe80::e434:1b13:e8ed:9e14]) by AM5PR0801MB1668.eurprd08.prod.outlook.com ([fe80::e434:1b13:e8ed:9e14%11]) with mapi id 15.20.5525.010; Thu, 11 Aug 2022 16:22:56 +0000 To: 'GNU C Library' <libc-alpha@sourceware.org> Subject: [PATCH] Improve performance of libc locks Thread-Topic: [PATCH] Improve performance of libc locks Thread-Index: AQHYrZ5WlKF2715S40mQdP77vhgIrA== Date: Thu, 11 Aug 2022 16:22:56 +0000 Message-ID: <AM5PR0801MB16680E6787BC0553E1847A0B83649@AM5PR0801MB1668.eurprd08.prod.outlook.com> Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-MS-Office365-Filtering-Correlation-Id: 7474c595-497c-4789-21b4-08da7bb5c923 x-ms-traffictypediagnostic: AM0PR08MB4484:EE_|AM5EUR03FT038:EE_|AM4PR08MB2756:EE_ x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: zDLK1/NfQ4uFtmM2n3EQ2iXltww8xbWSgUTcKmoml8GvFGdBrcQwwP7e4YEnKG/EGaI4WowoYoR1ktWwKFzBi1cYq0WsmB5L0Sun4Z8p+3IAG2OWVCUIBZ4T8PbLmSqNukKl3Hgc+RiuJF6nJnr77UB9oQbrfHSqFy5difqdiI9B+sAPwD4MV3FcZyAfvUFIK2KUYoQUR9Gfk506Hk5e5ocPu5tkcPF4XBGdW58U4vg/xReXxhoKfojiZIn6IL0y30J1whz0QJWxEcS11Hq7dL3pqqpQQWv7Q16G5GP8/Fg7XIQ/IyxRrnqLicuQnebl/7wXhkp1ipLwho9zCJ7IDBZ3mSUO76p6kHaUOyf/dupVvsu0CJjmUOJTerf62C5fE8AlEVdvUE1oGVVsJvRjmGLg2AoPqcNv0pZDVcIGgzelhXsCAaiXkLJ9we+DvXbAh4aciCgex3QsOpT63Dz74ya0doTL7cqpUukzxeWP7JBZ9svN2q2SP5Y4p6ECqvj5qxAUtdGNLcgMFU6mxr68BgEFmcxwDB0Gr9lQUl20PLGUWeeOrkg28bURwLZQipSWrc3BMzO8EXzptqpf1GFla9bzKrviMtf6/g/Jq7SGltwe1c3NsdxbufulJdXZZ9YzDCsUNY0rbkG3lVtivfbKIBDO/tdbQ+XzlsowppXoOreu41RxSwPdj0qUgMhQDWJw/bJL9R1hfKYM5TAXPwjI1Lb5qEvuhUioPfKmMYjdwKgdEqPJYHZTdOgXNitqV6Iaema0M6bFnelHi3+lbfjsKC3bjEUNThcIKMZlWxbdjzc= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM5PR0801MB1668.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230016)(4636009)(376002)(39860400002)(346002)(136003)(366004)(396003)(6916009)(66476007)(6506007)(7696005)(8676002)(64756008)(66446008)(91956017)(66556008)(8936002)(5660300002)(9686003)(26005)(76116006)(186003)(52536014)(2906002)(41300700001)(66946007)(38070700005)(33656002)(86362001)(55016003)(83380400001)(478600001)(316002)(71200400001)(38100700002)(122000001); DIR:OUT; SFP:1101; Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB4484 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT038.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 24894247-e675-4bf6-26d7-08da7bb5c010 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: ewBz2p5x9B7xZt6AWdEOaXM4X5bRwgj4ol9rY3EoqyEkh+wfAZ0PXIydINInZYvTWEpISY0kxg3XDP+TlfX18vV8r2n2APo6E8p3jEXWKfCDOLEmSg8KJl2SNGHGZtlRYa2RVXBiG44SGzUlrQf95tSyaV2XzPiJG3B3oOBpZ92eNE9B0b/i91hLDGu+O7cBVm0sZvj680DvlcUfqr3NMtQkTqj5mflpJiFWdLGvKfmhFnlgKCe73s7ohnTqT6X+c7dQoTpvydizfj/Rcq5szyV58mog+4BkLe0H9sMsI5LztmKAQ+jv243ow5llipBwfkJv12lXoEyWm6sepWpv1xHOeznpYeZbl8lrAUgTH5aywc9nkuKug2vxHGkSSumKeBlJOamoOQaAxbu7gl6s3MGWBqx/avXL0Ecs8gALXrecaWAfdIWq8krwDnB/dSaAoYYwNOUIaTOJ2dEskO4CGLCW8PkG1wut+C6AamYQ1/mVKtA6iyoDSutpM49an/Fs8VQ9O3IfvjG0nYR334/pxgNDP9S+B6cSux5L8mi1imGm8gXM363CyNJQdNz/I3eoA5waATK1/kGGJYsoUwXYC8g2jc4ubJHZIuSAx7sPR7nkb8BJE6lvjv+FGMp1a6oYOtNaJxjIbjZYp7zWH0cBpWZdOhX/MkRcCqpbgICJVKBxQ4bUYNjWbJ3h0/fktUz2ty4NrRjdVw4Xx/vHMY4IfGvCCuIOeuMt9QDZG2kuLznmQX1jUY+YrDvgLu/GaTqWTxhka8gVJhx5PmMqq4p4QQtlT7+JVXMB2WRW8aIpDf6VIcHPs9/7iI8RxOUcN1Y1 X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230016)(4636009)(39860400002)(396003)(376002)(136003)(346002)(40470700004)(36840700001)(46966006)(316002)(6916009)(70206006)(5660300002)(478600001)(40460700003)(82310400005)(8676002)(86362001)(55016003)(40480700001)(33656002)(8936002)(2906002)(52536014)(70586007)(36860700001)(83380400001)(82740400003)(81166007)(47076005)(356005)(9686003)(41300700001)(7696005)(6506007)(336012)(26005)(186003); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Aug 2022 16:23:11.9541 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7474c595-497c-4789-21b4-08da7bb5c923 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT038.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR08MB2756 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/libc-alpha/> List-Post: <mailto:libc-alpha@sourceware.org> List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>, <mailto:libc-alpha-request@sourceware.org?subject=subscribe> From: Wilco Dijkstra via Libc-alpha <libc-alpha@sourceware.org> Reply-To: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" <libc-alpha-bounces+patchwork=sourceware.org@sourceware.org> |
Series |
Improve performance of libc locks
|
|
Checks
Context | Check | Description |
---|---|---|
dj/TryBot-apply_patch | success | Patch applied to master at the time it was sent |
dj/TryBot-32bit | success | Build for i686 |
Commit Message
Wilco Dijkstra
Aug. 11, 2022, 4:22 p.m. UTC
Improve performance of libc locks by adding a fast path for the single-threaded case. On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded performance is unchanged. Passes regress on AArch64, OK for commit? ---
Comments
On 8/11/22 12:22, Wilco Dijkstra via Libc-alpha wrote: > Improve performance of libc locks by adding a fast path for the > single-threaded case. > > On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded > performance is unchanged. > > Passes regress on AArch64, OK for commit? This impacts all architectures. Are you able to run the microbenchmarks to show that this improves one of them? If you can, then we can at least ask the other machine maintainers to test the patch shows gains there too. Conceptually I don't see why it wouldn't improve the performance of all architectures, but having a baseline at the point of the patch is good for recording the performance for future discussions. If we don't have a benchmark that shows this specific base of ST vs MT and internal __libc_lock_lock-locks then we should add one. Improving the internal locking for our algorithms is always going to be a point of interest for IHVs. Thanks. > --- > > diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h > index d3a6837fd212f3f5dfd80f46d0e9ce365042ae0c..ccdb11fee6f14069d0b936be93d0f0fa6d8bc30b 100644 > --- a/sysdeps/nptl/libc-lockP.h > +++ b/sysdeps/nptl/libc-lockP.h > @@ -108,7 +108,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0"); > #define __libc_rwlock_fini(NAME) ((void) 0) > > /* Lock the named lock variable. */ > -#define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; }) > +#define __libc_lock_lock(NAME) \ > + ({ \ > + if (SINGLE_THREAD_P) \ > + (NAME) = LLL_LOCK_INITIALIZER_LOCKED; \ > + else \ > + lll_lock (NAME, LLL_PRIVATE); \ > + 0; \ > + }) > #define __libc_rwlock_rdlock(NAME) __pthread_rwlock_rdlock (&(NAME)) > #define __libc_rwlock_wrlock(NAME) __pthread_rwlock_wrlock (&(NAME)) > > @@ -116,7 +123,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0"); > #define __libc_lock_trylock(NAME) lll_trylock (NAME) > > /* Unlock the named lock variable. */ > -#define __libc_lock_unlock(NAME) lll_unlock (NAME, LLL_PRIVATE) > +#define __libc_lock_unlock(NAME) \ > + ({ \ > + if (SINGLE_THREAD_P) \ > + (NAME) = LLL_LOCK_INITIALIZER; \ > + else \ > + lll_unlock (NAME, LLL_PRIVATE); \ > + 0; \ > + }) > #define __libc_rwlock_unlock(NAME) __pthread_rwlock_unlock (&(NAME)) > > #if IS_IN (rtld) > > >
Hi Carlos, > This impacts all architectures. That was the goal indeed - we should add single-threaded optimizations in a generic way. > If we don't have a benchmark that shows this specific base of ST vs MT and > internal __libc_lock_lock-locks then we should add one. Improving the internal > locking for our algorithms is always going to be a point of interest for IHVs. I can easily wrap my rand() microbench in json and add it to the benchtests. I think it would be harder to do more tests on internal locks/headers since they are not easily usable from benchtest infrastructure (just including libc-lock.h results in lots of errors...). Cheers, Wilco
On Tue, Aug 16, 2022 at 1:35 AM Wilco Dijkstra via Libc-alpha <libc-alpha@sourceware.org> wrote: > > Hi Carlos, > > > This impacts all architectures. > > That was the goal indeed - we should add single-threaded optimizations in a > generic way. > > > If we don't have a benchmark that shows this specific base of ST vs MT and > > internal __libc_lock_lock-locks then we should add one. Improving the internal > > locking for our algorithms is always going to be a point of interest for IHVs. > > I can easily wrap my rand() microbench in json and add it to the benchtests. Think that would be good so we can easily measure on other architectures. > I think it would be harder to do more tests on internal locks/headers since they > are not easily usable from benchtest infrastructure (just including libc-lock.h > results in lots of errors...). > > Cheers, > Wilco
On Thu, Aug 11, 2022 at 12:23 PM Wilco Dijkstra via Libc-alpha <libc-alpha@sourceware.org> wrote: > > Improve performance of libc locks by adding a fast path for the > single-threaded case. > > On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded > performance is unchanged. > > Passes regress on AArch64, OK for commit? > > --- > Ping ? I saw the stdio one was committed but what happened with this one ?
Hi Cristian, >> Improve performance of libc locks by adding a fast path for the >> single-threaded case. >> >> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded >> performance is unchanged. >> >> Passes regress on AArch64, OK for commit? > > Ping ? I saw the stdio one was committed but what happened with this one ? It is waiting on the locking benchmark being accepted. I've pinged that (https://sourceware.org/pipermail/libc-alpha/2022-December/143944.html) since it would be great to get all this in the next GLIBC. Cheers, Wilco
On Fri, Dec 9, 2022 at 11:10 AM Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote: > Hi Cristian, > > >> Improve performance of libc locks by adding a fast path for the > >> single-threaded case. > >> > >> On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded > >> performance is unchanged. > >> > >> Passes regress on AArch64, OK for commit? > > > > Ping ? I saw the stdio one was committed but what happened with this one > ? > > It is waiting on the locking benchmark being accepted. I've pinged that > (https://sourceware.org/pipermail/libc-alpha/2022-December/143944.html) > since it would be great to get all this in the next GLIBC. > > Cheers, > Wilco Ping ? did this move forward?
* Wilco Dijkstra via Libc-alpha: > Improve performance of libc locks by adding a fast path for the > single-threaded case. > > On Neoverse V1, a loop using rand() improved 3.6 times. Multithreaded > performance is unchanged. > > Passes regress on AArch64, OK for commit? > > --- > > diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h > index d3a6837fd212f3f5dfd80f46d0e9ce365042ae0c..ccdb11fee6f14069d0b936be93d0f0fa6d8bc30b 100644 > --- a/sysdeps/nptl/libc-lockP.h > +++ b/sysdeps/nptl/libc-lockP.h > @@ -108,7 +108,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0"); > #define __libc_rwlock_fini(NAME) ((void) 0) > > /* Lock the named lock variable. */ > -#define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; }) > +#define __libc_lock_lock(NAME) \ > + ({ \ > + if (SINGLE_THREAD_P) \ > + (NAME) = LLL_LOCK_INITIALIZER_LOCKED; \ > + else \ > + lll_lock (NAME, LLL_PRIVATE); \ > + 0; \ > + }) We already have SINGLE_THREAD_P checks around locking in several places. This makes the __libc_lock_lock check redudant in those cases. I believe this was done deliberately because in many cases, we can also to skip cancellation handling at the same time. The rand performance issue could be addressed with a similar localized change. I believe that would be far less controversial. Thanks, Florian
Hi Florian, > We already have SINGLE_THREAD_P checks around locking in several places. > This makes the __libc_lock_lock check redudant in those cases. I > believe this was done deliberately because in many cases, we can also to > skip cancellation handling at the same time. Yes, this is best for the most performance critical cases. However there are lots of functions that use locks and many will improve with a single thread check at a higher level. You're right that would add extra checks in cases that are not performance critical. Maybe a solution would be to introduce __libc_lock_fast() or similar? That way one can improve performance critical code easily without having to add special fast paths. Today it would use SINGLE_THREAD_P, but it could potentially use RSEQ in the future. > The rand performance issue could be addressed with a similar localized > change. I believe that would be far less controversial. I can send a patch that adds fast paths to rand() if that helps unblocking this. Cheers, Wilco
On 11/24/23 08:47, Wilco Dijkstra wrote: > Hi Florian, > >> We already have SINGLE_THREAD_P checks around locking in several places. >> This makes the __libc_lock_lock check redudant in those cases. I >> believe this was done deliberately because in many cases, we can also to >> skip cancellation handling at the same time. > > Yes, this is best for the most performance critical cases. However there are lots of > functions that use locks and many will improve with a single thread check at a > higher level. You're right that would add extra checks in cases that are not > performance critical. Right. > Maybe a solution would be to introduce __libc_lock_fast() or similar? That way one > can improve performance critical code easily without having to add special fast > paths. Today it would use SINGLE_THREAD_P, but it could potentially use RSEQ in > the future. I would prefer a solution like this because you can actively audit, and migrate the callers rather than adding a hidden (to the developer) change in the macro. >> The rand performance issue could be addressed with a similar localized >> change. I believe that would be far less controversial. > > I can send a patch that adds fast paths to rand() if that helps unblocking this. I think it would. Add the fast path to rand(), and a microbenchmark to show that this is good for performance on your machine, that way we don't regress this change in the future when we work on rand(). I'm amenable to not having a microbenchmark, but every time we talk about performance adding a little bit to the corpus helps ensure we don't loose track of the gains.
diff --git a/sysdeps/nptl/libc-lockP.h b/sysdeps/nptl/libc-lockP.h index d3a6837fd212f3f5dfd80f46d0e9ce365042ae0c..ccdb11fee6f14069d0b936be93d0f0fa6d8bc30b 100644 --- a/sysdeps/nptl/libc-lockP.h +++ b/sysdeps/nptl/libc-lockP.h @@ -108,7 +108,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0"); #define __libc_rwlock_fini(NAME) ((void) 0) /* Lock the named lock variable. */ -#define __libc_lock_lock(NAME) ({ lll_lock (NAME, LLL_PRIVATE); 0; }) +#define __libc_lock_lock(NAME) \ + ({ \ + if (SINGLE_THREAD_P) \ + (NAME) = LLL_LOCK_INITIALIZER_LOCKED; \ + else \ + lll_lock (NAME, LLL_PRIVATE); \ + 0; \ + }) #define __libc_rwlock_rdlock(NAME) __pthread_rwlock_rdlock (&(NAME)) #define __libc_rwlock_wrlock(NAME) __pthread_rwlock_wrlock (&(NAME)) @@ -116,7 +123,14 @@ _Static_assert (LLL_LOCK_INITIALIZER == 0, "LLL_LOCK_INITIALIZER != 0"); #define __libc_lock_trylock(NAME) lll_trylock (NAME) /* Unlock the named lock variable. */ -#define __libc_lock_unlock(NAME) lll_unlock (NAME, LLL_PRIVATE) +#define __libc_lock_unlock(NAME) \ + ({ \ + if (SINGLE_THREAD_P) \ + (NAME) = LLL_LOCK_INITIALIZER; \ + else \ + lll_unlock (NAME, LLL_PRIVATE); \ + 0; \ + }) #define __libc_rwlock_unlock(NAME) __pthread_rwlock_unlock (&(NAME)) #if IS_IN (rtld)