From patchwork Tue Aug 13 13:10:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 34070 Received: (qmail 116094 invoked by alias); 13 Aug 2019 13:11:07 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 116068 invoked by uid 89); 13 Aug 2019 13:11:07 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-19.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=2112 X-HELO: EUR02-AM5-obe.outbound.protection.outlook.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ep2LKCIMMQRVWgjpfw8D4yoTWm1fGuO3RnSfpAdUWbg=; b=kdSyeddX653UzbNVb2xg5BIyyv7MFCTAaLPg2/k9LthuJpyeJ7PnYbEiMpX0ULKS0o+AjP0xOE7OHAfx4WQvMQ6bBUiHJO2lQ3njtgiyKPM8aFLyJEfh+s29KjpWWkj64HG5IBrgL+uK0PC0aSQHlLzrZKlmnIDYJ5h8wYvSulc= Authentication-Results: spf=temperror (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; sourceware.org; dmarc=temperror action=none header.from=arm.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of arm.com: DNS Timeout) X-CheckRecipientChecked: true X-CR-MTA-CID: 3b235134667c5a75 X-CR-MTA-TID: 64aa7808 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GYxF5v9VmvbKj2y86dJFQygrFCnkf2g/yuFbvvhk9E7pAg8gjBdMjmHwS1GM5QCuIz3dWxc8kF8HHW6bPXm2A5n2LjW2NG3pvS0///QK5mzaaONa6XhULVYU0c35Sw03cPRs1g7p1ohy5Rsy/DmJRazeXDLqzTosUw/nGObWwYx+8Q4tX4x7LVO8CY8cj3qy7UlzX84+phmWsqfOUHhZdhBiP8klyBEBtePfTX2wEOO6OW/dutcQri0zOJ+k3wlkmZdjs7bIxGpY2uVfM1efIoCHgDyim634poCJItB5LvPfov7LwMrqrTKZz1P2wlAES8K41CU4SmoHNsN9kiMOQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ep2LKCIMMQRVWgjpfw8D4yoTWm1fGuO3RnSfpAdUWbg=; b=QGTSJdiw0gNWE2EyurRdNt+Eis2YVMlMnIMEFpq60z++m4KzsyQyl3ux7ec554mbA4Tqct+PS4xVecZTc32xFQg8q/g9lD+ShnWwcaehP3QjzPcVF3mDjYwwph3egG/Fbm6dHBJAYGFunGvEigcHhFhc1Lw50af9+v7EtTx+QrcbBXeTqiykn3WmO+96Cugm4kkje0T12uqB5vVpmbWGLDdGKd88WRm5afEXbdxM84TsCf+HwoI2mi5aaSd/XMLOKYY3RePkOzqJLYriMsf3Uxf3+bGel+AdzUJ6xANQtri5hXvX+WMKRN7zogAECrrqKhe/2ugM2I1Ha4KIRBKh1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ep2LKCIMMQRVWgjpfw8D4yoTWm1fGuO3RnSfpAdUWbg=; b=kdSyeddX653UzbNVb2xg5BIyyv7MFCTAaLPg2/k9LthuJpyeJ7PnYbEiMpX0ULKS0o+AjP0xOE7OHAfx4WQvMQ6bBUiHJO2lQ3njtgiyKPM8aFLyJEfh+s29KjpWWkj64HG5IBrgL+uK0PC0aSQHlLzrZKlmnIDYJ5h8wYvSulc= From: Wilco Dijkstra To: Feng Xue OS , Siddhesh Poyarekar , 'GNU C Library' CC: nd Subject: Re: [PATCH] aarch64: Add tunable glibc.memset.dc_zva_threshold Date: Tue, 13 Aug 2019 13:10:49 +0000 Message-ID: References: , <0a810dde-3b92-4782-09cb-16cdbc8dbb75@gotplt.org>, In-Reply-To: Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; X-Microsoft-Antispam-Untrusted: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(5600148)(711020)(4605104)(1401327)(4618075)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7193020); SRVR:VI1PR0801MB1902; x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:6108;OLM:6108; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(396003)(346002)(136003)(366004)(376002)(189003)(199004)(99286004)(6506007)(110136005)(256004)(7696005)(86362001)(26005)(102836004)(446003)(486006)(71200400001)(71190400001)(8676002)(81156014)(476003)(11346002)(81166006)(6246003)(316002)(2906002)(4326008)(3846002)(14454004)(186003)(25786009)(6116002)(76176011)(53936002)(478600001)(229853002)(66066001)(6436002)(52536014)(7736002)(74316002)(76116006)(5660300002)(66446008)(66476007)(66556008)(64756008)(66946007)(305945005)(8936002)(9686003)(33656002)(55016002); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0801MB1902; H:VI1PR0801MB2127.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Message-Info-Original: Wjr2GZNzNBOVFTk4FTLlwMcj76fdWzJ3DVWh7BYIgB0bTbn6G0G+D1zc1DBhuR46PXF0ZqL9Z4kULcS6yx/HNAnDniR7oMBM6ztStWqF7vTbj0E03BWJ3NuKxVLBnVvT9DQPXWZZ8XGzVjoypo7yyZ8au3SFFY7/4CJn074mN3QpEWDg1OIcq2iDkWR4VroJimtb8MVjtX/oqDj7bdJectiExiOkmJ5RhrBICDA0NyIaklBO+mS1EJZ7KWO3EqWKj/S4qx19NJDjPOPYMnO4xQMss0YMA1DSORuIsvsWasSob7szhhFEr+2B2nRCGRLhM7zKifxP/yePoL4MCuOjQnMrB9mwY9RM41PGwhLbByCmpv7U4Sh4aQMqW7bOTN+T4bSaFAXXicfS5yeDGGaEugFApj9QLFI2cZAzrD840L0= x-ms-exchange-transport-forked: True MIME-Version: 1.0 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; Return-Path: Wilco.Dijkstra@arm.com X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT013.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: ce953b4d-f853-4da5-1f2b-08d71fefa999 Hi Feng, > This version disable DC ZVA in emag. That looks good to me. OK Wilco diff --git a/sysdeps/aarch64/multiarch/memset_base64.S b/sysdeps/aarch64/multiarch/memset_base64.S index 9a62325..c0cccba 100644 --- a/sysdeps/aarch64/multiarch/memset_base64.S +++ b/sysdeps/aarch64/multiarch/memset_base64.S @@ -23,6 +23,7 @@ # define MEMSET __memset_base64 #endif +/* To disable DC ZVA, set this threshold to 0. */ #ifndef DC_ZVA_THRESHOLD # define DC_ZVA_THRESHOLD 512 #endif @@ -91,11 +92,12 @@ L(set96): .p2align 4 L(set_long): stp val, val, [dstin] + bic dst, dstin, 15 +#if DC_ZVA_THRESHOLD cmp count, DC_ZVA_THRESHOLD ccmp val, 0, 0, cs - bic dst, dstin, 15 b.eq L(zva_64) - +#endif /* Small-size or non-zero memset does not use DC ZVA. */ sub count, dstend, dst @@ -105,7 +107,11 @@ L(set_long): * count is less than 33 bytes, so as to bypass 2 unneccesary stps. */ sub count, count, 64+16+1 + +#if DC_ZVA_THRESHOLD + /* Align loop on 16-byte boundary, this might be friendly to i-cache. */ nop +#endif 1: stp val, val, [dst, 16] stp val, val, [dst, 32] @@ -121,6 +127,7 @@ L(set_long): stp val, val, [dstend, -16] ret +#if DC_ZVA_THRESHOLD .p2align 3 L(zva_64): stp val, val, [dst, 16] @@ -173,6 +180,7 @@ L(zva_64): 1: stp val, val, [dstend, -32] stp val, val, [dstend, -16] ret +#endif END (MEMSET) libc_hidden_builtin_def (MEMSET) diff --git a/sysdeps/aarch64/multiarch/memset_emag.S b/sysdeps/aarch64/multiarch/memset_emag.S index 1c1fabc..c2aed62 100644 --- a/sysdeps/aarch64/multiarch/memset_emag.S +++ b/sysdeps/aarch64/multiarch/memset_emag.S @@ -21,12 +21,14 @@ # define MEMSET __memset_emag /* - * Using dc zva to zero memory does not produce better performance if + * Using DC ZVA to zero memory does not produce better performance if * memory size is not very large, especially when there are multiple - * processes/threads contending memory/cache. Here we use a somewhat - * large threshold to trigger usage of dc zva. -*/ -# define DC_ZVA_THRESHOLD 1024 + * processes/threads contending memory/cache. Here we set threshold to + * zero to disable using DC ZVA, which is good for multi-process/thread + * workloads. + */ + +# define DC_ZVA_THRESHOLD 0 # include "./memset_base64.S" #endif