From patchwork Thu Aug 20 11:46:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 40313 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 75E30386F400; Thu, 20 Aug 2020 11:46:37 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70084.outbound.protection.outlook.com [40.107.7.84]) by sourceware.org (Postfix) with ESMTPS id A9E513857C7A for ; Thu, 20 Aug 2020 11:46:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org A9E513857C7A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Wilco.Dijkstra@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4xo9NPPgBhGzwsC3FoPfLB8QsUVzBjBm1Kxcd38ibdU=; b=MlUnowcmAr2M47ETDYfujS8RUA5zLxDGA0Mqfiv1SMlWNLMC9rLMyGPM8R4TQAmyheykZmjqrKIiWX/gHtcFZsXMcbnALNwpb1Y6m+GxBmInHV3o6An31HqLlTko0LbKez0TRWPeR3BLzRLk+F3v9/MGezb+drisGzMwX6dMSKw= Received: from MR2P264CA0081.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:32::21) by VI1PR08MB2816.eurprd08.prod.outlook.com (2603:10a6:802:1c::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3305.24; Thu, 20 Aug 2020 11:46:31 +0000 Received: from VE1EUR03FT052.eop-EUR03.prod.protection.outlook.com (2603:10a6:500:32:cafe::b8) by MR2P264CA0081.outlook.office365.com (2603:10a6:500:32::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3305.24 via Frontend Transport; Thu, 20 Aug 2020 11:46:31 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; sourceware.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT052.mail.protection.outlook.com (10.152.19.173) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3305.24 via Frontend Transport; Thu, 20 Aug 2020 11:46:29 +0000 Received: ("Tessian outbound 195a290eb161:v64"); Thu, 20 Aug 2020 11:46:29 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: a6d2a836ba58fea3 X-CR-MTA-TID: 64aa7808 Received: from 64b9d4825cbf.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id FB41BD93-06DD-4137-999E-D95783FA0266.1; Thu, 20 Aug 2020 11:46:23 +0000 Received: from EUR01-HE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 64b9d4825cbf.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 20 Aug 2020 11:46:23 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lmKnaqPDdFrkuqH1LCxM3eItNyKHbSpJZCsWdcEMaeA1f325pjc3kwQ7TapXZtrBA6yQqPIU5EeDZ7NWczckv8QEhnPiE+eKmjE0ju378rnv2TP9yPAMbcSC3xAuFkYvoFguDaoPQDJ7uYZTFTJNPUZtr16UniZJSytok1pWHr1z1vVms+2/jnhfOFc8QSa4yjskjGj8Nc3Tq7DD5hr4JiG7zhNTU4DAFfkblvw6y96CAUevI8tHf+bNMiQzSUhk3tfqehM4nvl3iKRkphGImtmwZoqnjcLmjlwPG34nXJCmzB+K4dWe6uxvd4pLdcHYZTs3f6IJHOcX1KpkqhoyOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4xo9NPPgBhGzwsC3FoPfLB8QsUVzBjBm1Kxcd38ibdU=; b=gUX+aZ3LDNT5vvoWsywfgd42neXBCwAWlXqgO/248RehswELXuTzFYeIZZB8Q444hJrIs4qr80asM+x1fXQJg5usRz6JuUemhkIDOc+1vsBpy9TTpmuPtxk/LgFnY/eeC+N7WvpDBQ73orIPL4EtokLZGspt4Ny5o1P3ttjDeBbUVLcMoHDJ+fMPxmcTdtkjyb89wFUiZL2rcu5RbgbhkQoszRSA3yYU3ONR3XcPN3o+aW0RhFisydwOigzUOBaRertmN+uZcLAQyDCq89eQOuq3W3ncn9olE2Km/bIv1/99nMtdukxd9yXrTIsMaJQNxZj16ef4vO8qrfLFNEKMyw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4xo9NPPgBhGzwsC3FoPfLB8QsUVzBjBm1Kxcd38ibdU=; b=MlUnowcmAr2M47ETDYfujS8RUA5zLxDGA0Mqfiv1SMlWNLMC9rLMyGPM8R4TQAmyheykZmjqrKIiWX/gHtcFZsXMcbnALNwpb1Y6m+GxBmInHV3o6An31HqLlTko0LbKez0TRWPeR3BLzRLk+F3v9/MGezb+drisGzMwX6dMSKw= Received: from DB8PR08MB5036.eurprd08.prod.outlook.com (2603:10a6:10:ed::20) by DB8PR08MB4202.eurprd08.prod.outlook.com (2603:10a6:10:ae::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3305.24; Thu, 20 Aug 2020 11:46:21 +0000 Received: from DB8PR08MB5036.eurprd08.prod.outlook.com ([fe80::f89d:935b:b2c:803c]) by DB8PR08MB5036.eurprd08.prod.outlook.com ([fe80::f89d:935b:b2c:803c%7]) with mapi id 15.20.3305.024; Thu, 20 Aug 2020 11:46:21 +0000 From: Wilco Dijkstra To: 'GNU C Library' Subject: [PATCH] AArch64: Improve backwards memmove performance Thread-Topic: [PATCH] AArch64: Improve backwards memmove performance Thread-Index: AQHWducz9iWnfG83nUOXLVc7WP64oQ== Date: Thu, 20 Aug 2020 11:46:20 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: sourceware.org; dkim=none (message not signed) header.d=none;sourceware.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [82.24.199.97] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: f6115bc1-1a25-4b4c-09cb-08d844feada6 x-ms-traffictypediagnostic: DB8PR08MB4202:|VI1PR08MB2816: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:747;OLM:747; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: n7C2rLNWpsu+IT8mCbymSqNJhLZnEXMKd977NT/lk8nW1klLITPTiqli62drN5uMF0v8J4InNvdAfJZmVWCF+Xisg+7kAM0iq+5LVe/ZM9W8Y7EKZCFHZoLsyQ9n/67Dl3apoJSp5zHZkF/Zel8hlHd3D3p8dTZUX+8a32poCzWi5TBcTQNoFioMss/X6SToeU4/NizvfgDXYm7VWw1X78FctRjZeEZA81hB4brNd9eDxrhK2KeLXMEYXNUe7W4StkT8ai7EoZ6psrpx2voQZTjcZLmnxy6pLPGlLQKISKkWwY/NRn+OK50HvZBOU9i5Idw9NhLKyb82aA7Y/+RJTw== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DB8PR08MB5036.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(39860400002)(136003)(376002)(396003)(366004)(26005)(186003)(7696005)(9686003)(66946007)(5660300002)(8676002)(66446008)(64756008)(2906002)(66556008)(8936002)(66476007)(478600001)(33656002)(4744005)(6916009)(76116006)(71200400001)(6506007)(52536014)(316002)(55016002)(86362001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: kW5/XeUCS1LcF3CIIlmVKUlBk61+1Ax6rigVxo2nfvk0LEUort+7QdgAcE4wYSzqczEHSF0MJfTsqQ/GB3/Oc++mpZZXHL84XGc2H+XrP+ypKySdKJ7JkaIouWKF8haEkoQngQ2h6IS1bqdpWkDSNNh6FlLDovoWDNmQCHamTkSGxmIxCMfrPYuKiVqA8oIJq5r+PInV8UEeRz9FJ1+SjR7t6lMEN9vPtCOnMduyFHxEv+eWq5aI/ok9P0BBItfX3ROgZpD/ZCYNwZFwxzMuYmFmzLjO3ODcGDliqZxHJm2LfNWymGnknAgM4At9vqZMgYDp/aHsvg4nXqelfTT9WSVm4cHw2XzbychRmQfRH15l9R71fUQAdGqTm63axCjole/Mzav838o3d7Y47oMIHa0BIB7mUXvcZjWiHJ+iJ9QewzZFXZkq/kG5CiCl+SmQpG5bHf5il0HbrlNYtx4NXIatf3U8qYVMoeEWuf9azYi2BzxkzwNEAdOGZkBnbR2in9sxJewioZ3nuTU4LHv45yr9MPmbg8pbnYChrP4huqWwgefNH0ES4I0ldNOWkqmna3IJphChYZlWkiVS+JROBEwmpHb6Fw9STQRyi1xfCTvlTsrFH++QWGv0s/AaMlqQUh/JPKlWVxL+wKbfg/Zosg== x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB4202 Original-Authentication-Results: sourceware.org; dkim=none (message not signed) header.d=none; sourceware.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT052.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 8c8f3548-e00f-4dab-e5e2-08d844fea867 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: XVpVovmYOSqSa6zu16U9YvXtut10a53ewlIZe9C0hghJNqkY0ODS5POzflMtqxhPAbE9C44C7FSO7BpQHaiPyWEUk03tXG+ZDG3CE88DmoKTrB1/wXbsO5RIHFF1U14H64oGzRAcNKL6aZsT9rgPrP+cwr3hdfgvITvzUTRN9ucwqlPPBYbQ7/sWk7EGWmtfIsrKkKz1oBqxt2kchW3SICo3F9e03jWdGLyoLkqgPPHlAhydZ0qpv/TzGdw/gHR7tkICLSjhFDmwx6e4fKiAbMS35sJuRlObFAiooTQ2t2rFiczOfhLNdbR0Clw3wGFLWga1D0Byu7+piAjJ+W9IGrmkf5OS/9sfaNLxU4SV8yaPkp/yqxPZU8dLlZNhxf4z4oNz77kgsJWN1FzQxAzqUA== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39860400002)(396003)(346002)(376002)(136003)(46966005)(81166007)(316002)(82740400003)(47076004)(33656002)(7696005)(478600001)(26005)(186003)(6506007)(336012)(9686003)(55016002)(70586007)(5660300002)(36906005)(86362001)(2906002)(4744005)(8676002)(82310400002)(70206006)(8936002)(356005)(52536014)(6916009); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Aug 2020 11:46:29.7547 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f6115bc1-1a25-4b4c-09cb-08d844feada6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT052.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB2816 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On some microarchitectures performance of the backwards memmove improves if the stores use STR with decreasing addresses. So change the memmove loop in memcpy_advsimd.S to use 2x STR rather than STP. Passes GLIBC regression test, OK for commit? Reviewed-by: Adhemerval Zanella diff --git a/sysdeps/aarch64/multiarch/memcpy_advsimd.S b/sysdeps/aarch64/multiarch/memcpy_advsimd.S index d4ba74777744c8bb5a83e43ab2d63ad8dab35203..48bb6d7ca425197907eaef2307fb3939e69baa15 100644 --- a/sysdeps/aarch64/multiarch/memcpy_advsimd.S +++ b/sysdeps/aarch64/multiarch/memcpy_advsimd.S @@ -223,12 +223,13 @@ L(copy_long_backwards): b.ls L(copy64_from_start) L(loop64_backwards): - stp A_q, B_q, [dstend, -32] + str B_q, [dstend, -16] + str A_q, [dstend, -32] ldp A_q, B_q, [srcend, -96] - stp C_q, D_q, [dstend, -64] + str D_q, [dstend, -48] + str C_q, [dstend, -64]! ldp C_q, D_q, [srcend, -128] sub srcend, srcend, 64 - sub dstend, dstend, 64 subs count, count, 64 b.hi L(loop64_backwards)