From patchwork Thu Jul 22 16:03:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 44460 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6A4AA3848409 for ; Thu, 22 Jul 2021 16:08:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6A4AA3848409 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1626970089; bh=DawHQgeki8ZvZ/CEVflMwrDi2LY1/Pc1w6Kf2TtSQY0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=joq8X9CPXQZsbngAknxkfRZ9G9mGUGOxPjZODhcqd0mgV3V7NyArAlfAHbtbBgZdY gKhSc1U7lhY2DIlbuttRWGK71xeqhXwWeQtlWuQPMIWeQR8D3KjTTHASU7KDr16Bqo LEiuB4FvaN+SgWVHY/fimfnLwjvTiUfa11F9I5gQ= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-eopbgr140052.outbound.protection.outlook.com [40.107.14.52]) by sourceware.org (Postfix) with ESMTPS id A304B3839834 for ; Thu, 22 Jul 2021 16:06:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org A304B3839834 Received: from DB6PR0601CA0031.eurprd06.prod.outlook.com (2603:10a6:4:17::17) by AM7PR08MB5301.eurprd08.prod.outlook.com (2603:10a6:20b:dd::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4331.28; Thu, 22 Jul 2021 16:06:53 +0000 Received: from DB5EUR03FT035.eop-EUR03.prod.protection.outlook.com (2603:10a6:4:17:cafe::d) by DB6PR0601CA0031.outlook.office365.com (2603:10a6:4:17::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4352.26 via Frontend Transport; Thu, 22 Jul 2021 16:06:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; sourceware.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;sourceware.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT035.mail.protection.outlook.com (10.152.20.65) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4352.24 via Frontend Transport; Thu, 22 Jul 2021 16:06:53 +0000 Received: ("Tessian outbound 664b93226e0b:v99"); Thu, 22 Jul 2021 16:06:53 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3b1a9aa6f195cbe3 X-CR-MTA-TID: 64aa7808 Received: from 5446ae3cfc6f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 91732F29-A1A0-4FED-AA4A-ADF658427AC2.1; Thu, 22 Jul 2021 16:03:21 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 5446ae3cfc6f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 22 Jul 2021 16:03:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VsqD42CehXZGKzBDKVwht5shoWnzqwzJgYlpHfIkdFjXDc4vSW9kdYqxue3brcqqucnLHA5LcEst51SU8A1iKBHPKYbkg0KeayZk4itJpaiOa4VSarGoSXFzm+8cPTdSruAMwqhnfMxMgZypgxW2veTRMiRHMnDGXMlQycZDg3ECarwg8A0t3znx9WlONZjyeOOYT7all+Q213kugdxb81OJIr+zQm8Q0OKyLsEYFyL8gTeDuAiO3wUXil87xF2ULn36tPFW2rvZFluSh6+de1vyj3MJBGNG5iYE9hGldjqIwN3Lpku/Axzi4yoQrREZtyu3MzwPuUe2/kWLc+3f+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DawHQgeki8ZvZ/CEVflMwrDi2LY1/Pc1w6Kf2TtSQY0=; b=SPEhJR92rLZbwDOHN2OxTPkidGZ+JZ32Dm/8n2ibLG4tB0tdqZ+tJNVAB2Jz1eyKLVcaydfOYNw29xAhiPRanOQJcvxUu8FwGiXkHgn4ntxk8iPMSgoZfAVfp/pMQtV3CBoYfmQyePeUdEMjYWEcxrjSFg7azMht6zdp/+Or8PJ4pA+3UyIINQRdRM2zBp2ZkrueskVE7sxTA9ITNKdNQXqH/PVxtt41tVy2wMz/vEL1DCXiAuzZ3u8eR3fo1CLwxP6R8osC9i47YfM4tcmX+Gl78LL1giyjH7UvDryQz1MfxKvzUEN9K56eQDYFbah9Ucqu1+0JfJIGMbnOKYEQaA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from VE1PR08MB5599.eurprd08.prod.outlook.com (2603:10a6:800:1a1::12) by VI1PR08MB5551.eurprd08.prod.outlook.com (2603:10a6:803:f1::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4331.29; Thu, 22 Jul 2021 16:03:20 +0000 Received: from VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::5ccd:ab57:a64f:e07e]) by VE1PR08MB5599.eurprd08.prod.outlook.com ([fe80::5ccd:ab57:a64f:e07e%7]) with mapi id 15.20.4352.025; Thu, 22 Jul 2021 16:03:20 +0000 To: "naohirot@fujitsu.com" Subject: [PATCH v3 4/5] AArch64: Improve A64FX memset Thread-Topic: [PATCH v3 4/5] AArch64: Improve A64FX memset Thread-Index: AQHXfxMAahcBoeFLQU2G2lcRn1XzVA== Date: Thu, 22 Jul 2021 16:03:20 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: fujitsu.com; dkim=none (message not signed) header.d=none;fujitsu.com; dmarc=none action=none header.from=arm.com; x-ms-publictraffictype: Email X-MS-Office365-Filtering-Correlation-Id: 25ac83b0-1f73-401d-53a9-08d94d2ab904 x-ms-traffictypediagnostic: VI1PR08MB5551:|AM7PR08MB5301: X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:2399;OLM:2399; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: FaV46D/ANL9jtFHsGU/ZAZSqDr5fH7t5l5lfg66bnAMuFz9Ou2LgemGQ5eVWLq0+HUsG1xd1yBO2CB8mr3ya2IVxuv/gERkfTO5CRlImOP4HlC3Pio1guaLCb8dZ6Li++7z+0hmOQ/dawdPfSiJR9yEw21EwTXG3MsCfdtx/uMn2iX8/Pg665eKfR/7ur2RGCLkD21znS0DoMXhYcIK7XfnP8R2nwdUUBZoRKRPxC+JAiqnrB+aXsKNdjZbXzZAs/J0b6w/JS2QnP7/G2GDb8zBj4j8oXkmfbXEcXSVG4hrbsymxYaHnA+SoPACuVZLXkmAZha8AWQrsRyYVUZZ1tObZhvfrsN3jAFKrkCu8m4t/uC78SzwLPYgTbwP4g736AKqwOnQNQGAm1/Z24x2PoqOkKkyiiRsBc1xW13Vs1KX+M1O68zi4b5ZdEA60kaXAENohsfoa8ErtSiQuG/YU9TTH8cbcUyFUjNIQSlb0Mplxj8TMNkernLeufspKHDiIJ3gim6YYnHuqQQjjzSBxR7Y6ZjwSfA/v+43Y37c88jwTzdcaMjx4/Pd4QzPANqANcO1S/cz49y/kKtuvrC70wBHU7oqRvv8S4t8pJNt79OsokunuoR9XpxO0qSEQRYc2u2uQlgholXtLpLM3fPTTSMgLWD3hrq7BO0ag/ZQtdm4NAooaf1I48WM38rxYEiPzyY3hSm0u2tAjCin02L4TjQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VE1PR08MB5599.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(136003)(39860400002)(346002)(376002)(396003)(76116006)(64756008)(38100700002)(4326008)(478600001)(66556008)(66446008)(2906002)(66946007)(55016002)(5660300002)(66476007)(52536014)(33656002)(9686003)(6506007)(91956017)(8936002)(186003)(122000001)(86362001)(26005)(316002)(71200400001)(6916009)(7696005)(8676002)(38070700004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?q?e3Ph3RXB1DpQ866zX2vCqfn?= =?iso-8859-1?q?VfYafyLds6t4dABam/RJH7q+Wto0vXHcz1g/77D4aXLq/Mlxu0A9CkvjfGK/?= =?iso-8859-1?q?Ckg3tUZ/maLd06CDC6Xz0RB97K0VBzHtePbxNREe1AniyeEuaNtewPpxiG38?= =?iso-8859-1?q?evQJkpmwiHNwfWG9/HRY4K1Ak0KV/+FJA7hXOO2phtXWqzzcKjyzQT3vWXBG?= =?iso-8859-1?q?K/ah1dyP3bJqvFSwTJHB4rxJYfFiLNyh/iaLLMGInGXpMZ8CY1txpTK//Aa8?= =?iso-8859-1?q?qy/IoBykdcJdzFu7wi0G8LzPWSwxiRXbXaHwBnXI12GNF3XhwwMYywfMmRqq?= =?iso-8859-1?q?2OSnmTnF8j1KCUTdpScGULVuOKT2/NMByroxaQeBRidlCpOmdBAgmo4G8gGf?= =?iso-8859-1?q?wcHXkWsEMdkR1RLJq/Ey+GebZKb/bXmuIYEe13QLtYaF69FXJDAuZbgi+eqL?= =?iso-8859-1?q?0DeEgCN0+5E0JSlBppOWXyRffFfSIMkOopXydtM7Mq9CcprDzfXESy9nWQLZ?= =?iso-8859-1?q?8Z0cGoEK1ERCwEOY1X7TK/7+pTOyDfrEFJXagR9ZXqC2Ya0CjreEpAXbk3m8?= =?iso-8859-1?q?6Z+Xp3dWAL1EVEb+HAlsSQialf0lBAkvImWPtF1C3ObOdkRcDtf0afWaYjr+?= =?iso-8859-1?q?0Vj89CgU2JdgG8Qo2uQHQY1qXlMYDXCBxjr057kL6/QJusoIhgAgKqjs7iQX?= =?iso-8859-1?q?nsMeneNU38mmilX0zD5N6/0rrhxDZBG4tCqumvBu434PKcD6hjID5U1Kw8x7?= =?iso-8859-1?q?AI3NL9VBeg0GF+WF1ipKzsmQraQX8g/zVZbmiiVIVdKMgKnARQlGigd9+5bx?= =?iso-8859-1?q?x5NfdYrZhZ9JsaO+Tm/MSGmUWhLIoWlKRhqGDfzODAJawfWFPLtpKJuiPlM0?= =?iso-8859-1?q?pOzPWhGE9C04taPcy1x41sJkjhJDKevmkp+UDB6irgWhnS6Pxzx04CH2KbkU?= =?iso-8859-1?q?99mBMIgB/mz1QY0PhlyaAPOCmv2tMGxwzWDgTeGP4n5qqplGlFvSCEJBitAS?= =?iso-8859-1?q?3A5bznxBdNCGqt3kzPu1YbyV5Cr7paSv+BEXlq6FNgRmkqjT8RgjiGcYLQVn?= =?iso-8859-1?q?urxSx5xLTvIApe787voSebsQIDPf6C6TunPxWGvm711orx5NMGFo7yRyrdQE?= =?iso-8859-1?q?fztMDusrCLp70bhbPIPZQ+T3+TT7l0NL7hE2/k57JV/8f1x++KEUOgY01TFQ?= =?iso-8859-1?q?KqhcCmSWt8Bc0Bex5goCv2Rvg0aGEB//lSKEggIKse4RQFCr5MK1ps20fqR2?= =?iso-8859-1?q?y06PJFTME8ZEIP9cLVs7Ii2OwPEahpscS8GlqqXWab42J3AyFdssBYJY+3lA?= =?iso-8859-1?q?1tibTrWjccaX4XcfvF8xVVTIrpv6CFQ1ymWuWYm4=3D?= x-ms-exchange-transport-forked: True MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR08MB5551 Original-Authentication-Results: fujitsu.com; dkim=none (message not signed) header.d=none;fujitsu.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 50a77a12-0838-4c85-4c80-08d94d2a39d1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: dXPVQ7s5gf01ZdjnOb9+8GpuLpTdQ2Yrg1s7trEc+VRjhj5ROM8iPc7N2NXVzOpvepqTVdzjG2bLNRRjYhNSSFBZY3r/Fk4awr0YRikoctc/sWMsc+sxqjpo3vF8Rqp5d+tGoMk6ohhG9PmdCacHJjRPWiP6nh2ZkW30zJf8ykvdp+SlSksZEFgEDkzWZ1zJZpD4CPzxrEJDFe+X4tYEwiVbsvvF5jLnHevSP17Ymd2i0GKsxKOczolDy9xSz1JMKG+BVw4D6Thaw443JlNH5VauTytTu4uOMImgwwsc7ZkaYC/0thF4GJC03F6pX3HiNqQSkKhSfb+6t47xWmVmJl4L+S0rQ9mM88o0bD7EvGWzm15lfwav26cL1HIR7t96PDU+7YWN85IX/1KLT9hinGxmTcdwVk424ysAz1rzKdiXBhEB00IDkUahNUGVxLoF8/QMDK+uNE6Bct/OdU8GewOjQthq+vducA0W8RDc43VxBxZ9Pkv3QqWWb/DW39gJR4/83fHCOFQCVEUAnBXKBZ/jE08VHrb1m+AWqlpCio5sd5hEDUfj8S/svtrpdLKdDgbkLJxpp4dkAFtVZ5HMsnkKZnEY4UXYzRut0PrOW0nJUSbVtoWEQwoZJWRat7Rjg0k2TCj+ApeVFToNAxisRucZKOsbcjkrWwCUN8lRU7obVE9RHxYD4e5gMAYKYEb0PoLlLTMCu6fdFNd7rY7Lrg== X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(39850400004)(346002)(396003)(136003)(376002)(36840700001)(46966006)(478600001)(33656002)(36860700001)(356005)(186003)(82310400003)(82740400003)(70206006)(9686003)(6506007)(8676002)(47076005)(70586007)(81166007)(55016002)(8936002)(4326008)(7696005)(86362001)(2906002)(52536014)(316002)(5660300002)(26005)(6862004)(336012); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 22 Jul 2021 16:06:53.7609 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 25ac83b0-1f73-401d-53a9-08d94d2ab904 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DB5EUR03FT035.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM7PR08MB5301 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Wilco Dijkstra via Libc-alpha From: Wilco Dijkstra Reply-To: Wilco Dijkstra Cc: 'GNU C Library' Errors-To: libc-alpha-bounces+patchwork=sourceware.org@sourceware.org Sender: "Libc-alpha" Remove unroll32 code since it doesn't improve performance. Reviewed-by: Naohiro Tamura Tested-by: Naohiro Tamura diff --git a/sysdeps/aarch64/multiarch/memset_a64fx.S b/sysdeps/aarch64/multiarch/memset_a64fx.S index fce257fa68120c2b101f29b438c397e10b4c275e..8665c272431b46dadea53c63ab74829c3aa99312 100644 --- a/sysdeps/aarch64/multiarch/memset_a64fx.S +++ b/sysdeps/aarch64/multiarch/memset_a64fx.S @@ -102,22 +102,6 @@ L(vl_agnostic): // VL Agnostic ccmp vector_length, tmp1, 0, cs b.eq L(L1_prefetch) -L(unroll32): - lsl tmp1, vector_length, 3 // vector_length * 8 - lsl tmp2, vector_length, 5 // vector_length * 32 - .p2align 3 -1: cmp rest, tmp2 - b.cc L(unroll8) - st1b_unroll - add dst, dst, tmp1 - st1b_unroll - add dst, dst, tmp1 - st1b_unroll - add dst, dst, tmp1 - st1b_unroll - add dst, dst, tmp1 - sub rest, rest, tmp2 - b 1b L(unroll8): lsl tmp1, vector_length, 3 @@ -155,7 +139,7 @@ L(L1_prefetch): // if rest >= L1_SIZE sub rest, rest, CACHE_LINE_SIZE * 2 cmp rest, L1_SIZE b.ge 1b - cbnz rest, L(unroll32) + cbnz rest, L(unroll8) ret // count >= L2_SIZE