From patchwork Mon Oct 31 15:45:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 59674 X-Patchwork-Delegate: rearnsha@gcc.gnu.org Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 95737385140C for ; Mon, 31 Oct 2022 15:49:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout3-smtp.messagingengine.com (wout3-smtp.messagingengine.com [64.147.123.19]) by sourceware.org (Postfix) with ESMTPS id AEE5A389838B for ; Mon, 31 Oct 2022 15:47:50 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AEE5A389838B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=danielengel.com Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailout.west.internal (Postfix) with ESMTP id 42EBE320093B; Mon, 31 Oct 2022 11:47:49 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Mon, 31 Oct 2022 11:47:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=cc:cc:content-transfer-encoding:date:date:from:from :in-reply-to:in-reply-to:message-id:mime-version:references :reply-to:sender:subject:subject:to:to; s=fm1; t=1667231268; x= 1667317668; bh=RmvQX19ZLua4MFD7xxv/ty/xl14A61A1/BK9hTZL4J8=; b=i uA59xSoKGm1Sige9LOS6LWTsewcZ5z6PxPLWYuveWVWNv3dpy8iF1anYWh7cwVE4 sZU+QUq+xYBKnrAF6phutjnj2f1Jt5n9ubzF0hAh1/omcJhOx8kT97dMegly5Ydt W0fhHenw1yjne5XYuL4LQH6r2ngXh84wTABc07ScJ4+6geHqZtQj3Ny3Fjwwm3rG Geff2m0ggV0Ii3uHB0qT3RKkYlkvdTM1netcFP9fb1H3oop+s1phvgXFzfMCvsqA ZXYuryCnO4Ktr08l5vkQR8Pf9zydFQ1ePEvEcIYtL9pMhnB+ofldarzXgXp5w+cF uHi6ntb5EPrAJDuP9zAKw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:sender:subject :subject:to:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; t=1667231268; x=1667317668; bh=RmvQX19ZLua4M FD7xxv/ty/xl14A61A1/BK9hTZL4J8=; b=VuxuXiVq0H1bK8584wtlWl3j42GBC 0y4E1Kxp3VaS7Gvz3n56HCbyagVyoPlzL6ulezj2EfQAhwwCFjLFUtHyHCikNHps m8dgt7nwRL22yuaPvJd6Nvg2/LuWl94nOjRwdFNqrC9/RumiciGXcsKKVMkQpDEf A23c8FJdlPWizadCQYGWQbwv72THX5Gx9bU+K6OZr9E9fbJ2/ER/JJSJrwI682s1 3OSengWU1kcaGYlAM6z3LH0lcyCsRL/0LxcAZB5Tw7EdJXiGV69Rv+vlBbgyd2MQ 4TE4tWKP7ZOB7EGQSheBzu61g5a3y12819Yed4XGss53y8jhoWD6Rn7rA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvgedrudefgdektdcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffojghfggfgsedtkeertdertddtnecuhfhrohhmpeffrghnihgv lhcugfhnghgvlhcuoehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhmqeenucggtffrrg htthgvrhhnpeffgffgfeefgffghefhjeetueelveejleffueefvdehkefggedtheduvedu vedvffenucffohhmrghinhepsghprggsihdrshgspdhltghmphdrshgspdhlihgsudhfuh hntghsrdhssgenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhr ohhmpehgnhhusegurghnihgvlhgvnhhgvghlrdgtohhm X-ME-Proxy: Feedback-ID: i791144d6:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 31 Oct 2022 11:47:48 -0400 (EDT) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 29VFleMo087283; Mon, 31 Oct 2022 08:47:40 -0700 (PDT) (envelope-from gnu@danielengel.com) From: Daniel Engel To: Richard Earnshaw , gcc-patches@gcc.gnu.org Subject: [PATCH v7 18/34] Merge Thumb-2 optimizations for 64-bit comparison Date: Mon, 31 Oct 2022 08:45:13 -0700 Message-Id: <20221031154529.3627576-19-gnu@danielengel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221031154529.3627576-1-gnu@danielengel.com> References: <20221031154529.3627576-1-gnu@danielengel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-13.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Engel , Christophe Lyon Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This effectively merges support for all architecture variants into a common function path with appropriate build conditions. ARM performance is 1-2 instructions faster; Thumb-2 is about 50% faster. gcc/libgcc/ChangeLog: 2022-10-09 Daniel Engel * config/arm/bpabi.S (__aeabi_lcmp, __aeabi_ulcmp): Removed. * config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Added conditional execution on supported architectures (__ARM_FEATURE_IT). * config/arm/lib1funcs.S: Moved #include scope of eabi/lcmp.S. --- libgcc/config/arm/bpabi.S | 42 ------------------------------- libgcc/config/arm/eabi/lcmp.S | 47 ++++++++++++++++++++++++++++++++++- libgcc/config/arm/lib1funcs.S | 2 +- 3 files changed, 47 insertions(+), 44 deletions(-) diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S index 17fe707ddf3..531a64fa98d 100644 --- a/libgcc/config/arm/bpabi.S +++ b/libgcc/config/arm/bpabi.S @@ -34,48 +34,6 @@ .eabi_attribute 25, 1 #endif /* __ARM_EABI__ */ -#ifdef L_aeabi_lcmp - -ARM_FUNC_START aeabi_lcmp - cmp xxh, yyh - do_it lt - movlt r0, #-1 - do_it gt - movgt r0, #1 - do_it ne - RETc(ne) - subs r0, xxl, yyl - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - RET - FUNC_END aeabi_lcmp - -#endif /* L_aeabi_lcmp */ - -#ifdef L_aeabi_ulcmp - -ARM_FUNC_START aeabi_ulcmp - cmp xxh, yyh - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - do_it ne - RETc(ne) - cmp xxl, yyl - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - do_it eq - moveq r0, #0 - RET - FUNC_END aeabi_ulcmp - -#endif /* L_aeabi_ulcmp */ - .macro test_div_by_zero signed /* Tail-call to divide-by-zero handlers which may be overridden by the user, so unwinding works properly. */ diff --git a/libgcc/config/arm/eabi/lcmp.S b/libgcc/config/arm/eabi/lcmp.S index 99c7970ecba..d397325cbef 100644 --- a/libgcc/config/arm/eabi/lcmp.S +++ b/libgcc/config/arm/eabi/lcmp.S @@ -46,6 +46,19 @@ FUNC_START_SECTION LCMP_NAME LCMP_SECTION subs xxl, yyl sbcs xxh, yyh + #ifdef __HAVE_FEATURE_IT + do_it lt,t + + #ifdef L_aeabi_lcmp + movlt r0, #-1 + #else + movlt r0, #0 + #endif + + // Early return on '<'. + RETc(lt) + + #else /* !__HAVE_FEATURE_IT */ // With $r2 free, create a known offset value without affecting // the N or Z flags. // BUG? The originally unified instruction for v6m was 'mov r2, r3'. @@ -62,17 +75,27 @@ FUNC_START_SECTION LCMP_NAME LCMP_SECTION // argument is larger, otherwise the offset value remains 0. adds r2, #2 + #endif + // Check for zero (equality in 64 bits). // It doesn't matter which register was originally "hi". orrs r0, r1 + #ifdef __HAVE_FEATURE_IT + // The result is already 0 on equality. + // -1 already returned, so just force +1. + do_it ne + movne r0, #1 + + #else /* !__HAVE_FEATURE_IT */ // The result is already 0 on equality. beq LLSYM(__lcmp_return) - LLSYM(__lcmp_lt): + LLSYM(__lcmp_lt): // Create +1 or -1 from the offset value defined earlier. adds r3, #1 subs r0, r2, r3 + #endif LLSYM(__lcmp_return): #ifdef L_cmpdi2 @@ -111,21 +134,43 @@ FUNC_START_SECTION ULCMP_NAME ULCMP_SECTION subs xxl, yyl sbcs xxh, yyh + #ifdef __HAVE_FEATURE_IT + do_it lo,t + + #ifdef L_aeabi_ulcmp + movlo r0, -1 + #else + movlo r0, #0 + #endif + + // Early return on '<'. + RETc(lo) + + #else // Capture the carry flg. // $r2 will contain -1 if the first value is smaller, // 0 if the first value is larger or equal. sbcs r2, r2 + #endif // Check for zero (equality in 64 bits). // It doesn't matter which register was originally "hi". orrs r0, r1 + #ifdef __HAVE_FEATURE_IT + // The result is already 0 on equality. + // -1 already returned, so just force +1. + do_it ne + movne r0, #1 + + #else /* !__HAVE_FEATURE_IT */ // The result is already 0 on equality. beq LLSYM(__ulcmp_return) // Assume +1. If -1 is correct, $r2 will override. movs r0, #1 orrs r0, r2 + #endif LLSYM(__ulcmp_return): #ifdef L_ucmpdi2 diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index d85a20252d9..796f6f30ed9 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1991,6 +1991,6 @@ LSYM(Lchange_\register): #include "bpabi.S" #else /* NOT_ISA_TARGET_32BIT */ #include "bpabi-v6m.S" -#include "eabi/lcmp.S" #endif /* NOT_ISA_TARGET_32BIT */ +#include "eabi/lcmp.S" #endif /* !__symbian__ */