From patchwork Mon Dec 27 19:05:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 49301 X-Patchwork-Delegate: rearnsha@gcc.gnu.org Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A5BD03858020 for ; Mon, 27 Dec 2021 19:16:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by sourceware.org (Postfix) with ESMTPS id 350EF385843E for ; Mon, 27 Dec 2021 19:09:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 350EF385843E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=danielengel.com Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.west.internal (Postfix) with ESMTP id E1EE93200AB0; Mon, 27 Dec 2021 14:09:36 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Mon, 27 Dec 2021 14:09:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=J9YdNbBT+PqJO Z4iSHu8a2+PW2MuqUgRVB4YQpI2mow=; b=XJg06eb7sAwU0cLNQX9FatogqhCPn x+9A4rdTuVEzOFyTygQ51TpJ+9GHPwndtJ3157D127rZ158U+wrbGVqTtP6EOO8n 0Zq15vDXZlwqmtZIfAX3D0vcZsF0kHfbmCLYw63CvLWLTiIhRhPG+i5kUDu74UeF 31IkB3gflMnM2hkxDTXwGhx+KDA6hDc2SY57UDjGq5FzEIPkqgyzZzXl3BqnZ5sy Eae6HAsHJWAOL+LDz0Fp7QQi4wV1Jb/7H5QhYz4xgmAXKUPXrDcjthuhFfsdKNhq tpQkHmovJ5gTeX9oUz+Ga+lyf0ACrlgdSe9wlOcqaunvaRvcUjg5gLFSA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=J9YdNbBT+PqJOZ4iSHu8a2+PW2MuqUgRVB4YQpI2mow=; b=Gi/90bzk H4vKCezXeAOZ3T88nBprMdsgLQzrc+dOsAWFA1F83JzqVRam6yHAb+HGJNuUQcMn g9AUR9FxqIWcL5utUMoQJojl3R1Lug4dFP1Ho36hSB9ugOQFCz6s0qOJpF/ji7Mc zgnL7t4vBeB+yed7tqxAUV2d3QhwlVHwPB1YsXiGNY0Qoq0CZmdnbDA7B6trd0DR R/bEzMT+bD+zF7ad4ZLNY/RAl9TcMKbJ1HNXSLiMEaTVok2gRFf4DST1bAXhL9uO VVr0aqYg5PbLnCpkJgIEw7kdprLJHgMbXVc4LGtywAzMck1veqBtk1jdkX5hFlU+ ldzKDI1+tg2VLA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvuddruddujedguddutdcutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhi vghlucfgnhhgvghluceoghhnuhesuggrnhhivghlvghnghgvlhdrtghomheqnecuggftrf grthhtvghrnhepfeevveehfedvffeugfeiudeugedufeefudekjeeuffdviedttddvvdfh vdfgtdehnecuffhomhgrihhnpegsphgrsghirdhssgdplhgtmhhprdhssgdplhhisgdufh hunhgtshdrshgsnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhf rhhomhepghhnuhesuggrnhhivghlvghnghgvlhdrtghomh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Dec 2021 14:09:35 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 1BRJ9XQ8060978; Mon, 27 Dec 2021 11:09:34 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: Richard Earnshaw , gcc-patches@gcc.gnu.org Subject: [PATCH v6 18/34] Merge Thumb-2 optimizations for 64-bit comparison Date: Mon, 27 Dec 2021 11:05:14 -0800 Message-Id: <20211227190530.3136549-19-gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211227190530.3136549-1-gnu@danielengel.com> References: <20211227190530.3136549-1-gnu@danielengel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Engel , Christophe Lyon Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This effectively merges support for all architecture variants into a common function path with appropriate build conditions. ARM performance is 1-2 instructions faster; Thumb-2 is about 50% faster. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/bpabi.S (__aeabi_lcmp, __aeabi_ulcmp): Removed. * config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Added conditional execution on supported architectures (__ARM_FEATURE_IT). * config/arm/lib1funcs.S: Moved #include scope of eabi/lcmp.S. --- libgcc/config/arm/bpabi.S | 42 ------------------------------- libgcc/config/arm/eabi/lcmp.S | 47 ++++++++++++++++++++++++++++++++++- libgcc/config/arm/lib1funcs.S | 2 +- 3 files changed, 47 insertions(+), 44 deletions(-) diff --git a/libgcc/config/arm/bpabi.S b/libgcc/config/arm/bpabi.S index 2cbb67d54ad..4281a2be594 100644 --- a/libgcc/config/arm/bpabi.S +++ b/libgcc/config/arm/bpabi.S @@ -34,48 +34,6 @@ .eabi_attribute 25, 1 #endif /* __ARM_EABI__ */ -#ifdef L_aeabi_lcmp - -ARM_FUNC_START aeabi_lcmp - cmp xxh, yyh - do_it lt - movlt r0, #-1 - do_it gt - movgt r0, #1 - do_it ne - RETc(ne) - subs r0, xxl, yyl - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - RET - FUNC_END aeabi_lcmp - -#endif /* L_aeabi_lcmp */ - -#ifdef L_aeabi_ulcmp - -ARM_FUNC_START aeabi_ulcmp - cmp xxh, yyh - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - do_it ne - RETc(ne) - cmp xxl, yyl - do_it lo - movlo r0, #-1 - do_it hi - movhi r0, #1 - do_it eq - moveq r0, #0 - RET - FUNC_END aeabi_ulcmp - -#endif /* L_aeabi_ulcmp */ - .macro test_div_by_zero signed /* Tail-call to divide-by-zero handlers which may be overridden by the user, so unwinding works properly. */ diff --git a/libgcc/config/arm/eabi/lcmp.S b/libgcc/config/arm/eabi/lcmp.S index 2ac9d178b34..f1a9c3b8fe0 100644 --- a/libgcc/config/arm/eabi/lcmp.S +++ b/libgcc/config/arm/eabi/lcmp.S @@ -46,6 +46,19 @@ FUNC_START_SECTION LCMP_NAME LCMP_SECTION subs xxl, yyl sbcs xxh, yyh + #ifdef __HAVE_FEATURE_IT + do_it lt,t + + #ifdef L_aeabi_lcmp + movlt r0, #-1 + #else + movlt r0, #0 + #endif + + // Early return on '<'. + RETc(lt) + + #else /* !__HAVE_FEATURE_IT */ // With $r2 free, create a known offset value without affecting // the N or Z flags. // BUG? The originally unified instruction for v6m was 'mov r2, r3'. @@ -62,17 +75,27 @@ FUNC_START_SECTION LCMP_NAME LCMP_SECTION // argument is larger, otherwise the offset value remains 0. adds r2, #2 + #endif + // Check for zero (equality in 64 bits). // It doesn't matter which register was originally "hi". orrs r0, r1 + #ifdef __HAVE_FEATURE_IT + // The result is already 0 on equality. + // -1 already returned, so just force +1. + do_it ne + movne r0, #1 + + #else /* !__HAVE_FEATURE_IT */ // The result is already 0 on equality. beq LLSYM(__lcmp_return) - LLSYM(__lcmp_lt): + LLSYM(__lcmp_lt): // Create +1 or -1 from the offset value defined earlier. adds r3, #1 subs r0, r2, r3 + #endif LLSYM(__lcmp_return): #ifdef L_cmpdi2 @@ -111,21 +134,43 @@ FUNC_START_SECTION ULCMP_NAME ULCMP_SECTION subs xxl, yyl sbcs xxh, yyh + #ifdef __HAVE_FEATURE_IT + do_it lo,t + + #ifdef L_aeabi_ulcmp + movlo r0, -1 + #else + movlo r0, #0 + #endif + + // Early return on '<'. + RETc(lo) + + #else // Capture the carry flg. // $r2 will contain -1 if the first value is smaller, // 0 if the first value is larger or equal. sbcs r2, r2 + #endif // Check for zero (equality in 64 bits). // It doesn't matter which register was originally "hi". orrs r0, r1 + #ifdef __HAVE_FEATURE_IT + // The result is already 0 on equality. + // -1 already returned, so just force +1. + do_it ne + movne r0, #1 + + #else /* !__HAVE_FEATURE_IT */ // The result is already 0 on equality. beq LLSYM(__ulcmp_return) // Assume +1. If -1 is correct, $r2 will override. movs r0, #1 orrs r0, r2 + #endif LLSYM(__ulcmp_return): #ifdef L_ucmpdi2 diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S index 5e24d0a6749..f41354f811e 100644 --- a/libgcc/config/arm/lib1funcs.S +++ b/libgcc/config/arm/lib1funcs.S @@ -1991,6 +1991,6 @@ LSYM(Lchange_\register): #include "bpabi.S" #else /* NOT_ISA_TARGET_32BIT */ #include "bpabi-v6m.S" -#include "eabi/lcmp.S" #endif /* NOT_ISA_TARGET_32BIT */ +#include "eabi/lcmp.S" #endif /* !__symbian__ */