From patchwork Mon Dec 27 19:05:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Engel X-Patchwork-Id: 49300 X-Patchwork-Delegate: rearnsha@gcc.gnu.org Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2F931385800D for ; Mon, 27 Dec 2021 19:15:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from wout4-smtp.messagingengine.com (wout4-smtp.messagingengine.com [64.147.123.20]) by sourceware.org (Postfix) with ESMTPS id 0A64A385842A for ; Mon, 27 Dec 2021 19:09:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0A64A385842A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=danielengel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=danielengel.com Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailout.west.internal (Postfix) with ESMTP id C37BA3200A2F; Mon, 27 Dec 2021 14:09:17 -0500 (EST) Received: from mailfrontend1 ([10.202.2.162]) by compute5.internal (MEProxy); Mon, 27 Dec 2021 14:09:18 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=danielengel.com; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; s=fm1; bh=6KqfPaa+JDV7S qJA2s4sakLd7m95dKkyjQaGAka/SVE=; b=LIFKPXeQo0gIWY8pgXhVnMQH2yyba JvQq+P40f6ZgrK6IvT8ajumEYZz//ERtOJHNMLk/vC8wilqthcXdtoEHaKmi1wq8 RXjU2WtefjPwmEX0NBPXNsc3mLe5OsBxxZeaRtPTCGQQ+ESU65MGJrqRw4oCmhKI RShXjTXQe8q/R0zVvoaWB/jvH99cgKao9SH9kts6yf8GvReuN8lrDWhk3XX2Bezu Oc3mFOHirnLms0/QufuBxwN0aE31S5m66n/0DwPvv+TeA8veL9x0pWcXpNAXLtZb RCo/UM3oKh174HsdlcCzCTjYsCTA1piPytlxC5tUKmrj7LPF4CnPc006g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:date:from :in-reply-to:message-id:mime-version:references:subject:to :x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm1; bh=6KqfPaa+JDV7SqJA2s4sakLd7m95dKkyjQaGAka/SVE=; b=iLqAvVsw wJpjxwDtftYgBAsIZDtg/DPDkMsA41XMUw00Qke4pXmJbsLTPATr3TG0zVpHd2EE Ydx02UtR5eypSYQC5iie6WP1vpsI+6xZ5FdzFqOw6QQXWhCDQhToMdvZzGkgZsm5 uR/uN7qmV48e3HEdCoQs9roqHg6ibWZCOG0ciV7a/Ol0PBeiyLHMEYpGXt9Ha7t/ zUHGQMO0HxF7X84H7YeFkGiUSzhC0obwPmUwQpMn6cwRxjXGirmB72rF4cDLDYvH DJHEyqpFbeHygHjmaH6yqnhwkQRRugSzu2idDjSDv1uXLDpSCx/7/wlK7l2hSW0Y TtZLaQfwdU7xBQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvuddruddujedguddtlecutefuodetggdotefrod ftvfcurfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfgh necuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd enucfjughrpefhvffufffkofgjfhgggfestdekredtredttdenucfhrhhomhepffgrnhhi vghlucfgnhhgvghluceoghhnuhesuggrnhhivghlvghnghgvlhdrtghomheqnecuggftrf grthhtvghrnhepieduiedufefgieefveevgfehfeevvdeigeetudelieelkeehfedtjeek teeiveetnecuffhomhgrihhnpehltghmphdrshgspdhgnhhurdhorhhgnecuvehluhhsth gvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepghhnuhesuggrnhhivghl vghnghgvlhdrtghomh X-ME-Proxy: Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 27 Dec 2021 14:09:16 -0500 (EST) Received: from ubuntu.lorien.danielengel.com (ubuntu.lorien.danielengel.com [10.0.0.96]) by sendmail.lorien.danielengel.com (8.15.2/8.15.2) with ESMTP id 1BRJ9F9l060966; Mon, 27 Dec 2021 11:09:15 -0800 (PST) (envelope-from gnu@danielengel.com) From: Daniel Engel To: Richard Earnshaw , gcc-patches@gcc.gnu.org Subject: [PATCH v6 17/34] Import 64-bit comparison from CM0 library Date: Mon, 27 Dec 2021 11:05:13 -0800 Message-Id: <20211227190530.3136549-18-gnu@danielengel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20211227190530.3136549-1-gnu@danielengel.com> References: <20211227190530.3136549-1-gnu@danielengel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Daniel Engel , Christophe Lyon Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" These are 2-5 instructions smaller and just as fast. Branches are minimized, which will allow easier adaptation to Thumb-2/ARM mode. gcc/libgcc/ChangeLog: 2021-01-13 Daniel Engel * config/arm/eabi/lcmp.S (__aeabi_lcmp, __aeabi_ulcmp): Replaced; add macro configuration to build __cmpdi2() and __ucmpdi2(). * config/arm/t-elf (LIB1ASMFUNCS): Added _cmpdi2 and _ucmpdi2. --- libgcc/config/arm/eabi/lcmp.S | 151 +++++++++++++++++++++++++--------- libgcc/config/arm/t-elf | 2 + 2 files changed, 112 insertions(+), 41 deletions(-) diff --git a/libgcc/config/arm/eabi/lcmp.S b/libgcc/config/arm/eabi/lcmp.S index 336db1d398c..2ac9d178b34 100644 --- a/libgcc/config/arm/eabi/lcmp.S +++ b/libgcc/config/arm/eabi/lcmp.S @@ -1,8 +1,7 @@ -/* Miscellaneous BPABI functions. Thumb-1 implementation, suitable for ARMv4T, - ARMv6-M and ARMv8-M Baseline like ISA variants. +/* lcmp.S: Thumb-1 optimized 64-bit integer comparison - Copyright (C) 2006-2020 Free Software Foundation, Inc. - Contributed by CodeSourcery. + Copyright (C) 2018-2021 Free Software Foundation, Inc. + Contributed by Daniel Engel, Senva Inc (gnu@danielengel.com) This file is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the @@ -24,50 +23,120 @@ . */ +#if defined(L_aeabi_lcmp) || defined(L_cmpdi2) + #ifdef L_aeabi_lcmp + #define LCMP_NAME aeabi_lcmp + #define LCMP_SECTION .text.sorted.libgcc.lcmp +#else + #define LCMP_NAME cmpdi2 + #define LCMP_SECTION .text.sorted.libgcc.cmpdi2 +#endif + +// int __aeabi_lcmp(long long, long long) +// int __cmpdi2(long long, long long) +// Compares the 64 bit signed values in $r1:$r0 and $r3:$r2. +// lcmp() returns $r0 = { -1, 0, +1 } for orderings { <, ==, > } respectively. +// cmpdi2() returns $r0 = { 0, 1, 2 } for orderings { <, ==, > } respectively. +// Object file duplication assumes typical programs follow one runtime ABI. +FUNC_START_SECTION LCMP_NAME LCMP_SECTION + CFI_START_FUNCTION + + // Calculate the difference $r1:$r0 - $r3:$r2. + subs xxl, yyl + sbcs xxh, yyh + + // With $r2 free, create a known offset value without affecting + // the N or Z flags. + // BUG? The originally unified instruction for v6m was 'mov r2, r3'. + // However, this resulted in a compile error with -mthumb: + // "MOV Rd, Rs with two low registers not permitted". + // Since unified syntax deprecates the "cpy" instruction, shouldn't + // there be a backwards-compatible tranlation available? + cpy r2, r3 + + // Evaluate the comparison result. + blt LLSYM(__lcmp_lt) + + // The reference offset ($r2 - $r3) will be +2 iff the first + // argument is larger, otherwise the offset value remains 0. + adds r2, #2 + + // Check for zero (equality in 64 bits). + // It doesn't matter which register was originally "hi". + orrs r0, r1 + + // The result is already 0 on equality. + beq LLSYM(__lcmp_return) + + LLSYM(__lcmp_lt): + // Create +1 or -1 from the offset value defined earlier. + adds r3, #1 + subs r0, r2, r3 + + LLSYM(__lcmp_return): + #ifdef L_cmpdi2 + // Offset to the correct output specification. + adds r0, #1 + #endif -FUNC_START aeabi_lcmp - cmp xxh, yyh - beq 1f - bgt 2f - movs r0, #1 - negs r0, r0 - RET -2: - movs r0, #1 - RET -1: - subs r0, xxl, yyl - beq 1f - bhi 2f - movs r0, #1 - negs r0, r0 - RET -2: - movs r0, #1 -1: RET - FUNC_END aeabi_lcmp -#endif /* L_aeabi_lcmp */ + CFI_END_FUNCTION +FUNC_END LCMP_NAME + +#endif /* L_aeabi_lcmp || L_cmpdi2 */ + + +#if defined(L_aeabi_ulcmp) || defined(L_ucmpdi2) #ifdef L_aeabi_ulcmp + #define ULCMP_NAME aeabi_ulcmp + #define ULCMP_SECTION .text.sorted.libgcc.ulcmp +#else + #define ULCMP_NAME ucmpdi2 + #define ULCMP_SECTION .text.sorted.libgcc.ucmpdi2 +#endif + +// int __aeabi_ulcmp(unsigned long long, unsigned long long) +// int __ucmpdi2(unsigned long long, unsigned long long) +// Compares the 64 bit unsigned values in $r1:$r0 and $r3:$r2. +// ulcmp() returns $r0 = { -1, 0, +1 } for orderings { <, ==, > } respectively. +// ucmpdi2() returns $r0 = { 0, 1, 2 } for orderings { <, ==, > } respectively. +// Object file duplication assumes typical programs follow one runtime ABI. +FUNC_START_SECTION ULCMP_NAME ULCMP_SECTION + CFI_START_FUNCTION + + // Calculate the 'C' flag. + subs xxl, yyl + sbcs xxh, yyh + + // Capture the carry flg. + // $r2 will contain -1 if the first value is smaller, + // 0 if the first value is larger or equal. + sbcs r2, r2 + + // Check for zero (equality in 64 bits). + // It doesn't matter which register was originally "hi". + orrs r0, r1 + + // The result is already 0 on equality. + beq LLSYM(__ulcmp_return) + + // Assume +1. If -1 is correct, $r2 will override. + movs r0, #1 + orrs r0, r2 + + LLSYM(__ulcmp_return): + #ifdef L_ucmpdi2 + // Offset to the correct output specification. + adds r0, #1 + #endif -FUNC_START aeabi_ulcmp - cmp xxh, yyh - bne 1f - subs r0, xxl, yyl - beq 2f -1: - bcs 1f - movs r0, #1 - negs r0, r0 - RET -1: - movs r0, #1 -2: RET - FUNC_END aeabi_ulcmp -#endif /* L_aeabi_ulcmp */ + CFI_END_FUNCTION +FUNC_END ULCMP_NAME + +#endif /* L_aeabi_ulcmp || L_ucmpdi2 */ diff --git a/libgcc/config/arm/t-elf b/libgcc/config/arm/t-elf index 2e3f04aa2f0..83325410097 100644 --- a/libgcc/config/arm/t-elf +++ b/libgcc/config/arm/t-elf @@ -41,6 +41,8 @@ LIB1ASMFUNCS += \ _ffsdi2 \ _paritydi2 \ _popcountdi2 \ + _cmpdi2 \ + _ucmpdi2 \ _dvmd_tls \ _divsi3 \ _modsi3 \