From patchwork Fri Jul 29 06:10:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 56410 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CF282385483E for ; Fri, 29 Jul 2022 06:10:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 671313858429 for ; Fri, 29 Jul 2022 06:10:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 671313858429 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=jyW4NIxx8FrClsKEcNNTAC/tTKDwVG6VU6hJuuDiXsY=; b=P5N7eMKTWxMQtbGbpuF+tmD6Fq hnu34LqVZzKl4CsJ97xCsm07TS3H08FGDn1nHl1QQJvQBQGC+zWvW+nLS9UpReh+7cX9XhpV9Pcci L1VDixgorQrXPbW456G55BANadHz40GNFp2oes1Cq+jdgBayzZ2cq8/8Tbk4xahGHpDMMBSwkNopg EFV1j8Q6/F5QUjNDvpAZz30+mBAnyFyzgTcACaT7uFM1jJX39yPtNV9ftDsYhPmaI5qQ9BEE65uU5 LgdWbdqUkLdt5M2YwYXbf9hBpevxLM1T3mjKKbGtnqCWf/Y7fCqiAEbnNAO0UfTxCEKkVdLAdk856 tu28NDpg==; Received: from host86-169-41-119.range86-169.btcentralplus.com ([86.169.41.119]:62090 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1oHJCl-0002nM-7m; Fri, 29 Jul 2022 02:10:35 -0400 From: "Roger Sayle" To: Subject: [x86_64 PATCH] Add rotl64ti2_doubleword pattern to i386.md Date: Fri, 29 Jul 2022 07:10:33 +0100 Message-ID: <035101d8a311$ea4f0b90$beed22b0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdijEN9OcAiHZsQgRd6ASmEDp+lykg== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MEDICAL_SUBJECT, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds rot[lr]64ti2_doubleword patterns to the x86_64 backend, to move splitting of 128-bit TImode rotates by 64 bits after reload, matching what we now do for 64-bit DImode rotations by 32 bits with -m32. In theory moving when this rotation is split should have little influence on code generation, but in practice "reload" sometimes decides to make use of the increased flexibility to reduce the number of registers used, and the code size, by using xchg. For example: __int128 x; __int128 y; __int128 a; __int128 b; void foo() { unsigned __int128 t = x; t ^= a; t = (t<<64) | (t>>64); t ^= b; y = t; } Before: movq x(%rip), %rsi movq x+8(%rip), %rdi xorq a(%rip), %rsi xorq a+8(%rip), %rdi movq %rdi, %rax movq %rsi, %rdx xorq b(%rip), %rax xorq b+8(%rip), %rdx movq %rax, y(%rip) movq %rdx, y+8(%rip) ret After: movq x(%rip), %rax movq x+8(%rip), %rdx xorq a(%rip), %rax xorq a+8(%rip), %rdx xchgq %rdx, %rax xorq b(%rip), %rax xorq b+8(%rip), %rdx movq %rax, y(%rip) movq %rdx, y+8(%rip) ret One some modern architectures this is a small win, on some older architectures this is a small loss. The decision which code to generate is made in "reload", and could probably be tweaked by register preferencing. The much bigger win is that (eventually) all TImode mode shifts and rotates by constants will become potential candidates for TImode STV. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-07-29 Roger Sayle gcc/ChangeLog * config/i386/i386.md (define_expand ti3): For rotations by 64 bits use new rot[lr]64ti2_doubleword pattern. (rot[lr]64ti2_doubleword): New post-reload splitter. Thanks again, Roger diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index fab6aed..f1158e1 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -13820,6 +13820,8 @@ if (const_1_to_63_operand (operands[2], VOIDmode)) emit_insn (gen_ix86_ti3_doubleword (operands[0], operands[1], operands[2])); + else if (CONST_INT_P (operands[2]) && INTVAL (operands[2]) == 64) + emit_insn (gen_64ti2_doubleword (operands[0], operands[1])); else { rtx amount = force_reg (QImode, operands[2]); @@ -14045,6 +14047,24 @@ } }) +(define_insn_and_split "64ti2_doubleword" + [(set (match_operand:TI 0 "register_operand" "=r,r,r") + (any_rotate:TI (match_operand:TI 1 "nonimmediate_operand" "0,r,o") + (const_int 64)))] + "TARGET_64BIT" + "#" + "&& reload_completed" + [(set (match_dup 0) (match_dup 3)) + (set (match_dup 2) (match_dup 1))] +{ + split_double_mode (TImode, &operands[0], 2, &operands[0], &operands[2]); + if (rtx_equal_p (operands[0], operands[1])) + { + emit_insn (gen_swapdi (operands[0], operands[2])); + DONE; + } +}) + (define_mode_attr rorx_immediate_operand [(SI "const_0_to_31_operand") (DI "const_0_to_63_operand")])