From patchwork Tue Apr 19 11:58:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 53036 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2ACD93857355 for ; Tue, 19 Apr 2022 11:59:06 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 218863858C83 for ; Tue, 19 Apr 2022 11:58:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 218863858C83 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=dV7J7/PCoQWwo6iPqZ5urUxWTvYJpd2Aru9/2qcnAhw=; b=spvaipBsGs2Bq1x9ZjTfMtFF3K sfLXxz9e0oPhqfIr+JB8Y3sF0WOLIVBMSuvEtr8ir1glwkYASduJT1ZaGXdW597oOH25woMyyI0Af TVoRlGX9nSPKYs7G16cUnBpS/AroeVu/UqzI4uOu0eyfHe9O70e0ohnKc1xFefEGBDBJyn/InK/9n oIJTJYijPtMk9qhVQP1cOJLEpjiOr6oyeCGq0BdOLfjjjP5cZSh224boJ8rRIjZI9Q0oX7906Ki/7 sQ21pR3BybrsxispJA4WmHbFcLjzCN3rkwxg18tUje7j31CD9cCMFNSSwla+PSfQV3sWPIUKfhiUM z9DPw92w==; Received: from [185.62.158.67] (port=54915 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ngmVH-00033T-AP; Tue, 19 Apr 2022 07:58:43 -0400 From: "Roger Sayle" To: Subject: [x86_64 PATCH] PR middle-end/105135: Catch more cmov idioms in combine. Date: Tue, 19 Apr 2022 12:58:41 +0100 Message-ID: <004601d853e4$d11a99e0$734fcda0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdhT48WOSU2TcYxgTjujwyTwIa9zMA== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch addresses PR middle-end/105135, a missed-optimization regression affecting mainline. I agree with Jakub's comment that the middle-end optimizations are sound, reducing basic blocks and conditional expressions at the tree-level, but requiring backend's to recognize conditional move instructions/idioms if/when beneficial. This patch introduces two new define_insn_and_split in i386.md to recognize two additional cmove idioms. The first recognizes (PR105135's): int foo(int x, int y, int z) { return ((x < y) << 5) + z; } and transforms (the 6 insns, 13 bytes): xorl %eax, %eax ;; 2 bytes cmpl %esi, %edi ;; 2 bytes setl %al ;; 3 bytes sall $5, %eax ;; 3 bytes addl %edx, %eax ;; 2 bytes ret ;; 1 byte into (the 4 insns, 9 bytes): cmpl %esi, %edi ;; 2 bytes leal 32(%rdx), %eax ;; 3 bytes cmovge %edx, %eax ;; 3 bytes ret ;; 1 byte The second catches the very closely related (from PR 98865): int bar(int x, int y, int z) { return -(x < y) & z; } and transforms the (6 insns, 12 bytes): xorl %eax, %eax ;; 2 bytes cmpl %esi, %edi ;; 2 bytes setl %al ;; 3 bytes negl %eax ;; 2 bytes andl %edx, %eax ;; 2 bytes ret ;; 1 byte into (4 insns, 8 bytes): xorl %eax, %eax ;; 2 bytes cmpl %esi, %edi ;; 2 bytes cmovl %edx, %eax ;; 3 bytes ret ;; 1 byte They both have in common that they recognize a setcc followed by two instructions, and replace them with one instruction and a cmov, which is typically a performance win, but always a size win. Fine tuning these decisions based on microarchitecture is much easier in the backend, than the middle-end. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. Ok for mainline? 2022-04-19 Roger Sayle gcc/ChangeLog PR target/105135 * config/i386/i386.md (*xor_cmov): Transform setcc, negate then and into mov $0, followed by a cmov. (*lea_cmov): Transform setcc, ashift const then plus into lea followed by cmov. gcc/testsuite/ChangeLog PR target/105135 * gcc.target/i386/cmov10.c: New test case. * gcc.target/i386/cmov11.c: New test case. * gcc.target/i386/pr105135.c: New test case. Thanks in advance, Roger diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c74edd1..5887688 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -20751,6 +20751,52 @@ operands[9] = replace_rtx (operands[6], operands[0], operands[1], true); }) +;; Transform setcc;negate;and into mov_zero;cmov +(define_insn_and_split "*xor_cmov" + [(set (match_operand:SWI248 0 "register_operand") + (and:SWI248 + (neg:SWI248 (match_operator:SWI248 1 "ix86_comparison_operator" + [(match_operand 2 "flags_reg_operand") + (const_int 0)])) + (match_operand:SWI248 3 "register_operand"))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_CMOVE && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (match_dup 4) (const_int 0)) + (set (match_dup 0) + (if_then_else:SWI248 (match_op_dup 1 [(match_dup 2) (const_int 0)]) + (match_dup 3) (match_dup 4)))] +{ + operands[4] = gen_reg_rtx (mode); +}) + +;; Transform setcc;ashift_const;plus into lea_const;cmov +(define_insn_and_split "*lea_cmov" + [(set (match_operand:SWI 0 "register_operand") + (plus:SWI (ashift:SWI (match_operator:SWI 1 "ix86_comparison_operator" + [(match_operand 2 "flags_reg_operand") + (const_int 0)]) + (match_operand:SWI 3 "const_int_operand")) + (match_operand:SWI 4 "register_operand"))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_CMOVE && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (match_dup 5) (plus: (match_dup 4) (match_dup 6))) + (set (match_dup 0) + (if_then_else: (match_op_dup 1 [(match_dup 2) (const_int 0)]) + (match_dup 5) (match_dup 4)))] +{ + operands[5] = gen_reg_rtx (mode); + operands[6] = GEN_INT (1 << INTVAL (operands[3])); + if (mode != mode) + { + operands[0] = gen_lowpart (mode, operands[0]); + operands[4] = gen_lowpart (mode, operands[4]); + } +}) + (define_insn "movhf_mask" [(set (match_operand:HF 0 "nonimmediate_operand" "=v,m,v") (unspec:HF diff --git a/gcc/testsuite/gcc.target/i386/cmov10.c b/gcc/testsuite/gcc.target/i386/cmov10.c new file mode 100644 index 0000000..c04fdd8 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cmov10.c @@ -0,0 +1,9 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ + +int foo(int x, int y, int z) +{ + return ((x < y) << 5) + z; +} + +/* { dg-final { scan-assembler "cmovge" } } */ diff --git a/gcc/testsuite/gcc.target/i386/cmov11.c b/gcc/testsuite/gcc.target/i386/cmov11.c new file mode 100644 index 0000000..65f2bfc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cmov11.c @@ -0,0 +1,9 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ + +int foo(int x, int y, int z) +{ + return -(x < y) & z; +} + +/* { dg-final { scan-assembler "cmovl" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr105135.c b/gcc/testsuite/gcc.target/i386/pr105135.c new file mode 100644 index 0000000..3ed3c9e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr105135.c @@ -0,0 +1,16 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2" } */ + +char to_lower_1(const char c) { return c + ((c >= 'A' && c <= 'Z') * 32); } + +char to_lower_2(const char c) { return c + (((c >= 'A') & (c <= 'Z')) * 32); } + +char to_lower_3(const char c) { + if (c >= 'A' && c <= 'Z') { + return c + 32; + } + return c; +} + +/* { dg-final { scan-assembler-not "setbe" } } */ +/* { dg-final { scan-assembler-not "sall" } } */