From patchwork Sat Oct 1 04:52:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 58252 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 529893854166 for ; Sat, 1 Oct 2022 04:52:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 529893854166 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1664599965; bh=lJFZEd/wx7Xvt9twPqmNUXKojrTi3gPGKBd/y/TNeLA=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=gyQcGeyoYoD6efpA1Dp3cNsjfhKgLp5HIxD0t30LzWJbdo/8crh/nLdecfX3AwmJx XAg86a6WRqhI5da8tqwU8nGjy6a+CIvasgvQZ2ImfiYyUgYnaXRfts16l0LdEwN5Sb Pz8dUFHD4A/IqaJ0WQXzqK++OWNJqaEVrOcOYZrA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pg1-x535.google.com (mail-pg1-x535.google.com [IPv6:2607:f8b0:4864:20::535]) by sourceware.org (Postfix) with ESMTPS id D74ED3858D1E for ; Sat, 1 Oct 2022 04:52:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D74ED3858D1E Received: by mail-pg1-x535.google.com with SMTP id s206so5728380pgs.3 for ; Fri, 30 Sep 2022 21:52:15 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:from:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date; bh=lJFZEd/wx7Xvt9twPqmNUXKojrTi3gPGKBd/y/TNeLA=; b=PESBN6224o5kDQnO9f2ItcULzLsWls44rnqLSrwPR0f1PtmgZ4+CVbeTuTXStAclz+ j99fwfswfZh06DBE0WOtr6lPZ41TwRaeY8zxHpumHOwoDKB4fv2PzZDLM7Ujgu/AaFZC 5sb9mS7/TuusnBWbYApeX6u/CFH5lmvekDNBpOLCaWYidQGfO+DwLaK+wZroSEjehIoS 0YminlmJTHgSu0XTDyJfJ6lFG/0cwq4AZq/x/Jbb+qYf5bC7uiDE7kVGjIIRvs5/ad4l lA8eviWkLJAFQLZ6napNgJHZHUTj3PPwgGYU405PVp8UUF5eu4TrwS9EJ99xtKVhC3zM Nupw== X-Gm-Message-State: ACrzQf15HzJH3SZzPgbneAFkYAPylzqeUA1k05iHXQ6Jqvw6p0xTJO0C 8bBOCttznK3dZIfzU+qRHTZtFr/4NhzsYw== X-Google-Smtp-Source: AMsMyM7nG1NBjcnsBZSsBjMMoPn9KP7lxjHIK/id3sOrc5Tg7BfHj4mSy9E5zvHfxquTBrGlD8czNA== X-Received: by 2002:a62:17d1:0:b0:54d:87d5:249e with SMTP id 200-20020a6217d1000000b0054d87d5249emr12458716pfx.14.1664599934428; Fri, 30 Sep 2022 21:52:14 -0700 (PDT) Received: from ?IPV6:2601:681:8600:13d0::f0a? ([2601:681:8600:13d0::f0a]) by smtp.gmail.com with ESMTPSA id k17-20020a170902c41100b0016c0b0fe1c6sm2840407plk.73.2022.09.30.21.52.13 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 30 Sep 2022 21:52:13 -0700 (PDT) Message-ID: <36f8c642-9cc5-9fb5-5e76-e01a001f57f7@gmail.com> Date: Fri, 30 Sep 2022 22:52:12 -0600 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.1 Content-Language: en-US Subject: [committed][PATCH] Improve Z flag handling on H8 To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jeff Law via Gcc-patches From: Jeff Law Reply-To: Jeff Law Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch improves handling of the Z bit in the status register in a variety of ways to improve either the code size or code speed on various H8 subtargets. For example, we can test the zero/nonzero status of the upper byte of a 16 bit register using mov.b, we can move the Z or an inverted Z into a QImode register profitably on some subtargets. We can move Z or an inverted Z into the sign bit on the H8/SX profitably, etc. I've actually had this patch in my tester for over a year, but got crazy busy and hadn't bothered to upstream it until now. Naturally it has been working all that time without regressions. Pushed to the trunk. Jeff commit 2555071c954aa5796eb3432c15739dcaae457bc3 Author: Jeff Law Date: Sat Oct 1 00:42:15 2022 -0400 Improve Z flag handling on H8 This patch improves handling of the Z bit in the status register in a variety of ways to improve either the code size or code speed on various H8 subtargets. For example, we can test the zero/nonzero status of the upper byte of a 16 bit register using mov.b, we can move the Z or an inverted Z into a QImode register profitably on some subtargets. We can move Z or an inverted Z into the sign bit on the H8/SX profitably, etc. gcc/ * config/h8300/h8300.md (HSI2): New iterator. (eqne_invert): Similarly. * config/h8300/testcompare.md (testhi_upper_z): New pattern. (cmpqi_z, cmphi_z, cmpsi_z): Likewise. (store_z_qi, store_z_i_qi, store_z_hi, store_z_hi_sb): New define_insn_and_splits and/or define_insns. (store_z_hi_neg, store_z_hi_and, store_z_): Likewise. (store_z__neg, store_z__and, store_z): Likewise. diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md index 2da7e7d8a7d..f592af1d5f7 100644 --- a/gcc/config/h8300/h8300.md +++ b/gcc/config/h8300/h8300.md @@ -221,6 +221,7 @@ (define_mode_iterator QHI [QI HI]) (define_mode_iterator HSI [HI SI]) +(define_mode_iterator HSI2 [HI SI]) (define_mode_iterator QHSI [QI HI SI]) (define_mode_iterator QHSI2 [QI HI SI]) @@ -236,6 +237,7 @@ (define_code_iterator ors [ior xor]) (define_code_iterator eqne [eq ne]) +(define_code_attr eqne_invert [(eq "ne") (ne "eq")]) ;; For storing the C flag, map from the unsigned comparison to the right ;; code for testing the C bit. diff --git a/gcc/config/h8300/testcompare.md b/gcc/config/h8300/testcompare.md index 0ee3e360bea..81dce1d0bc1 100644 --- a/gcc/config/h8300/testcompare.md +++ b/gcc/config/h8300/testcompare.md @@ -61,6 +61,15 @@ "mov.b %t0,%t0" [(set_attr "length" "2")]) +(define_insn "*tsthi_upper_z" + [(set (reg:CCZ CC_REG) + (compare (and:HI (match_operand:HI 0 "register_operand" "r") + (const_int -256)) + (const_int 0)))] + "reload_completed" + "mov.b %t0,%t0" + [(set_attr "length" "2")]) + (define_insn "*tstsi_upper" [(set (reg:CCZN CC_REG) (compare (and:SI (match_operand:SI 0 "register_operand" "r") @@ -86,6 +95,30 @@ } [(set_attr "length_table" "add")]) +(define_insn "*cmpqi_z" + [(set (reg:CCZ CC_REG) + (eq (match_operand:QI 0 "h8300_dst_operand" "rQ") + (match_operand:QI 1 "h8300_src_operand" "rQi")))] + "reload_completed" + { return "cmp.b %X1,%X0"; } + [(set_attr "length_table" "add")]) + +(define_insn "*cmphi_z" + [(set (reg:CCZ CC_REG) + (eq (match_operand:HI 0 "h8300_dst_operand" "rQ") + (match_operand:HI 1 "h8300_src_operand" "rQi")))] + "reload_completed" + { return "cmp.w %T1,%T0"; } + [(set_attr "length_table" "add")]) + +(define_insn "*cmpsi_z" + [(set (reg:CCZ CC_REG) + (eq (match_operand:SI 0 "h8300_dst_operand" "rQ") + (match_operand:SI 1 "h8300_src_operand" "rQi")))] + "reload_completed" + { return "cmp.l %S1,%S0"; } + [(set_attr "length_table" "add")]) + (define_insn "*cmpqi" [(set (reg:CC CC_REG) (compare (match_operand:QI 0 "h8300_dst_operand" "rQ") @@ -209,6 +242,8 @@ return "xor.l\t%S0,%S0\;bist\t#0,%w0"; gcc_unreachable (); } + else + gcc_unreachable (); } [(set (attr "length") (symbol_ref "mode == SImode ? 6 : 4"))]) @@ -340,3 +375,235 @@ (ashift:QHSI (:QHSI (reg:CCC CC_REG) (const_int 0)) (match_dup 3)))]) +;; Storing Z into a QImode destination is fairly easy on the H8/S and +;; newer as the stc; shift; mask is just 3 insns/6 bytes. On the H8/300H +;; it is 4 insns/8 bytes which is a speed improvement, but a size +;; regression relative to the branchy sequence +;; +;; Storing inverted Z in QImode is not profitable on the H8/300H, but +;; is a speed improvement on the H8S. +(define_insn_and_split "*store_z_qi" + [(set (match_operand:QI 0 "register_operand" "=r") + (eq:QI (match_operand:HI 1 "register_operand" "r") + (match_operand:HI 2 "register_operand" "r")))] + "TARGET_H8300S || !optimize_size" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (eq:CCZ (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (ne:QI (reg:CCZ CC_REG) (const_int 0)))]) + +(define_insn_and_split "*store_z_i_qi" + [(set (match_operand:QI 0 "register_operand" "=r") + (ne:QI (match_operand:HI 1 "register_operand" "r") + (match_operand:HI 2 "register_operand" "r")))] + "TARGET_H8300S" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (eq:CCZ (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (eq:QI (reg:CCZ CC_REG) (const_int 0)))]) + +(define_insn "*store_z_qi" + [(set (match_operand:QI 0 "register_operand" "=r") + (ne:QI (reg:CCZ CC_REG) (const_int 0)))] + "(TARGET_H8300S || !optimize_size) && reload_completed" + { + if (TARGET_H8300S) + return "stc\tccr,%X0\;shar\t#2,%X0\;and\t#0x1,%X0"; + else + return "stc\tccr,%X0\;shar\t%X0\;shar\t%X0\;and\t#0x1,%X0"; + } + [(set (attr "length") (symbol_ref "TARGET_H8300S ? 6 : 8"))]) + +(define_insn "*store_z_i_qi" + [(set (match_operand:QI 0 "register_operand" "=r") + (eq:QI (reg:CCZ CC_REG) (const_int 0)))] + "(TARGET_H8300S || !optimize_size) && reload_completed" + "stc\tccr,%X0\;bld\t#2,%X0\;xor.w\t%T0,%T0\;bist\t#0,%X0"; + [(set_attr "length" "8")]) + +;; Storing Z or an inverted Z into a HImode destination is +;; profitable on the H8/S and older variants, but not on the +;; H8/SX where the branchy sequence can use the two-byte +;; mov-immediate that is specific to the H8/SX +(define_insn_and_split "*store_z_hi" + [(set (match_operand:HSI 0 "register_operand" "=r") + (eqne:HSI (match_operand:HSI2 1 "register_operand" "r") + (match_operand:HSI2 2 "register_operand" "r")))] + "!TARGET_H8300SX" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (eq:CCZ (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (:HSI (reg:CCZ CC_REG) (const_int 0)))]) + +;; Similar, but putting the result into the sign bit +(define_insn_and_split "*store_z_hi_sb" + [(set (match_operand:HSI 0 "register_operand" "=r") + (ashift:HSI (eqne:HSI (match_operand:HSI2 1 "register_operand" "r") + (match_operand:HSI2 2 "register_operand" "r")) + (const_int 15)))] + "!TARGET_H8300SX" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (eq:CCZ (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (ashift:HSI (:HSI (reg:CCZ CC_REG) (const_int 0)) + (const_int 15)))]) + +;; Similar, but negating the result +(define_insn_and_split "*store_z_hi_neg" + [(set (match_operand:HSI 0 "register_operand" "=r") + (neg:HSI (eqne:HSI (match_operand:HSI2 1 "register_operand" "r") + (match_operand:HSI2 2 "register_operand" "r"))))] + "!TARGET_H8300SX" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (eq:CCZ (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (neg:HSI (:HSI (reg:CCZ CC_REG) (const_int 0))))]) + +(define_insn_and_split "*store_z_hi_and" + [(set (match_operand:HSI 0 "register_operand" "=r") + (and:HSI (eqne:HSI (match_operand:HSI2 1 "register_operand" "r") + (match_operand:HSI2 2 "register_operand" "r")) + (match_operand:HSI 3 "register_operand" "r")))] + "!TARGET_H8300SX" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (eq:CCZ (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (and:HSI (:HSI (reg:CCZ CC_REG) (const_int 0)) + (match_dup 3)))]) + +(define_insn "*store_z_" + [(set (match_operand:HSI 0 "register_operand" "=r") + (eqne:HSI (reg:CCZ CC_REG) (const_int 0)))] + "!TARGET_H8300SX" + { + if (mode == HImode) + { + if ( == NE) + { + if (TARGET_H8300S) + return "stc\tccr,%X0\;shlr.b\t#2,%X0\;and.w\t#1,%T0"; + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.w\t%T0,%T0\;bst\t#0,%X0"; + } + else + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.w\t%T0,%T0\;bist\t#0,%X0"; + } + else if (mode == SImode) + { + if ( == NE) + { + if (TARGET_H8300S) + return "stc\tccr,%X0\;shlr.b\t#2,%X0\;and.l\t#1,%S0"; + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.l\t%S0,%S0\;bst\t#0,%X0"; + } + else + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.l\t%S0,%S0\;bist\t#0,%X0"; + } + gcc_unreachable (); + } +;; XXXSImode is 2 bytes longer + [(set_attr "length" "8")]) + +(define_insn "*store_z__sb" + [(set (match_operand:HSI 0 "register_operand" "=r") + (ashift:HSI (eqne:HSI (reg:CCZ CC_REG) (const_int 0)) + (const_int 15)))] + "!TARGET_H8300SX" + { + if (mode == HImode) + { + if ( == NE) + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.w\t%T0,%T0\;bst\t#7,%t0"; + else + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.w\t%T0,%T0\;bist\t#7,%t0"; + } + else if (mode == SImode) + { + if ( == NE) + return "stc\tccr,%X0\;bld\t#2,%X0\;xor.l\t%T0,%T0\;rotxr.l\t%S0"; + else + return "stc\tccr,%X0\;bild\t#2,%X0\;xor.l\t%T0,%T0\;rotxr.l\t%S0"; + } + gcc_unreachable (); + } + ;; XXX SImode is larger + [(set_attr "length" "8")]) + +(define_insn "*store_z__neg" + [(set (match_operand:HSI 0 "register_operand" "=r") + (neg:HSI (eqne:HSI (reg:CCZ CC_REG) (const_int 0))))] + "!TARGET_H8300SX" + { + if (mode == HImode) + { + if ( == NE) + return "stc\tccr,%X0\;bld\t#2,%X0\;subx.b\t%X0,%X0\;exts.w\t%T0"; + else + return "stc\tccr,%X0\;bild\t#2,%X0\;subx.b\t%X0,%X0\;exts.w\t%T0"; + } + else if (mode == SImode) + { + if ( == NE) + return "stc\tccr,%X0\;bld\t#2,%X0\;subx.b\t%X0,%X0\;exts.w\t%T0\;exts.l\t%S0"; + else + return "stc\tccr,%X0\;bild\t#2,%X0\;subx.b\t%X0,%X0\;exts.w\t%T0\;exts.l\t%S0"; + } + gcc_unreachable (); + } + ;; XXX simode is an instruction longer + [(set_attr "length" "8")]) + +(define_insn "*store_z__and" + [(set (match_operand:HSI 0 "register_operand" "=r") + (and:HSI (eqne:HSI (reg:CCZ CC_REG) (const_int 0)) + (match_operand:HSI 1 "register_operand" "r")))] + "!TARGET_H8300SX" + { + if (mode == HImode) + { + if ( == NE) + return "bld\t#0,%X1\;stc\tccr,%X0\;band\t#2,%X0\;xor.w\t%T0,%T0\;bst\t#0,%X0"; + else + return "bild\t#0,%X1\;stc\tccr,%X0\;band\t#2,%X0\;xor.w\t%T0,%T0\;bist\t#0,X0"; + } + else if (mode == SImode) + { + if ( == NE) + return "bld\t#0,%X1\;stc\tccr,%X0\;band\t#2,%X0\;xor.l\t%S0,%S0\;bst\t#0,%X0"; + else + return "bild\t#0,%X1\;stc\tccr,%X0\;band\t#2,%X0\;xor.l\t%S0,%S0\;bist\t#0,X0"; + } + gcc_unreachable (); + } + ;; XXX simode is an instruction longer + [(set_attr "length" "8")]) + +;; We can test the upper byte of a HImode register and the upper word +;; of a SImode register + +;; We can test the upper byte of a HImode register and the upper word +;; of a SImode register +(define_insn_and_split "*store_z" + [(set (match_operand:HI 0 "register_operand" "=r") + (eqne:HI (and:HI (match_operand:HI 1 "register_operand" "r") + (const_int -256)) + (const_int 0)))] + "!TARGET_H8300SX" + "#" + "&& reload_completed" + [(set (reg:CCZ CC_REG) + (compare (and:HI (match_dup 1) (const_int -256)) + (const_int 0))) + (set (match_dup 0) + (:HI (reg:CCZ CC_REG) (const_int 0)))])