From patchwork Fri Jun 10 04:17:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 54999 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 750E2383EC7C for ; Fri, 10 Jun 2022 04:26:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 750E2383EC7C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654835213; bh=ZOK5zttpoPTwaisRQDh8UG3m0Bl++wy1Pa670thkbNY=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=ZhJSvC2go4ZL2b1xq8dqgdIqIFeDAb3RmgDCjZH5z0rTuPivUwsMlAtuwttIPFayj EeoNFgKO50iIYa5c983mGXAdWbsgvwLJCxLlYs1gYLgb4a+EECBszLHBq1QT2KuLZ7 YP5ScWJ9EL3SjzqQvd02xLmPxLsxt5GfS4JMww68= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nh503-vm2.bullet.mail.kks.yahoo.co.jp (nh503-vm2.bullet.mail.kks.yahoo.co.jp [183.79.56.188]) by sourceware.org (Postfix) with SMTP id 7EB6B384D162 for ; Fri, 10 Jun 2022 04:26:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 7EB6B384D162 Received: from [183.79.100.138] by nh503.bullet.mail.kks.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:04 -0000 Received: from [183.79.100.134] by t501.bullet.mail.kks.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:04 -0000 Received: from [127.0.0.1] by omp503.mail.kks.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:04 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 928977.91574.bm@omp503.mail.kks.yahoo.co.jp Received: (qmail 5271 invoked by alias); 10 Jun 2022 04:26:04 -0000 Received: from unknown (HELO ?192.168.2.3?) (175.177.45.164 with ) by smtp6004.mail.ssk.ynwp.yahoo.co.jp with SMTP; 10 Jun 2022 04:26:04 -0000 X-YMail-JAS: NC2g6AcVM1mQr0m41Q0T1I6pFKcaT5IVyhupGm2CH8LV8TR25swC_zcdFh9fuEjoktgmJCOV1zU9Y7Ppj7GfqgjmMUsUYpLnP7DiqmXnTqb5Y2f__r.kb2sx.OiMyVpOZ5TscUDhUQ-- X-Apparently-From: X-YMail-OSG: 2o4nIFQVM1nwzO9U04cd2eOahZ2YjdoB6Nlh4gl_pF1t_KM WelZZ0EzNnTmZva8SgSyekd2fnT.ErSWXGJ029sMlgitusjKQTUGD9I7tt5J CHOYUVF6sXmsreQ7iTmSzLyN8ix8kSsLcgXur7JgM1GI6mlXxCZh1KZFKLhZ 4AfreryXTqnO5RXhvngcun4Phu0aTnZBDnI5kWdDeAlxOgeSnqrPmWAqH1zu gebqOoHW7WKyzZ0U7b7NtG3V82QV2PCGRRMl_4Id4Llk2aGLPR2oCa.eXaTA CGwi6aB7FLHN978b8ICSAofXhHGzedfih2HHnp_Hc5FT3vwFB9VKfMS6WuEF tT_YUVYlcxumTCrOvR0AjB.109_Qu62mJiJyDx0tgwqaVuBUJktyghHaXdWT jxf.TaixR58a_d46ggCpDpcYRjwnX1Il28q5NsiM6MsNOu4HLMDU7zTjJblr H2N3nOzDWivyhiaLwGddh9W0d9dTtAwJw_eyHOhRNdnPgltcSrggo73zET6C qF5I39DIXRYBl9c7GJxYGSqzkZIOKWYrWksrGVXMCCiZgvyJJik8Cqd9IoH0 Vf1i4K6.l_BlUc_pPGKbV8id6fJp_HBU3L8q6vvo9Srsxku7jwYWn3YUSBw0 gVlV2jlRBE2EQxo7FCgRu5zE36MiiBu9BlVk0oL_OOrIVCKKKwPG44DHNMLk erT4V2N1aPqASXBXvH.hXe5VUUSQb4yuVmZ3iPU1zUVjUQijgGQKd8f5ttVM lcCjeQ_KTfKOgpXbYCEFN0PSPVEij_q._gExk1UWRi8iqtuwtyv8jm5SfVVw mEOKS5D.LEMq4.Kxg7l3Nsw3fs6AXJy.gJiZRt8h1JdDWorO1PGQul9yjD0i tdj5eMT8DdLuIL2NYvEtkhniiJM2P0z358_HYJPRy7cG.cXXamG_fINjz8.3 DlRJttCNyuldl80zy7g-- Message-ID: Date: Fri, 10 Jun 2022 13:17:40 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: GCC Patches Subject: [PATCH 1/4] xtensa: Tweak some widen multiplications X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" umulsidi3 is faster than umuldi3 even if library call, and is also prerequisite for fast constant division by multiplication. gcc/ChangeLog: * config/xtensa/xtensa.md (mulsidi3, umulsidi3): Split into individual signedness, in order to use libcall "__umulsidi3" but not the other. (mulhisi3): Merge into one by using code iterator. (mulsidi3, mulhisi3, umulhisi3): Remove. --- gcc/config/xtensa/xtensa.md | 56 +++++++++++++++++++++---------------- 1 file changed, 32 insertions(+), 24 deletions(-) diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index 8ff6f9a95fe..33cbd546de3 100644 --- a/gcc/config/xtensa/xtensa.md +++ b/gcc/config/xtensa/xtensa.md @@ -224,20 +224,42 @@ ;; Multiplication. -(define_expand "mulsidi3" +(define_expand "mulsidi3" [(set (match_operand:DI 0 "register_operand") - (mult:DI (any_extend:DI (match_operand:SI 1 "register_operand")) - (any_extend:DI (match_operand:SI 2 "register_operand"))))] + (mult:DI (sign_extend:DI (match_operand:SI 1 "register_operand")) + (sign_extend:DI (match_operand:SI 2 "register_operand"))))] "TARGET_MUL32_HIGH" { rtx temp = gen_reg_rtx (SImode); emit_insn (gen_mulsi3 (temp, operands[1], operands[2])); - emit_insn (gen_mulsi3_highpart (gen_highpart (SImode, operands[0]), - operands[1], operands[2])); + emit_insn (gen_mulsi3_highpart (gen_highpart (SImode, operands[0]), + operands[1], operands[2])); emit_insn (gen_movsi (gen_lowpart (SImode, operands[0]), temp)); DONE; }) +(define_expand "umulsidi3" + [(set (match_operand:DI 0 "register_operand") + (mult:DI (zero_extend:DI (match_operand:SI 1 "register_operand")) + (zero_extend:DI (match_operand:SI 2 "register_operand"))))] + "" +{ + if (TARGET_MUL32_HIGH) + { + rtx temp = gen_reg_rtx (SImode); + emit_insn (gen_mulsi3 (temp, operands[1], operands[2])); + emit_insn (gen_umulsi3_highpart (gen_highpart (SImode, operands[0]), + operands[1], operands[2])); + emit_insn (gen_movsi (gen_lowpart (SImode, operands[0]), temp)); + } + else + emit_library_call_value (gen_rtx_SYMBOL_REF (Pmode, "__umulsidi3"), + operands[0], LCT_NORMAL, DImode, + operands[1], SImode, + operands[2], SImode); + DONE; +}) + (define_insn "mulsi3_highpart" [(set (match_operand:SI 0 "register_operand" "=a") (truncate:SI @@ -261,30 +283,16 @@ (set_attr "mode" "SI") (set_attr "length" "3")]) -(define_insn "mulhisi3" - [(set (match_operand:SI 0 "register_operand" "=C,A") - (mult:SI (sign_extend:SI - (match_operand:HI 1 "register_operand" "%r,r")) - (sign_extend:SI - (match_operand:HI 2 "register_operand" "r,r"))))] - "TARGET_MUL16 || TARGET_MAC16" - "@ - mul16s\t%0, %1, %2 - mul.aa.ll\t%1, %2" - [(set_attr "type" "mul16,mac16") - (set_attr "mode" "SI") - (set_attr "length" "3,3")]) - -(define_insn "umulhisi3" +(define_insn "mulhisi3" [(set (match_operand:SI 0 "register_operand" "=C,A") - (mult:SI (zero_extend:SI + (mult:SI (any_extend:SI (match_operand:HI 1 "register_operand" "%r,r")) - (zero_extend:SI + (any_extend:SI (match_operand:HI 2 "register_operand" "r,r"))))] "TARGET_MUL16 || TARGET_MAC16" "@ - mul16u\t%0, %1, %2 - umul.aa.ll\t%1, %2" + mul16\t%0, %1, %2 + mul.aa.ll\t%1, %2" [(set_attr "type" "mul16,mac16") (set_attr "mode" "SI") (set_attr "length" "3,3")]) From patchwork Fri Jun 10 04:18:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 55001 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7E343383EC7C for ; Fri, 10 Jun 2022 04:28:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7E343383EC7C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654835336; bh=KvaD0/dkxLoLqUX6zO+CeGciKFuNlvcb33ydwfqeBaA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=IDfmtbELWRZcHVeOD9S1M5nOCev3nfJ4CEUvwHa1u9xq5K5kdS+TFdSb8wHBkdfyy S9tXc2eftzNVYIEHCVy+I5TMJq281Ti1jf90KsQOVthHabdbNRXGBU83RVIgDoO1Oo teSxTtDZNzTxIjGsWnsVu/46L6y/oO6ySPgFL4wA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nh602-vm0.bullet.mail.ssk.yahoo.co.jp (nh602-vm0.bullet.mail.ssk.yahoo.co.jp [182.22.90.25]) by sourceware.org (Postfix) with SMTP id 19C76384D164 for ; Fri, 10 Jun 2022 04:26:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 19C76384D164 Received: from [182.22.66.105] by nh602.bullet.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:05 -0000 Received: from [182.22.91.133] by t603.bullet.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:05 -0000 Received: from [127.0.0.1] by omp606.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:05 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 927311.55213.bm@omp606.mail.ssk.yahoo.co.jp Received: (qmail 86142 invoked by alias); 10 Jun 2022 04:26:05 -0000 Received: from unknown (HELO ?192.168.2.3?) (175.177.45.164 with ) by smtp6009.mail.ssk.ynwp.yahoo.co.jp with SMTP; 10 Jun 2022 04:26:05 -0000 X-YMail-JAS: pyMiQeUVM1kol2MxR2NJbnhAIErtELS0KGnngsHmDgwfAZ1DocbSQcAE5YvVxMMBOahyVlBK01h.RV19uVPMx_OVywrwVs46PgAXCUpoAJwhTeoy5auPpFrx5rfTjxFT.pTICUFQfA-- X-Apparently-From: X-YMail-OSG: 3rsjy0UVM1lKGFZIMQwYgnRS2H19ox0xb0RmgBfLJTIYBsA _XeYbFPOHLQASwSa.h88itfH92r2B2NQojfe6ubqm4jf_fT98y6bbg8dODRh mERTk_87_MC..LBiSMH_dwcwAF4c5Pq6Spwsm.tai8_S7fnw2B0pi5CgEXto E4Z56yge0azXWYSWovBQxcY_wKJTJhy_RNwwP1ZDEQ8lqkxCKl5uISt7Nq.o 86aeqOV9Nhe4fSqvfzro_PzK2elyGJ70HcIEb4Bw6u2u8r51GQvtYWH_cyO6 oHFfkLbssQuWDh4Boch5Bt7VDrl2y6J3biHOc3JHATNgO.mbXTbWG1K6vnGl Hde9G3NvDg90Mh0VgG.YQfyV0XIRq8j3fVP31d_Hep6pCX7wdo7TcHVpkoZY k55pA3_4FqQoEuPekMXYxskmtAgfYoQjhTlzKwGenArs64OCzCbkW49yWKBy y8nUsgAExnbmD7QRO4tNcv8ytOK1Nr3yXcukGKXPtvscKwH5JgxFEBqxGLBU czYfbUtS9BcySzwrzuEJmjdqSJrBRXU3jg._zcWAlOc9DF8Mk7Ph5bathahn nrJcWbcSdZ74iLQqpdcJffC_qwPG0OhOCSJ52BZvAi9EwNz9DMYitkFIRTPn EgFEyg80tZnTtBVfK_JJOOsrZWWpjVBI8KoCnN3fR3gEEm5OzaUPhA.Gb0nb eiYeaiJdquS_uth0ZxSDCpr6c0BrIxNiF5PWKRuIDb6EtV7l8oAT_Ns1KHJY BWNKnCJT0gKkHJnrlY5eMciPzpnaEeMPlZjMS4gjJpG4qiPtSqCqBbO5nRSr 1QD_9E1mbISL4i9.LyR1dpEx9hR0tZRHywelbcsp1C8lDuOocrF1FAsT9ENY GPvicd_QUfjm9PVBnPrO.V9wHOOsmWu1qim5TLIHqyTnf9dO.nhVVqVQNVIj .fZjdVGNCH3Lu7xub5w-- Message-ID: <842bf876-2f00-7216-b6f4-e625a13868dc@yahoo.co.jp> Date: Fri, 10 Jun 2022 13:18:24 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: GCC Patches Subject: [PATCH 2/4] xtensa: Consider the Loop Option when setmemsi is expanded to small loop X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" Now apply to almost any size of aligned block under such circumstances. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_block_set_small_loop): Pass through the block length / loop count conditions if zero-overhead looping is configured and active, --- gcc/config/xtensa/xtensa.cc | 65 +++++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 20 deletions(-) /* Insn expansion: holding the init value. Either MOV(.N) or L32R w/litpool. */ @@ -1523,16 +1531,33 @@ xtensa_expand_block_set_small_loop (rtx *operands) expand_len = TARGET_DENSITY ? 2 : 3; else expand_len = 3 + 4; - /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ - expand_len += bytes > 127 ? 3 - : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; - - /* Insn expansion: the loop body and branch instruction. - For store, one of S8I, S16I or S32I(.N). - For advance, ADDI(.N). - For branch, BNE. */ - expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) - + (TARGET_DENSITY ? 2 : 3) + 3; + if (TARGET_LOOPS && optimize) /* zero-overhead looping */ + { + /* Insn translation: Either MOV(.N) or L32R w/litpool for the + loop count. */ + expand_len += xtensa_simm12b (count) ? xtensa_sizeof_MOVI (count) + : 3 + 4; + /* Insn translation: LOOP, the zero-overhead looping setup + instruction. */ + expand_len += 3; + /* Insn expansion: the loop body instructions. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3); + } + else /* NO zero-overhead looping */ + { + /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ + expand_len += bytes > 127 ? 3 + : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; + /* Insn expansion: the loop body and branch instruction. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). + For branch, BNE. */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3) + 3; + } /* Function call: preparing two arguments. */ funccall_len = xtensa_sizeof_MOVI (value); diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index c7b54babc37..616ced3ed38 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -1483,7 +1483,7 @@ xtensa_expand_block_set_unrolled_loop (rtx *operands) int xtensa_expand_block_set_small_loop (rtx *operands) { - HOST_WIDE_INT bytes, value, align; + HOST_WIDE_INT bytes, value, align, count; int expand_len, funccall_len; rtx x, dst, end, reg; machine_mode unit_mode; @@ -1503,17 +1503,25 @@ xtensa_expand_block_set_small_loop (rtx *operands) /* Totally-aligned block only. */ if (bytes % align != 0) return 0; + count = bytes / align; - /* If 4-byte aligned, small loop substitution is almost optimal, thus - limited to only offset to the end address for ADDI/ADDMI instruction. */ - if (align == 4 - && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) - return 0; + /* If the Loop Option (zero-overhead looping) is configured and active, + almost no restrictions about the length of the block. */ + if (! (TARGET_LOOPS && optimize)) + { + /* If 4-byte aligned, small loop substitution is almost optimal, + thus limited to only offset to the end address for ADDI/ADDMI + instruction. */ + if (align == 4 + && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) + return 0; - /* If no 4-byte aligned, loop count should be treated as the constraint. */ - if (align != 4 - && bytes / align > ((optimize > 1 && !optimize_size) ? 8 : 15)) - return 0; + /* If no 4-byte aligned, loop count should be treated as the + constraint. */ + if (align != 4 + && count > ((optimize > 1 && !optimize_size) ? 8 : 15)) + return 0; + } From patchwork Fri Jun 10 04:19:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 55000 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C4BA6384D178 for ; Fri, 10 Jun 2022 04:27:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C4BA6384D178 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654835271; bh=NLva07GlnRVoxfmXSNmaLqR/Y1q0R98ADhCIR/QPvzY=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=ARv/V1Rwp9yGcrjqw467PVi7/hoAKKmrtBxo/eATpNahR0n2YS5QY59Ed1CLEsGQ3 ACurT3ybvV4dZewYg000hhkpd/DoR21HMD9cXTj3I3N+RoKin0SSY5XkW/SA7tg7S/ E1PRU/8lKCfO31PQBcUw6869XcmqdyyUjVRauivA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nh502-vm6.bullet.mail.kks.yahoo.co.jp (nh502-vm6.bullet.mail.kks.yahoo.co.jp [183.79.56.151]) by sourceware.org (Postfix) with SMTP id ABF73384D1BB for ; Fri, 10 Jun 2022 04:26:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org ABF73384D1BB Received: from [183.79.100.140] by nh502.bullet.mail.kks.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:07 -0000 Received: from [183.79.100.135] by t503.bullet.mail.kks.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:07 -0000 Received: from [127.0.0.1] by omp504.mail.kks.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:07 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 917731.47268.bm@omp504.mail.kks.yahoo.co.jp Received: (qmail 96751 invoked by alias); 10 Jun 2022 04:26:07 -0000 Received: from unknown (HELO ?192.168.2.3?) (175.177.45.164 with ) by smtp6008.mail.ssk.ynwp.yahoo.co.jp with SMTP; 10 Jun 2022 04:26:07 -0000 X-YMail-JAS: ehtySAoVM1mFXNT_9hcqFN.670vGlhwyQDdJmx4vO9dc02aW5j4ul5EVhHzz.f5_vLYzIq4ivmuzgYxa9FnMtEjimrvdnx1840HJqgpk16E3B3KhNY.yuai37wkYNl_dIwz1Nzg_6Q-- X-Apparently-From: X-YMail-OSG: WbFQcYkVM1nFD_cxB2YaVbcXw4tME.jxshadOouTI.ETtwr paKN9p1fFZYCtLGzjDOI0NuaiB1.tShCdWEJLPE70IuFDr2Ksqz3nAQgZo2H cUR4zEnNme0nMBP_xz.kuwWM.C0g2dvHW3kikwYIfbPp4lNJI7niU3ZtWBGn PSQXDmOp1Zy2KBu16ubQ0RZgn7HdcjDkh7z.4jwcWu52KMBaaGTrgwpliEC4 ynjCAfZ0slaOxARN0n0dRfNUMDxG1PcETsf6TJfO9OAzoqNIFGkPlmgqFh5m Nt8OtcyQqyHWg8LKx4yhC9pVKP7tWxpE39eaUTW0zSQuSgv5A1uIu5U6zEkN 6_OQS_404QF3fBWWTywBI2temnCZHZHhxBTsaQzNbUCTja9QHVNiKOEiKg40 lsvuYitVcJQ.mHIugqB0umnYdeeOTv3di1XujkgtrXuL0w25a7cazQl8ni1o xMdAa.e4nZnM7ecN32KtKs1oO_sJc4A5mV29ze2V7o5XhXu_w4gqY.kDisHt wMBiMt5LaFDr.8C81RU3JiQy_vA47AuNbQA.GtYiHTmKUsl.a46oiqQU7oBU BSXYCUb3ZCec9Rn_0yQ6C0hGVV5zkbEWMRJrWs2aXoE63IA2j.Zql37ejgCQ XtlzTqUeONvYuSzrfo9JJZLW91I7_UITaiVA8e8LrZrAqIYRTQXwxehYfc.2 AVC1HNgVVA5HSO1eRRBc033mrl9GHiGx8JSV3OBE2cVOkWXIXOA7NIahKU1B 3UoTJRJ9wHNLWzSuG.vBjQn0DAVE7EydgmfNXLx_Vkyhg5EF8SdDE90mvNYD RI6BKwRxbCmNqyQ6DoN.FQpdDAYmdVG5gtSC03We2bfxzHoJSoSU.SudqWM. pzeLdvXtblEY_duxfpesyRsGbdAXy7liWmde4ZcQ3r2mf5rQW.XwSSikHpyO Jxvc7KyRPcvhbT34tCA-- Message-ID: <90af87ac-56ce-a620-ce14-ecbb524b4e57@yahoo.co.jp> Date: Fri, 10 Jun 2022 13:19:32 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: GCC Patches Subject: [PATCH 3/4] xtensa: Improve instruction cost estimation and suggestion X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch implements a new target-specific relative RTL insn cost function because of suboptimal cost estimation by default, and fixes several "length" insn attributes (related to the cost estimation). And also introduces a new machine-dependent option "-mextra-l32r-costs=" that tells implementation-specific InstRAM/ROM access penalty for L32R instruction to the compiler (in clock-cycle units, 0 by default). gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_rtx_costs): Correct wrong case for ABS and NEG, add missing case for BSWAP and CLRSB, and double the costs for integer divisions using libfuncs if optimizing for speed, in order to take advantage of fast constant division by multiplication. (TARGET_INSN_COST): New macro definition. (xtensa_is_insn_L32R_p, xtensa_insn_cost): New functions for calculating relative costs of a RTL insns, for both of speed and size. * config/xtensa/xtensa.md (return, nop, trap): Correct values of the attribute "length" that depends on TARGET_DENSITY. (define_asm_attributes, blockage, frame_blockage): Add missing attributes. * config/xtensa/xtensa.opt (-mextra-l32r-costs=): New machine- dependent option, however, preparatory work for now. --- gcc/config/xtensa/xtensa.cc | 116 ++++++++++++++++++++++++++++++++--- gcc/config/xtensa/xtensa.md | 29 ++++++--- gcc/config/xtensa/xtensa.opt | 4 ++ 3 files changed, 134 insertions(+), 15 deletions(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index 616ced3ed38..1769e43c7b5 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -55,6 +55,7 @@ along with GCC; see the file COPYING3. If not see #include "dumpfile.h" #include "hw-doloop.h" #include "rtl-iter.h" +#include "insn-attr.h" /* This file should be included last. */ #include "target-def.h" @@ -134,6 +135,7 @@ static unsigned int xtensa_multibss_section_type_flags (tree, const char *, static section *xtensa_select_rtx_section (machine_mode, rtx, unsigned HOST_WIDE_INT); static bool xtensa_rtx_costs (rtx, machine_mode, int, int, int *, bool); +static int xtensa_insn_cost (rtx_insn *, bool); static int xtensa_register_move_cost (machine_mode, reg_class_t, reg_class_t); static int xtensa_memory_move_cost (machine_mode, reg_class_t, bool); @@ -212,6 +214,8 @@ static rtx xtensa_delegitimize_address (rtx); #define TARGET_MEMORY_MOVE_COST xtensa_memory_move_cost #undef TARGET_RTX_COSTS #define TARGET_RTX_COSTS xtensa_rtx_costs +#undef TARGET_INSN_COST +#define TARGET_INSN_COST xtensa_insn_cost #undef TARGET_ADDRESS_COST #define TARGET_ADDRESS_COST hook_int_rtx_mode_as_bool_0 @@ -3933,7 +3937,7 @@ xtensa_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED, static bool xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, int opno ATTRIBUTE_UNUSED, - int *total, bool speed ATTRIBUTE_UNUSED) + int *total, bool speed) { int code = GET_CODE (x); @@ -4021,9 +4025,14 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, return true; case CLZ: + case CLRSB: *total = COSTS_N_INSNS (TARGET_NSA ? 1 : 50); return true; + case BSWAP: + *total = COSTS_N_INSNS (mode == HImode ? 3 : 5); + return true; + case NOT: *total = COSTS_N_INSNS (mode == DImode ? 3 : 2); return true; @@ -4047,13 +4056,16 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, return true; case ABS: + case NEG: { if (mode == SFmode) *total = COSTS_N_INSNS (TARGET_HARD_FLOAT ? 1 : 50); else if (mode == DFmode) *total = COSTS_N_INSNS (50); - else + else if (mode == DImode) *total = COSTS_N_INSNS (4); + else + *total = COSTS_N_INSNS (1); return true; } @@ -4069,10 +4081,6 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, return true; } - case NEG: - *total = COSTS_N_INSNS (mode == DImode ? 4 : 2); - return true; - case MULT: { if (mode == SFmode) @@ -4112,11 +4120,11 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, case UMOD: { if (mode == DImode) - *total = COSTS_N_INSNS (50); + *total = COSTS_N_INSNS (speed ? 100 : 50); else if (TARGET_DIV32) *total = COSTS_N_INSNS (32); else - *total = COSTS_N_INSNS (50); + *total = COSTS_N_INSNS (speed ? 100 : 50); return true; } @@ -4149,6 +4157,98 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, } } +static bool +xtensa_is_insn_L32R_p(const rtx_insn *insn) +{ + rtx x = PATTERN (insn); + + if (GET_CODE (x) == SET) + { + x = XEXP (x, 1); + if (GET_CODE (x) == MEM) + { + x = XEXP (x, 0); + return (GET_CODE (x) == SYMBOL_REF || CONST_INT_P (x)) + && CONSTANT_POOL_ADDRESS_P (x); + } + } + + return false; +} + +/* Compute a relative costs of RTL insns. This is necessary in order to + achieve better RTL insn splitting/combination result. */ + +static int +xtensa_insn_cost (rtx_insn *insn, bool speed) +{ + if (!(recog_memoized (insn) < 0)) + { + int len = get_attr_length (insn), n = (len + 2) / 3; + + if (len == 0) + return COSTS_N_INSNS (0); + + if (speed) /* For speed cost. */ + { + /* "L32R" may be particular slow (implementation-dependent). */ + if (xtensa_is_insn_L32R_p (insn)) + return COSTS_N_INSNS (1 + xtensa_extra_l32r_costs); + + /* Cost based on the pipeline model. */ + switch (get_attr_type (insn)) + { + case TYPE_STORE: + case TYPE_MOVE: + case TYPE_ARITH: + case TYPE_MULTI: + case TYPE_NOP: + case TYPE_FSTORE: + return COSTS_N_INSNS (n); + + case TYPE_LOAD: + return COSTS_N_INSNS (n - 1 + 2); + + case TYPE_JUMP: + case TYPE_CALL: + return COSTS_N_INSNS (n - 1 + 3); + + case TYPE_FCONV: + case TYPE_FLOAD: + case TYPE_MUL16: + case TYPE_MUL32: + case TYPE_RSR: + return COSTS_N_INSNS (n * 2); + + case TYPE_FMADD: + return COSTS_N_INSNS (n * 4); + + case TYPE_DIV32: + return COSTS_N_INSNS (n * 16); + + default: + break; + } + } + else /* For size cost. */ + { + /* Cost based on the instruction length. */ + if (get_attr_type (insn) != TYPE_UNKNOWN) + { + /* "L32R" itself plus constant in litpool. */ + if (xtensa_is_insn_L32R_p (insn)) + return COSTS_N_INSNS (2) + 1; + + /* Consider ".n" short instructions. */ + return COSTS_N_INSNS (n) - (n * 3 - len); + } + } + } + + /* Fall back. */ + return pattern_cost (PATTERN (insn), speed); +} + /* Worker function for TARGET_RETURN_IN_MEMORY. */ static bool diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index 33cbd546de3..f6c6be4af24 100644 --- a/gcc/config/xtensa/xtensa.md +++ b/gcc/config/xtensa/xtensa.md @@ -98,7 +98,10 @@ ;; Describe a user's asm statement. (define_asm_attributes - [(set_attr "type" "multi")]) + [(set_attr "type" "multi") + (set_attr "mode" "none") + (set_attr "length" "3")]) ;; Should be the maximum possible length + ;; of a single machine instruction. ;; Pipeline model. @@ -1879,7 +1882,10 @@ } [(set_attr "type" "jump") (set_attr "mode" "none") - (set_attr "length" "2")]) + (set (attr "length") + (if_then_else (match_test "TARGET_DENSITY") + (const_int 2) + (const_int 3)))]) ;; Miscellaneous instructions. @@ -1934,7 +1940,10 @@ } [(set_attr "type" "nop") (set_attr "mode" "none") - (set_attr "length" "3")]) + (set (attr "length") + (if_then_else (match_test "TARGET_DENSITY") + (const_int 2) + (const_int 3)))]) (define_expand "nonlocal_goto" [(match_operand:SI 0 "general_operand" "") @@ -1998,8 +2007,9 @@ [(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)] "" "" - [(set_attr "length" "0") - (set_attr "type" "nop")]) + [(set_attr "type" "nop") + (set_attr "mode" "none") + (set_attr "length" "0")]) ;; Do not schedule instructions accessing memory before this point. @@ -2018,7 +2028,9 @@ (unspec:BLK [(match_operand:SI 1 "" "")] UNSPEC_FRAME_BLOCKAGE))] "" "" - [(set_attr "length" "0")]) + [(set_attr "type" "nop") + (set_attr "mode" "none") + (set_attr "length" "0")]) (define_insn "trap" [(trap_if (const_int 1) (const_int 0))] @@ -2031,7 +2043,10 @@ } [(set_attr "type" "trap") (set_attr "mode" "none") - (set_attr "length" "3")]) + (set (attr "length") + (if_then_else (match_test "!TARGET_DEBUG && TARGET_DENSITY") + (const_int 2) + (const_int 3)))]) ;; Setting up a frame pointer is tricky for Xtensa because GCC doesn't ;; know if a frame pointer is required until the reload pass, and diff --git a/gcc/config/xtensa/xtensa.opt b/gcc/config/xtensa/xtensa.opt index 1fc68a3d994..08338e39060 100644 --- a/gcc/config/xtensa/xtensa.opt +++ b/gcc/config/xtensa/xtensa.opt @@ -30,6 +30,10 @@ mlongcalls Target Mask(LONGCALLS) Use indirect CALLXn instructions for large programs. +mextra-l32r-costs= +Target RejectNegative Joined UInteger Var(xtensa_extra_l32r_costs) Init(0) +Set extra memory access cost for L32R instruction, in clock-cycle units. + mtarget-align Target Automatically align branch targets to reduce branch penalties. From patchwork Fri Jun 10 04:20:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 55002 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BCAA5383EC7B for ; Fri, 10 Jun 2022 04:29:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BCAA5383EC7B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654835393; bh=ArdGnDlmJKEvVOKRBACIpejDLmxvUEJaQyfxsou7Z/U=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=rRynBDqJAqfirPbMm4vcgAvQ9+gU4FvxqOM8ydC4Y06pjKYcbDkKpfvzLhvKkZvGO MIcOQMeinZFz0YaA0DDFca2syCRRY4rB7I+ZxtIXFOhoLFsSXKEzpfUJdOwFUw2+gY fAXcLhVGb9mrGEalERzaboHWXTMhBLvcf/h1Knc0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nh604-vm12.bullet.mail.ssk.yahoo.co.jp (nh604-vm12.bullet.mail.ssk.yahoo.co.jp [182.22.90.69]) by sourceware.org (Postfix) with SMTP id BDC92384D16B for ; Fri, 10 Jun 2022 04:26:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BDC92384D16B Received: from [182.22.66.103] by nh604.bullet.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:09 -0000 Received: from [182.22.91.129] by t601.bullet.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:09 -0000 Received: from [127.0.0.1] by omp602.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:09 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 667780.23875.bm@omp602.mail.ssk.yahoo.co.jp Received: (qmail 14757 invoked by alias); 10 Jun 2022 04:26:09 -0000 Received: from unknown (HELO ?192.168.2.3?) (175.177.45.164 with ) by smtp6010.mail.ssk.ynwp.yahoo.co.jp with SMTP; 10 Jun 2022 04:26:09 -0000 X-YMail-JAS: RQMCWSgVM1k_ImafOl.vaHjAdIay6cnGB90N1InYW7nw54nibSFzXXcD9AAJnUUsETOE7rLzqSM_7fj_Zdpf2YOTXQTKMllJyenY6_mRlUgzeoeEV3N1nBEZCylc7urjXvONPnoRXA-- X-Apparently-From: X-YMail-OSG: 0JubNJ8VM1lnUX3Nl65VX5_NXh3NgH9CKH5euNOuNbZTzQp J8LnX4GqUN9Nn31oYfSXuDsJqNqjOJ9ogSWDTUXMzJBSZ1MqlmqVXUvs.OEO pNOnfGFHbSfEDTSGW6H8XTm1lsFVz.oDcR9bRTi2UwOn2ZKqQTqQQHz6OFOP _fzda.Fzri_BJqIDJ4IDC8JGWqu5PE6ryyuGt0Oet9jEPiGfIgGWloaw8kh. BQt1aRsvzqkk4V1tWEykt_AcxZq35AUR8_GOEtfxykb56IUzMfOBM9fmtRx0 UdtPMsjFI1R1JmyLThTivuIajJ9x9hAt7_rHEe6A2oiZV.MHt3YxNwhrvyTB 6XBKX9iXL3p7wnUrdbOZ4sSofGIh77xyeLZPr00vDYWGWEhkl7zSE9YgBlaK tdwiclBbyB3AuInSpRDlkKbs6HMvpRYcydk8pBGjOP5eOWCwWvOK6Saeht3d opT8pL0LObmS6WHD96DOKQLZdA6UlYo433ezXTuZCIKr94l36cyFngHBmsME 2o2mV93SUSOdRFT.y.KdnhUbNJt1z.g_5HA_AKViuwODgmRBxrmBrG3tmCTG dr62mhB6Y31R15vMaOJdSH3emyEJ6kVDxqqCebPFVec.rhscShzSHssHPRFt 5yK_XjZbhVGx1XdVEL6zoC3goAY75F19VvxbzGw6zLiY9gcwUebdGcqwxAWr OFp_91wWqDL2zqbMLh83YGAban1YvskKfy.LSHE_iwyGFg24jWl1zGt3Z0Y9 DXsgrXWO6S1KvKw4UUacm2CCGHNNyziVDty5S91b14v3vP_H3sBGgytUdE.D _kqpUWXJixwt1mhjWxRDFx4AbUvHRy0Rn3j1ekFkfTQ7R9wPvZNGJ79p9Ydi 9CGPGDUGkInI02XY4Cfd5lFUtS_N4ElYL2j7TtwlqlDoyQvu9NRpLoYTo9T6 RPudnjF5JMpuSLQvh0g-- Message-ID: <18c02d84-2ea9-53e0-80c2-4ed1746b6b70@yahoo.co.jp> Date: Fri, 10 Jun 2022 13:20:44 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: GCC Patches Subject: [PATCH 4/4] xtensa: Improve constant synthesis for both integer and floating-point X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch revises the previous implementation of constant synthesis. First, changed to use define_split machine description pattern and to run after reload pass, in order not to interfere some optimizations such as the loop invariant motion. Second, not only integer but floating-point is subject to processing. Third, several new synthesis patterns - when the constant cannot fit into a "MOVI Ax, simm12" instruction, but: I. can be represented as a power of two minus one (eg. 32767, 65535 or 0x7fffffffUL) => "MOVI(.N) Ax, -1" + "SRLI Ax, Ax, 1 ... 31" (or "EXTUI") II. is between -34816 and 34559 => "MOVI(.N) Ax, -2048 ... 2047" + "ADDMI Ax, Ax, -32768 ... 32512" III. (existing case) can fit into a signed 12-bit if the trailing zero bits are stripped => "MOVI(.N) Ax, -2048 ... 2047" + "SLLI Ax, Ax, 1 ... 31" The above sequences consist of 5 or 6 bytes and have latency of 2 clock cycles, in contrast with "L32R Ax, " (3 bytes and one clock latency, but may suffer additional one clock pipeline stall and implementation-specific InstRAM/ROM access penalty) plus 4 bytes of constant value. In addition, 3-instructions synthesis patterns (8 or 9 bytes, 3 clock latency) are also provided when optimizing for speed and L32R instruction has considerable access penalty: IV. 2-instructions synthesis (any of I ... III) followed by "SLLI Ax, Ax, 1 ... 31" V. 2-instructions synthesis followed by either "ADDX[248] Ax, Ax, Ax" or "SUBX8 Ax, Ax, Ax" (multiplying by 3, 5, 7 or 9) gcc/ChangeLog: * config/xtensa/xtensa-protos.h (xtensa_constantsynth): New prototype. * config/xtensa/xtensa.cc (xtensa_emit_constantsynth, xtensa_constantsynth_2insn, xtensa_constantsynth_rtx_SLLI, xtensa_constantsynth_rtx_ADDSUBX, xtensa_constantsynth): New backend functions that process the abovementioned logic. (xtensa_emit_move_sequence): Revert the previous changes. * config/xtensa/xtensa.md (): New split patterns for integer and floating-point, as the frontend part. gcc/testsuite/ChangeLog: * gcc.target/xtensa/constsynth_2insns.c: New. * gcc.target/xtensa/constsynth_3insns.c: Ditto. * gcc.target/xtensa/constsynth_double.c: Ditto. --- gcc/config/xtensa/xtensa-protos.h | 1 + gcc/config/xtensa/xtensa.cc | 144 ++++++++++++++++-- gcc/config/xtensa/xtensa.md | 50 ++++++ .../gcc.target/xtensa/constsynth_2insns.c | 44 ++++++ .../gcc.target/xtensa/constsynth_3insns.c | 24 +++ .../gcc.target/xtensa/constsynth_double.c | 11 ++ 6 files changed, 258 insertions(+), 16 deletions(-) create mode 100644 gcc/testsuite/gcc.target/xtensa/constsynth_2insns.c create mode 100644 gcc/testsuite/gcc.target/xtensa/constsynth_3insns.c create mode 100644 gcc/testsuite/gcc.target/xtensa/constsynth_double.c diff --git a/gcc/config/xtensa/xtensa-protos.h b/gcc/config/xtensa/xtensa-protos.h index 30e4b54394a..c2fd750cd3a 100644 --- a/gcc/config/xtensa/xtensa-protos.h +++ b/gcc/config/xtensa/xtensa-protos.h @@ -44,6 +44,7 @@ extern int xtensa_expand_block_move (rtx *); extern int xtensa_expand_block_set_unrolled_loop (rtx *); extern int xtensa_expand_block_set_small_loop (rtx *); extern void xtensa_split_operand_pair (rtx *, machine_mode); +extern int xtensa_constantsynth (rtx, HOST_WIDE_INT); extern int xtensa_emit_move_sequence (rtx *, machine_mode); extern rtx xtensa_copy_incoming_a7 (rtx); extern void xtensa_expand_nonlocal_goto (rtx *); diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index 1769e43c7b5..2febea0eb3d 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -1037,6 +1037,134 @@ xtensa_split_operand_pair (rtx operands[4], machine_mode mode) } +/* Try to emit insns to load srcval (that cannot fit into signed 12-bit) + into dst with synthesizing a such constant value from a sequence of + load-immediate / arithmetic ones, instead of a L32R instruction + (plus a constant in litpool). */ + +static void +xtensa_emit_constantsynth (rtx dst, enum rtx_code code, + HOST_WIDE_INT imm0, HOST_WIDE_INT imm1, + rtx (*gen_op)(rtx, HOST_WIDE_INT), + HOST_WIDE_INT imm2) +{ + if (REG_P (dst)) + { + emit_move_insn (dst, GEN_INT (imm0)); + emit_move_insn (dst, gen_rtx_fmt_ee (code, SImode, + dst, GEN_INT (imm1))); + if (gen_op) + emit_move_insn (dst, gen_op (dst, imm2)); + } + else + { + rtx r = gen_reg_rtx (SImode); + + emit_move_insn (r, GEN_INT (imm0)); + emit_move_insn (r, gen_rtx_fmt_ee (code, SImode, + r, GEN_INT (imm1))); + emit_move_insn (dst, gen_op ? gen_op (r, imm2) : r); + } +} + +static int +xtensa_constantsynth_2insn (rtx dst, HOST_WIDE_INT srcval, + rtx (*gen_op)(rtx, HOST_WIDE_INT), + HOST_WIDE_INT op_imm) +{ + int shift = exact_log2 (srcval + 1); + + if (IN_RANGE (shift, 1, 31)) + { + xtensa_emit_constantsynth (dst, LSHIFTRT, -1, 32 - shift, + gen_op, op_imm); + return 1; + } + + if (IN_RANGE (srcval, (-2048 - 32768), (2047 + 32512))) + { + HOST_WIDE_INT imm0, imm1; + + if (srcval < -32768) + imm1 = -32768; + else if (srcval > 32512) + imm1 = 32512; + else + imm1 = srcval & ~255; + imm0 = srcval - imm1; + if (TARGET_DENSITY && imm1 < 32512 && IN_RANGE (imm0, 224, 255)) + imm0 -= 256, imm1 += 256; + xtensa_emit_constantsynth (dst, PLUS, imm0, imm1, gen_op, op_imm); + return 1; + } + + shift = ctz_hwi (srcval); + if (xtensa_simm12b (srcval >> shift)) + { + xtensa_emit_constantsynth (dst, ASHIFT, srcval >> shift, shift, + gen_op, op_imm); + return 1; + } + + return 0; +} + +static rtx +xtensa_constantsynth_rtx_SLLI (rtx reg, HOST_WIDE_INT imm) +{ + return gen_rtx_ASHIFT (SImode, reg, GEN_INT (imm)); +} + +static rtx +xtensa_constantsynth_rtx_ADDSUBX (rtx reg, HOST_WIDE_INT imm) +{ + return imm == 7 + ? gen_rtx_MINUS (SImode, gen_rtx_ASHIFT (SImode, reg, GEN_INT (3)), + reg) + : gen_rtx_PLUS (SImode, gen_rtx_ASHIFT (SImode, reg, + GEN_INT (floor_log2 (imm - 1))), + reg); +} + +int +xtensa_constantsynth (rtx dst, HOST_WIDE_INT srcval) +{ + /* No need for synthesizing for what fits into MOVI instruction. */ + if (xtensa_simm12b (srcval)) + return 0; + + /* 2-insns substitution. */ + if ((optimize_size || (optimize && xtensa_extra_l32r_costs >= 1)) + && xtensa_constantsynth_2insn (dst, srcval, NULL, 0)) + return 1; + + /* 3-insns substitution. */ + if (optimize > 1 && !optimize_size && xtensa_extra_l32r_costs >= 2) + { + int shift, divisor; + + /* 2-insns substitution followed by SLLI. */ + shift = ctz_hwi (srcval); + if (IN_RANGE (shift, 1, 31) && + xtensa_constantsynth_2insn (dst, srcval >> shift, + xtensa_constantsynth_rtx_SLLI, + shift)) + return 1; + + /* 2-insns substitution followed by ADDX[248] or SUBX8. */ + if (TARGET_ADDX) + for (divisor = 3; divisor <= 9; divisor += 2) + if (srcval % divisor == 0 && + xtensa_constantsynth_2insn (dst, srcval / divisor, + xtensa_constantsynth_rtx_ADDSUBX, + divisor)) + return 1; + } + + return 0; +} + + /* Emit insns to move operands[1] into operands[0]. Return 1 if we have written out everything that needs to be done to do the move. Otherwise, return 0 and the caller will emit the move @@ -1074,22 +1202,6 @@ xtensa_emit_move_sequence (rtx *operands, machine_mode mode) if (! TARGET_AUTO_LITPOOLS && ! TARGET_CONST16) { - /* Try to emit MOVI + SLLI sequence, that is smaller - than L32R + literal. */ - if (optimize_size && mode == SImode && CONST_INT_P (src) - && register_operand (dst, mode)) - { - HOST_WIDE_INT srcval = INTVAL (src); - int shift = ctz_hwi (srcval); - - if (xtensa_simm12b (srcval >> shift)) - { - emit_move_insn (dst, GEN_INT (srcval >> shift)); - emit_insn (gen_ashlsi3_internal (dst, dst, GEN_INT (shift))); - return 1; - } - } - src = force_const_mem (SImode, src); operands[1] = src; } diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index f6c6be4af24..7cb566dfc53 100644 --- a/gcc/config/xtensa/xtensa.md +++ b/gcc/config/xtensa/xtensa.md @@ -937,6 +937,19 @@ (set_attr "mode" "SI") (set_attr "length" "2,2,2,2,2,2,3,3,3,3,6,3,3,3,3,3")]) +(define_split + [(set (match_operand:SI 0 "register_operand") + (match_operand:SI 1 "constantpool_operand"))] + "! optimize_debug && reload_completed" + [(const_int 0)] +{ + rtx x = avoid_constant_pool_reference (operands[1]); + if (! CONST_INT_P (x)) + FAIL; + if (! xtensa_constantsynth (operands[0], INTVAL (x))) + emit_move_insn (operands[0], x); +}) + ;; 16-bit Integer moves (define_expand "movhi" @@ -1139,6 +1152,43 @@ (set_attr "mode" "SF") (set_attr "length" "3")]) +(define_split + [(set (match_operand:SF 0 "register_operand") + (match_operand:SF 1 "constantpool_operand"))] + "! optimize_debug && reload_completed" + [(const_int 0)] +{ + int i = 0; + rtx x = XEXP (operands[1], 0); + long l[2]; + if (GET_CODE (x) == SYMBOL_REF + && CONSTANT_POOL_ADDRESS_P (x)) + x = get_pool_constant (x); + else if (GET_CODE (x) == CONST) + { + x = XEXP (x, 0); + gcc_assert (GET_CODE (x) == PLUS + && GET_CODE (XEXP (x, 0)) == SYMBOL_REF + && CONSTANT_POOL_ADDRESS_P (XEXP (x, 0)) + && CONST_INT_P (XEXP (x, 1))); + i = INTVAL (XEXP (x, 1)); + gcc_assert (i == 0 || i == 4); + i /= 4; + x = get_pool_constant (XEXP (x, 0)); + } + else + gcc_unreachable (); + if (GET_MODE (x) == SFmode) + REAL_VALUE_TO_TARGET_SINGLE (*CONST_DOUBLE_REAL_VALUE (x), l[0]); + else if (GET_MODE (x) == DFmode) + REAL_VALUE_TO_TARGET_DOUBLE (*CONST_DOUBLE_REAL_VALUE (x), l); + else + gcc_unreachable (); + x = gen_rtx_REG (SImode, REGNO (operands[0])); + if (! xtensa_constantsynth (x, l[i])) + emit_move_insn (x, GEN_INT (l[i])); +}) + ;; 64-bit floating point moves (define_expand "movdf" diff --git a/gcc/testsuite/gcc.target/xtensa/constsynth_2insns.c b/gcc/testsuite/gcc.target/xtensa/constsynth_2insns.c new file mode 100644 index 00000000000..ec2606ed11a --- /dev/null +++ b/gcc/testsuite/gcc.target/xtensa/constsynth_2insns.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-Os } */ + +int test_0(void) +{ + return 4095; +} + +int test_1(void) +{ + return 2147483647; +} + +int test_2(void) +{ + return -34816; +} + +int test_3(void) +{ + return -2049; +} + +int test_4(void) +{ + return 2048; +} + +int test_5(void) +{ + return 34559; +} + +int test_6(void) +{ + return 43680; +} + +void test_7(int *p) +{ + *p = -1432354816; +} + +/* { dg-final { scan-assembler-not "l32r" } } */ diff --git a/gcc/testsuite/gcc.target/xtensa/constsynth_3insns.c b/gcc/testsuite/gcc.target/xtensa/constsynth_3insns.c new file mode 100644 index 00000000000..f3c4a1c7c15 --- /dev/null +++ b/gcc/testsuite/gcc.target/xtensa/constsynth_3insns.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mextra-l32r-costs=3" } */ + +int test_0(void) +{ + return 134217216; +} + +int test_1(void) +{ + return -27604992; +} + +int test_2(void) +{ + return -162279; +} + +void test_3(int *p) +{ + *p = 192437; +} + +/* { dg-final { scan-assembler-not "l32r" } } */ diff --git a/gcc/testsuite/gcc.target/xtensa/constsynth_double.c b/gcc/testsuite/gcc.target/xtensa/constsynth_double.c new file mode 100644 index 00000000000..11e5d524283 --- /dev/null +++ b/gcc/testsuite/gcc.target/xtensa/constsynth_double.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-Os } */ + +void test(unsigned int count, double array[]) +{ + unsigned int i; + for (i = 0; i < count; ++i) + array[i] = 1.0; +} + +/* { dg-final { scan-assembler-not "l32r" } } */