Message ID | 842bf876-2f00-7216-b6f4-e625a13868dc@yahoo.co.jp |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7E343383EC7C for <patchwork@sourceware.org>; Fri, 10 Jun 2022 04:28:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7E343383EC7C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1654835336; bh=KvaD0/dkxLoLqUX6zO+CeGciKFuNlvcb33ydwfqeBaA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=IDfmtbELWRZcHVeOD9S1M5nOCev3nfJ4CEUvwHa1u9xq5K5kdS+TFdSb8wHBkdfyy S9tXc2eftzNVYIEHCVy+I5TMJq281Ti1jf90KsQOVthHabdbNRXGBU83RVIgDoO1Oo teSxTtDZNzTxIjGsWnsVu/46L6y/oO6ySPgFL4wA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nh602-vm0.bullet.mail.ssk.yahoo.co.jp (nh602-vm0.bullet.mail.ssk.yahoo.co.jp [182.22.90.25]) by sourceware.org (Postfix) with SMTP id 19C76384D164 for <gcc-patches@gcc.gnu.org>; Fri, 10 Jun 2022 04:26:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 19C76384D164 Received: from [182.22.66.105] by nh602.bullet.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:05 -0000 Received: from [182.22.91.133] by t603.bullet.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:05 -0000 Received: from [127.0.0.1] by omp606.mail.ssk.yahoo.co.jp with NNFMP; 10 Jun 2022 04:26:05 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 927311.55213.bm@omp606.mail.ssk.yahoo.co.jp Received: (qmail 86142 invoked by alias); 10 Jun 2022 04:26:05 -0000 Received: from unknown (HELO ?192.168.2.3?) (175.177.45.164 with ) by smtp6009.mail.ssk.ynwp.yahoo.co.jp with SMTP; 10 Jun 2022 04:26:05 -0000 X-YMail-JAS: pyMiQeUVM1kol2MxR2NJbnhAIErtELS0KGnngsHmDgwfAZ1DocbSQcAE5YvVxMMBOahyVlBK01h.RV19uVPMx_OVywrwVs46PgAXCUpoAJwhTeoy5auPpFrx5rfTjxFT.pTICUFQfA-- X-Apparently-From: <jjsuwa_sys3175@yahoo.co.jp> X-YMail-OSG: 3rsjy0UVM1lKGFZIMQwYgnRS2H19ox0xb0RmgBfLJTIYBsA _XeYbFPOHLQASwSa.h88itfH92r2B2NQojfe6ubqm4jf_fT98y6bbg8dODRh mERTk_87_MC..LBiSMH_dwcwAF4c5Pq6Spwsm.tai8_S7fnw2B0pi5CgEXto E4Z56yge0azXWYSWovBQxcY_wKJTJhy_RNwwP1ZDEQ8lqkxCKl5uISt7Nq.o 86aeqOV9Nhe4fSqvfzro_PzK2elyGJ70HcIEb4Bw6u2u8r51GQvtYWH_cyO6 oHFfkLbssQuWDh4Boch5Bt7VDrl2y6J3biHOc3JHATNgO.mbXTbWG1K6vnGl Hde9G3NvDg90Mh0VgG.YQfyV0XIRq8j3fVP31d_Hep6pCX7wdo7TcHVpkoZY k55pA3_4FqQoEuPekMXYxskmtAgfYoQjhTlzKwGenArs64OCzCbkW49yWKBy y8nUsgAExnbmD7QRO4tNcv8ytOK1Nr3yXcukGKXPtvscKwH5JgxFEBqxGLBU czYfbUtS9BcySzwrzuEJmjdqSJrBRXU3jg._zcWAlOc9DF8Mk7Ph5bathahn nrJcWbcSdZ74iLQqpdcJffC_qwPG0OhOCSJ52BZvAi9EwNz9DMYitkFIRTPn EgFEyg80tZnTtBVfK_JJOOsrZWWpjVBI8KoCnN3fR3gEEm5OzaUPhA.Gb0nb eiYeaiJdquS_uth0ZxSDCpr6c0BrIxNiF5PWKRuIDb6EtV7l8oAT_Ns1KHJY BWNKnCJT0gKkHJnrlY5eMciPzpnaEeMPlZjMS4gjJpG4qiPtSqCqBbO5nRSr 1QD_9E1mbISL4i9.LyR1dpEx9hR0tZRHywelbcsp1C8lDuOocrF1FAsT9ENY GPvicd_QUfjm9PVBnPrO.V9wHOOsmWu1qim5TLIHqyTnf9dO.nhVVqVQNVIj .fZjdVGNCH3Lu7xub5w-- Message-ID: <842bf876-2f00-7216-b6f4-e625a13868dc@yahoo.co.jp> Date: Fri, 10 Jun 2022 13:18:24 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.10.0 Content-Language: en-US To: GCC Patches <gcc-patches@gcc.gnu.org> Subject: [PATCH 2/4] xtensa: Consider the Loop Option when setmemsi is expanded to small loop Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Takayuki 'January June' Suwa via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[1/4] xtensa: Tweak some widen multiplications
|
|
Commit Message
Takayuki 'January June' Suwa
June 10, 2022, 4:18 a.m. UTC
Now apply to almost any size of aligned block under such circumstances. gcc/ChangeLog: * config/xtensa/xtensa.cc (xtensa_expand_block_set_small_loop): Pass through the block length / loop count conditions if zero-overhead looping is configured and active, --- gcc/config/xtensa/xtensa.cc | 65 +++++++++++++++++++++++++------------ 1 file changed, 45 insertions(+), 20 deletions(-) /* Insn expansion: holding the init value. Either MOV(.N) or L32R w/litpool. */ @@ -1523,16 +1531,33 @@ xtensa_expand_block_set_small_loop (rtx *operands) expand_len = TARGET_DENSITY ? 2 : 3; else expand_len = 3 + 4; - /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ - expand_len += bytes > 127 ? 3 - : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; - - /* Insn expansion: the loop body and branch instruction. - For store, one of S8I, S16I or S32I(.N). - For advance, ADDI(.N). - For branch, BNE. */ - expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) - + (TARGET_DENSITY ? 2 : 3) + 3; + if (TARGET_LOOPS && optimize) /* zero-overhead looping */ + { + /* Insn translation: Either MOV(.N) or L32R w/litpool for the + loop count. */ + expand_len += xtensa_simm12b (count) ? xtensa_sizeof_MOVI (count) + : 3 + 4; + /* Insn translation: LOOP, the zero-overhead looping setup + instruction. */ + expand_len += 3; + /* Insn expansion: the loop body instructions. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3); + } + else /* NO zero-overhead looping */ + { + /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ + expand_len += bytes > 127 ? 3 + : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; + /* Insn expansion: the loop body and branch instruction. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). + For branch, BNE. */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3) + 3; + } /* Function call: preparing two arguments. */ funccall_len = xtensa_sizeof_MOVI (value);
Comments
Hi Suwa-san, On Thu, Jun 9, 2022 at 9:26 PM Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp> wrote: > > Now apply to almost any size of aligned block under such circumstances. > > gcc/ChangeLog: > > * config/xtensa/xtensa.cc (xtensa_expand_block_set_small_loop): > Pass through the block length / loop count conditions if > zero-overhead looping is configured and active, > --- > gcc/config/xtensa/xtensa.cc | 65 +++++++++++++++++++++++++------------ > 1 file changed, 45 insertions(+), 20 deletions(-) This change results in a bunch of ICEs in tests that look like this: gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c: In function 'main': gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c:28:1: error: unrecognizable insn: (insn 7 6 8 2 (set (reg:SI 45) (plus:SI (reg:SI 44) (const_int 262144 [0x40000]))) "gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c":23:3 -1 (nil)) during RTL pass: vregs gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c:28:1: internal compiler error: in extract_insn, at recog.cc:2791 0x6a21cf _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) gcc/gcc/rtl-error.cc:108 0x6a2252 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) gcc/gcc/rtl-error.cc:116 0x693824 extract_insn(rtx_insn*) gcc/gcc/recog.cc:2791 0xb27647 instantiate_virtual_regs_in_insn gcc/gcc/function.cc:1611 0xb27647 instantiate_virtual_regs gcc/gcc/function.cc:1985 0xb27647 execute gcc/gcc/function.cc:2034
On 2022/06/11 9:12, Max Filippov wrote: > Hi Suwa-san, hi! > This change results in a bunch of ICEs in tests that look like this: > > gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c: In function 'main': > gcc/gcc/testsuite/gcc.c-torture/compile/memtst.c:28:1: error: > unrecognizable insn: > (insn 7 6 8 2 (set (reg:SI 45) > (plus:SI (reg:SI 44) > (const_int 262144 [0x40000]))) oh, what a my mistake... it's so RISCy! int array[65535]; void test(void) { __builtin_memset(array, 0, sizeof(array)); } .literal_position .literal .LC0, array .literal .LC2, 65535 test: l32r a3, .LC0 l32r a2, .LC2 movi.n a4, 0 loop a2, .L2_LEND .L2: s32i.n a4, a3, 0 addi.n a3, a3, 4 .L2_LEND: ret.n --- gcc/config/xtensa/xtensa.cc | 71 ++++++++++++++++++++++++++----------- 1 file changed, 50 insertions(+), 21 deletions(-) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index c7b54babc37..bc3330f836f 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -1483,7 +1483,7 @@ xtensa_expand_block_set_unrolled_loop (rtx *operands) int xtensa_expand_block_set_small_loop (rtx *operands) { - HOST_WIDE_INT bytes, value, align; + HOST_WIDE_INT bytes, value, align, count; int expand_len, funccall_len; rtx x, dst, end, reg; machine_mode unit_mode; @@ -1503,17 +1503,25 @@ xtensa_expand_block_set_small_loop (rtx *operands) /* Totally-aligned block only. */ if (bytes % align != 0) return 0; + count = bytes / align; - /* If 4-byte aligned, small loop substitution is almost optimal, thus - limited to only offset to the end address for ADDI/ADDMI instruction. */ - if (align == 4 - && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) - return 0; + /* If the Loop Option (zero-overhead looping) is configured and active, + almost no restrictions about the length of the block. */ + if (! (TARGET_LOOPS && optimize)) + { + /* If 4-byte aligned, small loop substitution is almost optimal, + thus limited to only offset to the end address for ADDI/ADDMI + instruction. */ + if (align == 4 + && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) + return 0; - /* If no 4-byte aligned, loop count should be treated as the constraint. */ - if (align != 4 - && bytes / align > ((optimize > 1 && !optimize_size) ? 8 : 15)) - return 0; + /* If no 4-byte aligned, loop count should be treated as the + constraint. */ + if (align != 4 + && count > ((optimize > 1 && !optimize_size) ? 8 : 15)) + return 0; + } /* Insn expansion: holding the init value. Either MOV(.N) or L32R w/litpool. */ @@ -1523,16 +1531,33 @@ xtensa_expand_block_set_small_loop (rtx *operands) expand_len = TARGET_DENSITY ? 2 : 3; else expand_len = 3 + 4; - /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ - expand_len += bytes > 127 ? 3 - : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; - - /* Insn expansion: the loop body and branch instruction. - For store, one of S8I, S16I or S32I(.N). - For advance, ADDI(.N). - For branch, BNE. */ - expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) - + (TARGET_DENSITY ? 2 : 3) + 3; + if (TARGET_LOOPS && optimize) /* zero-overhead looping */ + { + /* Insn translation: Either MOV(.N) or L32R w/litpool for the + loop count. */ + expand_len += xtensa_simm12b (count) ? xtensa_sizeof_MOVI (count) + : 3 + 4; + /* Insn translation: LOOP, the zero-overhead looping setup + instruction. */ + expand_len += 3; + /* Insn expansion: the loop body instructions. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3); + } + else /* NO zero-overhead looping */ + { + /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ + expand_len += bytes > 127 ? 3 + : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; + /* Insn expansion: the loop body and branch instruction. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). + For branch, BNE. */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3) + 3; + } /* Function call: preparing two arguments. */ funccall_len = xtensa_sizeof_MOVI (value); @@ -1555,7 +1580,11 @@ xtensa_expand_block_set_small_loop (rtx *operands) dst = gen_reg_rtx (SImode); emit_move_insn (dst, x); end = gen_reg_rtx (SImode); - emit_insn (gen_addsi3 (end, dst, operands[1] /* the length */)); + if (TARGET_LOOPS && optimize) + x = force_reg (SImode, operands[1] /* the length */); + else + x = operands[1]; + emit_insn (gen_addsi3 (end, dst, x)); switch (align) { case 1:
diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index c7b54babc37..616ced3ed38 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -1483,7 +1483,7 @@ xtensa_expand_block_set_unrolled_loop (rtx *operands) int xtensa_expand_block_set_small_loop (rtx *operands) { - HOST_WIDE_INT bytes, value, align; + HOST_WIDE_INT bytes, value, align, count; int expand_len, funccall_len; rtx x, dst, end, reg; machine_mode unit_mode; @@ -1503,17 +1503,25 @@ xtensa_expand_block_set_small_loop (rtx *operands) /* Totally-aligned block only. */ if (bytes % align != 0) return 0; + count = bytes / align; - /* If 4-byte aligned, small loop substitution is almost optimal, thus - limited to only offset to the end address for ADDI/ADDMI instruction. */ - if (align == 4 - && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) - return 0; + /* If the Loop Option (zero-overhead looping) is configured and active, + almost no restrictions about the length of the block. */ + if (! (TARGET_LOOPS && optimize)) + { + /* If 4-byte aligned, small loop substitution is almost optimal, + thus limited to only offset to the end address for ADDI/ADDMI + instruction. */ + if (align == 4 + && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) + return 0; - /* If no 4-byte aligned, loop count should be treated as the constraint. */ - if (align != 4 - && bytes / align > ((optimize > 1 && !optimize_size) ? 8 : 15)) - return 0; + /* If no 4-byte aligned, loop count should be treated as the + constraint. */ + if (align != 4 + && count > ((optimize > 1 && !optimize_size) ? 8 : 15)) + return 0; + }