From patchwork Tue May 17 10:29:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 54097 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D76533858C53 for ; Tue, 17 May 2022 14:26:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D76533858C53 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1652797614; bh=Fjeolabmd7XQGI+I8/0/OKTEvWRj8tcoWSq9JtHMTlA=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=G1DX8PRZv+O8RiKqw8+oZGfMhpP5n75JrNyNwYSUjhvO8HkEfltp27gkmiU2zfpUc V495c/c0qZbpxd50Zm5pYXG7HBPGb1OvAnZl3ga4BOX47T/Y8kVvYkWve9ZlQuzc1x 31UwAiUvwTiVpOUFlCYJ3cob0pXN+oDNd5FlqYvU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nh501-vm11.bullet.mail.kks.yahoo.co.jp (nh501-vm11.bullet.mail.kks.yahoo.co.jp [183.79.56.141]) by sourceware.org (Postfix) with SMTP id 058583858D3C for ; Tue, 17 May 2022 14:26:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 058583858D3C Received: from [183.79.100.140] by nh501.bullet.mail.kks.yahoo.co.jp with NNFMP; 17 May 2022 14:26:20 -0000 Received: from [183.79.101.121] by t503.bullet.mail.kks.yahoo.co.jp with NNFMP; 17 May 2022 14:26:20 -0000 Received: from [127.0.0.1] by omp508.mail.kks.yahoo.co.jp with NNFMP; 17 May 2022 14:26:20 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 105933.30825.bm@omp508.mail.kks.yahoo.co.jp Received: (qmail 18750 invoked by alias); 17 May 2022 14:26:19 -0000 Received: from unknown (HELO ?192.168.2.3?) (175.177.45.190 with ) by smtp6007.mail.ssk.ynwp.yahoo.co.jp with SMTP; 17 May 2022 14:26:19 -0000 X-YMail-JAS: WpCtHkoVM1nmkR9WaMrKKungc6KUMsLaE3XYN3T8ELDbdDRYOAA14XgnpquJdGRHCSIXMbfqpEIYbZUNlT4h1Wr3J00_Cf9i2xqN5h7nu5NPXF4cIKkWkFDT_Xu6IrZjgWGhw.l9aA-- X-Apparently-From: X-YMail-OSG: G3OpuEgVM1nZ1USBUJzPpfadOlsWRU9iHdYCfzXpSBsErMR 6a7tcbfqu0rp7ZtulsuoO_vfJ2ZEKHb_SgPb1OpLo9DEYxY7.tWEYMK85LRE TdZFVH0R1cVYkgfhzMhHe76ByIXUQziGx1PyQyADUOn6qLfFgRjQaW7ZnN1N wTvYkw_6RlD3savwDMLIRSusCjFDzJDCHGFkjSedDwMZyjWy6G0YxhqQubC7 MiEV2ebVpktHMeDo3Zkm_0NH0cTkAjNoWn1aWkhuMMLfZPX9N76Q5lV6DuGE VwJ2HibctrvZ5B.4eDCjwp7mC.lfGI2Q8dggL7XcHFO_wgXcGt89mqEaRdPy E8KEEICYqqSgenW5TbXRjgoIHsDiXPWKiHp1O69B3dQWDktz_XV.aWFBlLG3 Uq9lL0yCp3I56GsW_FpdCzhrhxQ.bOt4Xxh8f6R8.GRpNS2QfpeW_O8tNPbH PQS_53RpZBM0iHxYXyiTCwGoEaPcEYB94bZgpeE23Qwfn6czWflQ1ejHZEbc Su4U4Ru9iZ.R941gqfTeAZ0FRjp0lD7sIvI6HEdv25iUK4kOxBEAspyEnGVw skRzDKi41axbe6DlAUBWDSQwVdcbZTdWBeoRYQV85EVKCOve.x16hTPkrZ8R Ob8b8H4emaj3tlqCF_G2eKrWS6.acucKfi1c4oKXvky2Kcp9qnX09z2aj1ZT ZAMVSngiEy5AsWaXubg5Ii07P9AZkPafvzhBEnDmBjAnLgAY5MCF5LYy1B3K piCq4vzOcWS7s2Rm6SQ6BIEakrz4cBEfAid.vIGd1mrXRWOErCvUS4rDaQfp YHDfHi7QZv8MQhUF0G3yFYgKoK7NUt3l.peoX4o1H2rk6obuo1nqrDAzA7Cg 75rl.tQzfgIDComIP9uGE7_tqjcLYCZ6sC_SKhWWRdVqXtAj_xxhk9hLW4Z9 A_8BTpMRSCYGciy6dkQ-- Message-ID: <2bd352b2-3a62-74d5-3949-2805bed2238e@yahoo.co.jp> Date: Tue, 17 May 2022 19:29:01 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Content-Language: en-US To: gcc-patches@gcc.gnu.org Subject: [PATCH v2 4/5] xtensa: Add setmemsi insn pattern X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DATE_IN_PAST_03_06, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This patch introduces setmemsi insn pattern of two kinds, unrolled loop and small loop, for fixed small length and constant initialization value. gcc/ChangeLog: * gcc/config/xtensa/xtensa-protos.h (xtensa_expand_block_set_unrolled_loop, xtensa_expand_block_set_small_loop): New prototypes. * gcc/config/xtensa/xtensa.c (xtensa_sizeof_MOVI, xtensa_expand_block_set_unrolled_loop, xtensa_expand_block_set_small_loop): New functions. * gcc/config/xtensa/xtensa.md (setmemsi): New expansion pattern. * gcc/config/xtensa/xtensa.opt (mlongcalls): Add target mask. --- gcc/config/xtensa/xtensa-protos.h | 2 + gcc/config/xtensa/xtensa.c | 208 ++++++++++++++++++++++++++++++ gcc/config/xtensa/xtensa.md | 16 +++ gcc/config/xtensa/xtensa.opt | 2 +- 4 files changed, 227 insertions(+), 1 deletion(-) diff --git a/gcc/config/xtensa/xtensa-protos.h b/gcc/config/xtensa/xtensa-protos.h index 18d803581..80b1da2bb 100644 --- a/gcc/config/xtensa/xtensa-protos.h +++ b/gcc/config/xtensa/xtensa-protos.h @@ -41,6 +41,8 @@ extern void xtensa_expand_conditional_branch (rtx *, machine_mode); extern int xtensa_expand_conditional_move (rtx *, int); extern int xtensa_expand_scc (rtx *, machine_mode); extern int xtensa_expand_block_move (rtx *); +extern int xtensa_expand_block_set_unrolled_loop (rtx *); +extern int xtensa_expand_block_set_small_loop (rtx *); extern void xtensa_split_operand_pair (rtx *, machine_mode); extern int xtensa_emit_move_sequence (rtx *, machine_mode); extern rtx xtensa_copy_incoming_a7 (rtx); diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c index d3405beb6..fb398d00c 100644 --- a/gcc/config/xtensa/xtensa.c +++ b/gcc/config/xtensa/xtensa.c @@ -1363,6 +1363,214 @@ xtensa_expand_block_move (rtx *operands) } +/* Try to expand a block set operation to a sequence of RTL move + instructions. If not optimizing, or if the block size is not a + constant, or if the block is too large, or if the value to + initialize the block with is not a constant, the expansion + fails and GCC falls back to calling memset(). + + operands[0] is the destination + operands[1] is the length + operands[2] is the initialization value + operands[3] is the alignment */ + +static int +xtensa_sizeof_MOVI (HOST_WIDE_INT imm) +{ + return (TARGET_DENSITY && IN_RANGE (imm, -32, 95)) ? 2 : 3; +} + +int +xtensa_expand_block_set_unrolled_loop (rtx *operands) +{ + rtx dst_mem = operands[0]; + HOST_WIDE_INT bytes, value, align; + int expand_len, funccall_len; + rtx x, reg; + int offset; + + if (!CONST_INT_P (operands[1]) || !CONST_INT_P (operands[2])) + return 0; + + bytes = INTVAL (operands[1]); + if (bytes <= 0) + return 0; + value = (int8_t)INTVAL (operands[2]); + align = INTVAL (operands[3]); + if (align > MOVE_MAX) + align = MOVE_MAX; + + /* Insn expansion: holding the init value. + Either MOV(.N) or L32R w/litpool. */ + if (align == 1) + expand_len = xtensa_sizeof_MOVI (value); + else if (value == 0 || value == -1) + expand_len = TARGET_DENSITY ? 2 : 3; + else + expand_len = 3 + 4; + /* Insn expansion: a series of aligned memory stores. + Consist of S8I, S16I or S32I(.N). */ + expand_len += (bytes / align) * (TARGET_DENSITY + && align == 4 ? 2 : 3); + /* Insn expansion: the remainder, sub-aligned memory stores. + A combination of S8I and S16I as needed. */ + expand_len += ((bytes % align + 1) / 2) * 3; + + /* Function call: preparing two arguments. */ + funccall_len = xtensa_sizeof_MOVI (value); + funccall_len += xtensa_sizeof_MOVI (bytes); + /* Function call: calling memset(). */ + funccall_len += TARGET_LONGCALLS ? (3 + 4 + 3) : 3; + + /* Apply expansion bonus (2x) if optimizing for speed. */ + if (optimize > 1 && !optimize_size) + funccall_len *= 2; + + /* Decide whether to expand or not, based on the sum of the length + of instructions. */ + if (expand_len > funccall_len) + return 0; + + x = XEXP (dst_mem, 0); + if (!REG_P (x)) + dst_mem = replace_equiv_address (dst_mem, force_reg (Pmode, x)); + switch (align) + { + case 1: + break; + case 2: + value = (int16_t)((uint8_t)value * 0x0101U); + break; + case 4: + value = (int32_t)((uint8_t)value * 0x01010101U); + break; + default: + gcc_unreachable (); + } + reg = force_reg (SImode, GEN_INT (value)); + + offset = 0; + do + { + int unit_size = MIN (bytes, align); + machine_mode unit_mode = (unit_size >= 4 ? SImode : + (unit_size >= 2 ? HImode : + QImode)); + unit_size = GET_MODE_SIZE (unit_mode); + + emit_move_insn (adjust_address (dst_mem, unit_mode, offset), + unit_mode == SImode ? reg + : convert_to_mode (unit_mode, reg, true)); + + offset += unit_size; + bytes -= unit_size; + } + while (bytes > 0); + + return 1; +} + +int +xtensa_expand_block_set_small_loop (rtx *operands) +{ + HOST_WIDE_INT bytes, value, align; + int expand_len, funccall_len; + rtx dst, end, reg; + machine_mode unit_mode; + rtx_code_label *label; + + if (!CONST_INT_P (operands[1]) || !CONST_INT_P (operands[2])) + return 0; + + bytes = INTVAL (operands[1]); + if (bytes <= 0) + return 0; + value = (int8_t)INTVAL (operands[2]); + align = INTVAL (operands[3]); + if (align > MOVE_MAX) + align = MOVE_MAX; + + /* Totally-aligned block only. */ + if (bytes % align != 0) + return 0; + + /* If 4-byte aligned, small loop substitution is almost optimal, thus + limited to only offset to the end address for ADDI/ADDMI instruction. */ + if (align == 4 + && ! (bytes <= 127 || (bytes <= 32512 && bytes % 256 == 0))) + return 0; + + /* If no 4-byte aligned, loop count should be treated as the constraint. */ + if (align != 4 + && bytes / align > ((optimize > 1 && !optimize_size) ? 8 : 15)) + return 0; + + /* Insn expansion: holding the init value. + Either MOV(.N) or L32R w/litpool. */ + if (align == 1) + expand_len = xtensa_sizeof_MOVI (value); + else if (value == 0 || value == -1) + expand_len = TARGET_DENSITY ? 2 : 3; + else + expand_len = 3 + 4; + /* Insn expansion: Either ADDI(.N) or ADDMI for the end address. */ + expand_len += bytes > 127 ? 3 + : (TARGET_DENSITY && bytes <= 15) ? 2 : 3; + + /* Insn expansion: the loop body and branch instruction. + For store, one of S8I, S16I or S32I(.N). + For advance, ADDI(.N). + For branch, BNE. */ + expand_len += (TARGET_DENSITY && align == 4 ? 2 : 3) + + (TARGET_DENSITY ? 2 : 3) + 3; + + /* Function call: preparing two arguments. */ + funccall_len = xtensa_sizeof_MOVI (value); + funccall_len += xtensa_sizeof_MOVI (bytes); + /* Function call: calling memset(). */ + funccall_len += TARGET_LONGCALLS ? (3 + 4 + 3) : 3; + + /* Apply expansion bonus (2x) if optimizing for speed. */ + if (optimize > 1 && !optimize_size) + funccall_len *= 2; + + /* Decide whether to expand or not, based on the sum of the length + of instructions. */ + if (expand_len > funccall_len) + return 0; + + dst = gen_reg_rtx (SImode); + emit_move_insn (dst, XEXP (operands[0], 0)); + end = gen_reg_rtx (SImode); + emit_insn (gen_addsi3 (end, dst, operands[1] /* the length */)); + switch (align) + { + case 1: + unit_mode = QImode; + break; + case 2: + value = (int16_t)((uint8_t)value * 0x0101U); + unit_mode = HImode; + break; + case 4: + value = (int32_t)((uint8_t)value * 0x01010101U); + unit_mode = SImode; + break; + default: + gcc_unreachable (); + } + reg = force_reg (unit_mode, GEN_INT (value)); + + label = gen_label_rtx (); + emit_label (label); + emit_move_insn (gen_rtx_MEM (unit_mode, dst), reg); + emit_insn (gen_addsi3 (dst, dst, GEN_INT (align))); + emit_cmp_and_jump_insns (dst, end, NE, const0_rtx, SImode, true, label); + + return 1; +} + + void xtensa_expand_nonlocal_goto (rtx *operands) { diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index 251c313d5..9eb689efa 100644 --- a/gcc/config/xtensa/xtensa.md +++ b/gcc/config/xtensa/xtensa.md @@ -1085,6 +1085,22 @@ DONE; }) +;; Block sets + +(define_expand "setmemsi" + [(match_operand:BLK 0 "memory_operand") + (match_operand:SI 1 "") + (match_operand:SI 2 "") + (match_operand:SI 3 "const_int_operand")] + "!optimize_debug && optimize" +{ + if (xtensa_expand_block_set_unrolled_loop (operands)) + DONE; + if (xtensa_expand_block_set_small_loop (operands)) + DONE; + FAIL; +}) + ;; Shift instructions. diff --git a/gcc/config/xtensa/xtensa.opt b/gcc/config/xtensa/xtensa.opt index aef67970b..e1d992f5d 100644 --- a/gcc/config/xtensa/xtensa.opt +++ b/gcc/config/xtensa/xtensa.opt @@ -27,7 +27,7 @@ Target Report Mask(FORCE_NO_PIC) Disable position-independent code (PIC) for use in OS kernel code. mlongcalls -Target +Target Mask(LONGCALLS) Use indirect CALLXn instructions for large programs. mtarget-align