From patchwork Thu Jan 13 09:48:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 49955 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E37463858029 for ; Thu, 13 Jan 2022 09:48:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E37463858029 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1642067339; bh=/iHolU7g5IzzXbjcEHxB6ByJKffd+FepWArQoJDA0hY=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=D36AMREpVuakY/ZaNcHobXBoXw4jOHv6vl1prlu9qSrg09RW73FV30KjZTwf8ITTa y7ah0xd/ESLO6ajR3BnBgWju+g/zbIhguNQ+jTI2mqBfQE8j4zCRR6RhxYVoQytKWt 4EvU7iK5wIZETq2VWaoglxcq35vqGXx3aJCQySRo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id F0CB43858029 for ; Thu, 13 Jan 2022 09:48:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F0CB43858029 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 75B2E6D; Thu, 13 Jan 2022 01:48:30 -0800 (PST) Received: from [10.57.11.97] (unknown [10.57.11.97]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DB2413F881; Thu, 13 Jan 2022 01:48:29 -0800 (PST) Message-ID: <51da8b34-511c-4359-b7d8-786935417c70@arm.com> Date: Thu, 13 Jan 2022 09:48:35 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.4.1 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Subject: [vect] PR103997: Fix epilogue mode skipping X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford , Richard Biener Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" This time to the list too (sorry for double email) Hi, The original patch '[vect] Re-analyze all modes for epilogues', skipped modes that should not be skipped since it used the vector mode provided by autovectorize_vector_modes to derive the minimum VF required for it. However, those modes should only really be used to dictate vector size, so instead this patch looks for the mode in 'used_vector_modes' with the largest element size, and constructs a vector mode with the smae size as the current vector_modes[mode_i]. Since we are using the largest element size the NUNITs for this mode is the smallest possible VF required for an epilogue with this mode and should thus skip only the modes we are certain can not be used. Passes bootstrap and regression on x86_64 and aarch64. gcc/ChangeLog:         PR 103997         * tree-vect-loop.c (vect_analyze_loop): Fix mode skipping for epilogue         vectorization. diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index ba67de490bbd033b6db6217c8f9f9ca04cec323b..87b5ec5b4c6cb40e922b1e04bb7777ce74233af8 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -3038,12 +3038,37 @@ vect_analyze_loop (class loop *loop, vec_info_shared *shared) would be at least as high as the main loop's and we would be vectorizing for more scalar iterations than there would be left. */ if (!supports_partial_vectors - && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf)) - { - mode_i++; - if (mode_i == vector_modes.length ()) - break; - continue; + && VECTOR_MODE_P (vector_modes[mode_i])) + { + /* To make sure we are conservative as to what modes we skip, we + should use check the smallest possible NUNITS which would be + derived from the mode in USED_VECTOR_MODES with the largest + element size. */ + scalar_mode max_elsize_mode = GET_MODE_INNER (vector_modes[mode_i]); + for (vec_info::mode_set::iterator i = + first_loop_vinfo->used_vector_modes.begin (); + i != first_loop_vinfo->used_vector_modes.end (); ++i) + { + if (VECTOR_MODE_P (*i) + && GET_MODE_SIZE (GET_MODE_INNER (*i)) + > GET_MODE_SIZE (max_elsize_mode)) + max_elsize_mode = GET_MODE_INNER (*i); + } + /* After finding the largest element size used in the main loop, find + the related vector mode with the same size as the mode + corresponding to the current MODE_I. */ + machine_mode max_elsize_vector_mode = + related_vector_mode (vector_modes[mode_i], max_elsize_mode, + 0).else_void (); + if (VECTOR_MODE_P (max_elsize_vector_mode) + && maybe_ge (GET_MODE_NUNITS (max_elsize_vector_mode), + first_vinfo_vf)) + { + mode_i++; + if (mode_i == vector_modes.length ()) + break; + continue; + } } if (dump_enabled_p ())