From patchwork Fri Mar 25 14:26:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 52350 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CCA043888816 for ; Fri, 25 Mar 2022 14:26:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CCA043888816 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1648218406; bh=r3NMeODr/vJe9SE3t7w1z8o5PyCumCFbpPnaVaxYIVw=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=m3CEuAWzwUGqaRLCrZyNmYtAj7ce47gQeDR2+arPIhZ4DD9inINjrSWdQn3huPgzK YrVgVHM/69ykausv7TdbnGkEJVz44FhbWrffBx2T7mzS4HMXFWIQtWvm6PaJsDSX/0 3Jow9VkdQ8MAHQnDGYwRt5aZwfzjUj6rlSHvh9/s= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id E8B993857404 for ; Fri, 25 Mar 2022 14:26:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org E8B993857404 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id B7643210FD for ; Fri, 25 Mar 2022 14:26:16 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id A480D132E9 for ; Fri, 25 Mar 2022 14:26:16 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id xxIBJwjRPWKMFAAAMHmgww (envelope-from ) for ; Fri, 25 Mar 2022 14:26:16 +0000 Date: Fri, 25 Mar 2022 15:26:16 +0100 (CET) To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/105053 - fix reduction chain epilogue generation MIME-Version: 1.0 Message-Id: <20220325142616.A480D132E9@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" When we optimize permutations in a reduction chain we have to be careful to select the correct live-out stmt, otherwise the reduction result will be unused and the retained scalar code will execute only the number of vector iterations. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk. 2022-03-25 Richard Biener PR tree-optimization/105053 * tree-vect-loop.cc (vect_create_epilog_for_reduction): Pick the correct live-out stmt for a reduction chain. * g++.dg/vect/pr105053.cc: New testcase. --- gcc/testsuite/g++.dg/vect/pr105053.cc | 25 +++++++++++++++++++++++++ gcc/tree-vect-loop.cc | 14 +++++++++++--- 2 files changed, 36 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/vect/pr105053.cc diff --git a/gcc/testsuite/g++.dg/vect/pr105053.cc b/gcc/testsuite/g++.dg/vect/pr105053.cc new file mode 100644 index 00000000000..6deef8458fc --- /dev/null +++ b/gcc/testsuite/g++.dg/vect/pr105053.cc @@ -0,0 +1,25 @@ +// { dg-require-effective-target c++11 } +// { dg-require-effective-target int32plus } + +#include +#include +#include + +int main() +{ + const int n = 4; + std::vector> vec + = { { 1597201307, 1817606674, 0. }, + { 1380347796, 1721941769, 0.}, + {837975613, 1032707773, 0.}, + {1173654292, 2020064272, 0.} } ; + int sup1 = 0; + for(int i=0;i(vec[i]),std::get<1>(vec[i]))); + int sup2 = 0; + for(int i=0;i(vec[i])),std::get<1>(vec[i])); + if (sup1 != sup2) + std::abort (); + return 0; +} diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 7a74633e0b4..d7bc34636bd 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -5271,9 +5271,17 @@ vect_create_epilog_for_reduction (loop_vec_info loop_vinfo, /* All statements produce live-out values. */ live_out_stmts = SLP_TREE_SCALAR_STMTS (slp_node); else if (slp_node) - /* The last statement in the reduction chain produces the live-out - value. */ - single_live_out_stmt[0] = SLP_TREE_SCALAR_STMTS (slp_node)[group_size - 1]; + { + /* The last statement in the reduction chain produces the live-out + value. Note SLP optimization can shuffle scalar stmts to + optimize permutations so we have to search for the last stmt. */ + for (k = 0; k < group_size; ++k) + if (!REDUC_GROUP_NEXT_ELEMENT (SLP_TREE_SCALAR_STMTS (slp_node)[k])) + { + single_live_out_stmt[0] = SLP_TREE_SCALAR_STMTS (slp_node)[k]; + break; + } + } unsigned vec_num; int ncopies;