From patchwork Fri Nov 17 13:36:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 80125 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 724243858C74 for ; Fri, 17 Nov 2023 13:36:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 8BA8D3858C3A for ; Fri, 17 Nov 2023 13:36:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8BA8D3858C3A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8BA8D3858C3A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700228203; cv=none; b=UmtmOfJfYhQUJnHmbi2GCgthmnIc4VIDYFJbExf7VYk8hdzGFOJrbwuecslpBgApZ2wJbK/oDu/K1XuZ6nvrxK2lgqYhXqL6plmJkoC9whbY5G7DyCJuuYcv6tdRqrjRaNfICqqW0IAtNCDKU8J19sslpUBcpTHfQR0vl5dMtGY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700228203; c=relaxed/simple; bh=Kk0ZKkqGOBPV3C94R17pyQobXFIrTZ1cOUDuKOKOsIM=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=mz643wQZ1uBPhv1o2J/vtD2xvzrN291QZ7m4N0YI87aAYkhr93xME1OOZASF1D8Q1/k21o8s5uoT3pX2RmyobtawG+EES1CNcNFi390VIG7tmxItOQiA4dvPl7q6yjtKlqzz0RPPnpsnMHMtt34ekJACs8hbMUTj0fci1Y2ZELo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700228202; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=5tSIZckgqfuzCMHpi0g3AstBNrT+wuNsFKZdS5hiNEc=; b=ijVl8Q/PuvqMZHXCH0FtCA3Kbr0QWf+PVKlusgtjVlNOK8MkY5Nlu09HZBw6YsIuReVg0L iAKalBZWvTBTmuMv8k4kd+REjSY2H6c9uAJhVJvLUTiyGeUO0hkYswt7FfK+QHHPVWY4CF n5YBt87ksj72H/VgFhzpu+Gjp7PZ7Mw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-371-Joi3Hy10PUKv9gp0ADvZIg-1; Fri, 17 Nov 2023 08:36:41 -0500 X-MC-Unique: Joi3Hy10PUKv9gp0ADvZIg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C0A7D85A58A; Fri, 17 Nov 2023 13:36:40 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.194.53]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8496BC15881; Fri, 17 Nov 2023 13:36:40 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 3AHDabBw3134110 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Fri, 17 Nov 2023 14:36:38 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 3AHDabNk3134109; Fri, 17 Nov 2023 14:36:37 +0100 Date: Fri, 17 Nov 2023 14:36:36 +0100 From: Jakub Jelinek To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] vect: Fix check_reduction_path [PR112374] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Hi! As mentioned in the PR, the intent of the r14-5076 changes was that it doesn't count one of the uses on the use_stmt, but what actually got implemented is that it does this processing on any op_use_stmt, even if it is not the use_stmt statement, which means that it can increase count even on debug stmts (-fcompare-debug failures), or if there would be some other use stmt with 2+ uses it could count that as a single use. Though, because it fails whenever cnt != 1 and I believe use_stmt must be one of the uses, it would probably fail in the latter case anyway. The following patch fixes that by doing this extra processing only when op_use_stmt is use_stmt, and using the normal processing otherwise (so ignore debug stmts, and increase on any uses on the stmt). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2023-11-17 Jakub Jelinek PR tree-optimization/112374 * tree-vect-loop.cc (check_reduction_path): Perform the cond_fn_p special case only if op_use_stmt == use_stmt, use as_a rather than dyn_cast in that case. * gcc.dg/pr112374-1.c: New test. * gcc.dg/pr112374-2.c: New test. * g++.dg/opt/pr112374.C: New test. Jakub --- gcc/tree-vect-loop.cc.jj 2023-11-14 10:35:52.000000000 +0100 +++ gcc/tree-vect-loop.cc 2023-11-15 22:42:32.782007408 +0100 @@ -4105,9 +4105,9 @@ pop: /* In case of a COND_OP (mask, op1, op2, op1) reduction we might have op1 twice (once as definition, once as else) in the same operation. Allow this. */ - if (cond_fn_p) + if (cond_fn_p && op_use_stmt == use_stmt) { - gcall *call = dyn_cast (use_stmt); + gcall *call = as_a (use_stmt); unsigned else_pos = internal_fn_else_index (internal_fn (op.code)); --- gcc/testsuite/gcc.dg/pr112374-1.c.jj 2023-11-16 10:27:12.064849600 +0100 +++ gcc/testsuite/gcc.dg/pr112374-1.c 2023-11-16 10:23:57.949586509 +0100 @@ -0,0 +1,20 @@ +/* PR tree-optimization/112374 */ +/* { dg-do compile } */ +/* { dg-options "-fcompare-debug -gno-statement-frontiers -O2 -w" } */ +/* { dg-additional-options "-march=skylake-avx512" { target i?86-*-* x86_64-*-* } } */ +/* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */ + +void foo (int, int); +struct S { char s[4]; }; +int a, b, c; + +void +bar () +{ + struct S d; + long e = 0; + for (c = 0; c < 4; ++c) + e |= (d.s[c] ? 3 : 0) << c; + if (e) + foo (a, b); +} --- gcc/testsuite/gcc.dg/pr112374-2.c.jj 2023-11-16 10:27:15.341803394 +0100 +++ gcc/testsuite/gcc.dg/pr112374-2.c 2023-11-16 10:24:11.705392556 +0100 @@ -0,0 +1,33 @@ +/* PR tree-optimization/112374 */ +/* { dg-do compile } */ +/* { dg-options "-fcompare-debug -gno-statement-frontiers -O2" } */ +/* { dg-additional-options "-march=skylake-avx512" { target i?86-*-* x86_64-*-* } } */ +/* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */ + +void foo (int, int); +struct S { char s[64]; } *p; +char a, b; +unsigned char c; +int d, e; + +void +bar (void) +{ + unsigned i; + long j = 0; + for (i = 0; i < b; ++i) + j |= (p->s[i] ? 3 : 0) << i; + if (p->s[i + 1]) + lab: + for (;;) + ; + for (i = 0; i < 4; ++i) + j |= p->s[i] << i; + for (; i; i += 2) + if (c + 1 != a) + goto lab; + for (; i < 8; ++i) + j |= p->s[i] >= 6; + if (j) + foo (d, e); +} --- gcc/testsuite/g++.dg/opt/pr112374.C.jj 2023-11-16 10:27:52.626277708 +0100 +++ gcc/testsuite/g++.dg/opt/pr112374.C 2023-11-16 10:28:04.914104455 +0100 @@ -0,0 +1,24 @@ +// PR tree-optimization/112374 +// { dg-do compile { target c++11 } } +// { dg-options "-fcompare-debug -gno-statement-frontiers -O2" } +// { dg-additional-options "-march=skylake-avx512" { target i?86-*-* x86_64-*-* } } +// { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } + +struct t +{ + long coef[1]; + t(const unsigned long &a) : coef{(long)a} {}; + t(const t &a); +}; +extern void gen_int_mode(t, int); +struct expand_vec_perm_d { + unsigned char perm[64]; + int vmode; + unsigned char nelt; +}; +void expand_vec_perm_blend(struct expand_vec_perm_d *d) { + unsigned long mask = 0; + for (unsigned i = 0; i < 4; ++i) + mask |= (d->perm[i] >= 4 ? 3 : 0) << (i * 2); + gen_int_mode(mask, 0); +}