From patchwork Fri Mar 7 13:36:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 107507 Return-Path: X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CCB063858D38 for ; Fri, 7 Mar 2025 13:37:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CCB063858D38 Authentication-Results: sourceware.org; dkim=pass (1024-bit key, unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=twYx6og4; dkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=osRuSSaB; dkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de header.a=rsa-sha256 header.s=susede2_rsa header.b=X+v6Kjed; dkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=QOZrmynJ X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by sourceware.org (Postfix) with ESMTPS id 0A7C23858D1E for ; Fri, 7 Mar 2025 13:36:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0A7C23858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0A7C23858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.131 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741354571; cv=none; b=tHe1j6F1R9JbvzT/Q2XgF2WWe+rKcQk3IXuvySPByiWjGTGaMGv5Es5x3V2ATWsZWvfcRyEL2+M8dh8UpYy3f6QFXVnnw8JaUESmT96gp7J3O9cSGgjURzX7+JtkdNBWjZTyVafIYQfmgWq7ENRLylVG6fn2grDcMe1678P6hkk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1741354571; c=relaxed/simple; bh=ss3kXQ+ESEL8cMZkb83JZgrzCAIrsbX2Mlt0DwzYgm4=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version:Message-Id; b=C30SkRudOfUhjAe3SV6Tk6VsKzKkuoJU8oSiNtaz/bkrcLkeWTzYopM5Dc/rP2P/bU0cgkaR6GSBpxdVawBDz9WXcdibgSlbOpxpqiYWvc6ywLQys/Uz9zmizr6xlSoNKa8Lfx1xW6BSVUZvGbn/LIqiT0wfqEpnME6SbON2qvk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0A7C23858D1E Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id C9C1F1F395; Fri, 7 Mar 2025 13:36:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741354570; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=yIGKqIAxnmRV3wHwVQHbFvAX+Dlw4qAarUPVgGATkNY=; b=twYx6og4xNt7ovf7i7rnG94XDQH6hzoNbO+GfW2IpoYbWNwnbfleKVZaa0vxWebDlI+Toq 0fEobaRE3BuQ8M7r0rmNDinITo62l0l6xD8l1VF0eXLxaDWk0RrK1ZN1GqBP9lzCzfpTXC v560ktUzaZv72WiUvOwvZOU42wewjbA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741354570; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=yIGKqIAxnmRV3wHwVQHbFvAX+Dlw4qAarUPVgGATkNY=; b=osRuSSaBZp3+K6c92LcaCSfIPrlSAl+vTypgEA7pVlqtliJZRXth7Hfyqg3JsVIreoekX1 5bQYZliPuIh4ljBw== Authentication-Results: smtp-out2.suse.de; none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1741354568; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=yIGKqIAxnmRV3wHwVQHbFvAX+Dlw4qAarUPVgGATkNY=; b=X+v6KjedlcM/82G+OU7FdoFxgsLoHPaShNS0SUpFQUFQkSazjDMGVRT1q6X0JA2JHNokn7 sp3iVyrjqmESjTULeZEnbj97t8t6wTk+B0ymx8qWZzm6PMvleNoA3Q2cW7mLQ33G+RdouK NFMOugsK2HzYUPlmemjLh2W+apSZPwA= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1741354568; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=yIGKqIAxnmRV3wHwVQHbFvAX+Dlw4qAarUPVgGATkNY=; b=QOZrmynJt5EaH79N92d4U24u8ONNUfcFG6kWILt8GB02qTJKkheKjpu7qhjkItyU/EjHCv cUoikDtQ25BhAACw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id A307C13A22; Fri, 7 Mar 2025 13:36:08 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id GYsjJkj2ymcFGwAAD6G6ig (envelope-from ); Fri, 07 Mar 2025 13:36:08 +0000 Date: Fri, 7 Mar 2025 14:36:08 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: richard.sandiford@arm.com, RISC-V Subject: [PATCH] tree-optimization/119155 - wrong aligned access for vectorized packed access MIME-Version: 1.0 Message-Id: <20250307133608.A307C13A22@imap1.dmz-prg2.suse.org> X-Spam-Score: -4.30 X-Spamd-Result: default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; MISSING_XM_UA(0.00)[]; TO_DN_SOME(0.00)[]; RCVD_TLS_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; DBL_BLOCKED_OPENRESOLVER(0.00)[tree-vect-stmts.cc:url, imap1.dmz-prg2.suse.org:mid, imap1.dmz-prg2.suse.org:helo] X-Spam-Level: X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~patchwork=sourceware.org@gcc.gnu.org When doing strided SLP vectorization we use the wrong alignment for the possibly piecewise access of the vector elements for loads and stores. While we are carefully using element aligned loads and stores that isn't enough for the case the original scalar accesses are packed. The following instead honors larger alignment when present but correctly falls back to the original scalar alignment used. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. PR tree-optimization/119155 * tree-vect-stmts.cc (vectorizable_store): Do not always use vector element alignment for VMAT_STRIDED_SLP but a more correct alignment towards both ends. (vectorizable_load): Likewise. * gcc.dg/vect/pr119155.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr119155.c | 26 ++++++++++++++++++++++++++ gcc/tree-vect-stmts.cc | 21 ++++++++++++++++++--- 2 files changed, 44 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr119155.c diff --git a/gcc/testsuite/gcc.dg/vect/pr119155.c b/gcc/testsuite/gcc.dg/vect/pr119155.c new file mode 100644 index 00000000000..b860cf24b0f --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr119155.c @@ -0,0 +1,26 @@ +#include +#include "tree-vect.h" + +struct s { int x; } __attribute__((packed)); + +void __attribute__((noipa)) +f (char *xc, char *yc, int z) +{ + for (int i = 0; i < 100; ++i) + { + struct s *x = (struct s *) xc; + struct s *y = (struct s *) yc; + x->x += y->x; + xc += z; + yc += z; + } +} + +int main () +{ + check_vect (); + char *x = malloc (100 * sizeof (struct s) + 1); + char *y = malloc (100 * sizeof (struct s) + 1); + f (x + 1, y + 1, sizeof (struct s)); + return 0; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 6bbb16beff2..7d0a7fc4033 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -8782,7 +8782,15 @@ vectorizable_store (vec_info *vinfo, } } } - ltype = build_aligned_type (ltype, TYPE_ALIGN (elem_type)); + unsigned align; + if (alignment_support_scheme == dr_aligned) + align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); + else + align = dr_alignment (vect_dr_behavior (vinfo, first_dr_info)); + /* Alignment is at most the access size if we do multiple stores. */ + if (nstores > 1) + align = MIN (tree_to_uhwi (TYPE_SIZE_UNIT (ltype)), align); + ltype = build_aligned_type (ltype, align * BITS_PER_UNIT); ncopies = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node); } @@ -10750,8 +10758,15 @@ vectorizable_load (vec_info *vinfo, } } } - /* Else fall back to the default element-wise access. */ - ltype = build_aligned_type (ltype, TYPE_ALIGN (TREE_TYPE (vectype))); + unsigned align; + if (alignment_support_scheme == dr_aligned) + align = known_alignment (DR_TARGET_ALIGNMENT (first_dr_info)); + else + align = dr_alignment (vect_dr_behavior (vinfo, first_dr_info)); + /* Alignment is at most the access size if we do multiple loads. */ + if (nloads > 1) + align = MIN (tree_to_uhwi (TYPE_SIZE_UNIT (ltype)), align); + ltype = build_aligned_type (ltype, align * BITS_PER_UNIT); } if (slp)