Message ID | 1637286835-24590-1-git-send-email-apinski@marvell.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B8DB13857C43 for <patchwork@sourceware.org>; Fri, 19 Nov 2021 01:54:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B8DB13857C43 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1637286899; bh=BkXiU1GTzG5eiekvQ9qAoDUTFtnGHv61sKBVCQJ6l4s=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=r4xgf4TrD102nF9IgP2XksIEsj2Ap4xiBe78TojuCrfuZhQJx14qZan3QJejnPnM/ xSOr/2F0pN4BAmVGT+F/ta6SDTPErL00Qj8meYaD3HVEZRf/PfZXOzPvrdA7oteJWH H4UckdoXmGBHVZpwF5W1cvIH+Vlsc6xR1BYgBlWU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by sourceware.org (Postfix) with ESMTPS id 0EEB7385AC12 for <gcc-patches@gcc.gnu.org>; Fri, 19 Nov 2021 01:54:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0EEB7385AC12 Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 1AIJ6SbG030220 for <gcc-patches@gcc.gnu.org>; Thu, 18 Nov 2021 17:54:05 -0800 Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0a-0016f401.pphosted.com (PPS) with ESMTPS id 3cdvprsf4w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for <gcc-patches@gcc.gnu.org>; Thu, 18 Nov 2021 17:54:05 -0800 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Thu, 18 Nov 2021 17:54:03 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.18 via Frontend Transport; Thu, 18 Nov 2021 17:54:04 -0800 Received: from linux.wrightpinski.org.com (unknown [10.69.242.198]) by maili.marvell.com (Postfix) with ESMTP id C55153F706B; Thu, 18 Nov 2021 17:54:03 -0800 (PST) To: <gcc-patches@gcc.gnu.org> Subject: [PATCH v2] [AARCH64] Fix PR target/103100 -mstrict-align and memset on not aligned buffers Date: Thu, 18 Nov 2021 17:53:55 -0800 Message-ID: <1637286835-24590-1-git-send-email-apinski@marvell.com> X-Mailer: git-send-email 1.8.3.1 MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-ORIG-GUID: 8Q4YCL66xn6-7IuGiICrtl3rxpzny-Y8 X-Proofpoint-GUID: 8Q4YCL66xn6-7IuGiICrtl3rxpzny-Y8 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.790,Hydra:6.0.425,FMLib:17.0.607.475 definitions=2021-11-19_01,2021-11-17_01,2020-04-07_01 X-Spam-Status: No, score=-14.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: apinski@marvell.com Cc: Andrew Pinski <apinski@marvell.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
[v2,AARCH64] Fix PR target/103100 -mstrict-align and memset on not aligned buffers
|
|
Commit Message
Li, Pan2 via Gcc-patches
Nov. 19, 2021, 1:53 a.m. UTC
From: Andrew Pinski <apinski@marvell.com>
The problem here is that aarch64_expand_setmem does not change the alignment
for strict alignment case. This is a simplified patch from what I had previously.
So constraining copy_limit to the alignment of the mem in the case of strict align
fixes the issue without checking to many other changes to the code.
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions/
gcc/ChangeLog:
* config/aarch64/aarch64.c (aarch64_expand_setmem): Constraint
copy_limit to the alignment of the mem if STRICT_ALIGNMENT is
true.
---
gcc/config/aarch64/aarch64.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
Comments
On Thu, Nov 18, 2021 at 5:55 PM apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > From: Andrew Pinski <apinski@marvell.com> > > The problem here is that aarch64_expand_setmem does not change the alignment > for strict alignment case. This is a simplified patch from what I had previously. > So constraining copy_limit to the alignment of the mem in the case of strict align > fixes the issue without checking to many other changes to the code. > > OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Ping? > > gcc/ChangeLog: > > * config/aarch64/aarch64.c (aarch64_expand_setmem): Constraint > copy_limit to the alignment of the mem if STRICT_ALIGNMENT is > true. > --- > gcc/config/aarch64/aarch64.c | 13 ++++++++++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 7389b5953dc..e9c2e89d8ce 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -23744,9 +23744,16 @@ aarch64_expand_setmem (rtx *operands) > /* Maximum amount to copy in one go. We allow 256-bit chunks based on the > AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS tuning parameter. setmem expand > pattern is only turned on for TARGET_SIMD. */ > - const int copy_limit = (aarch64_tune_params.extra_tuning_flags > - & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) > - ? GET_MODE_BITSIZE (TImode) : 256; > + int copy_limit; > + > + if (aarch64_tune_params.extra_tuning_flags > + & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) > + copy_limit = GET_MODE_BITSIZE (TImode); > + else > + copy_limit = 256; > + > + if (STRICT_ALIGNMENT) > + copy_limit = MIN (copy_limit, (int)MEM_ALIGN (dst)); > > while (n > 0) > { > -- > 2.17.1 >
Andrew Pinski via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > On Thu, Nov 18, 2021 at 5:55 PM apinski--- via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> From: Andrew Pinski <apinski@marvell.com> >> >> The problem here is that aarch64_expand_setmem does not change the alignment >> for strict alignment case. This is a simplified patch from what I had previously. >> So constraining copy_limit to the alignment of the mem in the case of strict align >> fixes the issue without checking to many other changes to the code. >> >> OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. > > Ping? > >> >> gcc/ChangeLog: >> >> * config/aarch64/aarch64.c (aarch64_expand_setmem): Constraint >> copy_limit to the alignment of the mem if STRICT_ALIGNMENT is >> true. This looks correct, but for a modified version of the testcase in the PR: void f(char *x) { __builtin_memset (x, 0, 216); } we'll now emit 216 STRB instructions, which seems a bit excessive. I think in practice the code has only been tuned on targets that support LDP/STP Q, so how about moving the copy_limit calculation further up and doing: unsigned max_set_size = (copy_limit * 8) / BITS_PER_UNIT; ? It would be good to have a scan-assembler-not testcase for the testsuite. Thanks, Richard >> --- >> gcc/config/aarch64/aarch64.c | 13 ++++++++++--- >> 1 file changed, 10 insertions(+), 3 deletions(-) >> >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index 7389b5953dc..e9c2e89d8ce 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -23744,9 +23744,16 @@ aarch64_expand_setmem (rtx *operands) >> /* Maximum amount to copy in one go. We allow 256-bit chunks based on the >> AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS tuning parameter. setmem expand >> pattern is only turned on for TARGET_SIMD. */ >> - const int copy_limit = (aarch64_tune_params.extra_tuning_flags >> - & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) >> - ? GET_MODE_BITSIZE (TImode) : 256; >> + int copy_limit; >> + >> + if (aarch64_tune_params.extra_tuning_flags >> + & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) >> + copy_limit = GET_MODE_BITSIZE (TImode); >> + else >> + copy_limit = 256; >> + >> + if (STRICT_ALIGNMENT) >> + copy_limit = MIN (copy_limit, (int)MEM_ALIGN (dst)); >> >> while (n > 0) >> { >> -- >> 2.17.1 >>
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 7389b5953dc..e9c2e89d8ce 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -23744,9 +23744,16 @@ aarch64_expand_setmem (rtx *operands) /* Maximum amount to copy in one go. We allow 256-bit chunks based on the AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS tuning parameter. setmem expand pattern is only turned on for TARGET_SIMD. */ - const int copy_limit = (aarch64_tune_params.extra_tuning_flags - & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) - ? GET_MODE_BITSIZE (TImode) : 256; + int copy_limit; + + if (aarch64_tune_params.extra_tuning_flags + & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS) + copy_limit = GET_MODE_BITSIZE (TImode); + else + copy_limit = 256; + + if (STRICT_ALIGNMENT) + copy_limit = MIN (copy_limit, (int)MEM_ALIGN (dst)); while (n > 0) {