Message ID | 20221007040325.21276-1-kito.cheng@sifive.com |
---|---|
State | New |
Headers |
Return-Path: <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> X-Original-To: patchwork@sourceware.org Delivered-To: patchwork@sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 348413898C6D for <patchwork@sourceware.org>; Fri, 7 Oct 2022 04:04:15 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 5E2713898C77 for <gcc-patches@gcc.gnu.org>; Fri, 7 Oct 2022 04:03:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 5E2713898C77 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=sifive.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-pl1-x636.google.com with SMTP id z20so3474750plb.10 for <gcc-patches@gcc.gnu.org>; Thu, 06 Oct 2022 21:03:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=8TgUasoTzW94qmSkYXxx1NlWUmpGqm2UMtNCEOTJNaY=; b=SQHFijfJWfZlTDHDvhAe8VGB4lTiHFo1BUxe7I+z2WA79nTpH1ZNTaJ9aB/dm+P0rn eZBXik43vypWcxpfHg64aPqfHjiT9UeSGN8N5pPUfh/mSCOfzNT5eZVc72yHXR9VjXmy GK1BvGPDsc0DK34neH9SJJSSlQiy63Si5zWKn5cc+iG6EPde9Furg4anW8NwZ2PXAuAo Njo2QvaxLgstiR1KMdyOtwx4KszyPfOlPzgbhPYTPCWC2T3I8asECYM19E1vPuWJsDMv /48aZKxCWMoBzaj+BcK/W/A5tEWJpluvP/UCd1iK378md7sLD1OlsSpUHgmgz/nQhyHv ZWDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8TgUasoTzW94qmSkYXxx1NlWUmpGqm2UMtNCEOTJNaY=; b=rQFKkeaOJspF8MZBdPnqFLg259eMFYG+IUd4+u9hdUKnWcK/LMJNnkFd0OKeHXMzq8 osMpY8ywNGQEWJmKpPAPZ2rmzKa5JtSYZIYY8x9zxp8yb+UWC8UbnzXmF3WJRiAU0UJk jPlQFHoxC59JTMfDJ14Y7Fv26do4BmsUJ81ylJjilR/aja/EAPDkGGH0vSz9ZXeLpgUW faX6gNU2LpUd02T7q7P8ZePcf2Vmgj53uO/CHoiJiiHFAmBsobko1LItiiIEDV/HhpKY 1uzd7fGoR91mCegxLWDjpp0xjgeK7UI2Rie388S7cX3dEsOIsBO/XP6EOGOi1QDfavgJ U1JQ== X-Gm-Message-State: ACrzQf2yc1hG05rpvnzNKj5Evn1P5EPCdnxNA7Y1i0I/Krcu1FONPMif FSICSVBotACv+JMQ9T86FWa2aENKLD4ACkiEHVSzXTv2n4AVVkYGp+HPLOociK3rSVgahot4YH6 TlCEdFwgLP8dYr6FJ/U2kSONdyQxwGZdRLhopRT+TdxZDxv79cNZ1SJyhdcs045qAg7Bob9BpgQ == X-Google-Smtp-Source: AMsMyM44laxlAukRc/QndFFQRwJc6x19O9FOSBSfdRmp/ctGC+imIte7rKuXcAw5+WiY7zyVG7Q8tA== X-Received: by 2002:a17:90a:1c1:b0:20a:e745:bc30 with SMTP id 1-20020a17090a01c100b0020ae745bc30mr14245103pjd.131.1665115436709; Thu, 06 Oct 2022 21:03:56 -0700 (PDT) Received: from hsinchu02.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id p16-20020a170903249000b0017f864355aasm416947plw.164.2022.10.06.21.03.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Oct 2022 21:03:56 -0700 (PDT) From: Kito Cheng <kito.cheng@sifive.com> To: gcc-patches@gcc.gnu.org, kito.cheng@gmail.com, msebor@gcc.gnu.org, jakub@gcc.gnu.org, rguenth@gcc.gnu.org, koen.zandberg@inria.fr, andy.chiu@sifive.com, guoren@kernel.org, greentime.hu@sifive.com, palmer@dabbelt.com Subject: [PATCH] PR middle-end/88345: Honor -falign-functions=N even optimized for size. Date: Fri, 7 Oct 2022 12:03:25 +0800 Message-Id: <20221007040325.21276-1-kito.cheng@sifive.com> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Cc: Monk Chiang <monk.chiang@sifive.com> Errors-To: gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+patchwork=sourceware.org@gcc.gnu.org> |
Series |
PR middle-end/88345: Honor -falign-functions=N even optimized for size.
|
|
Commit Message
Kito Cheng
Oct. 7, 2022, 4:03 a.m. UTC
From: Monk Chiang <monk.chiang@sifive.com>
Currnetly setting of -falign-functions=N will be ignored if the function
is optimized for size or marked as cold function.
However function alignment requirement is needed even optimized for
size in some situations, RISC-V target is an example, RISC-V kernel implement
patchable function that require function must be align to at least 4 bytes for
atomicly patch the function prologue.
Here is 4 way to fix that:
1. Fix -falign-functions=N, let it work as expect on -Os and all cold
functions, which is this patch.
2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
RISC-V's FUNCTION_BOUNDARY.
3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
And this patch is the first approach.
gcc/ChangeLog:
PR middle-end/88345
* varasm.cc (assemble_start_function): Adjust function align
even optimized for size.
* doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
list of -Os.
gcc/testsuite/ChangeLog:
PR middle-end/88345
* gcc.target/i386/pr88345-1.c: New.
* gcc.target/i386/pr88345-2.c: Ditto.
* gcc.target/riscv/pr88345-1.c: Ditto.
* gcc.target/riscv/pr88345-2.c: Ditto.
---
gcc/doc/invoke.texi | 2 +-
gcc/testsuite/gcc.target/i386/pr88345-1.c | 5 +++++
gcc/testsuite/gcc.target/i386/pr88345-2.c | 5 +++++
gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
gcc/varasm.cc | 3 +--
6 files changed, 22 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
Comments
On Thu, 06 Oct 2022 21:03:25 PDT (-0700), kito.cheng@sifive.com wrote: > From: Monk Chiang <monk.chiang@sifive.com> > > Currnetly setting of -falign-functions=N will be ignored if the function > is optimized for size or marked as cold function. > > However function alignment requirement is needed even optimized for > size in some situations, RISC-V target is an example, RISC-V kernel implement > patchable function that require function must be align to at least 4 bytes for > atomicly patch the function prologue. > > Here is 4 way to fix that: > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold > functions, which is this patch. > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust > RISC-V's FUNCTION_BOUNDARY. > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N. > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that. > > And this patch is the first approach. > > gcc/ChangeLog: > > PR middle-end/88345 > * varasm.cc (assemble_start_function): Adjust function align > even optimized for size. > * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion > list of -Os. > > gcc/testsuite/ChangeLog: > > PR middle-end/88345 > * gcc.target/i386/pr88345-1.c: New. > * gcc.target/i386/pr88345-2.c: Ditto. > * gcc.target/riscv/pr88345-1.c: Ditto. > * gcc.target/riscv/pr88345-2.c: Ditto. > --- > gcc/doc/invoke.texi | 2 +- > gcc/testsuite/gcc.target/i386/pr88345-1.c | 5 +++++ > gcc/testsuite/gcc.target/i386/pr88345-2.c | 5 +++++ > gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++ > gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++ > gcc/varasm.cc | 3 +-- > 6 files changed, 22 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index a2b0b9636f0..acf98c68825 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -11381,7 +11381,7 @@ results. This is the default. > Optimize for size. @option{-Os} enables all @option{-O2} optimizations > except those that often increase code size: > > -@gccoptlist{-falign-functions -falign-jumps @gol > +@gccoptlist{-falign-jumps @gol > -falign-labels -falign-loops @gol > -fprefetch-loop-arrays -freorder-blocks-algorithm=stc} > > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c > new file mode 100644 > index 00000000000..226bb9ffc82 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128" } */ > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > + > +__attribute__((__cold__)) void a() {} > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c > new file mode 100644 > index 00000000000..a7fc3b162dd > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128 -Os" } */ > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > + > +void a() {} > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > new file mode 100644 > index 00000000000..7d5afe683eb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128" } */ > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > + > +__attribute__((__cold__)) void a() {} > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > new file mode 100644 > index 00000000000..d4fc89d86d8 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128 -Os" } */ > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > + > +void a() {} > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > index 423f3f91af8..4001648b214 100644 > --- a/gcc/varasm.cc > +++ b/gcc/varasm.cc > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname) > Note that we still need to align to DECL_ALIGN, as above, > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > if (! DECL_USER_ALIGN (decl) > - && align_functions.levels[0].log > align > - && optimize_function_for_speed_p (cfun)) > + && align_functions.levels[0].log > align) > { > #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > int align_log = align_functions.levels[0].log; Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Though I'm not a global reviewer, so not sure how much that helps...
On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote: > > From: Monk Chiang <monk.chiang@sifive.com> > > Currnetly setting of -falign-functions=N will be ignored if the function > is optimized for size or marked as cold function. > > However function alignment requirement is needed even optimized for > size in some situations, RISC-V target is an example, RISC-V kernel implement > patchable function that require function must be align to at least 4 bytes for > atomicly patch the function prologue. > > Here is 4 way to fix that: > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold > functions, which is this patch. > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust > RISC-V's FUNCTION_BOUNDARY. > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N. > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that. > > And this patch is the first approach. The behavior changed with r0-42853-g194734e9e5501f but documentation wasn't changed to reflect that -falign-functions=N is now only a hint. I'm not sure in what circumstances users are expected to use -falign-functions and whether if it is for ABI reasons (as in this case) we should instead have a -malign-functions or other magic. Honza - any comment on your change? Thanks, Richard. > gcc/ChangeLog: > > PR middle-end/88345 > * varasm.cc (assemble_start_function): Adjust function align > even optimized for size. > * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion > list of -Os. > > gcc/testsuite/ChangeLog: > > PR middle-end/88345 > * gcc.target/i386/pr88345-1.c: New. > * gcc.target/i386/pr88345-2.c: Ditto. > * gcc.target/riscv/pr88345-1.c: Ditto. > * gcc.target/riscv/pr88345-2.c: Ditto. > --- > gcc/doc/invoke.texi | 2 +- > gcc/testsuite/gcc.target/i386/pr88345-1.c | 5 +++++ > gcc/testsuite/gcc.target/i386/pr88345-2.c | 5 +++++ > gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++ > gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++ > gcc/varasm.cc | 3 +-- > 6 files changed, 22 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index a2b0b9636f0..acf98c68825 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -11381,7 +11381,7 @@ results. This is the default. > Optimize for size. @option{-Os} enables all @option{-O2} optimizations > except those that often increase code size: > > -@gccoptlist{-falign-functions -falign-jumps @gol > +@gccoptlist{-falign-jumps @gol > -falign-labels -falign-loops @gol > -fprefetch-loop-arrays -freorder-blocks-algorithm=stc} > > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c > new file mode 100644 > index 00000000000..226bb9ffc82 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128" } */ > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > + > +__attribute__((__cold__)) void a() {} > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c > new file mode 100644 > index 00000000000..a7fc3b162dd > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128 -Os" } */ > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > + > +void a() {} > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > new file mode 100644 > index 00000000000..7d5afe683eb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128" } */ > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > + > +__attribute__((__cold__)) void a() {} > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > new file mode 100644 > index 00000000000..d4fc89d86d8 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > @@ -0,0 +1,5 @@ > +/* { dg-do compile } */ > +/* { dg-options "-falign-functions=128 -Os" } */ > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > + > +void a() {} > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > index 423f3f91af8..4001648b214 100644 > --- a/gcc/varasm.cc > +++ b/gcc/varasm.cc > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname) > Note that we still need to align to DECL_ALIGN, as above, > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > if (! DECL_USER_ALIGN (decl) > - && align_functions.levels[0].log > align > - && optimize_function_for_speed_p (cfun)) > + && align_functions.levels[0].log > align) > { > #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > int align_log = align_functions.levels[0].log; > -- > 2.37.2 >
> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote: > > > > From: Monk Chiang <monk.chiang@sifive.com> > > > > Currnetly setting of -falign-functions=N will be ignored if the function > > is optimized for size or marked as cold function. > > > > However function alignment requirement is needed even optimized for > > size in some situations, RISC-V target is an example, RISC-V kernel implement > > patchable function that require function must be align to at least 4 bytes for > > atomicly patch the function prologue. > > > > Here is 4 way to fix that: > > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold > > functions, which is this patch. > > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust > > RISC-V's FUNCTION_BOUNDARY. > > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N. > > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that. > > > > And this patch is the first approach. > > The behavior changed with r0-42853-g194734e9e5501f but documentation > wasn't changed to reflect that -falign-functions=N is now only a hint. > > I'm not sure in what circumstances users are expected to use -falign-functions > and whether if it is for ABI reasons (as in this case) we should instead have > a -malign-functions or other magic. > > Honza - any comment on your change? This was done a long time ago and -falign-functions was/is about performance. The basic idea of the patch was to use minimal required alignment for -Os and cold functions while use -falign-functions for functions optimized for speed. This is not different what we do for other optimization options. i386 targets define quite large alignments (especially for older ones that needed function entry to be separated by given number of instructions from cache page boundary) so enabling it for all functions unconditionally is expensive (in code size). We have FUNCTION_BOUNDARY to specify minimal function alignment required by the ABI. So I think if live patches requires bigger alignment, I would go with adjusting FUNCTION_BOUNDARY with -flive-patching. Alternatively we could introduce -malign-all-function or -falign-all-functions to adjust minimal alignment if this is specific to a particular implementation of live patching in the kernel. Honza > > Thanks, > Richard. > > > gcc/ChangeLog: > > > > PR middle-end/88345 > > * varasm.cc (assemble_start_function): Adjust function align > > even optimized for size. > > * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion > > list of -Os. > > > > gcc/testsuite/ChangeLog: > > > > PR middle-end/88345 > > * gcc.target/i386/pr88345-1.c: New. > > * gcc.target/i386/pr88345-2.c: Ditto. > > * gcc.target/riscv/pr88345-1.c: Ditto. > > * gcc.target/riscv/pr88345-2.c: Ditto. > > --- > > gcc/doc/invoke.texi | 2 +- > > gcc/testsuite/gcc.target/i386/pr88345-1.c | 5 +++++ > > gcc/testsuite/gcc.target/i386/pr88345-2.c | 5 +++++ > > gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++ > > gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++ > > gcc/varasm.cc | 3 +-- > > 6 files changed, 22 insertions(+), 3 deletions(-) > > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c > > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > > index a2b0b9636f0..acf98c68825 100644 > > --- a/gcc/doc/invoke.texi > > +++ b/gcc/doc/invoke.texi > > @@ -11381,7 +11381,7 @@ results. This is the default. > > Optimize for size. @option{-Os} enables all @option{-O2} optimizations > > except those that often increase code size: > > > > -@gccoptlist{-falign-functions -falign-jumps @gol > > +@gccoptlist{-falign-jumps @gol > > -falign-labels -falign-loops @gol > > -fprefetch-loop-arrays -freorder-blocks-algorithm=stc} > > > > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c > > new file mode 100644 > > index 00000000000..226bb9ffc82 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128" } */ > > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > > + > > +__attribute__((__cold__)) void a() {} > > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c > > new file mode 100644 > > index 00000000000..a7fc3b162dd > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128 -Os" } */ > > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ > > + > > +void a() {} > > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > > new file mode 100644 > > index 00000000000..7d5afe683eb > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128" } */ > > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > > + > > +__attribute__((__cold__)) void a() {} > > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > > new file mode 100644 > > index 00000000000..d4fc89d86d8 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c > > @@ -0,0 +1,5 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-falign-functions=128 -Os" } */ > > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ > > + > > +void a() {} > > diff --git a/gcc/varasm.cc b/gcc/varasm.cc > > index 423f3f91af8..4001648b214 100644 > > --- a/gcc/varasm.cc > > +++ b/gcc/varasm.cc > > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname) > > Note that we still need to align to DECL_ALIGN, as above, > > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ > > if (! DECL_USER_ALIGN (decl) > > - && align_functions.levels[0].log > align > > - && optimize_function_for_speed_p (cfun)) > > + && align_functions.levels[0].log > align) > > { > > #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN > > int align_log = align_functions.levels[0].log; > > -- > > 2.37.2 > >
On Fri, 07 Oct 2022 05:56:39 PDT (-0700), hubicka@ucw.cz wrote: >> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote: >> > >> > From: Monk Chiang <monk.chiang@sifive.com> >> > >> > Currnetly setting of -falign-functions=N will be ignored if the function >> > is optimized for size or marked as cold function. >> > >> > However function alignment requirement is needed even optimized for >> > size in some situations, RISC-V target is an example, RISC-V kernel implement >> > patchable function that require function must be align to at least 4 bytes for >> > atomicly patch the function prologue. >> > >> > Here is 4 way to fix that: >> > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold >> > functions, which is this patch. >> > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust >> > RISC-V's FUNCTION_BOUNDARY. >> > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N. >> > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that. >> > >> > And this patch is the first approach. >> >> The behavior changed with r0-42853-g194734e9e5501f but documentation >> wasn't changed to reflect that -falign-functions=N is now only a hint. >> >> I'm not sure in what circumstances users are expected to use -falign-functions >> and whether if it is for ABI reasons (as in this case) we should instead have >> a -malign-functions or other magic. >> >> Honza - any comment on your change? > > This was done a long time ago and -falign-functions was/is about > performance. > > The basic idea of the patch was to use minimal required alignment for > -Os and cold functions while use -falign-functions for functions > optimized for speed. This is not different what we do for other > optimization options. I'd also noticed last night that -falign-functions doesn't override __attribute__((align(N))), but got too tired to write up the patch. I wasn't sure it was the desired behavior at the time, but sounds like it is? I just send a doc patch to mention that. > i386 targets define quite large alignments (especially for older ones > that needed function entry to be separated by given number of > instructions from cache page boundary) so enabling it for all functions > unconditionally is expensive (in code size). > > We have FUNCTION_BOUNDARY to specify minimal function alignment required > by the ABI. So I think if live patches requires bigger alignment, I > would go with adjusting FUNCTION_BOUNDARY with -flive-patching. > Alternatively we could introduce -malign-all-function or > -falign-all-functions to adjust minimal alignment if this is specific to > a particular implementation of live patching in the kernel. > > Honza >> >> Thanks, >> Richard. >> >> > gcc/ChangeLog: >> > >> > PR middle-end/88345 >> > * varasm.cc (assemble_start_function): Adjust function align >> > even optimized for size. >> > * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion >> > list of -Os. >> > >> > gcc/testsuite/ChangeLog: >> > >> > PR middle-end/88345 >> > * gcc.target/i386/pr88345-1.c: New. >> > * gcc.target/i386/pr88345-2.c: Ditto. >> > * gcc.target/riscv/pr88345-1.c: Ditto. >> > * gcc.target/riscv/pr88345-2.c: Ditto. >> > --- >> > gcc/doc/invoke.texi | 2 +- >> > gcc/testsuite/gcc.target/i386/pr88345-1.c | 5 +++++ >> > gcc/testsuite/gcc.target/i386/pr88345-2.c | 5 +++++ >> > gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++ >> > gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++ >> > gcc/varasm.cc | 3 +-- >> > 6 files changed, 22 insertions(+), 3 deletions(-) >> > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c >> > create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c >> > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c >> > create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c >> > >> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> > index a2b0b9636f0..acf98c68825 100644 >> > --- a/gcc/doc/invoke.texi >> > +++ b/gcc/doc/invoke.texi >> > @@ -11381,7 +11381,7 @@ results. This is the default. >> > Optimize for size. @option{-Os} enables all @option{-O2} optimizations >> > except those that often increase code size: >> > >> > -@gccoptlist{-falign-functions -falign-jumps @gol >> > +@gccoptlist{-falign-jumps @gol >> > -falign-labels -falign-loops @gol >> > -fprefetch-loop-arrays -freorder-blocks-algorithm=stc} >> > >> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c >> > new file mode 100644 >> > index 00000000000..226bb9ffc82 >> > --- /dev/null >> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c >> > @@ -0,0 +1,5 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-falign-functions=128" } */ >> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ >> > + >> > +__attribute__((__cold__)) void a() {} >> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c >> > new file mode 100644 >> > index 00000000000..a7fc3b162dd >> > --- /dev/null >> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c >> > @@ -0,0 +1,5 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-falign-functions=128 -Os" } */ >> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ >> > + >> > +void a() {} >> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c >> > new file mode 100644 >> > index 00000000000..7d5afe683eb >> > --- /dev/null >> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c >> > @@ -0,0 +1,5 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-falign-functions=128" } */ >> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ >> > + >> > +__attribute__((__cold__)) void a() {} >> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c >> > new file mode 100644 >> > index 00000000000..d4fc89d86d8 >> > --- /dev/null >> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c >> > @@ -0,0 +1,5 @@ >> > +/* { dg-do compile } */ >> > +/* { dg-options "-falign-functions=128 -Os" } */ >> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ >> > + >> > +void a() {} >> > diff --git a/gcc/varasm.cc b/gcc/varasm.cc >> > index 423f3f91af8..4001648b214 100644 >> > --- a/gcc/varasm.cc >> > +++ b/gcc/varasm.cc >> > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname) >> > Note that we still need to align to DECL_ALIGN, as above, >> > because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ >> > if (! DECL_USER_ALIGN (decl) >> > - && align_functions.levels[0].log > align >> > - && optimize_function_for_speed_p (cfun)) >> > + && align_functions.levels[0].log > align) >> > { >> > #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN >> > int align_log = align_functions.levels[0].log; >> > -- >> > 2.37.2 >> >
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index a2b0b9636f0..acf98c68825 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -11381,7 +11381,7 @@ results. This is the default. Optimize for size. @option{-Os} enables all @option{-O2} optimizations except those that often increase code size: -@gccoptlist{-falign-functions -falign-jumps @gol +@gccoptlist{-falign-jumps @gol -falign-labels -falign-loops @gol -fprefetch-loop-arrays -freorder-blocks-algorithm=stc} diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c new file mode 100644 index 00000000000..226bb9ffc82 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-falign-functions=128" } */ +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ + +__attribute__((__cold__)) void a() {} diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c new file mode 100644 index 00000000000..a7fc3b162dd --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-falign-functions=128 -Os" } */ +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */ + +void a() {} diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c new file mode 100644 index 00000000000..7d5afe683eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-falign-functions=128" } */ +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ + +__attribute__((__cold__)) void a() {} diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c new file mode 100644 index 00000000000..d4fc89d86d8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-falign-functions=128 -Os" } */ +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */ + +void a() {} diff --git a/gcc/varasm.cc b/gcc/varasm.cc index 423f3f91af8..4001648b214 100644 --- a/gcc/varasm.cc +++ b/gcc/varasm.cc @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname) Note that we still need to align to DECL_ALIGN, as above, because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ if (! DECL_USER_ALIGN (decl) - && align_functions.levels[0].log > align - && optimize_function_for_speed_p (cfun)) + && align_functions.levels[0].log > align) { #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN int align_log = align_functions.levels[0].log;