PR middle-end/88345: Honor -falign-functions=N even optimized for size.

Message ID 20221007040325.21276-1-kito.cheng@sifive.com
State New
Headers
Series PR middle-end/88345: Honor -falign-functions=N even optimized for size. |

Commit Message

Kito Cheng Oct. 7, 2022, 4:03 a.m. UTC
  From: Monk Chiang <monk.chiang@sifive.com>

Currnetly setting of -falign-functions=N will be ignored if the function
is optimized for size or marked as cold function.

However function alignment requirement is needed even optimized for
size in some situations, RISC-V target is an example, RISC-V kernel implement
patchable function that require function must be align to at least 4 bytes for
atomicly patch the function prologue.

Here is 4 way to fix that:
1. Fix -falign-functions=N, let it work as expect on -Os and all cold
functions, which is this patch.
2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
RISC-V's FUNCTION_BOUNDARY.
3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.

And this patch is the first approach.

gcc/ChangeLog:

	PR middle-end/88345
	* varasm.cc (assemble_start_function): Adjust function align
	even optimized for size.
	* doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
	list of -Os.

gcc/testsuite/ChangeLog:

	PR middle-end/88345
	* gcc.target/i386/pr88345-1.c: New.
	* gcc.target/i386/pr88345-2.c: Ditto.
	* gcc.target/riscv/pr88345-1.c: Ditto.
	* gcc.target/riscv/pr88345-2.c: Ditto.
---
 gcc/doc/invoke.texi                        | 2 +-
 gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
 gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
 gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
 gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
 gcc/varasm.cc                              | 3 +--
 6 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
  

Comments

Palmer Dabbelt Oct. 7, 2022, 4:52 a.m. UTC | #1
On Thu, 06 Oct 2022 21:03:25 PDT (-0700), kito.cheng@sifive.com wrote:
> From: Monk Chiang <monk.chiang@sifive.com>
>
> Currnetly setting of -falign-functions=N will be ignored if the function
> is optimized for size or marked as cold function.
>
> However function alignment requirement is needed even optimized for
> size in some situations, RISC-V target is an example, RISC-V kernel implement
> patchable function that require function must be align to at least 4 bytes for
> atomicly patch the function prologue.
>
> Here is 4 way to fix that:
> 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> functions, which is this patch.
> 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> RISC-V's FUNCTION_BOUNDARY.
> 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
>
> And this patch is the first approach.
>
> gcc/ChangeLog:
>
> 	PR middle-end/88345
> 	* varasm.cc (assemble_start_function): Adjust function align
> 	even optimized for size.
> 	* doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
> 	list of -Os.
>
> gcc/testsuite/ChangeLog:
>
> 	PR middle-end/88345
> 	* gcc.target/i386/pr88345-1.c: New.
> 	* gcc.target/i386/pr88345-2.c: Ditto.
> 	* gcc.target/riscv/pr88345-1.c: Ditto.
> 	* gcc.target/riscv/pr88345-2.c: Ditto.
> ---
>  gcc/doc/invoke.texi                        | 2 +-
>  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
>  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
>  gcc/varasm.cc                              | 3 +--
>  6 files changed, 22 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a2b0b9636f0..acf98c68825 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11381,7 +11381,7 @@ results.  This is the default.
>  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>  except those that often increase code size:
>
> -@gccoptlist{-falign-functions  -falign-jumps @gol
> +@gccoptlist{-falign-jumps @gol
>  -falign-labels  -falign-loops @gol
>  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> new file mode 100644
> index 00000000000..226bb9ffc82
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> new file mode 100644
> index 00000000000..a7fc3b162dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> new file mode 100644
> index 00000000000..7d5afe683eb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> new file mode 100644
> index 00000000000..d4fc89d86d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 423f3f91af8..4001648b214 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
>       Note that we still need to align to DECL_ALIGN, as above,
>       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
>    if (! DECL_USER_ALIGN (decl)
> -      && align_functions.levels[0].log > align
> -      && optimize_function_for_speed_p (cfun))
> +      && align_functions.levels[0].log > align)
>      {
>  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
>        int align_log = align_functions.levels[0].log;

Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>

Though I'm not a global reviewer, so not sure how much that helps...
  
Richard Biener Oct. 7, 2022, 6:32 a.m. UTC | #2
On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote:
>
> From: Monk Chiang <monk.chiang@sifive.com>
>
> Currnetly setting of -falign-functions=N will be ignored if the function
> is optimized for size or marked as cold function.
>
> However function alignment requirement is needed even optimized for
> size in some situations, RISC-V target is an example, RISC-V kernel implement
> patchable function that require function must be align to at least 4 bytes for
> atomicly patch the function prologue.
>
> Here is 4 way to fix that:
> 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> functions, which is this patch.
> 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> RISC-V's FUNCTION_BOUNDARY.
> 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
>
> And this patch is the first approach.

The behavior changed with r0-42853-g194734e9e5501f but documentation
wasn't changed to reflect that -falign-functions=N is now only a hint.

I'm not sure in what circumstances users are expected to use -falign-functions
and whether if it is for ABI reasons (as in this case) we should instead have
a -malign-functions or other magic.

Honza - any comment on your change?

Thanks,
Richard.

> gcc/ChangeLog:
>
>         PR middle-end/88345
>         * varasm.cc (assemble_start_function): Adjust function align
>         even optimized for size.
>         * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
>         list of -Os.
>
> gcc/testsuite/ChangeLog:
>
>         PR middle-end/88345
>         * gcc.target/i386/pr88345-1.c: New.
>         * gcc.target/i386/pr88345-2.c: Ditto.
>         * gcc.target/riscv/pr88345-1.c: Ditto.
>         * gcc.target/riscv/pr88345-2.c: Ditto.
> ---
>  gcc/doc/invoke.texi                        | 2 +-
>  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
>  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
>  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
>  gcc/varasm.cc                              | 3 +--
>  6 files changed, 22 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a2b0b9636f0..acf98c68825 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -11381,7 +11381,7 @@ results.  This is the default.
>  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>  except those that often increase code size:
>
> -@gccoptlist{-falign-functions  -falign-jumps @gol
> +@gccoptlist{-falign-jumps @gol
>  -falign-labels  -falign-loops @gol
>  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> new file mode 100644
> index 00000000000..226bb9ffc82
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> new file mode 100644
> index 00000000000..a7fc3b162dd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> new file mode 100644
> index 00000000000..7d5afe683eb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +__attribute__((__cold__)) void a() {}
> diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> new file mode 100644
> index 00000000000..d4fc89d86d8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> @@ -0,0 +1,5 @@
> +/* { dg-do compile } */
> +/* { dg-options "-falign-functions=128 -Os" } */
> +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> +
> +void a() {}
> diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> index 423f3f91af8..4001648b214 100644
> --- a/gcc/varasm.cc
> +++ b/gcc/varasm.cc
> @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
>       Note that we still need to align to DECL_ALIGN, as above,
>       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
>    if (! DECL_USER_ALIGN (decl)
> -      && align_functions.levels[0].log > align
> -      && optimize_function_for_speed_p (cfun))
> +      && align_functions.levels[0].log > align)
>      {
>  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
>        int align_log = align_functions.levels[0].log;
> --
> 2.37.2
>
  
Jan Hubicka Oct. 7, 2022, 12:56 p.m. UTC | #3
> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote:
> >
> > From: Monk Chiang <monk.chiang@sifive.com>
> >
> > Currnetly setting of -falign-functions=N will be ignored if the function
> > is optimized for size or marked as cold function.
> >
> > However function alignment requirement is needed even optimized for
> > size in some situations, RISC-V target is an example, RISC-V kernel implement
> > patchable function that require function must be align to at least 4 bytes for
> > atomicly patch the function prologue.
> >
> > Here is 4 way to fix that:
> > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
> > functions, which is this patch.
> > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
> > RISC-V's FUNCTION_BOUNDARY.
> > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
> > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
> >
> > And this patch is the first approach.
> 
> The behavior changed with r0-42853-g194734e9e5501f but documentation
> wasn't changed to reflect that -falign-functions=N is now only a hint.
> 
> I'm not sure in what circumstances users are expected to use -falign-functions
> and whether if it is for ABI reasons (as in this case) we should instead have
> a -malign-functions or other magic.
> 
> Honza - any comment on your change?

This was done a long time ago and -falign-functions was/is about
performance.

The basic idea of the patch was to use minimal required alignment for
-Os and cold functions while use -falign-functions for functions
optimized for speed.  This is not different what we do for other
optimization options.

i386 targets define quite large alignments (especially for older ones
that needed function entry to be separated by given number of
instructions from cache page boundary) so enabling it for all functions
unconditionally is expensive (in code size).

We have FUNCTION_BOUNDARY to specify minimal function alignment required
by the ABI.  So I think if live patches requires bigger alignment, I
would go with adjusting FUNCTION_BOUNDARY with -flive-patching.
Alternatively we could introduce -malign-all-function or
-falign-all-functions to adjust minimal alignment if this is specific to
a particular implementation of live patching in the kernel.

Honza
> 
> Thanks,
> Richard.
> 
> > gcc/ChangeLog:
> >
> >         PR middle-end/88345
> >         * varasm.cc (assemble_start_function): Adjust function align
> >         even optimized for size.
> >         * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
> >         list of -Os.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR middle-end/88345
> >         * gcc.target/i386/pr88345-1.c: New.
> >         * gcc.target/i386/pr88345-2.c: Ditto.
> >         * gcc.target/riscv/pr88345-1.c: Ditto.
> >         * gcc.target/riscv/pr88345-2.c: Ditto.
> > ---
> >  gcc/doc/invoke.texi                        | 2 +-
> >  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
> >  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
> >  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
> >  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
> >  gcc/varasm.cc                              | 3 +--
> >  6 files changed, 22 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index a2b0b9636f0..acf98c68825 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -11381,7 +11381,7 @@ results.  This is the default.
> >  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
> >  except those that often increase code size:
> >
> > -@gccoptlist{-falign-functions  -falign-jumps @gol
> > +@gccoptlist{-falign-jumps @gol
> >  -falign-labels  -falign-loops @gol
> >  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
> >
> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> > new file mode 100644
> > index 00000000000..226bb9ffc82
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128" } */
> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> > +
> > +__attribute__((__cold__)) void a() {}
> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> > new file mode 100644
> > index 00000000000..a7fc3b162dd
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128 -Os" } */
> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
> > +
> > +void a() {}
> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> > new file mode 100644
> > index 00000000000..7d5afe683eb
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128" } */
> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> > +
> > +__attribute__((__cold__)) void a() {}
> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> > new file mode 100644
> > index 00000000000..d4fc89d86d8
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
> > @@ -0,0 +1,5 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-falign-functions=128 -Os" } */
> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
> > +
> > +void a() {}
> > diff --git a/gcc/varasm.cc b/gcc/varasm.cc
> > index 423f3f91af8..4001648b214 100644
> > --- a/gcc/varasm.cc
> > +++ b/gcc/varasm.cc
> > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
> >       Note that we still need to align to DECL_ALIGN, as above,
> >       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
> >    if (! DECL_USER_ALIGN (decl)
> > -      && align_functions.levels[0].log > align
> > -      && optimize_function_for_speed_p (cfun))
> > +      && align_functions.levels[0].log > align)
> >      {
> >  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
> >        int align_log = align_functions.levels[0].log;
> > --
> > 2.37.2
> >
  
Palmer Dabbelt Oct. 7, 2022, 1:50 p.m. UTC | #4
On Fri, 07 Oct 2022 05:56:39 PDT (-0700), hubicka@ucw.cz wrote:
>> On Fri, Oct 7, 2022 at 6:04 AM Kito Cheng <kito.cheng@sifive.com> wrote:
>> >
>> > From: Monk Chiang <monk.chiang@sifive.com>
>> >
>> > Currnetly setting of -falign-functions=N will be ignored if the function
>> > is optimized for size or marked as cold function.
>> >
>> > However function alignment requirement is needed even optimized for
>> > size in some situations, RISC-V target is an example, RISC-V kernel implement
>> > patchable function that require function must be align to at least 4 bytes for
>> > atomicly patch the function prologue.
>> >
>> > Here is 4 way to fix that:
>> > 1. Fix -falign-functions=N, let it work as expect on -Os and all cold
>> > functions, which is this patch.
>> > 2. Force align to 4 byte if -fpatchable-function-entry is given by adjust
>> > RISC-V's FUNCTION_BOUNDARY.
>> > 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N.
>> > 4. Adding a -malign-functions=N for RISC-V...which x86 already deprecated that.
>> >
>> > And this patch is the first approach.
>>
>> The behavior changed with r0-42853-g194734e9e5501f but documentation
>> wasn't changed to reflect that -falign-functions=N is now only a hint.
>>
>> I'm not sure in what circumstances users are expected to use -falign-functions
>> and whether if it is for ABI reasons (as in this case) we should instead have
>> a -malign-functions or other magic.
>>
>> Honza - any comment on your change?
>
> This was done a long time ago and -falign-functions was/is about
> performance.
>
> The basic idea of the patch was to use minimal required alignment for
> -Os and cold functions while use -falign-functions for functions
> optimized for speed.  This is not different what we do for other
> optimization options.

I'd also noticed last night that -falign-functions doesn't override 
__attribute__((align(N))), but got too tired to write up the patch.  I 
wasn't sure it was the desired behavior at the time, but sounds like it 
is?  I just send a doc patch to mention that.

> i386 targets define quite large alignments (especially for older ones
> that needed function entry to be separated by given number of
> instructions from cache page boundary) so enabling it for all functions
> unconditionally is expensive (in code size).
>
> We have FUNCTION_BOUNDARY to specify minimal function alignment required
> by the ABI.  So I think if live patches requires bigger alignment, I
> would go with adjusting FUNCTION_BOUNDARY with -flive-patching.
> Alternatively we could introduce -malign-all-function or
> -falign-all-functions to adjust minimal alignment if this is specific to
> a particular implementation of live patching in the kernel.
>
> Honza
>>
>> Thanks,
>> Richard.
>>
>> > gcc/ChangeLog:
>> >
>> >         PR middle-end/88345
>> >         * varasm.cc (assemble_start_function): Adjust function align
>> >         even optimized for size.
>> >         * doc/invoke.texi (Os): Remove -falign-functions= from the exclusion
>> >         list of -Os.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >         PR middle-end/88345
>> >         * gcc.target/i386/pr88345-1.c: New.
>> >         * gcc.target/i386/pr88345-2.c: Ditto.
>> >         * gcc.target/riscv/pr88345-1.c: Ditto.
>> >         * gcc.target/riscv/pr88345-2.c: Ditto.
>> > ---
>> >  gcc/doc/invoke.texi                        | 2 +-
>> >  gcc/testsuite/gcc.target/i386/pr88345-1.c  | 5 +++++
>> >  gcc/testsuite/gcc.target/i386/pr88345-2.c  | 5 +++++
>> >  gcc/testsuite/gcc.target/riscv/pr88345-1.c | 5 +++++
>> >  gcc/testsuite/gcc.target/riscv/pr88345-2.c | 5 +++++
>> >  gcc/varasm.cc                              | 3 +--
>> >  6 files changed, 22 insertions(+), 3 deletions(-)
>> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-1.c
>> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr88345-2.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-1.c
>> >  create mode 100644 gcc/testsuite/gcc.target/riscv/pr88345-2.c
>> >
>> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> > index a2b0b9636f0..acf98c68825 100644
>> > --- a/gcc/doc/invoke.texi
>> > +++ b/gcc/doc/invoke.texi
>> > @@ -11381,7 +11381,7 @@ results.  This is the default.
>> >  Optimize for size.  @option{-Os} enables all @option{-O2} optimizations
>> >  except those that often increase code size:
>> >
>> > -@gccoptlist{-falign-functions  -falign-jumps @gol
>> > +@gccoptlist{-falign-jumps @gol
>> >  -falign-labels  -falign-loops @gol
>> >  -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
>> >
>> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
>> > new file mode 100644
>> > index 00000000000..226bb9ffc82
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128" } */
>> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
>> > +
>> > +__attribute__((__cold__)) void a() {}
>> > diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
>> > new file mode 100644
>> > index 00000000000..a7fc3b162dd
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128 -Os" } */
>> > +/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
>> > +
>> > +void a() {}
>> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
>> > new file mode 100644
>> > index 00000000000..7d5afe683eb
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128" } */
>> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
>> > +
>> > +__attribute__((__cold__)) void a() {}
>> > diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
>> > new file mode 100644
>> > index 00000000000..d4fc89d86d8
>> > --- /dev/null
>> > +++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
>> > @@ -0,0 +1,5 @@
>> > +/* { dg-do compile } */
>> > +/* { dg-options "-falign-functions=128 -Os" } */
>> > +/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
>> > +
>> > +void a() {}
>> > diff --git a/gcc/varasm.cc b/gcc/varasm.cc
>> > index 423f3f91af8..4001648b214 100644
>> > --- a/gcc/varasm.cc
>> > +++ b/gcc/varasm.cc
>> > @@ -1923,8 +1923,7 @@ assemble_start_function (tree decl, const char *fnname)
>> >       Note that we still need to align to DECL_ALIGN, as above,
>> >       because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
>> >    if (! DECL_USER_ALIGN (decl)
>> > -      && align_functions.levels[0].log > align
>> > -      && optimize_function_for_speed_p (cfun))
>> > +      && align_functions.levels[0].log > align)
>> >      {
>> >  #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
>> >        int align_log = align_functions.levels[0].log;
>> > --
>> > 2.37.2
>> >
  

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a2b0b9636f0..acf98c68825 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -11381,7 +11381,7 @@  results.  This is the default.
 Optimize for size.  @option{-Os} enables all @option{-O2} optimizations 
 except those that often increase code size:
 
-@gccoptlist{-falign-functions  -falign-jumps @gol
+@gccoptlist{-falign-jumps @gol
 -falign-labels  -falign-loops @gol
 -fprefetch-loop-arrays  -freorder-blocks-algorithm=stc}
 
diff --git a/gcc/testsuite/gcc.target/i386/pr88345-1.c b/gcc/testsuite/gcc.target/i386/pr88345-1.c
new file mode 100644
index 00000000000..226bb9ffc82
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88345-1.c
@@ -0,0 +1,5 @@ 
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128" } */
+/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
+
+__attribute__((__cold__)) void a() {}
diff --git a/gcc/testsuite/gcc.target/i386/pr88345-2.c b/gcc/testsuite/gcc.target/i386/pr88345-2.c
new file mode 100644
index 00000000000..a7fc3b162dd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr88345-2.c
@@ -0,0 +1,5 @@ 
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128 -Os" } */
+/* { dg-final { scan-assembler-times "\.p2align\t7" 1 } } */
+
+void a() {}
diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-1.c b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
new file mode 100644
index 00000000000..7d5afe683eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr88345-1.c
@@ -0,0 +1,5 @@ 
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128" } */
+/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
+
+__attribute__((__cold__)) void a() {}
diff --git a/gcc/testsuite/gcc.target/riscv/pr88345-2.c b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
new file mode 100644
index 00000000000..d4fc89d86d8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr88345-2.c
@@ -0,0 +1,5 @@ 
+/* { dg-do compile } */
+/* { dg-options "-falign-functions=128 -Os" } */
+/* { dg-final { scan-assembler-times "\.align\t7" 1 } } */
+
+void a() {}
diff --git a/gcc/varasm.cc b/gcc/varasm.cc
index 423f3f91af8..4001648b214 100644
--- a/gcc/varasm.cc
+++ b/gcc/varasm.cc
@@ -1923,8 +1923,7 @@  assemble_start_function (tree decl, const char *fnname)
      Note that we still need to align to DECL_ALIGN, as above,
      because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all.  */
   if (! DECL_USER_ALIGN (decl)
-      && align_functions.levels[0].log > align
-      && optimize_function_for_speed_p (cfun))
+      && align_functions.levels[0].log > align)
     {
 #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
       int align_log = align_functions.levels[0].log;