[0/2,aarch64] sme/nonlocal_goto_* tests fail remat with PIE

Message ID orldmix3cw.fsf@lxoliva.fsfla.org
Headers
Series sme/nonlocal_goto_* tests fail remat with PIE |

Message

Alexandre Oliva Sept. 13, 2025, 6:39 a.m. UTC
  gcc.target/aarch64/sme/nonlocal_goto_[123].c fail on aarch64 targets
configured with --enable-default-pie.

Instead of expected 'add' insns that would compute the address of a
stack slot to pass to a function, after setting up a trampoline and
calling the instruction cache sync function with the same address, when
PIE is enabled we get an 'ldr' instead.

That's because we fail to rematerialize the address of the stack slot,
and so we allocate another pseudo to hold the address across the cache
sync call, and then restore it, as it happens, from memory.  I suspect
that's a register allocation bug, and at least part of it seems to be
related with dropping function_invariant equivalences when flag_pic is
nonzero in setup_reg_equiv, but the upcoming patch for that is not
enough for rematerialization to succeed in lra, but it is enough to
avoid the allocation of yet another pseudo in ira.

Anyhow, I am going to propose two different versions of workarounds for
the testcases, one (1v1) that tolerates the missed optimization, and
another (1v2) that forces PIE off to avoid running into the problem.
The second patch is for ira to retain the equivalence but, as mentioned
above, it is *not* enough for us to output the expected code, for
reasons I'm yet to investigate.
  

Comments

Alexandre Oliva Sept. 15, 2025, 10:57 p.m. UTC | #1
On Sep 13, 2025, Alexandre Oliva <oliva@adacore.com> wrote:

> gcc.target/aarch64/sme/nonlocal_goto_[123].c fail on aarch64 targets
> configured with --enable-default-pie.

FTR, Martin's commit afa74d37e8170d696f97424da7ab0f71883aac70 regressed
nonlocal_goto_2.c in a slightly different way, even in a standard
aarch64-elf build.  I was surprised that the warning patch caused any
codegen changes, but it seems to prevent collapsing the nonlocal label
with the return block it falls through to, and the expected code pattern
no longer matches because the separation survives through to the end of
compilation, with an extra smstart and padding.
  
Andrew Pinski Sept. 15, 2025, 11:04 p.m. UTC | #2
On Mon, Sep 15, 2025 at 3:59 PM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Sep 13, 2025, Alexandre Oliva <oliva@adacore.com> wrote:
>
> > gcc.target/aarch64/sme/nonlocal_goto_[123].c fail on aarch64 targets
> > configured with --enable-default-pie.
>
> FTR, Martin's commit afa74d37e8170d696f97424da7ab0f71883aac70 regressed
> nonlocal_goto_2.c in a slightly different way, even in a standard
> aarch64-elf build.  I was surprised that the warning patch caused any
> codegen changes, but it seems to prevent collapsing the nonlocal label
> with the return block it falls through to, and the expected code pattern
> no longer matches because the separation survives through to the end of
> compilation, with an extra smstart and padding.

Yes the problem with non-local lable/goto's is recorded already as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121933 .
It caused gcs-nonlocal-1-track-speculation.c to fail too.

Thanks,
Andrew

>
> --
> Alexandre Oliva, happy hacker            https://blog.lx.oliva.nom.br/
> Free Software Activist     FSFLA co-founder     GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity.
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive!
  
Martin Uecker Sept. 16, 2025, 12:09 p.m. UTC | #3
Am Montag, dem 15.09.2025 um 16:04 -0700 schrieb Andrew Pinski:
> On Mon, Sep 15, 2025 at 3:59 PM Alexandre Oliva <oliva@adacore.com> wrote:
> > 
> > On Sep 13, 2025, Alexandre Oliva <oliva@adacore.com> wrote:
> > 
> > > gcc.target/aarch64/sme/nonlocal_goto_[123].c fail on aarch64 targets
> > > configured with --enable-default-pie.
> > 
> > FTR, Martin's commit afa74d37e8170d696f97424da7ab0f71883aac70 regressed
> > nonlocal_goto_2.c in a slightly different way, even in a standard
> > aarch64-elf build.  I was surprised that the warning patch caused any
> > codegen changes, but it seems to prevent collapsing the nonlocal label
> > with the return block it falls through to, and the expected code pattern
> > no longer matches because the separation survives through to the end of
> > compilation, with an extra smstart and padding.
> 
> Yes the problem with non-local lable/goto's is recorded already as
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121933 .
> It caused gcs-nonlocal-1-track-speculation.c to fail too.

Sorry, for that.  The fix shown there is bootstrapped and regression
tested on aarch64.  I am travelling now but plan to submit it at the
end of the week. 

But it would certainly be nice if we *could* use the DECL_NONLOCAL bit
in the FE to track this information for decls.  So I am do not know what
consequences this might have, but if it could be made to work I would
prefer this.

Martin