diff mbox series

arm64: bti: Set PROT_BTI on all BTI executables mapped by the kernel

Message ID 20210205173837.39315-1-broonie@kernel.org
State Not Applicable
Headers show
Series arm64: bti: Set PROT_BTI on all BTI executables mapped by the kernel | expand

Commit Message

Mark Brown Feb. 5, 2021, 5:38 p.m. UTC
Currently for dynamically linked executables the kernel only enables
PROT_BTI for the interpreter, the interpreter is responsible for
enabling it for everything else including the main executable.
Unfortunately this interacts poorly with systemd's
MemoryDenyWriteExecute feature which uses a seccomp filter to prevent
setting PROT_EXEC on already mapped memory via mprotect(), it lacks the
context to detect that PROT_EXEC is already set and so refuses to allow
the mprotect() on the main executable which the kernel has already
mapped.

Since we don't want to force users to choose between having MDWX and BTI
as these are othogonal features have the kernel enable PROT_BTI for all
the ELF objects it loads, not just the dynamic linker.  This means that
if there is a problem with BTI it will be harder to disable at the
executable level but we currently have no conditional support for this
in any libc anyway so that would be new development.  Ideally we would
have interfaces that allowed us to more clearly specify what is enabled
and disabled by a given syscall but this would be a far more difficult
change to deploy.

Reported-by: Jeremy Linton <jeremy.linton@arm.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
Cc: Dave Martin <dave.martin@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: libc-alpha@sourceware.org
---

This solution was proposed by Catalin, I'm just writing it up into a
patch since it looks to be what we've converged on as the most practical
solution and but things seemed to have stalled out.

 arch/arm64/kernel/process.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

Catalin Marinas Feb. 5, 2021, 5:51 p.m. UTC | #1
On Fri, Feb 05, 2021 at 05:38:37PM +0000, Mark Brown wrote:
> Currently for dynamically linked executables the kernel only enables
> PROT_BTI for the interpreter, the interpreter is responsible for
> enabling it for everything else including the main executable.
> Unfortunately this interacts poorly with systemd's
> MemoryDenyWriteExecute feature which uses a seccomp filter to prevent
> setting PROT_EXEC on already mapped memory via mprotect(), it lacks the
> context to detect that PROT_EXEC is already set and so refuses to allow
> the mprotect() on the main executable which the kernel has already
> mapped.
> 
> Since we don't want to force users to choose between having MDWX and BTI
> as these are othogonal features have the kernel enable PROT_BTI for all
> the ELF objects it loads, not just the dynamic linker.  This means that
> if there is a problem with BTI it will be harder to disable at the
> executable level but we currently have no conditional support for this
> in any libc anyway so that would be new development.  Ideally we would
> have interfaces that allowed us to more clearly specify what is enabled
> and disabled by a given syscall but this would be a far more difficult
> change to deploy.
> 
> Reported-by: Jeremy Linton <jeremy.linton@arm.com>
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
> Cc: Dave Martin <dave.martin@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: libc-alpha@sourceware.org

Thanks Mark for putting the patch together. You may want to add a
reference to some of the discussions around the ABI, one of them:

Link: https://lore.kernel.org/r/20201207200338.GB24625@arm.com/

(so we can keep Szabolcs accountable if something breaks ;))

For this patch:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

I wouldn't merge it as a fix yet but I'm ok with getting in 5.12 if Will
is ok. It would give us some time to revert.
Mark Brown Feb. 5, 2021, 7:01 p.m. UTC | #2
On Fri, Feb 05, 2021 at 05:51:29PM +0000, Catalin Marinas wrote:

> Thanks Mark for putting the patch together. You may want to add a
> reference to some of the discussions around the ABI, one of them:

> Link: https://lore.kernel.org/r/20201207200338.GB24625@arm.com/

> (so we can keep Szabolcs accountable if something breaks ;))

Done locally.

> For this patch:

> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

> I wouldn't merge it as a fix yet but I'm ok with getting in 5.12 if Will
> is ok. It would give us some time to revert.

Yes, I was expecting v5.12 at the earliest.
Will Deacon Feb. 8, 2021, 12:44 p.m. UTC | #3
On Fri, Feb 05, 2021 at 05:51:29PM +0000, Catalin Marinas wrote:
> On Fri, Feb 05, 2021 at 05:38:37PM +0000, Mark Brown wrote:
> > Currently for dynamically linked executables the kernel only enables
> > PROT_BTI for the interpreter, the interpreter is responsible for
> > enabling it for everything else including the main executable.
> > Unfortunately this interacts poorly with systemd's
> > MemoryDenyWriteExecute feature which uses a seccomp filter to prevent
> > setting PROT_EXEC on already mapped memory via mprotect(), it lacks the
> > context to detect that PROT_EXEC is already set and so refuses to allow
> > the mprotect() on the main executable which the kernel has already
> > mapped.
> > 
> > Since we don't want to force users to choose between having MDWX and BTI
> > as these are othogonal features have the kernel enable PROT_BTI for all
> > the ELF objects it loads, not just the dynamic linker.  This means that
> > if there is a problem with BTI it will be harder to disable at the
> > executable level but we currently have no conditional support for this
> > in any libc anyway so that would be new development.  Ideally we would
> > have interfaces that allowed us to more clearly specify what is enabled
> > and disabled by a given syscall but this would be a far more difficult
> > change to deploy.
> > 
> > Reported-by: Jeremy Linton <jeremy.linton@arm.com>
> > Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Mark Brown <broonie@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > Cc: Dave Martin <dave.martin@arm.com>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: libc-alpha@sourceware.org
> 
> Thanks Mark for putting the patch together. You may want to add a
> reference to some of the discussions around the ABI, one of them:
> 
> Link: https://lore.kernel.org/r/20201207200338.GB24625@arm.com/
> 
> (so we can keep Szabolcs accountable if something breaks ;))
> 
> For this patch:
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> I wouldn't merge it as a fix yet but I'm ok with getting in 5.12 if Will
> is ok. It would give us some time to revert.

I'd like an Ack from Szabolcs before we queue this.

Will
Szabolcs Nagy Feb. 8, 2021, 2:13 p.m. UTC | #4
The 02/08/2021 12:44, Will Deacon wrote:
> On Fri, Feb 05, 2021 at 05:51:29PM +0000, Catalin Marinas wrote:
> > On Fri, Feb 05, 2021 at 05:38:37PM +0000, Mark Brown wrote:
> > > Currently for dynamically linked executables the kernel only enables
> > > PROT_BTI for the interpreter, the interpreter is responsible for
> > > enabling it for everything else including the main executable.
> > > Unfortunately this interacts poorly with systemd's
> > > MemoryDenyWriteExecute feature which uses a seccomp filter to prevent
> > > setting PROT_EXEC on already mapped memory via mprotect(), it lacks the
> > > context to detect that PROT_EXEC is already set and so refuses to allow
> > > the mprotect() on the main executable which the kernel has already
> > > mapped.
> > > 
> > > Since we don't want to force users to choose between having MDWX and BTI
> > > as these are othogonal features have the kernel enable PROT_BTI for all
> > > the ELF objects it loads, not just the dynamic linker.  This means that
> > > if there is a problem with BTI it will be harder to disable at the
> > > executable level but we currently have no conditional support for this
> > > in any libc anyway so that would be new development.  Ideally we would
> > > have interfaces that allowed us to more clearly specify what is enabled
> > > and disabled by a given syscall but this would be a far more difficult
> > > change to deploy.
> > > 
> > > Reported-by: Jeremy Linton <jeremy.linton@arm.com>
> > > Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> > > Signed-off-by: Mark Brown <broonie@kernel.org>
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
> > > Cc: Dave Martin <dave.martin@arm.com>
> > > Cc: Kees Cook <keescook@chromium.org>
> > > Cc: libc-alpha@sourceware.org
> > 
> > Thanks Mark for putting the patch together. You may want to add a
> > reference to some of the discussions around the ABI, one of them:
> > 
> > Link: https://lore.kernel.org/r/20201207200338.GB24625@arm.com/
> > 
> > (so we can keep Szabolcs accountable if something breaks ;))
> > 
> > For this patch:
> > 
> > Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> > 
> > I wouldn't merge it as a fix yet but I'm ok with getting in 5.12 if Will
> > is ok. It would give us some time to revert.
> 
> I'd like an Ack from Szabolcs before we queue this.
> 
> Will

i'm ok with this in principle, but i will rerun
the glibc tests over night to be sure.
Dave Martin Feb. 8, 2021, 2:53 p.m. UTC | #5
On Fri, Feb 05, 2021 at 05:38:37PM +0000, Mark Brown via Libc-alpha wrote:
> Currently for dynamically linked executables the kernel only enables
> PROT_BTI for the interpreter, the interpreter is responsible for
> enabling it for everything else including the main executable.
> Unfortunately this interacts poorly with systemd's
> MemoryDenyWriteExecute feature which uses a seccomp filter to prevent
> setting PROT_EXEC on already mapped memory via mprotect(), it lacks the
> context to detect that PROT_EXEC is already set and so refuses to allow
> the mprotect() on the main executable which the kernel has already
> mapped.
> 
> Since we don't want to force users to choose between having MDWX and BTI
> as these are othogonal features have the kernel enable PROT_BTI for all
> the ELF objects it loads, not just the dynamic linker.  This means that
> if there is a problem with BTI it will be harder to disable at the
> executable level but we currently have no conditional support for this
> in any libc anyway so that would be new development.  Ideally we would
> have interfaces that allowed us to more clearly specify what is enabled
> and disabled by a given syscall but this would be a far more difficult
> change to deploy.
> 
> Reported-by: Jeremy Linton <jeremy.linton@arm.com>
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com>
> Cc: Dave Martin <dave.martin@arm.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: libc-alpha@sourceware.org
> ---
> 
> This solution was proposed by Catalin, I'm just writing it up into a
> patch since it looks to be what we've converged on as the most practical
> solution and but things seemed to have stalled out.
> 
>  arch/arm64/kernel/process.c | 8 --------
>  1 file changed, 8 deletions(-)
> 
> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> index 71c8265b9139..0967f9e1f9fd 100644
> --- a/arch/arm64/kernel/process.c
> +++ b/arch/arm64/kernel/process.c
> @@ -717,14 +717,6 @@ asmlinkage void __sched arm64_preempt_schedule_irq(void)
>  int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
>  			 bool has_interp, bool is_interp)
>  {
> -	/*
> -	 * For dynamically linked executables the interpreter is
> -	 * responsible for setting PROT_BTI on everything except
> -	 * itself.
> -	 */
> -	if (is_interp != has_interp)
> -		return prot;
> -

Reviewed-by: Dave Martin <Dave.Martin@arm.com>

>  	if (!(state->flags & ARM64_ELF_BTI))
>  		return prot;

The original idea was to interfere with userspace as little as possible,
and leave this until/unless there was a clear need for it and a clear
understanding that it wouldn't break anything.

Looks like we have both of those now -- I'll leave it to Szabolcs to
confirm the userspace view of this.

Question: will this change prevent BTI executables from working under a
non-BTI-aware ld.so?  And do we care?  (I think "probably not" for both,
but I'd be interested in others' views.)

Cheers
---Dave
Mark Brown Feb. 8, 2021, 3:06 p.m. UTC | #6
On Mon, Feb 08, 2021 at 02:53:16PM +0000, Dave Martin wrote:

> Question: will this change prevent BTI executables from working under a
> non-BTI-aware ld.so?  And do we care?  (I think "probably not" for both,
> but I'd be interested in others' views.)

Yes, I'd expect that we'd end up allowing a BTIed main executable
without a BTI ld.so.  We already have the case where we will enable BTI
for the vDSO even if nothing outside the kernel supports BTI and BTI is
intended to interwork happily so I'd be surprised if there were
problems.
Szabolcs Nagy Feb. 8, 2021, 4:47 p.m. UTC | #7
The 02/08/2021 14:13, Szabolcs Nagy via Libc-alpha wrote:
> The 02/08/2021 12:44, Will Deacon wrote:
> > I'd like an Ack from Szabolcs before we queue this.
> 
> i'm ok with this in principle, but i will rerun
> the glibc tests over night to be sure.


the patch applied cleanly on top of arm64 for-next/core

but it does not work as i expected:

executables that do not have the bti property note
set seems to get bti guarded by the kernel.

at least i see crash in _start when the dynamic
linker (which has bti marking) transfers control
to it and the start code has no bti c.

(according to strace the dynamic linker did not
remap/mprotect the main exe with bti so i assume
this is the kernel's doing)

can somebody verify that the notes are checked
on the executable too and not just on ld.so?
Szabolcs Nagy Feb. 8, 2021, 4:50 p.m. UTC | #8
The 02/08/2021 15:06, Mark Brown via Libc-alpha wrote:
> On Mon, Feb 08, 2021 at 02:53:16PM +0000, Dave Martin wrote:
> 
> > Question: will this change prevent BTI executables from working under a
> > non-BTI-aware ld.so?  And do we care?  (I think "probably not" for both,
> > but I'd be interested in others' views.)
> 
> Yes, I'd expect that we'd end up allowing a BTIed main executable
> without a BTI ld.so.  We already have the case where we will enable BTI
> for the vDSO even if nothing outside the kernel supports BTI and BTI is
> intended to interwork happily so I'd be surprised if there were
> problems.

note that this is unlikely to happen (as opposed
to the other way: bti marked ld.so & non-bti exe)
because if the libc is not built with bti then
the start code (crt1.o) has no bti marking either
so the exe will not get bti marked.
Dave Martin Feb. 8, 2021, 5:40 p.m. UTC | #9
On Mon, Feb 08, 2021 at 04:47:45PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> The 02/08/2021 14:13, Szabolcs Nagy via Libc-alpha wrote:
> > The 02/08/2021 12:44, Will Deacon wrote:
> > > I'd like an Ack from Szabolcs before we queue this.
> > 
> > i'm ok with this in principle, but i will rerun
> > the glibc tests over night to be sure.
> 
> 
> the patch applied cleanly on top of arm64 for-next/core
> 
> but it does not work as i expected:
> 
> executables that do not have the bti property note
> set seems to get bti guarded by the kernel.
> 
> at least i see crash in _start when the dynamic
> linker (which has bti marking) transfers control
> to it and the start code has no bti c.
> 
> (according to strace the dynamic linker did not
> remap/mprotect the main exe with bti so i assume
> this is the kernel's doing)
> 
> can somebody verify that the notes are checked
> on the executable too and not just on ld.so?

Reviewed-by bites the dust...


Aha, looking at the ELF code in the kernel, it looks like some extra
refactoring is needed.

We do the heavy lifting only for the image containing the userspace
entry point -- i.e., ld.so in the dynamically linked case.  This
includes the ELF property handling.  When ld.so is present, the main
executable is just data so we map it in but don't do a whole lot else
with it:


static int load_elf_binary(struct linux_binprm *bprm)
{
	/* ... */

	retval = parse_elf_properties(interpreter ?: bprm->file,
				      elf_property_phdata, &arch_state);



The way I originally integrated this therefore just tracks the BTI-ness
(and/or equivalently how to mutate PROT_EXEC) once.  Looks like we need
to do that independently for ld.so and for the executable instead.

We could simplify things by treating it as an error if the executable
and ld.so have different BTI properties, but that seems a bit of an own
goal, since it breaks foreseeable backwards compatibility / hybrid use
cases.

Cheers
---Dave
Catalin Marinas Feb. 8, 2021, 6:49 p.m. UTC | #10
On Mon, Feb 08, 2021 at 05:40:29PM +0000, Dave P Martin wrote:
> On Mon, Feb 08, 2021 at 04:47:45PM +0000, Szabolcs Nagy via Libc-alpha wrote:
> > The 02/08/2021 14:13, Szabolcs Nagy via Libc-alpha wrote:
> > > The 02/08/2021 12:44, Will Deacon wrote:
> > > > I'd like an Ack from Szabolcs before we queue this.
> > > 
> > > i'm ok with this in principle, but i will rerun
> > > the glibc tests over night to be sure.
> > 
> > 
> > the patch applied cleanly on top of arm64 for-next/core
> > 
> > but it does not work as i expected:
> > 
> > executables that do not have the bti property note
> > set seems to get bti guarded by the kernel.
> > 
> > at least i see crash in _start when the dynamic
> > linker (which has bti marking) transfers control
> > to it and the start code has no bti c.
> > 
> > (according to strace the dynamic linker did not
> > remap/mprotect the main exe with bti so i assume
> > this is the kernel's doing)
> > 
> > can somebody verify that the notes are checked
> > on the executable too and not just on ld.so?

Thanks Szabolcs for testing.

> Reviewed-by bites the dust...
> 
> Aha, looking at the ELF code in the kernel, it looks like some extra
> refactoring is needed.
> 
> We do the heavy lifting only for the image containing the userspace
> entry point -- i.e., ld.so in the dynamically linked case.  This
> includes the ELF property handling.  When ld.so is present, the main
> executable is just data so we map it in but don't do a whole lot else
> with it:
> 
> static int load_elf_binary(struct linux_binprm *bprm)
> {
> 	/* ... */
> 
> 	retval = parse_elf_properties(interpreter ?: bprm->file,
> 				      elf_property_phdata, &arch_state);
> 
> The way I originally integrated this therefore just tracks the BTI-ness
> (and/or equivalently how to mutate PROT_EXEC) once.  Looks like we need
> to do that independently for ld.so and for the executable instead.
> 
> We could simplify things by treating it as an error if the executable
> and ld.so have different BTI properties, but that seems a bit of an own
> goal, since it breaks foreseeable backwards compatibility / hybrid use
> cases.

Ah, I haven't thought of these cases either.

In theory, if the dynamic loader hasn't been compiled with
-mbranch-protection=bti, it does not mean it won't be able to use
PROT_BTI on the mapped executable/libraries. The same goes for
BTI-compiled loader with a non-BTI executable, it should be allowed.

So I think decoupling the two makes sense but the patch is no longer as
simple as the ELF loader now needs to parse the main executable. If
that's doable and the code looks sane, I'd go with it. Otherwise, we
scrap the idea of the kernel setting PROT_BTI on the main executable.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 71c8265b9139..0967f9e1f9fd 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -717,14 +717,6 @@  asmlinkage void __sched arm64_preempt_schedule_irq(void)
 int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
 			 bool has_interp, bool is_interp)
 {
-	/*
-	 * For dynamically linked executables the interpreter is
-	 * responsible for setting PROT_BTI on everything except
-	 * itself.
-	 */
-	if (is_interp != has_interp)
-		return prot;
-
 	if (!(state->flags & ARM64_ELF_BTI))
 		return prot;