All of lore.kernel.org
 help / color / mirror / Atom feed
From: Topi Miettinen <toiwoton@gmail.com>
To: Dave Martin <Dave.Martin@arm.com>, Jeremy Linton <jeremy.linton@arm.com>
Cc: "linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	libc-alpha@sourceware.org, systemd-devel@lists.freedesktop.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Kees Cook <keescook@chromium.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Mark Brown <broonie@kernel.org>
Subject: Re: BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures
Date: Mon, 26 Oct 2020 18:39:55 +0200	[thread overview]
Message-ID: <ed3407a9-8479-edf7-23eb-5354e77d2a58@gmail.com> (raw)
In-Reply-To: <20201026162410.GB27285@arm.com>

On 26.10.2020 18.24, Dave Martin wrote:
> On Wed, Oct 21, 2020 at 10:44:46PM -0500, Jeremy Linton via Libc-alpha wrote:
>> Hi,
>>
>> There is a problem with glibc+systemd on BTI enabled systems. Systemd
>> has a service flag "MemoryDenyWriteExecute" which uses seccomp to deny
>> PROT_EXEC changes. Glibc enables BTI only on segments which are marked as
>> being BTI compatible by calling mprotect PROT_EXEC|PROT_BTI. That call is
>> caught by the seccomp filter, resulting in service failures.
>>
>> So, at the moment one has to pick either denying PROT_EXEC changes, or BTI.
>> This is obviously not desirable.
>>
>> Various changes have been suggested, replacing the mprotect with mmap calls
>> having PROT_BTI set on the original mapping, re-mmapping the segments,
>> implying PROT_EXEC on mprotect PROT_BTI calls when VM_EXEC is already set,
>> and various modification to seccomp to allow particular mprotect cases to
>> bypass the filters. In each case there seems to be an undesirable attribute
>> to the solution.
>>
>> So, whats the best solution?
> 
> Unrolling this discussion a bit, this problem comes from a few sources:
> 
> 1) systemd is trying to implement a policy that doesn't fit SECCOMP
> syscall filtering very well.
> 
> 2) The program is trying to do something not expressible through the
> syscall interface: really the intent is to set PROT_BTI on the page,
> with no intent to set PROT_EXEC on any page that didn't already have it
> set.
> 
> 
> This limitation of mprotect() was known when I originally added PROT_BTI,
> but at that time we weren't aware of a clear use case that would fail.
> 
> 
> Would it now help to add something like:
> 
> int mchangeprot(void *addr, size_t len, int old_flags, int new_flags)
> {
> 	int ret = -EINVAL;
> 	mmap_write_lock(current->mm);
> 	if (all vmas in [addr .. addr + len) have
> 			their mprotect flags set to old_flags) {
> 
> 		ret = mprotect(addr, len, new_flags);
> 	}
> 	
> 	mmap_write_unlock(current->mm);
> 	return ret;
> }
> 
> 
> libc would now be able to do
> 
> 	mchangeprot(addr, len, PROT_EXEC | PROT_READ,
> 		PROT_EXEC | PROT_READ | PROT_BTI);
> 
> while systemd's MDWX filter would reject the call if
> 
> 	(new_flags & PROT_EXEC) &&
> 		(!(old_flags & PROT_EXEC) || (new_flags & PROT_WRITE)
> 
> 
> 
> This won't magically fix current code, but something along these lines
> might be better going forward.
> 
> 
> Thoughts?

Looks good to me.

-Topi


WARNING: multiple messages have this Message-ID (diff)
From: Topi Miettinen <toiwoton@gmail.com>
To: Dave Martin <Dave.Martin@arm.com>, Jeremy Linton <jeremy.linton@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>,
	systemd-devel@lists.freedesktop.org,
	Kees Cook <keescook@chromium.org>,
	Catalin Marinas <Catalin.Marinas@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Mark Brown <broonie@kernel.org>,
	libc-alpha@sourceware.org,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures
Date: Mon, 26 Oct 2020 18:39:55 +0200	[thread overview]
Message-ID: <ed3407a9-8479-edf7-23eb-5354e77d2a58@gmail.com> (raw)
In-Reply-To: <20201026162410.GB27285@arm.com>

On 26.10.2020 18.24, Dave Martin wrote:
> On Wed, Oct 21, 2020 at 10:44:46PM -0500, Jeremy Linton via Libc-alpha wrote:
>> Hi,
>>
>> There is a problem with glibc+systemd on BTI enabled systems. Systemd
>> has a service flag "MemoryDenyWriteExecute" which uses seccomp to deny
>> PROT_EXEC changes. Glibc enables BTI only on segments which are marked as
>> being BTI compatible by calling mprotect PROT_EXEC|PROT_BTI. That call is
>> caught by the seccomp filter, resulting in service failures.
>>
>> So, at the moment one has to pick either denying PROT_EXEC changes, or BTI.
>> This is obviously not desirable.
>>
>> Various changes have been suggested, replacing the mprotect with mmap calls
>> having PROT_BTI set on the original mapping, re-mmapping the segments,
>> implying PROT_EXEC on mprotect PROT_BTI calls when VM_EXEC is already set,
>> and various modification to seccomp to allow particular mprotect cases to
>> bypass the filters. In each case there seems to be an undesirable attribute
>> to the solution.
>>
>> So, whats the best solution?
> 
> Unrolling this discussion a bit, this problem comes from a few sources:
> 
> 1) systemd is trying to implement a policy that doesn't fit SECCOMP
> syscall filtering very well.
> 
> 2) The program is trying to do something not expressible through the
> syscall interface: really the intent is to set PROT_BTI on the page,
> with no intent to set PROT_EXEC on any page that didn't already have it
> set.
> 
> 
> This limitation of mprotect() was known when I originally added PROT_BTI,
> but at that time we weren't aware of a clear use case that would fail.
> 
> 
> Would it now help to add something like:
> 
> int mchangeprot(void *addr, size_t len, int old_flags, int new_flags)
> {
> 	int ret = -EINVAL;
> 	mmap_write_lock(current->mm);
> 	if (all vmas in [addr .. addr + len) have
> 			their mprotect flags set to old_flags) {
> 
> 		ret = mprotect(addr, len, new_flags);
> 	}
> 	
> 	mmap_write_unlock(current->mm);
> 	return ret;
> }
> 
> 
> libc would now be able to do
> 
> 	mchangeprot(addr, len, PROT_EXEC | PROT_READ,
> 		PROT_EXEC | PROT_READ | PROT_BTI);
> 
> while systemd's MDWX filter would reject the call if
> 
> 	(new_flags & PROT_EXEC) &&
> 		(!(old_flags & PROT_EXEC) || (new_flags & PROT_WRITE)
> 
> 
> 
> This won't magically fix current code, but something along these lines
> might be better going forward.
> 
> 
> Thoughts?

Looks good to me.

-Topi


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-10-26 16:40 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <8584c14f-5c28-9d70-c054-7c78127d84ea@arm.com>
2020-10-22  7:18 ` [systemd-devel] BTI interaction between seccomp filters in systemd and glibc mprotect calls, causing service failures Lennart Poettering
2020-10-22  7:18   ` Lennart Poettering
2020-10-22  7:54   ` Florian Weimer
2020-10-22  7:54     ` Florian Weimer
2020-10-22  8:17     ` Topi Miettinen
2020-10-22  8:17       ` Topi Miettinen
2020-10-22  8:25       ` Florian Weimer
2020-10-22  8:25         ` Florian Weimer
2020-10-22  8:29       ` Szabolcs Nagy
2020-10-22  8:29         ` Szabolcs Nagy
2020-10-22  8:38         ` Lennart Poettering
2020-10-22  8:38           ` Lennart Poettering
2020-10-22  9:31           ` Catalin Marinas
2020-10-22  9:31             ` Catalin Marinas
2020-10-22 10:12             ` Topi Miettinen
2020-10-22 10:12               ` Topi Miettinen
2020-10-22 10:27               ` Florian Weimer
2020-10-22 10:27                 ` Florian Weimer
2020-10-23  6:13             ` Szabolcs Nagy
2020-10-23  6:13               ` Szabolcs Nagy
2020-10-23  9:04               ` Catalin Marinas
2020-10-23  9:04                 ` Catalin Marinas
2020-10-22 10:03         ` Topi Miettinen
2020-10-22 10:03           ` Topi Miettinen
2020-10-22  8:05   ` Szabolcs Nagy
2020-10-22  8:05     ` Szabolcs Nagy
2020-10-22  8:31     ` Lennart Poettering
2020-10-22  8:31       ` Lennart Poettering
     [not found] ` <20201022075447.GO3819@arm.com>
2020-10-22 10:39   ` Topi Miettinen
2020-10-22 10:39     ` Topi Miettinen
2020-10-22 20:02     ` Kees Cook
2020-10-22 20:02       ` Kees Cook
2020-10-22 20:02       ` Kees Cook
2020-10-22 22:24       ` Topi Miettinen
2020-10-22 22:24         ` Topi Miettinen
2020-10-22 22:24         ` Topi Miettinen
2020-10-23 17:52         ` Salvatore Mesoraca
2020-10-23 17:52           ` Salvatore Mesoraca
2020-10-23 17:52           ` Salvatore Mesoraca
2020-10-24 11:34           ` Topi Miettinen
2020-10-24 11:34             ` Topi Miettinen
2020-10-24 11:34             ` Topi Miettinen
2020-10-24 14:12             ` Salvatore Mesoraca
2020-10-24 14:12               ` Salvatore Mesoraca
2020-10-24 14:12               ` Salvatore Mesoraca
2020-10-25 13:42               ` Jordan Glover
2020-10-25 13:42                 ` Jordan Glover
2020-10-25 13:42                 ` Jordan Glover
2020-10-23  9:02       ` Catalin Marinas
2020-10-23  9:02         ` Catalin Marinas
2020-10-23  9:02         ` Catalin Marinas
2020-10-24 11:01         ` Topi Miettinen
2020-10-24 11:01           ` Topi Miettinen
2020-10-24 11:01           ` Topi Miettinen
2020-10-26 14:52           ` Catalin Marinas
2020-10-26 14:52             ` Catalin Marinas
2020-10-26 14:52             ` Catalin Marinas
2020-10-26 15:56             ` Dave Martin
2020-10-26 15:56               ` Dave Martin
2020-10-26 15:56               ` Dave Martin
2020-10-26 16:51               ` Mark Brown
2020-10-26 16:51                 ` Mark Brown
2020-10-26 16:51                 ` Mark Brown
2020-10-26 16:31             ` Topi Miettinen
2020-10-26 16:31               ` Topi Miettinen
2020-10-26 16:31               ` Topi Miettinen
2020-10-26 16:24 ` Dave Martin
2020-10-26 16:24   ` Dave Martin
2020-10-26 16:39   ` Topi Miettinen [this message]
2020-10-26 16:39     ` Topi Miettinen
2020-10-26 16:45   ` Florian Weimer
2020-10-26 16:45     ` Florian Weimer
2020-10-27 14:22     ` Dave Martin
2020-10-27 14:22       ` Dave Martin
2020-10-27 14:41       ` Florian Weimer
2020-10-27 14:41         ` Florian Weimer
2020-10-26 16:57   ` Szabolcs Nagy
2020-10-26 16:57     ` Szabolcs Nagy
2020-10-26 17:52     ` Dave Martin
2020-10-26 17:52       ` Dave Martin
2020-10-26 22:39       ` Jeremy Linton
2020-10-26 22:39         ` Jeremy Linton
2020-10-27 14:15         ` Dave Martin
2020-10-27 14:15           ` Dave Martin
2020-10-29 11:02           ` Catalin Marinas
2020-10-29 11:02             ` Catalin Marinas
2020-11-04 12:18             ` Dave Martin
2020-11-04 12:18               ` Dave Martin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed3407a9-8479-edf7-23eb-5354e77d2a58@gmail.com \
    --to=toiwoton@gmail.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Dave.Martin@arm.com \
    --cc=broonie@kernel.org \
    --cc=jeremy.linton@arm.com \
    --cc=keescook@chromium.org \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=systemd-devel@lists.freedesktop.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.