linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Paul Gofman <gofmanp@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>,
	Gabriel Krisman Bertazi <krisman@collabora.com>,
	Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel@collabora.com, Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@chromium.org>, Will Drewry <wad@chromium.org>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Zebediah Figura <zfigura@codeweavers.com>
Subject: Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory areas
Date: Sun, 31 May 2020 11:57:02 -0700	[thread overview]
Message-ID: <CALCETrV+rYnUnve09=n+Zb8BR8mDBq6txX9LmEw7r8tAA7d+2Q@mail.gmail.com> (raw)
In-Reply-To: <a14be8b0-a9a2-cf96-939e-cedf7e0e669a@gmail.com>

On Sun, May 31, 2020 at 11:36 AM Paul Gofman <gofmanp@gmail.com> wrote:
>
> On 5/31/20 21:10, Andy Lutomirski wrote:
> >
> > That's not what I meant.  I meant that you would set the kernel up to
> > redirect *all* syscalls from the thread with the sole exception of one
> > syscall instruction in the thunk.  This would catch Windows syscalls
> > and Linux syscalls.  The thunk would determine whether the original
> > syscall was Linux or Windows and handle it accordingly.
> >
> > This may interact poorly with the DRM scheme.  The redzone might need
> > to be respected, or stack switching might be needed.
>
> Oh yeah, I see now, thanks. Sure, we could trap every syscall and have a
> Seccomp-allowed trampoline for executing native ones with the existing
> Seccomp implementation. But this is going to have prohibitive
> performance impact. Our present use case specifics is that vast majority
> of syscalls do not need to be emulated, they are native. And just a few
> go from the Windows application which we need to trap and route to our
> handler to let the program continue, while we do not care too much about
> the overhead for those few. So the hope was that the kernel can route
> that majority of Linux native syscalls inside with the minor overhead.
> I've read the suggestion to use SECCOMP_RET_USER_NOTIF instead of
> SECCOMP_RET_TRAP, is handling the trap this way supposed to be much
> quicker than handling the sigsys from SECCOMP_RET_TRAP? More
> specifically, would not SECCOMP_RET_USER_NOTIF effectively serialize all
> the syscalls waiting in a single queue for processing, while
> SECCOMP_RET_TRAP can be processed without exclusive locking?
>
>

Using SECCOMP_RET_USER_NOTIF is likely to be considerably more
expensive than my scheme.  On a non-PTI system, my approach will add a
few tens of ns to each syscall.  On a PTI system, it will be worse.
But using any kind of notifier for all syscalls will cause a context
switch to a different user program for each syscall, and that will be
much slower.

I think that the implementation may well want to live in seccomp, but
doing this as a seccomp filter isn't quite right.  It's not a security
thing -- it's an emulation thing.  Seccomp is all about making
inescapable sandboxes, but that's not what you're doing at all, and
the fact that seccomp filters are preserved across execve() sounds
like it'll be annoying for you.

What if there was a special filter type that ran a BPF program on each
syscall, and the program was allowed to access user memory to make its
decisions, e.g. to look at some list of memory addresses.  But this
would explicitly *not* be a security feature -- execve() would remove
the filter, and the filter's outcome would be one of redirecting
execution or allowing the syscall.  If the "allow" outcome occurs,
then regular seccomp filters run.  Obviously the exact semantics here
would need some care.

  reply	other threads:[~2020-05-31 18:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-30  5:59 [PATCH RFC] seccomp: Implement syscall isolation based on memory areas Gabriel Krisman Bertazi
2020-05-30 17:30 ` Kees Cook
2020-05-31  5:56   ` Gabriel Krisman Bertazi
2020-05-31 12:39     ` Paul Gofman
2020-05-31 16:49       ` Matthew Wilcox
2020-05-31 17:10         ` Paul Gofman
2020-05-31 17:31           ` Matthew Wilcox
2020-05-31 18:01             ` Paul Gofman
2020-06-01 17:54               ` Gabriel Krisman Bertazi
2020-06-01 17:53         ` Gabriel Krisman Bertazi
2020-05-30 22:09 ` Andy Lutomirski
2020-05-31  0:26   ` Gabriel Krisman Bertazi
2020-05-31  0:59     ` Andy Lutomirski
2020-05-31 12:56       ` Paul Gofman
2020-05-31 18:10         ` Andy Lutomirski
2020-05-31 18:36           ` Paul Gofman
2020-05-31 18:57             ` Andy Lutomirski [this message]
2020-05-31 19:37               ` Paul Gofman
2020-05-31 21:03               ` Andy Lutomirski
2020-06-01 18:06                 ` Gabriel Krisman Bertazi
2020-06-01 20:08                 ` Kees Cook
2020-06-01 23:18                   ` Andy Lutomirski
2020-06-11 19:38                 ` Gabriel Krisman Bertazi
2020-05-31 23:33               ` Brendan Shanks
2020-06-01  1:51                 ` Andy Lutomirski
2020-06-25 23:14     ` Robert O'Callahan
2020-06-25 23:48       ` Gabriel Krisman Bertazi
2020-06-26  1:03         ` Robert O'Callahan
2020-06-05  6:06 ` Sargun Dhillon
2020-06-01  9:23 Billy Laws
2020-06-01 13:59 ` Andy Lutomirski
2020-06-01 17:48   ` hpa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrV+rYnUnve09=n+Zb8BR8mDBq6txX9LmEw7r8tAA7d+2Q@mail.gmail.com' \
    --to=luto@kernel.org \
    --cc=gofmanp@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=kernel@collabora.com \
    --cc=krisman@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    --cc=zfigura@codeweavers.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).