linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@collabora.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: Paul Gofman <gofmanp@gmail.com>, Linux-MM <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel@collabora.com, Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@chromium.org>, Will Drewry <wad@chromium.org>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Zebediah Figura <zfigura@codeweavers.com>
Subject: Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory areas
Date: Mon, 01 Jun 2020 14:06:30 -0400	[thread overview]
Message-ID: <85y2p664pl.fsf@collabora.com> (raw)
In-Reply-To: <CALCETrWr_B-quNckFksTP1W-Ww71uQgCrR-o9QWdQ-Gi8p1r9A@mail.gmail.com> (Andy Lutomirski's message of "Sun, 31 May 2020 14:03:48 -0700")

Andy Lutomirski <luto@kernel.org> writes:

> On Sun, May 31, 2020 at 11:57 AM Andy Lutomirski <luto@kernel.org> wrote:
>>
>>
>> What if there was a special filter type that ran a BPF program on each
>> syscall, and the program was allowed to access user memory to make its
>> decisions, e.g. to look at some list of memory addresses.  But this
>> would explicitly *not* be a security feature -- execve() would remove
>> the filter, and the filter's outcome would be one of redirecting
>> execution or allowing the syscall.  If the "allow" outcome occurs,
>> then regular seccomp filters run.  Obviously the exact semantics here
>> would need some care.
>
> Let me try to flesh this out a little.
>
> A task could install a syscall emulation filter (maybe using the
> seccomp() syscall, maybe using something else).  There would be at
> most one such filter per process.  Upon doing a syscall, the kernel
> will first do initial syscall fixups (e.g. SYSENTER/SYSCALL32 magic
> argument translation) and would then invoke the filter.  The filter is
> an eBPF program (sorry Kees) and, as input, it gets access to the
> task's register state and to an indication of which type of syscall
> entry this was.  This will inherently be rather architecture specific
> -- x86 choices could be int80, int80(translated), and syscall64.  (We
> could expose SYSCALL32 separately, I suppose, but SYSENTER is such a
> mess that I'm not sure this would be productive.)  The program can
> access user memory, and it returns one of two results: allow the
> syscall or send SIGSYS.  If the program tries to access user memory
> and faults, the result is SIGSYS.
>
> (I would love to do this with cBPF, but I'm not sure how to pull this
> off.  Accessing user memory is handy for making the lookup flexible
> enough to detect Windows vs Linux.  It would be *really* nice to
> finally settle the unprivileged eBPF subset discussion so that we can
> figure out how to make eBPF work here.)
>
> execve() clears the filter.  clone() copies the filter.
>
> Does this seem reasonable?  Is the implementation complexity small
> enough?  Is the eBPF thing going to be a showstopper?
>
> Using a signal instead of a bespoke thunk simplifies a lot of thorny
> details but is also enough slower that catching all syscalls might be
> a performance problem.

If we can have something close to the numbers you shared, it seems to be
good for us.  Using the thunk instead of a signal seems very interesting
for performance.

Though, I'm not convinced about this not being part of seccomp just
because it is not security.  The suggestion from Kees to convert
seccomp to eBPF filters and stack them would provide similar semantics
and reuse the infrastructure.

Finnaly, as you said, I'm afraid that eBPF will be a show stopper,
unless unpriviledged eBPF becomes a thing. Wine cannot count on
CAP_SYS_ADMIN.

-- 
Gabriel Krisman Bertazi

  reply	other threads:[~2020-06-01 18:48 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-30  5:59 [PATCH RFC] seccomp: Implement syscall isolation based on memory areas Gabriel Krisman Bertazi
2020-05-30 17:30 ` Kees Cook
2020-05-31  5:56   ` Gabriel Krisman Bertazi
2020-05-31 12:39     ` Paul Gofman
2020-05-31 16:49       ` Matthew Wilcox
2020-05-31 17:10         ` Paul Gofman
2020-05-31 17:31           ` Matthew Wilcox
2020-05-31 18:01             ` Paul Gofman
2020-06-01 17:54               ` Gabriel Krisman Bertazi
2020-06-01 17:53         ` Gabriel Krisman Bertazi
2020-05-30 22:09 ` Andy Lutomirski
2020-05-31  0:26   ` Gabriel Krisman Bertazi
2020-05-31  0:59     ` Andy Lutomirski
2020-05-31 12:56       ` Paul Gofman
2020-05-31 18:10         ` Andy Lutomirski
2020-05-31 18:36           ` Paul Gofman
2020-05-31 18:57             ` Andy Lutomirski
2020-05-31 19:37               ` Paul Gofman
2020-05-31 21:03               ` Andy Lutomirski
2020-06-01 18:06                 ` Gabriel Krisman Bertazi [this message]
2020-06-01 20:08                 ` Kees Cook
2020-06-01 23:18                   ` Andy Lutomirski
2020-06-11 19:38                 ` Gabriel Krisman Bertazi
2020-05-31 23:33               ` Brendan Shanks
2020-06-01  1:51                 ` Andy Lutomirski
2020-06-25 23:14     ` Robert O'Callahan
2020-06-25 23:48       ` Gabriel Krisman Bertazi
2020-06-26  1:03         ` Robert O'Callahan
2020-06-05  6:06 ` Sargun Dhillon
2020-06-01  9:23 Billy Laws
2020-06-01 13:59 ` Andy Lutomirski
2020-06-01 17:48   ` hpa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=85y2p664pl.fsf@collabora.com \
    --to=krisman@collabora.com \
    --cc=gofmanp@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=kernel@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    --cc=zfigura@codeweavers.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).