linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Robert O'Callahan" <robert@ocallahan.org>
To: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: Andy Lutomirski <luto@amacapital.net>,
	Linux-MM <linux-mm@kvack.org>,
	open list <linux-kernel@vger.kernel.org>,
	kernel@collabora.com, Thomas Gleixner <tglx@linutronix.de>,
	Kees Cook <keescook@chromium.org>, Will Drewry <wad@chromium.org>,
	"H . Peter Anvin" <hpa@zytor.com>,
	Paul Gofman <gofmanp@gmail.com>
Subject: Re: [PATCH RFC] seccomp: Implement syscall isolation based on memory areas
Date: Fri, 26 Jun 2020 13:03:47 +1200	[thread overview]
Message-ID: <CAOp6jLZrzmKh34L5f=RFOW7kNKraqer7M0cuQPv-aeUNc+AcpA@mail.gmail.com> (raw)
In-Reply-To: <877dvuemfp.fsf@collabora.com>

On Fri, Jun 26, 2020 at 11:49 AM Gabriel Krisman Bertazi
<krisman@collabora.com> wrote:
> We couldn't patch Windows code because of the aforementioned DRM and
> anti-cheat mechanisms, but I suppose this limitation doesn't apply to
> Wine/native code, and if this assumption is correct, this approach could
> work.
>
> One complexity might be the consistent model for the syscall live
> patching.  I don't know how much of the problem is diminished from the
> original userspace live-patching problem, but I believe at least part of
> it applies.  And fencing every thread to patch would kill performance.
> Also, we cannot just patch everything at the beginning.  How does rr
> handle that?

That's a good point. rr only allows one tracee thread to run at a time
for other reasons, so when we consider patching a syscall instruction,
we inspect all thread states to see if the patch would interfere with
any other thread, and skip patching it in that unlikely case. (We'll
try to patch it again next time that instruction is executed.) Wine
would need some other solution, but indeed that could be a
showstopper.

> Another problem is that we will want to support i386 and other
> architectures.  For int 0x80, it is trickier to encode a branch to
> another region, given the limited instruction space, and the patching
> might not be possible in hot paths.

This is no worse than for x86-64 `syscall`, which is also two bytes.
We have pretty much always been able to patch the frequently executed
syscalls by replacing both the syscall instruction and an instruction
before or after the syscall with a five-byte jump, and folding the
extra instruction into the stub.

>I did port libsyscall-intercept to
> x86-32 once and I could correctly patch glibc, but it's not guaranteed
> that an updated libc or something else won't break it.

We haven't found this to be much of a problem in rr. From time to time
we have to add new patch patterns. The good news is that if things
change so a syscall can't be patched, the result is degraded
performance, not functional breakage.

> I'm not sure the benefit of not needing enhanced kernel support
> justifies the complexity and performance cost required to make this work
> reliably, in particular since the semantics for a kernel implementation
> that we are discussing doesn't seem overly intrusive and might have
> other applications like in the generic filter Andy mentioned.

That's fair --- our solution is complex. (But worth it --- for us,
it's valuable that rr works on quite old Linux kernels.)

As for performance, it performs well for us. I think we'd prefer our
current approach to Andy's hypothetical PR_SET_SYSCALL_THUNK because
the latter would have higher overhead (two trips into the kernel per
syscall). We definitely don't want to require CAP_SYS_ADMIN so that
rules out any eBPF-based alternative too. I would love to see a
low-overhead unprivileged syscall interception mechanism that would
obsolete our patching approach --- preferably one that's also
stackable so rr can record and replay processes that use the new
mechanism --- but I don't think any of the proposals here are that,
yet, unfortunately.

Rob
-- 
Su ot deraeppa sah dna Rehtaf eht htiw saw hcihw, efil lanrete eht uoy
ot mialcorp ew dna, ti ot yfitset dna ti nees evah ew; deraeppa efil
eht. Efil fo Drow eht gninrecnoc mialcorp ew siht - dehcuot evah sdnah
ruo dna ta dekool evah ew hcihw, seye ruo htiw nees evah ew hcihw,
draeh evah ew hcihw, gninnigeb eht morf saw hcihw taht.

  reply	other threads:[~2020-06-26  1:05 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-30  5:59 [PATCH RFC] seccomp: Implement syscall isolation based on memory areas Gabriel Krisman Bertazi
2020-05-30 17:30 ` Kees Cook
2020-05-31  5:56   ` Gabriel Krisman Bertazi
2020-05-31 12:39     ` Paul Gofman
2020-05-31 16:49       ` Matthew Wilcox
2020-05-31 17:10         ` Paul Gofman
2020-05-31 17:31           ` Matthew Wilcox
2020-05-31 18:01             ` Paul Gofman
2020-06-01 17:54               ` Gabriel Krisman Bertazi
2020-06-01 17:53         ` Gabriel Krisman Bertazi
2020-05-30 22:09 ` Andy Lutomirski
2020-05-31  0:26   ` Gabriel Krisman Bertazi
2020-05-31  0:59     ` Andy Lutomirski
2020-05-31 12:56       ` Paul Gofman
2020-05-31 18:10         ` Andy Lutomirski
2020-05-31 18:36           ` Paul Gofman
2020-05-31 18:57             ` Andy Lutomirski
2020-05-31 19:37               ` Paul Gofman
2020-05-31 21:03               ` Andy Lutomirski
2020-06-01 18:06                 ` Gabriel Krisman Bertazi
2020-06-01 20:08                 ` Kees Cook
2020-06-01 23:18                   ` Andy Lutomirski
2020-06-11 19:38                 ` Gabriel Krisman Bertazi
2020-05-31 23:33               ` Brendan Shanks
2020-06-01  1:51                 ` Andy Lutomirski
2020-06-25 23:14     ` Robert O'Callahan
2020-06-25 23:48       ` Gabriel Krisman Bertazi
2020-06-26  1:03         ` Robert O'Callahan [this message]
2020-06-05  6:06 ` Sargun Dhillon
2020-06-01  9:23 Billy Laws
2020-06-01 13:59 ` Andy Lutomirski
2020-06-01 17:48   ` hpa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOp6jLZrzmKh34L5f=RFOW7kNKraqer7M0cuQPv-aeUNc+AcpA@mail.gmail.com' \
    --to=robert@ocallahan.org \
    --cc=gofmanp@gmail.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=kernel@collabora.com \
    --cc=krisman@collabora.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).