All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@gmail.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: "Avi Kivity" <avi@redhat.com>,
	"Pekka Enberg" <penberg@cs.helsinki.fi>,
	"Tom Zanussi" <tzanussi@gmail.com>,
	"Frédéric Weisbecker" <fweisbec@gmail.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	linux-perf-users@vger.kernel.org,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: disabling group leader perf_event
Date: Tue, 7 Sep 2010 09:33:12 +0100	[thread overview]
Message-ID: <AANLkTik0d=d4VfWy0WFDpsQttbZ9cFTVjqmRjgY4+7v1@mail.gmail.com> (raw)
In-Reply-To: <20100907034417.GA14046@elte.hu>

On Tue, Sep 7, 2010 at 4:44 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Avi Kivity <avi@redhat.com> wrote:
>
>>  On 09/06/2010 06:47 PM, Ingo Molnar wrote:
>> >
>> >>The actual language doesn't really matter.
>> >There are 3 basic categories:
>> >
>> >  1- Most (least abstract) specific code: a block of bytecode in the form
>> >     of a simplified, executable, kernel-checked x86 machine code block -
>> >     this is also the fastest form. [yes, this is actually possible.]
>>
>> Do you then recompile it? [...]
>
> No, it's machine code. It's 'safe x86 bytecode executed natively by the
> kernel as a function'.
>
> It needs a verification pass (because the code can come from untrusted
> apps) so that we can copy, verify and trust it (so obviously it's not
> _arbitrary_ x86 machine code - a safe subset of x86) - maybe with a sha1
> based cache for already-verified snippets (or a fast verifier).
>
>> x86 is quite unpleasant.
>
> Any machine code that is fast and compact is unpleasant almost by
> definition: it's a rather non-obvious Huffman encoding embedded in an
> instruction architecture.
>
> But that's the life of kernel hackers, we deal with difficult things.
> (We could have made a carreer choice of selling icecream instead, but
> it's too late i suspect.)
>
>> >  2- Least specific (most abstract) code: A subset/sideset of C - as it's
>> >     the most kernel-developer-trustable/debuggable form.
>> >
>> >  3- Everything else little more than a dot on the spectrum between the
>> >     first two points.
>> >
>> > I lean towards #2 - but #1 looks interesting too. #3 is distinctly
>> > uninteresting as it cannot be as fast as #1 and cannot be as
>> > convenient as #2.
>>
>> Curious - how do you guarantee safety of #1 or even #2? [...]
>
> Safety of #1 (x86 bytecode passed in by untrusted user-space, verified
> and saved by the kernel and executed natively as an x86 function if it
> passes the security checks) is trivial but obviously needs quite a bit
> of work.
>
> We start with trivial (and useless) special case of something like:
>
> #define MAX_BYTECODE_SIZE 256
>
> int x86_bytecode_verify(char *opcodes, unsigned int len)
> {
>
>        if (len-1 > MAX_BYTECODE_SIZE-1)
>                return -EINVAL;
>
>        if (opcodes[0] != 0xc3) /* RET instruction */
>                return -EINVAL;
>
>        return 0;
> }
>
> ... and then we add checks for accepted/safe x86 patterns of
> instructions step by step - always keeping it 100% correct.
>
> Initially it would only allow general register operations with some
> input and output parameters in registers, and a wrapper would
> save/restore those general registers - later on stack operands and
> globals could be added too.
>
> That's not yet Turing complete but already quite functional: an amazing
> amount of logic can be expressed via generic register ops only - i think
> the filter engine could be implemented via that for example.
>
> We'd eventually make it Turing complete in the operations space we care
> about: a fixed-size stack sandbox and a virtual memory window sandbox
> area, allow conditional jumps (only to instruction boundaries).
>
> The code itself is copied into kernel-space and immutable after it has
> been verified.
>
> The point is to decode only safe instructions we know, and to always
> have a 'safe' core of checking code we can extend safely and
> iteratively.
>
> Safety of #2 (C code) is like the filter engine: it's safe right now, as
> it parses the ASCII expression in-kernel, compiles it into predicaments
> and executes those predicament (which are baby instructions really)
> safely.
>
> Every extension needs to be done safely, of course - and more complex
> language constructs will complicate matters for sure.
>
> Note that we have (small) bits of #1 done already in the kernel: the x86
> disassembler. Any instruction pattern we dont know or dont trust we punt
> on.
>
> ( Also note that beyond native execution this 'x86 bytecode' approach
>  would still allow JIT techniques, if we are so inclined: x86 bytecode,
>  because we fully verify it and fully know its structure (and exclude
>  nasties like self-modifying code) can be re-JIT-ed just fine.
>
>  Common sequences might even be pre-JIT-ed and cached in a hash. That
>  way we could make sequences faster post facto, via a kernel change
>  only, without impacting any user-space which only passes in the 'old'
>  sequence. Lots of flexibility. )
>
>> Can you point me to any research?
>
> Nope, havent seen this 'safe native x86 bytecode' idea
> mentioned/researched anywhere yet.

Native Client: A Sandbox for Portable, Untrusted x86 Native Code, IEEE
Symposium on Security and Privacy, May 2009
http://nativeclient.googlecode.com/svn/data/docs_tarball/nacl/googleclient/native_client/documentation/nacl_paper.pdf

The "Inner Sandbox" they talk about verifies a subset of x86 code.
For indirect control flow (computed jumps), they introduce a new
instruction that can do run-time checking of the destination address.

IIRC they have a patched gcc toolchain that can compile to this subset of x86.

Stefan

  reply	other threads:[~2010-09-07  8:33 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-06  9:12 disabling group leader perf_event Avi Kivity
2010-09-06 11:24 ` Peter Zijlstra
2010-09-06 11:34   ` Avi Kivity
2010-09-06 11:54     ` Peter Zijlstra
2010-09-06 11:58       ` Avi Kivity
2010-09-06 12:29         ` Peter Zijlstra
2010-09-06 12:40           ` Ingo Molnar
2010-09-06 13:16             ` Steven Rostedt
2010-09-06 16:42               ` Tom Zanussi
2010-09-07 12:53                 ` Steven Rostedt
2010-09-07 14:16                   ` Tom Zanussi
2010-09-06 12:49           ` Avi Kivity
2010-09-06 12:43         ` Ingo Molnar
2010-09-06 12:45           ` Avi Kivity
2010-09-06 12:59             ` Ingo Molnar
2010-09-06 13:41               ` Pekka Enberg
2010-09-06 13:54                 ` Ingo Molnar
2010-09-06 14:57               ` Avi Kivity
2010-09-06 15:30                 ` Alan Cox
2010-09-06 15:20                   ` Avi Kivity
2010-09-06 15:48                     ` Alan Cox
2010-09-06 17:50                       ` Avi Kivity
2010-09-06 15:47                 ` Ingo Molnar
2010-09-06 17:55                   ` Avi Kivity
2010-09-07  3:44                     ` Ingo Molnar
2010-09-07  8:33                       ` Stefan Hajnoczi [this message]
2010-09-07  9:13                         ` Avi Kivity
2010-09-07 22:43                         ` Ingo Molnar
2010-09-07 15:55                       ` Alan Cox
2010-09-08  1:44                       ` Paul Mackerras
2010-09-08  6:16                         ` Pekka Enberg
2010-09-08  6:44                           ` Ingo Molnar
2010-09-08  7:30                             ` Peter Zijlstra
2010-09-08 19:30                             ` Frank Ch. Eigler
2010-09-09  7:38                               ` Ingo Molnar
2010-09-08  6:19                         ` Avi Kivity
2010-09-06 20:31                   ` Pekka Enberg
2010-09-06 20:37                     ` Pekka Enberg
2010-09-07  4:03                     ` Ingo Molnar
2010-09-07  9:30                       ` Pekka Enberg
2010-09-07 22:27                         ` Ingo Molnar
2010-09-07 10:57                     ` KOSAKI Motohiro
2010-09-07 12:14                       ` Pekka Enberg
2010-09-07 13:35                   ` Steven Rostedt
2010-09-07 13:47                     ` Avi Kivity
2010-09-07 16:02                       ` Steven Rostedt
2010-09-12  6:46                   ` Pavel Machek
2010-09-12 17:54                     ` Avi Kivity
2010-09-12 18:48                       ` Ingo Molnar
2010-09-12 19:14                         ` Pavel Machek
2010-09-12 20:32                           ` Ingo Molnar
2010-09-12 21:06                             ` Pavel Machek
2010-09-12 22:19                               ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTik0d=d4VfWy0WFDpsQttbZ9cFTVjqmRjgY4+7v1@mail.gmail.com' \
    --to=stefanha@gmail.com \
    --cc=acme@redhat.com \
    --cc=avi@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=penberg@cs.helsinki.fi \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tzanussi@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.