All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@kernel.org>
To: Elena Reshetova <elena.reshetova@intel.com>
Cc: Andrew Lutomirski <luto@kernel.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Kees Cook <keescook@chromium.org>, Jann Horn <jannh@google.com>,
	"Perla, Enrico" <enrico.perla@intel.com>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Greg KH <gregkh@linuxfoundation.org>
Subject: Re: [RFC PATCH] x86/entry/64: randomize kernel stack offset upon syscall
Date: Mon, 18 Mar 2019 13:15:44 -0700	[thread overview]
Message-ID: <CALCETrUxhzHyUQCAjPQcPNWwAw5UTxUX4ZaeGxpbf9VSCDdcPg@mail.gmail.com> (raw)
In-Reply-To: <20190318094128.1488-1-elena.reshetova@intel.com>

On Mon, Mar 18, 2019 at 2:41 AM Elena Reshetova
<elena.reshetova@intel.com> wrote:
>
> If CONFIG_RANDOMIZE_KSTACK_OFFSET is selected,
> the kernel stack offset is randomized upon each
> entry to a system call after fixed location of pt_regs
> struct.
>
> This feature is based on the original idea from
> the PaX's RANDKSTACK feature:
> https://pax.grsecurity.net/docs/randkstack.txt
> All the credits for the original idea goes to the PaX team.
> However, the design and implementation of
> RANDOMIZE_KSTACK_OFFSET differs greatly from the RANDKSTACK
> feature (see below).
>
> Reasoning for the feature:
>
> This feature aims to make considerably harder various
> stack-based attacks that rely on deterministic stack
> structure.
> We have had many of such attacks in past [1],[2],[3]
> (just to name few), and as Linux kernel stack protections
> have been constantly improving (vmap-based stack
> allocation with guard pages, removal of thread_info,
> STACKLEAK), attackers have to find new ways for their
> exploits to work.
>
> It is important to note that we currently cannot show
> a concrete attack that would be stopped by this new
> feature (given that other existing stack protections
> are enabled), so this is an attempt to be on a proactive
> side vs. catching up with existing successful exploits.
>
> The main idea is that since the stack offset is
> randomized upon each system call, it is very hard for
> attacker to reliably land in any particular place on
> the thread stack when attack is performed.
> Also, since randomization is performed *after* pt_regs,
> the ptrace-based approach to discover randomization
> offset during a long-running syscall should not be
> possible.
>
> [1] jon.oberheide.org/files/infiltrate12-thestackisback.pdf
> [2] jon.oberheide.org/files/stackjacking-infiltrate11.pdf
> [3] googleprojectzero.blogspot.com/2016/06/exploiting-
> recursion-in-linux-kernel_20.html
>
> Design description:
>
> During most of the kernel's execution, it runs on the "thread
> stack", which is allocated at fork.c/dup_task_struct() and stored in
> a per-task variable (tsk->stack). Since stack is growing downward,
> the stack top can be always calculated using task_top_of_stack(tsk)
> function, which essentially returns an address of tsk->stack + stack
> size. When VMAP_STACK is enabled, the thread stack is allocated from
> vmalloc space.
>
> Thread stack is pretty deterministic on its structure - fixed in size,
> and upon every entry from a userspace to kernel on a
> syscall the thread stack is started to be constructed from an
> address fetched from a per-cpu cpu_current_top_of_stack variable.
> The first element to be pushed to the thread stack is the pt_regs struct
> that stores all required CPU registers and sys call parameters.
>
> The goal of RANDOMIZE_KSTACK_OFFSET feature is to add a random offset
> after the pt_regs has been pushed to the stack and the rest of thread
> stack (used during the syscall processing) every time a process issues
> a syscall. The source of randomness can be taken either from rdtsc or
> rdrand with performance implications listed below. The value of random
> offset is stored in a callee-saved register (r15 currently) and the
> maximum size of random offset is defined by __MAX_STACK_RANDOM_OFFSET
> value, which currently equals to 0xFF0.
>
> As a result this patch introduces 8 bits of randomness
> (bits 4 - 11 are randomized, bits 0-3 must be zero due to stack alignment)
> after pt_regs location on the thread stack.
> The amount of randomness can be adjusted based on how much of the
> stack space we wish/can trade for security.

Why do you need four zero bits at the bottom?  x86_64 Linux only
maintains 8 byte stack alignment.

>
> The main issue with this approach is that it slightly breaks the
> processing of last frame in the unwinder, so I have made a simple
> fix to the frame pointer unwinder (I guess others should be fixed
> similarly) and stack dump functionality to "jump" over the random hole
> at the end. My way of solving this is probably far from ideal,
> so I would really appreciate feedback on how to improve it.

That's probably a question for Josh :)

Another way to do the dirty work would be to do:

    char *ptr = alloca(offset);
    asm volatile ("" :: "m" (*ptr));

in do_syscall_64() and adjust compiler flags as needed to avoid warnings.  Hmm.

>
> Performance:
>
> 1) lmbench: ./lat_syscall -N 1000000 null
>     base:                     Simple syscall: 0.1774 microseconds
>     random_offset (rdtsc):     Simple syscall: 0.1803 microseconds
>     random_offset (rdrand): Simple syscall: 0.3702 microseconds
>
> 2)  Andy's tests, misc-tests: ./timing_test_64 10M sys_enosys
>     base:                     10000000 loops in 1.62224s = 162.22 nsec / loop
>     random_offset (rdtsc):     10000000 loops in 1.64660s = 164.66 nsec / loop
>     random_offset (rdrand): 10000000 loops in 3.51315s = 351.32 nsec / loop
>

Egads!  RDTSC is nice and fast but probably fairly easy to defeat.
RDRAND is awful.  I had hoped for better.

So perhaps we need a little percpu buffer that collects 64 bits of
randomness at a time, shifts out the needed bits, and refills the
buffer when we run out.

>  /*
>   * This does 'call enter_from_user_mode' unless we can avoid it based on
>   * kernel config or using the static jump infrastructure.
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 1f0efdb7b629..0816ec680c21 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -167,13 +167,19 @@ GLOBAL(entry_SYSCALL_64_after_hwframe)
>
>         PUSH_AND_CLEAR_REGS rax=$-ENOSYS
>
> +       RANDOMIZE_KSTACK                /* stores randomized offset in r15 */
> +
>         TRACE_IRQS_OFF
>
>         /* IRQs are off. */
>         movq    %rax, %rdi
>         movq    %rsp, %rsi
> +       sub     %r15, %rsp          /* substitute random offset from rsp */
>         call    do_syscall_64           /* returns with IRQs disabled */
>
> +       /* need to restore the gap */
> +       add     %r15, %rsp       /* add random offset back to rsp */

Off the top of my head, the nicer way to approach this would be to
change this such that mov %rbp, %rsp; popq %rbp or something like that
will do the trick.  Then the unwinder could just see it as a regular
frame.  Maybe Josh will have a better idea.

  reply	other threads:[~2019-03-18 20:16 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-18  9:41 [RFC PATCH] x86/entry/64: randomize kernel stack offset upon syscall Elena Reshetova
2019-03-18 20:15 ` Andy Lutomirski [this message]
2019-03-18 21:07   ` Kees Cook
2019-03-26 10:35     ` Reshetova, Elena
2019-03-27  4:31       ` Andy Lutomirski
2019-03-28 15:45         ` Kees Cook
2019-03-28 16:29           ` Andy Lutomirski
2019-03-28 16:47             ` Kees Cook
2019-03-29  7:50               ` Reshetova, Elena
2019-03-18 23:31   ` Josh Poimboeuf
2019-03-20 12:10     ` Reshetova, Elena
2019-03-20 11:12   ` David Laight
2019-03-20 14:51     ` Andy Lutomirski
2019-03-20 12:04   ` Reshetova, Elena
2019-03-20  7:27 Elena Reshetova
2019-03-20  7:29 ` Reshetova, Elena
2019-03-29  8:13 Elena Reshetova
2019-04-03 21:17 ` Kees Cook
2019-04-04 11:41   ` Reshetova, Elena
2019-04-04 17:03     ` Kees Cook
2019-04-05 10:14       ` Reshetova, Elena
2019-04-05 13:14         ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALCETrUxhzHyUQCAjPQcPNWwAw5UTxUX4ZaeGxpbf9VSCDdcPg@mail.gmail.com \
    --to=luto@kernel.org \
    --cc=bp@alien8.de \
    --cc=elena.reshetova@intel.com \
    --cc=enrico.perla@intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=jpoimboe@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.