linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Eiichi Tsukata <devel@etsukata.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Borislav Petkov <bp@alien8.de>, Ingo Molnar <mingo@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Lutomirski <luto@kernel.org>, Peter Anvin <hpa@zytor.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Juergen Gross <jgross@suse.com>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	He Zhe <zhe.he@windriver.com>,
	Joel Fernandes <joel@joelfernandes.org>
Subject: Re: [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption
Date: Mon, 8 Jul 2019 09:48:23 +0200	[thread overview]
Message-ID: <20190708074823.GV3402@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <a4a50e28-d972-3cef-b668-1e49d5b5496f@etsukata.com>

On Sat, Jul 06, 2019 at 08:07:22PM +0900, Eiichi Tsukata wrote:
> 
> 
> On 2019/07/05 11:18, Linus Torvalds wrote:
> > On Fri, Jul 5, 2019 at 5:03 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >>
> >> Despire the current efforts to read CR2 before tracing happens there
> >> still exist a number of possible holes:
> > 
> > So this whole series disturbs me for the simple reason that I thought
> > tracing was supposed to save/restore cr2 and make it unnecessary to
> > worry about this in non-tracing code.
> > 
> > That is very much what the NMI code explicitly does. Why shouldn't all
> > the other tracing code do the same thing in case they can take page
> > faults?
> > 
> > So I don't think the patches are wrong per se, but this seems to solve
> > it at the wrong level.

This solves it all at one site. If we're going to sprinkle CR2
save/restore over 'all' sites we're bound to miss some and we'll be
playing catch-up forever :/

> Steven previously tried to fix it by saving CR2 in TRACE_IRQS_OFF:
> https://lore.kernel.org/lkml/20190320221534.165ab87b@oasis.local.home/

That misses the context tracking tracepoint, which is also before we
load CR2.

> To prevent userstack trace code from reading user stack before it becomes ready,
> checking current->in_execve in stack_trace_save_user() can help Steven's approach,
> though trace_sched_process_exec() is called before current->in_execve = 0 so it changes
> current behavior.
> 
> The PoC code is as follows:
> 
> diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> index 2abf27d7df6b..30fa6e1b7a87 100644
> --- a/arch/x86/kernel/stacktrace.c
> +++ b/arch/x86/kernel/stacktrace.c
> @@ -116,10 +116,12 @@ void arch_stack_walk_user(stack_trace_consume_fn consume_entry, void *cookie,
>                           const struct pt_regs *regs)
>  {
>         const void __user *fp = (const void __user *)regs->bp;
> +       unsigned long address;
>  
>         if (!consume_entry(cookie, regs->ip, false))
>                 return;
>  
> +       address = read_cr2();
>         while (1) {
>                 struct stack_frame_user frame;
>  
> @@ -131,11 +133,14 @@ void arch_stack_walk_user(stack_trace_consume_fn consume_entry, void *cookie,
>                         break;
>                 if (frame.ret_addr) {
>                         if (!consume_entry(cookie, frame.ret_addr, false))
> -                               return;
> +                               break;
>                 }
>                 if (fp == frame.next_fp)
>                         break;
>                 fp = frame.next_fp;
>         }
> +
> +       if (address != read_cr2())
> +               write_cr2(address);
>  }
>  

And this misses the perf unwinder, which we can reach if we attach perf
to the tracepoint. We can most likely also do user accesses from
kprobes, which we can similarly attach to tracepoints, and there's eBPF,
and who knows what else...

Do we really want to go put CR2 save/restore around every single thing
that can cause a fault (any fault) and is reachable from a tracepoint?
That's going to be a long list of things ... :/

Or are we going to put the CR2 save/restore on every single tracepoint?
But then we also need it on the mcount/fentry stubs and we again have
multiple places.

Whereas if we stick it in the entry path, like I proposed, we fix it in
one location and we're done.

  reply	other threads:[~2019-07-08  7:49 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-04 19:55 [PATCH v2 0/7] Tracing vs CR2 (and cleanups) Peter Zijlstra
2019-07-04 19:55 ` [PATCH v2 1/7] x86/paravirt: Make read_cr2() CALLEE_SAVE Peter Zijlstra
2019-07-04 21:49   ` Andy Lutomirski
2019-07-10 19:53   ` Steven Rostedt
2019-07-04 19:55 ` [PATCH v2 2/7] x86/entry/32: Simplify common_exception Peter Zijlstra
2019-07-04 21:51   ` Andy Lutomirski
2019-07-10 20:11   ` Steven Rostedt
2019-07-10 20:14     ` Peter Zijlstra
2019-07-04 19:55 ` [PATCH v2 3/7] x86/entry/64: Simplify idtentry a little Peter Zijlstra
2019-07-04 21:54   ` Andy Lutomirski
2019-07-10 20:23   ` Steven Rostedt
2019-07-04 19:55 ` [PATCH v2 4/7] x86/entry/64: Update comments and sanity tests for create_gap Peter Zijlstra
2019-07-04 21:55   ` Andy Lutomirski
2019-07-10 20:24   ` Steven Rostedt
2019-07-04 19:56 ` [PATCH v2 5/7] x86/mm, tracing: Fix CR2 corruption Peter Zijlstra
2019-07-05  2:18   ` Linus Torvalds
2019-07-05  3:16     ` Andy Lutomirski
2019-07-05  3:27       ` Linus Torvalds
2019-07-05 13:49     ` Peter Zijlstra
2019-07-06 21:41       ` Linus Torvalds
2019-07-06 22:27         ` Steven Rostedt
2019-07-06 22:41           ` Linus Torvalds
2019-07-07  0:08             ` Linus Torvalds
2019-07-07  0:36               ` Andy Lutomirski
2019-07-06 23:50           ` Andy Lutomirski
2019-07-07  3:44           ` Eiichi Tsukata
2019-07-06 11:07     ` Eiichi Tsukata
2019-07-08  7:48       ` Peter Zijlstra [this message]
2019-07-08  8:58         ` Eiichi Tsukata
2019-07-08  9:42           ` Eiichi Tsukata
2019-07-09  5:17             ` Eiichi Tsukata
2019-07-07 15:10   ` Andy Lutomirski
2019-07-07 15:11     ` Andy Lutomirski
2019-07-07 18:17       ` Linus Torvalds
2019-07-10 20:27         ` Steven Rostedt
2019-07-11  6:45           ` Greg Kroah-Hartman
2019-07-11 12:12           ` Sasha Levin
2019-07-11 12:21           ` Peter Zijlstra
2019-07-04 19:56 ` [PATCH v2 6/7] x86/entry/64: Remove TRACE_IRQS_*_DEBUG Peter Zijlstra
2019-07-11  3:24   ` Steven Rostedt
2019-07-11  8:05     ` Peter Zijlstra
2019-07-04 19:56 ` [RFC][PATCH v2 7/7] x86/entry/64: Pull bits into C Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190708074823.GV3402@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=devel@etsukata.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=zhe.he@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).