All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Peter Zijlstra <peterz@infradead.org>, Jan Beulich <jbeulich@suse.com>
Cc: "Stephane Eranian" <eranian@google.com>,
	"Ingo Molnar" <mingo@redhat.com>, "Jiri Olsa" <jolsa@redhat.com>,
	root <chenggang.qin@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"秦承刚(承刚)" <chenggang.qcg@taobao.com>,
	"Wu Fengguang" <fengguang.wu@intel.com>,
	"Namhyung Kim" <namhyung@gmail.com>,
	"Mike Galbraith" <efault@gmx.de>,
	"Arjan van de Ven" <arjan@linux.intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	"David Ahern" <dsahern@gmail.com>,
	"Paul Mackerras" <paulus@samba.org>,
	"秦承刚(承刚)" <chenggang.qcg@alibaba-inc.com>,
	"Yanmin Zhang" <yanmin.zhang@intel.com>
Subject: Re: 答复:[PATCH] perf core: Use KSTK_ESP() instead of pt_regs->sp while output user regs
Date: Tue, 30 Dec 2014 18:00:30 -0800	[thread overview]
Message-ID: <CALCETrUEDuWoXyN+JrB5nWfL6B5odQJU37_y=-tzXN0vxr4tOQ@mail.gmail.com> (raw)
In-Reply-To: <CALCETrWxPBdjY0+gCmPSwjYLB0u4Z4JgGtuR-XU=qwVzoDEBpA@mail.gmail.com>

On Tue, Dec 30, 2014 at 3:29 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Dec 30, 2014 11:03 AM, "Peter Zijlstra" <peterz@infradead.org> wrote:
>>
>> On Thu, Dec 25, 2014 at 07:48:28AM -0800, Andy Lutomirski wrote:
>> > On a quick look, there are plenty of other bugs in there besides just
>> > the stack pointer issue.  The ABI check that uses TIF_IA32 in the perf
>> > core is completely wrong.  TIF_IA32 may be equal to the actual
>> > userspace bitness by luck, but, if so, that's more or less just luck.
>> > And there's a user_mode test that should be user_mode_vm.
>> >
>> > Also, it's not just sp that's wrong.  There are various places that
>> > you can interrupt in which many of the registers have confusing
>> > locations.  You could try using the cfi unwind data, but that's
>> > unlikely to work for regs like cs and ss, and, during context switch,
>> > this has very little chance of working.
>> >
>> > What's the point of this feature?  Honestly, my suggestion would be to
>> > delete it instead of trying to fix it.  It's also not clear to me that
>> > there aren't serious security problems here -- it's entirely possible
>> > for sensitive *kernel* values to and up in task_pt_regs at certain
>> > times, and if you run during context switch and there's no code to
>> > suppress this dump during context switch, then you could be showing
>> > regs that belong to the wrong task.
>>
>> Of course the people who actually wrote the code are not on CC :/
>>
>> There's two users of this iirc;
>>
>>  1) the dwarf stack unwinder thingy, which basically dumps the userspace
>>  regs and the top of userspace stack on 'event'.
>>
>
> Given how the x86_64* entry code works, using task_pt_regs from
> anywhere except explicitly supported contexts (including exceptions
> that originated in userspace and a small handful of system calls) is
> asking for trouble.  NMI context is especially bad.
>
> How important is this feature, and which registers matter?  It might
> be possible to use a dwarf unwinder on the kernel call stack to get
> most of the regs from most contexts, and it might also be possible to
> make small changes to the entry code to make it possible to get some
> of the registers reliably, but it's not currently possible to safely
> use task_pt_regs *at all* from NMI context unless you've at least
> blacklisted a handful of origin RIP values that give dangerously bogus
> results.  (Using do_nmi's regs parameter if user_mode_vm(regs) is a
> different story.)

It's actually worse than just knowing the interrupted kernel RIP.  If
the call chain goes usermode -> IST exception -> NMI, then
task_pt_regs is entirely uninitialized.  Assuming all the CFI
annotations are correct, the unwinder could still do it from the
kernel.

Note that, as far as I know, Jan Beulich is the only person who uses
the unwinder on kernel code.  Jan, how do you do this?

>
> * I'm not nearly as familiar with the 32-bit entry code, so I don't
> know whether we have the same issues there.
>
>>  2) the recent sample_regs_intr, which dumps the register set at
>>  'event', be it kernel or userspace.
>>
>
> What's wrong with the PMI's pt_regs for that?  If we interrupted the
> kernel, they'll be kernel regs (with all their attendant security
> issues) and, if we interrupted userspace, then they'll be the full,
> correct userspace registers.
>
> --Andy
>
>>
>> The first is somewhat usable when lacking framepointers while still
>> desiring some unwind information, the second is useful to things like
>> call argument profiling and the like.



-- 
Andy Lutomirski
AMA Capital Management, LLC

  reply	other threads:[~2014-12-31  2:00 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-23  6:22 [PATCH] perf core: Use KSTK_ESP() instead of pt_regs->sp while output user regs root
2014-12-23  8:30 ` Andy Lutomirski
     [not found] ` <c027bde0-5f4f-441f-8d45-3e7f6f702231@alibaba-inc.com>
2014-12-25 15:48   ` 答复:[PATCH] " Andy Lutomirski
2014-12-25 16:21     ` Andy Lutomirski
2014-12-30 19:03     ` Peter Zijlstra
2014-12-30 23:29       ` Andy Lutomirski
2014-12-31  2:00         ` Andy Lutomirski [this message]
2015-01-02 16:11           ` Jan Beulich
2015-01-02 18:03             ` Andy Lutomirski
2015-01-05  8:47               ` Jan Beulich
2015-01-04 16:10       ` Jiri Olsa
2015-01-04 17:18         ` Andy Lutomirski
2015-01-04 17:41           ` Jiri Olsa
2015-01-04 18:36             ` [PATCH 0/2] perf: Improve user regs sampling Andy Lutomirski
2015-01-04 18:36               ` [PATCH 1/2] perf: Move task_pt_regs sampling into arch code Andy Lutomirski
2015-01-05 14:07                 ` Peter Zijlstra
2015-01-05 16:13                   ` Andy Lutomirski
2015-01-05 16:44                     ` Peter Zijlstra
2015-01-05 18:28                       ` Andy Lutomirski
2015-01-09 12:32                 ` [tip:perf/urgent] " tip-bot for Andy Lutomirski
2015-01-04 18:36               ` [PATCH 2/2] x86_64, perf: Improve user regs sampling Andy Lutomirski
2015-01-09 12:32                 ` [tip:perf/urgent] perf/x86_64: " tip-bot for Andy Lutomirski
2015-01-05 10:46               ` [PATCH 0/2] perf: " Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CALCETrUEDuWoXyN+JrB5nWfL6B5odQJU37_y=-tzXN0vxr4tOQ@mail.gmail.com' \
    --to=luto@amacapital.net \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=chenggang.qcg@alibaba-inc.com \
    --cc=chenggang.qcg@taobao.com \
    --cc=chenggang.qin@gmail.com \
    --cc=dsahern@gmail.com \
    --cc=efault@gmx.de \
    --cc=eranian@google.com \
    --cc=fengguang.wu@intel.com \
    --cc=jbeulich@suse.com \
    --cc=jolsa@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@gmail.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=yanmin.zhang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.