From: Andy Lutomirski <luto@amacapital.net>
To: Peter Zijlstra <peterz@infradead.org>, Jan Beulich <jbeulich@suse.com>
Cc: "Stephane Eranian" <eranian@google.com>,
"Ingo Molnar" <mingo@redhat.com>, "Jiri Olsa" <jolsa@redhat.com>,
root <chenggang.qin@gmail.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"秦承刚(承刚)" <chenggang.qcg@taobao.com>,
"Wu Fengguang" <fengguang.wu@intel.com>,
"Namhyung Kim" <namhyung@gmail.com>,
"Mike Galbraith" <efault@gmx.de>,
"Arjan van de Ven" <arjan@linux.intel.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
"David Ahern" <dsahern@gmail.com>,
"Paul Mackerras" <paulus@samba.org>,
"秦承刚(承刚)" <chenggang.qcg@alibaba-inc.com>,
"Yanmin Zhang" <yanmin.zhang@intel.com>
Subject: Re: 答复:[PATCH] perf core: Use KSTK_ESP() instead of pt_regs->sp while output user regs
Date: Tue, 30 Dec 2014 18:00:30 -0800 [thread overview]
Message-ID: <CALCETrUEDuWoXyN+JrB5nWfL6B5odQJU37_y=-tzXN0vxr4tOQ@mail.gmail.com> (raw)
In-Reply-To: <CALCETrWxPBdjY0+gCmPSwjYLB0u4Z4JgGtuR-XU=qwVzoDEBpA@mail.gmail.com>
On Tue, Dec 30, 2014 at 3:29 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Dec 30, 2014 11:03 AM, "Peter Zijlstra" <peterz@infradead.org> wrote:
>>
>> On Thu, Dec 25, 2014 at 07:48:28AM -0800, Andy Lutomirski wrote:
>> > On a quick look, there are plenty of other bugs in there besides just
>> > the stack pointer issue. The ABI check that uses TIF_IA32 in the perf
>> > core is completely wrong. TIF_IA32 may be equal to the actual
>> > userspace bitness by luck, but, if so, that's more or less just luck.
>> > And there's a user_mode test that should be user_mode_vm.
>> >
>> > Also, it's not just sp that's wrong. There are various places that
>> > you can interrupt in which many of the registers have confusing
>> > locations. You could try using the cfi unwind data, but that's
>> > unlikely to work for regs like cs and ss, and, during context switch,
>> > this has very little chance of working.
>> >
>> > What's the point of this feature? Honestly, my suggestion would be to
>> > delete it instead of trying to fix it. It's also not clear to me that
>> > there aren't serious security problems here -- it's entirely possible
>> > for sensitive *kernel* values to and up in task_pt_regs at certain
>> > times, and if you run during context switch and there's no code to
>> > suppress this dump during context switch, then you could be showing
>> > regs that belong to the wrong task.
>>
>> Of course the people who actually wrote the code are not on CC :/
>>
>> There's two users of this iirc;
>>
>> 1) the dwarf stack unwinder thingy, which basically dumps the userspace
>> regs and the top of userspace stack on 'event'.
>>
>
> Given how the x86_64* entry code works, using task_pt_regs from
> anywhere except explicitly supported contexts (including exceptions
> that originated in userspace and a small handful of system calls) is
> asking for trouble. NMI context is especially bad.
>
> How important is this feature, and which registers matter? It might
> be possible to use a dwarf unwinder on the kernel call stack to get
> most of the regs from most contexts, and it might also be possible to
> make small changes to the entry code to make it possible to get some
> of the registers reliably, but it's not currently possible to safely
> use task_pt_regs *at all* from NMI context unless you've at least
> blacklisted a handful of origin RIP values that give dangerously bogus
> results. (Using do_nmi's regs parameter if user_mode_vm(regs) is a
> different story.)
It's actually worse than just knowing the interrupted kernel RIP. If
the call chain goes usermode -> IST exception -> NMI, then
task_pt_regs is entirely uninitialized. Assuming all the CFI
annotations are correct, the unwinder could still do it from the
kernel.
Note that, as far as I know, Jan Beulich is the only person who uses
the unwinder on kernel code. Jan, how do you do this?
>
> * I'm not nearly as familiar with the 32-bit entry code, so I don't
> know whether we have the same issues there.
>
>> 2) the recent sample_regs_intr, which dumps the register set at
>> 'event', be it kernel or userspace.
>>
>
> What's wrong with the PMI's pt_regs for that? If we interrupted the
> kernel, they'll be kernel regs (with all their attendant security
> issues) and, if we interrupted userspace, then they'll be the full,
> correct userspace registers.
>
> --Andy
>
>>
>> The first is somewhat usable when lacking framepointers while still
>> desiring some unwind information, the second is useful to things like
>> call argument profiling and the like.
--
Andy Lutomirski
AMA Capital Management, LLC
next prev parent reply other threads:[~2014-12-31 2:00 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-23 6:22 [PATCH] perf core: Use KSTK_ESP() instead of pt_regs->sp while output user regs root
2014-12-23 8:30 ` Andy Lutomirski
[not found] ` <c027bde0-5f4f-441f-8d45-3e7f6f702231@alibaba-inc.com>
2014-12-25 15:48 ` 答复:[PATCH] " Andy Lutomirski
2014-12-25 16:21 ` Andy Lutomirski
2014-12-30 19:03 ` Peter Zijlstra
2014-12-30 23:29 ` Andy Lutomirski
2014-12-31 2:00 ` Andy Lutomirski [this message]
2015-01-02 16:11 ` Jan Beulich
2015-01-02 18:03 ` Andy Lutomirski
2015-01-05 8:47 ` Jan Beulich
2015-01-04 16:10 ` Jiri Olsa
2015-01-04 17:18 ` Andy Lutomirski
2015-01-04 17:41 ` Jiri Olsa
2015-01-04 18:36 ` [PATCH 0/2] perf: Improve user regs sampling Andy Lutomirski
2015-01-04 18:36 ` [PATCH 1/2] perf: Move task_pt_regs sampling into arch code Andy Lutomirski
2015-01-05 14:07 ` Peter Zijlstra
2015-01-05 16:13 ` Andy Lutomirski
2015-01-05 16:44 ` Peter Zijlstra
2015-01-05 18:28 ` Andy Lutomirski
2015-01-09 12:32 ` [tip:perf/urgent] " tip-bot for Andy Lutomirski
2015-01-04 18:36 ` [PATCH 2/2] x86_64, perf: Improve user regs sampling Andy Lutomirski
2015-01-09 12:32 ` [tip:perf/urgent] perf/x86_64: " tip-bot for Andy Lutomirski
2015-01-05 10:46 ` [PATCH 0/2] perf: " Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CALCETrUEDuWoXyN+JrB5nWfL6B5odQJU37_y=-tzXN0vxr4tOQ@mail.gmail.com' \
--to=luto@amacapital.net \
--cc=akpm@linux-foundation.org \
--cc=arjan@linux.intel.com \
--cc=chenggang.qcg@alibaba-inc.com \
--cc=chenggang.qcg@taobao.com \
--cc=chenggang.qin@gmail.com \
--cc=dsahern@gmail.com \
--cc=efault@gmx.de \
--cc=eranian@google.com \
--cc=fengguang.wu@intel.com \
--cc=jbeulich@suse.com \
--cc=jolsa@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=namhyung@gmail.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=yanmin.zhang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).