linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf, x86: Disable sanity check
@ 2012-04-18 23:24 Arun Sharma
  2012-04-19  5:22 ` Frederic Weisbecker
  2012-04-20  9:11 ` Peter Zijlstra
  0 siblings, 2 replies; 6+ messages in thread
From: Arun Sharma @ 2012-04-18 23:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Arun Sharma, Ingo Molnar, Arnaldo Carvalho de Melo,
	Frederic Weisbecker, Mike Galbraith, Paul Mackerras,
	Peter Zijlstra, Stephane Eranian, Namhyung Kim, Tom Zanussi,
	linux-perf-users

Without this patch, applications with two different stack
regions (eg: native stack vs JIT stack) get truncated 
callchains even when RBP chaining is present. GDB shows proper
stack traces and the frame pointer chaining is intact.

This patch disables the (fp < RSP) check, hoping that other checks
in the code save the day for us. In our limited testing, this
didn't seem to break anything.

In the long term, we could potentially have userspace advise
the kernel on the range of valid stack addresses, so we don't 
spend a lot of time unwinding from bogus addresses.

Signed-off-by: Arun Sharma <asharma@fb.com>
Cc: Ingo Molnar <mingo@elte.hu>
CC: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org

---
 arch/x86/kernel/cpu/perf_event.c |    3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 07f46ba..87d9abd 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -1596,9 +1596,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
 		if (bytes != sizeof(frame))
 			break;
 
-		if ((unsigned long)fp < regs->sp)
-			break;
-
 		perf_callchain_store(entry, frame.return_address);
 		fp = frame.next_frame;
 	}
-- 
1.7.8.4


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf, x86: Disable sanity check
  2012-04-18 23:24 [PATCH] perf, x86: Disable sanity check Arun Sharma
@ 2012-04-19  5:22 ` Frederic Weisbecker
  2012-04-19 17:33   ` Arun Sharma
  2012-04-20  9:11 ` Peter Zijlstra
  1 sibling, 1 reply; 6+ messages in thread
From: Frederic Weisbecker @ 2012-04-19  5:22 UTC (permalink / raw)
  To: Arun Sharma
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mike Galbraith, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Namhyung Kim, Tom Zanussi, linux-perf-users

On Wed, Apr 18, 2012 at 04:24:09PM -0700, Arun Sharma wrote:
> Without this patch, applications with two different stack
> regions (eg: native stack vs JIT stack) get truncated 
> callchains even when RBP chaining is present. GDB shows proper
> stack traces and the frame pointer chaining is intact.
> 
> This patch disables the (fp < RSP) check, hoping that other checks
> in the code save the day for us. In our limited testing, this
> didn't seem to break anything.
> 
> In the long term, we could potentially have userspace advise
> the kernel on the range of valid stack addresses, so we don't 
> spend a lot of time unwinding from bogus addresses.

So rbp is part of the JIT stack but not rsp?
Do you have a practical example of that? I must confess I don't know
much about JIT stack.

Thanks.

> 
> Signed-off-by: Arun Sharma <asharma@fb.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> CC: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Mike Galbraith <efault@gmx.de>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Namhyung Kim <namhyung.kim@lge.com>
> Cc: Tom Zanussi <tzanussi@gmail.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-perf-users@vger.kernel.org
> 
> ---
>  arch/x86/kernel/cpu/perf_event.c |    3 ---
>  1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 07f46ba..87d9abd 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1596,9 +1596,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
>  		if (bytes != sizeof(frame))
>  			break;
>  
> -		if ((unsigned long)fp < regs->sp)
> -			break;
> -
>  		perf_callchain_store(entry, frame.return_address);
>  		fp = frame.next_frame;
>  	}
> -- 
> 1.7.8.4
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf, x86: Disable sanity check
  2012-04-19  5:22 ` Frederic Weisbecker
@ 2012-04-19 17:33   ` Arun Sharma
  0 siblings, 0 replies; 6+ messages in thread
From: Arun Sharma @ 2012-04-19 17:33 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mike Galbraith, Paul Mackerras, Peter Zijlstra, Stephane Eranian,
	Namhyung Kim, Tom Zanussi, linux-perf-users

On 4/18/12 10:22 PM, Frederic Weisbecker wrote:

> So rbp is part of the JIT stack but not rsp?
> Do you have a practical example of that? I must confess I don't know
> much about JIT stack.

Nothing specific to JITs here. Any time an app has two stacks S1 and S2 
(with S1 at a lower address and S2 at a higher address) and %rsp at the 
time of a perf event is pointing to S2, we don't get traces beyond S2.

               |                 |
               |                 |
               |   0x1000000     |  <-- %rbp
               |                 |
               |                 |
S2: 0x2000000 |                 |  <-- %rsp
               |                 |
               |                 |
               |                 |
               |    frame3       |
               |                 |
               |    frame2       |
               |                 |
S1: 0x1000000 |    frame1       |
               |                 |
               |                 |
               |                 |
               |                 |

gdb is able to show:

(gdb) bt
<frame at 0x2000000>
frame1
frame2
frame3
frame4
..

just fine.

In our use case, there may be multiple transitions between S1 and S2. 
For eg: frame4 could be on S2 (i.e 0x2xxxxxx range).

  -Arun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf, x86: Disable sanity check
  2012-04-18 23:24 [PATCH] perf, x86: Disable sanity check Arun Sharma
  2012-04-19  5:22 ` Frederic Weisbecker
@ 2012-04-20  9:11 ` Peter Zijlstra
  2012-04-20 16:16   ` Linus Torvalds
  2012-04-20 18:18   ` Arun Sharma
  1 sibling, 2 replies; 6+ messages in thread
From: Peter Zijlstra @ 2012-04-20  9:11 UTC (permalink / raw)
  To: Arun Sharma
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Frederic Weisbecker, Mike Galbraith, Paul Mackerras,
	Stephane Eranian, Namhyung Kim, Tom Zanussi, linux-perf-users,
	Linus Torvalds

On Wed, 2012-04-18 at 16:24 -0700, Arun Sharma wrote:
> Without this patch, applications with two different stack
> regions (eg: native stack vs JIT stack) get truncated 
> callchains even when RBP chaining is present. GDB shows proper
> stack traces and the frame pointer chaining is intact.
> 
> This patch disables the (fp < RSP) check, hoping that other checks
> in the code save the day for us. In our limited testing, this
> didn't seem to break anything.
> 
> In the long term, we could potentially have userspace advise
> the kernel on the range of valid stack addresses, so we don't 
> spend a lot of time unwinding from bogus addresses.

Makes me really nervous.. Ingo, Linus ?

'normal' usespace can suffer from this too with signal stacks. Arun's
JIT case seems particularly weird in that his stacks don't strictly nest
but can cross over multiple times (makes one wonder why they bother with
multiple stacks..).

>  arch/x86/kernel/cpu/perf_event.c |    3 ---
>  1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
> index 07f46ba..87d9abd 100644
> --- a/arch/x86/kernel/cpu/perf_event.c
> +++ b/arch/x86/kernel/cpu/perf_event.c
> @@ -1596,9 +1596,6 @@ perf_callchain_user(struct perf_callchain_entry *entry, struct pt_regs *regs)
>  		if (bytes != sizeof(frame))
>  			break;
>  
> -		if ((unsigned long)fp < regs->sp)
> -			break;
> -
>  		perf_callchain_store(entry, frame.return_address);
>  		fp = frame.next_frame;
>  	}




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf, x86: Disable sanity check
  2012-04-20  9:11 ` Peter Zijlstra
@ 2012-04-20 16:16   ` Linus Torvalds
  2012-04-20 18:18   ` Arun Sharma
  1 sibling, 0 replies; 6+ messages in thread
From: Linus Torvalds @ 2012-04-20 16:16 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arun Sharma, linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Frederic Weisbecker, Mike Galbraith, Paul Mackerras,
	Stephane Eranian, Namhyung Kim, Tom Zanussi, linux-perf-users

On Fri, Apr 20, 2012 at 2:11 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> Makes me really nervous.. Ingo, Linus ?

I don't care as long as this only *ever* triggers for user stacks, and
the code verifies that. And I'm not sure it does, actually.

Why am I not sure? Because it uses copy_from_user_nmi(), which in turn
uses "access_ok()". But can we perhaps have the perf event happen
*while* the kernel has done a "set_fs(KERNEL_DS)" - and we just
happen to follow the user stack too? In which case we may be copying
kernel memory.

So I think this user stack following code is buggy in *other* ways.

Guys: stack following has to be *f^&%ing* careful! This shows yet
again how people blithely follow frame pointers without verifying
everything they damn well can.

Also, I note that the deepest stack chain allowed is something
*ridiculously* deep (like 255), and we use copy_from_user_nmi() for
this each entry. Which is slow as hell. So I would suggest at least
considering limiting that depth more.

End result: I'm ok with removing that one test. But I want more tests
to replace it. The user frame pointer had damn well better be in user
space (and no, "access_ok()" is not valid or sufficient in interrupt
context!) and I suspect there are other things you could check.

                  Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf, x86: Disable sanity check
  2012-04-20  9:11 ` Peter Zijlstra
  2012-04-20 16:16   ` Linus Torvalds
@ 2012-04-20 18:18   ` Arun Sharma
  1 sibling, 0 replies; 6+ messages in thread
From: Arun Sharma @ 2012-04-20 18:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Arnaldo Carvalho de Melo,
	Frederic Weisbecker, Mike Galbraith, Paul Mackerras,
	Stephane Eranian, Namhyung Kim, Tom Zanussi, linux-perf-users,
	Linus Torvalds

On 4/20/12 2:11 AM, Peter Zijlstra wrote:
> Makes me really nervous.. Ingo, Linus ?
>
> 'normal' usespace can suffer from this too with signal stacks. Arun's
> JIT case seems particularly weird in that his stacks don't strictly nest
> but can cross over multiple times (makes one wonder why they bother with
> multiple stacks..).

It's a tracing JIT, which dynamically chooses between interpreted mode 
and JIT mode. Translation is not necessarily a function at a time and 
has to be guarded by types inferred at runtime. Each time they switch 
between the two modes, they need to examine the non-native frames on the 
stack. With a single stack design, they'd have to unwind the stack 
looking for native frames vs JIT frames which adds complexity and cost.

Like you observed, this is needed for other less complex cases as well.

  -Arun


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2012-04-20 18:19 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-18 23:24 [PATCH] perf, x86: Disable sanity check Arun Sharma
2012-04-19  5:22 ` Frederic Weisbecker
2012-04-19 17:33   ` Arun Sharma
2012-04-20  9:11 ` Peter Zijlstra
2012-04-20 16:16   ` Linus Torvalds
2012-04-20 18:18   ` Arun Sharma

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).