* Re: perf PPC: kernel panic with callchains and context switch events
[not found] <4E274F5F.7000604@gmail.com>
@ 2011-07-24 17:18 ` David Ahern
2011-07-25 0:05 ` [PATCH] perf: powerpc: Disable pagefaults during callchain stack read Anton Blanchard
2011-07-25 1:55 ` perf PPC: kernel panic with callchains and context switch events Benjamin Herrenschmidt
0 siblings, 2 replies; 8+ messages in thread
From: David Ahern @ 2011-07-24 17:18 UTC (permalink / raw)
To: Anton Blanchard, Paul Mackerras, linux-perf-users, LKML, linuxppc-dev
On 07/20/2011 03:57 PM, David Ahern wrote:
> I am hoping someone familiar with PPC can help understand a panic that
> is generated when capturing callchains with context switch events.
>
> Call trace is below. The short of it is that walking the callchain
> generates a page fault. To handle the page fault the mmap_sem is needed,
> but it is currently held by setup_arg_pages. setup_arg_pages calls
> shift_arg_pages with the mmap_sem held. shift_arg_pages then calls
> move_page_tables which has a cond_resched at the top of its for loop. If
> the cond_resched() is removed from move_page_tables everything works
> beautifully - no panics.
>
> So, the question: is it normal for walking the stack to trigger a page
> fault on PPC? The panic is not seen on x86 based systems.
Can anyone confirm whether page faults while walking the stack are
normal for PPC? We really want to use the context switch event with
callchains and need to understand whether this behavior is normal. Of
course if it is normal, a way to address the problem without a panic
will be needed.
Thanks,
David
>
> [<b0180e00>]rb_erase+0x1b4/0x3e8
> [<b00430f4>]__dequeue_entity+0x50/0xe8
> [<b0043304>]set_next_entity+0x178/0x1bc
> [<b0043440>]pick_next_task_fair+0xb0/0x118
> [<b02ada80>]schedule+0x500/0x614
> [<b02afaa8>]rwsem_down_failed_common+0xf0/0x264
> [<b02afca0>]rwsem_down_read_failed+0x34/0x54
> [<b02aed4c>]down_read+0x3c/0x54
> [<b0023b58>]do_page_fault+0x114/0x5e8
> [<b001e350>]handle_page_fault+0xc/0x80
> [<b0022dec>]perf_callchain+0x224/0x31c
> [<b009ba70>]perf_prepare_sample+0x240/0x2fc
> [<b009d760>]__perf_event_overflow+0x280/0x398
> [<b009d914>]perf_swevent_overflow+0x9c/0x10c
> [<b009db54>]perf_swevent_ctx_event+0x1d0/0x230
> [<b009dc38>]do_perf_sw_event+0x84/0xe4
> [<b009dde8>]perf_sw_event_context_switch+0x150/0x1b4
> [<b009de90>]perf_event_task_sched_out+0x44/0x2d4
> [<b02ad840>]schedule+0x2c0/0x614
> [<b0047dc0>]__cond_resched+0x34/0x90
> [<b02adcc8>]_cond_resched+0x4c/0x68
> [<b00bccf8>]move_page_tables+0xb0/0x418
> [<b00d7ee0>]setup_arg_pages+0x184/0x2a0
> [<b0110914>]load_elf_binary+0x394/0x1208
> [<b00d6e28>]search_binary_handler+0xe0/0x2c4
> [<b00d834c>]do_execve+0x1bc/0x268
> [<b0015394>]sys_execve+0x84/0xc8
> [<b001df10>]ret_from_syscall+0x0/0x3c
>
> Thanks,
> David
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] perf: powerpc: Disable pagefaults during callchain stack read
2011-07-24 17:18 ` perf PPC: kernel panic with callchains and context switch events David Ahern
@ 2011-07-25 0:05 ` Anton Blanchard
2011-07-25 1:55 ` perf PPC: kernel panic with callchains and context switch events Benjamin Herrenschmidt
1 sibling, 0 replies; 8+ messages in thread
From: Anton Blanchard @ 2011-07-25 0:05 UTC (permalink / raw)
To: David Ahern; +Cc: linux-perf-users, linuxppc-dev, Paul Mackerras, LKML
Hi David,
> > I am hoping someone familiar with PPC can help understand a panic
> > that is generated when capturing callchains with context switch
> > events.
> >
> > Call trace is below. The short of it is that walking the callchain
> > generates a page fault. To handle the page fault the mmap_sem is
> > needed, but it is currently held by setup_arg_pages.
> > setup_arg_pages calls shift_arg_pages with the mmap_sem held.
> > shift_arg_pages then calls move_page_tables which has a
> > cond_resched at the top of its for loop. If the cond_resched() is
> > removed from move_page_tables everything works beautifully - no
> > panics.
> >
> > So, the question: is it normal for walking the stack to trigger a
> > page fault on PPC? The panic is not seen on x86 based systems.
>
> Can anyone confirm whether page faults while walking the stack are
> normal for PPC? We really want to use the context switch event with
> callchains and need to understand whether this behavior is normal. Of
> course if it is normal, a way to address the problem without a panic
> will be needed.
I talked to Ben about this last week and he pointed me at
pagefault_disable/enable. Untested patch below.
Anton
--
We need to disable pagefaults when reading the stack otherwise
we can lock up trying to take the mmap_sem when the code we are
profiling already has a write lock taken.
This will not happen for hardware events, but could for software
events.
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
---
Index: linux-powerpc/arch/powerpc/kernel/perf_callchain.c
===================================================================
--- linux-powerpc.orig/arch/powerpc/kernel/perf_callchain.c 2011-07-25 09:54:27.296757427 +1000
+++ linux-powerpc/arch/powerpc/kernel/perf_callchain.c 2011-07-25 09:56:08.828367882 +1000
@@ -154,8 +154,12 @@ static int read_user_stack_64(unsigned l
((unsigned long)ptr & 7))
return -EFAULT;
- if (!__get_user_inatomic(*ret, ptr))
+ pagefault_disable();
+ if (!__get_user_inatomic(*ret, ptr)) {
+ pagefault_enable();
return 0;
+ }
+ pagefault_enable();
return read_user_stack_slow(ptr, ret, 8);
}
@@ -166,8 +170,12 @@ static int read_user_stack_32(unsigned i
((unsigned long)ptr & 3))
return -EFAULT;
- if (!__get_user_inatomic(*ret, ptr))
+ pagefault_disable();
+ if (!__get_user_inatomic(*ret, ptr)) {
+ pagefault_enable();
return 0;
+ }
+ pagefault_enable();
return read_user_stack_slow(ptr, ret, 4);
}
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf PPC: kernel panic with callchains and context switch events
2011-07-24 17:18 ` perf PPC: kernel panic with callchains and context switch events David Ahern
2011-07-25 0:05 ` [PATCH] perf: powerpc: Disable pagefaults during callchain stack read Anton Blanchard
@ 2011-07-25 1:55 ` Benjamin Herrenschmidt
2011-07-25 15:38 ` David Ahern
1 sibling, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2011-07-25 1:55 UTC (permalink / raw)
To: David Ahern, Kumar Gala, Becky Bruce
Cc: linux-perf-users, linuxppc-dev, Paul Mackerras, Anton Blanchard, LKML
On Sun, 2011-07-24 at 11:18 -0600, David Ahern wrote:
> On 07/20/2011 03:57 PM, David Ahern wrote:
> > I am hoping someone familiar with PPC can help understand a panic that
> > is generated when capturing callchains with context switch events.
> >
> > Call trace is below. The short of it is that walking the callchain
> > generates a page fault. To handle the page fault the mmap_sem is needed,
> > but it is currently held by setup_arg_pages. setup_arg_pages calls
> > shift_arg_pages with the mmap_sem held. shift_arg_pages then calls
> > move_page_tables which has a cond_resched at the top of its for loop. If
> > the cond_resched() is removed from move_page_tables everything works
> > beautifully - no panics.
> >
> > So, the question: is it normal for walking the stack to trigger a page
> > fault on PPC? The panic is not seen on x86 based systems.
>
> Can anyone confirm whether page faults while walking the stack are
> normal for PPC? We really want to use the context switch event with
> callchains and need to understand whether this behavior is normal. Of
> course if it is normal, a way to address the problem without a panic
> will be needed.
Now that leads to interesting discoveries :-) Becky, can you read all
the way and let me know what you think ?
So, trying to walk the user stack directly will potentially cause page
faults if it's done by direct access. So if you're going to do it in a
spot where you can't afford it, you need to pagefault_disable() I
suppose. I think the problem with our existing code is that it's missing
those around __get_user_inatomic().
In fact, arguably, we don't want the hash code from modifying the hash
either (or even hashing things in). Our 64-bit code handles it today in
perf_callchain.c in a way that involves pretty much duplicating the
functionality of __get_user_pages_fast() as used by x86 (see below), but
as a fallback from a direct access which misses the pagefault_disable()
as well.
I think it comes from an old assumption that this would always be called
from an nmi, and the explicit tracepoints broke that assumption.
In fact we probably want to bump the NMI count, not just the IRQ count
as pagefault_disable() does, to make sure we prevent hashing.
x86 does things differently, using __get_user_pages_fast() (a variant of
get_user_page_fast() that doesn't fallback to normal get_user_pages()).
Now, we could do the same (use __gup_fast too), but I can see a
potential issue with ppc 32-bit platforms that have 64-bit PTEs, since
we could end up GUP'ing in the middle of the two accesses.
Becky: I think gup_fast is generally broken on 32-bit with 64-bit PTE
because of that, the problem isn't specific to perf backtraces, I'll
propose a solution further down.
Now, on x86, there is a similar problem with PAE, which is handled by
- having gup disable IRQs
- rely on the fact that to change from a valid value to another valid
value, the PTE will first get invalidated, which requires an IPI
and thus will be blocked by our interrupts being off
We do the first part, but the second part will break if we use HW TLB
invalidation broadcast (yet another reason why those are bad, I think I
will write a blog entry about it one of these days).
I think we can work around this while keeping our broadcast TLB
invalidations by having the invalidation code also increment a global
generation count (using the existing lock used by the invalidation code,
all 32-bit platforms have such a lock).
>From there, gup_fast can be changed to, with proper ordering, check the
generation count around the loading of the PTE and loop if it has
changed, kind-of a seqlock.
We also need the NMI count bump if we are going to try to keep the
attempt at doing a direct access first for perfs.
Becky, do you feel like giving that a shot or should I find another
victim ? (Or even do it myself ... ) :-)
Cheers,
Ben.
> Thanks,
> David
>
> >
> > [<b0180e00>]rb_erase+0x1b4/0x3e8
> > [<b00430f4>]__dequeue_entity+0x50/0xe8
> > [<b0043304>]set_next_entity+0x178/0x1bc
> > [<b0043440>]pick_next_task_fair+0xb0/0x118
> > [<b02ada80>]schedule+0x500/0x614
> > [<b02afaa8>]rwsem_down_failed_common+0xf0/0x264
> > [<b02afca0>]rwsem_down_read_failed+0x34/0x54
> > [<b02aed4c>]down_read+0x3c/0x54
> > [<b0023b58>]do_page_fault+0x114/0x5e8
> > [<b001e350>]handle_page_fault+0xc/0x80
> > [<b0022dec>]perf_callchain+0x224/0x31c
> > [<b009ba70>]perf_prepare_sample+0x240/0x2fc
> > [<b009d760>]__perf_event_overflow+0x280/0x398
> > [<b009d914>]perf_swevent_overflow+0x9c/0x10c
> > [<b009db54>]perf_swevent_ctx_event+0x1d0/0x230
> > [<b009dc38>]do_perf_sw_event+0x84/0xe4
> > [<b009dde8>]perf_sw_event_context_switch+0x150/0x1b4
> > [<b009de90>]perf_event_task_sched_out+0x44/0x2d4
> > [<b02ad840>]schedule+0x2c0/0x614
> > [<b0047dc0>]__cond_resched+0x34/0x90
> > [<b02adcc8>]_cond_resched+0x4c/0x68
> > [<b00bccf8>]move_page_tables+0xb0/0x418
> > [<b00d7ee0>]setup_arg_pages+0x184/0x2a0
> > [<b0110914>]load_elf_binary+0x394/0x1208
> > [<b00d6e28>]search_binary_handler+0xe0/0x2c4
> > [<b00d834c>]do_execve+0x1bc/0x268
> > [<b0015394>]sys_execve+0x84/0xc8
> > [<b001df10>]ret_from_syscall+0x0/0x3c
> >
> > Thanks,
> > David
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf PPC: kernel panic with callchains and context switch events
2011-07-25 1:55 ` perf PPC: kernel panic with callchains and context switch events Benjamin Herrenschmidt
@ 2011-07-25 15:38 ` David Ahern
0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2011-07-25 15:38 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Anton Blanchard
Cc: LKML, linux-perf-users, Paul Mackerras, linuxppc-dev
Hi Ben:
On 07/24/2011 07:55 PM, Benjamin Herrenschmidt wrote:
> On Sun, 2011-07-24 at 11:18 -0600, David Ahern wrote:
>> On 07/20/2011 03:57 PM, David Ahern wrote:
>>> I am hoping someone familiar with PPC can help understand a panic that
>>> is generated when capturing callchains with context switch events.
>>>
>>> Call trace is below. The short of it is that walking the callchain
>>> generates a page fault. To handle the page fault the mmap_sem is needed,
>>> but it is currently held by setup_arg_pages. setup_arg_pages calls
>>> shift_arg_pages with the mmap_sem held. shift_arg_pages then calls
>>> move_page_tables which has a cond_resched at the top of its for loop. If
>>> the cond_resched() is removed from move_page_tables everything works
>>> beautifully - no panics.
>>>
>>> So, the question: is it normal for walking the stack to trigger a page
>>> fault on PPC? The panic is not seen on x86 based systems.
>>
>> Can anyone confirm whether page faults while walking the stack are
>> normal for PPC? We really want to use the context switch event with
>> callchains and need to understand whether this behavior is normal. Of
>> course if it is normal, a way to address the problem without a panic
>> will be needed.
>
> Now that leads to interesting discoveries :-) Becky, can you read all
> the way and let me know what you think ?
>
> So, trying to walk the user stack directly will potentially cause page
> faults if it's done by direct access. So if you're going to do it in a
> spot where you can't afford it, you need to pagefault_disable() I
> suppose. I think the problem with our existing code is that it's missing
> those around __get_user_inatomic().
>
> In fact, arguably, we don't want the hash code from modifying the hash
> either (or even hashing things in). Our 64-bit code handles it today in
> perf_callchain.c in a way that involves pretty much duplicating the
> functionality of __get_user_pages_fast() as used by x86 (see below), but
> as a fallback from a direct access which misses the pagefault_disable()
> as well.
>
> I think it comes from an old assumption that this would always be called
> from an nmi, and the explicit tracepoints broke that assumption.
>
> In fact we probably want to bump the NMI count, not just the IRQ count
> as pagefault_disable() does, to make sure we prevent hashing.
>
> x86 does things differently, using __get_user_pages_fast() (a variant of
> get_user_page_fast() that doesn't fallback to normal get_user_pages()).
>
> Now, we could do the same (use __gup_fast too), but I can see a
> potential issue with ppc 32-bit platforms that have 64-bit PTEs, since
> we could end up GUP'ing in the middle of the two accesses.
>
> Becky: I think gup_fast is generally broken on 32-bit with 64-bit PTE
> because of that, the problem isn't specific to perf backtraces, I'll
> propose a solution further down.
>
> Now, on x86, there is a similar problem with PAE, which is handled by
>
> - having gup disable IRQs
> - rely on the fact that to change from a valid value to another valid
> value, the PTE will first get invalidated, which requires an IPI
> and thus will be blocked by our interrupts being off
>
> We do the first part, but the second part will break if we use HW TLB
> invalidation broadcast (yet another reason why those are bad, I think I
> will write a blog entry about it one of these days).
>
> I think we can work around this while keeping our broadcast TLB
> invalidations by having the invalidation code also increment a global
> generation count (using the existing lock used by the invalidation code,
> all 32-bit platforms have such a lock).
>
> From there, gup_fast can be changed to, with proper ordering, check the
> generation count around the loading of the PTE and loop if it has
> changed, kind-of a seqlock.
>
> We also need the NMI count bump if we are going to try to keep the
> attempt at doing a direct access first for perfs.
>
> Becky, do you feel like giving that a shot or should I find another
> victim ? (Or even do it myself ... ) :-)
Did you have something in mind besides the patch Anton sent? We'll give
that one a try and see how it works. (Thanks, Anton!)
David
>
> Cheers,
> Ben.
>
>> Thanks,
>> David
>>
>>>
>>> [<b0180e00>]rb_erase+0x1b4/0x3e8
>>> [<b00430f4>]__dequeue_entity+0x50/0xe8
>>> [<b0043304>]set_next_entity+0x178/0x1bc
>>> [<b0043440>]pick_next_task_fair+0xb0/0x118
>>> [<b02ada80>]schedule+0x500/0x614
>>> [<b02afaa8>]rwsem_down_failed_common+0xf0/0x264
>>> [<b02afca0>]rwsem_down_read_failed+0x34/0x54
>>> [<b02aed4c>]down_read+0x3c/0x54
>>> [<b0023b58>]do_page_fault+0x114/0x5e8
>>> [<b001e350>]handle_page_fault+0xc/0x80
>>> [<b0022dec>]perf_callchain+0x224/0x31c
>>> [<b009ba70>]perf_prepare_sample+0x240/0x2fc
>>> [<b009d760>]__perf_event_overflow+0x280/0x398
>>> [<b009d914>]perf_swevent_overflow+0x9c/0x10c
>>> [<b009db54>]perf_swevent_ctx_event+0x1d0/0x230
>>> [<b009dc38>]do_perf_sw_event+0x84/0xe4
>>> [<b009dde8>]perf_sw_event_context_switch+0x150/0x1b4
>>> [<b009de90>]perf_event_task_sched_out+0x44/0x2d4
>>> [<b02ad840>]schedule+0x2c0/0x614
>>> [<b0047dc0>]__cond_resched+0x34/0x90
>>> [<b02adcc8>]_cond_resched+0x4c/0x68
>>> [<b00bccf8>]move_page_tables+0xb0/0x418
>>> [<b00d7ee0>]setup_arg_pages+0x184/0x2a0
>>> [<b0110914>]load_elf_binary+0x394/0x1208
>>> [<b00d6e28>]search_binary_handler+0xe0/0x2c4
>>> [<b00d834c>]do_execve+0x1bc/0x268
>>> [<b0015394>]sys_execve+0x84/0xc8
>>> [<b001df10>]ret_from_syscall+0x0/0x3c
>>>
>>> Thanks,
>>> David
>> _______________________________________________
>> Linuxppc-dev mailing list
>> Linuxppc-dev@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] perf: powerpc: Disable pagefaults during callchain stack read
@ 2011-07-30 20:53 David Ahern
2011-08-01 9:59 ` Peter Zijlstra
0 siblings, 1 reply; 8+ messages in thread
From: David Ahern @ 2011-07-30 20:53 UTC (permalink / raw)
To: benh, anton
Cc: Peter Zijlstra, peterz, linux-kernel, paulus, acme, David Ahern,
mingo, linuxppc-dev
Panic observed on an older kernel when collecting call chains for
the context-switch software event:
[<b0180e00>]rb_erase+0x1b4/0x3e8
[<b00430f4>]__dequeue_entity+0x50/0xe8
[<b0043304>]set_next_entity+0x178/0x1bc
[<b0043440>]pick_next_task_fair+0xb0/0x118
[<b02ada80>]schedule+0x500/0x614
[<b02afaa8>]rwsem_down_failed_common+0xf0/0x264
[<b02afca0>]rwsem_down_read_failed+0x34/0x54
[<b02aed4c>]down_read+0x3c/0x54
[<b0023b58>]do_page_fault+0x114/0x5e8
[<b001e350>]handle_page_fault+0xc/0x80
[<b0022dec>]perf_callchain+0x224/0x31c
[<b009ba70>]perf_prepare_sample+0x240/0x2fc
[<b009d760>]__perf_event_overflow+0x280/0x398
[<b009d914>]perf_swevent_overflow+0x9c/0x10c
[<b009db54>]perf_swevent_ctx_event+0x1d0/0x230
[<b009dc38>]do_perf_sw_event+0x84/0xe4
[<b009dde8>]perf_sw_event_context_switch+0x150/0x1b4
[<b009de90>]perf_event_task_sched_out+0x44/0x2d4
[<b02ad840>]schedule+0x2c0/0x614
[<b0047dc0>]__cond_resched+0x34/0x90
[<b02adcc8>]_cond_resched+0x4c/0x68
[<b00bccf8>]move_page_tables+0xb0/0x418
[<b00d7ee0>]setup_arg_pages+0x184/0x2a0
[<b0110914>]load_elf_binary+0x394/0x1208
[<b00d6e28>]search_binary_handler+0xe0/0x2c4
[<b00d834c>]do_execve+0x1bc/0x268
[<b0015394>]sys_execve+0x84/0xc8
[<b001df10>]ret_from_syscall+0x0/0x3c
A page fault occurred walking the callchain while creating a perf
sample for the context-switch event. To handle the page fault the
mmap_sem is needed, but it is currently held by setup_arg_pages.
(setup_arg_pages calls shift_arg_pages with the mmap_sem held.
shift_arg_pages then calls move_page_tables which has a cond_resched
at the top of its for loop - hitting that cond_resched is what caused
the context switch.)
This is an extension of Anton's proposed patch:
https://lkml.org/lkml/2011/7/24/151
adding case for 32-bit ppc.
Tested on the system that first generated the panic and then again
with latest kernel using a PPC VM. I am not able to test the 64-bit
path - I do not have H/W for it and 64-bit PPC VMs (qemu on Intel)
is horribly slow.
Signed-off-by: David Ahern <dsahern@gmail.com>
CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Anton Blanchard <anton@samba.org>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Paul Mackerras <paulus@samba.org>
CC: Ingo Molnar <mingo@elte.hu>
CC: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-kernel@vger.kernel.org
---
arch/powerpc/kernel/perf_callchain.c | 20 +++++++++++++++++---
1 files changed, 17 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/perf_callchain.c b/arch/powerpc/kernel/perf_callchain.c
index d05ae42..564c1d8 100644
--- a/arch/powerpc/kernel/perf_callchain.c
+++ b/arch/powerpc/kernel/perf_callchain.c
@@ -154,8 +154,12 @@ static int read_user_stack_64(unsigned long __user *ptr, unsigned long *ret)
((unsigned long)ptr & 7))
return -EFAULT;
- if (!__get_user_inatomic(*ret, ptr))
+ pagefault_disable();
+ if (!__get_user_inatomic(*ret, ptr)) {
+ pagefault_enable();
return 0;
+ }
+ pagefault_enable();
return read_user_stack_slow(ptr, ret, 8);
}
@@ -166,8 +170,12 @@ static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret)
((unsigned long)ptr & 3))
return -EFAULT;
- if (!__get_user_inatomic(*ret, ptr))
+ pagefault_disable();
+ if (!__get_user_inatomic(*ret, ptr)) {
+ pagefault_enable();
return 0;
+ }
+ pagefault_enable();
return read_user_stack_slow(ptr, ret, 4);
}
@@ -294,11 +302,17 @@ static inline int current_is_64bit(void)
*/
static int read_user_stack_32(unsigned int __user *ptr, unsigned int *ret)
{
+ int rc;
+
if ((unsigned long)ptr > TASK_SIZE - sizeof(unsigned int) ||
((unsigned long)ptr & 3))
return -EFAULT;
- return __get_user_inatomic(*ret, ptr);
+ pagefault_disable();
+ rc = __get_user_inatomic(*ret, ptr);
+ pagefault_enable();
+
+ return rc;
}
static inline void perf_callchain_user_64(struct perf_callchain_entry *entry,
--
1.7.6
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] perf: powerpc: Disable pagefaults during callchain stack read
2011-07-30 20:53 [PATCH] perf: powerpc: Disable pagefaults during callchain stack read David Ahern
@ 2011-08-01 9:59 ` Peter Zijlstra
2011-08-01 10:39 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 8+ messages in thread
From: Peter Zijlstra @ 2011-08-01 9:59 UTC (permalink / raw)
To: David Ahern; +Cc: linux-kernel, paulus, anton, acme, mingo, linuxppc-dev
On Sat, 2011-07-30 at 14:53 -0600, David Ahern wrote:
> A page fault occurred walking the callchain while creating a perf
> sample for the context-switch event. To handle the page fault the
> mmap_sem is needed, but it is currently held by setup_arg_pages.
> (setup_arg_pages calls shift_arg_pages with the mmap_sem held.
> shift_arg_pages then calls move_page_tables which has a cond_resched
> at the top of its for loop - hitting that cond_resched is what caused
> the context switch.)
>
> This is an extension of Anton's proposed patch:
> https://lkml.org/lkml/2011/7/24/151
> adding case for 32-bit ppc.
>
> Tested on the system that first generated the panic and then again
> with latest kernel using a PPC VM. I am not able to test the 64-bit
> path - I do not have H/W for it and 64-bit PPC VMs (qemu on Intel)
> is horribly slow.
>
> Signed-off-by: David Ahern <dsahern@gmail.com>
> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> CC: Anton Blanchard <anton@samba.org>
Hmm, Paul, didn't you fix something like this early on? Anyway, I've no
objections since I'm really not familiar enough with the PPC side of
things.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] perf: powerpc: Disable pagefaults during callchain stack read
2011-08-01 9:59 ` Peter Zijlstra
@ 2011-08-01 10:39 ` Benjamin Herrenschmidt
2011-08-01 12:44 ` David Ahern
0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Herrenschmidt @ 2011-08-01 10:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: linux-kernel, paulus, anton, acme, David Ahern, mingo, linuxppc-dev
On Mon, 2011-08-01 at 11:59 +0200, Peter Zijlstra wrote:
> > Signed-off-by: David Ahern <dsahern@gmail.com>
> > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > CC: Anton Blanchard <anton@samba.org>
>
> Hmm, Paul, didn't you fix something like this early on? Anyway, I've
> no
> objections since I'm really not familiar enough with the PPC side of
> things.
I'm travelling so I haven't had a chance to review properly or even test
but it looks like an ad-hoc fix for the immediate problem.
Ultimately, I want to rework that stuff to do a __gup_fast like x86 does
(maybe as a fallback from an attempt at access first) so we work around
access permissions blocked by lack of dirty/accessed bits but in the
meantime, this should fix the immediate issue.
Cheers,
Ben.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] perf: powerpc: Disable pagefaults during callchain stack read
2011-08-01 10:39 ` Benjamin Herrenschmidt
@ 2011-08-01 12:44 ` David Ahern
0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2011-08-01 12:44 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Peter Zijlstra, linux-kernel, paulus, anton, acme, mingo, linuxppc-dev
On 08/01/2011 04:39 AM, Benjamin Herrenschmidt wrote:
> On Mon, 2011-08-01 at 11:59 +0200, Peter Zijlstra wrote:
>>> Signed-off-by: David Ahern <dsahern@gmail.com>
>>> CC: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> CC: Anton Blanchard <anton@samba.org>
>>
>> Hmm, Paul, didn't you fix something like this early on? Anyway, I've
>> no
>> objections since I'm really not familiar enough with the PPC side of
>> things.
>
> I'm travelling so I haven't had a chance to review properly or even test
> but it looks like an ad-hoc fix for the immediate problem.
>
> Ultimately, I want to rework that stuff to do a __gup_fast like x86 does
> (maybe as a fallback from an attempt at access first) so we work around
> access permissions blocked by lack of dirty/accessed bits but in the
> meantime, this should fix the immediate issue.
The problem goes back to all kernel releases with perf, so this patch
should get applied to the stable trains too.
David
>
> Cheers,
> Ben.
>
>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2011-08-01 12:44 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <4E274F5F.7000604@gmail.com>
2011-07-24 17:18 ` perf PPC: kernel panic with callchains and context switch events David Ahern
2011-07-25 0:05 ` [PATCH] perf: powerpc: Disable pagefaults during callchain stack read Anton Blanchard
2011-07-25 1:55 ` perf PPC: kernel panic with callchains and context switch events Benjamin Herrenschmidt
2011-07-25 15:38 ` David Ahern
2011-07-30 20:53 [PATCH] perf: powerpc: Disable pagefaults during callchain stack read David Ahern
2011-08-01 9:59 ` Peter Zijlstra
2011-08-01 10:39 ` Benjamin Herrenschmidt
2011-08-01 12:44 ` David Ahern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).