* [PATCH] Sometimes, there is OOPS happened when we use oprofile.
@ 2012-10-29 2:33 Zhang, Jun
2012-10-31 21:05 ` Robert Richter
0 siblings, 1 reply; 5+ messages in thread
From: Zhang, Jun @ 2012-10-29 2:33 UTC (permalink / raw)
To: Robert Richter, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
x86, oprofile-list, linux-kernel
Cc: Zhang, Jun
>From fff479313342940372444797814edee996b18fc9 Mon Sep 17 00:00:00 2001
From: jzha144 <jun.zhang@intel.com>
Date: Mon, 29 Oct 2012 09:07:22 +0800
Subject: [PATCH] Sometimes, there is OOPS happened when we use oprofile. next
is the call stack. From call stack, we find in
call_on_stack if there is a nmi interrupt between "xchgl
%%ebx,%%esp" and "call *%%edi", system will OOPS.
BUG: unable to handle kernel paging request at ff06383f
IP: [<c12051cd>] print_context_stack+0x4d/0x100
*pde = 00000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: wl12xx_sdio wl12xx mac80211 cfg80211
compat btwilink atomisp lm3554 mt9m114 mt9e013 videobuf2_memops videobuf2_core st_drv matrix(C)
Pid: 162, comm: adbd Tainted: G WC 3.0.34-140446-g9e77874-dirty #1 Intel Corporation
EIP: 0060:[<c12051cd>] EFLAGS: 00010083 CPU: 1
EIP is at print_context_stack+0x4d/0x100
EAX: ff063ffc EBX: ff06383f ECX: f4a0bd74 EDX: ff06383f
ESI: 00000000 EDI: ffffe000 EBP: f58dbe48 ESP: f58dbe24
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process adbd (pid: 162, ti=f58da000 task=f430a730 task.ti=f4a0a000)
Stack:
0000000c ff063ffc f4a0bd74 ffffe000 ff062000 f4a0bd74 ff06383f c1b2b1c0
ff062000 f58dbe74 c120428f c1b2b1c0 f58dbe98 00000000 f58dbe60 00000000
00000000 f4a0bd74 f58dbfc4 00000005 f58dbebc c172d52f f4a0bd74 c1b2b1c0
Call Trace:
[<c120428f>] dump_trace+0x7f/0xf0
[<c172d52f>] x86_backtrace+0x13f/0x150
[<c172b504>] ? op_cpu_buffer_write_commit+0x14/0x20
[<c172b66e>] ? log_sample+0x8e/0xb0
[<c172b8ca>] oprofile_add_sample+0x9a/0xc0
[<c172f09e>] ppro_check_ctrs+0x8e/0x110
[<c12a31ce>] ? rb_reserve_next_event+0x3e/0x370
[<c172d8d7>] profile_exceptions_notify+0x67/0x70
[<c18694c7>] notifier_call_chain+0x47/0x90
[<c1869548>] __atomic_notifier_call_chain+0x38/0x50
[<c1250930>] ? remote_softirq_receive+0x110/0x110
[<c186957f>] atomic_notifier_call_chain+0x1f/0x30
[<c18695bd>] notify_die+0x2d/0x30
[<c1867390>] do_nmi+0xb0/0x300
[<c124fcef>] ? __local_bh_enable+0x4f/0xa0
[<c1866f95>] nmi_stack_correct+0x28/0x2d
[<c1250930>] ? remote_softirq_receive+0x110/0x110
[<c120412f>] ? do_softirq+0x8f/0xe0
<IRQ>
[<c1250e26>] irq_exit+0x86/0xd0
[<c186cb49>] smp_apic_timer_interrupt+0x59/0x88
[<c1496738>] ? trace_hardirqs_off_thunk+0xc/0x14
[<c1866ca7>] apic_timer_interrupt+0x2f/0x34
[<c122007b>] ? handle_vm86_fault+0x78b/0x9b0
[<c186661f>] ? _raw_spin_unlock_irqrestore+0x3f/0x50
[<c1230d3c>] __wake_up_sync_key+0x4c/0x60
[<c17353f0>] sock_def_readable+0x40/0x70
[<c17d050d>] unix_stream_sendmsg+0x22d/0x390
[<c173103b>] sock_aio_write+0x11b/0x140
[<c186375d>] ? __schedule+0x23d/0x8d0
[<c1866f95>] ? nmi_stack_correct+0x28/0x2d
[<c12feaf9>] do_sync_write+0xa9/0xe0
[<c186942d>] ? sub_preempt_count+0x3d/0x50
[<c12ff321>] vfs_write+0x151/0x160
[<c1300798>] ? fget_light+0x58/0xd0
[<c12ff53d>] sys_write+0x3d/0x70
[<c18669a1>] syscall_call+0x7/0xb
Code: f6 89 4d f0 89 4d e4 89 45 e0 89 7d e8 74 5e 8d b4 26 00 00 00 00 39
f3 72 0c 8b 45 f0 83 c4 18 5b 5e 5f 5d c3 90 3b 5d e8 72 ef <8b> 3b 89 f8
89 7d dc e8 c7 07 06 00 85 c0 74 2b 8b 45 f0 83 c0
EIP: [<c12051cd>] print_context_stack+0x4d/0x100 SS:ESP 0068:f58dbe24
CR2: 00000000ff06383f
Signed-off-by: jzha144 <jun.zhang@intel.com>
---
arch/x86/oprofile/backtrace.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/arch/x86/oprofile/backtrace.c b/arch/x86/oprofile/backtrace.c
index d6aa6e8..c1af4f0 100644
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -113,6 +113,10 @@ x86_backtrace(struct pt_regs * const regs, unsigned int depth)
if (!user_mode_vm(regs)) {
unsigned long stack = kernel_stack_pointer(regs);
+
+ if (!((unsigned long)stack & (THREAD_SIZE - 1)))
+ stack = 0;
+
if (depth)
dump_trace(NULL, regs, (unsigned long *)stack, 0,
&backtrace_ops, &depth);
--
1.7.6
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] Sometimes, there is OOPS happened when we use oprofile.
2012-10-29 2:33 [PATCH] Sometimes, there is OOPS happened when we use oprofile Zhang, Jun
@ 2012-10-31 21:05 ` Robert Richter
2012-10-31 21:27 ` H. Peter Anvin
2012-10-31 21:33 ` H. Peter Anvin
0 siblings, 2 replies; 5+ messages in thread
From: Robert Richter @ 2012-10-31 21:05 UTC (permalink / raw)
To: Zhang, Jun, Ingo Molnar, H. Peter Anvin
Cc: Thomas Gleixner, x86, oprofile-list, linux-kernel
Jun,
On 29.10.12 02:33:54, Zhang, Jun wrote:
> Sometimes, there is OOPS happened when we use oprofile. next
> is the call stack. From call stack, we find in
> call_on_stack if there is a nmi interrupt between "xchgl
> %%ebx,%%esp" and "call *%%edi", system will OOPS.
this should be related and fixed with:
https://lkml.org/lkml/2012/9/12/269
Ingo, HPA,
please apply the fix of kernel_stack_pointer().
Thanks,
-Robert
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Sometimes, there is OOPS happened when we use oprofile.
2012-10-31 21:05 ` Robert Richter
@ 2012-10-31 21:27 ` H. Peter Anvin
2012-10-31 21:33 ` H. Peter Anvin
1 sibling, 0 replies; 5+ messages in thread
From: H. Peter Anvin @ 2012-10-31 21:27 UTC (permalink / raw)
To: Robert Richter
Cc: Zhang, Jun, Ingo Molnar, Thomas Gleixner, x86, oprofile-list,
linux-kernel
On 10/31/2012 02:05 PM, Robert Richter wrote:
> Jun,
>
> On 29.10.12 02:33:54, Zhang, Jun wrote:
>> Sometimes, there is OOPS happened when we use oprofile. next
>> is the call stack. From call stack, we find in
>> call_on_stack if there is a nmi interrupt between "xchgl
>> %%ebx,%%esp" and "call *%%edi", system will OOPS.
>
> this should be related and fixed with:
>
> https://lkml.org/lkml/2012/9/12/269
>
> Ingo, HPA,
>
> please apply the fix of kernel_stack_pointer().
>
Thanks for the reminder. Ingo bounced this one to me for review while I
was away and it fell between the cracks.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Sometimes, there is OOPS happened when we use oprofile.
2012-10-31 21:05 ` Robert Richter
2012-10-31 21:27 ` H. Peter Anvin
@ 2012-10-31 21:33 ` H. Peter Anvin
2012-10-31 22:45 ` Robert Richter
1 sibling, 1 reply; 5+ messages in thread
From: H. Peter Anvin @ 2012-10-31 21:33 UTC (permalink / raw)
To: Robert Richter
Cc: Zhang, Jun, Ingo Molnar, Thomas Gleixner, x86, oprofile-list,
linux-kernel
On 10/31/2012 02:05 PM, Robert Richter wrote:
> Jun,
>
> On 29.10.12 02:33:54, Zhang, Jun wrote:
>> Sometimes, there is OOPS happened when we use oprofile. next
>> is the call stack. From call stack, we find in
>> call_on_stack if there is a nmi interrupt between "xchgl
>> %%ebx,%%esp" and "call *%%edi", system will OOPS.
>
> this should be related and fixed with:
>
> https://lkml.org/lkml/2012/9/12/269
>
> Ingo, HPA,
>
> please apply the fix of kernel_stack_pointer().
>
I'm vaguely concerned about the following:
+ * To always return a non-null
+ * stack pointer we fall back to regs as stack if no previous stack
+ * exists.
The logic being that if there is no stack pointer and the stack is too
empty, to simply assume regs point to the top of the stack? Is this
possible to ever be actually seen?
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] Sometimes, there is OOPS happened when we use oprofile.
2012-10-31 21:33 ` H. Peter Anvin
@ 2012-10-31 22:45 ` Robert Richter
0 siblings, 0 replies; 5+ messages in thread
From: Robert Richter @ 2012-10-31 22:45 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Zhang, Jun, Ingo Molnar, Thomas Gleixner, x86, oprofile-list,
linux-kernel, Steven Rostedt
On 31.10.12 14:33:17, H. Peter Anvin wrote:
> I'm vaguely concerned about the following:
>
> + * To always return a non-null
> + * stack pointer we fall back to regs as stack if no previous stack
> + * exists.
>
> The logic being that if there is no stack pointer and the stack is
> too empty, to simply assume regs point to the top of the stack? Is
> this possible to ever be actually seen?
I discussed this with Steven too (https://lkml.org/lkml/2012/9/6/322)
and we both had a bad feeling with returning a null pointer by
kernel_stack_pointer() (implemented in version 1 of this patch). It
could be null if tinfo->previous_esp is null (last stack). Not sure
when this may happen.
So using regs as fallback seemed to be ok as this was in for years:
7b6c6c7 x86, 32-bit: fix kernel_trap_sp()
-Robert
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-10-31 22:45 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-29 2:33 [PATCH] Sometimes, there is OOPS happened when we use oprofile Zhang, Jun
2012-10-31 21:05 ` Robert Richter
2012-10-31 21:27 ` H. Peter Anvin
2012-10-31 21:33 ` H. Peter Anvin
2012-10-31 22:45 ` Robert Richter
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).