* [BUG] 3.13-rc1 kernel crash when enable all tracepoints
@ 2013-11-26 4:38 Jovi Zhangwei
2013-11-26 6:33 ` Masami Hiramatsu
2013-11-26 12:58 ` Steven Rostedt
0 siblings, 2 replies; 9+ messages in thread
From: Jovi Zhangwei @ 2013-11-26 4:38 UTC (permalink / raw)
To: Steven Rostedt, Masami Hiramatsu, Oleg Nesterov; +Cc: LKML
Hi,
I'm not sure this issue already be fixed or not, it can be reproduced
permanently.
(I didn't use git-bisect yet, you guys might can understand it quickly)
#echo 1 > /sys/kernel/debug/tracing/events/enable
[ 160.472176] BUG: unable to handle kernel NULL pointer dereference
at (null)
[ 160.472176] IP: [<ffffffff8132ae40>] strlen+0x0/0x30
[ 160.472176] PGD 7b67a067 PUD 788c6067 PMD 0
[ 160.472176] Oops: 0000 [#12] SMP
[ 160.472176] Modules linked in: ebtable_nat ebtables joydev e1000
[ 160.472176] CPU: 0 PID: 741 Comm: bash Tainted: G D 3.13.0-rc1+ #16
[ 160.472176] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 160.472176] task: ffff88007891e000 ti: ffff880074d04000 task.ti:
ffff880074d04000
[ 160.472176] RIP: 0010:[<ffffffff8132ae40>] [<ffffffff8132ae40>]
strlen+0x0/0x30
[ 160.472176] RSP: 0018:ffff88007fc03e40 EFLAGS: 00010006
[ 160.472176] RAX: 0000000000000002 RBX: ffff88007bb19d80 RCX: 0000000000000002
[ 160.472176] RDX: ffffffff8107a5ac RSI: ffffffff826eac88 RDI: 0000000000000000
[ 160.472176] RBP: ffff88007fc03e88 R08: 0000000000000002 R09: 0000000000000000
[ 160.472176] R10: ffff88007891e000 R11: 0000000000000000 R12: ffffffff826eac88
[ 160.472176] R13: 0000000000010000 R14: 0000000000000046 R15: ffffffff81c47880
[ 160.472176] FS: 00007fef34515740(0000) GS:ffff88007fc00000(0000)
knlGS:0000000000000000
[ 160.472176] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 160.472176] CR2: 0000000000000000 CR3: 000000007a492000 CR4: 00000000000006f0
[ 160.472176] Stack:
[ 160.472176] ffffffff810999e0 0000000000000002 0000000000000046
ffff88007fc03ec0
[ 160.472176] ffff8800788031a0 ffffffff826eac88 ffffffff8107a5ac
0000000000000046
[ 160.472176] 0000000000000001 ffff88007fc03ec0 ffffffff8109e840
0000000000000082
[ 160.472176] Call Trace:
[ 160.472176] <IRQ>
[ 160.472176] [<ffffffff810999e0>] ? ftrace_raw_event_lock+0x50/0xe0
[ 160.472176] [<ffffffff8107a5ac>] ? hrtimer_interrupt+0x7c/0x240
[ 160.472176] [<ffffffff8109e840>] lock_release+0xe0/0x1f0
[ 160.472176] [<ffffffff810bb612>] ktime_get_update_offsets+0x62/0x150
[ 160.472176] [<ffffffff8107a5ac>] ? hrtimer_interrupt+0x7c/0x240
[ 160.472176] [<ffffffff8107a5ac>] hrtimer_interrupt+0x7c/0x240
[ 160.472176] [<ffffffff811016d6>] ? trace_event_buffer_lock_reserve+0x26/0x30
[ 160.472176] [<ffffffff810301d7>] local_apic_timer_interrupt+0x37/0x60
[ 160.472176] [<ffffffff81643b4a>] smp_trace_apic_timer_interrupt+0x4a/0xc9
[ 160.472176] [<ffffffff816427ef>] trace_apic_timer_interrupt+0x6f/0x80
[ 160.472176] <EOI>
[ 160.472176] [<ffffffff810c89ba>] ? smp_call_function_many+0x25a/0x2c0
[ 160.472176] [<ffffffff81008b80>] ? setup_data_read+0xa0/0xa0
[ 160.472176] [<ffffffff8109e7a6>] ? lock_release+0x46/0x1f0
[ 160.472176] [<ffffffff81008b80>] ? setup_data_read+0xa0/0xa0
[ 160.472176] [<ffffffff8109e7a7>] ? lock_release+0x47/0x1f0
[ 160.472176] [<ffffffff810c8a7d>] on_each_cpu+0x2d/0x60
[ 160.472176] [<ffffffff8109e7a6>] ? lock_release+0x46/0x1f0
[ 160.472176] [<ffffffff810092e8>] text_poke_bp+0xa8/0xc0
[ 160.472176] [<ffffffff8109e7a6>] ? lock_release+0x46/0x1f0
[ 160.472176] [<ffffffff8100685e>] arch_jump_label_transform+0x8e/0x100
[ 160.472176] [<ffffffff8112b73f>] __jump_label_update+0x5f/0x80
[ 160.472176] [<ffffffff8112b7fd>] jump_label_update+0x9d/0xb0
[ 160.472176] [<ffffffff8112ba3d>] static_key_slow_inc+0xad/0xb0
[ 160.472176] [<ffffffff810f2979>] tracepoint_update_probe_range+0x139/0x160
[ 160.472176] [<ffffffff810f32d4>] tracepoint_probe_register+0x54/0xa0
[ 160.472176] [<ffffffff81110b19>] ftrace_event_reg+0x69/0x90
[ 160.472176] [<ffffffff8110fb9f>] __ftrace_event_enable_disable+0x5f/0x1b0
[ 160.472176] [<ffffffff8110fdb7>] __ftrace_set_clr_event_nolock+0xc7/0xf0
[ 160.472176] [<ffffffff8111049b>] system_enable_write+0x9b/0xe0
[ 160.472176] [<ffffffff8118d494>] vfs_write+0xb4/0x1f0
[ 160.472176] [<ffffffff8118dec9>] SyS_write+0x49/0xa0
[ 160.472176] [<ffffffff810e9d7c>] ? __audit_syscall_entry+0x9c/0xf0
[ 160.472176] [<ffffffff81641b12>] system_call_fastpath+0x16/0x1b
[ 160.472176] Code: 89 f8 48 89 e5 f6 82 e0 c7 85 81 20 74 15 0f 1f
44 00 00 48 83 c0 01 0f b6 10 f6 82 e0 c7 85 81 20 75 f0 5d c3 66 0f
1f 44 00 00 <80> 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0
01 80
[ 160.472176] RIP [<ffffffff8132ae40>] strlen+0x0/0x30
[ 160.472176] RSP <ffff88007fc03e40>
[ 160.472176] CR2: 0000000000000000
[ 160.472176] ---[ end trace cadf03f6f81c6eb9 ]---
[ 160.472277] Shutting down cpus with NMI
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 4:38 [BUG] 3.13-rc1 kernel crash when enable all tracepoints Jovi Zhangwei
@ 2013-11-26 6:33 ` Masami Hiramatsu
2013-11-26 7:20 ` Jovi Zhangwei
2013-11-26 7:26 ` (ltc-kernel 7837) " Masami Hiramatsu
2013-11-26 12:58 ` Steven Rostedt
1 sibling, 2 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2013-11-26 6:33 UTC (permalink / raw)
To: Jovi Zhangwei, Steven Rostedt; +Cc: Oleg Nesterov, LKML
(2013/11/26 13:38), Jovi Zhangwei wrote:
> Hi,
>
> I'm not sure this issue already be fixed or not, it can be reproduced
> permanently.
>
> (I didn't use git-bisect yet, you guys might can understand it quickly)
>
> #echo 1 > /sys/kernel/debug/tracing/events/enable
Thanks for reporting. I could reproduce it.
To narrow this down, I tried to run the below command
[tracing]# for i in events/*/*/enable ; do echo $i; echo 1 > $i; done
And it ran through the end without any problem.
Hm, next I checked the difference of available_events and set_event.
[tracing]# diff available_events set_event
283d282
< ftrace:function
So, I guess it was caused by enabling ftrace:function, and
it is unable to do that via set_event, nor events/ftrace/enable
I'm not sure how, but it seems that ftrace:function can be
enabled by the events/enable.
Thank you,
--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 6:33 ` Masami Hiramatsu
@ 2013-11-26 7:20 ` Jovi Zhangwei
2013-11-26 8:57 ` Masami Hiramatsu
2013-11-26 7:26 ` (ltc-kernel 7837) " Masami Hiramatsu
1 sibling, 1 reply; 9+ messages in thread
From: Jovi Zhangwei @ 2013-11-26 7:20 UTC (permalink / raw)
To: Masami Hiramatsu; +Cc: Steven Rostedt, Oleg Nesterov, LKML
On Tue, Nov 26, 2013 at 2:33 PM, Masami Hiramatsu
<masami.hiramatsu.pt@hitachi.com> wrote:
> (2013/11/26 13:38), Jovi Zhangwei wrote:
>> Hi,
>>
>> I'm not sure this issue already be fixed or not, it can be reproduced
>> permanently.
>>
>> (I didn't use git-bisect yet, you guys might can understand it quickly)
>>
>> #echo 1 > /sys/kernel/debug/tracing/events/enable
>
> Thanks for reporting. I could reproduce it.
>
> To narrow this down, I tried to run the below command
>
> [tracing]# for i in events/*/*/enable ; do echo $i; echo 1 > $i; done
>
> And it ran through the end without any problem.
> Hm, next I checked the difference of available_events and set_event.
>
> [tracing]# diff available_events set_event
> 283d282
> < ftrace:function
>
> So, I guess it was caused by enabling ftrace:function, and
> it is unable to do that via set_event, nor events/ftrace/enable
> I'm not sure how, but it seems that ftrace:function can be
> enabled by the events/enable.
>
I'm not sure this relate with ftrace:function event, from the
crash log, it seems more like caused by lock events.
Now I only enable lock events (lockdep compiled in my box), below
two commands both can crash system.
#echo 1 > /sys/kernel/debug/tracing/events/lock/lock_acquire/enable
or.
#echo 1 > /sys/kernel/debug/tracing/events/lock/lock_release/enable
According to the log, it seems like a recursion tracing bug.
register lock event -> jump_label_update -> text_poke_bp ->
on_each_cpu -> local_apic_timer_interrupt -> ktime_get_update_offsets
-> lock_release
(Perhaps the crash you reproduced is another crash? similar crash log?)
Jovi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: (ltc-kernel 7837) Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 6:33 ` Masami Hiramatsu
2013-11-26 7:20 ` Jovi Zhangwei
@ 2013-11-26 7:26 ` Masami Hiramatsu
1 sibling, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2013-11-26 7:26 UTC (permalink / raw)
To: Jovi Zhangwei, Steven Rostedt; +Cc: Oleg Nesterov, LKML, yrl.pp-manager.tt
(2013/11/26 15:33), Masami Hiramatsu wrote:
> (2013/11/26 13:38), Jovi Zhangwei wrote:
>> Hi,
>>
>> I'm not sure this issue already be fixed or not, it can be reproduced
>> permanently.
>>
>> (I didn't use git-bisect yet, you guys might can understand it quickly)
>>
>> #echo 1 > /sys/kernel/debug/tracing/events/enable
>
> Thanks for reporting. I could reproduce it.
>
> To narrow this down, I tried to run the below command
>
> [tracing]# for i in events/*/*/enable ; do echo $i; echo 1 > $i; done
>
> And it ran through the end without any problem.
> Hm, next I checked the difference of available_events and set_event.
>
> [tracing]# diff available_events set_event
> 283d282
> < ftrace:function
>
> So, I guess it was caused by enabling ftrace:function, and
> it is unable to do that via set_event, nor events/ftrace/enable
> I'm not sure how, but it seems that ftrace:function can be
> enabled by the events/enable.
Oops, No, forget it. When I added a printk in
ftrace_set_clr_event_nolock(), it didn't happen and ftrace/function
is not enabled.
I'm trying to reproduce it on the latest tip kernel again on KVM,
but not yet reproduced.
Thank you,
--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 7:20 ` Jovi Zhangwei
@ 2013-11-26 8:57 ` Masami Hiramatsu
2013-11-26 10:27 ` Masami Hiramatsu
0 siblings, 1 reply; 9+ messages in thread
From: Masami Hiramatsu @ 2013-11-26 8:57 UTC (permalink / raw)
To: Jovi Zhangwei; +Cc: Steven Rostedt, Oleg Nesterov, LKML, yrl.pp-manager.tt
(2013/11/26 16:20), Jovi Zhangwei wrote:
> I'm not sure this relate with ftrace:function event, from the
> crash log, it seems more like caused by lock events.
>
> Now I only enable lock events (lockdep compiled in my box), below
> two commands both can crash system.
That is not the lockdep, but the lockstat. :)
>
> #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_acquire/enable
> or.
> #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_release/enable
>
> According to the log, it seems like a recursion tracing bug.
>
> register lock event -> jump_label_update -> text_poke_bp ->
> on_each_cpu -> local_apic_timer_interrupt -> ktime_get_update_offsets
> -> lock_release
OK, the problem is in the lock_acquire/release events.
> (Perhaps the crash you reproduced is another crash? similar crash log?)
Right, that was my mistake. Now I got same error log on kvm too.
This should be fixed.
Thank you,
--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 8:57 ` Masami Hiramatsu
@ 2013-11-26 10:27 ` Masami Hiramatsu
0 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2013-11-26 10:27 UTC (permalink / raw)
To: Jovi Zhangwei; +Cc: Steven Rostedt, Oleg Nesterov, LKML, yrl.pp-manager.tt
(2013/11/26 17:57), Masami Hiramatsu wrote:
> (2013/11/26 16:20), Jovi Zhangwei wrote:
>> I'm not sure this relate with ftrace:function event, from the
>> crash log, it seems more like caused by lock events.
>>
>> Now I only enable lock events (lockdep compiled in my box), below
>> two commands both can crash system.
>
> That is not the lockdep, but the lockstat. :)
>
>>
>> #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_acquire/enable
>> or.
>> #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_release/enable
>>
>> According to the log, it seems like a recursion tracing bug.
>>
>> register lock event -> jump_label_update -> text_poke_bp ->
>> on_each_cpu -> local_apic_timer_interrupt -> ktime_get_update_offsets
>> -> lock_release
>
> OK, the problem is in the lock_acquire/release events.
Hm, it seems that the bug is caused because lock->name is NULL in
ftrace_raw_event_lock(_acquire). To allocate a buffer on ring_buffer,
ftrace_raw_event_##call tries to measure the length of lock->name
but it causes a memory access violation in ftrace_get_offsets_##call
because lock->name is NULL. :(
I could fix this by checking lock->name in include/trace/events/lock.h
e.g.
TP_STRUCT__entry(
__field(unsigned int, flags)
__string(name, lock->name ?: "NULL")
__field(void *, lockdep_addr)
),
TP_fast_assign(
__entry->flags = (trylock ? 1 : 0) | (read ? 2 : 0);
__assign_str(name, lock->name ?: "NULL");
__entry->lockdep_addr = lock;
),
Thank you,
--
Masami HIRAMATSU
IT Management Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu.pt@hitachi.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 4:38 [BUG] 3.13-rc1 kernel crash when enable all tracepoints Jovi Zhangwei
2013-11-26 6:33 ` Masami Hiramatsu
@ 2013-11-26 12:58 ` Steven Rostedt
2013-11-26 14:32 ` Shuah Khan
1 sibling, 1 reply; 9+ messages in thread
From: Steven Rostedt @ 2013-11-26 12:58 UTC (permalink / raw)
To: Jovi Zhangwei; +Cc: Masami Hiramatsu, Oleg Nesterov, LKML
On Tue, 26 Nov 2013 12:38:53 +0800
Jovi Zhangwei <jovi.zhangwei@gmail.com> wrote:
> Hi,
>
> I'm not sure this issue already be fixed or not, it can be reproduced
> permanently.
>
> (I didn't use git-bisect yet, you guys might can understand it quickly)
>
> #echo 1 > /sys/kernel/debug/tracing/events/enable
>
> [ 160.472176] BUG: unable to handle kernel NULL pointer dereference
> at (null)
> [ 160.472176] IP: [<ffffffff8132ae40>] strlen+0x0/0x30
I hit this bug last night when testing a fix for another bug. I have a
patch for this too, and will send it out after I put it through my
tests.
You can try it out.
-- Steve
diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
index 52594b2..bdac88c 100644
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -372,7 +372,8 @@ ftrace_define_fields_##call(struct ftrace_event_call *event_call) \
__data_size += (len) * sizeof(type);
#undef __string
-#define __string(item, src) __dynamic_array(char, item, strlen(src) + 1)
+#define __string(item, src) __dynamic_array(char, item, \
+ strlen((src) ? (src) : "(null)") + 1)
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
@@ -501,7 +502,7 @@ static inline notrace int ftrace_get_offsets_##call( \
#undef __assign_str
#define __assign_str(dst, src) \
- strcpy(__get_str(dst), src);
+ strcpy(__get_str(dst), (src) ? (src) : "(null)");
#undef TP_fast_assign
#define TP_fast_assign(args...) args
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 12:58 ` Steven Rostedt
@ 2013-11-26 14:32 ` Shuah Khan
2013-11-26 14:44 ` Steven Rostedt
0 siblings, 1 reply; 9+ messages in thread
From: Shuah Khan @ 2013-11-26 14:32 UTC (permalink / raw)
To: Steven Rostedt; +Cc: Jovi Zhangwei, Masami Hiramatsu, Oleg Nesterov, LKML
I sent in a patch to fix it. You can try the patch out if you like,
https://lkml.org/lkml/2013/11/22/337
thanks,
-- Shuah
On Tue, Nov 26, 2013 at 5:58 AM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Tue, 26 Nov 2013 12:38:53 +0800
> Jovi Zhangwei <jovi.zhangwei@gmail.com> wrote:
>
>> Hi,
>>
>> I'm not sure this issue already be fixed or not, it can be reproduced
>> permanently.
>>
>> (I didn't use git-bisect yet, you guys might can understand it quickly)
>>
>> #echo 1 > /sys/kernel/debug/tracing/events/enable
>>
>> [ 160.472176] BUG: unable to handle kernel NULL pointer dereference
>> at (null)
>> [ 160.472176] IP: [<ffffffff8132ae40>] strlen+0x0/0x30
>
> I hit this bug last night when testing a fix for another bug. I have a
> patch for this too, and will send it out after I put it through my
> tests.
>
> You can try it out.
>
> -- Steve
>
> diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h
> index 52594b2..bdac88c 100644
> --- a/include/trace/ftrace.h
> +++ b/include/trace/ftrace.h
> @@ -372,7 +372,8 @@ ftrace_define_fields_##call(struct ftrace_event_call *event_call) \
> __data_size += (len) * sizeof(type);
>
> #undef __string
> -#define __string(item, src) __dynamic_array(char, item, strlen(src) + 1)
> +#define __string(item, src) __dynamic_array(char, item, \
> + strlen((src) ? (src) : "(null)") + 1)
>
> #undef DECLARE_EVENT_CLASS
> #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
> @@ -501,7 +502,7 @@ static inline notrace int ftrace_get_offsets_##call( \
>
> #undef __assign_str
> #define __assign_str(dst, src) \
> - strcpy(__get_str(dst), src);
> + strcpy(__get_str(dst), (src) ? (src) : "(null)");
>
> #undef TP_fast_assign
> #define TP_fast_assign(args...) args
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints
2013-11-26 14:32 ` Shuah Khan
@ 2013-11-26 14:44 ` Steven Rostedt
0 siblings, 0 replies; 9+ messages in thread
From: Steven Rostedt @ 2013-11-26 14:44 UTC (permalink / raw)
To: Shuah Khan; +Cc: Jovi Zhangwei, Masami Hiramatsu, Oleg Nesterov, LKML
On Tue, 26 Nov 2013 07:32:43 -0700
Shuah Khan <shuahkhan@gmail.com> wrote:
> I sent in a patch to fix it. You can try the patch out if you like,
>
> https://lkml.org/lkml/2013/11/22/337
That was the patch I sent to you.
https://lkml.org/lkml/2013/11/20/637
I wanted you to test it, not submit it as your own. I have it in my
queue.
-- Steve
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-11-26 14:44 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-26 4:38 [BUG] 3.13-rc1 kernel crash when enable all tracepoints Jovi Zhangwei
2013-11-26 6:33 ` Masami Hiramatsu
2013-11-26 7:20 ` Jovi Zhangwei
2013-11-26 8:57 ` Masami Hiramatsu
2013-11-26 10:27 ` Masami Hiramatsu
2013-11-26 7:26 ` (ltc-kernel 7837) " Masami Hiramatsu
2013-11-26 12:58 ` Steven Rostedt
2013-11-26 14:32 ` Shuah Khan
2013-11-26 14:44 ` Steven Rostedt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).