* [BUG] 3.13-rc1 kernel crash when enable all tracepoints @ 2013-11-26 4:38 Jovi Zhangwei 2013-11-26 6:33 ` Masami Hiramatsu 2013-11-26 12:58 ` Steven Rostedt 0 siblings, 2 replies; 9+ messages in thread From: Jovi Zhangwei @ 2013-11-26 4:38 UTC (permalink / raw) To: Steven Rostedt, Masami Hiramatsu, Oleg Nesterov; +Cc: LKML Hi, I'm not sure this issue already be fixed or not, it can be reproduced permanently. (I didn't use git-bisect yet, you guys might can understand it quickly) #echo 1 > /sys/kernel/debug/tracing/events/enable [ 160.472176] BUG: unable to handle kernel NULL pointer dereference at (null) [ 160.472176] IP: [<ffffffff8132ae40>] strlen+0x0/0x30 [ 160.472176] PGD 7b67a067 PUD 788c6067 PMD 0 [ 160.472176] Oops: 0000 [#12] SMP [ 160.472176] Modules linked in: ebtable_nat ebtables joydev e1000 [ 160.472176] CPU: 0 PID: 741 Comm: bash Tainted: G D 3.13.0-rc1+ #16 [ 160.472176] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 160.472176] task: ffff88007891e000 ti: ffff880074d04000 task.ti: ffff880074d04000 [ 160.472176] RIP: 0010:[<ffffffff8132ae40>] [<ffffffff8132ae40>] strlen+0x0/0x30 [ 160.472176] RSP: 0018:ffff88007fc03e40 EFLAGS: 00010006 [ 160.472176] RAX: 0000000000000002 RBX: ffff88007bb19d80 RCX: 0000000000000002 [ 160.472176] RDX: ffffffff8107a5ac RSI: ffffffff826eac88 RDI: 0000000000000000 [ 160.472176] RBP: ffff88007fc03e88 R08: 0000000000000002 R09: 0000000000000000 [ 160.472176] R10: ffff88007891e000 R11: 0000000000000000 R12: ffffffff826eac88 [ 160.472176] R13: 0000000000010000 R14: 0000000000000046 R15: ffffffff81c47880 [ 160.472176] FS: 00007fef34515740(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000 [ 160.472176] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 160.472176] CR2: 0000000000000000 CR3: 000000007a492000 CR4: 00000000000006f0 [ 160.472176] Stack: [ 160.472176] ffffffff810999e0 0000000000000002 0000000000000046 ffff88007fc03ec0 [ 160.472176] ffff8800788031a0 ffffffff826eac88 ffffffff8107a5ac 0000000000000046 [ 160.472176] 0000000000000001 ffff88007fc03ec0 ffffffff8109e840 0000000000000082 [ 160.472176] Call Trace: [ 160.472176] <IRQ> [ 160.472176] [<ffffffff810999e0>] ? ftrace_raw_event_lock+0x50/0xe0 [ 160.472176] [<ffffffff8107a5ac>] ? hrtimer_interrupt+0x7c/0x240 [ 160.472176] [<ffffffff8109e840>] lock_release+0xe0/0x1f0 [ 160.472176] [<ffffffff810bb612>] ktime_get_update_offsets+0x62/0x150 [ 160.472176] [<ffffffff8107a5ac>] ? hrtimer_interrupt+0x7c/0x240 [ 160.472176] [<ffffffff8107a5ac>] hrtimer_interrupt+0x7c/0x240 [ 160.472176] [<ffffffff811016d6>] ? trace_event_buffer_lock_reserve+0x26/0x30 [ 160.472176] [<ffffffff810301d7>] local_apic_timer_interrupt+0x37/0x60 [ 160.472176] [<ffffffff81643b4a>] smp_trace_apic_timer_interrupt+0x4a/0xc9 [ 160.472176] [<ffffffff816427ef>] trace_apic_timer_interrupt+0x6f/0x80 [ 160.472176] <EOI> [ 160.472176] [<ffffffff810c89ba>] ? smp_call_function_many+0x25a/0x2c0 [ 160.472176] [<ffffffff81008b80>] ? setup_data_read+0xa0/0xa0 [ 160.472176] [<ffffffff8109e7a6>] ? lock_release+0x46/0x1f0 [ 160.472176] [<ffffffff81008b80>] ? setup_data_read+0xa0/0xa0 [ 160.472176] [<ffffffff8109e7a7>] ? lock_release+0x47/0x1f0 [ 160.472176] [<ffffffff810c8a7d>] on_each_cpu+0x2d/0x60 [ 160.472176] [<ffffffff8109e7a6>] ? lock_release+0x46/0x1f0 [ 160.472176] [<ffffffff810092e8>] text_poke_bp+0xa8/0xc0 [ 160.472176] [<ffffffff8109e7a6>] ? lock_release+0x46/0x1f0 [ 160.472176] [<ffffffff8100685e>] arch_jump_label_transform+0x8e/0x100 [ 160.472176] [<ffffffff8112b73f>] __jump_label_update+0x5f/0x80 [ 160.472176] [<ffffffff8112b7fd>] jump_label_update+0x9d/0xb0 [ 160.472176] [<ffffffff8112ba3d>] static_key_slow_inc+0xad/0xb0 [ 160.472176] [<ffffffff810f2979>] tracepoint_update_probe_range+0x139/0x160 [ 160.472176] [<ffffffff810f32d4>] tracepoint_probe_register+0x54/0xa0 [ 160.472176] [<ffffffff81110b19>] ftrace_event_reg+0x69/0x90 [ 160.472176] [<ffffffff8110fb9f>] __ftrace_event_enable_disable+0x5f/0x1b0 [ 160.472176] [<ffffffff8110fdb7>] __ftrace_set_clr_event_nolock+0xc7/0xf0 [ 160.472176] [<ffffffff8111049b>] system_enable_write+0x9b/0xe0 [ 160.472176] [<ffffffff8118d494>] vfs_write+0xb4/0x1f0 [ 160.472176] [<ffffffff8118dec9>] SyS_write+0x49/0xa0 [ 160.472176] [<ffffffff810e9d7c>] ? __audit_syscall_entry+0x9c/0xf0 [ 160.472176] [<ffffffff81641b12>] system_call_fastpath+0x16/0x1b [ 160.472176] Code: 89 f8 48 89 e5 f6 82 e0 c7 85 81 20 74 15 0f 1f 44 00 00 48 83 c0 01 0f b6 10 f6 82 e0 c7 85 81 20 75 f0 5d c3 66 0f 1f 44 00 00 <80> 3f 00 55 48 89 e5 74 15 48 89 f8 0f 1f 40 00 48 83 c0 01 80 [ 160.472176] RIP [<ffffffff8132ae40>] strlen+0x0/0x30 [ 160.472176] RSP <ffff88007fc03e40> [ 160.472176] CR2: 0000000000000000 [ 160.472176] ---[ end trace cadf03f6f81c6eb9 ]--- [ 160.472277] Shutting down cpus with NMI ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 4:38 [BUG] 3.13-rc1 kernel crash when enable all tracepoints Jovi Zhangwei @ 2013-11-26 6:33 ` Masami Hiramatsu 2013-11-26 7:20 ` Jovi Zhangwei 2013-11-26 7:26 ` (ltc-kernel 7837) " Masami Hiramatsu 2013-11-26 12:58 ` Steven Rostedt 1 sibling, 2 replies; 9+ messages in thread From: Masami Hiramatsu @ 2013-11-26 6:33 UTC (permalink / raw) To: Jovi Zhangwei, Steven Rostedt; +Cc: Oleg Nesterov, LKML (2013/11/26 13:38), Jovi Zhangwei wrote: > Hi, > > I'm not sure this issue already be fixed or not, it can be reproduced > permanently. > > (I didn't use git-bisect yet, you guys might can understand it quickly) > > #echo 1 > /sys/kernel/debug/tracing/events/enable Thanks for reporting. I could reproduce it. To narrow this down, I tried to run the below command [tracing]# for i in events/*/*/enable ; do echo $i; echo 1 > $i; done And it ran through the end without any problem. Hm, next I checked the difference of available_events and set_event. [tracing]# diff available_events set_event 283d282 < ftrace:function So, I guess it was caused by enabling ftrace:function, and it is unable to do that via set_event, nor events/ftrace/enable I'm not sure how, but it seems that ftrace:function can be enabled by the events/enable. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 6:33 ` Masami Hiramatsu @ 2013-11-26 7:20 ` Jovi Zhangwei 2013-11-26 8:57 ` Masami Hiramatsu 2013-11-26 7:26 ` (ltc-kernel 7837) " Masami Hiramatsu 1 sibling, 1 reply; 9+ messages in thread From: Jovi Zhangwei @ 2013-11-26 7:20 UTC (permalink / raw) To: Masami Hiramatsu; +Cc: Steven Rostedt, Oleg Nesterov, LKML On Tue, Nov 26, 2013 at 2:33 PM, Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> wrote: > (2013/11/26 13:38), Jovi Zhangwei wrote: >> Hi, >> >> I'm not sure this issue already be fixed or not, it can be reproduced >> permanently. >> >> (I didn't use git-bisect yet, you guys might can understand it quickly) >> >> #echo 1 > /sys/kernel/debug/tracing/events/enable > > Thanks for reporting. I could reproduce it. > > To narrow this down, I tried to run the below command > > [tracing]# for i in events/*/*/enable ; do echo $i; echo 1 > $i; done > > And it ran through the end without any problem. > Hm, next I checked the difference of available_events and set_event. > > [tracing]# diff available_events set_event > 283d282 > < ftrace:function > > So, I guess it was caused by enabling ftrace:function, and > it is unable to do that via set_event, nor events/ftrace/enable > I'm not sure how, but it seems that ftrace:function can be > enabled by the events/enable. > I'm not sure this relate with ftrace:function event, from the crash log, it seems more like caused by lock events. Now I only enable lock events (lockdep compiled in my box), below two commands both can crash system. #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_acquire/enable or. #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_release/enable According to the log, it seems like a recursion tracing bug. register lock event -> jump_label_update -> text_poke_bp -> on_each_cpu -> local_apic_timer_interrupt -> ktime_get_update_offsets -> lock_release (Perhaps the crash you reproduced is another crash? similar crash log?) Jovi ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 7:20 ` Jovi Zhangwei @ 2013-11-26 8:57 ` Masami Hiramatsu 2013-11-26 10:27 ` Masami Hiramatsu 0 siblings, 1 reply; 9+ messages in thread From: Masami Hiramatsu @ 2013-11-26 8:57 UTC (permalink / raw) To: Jovi Zhangwei; +Cc: Steven Rostedt, Oleg Nesterov, LKML, yrl.pp-manager.tt (2013/11/26 16:20), Jovi Zhangwei wrote: > I'm not sure this relate with ftrace:function event, from the > crash log, it seems more like caused by lock events. > > Now I only enable lock events (lockdep compiled in my box), below > two commands both can crash system. That is not the lockdep, but the lockstat. :) > > #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_acquire/enable > or. > #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_release/enable > > According to the log, it seems like a recursion tracing bug. > > register lock event -> jump_label_update -> text_poke_bp -> > on_each_cpu -> local_apic_timer_interrupt -> ktime_get_update_offsets > -> lock_release OK, the problem is in the lock_acquire/release events. > (Perhaps the crash you reproduced is another crash? similar crash log?) Right, that was my mistake. Now I got same error log on kvm too. This should be fixed. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 8:57 ` Masami Hiramatsu @ 2013-11-26 10:27 ` Masami Hiramatsu 0 siblings, 0 replies; 9+ messages in thread From: Masami Hiramatsu @ 2013-11-26 10:27 UTC (permalink / raw) To: Jovi Zhangwei; +Cc: Steven Rostedt, Oleg Nesterov, LKML, yrl.pp-manager.tt (2013/11/26 17:57), Masami Hiramatsu wrote: > (2013/11/26 16:20), Jovi Zhangwei wrote: >> I'm not sure this relate with ftrace:function event, from the >> crash log, it seems more like caused by lock events. >> >> Now I only enable lock events (lockdep compiled in my box), below >> two commands both can crash system. > > That is not the lockdep, but the lockstat. :) > >> >> #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_acquire/enable >> or. >> #echo 1 > /sys/kernel/debug/tracing/events/lock/lock_release/enable >> >> According to the log, it seems like a recursion tracing bug. >> >> register lock event -> jump_label_update -> text_poke_bp -> >> on_each_cpu -> local_apic_timer_interrupt -> ktime_get_update_offsets >> -> lock_release > > OK, the problem is in the lock_acquire/release events. Hm, it seems that the bug is caused because lock->name is NULL in ftrace_raw_event_lock(_acquire). To allocate a buffer on ring_buffer, ftrace_raw_event_##call tries to measure the length of lock->name but it causes a memory access violation in ftrace_get_offsets_##call because lock->name is NULL. :( I could fix this by checking lock->name in include/trace/events/lock.h e.g. TP_STRUCT__entry( __field(unsigned int, flags) __string(name, lock->name ?: "NULL") __field(void *, lockdep_addr) ), TP_fast_assign( __entry->flags = (trylock ? 1 : 0) | (read ? 2 : 0); __assign_str(name, lock->name ?: "NULL"); __entry->lockdep_addr = lock; ), Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: (ltc-kernel 7837) Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 6:33 ` Masami Hiramatsu 2013-11-26 7:20 ` Jovi Zhangwei @ 2013-11-26 7:26 ` Masami Hiramatsu 1 sibling, 0 replies; 9+ messages in thread From: Masami Hiramatsu @ 2013-11-26 7:26 UTC (permalink / raw) To: Jovi Zhangwei, Steven Rostedt; +Cc: Oleg Nesterov, LKML, yrl.pp-manager.tt (2013/11/26 15:33), Masami Hiramatsu wrote: > (2013/11/26 13:38), Jovi Zhangwei wrote: >> Hi, >> >> I'm not sure this issue already be fixed or not, it can be reproduced >> permanently. >> >> (I didn't use git-bisect yet, you guys might can understand it quickly) >> >> #echo 1 > /sys/kernel/debug/tracing/events/enable > > Thanks for reporting. I could reproduce it. > > To narrow this down, I tried to run the below command > > [tracing]# for i in events/*/*/enable ; do echo $i; echo 1 > $i; done > > And it ran through the end without any problem. > Hm, next I checked the difference of available_events and set_event. > > [tracing]# diff available_events set_event > 283d282 > < ftrace:function > > So, I guess it was caused by enabling ftrace:function, and > it is unable to do that via set_event, nor events/ftrace/enable > I'm not sure how, but it seems that ftrace:function can be > enabled by the events/enable. Oops, No, forget it. When I added a printk in ftrace_set_clr_event_nolock(), it didn't happen and ftrace/function is not enabled. I'm trying to reproduce it on the latest tip kernel again on KVM, but not yet reproduced. Thank you, -- Masami HIRAMATSU IT Management Research Dept. Linux Technology Center Hitachi, Ltd., Yokohama Research Laboratory E-mail: masami.hiramatsu.pt@hitachi.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 4:38 [BUG] 3.13-rc1 kernel crash when enable all tracepoints Jovi Zhangwei 2013-11-26 6:33 ` Masami Hiramatsu @ 2013-11-26 12:58 ` Steven Rostedt 2013-11-26 14:32 ` Shuah Khan 1 sibling, 1 reply; 9+ messages in thread From: Steven Rostedt @ 2013-11-26 12:58 UTC (permalink / raw) To: Jovi Zhangwei; +Cc: Masami Hiramatsu, Oleg Nesterov, LKML On Tue, 26 Nov 2013 12:38:53 +0800 Jovi Zhangwei <jovi.zhangwei@gmail.com> wrote: > Hi, > > I'm not sure this issue already be fixed or not, it can be reproduced > permanently. > > (I didn't use git-bisect yet, you guys might can understand it quickly) > > #echo 1 > /sys/kernel/debug/tracing/events/enable > > [ 160.472176] BUG: unable to handle kernel NULL pointer dereference > at (null) > [ 160.472176] IP: [<ffffffff8132ae40>] strlen+0x0/0x30 I hit this bug last night when testing a fix for another bug. I have a patch for this too, and will send it out after I put it through my tests. You can try it out. -- Steve diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h index 52594b2..bdac88c 100644 --- a/include/trace/ftrace.h +++ b/include/trace/ftrace.h @@ -372,7 +372,8 @@ ftrace_define_fields_##call(struct ftrace_event_call *event_call) \ __data_size += (len) * sizeof(type); #undef __string -#define __string(item, src) __dynamic_array(char, item, strlen(src) + 1) +#define __string(item, src) __dynamic_array(char, item, \ + strlen((src) ? (src) : "(null)") + 1) #undef DECLARE_EVENT_CLASS #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ @@ -501,7 +502,7 @@ static inline notrace int ftrace_get_offsets_##call( \ #undef __assign_str #define __assign_str(dst, src) \ - strcpy(__get_str(dst), src); + strcpy(__get_str(dst), (src) ? (src) : "(null)"); #undef TP_fast_assign #define TP_fast_assign(args...) args ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 12:58 ` Steven Rostedt @ 2013-11-26 14:32 ` Shuah Khan 2013-11-26 14:44 ` Steven Rostedt 0 siblings, 1 reply; 9+ messages in thread From: Shuah Khan @ 2013-11-26 14:32 UTC (permalink / raw) To: Steven Rostedt; +Cc: Jovi Zhangwei, Masami Hiramatsu, Oleg Nesterov, LKML I sent in a patch to fix it. You can try the patch out if you like, https://lkml.org/lkml/2013/11/22/337 thanks, -- Shuah On Tue, Nov 26, 2013 at 5:58 AM, Steven Rostedt <rostedt@goodmis.org> wrote: > On Tue, 26 Nov 2013 12:38:53 +0800 > Jovi Zhangwei <jovi.zhangwei@gmail.com> wrote: > >> Hi, >> >> I'm not sure this issue already be fixed or not, it can be reproduced >> permanently. >> >> (I didn't use git-bisect yet, you guys might can understand it quickly) >> >> #echo 1 > /sys/kernel/debug/tracing/events/enable >> >> [ 160.472176] BUG: unable to handle kernel NULL pointer dereference >> at (null) >> [ 160.472176] IP: [<ffffffff8132ae40>] strlen+0x0/0x30 > > I hit this bug last night when testing a fix for another bug. I have a > patch for this too, and will send it out after I put it through my > tests. > > You can try it out. > > -- Steve > > diff --git a/include/trace/ftrace.h b/include/trace/ftrace.h > index 52594b2..bdac88c 100644 > --- a/include/trace/ftrace.h > +++ b/include/trace/ftrace.h > @@ -372,7 +372,8 @@ ftrace_define_fields_##call(struct ftrace_event_call *event_call) \ > __data_size += (len) * sizeof(type); > > #undef __string > -#define __string(item, src) __dynamic_array(char, item, strlen(src) + 1) > +#define __string(item, src) __dynamic_array(char, item, \ > + strlen((src) ? (src) : "(null)") + 1) > > #undef DECLARE_EVENT_CLASS > #define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \ > @@ -501,7 +502,7 @@ static inline notrace int ftrace_get_offsets_##call( \ > > #undef __assign_str > #define __assign_str(dst, src) \ > - strcpy(__get_str(dst), src); > + strcpy(__get_str(dst), (src) ? (src) : "(null)"); > > #undef TP_fast_assign > #define TP_fast_assign(args...) args > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [BUG] 3.13-rc1 kernel crash when enable all tracepoints 2013-11-26 14:32 ` Shuah Khan @ 2013-11-26 14:44 ` Steven Rostedt 0 siblings, 0 replies; 9+ messages in thread From: Steven Rostedt @ 2013-11-26 14:44 UTC (permalink / raw) To: Shuah Khan; +Cc: Jovi Zhangwei, Masami Hiramatsu, Oleg Nesterov, LKML On Tue, 26 Nov 2013 07:32:43 -0700 Shuah Khan <shuahkhan@gmail.com> wrote: > I sent in a patch to fix it. You can try the patch out if you like, > > https://lkml.org/lkml/2013/11/22/337 That was the patch I sent to you. https://lkml.org/lkml/2013/11/20/637 I wanted you to test it, not submit it as your own. I have it in my queue. -- Steve ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-11-26 14:44 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-11-26 4:38 [BUG] 3.13-rc1 kernel crash when enable all tracepoints Jovi Zhangwei 2013-11-26 6:33 ` Masami Hiramatsu 2013-11-26 7:20 ` Jovi Zhangwei 2013-11-26 8:57 ` Masami Hiramatsu 2013-11-26 10:27 ` Masami Hiramatsu 2013-11-26 7:26 ` (ltc-kernel 7837) " Masami Hiramatsu 2013-11-26 12:58 ` Steven Rostedt 2013-11-26 14:32 ` Shuah Khan 2013-11-26 14:44 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).