linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* perf: fuzzer gets CPU stuck in perf_callchain()
@ 2015-02-19 16:54 Vince Weaver
  2015-02-19 16:57 ` Peter Zijlstra
  2015-02-19 17:03 ` Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: Vince Weaver @ 2015-02-19 16:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Steven Rostedt


Another bug found by the perf_fuzzer().  I think this one is different
than the one I sent the other day, it looks like something is going
very wrong in perf_callchain().

This one is reasonably reproducible, if there's any extra debugging that I 
can add.  This is on a Haswell machine with git as of a few days ago.

There are two traces below; they were generated by the same random seed
across different reboots and the trigger is slightly different (although 
both look like they hit the same bug in the callchain code).

[ 7938.495396] ------------[ cut here ]------------
[ 7938.500373] WARNING: CPU: 2 PID: 5692 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0()
[ 7938.510337] Watchdog detected hard LOCKUP on cpu 2
[ 7938.515301] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller snd_hda_codec i915 aes_x86_64 lrw gf128mul ppdev iTCO_wdt xhci_pci snd_hwdep snd_pcm tpm_tis evdev glue_helper iTCO_vendor_support xhci_hcd snd_timer mei_me drm_kms_helper ablk_helper snd tpm mei drm psmouse serio_raw pcspkr parport_pc cryptd soundcore lpc_ich parport wmi mfd_core battery i2c_i801 i2c_algo_bit button processor video sg sr_mod sd_mod cdrom ahci libahci ehci_pci libata ehci_hcd e1000e ptp crc32c_intel scsi_mod usbcore usb_common pps_core fan thermal thermal_sys
[ 7938.586970] CPU: 2 PID: 5692 Comm: perf_fuzzer Not tainted 3.19.0+ #127
[ 7938.594053] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 7938.601990]  ffffffff81a41afe ffff88011ea85aa0 ffffffff816bfed5 0000000000000000
[ 7938.610056]  ffff88011ea85af0 ffff88011ea85ae0 ffffffff8106ddfa 0000000000000000
[ 7938.618102]  ffff8801195a1800 0000000000000000 ffff88011ea85c40 ffff88011ea85ef8
[ 7938.626105] Call Trace:
[ 7938.628735]  <NMI>  [<ffffffff816bfed5>] dump_stack+0x45/0x57
[ 7938.634940]  [<ffffffff8106ddfa>] warn_slowpath_common+0x8a/0xc0
[ 7938.641394]  [<ffffffff8106de76>] warn_slowpath_fmt+0x46/0x50
[ 7938.647566]  [<ffffffff8111cde2>] watchdog_overflow_callback+0x92/0xc0
[ 7938.654617]  [<ffffffff8115dbe1>] __perf_event_overflow+0x91/0x270
[ 7938.661233]  [<ffffffff8102b56a>] ? x86_perf_event_set_period+0xca/0x170
[ 7938.668419]  [<ffffffff8115e6c9>] perf_event_overflow+0x19/0x20
[ 7938.674764]  [<ffffffff81032a7a>] intel_pmu_handle_irq+0x1ba/0x3a0
[ 7938.681432]  [<ffffffff8102a45b>] perf_event_nmi_handler+0x2b/0x50
[ 7938.688055]  [<ffffffff81018760>] nmi_handle+0xa0/0x150
[ 7938.693681]  [<ffffffff810186c5>] ? nmi_handle+0x5/0x150
[ 7938.699405]  [<ffffffff81018a0a>] default_do_nmi+0x4a/0x140
[ 7938.705429]  [<ffffffff81018b98>] do_nmi+0x98/0xe0
[ 7938.710573]  [<ffffffff816ca3d1>] end_repeat_nmi+0x1e/0x2e
[ 7938.716470]  [<ffffffff8140b300>] ? copy_from_user_nmi+0x20/0x70
[ 7938.722929]  [<ffffffff8140b300>] ? copy_from_user_nmi+0x20/0x70
[ 7938.729386]  [<ffffffff8140b300>] ? copy_from_user_nmi+0x20/0x70
[ 7938.735862]  <<EOE>>  <IRQ>  [<ffffffff8102c210>] ? perf_callchain_user+0xd0/0x240
[ 7938.744067]  [<ffffffff81161508>] perf_callchain+0x1f8/0x220
[ 7938.750152]  [<ffffffff8115da21>] perf_prepare_sample+0x291/0x3c0
[ 7938.756700]  [<ffffffff8115dc5e>] __perf_event_overflow+0x10e/0x270
[ 7938.763412]  [<ffffffff8115dc29>] ? __perf_event_overflow+0xd9/0x270
[ 7938.770254]  [<ffffffff8115e114>] ? perf_tp_event+0xc4/0x210
[ 7938.776302]  [<ffffffff8115da7c>] ? perf_prepare_sample+0x2ec/0x3c0
[ 7938.783021]  [<ffffffff816ca018>] ? trace_page_fault+0x28/0x30
[ 7938.789319]  [<ffffffff8115df89>] perf_swevent_overflow+0xa9/0xe0
[ 7938.795845]  [<ffffffff8115e027>] perf_swevent_event+0x67/0x90
[ 7938.802139]  [<ffffffff8115e114>] perf_tp_event+0xc4/0x210
[ 7938.808025]  [<ffffffff810b7f10>] ? lock_release+0x130/0x260
[ 7938.814132]  [<ffffffff810b643f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[ 7938.821124]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[ 7938.827926]  [<ffffffff810b4f8a>] ? perf_trace_lock+0x12a/0x160
[ 7938.834291]  [<ffffffff81079c05>] ? __sigqueue_alloc+0x5/0x150
[ 7938.840541]  [<ffffffff81079d34>] ? __sigqueue_alloc+0x134/0x150
[ 7938.847009]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[ 7938.853835]  [<ffffffff8107ba4e>] ? __lock_task_sighand+0x6e/0x110
[ 7938.861174]  [<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
[ 7938.867972]  [<ffffffff8107ba58>] ? __lock_task_sighand+0x78/0x110
[ 7938.875256]  [<ffffffff8107bb4d>] ? do_send_sig_info+0x5d/0x80
[ 7938.882197]  [<ffffffff810b7f10>] lock_release+0x130/0x260
[ 7938.888754]  [<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40
[ 7938.896510]  [<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80
[ 7938.903240]  [<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0
[ 7938.910365]  [<ffffffff811f68b5>] ? send_sigio_to_task+0x5/0x1a0
[ 7938.917455]  [<ffffffff811f7176>] ? send_sigio+0x56/0x100
[ 7938.923894]  [<ffffffff811f71ce>] send_sigio+0xae/0x100
[ 7938.930132]  [<ffffffff811f72b7>] kill_fasync+0x97/0xf0
[ 7938.936359]  [<ffffffff811f722f>] ? kill_fasync+0xf/0xf0
[ 7938.942707]  [<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
[ 7938.949522]  [<ffffffff8115cfe5>] ? perf_event_wakeup+0x5/0xf0
[ 7938.956367]  [<ffffffff8115d103>] perf_pending_event+0x33/0x60
[ 7938.963199]  [<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
[ 7938.969932]  [<ffffffff8114e448>] irq_work_run+0x18/0x40
[ 7938.976212]  [<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
[ 7938.983957]  [<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
[ 7938.991362]  <EOI>  [<ffffffff816c743b>] ? _raw_spin_unlock+0x2b/0x40
[ 7938.998883]  [<ffffffff810f0993>] ? generic_exec_single+0x133/0x190
[ 7939.006127]  [<ffffffff81156100>] ? task_clock_event_add+0x40/0x40
[ 7939.013297]  [<ffffffff810f0a57>] smp_call_function_single+0x67/0xa0
[ 7939.020568]  [<ffffffff81154d69>] task_function_call+0x49/0x60
[ 7939.027277]  [<ffffffff8115b400>] ? perf_event_sched_in+0xa0/0xa0
[ 7939.034270]  [<ffffffff81154e10>] _perf_event_enable+0x90/0xf0
[ 7939.040963]  [<ffffffff81154d80>] ? task_function_call+0x60/0x60
[ 7939.047863]  [<ffffffff81154baf>] perf_event_for_each_child+0x3f/0xb0
[ 7939.055259]  [<ffffffff8115c4ac>] perf_event_task_enable+0x7c/0xd0
[ 7939.062321]  [<ffffffff81082d4d>] SyS_prctl+0x26d/0x4d0
[ 7939.068359]  [<ffffffff816c7ead>] system_call_fastpath+0x16/0x1b
[ 7939.075261] ---[ end trace 8e72f55d0f5cbae4 ]---
[ 7939.080732] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 585.271 msecs
[ 7939.090016] perf interrupt took too long (4572449 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[ 7940.065563] INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3, t=5252 jiffies, g=10488, c=10487, q=196)
[ 7940.077820] Task dump for CPU 2:
[ 7940.081779] perf_fuzzer     R  running task        0  5692   2549 0x0000000c
[ 7940.089901]  00007ffff86e99b8 000000000000002b 0000000000000003 ffff8800a4103cd8
[ 7940.098442]  ffffffff810b643f ffff88011897a4d0 0000000000000003 ffff88011897a4d0
[ 7940.106955]  ffff8800ced65020 ffff8800a4103c98 ffffffff810b52b8 ffff8800a4103cb8
[ 7940.115473] Call Trace:
[ 7940.118593]  [<ffffffff810b643f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[ 7940.126120]  [<ffffffff810b52b8>] ? __lock_is_held+0x58/0x80
[ 7940.132660]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[ 7940.139989]  [<ffffffff810b4f8a>] ? perf_trace_lock+0x12a/0x160
[ 7940.146848]  [<ffffffff8115b485>] ? __perf_event_enable+0x85/0x2a0
[ 7940.153971]  [<ffffffff816c743b>] ? _raw_spin_unlock+0x2b/0x40
[ 7940.160738]  [<ffffffff810f0993>] ? generic_exec_single+0x133/0x190
[ 7940.167964]  [<ffffffff81156100>] ? task_clock_event_add+0x40/0x40
[ 7940.175029]  [<ffffffff810f0a57>] ? smp_call_function_single+0x67/0xa0
[ 7940.182518]  [<ffffffff81154d69>] ? task_function_call+0x49/0x60
[ 7940.189447]  [<ffffffff8115b400>] ? perf_event_sched_in+0xa0/0xa0
[ 7940.196485]  [<ffffffff81154e10>] ? _perf_event_enable+0x90/0xf0
[ 7940.203430]  [<ffffffff81154d80>] ? task_function_call+0x60/0x60
[ 7940.210386]  [<ffffffff81154baf>] ? perf_event_for_each_child+0x3f/0xb0
[ 7940.217945]  [<ffffffff8115c4ac>] ? perf_event_task_enable+0x7c/0xd0
[ 7940.225255]  [<ffffffff81082d4d>] ? SyS_prctl+0x26d/0x4d0
[ 7940.231557]  [<ffffffff816c7ead>] ? system_call_fastpath+0x16/0x1b
[ 7947.852948] perf interrupt took too long (4536733 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[ 7956.620503] perf interrupt took too long (4501296 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
[ 7965.387462] perf interrupt took too long (4466134 > 38461), lowering kernel.perf_event_max_sample_rate to 3250


[145965.719358] Call Trace:
[145965.722845]  <NMI>  [<ffffffff816bfed5>] dump_stack+0x45/0x57
[145965.729907]  [<ffffffff8106ddfa>] warn_slowpath_common+0x8a/0xc0
[145965.737170]  [<ffffffff8106de76>] warn_slowpath_fmt+0x46/0x50
[145965.744103]  [<ffffffff8101db0a>] ? native_sched_clock+0x2a/0x90
[145965.751356]  [<ffffffff8111cde2>] watchdog_overflow_callback+0x92/0xc0
[145965.759125]  [<ffffffff8115dbe1>] __perf_event_overflow+0x91/0x270
[145965.766466]  [<ffffffff8102b56a>] ? x86_perf_event_set_period+0xca/0x170
[145965.774393]  [<ffffffff8115e6c9>] perf_event_overflow+0x19/0x20
[145965.781478]  [<ffffffff81032a7a>] intel_pmu_handle_irq+0x1ba/0x3a0
[145965.788793]  [<ffffffff816ca020>] ? trace_page_fault+0x30/0x30
[145965.795777]  [<ffffffff8102a45b>] perf_event_nmi_handler+0x2b/0x50
[145965.803049]  [<ffffffff81018760>] nmi_handle+0xa0/0x150
[145965.809335]  [<ffffffff810186c5>] ? nmi_handle+0x5/0x150
[145965.815730]  [<ffffffff81018a95>] default_do_nmi+0xd5/0x140
[145965.822428]  [<ffffffff81018b98>] do_nmi+0x98/0xe0
[145965.828254]  [<ffffffff816ca3d1>] end_repeat_nmi+0x1e/0x2e
[145965.834803]  [<ffffffff816c9ff0>] ? general_protection+0x30/0x30
[145965.841912]  [<ffffffff816c9ff0>] ? general_protection+0x30/0x30
[145965.849073]  [<ffffffff816c9ff0>] ? general_protection+0x30/0x30
[145965.856187]  <<EOE>>  <IRQ>  [<ffffffff8140b46b>] ? copy_user_handle_tail+0x3b/0x80
[145965.865154]  [<ffffffff8140b343>] copy_from_user_nmi+0x63/0x70
[145965.872114]  [<ffffffff8102c210>] perf_callchain_user+0xd0/0x240
[145965.879227]  [<ffffffff81161508>] perf_callchain+0x1f8/0x220
[145965.885923]  [<ffffffff8115da21>] perf_prepare_sample+0x291/0x3c0
[145965.893151]  [<ffffffff8115dc5e>] __perf_event_overflow+0x10e/0x270
[145965.900562]  [<ffffffff8115dc29>] ? __perf_event_overflow+0xd9/0x270
[145965.908078]  [<ffffffff810b7f10>] ? lock_release+0x130/0x260
[145965.914842]  [<ffffffff8115df89>] perf_swevent_overflow+0xa9/0xe0
[145965.922072]  [<ffffffff8115e027>] perf_swevent_event+0x67/0x90
[145965.929025]  [<ffffffff8115e114>] perf_tp_event+0xc4/0x210
[145965.935615]  [<ffffffff810b7f10>] ? lock_release+0x130/0x260
[145965.942395]  [<ffffffff8107ba4e>] ? __lock_task_sighand+0x6e/0x110
[145965.949648]  [<ffffffff810b4f8a>] ? perf_trace_lock+0x12a/0x160
[145965.956671]  [<ffffffff8107ba58>] ? __lock_task_sighand+0x78/0x110
[145965.963919]  [<ffffffff8107bb4d>] ? do_send_sig_info+0x5d/0x80
[145965.970858]  [<ffffffff8107bb4d>] ? do_send_sig_info+0x5d/0x80
[145965.977849]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[145965.985330]  [<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
[145965.992195]  [<ffffffff811f7176>] ? send_sigio+0x56/0x100
[145965.998659]  [<ffffffff811f71f0>] ? send_sigio+0xd0/0x100
[145966.005169]  [<ffffffff810b7f10>] lock_release+0x130/0x260
[145966.011716]  [<ffffffff816c76e3>] _raw_read_unlock+0x23/0x40
[145966.018442]  [<ffffffff811f71f0>] send_sigio+0xd0/0x100
[145966.024700]  [<ffffffff811f72b7>] kill_fasync+0x97/0xf0
[145966.030968]  [<ffffffff811f722f>] ? kill_fasync+0xf/0xf0
[145966.037365]  [<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
[145966.044210]  [<ffffffff8115cfe5>] ? perf_event_wakeup+0x5/0xf0
[145966.051060]  [<ffffffff8115d103>] perf_pending_event+0x33/0x60
[145966.057950]  [<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
[145966.064798]  [<ffffffff8114e448>] irq_work_run+0x18/0x40
[145966.071127]  [<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
[145966.078929]  [<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
[145966.086359]  <EOI>  [<ffffffff816c74c2>] ? _raw_spin_unlock_irq+0x32/0x40
[145966.094337]  [<ffffffff816c74bb>] ? _raw_spin_unlock_irq+0x2b/0x40
[145966.101634]  [<ffffffff81092f7c>] finish_task_switch+0x8c/0x140
[145966.108630]  [<ffffffff81092f3d>] ? finish_task_switch+0x4d/0x140
[145966.115793]  [<ffffffff816c1c06>] __schedule+0x2d6/0x9f0
[145966.122115]  [<ffffffff810b4e89>] ? perf_trace_lock+0x29/0x160
[145966.128909]  [<ffffffff816c2357>] schedule+0x37/0x90
[145966.134802]  [<ffffffff816c67ac>] schedule_timeout+0x1bc/0x250
[145966.141578]  [<ffffffff816c37f8>] ? wait_for_completion+0x28/0x100
[145966.148660]  [<ffffffff816c3894>] ? wait_for_completion+0xc4/0x100
[145966.155795]  [<ffffffff816c389c>] wait_for_completion+0xcc/0x100
[145966.162731]  [<ffffffff8109ae80>] ? wake_up_state+0x20/0x20
[145966.169205]  [<ffffffff810d4830>] ? call_rcu_bh+0x20/0x20
[145966.175464]  [<ffffffff810d12fd>] wait_rcu_gp+0x5d/0x80
[145966.181586]  [<ffffffff810d1240>] ? ftrace_raw_output_rcu_utilization+0x70/0x70
[145966.189890]  [<ffffffff810d352a>] synchronize_sched+0x3a/0x50
[145966.196573]  [<ffffffff8114083b>] perf_trace_event_unreg.isra.1+0x3b/0x90
[145966.204345]  [<ffffffff81140bac>] perf_trace_destroy+0x3c/0x50
[145966.211075]  [<ffffffff81155aee>] tp_perf_event_destroy+0xe/0x10
[145966.217976]  [<ffffffff8115721c>] __free_event+0x2c/0x80
[145966.224102]  [<ffffffff8115f5f9>] SYSC_perf_event_open+0x629/0xc80
[145966.231246]  [<ffffffff8116009e>] SyS_perf_event_open+0xe/0x10
[145966.237982]  [<ffffffff816c7ead>] system_call_fastpath+0x16/0x1b
[145966.244932] ---[ end trace 7088940f25646d60 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-23 12:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-19 16:54 perf: fuzzer gets CPU stuck in perf_callchain() Vince Weaver
2015-02-19 16:57 ` Peter Zijlstra
2015-02-19 17:05   ` Vince Weaver
2015-02-19 17:24     ` Vince Weaver
2015-02-19 17:32       ` Vince Weaver
2015-02-19 17:03 ` Peter Zijlstra
2015-02-24  3:33   ` Vince Weaver
2015-03-23 12:23   ` [tip:perf/urgent] perf: Fix irq_work 'tail' recursion tip-bot for Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).