All of lore.kernel.org
 help / color / mirror / Atom feed
* perf: fuzzer gets CPU stuck in perf_callchain()
@ 2015-02-19 16:54 Vince Weaver
  2015-02-19 16:57 ` Peter Zijlstra
  2015-02-19 17:03 ` Peter Zijlstra
  0 siblings, 2 replies; 8+ messages in thread
From: Vince Weaver @ 2015-02-19 16:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Paul Mackerras, Ingo Molnar,
	Arnaldo Carvalho de Melo, Jiri Olsa, Steven Rostedt


Another bug found by the perf_fuzzer().  I think this one is different
than the one I sent the other day, it looks like something is going
very wrong in perf_callchain().

This one is reasonably reproducible, if there's any extra debugging that I 
can add.  This is on a Haswell machine with git as of a few days ago.

There are two traces below; they were generated by the same random seed
across different reboots and the trigger is slightly different (although 
both look like they hit the same bug in the callchain code).

[ 7938.495396] ------------[ cut here ]------------
[ 7938.500373] WARNING: CPU: 2 PID: 5692 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0()
[ 7938.510337] Watchdog detected hard LOCKUP on cpu 2
[ 7938.515301] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller snd_hda_codec i915 aes_x86_64 lrw gf128mul ppdev iTCO_wdt xhci_pci snd_hwdep snd_pcm tpm_tis evdev glue_helper iTCO_vendor_support xhci_hcd snd_timer mei_me drm_kms_helper ablk_helper snd tpm mei drm psmouse serio_raw pcspkr parport_pc cryptd soundcore lpc_ich parport wmi mfd_core battery i2c_i801 i2c_algo_bit button processor video sg sr_mod sd_mod cdrom ahci libahci ehci_pci libata ehci_hcd e1000e ptp crc32c_intel scsi_mod usbcore usb_common pps_core fan thermal thermal_sys
[ 7938.586970] CPU: 2 PID: 5692 Comm: perf_fuzzer Not tainted 3.19.0+ #127
[ 7938.594053] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 7938.601990]  ffffffff81a41afe ffff88011ea85aa0 ffffffff816bfed5 0000000000000000
[ 7938.610056]  ffff88011ea85af0 ffff88011ea85ae0 ffffffff8106ddfa 0000000000000000
[ 7938.618102]  ffff8801195a1800 0000000000000000 ffff88011ea85c40 ffff88011ea85ef8
[ 7938.626105] Call Trace:
[ 7938.628735]  <NMI>  [<ffffffff816bfed5>] dump_stack+0x45/0x57
[ 7938.634940]  [<ffffffff8106ddfa>] warn_slowpath_common+0x8a/0xc0
[ 7938.641394]  [<ffffffff8106de76>] warn_slowpath_fmt+0x46/0x50
[ 7938.647566]  [<ffffffff8111cde2>] watchdog_overflow_callback+0x92/0xc0
[ 7938.654617]  [<ffffffff8115dbe1>] __perf_event_overflow+0x91/0x270
[ 7938.661233]  [<ffffffff8102b56a>] ? x86_perf_event_set_period+0xca/0x170
[ 7938.668419]  [<ffffffff8115e6c9>] perf_event_overflow+0x19/0x20
[ 7938.674764]  [<ffffffff81032a7a>] intel_pmu_handle_irq+0x1ba/0x3a0
[ 7938.681432]  [<ffffffff8102a45b>] perf_event_nmi_handler+0x2b/0x50
[ 7938.688055]  [<ffffffff81018760>] nmi_handle+0xa0/0x150
[ 7938.693681]  [<ffffffff810186c5>] ? nmi_handle+0x5/0x150
[ 7938.699405]  [<ffffffff81018a0a>] default_do_nmi+0x4a/0x140
[ 7938.705429]  [<ffffffff81018b98>] do_nmi+0x98/0xe0
[ 7938.710573]  [<ffffffff816ca3d1>] end_repeat_nmi+0x1e/0x2e
[ 7938.716470]  [<ffffffff8140b300>] ? copy_from_user_nmi+0x20/0x70
[ 7938.722929]  [<ffffffff8140b300>] ? copy_from_user_nmi+0x20/0x70
[ 7938.729386]  [<ffffffff8140b300>] ? copy_from_user_nmi+0x20/0x70
[ 7938.735862]  <<EOE>>  <IRQ>  [<ffffffff8102c210>] ? perf_callchain_user+0xd0/0x240
[ 7938.744067]  [<ffffffff81161508>] perf_callchain+0x1f8/0x220
[ 7938.750152]  [<ffffffff8115da21>] perf_prepare_sample+0x291/0x3c0
[ 7938.756700]  [<ffffffff8115dc5e>] __perf_event_overflow+0x10e/0x270
[ 7938.763412]  [<ffffffff8115dc29>] ? __perf_event_overflow+0xd9/0x270
[ 7938.770254]  [<ffffffff8115e114>] ? perf_tp_event+0xc4/0x210
[ 7938.776302]  [<ffffffff8115da7c>] ? perf_prepare_sample+0x2ec/0x3c0
[ 7938.783021]  [<ffffffff816ca018>] ? trace_page_fault+0x28/0x30
[ 7938.789319]  [<ffffffff8115df89>] perf_swevent_overflow+0xa9/0xe0
[ 7938.795845]  [<ffffffff8115e027>] perf_swevent_event+0x67/0x90
[ 7938.802139]  [<ffffffff8115e114>] perf_tp_event+0xc4/0x210
[ 7938.808025]  [<ffffffff810b7f10>] ? lock_release+0x130/0x260
[ 7938.814132]  [<ffffffff810b643f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[ 7938.821124]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[ 7938.827926]  [<ffffffff810b4f8a>] ? perf_trace_lock+0x12a/0x160
[ 7938.834291]  [<ffffffff81079c05>] ? __sigqueue_alloc+0x5/0x150
[ 7938.840541]  [<ffffffff81079d34>] ? __sigqueue_alloc+0x134/0x150
[ 7938.847009]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[ 7938.853835]  [<ffffffff8107ba4e>] ? __lock_task_sighand+0x6e/0x110
[ 7938.861174]  [<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
[ 7938.867972]  [<ffffffff8107ba58>] ? __lock_task_sighand+0x78/0x110
[ 7938.875256]  [<ffffffff8107bb4d>] ? do_send_sig_info+0x5d/0x80
[ 7938.882197]  [<ffffffff810b7f10>] lock_release+0x130/0x260
[ 7938.888754]  [<ffffffff816c7474>] _raw_spin_unlock_irqrestore+0x24/0x40
[ 7938.896510]  [<ffffffff8107bb4d>] do_send_sig_info+0x5d/0x80
[ 7938.903240]  [<ffffffff811f69df>] send_sigio_to_task+0x12f/0x1a0
[ 7938.910365]  [<ffffffff811f68b5>] ? send_sigio_to_task+0x5/0x1a0
[ 7938.917455]  [<ffffffff811f7176>] ? send_sigio+0x56/0x100
[ 7938.923894]  [<ffffffff811f71ce>] send_sigio+0xae/0x100
[ 7938.930132]  [<ffffffff811f72b7>] kill_fasync+0x97/0xf0
[ 7938.936359]  [<ffffffff811f722f>] ? kill_fasync+0xf/0xf0
[ 7938.942707]  [<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
[ 7938.949522]  [<ffffffff8115cfe5>] ? perf_event_wakeup+0x5/0xf0
[ 7938.956367]  [<ffffffff8115d103>] perf_pending_event+0x33/0x60
[ 7938.963199]  [<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
[ 7938.969932]  [<ffffffff8114e448>] irq_work_run+0x18/0x40
[ 7938.976212]  [<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
[ 7938.983957]  [<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
[ 7938.991362]  <EOI>  [<ffffffff816c743b>] ? _raw_spin_unlock+0x2b/0x40
[ 7938.998883]  [<ffffffff810f0993>] ? generic_exec_single+0x133/0x190
[ 7939.006127]  [<ffffffff81156100>] ? task_clock_event_add+0x40/0x40
[ 7939.013297]  [<ffffffff810f0a57>] smp_call_function_single+0x67/0xa0
[ 7939.020568]  [<ffffffff81154d69>] task_function_call+0x49/0x60
[ 7939.027277]  [<ffffffff8115b400>] ? perf_event_sched_in+0xa0/0xa0
[ 7939.034270]  [<ffffffff81154e10>] _perf_event_enable+0x90/0xf0
[ 7939.040963]  [<ffffffff81154d80>] ? task_function_call+0x60/0x60
[ 7939.047863]  [<ffffffff81154baf>] perf_event_for_each_child+0x3f/0xb0
[ 7939.055259]  [<ffffffff8115c4ac>] perf_event_task_enable+0x7c/0xd0
[ 7939.062321]  [<ffffffff81082d4d>] SyS_prctl+0x26d/0x4d0
[ 7939.068359]  [<ffffffff816c7ead>] system_call_fastpath+0x16/0x1b
[ 7939.075261] ---[ end trace 8e72f55d0f5cbae4 ]---
[ 7939.080732] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 585.271 msecs
[ 7939.090016] perf interrupt took too long (4572449 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
[ 7940.065563] INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3, t=5252 jiffies, g=10488, c=10487, q=196)
[ 7940.077820] Task dump for CPU 2:
[ 7940.081779] perf_fuzzer     R  running task        0  5692   2549 0x0000000c
[ 7940.089901]  00007ffff86e99b8 000000000000002b 0000000000000003 ffff8800a4103cd8
[ 7940.098442]  ffffffff810b643f ffff88011897a4d0 0000000000000003 ffff88011897a4d0
[ 7940.106955]  ffff8800ced65020 ffff8800a4103c98 ffffffff810b52b8 ffff8800a4103cb8
[ 7940.115473] Call Trace:
[ 7940.118593]  [<ffffffff810b643f>] ? __lock_acquire.isra.31+0x3af/0xfe0
[ 7940.126120]  [<ffffffff810b52b8>] ? __lock_is_held+0x58/0x80
[ 7940.132660]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[ 7940.139989]  [<ffffffff810b4f8a>] ? perf_trace_lock+0x12a/0x160
[ 7940.146848]  [<ffffffff8115b485>] ? __perf_event_enable+0x85/0x2a0
[ 7940.153971]  [<ffffffff816c743b>] ? _raw_spin_unlock+0x2b/0x40
[ 7940.160738]  [<ffffffff810f0993>] ? generic_exec_single+0x133/0x190
[ 7940.167964]  [<ffffffff81156100>] ? task_clock_event_add+0x40/0x40
[ 7940.175029]  [<ffffffff810f0a57>] ? smp_call_function_single+0x67/0xa0
[ 7940.182518]  [<ffffffff81154d69>] ? task_function_call+0x49/0x60
[ 7940.189447]  [<ffffffff8115b400>] ? perf_event_sched_in+0xa0/0xa0
[ 7940.196485]  [<ffffffff81154e10>] ? _perf_event_enable+0x90/0xf0
[ 7940.203430]  [<ffffffff81154d80>] ? task_function_call+0x60/0x60
[ 7940.210386]  [<ffffffff81154baf>] ? perf_event_for_each_child+0x3f/0xb0
[ 7940.217945]  [<ffffffff8115c4ac>] ? perf_event_task_enable+0x7c/0xd0
[ 7940.225255]  [<ffffffff81082d4d>] ? SyS_prctl+0x26d/0x4d0
[ 7940.231557]  [<ffffffff816c7ead>] ? system_call_fastpath+0x16/0x1b
[ 7947.852948] perf interrupt took too long (4536733 > 10000), lowering kernel.perf_event_max_sample_rate to 12500
[ 7956.620503] perf interrupt took too long (4501296 > 20000), lowering kernel.perf_event_max_sample_rate to 6250
[ 7965.387462] perf interrupt took too long (4466134 > 38461), lowering kernel.perf_event_max_sample_rate to 3250


[145965.719358] Call Trace:
[145965.722845]  <NMI>  [<ffffffff816bfed5>] dump_stack+0x45/0x57
[145965.729907]  [<ffffffff8106ddfa>] warn_slowpath_common+0x8a/0xc0
[145965.737170]  [<ffffffff8106de76>] warn_slowpath_fmt+0x46/0x50
[145965.744103]  [<ffffffff8101db0a>] ? native_sched_clock+0x2a/0x90
[145965.751356]  [<ffffffff8111cde2>] watchdog_overflow_callback+0x92/0xc0
[145965.759125]  [<ffffffff8115dbe1>] __perf_event_overflow+0x91/0x270
[145965.766466]  [<ffffffff8102b56a>] ? x86_perf_event_set_period+0xca/0x170
[145965.774393]  [<ffffffff8115e6c9>] perf_event_overflow+0x19/0x20
[145965.781478]  [<ffffffff81032a7a>] intel_pmu_handle_irq+0x1ba/0x3a0
[145965.788793]  [<ffffffff816ca020>] ? trace_page_fault+0x30/0x30
[145965.795777]  [<ffffffff8102a45b>] perf_event_nmi_handler+0x2b/0x50
[145965.803049]  [<ffffffff81018760>] nmi_handle+0xa0/0x150
[145965.809335]  [<ffffffff810186c5>] ? nmi_handle+0x5/0x150
[145965.815730]  [<ffffffff81018a95>] default_do_nmi+0xd5/0x140
[145965.822428]  [<ffffffff81018b98>] do_nmi+0x98/0xe0
[145965.828254]  [<ffffffff816ca3d1>] end_repeat_nmi+0x1e/0x2e
[145965.834803]  [<ffffffff816c9ff0>] ? general_protection+0x30/0x30
[145965.841912]  [<ffffffff816c9ff0>] ? general_protection+0x30/0x30
[145965.849073]  [<ffffffff816c9ff0>] ? general_protection+0x30/0x30
[145965.856187]  <<EOE>>  <IRQ>  [<ffffffff8140b46b>] ? copy_user_handle_tail+0x3b/0x80
[145965.865154]  [<ffffffff8140b343>] copy_from_user_nmi+0x63/0x70
[145965.872114]  [<ffffffff8102c210>] perf_callchain_user+0xd0/0x240
[145965.879227]  [<ffffffff81161508>] perf_callchain+0x1f8/0x220
[145965.885923]  [<ffffffff8115da21>] perf_prepare_sample+0x291/0x3c0
[145965.893151]  [<ffffffff8115dc5e>] __perf_event_overflow+0x10e/0x270
[145965.900562]  [<ffffffff8115dc29>] ? __perf_event_overflow+0xd9/0x270
[145965.908078]  [<ffffffff810b7f10>] ? lock_release+0x130/0x260
[145965.914842]  [<ffffffff8115df89>] perf_swevent_overflow+0xa9/0xe0
[145965.922072]  [<ffffffff8115e027>] perf_swevent_event+0x67/0x90
[145965.929025]  [<ffffffff8115e114>] perf_tp_event+0xc4/0x210
[145965.935615]  [<ffffffff810b7f10>] ? lock_release+0x130/0x260
[145965.942395]  [<ffffffff8107ba4e>] ? __lock_task_sighand+0x6e/0x110
[145965.949648]  [<ffffffff810b4f8a>] ? perf_trace_lock+0x12a/0x160
[145965.956671]  [<ffffffff8107ba58>] ? __lock_task_sighand+0x78/0x110
[145965.963919]  [<ffffffff8107bb4d>] ? do_send_sig_info+0x5d/0x80
[145965.970858]  [<ffffffff8107bb4d>] ? do_send_sig_info+0x5d/0x80
[145965.977849]  [<ffffffff811406d8>] ? perf_trace_buf_prepare+0x98/0xf0
[145965.985330]  [<ffffffff810b4f8a>] perf_trace_lock+0x12a/0x160
[145965.992195]  [<ffffffff811f7176>] ? send_sigio+0x56/0x100
[145965.998659]  [<ffffffff811f71f0>] ? send_sigio+0xd0/0x100
[145966.005169]  [<ffffffff810b7f10>] lock_release+0x130/0x260
[145966.011716]  [<ffffffff816c76e3>] _raw_read_unlock+0x23/0x40
[145966.018442]  [<ffffffff811f71f0>] send_sigio+0xd0/0x100
[145966.024700]  [<ffffffff811f72b7>] kill_fasync+0x97/0xf0
[145966.030968]  [<ffffffff811f722f>] ? kill_fasync+0xf/0xf0
[145966.037365]  [<ffffffff8115d0b4>] perf_event_wakeup+0xd4/0xf0
[145966.044210]  [<ffffffff8115cfe5>] ? perf_event_wakeup+0x5/0xf0
[145966.051060]  [<ffffffff8115d103>] perf_pending_event+0x33/0x60
[145966.057950]  [<ffffffff8114e3fc>] irq_work_run_list+0x4c/0x80
[145966.064798]  [<ffffffff8114e448>] irq_work_run+0x18/0x40
[145966.071127]  [<ffffffff810196af>] smp_trace_irq_work_interrupt+0x3f/0xc0
[145966.078929]  [<ffffffff816c99bd>] trace_irq_work_interrupt+0x6d/0x80
[145966.086359]  <EOI>  [<ffffffff816c74c2>] ? _raw_spin_unlock_irq+0x32/0x40
[145966.094337]  [<ffffffff816c74bb>] ? _raw_spin_unlock_irq+0x2b/0x40
[145966.101634]  [<ffffffff81092f7c>] finish_task_switch+0x8c/0x140
[145966.108630]  [<ffffffff81092f3d>] ? finish_task_switch+0x4d/0x140
[145966.115793]  [<ffffffff816c1c06>] __schedule+0x2d6/0x9f0
[145966.122115]  [<ffffffff810b4e89>] ? perf_trace_lock+0x29/0x160
[145966.128909]  [<ffffffff816c2357>] schedule+0x37/0x90
[145966.134802]  [<ffffffff816c67ac>] schedule_timeout+0x1bc/0x250
[145966.141578]  [<ffffffff816c37f8>] ? wait_for_completion+0x28/0x100
[145966.148660]  [<ffffffff816c3894>] ? wait_for_completion+0xc4/0x100
[145966.155795]  [<ffffffff816c389c>] wait_for_completion+0xcc/0x100
[145966.162731]  [<ffffffff8109ae80>] ? wake_up_state+0x20/0x20
[145966.169205]  [<ffffffff810d4830>] ? call_rcu_bh+0x20/0x20
[145966.175464]  [<ffffffff810d12fd>] wait_rcu_gp+0x5d/0x80
[145966.181586]  [<ffffffff810d1240>] ? ftrace_raw_output_rcu_utilization+0x70/0x70
[145966.189890]  [<ffffffff810d352a>] synchronize_sched+0x3a/0x50
[145966.196573]  [<ffffffff8114083b>] perf_trace_event_unreg.isra.1+0x3b/0x90
[145966.204345]  [<ffffffff81140bac>] perf_trace_destroy+0x3c/0x50
[145966.211075]  [<ffffffff81155aee>] tp_perf_event_destroy+0xe/0x10
[145966.217976]  [<ffffffff8115721c>] __free_event+0x2c/0x80
[145966.224102]  [<ffffffff8115f5f9>] SYSC_perf_event_open+0x629/0xc80
[145966.231246]  [<ffffffff8116009e>] SyS_perf_event_open+0xe/0x10
[145966.237982]  [<ffffffff816c7ead>] system_call_fastpath+0x16/0x1b
[145966.244932] ---[ end trace 7088940f25646d60 ]---

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-23 12:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-19 16:54 perf: fuzzer gets CPU stuck in perf_callchain() Vince Weaver
2015-02-19 16:57 ` Peter Zijlstra
2015-02-19 17:05   ` Vince Weaver
2015-02-19 17:24     ` Vince Weaver
2015-02-19 17:32       ` Vince Weaver
2015-02-19 17:03 ` Peter Zijlstra
2015-02-24  3:33   ` Vince Weaver
2015-03-23 12:23   ` [tip:perf/urgent] perf: Fix irq_work 'tail' recursion tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.