From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752958AbbBSQv2 (ORCPT ); Thu, 19 Feb 2015 11:51:28 -0500 Received: from mail-qg0-f48.google.com ([209.85.192.48]:51430 "EHLO mail-qg0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751783AbbBSQvY (ORCPT ); Thu, 19 Feb 2015 11:51:24 -0500 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Thu, 19 Feb 2015 11:54:40 -0500 (EST) To: linux-kernel@vger.kernel.org cc: Peter Zijlstra , Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , Jiri Olsa , Steven Rostedt Subject: perf: fuzzer gets CPU stuck in perf_callchain() Message-ID: User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Another bug found by the perf_fuzzer(). I think this one is different than the one I sent the other day, it looks like something is going very wrong in perf_callchain(). This one is reasonably reproducible, if there's any extra debugging that I can add. This is on a Haswell machine with git as of a few days ago. There are two traces below; they were generated by the same random seed across different reboots and the trigger is slightly different (although both look like they hit the same bug in the callchain code). [ 7938.495396] ------------[ cut here ]------------ [ 7938.500373] WARNING: CPU: 2 PID: 5692 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0() [ 7938.510337] Watchdog detected hard LOCKUP on cpu 2 [ 7938.515301] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel snd_hda_controller snd_hda_codec i915 aes_x86_64 lrw gf128mul ppdev iTCO_wdt xhci_pci snd_hwdep snd_pcm tpm_tis evdev glue_helper iTCO_vendor_support xhci_hcd snd_timer mei_me drm_kms_helper ablk_helper snd tpm mei drm psmouse serio_raw pcspkr parport_pc cryptd soundcore lpc_ich parport wmi mfd_core battery i2c_i801 i2c_algo_bit button processor video sg sr_mod sd_mod cdrom ahci libahci ehci_pci libata ehci_hcd e1000e ptp crc32c_intel scsi_mod usbcore usb_common pps_core fan thermal thermal_sys [ 7938.586970] CPU: 2 PID: 5692 Comm: perf_fuzzer Not tainted 3.19.0+ #127 [ 7938.594053] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 [ 7938.601990] ffffffff81a41afe ffff88011ea85aa0 ffffffff816bfed5 0000000000000000 [ 7938.610056] ffff88011ea85af0 ffff88011ea85ae0 ffffffff8106ddfa 0000000000000000 [ 7938.618102] ffff8801195a1800 0000000000000000 ffff88011ea85c40 ffff88011ea85ef8 [ 7938.626105] Call Trace: [ 7938.628735] [] dump_stack+0x45/0x57 [ 7938.634940] [] warn_slowpath_common+0x8a/0xc0 [ 7938.641394] [] warn_slowpath_fmt+0x46/0x50 [ 7938.647566] [] watchdog_overflow_callback+0x92/0xc0 [ 7938.654617] [] __perf_event_overflow+0x91/0x270 [ 7938.661233] [] ? x86_perf_event_set_period+0xca/0x170 [ 7938.668419] [] perf_event_overflow+0x19/0x20 [ 7938.674764] [] intel_pmu_handle_irq+0x1ba/0x3a0 [ 7938.681432] [] perf_event_nmi_handler+0x2b/0x50 [ 7938.688055] [] nmi_handle+0xa0/0x150 [ 7938.693681] [] ? nmi_handle+0x5/0x150 [ 7938.699405] [] default_do_nmi+0x4a/0x140 [ 7938.705429] [] do_nmi+0x98/0xe0 [ 7938.710573] [] end_repeat_nmi+0x1e/0x2e [ 7938.716470] [] ? copy_from_user_nmi+0x20/0x70 [ 7938.722929] [] ? copy_from_user_nmi+0x20/0x70 [ 7938.729386] [] ? copy_from_user_nmi+0x20/0x70 [ 7938.735862] <> [] ? perf_callchain_user+0xd0/0x240 [ 7938.744067] [] perf_callchain+0x1f8/0x220 [ 7938.750152] [] perf_prepare_sample+0x291/0x3c0 [ 7938.756700] [] __perf_event_overflow+0x10e/0x270 [ 7938.763412] [] ? __perf_event_overflow+0xd9/0x270 [ 7938.770254] [] ? perf_tp_event+0xc4/0x210 [ 7938.776302] [] ? perf_prepare_sample+0x2ec/0x3c0 [ 7938.783021] [] ? trace_page_fault+0x28/0x30 [ 7938.789319] [] perf_swevent_overflow+0xa9/0xe0 [ 7938.795845] [] perf_swevent_event+0x67/0x90 [ 7938.802139] [] perf_tp_event+0xc4/0x210 [ 7938.808025] [] ? lock_release+0x130/0x260 [ 7938.814132] [] ? __lock_acquire.isra.31+0x3af/0xfe0 [ 7938.821124] [] ? perf_trace_buf_prepare+0x98/0xf0 [ 7938.827926] [] ? perf_trace_lock+0x12a/0x160 [ 7938.834291] [] ? __sigqueue_alloc+0x5/0x150 [ 7938.840541] [] ? __sigqueue_alloc+0x134/0x150 [ 7938.847009] [] ? perf_trace_buf_prepare+0x98/0xf0 [ 7938.853835] [] ? __lock_task_sighand+0x6e/0x110 [ 7938.861174] [] perf_trace_lock+0x12a/0x160 [ 7938.867972] [] ? __lock_task_sighand+0x78/0x110 [ 7938.875256] [] ? do_send_sig_info+0x5d/0x80 [ 7938.882197] [] lock_release+0x130/0x260 [ 7938.888754] [] _raw_spin_unlock_irqrestore+0x24/0x40 [ 7938.896510] [] do_send_sig_info+0x5d/0x80 [ 7938.903240] [] send_sigio_to_task+0x12f/0x1a0 [ 7938.910365] [] ? send_sigio_to_task+0x5/0x1a0 [ 7938.917455] [] ? send_sigio+0x56/0x100 [ 7938.923894] [] send_sigio+0xae/0x100 [ 7938.930132] [] kill_fasync+0x97/0xf0 [ 7938.936359] [] ? kill_fasync+0xf/0xf0 [ 7938.942707] [] perf_event_wakeup+0xd4/0xf0 [ 7938.949522] [] ? perf_event_wakeup+0x5/0xf0 [ 7938.956367] [] perf_pending_event+0x33/0x60 [ 7938.963199] [] irq_work_run_list+0x4c/0x80 [ 7938.969932] [] irq_work_run+0x18/0x40 [ 7938.976212] [] smp_trace_irq_work_interrupt+0x3f/0xc0 [ 7938.983957] [] trace_irq_work_interrupt+0x6d/0x80 [ 7938.991362] [] ? _raw_spin_unlock+0x2b/0x40 [ 7938.998883] [] ? generic_exec_single+0x133/0x190 [ 7939.006127] [] ? task_clock_event_add+0x40/0x40 [ 7939.013297] [] smp_call_function_single+0x67/0xa0 [ 7939.020568] [] task_function_call+0x49/0x60 [ 7939.027277] [] ? perf_event_sched_in+0xa0/0xa0 [ 7939.034270] [] _perf_event_enable+0x90/0xf0 [ 7939.040963] [] ? task_function_call+0x60/0x60 [ 7939.047863] [] perf_event_for_each_child+0x3f/0xb0 [ 7939.055259] [] perf_event_task_enable+0x7c/0xd0 [ 7939.062321] [] SyS_prctl+0x26d/0x4d0 [ 7939.068359] [] system_call_fastpath+0x16/0x1b [ 7939.075261] ---[ end trace 8e72f55d0f5cbae4 ]--- [ 7939.080732] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 585.271 msecs [ 7939.090016] perf interrupt took too long (4572449 > 5000), lowering kernel.perf_event_max_sample_rate to 25000 [ 7940.065563] INFO: rcu_sched detected stalls on CPUs/tasks: { 2} (detected by 3, t=5252 jiffies, g=10488, c=10487, q=196) [ 7940.077820] Task dump for CPU 2: [ 7940.081779] perf_fuzzer R running task 0 5692 2549 0x0000000c [ 7940.089901] 00007ffff86e99b8 000000000000002b 0000000000000003 ffff8800a4103cd8 [ 7940.098442] ffffffff810b643f ffff88011897a4d0 0000000000000003 ffff88011897a4d0 [ 7940.106955] ffff8800ced65020 ffff8800a4103c98 ffffffff810b52b8 ffff8800a4103cb8 [ 7940.115473] Call Trace: [ 7940.118593] [] ? __lock_acquire.isra.31+0x3af/0xfe0 [ 7940.126120] [] ? __lock_is_held+0x58/0x80 [ 7940.132660] [] ? perf_trace_buf_prepare+0x98/0xf0 [ 7940.139989] [] ? perf_trace_lock+0x12a/0x160 [ 7940.146848] [] ? __perf_event_enable+0x85/0x2a0 [ 7940.153971] [] ? _raw_spin_unlock+0x2b/0x40 [ 7940.160738] [] ? generic_exec_single+0x133/0x190 [ 7940.167964] [] ? task_clock_event_add+0x40/0x40 [ 7940.175029] [] ? smp_call_function_single+0x67/0xa0 [ 7940.182518] [] ? task_function_call+0x49/0x60 [ 7940.189447] [] ? perf_event_sched_in+0xa0/0xa0 [ 7940.196485] [] ? _perf_event_enable+0x90/0xf0 [ 7940.203430] [] ? task_function_call+0x60/0x60 [ 7940.210386] [] ? perf_event_for_each_child+0x3f/0xb0 [ 7940.217945] [] ? perf_event_task_enable+0x7c/0xd0 [ 7940.225255] [] ? SyS_prctl+0x26d/0x4d0 [ 7940.231557] [] ? system_call_fastpath+0x16/0x1b [ 7947.852948] perf interrupt took too long (4536733 > 10000), lowering kernel.perf_event_max_sample_rate to 12500 [ 7956.620503] perf interrupt took too long (4501296 > 20000), lowering kernel.perf_event_max_sample_rate to 6250 [ 7965.387462] perf interrupt took too long (4466134 > 38461), lowering kernel.perf_event_max_sample_rate to 3250 [145965.719358] Call Trace: [145965.722845] [] dump_stack+0x45/0x57 [145965.729907] [] warn_slowpath_common+0x8a/0xc0 [145965.737170] [] warn_slowpath_fmt+0x46/0x50 [145965.744103] [] ? native_sched_clock+0x2a/0x90 [145965.751356] [] watchdog_overflow_callback+0x92/0xc0 [145965.759125] [] __perf_event_overflow+0x91/0x270 [145965.766466] [] ? x86_perf_event_set_period+0xca/0x170 [145965.774393] [] perf_event_overflow+0x19/0x20 [145965.781478] [] intel_pmu_handle_irq+0x1ba/0x3a0 [145965.788793] [] ? trace_page_fault+0x30/0x30 [145965.795777] [] perf_event_nmi_handler+0x2b/0x50 [145965.803049] [] nmi_handle+0xa0/0x150 [145965.809335] [] ? nmi_handle+0x5/0x150 [145965.815730] [] default_do_nmi+0xd5/0x140 [145965.822428] [] do_nmi+0x98/0xe0 [145965.828254] [] end_repeat_nmi+0x1e/0x2e [145965.834803] [] ? general_protection+0x30/0x30 [145965.841912] [] ? general_protection+0x30/0x30 [145965.849073] [] ? general_protection+0x30/0x30 [145965.856187] <> [] ? copy_user_handle_tail+0x3b/0x80 [145965.865154] [] copy_from_user_nmi+0x63/0x70 [145965.872114] [] perf_callchain_user+0xd0/0x240 [145965.879227] [] perf_callchain+0x1f8/0x220 [145965.885923] [] perf_prepare_sample+0x291/0x3c0 [145965.893151] [] __perf_event_overflow+0x10e/0x270 [145965.900562] [] ? __perf_event_overflow+0xd9/0x270 [145965.908078] [] ? lock_release+0x130/0x260 [145965.914842] [] perf_swevent_overflow+0xa9/0xe0 [145965.922072] [] perf_swevent_event+0x67/0x90 [145965.929025] [] perf_tp_event+0xc4/0x210 [145965.935615] [] ? lock_release+0x130/0x260 [145965.942395] [] ? __lock_task_sighand+0x6e/0x110 [145965.949648] [] ? perf_trace_lock+0x12a/0x160 [145965.956671] [] ? __lock_task_sighand+0x78/0x110 [145965.963919] [] ? do_send_sig_info+0x5d/0x80 [145965.970858] [] ? do_send_sig_info+0x5d/0x80 [145965.977849] [] ? perf_trace_buf_prepare+0x98/0xf0 [145965.985330] [] perf_trace_lock+0x12a/0x160 [145965.992195] [] ? send_sigio+0x56/0x100 [145965.998659] [] ? send_sigio+0xd0/0x100 [145966.005169] [] lock_release+0x130/0x260 [145966.011716] [] _raw_read_unlock+0x23/0x40 [145966.018442] [] send_sigio+0xd0/0x100 [145966.024700] [] kill_fasync+0x97/0xf0 [145966.030968] [] ? kill_fasync+0xf/0xf0 [145966.037365] [] perf_event_wakeup+0xd4/0xf0 [145966.044210] [] ? perf_event_wakeup+0x5/0xf0 [145966.051060] [] perf_pending_event+0x33/0x60 [145966.057950] [] irq_work_run_list+0x4c/0x80 [145966.064798] [] irq_work_run+0x18/0x40 [145966.071127] [] smp_trace_irq_work_interrupt+0x3f/0xc0 [145966.078929] [] trace_irq_work_interrupt+0x6d/0x80 [145966.086359] [] ? _raw_spin_unlock_irq+0x32/0x40 [145966.094337] [] ? _raw_spin_unlock_irq+0x2b/0x40 [145966.101634] [] finish_task_switch+0x8c/0x140 [145966.108630] [] ? finish_task_switch+0x4d/0x140 [145966.115793] [] __schedule+0x2d6/0x9f0 [145966.122115] [] ? perf_trace_lock+0x29/0x160 [145966.128909] [] schedule+0x37/0x90 [145966.134802] [] schedule_timeout+0x1bc/0x250 [145966.141578] [] ? wait_for_completion+0x28/0x100 [145966.148660] [] ? wait_for_completion+0xc4/0x100 [145966.155795] [] wait_for_completion+0xcc/0x100 [145966.162731] [] ? wake_up_state+0x20/0x20 [145966.169205] [] ? call_rcu_bh+0x20/0x20 [145966.175464] [] wait_rcu_gp+0x5d/0x80 [145966.181586] [] ? ftrace_raw_output_rcu_utilization+0x70/0x70 [145966.189890] [] synchronize_sched+0x3a/0x50 [145966.196573] [] perf_trace_event_unreg.isra.1+0x3b/0x90 [145966.204345] [] perf_trace_destroy+0x3c/0x50 [145966.211075] [] tp_perf_event_destroy+0xe/0x10 [145966.217976] [] __free_event+0x2c/0x80 [145966.224102] [] SYSC_perf_event_open+0x629/0xc80 [145966.231246] [] SyS_perf_event_open+0xe/0x10 [145966.237982] [] system_call_fastpath+0x16/0x1b [145966.244932] ---[ end trace 7088940f25646d60 ]---