From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752889AbbBBGcQ (ORCPT ); Mon, 2 Feb 2015 01:32:16 -0500 Received: from mail-qg0-f53.google.com ([209.85.192.53]:58155 "EHLO mail-qg0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751257AbbBBGcO (ORCPT ); Mon, 2 Feb 2015 01:32:14 -0500 From: Vince Weaver X-Google-Original-From: Vince Weaver Date: Mon, 2 Feb 2015 01:33:14 -0500 (EST) To: Peter Zijlstra cc: mingo@kernel.org, linux-kernel@vger.kernel.org, vincent.weaver@maine.edu, eranian@gmail.com, jolsa@redhat.com, mark.rutland@arm.com, torvalds@linux-foundation.org, tglx@linutronix.de Subject: Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia In-Reply-To: <20150129144749.GC24151@twins.programming.kicks-ass.net> Message-ID: References: <20150123125159.696530128@infradead.org> <20150123125834.150481799@infradead.org> <20150126162639.GA21418@twins.programming.kicks-ass.net> <20150129144749.GC24151@twins.programming.kicks-ass.net> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 29 Jan 2015, Peter Zijlstra wrote: > That said, it does need to do that sibling first leaders later install > order too. So I've put the below on top. so I've lost track of exactly which patches I should be running (do I need to apply both of the additional patches?) Meanwhile my haswell was still fuzzing away (without those two updated patches) until it triggered the below and crashed. [407484.309136] ------------[ cut here ]------------ [407484.314590] WARNING: CPU: 3 PID: 27209 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0() [407484.325090] Watchdog detected hard LOCKUP on cpu 3 [407484.330093] Modules linked in: btrfs xor raid6_pq ntfs vfat msdos fat dm_mod fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys [407484.408496] CPU: 3 PID: 27209 Comm: perf_fuzzer Tainted: G W 3.19.0-rc6+ #126 [407484.417914] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 [407484.426497] ffffffff81a3d3da ffff88011eac5aa0 ffffffff816b6761 0000000000000000 [407484.435161] ffff88011eac5af0 ffff88011eac5ae0 ffffffff8106dcda ffff88011eac5b00 [407484.443900] ffff8801195f9800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8 [407484.452588] Call Trace: [407484.455862] [] dump_stack+0x45/0x57 [407484.462741] [] warn_slowpath_common+0x8a/0xc0 [407484.469851] [] warn_slowpath_fmt+0x46/0x50 [407484.476743] [] ? native_sched_clock+0x2a/0x90 [407484.483888] [] ? native_sched_clock+0x2a/0x90 [407484.490999] [] watchdog_overflow_callback+0x92/0xc0 [407484.498672] [] __perf_event_overflow+0x91/0x270 [407484.505984] [] ? x86_perf_event_set_period+0xca/0x170 [407484.513834] [] perf_event_overflow+0x19/0x20 [407484.520812] [] intel_pmu_handle_irq+0x1ba/0x3a0 [407484.528119] [] perf_event_nmi_handler+0x2b/0x50 [407484.535402] [] nmi_handle+0xa0/0x150 [407484.541701] [] ? nmi_handle+0x5/0x150 [407484.548069] [] default_do_nmi+0x4a/0x140 [407484.554692] [] do_nmi+0x98/0xe0 [407484.560517] [] end_repeat_nmi+0x1e/0x2e [407484.567054] [] ? perf_get_regs_user+0xbf/0x190 [407484.574256] [] ? perf_get_regs_user+0xbf/0x190 [407484.581431] [] ? perf_get_regs_user+0xbf/0x190 [407484.588602] <> [] perf_prepare_sample+0x2ec/0x3c0 [407484.597358] [] __perf_event_overflow+0x10e/0x270 [407484.604708] [] ? __perf_event_overflow+0xd9/0x270 [407484.612215] [] ? perf_tp_event+0xc4/0x210 [407484.619000] [] ? __perf_sw_event+0x72/0x1f0 [407484.625937] [] ? perf_swevent_overflow+0xa9/0xe0 [407484.633287] [] perf_swevent_overflow+0xa9/0xe0 [407484.640467] [] perf_swevent_event+0x67/0x90 [407484.647343] [] perf_tp_event+0xc4/0x210 [407484.653923] [] ? lock_acquire+0x119/0x130 [407484.660606] [] ? perf_trace_lock_acquire+0x146/0x180 [407484.668332] [] ? __lock_acquire.isra.31+0x3af/0xfe0 [407484.675962] [] perf_trace_lock_acquire+0x146/0x180 [407484.683490] [] ? lock_acquire+0x119/0x130 [407484.690211] [] lock_acquire+0x119/0x130 [407484.696750] [] ? perf_event_wakeup+0x5/0xf0 [407484.703640] [] ? kill_fasync+0xf/0xf0 [407484.710008] [] perf_event_wakeup+0x38/0xf0 [407484.716798] [] ? perf_event_wakeup+0x5/0xf0 [407484.723696] [] perf_pending_event+0x33/0x60 [407484.730570] [] irq_work_run_list+0x4c/0x80 [407484.737392] [] irq_work_run+0x18/0x40 [407484.743765] [] smp_trace_irq_work_interrupt+0x3f/0xc0 [407484.751579] [] trace_irq_work_interrupt+0x6d/0x80 [407484.759046] [] ? lock_acquire+0xbd/0x130 [407484.766380] [] ? SyS_fcntl+0x5b2/0x650 [407484.772786] [] _raw_spin_lock+0x31/0x40 [407484.779321] [] ? SyS_fcntl+0x5b2/0x650 [407484.785813] [] SyS_fcntl+0x5b2/0x650 [407484.792109] [] system_call_fastpath+0x16/0x1b [407484.799195] ---[ end trace 55752a03ec8ab979 ]--- [407490.307881] INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 4, t=5252 jiffies, g=36308737, c=36308736, q=207) [407490.320908] Task dump for CPU 3: [407490.325104] perf_fuzzer R running task 0 27209 2304 0x00000008 [407490.333439] ffffffff8115c837 ffff8800c4293cc8 ffffffff8115c924 ffff880000000024 [407490.342163] 0000000000000001 ffff8800c4293b80 ffff8800c4293d90 ffff8800ceb81000 [407490.350899] ffffe8ffffcca880 ffff8800c4293b10 ffffffff8115c799 ffff8800ceb81000 [407490.359619] Call Trace: [407490.362932] [] ? perf_swevent_event+0x67/0x90 [407490.370033] [] ? perf_tp_event+0xc4/0x210 [407490.376770] [] ? perf_swevent_overflow+0xa9/0xe0 [407490.384068] [] ? perf_swevent_event+0x67/0x90 [407490.391073] [] ? perf_tp_event+0xc4/0x210 [407490.397696] [] ? lock_acquire+0x119/0x130 [407490.404328] [] ? __lock_acquire.isra.31+0x3af/0xfe0 [407490.411921] [] ? perf_trace_lock_acquire+0x146/0x180 [407490.419567] [] ? __lock_acquire.isra.31+0x3af/0xfe0 [407490.427133] [] ? lock_acquire+0xbd/0x130 [407490.433668] [] ? SyS_fcntl+0x5b2/0x650 [407490.440062] [] ? _raw_spin_lock+0x31/0x40 [407490.446600] [] ? SyS_fcntl+0x5b2/0x650 [407490.452913] [] ? SyS_fcntl+0x5b2/0x650 [407490.459253] [] ? system_call_fastpath+0x16/0x1b (repeat forever filling up the disk with these messages)