From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754328AbbBBPmt (ORCPT ); Mon, 2 Feb 2015 10:42:49 -0500 Received: from casper.infradead.org ([85.118.1.10]:60386 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753430AbbBBPmq (ORCPT ); Mon, 2 Feb 2015 10:42:46 -0500 Date: Mon, 2 Feb 2015 16:42:40 +0100 From: Peter Zijlstra To: Vince Weaver Cc: mingo@kernel.org, linux-kernel@vger.kernel.org, eranian@gmail.com, jolsa@redhat.com, mark.rutland@arm.com, torvalds@linux-foundation.org, tglx@linutronix.de Subject: Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia Message-ID: <20150202154240.GG26304@twins.programming.kicks-ass.net> References: <20150123125159.696530128@infradead.org> <20150123125834.150481799@infradead.org> <20150126162639.GA21418@twins.programming.kicks-ass.net> <20150129144749.GC24151@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 02, 2015 at 01:33:14AM -0500, Vince Weaver wrote: > On Thu, 29 Jan 2015, Peter Zijlstra wrote: > > > That said, it does need to do that sibling first leaders later install > > order too. So I've put the below on top. > > so I've lost track of exactly which patches I should be running (do I need > to apply both of the additional patches?) Probably, lemme try and get all of the current stuff in tip/master to make for easier testing. > Meanwhile my haswell was still fuzzing away (without those two > updated patches) until it triggered the below and crashed. > > [407484.309136] ------------[ cut here ]------------ > [407484.314590] WARNING: CPU: 3 PID: 27209 at kernel/watchdog.c:290 watchdog_overflow_callback+0x92/0xc0() > [407484.325090] Watchdog detected hard LOCKUP on cpu 3 > [407484.330093] Modules linked in: btrfs xor raid6_pq ntfs vfat msdos fat dm_mod fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm crct10dif_pclmul snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic crc32_pclmul snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 snd_hwdep lrw gf128mul snd_pcm ppdev glue_helper xhci_pci mei_me iTCO_wdt iTCO_vendor_support i915 snd_timer drm_kms_helper snd drm ablk_helper lpc_ich mfd_core evdev pcspkr parport_pc psmouse cryptd soundcore i2c_i801 serio_raw parport xhci_hcd mei wmi tpm_tis tpm video battery button processor i2c_algo_bit sg sr_mod sd_mod cdrom ahci libahci e1000e ehci_pci libata ptp ehci_hcd crc32c_intel usbcore scsi_mod usb_common pps_core thermal fan thermal_sys > [407484.408496] CPU: 3 PID: 27209 Comm: perf_fuzzer Tainted: G W 3.19.0-rc6+ #126 > [407484.417914] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014 > [407484.426497] ffffffff81a3d3da ffff88011eac5aa0 ffffffff816b6761 0000000000000000 > [407484.435161] ffff88011eac5af0 ffff88011eac5ae0 ffffffff8106dcda ffff88011eac5b00 > [407484.443900] ffff8801195f9800 0000000000000001 ffff88011eac5c40 ffff88011eac5ef8 > [407484.452588] Call Trace: > [407484.455862] [] dump_stack+0x45/0x57 > [407484.462741] [] warn_slowpath_common+0x8a/0xc0 > [407484.469851] [] warn_slowpath_fmt+0x46/0x50 > [407484.476743] [] ? native_sched_clock+0x2a/0x90 > [407484.483888] [] ? native_sched_clock+0x2a/0x90 > [407484.490999] [] watchdog_overflow_callback+0x92/0xc0 > [407484.498672] [] __perf_event_overflow+0x91/0x270 > [407484.505984] [] ? x86_perf_event_set_period+0xca/0x170 > [407484.513834] [] perf_event_overflow+0x19/0x20 > [407484.520812] [] intel_pmu_handle_irq+0x1ba/0x3a0 > [407484.528119] [] perf_event_nmi_handler+0x2b/0x50 > [407484.535402] [] nmi_handle+0xa0/0x150 > [407484.541701] [] ? nmi_handle+0x5/0x150 > [407484.548069] [] default_do_nmi+0x4a/0x140 > [407484.554692] [] do_nmi+0x98/0xe0 > [407484.560517] [] end_repeat_nmi+0x1e/0x2e > [407484.567054] [] ? perf_get_regs_user+0xbf/0x190 > [407484.574256] [] ? perf_get_regs_user+0xbf/0x190 > [407484.581431] [] ? perf_get_regs_user+0xbf/0x190 > [407484.588602] <> [] perf_prepare_sample+0x2ec/0x3c0 > [407484.597358] [] __perf_event_overflow+0x10e/0x270 > [407484.604708] [] ? __perf_event_overflow+0xd9/0x270 > [407484.612215] [] ? perf_tp_event+0xc4/0x210 > [407484.619000] [] ? __perf_sw_event+0x72/0x1f0 > [407484.625937] [] ? perf_swevent_overflow+0xa9/0xe0 > [407484.633287] [] perf_swevent_overflow+0xa9/0xe0 > [407484.640467] [] perf_swevent_event+0x67/0x90 > [407484.647343] [] perf_tp_event+0xc4/0x210 > [407484.653923] [] ? lock_acquire+0x119/0x130 > [407484.660606] [] ? perf_trace_lock_acquire+0x146/0x180 > [407484.668332] [] ? __lock_acquire.isra.31+0x3af/0xfe0 > [407484.675962] [] perf_trace_lock_acquire+0x146/0x180 > [407484.683490] [] ? lock_acquire+0x119/0x130 > [407484.690211] [] lock_acquire+0x119/0x130 > [407484.696750] [] ? perf_event_wakeup+0x5/0xf0 > [407484.703640] [] ? kill_fasync+0xf/0xf0 > [407484.710008] [] perf_event_wakeup+0x38/0xf0 > [407484.716798] [] ? perf_event_wakeup+0x5/0xf0 > [407484.723696] [] perf_pending_event+0x33/0x60 > [407484.730570] [] irq_work_run_list+0x4c/0x80 > [407484.737392] [] irq_work_run+0x18/0x40 > [407484.743765] [] smp_trace_irq_work_interrupt+0x3f/0xc0 > [407484.751579] [] trace_irq_work_interrupt+0x6d/0x80 > [407484.759046] [] ? lock_acquire+0xbd/0x130 > [407484.766380] [] ? SyS_fcntl+0x5b2/0x650 > [407484.772786] [] _raw_spin_lock+0x31/0x40 > [407484.779321] [] ? SyS_fcntl+0x5b2/0x650 > [407484.785813] [] SyS_fcntl+0x5b2/0x650 > [407484.792109] [] system_call_fastpath+0x16/0x1b > [407484.799195] ---[ end trace 55752a03ec8ab979 ]--- That looks like tail recursive fun! An irq work that raises and irq work ad infinitum. Lemme see if I can squash that.. didn't we have something like this before... /me goes look.