linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vince Weaver <vincent.weaver@maine.edu>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Vince Weaver <vincent.weaver@maine.edu>,
	linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Stephane Eranian <eranian@gmail.com>,
	kan.liang@intel.com
Subject: Re: perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm()
Date: Fri, 3 Jul 2015 15:03:05 -0400 (EDT)	[thread overview]
Message-ID: <alpine.DEB.2.20.1507031459340.18107@vincent-weaver-1.umelst.maine.edu> (raw)
In-Reply-To: <20150703131336.GI19282@twins.programming.kicks-ass.net>

On Fri, 3 Jul 2015, Peter Zijlstra wrote:

> On Thu, Jul 02, 2015 at 11:18:10AM -0400, Vince Weaver wrote:
> > 
> > So sad to say the lack of fuzzer reports was because I was out of town for 
> > a bit, not due to the kernel suddenly getting amazingly better.
> > 
> > In any case I am running against current git and getting a lot of 
> > warnings, but most of them seem to be old ones.  This following one looks 
> > new though.
> > 
> > This is current linus-git on a Haswell machine with peterz's patch to fix 
> > the aux buffer spinlock recursion (I can still crash the kernel if that 
> > patch is not applied).
> > 
> > It corresponds to:
> > 
> > 	WARN_ON_ONCE(!event->attr.precise_ip);
> > 
> > [  584.352324] WARNING: CPU: 2 PID: 18924 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
> 
> I've not yet tried to reproduce, but the below could explain things.
> 
> On disabling an event we first clear our cpuc->pebs_enabled bits, only
> to then check them to see if there are any set, and if so, drain the
> buffer.
> 
> If we just cleared the last bit, we'll fail to drain the buffer.
> 
> If we then program another event on that counter and another PEBS event,
> we can hit the above WARN with the 'stale' entries left over from the
> previous event.

with that patch applied I still managed to hit this:

	WARN_ON_ONCE(!event->attr.precise_ip);

I'll let it run some more and see if the watchdog still gets triggered.

[ 2217.544901] ------------[ cut here ]------------
[ 2217.550351] WARNING: CPU: 2 PID: 9136 at arch/x86/kernel/cpu/perf_event_intel_ds.c:1198 intel_pmu_drain_pebs_nhm+0x283/0x2e0()
[ 2217.563534] Modules linked in: fuse snd_hda_codec_hdmi i915 x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel psmouse hmac drbg evdev serio_raw ansi_cprng snd_hda_codec_realtek drm_kms_helper snd_hda_codec_generic ppdev iTCO_wdt iTCO_vendor_support pcspkr drm i2c_algo_bit aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper snd_hda_intel cryptd mei_me mei snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer tpm_tis tpm wmi button processor video battery i2c_i801 parport_pc parport snd lpc_ich mfd_core soundcore sg sr_mod sd_mod cdrom ehci_pci ehci_hcd ahci libahci xhci_pci xhci_hcd e1000e libata ptp crc32c_intel scsi_mod pps_core usbcore usb_common fan thermal thermal_sys
[ 2217.640998] CPU: 2 PID: 9136 Comm: perf_fuzzer Tainted: G        W       4.1.0+ #163
[ 2217.649810] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
[ 2217.658281]  ffffffff81a105a0 ffff88011ea85b10 ffffffff8169f823 0000000000000000
[ 2217.666818]  0000000000000000 ffff88011ea85b50 ffffffff8106ec8a ffff88011ea85ba0
[ 2217.675329]  0000000000000002 0000000000000001 ffff88011ea8bd80 ffff8801190400c0
[ 2217.683821] Call Trace:
[ 2217.686960]  <NMI>  [<ffffffff8169f823>] dump_stack+0x45/0x57
[ 2217.693638]  [<ffffffff8106ec8a>] warn_slowpath_common+0x8a/0xc0
[ 2217.700549]  [<ffffffff8106ed7a>] warn_slowpath_null+0x1a/0x20
[ 2217.707296]  [<ffffffff8102f783>] intel_pmu_drain_pebs_nhm+0x283/0x2e0
[ 2217.714775]  [<ffffffff81031834>] ? intel_pmu_disable_event+0xa4/0x130
[ 2217.722216]  [<ffffffff81032235>] intel_pmu_handle_irq+0x255/0x440
[ 2217.729339]  [<ffffffff8115413e>] ? perf_event_ctx_lock_nested+0x5e/0xf0
[ 2217.737026]  [<ffffffff81028e76>] perf_event_nmi_handler+0x26/0x40
[ 2217.744070]  [<ffffffff810181ad>] nmi_handle+0x9d/0x140
[ 2217.750160]  [<ffffffff81018115>] ? nmi_handle+0x5/0x140
[ 2217.756290]  [<ffffffff8101843a>] default_do_nmi+0x4a/0x120
[ 2217.762688]  [<ffffffff8101859d>] do_nmi+0x8d/0xc0
[ 2217.768280]  [<ffffffff816a979f>] end_repeat_nmi+0x1e/0x2e
[ 2217.774627]  [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 2217.781894]  [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 2217.789153]  [<ffffffff810309ba>] ? __intel_pmu_enable_all+0x5a/0xc0
[ 2217.796415]  <<EOE>>  <IRQ>  [<ffffffff81030a30>] intel_pmu_enable_all+0x10/0x20
[ 2217.804847]  [<ffffffff8102a95c>] x86_pmu_enable+0x25c/0x2e0
[ 2217.811383]  [<ffffffff811560e2>] perf_pmu_enable+0x22/0x30
[ 2217.817837]  [<ffffffff81157a80>] perf_mux_hrtimer_handler+0x120/0x1f0
[ 2217.825316]  [<ffffffff81157960>] ? perf_event_context_sched_in+0x150/0x150
[ 2217.833239]  [<ffffffff810dcf43>] __hrtimer_run_queues+0xd3/0x260
[ 2217.840239]  [<ffffffff810dd4bb>] hrtimer_interrupt+0xab/0x1b0
[ 2217.846930]  [<ffffffff8104b32c>] local_apic_timer_interrupt+0x3c/0x70
[ 2217.854367]  [<ffffffff816aa1a1>] smp_apic_timer_interrupt+0x41/0x60
[ 2217.861630]  [<ffffffff816a83eb>] apic_timer_interrupt+0x6b/0x70
[ 2217.868540]  <EOI> 
[ 2217.870633] ---[ end trace 3a31b4d07b4f3450 ]---
[ 2353.824071] Uhhuh. NMI received for unknown reason 31 on CPU 1.
[ 2353.831238] Do you have a strange power saving mode enabled?
[ 2353.838120] Dazed and confused, but trying to continue


  parent reply	other threads:[~2015-07-03 18:57 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-02 15:18 perf: fuzzer triggered warning in intel_pmu_drain_pebs_nhm() Vince Weaver
2015-07-02 19:43 ` Peter Zijlstra
2015-07-03 13:13 ` Peter Zijlstra
2015-07-03 18:56   ` Stephane Eranian
2015-07-03 19:04     ` Peter Zijlstra
2015-07-03 19:49       ` Vince Weaver
2015-07-15  6:42         ` Stephane Eranian
2015-07-15 12:35           ` Peter Zijlstra
2015-07-16  6:02             ` Stephane Eranian
2015-07-16  7:15               ` Peter Zijlstra
2015-07-16  7:30                 ` Stephane Eranian
2015-07-16 21:12                   ` Stephane Eranian
2015-07-03 19:03   ` Vince Weaver [this message]
2015-07-03 20:08   ` Liang, Kan
2015-07-06 10:55     ` Peter Zijlstra
2015-07-06 13:47       ` Liang, Kan
2015-07-06 16:22         ` Vince Weaver
2015-07-06 16:23           ` Liang, Kan
2015-07-06 16:51             ` Vince Weaver
2015-08-04  9:01     ` [tip:perf/core] perf/x86/intel/pebs: Fix event disable PEBS buffer drain tip-bot for Liang, Kan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.20.1507031459340.18107@vincent-weaver-1.umelst.maine.edu \
    --to=vincent.weaver@maine.edu \
    --cc=acme@kernel.org \
    --cc=eranian@gmail.com \
    --cc=kan.liang@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).