All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Shishkin <alexander.shishkin@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Vince Weaver <vincent.weaver@maine.edu>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Stephane Eranian <eranian@gmail.com>
Subject: Re: perf: aux area related crash and warnings
Date: Mon, 15 Jun 2015 15:49:19 +0300	[thread overview]
Message-ID: <87egld2l2o.fsf@ashishki-desk.ger.corp.intel.com> (raw)
In-Reply-To: <20150615122054.GY3644@twins.programming.kicks-ass.net>

Peter Zijlstra <peterz@infradead.org> writes:

> Alex, any clue?

Let me look into it. Definitely haven't seen anything like that in my
tests.

> On Fri, Jun 12, 2015 at 02:42:36PM -0400, Vince Weaver wrote:
>> On Thu, 11 Jun 2015, Vince Weaver wrote:
>> 
>> > and while I was trying to cut and paste that, the locked haswell just took 
>> > down the network switch so I can't get the rest until tomorrow.
>> 
>> here are the full dumps if anyone is interested
>> 
>> the warning is reproducible, the spinlock disaster is not.
>> 
>> [36298.986117] BUG: spinlock recursion on CPU#4, perf_fuzzer/3410
>> [36298.992915]  lock: 0xffff88011edf7cd0, .magic: dead4ead, .owner: perf_fuzzer/3410, .owner_cpu: 4
>> [36299.002919] CPU: 4 PID: 3410 Comm: perf_fuzzer Tainted: G        W       4.1.0-rc7+ #155
>> [36299.012152] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
>> [36299.020606]  ffff88011edf7cd0 ffff88011eb059a0 ffffffff816d7229 0000000000000054
>> [36299.029199]  ffff8800c2f4ac50 ffff88011eb059c0 ffffffff810c2895 ffff88011edf7cd0
>> [36299.037796]  ffffffff81a1e481 ffff88011eb059e0 ffffffff810c2916 ffff88011edf7cd0
>> [36299.046338] Call Trace:
>> [36299.049501]  <NMI>  [<ffffffff816d7229>] dump_stack+0x45/0x57
>> [36299.056284]  [<ffffffff810c2895>] spin_dump+0x85/0xe0
>> [36299.062282]  [<ffffffff810c2916>] spin_bug+0x26/0x30
>> [36299.068111]  [<ffffffff810c2acf>] do_raw_spin_lock+0x13f/0x180
>> [36299.074897]  [<ffffffff816de6e9>] _raw_spin_lock+0x39/0x40
>> [36299.081276]  [<ffffffff8117a039>] ? free_pcppages_bulk+0x39/0x620
>> [36299.088340]  [<ffffffff8117a039>] free_pcppages_bulk+0x39/0x620
>> [36299.095182]  [<ffffffff81177e14>] ? free_pages_prepare+0x3a4/0x550
>> [36299.102291]  [<ffffffff811c9936>] ? kfree_debugcheck+0x16/0x40
>> [36299.108987]  [<ffffffff8117a938>] free_hot_cold_page+0x178/0x1a0
>> [36299.115850]  [<ffffffff8117aa47>] __free_pages+0x37/0x50
>> [36299.121991]  [<ffffffff8116ae0a>] rb_free_aux+0xba/0xf0

This one goes to free aux pages from nmi context, looks like aux buffer
was unmapped while the event was running, so here it dropped the last
reference.

>> [36299.128034]  [<ffffffff8116b0e7>] perf_aux_output_end+0xb7/0xf0
>> [36299.134793]  [<ffffffff81037b0e>] intel_bts_interrupt+0x8e/0xd0
>> [36299.141543]  [<ffffffff810338bf>] intel_pmu_handle_irq+0x4f/0x450
>> [36299.148482]  [<ffffffff810bc288>] ? check_chain_key+0x128/0x1e0
>> [36299.155249]  [<ffffffff8102a4ab>] perf_event_nmi_handler+0x2b/0x50
>> [36299.162273]  [<ffffffff810185d0>] nmi_handle+0xa0/0x150

[snip]

>> 
>> and
>> 
>> [27716.785131] WARNING: CPU: 2 PID: 17655 at kernel/events/ring_buffer.c:282 perf_aux_output_begin+0x1ce/0x1f0()
>> [27716.796157] Modules linked in: fuse x86_pkg_temp_thermal intel_powerclamp intel_rapl iosf_mbi coretemp kvm_intel kvm snd_hda_codec_realtek snd_hda_codec_hdmi snd_hda_codec_generic snd_hda_intel snd_hda_controller crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec aesni_intel snd_hda_core aes_x86_64 iTCO_wdt lrw snd_hwdep gf128mul snd_pcm iTCO_vendor_support ppdev i915snd_timer glue_helper evdev ablk_helper snd cryptd mei_me mei psmouse pcspkr soundcore serio_raw lpc_ich drm_kms_helper parport_pc battery tpm_tis mfd_core parport drm video tpm i2c_i801 i2c_algo_bit wmi processor button sg sr_mod cdrom sd_mod ehci_pci ehci_hcd ahci libahci libata xhci_pci xhci_hcd e1000e ptp scsi_mod crc32c_intel usbcore pps_core usb_common thermal fan thermal_sys
>> [27716.873350] CPU: 2 PID: 17655 Comm: perf_fuzzer Tainted: G        W       4.1.0-rc7+ #155
>> [27716.882671] Hardware name: LENOVO 10AM000AUS/SHARKBAY, BIOS FBKT72AUS 01/26/2014
>> [27716.891168]  ffffffff81a230f2 ffff8800cf9afc68 ffffffff816d7229 0000000000000000
>> [27716.899782]  0000000000000000 ffff8800cf9afca8 ffffffff81072eba 000000000000cf80
>> [27716.908372]  ffff880036d85a00 ffff88011ea8cf80 ffff880119106000 000000000000cf80
>> [27716.916981] Call Trace:
>> [27716.920180]  [<ffffffff816d7229>] dump_stack+0x45/0x57
>> [27716.926250]  [<ffffffff81072eba>] warn_slowpath_common+0x8a/0xc0
>> [27716.933244]  [<ffffffff81072faa>] warn_slowpath_null+0x1a/0x20
>> [27716.939999]  [<ffffffff8116b00e>] perf_aux_output_begin+0x1ce/0x1f0
>> [27716.947225]  [<ffffffff81164785>] ? perf_event_update_userpage+0x5/0x170
>> [27716.954875]  [<ffffffff8103767d>] bts_event_add+0x6d/0x150
>> [27716.961259]  [<ffffffff81162ddb>] event_sched_in.isra.85+0xfb/0x330
>> [27716.968495]  [<ffffffff811630ca>] group_sched_in+0xba/0x1e0
>> [27716.974957]  [<ffffffff8101df2a>] ? native_sched_clock+0x2a/0x90
>> [27716.981888]  [<ffffffff811633c0>] ctx_sched_in+0x1d0/0x3f0
>> [27716.988274]  [<ffffffff81163649>] perf_event_sched_in+0x69/0xa0
>> [27716.995082]  [<ffffffff81163bc0>] perf_event_context_sched_in+0x90/0x160
>> [27717.002714]  [<ffffffff81164148>] __perf_event_task_sched_in+0xa8/0xc0
>> [27717.010160]  [<ffffffff8109897d>] finish_task_switch+0xed/0x140
>> [27717.016988]  [<ffffffff81012723>] ? __switch_to+0x263/0x5d0
>> [27717.023400]  [<ffffffff8109e5a1>] schedule_tail+0x11/0xa0
>> [27717.029652]  [<ffffffff816df57f>] ret_from_fork+0xf/0x70
>> [27717.035806] ---[ end trace bdbfeef5d81e5fa4 ]---
>> 
>> which corresponds to 
>>         /*
>>          * Nesting is not supported for AUX area, make sure nested
>>          * writers are caught early
>>          */
>>         if (WARN_ON_ONCE(local_xchg(&rb->aux_nest, 1)))
>>                 goto err_put;
>> 

This really shouldn't happen, I'd need to check with the fuzzer.

Thanks,
--
Alex

  reply	other threads:[~2015-06-15 12:49 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-12  3:15 perf: aux area related crash and warnings Vince Weaver
2015-06-12 18:42 ` Vince Weaver
2015-06-15 12:20   ` Peter Zijlstra
2015-06-15 12:49     ` Alexander Shishkin [this message]
2015-06-15 20:24       ` Ingo Molnar
2015-06-16 11:37       ` Alexander Shishkin
2015-06-18  9:09         ` Ingo Molnar
2015-06-18  9:35           ` Alexander Shishkin
2015-06-18 10:24             ` Peter Zijlstra
2015-06-18 16:30           ` Paul E. McKenney
2015-06-18 10:32         ` Peter Zijlstra
2015-06-18 13:11           ` Alexander Shishkin
2015-06-19 14:21           ` Vince Weaver
2015-07-06 15:34           ` [tip:perf/urgent] perf: Fix AUX buffer refcounting tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87egld2l2o.fsf@ashishki-desk.ger.corp.intel.com \
    --to=alexander.shishkin@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=eranian@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=vincent.weaver@maine.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.