All of lore.kernel.org
 help / color / mirror / Atom feed
* Perf events warning..
@ 2012-05-11 15:43 Linus Torvalds
  2012-05-14 22:20 ` Peter Zijlstra
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-05-11 15:43 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo; +Cc: linux-kernel

Peter, Ingo, Arnaldo,
 google doesn't find a warning like this, so it can't be *too*
commonly reported.

Anyway, doing some profiling of git "make test" (wonderful load for
doing lots of small processes that do lots of page faults etc), this
WARN_ON_ONCE() triggered:

    ------------[ cut here ]------------
    WARNING: at kernel/events/core.c:2066 task_ctx_sched_out+0x63/0x70()
    Hardware name: System Product Name
    Pid: 18120, comm: git Not tainted 3.4.0-rc6-00089-g4a01c681d58f-dirty #3
    Call Trace:
     [<ffffffff810308c5>] warn_slowpath_common+0x75/0xb0
     [<ffffffff810309c5>] warn_slowpath_null+0x15/0x20
     [<ffffffff81096d63>] task_ctx_sched_out+0x63/0x70
     [<ffffffff8109af06>] perf_event_comm+0x1d6/0x2e0
     [<ffffffff81368ba8>] ? get_random_int+0x88/0xb0
     [<ffffffff8102e1e2>] ? __mmdrop+0x62/0x90
     [<ffffffff810e1f93>] set_task_comm+0x63/0x80
     [<ffffffff810e3206>] setup_new_exec+0x86/0x250
     [<ffffffff81126b69>] load_elf_binary+0x389/0x1930
     [<ffffffff81125222>] ? load_misc_binary+0x292/0x390
     [<ffffffff810bcbdd>] ? get_user_pages+0x4d/0x50
     [<ffffffff810e13b2>] ? get_arg_page+0xa2/0xd0
     [<ffffffff810e2b65>] search_binary_handler+0xd5/0x200
     [<ffffffff811267e0>] ? elf_map+0x170/0x170
     [<ffffffff810e300d>] do_execve_common.clone.39+0x37d/0x430
     [<ffffffff810e30d6>] do_execve+0x16/0x20
     [<ffffffff8100ab75>] sys_execve+0x45/0x70
     [<ffffffff816b10ac>] stub_execve+0x6c/0xc0
    ---[ end trace 6fccf3db70f1b560 ]---

any comments/ideas?

(That kernel version isn't one you'd find in any git tree - it's a
throw-away tree with some experimental patches for dcache cleanups
etc. But those patches should not matter at all for this kind of
thing)

                          Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-11 15:43 Perf events warning Linus Torvalds
@ 2012-05-14 22:20 ` Peter Zijlstra
  2012-05-14 22:25   ` David Ahern
  2012-05-14 22:25   ` Linus Torvalds
  0 siblings, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2012-05-14 22:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel

On Fri, 2012-05-11 at 08:43 -0700, Linus Torvalds wrote:
> Peter, Ingo, Arnaldo,
>  google doesn't find a warning like this, so it can't be *too*
> commonly reported.
> 
> Anyway, doing some profiling of git "make test" (wonderful load for
> doing lots of small processes that do lots of page faults etc), this
> WARN_ON_ONCE() triggered:
> 
>     ------------[ cut here ]------------
>     WARNING: at kernel/events/core.c:2066 task_ctx_sched_out+0x63/0x70()
>     Hardware name: System Product Name
>     Pid: 18120, comm: git Not tainted 3.4.0-rc6-00089-g4a01c681d58f-dirty #3
>     Call Trace:
>      [<ffffffff810308c5>] warn_slowpath_common+0x75/0xb0
>      [<ffffffff810309c5>] warn_slowpath_null+0x15/0x20
>      [<ffffffff81096d63>] task_ctx_sched_out+0x63/0x70
>      [<ffffffff8109af06>] perf_event_comm+0x1d6/0x2e0
>      [<ffffffff81368ba8>] ? get_random_int+0x88/0xb0
>      [<ffffffff8102e1e2>] ? __mmdrop+0x62/0x90
>      [<ffffffff810e1f93>] set_task_comm+0x63/0x80
>      [<ffffffff810e3206>] setup_new_exec+0x86/0x250
>      [<ffffffff81126b69>] load_elf_binary+0x389/0x1930
>      [<ffffffff81125222>] ? load_misc_binary+0x292/0x390
>      [<ffffffff810bcbdd>] ? get_user_pages+0x4d/0x50
>      [<ffffffff810e13b2>] ? get_arg_page+0xa2/0xd0
>      [<ffffffff810e2b65>] search_binary_handler+0xd5/0x200
>      [<ffffffff811267e0>] ? elf_map+0x170/0x170
>      [<ffffffff810e300d>] do_execve_common.clone.39+0x37d/0x430
>      [<ffffffff810e30d6>] do_execve+0x16/0x20
>      [<ffffffff8100ab75>] sys_execve+0x45/0x70
>      [<ffffffff816b10ac>] stub_execve+0x6c/0xc0
>     ---[ end trace 6fccf3db70f1b560 ]---
> 
> any comments/ideas?

So far I can't make any of the things I came up stick. You ran something
simple like: 'perf record -e cycles:pp -F 20000 make test' ? Or did you
do something more interesting?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-14 22:20 ` Peter Zijlstra
@ 2012-05-14 22:25   ` David Ahern
  2012-05-14 22:25   ` Linus Torvalds
  1 sibling, 0 replies; 12+ messages in thread
From: David Ahern @ 2012-05-14 22:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel

On 5/14/12 4:20 PM, Peter Zijlstra wrote:
> On Fri, 2012-05-11 at 08:43 -0700, Linus Torvalds wrote:
>> Peter, Ingo, Arnaldo,
>>   google doesn't find a warning like this, so it can't be *too*
>> commonly reported.
>>
>> Anyway, doing some profiling of git "make test" (wonderful load for
>> doing lots of small processes that do lots of page faults etc), this
>> WARN_ON_ONCE() triggered:
>>
>>      ------------[ cut here ]------------
>>      WARNING: at kernel/events/core.c:2066 task_ctx_sched_out+0x63/0x70()
>>      Hardware name: System Product Name
>>      Pid: 18120, comm: git Not tainted 3.4.0-rc6-00089-g4a01c681d58f-dirty #3
>>      Call Trace:
>>       [<ffffffff810308c5>] warn_slowpath_common+0x75/0xb0
>>       [<ffffffff810309c5>] warn_slowpath_null+0x15/0x20
>>       [<ffffffff81096d63>] task_ctx_sched_out+0x63/0x70
>>       [<ffffffff8109af06>] perf_event_comm+0x1d6/0x2e0
>>       [<ffffffff81368ba8>] ? get_random_int+0x88/0xb0
>>       [<ffffffff8102e1e2>] ? __mmdrop+0x62/0x90
>>       [<ffffffff810e1f93>] set_task_comm+0x63/0x80
>>       [<ffffffff810e3206>] setup_new_exec+0x86/0x250
>>       [<ffffffff81126b69>] load_elf_binary+0x389/0x1930
>>       [<ffffffff81125222>] ? load_misc_binary+0x292/0x390
>>       [<ffffffff810bcbdd>] ? get_user_pages+0x4d/0x50
>>       [<ffffffff810e13b2>] ? get_arg_page+0xa2/0xd0
>>       [<ffffffff810e2b65>] search_binary_handler+0xd5/0x200
>>       [<ffffffff811267e0>] ? elf_map+0x170/0x170
>>       [<ffffffff810e300d>] do_execve_common.clone.39+0x37d/0x430
>>       [<ffffffff810e30d6>] do_execve+0x16/0x20
>>       [<ffffffff8100ab75>] sys_execve+0x45/0x70
>>       [<ffffffff816b10ac>] stub_execve+0x6c/0xc0
>>      ---[ end trace 6fccf3db70f1b560 ]---
>>
>> any comments/ideas?
>
> So far I can't make any of the things I came up stick. You ran something
> simple like: 'perf record -e cycles:pp -F 20000 make test' ? Or did you
> do something more interesting?

By chance I just hit this:
[   31.528795] WARNING: at 
/mnt/sw/kernel-2.6.git/arch/x86/kernel/cpu/perf_event.c:1054 
x86_pmu_start+0xdc/0x110()
[   31.528799] Hardware name: Bochs
[   31.528801] Modules linked in: nfs fscache auth_rpcgss nfs_acl lockd 
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 
nf_defrag_ipv4 xt_state ip6table_filter nf_conntrack ip6_tables 
virtio_net ppdev i2c_piix4 parport_pc parport i2c_core sunrpc virtio_blk 
[last unloaded: scsi_wait_scan]
[   31.528825] Pid: 959, comm: find Not tainted 3.4.0-rc6+ #1
[   31.528827] Call Trace:
[   31.528835]  [<ffffffff8105792f>] warn_slowpath_common+0x7f/0xc0
[   31.528839]  [<ffffffff8105798a>] warn_slowpath_null+0x1a/0x20
[   31.528843]  [<ffffffff8102543c>] x86_pmu_start+0xdc/0x110
[   31.528847]  [<ffffffff81025af2>] x86_pmu_enable+0x212/0x270
[   31.528854]  [<ffffffff81116466>] perf_event_context_sched_in+0xe6/0x100
[   31.528857]  [<ffffffff81118083>] perf_event_comm+0x103/0x2b0
[   31.528863]  [<ffffffff81184342>] set_task_comm+0x72/0xe0
[   31.528867]  [<ffffffff81184a1b>] setup_new_exec+0x8b/0x240
[   31.528873]  [<ffffffff811cc8b7>] load_elf_binary+0x3e7/0x19a0
[   31.528879]  [<ffffffff811443c2>] ? get_user_pages+0x52/0x60
[   31.528883]  [<ffffffff81182708>] ? get_user_arg_ptr+0x38/0x80
[   31.528887]  [<ffffffff81182bae>] search_binary_handler+0xee/0x340
[   31.528891]  [<ffffffff811cc4d0>] ? load_elf_library+0x230/0x230
[   31.528895]  [<ffffffff811847ff>] do_execve_common+0x36f/0x410
[   31.528899]  [<ffffffff811848da>] do_execve+0x3a/0x40
[   31.528903]  [<ffffffff8101d477>] sys_execve+0x47/0x70
[   31.528908]  [<ffffffff815f226c>] stub_execve+0x6c/0xc0
[   31.528911] ---[ end trace ba6387065a9b9696 ]---

As for the something interesting part - not really: debugging the 
regression on 'perf stat --group -- find /usr >/dev/null'

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-14 22:20 ` Peter Zijlstra
  2012-05-14 22:25   ` David Ahern
@ 2012-05-14 22:25   ` Linus Torvalds
  2012-05-15 10:49     ` Peter Zijlstra
  1 sibling, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2012-05-14 22:25 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel

On Mon, May 14, 2012 at 3:20 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>
> So far I can't make any of the things I came up stick. You ran something
> simple like: 'perf record -e cycles:pp -F 20000 make test' ? Or did you
> do something more interesting?

It was not much more complex than that.

It did use "make -j64 test" to make the load a *bit* more interesting
(and go noticeably faster), but other than that you got it.

                     Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-14 22:25   ` Linus Torvalds
@ 2012-05-15 10:49     ` Peter Zijlstra
  2012-05-15 15:25       ` David Ahern
  2012-05-15 15:48       ` Linus Torvalds
  0 siblings, 2 replies; 12+ messages in thread
From: Peter Zijlstra @ 2012-05-15 10:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel

On Mon, 2012-05-14 at 15:25 -0700, Linus Torvalds wrote:
> On Mon, May 14, 2012 at 3:20 PM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> >
> > So far I can't make any of the things I came up stick. You ran something
> > simple like: 'perf record -e cycles:pp -F 20000 make test' ? Or did you
> > do something more interesting?
> 
> It was not much more complex than that.
> 
> It did use "make -j64 test" to make the load a *bit* more interesting
> (and go noticeably faster), but other than that you got it.

OK, that limits the scope of crazy scenarios I have to consider, still
no immediate clue though..

I think I've found a possible race, but I can't make it work with that
workload. I've also let your workload run for 2+ hours in trying to
reproduce, but no luck, it must be a very narrow window indeed.

I'll keep prodding at it..

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-15 10:49     ` Peter Zijlstra
@ 2012-05-15 15:25       ` David Ahern
  2012-05-15 15:28         ` Peter Zijlstra
  2012-05-15 15:46         ` Arnaldo Carvalho de Melo
  2012-05-15 15:48       ` Linus Torvalds
  1 sibling, 2 replies; 12+ messages in thread
From: David Ahern @ 2012-05-15 15:25 UTC (permalink / raw)
  To: Peter Zijlstra, Arnaldo Carvalho de Melo
  Cc: Linus Torvalds, Ingo Molnar, linux-kernel

On 5/15/12 4:49 AM, Peter Zijlstra wrote:
> On Mon, 2012-05-14 at 15:25 -0700, Linus Torvalds wrote:
>> On Mon, May 14, 2012 at 3:20 PM, Peter Zijlstra<a.p.zijlstra@chello.nl>  wrote:
>>>
>>> So far I can't make any of the things I came up stick. You ran something
>>> simple like: 'perf record -e cycles:pp -F 20000 make test' ? Or did you
>>> do something more interesting?
>>
>> It was not much more complex than that.
>>
>> It did use "make -j64 test" to make the load a *bit* more interesting
>> (and go noticeably faster), but other than that you got it.
>
> OK, that limits the scope of crazy scenarios I have to consider, still
> no immediate clue though..
>
> I think I've found a possible race, but I can't make it work with that
> workload. I've also let your workload run for 2+ hours in trying to
> reproduce, but no luck, it must be a very narrow window indeed.
>
> I'll keep prodding at it..

Perhaps it is specific to processor generation? Yesterday I noted that 
perf-stat -g trips a WARNING only on Nehalem. Westmere works fine - 
perf-stat -g generates output and no warning is triggered. Arnaldo is 
using Sandy Bridge - though it's not clear if his success (3.4.0-rc3 on 
server named sandy) or fail (3.4.0-rc4-uprobes on felicio) was on a SNB.

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-15 15:25       ` David Ahern
@ 2012-05-15 15:28         ` Peter Zijlstra
  2012-05-15 15:37           ` David Ahern
  2012-05-15 15:46         ` Arnaldo Carvalho de Melo
  1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2012-05-15 15:28 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Carvalho de Melo, Linus Torvalds, Ingo Molnar, linux-kernel

On Tue, 2012-05-15 at 09:25 -0600, David Ahern wrote:
> 
> Perhaps it is specific to processor generation? 

Your error is distinctly different from Linus' in that it came from
within the arch code, Linus' was core code.

Furthermore the error you send had:

 [   31.528799] Hardware name: Bochs

Which is some virt crap.. so I wouldn't trust the 'hardware' anyway.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-15 15:28         ` Peter Zijlstra
@ 2012-05-15 15:37           ` David Ahern
  2012-05-16  1:38             ` Namhyung Kim
  0 siblings, 1 reply; 12+ messages in thread
From: David Ahern @ 2012-05-15 15:37 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Arnaldo Carvalho de Melo, Linus Torvalds, Ingo Molnar, linux-kernel

On 5/15/12 9:28 AM, Peter Zijlstra wrote:
> On Tue, 2012-05-15 at 09:25 -0600, David Ahern wrote:
>>
>> Perhaps it is specific to processor generation?
>
> Your error is distinctly different from Linus' in that it came from
> within the arch code, Linus' was core code.
>
> Furthermore the error you send had:
>
>   [   31.528799] Hardware name: Bochs
>
> Which is some virt crap.. so I wouldn't trust the 'hardware' anyway.

:-) Right, KVM and the vPMU added in 3.3. That said, it is recognized as 
a Nehalem and perf walks the Nehalem events path.

So if VM based WARNING is not to your liking, here's a baremetal version:

[   84.388495] ------------[ cut here ]------------
[   84.388554] WARNING: at 
/opt/sw/ahern/kernels/kernel-2.6.git/arch/x86/kernel/cpu/perf_event.c:1054 
x86_pmu_start+0xdc/0x110()
[   84.388613] Hardware name: ProLiant DL380 G6
[   84.388663] Modules linked in: nfs fscache bridge stp llc 
ipt_MASQUERADE iptable_nat nf_nat xt_physdev nf_conntrack_ipv4 
nf_defrag_ipv4 xt_state nf_conntrack xt_multiport nfsd lockd nfs_acl 
auth_rpcgss sunrpc coretemp ipmi_si ipmi_msghandler bnx2 i7core_edac 
edac_core hpilo hpwdt acpi_power_meter crc32c_intel microcode iTCO_wdt 
iTCO_vendor_support vhost_net pcspkr macvtap macvlan tun virtio_net 
kvm_intel kvm usb_storage hpsa radeon ttm drm_kms_helper drm 
i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
[   84.390624] Pid: 1806, comm: find Not tainted 3.4.0-rc7+ #1
[   84.390671] Call Trace:
[   84.390719]  [<ffffffff810579df>] warn_slowpath_common+0x7f/0xc0
[   84.390769]  [<ffffffff81057a3a>] warn_slowpath_null+0x1a/0x20
[   84.390831]  [<ffffffff8102546c>] x86_pmu_start+0xdc/0x110
[   84.390880]  [<ffffffff81025b22>] x86_pmu_enable+0x212/0x270
[   84.390996]  [<ffffffff81116496>] perf_event_context_sched_in+0xe6/0x100
[   84.391113]  [<ffffffff811180b3>] perf_event_comm+0x103/0x2b0
[   84.391232]  [<ffffffff81186732>] set_task_comm+0x72/0xe0
[   84.391361]  [<ffffffff81186e0b>] setup_new_exec+0x8b/0x240
[   84.391480]  [<ffffffff811ceca7>] load_elf_binary+0x3e7/0x19a0
[   84.391600]  [<ffffffff81145ac2>] ? get_user_pages+0x52/0x60
[   84.391716]  [<ffffffff81184af8>] ? get_user_arg_ptr+0x38/0x80
[   84.391833]  [<ffffffff81184f9e>] search_binary_handler+0xee/0x340
[   84.391963]  [<ffffffff811ce8c0>] ? load_elf_library+0x230/0x230
[   84.392080]  [<ffffffff81186bef>] do_execve_common+0x36f/0x410
[   84.392196]  [<ffffffff81186cca>] do_execve+0x3a/0x40
[   84.392328]  [<ffffffff8101d4a7>] sys_execve+0x47/0x70
[   84.392445]  [<ffffffff816002ec>] stub_execve+0x6c/0xc0
[   84.392558] ---[ end trace 78e50a201158fd5d ]---


Though this one is an HP server with the lovely:

[    0.143910] Performance Events: PEBS fmt1+, 16-deep LBR, Nehalem 
events, Broken BIOS detected, complain to your hardware vendor.
[    0.144351] [Firmware Bug]: the BIOS has corrupted hw-PMU resources 
(MSR 38d is 330)
[    0.144627] Intel PMU driver.
[    0.144777] CPU erratum AAJ80 worked around

David

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-15 15:25       ` David Ahern
  2012-05-15 15:28         ` Peter Zijlstra
@ 2012-05-15 15:46         ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2012-05-15 15:46 UTC (permalink / raw)
  To: David Ahern; +Cc: Peter Zijlstra, Linus Torvalds, Ingo Molnar, linux-kernel

Em Tue, May 15, 2012 at 09:25:19AM -0600, David Ahern escreveu:
> Perhaps it is specific to processor generation? Yesterday I noted
> that perf-stat -g trips a WARNING only on Nehalem. Westmere works
> fine - perf-stat -g generates output and no warning is triggered.
> Arnaldo is using Sandy Bridge - though it's not clear if his success
> (3.4.0-rc3 on server named sandy) or fail (3.4.0-rc4-uprobes on
> felicio) was on a SNB.

Both are sandy bridges, sandy is a notebook (Intel(R) Core(TM)
i7-2920XM), felicio is a desktop (Intel(R) Core(TM) i5-2400).

- Arnaldo

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-15 10:49     ` Peter Zijlstra
  2012-05-15 15:25       ` David Ahern
@ 2012-05-15 15:48       ` Linus Torvalds
  1 sibling, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2012-05-15 15:48 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, Arnaldo Carvalho de Melo, linux-kernel

On Tue, May 15, 2012 at 3:49 AM, Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
>>
>> It did use "make -j64 test" to make the load a *bit* more interesting
>> (and go noticeably faster), but other than that you got it.
>
> OK, that limits the scope of crazy scenarios I have to consider, still
> no immediate clue though..

Actually, looking into my history, some of them had "-fg", and some of
them used "cycles:p". It looks like I had three different combinations
I used:

 - the one I already mentioned:

        perf record -f -e cycles:pp make -j64 test

 - two variations of the above:

        perf record -fg -e cycles:p make -j64 test
        perf record -fg -e cycles:pp make -j64 test

and I don't know which of these caused the warning.

I did profile some other things too (I commonly do profiles of "git
diff" and "make -j" on a fully built kernel), but they used the same
flags, so from a perf standpoint they shouldn't be all that different.

                     Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-15 15:37           ` David Ahern
@ 2012-05-16  1:38             ` Namhyung Kim
  2012-05-21  6:06               ` Namhyung Kim
  0 siblings, 1 reply; 12+ messages in thread
From: Namhyung Kim @ 2012-05-16  1:38 UTC (permalink / raw)
  To: David Ahern
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Linus Torvalds,
	Ingo Molnar, linux-kernel

Hi,

On Tue, 15 May 2012 09:37:37 -0600, David Ahern wrote:
> :-) Right, KVM and the vPMU added in 3.3. That said, it is recognized
> as a Nehalem and perf walks the Nehalem events path.
>
> So if VM based WARNING is not to your liking, here's a baremetal version:
>
> [   84.388495] ------------[ cut here ]------------
> [   84.388554] WARNING: at
> /opt/sw/ahern/kernels/kernel-2.6.git/arch/x86/kernel/cpu/perf_event.c:1054
> x86_pmu_start+0xdc/0x110()
> [   84.388613] Hardware name: ProLiant DL380 G6
> [   84.388663] Modules linked in: nfs fscache bridge stp llc
> ipt_MASQUERADE iptable_nat nf_nat xt_physdev nf_conntrack_ipv4
> nf_defrag_ipv4 xt_state nf_conntrack xt_multiport nfsd lockd nfs_acl
> auth_rpcgss sunrpc coretemp ipmi_si ipmi_msghandler bnx2 i7core_edac
> edac_core hpilo hpwdt acpi_power_meter crc32c_intel microcode iTCO_wdt
> iTCO_vendor_support vhost_net pcspkr macvtap macvlan tun virtio_net
> kvm_intel kvm usb_storage hpsa radeon ttm drm_kms_helper drm
> i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
> [   84.390624] Pid: 1806, comm: find Not tainted 3.4.0-rc7+ #1
> [   84.390671] Call Trace:
> [   84.390719]  [<ffffffff810579df>] warn_slowpath_common+0x7f/0xc0
> [   84.390769]  [<ffffffff81057a3a>] warn_slowpath_null+0x1a/0x20
> [   84.390831]  [<ffffffff8102546c>] x86_pmu_start+0xdc/0x110
> [   84.390880]  [<ffffffff81025b22>] x86_pmu_enable+0x212/0x270
> [   84.390996]  [<ffffffff81116496>] perf_event_context_sched_in+0xe6/0x100
> [   84.391113]  [<ffffffff811180b3>] perf_event_comm+0x103/0x2b0
> [   84.391232]  [<ffffffff81186732>] set_task_comm+0x72/0xe0
> [   84.391361]  [<ffffffff81186e0b>] setup_new_exec+0x8b/0x240
> [   84.391480]  [<ffffffff811ceca7>] load_elf_binary+0x3e7/0x19a0
> [   84.391600]  [<ffffffff81145ac2>] ? get_user_pages+0x52/0x60
> [   84.391716]  [<ffffffff81184af8>] ? get_user_arg_ptr+0x38/0x80
> [   84.391833]  [<ffffffff81184f9e>] search_binary_handler+0xee/0x340
> [   84.391963]  [<ffffffff811ce8c0>] ? load_elf_library+0x230/0x230
> [   84.392080]  [<ffffffff81186bef>] do_execve_common+0x36f/0x410
> [   84.392196]  [<ffffffff81186cca>] do_execve+0x3a/0x40
> [   84.392328]  [<ffffffff8101d4a7>] sys_execve+0x47/0x70
> [   84.392445]  [<ffffffff816002ec>] stub_execve+0x6c/0xc0
> [   84.392558] ---[ end trace 78e50a201158fd5d ]---
>
>
> Though this one is an HP server with the lovely:
>
> [    0.143910] Performance Events: PEBS fmt1+, 16-deep LBR, Nehalem
> events, Broken BIOS detected, complain to your hardware vendor.
> [    0.144351] [Firmware Bug]: the BIOS has corrupted hw-PMU resources
> (MSR 38d is 330)
> [    0.144627] Intel PMU driver.
> [    0.144777] CPU erratum AAJ80 worked around
>
> David

I got a similar warning on my SNB (i7-3930K) desktop.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Perf events warning..
  2012-05-16  1:38             ` Namhyung Kim
@ 2012-05-21  6:06               ` Namhyung Kim
  0 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2012-05-21  6:06 UTC (permalink / raw)
  To: David Ahern
  Cc: Peter Zijlstra, Arnaldo Carvalho de Melo, Linus Torvalds,
	Ingo Molnar, linux-kernel

Hi,

On Wed, 16 May 2012 10:38:33 +0900, Namhyung Kim wrote:
> On Tue, 15 May 2012 09:37:37 -0600, David Ahern wrote:
>> :-) Right, KVM and the vPMU added in 3.3. That said, it is recognized
>> as a Nehalem and perf walks the Nehalem events path.
>>
>> So if VM based WARNING is not to your liking, here's a baremetal version:
>>
>> [   84.388495] ------------[ cut here ]------------
>> [   84.388554] WARNING: at
>> /opt/sw/ahern/kernels/kernel-2.6.git/arch/x86/kernel/cpu/perf_event.c:1054
>> x86_pmu_start+0xdc/0x110()
>> [   84.388613] Hardware name: ProLiant DL380 G6
>> [   84.388663] Modules linked in: nfs fscache bridge stp llc
>> ipt_MASQUERADE iptable_nat nf_nat xt_physdev nf_conntrack_ipv4
>> nf_defrag_ipv4 xt_state nf_conntrack xt_multiport nfsd lockd nfs_acl
>> auth_rpcgss sunrpc coretemp ipmi_si ipmi_msghandler bnx2 i7core_edac
>> edac_core hpilo hpwdt acpi_power_meter crc32c_intel microcode iTCO_wdt
>> iTCO_vendor_support vhost_net pcspkr macvtap macvlan tun virtio_net
>> kvm_intel kvm usb_storage hpsa radeon ttm drm_kms_helper drm
>> i2c_algo_bit i2c_core [last unloaded: scsi_wait_scan]
>> [   84.390624] Pid: 1806, comm: find Not tainted 3.4.0-rc7+ #1
>> [   84.390671] Call Trace:
>> [   84.390719]  [<ffffffff810579df>] warn_slowpath_common+0x7f/0xc0
>> [   84.390769]  [<ffffffff81057a3a>] warn_slowpath_null+0x1a/0x20
>> [   84.390831]  [<ffffffff8102546c>] x86_pmu_start+0xdc/0x110
>> [   84.390880]  [<ffffffff81025b22>] x86_pmu_enable+0x212/0x270
>> [   84.390996]  [<ffffffff81116496>] perf_event_context_sched_in+0xe6/0x100
>> [   84.391113]  [<ffffffff811180b3>] perf_event_comm+0x103/0x2b0
>> [   84.391232]  [<ffffffff81186732>] set_task_comm+0x72/0xe0
>> [   84.391361]  [<ffffffff81186e0b>] setup_new_exec+0x8b/0x240
>> [   84.391480]  [<ffffffff811ceca7>] load_elf_binary+0x3e7/0x19a0
>> [   84.391600]  [<ffffffff81145ac2>] ? get_user_pages+0x52/0x60
>> [   84.391716]  [<ffffffff81184af8>] ? get_user_arg_ptr+0x38/0x80
>> [   84.391833]  [<ffffffff81184f9e>] search_binary_handler+0xee/0x340
>> [   84.391963]  [<ffffffff811ce8c0>] ? load_elf_library+0x230/0x230
>> [   84.392080]  [<ffffffff81186bef>] do_execve_common+0x36f/0x410
>> [   84.392196]  [<ffffffff81186cca>] do_execve+0x3a/0x40
>> [   84.392328]  [<ffffffff8101d4a7>] sys_execve+0x47/0x70
>> [   84.392445]  [<ffffffff816002ec>] stub_execve+0x6c/0xc0
>> [   84.392558] ---[ end trace 78e50a201158fd5d ]---
>>
>>
>> Though this one is an HP server with the lovely:
>>
>> [    0.143910] Performance Events: PEBS fmt1+, 16-deep LBR, Nehalem
>> events, Broken BIOS detected, complain to your hardware vendor.
>> [    0.144351] [Firmware Bug]: the BIOS has corrupted hw-PMU resources
>> (MSR 38d is 330)
>> [    0.144627] Intel PMU driver.
>> [    0.144777] CPU erratum AAJ80 worked around
>>
>> David
>
> I got a similar warning on my SNB (i7-3930K) desktop.
>

The git bisect told me (sigh):

a34668f6beb4ab01e07683276d6a24bab6c175e0 is the first bad commit
commit a34668f6beb4ab01e07683276d6a24bab6c175e0
Author: Youquan Song <youquan.song@intel.com>
Date:   Tue Aug 2 14:01:35 2011 +0800

    perf, x86: Add model 45 SandyBridge support
    
    Add support to Romely-EP SandyBridge.
    
    Signed-off-by: Youquan Song <youquan.song@intel.com>
    Signed-off-by: Anhua Xu <anhua.xu@intel.com>
    Signed-off-by: Lin Ming <ming.m.lin@intel.com>
    Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/1312264895-2010-1-git-send-email-youquan.song@intel.com
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

:040000 040000 c8302c68b80d3657cfd6afa5eb51300c0996baf1 ba78ebeb8ad6cce92fba8f643b4c5b1007d0e336 March


Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-05-21  6:08 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-11 15:43 Perf events warning Linus Torvalds
2012-05-14 22:20 ` Peter Zijlstra
2012-05-14 22:25   ` David Ahern
2012-05-14 22:25   ` Linus Torvalds
2012-05-15 10:49     ` Peter Zijlstra
2012-05-15 15:25       ` David Ahern
2012-05-15 15:28         ` Peter Zijlstra
2012-05-15 15:37           ` David Ahern
2012-05-16  1:38             ` Namhyung Kim
2012-05-21  6:06               ` Namhyung Kim
2012-05-15 15:46         ` Arnaldo Carvalho de Melo
2012-05-15 15:48       ` Linus Torvalds

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.