linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Oprofile bug ?
@ 2010-11-12 21:11 Sylvain GENEVES
  2010-11-12 21:45 ` Frank Ch. Eigler
  2010-11-13 16:18 ` Peter Zijlstra
  0 siblings, 2 replies; 3+ messages in thread
From: Sylvain GENEVES @ 2010-11-12 21:11 UTC (permalink / raw)
  To: oprofile-list; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 677 bytes --]

Hello,

I'm encountering unexpected behaviour with OProfile when the profiled
system is under heavy load : "BUG: unable to handle kernel paging request
at 0000000000004cc3" (full console message is attached).

I'm using a linux kernel 2.6.32 and both OProfile 0.9.6 from Debian
repositories and OProfile 0.9.7 from CVS on Intel Xeon E5440 harpertown.

It happens when I profile a webserver under heavy load, with some cores
disabled, and with the following event :
--event=CPU_CLK_UNHALTED:1600000000000

I also use the --separate=all --callgraph=3 and my kernel has debug
symbols and frame pointers enabled.

Anyone has any idea on what is happening ?
Thanks,
Regards,
Sylvain

[-- Attachment #2: oprofile_bug_stacktrace --]
[-- Type: application/octet-stream, Size: 8743 bytes --]

oprofile: using NMI interrupt.
BUG: unable to handle kernel paging request at 0000000000004cc3
IP: [<ffffffff8100eda8>] profile_pc+0x2b/0x37
PGD 1d9cae067 PUD 22f15a067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu1/online
CPU 0 
Modules linked in: oprofile mx_driver(P) mx_mcp(P) video output sbs sbshc battery ac cpufreq_userspace cpufreq_stats cpufreq_ondemand freq_table cpufreq_powersave cpufreq_conservative nfs lockd nfs_acl auth_rpcgss sunrpc ext3 jbd tcp_cubic dm_snapshot dm_mirror dm_region_hash dm_log dm_mod ib_uverbs ib_ipoib ib_cm ib_sa ib_mad ib_core ipv6 sg rtc_cmos rtc_core psmouse i2c_i801 rtc_lib i5000_edac sr_mod cdrom shpchp serio_raw edac_core tpm_tis tpm tpm_bios evdev joydev i2c_core pcspkr iTCO_wdt container pci_hotplug i5k_amb processor button ext2 mbcache usbhid hid usb_storage sd_mod uhci_hcd ehci_hcd e1000e thermal fan thermal_sys bnx2 ahci tg3 libphy forcedeth e1000 cciss ata_piix megaraid_sas mptsas scsi_transport_sas sata_svw sata_sil sata_nv libata mptspi mptscsih mptbase scsi_transport_spi scsi_mod [last unloaded: oprofile]
Pid: 5008, comm: userver Tainted: P        W  2.6.32 #3 AltixXE310
RIP: 0010:[<ffffffff8100eda8>]  [<ffffffff8100eda8>] profile_pc+0x2b/0x37
RSP: 0018:ffff880028207de8  EFLAGS: 00010002
RAX: 0000000000004cbb RBX: ffffffff812d1ab0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff812d1ab0
RBP: ffff880028207df8 R08: 0000000000000000 R09: 0000000000000002
R10: 0000000000000001 R11: ffffffff8118cbe4 R12: ffff880028207f58
R13: 0000000000000000 R14: ffff880028207f58 R15: 0000000000000001
FS:  00007ffff7fef700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000004cc3 CR3: 0000000202a55000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process userver (pid: 5008, threadinfo ffff880223cc2000, task ffff880223d8c2c0)
Stack:
 0000000000000000 0000000000000000 ffff880028207e38 ffffffffa02535fb
<0> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
<0> ffff880028213950 ffff880028207f58 ffff880028207e78 ffffffffa0255d35
Call Trace:
 <NMI> 
 [<ffffffffa02535fb>] oprofile_add_sample+0x2a/0x8a [oprofile]
 [<ffffffffa0255d35>] ppro_check_ctrs+0x6b/0xd3 [oprofile]
 [<ffffffffa02550af>] profile_exceptions_notify+0x39/0x40 [oprofile]
 [<ffffffff8105bed3>] notifier_call_chain+0x33/0x5b
 [<ffffffff8105bf1d>] atomic_notifier_call_chain+0x13/0x15
 [<ffffffff8105bfbf>] notify_die+0x2e/0x30
 [<ffffffff8100d96c>] do_nmi+0x9a/0x266
 [<ffffffff812d20fa>] nmi+0x1a/0x20
 [<ffffffff8118cbe4>] ? is_swiotlb_buffer+0x2b/0x39
 [<ffffffff812d1ab0>] ? _spin_lock_irqsave+0x0/0x1d
 <<EOE>> 
 <IRQ> 
 [<ffffffffa052cfd9>] ? mx_gw_lookup+0x2e/0x96 [mx_driver]
 [<ffffffffa0522757>] ? mx_ether_xmit+0x130/0x668 [mx_driver]
 [<ffffffff81271368>] ? ip_rcv_finish+0x33c/0x356
 [<ffffffff8105f946>] ? getnstimeofday+0x5f/0xbf
 [<ffffffff8118cbe4>] ? is_swiotlb_buffer+0x2b/0x39
 [<ffffffff8124d80d>] ? dev_hard_start_xmit+0x224/0x2d0
 [<ffffffff81260ae9>] ? sch_direct_xmit+0x5a/0x157
 [<ffffffff81260ca2>] ? __qdisc_run+0xbc/0xdc
 [<ffffffff8124af32>] ? net_tx_action+0xc2/0x120
 [<ffffffff81046767>] ? __do_softirq+0xd8/0x195
 [<ffffffff8100cb5c>] ? call_softirq+0x1c/0x28
 [<ffffffff8100e5ef>] ? do_softirq+0x33/0x6b
 [<ffffffff8104654b>] ? irq_exit+0x36/0x75
 [<ffffffff8100dcf1>] ? do_IRQ+0xa3/0xba
 [<ffffffff8100c3d3>] ? ret_from_intr+0x0/0xa
 <EOI> 
 [<ffffffff810fe25e>] ? generic_file_splice_read+0x0/0x72
 [<ffffffff810fe1ac>] ? __generic_file_splice_read+0x393/0x445
 [<ffffffff810fdee5>] ? __generic_file_splice_read+0xcc/0x445
 [<ffffffff81286780>] ? tcp_init_tso_segs+0x32/0x49
 [<ffffffff8104dc29>] ? mod_timer+0x188/0x19a
 [<ffffffff8124655b>] ? __alloc_skb+0x6b/0x164
 [<ffffffff812d1c78>] ? _spin_unlock_bh+0xf/0x11
 [<ffffffff812412f3>] ? release_sock+0xa6/0xae
 [<ffffffff8127db78>] ? tcp_sendmsg+0x930/0xa2e
 [<ffffffff8123e4c2>] ? sock_aio_write+0x10f/0x123
 [<ffffffff810fc1e0>] ? pipe_to_sendpage+0x0/0x69
 [<ffffffff810fc571>] ? __splice_from_pipe+0x3f/0x6d
 [<ffffffff810fc902>] ? spd_release_page+0x0/0x14
 [<ffffffff810fe2a1>] ? generic_file_splice_read+0x43/0x72
 [<ffffffff810fca88>] ? do_splice_to+0x6c/0x7f
 [<ffffffff810fcb62>] ? splice_direct_to_actor+0xc7/0x193
 [<ffffffff810fd174>] ? direct_splice_actor+0x0/0x1d
 [<ffffffff812d1be6>] ? _spin_lock_bh+0x11/0x29
 [<ffffffff810fcc7f>] ? do_splice_direct+0x51/0x6c
 [<ffffffff810df5a3>] ? do_sendfile+0x151/0x1d5
 [<ffffffff810df676>] ? sys_sendfile64+0x4f/0x8c
 [<ffffffff810dfde4>] ? sys_write+0x47/0x6e
 [<ffffffff8100baab>] ? system_call_fastpath+0x16/0x1b
Code: 55 48 89 e5 41 54 49 89 fc 53 f6 87 88 00 00 00 03 48 8b 9f 80 00 00 00 75 15 48 89 df e8 85 9e 05 00 85 c0 74 09 49 8b 44 24 20 <48> 8b 58 08 48 89 d8 5b 41 5c c9 c3 55 83 ff 03 ba ea ff ff ff 
RIP  [<ffffffff8100eda8>] profile_pc+0x2b/0x37
 RSP <ffff880028207de8>
CR2: 0000000000004cc3
---[ end trace 4eaa2a86a8e2da24 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 5008, comm: userver Tainted: P      D W  2.6.32 #3
Call Trace:
 <NMI>  [<ffffffff812cf4f0>] panic+0xa0/0x157
 [<ffffffff811a25e2>] ? vgacon_set_cursor_size+0xfc/0x108
 [<ffffffff8105bd7c>] ? up+0x34/0x39
 [<ffffffff81040f5f>] ? release_console_sem+0x18a/0x1bf
 [<ffffffff8104118e>] ? console_unblank+0x65/0x6a
 [<ffffffff8100f869>] oops_end+0xaa/0xba
 [<ffffffff81028622>] no_context+0x1ea/0x1f9
 [<ffffffff810287e4>] __bad_area_nosemaphore+0x1b3/0x1d9
 [<ffffffff81091aae>] ? ring_buffer_lock_reserve+0x103/0x145
 [<ffffffff8100baab>] ? system_call_fastpath+0x16/0x1b
 [<ffffffff81090aea>] ? ring_buffer_unlock_commit+0x5b/0x5f
 [<ffffffff8106a9eb>] ? __module_text_address+0xd/0x5b
 [<ffffffff8106aa42>] ? is_module_text_address+0x9/0x11
 [<ffffffff81028818>] bad_area_nosemaphore+0xe/0x10
 [<ffffffff81028ab8>] do_page_fault+0x114/0x24a
 [<ffffffff812d1eaf>] page_fault+0x1f/0x30
 [<ffffffff812d1ab0>] ? _spin_lock_irqsave+0x0/0x1d
 [<ffffffff8118cbe4>] ? is_swiotlb_buffer+0x2b/0x39
 [<ffffffff812d1ab0>] ? _spin_lock_irqsave+0x0/0x1d
 [<ffffffff8100eda8>] ? profile_pc+0x2b/0x37
 [<ffffffff8100ed9f>] ? profile_pc+0x22/0x37
 [<ffffffffa02535fb>] oprofile_add_sample+0x2a/0x8a [oprofile]
 [<ffffffffa0255d35>] ppro_check_ctrs+0x6b/0xd3 [oprofile]
 [<ffffffffa02550af>] profile_exceptions_notify+0x39/0x40 [oprofile]
 [<ffffffff8105bed3>] notifier_call_chain+0x33/0x5b
 [<ffffffff8105bf1d>] atomic_notifier_call_chain+0x13/0x15
 [<ffffffff8105bfbf>] notify_die+0x2e/0x30
 [<ffffffff8100d96c>] do_nmi+0x9a/0x266
 [<ffffffff812d20fa>] nmi+0x1a/0x20
 [<ffffffff8118cbe4>] ? is_swiotlb_buffer+0x2b/0x39
 [<ffffffff812d1ab0>] ? _spin_lock_irqsave+0x0/0x1d
 <<EOE>>  <IRQ>  [<ffffffffa052cfd9>] ? mx_gw_lookup+0x2e/0x96 [mx_driver]
 [<ffffffffa0522757>] ? mx_ether_xmit+0x130/0x668 [mx_driver]
 [<ffffffff81271368>] ? ip_rcv_finish+0x33c/0x356
 [<ffffffff8105f946>] ? getnstimeofday+0x5f/0xbf
 [<ffffffff8118cbe4>] ? is_swiotlb_buffer+0x2b/0x39
 [<ffffffff8124d80d>] ? dev_hard_start_xmit+0x224/0x2d0
 [<ffffffff81260ae9>] ? sch_direct_xmit+0x5a/0x157
 [<ffffffff81260ca2>] ? __qdisc_run+0xbc/0xdc
 [<ffffffff8124af32>] ? net_tx_action+0xc2/0x120
 [<ffffffff81046767>] ? __do_softirq+0xd8/0x195
 [<ffffffff8100cb5c>] ? call_softirq+0x1c/0x28
 [<ffffffff8100e5ef>] ? do_softirq+0x33/0x6b
 [<ffffffff8104654b>] ? irq_exit+0x36/0x75
 [<ffffffff8100dcf1>] ? do_IRQ+0xa3/0xba
 [<ffffffff8100c3d3>] ? ret_from_intr+0x0/0xa
 <EOI>  [<ffffffff810fe25e>] ? generic_file_splice_read+0x0/0x72
 [<ffffffff810fe1ac>] ? __generic_file_splice_read+0x393/0x445
 [<ffffffff810fdee5>] ? __generic_file_splice_read+0xcc/0x445
 [<ffffffff81286780>] ? tcp_init_tso_segs+0x32/0x49
 [<ffffffff8104dc29>] ? mod_timer+0x188/0x19a
 [<ffffffff8124655b>] ? __alloc_skb+0x6b/0x164
 [<ffffffff812d1c78>] ? _spin_unlock_bh+0xf/0x11
 [<ffffffff812412f3>] ? release_sock+0xa6/0xae
 [<ffffffff8127db78>] ? tcp_sendmsg+0x930/0xa2e
 [<ffffffff8123e4c2>] ? sock_aio_write+0x10f/0x123
 [<ffffffff810fc1e0>] ? pipe_to_sendpage+0x0/0x69
 [<ffffffff810fc571>] ? __splice_from_pipe+0x3f/0x6d
 [<ffffffff810fc902>] ? spd_release_page+0x0/0x14
 [<ffffffff810fe2a1>] ? generic_file_splice_read+0x43/0x72
 [<ffffffff810fca88>] ? do_splice_to+0x6c/0x7f
 [<ffffffff810fcb62>] ? splice_direct_to_actor+0xc7/0x193
 [<ffffffff810fd174>] ? direct_splice_actor+0x0/0x1d
 [<ffffffff812d1be6>] ? _spin_lock_bh+0x11/0x29
 [<ffffffff810fcc7f>] ? do_splice_direct+0x51/0x6c
 [<ffffffff810df5a3>] ? do_sendfile+0x151/0x1d5
 [<ffffffff810df676>] ? sys_sendfile64+0x4f/0x8c
 [<ffffffff810dfde4>] ? sys_write+0x47/0x6e
 [<ffffffff8100baab>] ? system_call_fastpath+0x16/0x1b


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Oprofile bug ?
  2010-11-12 21:11 Oprofile bug ? Sylvain GENEVES
@ 2010-11-12 21:45 ` Frank Ch. Eigler
  2010-11-13 16:18 ` Peter Zijlstra
  1 sibling, 0 replies; 3+ messages in thread
From: Frank Ch. Eigler @ 2010-11-12 21:45 UTC (permalink / raw)
  To: Sylvain GENEVES; +Cc: oprofile-list, linux-kernel

"Sylvain GENEVES" <Sylvain.Geneves@inrialpes.fr> writes:

> [...]
> I'm encountering unexpected behaviour with OProfile when the profiled
> system is under heavy load : "BUG: unable to handle kernel paging request
> at 0000000000004cc3" (full console message is attached).
> [...]
> Anyone has any idea on what is happening ?

Just glancing at that oops & my local random kernel build, it appears
as though this part of arch/x86/kernel/time.c:profile_pc is failing:

unsigned long profile_pc(struct pt_regs *regs)
{
        unsigned long pc = instruction_pointer(regs);

        if (!user_mode_vm(regs) && in_lock_functions(pc)) {
#ifdef CONFIG_FRAME_POINTER
                return *(unsigned long *)(regs->bp + sizeof(long));
#else
                       ^^^^^^^^^^^^^^^^^^
[...]

regs->bp must have been 0x4cbb, which this code turns into an
unchecked dereferences at 0x4cbb+8 = 0x4cc3.  I don't have a theory
as to why regs->bp should have that value in it, but the kernel
should probably use probe_kernel_read() or somesuch to validate the
value before dereferencing it.

- FChE

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Oprofile bug ?
  2010-11-12 21:11 Oprofile bug ? Sylvain GENEVES
  2010-11-12 21:45 ` Frank Ch. Eigler
@ 2010-11-13 16:18 ` Peter Zijlstra
  1 sibling, 0 replies; 3+ messages in thread
From: Peter Zijlstra @ 2010-11-13 16:18 UTC (permalink / raw)
  To: Sylvain GENEVES; +Cc: oprofile-list, linux-kernel, robert.richter

On Fri, 2010-11-12 at 22:11 +0100, Sylvain GENEVES wrote:
> Anyone has any idea on what is happening ?

Yeah, the oprofile code is terminally broken, it uses
__copy_from_user_inatomic() from NMI context.



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-11-13 16:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-12 21:11 Oprofile bug ? Sylvain GENEVES
2010-11-12 21:45 ` Frank Ch. Eigler
2010-11-13 16:18 ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).