linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Kernel 2.6.38.6 page allocation failure (ixgbe)
@ 2011-05-10 14:04 Stefan Majer
  2011-05-10 14:20 ` Yehuda Sadeh Weinraub
       [not found] ` <BANLkTik=FM5LJs8JUKHR2S+r41vi94Z7pw@mail.gmail.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Stefan Majer @ 2011-05-10 14:04 UTC (permalink / raw)
  To: linux-net, linux-kernel; +Cc: ceph-devel

Hi,

im running 4 nodes with ceph on top of btrfs with a dualport Intel
X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
during benchmarks i get the following stack.
I can easily reproduce this by simply running rados bench from a fast
machine using this 4 nodes as ceph cluster.
We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
3.3.9 ixgbe.
This kernel is tainted because we use fusion-io iodrives as journal
devices for btrfs.

Any hints to nail this down are welcome.

Greetings Stefan Majer

May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
failure. order:2, mode:0x4020
May 10 15:26:40 os02 kernel: [ 3652.485223] kswapd0: page allocation
failure. order:2, mode:0x4020
May 10 15:26:40 os02 kernel: [ 3652.485228] Pid: 57, comm: kswapd0
Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
May 10 15:26:40 os02 kernel: [ 3652.485230] Call Trace:
May 10 15:26:40 os02 kernel: [ 3652.485232]  <IRQ>
[<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
May 10 15:26:40 os02 kernel: [ 3652.485247]  [<ffffffff814b0ad0>] ?
ip_local_deliver+0x80/0x90
May 10 15:26:40 os02 kernel: [ 3652.485250] cosd: page allocation
failure. order:2, mode:0x4020
May 10 15:26:40 os02 kernel: [ 3652.485256]  [<ffffffff81146cd2>] ?
kmalloc_large_node+0x62/0xb0
May 10 15:26:40 os02 kernel: [ 3652.485259] Pid: 1849, comm: cosd
Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
May 10 15:26:40 os02 kernel: [ 3652.485261] Call Trace:
May 10 15:26:40 os02 kernel: [ 3652.485264]  [<ffffffff8114becb>] ?
__kmalloc_node_track_caller+0x15b/0x1d0
May 10 15:26:40 os02 kernel: [ 3652.485266]  <IRQ>
[<ffffffff81466f74>] ? __netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3652.485274]  [<ffffffff81108ce7>] ?
__alloc_pages_nodemask+0x6f7/0x8a0
May 10 15:26:40 os02 kernel: [ 3652.485277]  [<ffffffff81466713>] ?
__alloc_skb+0x83/0x170
May 10 15:26:40 os02 kernel: [ 3652.485281]  [<ffffffff814b0ad0>] ?
ip_local_deliver+0x80/0x90
May 10 15:26:40 os02 kernel: [ 3652.485283]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3652.485287]  [<ffffffff81146cd2>] ?
kmalloc_large_node+0x62/0xb0
May 10 15:26:40 os02 kernel: [ 3652.485297]  [<ffffffffa005d9aa>] ?
ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3652.485300]  [<ffffffff8114becb>] ?
__kmalloc_node_track_caller+0x15b/0x1d0
May 10 15:26:40 os02 kernel: [ 3652.485305]  [<ffffffff812b79e0>] ?
swiotlb_map_page+0x0/0x110
May 10 15:26:40 os02 kernel: [ 3652.485308]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3652.485315]  [<ffffffffa0060930>] ?
ixgbe_poll+0x1140/0x1670 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3652.485318]  [<ffffffff81466713>] ?
__alloc_skb+0x83/0x170
May 10 15:26:40 os02 kernel: [ 3652.485323]  [<ffffffff810f33eb>] ?
perf_pmu_enable+0x2b/0x40
May 10 15:26:40 os02 kernel: [ 3652.485326]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3652.485330]  [<ffffffff81474eb2>] ?
net_rx_action+0x102/0x2a0
May 10 15:26:40 os02 kernel: [ 3652.485336]  [<ffffffffa005d9aa>] ?
ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3652.485341]  [<ffffffff8106b745>] ?
__do_softirq+0xb5/0x210
May 10 15:26:40 os02 kernel: [ 3652.485344]  [<ffffffff81474840>] ?
napi_skb_finish+0x50/0x70
May 10 15:26:40 os02 kernel: [ 3652.485348]  [<ffffffff810c7ca4>] ?
handle_IRQ_event+0x54/0x180
May 10 15:26:40 os02 kernel: [ 3652.485354]  [<ffffffffa0060930>] ?
ixgbe_poll+0x1140/0x1670 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3652.485357]  [<ffffffff8106b7bd>] ?
__do_softirq+0x12d/0x210
May 10 15:26:40 os02 kernel: [ 3652.485360]  [<ffffffff810f33eb>] ?
perf_pmu_enable+0x2b/0x40
May 10 15:26:40 os02 kernel: [ 3652.485364]  [<ffffffff8100cf3c>] ?
call_softirq+0x1c/0x30
May 10 15:26:40 os02 kernel: [ 3652.485367]  [<ffffffff81474eb2>] ?
net_rx_action+0x102/0x2a0
May 10 15:26:40 os02 kernel: [ 3652.485369]  [<ffffffff8100e975>] ?
do_softirq+0x65/0xa0
May 10 15:26:40 os02 kernel: [ 3652.485372]  [<ffffffff8106b745>] ?
__do_softirq+0xb5/0x210
May 10 15:26:40 os02 kernel: [ 3652.485375]  [<ffffffff8106b605>] ?
irq_exit+0x95/0xa0
May 10 15:26:40 os02 kernel: [ 3652.485379]  [<ffffffff810c7ca4>] ?
handle_IRQ_event+0x54/0x180
May 10 15:26:40 os02 kernel: [ 3652.485383]  [<ffffffff8154a276>] ?
do_IRQ+0x66/0xe0
May 10 15:26:40 os02 kernel: [ 3652.485386]  [<ffffffff8106b7bd>] ?
__do_softirq+0x12d/0x210
May 10 15:26:40 os02 kernel: [ 3652.485389]  [<ffffffff81542a53>] ?
ret_from_intr+0x0/0x15
May 10 15:26:40 os02 kernel: [ 3652.485391]  <EOI>
[<ffffffff8100cf3c>] ? call_softirq+0x1c/0x30
May 10 15:26:40 os02 kernel: [ 3652.485397]  [<ffffffff81110a54>] ?
shrink_inactive_list+0x164/0x460
May 10 15:26:40 os02 kernel: [ 3652.485400]  [<ffffffff8100e975>] ?
do_softirq+0x65/0xa0
May 10 15:26:40 os02 kernel: [ 3652.485404]  [<ffffffff8153facc>] ?
schedule+0x44c/0xa10
May 10 15:26:40 os02 kernel: [ 3652.485407]  [<ffffffff8106b605>] ?
irq_exit+0x95/0xa0
May 10 15:26:40 os02 kernel: [ 3652.485412]  [<ffffffff81109b1a>] ?
determine_dirtyable_memory+0x1a/0x30
May 10 15:26:40 os02 kernel: [ 3652.485416]  [<ffffffff8154a276>] ?
do_IRQ+0x66/0xe0
May 10 15:26:40 os02 kernel: [ 3652.485419]  [<ffffffff81111453>] ?
shrink_zone+0x3d3/0x530
May 10 15:26:40 os02 kernel: [ 3652.485422]  [<ffffffff81542a53>] ?
ret_from_intr+0x0/0x15
May 10 15:26:40 os02 kernel: [ 3652.485423]  <EOI>
[<ffffffff81074a4a>] ? del_timer_sync+0x3a/0x60
May 10 15:26:40 os02 kernel: [ 3652.485430]  [<ffffffff812a774d>] ?
copy_user_generic_string+0x2d/0x40
May 10 15:26:40 os02 kernel: [ 3652.485435]  [<ffffffff811054a5>] ?
zone_watermark_ok_safe+0xb5/0xd0
May 10 15:26:40 os02 kernel: [ 3652.485439]  [<ffffffff810ff351>] ?
iov_iter_copy_from_user_atomic+0x101/0x170
May 10 15:26:40 os02 kernel: [ 3652.485442]  [<ffffffff81112a69>] ?
kswapd+0x889/0xb20
May 10 15:26:40 os02 kernel: [ 3652.485457]  [<ffffffffa026c91d>] ?
btrfs_copy_from_user+0xcd/0x130 [btrfs]
May 10 15:26:40 os02 kernel: [ 3652.485460]  [<ffffffff811121e0>] ?
kswapd+0x0/0xb20
May 10 15:26:40 os02 kernel: [ 3652.485472]  [<ffffffffa026d844>] ?
__btrfs_buffered_write+0x1a4/0x330 [btrfs]
May 10 15:26:40 os02 kernel: [ 3652.485476]  [<ffffffff810862b6>] ?
kthread+0x96/0xa0
May 10 15:26:40 os02 kernel: [ 3652.485479]  [<ffffffff8117151f>] ?
file_update_time+0x5f/0x170
May 10 15:26:40 os02 kernel: [ 3652.485482]  [<ffffffff8100ce44>] ?
kernel_thread_helper+0x4/0x10
May 10 15:26:40 os02 kernel: [ 3652.485493]  [<ffffffffa026dc08>] ?
btrfs_file_aio_write+0x238/0x4e0 [btrfs]
May 10 15:26:40 os02 kernel: [ 3652.485496]  [<ffffffff81086220>] ?
kthread+0x0/0xa0
May 10 15:26:40 os02 kernel: [ 3652.485507]  [<ffffffffa026d9d0>] ?
btrfs_file_aio_write+0x0/0x4e0 [btrfs]
May 10 15:26:40 os02 kernel: [ 3652.485511]  [<ffffffff8100ce40>] ?
kernel_thread_helper+0x0/0x10
May 10 15:26:40 os02 kernel: [ 3652.485515]  [<ffffffff81158ff3>] ?
do_sync_readv_writev+0xd3/0x110
May 10 15:26:40 os02 kernel: [ 3652.485516] Mem-Info:
May 10 15:26:40 os02 kernel: [ 3652.485519]  [<ffffffff81163d42>] ?
path_put+0x22/0x30
May 10 15:26:40 os02 kernel: [ 3652.485521] Node 0 DMA per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.485525]  [<ffffffff812584a3>] ?
selinux_file_permission+0xf3/0x150
May 10 15:26:40 os02 kernel: [ 3652.485528] CPU    0: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485530] CPU    1: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485534]  [<ffffffff81251583>] ?
security_file_permission+0x23/0x90
May 10 15:26:40 os02 kernel: [ 3652.485535] CPU    2: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485538] CPU    3: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485542]  [<ffffffff81159f14>] ?
do_readv_writev+0xd4/0x1e0
May 10 15:26:40 os02 kernel: [ 3652.485544] CPU    4: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485547] CPU    5: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485550]  [<ffffffff81540d91>] ?
mutex_lock+0x31/0x60
May 10 15:26:40 os02 kernel: [ 3652.485552] CPU    6: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485554] CPU    7: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485557]  [<ffffffff8115a066>] ?
vfs_writev+0x46/0x60
May 10 15:26:40 os02 kernel: [ 3652.485558] Node 0 DMA32 per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.485562]  [<ffffffff8115a1a1>] ?
sys_writev+0x51/0xc0
May 10 15:26:40 os02 kernel: [ 3652.485564] CPU    0: hi:  186, btch:
31 usd: 144
May 10 15:26:40 os02 kernel: [ 3652.485567] CPU    1: hi:  186, btch:
31 usd: 198
May 10 15:26:40 os02 kernel: [ 3652.485571]  [<ffffffff8100c002>] ?
system_call_fastpath+0x16/0x1b
May 10 15:26:40 os02 kernel: [ 3652.485573] CPU    2: hi:  186, btch:
31 usd: 180
May 10 15:26:40 os02 kernel: [ 3652.485574] Mem-Info:
May 10 15:26:40 os02 kernel: [ 3652.485576] CPU    3: hi:  186, btch:
31 usd: 171
May 10 15:26:40 os02 kernel: [ 3652.485578] Node 0 CPU    4: hi:  186,
btch:  31 usd: 159
May 10 15:26:40 os02 kernel: [ 3652.485581] DMA per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.485582] CPU    5: hi:  186, btch:
31 usd:  69
May 10 15:26:40 os02 kernel: [ 3652.485585] CPU    0: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485587] CPU    6: hi:  186, btch:
31 usd: 180
May 10 15:26:40 os02 kernel: [ 3652.485589] CPU    1: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485591] CPU    7: hi:  186, btch:
31 usd: 184
May 10 15:26:40 os02 kernel: [ 3652.485593] CPU    2: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485594] Node 0 CPU    3: hi:    0,
btch:   1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485597] Normal per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.485598] CPU    4: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485600] CPU    0: hi:  186, btch:
31 usd: 100
May 10 15:26:40 os02 kernel: [ 3652.485602] CPU    5: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485604] CPU    1: hi:  186, btch:
31 usd:  47
May 10 15:26:40 os02 kernel: [ 3652.485606] CPU    6: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485608] CPU    2: hi:  186, btch:
31 usd: 168
May 10 15:26:40 os02 kernel: [ 3652.485610] CPU    7: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.485612] CPU    3: hi:  186, btch:
31 usd: 140
May 10 15:26:40 os02 kernel: [ 3652.485614] Node 0 CPU    4: hi:  186,
btch:  31 usd: 177
May 10 15:26:40 os02 kernel: [ 3652.485617] DMA32 per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.485618] CPU    5: hi:  186, btch:
31 usd:  77
May 10 15:26:40 os02 kernel: [ 3652.485621] CPU    0: hi:  186, btch:
31 usd: 144
May 10 15:26:40 os02 kernel: [ 3652.485623] CPU    6: hi:  186, btch:
31 usd: 168
May 10 15:26:40 os02 kernel: [ 3652.485625] CPU    1: hi:  186, btch:
31 usd: 198
May 10 15:26:40 os02 kernel: [ 3652.485627] CPU    7: hi:  186, btch:
31 usd:  68
May 10 15:26:40 os02 kernel: [ 3652.485629] CPU    2: hi:  186, btch:
31 usd: 180
May 10 15:26:40 os02 kernel: [ 3652.485634] active_anon:255806
inactive_anon:19454 isolated_anon:0
May 10 15:26:40 os02 kernel: [ 3652.485636]  active_file:420093
inactive_file:5180559 isolated_file:0
May 10 15:26:40 os02 kernel: [ 3652.485637]  unevictable:50582
dirty:314034 writeback:8484 unstable:0
May 10 15:26:40 os02 kernel: [ 3652.485639]  free:30074
slab_reclaimable:35739 slab_unreclaimable:13526
May 10 15:26:40 os02 kernel: [ 3652.485641]  mapped:3440 shmem:51
pagetables:1342 bounce:0
May 10 15:26:40 os02 kernel: [ 3652.485643] CPU    3: hi:  186, btch:
31 usd: 171
May 10 15:26:40 os02 kernel: [ 3652.485644] Node 0 CPU    4: hi:  186,
btch:  31 usd: 159
May 10 15:26:40 os02 kernel: [ 3652.485652] DMA free:15852kB min:12kB
low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15660kB mlocked:0kB dirty:0kB writeback:0kB
mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May 10 15:26:40 os02 kernel: [ 3652.485659] CPU    5: hi:  186, btch:
31 usd:  69
May 10 15:26:40 os02 kernel: [ 3652.485661] lowmem_reserve[]:CPU    6:
hi:  186, btch:  31 usd: 180
May 10 15:26:40 os02 kernel: [ 3652.485663]  0CPU    7: hi:  186,
btch:  31 usd: 184
May 10 15:26:40 os02 kernel: [ 3652.485665]  2991Node 0  24201Normal per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.485668]  24201CPU    0: hi:  186,
btch:  31 usd: 100
May 10 15:26:40 os02 kernel: [ 3652.485671]
May 10 15:26:40 os02 kernel: [ 3652.485672] CPU    1: hi:  186, btch:
31 usd:  47
May 10 15:26:40 os02 kernel: [ 3652.485674] Node 0 CPU    2: hi:  186,
btch:  31 usd: 168
May 10 15:26:40 os02 kernel: [ 3652.485682] DMA32 free:85748kB
min:2460kB low:3072kB high:3688kB active_anon:20480kB
inactive_anon:5268kB active_file:151588kB inactive_file:2645188kB
unevictable:72kB isolated(anon):0kB isolated(file):0kB
present:3063392kB mlocked:0kB dirty:210820kB writeback:0kB
mapped:648kB shmem:0kB slab_reclaimable:28400kB
slab_unreclaimable:2152kB kernel_stack:520kB pagetables:100kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
May 10 15:26:40 os02 kernel: [ 3652.485690] CPU    3: hi:  186, btch:
31 usd: 140
May 10 15:26:40 os02 kernel: [ 3652.485691] lowmem_reserve[]:CPU    4:
hi:  186, btch:  31 usd: 177
May 10 15:26:40 os02 kernel: [ 3652.485693]  0CPU    5: hi:  186,
btch:  31 usd:  77
May 10 15:26:40 os02 kernel: [ 3652.485696]  0CPU    6: hi:  186,
btch:  31 usd: 168
May 10 15:26:40 os02 kernel: [ 3652.485698]  21210CPU    7: hi:  186,
btch:  31 usd:  68
May 10 15:26:40 os02 kernel: [ 3652.485701]  21210active_anon:255806
inactive_anon:19454 isolated_anon:0
May 10 15:26:40 os02 kernel: [ 3652.485705]  active_file:420093
inactive_file:5180559 isolated_file:0
May 10 15:26:40 os02 kernel: [ 3652.485706]  unevictable:50582
dirty:314034 writeback:8484 unstable:0
May 10 15:26:40 os02 kernel: [ 3652.485707]  free:30074
slab_reclaimable:35739 slab_unreclaimable:13526
May 10 15:26:40 os02 kernel: [ 3652.485708]  mapped:3440 shmem:51
pagetables:1342 bounce:0
May 10 15:26:40 os02 kernel: [ 3652.485709]
May 10 15:26:40 os02 kernel: [ 3652.485710] Node 0 Node 0 DMA
free:15852kB min:12kB low:12kB high:16kB active_anon:0kB
inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15660kB mlocked:0kB
dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May 10 15:26:40 os02 kernel: [ 3652.485724] Normal free:18696kB
min:17440kB low:21800kB high:26160kB active_anon:1002744kB
inactive_anon:72548kB active_file:1528784kB inactive_file:18077048kB
unevictable:202256kB isolated(anon):0kB isolated(file):0kB
present:21719040kB mlocked:0kB dirty:1045316kB writeback:33936kB
mapped:13112kB shmem:204kB slab_reclaimable:114556kB
slab_unreclaimable:51952kB kernel_stack:3768kB pagetables:5268kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32
all_unreclaimable? no
May 10 15:26:40 os02 kernel: [ 3652.485731]
lowmem_reserve[]:lowmem_reserve[]: 0 0 2991 0 24201 0 24201 0
May 10 15:26:40 os02 kernel: [ 3652.485737]
May 10 15:26:40 os02 kernel: [ 3652.485738] Node 0 Node 0 DMA32
free:85748kB min:2460kB low:3072kB high:3688kB active_anon:20480kB
inactive_anon:5268kB active_file:151588kB inactive_file:2645188kB
unevictable:72kB isolated(anon):0kB isolated(file):0kB
present:3063392kB mlocked:0kB dirty:210820kB writeback:0kB
mapped:648kB shmem:0kB slab_reclaimable:28400kB
slab_unreclaimable:2152kB kernel_stack:520kB pagetables:100kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
May 10 15:26:40 os02 kernel: [ 3652.485747] DMA:
lowmem_reserve[]:1*4kB  01*8kB  00*16kB  212101*32kB  212101*64kB
May 10 15:26:40 os02 kernel: [ 3652.485754] 1*128kB Node 0 1*256kB
Normal free:18696kB min:17440kB low:21800kB high:26160kB
active_anon:1002744kB inactive_anon:72548kB active_file:1528784kB
inactive_file:18077048kB unevictable:202256kB isolated(anon):0kB
isolated(file):0kB present:21719040kB mlocked:0kB dirty:1045316kB
writeback:33936kB mapped:13112kB shmem:204kB slab_reclaimable:114556kB
slab_unreclaimable:51952kB kernel_stack:3768kB pagetables:5268kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32
all_unreclaimable? no
May 10 15:26:40 os02 kernel: [ 3652.485764] 0*512kB
lowmem_reserve[]:1*1024kB  01*2048kB  03*4096kB  0= 15852kB
May 10 15:26:40 os02 kernel: [ 3652.485771]  0Node 0
May 10 15:26:40 os02 kernel: [ 3652.485773] DMA32: Node 0 59*4kB DMA:
125*8kB 1*4kB 66*16kB 1*8kB 80*32kB 0*16kB 188*64kB 1*32kB 51*128kB
1*64kB 15*256kB 1*128kB 40*512kB 1*256kB 31*1024kB 0*512kB 1*2048kB
1*1024kB 1*4096kB 1*2048kB = 85620kB
May 10 15:26:40 os02 kernel: [ 3652.485789] 3*4096kB Node 0 = 15852kB
May 10 15:26:40 os02 kernel: [ 3652.485791] Normal: Node 0 3930*4kB
DMA32: 0*8kB 59*4kB 1*16kB 125*8kB 0*32kB 66*16kB 0*64kB 80*32kB
0*128kB 188*64kB 1*256kB 51*128kB 1*512kB 15*256kB 0*1024kB 40*512kB
1*2048kB 31*1024kB 0*4096kB 1*2048kB = 18552kB
May 10 15:26:40 os02 kernel: [ 3652.485807] 1*4096kB 5651289 total
pagecache pages
May 10 15:26:40 os02 kernel: [ 3652.485809] = 85620kB
May 10 15:26:40 os02 kernel: [ 3652.485810] 0 pages in swap cache
May 10 15:26:40 os02 kernel: [ 3652.485811] Node 0 Swap cache stats:
add 0, delete 0, find 0/0
May 10 15:26:40 os02 kernel: [ 3652.485814] Normal: Free swap  = 1048572kB
May 10 15:26:40 os02 kernel: [ 3652.485815] 3930*4kB Total swap = 1048572kB
May 10 15:26:40 os02 kernel: [ 3652.485817] 0*8kB 1*16kB 0*32kB 0*64kB
0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 18552kB
May 10 15:26:40 os02 kernel: [ 3652.485822] 5651289 total pagecache pages
May 10 15:26:40 os02 kernel: [ 3652.485823] 0 pages in swap cache
May 10 15:26:40 os02 kernel: [ 3652.485824] Swap cache stats: add 0,
delete 0, find 0/0
May 10 15:26:40 os02 kernel: [ 3652.485825] Free swap  = 1048572kB
May 10 15:26:40 os02 kernel: [ 3652.485826] Total swap = 1048572kB
May 10 15:26:40 os02 kernel: [ 3652.486439] kworker/0:1: page
allocation failure. order:2, mode:0x4020
May 10 15:26:40 os02 kernel: [ 3652.486443] Pid: 0, comm: kworker/0:1
Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
May 10 15:26:40 os02 kernel: [ 3652.486446] Call Trace:
May 10 15:26:40 os02 kernel: [ 3652.486448]  <IRQ>
[<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
May 10 15:26:40 os02 kernel: [ 3652.486459]  [<ffffffff814b0ad0>] ?
ip_local_deliver+0x80/0x90
May 10 15:26:40 os02 kernel: [ 3652.486464]  [<ffffffff81146cd2>] ?
kmalloc_large_node+0x62/0xb0
May 10 15:26:40 os02 kernel: [ 3652.486468]  [<ffffffff8114becb>] ?
__kmalloc_node_track_caller+0x15b/0x1d0
May 10 15:26:40 os02 kernel: [ 3652.486473]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3652.486476]  [<ffffffff81466713>] ?
__alloc_skb+0x83/0x170
May 10 15:26:40 os02 kernel: [ 3652.486479]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3652.486489]  [<ffffffffa005d9aa>] ?
ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3652.486494]  [<ffffffff81474840>] ?
napi_skb_finish+0x50/0x70
May 10 15:26:40 os02 kernel: [ 3652.486501]  [<ffffffffa0060930>] ?
ixgbe_poll+0x1140/0x1670 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3652.486506]  [<ffffffff81013379>] ?
sched_clock+0x9/0x10
May 10 15:26:40 os02 kernel: [ 3652.486510]  [<ffffffff81474eb2>] ?
net_rx_action+0x102/0x2a0
May 10 15:26:40 os02 kernel: [ 3652.486514]  [<ffffffff8106b745>] ?
__do_softirq+0xb5/0x210
May 10 15:26:40 os02 kernel: [ 3652.486520]  [<ffffffff8108aec4>] ?
hrtimer_interrupt+0x134/0x240
May 10 15:26:40 os02 kernel: [ 3652.486523]  [<ffffffff8100cf3c>] ?
call_softirq+0x1c/0x30
May 10 15:26:40 os02 kernel: [ 3652.486526]  [<ffffffff8100e975>] ?
do_softirq+0x65/0xa0
May 10 15:26:40 os02 kernel: [ 3652.486529]  [<ffffffff8106b605>] ?
irq_exit+0x95/0xa0
May 10 15:26:40 os02 kernel: [ 3652.486533]  [<ffffffff8154a360>] ?
smp_apic_timer_interrupt+0x70/0x9b
May 10 15:26:40 os02 kernel: [ 3652.486536]  [<ffffffff8100c9f3>] ?
apic_timer_interrupt+0x13/0x20
May 10 15:26:40 os02 kernel: [ 3652.486538]  <EOI>
[<ffffffff812db311>] ? intel_idle+0xc1/0x120
May 10 15:26:40 os02 kernel: [ 3652.486544]  [<ffffffff812db2f4>] ?
intel_idle+0xa4/0x120
May 10 15:26:40 os02 kernel: [ 3652.486549]  [<ffffffff8143bca5>] ?
cpuidle_idle_call+0xb5/0x240
May 10 15:26:40 os02 kernel: [ 3652.486554]  [<ffffffff8100aa87>] ?
cpu_idle+0xb7/0x110
May 10 15:26:40 os02 kernel: [ 3652.486558]  [<ffffffff81538ffe>] ?
start_secondary+0x21f/0x221
May 10 15:26:40 os02 kernel: [ 3652.486561] Mem-Info:
May 10 15:26:40 os02 kernel: [ 3652.486562] Node 0 DMA per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.486564] CPU    0: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486567] CPU    1: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486569] CPU    2: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486571] CPU    3: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486573] CPU    4: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486575] CPU    5: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486578] CPU    6: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486580] CPU    7: hi:    0, btch:
 1 usd:   0
May 10 15:26:40 os02 kernel: [ 3652.486581] Node 0 DMA32 per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.486584] CPU    0: hi:  186, btch:
31 usd: 144
May 10 15:26:40 os02 kernel: [ 3652.486586] CPU    1: hi:  186, btch:
31 usd: 198
May 10 15:26:40 os02 kernel: [ 3652.486588] CPU    2: hi:  186, btch:
31 usd: 180
May 10 15:26:40 os02 kernel: [ 3652.486590] CPU    3: hi:  186, btch:
31 usd: 172
May 10 15:26:40 os02 kernel: [ 3652.486593] CPU    4: hi:  186, btch:
31 usd: 159
May 10 15:26:40 os02 kernel: [ 3652.486595] CPU    5: hi:  186, btch:
31 usd:  69
May 10 15:26:40 os02 kernel: [ 3652.486597] CPU    6: hi:  186, btch:
31 usd: 180
May 10 15:26:40 os02 kernel: [ 3652.486599] CPU    7: hi:  186, btch:
31 usd: 184
May 10 15:26:40 os02 kernel: [ 3652.486601] Node 0 Normal per-cpu:
May 10 15:26:40 os02 kernel: [ 3652.486603] CPU    0: hi:  186, btch:
31 usd: 162
May 10 15:26:40 os02 kernel: [ 3652.486605] CPU    1: hi:  186, btch:
31 usd:  47
May 10 15:26:40 os02 kernel: [ 3652.486608] CPU    2: hi:  186, btch:
31 usd: 168
May 10 15:26:40 os02 kernel: [ 3652.486610] CPU    3: hi:  186, btch:
31 usd: 141
May 10 15:26:40 os02 kernel: [ 3652.486612] CPU    4: hi:  186, btch:
31 usd: 177
May 10 15:26:40 os02 kernel: [ 3652.486614] CPU    5: hi:  186, btch:
31 usd:  77
May 10 15:26:40 os02 kernel: [ 3652.486616] CPU    6: hi:  186, btch:
31 usd: 168
May 10 15:26:40 os02 kernel: [ 3652.486618] CPU    7: hi:  186, btch:
31 usd: 174
May 10 15:26:40 os02 kernel: [ 3652.486624] active_anon:255806
inactive_anon:19454 isolated_anon:0
May 10 15:26:40 os02 kernel: [ 3652.486625]  active_file:420093
inactive_file:5180745 isolated_file:0
May 10 15:26:40 os02 kernel: [ 3652.486627]  unevictable:50582
dirty:314470 writeback:8484 unstable:0
May 10 15:26:40 os02 kernel: [ 3652.486628]  free:29795
slab_reclaimable:35739 slab_unreclaimable:13526
May 10 15:26:40 os02 kernel: [ 3652.486629]  mapped:3440 shmem:51
pagetables:1342 bounce:0
May 10 15:26:40 os02 kernel: [ 3652.486631] Node 0 DMA free:15852kB
min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15660kB mlocked:0kB dirty:0kB writeback:0kB
mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May 10 15:26:40 os02 kernel: [ 3652.486642] lowmem_reserve[]: 0 2991 24201 24201
May 10 15:26:40 os02 kernel: [ 3652.486645] Node 0 DMA32 free:85748kB
min:2460kB low:3072kB high:3688kB active_anon:20480kB
inactive_anon:5268kB active_file:151588kB inactive_file:2645188kB
unevictable:72kB isolated(anon):0kB isolated(file):0kB
present:3063392kB mlocked:0kB dirty:210820kB writeback:0kB
mapped:648kB shmem:0kB slab_reclaimable:28400kB
slab_unreclaimable:2152kB kernel_stack:520kB pagetables:100kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
May 10 15:26:40 os02 kernel: [ 3652.486657] lowmem_reserve[]: 0 0 21210 21210
May 10 15:26:40 os02 kernel: [ 3652.486660] Node 0 Normal free:17580kB
min:17440kB low:21800kB high:26160kB active_anon:1002744kB
inactive_anon:72548kB active_file:1528784kB inactive_file:18077792kB
unevictable:202256kB isolated(anon):0kB isolated(file):0kB
present:21719040kB mlocked:0kB dirty:1047060kB writeback:33936kB
mapped:13112kB shmem:204kB slab_reclaimable:114556kB
slab_unreclaimable:51952kB kernel_stack:3768kB pagetables:5268kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64
all_unreclaimable? no
May 10 15:26:40 os02 kernel: [ 3652.486673] lowmem_reserve[]: 0 0 0 0
May 10 15:26:40 os02 kernel: [ 3652.486675] Node 0 DMA: 1*4kB 1*8kB
0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB
3*4096kB = 15852kB
May 10 15:26:40 os02 kernel: [ 3652.486684] Node 0 DMA32: 59*4kB
125*8kB 66*16kB 80*32kB 188*64kB 51*128kB 15*256kB 40*512kB 31*1024kB
1*2048kB 1*4096kB = 85620kB
May 10 15:26:40 os02 kernel: [ 3652.486692] Node 0 Normal: 3705*4kB
12*8kB 16*16kB 4*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB
0*4096kB = 18180kB
May 10 15:26:40 os02 kernel: [ 3652.486700] 5651289 total pagecache pages
May 10 15:26:40 os02 kernel: [ 3652.486702] 0 pages in swap cache
May 10 15:26:40 os02 kernel: [ 3652.486704] Swap cache stats: add 0,
delete 0, find 0/0
May 10 15:26:40 os02 kernel: [ 3652.486705] Free swap  = 1048572kB
May 10 15:26:40 os02 kernel: [ 3652.486707] Total swap = 1048572kB
May 10 15:26:40 os02 kernel: [ 3652.562795] 6291440 pages RAM
May 10 15:26:40 os02 kernel: [ 3652.562798] 108688 pages reserved
May 10 15:26:40 os02 kernel: [ 3652.562799] 5429575 pages shared
May 10 15:26:40 os02 kernel: [ 3652.562801] 783596 pages non-shared
May 10 15:26:40 os02 kernel: [ 3652.651570] 6291440 pages RAM
May 10 15:26:40 os02 kernel: [ 3652.651572] 108688 pages reserved
May 10 15:26:40 os02 kernel: [ 3652.651573] 5430055 pages shared
May 10 15:26:40 os02 kernel: [ 3652.651575] 782974 pages non-shared
May 10 15:26:40 os02 kernel: [ 3652.721553] 6291440 pages RAM
May 10 15:26:40 os02 kernel: [ 3652.721555] 108688 pages reserved
May 10 15:26:40 os02 kernel: [ 3652.721556] 5430961 pages shared
May 10 15:26:40 os02 kernel: [ 3652.721557] 781496 pages non-shared
May 10 15:26:40 os02 kernel: [ 3654.349865] Pid: 1846, comm: cosd
Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
May 10 15:26:40 os02 kernel: [ 3654.358792] Call Trace:
May 10 15:26:40 os02 kernel: [ 3654.361519]  <IRQ>
[<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
May 10 15:26:40 os02 kernel: [ 3654.369495]  [<ffffffff814b0ad0>] ?
ip_local_deliver+0x80/0x90
May 10 15:26:40 os02 kernel: [ 3654.376005]  [<ffffffff81146cd2>] ?
kmalloc_large_node+0x62/0xb0
May 10 15:26:40 os02 kernel: [ 3654.382703]  [<ffffffff8114becb>] ?
__kmalloc_node_track_caller+0x15b/0x1d0
May 10 15:26:40 os02 kernel: [ 3654.390464]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3654.397163]  [<ffffffff81466713>] ?
__alloc_skb+0x83/0x170
May 10 15:26:40 os02 kernel: [ 3654.403277]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 10 15:26:40 os02 kernel: [ 3654.409970]  [<ffffffffa005d9aa>] ?
ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3654.417926]  [<ffffffff812b79e0>] ?
swiotlb_map_page+0x0/0x110
May 10 15:26:40 os02 kernel: [ 3654.424432]  [<ffffffffa0060930>] ?
ixgbe_poll+0x1140/0x1670 [ixgbe]
May 10 15:26:40 os02 kernel: [ 3654.431518]  [<ffffffff810f33eb>] ?
perf_pmu_enable+0x2b/0x40
May 10 15:26:40 os02 kernel: [ 3654.437924]  [<ffffffff81474eb2>] ?
net_rx_action+0x102/0x2a0
May 10 15:26:40 os02 kernel: [ 3654.444329]  [<ffffffff8106b745>] ?
__do_softirq+0xb5/0x210
May 10 15:26:40 os02 kernel: [ 3654.450541]  [<ffffffff810c7ca4>] ?
handle_IRQ_event+0x54/0x180
May 10 15:26:40 os02 kernel: [ 3654.457138]  [<ffffffff8106b7bd>] ?
__do_softirq+0x12d/0x210
May 10 15:26:40 os02 kernel: [ 3654.463446]  [<ffffffff8100cf3c>] ?
call_softirq+0x1c/0x30
May 10 15:26:40 os02 kernel: [ 3654.469562]  [<ffffffff8100e975>] ?
do_softirq+0x65/0xa0
May 10 15:26:40 os02 kernel: [ 3654.475484]  [<ffffffff8106b605>] ?
irq_exit+0x95/0xa0
May 10 15:26:40 os02 kernel: [ 3654.481218]  [<ffffffff8154a276>] ?
do_IRQ+0x66/0xe0
May 10 15:26:40 os02 kernel: [ 3654.486754]  [<ffffffff81542a53>] ?
ret_from_intr+0x0/0x15
May 10 15:26:40 os02 kernel: [ 3654.492867]  <EOI>
[<ffffffff81286919>] ? __make_request+0x149/0x4c0
May 10 15:26:40 os02 kernel: [ 3654.500061]  [<ffffffff812868e4>] ?
__make_request+0x114/0x4c0
May 10 15:26:41 os02 kernel: [ 3654.506565]  [<ffffffff812841bd>] ?
generic_make_request+0x2fd/0x5e0
May 10 15:26:41 os02 kernel: [ 3654.513649]  [<ffffffff8142742b>] ?
dm_get_live_table+0x4b/0x60
May 10 15:26:41 os02 kernel: [ 3654.520248]  [<ffffffff81427bc1>] ?
dm_merge_bvec+0xc1/0x140
May 10 15:26:41 os02 kernel: [ 3654.526555]  [<ffffffff81284526>] ?
submit_bio+0x86/0x110
May 10 15:26:41 os02 kernel: [ 3654.532574]  [<ffffffff8118deac>] ?
dio_bio_submit+0xbc/0xc0
May 10 15:26:41 os02 kernel: [ 3654.538881]  [<ffffffff8118df40>] ?
dio_send_cur_page+0x90/0xc0
May 10 15:26:41 os02 kernel: [ 3654.545478]  [<ffffffff8118dfd5>] ?
submit_page_section+0x65/0x180
May 10 15:26:41 os02 kernel: [ 3654.552370]  [<ffffffff8118e918>] ?
__blockdev_direct_IO+0x678/0xb30
May 10 15:26:41 os02 kernel: [ 3654.559454]  [<ffffffff81250eaf>] ?
security_inode_getsecurity+0x1f/0x30
May 10 15:26:41 os02 kernel: [ 3654.566924]  [<ffffffff8118c627>] ?
blkdev_direct_IO+0x57/0x60
May 10 15:26:41 os02 kernel: [ 3654.573414]  [<ffffffff8118b760>] ?
blkdev_get_blocks+0x0/0xc0
May 10 15:26:41 os02 kernel: [ 3654.579954]  [<ffffffff811008f2>] ?
generic_file_direct_write+0xc2/0x190
May 10 15:26:41 os02 kernel: [ 3654.587424]  [<ffffffff811715b6>] ?
file_update_time+0xf6/0x170
May 10 15:26:41 os02 kernel: [ 3654.594025]  [<ffffffff811023eb>] ?
__generic_file_aio_write+0x32b/0x460
May 10 15:26:41 os02 kernel: [ 3654.601494]  [<ffffffff8105c9e0>] ?
wake_up_state+0x10/0x20



and so on.

-- 
Stefan Majer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-10 14:04 Kernel 2.6.38.6 page allocation failure (ixgbe) Stefan Majer
@ 2011-05-10 14:20 ` Yehuda Sadeh Weinraub
  2011-05-10 14:26   ` Yehuda Sadeh Weinraub
  2011-05-10 15:55   ` Stefan Majer
       [not found] ` <BANLkTik=FM5LJs8JUKHR2S+r41vi94Z7pw@mail.gmail.com>
  1 sibling, 2 replies; 9+ messages in thread
From: Yehuda Sadeh Weinraub @ 2011-05-10 14:20 UTC (permalink / raw)
  To: Stefan Majer; +Cc: linux-net, linux-kernel, ceph-devel

On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
> Hi,
>
> im running 4 nodes with ceph on top of btrfs with a dualport Intel
> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
> during benchmarks i get the following stack.
> I can easily reproduce this by simply running rados bench from a fast
> machine using this 4 nodes as ceph cluster.
> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
> 3.3.9 ixgbe.
> This kernel is tainted because we use fusion-io iodrives as journal
> devices for btrfs.
>
> Any hints to nail this down are welcome.
>
> Greetings Stefan Majer
>
> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
> failure. order:2, mode:0x4020

It looks like the machine running the cosd is crashing, is that the case?
Are you running both ceph kernel module on the same machine by any
chance? If not, it can be some other fs bug (e.g., the underlying
btrfs). Also, the stack here is quite deep, there's a chance for a
stack overflow.

Thanks,
Yehuda

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-10 14:20 ` Yehuda Sadeh Weinraub
@ 2011-05-10 14:26   ` Yehuda Sadeh Weinraub
  2011-05-10 15:55   ` Stefan Majer
  1 sibling, 0 replies; 9+ messages in thread
From: Yehuda Sadeh Weinraub @ 2011-05-10 14:26 UTC (permalink / raw)
  To: Stefan Majer; +Cc: linux-net, linux-kernel, ceph-devel

On Tue, May 10, 2011 at 7:20 AM, Yehuda Sadeh Weinraub
<yehudasa@gmail.com> wrote:
> On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
>> Hi,
>>
>> im running 4 nodes with ceph on top of btrfs with a dualport Intel
>> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
>> during benchmarks i get the following stack.
>> I can easily reproduce this by simply running rados bench from a fast
>> machine using this 4 nodes as ceph cluster.
>> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
>> 3.3.9 ixgbe.
>> This kernel is tainted because we use fusion-io iodrives as journal
>> devices for btrfs.
>>
>> Any hints to nail this down are welcome.
>>
>> Greetings Stefan Majer
>>
>> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
>> failure. order:2, mode:0x4020
>
> It looks like the machine running the cosd is crashing, is that the case?
> Are you running both ceph kernel module on the same machine by any

that should be "both the osd and the kernel module"

> chance? If not, it can be some other fs bug (e.g., the underlying
> btrfs). Also, the stack here is quite deep, there's a chance for a
> stack overflow.
>
> Thanks,
> Yehuda
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-10 14:20 ` Yehuda Sadeh Weinraub
  2011-05-10 14:26   ` Yehuda Sadeh Weinraub
@ 2011-05-10 15:55   ` Stefan Majer
  2011-05-10 16:02     ` Sage Weil
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Majer @ 2011-05-10 15:55 UTC (permalink / raw)
  To: Yehuda Sadeh Weinraub; +Cc: linux-net, linux-kernel, ceph-devel

Hi,

On Tue, May 10, 2011 at 4:20 PM, Yehuda Sadeh Weinraub
<yehudasa@gmail.com> wrote:
> On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
>> Hi,
>>
>> im running 4 nodes with ceph on top of btrfs with a dualport Intel
>> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
>> during benchmarks i get the following stack.
>> I can easily reproduce this by simply running rados bench from a fast
>> machine using this 4 nodes as ceph cluster.
>> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
>> 3.3.9 ixgbe.
>> This kernel is tainted because we use fusion-io iodrives as journal
>> devices for btrfs.
>>
>> Any hints to nail this down are welcome.
>>
>> Greetings Stefan Majer
>>
>> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
>> failure. order:2, mode:0x4020
>
> It looks like the machine running the cosd is crashing, is that the case?

No the machine is still running. Even the cosd is still there.

> Are you running both ceph kernel module on the same machine by any
> chance? If not, it can be some other fs bug (e.g., the underlying
> btrfs). Also, the stack here is quite deep, there's a chance for a
> stack overflow.

There is only the cosd running on these machines. We have 3 seperate
mons and clients which uses qemu-rbd.


> Thanks,
> Yehuda
>


Greetings
-- 
Stefan Majer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-10 15:55   ` Stefan Majer
@ 2011-05-10 16:02     ` Sage Weil
  2011-05-10 16:06       ` Stefan Majer
  0 siblings, 1 reply; 9+ messages in thread
From: Sage Weil @ 2011-05-10 16:02 UTC (permalink / raw)
  To: Stefan Majer; +Cc: Yehuda Sadeh Weinraub, linux-net, linux-kernel, ceph-devel

Hi Stefan,

On Tue, 10 May 2011, Stefan Majer wrote:
> Hi,
> 
> On Tue, May 10, 2011 at 4:20 PM, Yehuda Sadeh Weinraub
> <yehudasa@gmail.com> wrote:
> > On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
> >> Hi,
> >>
> >> im running 4 nodes with ceph on top of btrfs with a dualport Intel
> >> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
> >> during benchmarks i get the following stack.
> >> I can easily reproduce this by simply running rados bench from a fast
> >> machine using this 4 nodes as ceph cluster.
> >> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
> >> 3.3.9 ixgbe.
> >> This kernel is tainted because we use fusion-io iodrives as journal
> >> devices for btrfs.
> >>
> >> Any hints to nail this down are welcome.
> >>
> >> Greetings Stefan Majer
> >>
> >> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
> >> failure. order:2, mode:0x4020
> >
> > It looks like the machine running the cosd is crashing, is that the case?
> 
> No the machine is still running. Even the cosd is still there.

How much memory is (was?) cosd using?  Is it possible for you to watch RSS 
under load when the errors trigger?

The osd throttles incoming client bandwidth, but it doesn't throttle 
inter-osd traffic yet because it's not obvious how to avoid deadlock.  
It's possible that one node is getting significantly behind the 
others on the replicated writes and that is blowing up its memory 
footprint.  There are a few ways we can address that, but I'd like to make 
sure we understand the problem first.

Thanks!
sage


 
> > Are you running both ceph kernel module on the same machine by any
> > chance? If not, it can be some other fs bug (e.g., the underlying
> > btrfs). Also, the stack here is quite deep, there's a chance for a
> > stack overflow.
> 
> There is only the cosd running on these machines. We have 3 seperate
> mons and clients which uses qemu-rbd.
> 
> 
> > Thanks,
> > Yehuda
> >
> 
> 
> Greetings
> -- 
> Stefan Majer
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-10 16:02     ` Sage Weil
@ 2011-05-10 16:06       ` Stefan Majer
  2011-05-11  6:58         ` Stefan Majer
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Majer @ 2011-05-10 16:06 UTC (permalink / raw)
  To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, linux-kernel, ceph-devel

Hi Sage,


On Tue, May 10, 2011 at 6:02 PM, Sage Weil <sage@newdream.net> wrote:
> Hi Stefan,
>
> On Tue, 10 May 2011, Stefan Majer wrote:
>> Hi,
>>
>> On Tue, May 10, 2011 at 4:20 PM, Yehuda Sadeh Weinraub
>> <yehudasa@gmail.com> wrote:
>> > On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
>> >> Hi,
>> >>
>> >> im running 4 nodes with ceph on top of btrfs with a dualport Intel
>> >> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
>> >> during benchmarks i get the following stack.
>> >> I can easily reproduce this by simply running rados bench from a fast
>> >> machine using this 4 nodes as ceph cluster.
>> >> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
>> >> 3.3.9 ixgbe.
>> >> This kernel is tainted because we use fusion-io iodrives as journal
>> >> devices for btrfs.
>> >>
>> >> Any hints to nail this down are welcome.
>> >>
>> >> Greetings Stefan Majer
>> >>
>> >> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
>> >> failure. order:2, mode:0x4020
>> >
>> > It looks like the machine running the cosd is crashing, is that the case?
>>
>> No the machine is still running. Even the cosd is still there.
>
> How much memory is (was?) cosd using?  Is it possible for you to watch RSS
> under load when the errors trigger?

I will look on this tomorrow
just for the record:
each machine has 24GB of RAM and 4 cosd with 1 btrfs formated disks
each, which is a raid5 over 3 2TB spindles.

The rados bench reaches a constant rate of about 1000Mb/sec !

Greetings

Stefan
> The osd throttles incoming client bandwidth, but it doesn't throttle
> inter-osd traffic yet because it's not obvious how to avoid deadlock.
> It's possible that one node is getting significantly behind the
> others on the replicated writes and that is blowing up its memory
> footprint.  There are a few ways we can address that, but I'd like to make
> sure we understand the problem first.
>
> Thanks!
> sage
>
>
>
>> > Are you running both ceph kernel module on the same machine by any
>> > chance? If not, it can be some other fs bug (e.g., the underlying
>> > btrfs). Also, the stack here is quite deep, there's a chance for a
>> > stack overflow.
>>
>> There is only the cosd running on these machines. We have 3 seperate
>> mons and clients which uses qemu-rbd.
>>
>>
>> > Thanks,
>> > Yehuda
>> >
>>
>>
>> Greetings
>> --
>> Stefan Majer
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>



-- 
Stefan Majer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-10 16:06       ` Stefan Majer
@ 2011-05-11  6:58         ` Stefan Majer
  2011-05-11  7:36           ` Stefan Majer
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Majer @ 2011-05-11  6:58 UTC (permalink / raw)
  To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, linux-kernel, ceph-devel

Hi Sage,

we were running rados bench like this:
# rados -p data bench 60 write -t 128
Maintaining 128 concurrent writes of 4194304 bytes for at least 60 seconds.
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
    0       0         0         0         0         0         -         0
    1     128       296       168   671.847       672  0.051857  0.131839
    2     127       537       410   819.838       968  0.052679  0.115476
    3     128       772       644   858.516       936  0.043241  0.114372
    4     128       943       815   814.865       684  0.799326  0.121142
    5     128      1114       986   788.673       684  0.082748   0.13059
    6     128      1428      1300   866.526      1256  0.065376  0.119083
    7     127      1716      1589   907.859      1156  0.037958   0.11151
    8     127      1986      1859    929.36      1080  0.063171   0.11077
    9     128      2130      2002   889.645       572  0.048705  0.109477
   10     127      2333      2206   882.269       816  0.062555  0.115842
   11     127      2466      2339   850.419       532  0.051618  0.117356
   12     128      2602      2474   824.545       540   0.06113  0.124453
   13     128      2807      2679   824.187       820  0.075126  0.125108
   14     127      2897      2770   791.312       364  0.077479  0.125009
   15     127      2955      2828   754.023       232  0.084222  0.123814
   16     127      2973      2846   711.393        72  0.078568  0.123562
   17     127      2975      2848   670.011         8  0.923208  0.124123

as you can see, the transferrate drops suddenly down to 8 and even to 0.

Memory consumption during this is low:

top - 08:52:24 up 18:12,  1 user,  load average: 0.64, 3.35, 4.17
Tasks: 203 total,   1 running, 202 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  24731008k total, 24550172k used,   180836k free,    79136k buffers
Swap:        0k total,        0k used,        0k free, 22574812k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
22203 root      20   0  581m 284m 2232 S  0.0  1.2   0:44.34 cosd
21922 root      20   0  577m 281m 2148 S  0.0  1.2   0:39.91 cosd
22788 root      20   0  576m 213m 2084 S  0.0  0.9   0:44.10 cosd
22476 root      20   0  509m 204m 2156 S  0.0  0.8   0:33.92 cosd

And after we hit this, ceph -w still reports clean state, all cosd are
still running.

We have no clue :-(

Greetings
Stefan Majer


On Tue, May 10, 2011 at 6:06 PM, Stefan Majer <stefan.majer@gmail.com> wrote:
> Hi Sage,
>
>
> On Tue, May 10, 2011 at 6:02 PM, Sage Weil <sage@newdream.net> wrote:
>> Hi Stefan,
>>
>> On Tue, 10 May 2011, Stefan Majer wrote:
>>> Hi,
>>>
>>> On Tue, May 10, 2011 at 4:20 PM, Yehuda Sadeh Weinraub
>>> <yehudasa@gmail.com> wrote:
>>> > On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
>>> >> Hi,
>>> >>
>>> >> im running 4 nodes with ceph on top of btrfs with a dualport Intel
>>> >> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
>>> >> during benchmarks i get the following stack.
>>> >> I can easily reproduce this by simply running rados bench from a fast
>>> >> machine using this 4 nodes as ceph cluster.
>>> >> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
>>> >> 3.3.9 ixgbe.
>>> >> This kernel is tainted because we use fusion-io iodrives as journal
>>> >> devices for btrfs.
>>> >>
>>> >> Any hints to nail this down are welcome.
>>> >>
>>> >> Greetings Stefan Majer
>>> >>
>>> >> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
>>> >> failure. order:2, mode:0x4020
>>> >
>>> > It looks like the machine running the cosd is crashing, is that the case?
>>>
>>> No the machine is still running. Even the cosd is still there.
>>
>> How much memory is (was?) cosd using?  Is it possible for you to watch RSS
>> under load when the errors trigger?
>
> I will look on this tomorrow
> just for the record:
> each machine has 24GB of RAM and 4 cosd with 1 btrfs formated disks
> each, which is a raid5 over 3 2TB spindles.
>
> The rados bench reaches a constant rate of about 1000Mb/sec !
>
> Greetings
>
> Stefan
>> The osd throttles incoming client bandwidth, but it doesn't throttle
>> inter-osd traffic yet because it's not obvious how to avoid deadlock.
>> It's possible that one node is getting significantly behind the
>> others on the replicated writes and that is blowing up its memory
>> footprint.  There are a few ways we can address that, but I'd like to make
>> sure we understand the problem first.
>>
>> Thanks!
>> sage
>>
>>
>>
>>> > Are you running both ceph kernel module on the same machine by any
>>> > chance? If not, it can be some other fs bug (e.g., the underlying
>>> > btrfs). Also, the stack here is quite deep, there's a chance for a
>>> > stack overflow.
>>>
>>> There is only the cosd running on these machines. We have 3 seperate
>>> mons and clients which uses qemu-rbd.
>>>
>>>
>>> > Thanks,
>>> > Yehuda
>>> >
>>>
>>>
>>> Greetings
>>> --
>>> Stefan Majer
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>
>
>
> --
> Stefan Majer
>



-- 
Stefan Majer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
  2011-05-11  6:58         ` Stefan Majer
@ 2011-05-11  7:36           ` Stefan Majer
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Majer @ 2011-05-11  7:36 UTC (permalink / raw)
  To: Sage Weil; +Cc: Yehuda Sadeh Weinraub, linux-kernel, ceph-devel

Hi Sage,

after some digging we set
sysctl -w vm.min_free_kbytes=262144
default was around 16000

This solved our problem and rados bench survived a 5 minute torture
with no single failure:

min lat: 0.036177 max lat: 299.924 avg lat: 0.553904
  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
  300      40     61736     61696   822.498      1312   299.602  0.553904
Total time run:        300.421378
Total writes made:     61736
Write size:            4194304
Bandwidth (MB/sec):    821.992

Average Latency:       0.621895
Max latency:           300.362
Min latency:           0.036177

Sorry for the noise, but i think you should mention this sysctl
modification in the ceph wiki (at least for 10GB/s deployments).

thanks

Stefan Majer


On Wed, May 11, 2011 at 8:58 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
> Hi Sage,
>
> we were running rados bench like this:
> # rados -p data bench 60 write -t 128
> Maintaining 128 concurrent writes of 4194304 bytes for at least 60 seconds.
>  sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>    0       0         0         0         0         0         -         0
>    1     128       296       168   671.847       672  0.051857  0.131839
>    2     127       537       410   819.838       968  0.052679  0.115476
>    3     128       772       644   858.516       936  0.043241  0.114372
>    4     128       943       815   814.865       684  0.799326  0.121142
>    5     128      1114       986   788.673       684  0.082748   0.13059
>    6     128      1428      1300   866.526      1256  0.065376  0.119083
>    7     127      1716      1589   907.859      1156  0.037958   0.11151
>    8     127      1986      1859    929.36      1080  0.063171   0.11077
>    9     128      2130      2002   889.645       572  0.048705  0.109477
>   10     127      2333      2206   882.269       816  0.062555  0.115842
>   11     127      2466      2339   850.419       532  0.051618  0.117356
>   12     128      2602      2474   824.545       540   0.06113  0.124453
>   13     128      2807      2679   824.187       820  0.075126  0.125108
>   14     127      2897      2770   791.312       364  0.077479  0.125009
>   15     127      2955      2828   754.023       232  0.084222  0.123814
>   16     127      2973      2846   711.393        72  0.078568  0.123562
>   17     127      2975      2848   670.011         8  0.923208  0.124123
>
> as you can see, the transferrate drops suddenly down to 8 and even to 0.
>
> Memory consumption during this is low:
>
> top - 08:52:24 up 18:12,  1 user,  load average: 0.64, 3.35, 4.17
> Tasks: 203 total,   1 running, 202 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.0%us,  0.3%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:  24731008k total, 24550172k used,   180836k free,    79136k buffers
> Swap:        0k total,        0k used,        0k free, 22574812k cached
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 22203 root      20   0  581m 284m 2232 S  0.0  1.2   0:44.34 cosd
> 21922 root      20   0  577m 281m 2148 S  0.0  1.2   0:39.91 cosd
> 22788 root      20   0  576m 213m 2084 S  0.0  0.9   0:44.10 cosd
> 22476 root      20   0  509m 204m 2156 S  0.0  0.8   0:33.92 cosd
>
> And after we hit this, ceph -w still reports clean state, all cosd are
> still running.
>
> We have no clue :-(
>
> Greetings
> Stefan Majer
>
>
> On Tue, May 10, 2011 at 6:06 PM, Stefan Majer <stefan.majer@gmail.com> wrote:
>> Hi Sage,
>>
>>
>> On Tue, May 10, 2011 at 6:02 PM, Sage Weil <sage@newdream.net> wrote:
>>> Hi Stefan,
>>>
>>> On Tue, 10 May 2011, Stefan Majer wrote:
>>>> Hi,
>>>>
>>>> On Tue, May 10, 2011 at 4:20 PM, Yehuda Sadeh Weinraub
>>>> <yehudasa@gmail.com> wrote:
>>>> > On Tue, May 10, 2011 at 7:04 AM, Stefan Majer <stefan.majer@gmail.com> wrote:
>>>> >> Hi,
>>>> >>
>>>> >> im running 4 nodes with ceph on top of btrfs with a dualport Intel
>>>> >> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
>>>> >> during benchmarks i get the following stack.
>>>> >> I can easily reproduce this by simply running rados bench from a fast
>>>> >> machine using this 4 nodes as ceph cluster.
>>>> >> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
>>>> >> 3.3.9 ixgbe.
>>>> >> This kernel is tainted because we use fusion-io iodrives as journal
>>>> >> devices for btrfs.
>>>> >>
>>>> >> Any hints to nail this down are welcome.
>>>> >>
>>>> >> Greetings Stefan Majer
>>>> >>
>>>> >> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
>>>> >> failure. order:2, mode:0x4020
>>>> >
>>>> > It looks like the machine running the cosd is crashing, is that the case?
>>>>
>>>> No the machine is still running. Even the cosd is still there.
>>>
>>> How much memory is (was?) cosd using?  Is it possible for you to watch RSS
>>> under load when the errors trigger?
>>
>> I will look on this tomorrow
>> just for the record:
>> each machine has 24GB of RAM and 4 cosd with 1 btrfs formated disks
>> each, which is a raid5 over 3 2TB spindles.
>>
>> The rados bench reaches a constant rate of about 1000Mb/sec !
>>
>> Greetings
>>
>> Stefan
>>> The osd throttles incoming client bandwidth, but it doesn't throttle
>>> inter-osd traffic yet because it's not obvious how to avoid deadlock.
>>> It's possible that one node is getting significantly behind the
>>> others on the replicated writes and that is blowing up its memory
>>> footprint.  There are a few ways we can address that, but I'd like to make
>>> sure we understand the problem first.
>>>
>>> Thanks!
>>> sage
>>>
>>>
>>>
>>>> > Are you running both ceph kernel module on the same machine by any
>>>> > chance? If not, it can be some other fs bug (e.g., the underlying
>>>> > btrfs). Also, the stack here is quite deep, there's a chance for a
>>>> > stack overflow.
>>>>
>>>> There is only the cosd running on these machines. We have 3 seperate
>>>> mons and clients which uses qemu-rbd.
>>>>
>>>>
>>>> > Thanks,
>>>> > Yehuda
>>>> >
>>>>
>>>>
>>>> Greetings
>>>> --
>>>> Stefan Majer
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Stefan Majer
>>
>
>
>
> --
> Stefan Majer
>



-- 
Stefan Majer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Kernel 2.6.38.6 page allocation failure (ixgbe)
       [not found]   ` <F169D4F5E1F1974DBFAFABF47F60C10AF23DC8FF@orsmsx507.amr.corp.intel.com>
@ 2011-05-16  8:28     ` Stefan Majer
  0 siblings, 0 replies; 9+ messages in thread
From: Stefan Majer @ 2011-05-16  8:28 UTC (permalink / raw)
  To: e1000-devel, netdev, linux-kernel

Hi,

after enlarging vm.min_free_kbytes to 524288 we survived almost a
week, but today i got this again:

May 16 09:18:13 os03 kernel: [331036.332001] kworker/0:1: page
allocation failure. order:2, mode:0x4020
May 16 09:18:13 os03 kernel: [331036.332005] Pid: 0, comm: kworker/0:1
Tainted: P        W   2.6.38.6-1.fits.3.el6.x86_64 #1
May 16 09:18:13 os03 kernel: [331036.332009] Call Trace:
May 16 09:18:13 os03 kernel: [331036.332011]  <IRQ>
[<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
May 16 09:18:13 os03 kernel: [331036.332024]  [<ffffffff81146cd2>] ?
kmalloc_large_node+0x62/0xb0
May 16 09:18:13 os03 kernel: [331036.332028]  [<ffffffff8114becb>] ?
__kmalloc_node_track_caller+0x15b/0x1d0
May 16 09:18:13 os03 kernel: [331036.332033]  [<ffffffff814b06ed>] ?
ip_rcv+0x23d/0x310
May 16 09:18:13 os03 kernel: [331036.332038]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 16 09:18:13 os03 kernel: [331036.332042]  [<ffffffff81466713>] ?
__alloc_skb+0x83/0x170
May 16 09:18:13 os03 kernel: [331036.332045]  [<ffffffff81466f74>] ?
__netdev_alloc_skb+0x24/0x50
May 16 09:18:13 os03 kernel: [331036.332054]  [<ffffffffa0170217>] ?
ixgbe_alloc_rx_buffers+0x2b7/0x370 [ixgbe]
May 16 09:18:13 os03 kernel: [331036.332059]  [<ffffffff8108d29d>] ?
sched_clock_cpu+0xcd/0x110
May 16 09:18:13 os03 kernel: [331036.332063]  [<ffffffff81474840>] ?
napi_skb_finish+0x50/0x70
May 16 09:18:13 os03 kernel: [331036.332069]  [<ffffffffa0172678>] ?
ixgbe_clean_rx_irq+0x828/0x890 [ixgbe]
May 16 09:18:13 os03 kernel: [331036.332076]  [<ffffffffa01747cf>] ?
ixgbe_clean_rxtx_many+0x10f/0x220 [ixgbe]
May 16 09:18:13 os03 kernel: [331036.332080]  [<ffffffff81474eb2>] ?
net_rx_action+0x102/0x2a0
May 16 09:18:13 os03 kernel: [331036.332084]  [<ffffffff8106b745>] ?
__do_softirq+0xb5/0x210
May 16 09:18:13 os03 kernel: [331036.332089]  [<ffffffff810c7ca4>] ?
handle_IRQ_event+0x54/0x180
May 16 09:18:13 os03 kernel: [331036.332094]  [<ffffffff8100cf3c>] ?
call_softirq+0x1c/0x30
May 16 09:18:13 os03 kernel: [331036.332097]  [<ffffffff8100e975>] ?
do_softirq+0x65/0xa0
May 16 09:18:13 os03 kernel: [331036.332100]  [<ffffffff8106b605>] ?
irq_exit+0x95/0xa0
May 16 09:18:13 os03 kernel: [331036.332105]  [<ffffffff8154a276>] ?
do_IRQ+0x66/0xe0
May 16 09:18:13 os03 kernel: [331036.332108]  [<ffffffff81542a53>] ?
ret_from_intr+0x0/0x15
May 16 09:18:13 os03 kernel: [331036.332110]  <EOI>
[<ffffffff812db311>] ? intel_idle+0xc1/0x120
May 16 09:18:13 os03 kernel: [331036.332116]  [<ffffffff812db2f4>] ?
intel_idle+0xa4/0x120
May 16 09:18:13 os03 kernel: [331036.332121]  [<ffffffff8143bca5>] ?
cpuidle_idle_call+0xb5/0x240
May 16 09:18:13 os03 kernel: [331036.332125]  [<ffffffff8100aa87>] ?
cpu_idle+0xb7/0x110
May 16 09:18:13 os03 kernel: [331036.332129]  [<ffffffff81538ffe>] ?
start_secondary+0x21f/0x221
May 16 09:18:13 os03 kernel: [331036.332131] Mem-Info:
May 16 09:18:13 os03 kernel: [331036.332132] Node 0 DMA per-cpu:
May 16 09:18:13 os03 kernel: [331036.332135] CPU    0: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332137] CPU    1: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332140] CPU    2: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332142] CPU    3: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332144] CPU    4: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332146] CPU    5: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332148] CPU    6: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332150] CPU    7: hi:    0, btch:
  1 usd:   0
May 16 09:18:13 os03 kernel: [331036.332152] Node 0 DMA32 per-cpu:
May 16 09:18:13 os03 kernel: [331036.332155] CPU    0: hi:  186, btch:
 31 usd: 163
May 16 09:18:13 os03 kernel: [331036.332157] CPU    1: hi:  186, btch:
 31 usd:  31
May 16 09:18:13 os03 kernel: [331036.332159] CPU    2: hi:  186, btch:
 31 usd: 182
May 16 09:18:13 os03 kernel: [331036.332162] CPU    3: hi:  186, btch:
 31 usd:  37
May 16 09:18:13 os03 kernel: [331036.332164] CPU    4: hi:  186, btch:
 31 usd:  13
May 16 09:18:13 os03 kernel: [331036.332166] CPU    5: hi:  186, btch:
 31 usd: 180
May 16 09:18:13 os03 kernel: [331036.332168] CPU    6: hi:  186, btch:
 31 usd: 159
May 16 09:18:13 os03 kernel: [331036.332170] CPU    7: hi:  186, btch:
 31 usd: 180
May 16 09:18:13 os03 kernel: [331036.332172] Node 0 Normal per-cpu:
May 16 09:18:13 os03 kernel: [331036.332174] CPU    0: hi:  186, btch:
 31 usd: 156
May 16 09:18:13 os03 kernel: [331036.332177] CPU    1: hi:  186, btch:
 31 usd: 160
May 16 09:18:13 os03 kernel: [331036.332179] CPU    2: hi:  186, btch:
 31 usd: 163
May 16 09:18:13 os03 kernel: [331036.332181] CPU    3: hi:  186, btch:
 31 usd: 168
May 16 09:18:13 os03 kernel: [331036.332183] CPU    4: hi:  186, btch:
 31 usd: 163
May 16 09:18:13 os03 kernel: [331036.332185] CPU    5: hi:  186, btch:
 31 usd: 180
May 16 09:18:13 os03 kernel: [331036.332187] CPU    6: hi:  186, btch:
 31 usd: 156
May 16 09:18:13 os03 kernel: [331036.332189] CPU    7: hi:  186, btch:
 31 usd: 182
May 16 09:18:13 os03 kernel: [331036.332195] active_anon:389538
inactive_anon:91572 isolated_anon:0
May 16 09:18:13 os03 kernel: [331036.332196]  active_file:2597361
inactive_file:2476894 isolated_file:0
May 16 09:18:13 os03 kernel: [331036.332198]  unevictable:123699
dirty:66164 writeback:11426 unstable:0
May 16 09:18:13 os03 kernel: [331036.332199]  free:254614
slab_reclaimable:53393 slab_unreclaimable:15304
May 16 09:18:13 os03 kernel: [331036.332201]  mapped:1251 shmem:91580
pagetables:1404 bounce:0
May 16 09:18:13 os03 kernel: [331036.332203] Node 0 DMA free:15852kB
min:328kB low:408kB high:492kB active_anon:0kB inactive_anon:0kB
active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:15660kB mlocked:0kB dirty:0kB writeback:0kB
mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
May 16 09:18:13 os03 kernel: [331036.332214] lowmem_reserve[]: 0 2991
24201 24201
May 16 09:18:13 os03 kernel: [331036.332217] Node 0 DMA32
free:426948kB min:64764kB low:80952kB high:97144kB
active_anon:135484kB inactive_anon:0kB active_file:1133516kB
inactive_file:1093352kB unevictable:49788kB isolated(anon):0kB
isolated(file):0kB present:3063392kB mlocked:0kB dirty:110436kB
writeback:284kB mapped:432kB shmem:0kB slab_reclaimable:46680kB
slab_unreclaimable:5268kB kernel_stack:152kB pagetables:324kB
unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
May 16 09:18:13 os03 kernel: [331036.332229] lowmem_reserve[]: 0 0 21210 21210
May 16 09:18:13 os03 kernel: [331036.332232] Node 0 Normal
free:575656kB min:459188kB low:573984kB high:688780kB
active_anon:1422668kB inactive_anon:366288kB active_file:9255928kB
inactive_file:8814224kB unevictable:445008kB isolated(anon):0kB
isolated(file):0kB present:21719040kB mlocked:0kB dirty:154220kB
writeback:45420kB mapped:4572kB shmem:366320kB
slab_reclaimable:166892kB slab_unreclaimable:55948kB
kernel_stack:3928kB pagetables:5292kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
May 16 09:18:13 os03 kernel: [331036.332245] lowmem_reserve[]: 0 0 0 0
May 16 09:18:13 os03 kernel: [331036.332248] Node 0 DMA: 1*4kB 1*8kB
0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB
3*4096kB = 15852kB
May 16 09:18:13 os03 kernel: [331036.332256] Node 0 DMA32: 55808*4kB
24890*8kB 3*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 1*4096kB = 426496kB
May 16 09:18:13 os03 kernel: [331036.332264] Node 0 Normal: 142372*4kB
49*8kB 67*16kB 48*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 1*4096kB = 576648kB
May 16 09:18:13 os03 kernel: [331036.332272] 5289868 total pagecache pages
May 16 09:18:13 os03 kernel: [331036.332274] 0 pages in swap cache


Is there any way to further identify which is causing this bug? Any
help appreciated.

Greetings Stefan

On Tue, May 10, 2011 at 9:06 PM, Brandeburg, Jesse
<jesse.brandeburg@intel.com> wrote:
> Adding e1000-devel, our list for the out-of-tree ixgbe driver (the issue is reported below to be in both upstream and out-of-tree)
>
> do you have jumbo frames enabled?
>
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On Behalf Of Stefan Majer
> Sent: Tuesday, May 10, 2011 9:03 AM
> To: netdev@vger.kernel.org
> Subject: Kernel 2.6.38.6 page allocation failure (ixgbe)
>
> Hi,
>
> im running 4 nodes with ceph on top of btrfs with a dualport Intel
> X520 10Gb Ethernet Card with the latest 3.3.9 ixgbe driver.
> during benchmarks i get the following stack.
> I can easily reproduce this by simply running rados bench from a fast
> machine using this 4 nodes as ceph cluster.
> We saw this with stock ixgbe driver from 2.6.38.6 and with the latest
> 3.3.9 ixgbe.
> This kernel is tainted because we use fusion-io iodrives as journal
> devices for btrfs.
>
> Any hints to nail this down are welcome.
>
> Greetings Stefan Majer
>
> May 10 15:26:40 os02 kernel: [ 3652.485219] cosd: page allocation
> failure. order:2, mode:0x4020
> May 10 15:26:40 os02 kernel: [ 3652.485223] kswapd0: page allocation
> failure. order:2, mode:0x4020
> May 10 15:26:40 os02 kernel: [ 3652.485228] Pid: 57, comm: kswapd0
> Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
> May 10 15:26:40 os02 kernel: [ 3652.485230] Call Trace:
> May 10 15:26:40 os02 kernel: [ 3652.485232]  <IRQ>
> [<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
> May 10 15:26:40 os02 kernel: [ 3652.485247]  [<ffffffff814b0ad0>] ?
> ip_local_deliver+0x80/0x90
> May 10 15:26:40 os02 kernel: [ 3652.485250] cosd: page allocation
> failure. order:2, mode:0x4020
> May 10 15:26:40 os02 kernel: [ 3652.485256]  [<ffffffff81146cd2>] ?
> kmalloc_large_node+0x62/0xb0
> May 10 15:26:40 os02 kernel: [ 3652.485259] Pid: 1849, comm: cosd
> Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
> May 10 15:26:40 os02 kernel: [ 3652.485261] Call Trace:
> May 10 15:26:40 os02 kernel: [ 3652.485264]  [<ffffffff8114becb>] ?
> __kmalloc_node_track_caller+0x15b/0x1d0
> May 10 15:26:40 os02 kernel: [ 3652.485266]  <IRQ>
> [<ffffffff81466f74>] ? __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3652.485274]  [<ffffffff81108ce7>] ?
> __alloc_pages_nodemask+0x6f7/0x8a0
> May 10 15:26:40 os02 kernel: [ 3652.485277]  [<ffffffff81466713>] ?
> __alloc_skb+0x83/0x170
> May 10 15:26:40 os02 kernel: [ 3652.485281]  [<ffffffff814b0ad0>] ?
> ip_local_deliver+0x80/0x90
> May 10 15:26:40 os02 kernel: [ 3652.485283]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3652.485287]  [<ffffffff81146cd2>] ?
> kmalloc_large_node+0x62/0xb0
> May 10 15:26:40 os02 kernel: [ 3652.485297]  [<ffffffffa005d9aa>] ?
> ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3652.485300]  [<ffffffff8114becb>] ?
> __kmalloc_node_track_caller+0x15b/0x1d0
> May 10 15:26:40 os02 kernel: [ 3652.485305]  [<ffffffff812b79e0>] ?
> swiotlb_map_page+0x0/0x110
> May 10 15:26:40 os02 kernel: [ 3652.485308]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3652.485315]  [<ffffffffa0060930>] ?
> ixgbe_poll+0x1140/0x1670 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3652.485318]  [<ffffffff81466713>] ?
> __alloc_skb+0x83/0x170
> May 10 15:26:40 os02 kernel: [ 3652.485323]  [<ffffffff810f33eb>] ?
> perf_pmu_enable+0x2b/0x40
> May 10 15:26:40 os02 kernel: [ 3652.485326]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3652.485330]  [<ffffffff81474eb2>] ?
> net_rx_action+0x102/0x2a0
> May 10 15:26:40 os02 kernel: [ 3652.485336]  [<ffffffffa005d9aa>] ?
> ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3652.485341]  [<ffffffff8106b745>] ?
> __do_softirq+0xb5/0x210
> May 10 15:26:40 os02 kernel: [ 3652.485344]  [<ffffffff81474840>] ?
> napi_skb_finish+0x50/0x70
> May 10 15:26:40 os02 kernel: [ 3652.485348]  [<ffffffff810c7ca4>] ?
> handle_IRQ_event+0x54/0x180
> May 10 15:26:40 os02 kernel: [ 3652.485354]  [<ffffffffa0060930>] ?
> ixgbe_poll+0x1140/0x1670 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3652.485357]  [<ffffffff8106b7bd>] ?
> __do_softirq+0x12d/0x210
> May 10 15:26:40 os02 kernel: [ 3652.485360]  [<ffffffff810f33eb>] ?
> perf_pmu_enable+0x2b/0x40
> May 10 15:26:40 os02 kernel: [ 3652.485364]  [<ffffffff8100cf3c>] ?
> call_softirq+0x1c/0x30
> May 10 15:26:40 os02 kernel: [ 3652.485367]  [<ffffffff81474eb2>] ?
> net_rx_action+0x102/0x2a0
> May 10 15:26:40 os02 kernel: [ 3652.485369]  [<ffffffff8100e975>] ?
> do_softirq+0x65/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.485372]  [<ffffffff8106b745>] ?
> __do_softirq+0xb5/0x210
> May 10 15:26:40 os02 kernel: [ 3652.485375]  [<ffffffff8106b605>] ?
> irq_exit+0x95/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.485379]  [<ffffffff810c7ca4>] ?
> handle_IRQ_event+0x54/0x180
> May 10 15:26:40 os02 kernel: [ 3652.485383]  [<ffffffff8154a276>] ?
> do_IRQ+0x66/0xe0
> May 10 15:26:40 os02 kernel: [ 3652.485386]  [<ffffffff8106b7bd>] ?
> __do_softirq+0x12d/0x210
> May 10 15:26:40 os02 kernel: [ 3652.485389]  [<ffffffff81542a53>] ?
> ret_from_intr+0x0/0x15
> May 10 15:26:40 os02 kernel: [ 3652.485391]  <EOI>
> [<ffffffff8100cf3c>] ? call_softirq+0x1c/0x30
> May 10 15:26:40 os02 kernel: [ 3652.485397]  [<ffffffff81110a54>] ?
> shrink_inactive_list+0x164/0x460
> May 10 15:26:40 os02 kernel: [ 3652.485400]  [<ffffffff8100e975>] ?
> do_softirq+0x65/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.485404]  [<ffffffff8153facc>] ?
> schedule+0x44c/0xa10
> May 10 15:26:40 os02 kernel: [ 3652.485407]  [<ffffffff8106b605>] ?
> irq_exit+0x95/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.485412]  [<ffffffff81109b1a>] ?
> determine_dirtyable_memory+0x1a/0x30
> May 10 15:26:40 os02 kernel: [ 3652.485416]  [<ffffffff8154a276>] ?
> do_IRQ+0x66/0xe0
> May 10 15:26:40 os02 kernel: [ 3652.485419]  [<ffffffff81111453>] ?
> shrink_zone+0x3d3/0x530
> May 10 15:26:40 os02 kernel: [ 3652.485422]  [<ffffffff81542a53>] ?
> ret_from_intr+0x0/0x15
> May 10 15:26:40 os02 kernel: [ 3652.485423]  <EOI>
> [<ffffffff81074a4a>] ? del_timer_sync+0x3a/0x60
> May 10 15:26:40 os02 kernel: [ 3652.485430]  [<ffffffff812a774d>] ?
> copy_user_generic_string+0x2d/0x40
> May 10 15:26:40 os02 kernel: [ 3652.485435]  [<ffffffff811054a5>] ?
> zone_watermark_ok_safe+0xb5/0xd0
> May 10 15:26:40 os02 kernel: [ 3652.485439]  [<ffffffff810ff351>] ?
> iov_iter_copy_from_user_atomic+0x101/0x170
> May 10 15:26:40 os02 kernel: [ 3652.485442]  [<ffffffff81112a69>] ?
> kswapd+0x889/0xb20
> May 10 15:26:40 os02 kernel: [ 3652.485457]  [<ffffffffa026c91d>] ?
> btrfs_copy_from_user+0xcd/0x130 [btrfs]
> May 10 15:26:40 os02 kernel: [ 3652.485460]  [<ffffffff811121e0>] ?
> kswapd+0x0/0xb20
> May 10 15:26:40 os02 kernel: [ 3652.485472]  [<ffffffffa026d844>] ?
> __btrfs_buffered_write+0x1a4/0x330 [btrfs]
> May 10 15:26:40 os02 kernel: [ 3652.485476]  [<ffffffff810862b6>] ?
> kthread+0x96/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.485479]  [<ffffffff8117151f>] ?
> file_update_time+0x5f/0x170
> May 10 15:26:40 os02 kernel: [ 3652.485482]  [<ffffffff8100ce44>] ?
> kernel_thread_helper+0x4/0x10
> May 10 15:26:40 os02 kernel: [ 3652.485493]  [<ffffffffa026dc08>] ?
> btrfs_file_aio_write+0x238/0x4e0 [btrfs]
> May 10 15:26:40 os02 kernel: [ 3652.485496]  [<ffffffff81086220>] ?
> kthread+0x0/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.485507]  [<ffffffffa026d9d0>] ?
> btrfs_file_aio_write+0x0/0x4e0 [btrfs]
> May 10 15:26:40 os02 kernel: [ 3652.485511]  [<ffffffff8100ce40>] ?
> kernel_thread_helper+0x0/0x10
> May 10 15:26:40 os02 kernel: [ 3652.485515]  [<ffffffff81158ff3>] ?
> do_sync_readv_writev+0xd3/0x110
> May 10 15:26:40 os02 kernel: [ 3652.485516] Mem-Info:
> May 10 15:26:40 os02 kernel: [ 3652.485519]  [<ffffffff81163d42>] ?
> path_put+0x22/0x30
> May 10 15:26:40 os02 kernel: [ 3652.485521] Node 0 DMA per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.485525]  [<ffffffff812584a3>] ?
> selinux_file_permission+0xf3/0x150
> May 10 15:26:40 os02 kernel: [ 3652.485528] CPU    0: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485530] CPU    1: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485534]  [<ffffffff81251583>] ?
> security_file_permission+0x23/0x90
> May 10 15:26:40 os02 kernel: [ 3652.485535] CPU    2: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485538] CPU    3: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485542]  [<ffffffff81159f14>] ?
> do_readv_writev+0xd4/0x1e0
> May 10 15:26:40 os02 kernel: [ 3652.485544] CPU    4: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485547] CPU    5: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485550]  [<ffffffff81540d91>] ?
> mutex_lock+0x31/0x60
> May 10 15:26:40 os02 kernel: [ 3652.485552] CPU    6: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485554] CPU    7: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485557]  [<ffffffff8115a066>] ?
> vfs_writev+0x46/0x60
> May 10 15:26:40 os02 kernel: [ 3652.485558] Node 0 DMA32 per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.485562]  [<ffffffff8115a1a1>] ?
> sys_writev+0x51/0xc0
> May 10 15:26:40 os02 kernel: [ 3652.485564] CPU    0: hi:  186, btch:
> 31 usd: 144
> May 10 15:26:40 os02 kernel: [ 3652.485567] CPU    1: hi:  186, btch:
> 31 usd: 198
> May 10 15:26:40 os02 kernel: [ 3652.485571]  [<ffffffff8100c002>] ?
> system_call_fastpath+0x16/0x1b
> May 10 15:26:40 os02 kernel: [ 3652.485573] CPU    2: hi:  186, btch:
> 31 usd: 180
> May 10 15:26:40 os02 kernel: [ 3652.485574] Mem-Info:
> May 10 15:26:40 os02 kernel: [ 3652.485576] CPU    3: hi:  186, btch:
> 31 usd: 171
> May 10 15:26:40 os02 kernel: [ 3652.485578] Node 0 CPU    4: hi:  186,
> btch:  31 usd: 159
> May 10 15:26:40 os02 kernel: [ 3652.485581] DMA per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.485582] CPU    5: hi:  186, btch:
> 31 usd:  69
> May 10 15:26:40 os02 kernel: [ 3652.485585] CPU    0: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485587] CPU    6: hi:  186, btch:
> 31 usd: 180
> May 10 15:26:40 os02 kernel: [ 3652.485589] CPU    1: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485591] CPU    7: hi:  186, btch:
> 31 usd: 184
> May 10 15:26:40 os02 kernel: [ 3652.485593] CPU    2: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485594] Node 0 CPU    3: hi:    0,
> btch:   1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485597] Normal per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.485598] CPU    4: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485600] CPU    0: hi:  186, btch:
> 31 usd: 100
> May 10 15:26:40 os02 kernel: [ 3652.485602] CPU    5: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485604] CPU    1: hi:  186, btch:
> 31 usd:  47
> May 10 15:26:40 os02 kernel: [ 3652.485606] CPU    6: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485608] CPU    2: hi:  186, btch:
> 31 usd: 168
> May 10 15:26:40 os02 kernel: [ 3652.485610] CPU    7: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.485612] CPU    3: hi:  186, btch:
> 31 usd: 140
> May 10 15:26:40 os02 kernel: [ 3652.485614] Node 0 CPU    4: hi:  186,
> btch:  31 usd: 177
> May 10 15:26:40 os02 kernel: [ 3652.485617] DMA32 per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.485618] CPU    5: hi:  186, btch:
> 31 usd:  77
> May 10 15:26:40 os02 kernel: [ 3652.485621] CPU    0: hi:  186, btch:
> 31 usd: 144
> May 10 15:26:40 os02 kernel: [ 3652.485623] CPU    6: hi:  186, btch:
> 31 usd: 168
> May 10 15:26:40 os02 kernel: [ 3652.485625] CPU    1: hi:  186, btch:
> 31 usd: 198
> May 10 15:26:40 os02 kernel: [ 3652.485627] CPU    7: hi:  186, btch:
> 31 usd:  68
> May 10 15:26:40 os02 kernel: [ 3652.485629] CPU    2: hi:  186, btch:
> 31 usd: 180
> May 10 15:26:40 os02 kernel: [ 3652.485634] active_anon:255806
> inactive_anon:19454 isolated_anon:0
> May 10 15:26:40 os02 kernel: [ 3652.485636]  active_file:420093
> inactive_file:5180559 isolated_file:0
> May 10 15:26:40 os02 kernel: [ 3652.485637]  unevictable:50582
> dirty:314034 writeback:8484 unstable:0
> May 10 15:26:40 os02 kernel: [ 3652.485639]  free:30074
> slab_reclaimable:35739 slab_unreclaimable:13526
> May 10 15:26:40 os02 kernel: [ 3652.485641]  mapped:3440 shmem:51
> pagetables:1342 bounce:0
> May 10 15:26:40 os02 kernel: [ 3652.485643] CPU    3: hi:  186, btch:
> 31 usd: 171
> May 10 15:26:40 os02 kernel: [ 3652.485644] Node 0 CPU    4: hi:  186,
> btch:  31 usd: 159
> May 10 15:26:40 os02 kernel: [ 3652.485652] DMA free:15852kB min:12kB
> low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB
> inactive_file:0kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:15660kB mlocked:0kB dirty:0kB writeback:0kB
> mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> May 10 15:26:40 os02 kernel: [ 3652.485659] CPU    5: hi:  186, btch:
> 31 usd:  69
> May 10 15:26:40 os02 kernel: [ 3652.485661] lowmem_reserve[]:CPU    6:
> hi:  186, btch:  31 usd: 180
> May 10 15:26:40 os02 kernel: [ 3652.485663]  0CPU    7: hi:  186,
> btch:  31 usd: 184
> May 10 15:26:40 os02 kernel: [ 3652.485665]  2991Node 0  24201Normal per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.485668]  24201CPU    0: hi:  186,
> btch:  31 usd: 100
> May 10 15:26:40 os02 kernel: [ 3652.485671]
> May 10 15:26:40 os02 kernel: [ 3652.485672] CPU    1: hi:  186, btch:
> 31 usd:  47
> May 10 15:26:40 os02 kernel: [ 3652.485674] Node 0 CPU    2: hi:  186,
> btch:  31 usd: 168
> May 10 15:26:40 os02 kernel: [ 3652.485682] DMA32 free:85748kB
> min:2460kB low:3072kB high:3688kB active_anon:20480kB
> inactive_anon:5268kB active_file:151588kB inactive_file:2645188kB
> unevictable:72kB isolated(anon):0kB isolated(file):0kB
> present:3063392kB mlocked:0kB dirty:210820kB writeback:0kB
> mapped:648kB shmem:0kB slab_reclaimable:28400kB
> slab_unreclaimable:2152kB kernel_stack:520kB pagetables:100kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> May 10 15:26:40 os02 kernel: [ 3652.485690] CPU    3: hi:  186, btch:
> 31 usd: 140
> May 10 15:26:40 os02 kernel: [ 3652.485691] lowmem_reserve[]:CPU    4:
> hi:  186, btch:  31 usd: 177
> May 10 15:26:40 os02 kernel: [ 3652.485693]  0CPU    5: hi:  186,
> btch:  31 usd:  77
> May 10 15:26:40 os02 kernel: [ 3652.485696]  0CPU    6: hi:  186,
> btch:  31 usd: 168
> May 10 15:26:40 os02 kernel: [ 3652.485698]  21210CPU    7: hi:  186,
> btch:  31 usd:  68
> May 10 15:26:40 os02 kernel: [ 3652.485701]  21210active_anon:255806
> inactive_anon:19454 isolated_anon:0
> May 10 15:26:40 os02 kernel: [ 3652.485705]  active_file:420093
> inactive_file:5180559 isolated_file:0
> May 10 15:26:40 os02 kernel: [ 3652.485706]  unevictable:50582
> dirty:314034 writeback:8484 unstable:0
> May 10 15:26:40 os02 kernel: [ 3652.485707]  free:30074
> slab_reclaimable:35739 slab_unreclaimable:13526
> May 10 15:26:40 os02 kernel: [ 3652.485708]  mapped:3440 shmem:51
> pagetables:1342 bounce:0
> May 10 15:26:40 os02 kernel: [ 3652.485709]
> May 10 15:26:40 os02 kernel: [ 3652.485710] Node 0 Node 0 DMA
> free:15852kB min:12kB low:12kB high:16kB active_anon:0kB
> inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB
> isolated(anon):0kB isolated(file):0kB present:15660kB mlocked:0kB
> dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB
> slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB
> bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> May 10 15:26:40 os02 kernel: [ 3652.485724] Normal free:18696kB
> min:17440kB low:21800kB high:26160kB active_anon:1002744kB
> inactive_anon:72548kB active_file:1528784kB inactive_file:18077048kB
> unevictable:202256kB isolated(anon):0kB isolated(file):0kB
> present:21719040kB mlocked:0kB dirty:1045316kB writeback:33936kB
> mapped:13112kB shmem:204kB slab_reclaimable:114556kB
> slab_unreclaimable:51952kB kernel_stack:3768kB pagetables:5268kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32
> all_unreclaimable? no
> May 10 15:26:40 os02 kernel: [ 3652.485731]
> lowmem_reserve[]:lowmem_reserve[]: 0 0 2991 0 24201 0 24201 0
> May 10 15:26:40 os02 kernel: [ 3652.485737]
> May 10 15:26:40 os02 kernel: [ 3652.485738] Node 0 Node 0 DMA32
> free:85748kB min:2460kB low:3072kB high:3688kB active_anon:20480kB
> inactive_anon:5268kB active_file:151588kB inactive_file:2645188kB
> unevictable:72kB isolated(anon):0kB isolated(file):0kB
> present:3063392kB mlocked:0kB dirty:210820kB writeback:0kB
> mapped:648kB shmem:0kB slab_reclaimable:28400kB
> slab_unreclaimable:2152kB kernel_stack:520kB pagetables:100kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> May 10 15:26:40 os02 kernel: [ 3652.485747] DMA:
> lowmem_reserve[]:1*4kB  01*8kB  00*16kB  212101*32kB  212101*64kB
> May 10 15:26:40 os02 kernel: [ 3652.485754] 1*128kB Node 0 1*256kB
> Normal free:18696kB min:17440kB low:21800kB high:26160kB
> active_anon:1002744kB inactive_anon:72548kB active_file:1528784kB
> inactive_file:18077048kB unevictable:202256kB isolated(anon):0kB
> isolated(file):0kB present:21719040kB mlocked:0kB dirty:1045316kB
> writeback:33936kB mapped:13112kB shmem:204kB slab_reclaimable:114556kB
> slab_unreclaimable:51952kB kernel_stack:3768kB pagetables:5268kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:32
> all_unreclaimable? no
> May 10 15:26:40 os02 kernel: [ 3652.485764] 0*512kB
> lowmem_reserve[]:1*1024kB  01*2048kB  03*4096kB  0= 15852kB
> May 10 15:26:40 os02 kernel: [ 3652.485771]  0Node 0
> May 10 15:26:40 os02 kernel: [ 3652.485773] DMA32: Node 0 59*4kB DMA:
> 125*8kB 1*4kB 66*16kB 1*8kB 80*32kB 0*16kB 188*64kB 1*32kB 51*128kB
> 1*64kB 15*256kB 1*128kB 40*512kB 1*256kB 31*1024kB 0*512kB 1*2048kB
> 1*1024kB 1*4096kB 1*2048kB = 85620kB
> May 10 15:26:40 os02 kernel: [ 3652.485789] 3*4096kB Node 0 = 15852kB
> May 10 15:26:40 os02 kernel: [ 3652.485791] Normal: Node 0 3930*4kB
> DMA32: 0*8kB 59*4kB 1*16kB 125*8kB 0*32kB 66*16kB 0*64kB 80*32kB
> 0*128kB 188*64kB 1*256kB 51*128kB 1*512kB 15*256kB 0*1024kB 40*512kB
> 1*2048kB 31*1024kB 0*4096kB 1*2048kB = 18552kB
> May 10 15:26:40 os02 kernel: [ 3652.485807] 1*4096kB 5651289 total
> pagecache pages
> May 10 15:26:40 os02 kernel: [ 3652.485809] = 85620kB
> May 10 15:26:40 os02 kernel: [ 3652.485810] 0 pages in swap cache
> May 10 15:26:40 os02 kernel: [ 3652.485811] Node 0 Swap cache stats:
> add 0, delete 0, find 0/0
> May 10 15:26:40 os02 kernel: [ 3652.485814] Normal: Free swap  = 1048572kB
> May 10 15:26:40 os02 kernel: [ 3652.485815] 3930*4kB Total swap = 1048572kB
> May 10 15:26:40 os02 kernel: [ 3652.485817] 0*8kB 1*16kB 0*32kB 0*64kB
> 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 18552kB
> May 10 15:26:40 os02 kernel: [ 3652.485822] 5651289 total pagecache pages
> May 10 15:26:40 os02 kernel: [ 3652.485823] 0 pages in swap cache
> May 10 15:26:40 os02 kernel: [ 3652.485824] Swap cache stats: add 0,
> delete 0, find 0/0
> May 10 15:26:40 os02 kernel: [ 3652.485825] Free swap  = 1048572kB
> May 10 15:26:40 os02 kernel: [ 3652.485826] Total swap = 1048572kB
> May 10 15:26:40 os02 kernel: [ 3652.486439] kworker/0:1: page
> allocation failure. order:2, mode:0x4020
> May 10 15:26:40 os02 kernel: [ 3652.486443] Pid: 0, comm: kworker/0:1
> Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
> May 10 15:26:40 os02 kernel: [ 3652.486446] Call Trace:
> May 10 15:26:40 os02 kernel: [ 3652.486448]  <IRQ>
> [<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
> May 10 15:26:40 os02 kernel: [ 3652.486459]  [<ffffffff814b0ad0>] ?
> ip_local_deliver+0x80/0x90
> May 10 15:26:40 os02 kernel: [ 3652.486464]  [<ffffffff81146cd2>] ?
> kmalloc_large_node+0x62/0xb0
> May 10 15:26:40 os02 kernel: [ 3652.486468]  [<ffffffff8114becb>] ?
> __kmalloc_node_track_caller+0x15b/0x1d0
> May 10 15:26:40 os02 kernel: [ 3652.486473]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3652.486476]  [<ffffffff81466713>] ?
> __alloc_skb+0x83/0x170
> May 10 15:26:40 os02 kernel: [ 3652.486479]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3652.486489]  [<ffffffffa005d9aa>] ?
> ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3652.486494]  [<ffffffff81474840>] ?
> napi_skb_finish+0x50/0x70
> May 10 15:26:40 os02 kernel: [ 3652.486501]  [<ffffffffa0060930>] ?
> ixgbe_poll+0x1140/0x1670 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3652.486506]  [<ffffffff81013379>] ?
> sched_clock+0x9/0x10
> May 10 15:26:40 os02 kernel: [ 3652.486510]  [<ffffffff81474eb2>] ?
> net_rx_action+0x102/0x2a0
> May 10 15:26:40 os02 kernel: [ 3652.486514]  [<ffffffff8106b745>] ?
> __do_softirq+0xb5/0x210
> May 10 15:26:40 os02 kernel: [ 3652.486520]  [<ffffffff8108aec4>] ?
> hrtimer_interrupt+0x134/0x240
> May 10 15:26:40 os02 kernel: [ 3652.486523]  [<ffffffff8100cf3c>] ?
> call_softirq+0x1c/0x30
> May 10 15:26:40 os02 kernel: [ 3652.486526]  [<ffffffff8100e975>] ?
> do_softirq+0x65/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.486529]  [<ffffffff8106b605>] ?
> irq_exit+0x95/0xa0
> May 10 15:26:40 os02 kernel: [ 3652.486533]  [<ffffffff8154a360>] ?
> smp_apic_timer_interrupt+0x70/0x9b
> May 10 15:26:40 os02 kernel: [ 3652.486536]  [<ffffffff8100c9f3>] ?
> apic_timer_interrupt+0x13/0x20
> May 10 15:26:40 os02 kernel: [ 3652.486538]  <EOI>
> [<ffffffff812db311>] ? intel_idle+0xc1/0x120
> May 10 15:26:40 os02 kernel: [ 3652.486544]  [<ffffffff812db2f4>] ?
> intel_idle+0xa4/0x120
> May 10 15:26:40 os02 kernel: [ 3652.486549]  [<ffffffff8143bca5>] ?
> cpuidle_idle_call+0xb5/0x240
> May 10 15:26:40 os02 kernel: [ 3652.486554]  [<ffffffff8100aa87>] ?
> cpu_idle+0xb7/0x110
> May 10 15:26:40 os02 kernel: [ 3652.486558]  [<ffffffff81538ffe>] ?
> start_secondary+0x21f/0x221
> May 10 15:26:40 os02 kernel: [ 3652.486561] Mem-Info:
> May 10 15:26:40 os02 kernel: [ 3652.486562] Node 0 DMA per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.486564] CPU    0: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486567] CPU    1: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486569] CPU    2: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486571] CPU    3: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486573] CPU    4: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486575] CPU    5: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486578] CPU    6: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486580] CPU    7: hi:    0, btch:
>  1 usd:   0
> May 10 15:26:40 os02 kernel: [ 3652.486581] Node 0 DMA32 per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.486584] CPU    0: hi:  186, btch:
> 31 usd: 144
> May 10 15:26:40 os02 kernel: [ 3652.486586] CPU    1: hi:  186, btch:
> 31 usd: 198
> May 10 15:26:40 os02 kernel: [ 3652.486588] CPU    2: hi:  186, btch:
> 31 usd: 180
> May 10 15:26:40 os02 kernel: [ 3652.486590] CPU    3: hi:  186, btch:
> 31 usd: 172
> May 10 15:26:40 os02 kernel: [ 3652.486593] CPU    4: hi:  186, btch:
> 31 usd: 159
> May 10 15:26:40 os02 kernel: [ 3652.486595] CPU    5: hi:  186, btch:
> 31 usd:  69
> May 10 15:26:40 os02 kernel: [ 3652.486597] CPU    6: hi:  186, btch:
> 31 usd: 180
> May 10 15:26:40 os02 kernel: [ 3652.486599] CPU    7: hi:  186, btch:
> 31 usd: 184
> May 10 15:26:40 os02 kernel: [ 3652.486601] Node 0 Normal per-cpu:
> May 10 15:26:40 os02 kernel: [ 3652.486603] CPU    0: hi:  186, btch:
> 31 usd: 162
> May 10 15:26:40 os02 kernel: [ 3652.486605] CPU    1: hi:  186, btch:
> 31 usd:  47
> May 10 15:26:40 os02 kernel: [ 3652.486608] CPU    2: hi:  186, btch:
> 31 usd: 168
> May 10 15:26:40 os02 kernel: [ 3652.486610] CPU    3: hi:  186, btch:
> 31 usd: 141
> May 10 15:26:40 os02 kernel: [ 3652.486612] CPU    4: hi:  186, btch:
> 31 usd: 177
> May 10 15:26:40 os02 kernel: [ 3652.486614] CPU    5: hi:  186, btch:
> 31 usd:  77
> May 10 15:26:40 os02 kernel: [ 3652.486616] CPU    6: hi:  186, btch:
> 31 usd: 168
> May 10 15:26:40 os02 kernel: [ 3652.486618] CPU    7: hi:  186, btch:
> 31 usd: 174
> May 10 15:26:40 os02 kernel: [ 3652.486624] active_anon:255806
> inactive_anon:19454 isolated_anon:0
> May 10 15:26:40 os02 kernel: [ 3652.486625]  active_file:420093
> inactive_file:5180745 isolated_file:0
> May 10 15:26:40 os02 kernel: [ 3652.486627]  unevictable:50582
> dirty:314470 writeback:8484 unstable:0
> May 10 15:26:40 os02 kernel: [ 3652.486628]  free:29795
> slab_reclaimable:35739 slab_unreclaimable:13526
> May 10 15:26:40 os02 kernel: [ 3652.486629]  mapped:3440 shmem:51
> pagetables:1342 bounce:0
> May 10 15:26:40 os02 kernel: [ 3652.486631] Node 0 DMA free:15852kB
> min:12kB low:12kB high:16kB active_anon:0kB inactive_anon:0kB
> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:15660kB mlocked:0kB dirty:0kB writeback:0kB
> mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB
> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
> May 10 15:26:40 os02 kernel: [ 3652.486642] lowmem_reserve[]: 0 2991 24201 24201
> May 10 15:26:40 os02 kernel: [ 3652.486645] Node 0 DMA32 free:85748kB
> min:2460kB low:3072kB high:3688kB active_anon:20480kB
> inactive_anon:5268kB active_file:151588kB inactive_file:2645188kB
> unevictable:72kB isolated(anon):0kB isolated(file):0kB
> present:3063392kB mlocked:0kB dirty:210820kB writeback:0kB
> mapped:648kB shmem:0kB slab_reclaimable:28400kB
> slab_unreclaimable:2152kB kernel_stack:520kB pagetables:100kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> May 10 15:26:40 os02 kernel: [ 3652.486657] lowmem_reserve[]: 0 0 21210 21210
> May 10 15:26:40 os02 kernel: [ 3652.486660] Node 0 Normal free:17580kB
> min:17440kB low:21800kB high:26160kB active_anon:1002744kB
> inactive_anon:72548kB active_file:1528784kB inactive_file:18077792kB
> unevictable:202256kB isolated(anon):0kB isolated(file):0kB
> present:21719040kB mlocked:0kB dirty:1047060kB writeback:33936kB
> mapped:13112kB shmem:204kB slab_reclaimable:114556kB
> slab_unreclaimable:51952kB kernel_stack:3768kB pagetables:5268kB
> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:64
> all_unreclaimable? no
> May 10 15:26:40 os02 kernel: [ 3652.486673] lowmem_reserve[]: 0 0 0 0
> May 10 15:26:40 os02 kernel: [ 3652.486675] Node 0 DMA: 1*4kB 1*8kB
> 0*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB
> 3*4096kB = 15852kB
> May 10 15:26:40 os02 kernel: [ 3652.486684] Node 0 DMA32: 59*4kB
> 125*8kB 66*16kB 80*32kB 188*64kB 51*128kB 15*256kB 40*512kB 31*1024kB
> 1*2048kB 1*4096kB = 85620kB
> May 10 15:26:40 os02 kernel: [ 3652.486692] Node 0 Normal: 3705*4kB
> 12*8kB 16*16kB 4*32kB 1*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB
> 0*4096kB = 18180kB
> May 10 15:26:40 os02 kernel: [ 3652.486700] 5651289 total pagecache pages
> May 10 15:26:40 os02 kernel: [ 3652.486702] 0 pages in swap cache
> May 10 15:26:40 os02 kernel: [ 3652.486704] Swap cache stats: add 0,
> delete 0, find 0/0
> May 10 15:26:40 os02 kernel: [ 3652.486705] Free swap  = 1048572kB
> May 10 15:26:40 os02 kernel: [ 3652.486707] Total swap = 1048572kB
> May 10 15:26:40 os02 kernel: [ 3652.562795] 6291440 pages RAM
> May 10 15:26:40 os02 kernel: [ 3652.562798] 108688 pages reserved
> May 10 15:26:40 os02 kernel: [ 3652.562799] 5429575 pages shared
> May 10 15:26:40 os02 kernel: [ 3652.562801] 783596 pages non-shared
> May 10 15:26:40 os02 kernel: [ 3652.651570] 6291440 pages RAM
> May 10 15:26:40 os02 kernel: [ 3652.651572] 108688 pages reserved
> May 10 15:26:40 os02 kernel: [ 3652.651573] 5430055 pages shared
> May 10 15:26:40 os02 kernel: [ 3652.651575] 782974 pages non-shared
> May 10 15:26:40 os02 kernel: [ 3652.721553] 6291440 pages RAM
> May 10 15:26:40 os02 kernel: [ 3652.721555] 108688 pages reserved
> May 10 15:26:40 os02 kernel: [ 3652.721556] 5430961 pages shared
> May 10 15:26:40 os02 kernel: [ 3652.721557] 781496 pages non-shared
> May 10 15:26:40 os02 kernel: [ 3654.349865] Pid: 1846, comm: cosd
> Tainted: P        W   2.6.38.6-1.fits.1.el6.x86_64 #1
> May 10 15:26:40 os02 kernel: [ 3654.358792] Call Trace:
> May 10 15:26:40 os02 kernel: [ 3654.361519]  <IRQ>
> [<ffffffff81108ce7>] ? __alloc_pages_nodemask+0x6f7/0x8a0
> May 10 15:26:40 os02 kernel: [ 3654.369495]  [<ffffffff814b0ad0>] ?
> ip_local_deliver+0x80/0x90
> May 10 15:26:40 os02 kernel: [ 3654.376005]  [<ffffffff81146cd2>] ?
> kmalloc_large_node+0x62/0xb0
> May 10 15:26:40 os02 kernel: [ 3654.382703]  [<ffffffff8114becb>] ?
> __kmalloc_node_track_caller+0x15b/0x1d0
> May 10 15:26:40 os02 kernel: [ 3654.390464]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3654.397163]  [<ffffffff81466713>] ?
> __alloc_skb+0x83/0x170
> May 10 15:26:40 os02 kernel: [ 3654.403277]  [<ffffffff81466f74>] ?
> __netdev_alloc_skb+0x24/0x50
> May 10 15:26:40 os02 kernel: [ 3654.409970]  [<ffffffffa005d9aa>] ?
> ixgbe_alloc_rx_buffers+0x9a/0x450 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3654.417926]  [<ffffffff812b79e0>] ?
> swiotlb_map_page+0x0/0x110
> May 10 15:26:40 os02 kernel: [ 3654.424432]  [<ffffffffa0060930>] ?
> ixgbe_poll+0x1140/0x1670 [ixgbe]
> May 10 15:26:40 os02 kernel: [ 3654.431518]  [<ffffffff810f33eb>] ?
> perf_pmu_enable+0x2b/0x40
> May 10 15:26:40 os02 kernel: [ 3654.437924]  [<ffffffff81474eb2>] ?
> net_rx_action+0x102/0x2a0
> May 10 15:26:40 os02 kernel: [ 3654.444329]  [<ffffffff8106b745>] ?
> __do_softirq+0xb5/0x210
> May 10 15:26:40 os02 kernel: [ 3654.450541]  [<ffffffff810c7ca4>] ?
> handle_IRQ_event+0x54/0x180
> May 10 15:26:40 os02 kernel: [ 3654.457138]  [<ffffffff8106b7bd>] ?
> __do_softirq+0x12d/0x210
> May 10 15:26:40 os02 kernel: [ 3654.463446]  [<ffffffff8100cf3c>] ?
> call_softirq+0x1c/0x30
> May 10 15:26:40 os02 kernel: [ 3654.469562]  [<ffffffff8100e975>] ?
> do_softirq+0x65/0xa0
> May 10 15:26:40 os02 kernel: [ 3654.475484]  [<ffffffff8106b605>] ?
> irq_exit+0x95/0xa0
> May 10 15:26:40 os02 kernel: [ 3654.481218]  [<ffffffff8154a276>] ?
> do_IRQ+0x66/0xe0
> May 10 15:26:40 os02 kernel: [ 3654.486754]  [<ffffffff81542a53>] ?
> ret_from_intr+0x0/0x15
> May 10 15:26:40 os02 kernel: [ 3654.492867]  <EOI>
> [<ffffffff81286919>] ? __make_request+0x149/0x4c0
> May 10 15:26:40 os02 kernel: [ 3654.500061]  [<ffffffff812868e4>] ?
> __make_request+0x114/0x4c0
> May 10 15:26:41 os02 kernel: [ 3654.506565]  [<ffffffff812841bd>] ?
> generic_make_request+0x2fd/0x5e0
> May 10 15:26:41 os02 kernel: [ 3654.513649]  [<ffffffff8142742b>] ?
> dm_get_live_table+0x4b/0x60
> May 10 15:26:41 os02 kernel: [ 3654.520248]  [<ffffffff81427bc1>] ?
> dm_merge_bvec+0xc1/0x140
> May 10 15:26:41 os02 kernel: [ 3654.526555]  [<ffffffff81284526>] ?
> submit_bio+0x86/0x110
> May 10 15:26:41 os02 kernel: [ 3654.532574]  [<ffffffff8118deac>] ?
> dio_bio_submit+0xbc/0xc0
> May 10 15:26:41 os02 kernel: [ 3654.538881]  [<ffffffff8118df40>] ?
> dio_send_cur_page+0x90/0xc0
> May 10 15:26:41 os02 kernel: [ 3654.545478]  [<ffffffff8118dfd5>] ?
> submit_page_section+0x65/0x180
> May 10 15:26:41 os02 kernel: [ 3654.552370]  [<ffffffff8118e918>] ?
> __blockdev_direct_IO+0x678/0xb30
> May 10 15:26:41 os02 kernel: [ 3654.559454]  [<ffffffff81250eaf>] ?
> security_inode_getsecurity+0x1f/0x30
> May 10 15:26:41 os02 kernel: [ 3654.566924]  [<ffffffff8118c627>] ?
> blkdev_direct_IO+0x57/0x60
> May 10 15:26:41 os02 kernel: [ 3654.573414]  [<ffffffff8118b760>] ?
> blkdev_get_blocks+0x0/0xc0
> May 10 15:26:41 os02 kernel: [ 3654.579954]  [<ffffffff811008f2>] ?
> generic_file_direct_write+0xc2/0x190
> May 10 15:26:41 os02 kernel: [ 3654.587424]  [<ffffffff811715b6>] ?
> file_update_time+0xf6/0x170
> May 10 15:26:41 os02 kernel: [ 3654.594025]  [<ffffffff811023eb>] ?
> __generic_file_aio_write+0x32b/0x460
> May 10 15:26:41 os02 kernel: [ 3654.601494]  [<ffffffff8105c9e0>] ?
> wake_up_state+0x10/0x20
>
>
>
> and so on.
>
> --
> Stefan Majer
>
>
>
> --
> Stefan Majer
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Stefan Majer

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-05-16  8:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-10 14:04 Kernel 2.6.38.6 page allocation failure (ixgbe) Stefan Majer
2011-05-10 14:20 ` Yehuda Sadeh Weinraub
2011-05-10 14:26   ` Yehuda Sadeh Weinraub
2011-05-10 15:55   ` Stefan Majer
2011-05-10 16:02     ` Sage Weil
2011-05-10 16:06       ` Stefan Majer
2011-05-11  6:58         ` Stefan Majer
2011-05-11  7:36           ` Stefan Majer
     [not found] ` <BANLkTik=FM5LJs8JUKHR2S+r41vi94Z7pw@mail.gmail.com>
     [not found]   ` <F169D4F5E1F1974DBFAFABF47F60C10AF23DC8FF@orsmsx507.amr.corp.intel.com>
2011-05-16  8:28     ` Stefan Majer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).