All of lore.kernel.org
 help / color / mirror / Atom feed
* I have a blaze of 353 page allocation failures, all alike
@ 2011-02-10 15:03 Peter Kruse
  2011-02-14 16:49 ` Christoph Lameter
  0 siblings, 1 reply; 27+ messages in thread
From: Peter Kruse @ 2011-02-10 15:03 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1017 bytes --]

Hello,

today one of our servers went berserk and produced literally 353
page allocation failures in 7 minutes until it was reset
(sysrq was still working).  I attach one of them as an example.
The failures happened for different processes ranging from
sshd, top, java, tclsh, ypserv, smbd, portmap, kswapd to Xvnc4.
I already reported about an incidence with this server here:
https://lkml.org/lkml/2011/1/19/145
we have set vm.min_free_kbytes = 2097152 but the problem
obviously did not go away.
All traces start with one of these three beginnings:

Call Trace:
  <IRQ>  [<ffffffff81071f46>] __alloc_pages_nodemask+0x5ca/0x600
  [<ffffffff81344127>] ? skb_dma_map+0xd2/0x23f

Call Trace:
  <IRQ>  [<ffffffff81071f46>] __alloc_pages_nodemask+0x5ca/0x600
  [<ffffffff8109428b>] kmem_getpages+0x5c/0x127

Call Trace:
  <IRQ>  [<ffffffff81071f46>] __alloc_pages_nodemask+0x5ca/0x600
  [<ffffffffa01418fd>] ? tcp_packet+0xc87/0xcb2 [nf_conntrack]

Please anybody, what is the cause of these failures?

Thanks,

   Peter

[-- Attachment #2: calltrace.1 --]
[-- Type: text/plain, Size: 2896 bytes --]

Call Trace:
 <IRQ>  [<ffffffff81071f46>] __alloc_pages_nodemask+0x5ca/0x600
 [<ffffffff8109428b>] kmem_getpages+0x5c/0x127
 [<ffffffff81094475>] fallback_alloc+0x11f/0x195
 [<ffffffff81094614>] ____cache_alloc_node+0x129/0x138
 [<ffffffff81094fdd>] kmem_cache_alloc+0xd1/0xfe
 [<ffffffff8133c2f9>] sk_prot_alloc+0x2c/0xcd
 [<ffffffff8133c427>] sk_clone+0x1b/0x24b
 [<ffffffff81369ce2>] inet_csk_clone+0x13/0x81
 [<ffffffff8137d698>] tcp_create_openreq_child+0x1d/0x39c
 [<ffffffff8137c309>] tcp_v4_syn_recv_sock+0x57/0x1bc
 [<ffffffff8137d50f>] tcp_check_req+0x210/0x37c
 [<ffffffffa0154423>] ? ipv4_confirm+0x161/0x179 [nf_conntrack_ipv4]
 [<ffffffff8137ba63>] tcp_v4_do_rcv+0xc1/0x1d7
 [<ffffffff8137c021>] tcp_v4_rcv+0x4a8/0x739
 [<ffffffff8135ba27>] ? nf_hook_slow+0x63/0xc3
 [<ffffffff81361bb0>] ? ip_local_deliver_finish+0x0/0x1d0
 [<ffffffff81361ca8>] ip_local_deliver_finish+0xf8/0x1d0
 [<ffffffff81361df2>] ip_local_deliver+0x72/0x7a
 [<ffffffff813618ac>] ip_rcv_finish+0x33c/0x356
 [<ffffffff81361b79>] ip_rcv+0x2b3/0x2ea
 [<ffffffff813a2861>] ? packet_rcv_spkt+0x10f/0x11a
 [<ffffffff8134660a>] netif_receive_skb+0x2cb/0x2ed
 [<ffffffff81346767>] napi_skb_finish+0x28/0x40
 [<ffffffff81346ba5>] napi_gro_receive+0x2a/0x2f
 [<ffffffffa001669d>] igb_poll+0x507/0x86a [igb]
 [<ffffffffa0015ef8>] ? igb_clean_tx_irq+0x1dd/0x47b [igb]
 [<ffffffff81346cb6>] net_rx_action+0xa7/0x178
 [<ffffffff8103bd21>] __do_softirq+0x96/0x119
 [<ffffffff8100bf5c>] call_softirq+0x1c/0x28
 [<ffffffff8100d9e7>] do_softirq+0x33/0x6b
 [<ffffffff8103b844>] irq_exit+0x36/0x38
 [<ffffffff8100d0e9>] do_IRQ+0xa3/0xba
 [<ffffffff8100b7d3>] ret_from_intr+0x0/0xa
 <EOI>  [<ffffffffa00f046f>] ? xfs_reclaim_inode_shrink+0xc3/0x112 [xfs]
 [<ffffffffa00f0451>] ? xfs_reclaim_inode_shrink+0xa5/0x112 [xfs]
 [<ffffffffa00f04bd>] ? xfs_reclaim_inode_shrink+0x111/0x112 [xfs]
 [<ffffffff810770fc>] ? shrink_slab+0xd2/0x154
 [<ffffffff81077e00>] ? try_to_free_pages+0x221/0x31c
 [<ffffffff81074f4a>] ? isolate_pages_global+0x0/0x1f0
 [<ffffffff81071d79>] ? __alloc_pages_nodemask+0x3fd/0x600
 [<ffffffff8109428b>] ? kmem_getpages+0x5c/0x127
 [<ffffffff81094475>] ? fallback_alloc+0x11f/0x195
 [<ffffffff81094614>] ? ____cache_alloc_node+0x129/0x138
 [<ffffffff810a9055>] ? pollwake+0x0/0x5b
 [<ffffffff810946bf>] ? kmem_cache_alloc_node+0x9c/0xc7
 [<ffffffff8109472d>] ? __kmalloc_node+0x43/0x45
 [<ffffffff81340625>] ? __alloc_skb+0x6b/0x164
 [<ffffffff8133bcc1>] ? sock_alloc_send_pskb+0xdd/0x31c
 [<ffffffff8133bf10>] ? sock_alloc_send_skb+0x10/0x12
 [<ffffffff8139e4c2>] ? unix_stream_sendmsg+0x180/0x312
 [<ffffffff81338270>] ? sock_aio_write+0x109/0x122
 [<ffffffff8100b7ce>] ? common_interrupt+0xe/0x13
 [<ffffffff8109a41a>] ? do_sync_write+0xe7/0x12d
 [<ffffffff81049208>] ? autoremove_wake_function+0x0/0x38
 [<ffffffff8100b7ce>] ? common_intreclaimable:78357
 mapped:11679 shmem:26799 pagetables:13497 bounce:0

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2011-11-24 11:03 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-10 15:03 I have a blaze of 353 page allocation failures, all alike Peter Kruse
2011-02-14 16:49 ` Christoph Lameter
2011-02-15  7:44   ` Peter Kruse
2011-02-15 17:30     ` Christoph Lameter
2011-02-16 12:22       ` Peter Kruse
2011-02-16 15:59         ` Christoph Lameter
2011-02-16 16:03           ` Peter Kruse
2011-02-16 16:14             ` Christoph Lameter
2011-02-17  7:31               ` Peter Kruse
2011-02-17 17:03                 ` Christoph Lameter
2011-02-18 12:30                   ` Peter Kruse
2011-02-24 12:01                     ` Peter Kruse
2011-04-12 15:01                       ` Peter Kruse
2011-04-12 18:08                         ` Christoph Lameter
2011-04-13  1:34                           ` David Rientjes
2011-04-13  7:13                             ` Peter Kruse
2011-04-13 16:17                               ` Christoph Lameter
2011-05-19 11:56                                 ` Peter Kruse
2011-05-19 16:00                                   ` Christoph Lameter
2011-05-23  6:34                                     ` Peter Kruse
     [not found]                                 ` <4E09BEA1.1080501@q-leap.de>
     [not found]                                   ` <alpine.DEB.2.00.1107051013500.16869@router.home>
2011-07-05 17:20                                     ` Mel Gorman
2011-07-06  4:16                                       ` Dave Chinner
2011-07-06  6:50                                         ` Peter Kruse
2011-07-06 14:31                                           ` Christoph Lameter
2011-07-06 15:15                                             ` Peter Kruse
2011-07-06 15:30                                               ` Christoph Lameter
2011-11-24 10:53                                                 ` Peter Kruse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.