All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
@ 2011-06-18  1:16 Justin Piszcz
  2011-06-18 16:19   ` Mark Lord
  0 siblings, 1 reply; 9+ messages in thread
From: Justin Piszcz @ 2011-06-18  1:16 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-net, e1000-devel

Hi,

Kernel 2.6.39.1, x86_64.
Has anyone seen a page allocation failure on a NIC before?

I was doing 3-4 dumps (stdin/stdout) over eth0 but not eth3 when this
happened 1 minute and 14 seconds later, the network card is a 4-port Intel 
NIC:

Network card:

03:00.0 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
03:00.1 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
03:00.2 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
03:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)

-v:

03:00.3 Ethernet controller: Intel Corporation 82580 Gigabit Network Connection (rev 01)
         Subsystem: Intel Corporation Ethernet Server Adapter I340-T4
         Flags: bus master, fast devsel, latency 0, IRQ 19
         Memory at f2200000 (32-bit, non-prefetchable) [size=512K]
         Memory at f2400000 (32-bit, non-prefetchable) [size=16K]
         Capabilities: [40] Power Management version 3
         Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
         Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
         Capabilities: [a0] Express Endpoint, MSI 00
         Capabilities: [100] Advanced Error Reporting
         Capabilities: [140] Device Serial Number hidden
         Capabilities: [1a0] #17
         Kernel driver in use: igb


What is interesting is eth0 => e1000e and not igb, igb typically does not 
have a lot of network traffic going through it < 50mbps.

Memory information:
Mem:  16434508k total, 16123848k used,   310660k free,  6837232k buffers
Swap: 31246388k total,      228k used, 31246160k free,  6423724k cached

Here is my udev rule configuration:

# PCI device 0x8086:0x10f0 (e1000e)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="hidden", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x8086:0x150e (igb) - closest to motherboard
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="hidden", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

# PCI device 0x8086:0x150e (igb)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="hidden", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"

# PCI device 0x8086:0x150e (igb)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="hidden", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3"

# PCI device 0x8086:0x150e (igb) - furthest from motherboard
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="hidden", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4"

# PCI device 0x8086:0x150b (ixgbe)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="hidden", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth5"

--

Do I need more memory?

--

[60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
[60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
[60295.945329] Call Trace:
[60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
[60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
[60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
[60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
[60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
[60295.945352]  [<ffffffff815580bb>] ? inet_csk_clone+0xb/0x90
[60295.945355]  [<ffffffff8156fa31>] ? tcp_create_openreq_child+0x21/0x4e0
[60295.945357]  [<ffffffff8156cbd3>] ? tcp_v4_syn_recv_sock+0x53/0x250
[60295.945359]  [<ffffffff8156f790>] ? tcp_check_req+0x200/0x480
[60295.945362]  [<ffffffff8156cab1>] ? tcp_v4_do_rcv+0x1c1/0x290
[60295.945365]  [<ffffffff8154dd30>] ? ip_rcv_finish+0x340/0x340
[60295.945367]  [<ffffffff8156f047>] ? tcp_v4_rcv+0x5f7/0x8b0
[60295.945369]  [<ffffffff8154ddf4>] ? ip_local_deliver_finish+0xc4/0x200
[60295.945373]  [<ffffffff8151158b>] ? __netif_receive_skb+0x4eb/0x610
[60295.945375]  [<ffffffff81511898>] ? netif_receive_skb+0x78/0x80
[60295.945377]  [<ffffffff81511f03>] ? napi_gro_receive+0xa3/0xc0
[60295.945379]  [<ffffffff815119b8>] ? napi_skb_finish+0x38/0x50
[60295.945383]  [<ffffffff813e6208>] ? igb_poll+0x8b8/0xd00
[60295.945386]  [<ffffffff8102e5f1>] ? enqueue_task_rt+0x121/0x320
[60295.945388]  [<ffffffff815120c9>] ? net_rx_action+0xf9/0x180
[60295.945391]  [<ffffffff8103df38>] ? __do_softirq+0x98/0x120
[60295.945395]  [<ffffffff81070010>] ? irq_thread_fn+0x40/0x40
[60295.945397]  [<ffffffff81619a4c>] ? call_softirq+0x1c/0x30
[60295.945398]  <EOI>  [<ffffffff81003d8d>] ? do_softirq+0x4d/0x80
[60295.945402]  [<ffffffff8103de94>] ? local_bh_enable+0x94/0xa0
[60295.945405]  [<ffffffff8106ff70>] ? irq_thread+0x150/0x1b0
[60295.945407]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
[60295.945409]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
[60295.945412]  [<ffffffff81052746>] ? kthread+0x96/0xa0
[60295.945414]  [<ffffffff81619954>] ? kernel_thread_helper+0x4/0x10
[60295.945417]  [<ffffffff810526b0>] ? kthread_worker_fn+0x120/0x120
[60295.945418]  [<ffffffff81619950>] ? gs_change+0xb/0xb
[60295.945419] Mem-Info:
[60295.945420] DMA per-cpu:
[60295.945422] CPU    0: hi:    0, btch:   1 usd:   0
[60295.945423] CPU    1: hi:    0, btch:   1 usd:   0
[60295.945425] CPU    2: hi:    0, btch:   1 usd:   0
[60295.945426] CPU    3: hi:    0, btch:   1 usd:   0
[60295.945427] CPU    4: hi:    0, btch:   1 usd:   0
[60295.945429] CPU    5: hi:    0, btch:   1 usd:   0
[60295.945430] CPU    6: hi:    0, btch:   1 usd:   0
[60295.945431] CPU    7: hi:    0, btch:   1 usd:   0
[60295.945432] DMA32 per-cpu:
[60295.945434] CPU    0: hi:  186, btch:  31 usd: 191
[60295.945435] CPU    1: hi:  186, btch:  31 usd: 162
[60295.945436] CPU    2: hi:  186, btch:  31 usd: 156
[60295.945438] CPU    3: hi:  186, btch:  31 usd: 112
[60295.945439] CPU    4: hi:  186, btch:  31 usd: 175
[60295.945440] CPU    5: hi:  186, btch:  31 usd: 183
[60295.945442] CPU    6: hi:  186, btch:  31 usd: 167
[60295.945443] CPU    7: hi:  186, btch:  31 usd: 161
[60295.945444] Normal per-cpu:
[60295.945445] CPU    0: hi:  186, btch:  31 usd:  68
[60295.945447] CPU    1: hi:  186, btch:  31 usd:  90
[60295.945448] CPU    2: hi:  186, btch:  31 usd:  79
[60295.945449] CPU    3: hi:  186, btch:  31 usd:  94
[60295.945451] CPU    4: hi:  186, btch:  31 usd: 145
[60295.945452] CPU    5: hi:  186, btch:  31 usd: 157
[60295.945453] CPU    6: hi:  186, btch:  31 usd: 210
[60295.945455] CPU    7: hi:  186, btch:  31 usd:  73
[60295.945458] active_anon:503700 inactive_anon:57605 isolated_anon:0
[60295.945459]  active_file:1082465 inactive_file:2046870 isolated_file:0
[60295.945460]  unevictable:0 dirty:44503 writeback:0 unstable:0
[60295.945460]  free:97865 slab_reclaimable:220079 slab_unreclaimable:19352
[60295.945461]  mapped:18737 shmem:1304 pagetables:10293 bounce:0
[60295.945466] DMA free:15860kB min:124kB low:152kB high:184kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15636kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[60295.945469] lowmem_reserve[]: 0 3502 16127 16127
[60295.945474] DMA32 free:136844kB min:29328kB low:36660kB high:43992kB active_anon:80236kB inactive_anon:31460kB active_file:475400kB inactive_file:2655292kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3586912kB mlocked:0kB dirty:22768kB writeback:0kB mapped:484kB shmem:0kB slab_reclaimable:180892kB slab_unreclaimable:1804kB kernel_stack:104kB pagetables:20kB unstable:0kB bounc:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[60295.945478] lowmem_reserve[]: 0 0 12625 12625
[60295.945483] Normal free:238756kB min:105708kB low:132132kB high:158560kB active_anon:1934564kB inactive_anon:198960kB active_file:3854460kB inactive_file:5532188kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:12928000kB mlocked:0kB dirty:155244kB writeback:0kB mapped:74464kB shmem:5216kB slab_reclaimable:699424kB slab_unreclaimable:75604kB kernel_stack:4880kB pagetables:41152kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:2 all_unreclaimable? no
[60295.945487] lowmem_reserve[]: 0 0 0 0
[60295.945489] DMA: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB
[60295.945493] DMA32: 24085*4kB 3687*8kB 372*16kB 94*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 136844kB
[60295.945498] Normal: 39611*4kB 6147*8kB 1178*16kB 294*32kB 45*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 238756kB
[60295.945503] 3130697 total pagecache pages
[60295.945504] 2 pages in swap cache
[60295.945505] Swap cache stats: add 57, delete 55, find 0/0
[60295.945506] Free swap  = 31246160kB
[60295.945507] Total swap = 31246388kB
[60296.002283] 4194288 pages RAM
[60296.002284] 85661 pages reserved
[60296.002285] 2906516 pages shared
[60296.002286] 1304532 pages non-shared


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
  2011-06-18  1:16 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20 Justin Piszcz
@ 2011-06-18 16:19   ` Mark Lord
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Lord @ 2011-06-18 16:19 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, linux-net, e1000-devel, linux-ext4

On 11-06-17 09:16 PM, Justin Piszcz wrote:
> Hi,
>
> Kernel 2.6.39.1, x86_64.
> Has anyone seen a page allocation failure on a NIC before?
..
> [60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
> [60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
> [60295.945329] Call Trace:
> [60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
> [60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
> [60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
> [60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
> [60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
> [60295.945352]  [<ffffffff815580bb>] ? inet_csk_clone+0xb/0x90
> [60295.945355]  [<ffffffff8156fa31>] ? tcp_create_openreq_child+0x21/0x4e0
> [60295.945357]  [<ffffffff8156cbd3>] ? tcp_v4_syn_recv_sock+0x53/0x250
> [60295.945359]  [<ffffffff8156f790>] ? tcp_check_req+0x200/0x480
> [60295.945362]  [<ffffffff8156cab1>] ? tcp_v4_do_rcv+0x1c1/0x290
> [60295.945365]  [<ffffffff8154dd30>] ? ip_rcv_finish+0x340/0x340
> [60295.945367]  [<ffffffff8156f047>] ? tcp_v4_rcv+0x5f7/0x8b0
> [60295.945369]  [<ffffffff8154ddf4>] ? ip_local_deliver_finish+0xc4/0x200
> [60295.945373]  [<ffffffff8151158b>] ? __netif_receive_skb+0x4eb/0x610
> [60295.945375]  [<ffffffff81511898>] ? netif_receive_skb+0x78/0x80
> [60295.945377]  [<ffffffff81511f03>] ? napi_gro_receive+0xa3/0xc0
> [60295.945379]  [<ffffffff815119b8>] ? napi_skb_finish+0x38/0x50
> [60295.945383]  [<ffffffff813e6208>] ? igb_poll+0x8b8/0xd00
> [60295.945386]  [<ffffffff8102e5f1>] ? enqueue_task_rt+0x121/0x320
> [60295.945388]  [<ffffffff815120c9>] ? net_rx_action+0xf9/0x180
> [60295.945391]  [<ffffffff8103df38>] ? __do_softirq+0x98/0x120
> [60295.945395]  [<ffffffff81070010>] ? irq_thread_fn+0x40/0x40
> [60295.945397]  [<ffffffff81619a4c>] ? call_softirq+0x1c/0x30
> [60295.945398]  <EOI>  [<ffffffff81003d8d>] ? do_softirq+0x4d/0x80
> [60295.945402]  [<ffffffff8103de94>] ? local_bh_enable+0x94/0xa0
> [60295.945405]  [<ffffffff8106ff70>] ? irq_thread+0x150/0x1b0
> [60295.945407]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
> [60295.945409]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
> [60295.945412]  [<ffffffff81052746>] ? kthread+0x96/0xa0
> [60295.945414]  [<ffffffff81619954>] ? kernel_thread_helper+0x4/0x10
> [60295.945417]  [<ffffffff810526b0>] ? kthread_worker_fn+0x120/0x120
> [60295.945418]  [<ffffffff81619950>] ? gs_change+0xb/0xb
..

Not on a NIC, but also with 2.6.39:

[35850.612899] sd 4:0:0:0: [sdc] Attached SCSI disk
[35943.085264] mount: page allocation failure. order:5, mode:0xc0d0
[35943.085277] Pid: 14228, comm: mount Not tainted 2.6.39 #10
[35943.085284] Call Trace:
[35943.085306]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
[35943.085322]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
[35943.085335]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
[35943.085347]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
[35943.085359]  [<ffffffff81148ef0>] ? snprintf+0x36/0x3b
[35943.085371]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
[35943.085384]  [<ffffffff8109e05e>] ? mount_bdev+0x136/0x17d
[35943.085397]  [<ffffffff8109537d>] ? __kmalloc_track_caller+0xa9/0x116
[35943.085410]  [<ffffffff8109cfa6>] ? mount_fs+0xc/0xa6
[35943.085423]  [<ffffffff810b225d>] ? vfs_kern_mount+0x61/0x97
[35943.085434]  [<ffffffff810b22f2>] ? do_kern_mount+0x49/0xd6
[35943.085445]  [<ffffffff810b2a70>] ? do_mount+0x6f1/0x758
[35943.085457]  [<ffffffff81078f01>] ? memdup_user+0x3f/0x5b
[35943.085468]  [<ffffffff810b2b5f>] ? sys_mount+0x88/0xcd
[35943.085482]  [<ffffffff812cc47b>] ? system_call_fastpath+0x16/0x1b
[35943.085490] Mem-Info:
[35943.085496] DMA per-cpu:
[35943.085503] CPU    0: hi:    0, btch:   1 usd:   0
[35943.085511] CPU    1: hi:    0, btch:   1 usd:   0
[35943.085517] DMA32 per-cpu:
[35943.085524] CPU    0: hi:  186, btch:  31 usd:   0
[35943.085532] CPU    1: hi:  186, btch:  31 usd: 114
[35943.085549] active_anon:64179 inactive_anon:31764 isolated_anon:0
[35943.085554]  active_file:90242 inactive_file:223697 isolated_file:0
[35943.085558]  unevictable:2 dirty:24616 writeback:0 unstable:0
[35943.085562]  free:19204 slab_reclaimable:64266 slab_unreclaimable:6283
[35943.085566]  mapped:7463 shmem:31597 pagetables:5475 bounce:0
[35943.085592] DMA free:8308kB min:340kB low:424kB high:508kB active_anon:0kB
inactive_anon:1056kB active_file:1736kB inactive_file:4712kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:92kB slab_unreclaimable:8kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? no
[35943.085612] lowmem_reserve[]: 0 1993 1993 1993
[35943.085643] DMA32 free:68508kB min:44712kB low:55888kB high:67068kB
active_anon:256716kB inactive_anon:126000kB active_file:359232kB
inactive_file:890076kB unevictable:8kB isolated(anon):0kB isolated(file):0kB
present:2041776kB mlocked:0kB dirty:98464kB writeback:0kB mapped:29852kB
shmem:126388kB slab_reclaimable:256972kB slab_unreclaimable:25124kB
kernel_stack:2160kB pagetables:21900kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:115 all_unreclaimable? no
[35943.085663] lowmem_reserve[]: 0 0 0 0
[35943.085673] DMA: 3*4kB 3*8kB 3*16kB 15*32kB 17*64kB 6*128kB 3*256kB 4*512kB
1*1024kB 1*2048kB 0*4096kB = 8308kB
[35943.085700] DMA32: 5597*4kB 2221*8kB 948*16kB 330*32kB 33*64kB 4*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68508kB
[35943.085727] 345589 total pagecache pages
[35943.085733] 50 pages in swap cache
[35943.085739] Swap cache stats: add 58, delete 8, find 0/0
[35943.085744] Free swap  = 1975060kB
[35943.085749] Total swap = 1975292kB
[35943.113312] 521600 pages RAM
[35943.113318] 9355 pages reserved
[35943.113322] 290443 pages shared
[35943.113326] 248448 pages non-shared
[35943.181471] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts:
(null)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
@ 2011-06-18 16:19   ` Mark Lord
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Lord @ 2011-06-18 16:19 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-net, e1000-devel, linux-ext4, linux-kernel

On 11-06-17 09:16 PM, Justin Piszcz wrote:
> Hi,
>
> Kernel 2.6.39.1, x86_64.
> Has anyone seen a page allocation failure on a NIC before?
..
> [60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
> [60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
> [60295.945329] Call Trace:
> [60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
> [60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
> [60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
> [60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
> [60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
> [60295.945352]  [<ffffffff815580bb>] ? inet_csk_clone+0xb/0x90
> [60295.945355]  [<ffffffff8156fa31>] ? tcp_create_openreq_child+0x21/0x4e0
> [60295.945357]  [<ffffffff8156cbd3>] ? tcp_v4_syn_recv_sock+0x53/0x250
> [60295.945359]  [<ffffffff8156f790>] ? tcp_check_req+0x200/0x480
> [60295.945362]  [<ffffffff8156cab1>] ? tcp_v4_do_rcv+0x1c1/0x290
> [60295.945365]  [<ffffffff8154dd30>] ? ip_rcv_finish+0x340/0x340
> [60295.945367]  [<ffffffff8156f047>] ? tcp_v4_rcv+0x5f7/0x8b0
> [60295.945369]  [<ffffffff8154ddf4>] ? ip_local_deliver_finish+0xc4/0x200
> [60295.945373]  [<ffffffff8151158b>] ? __netif_receive_skb+0x4eb/0x610
> [60295.945375]  [<ffffffff81511898>] ? netif_receive_skb+0x78/0x80
> [60295.945377]  [<ffffffff81511f03>] ? napi_gro_receive+0xa3/0xc0
> [60295.945379]  [<ffffffff815119b8>] ? napi_skb_finish+0x38/0x50
> [60295.945383]  [<ffffffff813e6208>] ? igb_poll+0x8b8/0xd00
> [60295.945386]  [<ffffffff8102e5f1>] ? enqueue_task_rt+0x121/0x320
> [60295.945388]  [<ffffffff815120c9>] ? net_rx_action+0xf9/0x180
> [60295.945391]  [<ffffffff8103df38>] ? __do_softirq+0x98/0x120
> [60295.945395]  [<ffffffff81070010>] ? irq_thread_fn+0x40/0x40
> [60295.945397]  [<ffffffff81619a4c>] ? call_softirq+0x1c/0x30
> [60295.945398]  <EOI>  [<ffffffff81003d8d>] ? do_softirq+0x4d/0x80
> [60295.945402]  [<ffffffff8103de94>] ? local_bh_enable+0x94/0xa0
> [60295.945405]  [<ffffffff8106ff70>] ? irq_thread+0x150/0x1b0
> [60295.945407]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
> [60295.945409]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
> [60295.945412]  [<ffffffff81052746>] ? kthread+0x96/0xa0
> [60295.945414]  [<ffffffff81619954>] ? kernel_thread_helper+0x4/0x10
> [60295.945417]  [<ffffffff810526b0>] ? kthread_worker_fn+0x120/0x120
> [60295.945418]  [<ffffffff81619950>] ? gs_change+0xb/0xb
..

Not on a NIC, but also with 2.6.39:

[35850.612899] sd 4:0:0:0: [sdc] Attached SCSI disk
[35943.085264] mount: page allocation failure. order:5, mode:0xc0d0
[35943.085277] Pid: 14228, comm: mount Not tainted 2.6.39 #10
[35943.085284] Call Trace:
[35943.085306]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
[35943.085322]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
[35943.085335]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
[35943.085347]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
[35943.085359]  [<ffffffff81148ef0>] ? snprintf+0x36/0x3b
[35943.085371]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
[35943.085384]  [<ffffffff8109e05e>] ? mount_bdev+0x136/0x17d
[35943.085397]  [<ffffffff8109537d>] ? __kmalloc_track_caller+0xa9/0x116
[35943.085410]  [<ffffffff8109cfa6>] ? mount_fs+0xc/0xa6
[35943.085423]  [<ffffffff810b225d>] ? vfs_kern_mount+0x61/0x97
[35943.085434]  [<ffffffff810b22f2>] ? do_kern_mount+0x49/0xd6
[35943.085445]  [<ffffffff810b2a70>] ? do_mount+0x6f1/0x758
[35943.085457]  [<ffffffff81078f01>] ? memdup_user+0x3f/0x5b
[35943.085468]  [<ffffffff810b2b5f>] ? sys_mount+0x88/0xcd
[35943.085482]  [<ffffffff812cc47b>] ? system_call_fastpath+0x16/0x1b
[35943.085490] Mem-Info:
[35943.085496] DMA per-cpu:
[35943.085503] CPU    0: hi:    0, btch:   1 usd:   0
[35943.085511] CPU    1: hi:    0, btch:   1 usd:   0
[35943.085517] DMA32 per-cpu:
[35943.085524] CPU    0: hi:  186, btch:  31 usd:   0
[35943.085532] CPU    1: hi:  186, btch:  31 usd: 114
[35943.085549] active_anon:64179 inactive_anon:31764 isolated_anon:0
[35943.085554]  active_file:90242 inactive_file:223697 isolated_file:0
[35943.085558]  unevictable:2 dirty:24616 writeback:0 unstable:0
[35943.085562]  free:19204 slab_reclaimable:64266 slab_unreclaimable:6283
[35943.085566]  mapped:7463 shmem:31597 pagetables:5475 bounce:0
[35943.085592] DMA free:8308kB min:340kB low:424kB high:508kB active_anon:0kB
inactive_anon:1056kB active_file:1736kB inactive_file:4712kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:92kB slab_unreclaimable:8kB
kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? no
[35943.085612] lowmem_reserve[]: 0 1993 1993 1993
[35943.085643] DMA32 free:68508kB min:44712kB low:55888kB high:67068kB
active_anon:256716kB inactive_anon:126000kB active_file:359232kB
inactive_file:890076kB unevictable:8kB isolated(anon):0kB isolated(file):0kB
present:2041776kB mlocked:0kB dirty:98464kB writeback:0kB mapped:29852kB
shmem:126388kB slab_reclaimable:256972kB slab_unreclaimable:25124kB
kernel_stack:2160kB pagetables:21900kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:115 all_unreclaimable? no
[35943.085663] lowmem_reserve[]: 0 0 0 0
[35943.085673] DMA: 3*4kB 3*8kB 3*16kB 15*32kB 17*64kB 6*128kB 3*256kB 4*512kB
1*1024kB 1*2048kB 0*4096kB = 8308kB
[35943.085700] DMA32: 5597*4kB 2221*8kB 948*16kB 330*32kB 33*64kB 4*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68508kB
[35943.085727] 345589 total pagecache pages
[35943.085733] 50 pages in swap cache
[35943.085739] Swap cache stats: add 58, delete 8, find 0/0
[35943.085744] Free swap  = 1975060kB
[35943.085749] Total swap = 1975292kB
[35943.113312] 521600 pages RAM
[35943.113318] 9355 pages reserved
[35943.113322] 290443 pages shared
[35943.113326] 248448 pages non-shared
[35943.181471] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts:
(null)

------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
  2011-06-18 16:19   ` Mark Lord
  (?)
@ 2011-06-18 16:21   ` Mark Lord
  -1 siblings, 0 replies; 9+ messages in thread
From: Mark Lord @ 2011-06-18 16:21 UTC (permalink / raw)
  To: Justin Piszcz; +Cc: linux-kernel, linux-net, e1000-devel, linux-ext4

On 11-06-18 12:19 PM, Mark Lord wrote:
> On 11-06-17 09:16 PM, Justin Piszcz wrote:
>> > Hi,
>> >
>> > Kernel 2.6.39.1, x86_64.
>> > Has anyone seen a page allocation failure on a NIC before?
> ..
>> > [60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
>> > [60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
>> > [60295.945329] Call Trace:
>> > [60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
>> > [60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
>> > [60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
>> > [60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
>> > [60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
>> > [60295.945352]  [<ffffffff815580bb>] ? inet_csk_clone+0xb/0x90
>> > [60295.945355]  [<ffffffff8156fa31>] ? tcp_create_openreq_child+0x21/0x4e0
>> > [60295.945357]  [<ffffffff8156cbd3>] ? tcp_v4_syn_recv_sock+0x53/0x250
>> > [60295.945359]  [<ffffffff8156f790>] ? tcp_check_req+0x200/0x480
>> > [60295.945362]  [<ffffffff8156cab1>] ? tcp_v4_do_rcv+0x1c1/0x290
>> > [60295.945365]  [<ffffffff8154dd30>] ? ip_rcv_finish+0x340/0x340
>> > [60295.945367]  [<ffffffff8156f047>] ? tcp_v4_rcv+0x5f7/0x8b0
>> > [60295.945369]  [<ffffffff8154ddf4>] ? ip_local_deliver_finish+0xc4/0x200
>> > [60295.945373]  [<ffffffff8151158b>] ? __netif_receive_skb+0x4eb/0x610
>> > [60295.945375]  [<ffffffff81511898>] ? netif_receive_skb+0x78/0x80
>> > [60295.945377]  [<ffffffff81511f03>] ? napi_gro_receive+0xa3/0xc0
>> > [60295.945379]  [<ffffffff815119b8>] ? napi_skb_finish+0x38/0x50
>> > [60295.945383]  [<ffffffff813e6208>] ? igb_poll+0x8b8/0xd00
>> > [60295.945386]  [<ffffffff8102e5f1>] ? enqueue_task_rt+0x121/0x320
>> > [60295.945388]  [<ffffffff815120c9>] ? net_rx_action+0xf9/0x180
>> > [60295.945391]  [<ffffffff8103df38>] ? __do_softirq+0x98/0x120
>> > [60295.945395]  [<ffffffff81070010>] ? irq_thread_fn+0x40/0x40
>> > [60295.945397]  [<ffffffff81619a4c>] ? call_softirq+0x1c/0x30
>> > [60295.945398]  <EOI>  [<ffffffff81003d8d>] ? do_softirq+0x4d/0x80
>> > [60295.945402]  [<ffffffff8103de94>] ? local_bh_enable+0x94/0xa0
>> > [60295.945405]  [<ffffffff8106ff70>] ? irq_thread+0x150/0x1b0
>> > [60295.945407]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
>> > [60295.945409]  [<ffffffff8106fe20>] ? irq_finalize_oneshot+0x130/0x130
>> > [60295.945412]  [<ffffffff81052746>] ? kthread+0x96/0xa0
>> > [60295.945414]  [<ffffffff81619954>] ? kernel_thread_helper+0x4/0x10
>> > [60295.945417]  [<ffffffff810526b0>] ? kthread_worker_fn+0x120/0x120
>> > [60295.945418]  [<ffffffff81619950>] ? gs_change+0xb/0xb
> ..
>
> Not on a NIC, but also with 2.6.39:
>
> [35850.612899] sd 4:0:0:0: [sdc] Attached SCSI disk
> [35943.085264] mount: page allocation failure. order:5, mode:0xc0d0
> [35943.085277] Pid: 14228, comm: mount Not tainted 2.6.39 #10
> [35943.085284] Call Trace:
> [35943.085306]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
> [35943.085322]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
> [35943.085335]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
> [35943.085347]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
> [35943.085359]  [<ffffffff81148ef0>] ? snprintf+0x36/0x3b
> [35943.085371]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
> [35943.085384]  [<ffffffff8109e05e>] ? mount_bdev+0x136/0x17d
> [35943.085397]  [<ffffffff8109537d>] ? __kmalloc_track_caller+0xa9/0x116
> [35943.085410]  [<ffffffff8109cfa6>] ? mount_fs+0xc/0xa6
> [35943.085423]  [<ffffffff810b225d>] ? vfs_kern_mount+0x61/0x97
> [35943.085434]  [<ffffffff810b22f2>] ? do_kern_mount+0x49/0xd6
> [35943.085445]  [<ffffffff810b2a70>] ? do_mount+0x6f1/0x758
> [35943.085457]  [<ffffffff81078f01>] ? memdup_user+0x3f/0x5b
> [35943.085468]  [<ffffffff810b2b5f>] ? sys_mount+0x88/0xcd
> [35943.085482]  [<ffffffff812cc47b>] ? system_call_fastpath+0x16/0x1b
> [35943.085490] Mem-Info:
> [35943.085496] DMA per-cpu:
> [35943.085503] CPU    0: hi:    0, btch:   1 usd:   0
> [35943.085511] CPU    1: hi:    0, btch:   1 usd:   0
> [35943.085517] DMA32 per-cpu:
> [35943.085524] CPU    0: hi:  186, btch:  31 usd:   0
> [35943.085532] CPU    1: hi:  186, btch:  31 usd: 114
> [35943.085549] active_anon:64179 inactive_anon:31764 isolated_anon:0
> [35943.085554]  active_file:90242 inactive_file:223697 isolated_file:0
> [35943.085558]  unevictable:2 dirty:24616 writeback:0 unstable:0
> [35943.085562]  free:19204 slab_reclaimable:64266 slab_unreclaimable:6283
> [35943.085566]  mapped:7463 shmem:31597 pagetables:5475 bounce:0
> [35943.085592] DMA free:8308kB min:340kB low:424kB high:508kB active_anon:0kB
> inactive_anon:1056kB active_file:1736kB inactive_file:4712kB unevictable:0kB
> isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB
> writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:92kB slab_unreclaimable:8kB
> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
> pages_scanned:0 all_unreclaimable? no
> [35943.085612] lowmem_reserve[]: 0 1993 1993 1993
> [35943.085643] DMA32 free:68508kB min:44712kB low:55888kB high:67068kB
> active_anon:256716kB inactive_anon:126000kB active_file:359232kB
> inactive_file:890076kB unevictable:8kB isolated(anon):0kB isolated(file):0kB
> present:2041776kB mlocked:0kB dirty:98464kB writeback:0kB mapped:29852kB
> shmem:126388kB slab_reclaimable:256972kB slab_unreclaimable:25124kB
> kernel_stack:2160kB pagetables:21900kB unstable:0kB bounce:0kB writeback_tmp:0kB
> pages_scanned:115 all_unreclaimable? no
> [35943.085663] lowmem_reserve[]: 0 0 0 0
> [35943.085673] DMA: 3*4kB 3*8kB 3*16kB 15*32kB 17*64kB 6*128kB 3*256kB 4*512kB
> 1*1024kB 1*2048kB 0*4096kB = 8308kB
> [35943.085700] DMA32: 5597*4kB 2221*8kB 948*16kB 330*32kB 33*64kB 4*128kB
> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68508kB
> [35943.085727] 345589 total pagecache pages
> [35943.085733] 50 pages in swap cache
> [35943.085739] Swap cache stats: add 58, delete 8, find 0/0
> [35943.085744] Free swap  = 1975060kB
> [35943.085749] Total swap = 1975292kB
> [35943.113312] 521600 pages RAM
> [35943.113318] 9355 pages reserved
> [35943.113322] 290443 pages shared
> [35943.113326] 248448 pages non-shared
> [35943.181471] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts:
> (null)

Oh, and another one, not quite identical:

[297246.722573] mount: page allocation failure. order:5, mode:0xc0d0
[297246.722584] Pid: 25863, comm: mount Not tainted 2.6.39 #10
[297246.722590] Call Trace:
[297246.722610]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
[297246.722622]  [<ffffffff810bbd2c>] ? unmap_underlying_metadata+0x4b/0x4b
[297246.722633]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
[297246.722643]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
[297246.722652]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
[297246.722662]  [<ffffffff81148ef0>] ? snprintf+0x36/0x3b
[297246.722670]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
[297246.722680]  [<ffffffff8109e05e>] ? mount_bdev+0x136/0x17d
[297246.722690]  [<ffffffff8109537d>] ? __kmalloc_track_caller+0xa9/0x116
[297246.722700]  [<ffffffff8109cfa6>] ? mount_fs+0xc/0xa6
[297246.722710]  [<ffffffff810b225d>] ? vfs_kern_mount+0x61/0x97
[297246.722720]  [<ffffffff810b22f2>] ? do_kern_mount+0x49/0xd6
[297246.722729]  [<ffffffff810b2a70>] ? do_mount+0x6f1/0x758
[297246.722740]  [<ffffffff81078f01>] ? memdup_user+0x3f/0x5b
[297246.722749]  [<ffffffff810b2b5f>] ? sys_mount+0x88/0xcd
[297246.722761]  [<ffffffff812cc47b>] ? system_call_fastpath+0x16/0x1b
[297246.722767] Mem-Info:
[297246.722772] DMA per-cpu:
[297246.722778] CPU    0: hi:    0, btch:   1 usd:   0
[297246.722784] CPU    1: hi:    0, btch:   1 usd:   0
[297246.722788] DMA32 per-cpu:
[297246.722794] CPU    0: hi:  186, btch:  31 usd:  14
[297246.722800] CPU    1: hi:  186, btch:  31 usd:   0
[297246.722813] active_anon:73864 inactive_anon:32029 isolated_anon:0
[297246.722817]  active_file:76404 inactive_file:195583 isolated_file:0
[297246.722821]  unevictable:2 dirty:19997 writeback:0 unstable:0
[297246.722824]  free:20421 slab_reclaimable:96332 slab_unreclaimable:4325
[297246.722827]  mapped:7904 shmem:29851 pagetables:5963 bounce:0
[297246.722848] DMA free:8396kB min:340kB low:424kB high:508kB active_anon:0kB
inactive_anon:2048kB active_file:3476kB inactive_file:400kB unevictable:0kB
isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB
writeback:0kB mapped:52kB shmem:0kB slab_reclaimable:1588kB
slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[297246.722864] lowmem_reserve[]: 0 1993 1993 1993
[297246.722888] DMA32 free:73288kB min:44712kB low:55888kB high:67068kB
active_anon:295456kB inactive_anon:126068kB active_file:302140kB
inactive_file:781932kB unevictable:8kB isolated(anon):0kB isolated(file):0kB
present:2041776kB mlocked:0kB dirty:79988kB writeback:0kB mapped:31564kB
shmem:119404kB slab_reclaimable:383740kB slab_unreclaimable:17296kB
kernel_stack:2248kB pagetables:23852kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? no
[297246.722904] lowmem_reserve[]: 0 0 0 0
[297246.722912] DMA: 57*4kB 29*8kB 10*16kB 1*32kB 3*64kB 3*128kB 4*256kB 2*512kB
3*1024kB 1*2048kB 0*4096kB = 8396kB
[297246.722936] DMA32: 3186*4kB 2220*8kB 936*16kB 455*32kB 191*64kB 8*128kB
0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 73288kB
[297246.722962] 310158 total pagecache pages
[297246.722968] 8321 pages in swap cache
[297246.722974] Swap cache stats: add 33604, delete 25283, find 46658/48112
[297246.722980] Free swap  = 1899068kB
[297246.722984] Total swap = 1975292kB
[297246.747988] 521600 pages RAM
[297246.747995] 9355 pages reserved
[297246.748000] 258772 pages shared
[297246.748004] 291766 pages non-shared
[297246.815211] EXT4-fs (sdc1): mounted filesystem with ordered data mode. Opts:
(null)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
  2011-06-18 16:19   ` Mark Lord
@ 2011-06-18 17:05     ` Andreas Dilger
  -1 siblings, 0 replies; 9+ messages in thread
From: Andreas Dilger @ 2011-06-18 17:05 UTC (permalink / raw)
  To: Mark Lord; +Cc: Justin Piszcz, linux-kernel, linux-net, e1000-devel, linux-ext4

On 2011-06-18, at 10:19 AM, Mark Lord <kernel@teksavvy.com> wrote:
> On 11-06-17 09:16 PM, Justin Piszcz wrote:
>> 
>> Kernel 2.6.39.1, x86_64.
>> Has anyone seen a page allocation failure on a NIC before?
> ..
>> [60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
>> [60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
>> [60295.945329] Call Trace:
>> [60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
>> [60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
>> [60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
>> [60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
>> [60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
>> [60295.945352]  [<ffffffff815580bb>] 
> 
> Not on a NIC, but also with 2.6.39:
> 
> [35850.612899] sd 4:0:0:0: [sdc] Attached SCSI disk
> [35943.085264] mount: page allocation failure. order:5, mode:0xc0d0
> [35943.085277] Pid: 14228, comm: mount Not tainted 2.6.39 #10
> [35943.085284] Call Trace:
> [35943.085306]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
> [35943.085322]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
> [35943.085335]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
> [35943.085347]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e

There are a few places in the ext4 mount that are doing large allocations. In some places they fall back to vmalloc, so they should really be done with GFP_NOWARN.

 A few places don't yet fall back to vmalloc(), which is a problem with fragmented memory or very large filesystems. We were trying to test a 192TB ext4 filesystem, but were unable to mount it without patching the kernel.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
@ 2011-06-18 17:05     ` Andreas Dilger
  0 siblings, 0 replies; 9+ messages in thread
From: Andreas Dilger @ 2011-06-18 17:05 UTC (permalink / raw)
  To: Mark Lord; +Cc: Justin Piszcz, linux-kernel, linux-net, e1000-devel, linux-ext4

On 2011-06-18, at 10:19 AM, Mark Lord <kernel@teksavvy.com> wrote:
> On 11-06-17 09:16 PM, Justin Piszcz wrote:
>> 
>> Kernel 2.6.39.1, x86_64.
>> Has anyone seen a page allocation failure on a NIC before?
> ..
>> [60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
>> [60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
>> [60295.945329] Call Trace:
>> [60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
>> [60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
>> [60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
>> [60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
>> [60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
>> [60295.945352]  [<ffffffff815580bb>] 
> 
> Not on a NIC, but also with 2.6.39:
> 
> [35850.612899] sd 4:0:0:0: [sdc] Attached SCSI disk
> [35943.085264] mount: page allocation failure. order:5, mode:0xc0d0
> [35943.085277] Pid: 14228, comm: mount Not tainted 2.6.39 #10
> [35943.085284] Call Trace:
> [35943.085306]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
> [35943.085322]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
> [35943.085335]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
> [35943.085347]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e

There are a few places in the ext4 mount that are doing large allocations. In some places they fall back to vmalloc, so they should really be done with GFP_NOWARN.

 A few places don't yet fall back to vmalloc(), which is a problem with fragmented memory or very large filesystems. We were trying to test a 192TB ext4 filesystem, but were unable to mount it without patching the kernel.

Cheers, Andreas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
  2011-06-18 17:05     ` Andreas Dilger
  (?)
@ 2011-06-18 17:39     ` Stephan Boettcher
  2011-06-18 19:44       ` Andreas Dilger
  -1 siblings, 1 reply; 9+ messages in thread
From: Stephan Boettcher @ 2011-06-18 17:39 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-ext4

Andreas Dilger <aedilger@gmail.com> writes:

> On 2011-06-18, at 10:19 AM, Mark Lord <kernel@teksavvy.com> wrote:
>> On 11-06-17 09:16 PM, Justin Piszcz wrote:
>>> 
>>> Kernel 2.6.39.1, x86_64.
>>> Has anyone seen a page allocation failure on a NIC before?
>> ..
>>> [60295.925691] irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
>>> [60295.945328] Pid: 2299, comm: irq/64-eth3-TxR Not tainted 2.6.39.1 #1
>>> [60295.945329] Call Trace:
>>> [60295.945330]  <IRQ>  [<ffffffff810882f6>] ? __alloc_pages_nodemask+0x606/0x890
>>> [60295.945341]  [<ffffffff810b1435>] ? cache_alloc_refill+0x2c5/0x530
>>> [60295.945343]  [<ffffffff810b180b>] ? kmem_cache_alloc+0x7b/0xa0
>>> [60295.945347]  [<ffffffff815031ac>] ? sk_prot_alloc.clone.35+0x3c/0x120
>>> [60295.945349]  [<ffffffff81503320>] ? sk_clone+0x10/0x2b0
>>> [60295.945352]  [<ffffffff815580bb>] 
>> 
>> Not on a NIC, but also with 2.6.39:
>> 
>> [35850.612899] sd 4:0:0:0: [sdc] Attached SCSI disk
>> [35943.085264] mount: page allocation failure. order:5, mode:0xc0d0
>> [35943.085277] Pid: 14228, comm: mount Not tainted 2.6.39 #10
>> [35943.085284] Call Trace:
>> [35943.085306]  [<ffffffff8106fa96>] ? __alloc_pages_nodemask+0x710/0x74d
>> [35943.085322]  [<ffffffff8106fb5b>] ? __get_free_pages+0x12/0x50
>> [35943.085335]  [<ffffffff810f9120>] ? ext4_fill_super+0xe4f/0x20ff
>> [35943.085347]  [<ffffffff810f82d1>] ? ext4_remount+0x40e/0x40e
>
> There are a few places in the ext4 mount that are doing large
> allocations. In some places they fall back to vmalloc, so they should
> really be done with GFP_NOWARN.
>
>  A few places don't yet fall back to vmalloc(), which is a problem
>  with fragmented memory or very large filesystems. We were trying to
>  test a 192TB ext4 filesystem, but were unable to mount it without
>  patching the kernel.

:-O ...  my puny 20TB ext4 filesystem did not do something like
this, yet.

> Cheers, Andreas--

-- 
Stephan 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
  2011-06-18 17:39     ` Stephan Boettcher
@ 2011-06-18 19:44       ` Andreas Dilger
  2011-06-19 11:28         ` Stephan Boettcher
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Dilger @ 2011-06-18 19:44 UTC (permalink / raw)
  To: Stephan Boettcher; +Cc: ext4 development

On 2011-06-18, at 11:39 AM, Stephan Boettcher wrote:
> Andreas Dilger <adilger@dilger.ca> writes:
>> There are a few places in the ext4 mount that are doing large
>> allocations. In some places they fall back to vmalloc, so they should
>> really be done with GFP_NOWARN.
>> 
>> A few places don't yet fall back to vmalloc(), which is a problem
>> with fragmented memory or very large filesystems. We were trying to
>> test a 192TB ext4 filesystem, but were unable to mount it without
>> patching the kernel.
> 
> :-O ...  my puny 20TB ext4 filesystem did not do something like
> this, yet.

What sort of experience do you have with using a filesystem > 20TB?
I don't think there are many users out there yet that are doing this
today, so it would be great if you could share some data with us.

So far, we've only been doing testing and benchmarking (mke2fs, e2fsck
times, IO and metadata load tests, etc) and I don't know that all of
the "real world" corner cases have been tested yet.

Cheers, Andreas






^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20
  2011-06-18 19:44       ` Andreas Dilger
@ 2011-06-19 11:28         ` Stephan Boettcher
  0 siblings, 0 replies; 9+ messages in thread
From: Stephan Boettcher @ 2011-06-19 11:28 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: ext4 development

Andreas Dilger <adilger@dilger.ca> writes:

> On 2011-06-18, at 11:39 AM, Stephan Boettcher wrote:
>> Andreas Dilger <adilger@dilger.ca> writes:
>>> There are a few places in the ext4 mount that are doing large
>>> allocations. In some places they fall back to vmalloc, so they should
>>> really be done with GFP_NOWARN.
>>> 
>>> A few places don't yet fall back to vmalloc(), which is a problem
>>> with fragmented memory or very large filesystems. We were trying to
>>> test a 192TB ext4 filesystem, but were unable to mount it without
>>> patching the kernel.
>> 
>> :-O ...  my puny 20TB ext4 filesystem did not do something like
>> this, yet.
>
> What sort of experience do you have with using a filesystem > 20TB?
> I don't think there are many users out there yet that are doing this
> today, so it would be great if you could share some data with us.

I will, as soon as something interesting shows up.  Currently it is
offline, I need to buy some hardware for the frontend.

The setup is nfs-md-nbd-md-sata, RAID5², 3*(6*2TB), mostly for backups.  The
aim is to keep some old solid 32-bit servers usefull for a little longer.

Three 32-bit servers each provide a 10TB nbd to the frontend, which
must be 64-bit.  The frontend, that run the outer md-RAID5 on three nbd
was an Atom525, which I had to return to it's original duties last week.

So far I filled it about 25% with backups via rsync.  

I did not observe any problems with the filesystem.  I did run several
fsck, which were surprisingly fast.  The problem I had were of the kind
that I could not login to any of the servers while they were busy
rebuilding the RAID.  This will get solved with a little more networking
gear.

As soon as I get new frontend hardware, I can run some tests, if
somebody tells me what and how to do it.  The data that is currently on
there is expendable.  The tests shall not target performance of any kind,
for obvious reasons.

> So far, we've only been doing testing and benchmarking (mke2fs, e2fsck
> times, IO and metadata load tests, etc) and I don't know that all of
> the "real world" corner cases have been tested yet.

Well, all the real world corner cases will be well out of my reach with
this setup.

-- 
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-06-19 11:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-18  1:16 2.6.39.1: Intel I340-T4: irq/64-eth3-TxR: page allocation failure. order:1, mode:0x20 Justin Piszcz
2011-06-18 16:19 ` Mark Lord
2011-06-18 16:19   ` Mark Lord
2011-06-18 16:21   ` Mark Lord
2011-06-18 17:05   ` Andreas Dilger
2011-06-18 17:05     ` Andreas Dilger
2011-06-18 17:39     ` Stephan Boettcher
2011-06-18 19:44       ` Andreas Dilger
2011-06-19 11:28         ` Stephan Boettcher

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.