All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.34.1 page allocation failure
@ 2010-08-22  6:13 Stan Hoeppner
  2010-08-22  6:47   ` Mikael Abrahamsson
  0 siblings, 1 reply; 29+ messages in thread
From: Stan Hoeppner @ 2010-08-22  6:13 UTC (permalink / raw)
  To: Linux Kernel List

I'm not subscribed to lkml so please CC me in replies.  First post.

Mobo:    Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100
Disk:    SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII
Kernel:  vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine
OS:      Debian 5.0.5 (Stable)
Build:   kernel configured via make menuconfig
         no modules, no initrd
         built via "make KDEB_PKGVERSION="
         installed via dpkg, bootloader is LILO
Role:    headless SOHO server, run level 2, _very_ light load
         Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba
         bulk of system memory (>300MB) is consumed by buffers/cache
Issue:   AFAIK, these errors never occurred with any revisions of
         2.6.26, .31, or .32.  After installing 2.6.34.1 I've noticed
         the following errors in dmesg.  I see 6 of these, including
         two errors each for kswapd0, lighttpd, and smtpd, all not
         tainted.  AFAICT everything is still running fine.  Are these
         critical errors?  If so, how do I fix?

kswapd0: page allocation failure. order:1, mode:0x20
Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
Call Trace:
 [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
 [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
 [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
 [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
 [<c11773a5>] ? sk_prot_alloc+0x19/0x55
 [<c117744b>] ? sk_clone+0x16/0x1cc
 [<c119a71d>] ? inet_csk_clone+0xf/0x80
 [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
 [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
 [<c11abf9d>] ? tcp_check_req+0x209/0x335
 [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
 [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
 [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
 [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
 [<c103655e>] ? ktime_get_real+0xf/0x2b
 [<c117f8d3>] ? netif_receive_skb+0x219/0x234
 [<c115ff46>] ? e100_poll+0x1d0/0x47e
 [<c117fa98>] ? net_rx_action+0x58/0xf8
 [<c102539c>] ? __do_softirq+0x78/0xe5
 [<c102542c>] ? do_softirq+0x23/0x27
 [<c1003955>] ? do_IRQ+0x7d/0x8e
 [<c1002aa9>] ? common_interrupt+0x29/0x30
 [<c1062870>] ? kmem_cache_free+0xbd/0xc5
 [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
 [<c1075215>] ? destroy_inode+0x1c/0x2b
 [<c10752ce>] ? dispose_list+0xaa/0xd0
 [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
 [<c104f76b>] ? shrink_slab+0xda/0x12f
 [<c104fc28>] ? kswapd+0x468/0x63b
 [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
 [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
 [<c1018faf>] ? complete+0x28/0x36
 [<c104f7c0>] ? kswapd+0x0/0x63b
 [<c10301cd>] ? kthread+0x61/0x66
 [<c103016c>] ? kthread+0x0/0x66
 [<c1002ab6>] ? kernel_thread_helper+0x6/0x10
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 180
CPU    1: hi:  186, btch:  31 usd:  29
active_anon:646 inactive_anon:4337 isolated_anon:0
 active_file:27189 inactive_file:35957 isolated_file:0
 unevictable:0 dirty:56 writeback:0 unstable:0
 free:1142 slab_reclaimable:25495 slab_unreclaimable:1020
 mapped:3116 shmem:143 pagetables:123 bounce:0
DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB
inactive_anon:4kB active_file:5704kB inactive_file:7732kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB
mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB
slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB
pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? no
lowmem_reserve[]: 0 365 365
Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB
inactive_anon:17344kB active_file:103052kB inactive_file:136096kB
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB
mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB
slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB
pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB
pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
0*2048kB 0*4096kB = 1564kB
Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
0*1024kB 0*2048kB 0*4096kB = 3000kB
63342 total pagecache pages
23 pages in swap cache
Swap cache stats: add 159, delete 136, find 401/412
Free swap  = 995636kB
Total swap = 995992kB
98303 pages RAM
1638 pages reserved
22416 pages shared
76947 pages non-shared

Thanks.

-- 
Stan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-22  6:13 2.6.34.1 page allocation failure Stan Hoeppner
@ 2010-08-22  6:47   ` Mikael Abrahamsson
  0 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-22  6:47 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Linux Kernel List, linux-mm

On Sun, 22 Aug 2010, Stan Hoeppner wrote:

> I'm not subscribed to lkml so please CC me in replies.  First post.

I'm seeing similar problems on older kernels (.24 up to .32).

<http://www.spinics.net/lists/linux-mm/msg07808.html>

I didn't get any response at all, neither on linux-mm or lkml... Our 
problems seem very similar, but I'm running 64bit and I have 8 gigs of 
ram.

Personally I can avoid this by tuning down my TCP settings so TCP uses 
less memory, but I don't think that workaround is very good, this 
shouldn't happen. My machine also freezes up (pressing caps lock doesn't 
work) sometimes, sometimes it just logs the error.

> Mobo:    Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100
> Disk:    SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII
> Kernel:  vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine
> OS:      Debian 5.0.5 (Stable)
> Build:   kernel configured via make menuconfig
>         no modules, no initrd
>         built via "make KDEB_PKGVERSION="
>         installed via dpkg, bootloader is LILO
> Role:    headless SOHO server, run level 2, _very_ light load
>         Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba
>         bulk of system memory (>300MB) is consumed by buffers/cache
> Issue:   AFAIK, these errors never occurred with any revisions of
>         2.6.26, .31, or .32.  After installing 2.6.34.1 I've noticed
>         the following errors in dmesg.  I see 6 of these, including
>         two errors each for kswapd0, lighttpd, and smtpd, all not
>         tainted.  AFAICT everything is still running fine.  Are these
>         critical errors?  If so, how do I fix?
>
> kswapd0: page allocation failure. order:1, mode:0x20
> Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
> Call Trace:
> [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
> [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
> [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
> [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
> [<c11773a5>] ? sk_prot_alloc+0x19/0x55
> [<c117744b>] ? sk_clone+0x16/0x1cc
> [<c119a71d>] ? inet_csk_clone+0xf/0x80
> [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
> [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
> [<c11abf9d>] ? tcp_check_req+0x209/0x335
> [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
> [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
> [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
> [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
> [<c103655e>] ? ktime_get_real+0xf/0x2b
> [<c117f8d3>] ? netif_receive_skb+0x219/0x234
> [<c115ff46>] ? e100_poll+0x1d0/0x47e
> [<c117fa98>] ? net_rx_action+0x58/0xf8
> [<c102539c>] ? __do_softirq+0x78/0xe5
> [<c102542c>] ? do_softirq+0x23/0x27
> [<c1003955>] ? do_IRQ+0x7d/0x8e
> [<c1002aa9>] ? common_interrupt+0x29/0x30
> [<c1062870>] ? kmem_cache_free+0xbd/0xc5
> [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
> [<c1075215>] ? destroy_inode+0x1c/0x2b
> [<c10752ce>] ? dispose_list+0xaa/0xd0
> [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
> [<c104f76b>] ? shrink_slab+0xda/0x12f
> [<c104fc28>] ? kswapd+0x468/0x63b
> [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
> [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
> [<c1018faf>] ? complete+0x28/0x36
> [<c104f7c0>] ? kswapd+0x0/0x63b
> [<c10301cd>] ? kthread+0x61/0x66
> [<c103016c>] ? kthread+0x0/0x66
> [<c1002ab6>] ? kernel_thread_helper+0x6/0x10
> Mem-Info:
> DMA per-cpu:
> CPU    0: hi:    0, btch:   1 usd:   0
> CPU    1: hi:    0, btch:   1 usd:   0
> Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 180
> CPU    1: hi:  186, btch:  31 usd:  29
> active_anon:646 inactive_anon:4337 isolated_anon:0
> active_file:27189 inactive_file:35957 isolated_file:0
> unevictable:0 dirty:56 writeback:0 unstable:0
> free:1142 slab_reclaimable:25495 slab_unreclaimable:1020
> mapped:3116 shmem:143 pagetables:123 bounce:0
> DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB
> inactive_anon:4kB active_file:5704kB inactive_file:7732kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB
> mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB
> slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB
> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> lowmem_reserve[]: 0 365 365
> Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB
> inactive_anon:17344kB active_file:103052kB inactive_file:136096kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB
> mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB
> slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB
> pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB
> pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
> 0*2048kB 0*4096kB = 1564kB
> Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
> 0*1024kB 0*2048kB 0*4096kB = 3000kB
> 63342 total pagecache pages
> 23 pages in swap cache
> Swap cache stats: add 159, delete 136, find 401/412
> Free swap  = 995636kB
> Total swap = 995992kB
> 98303 pages RAM
> 1638 pages reserved
> 22416 pages shared
> 76947 pages non-shared
>
> Thanks.
>
> -- 
> Stan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-22  6:47   ` Mikael Abrahamsson
  0 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-22  6:47 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: Linux Kernel List, linux-mm

On Sun, 22 Aug 2010, Stan Hoeppner wrote:

> I'm not subscribed to lkml so please CC me in replies.  First post.

I'm seeing similar problems on older kernels (.24 up to .32).

<http://www.spinics.net/lists/linux-mm/msg07808.html>

I didn't get any response at all, neither on linux-mm or lkml... Our 
problems seem very similar, but I'm running 64bit and I have 8 gigs of 
ram.

Personally I can avoid this by tuning down my TCP settings so TCP uses 
less memory, but I don't think that workaround is very good, this 
shouldn't happen. My machine also freezes up (pressing caps lock doesn't 
work) sometimes, sometimes it just logs the error.

> Mobo:    Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100
> Disk:    SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII
> Kernel:  vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine
> OS:      Debian 5.0.5 (Stable)
> Build:   kernel configured via make menuconfig
>         no modules, no initrd
>         built via "make KDEB_PKGVERSION="
>         installed via dpkg, bootloader is LILO
> Role:    headless SOHO server, run level 2, _very_ light load
>         Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba
>         bulk of system memory (>300MB) is consumed by buffers/cache
> Issue:   AFAIK, these errors never occurred with any revisions of
>         2.6.26, .31, or .32.  After installing 2.6.34.1 I've noticed
>         the following errors in dmesg.  I see 6 of these, including
>         two errors each for kswapd0, lighttpd, and smtpd, all not
>         tainted.  AFAICT everything is still running fine.  Are these
>         critical errors?  If so, how do I fix?
>
> kswapd0: page allocation failure. order:1, mode:0x20
> Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
> Call Trace:
> [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
> [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
> [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
> [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
> [<c11773a5>] ? sk_prot_alloc+0x19/0x55
> [<c117744b>] ? sk_clone+0x16/0x1cc
> [<c119a71d>] ? inet_csk_clone+0xf/0x80
> [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
> [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
> [<c11abf9d>] ? tcp_check_req+0x209/0x335
> [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
> [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
> [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
> [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
> [<c103655e>] ? ktime_get_real+0xf/0x2b
> [<c117f8d3>] ? netif_receive_skb+0x219/0x234
> [<c115ff46>] ? e100_poll+0x1d0/0x47e
> [<c117fa98>] ? net_rx_action+0x58/0xf8
> [<c102539c>] ? __do_softirq+0x78/0xe5
> [<c102542c>] ? do_softirq+0x23/0x27
> [<c1003955>] ? do_IRQ+0x7d/0x8e
> [<c1002aa9>] ? common_interrupt+0x29/0x30
> [<c1062870>] ? kmem_cache_free+0xbd/0xc5
> [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
> [<c1075215>] ? destroy_inode+0x1c/0x2b
> [<c10752ce>] ? dispose_list+0xaa/0xd0
> [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
> [<c104f76b>] ? shrink_slab+0xda/0x12f
> [<c104fc28>] ? kswapd+0x468/0x63b
> [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
> [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
> [<c1018faf>] ? complete+0x28/0x36
> [<c104f7c0>] ? kswapd+0x0/0x63b
> [<c10301cd>] ? kthread+0x61/0x66
> [<c103016c>] ? kthread+0x0/0x66
> [<c1002ab6>] ? kernel_thread_helper+0x6/0x10
> Mem-Info:
> DMA per-cpu:
> CPU    0: hi:    0, btch:   1 usd:   0
> CPU    1: hi:    0, btch:   1 usd:   0
> Normal per-cpu:
> CPU    0: hi:  186, btch:  31 usd: 180
> CPU    1: hi:  186, btch:  31 usd:  29
> active_anon:646 inactive_anon:4337 isolated_anon:0
> active_file:27189 inactive_file:35957 isolated_file:0
> unevictable:0 dirty:56 writeback:0 unstable:0
> free:1142 slab_reclaimable:25495 slab_unreclaimable:1020
> mapped:3116 shmem:143 pagetables:123 bounce:0
> DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB
> inactive_anon:4kB active_file:5704kB inactive_file:7732kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB
> mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB
> slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB
> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> lowmem_reserve[]: 0 365 365
> Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB
> inactive_anon:17344kB active_file:103052kB inactive_file:136096kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB
> mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB
> slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB
> pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB
> pages_scanned:0 all_unreclaimable? no
> lowmem_reserve[]: 0 0 0
> DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
> 0*2048kB 0*4096kB = 1564kB
> Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
> 0*1024kB 0*2048kB 0*4096kB = 3000kB
> 63342 total pagecache pages
> 23 pages in swap cache
> Swap cache stats: add 159, delete 136, find 401/412
> Free swap  = 995636kB
> Total swap = 995992kB
> 98303 pages RAM
> 1638 pages reserved
> 22416 pages shared
> 76947 pages non-shared
>
> Thanks.
>
> -- 
> Stan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-22  6:47   ` Mikael Abrahamsson
@ 2010-08-22 19:51     ` Pekka Enberg
  -1 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-22 19:51 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman,
	Christoph Lameter

On Sun, Aug 22, 2010 at 9:47 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> On Sun, 22 Aug 2010, Stan Hoeppner wrote:
>
>> I'm not subscribed to lkml so please CC me in replies.  First post.
>
> I'm seeing similar problems on older kernels (.24 up to .32).
>
> <http://www.spinics.net/lists/linux-mm/msg07808.html>
>
> I didn't get any response at all, neither on linux-mm or lkml... Our
> problems seem very similar, but I'm running 64bit and I have 8 gigs of ram.
>
> Personally I can avoid this by tuning down my TCP settings so TCP uses less
> memory, but I don't think that workaround is very good, this shouldn't
> happen. My machine also freezes up (pressing caps lock doesn't work)
> sometimes, sometimes it just logs the error.
>
>> Mobo:    Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100
>> Disk:    SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII
>> Kernel:  vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine
>> OS:      Debian 5.0.5 (Stable)
>> Build:   kernel configured via make menuconfig
>>        no modules, no initrd
>>        built via "make KDEB_PKGVERSION="
>>        installed via dpkg, bootloader is LILO
>> Role:    headless SOHO server, run level 2, _very_ light load
>>        Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba
>>        bulk of system memory (>300MB) is consumed by buffers/cache
>> Issue:   AFAIK, these errors never occurred with any revisions of
>>        2.6.26, .31, or .32.  After installing 2.6.34.1 I've noticed
>>        the following errors in dmesg.  I see 6 of these, including
>>        two errors each for kswapd0, lighttpd, and smtpd, all not
>>        tainted.  AFAICT everything is still running fine.  Are these
>>        critical errors?  If so, how do I fix?
>>
>> kswapd0: page allocation failure. order:1, mode:0x20
>> Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
>> Call Trace:
>> [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
>> [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
>> [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
>> [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
>> [<c11773a5>] ? sk_prot_alloc+0x19/0x55
>> [<c117744b>] ? sk_clone+0x16/0x1cc
>> [<c119a71d>] ? inet_csk_clone+0xf/0x80
>> [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
>> [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
>> [<c11abf9d>] ? tcp_check_req+0x209/0x335
>> [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
>> [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
>> [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
>> [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
>> [<c103655e>] ? ktime_get_real+0xf/0x2b
>> [<c117f8d3>] ? netif_receive_skb+0x219/0x234
>> [<c115ff46>] ? e100_poll+0x1d0/0x47e
>> [<c117fa98>] ? net_rx_action+0x58/0xf8
>> [<c102539c>] ? __do_softirq+0x78/0xe5
>> [<c102542c>] ? do_softirq+0x23/0x27
>> [<c1003955>] ? do_IRQ+0x7d/0x8e
>> [<c1002aa9>] ? common_interrupt+0x29/0x30
>> [<c1062870>] ? kmem_cache_free+0xbd/0xc5
>> [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
>> [<c1075215>] ? destroy_inode+0x1c/0x2b
>> [<c10752ce>] ? dispose_list+0xaa/0xd0
>> [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
>> [<c104f76b>] ? shrink_slab+0xda/0x12f
>> [<c104fc28>] ? kswapd+0x468/0x63b
>> [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
>> [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
>> [<c1018faf>] ? complete+0x28/0x36
>> [<c104f7c0>] ? kswapd+0x0/0x63b
>> [<c10301cd>] ? kthread+0x61/0x66
>> [<c103016c>] ? kthread+0x0/0x66
>> [<c1002ab6>] ? kernel_thread_helper+0x6/0x10
>> Mem-Info:
>> DMA per-cpu:
>> CPU    0: hi:    0, btch:   1 usd:   0
>> CPU    1: hi:    0, btch:   1 usd:   0
>> Normal per-cpu:
>> CPU    0: hi:  186, btch:  31 usd: 180
>> CPU    1: hi:  186, btch:  31 usd:  29
>> active_anon:646 inactive_anon:4337 isolated_anon:0
>> active_file:27189 inactive_file:35957 isolated_file:0
>> unevictable:0 dirty:56 writeback:0 unstable:0
>> free:1142 slab_reclaimable:25495 slab_unreclaimable:1020
>> mapped:3116 shmem:143 pagetables:123 bounce:0
>> DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB
>> inactive_anon:4kB active_file:5704kB inactive_file:7732kB
>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB
>> mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB
>> slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB
>> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
>> all_unreclaimable? no
>> lowmem_reserve[]: 0 365 365
>> Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB
>> inactive_anon:17344kB active_file:103052kB inactive_file:136096kB
>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB
>> mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB
>> slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB
>> pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB
>> pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0
>> DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
>> 0*2048kB 0*4096kB = 1564kB
>> Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
>> 0*1024kB 0*2048kB 0*4096kB = 3000kB
>> 63342 total pagecache pages
>> 23 pages in swap cache
>> Swap cache stats: add 159, delete 136, find 401/412
>> Free swap  = 995636kB
>> Total swap = 995992kB
>> 98303 pages RAM
>> 1638 pages reserved
>> 22416 pages shared
>> 76947 pages non-shared

In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
only order-0 pages available. Mel, any recent page allocator fixes in
2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-22 19:51     ` Pekka Enberg
  0 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-22 19:51 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman,
	Christoph Lameter

On Sun, Aug 22, 2010 at 9:47 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote:
> On Sun, 22 Aug 2010, Stan Hoeppner wrote:
>
>> I'm not subscribed to lkml so please CC me in replies.  First post.
>
> I'm seeing similar problems on older kernels (.24 up to .32).
>
> <http://www.spinics.net/lists/linux-mm/msg07808.html>
>
> I didn't get any response at all, neither on linux-mm or lkml... Our
> problems seem very similar, but I'm running 64bit and I have 8 gigs of ram.
>
> Personally I can avoid this by tuning down my TCP settings so TCP uses less
> memory, but I don't think that workaround is very good, this shouldn't
> happen. My machine also freezes up (pressing caps lock doesn't work)
> sometimes, sometimes it just logs the error.
>
>> Mobo:    Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100
>> Disk:    SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII
>> Kernel:  vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine
>> OS:      Debian 5.0.5 (Stable)
>> Build:   kernel configured via make menuconfig
>>        no modules, no initrd
>>        built via "make KDEB_PKGVERSION="
>>        installed via dpkg, bootloader is LILO
>> Role:    headless SOHO server, run level 2, _very_ light load
>>        Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba
>>        bulk of system memory (>300MB) is consumed by buffers/cache
>> Issue:   AFAIK, these errors never occurred with any revisions of
>>        2.6.26, .31, or .32.  After installing 2.6.34.1 I've noticed
>>        the following errors in dmesg.  I see 6 of these, including
>>        two errors each for kswapd0, lighttpd, and smtpd, all not
>>        tainted.  AFAICT everything is still running fine.  Are these
>>        critical errors?  If so, how do I fix?
>>
>> kswapd0: page allocation failure. order:1, mode:0x20
>> Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
>> Call Trace:
>> [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
>> [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
>> [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
>> [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
>> [<c11773a5>] ? sk_prot_alloc+0x19/0x55
>> [<c117744b>] ? sk_clone+0x16/0x1cc
>> [<c119a71d>] ? inet_csk_clone+0xf/0x80
>> [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
>> [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
>> [<c11abf9d>] ? tcp_check_req+0x209/0x335
>> [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
>> [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
>> [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
>> [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
>> [<c103655e>] ? ktime_get_real+0xf/0x2b
>> [<c117f8d3>] ? netif_receive_skb+0x219/0x234
>> [<c115ff46>] ? e100_poll+0x1d0/0x47e
>> [<c117fa98>] ? net_rx_action+0x58/0xf8
>> [<c102539c>] ? __do_softirq+0x78/0xe5
>> [<c102542c>] ? do_softirq+0x23/0x27
>> [<c1003955>] ? do_IRQ+0x7d/0x8e
>> [<c1002aa9>] ? common_interrupt+0x29/0x30
>> [<c1062870>] ? kmem_cache_free+0xbd/0xc5
>> [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
>> [<c1075215>] ? destroy_inode+0x1c/0x2b
>> [<c10752ce>] ? dispose_list+0xaa/0xd0
>> [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
>> [<c104f76b>] ? shrink_slab+0xda/0x12f
>> [<c104fc28>] ? kswapd+0x468/0x63b
>> [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
>> [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
>> [<c1018faf>] ? complete+0x28/0x36
>> [<c104f7c0>] ? kswapd+0x0/0x63b
>> [<c10301cd>] ? kthread+0x61/0x66
>> [<c103016c>] ? kthread+0x0/0x66
>> [<c1002ab6>] ? kernel_thread_helper+0x6/0x10
>> Mem-Info:
>> DMA per-cpu:
>> CPU    0: hi:    0, btch:   1 usd:   0
>> CPU    1: hi:    0, btch:   1 usd:   0
>> Normal per-cpu:
>> CPU    0: hi:  186, btch:  31 usd: 180
>> CPU    1: hi:  186, btch:  31 usd:  29
>> active_anon:646 inactive_anon:4337 isolated_anon:0
>> active_file:27189 inactive_file:35957 isolated_file:0
>> unevictable:0 dirty:56 writeback:0 unstable:0
>> free:1142 slab_reclaimable:25495 slab_unreclaimable:1020
>> mapped:3116 shmem:143 pagetables:123 bounce:0
>> DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB
>> inactive_anon:4kB active_file:5704kB inactive_file:7732kB
>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB
>> mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB
>> slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB
>> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
>> all_unreclaimable? no
>> lowmem_reserve[]: 0 365 365
>> Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB
>> inactive_anon:17344kB active_file:103052kB inactive_file:136096kB
>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB
>> mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB
>> slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB
>> pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB
>> pages_scanned:0 all_unreclaimable? no
>> lowmem_reserve[]: 0 0 0
>> DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB
>> 0*2048kB 0*4096kB = 1564kB
>> Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB
>> 0*1024kB 0*2048kB 0*4096kB = 3000kB
>> 63342 total pagecache pages
>> 23 pages in swap cache
>> Swap cache stats: add 159, delete 136, find 401/412
>> Free swap  = 995636kB
>> Total swap = 995992kB
>> 98303 pages RAM
>> 1638 pages reserved
>> 22416 pages shared
>> 76947 pages non-shared

In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
only order-0 pages available. Mel, any recent page allocator fixes in
2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-22 19:51     ` Pekka Enberg
@ 2010-08-22 22:40       ` Christoph Lameter
  -1 siblings, 0 replies; 29+ messages in thread
From: Christoph Lameter @ 2010-08-22 22:40 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm,
	Mel Gorman

On Sun, 22 Aug 2010, Pekka Enberg wrote:

> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
> only order-0 pages available. Mel, any recent page allocator fixes in
> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?

This is the TCP slab? Best fix would be in the page allocator. However,
in this particular case the slub allocator would be able to fall back to
an order 0 allocation and still satisfy the request.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-22 22:40       ` Christoph Lameter
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Lameter @ 2010-08-22 22:40 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm,
	Mel Gorman

On Sun, 22 Aug 2010, Pekka Enberg wrote:

> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
> only order-0 pages available. Mel, any recent page allocator fixes in
> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?

This is the TCP slab? Best fix would be in the page allocator. However,
in this particular case the slub allocator would be able to fall back to
an order 0 allocation and still satisfy the request.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-22 22:40       ` Christoph Lameter
@ 2010-08-23  9:37         ` Pekka Enberg
  -1 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-23  9:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm,
	Mel Gorman

  On 8/23/10 1:40 AM, Christoph Lameter wrote:
> On Sun, 22 Aug 2010, Pekka Enberg wrote:
>
>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
>> only order-0 pages available. Mel, any recent page allocator fixes in
>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?
> This is the TCP slab? Best fix would be in the page allocator. However,
> in this particular case the slub allocator would be able to fall back to
> an order 0 allocation and still satisfy the request.
>
Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB 
which doesn't have order-0 fallback.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-23  9:37         ` Pekka Enberg
  0 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-23  9:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm,
	Mel Gorman

  On 8/23/10 1:40 AM, Christoph Lameter wrote:
> On Sun, 22 Aug 2010, Pekka Enberg wrote:
>
>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
>> only order-0 pages available. Mel, any recent page allocator fixes in
>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?
> This is the TCP slab? Best fix would be in the page allocator. However,
> in this particular case the slub allocator would be able to fall back to
> an order 0 allocation and still satisfy the request.
>
Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB 
which doesn't have order-0 fallback.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-23  9:37         ` Pekka Enberg
@ 2010-08-23 22:35           ` Stan Hoeppner
  -1 siblings, 0 replies; 29+ messages in thread
From: Stan Hoeppner @ 2010-08-23 22:35 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List,
	linux-mm, Mel Gorman

Pekka Enberg put forth on 8/23/2010 4:37 AM:
>  On 8/23/10 1:40 AM, Christoph Lameter wrote:
>> On Sun, 22 Aug 2010, Pekka Enberg wrote:
>>
>>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
>>> only order-0 pages available. Mel, any recent page allocator fixes in
>>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?
>> This is the TCP slab? Best fix would be in the page allocator. However,
>> in this particular case the slub allocator would be able to fall back to
>> an order 0 allocation and still satisfy the request.
>>
> Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB
> which doesn't have order-0 fallback.

That is correct.  The menuconfig help screen led me to believe the SLAB
allocator was the "safe" choice:

"CONFIG_SLAB:
The regular slab allocator that is established and known to work well in
all environments"

Should I be using SLUB instead?  Any downsides to SLUB on an old and
slow (500 MHz) single core dual CPU box with <512MB RAM?

Also, what is the impact of these oopses?  Despite the entries in dmesg,
the system "seems" to be running ok.  Or is this simply the calm before
the impending storm?

-- 
Stan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-23 22:35           ` Stan Hoeppner
  0 siblings, 0 replies; 29+ messages in thread
From: Stan Hoeppner @ 2010-08-23 22:35 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List,
	linux-mm, Mel Gorman

Pekka Enberg put forth on 8/23/2010 4:37 AM:
>  On 8/23/10 1:40 AM, Christoph Lameter wrote:
>> On Sun, 22 Aug 2010, Pekka Enberg wrote:
>>
>>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
>>> only order-0 pages available. Mel, any recent page allocator fixes in
>>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?
>> This is the TCP slab? Best fix would be in the page allocator. However,
>> in this particular case the slub allocator would be able to fall back to
>> an order 0 allocation and still satisfy the request.
>>
> Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB
> which doesn't have order-0 fallback.

That is correct.  The menuconfig help screen led me to believe the SLAB
allocator was the "safe" choice:

"CONFIG_SLAB:
The regular slab allocator that is established and known to work well in
all environments"

Should I be using SLUB instead?  Any downsides to SLUB on an old and
slow (500 MHz) single core dual CPU box with <512MB RAM?

Also, what is the impact of these oopses?  Despite the entries in dmesg,
the system "seems" to be running ok.  Or is this simply the calm before
the impending storm?

-- 
Stan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-23 22:35           ` Stan Hoeppner
@ 2010-08-24 17:13             ` Christoph Lameter
  -1 siblings, 0 replies; 29+ messages in thread
From: Christoph Lameter @ 2010-08-24 17:13 UTC (permalink / raw)
  To: Stan Hoeppner
  Cc: Pekka Enberg, Mikael Abrahamsson, Linux Kernel List, linux-mm,
	Mel Gorman

On Mon, 23 Aug 2010, Stan Hoeppner wrote:

> Should I be using SLUB instead?  Any downsides to SLUB on an old and
> slow (500 MHz) single core dual CPU box with <512MB RAM?

SLUB has a smaller memory footprint so you may come out ahead for
such a small system in particular.

> Also, what is the impact of these oopses?  Despite the entries in dmesg,
> the system "seems" to be running ok.  Or is this simply the calm before
> the impending storm?

The system does not guarantee that GFP_ATOMIC allocation succeed so any
caller must provide logic to fall back if no memory is allocated. So the
effect may just be that certain OS operations have to be retried.



^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-24 17:13             ` Christoph Lameter
  0 siblings, 0 replies; 29+ messages in thread
From: Christoph Lameter @ 2010-08-24 17:13 UTC (permalink / raw)
  To: Stan Hoeppner
  Cc: Pekka Enberg, Mikael Abrahamsson, Linux Kernel List, linux-mm,
	Mel Gorman

On Mon, 23 Aug 2010, Stan Hoeppner wrote:

> Should I be using SLUB instead?  Any downsides to SLUB on an old and
> slow (500 MHz) single core dual CPU box with <512MB RAM?

SLUB has a smaller memory footprint so you may come out ahead for
such a small system in particular.

> Also, what is the impact of these oopses?  Despite the entries in dmesg,
> the system "seems" to be running ok.  Or is this simply the calm before
> the impending storm?

The system does not guarantee that GFP_ATOMIC allocation succeed so any
caller must provide logic to fall back if no memory is allocated. So the
effect may just be that certain OS operations have to be retried.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-23 22:35           ` Stan Hoeppner
@ 2010-08-24 18:03             ` Pekka Enberg
  -1 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-24 18:03 UTC (permalink / raw)
  To: Stan Hoeppner
  Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List,
	linux-mm, Mel Gorman, Linux Netdev List

  [ I'm CC'ing netdev. ]

On 24.8.2010 1.35, Stan Hoeppner wrote:
> Pekka Enberg put forth on 8/23/2010 4:37 AM:
>>   On 8/23/10 1:40 AM, Christoph Lameter wrote:
>>> On Sun, 22 Aug 2010, Pekka Enberg wrote:
>>>
>>>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
>>>> only order-0 pages available. Mel, any recent page allocator fixes in
>>>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?
>>> This is the TCP slab? Best fix would be in the page allocator. However,
>>> in this particular case the slub allocator would be able to fall back to
>>> an order 0 allocation and still satisfy the request.
>> Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB
>> which doesn't have order-0 fallback.
> That is correct.  The menuconfig help screen led me to believe the SLAB
> allocator was the "safe" choice:
>
> "CONFIG_SLAB:
> The regular slab allocator that is established and known to work well in
> all environments"
>
> Should I be using SLUB instead?  Any downsides to SLUB on an old and
> slow (500 MHz) single core dual CPU box with<512MB RAM?
I don't think the problem here is SLAB so it shouldn't matter which one 
you use. You might not see the problems with SLUB, though, because it 
falls back to 0-order allocations.
> Also, what is the impact of these oopses?  Despite the entries in dmesg,
> the system "seems" to be running ok.  Or is this simply the calm before
> the impending storm?
The page allocation failure in question is this:

kswapd0: page allocation failure. order:1, mode:0x20
Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
Call Trace:
  [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
  [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
  [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
  [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
  [<c11773a5>] ? sk_prot_alloc+0x19/0x55
  [<c117744b>] ? sk_clone+0x16/0x1cc
  [<c119a71d>] ? inet_csk_clone+0xf/0x80
  [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
  [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
  [<c11abf9d>] ? tcp_check_req+0x209/0x335
  [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
  [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
  [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
  [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
  [<c103655e>] ? ktime_get_real+0xf/0x2b
  [<c117f8d3>] ? netif_receive_skb+0x219/0x234
  [<c115ff46>] ? e100_poll+0x1d0/0x47e
  [<c117fa98>] ? net_rx_action+0x58/0xf8
  [<c102539c>] ? __do_softirq+0x78/0xe5
  [<c102542c>] ? do_softirq+0x23/0x27
  [<c1003955>] ? do_IRQ+0x7d/0x8e
  [<c1002aa9>] ? common_interrupt+0x29/0x30
  [<c1062870>] ? kmem_cache_free+0xbd/0xc5
  [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
  [<c1075215>] ? destroy_inode+0x1c/0x2b
  [<c10752ce>] ? dispose_list+0xaa/0xd0
  [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
  [<c104f76b>] ? shrink_slab+0xda/0x12f
  [<c104fc28>] ? kswapd+0x468/0x63b
  [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
  [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
  [<c1018faf>] ? complete+0x28/0x36
  [<c104f7c0>] ? kswapd+0x0/0x63b
  [<c10301cd>] ? kthread+0x61/0x66
  [<c103016c>] ? kthread+0x0/0x66
  [<c1002ab6>] ? kernel_thread_helper+0x6/0x10

It looks to me as if tcp_create_openreq_child() is able to cope with the 
situation so the warning could be harmless. If that's the case, we 
should probably stick a __GFP_NOWARN there.

                 Pekka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-24 18:03             ` Pekka Enberg
  0 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-24 18:03 UTC (permalink / raw)
  To: Stan Hoeppner
  Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List,
	linux-mm, Mel Gorman, Linux Netdev List

  [ I'm CC'ing netdev. ]

On 24.8.2010 1.35, Stan Hoeppner wrote:
> Pekka Enberg put forth on 8/23/2010 4:37 AM:
>>   On 8/23/10 1:40 AM, Christoph Lameter wrote:
>>> On Sun, 22 Aug 2010, Pekka Enberg wrote:
>>>
>>>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are
>>>> only order-0 pages available. Mel, any recent page allocator fixes in
>>>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test?
>>> This is the TCP slab? Best fix would be in the page allocator. However,
>>> in this particular case the slub allocator would be able to fall back to
>>> an order 0 allocation and still satisfy the request.
>> Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB
>> which doesn't have order-0 fallback.
> That is correct.  The menuconfig help screen led me to believe the SLAB
> allocator was the "safe" choice:
>
> "CONFIG_SLAB:
> The regular slab allocator that is established and known to work well in
> all environments"
>
> Should I be using SLUB instead?  Any downsides to SLUB on an old and
> slow (500 MHz) single core dual CPU box with<512MB RAM?
I don't think the problem here is SLAB so it shouldn't matter which one 
you use. You might not see the problems with SLUB, though, because it 
falls back to 0-order allocations.
> Also, what is the impact of these oopses?  Despite the entries in dmesg,
> the system "seems" to be running ok.  Or is this simply the calm before
> the impending storm?
The page allocation failure in question is this:

kswapd0: page allocation failure. order:1, mode:0x20
Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1
Call Trace:
  [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a
  [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422
  [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4
  [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a
  [<c11773a5>] ? sk_prot_alloc+0x19/0x55
  [<c117744b>] ? sk_clone+0x16/0x1cc
  [<c119a71d>] ? inet_csk_clone+0xf/0x80
  [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8
  [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151
  [<c11abf9d>] ? tcp_check_req+0x209/0x335
  [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d
  [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d
  [<c1193ba4>] ? ip_local_deliver+0x76/0xc0
  [<c1193b10>] ? ip_rcv+0x3dc/0x3fa
  [<c103655e>] ? ktime_get_real+0xf/0x2b
  [<c117f8d3>] ? netif_receive_skb+0x219/0x234
  [<c115ff46>] ? e100_poll+0x1d0/0x47e
  [<c117fa98>] ? net_rx_action+0x58/0xf8
  [<c102539c>] ? __do_softirq+0x78/0xe5
  [<c102542c>] ? do_softirq+0x23/0x27
  [<c1003955>] ? do_IRQ+0x7d/0x8e
  [<c1002aa9>] ? common_interrupt+0x29/0x30
  [<c1062870>] ? kmem_cache_free+0xbd/0xc5
  [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f
  [<c1075215>] ? destroy_inode+0x1c/0x2b
  [<c10752ce>] ? dispose_list+0xaa/0xd0
  [<c107548c>] ? shrink_icache_memory+0x198/0x1c5
  [<c104f76b>] ? shrink_slab+0xda/0x12f
  [<c104fc28>] ? kswapd+0x468/0x63b
  [<c104dca3>] ? isolate_pages_global+0x0/0x1bc
  [<c10304d6>] ? autoremove_wake_function+0x0/0x2d
  [<c1018faf>] ? complete+0x28/0x36
  [<c104f7c0>] ? kswapd+0x0/0x63b
  [<c10301cd>] ? kthread+0x61/0x66
  [<c103016c>] ? kthread+0x0/0x66
  [<c1002ab6>] ? kernel_thread_helper+0x6/0x10

It looks to me as if tcp_create_openreq_child() is able to cope with the 
situation so the warning could be harmless. If that's the case, we 
should probably stick a __GFP_NOWARN there.

                 Pekka

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-24 18:03             ` Pekka Enberg
@ 2010-08-24 19:08               ` Stan Hoeppner
  -1 siblings, 0 replies; 29+ messages in thread
From: Stan Hoeppner @ 2010-08-24 19:08 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List,
	linux-mm, Mel Gorman, Linux Netdev List

Pekka Enberg put forth on 8/24/2010 1:03 PM:

> It looks to me as if tcp_create_openreq_child() is able to cope with the
> situation so the warning could be harmless. If that's the case, we
> should probably stick a __GFP_NOWARN there.

If it would be helpful, here's a complete copy of dmesg:
http://www.hardwarefreak.com/2.6.34.1-dmesg-oopses.txt

Something I forgot to mention earlier is that every now and then I
unmount swap and drop caches to clear things out a bit.  Not sure if
that may be relevant, but since it has to do with memory allocation I
thought I'd mention it.

-- 
Stan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-24 19:08               ` Stan Hoeppner
  0 siblings, 0 replies; 29+ messages in thread
From: Stan Hoeppner @ 2010-08-24 19:08 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List,
	linux-mm, Mel Gorman, Linux Netdev List

Pekka Enberg put forth on 8/24/2010 1:03 PM:

> It looks to me as if tcp_create_openreq_child() is able to cope with the
> situation so the warning could be harmless. If that's the case, we
> should probably stick a __GFP_NOWARN there.

If it would be helpful, here's a complete copy of dmesg:
http://www.hardwarefreak.com/2.6.34.1-dmesg-oopses.txt

Something I forgot to mention earlier is that every now and then I
unmount swap and drop caches to clear things out a bit.  Not sure if
that may be relevant, but since it has to do with memory allocation I
thought I'd mention it.

-- 
Stan

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-24 18:03             ` Pekka Enberg
@ 2010-08-24 19:21               ` Mikael Abrahamsson
  -1 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-24 19:21 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Tue, 24 Aug 2010, Pekka Enberg wrote:

> It looks to me as if tcp_create_openreq_child() is able to cope with the 
> situation so the warning could be harmless. If that's the case, we 
> should probably stick a __GFP_NOWARN there.

What about my situation? (a complete dmesg can be had at 
<http://swm.pp.se/dmesg.100809-2.txt.gz>)

[87578.494471] swapper: page allocation failure. order:0, mode:0x4020
[87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic #39-Ubuntu
[87578.494480] Call Trace:
[87578.494483]  <IRQ>  [<ffffffff810fad0e>] __alloc_pages_slowpath+0x56e/0x580
[87578.494499]  [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0
[87578.494506]  [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0
[87578.494511]  [<ffffffff81133b17>] new_slab+0x2f7/0x310
[87578.494516]  [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0
[87578.494522]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
[87578.494528]  [<ffffffff81137408>] __kmalloc_node_track_caller+0xb8/0x180
[87578.494532]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
[87578.494536]  [<ffffffff81455ca0>] __alloc_skb+0x80/0x190
[87578.494540]  [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60
[87578.494564]  [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 [r8169]
[87578.494572]  [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169]
[87578.494580]  [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10
[87578.494586]  [<ffffffff8146029f>] net_rx_action+0x10f/0x250
[87578.494594]  [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 [r8169]
[87578.494600]  [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0
[87578.494605]  [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170
[87578.494610]  [<ffffffff810142ec>] call_softirq+0x1c/0x30
[87578.494614]  [<ffffffff81015cb5>] do_softirq+0x65/0xa0
[87578.494618]  [<ffffffff8106e305>] irq_exit+0x85/0x90
[87578.494623]  [<ffffffff81549515>] do_IRQ+0x75/0xf0
[87578.494627]  [<ffffffff81013b13>] ret_from_intr+0x0/0x11
[87578.494629]  <EOI>  [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1
[87578.494639]  [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1
[87578.494646]  [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140
[87578.494652]  [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110
[87578.494657]  [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa
[87578.494660] Mem-Info:
[87578.494662] Node 0 DMA per-cpu:
[87578.494666] CPU    0: hi:    0, btch:   1 usd:   0
[87578.494669] CPU    1: hi:    0, btch:   1 usd:   0
[87578.494672] CPU    2: hi:    0, btch:   1 usd:   0
[87578.494674] CPU    3: hi:    0, btch:   1 usd:   0
[87578.494677] Node 0 DMA32 per-cpu:
[87578.494680] CPU    0: hi:  186, btch:  31 usd: 173
[87578.494683] CPU    1: hi:  186, btch:  31 usd:  87
[87578.494686] CPU    2: hi:  186, btch:  31 usd: 168
[87578.494689] CPU    3: hi:  186, btch:  31 usd:  63
[87578.494691] Node 0 Normal per-cpu:
[87578.494695] CPU    0: hi:  186, btch:  31 usd: 177
[87578.494698] CPU    1: hi:  186, btch:  31 usd: 176
[87578.494700] CPU    2: hi:  186, btch:  31 usd:  82
[87578.494703] CPU    3: hi:  186, btch:  31 usd: 191
[87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0
[87578.494711]  active_file:916528 inactive_file:914736 isolated_file:0
[87578.494713]  unevictable:0 dirty:135959 writeback:24423 unstable:0
[87578.494714]  free:9990 slab_reclaimable:59767 slab_unreclaimable:11135
[87578.494716]  mapped:119343 shmem:985 pagetables:2113 bounce:0
[87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes
[87578.494733] lowmem_reserve[]: 0 2866 7852 7852
[87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB high:6204kB 
active_anon:4056kB inactive_anon:5856kB active_file:1322360kB 
inactive_file:1320432kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:2935456kB mlocked:0kB dirty:190824kB 
writeback:31900kB mapped:157676kB shmem:0kB slab_reclaimable:107316kB 
slab_unreclaimable:15480kB kernel_stack:56kB pagetables:764kB unstable:0kB 
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[87578.494754] lowmem_reserve[]: 0 0 4986 4986
[87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB 
high:10788kB active_anon:87824kB inactive_anon:19876kB 
active_file:2343752kB inactive_file:2338512kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB 
dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB 
slab_reclaimable:131752kB slab_unreclaimable:29060kB kernel_stack:2160kB 
pagetables:7688kB unstable:0kB bounce:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? no
[87578.494775] lowmem_reserve[]: 0 0 0 0
[87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB 
0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB
[87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB 
4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB
[87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB 0*128kB 
1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB
[87578.494818] 1832322 total pagecache pages
[87578.494820] 0 pages in swap cache
[87578.494823] Swap cache stats: add 0, delete 0, find 0/0
[87578.494825] Free swap  = 0kB
[87578.494827] Total swap = 0kB
[87578.531041] 2064368 pages RAM
[87578.531044] 66019 pages reserved
[87578.531046] 1501227 pages shared
[87578.531048] 619257 pages non-shared
[87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[87578.531057]   cache: kmalloc-4096, object size: 4096, buffer size: 
4096, default order: 3, min order: 0
[87578.531061]   node 0: slabs: 1322, objs: 4129, free: 0

This actually made the machine go offline for hours before it for some 
reason came back. The second time this happened it did not come back 
(waited 8 hours).

I also seem to have TCP related problems:

[87578.531806]  [<ffffffff8113651f>] kmem_cache_alloc_node+0x8f/0x160
[87578.531812]  [<ffffffff81455c6f>] __alloc_skb+0x4f/0x190
[87578.531820]  [<ffffffff814acbe0>] ? tcp_delack_timer+0x0/0x270
[87578.531828]  [<ffffffff814ab423>] tcp_send_ack+0x33/0x120
[87578.531834]  [<ffffffff814acd22>] tcp_delack_timer+0x142/0x270
[87578.531842]  [<ffffffff8105a34d>] ? scheduler_tick+0x18d/0x260
[87578.531849]  [<ffffffff8107776b>] run_timer_softirq+0x19b/0x340
[87578.531857]  [<ffffffff81094ac0>] ? tick_sched_timer+0x0/0xc0
[87578.531865]  [<ffffffff8108f723>] ? ktime_get+0x63/0xe0
[87578.531871]  [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0
[87578.531878]  [<ffffffff810946aa>] ? tick_program_event+0x2a/0x30
[87578.531885]  [<ffffffff810142ec>] call_softirq+0x1c/0x30
[87578.531891]  [<ffffffff81015cb5>] do_softirq+0x65/0xa0
[87578.531897]  [<ffffffff8106e305>] irq_exit+0x85/0x90
[87578.531904]  [<ffffffff81549601>] smp_apic_timer_interrupt+0x71/0x9c
[87578.531910]  [<ffffffff81013cb3>] apic_timer_interrupt+0x13/0x20
[87578.531914]  <EOI>  [<ffffffff8130fbbe>] ? acpi_idle_enter_simple+0x117/0x14b
[87578.531928]  [<ffffffff8130fbb7>] ? acpi_idle_enter_simple+0x110/0x14b
[87578.531936]  [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140
[87578.531943]  [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110
[87578.531950]  [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa


-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-24 19:21               ` Mikael Abrahamsson
  0 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-24 19:21 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Tue, 24 Aug 2010, Pekka Enberg wrote:

> It looks to me as if tcp_create_openreq_child() is able to cope with the 
> situation so the warning could be harmless. If that's the case, we 
> should probably stick a __GFP_NOWARN there.

What about my situation? (a complete dmesg can be had at 
<http://swm.pp.se/dmesg.100809-2.txt.gz>)

[87578.494471] swapper: page allocation failure. order:0, mode:0x4020
[87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic #39-Ubuntu
[87578.494480] Call Trace:
[87578.494483]  <IRQ>  [<ffffffff810fad0e>] __alloc_pages_slowpath+0x56e/0x580
[87578.494499]  [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0
[87578.494506]  [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0
[87578.494511]  [<ffffffff81133b17>] new_slab+0x2f7/0x310
[87578.494516]  [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0
[87578.494522]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
[87578.494528]  [<ffffffff81137408>] __kmalloc_node_track_caller+0xb8/0x180
[87578.494532]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
[87578.494536]  [<ffffffff81455ca0>] __alloc_skb+0x80/0x190
[87578.494540]  [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60
[87578.494564]  [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 [r8169]
[87578.494572]  [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169]
[87578.494580]  [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10
[87578.494586]  [<ffffffff8146029f>] net_rx_action+0x10f/0x250
[87578.494594]  [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 [r8169]
[87578.494600]  [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0
[87578.494605]  [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170
[87578.494610]  [<ffffffff810142ec>] call_softirq+0x1c/0x30
[87578.494614]  [<ffffffff81015cb5>] do_softirq+0x65/0xa0
[87578.494618]  [<ffffffff8106e305>] irq_exit+0x85/0x90
[87578.494623]  [<ffffffff81549515>] do_IRQ+0x75/0xf0
[87578.494627]  [<ffffffff81013b13>] ret_from_intr+0x0/0x11
[87578.494629]  <EOI>  [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1
[87578.494639]  [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1
[87578.494646]  [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140
[87578.494652]  [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110
[87578.494657]  [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa
[87578.494660] Mem-Info:
[87578.494662] Node 0 DMA per-cpu:
[87578.494666] CPU    0: hi:    0, btch:   1 usd:   0
[87578.494669] CPU    1: hi:    0, btch:   1 usd:   0
[87578.494672] CPU    2: hi:    0, btch:   1 usd:   0
[87578.494674] CPU    3: hi:    0, btch:   1 usd:   0
[87578.494677] Node 0 DMA32 per-cpu:
[87578.494680] CPU    0: hi:  186, btch:  31 usd: 173
[87578.494683] CPU    1: hi:  186, btch:  31 usd:  87
[87578.494686] CPU    2: hi:  186, btch:  31 usd: 168
[87578.494689] CPU    3: hi:  186, btch:  31 usd:  63
[87578.494691] Node 0 Normal per-cpu:
[87578.494695] CPU    0: hi:  186, btch:  31 usd: 177
[87578.494698] CPU    1: hi:  186, btch:  31 usd: 176
[87578.494700] CPU    2: hi:  186, btch:  31 usd:  82
[87578.494703] CPU    3: hi:  186, btch:  31 usd: 191
[87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0
[87578.494711]  active_file:916528 inactive_file:914736 isolated_file:0
[87578.494713]  unevictable:0 dirty:135959 writeback:24423 unstable:0
[87578.494714]  free:9990 slab_reclaimable:59767 slab_unreclaimable:11135
[87578.494716]  mapped:119343 shmem:985 pagetables:2113 bounce:0
[87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB 
active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB 
mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? yes
[87578.494733] lowmem_reserve[]: 0 2866 7852 7852
[87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB high:6204kB 
active_anon:4056kB inactive_anon:5856kB active_file:1322360kB 
inactive_file:1320432kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:2935456kB mlocked:0kB dirty:190824kB 
writeback:31900kB mapped:157676kB shmem:0kB slab_reclaimable:107316kB 
slab_unreclaimable:15480kB kernel_stack:56kB pagetables:764kB unstable:0kB 
bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[87578.494754] lowmem_reserve[]: 0 0 4986 4986
[87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB 
high:10788kB active_anon:87824kB inactive_anon:19876kB 
active_file:2343752kB inactive_file:2338512kB unevictable:0kB 
isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB 
dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB 
slab_reclaimable:131752kB slab_unreclaimable:29060kB kernel_stack:2160kB 
pagetables:7688kB unstable:0kB bounce:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? no
[87578.494775] lowmem_reserve[]: 0 0 0 0
[87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB 
0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB
[87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB 
4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB
[87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB 0*128kB 
1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB
[87578.494818] 1832322 total pagecache pages
[87578.494820] 0 pages in swap cache
[87578.494823] Swap cache stats: add 0, delete 0, find 0/0
[87578.494825] Free swap  = 0kB
[87578.494827] Total swap = 0kB
[87578.531041] 2064368 pages RAM
[87578.531044] 66019 pages reserved
[87578.531046] 1501227 pages shared
[87578.531048] 619257 pages non-shared
[87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[87578.531057]   cache: kmalloc-4096, object size: 4096, buffer size: 
4096, default order: 3, min order: 0
[87578.531061]   node 0: slabs: 1322, objs: 4129, free: 0

This actually made the machine go offline for hours before it for some 
reason came back. The second time this happened it did not come back 
(waited 8 hours).

I also seem to have TCP related problems:

[87578.531806]  [<ffffffff8113651f>] kmem_cache_alloc_node+0x8f/0x160
[87578.531812]  [<ffffffff81455c6f>] __alloc_skb+0x4f/0x190
[87578.531820]  [<ffffffff814acbe0>] ? tcp_delack_timer+0x0/0x270
[87578.531828]  [<ffffffff814ab423>] tcp_send_ack+0x33/0x120
[87578.531834]  [<ffffffff814acd22>] tcp_delack_timer+0x142/0x270
[87578.531842]  [<ffffffff8105a34d>] ? scheduler_tick+0x18d/0x260
[87578.531849]  [<ffffffff8107776b>] run_timer_softirq+0x19b/0x340
[87578.531857]  [<ffffffff81094ac0>] ? tick_sched_timer+0x0/0xc0
[87578.531865]  [<ffffffff8108f723>] ? ktime_get+0x63/0xe0
[87578.531871]  [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0
[87578.531878]  [<ffffffff810946aa>] ? tick_program_event+0x2a/0x30
[87578.531885]  [<ffffffff810142ec>] call_softirq+0x1c/0x30
[87578.531891]  [<ffffffff81015cb5>] do_softirq+0x65/0xa0
[87578.531897]  [<ffffffff8106e305>] irq_exit+0x85/0x90
[87578.531904]  [<ffffffff81549601>] smp_apic_timer_interrupt+0x71/0x9c
[87578.531910]  [<ffffffff81013cb3>] apic_timer_interrupt+0x13/0x20
[87578.531914]  <EOI>  [<ffffffff8130fbbe>] ? acpi_idle_enter_simple+0x117/0x14b
[87578.531928]  [<ffffffff8130fbb7>] ? acpi_idle_enter_simple+0x110/0x14b
[87578.531936]  [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140
[87578.531943]  [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110
[87578.531950]  [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa


-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-24 19:21               ` Mikael Abrahamsson
@ 2010-08-29 10:49                 ` Pekka Enberg
  -1 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-29 10:49 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

  On 24.8.2010 22.21, Mikael Abrahamsson wrote:
> On Tue, 24 Aug 2010, Pekka Enberg wrote:
>
>> It looks to me as if tcp_create_openreq_child() is able to cope with 
>> the situation so the warning could be harmless. If that's the case, 
>> we should probably stick a __GFP_NOWARN there.
>
> What about my situation? (a complete dmesg can be had at 
> <http://swm.pp.se/dmesg.100809-2.txt.gz>)
This looks like something the kernel can't really recover from.
> [87578.494471] swapper: page allocation failure. order:0, mode:0x4020
> [87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic 
> #39-Ubuntu
> [87578.494480] Call Trace:
> [87578.494483] <IRQ>  [<ffffffff810fad0e>] 
> __alloc_pages_slowpath+0x56e/0x580
> [87578.494499]  [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0
> [87578.494506]  [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0
> [87578.494511]  [<ffffffff81133b17>] new_slab+0x2f7/0x310
> [87578.494516]  [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0
> [87578.494522]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
> [87578.494528]  [<ffffffff81137408>] 
> __kmalloc_node_track_caller+0xb8/0x180
> [87578.494532]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
> [87578.494536]  [<ffffffff81455ca0>] __alloc_skb+0x80/0x190
> [87578.494540]  [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60
> [87578.494564]  [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 
> [r8169]
> [87578.494572]  [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169]
> [87578.494580]  [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10
> [87578.494586]  [<ffffffff8146029f>] net_rx_action+0x10f/0x250
> [87578.494594]  [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 
> [r8169]
> [87578.494600]  [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0
> [87578.494605]  [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170
> [87578.494610]  [<ffffffff810142ec>] call_softirq+0x1c/0x30
> [87578.494614]  [<ffffffff81015cb5>] do_softirq+0x65/0xa0
> [87578.494618]  [<ffffffff8106e305>] irq_exit+0x85/0x90
> [87578.494623]  [<ffffffff81549515>] do_IRQ+0x75/0xf0
> [87578.494627]  [<ffffffff81013b13>] ret_from_intr+0x0/0x11
> [87578.494629] <EOI>  [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1
> [87578.494639]  [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1
> [87578.494646]  [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140
> [87578.494652]  [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110
> [87578.494657]  [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa
> [87578.494660] Mem-Info:
> [87578.494662] Node 0 DMA per-cpu:
> [87578.494666] CPU    0: hi:    0, btch:   1 usd:   0
> [87578.494669] CPU    1: hi:    0, btch:   1 usd:   0
> [87578.494672] CPU    2: hi:    0, btch:   1 usd:   0
> [87578.494674] CPU    3: hi:    0, btch:   1 usd:   0
> [87578.494677] Node 0 DMA32 per-cpu:
> [87578.494680] CPU    0: hi:  186, btch:  31 usd: 173
> [87578.494683] CPU    1: hi:  186, btch:  31 usd:  87
> [87578.494686] CPU    2: hi:  186, btch:  31 usd: 168
> [87578.494689] CPU    3: hi:  186, btch:  31 usd:  63
> [87578.494691] Node 0 Normal per-cpu:
> [87578.494695] CPU    0: hi:  186, btch:  31 usd: 177
> [87578.494698] CPU    1: hi:  186, btch:  31 usd: 176
> [87578.494700] CPU    2: hi:  186, btch:  31 usd:  82
> [87578.494703] CPU    3: hi:  186, btch:  31 usd: 191
> [87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0
> [87578.494711]  active_file:916528 inactive_file:914736 isolated_file:0
> [87578.494713]  unevictable:0 dirty:135959 writeback:24423 unstable:0
> [87578.494714]  free:9990 slab_reclaimable:59767 slab_unreclaimable:11135
> [87578.494716]  mapped:119343 shmem:985 pagetables:2113 bounce:0
> [87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB 
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB 
> mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB 
> pages_scanned:0 all_unreclaimable? yes
> [87578.494733] lowmem_reserve[]: 0 2866 7852 7852
> [87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB 
> high:6204kB active_anon:4056kB inactive_anon:5856kB 
> active_file:1322360kB inactive_file:1320432kB unevictable:0kB 
> isolated(anon):0kB isolated(file):0kB present:2935456kB mlocked:0kB 
> dirty:190824kB writeback:31900kB mapped:157676kB shmem:0kB 
> slab_reclaimable:107316kB slab_unreclaimable:15480kB kernel_stack:56kB 
> pagetables:764kB unstable:0kB bounce:0kB writeback_tmp:0kB 
> pages_scanned:0 all_unreclaimable? no
> [87578.494754] lowmem_reserve[]: 0 0 4986 4986
> [87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB 
> high:10788kB active_anon:87824kB inactive_anon:19876kB 
> active_file:2343752kB inactive_file:2338512kB unevictable:0kB 
> isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB 
> dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB 
> slab_reclaimable:131752kB slab_unreclaimable:29060kB 
> kernel_stack:2160kB pagetables:7688kB unstable:0kB bounce:0kB 
> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [87578.494775] lowmem_reserve[]: 0 0 0 0
> [87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB 
> 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB
> [87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB 
> 4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB
> [87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB 
> 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB
You seem to have 4K pages available still. I wonder why the page 
allocator isn't giving them to SLUB?
> [87578.494818] 1832322 total pagecache pages
> [87578.494820] 0 pages in swap cache
> [87578.494823] Swap cache stats: add 0, delete 0, find 0/0
> [87578.494825] Free swap  = 0kB
> [87578.494827] Total swap = 0kB
> [87578.531041] 2064368 pages RAM
> [87578.531044] 66019 pages reserved
> [87578.531046] 1501227 pages shared
> [87578.531048] 619257 pages non-shared
> [87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> [87578.531057]   cache: kmalloc-4096, object size: 4096, buffer size: 
> 4096, default order: 3, min order: 0
> [87578.531061]   node 0: slabs: 1322, objs: 4129, free: 0
>
> This actually made the machine go offline for hours before it for some 
> reason came back. The second time this happened it did not come back 
> (waited 8 hours).
Do you see these out-of-memory problems with 2.6.35?

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-29 10:49                 ` Pekka Enberg
  0 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-29 10:49 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

  On 24.8.2010 22.21, Mikael Abrahamsson wrote:
> On Tue, 24 Aug 2010, Pekka Enberg wrote:
>
>> It looks to me as if tcp_create_openreq_child() is able to cope with 
>> the situation so the warning could be harmless. If that's the case, 
>> we should probably stick a __GFP_NOWARN there.
>
> What about my situation? (a complete dmesg can be had at 
> <http://swm.pp.se/dmesg.100809-2.txt.gz>)
This looks like something the kernel can't really recover from.
> [87578.494471] swapper: page allocation failure. order:0, mode:0x4020
> [87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic 
> #39-Ubuntu
> [87578.494480] Call Trace:
> [87578.494483] <IRQ>  [<ffffffff810fad0e>] 
> __alloc_pages_slowpath+0x56e/0x580
> [87578.494499]  [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0
> [87578.494506]  [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0
> [87578.494511]  [<ffffffff81133b17>] new_slab+0x2f7/0x310
> [87578.494516]  [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0
> [87578.494522]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
> [87578.494528]  [<ffffffff81137408>] 
> __kmalloc_node_track_caller+0xb8/0x180
> [87578.494532]  [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60
> [87578.494536]  [<ffffffff81455ca0>] __alloc_skb+0x80/0x190
> [87578.494540]  [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60
> [87578.494564]  [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 
> [r8169]
> [87578.494572]  [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169]
> [87578.494580]  [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10
> [87578.494586]  [<ffffffff8146029f>] net_rx_action+0x10f/0x250
> [87578.494594]  [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 
> [r8169]
> [87578.494600]  [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0
> [87578.494605]  [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170
> [87578.494610]  [<ffffffff810142ec>] call_softirq+0x1c/0x30
> [87578.494614]  [<ffffffff81015cb5>] do_softirq+0x65/0xa0
> [87578.494618]  [<ffffffff8106e305>] irq_exit+0x85/0x90
> [87578.494623]  [<ffffffff81549515>] do_IRQ+0x75/0xf0
> [87578.494627]  [<ffffffff81013b13>] ret_from_intr+0x0/0x11
> [87578.494629] <EOI>  [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1
> [87578.494639]  [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1
> [87578.494646]  [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140
> [87578.494652]  [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110
> [87578.494657]  [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa
> [87578.494660] Mem-Info:
> [87578.494662] Node 0 DMA per-cpu:
> [87578.494666] CPU    0: hi:    0, btch:   1 usd:   0
> [87578.494669] CPU    1: hi:    0, btch:   1 usd:   0
> [87578.494672] CPU    2: hi:    0, btch:   1 usd:   0
> [87578.494674] CPU    3: hi:    0, btch:   1 usd:   0
> [87578.494677] Node 0 DMA32 per-cpu:
> [87578.494680] CPU    0: hi:  186, btch:  31 usd: 173
> [87578.494683] CPU    1: hi:  186, btch:  31 usd:  87
> [87578.494686] CPU    2: hi:  186, btch:  31 usd: 168
> [87578.494689] CPU    3: hi:  186, btch:  31 usd:  63
> [87578.494691] Node 0 Normal per-cpu:
> [87578.494695] CPU    0: hi:  186, btch:  31 usd: 177
> [87578.494698] CPU    1: hi:  186, btch:  31 usd: 176
> [87578.494700] CPU    2: hi:  186, btch:  31 usd:  82
> [87578.494703] CPU    3: hi:  186, btch:  31 usd: 191
> [87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0
> [87578.494711]  active_file:916528 inactive_file:914736 isolated_file:0
> [87578.494713]  unevictable:0 dirty:135959 writeback:24423 unstable:0
> [87578.494714]  free:9990 slab_reclaimable:59767 slab_unreclaimable:11135
> [87578.494716]  mapped:119343 shmem:985 pagetables:2113 bounce:0
> [87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB 
> active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB 
> mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB 
> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB 
> pages_scanned:0 all_unreclaimable? yes
> [87578.494733] lowmem_reserve[]: 0 2866 7852 7852
> [87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB 
> high:6204kB active_anon:4056kB inactive_anon:5856kB 
> active_file:1322360kB inactive_file:1320432kB unevictable:0kB 
> isolated(anon):0kB isolated(file):0kB present:2935456kB mlocked:0kB 
> dirty:190824kB writeback:31900kB mapped:157676kB shmem:0kB 
> slab_reclaimable:107316kB slab_unreclaimable:15480kB kernel_stack:56kB 
> pagetables:764kB unstable:0kB bounce:0kB writeback_tmp:0kB 
> pages_scanned:0 all_unreclaimable? no
> [87578.494754] lowmem_reserve[]: 0 0 4986 4986
> [87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB 
> high:10788kB active_anon:87824kB inactive_anon:19876kB 
> active_file:2343752kB inactive_file:2338512kB unevictable:0kB 
> isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB 
> dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB 
> slab_reclaimable:131752kB slab_unreclaimable:29060kB 
> kernel_stack:2160kB pagetables:7688kB unstable:0kB bounce:0kB 
> writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
> [87578.494775] lowmem_reserve[]: 0 0 0 0
> [87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB 
> 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB
> [87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB 
> 4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB
> [87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB 
> 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB
You seem to have 4K pages available still. I wonder why the page 
allocator isn't giving them to SLUB?
> [87578.494818] 1832322 total pagecache pages
> [87578.494820] 0 pages in swap cache
> [87578.494823] Swap cache stats: add 0, delete 0, find 0/0
> [87578.494825] Free swap  = 0kB
> [87578.494827] Total swap = 0kB
> [87578.531041] 2064368 pages RAM
> [87578.531044] 66019 pages reserved
> [87578.531046] 1501227 pages shared
> [87578.531048] 619257 pages non-shared
> [87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> [87578.531057]   cache: kmalloc-4096, object size: 4096, buffer size: 
> 4096, default order: 3, min order: 0
> [87578.531061]   node 0: slabs: 1322, objs: 4129, free: 0
>
> This actually made the machine go offline for hours before it for some 
> reason came back. The second time this happened it did not come back 
> (waited 8 hours).
Do you see these out-of-memory problems with 2.6.35?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-29 10:49                 ` Pekka Enberg
@ 2010-08-29 12:38                   ` Mikael Abrahamsson
  -1 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-29 12:38 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Sun, 29 Aug 2010, Pekka Enberg wrote:

> Do you see these out-of-memory problems with 2.6.35?

Haven't tried it.

Has there been substantial work done there that changes things so that if 
I reproduce it on 2.6.35, someone will look into the issue in earnest? 
Since I'll most likely have to compile a new kernel, are there any debug 
options I should enable to give more information to aid fault finding?

I'll start with the .config file from Ubuntu 10.04 2.6.32 kernel and 
oldconfig from there.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-29 12:38                   ` Mikael Abrahamsson
  0 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-29 12:38 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Sun, 29 Aug 2010, Pekka Enberg wrote:

> Do you see these out-of-memory problems with 2.6.35?

Haven't tried it.

Has there been substantial work done there that changes things so that if 
I reproduce it on 2.6.35, someone will look into the issue in earnest? 
Since I'll most likely have to compile a new kernel, are there any debug 
options I should enable to give more information to aid fault finding?

I'll start with the .config file from Ubuntu 10.04 2.6.32 kernel and 
oldconfig from there.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-29 12:38                   ` Mikael Abrahamsson
@ 2010-08-29 13:17                     ` Pekka Enberg
  -1 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-29 13:17 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

  On Sun, 29 Aug 2010, Pekka Enberg wrote:
>> Do you see these out-of-memory problems with 2.6.35?
On 29.8.2010 15.38, Mikael Abrahamsson wrote:
> Haven't tried it.
>
> Has there been substantial work done there that changes things so that 
> if I reproduce it on 2.6.35, someone will look into the issue in 
> earnest? Since I'll most likely have to compile a new kernel, are 
> there any debug options I should enable to give more information to 
> aid fault finding?
There aren't any debug options that need to be enabled. The reason I'm 
asking is because we had a bunch of similar issues being reported 
earlier that got fixed and it's been calm for a while. That's why it 
would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too 
unstable to test) fixes things.

             Pekka

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-29 13:17                     ` Pekka Enberg
  0 siblings, 0 replies; 29+ messages in thread
From: Pekka Enberg @ 2010-08-29 13:17 UTC (permalink / raw)
  To: Mikael Abrahamsson
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

  On Sun, 29 Aug 2010, Pekka Enberg wrote:
>> Do you see these out-of-memory problems with 2.6.35?
On 29.8.2010 15.38, Mikael Abrahamsson wrote:
> Haven't tried it.
>
> Has there been substantial work done there that changes things so that 
> if I reproduce it on 2.6.35, someone will look into the issue in 
> earnest? Since I'll most likely have to compile a new kernel, are 
> there any debug options I should enable to give more information to 
> aid fault finding?
There aren't any debug options that need to be enabled. The reason I'm 
asking is because we had a bunch of similar issues being reported 
earlier that got fixed and it's been calm for a while. That's why it 
would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too 
unstable to test) fixes things.

             Pekka

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-29 13:17                     ` Pekka Enberg
@ 2010-08-29 15:37                       ` Mikael Abrahamsson
  -1 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-29 15:37 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Sun, 29 Aug 2010, Pekka Enberg wrote:

> There aren't any debug options that need to be enabled. The reason I'm 
> asking is because we had a bunch of similar issues being reported 
> earlier that got fixed and it's been calm for a while. That's why it 
> would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too 
> unstable to test) fixes things.

Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for 
10.04), just need to do a reboot at some convenient time.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-29 15:37                       ` Mikael Abrahamsson
  0 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-29 15:37 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Sun, 29 Aug 2010, Pekka Enberg wrote:

> There aren't any debug options that need to be enabled. The reason I'm 
> asking is because we had a bunch of similar issues being reported 
> earlier that got fixed and it's been calm for a while. That's why it 
> would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too 
> unstable to test) fixes things.

Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for 
10.04), just need to do a reboot at some convenient time.

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
  2010-08-29 15:37                       ` Mikael Abrahamsson
@ 2010-08-31 20:28                         ` Mikael Abrahamsson
  -1 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-31 20:28 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Sun, 29 Aug 2010, Mikael Abrahamsson wrote:

> On Sun, 29 Aug 2010, Pekka Enberg wrote:
>
>> There aren't any debug options that need to be enabled. The reason I'm 
>> asking is because we had a bunch of similar issues being reported earlier 
>> that got fixed and it's been calm for a while. That's why it would be 
>> interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too unstable to 
>> test) fixes things.
>
> Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for 
> 10.04), just need to do a reboot at some convenient time.

I just rebooted and ran a similar load of network+disk load that made the 
machine give "swapper allocation failure" messages before, and I couldn't 
reproduce it with 2.6.35:

2.6.35-19-generic #25~lucid1-Ubuntu SMP Wed Aug 25 03:50:05 UTC 2010 x86_64 GNU/Linux

Doing "sync" in the middle made sync take more than 5+ minutes to complete 
(2 hung-task messages in dmesg), but at least nothing ran out of memory.

Considering the amount of people running 2.6.32 and who will be running it 
in the future, it still worries me that this is present in 2.6.32 (and 
earlier kernels as well).

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: 2.6.34.1 page allocation failure
@ 2010-08-31 20:28                         ` Mikael Abrahamsson
  0 siblings, 0 replies; 29+ messages in thread
From: Mikael Abrahamsson @ 2010-08-31 20:28 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm,
	Mel Gorman, Linux Netdev List

On Sun, 29 Aug 2010, Mikael Abrahamsson wrote:

> On Sun, 29 Aug 2010, Pekka Enberg wrote:
>
>> There aren't any debug options that need to be enabled. The reason I'm 
>> asking is because we had a bunch of similar issues being reported earlier 
>> that got fixed and it's been calm for a while. That's why it would be 
>> interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too unstable to 
>> test) fixes things.
>
> Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for 
> 10.04), just need to do a reboot at some convenient time.

I just rebooted and ran a similar load of network+disk load that made the 
machine give "swapper allocation failure" messages before, and I couldn't 
reproduce it with 2.6.35:

2.6.35-19-generic #25~lucid1-Ubuntu SMP Wed Aug 25 03:50:05 UTC 2010 x86_64 GNU/Linux

Doing "sync" in the middle made sync take more than 5+ minutes to complete 
(2 hung-task messages in dmesg), but at least nothing ran out of memory.

Considering the amount of people running 2.6.32 and who will be running it 
in the future, it still worries me that this is present in 2.6.32 (and 
earlier kernels as well).

-- 
Mikael Abrahamsson    email: swmike@swm.pp.se

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2010-08-31 20:30 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-22  6:13 2.6.34.1 page allocation failure Stan Hoeppner
2010-08-22  6:47 ` Mikael Abrahamsson
2010-08-22  6:47   ` Mikael Abrahamsson
2010-08-22 19:51   ` Pekka Enberg
2010-08-22 19:51     ` Pekka Enberg
2010-08-22 22:40     ` Christoph Lameter
2010-08-22 22:40       ` Christoph Lameter
2010-08-23  9:37       ` Pekka Enberg
2010-08-23  9:37         ` Pekka Enberg
2010-08-23 22:35         ` Stan Hoeppner
2010-08-23 22:35           ` Stan Hoeppner
2010-08-24 17:13           ` Christoph Lameter
2010-08-24 17:13             ` Christoph Lameter
2010-08-24 18:03           ` Pekka Enberg
2010-08-24 18:03             ` Pekka Enberg
2010-08-24 19:08             ` Stan Hoeppner
2010-08-24 19:08               ` Stan Hoeppner
2010-08-24 19:21             ` Mikael Abrahamsson
2010-08-24 19:21               ` Mikael Abrahamsson
2010-08-29 10:49               ` Pekka Enberg
2010-08-29 10:49                 ` Pekka Enberg
2010-08-29 12:38                 ` Mikael Abrahamsson
2010-08-29 12:38                   ` Mikael Abrahamsson
2010-08-29 13:17                   ` Pekka Enberg
2010-08-29 13:17                     ` Pekka Enberg
2010-08-29 15:37                     ` Mikael Abrahamsson
2010-08-29 15:37                       ` Mikael Abrahamsson
2010-08-31 20:28                       ` Mikael Abrahamsson
2010-08-31 20:28                         ` Mikael Abrahamsson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.