* 2.6.34.1 page allocation failure @ 2010-08-22 6:13 Stan Hoeppner 2010-08-22 6:47 ` Mikael Abrahamsson 0 siblings, 1 reply; 29+ messages in thread From: Stan Hoeppner @ 2010-08-22 6:13 UTC (permalink / raw) To: Linux Kernel List I'm not subscribed to lkml so please CC me in replies. First post. Mobo: Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100 Disk: SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII Kernel: vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine OS: Debian 5.0.5 (Stable) Build: kernel configured via make menuconfig no modules, no initrd built via "make KDEB_PKGVERSION=" installed via dpkg, bootloader is LILO Role: headless SOHO server, run level 2, _very_ light load Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba bulk of system memory (>300MB) is consumed by buffers/cache Issue: AFAIK, these errors never occurred with any revisions of 2.6.26, .31, or .32. After installing 2.6.34.1 I've noticed the following errors in dmesg. I see 6 of these, including two errors each for kswapd0, lighttpd, and smtpd, all not tainted. AFAICT everything is still running fine. Are these critical errors? If so, how do I fix? kswapd0: page allocation failure. order:1, mode:0x20 Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 Call Trace: [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a [<c11773a5>] ? sk_prot_alloc+0x19/0x55 [<c117744b>] ? sk_clone+0x16/0x1cc [<c119a71d>] ? inet_csk_clone+0xf/0x80 [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 [<c11abf9d>] ? tcp_check_req+0x209/0x335 [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 [<c1193b10>] ? ip_rcv+0x3dc/0x3fa [<c103655e>] ? ktime_get_real+0xf/0x2b [<c117f8d3>] ? netif_receive_skb+0x219/0x234 [<c115ff46>] ? e100_poll+0x1d0/0x47e [<c117fa98>] ? net_rx_action+0x58/0xf8 [<c102539c>] ? __do_softirq+0x78/0xe5 [<c102542c>] ? do_softirq+0x23/0x27 [<c1003955>] ? do_IRQ+0x7d/0x8e [<c1002aa9>] ? common_interrupt+0x29/0x30 [<c1062870>] ? kmem_cache_free+0xbd/0xc5 [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f [<c1075215>] ? destroy_inode+0x1c/0x2b [<c10752ce>] ? dispose_list+0xaa/0xd0 [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 [<c104f76b>] ? shrink_slab+0xda/0x12f [<c104fc28>] ? kswapd+0x468/0x63b [<c104dca3>] ? isolate_pages_global+0x0/0x1bc [<c10304d6>] ? autoremove_wake_function+0x0/0x2d [<c1018faf>] ? complete+0x28/0x36 [<c104f7c0>] ? kswapd+0x0/0x63b [<c10301cd>] ? kthread+0x61/0x66 [<c103016c>] ? kthread+0x0/0x66 [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 Mem-Info: DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 CPU 1: hi: 0, btch: 1 usd: 0 Normal per-cpu: CPU 0: hi: 186, btch: 31 usd: 180 CPU 1: hi: 186, btch: 31 usd: 29 active_anon:646 inactive_anon:4337 isolated_anon:0 active_file:27189 inactive_file:35957 isolated_file:0 unevictable:0 dirty:56 writeback:0 unstable:0 free:1142 slab_reclaimable:25495 slab_unreclaimable:1020 mapped:3116 shmem:143 pagetables:123 bounce:0 DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB inactive_anon:4kB active_file:5704kB inactive_file:7732kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 365 365 Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB inactive_anon:17344kB active_file:103052kB inactive_file:136096kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1564kB Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3000kB 63342 total pagecache pages 23 pages in swap cache Swap cache stats: add 159, delete 136, find 401/412 Free swap = 995636kB Total swap = 995992kB 98303 pages RAM 1638 pages reserved 22416 pages shared 76947 pages non-shared Thanks. -- Stan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-22 6:13 2.6.34.1 page allocation failure Stan Hoeppner @ 2010-08-22 6:47 ` Mikael Abrahamsson 0 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-22 6:47 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Linux Kernel List, linux-mm On Sun, 22 Aug 2010, Stan Hoeppner wrote: > I'm not subscribed to lkml so please CC me in replies. First post. I'm seeing similar problems on older kernels (.24 up to .32). <http://www.spinics.net/lists/linux-mm/msg07808.html> I didn't get any response at all, neither on linux-mm or lkml... Our problems seem very similar, but I'm running 64bit and I have 8 gigs of ram. Personally I can avoid this by tuning down my TCP settings so TCP uses less memory, but I don't think that workaround is very good, this shouldn't happen. My machine also freezes up (pressing caps lock doesn't work) sometimes, sometimes it just logs the error. > Mobo: Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100 > Disk: SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII > Kernel: vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine > OS: Debian 5.0.5 (Stable) > Build: kernel configured via make menuconfig > no modules, no initrd > built via "make KDEB_PKGVERSION=" > installed via dpkg, bootloader is LILO > Role: headless SOHO server, run level 2, _very_ light load > Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba > bulk of system memory (>300MB) is consumed by buffers/cache > Issue: AFAIK, these errors never occurred with any revisions of > 2.6.26, .31, or .32. After installing 2.6.34.1 I've noticed > the following errors in dmesg. I see 6 of these, including > two errors each for kswapd0, lighttpd, and smtpd, all not > tainted. AFAICT everything is still running fine. Are these > critical errors? If so, how do I fix? > > kswapd0: page allocation failure. order:1, mode:0x20 > Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 > Call Trace: > [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a > [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 > [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 > [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a > [<c11773a5>] ? sk_prot_alloc+0x19/0x55 > [<c117744b>] ? sk_clone+0x16/0x1cc > [<c119a71d>] ? inet_csk_clone+0xf/0x80 > [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 > [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 > [<c11abf9d>] ? tcp_check_req+0x209/0x335 > [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d > [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d > [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 > [<c1193b10>] ? ip_rcv+0x3dc/0x3fa > [<c103655e>] ? ktime_get_real+0xf/0x2b > [<c117f8d3>] ? netif_receive_skb+0x219/0x234 > [<c115ff46>] ? e100_poll+0x1d0/0x47e > [<c117fa98>] ? net_rx_action+0x58/0xf8 > [<c102539c>] ? __do_softirq+0x78/0xe5 > [<c102542c>] ? do_softirq+0x23/0x27 > [<c1003955>] ? do_IRQ+0x7d/0x8e > [<c1002aa9>] ? common_interrupt+0x29/0x30 > [<c1062870>] ? kmem_cache_free+0xbd/0xc5 > [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f > [<c1075215>] ? destroy_inode+0x1c/0x2b > [<c10752ce>] ? dispose_list+0xaa/0xd0 > [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 > [<c104f76b>] ? shrink_slab+0xda/0x12f > [<c104fc28>] ? kswapd+0x468/0x63b > [<c104dca3>] ? isolate_pages_global+0x0/0x1bc > [<c10304d6>] ? autoremove_wake_function+0x0/0x2d > [<c1018faf>] ? complete+0x28/0x36 > [<c104f7c0>] ? kswapd+0x0/0x63b > [<c10301cd>] ? kthread+0x61/0x66 > [<c103016c>] ? kthread+0x0/0x66 > [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 > Mem-Info: > DMA per-cpu: > CPU 0: hi: 0, btch: 1 usd: 0 > CPU 1: hi: 0, btch: 1 usd: 0 > Normal per-cpu: > CPU 0: hi: 186, btch: 31 usd: 180 > CPU 1: hi: 186, btch: 31 usd: 29 > active_anon:646 inactive_anon:4337 isolated_anon:0 > active_file:27189 inactive_file:35957 isolated_file:0 > unevictable:0 dirty:56 writeback:0 unstable:0 > free:1142 slab_reclaimable:25495 slab_unreclaimable:1020 > mapped:3116 shmem:143 pagetables:123 bounce:0 > DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB > inactive_anon:4kB active_file:5704kB inactive_file:7732kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB > mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB > slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB > pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 > all_unreclaimable? no > lowmem_reserve[]: 0 365 365 > Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB > inactive_anon:17344kB active_file:103052kB inactive_file:136096kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB > mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB > slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB > pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB > pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 > DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB > 0*2048kB 0*4096kB = 1564kB > Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB > 0*1024kB 0*2048kB 0*4096kB = 3000kB > 63342 total pagecache pages > 23 pages in swap cache > Swap cache stats: add 159, delete 136, find 401/412 > Free swap = 995636kB > Total swap = 995992kB > 98303 pages RAM > 1638 pages reserved > 22416 pages shared > 76947 pages non-shared > > Thanks. > > -- > Stan > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-22 6:47 ` Mikael Abrahamsson 0 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-22 6:47 UTC (permalink / raw) To: Stan Hoeppner; +Cc: Linux Kernel List, linux-mm On Sun, 22 Aug 2010, Stan Hoeppner wrote: > I'm not subscribed to lkml so please CC me in replies. First post. I'm seeing similar problems on older kernels (.24 up to .32). <http://www.spinics.net/lists/linux-mm/msg07808.html> I didn't get any response at all, neither on linux-mm or lkml... Our problems seem very similar, but I'm running 64bit and I have 8 gigs of ram. Personally I can avoid this by tuning down my TCP settings so TCP uses less memory, but I don't think that workaround is very good, this shouldn't happen. My machine also freezes up (pressing caps lock doesn't work) sometimes, sometimes it just logs the error. > Mobo: Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100 > Disk: SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII > Kernel: vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine > OS: Debian 5.0.5 (Stable) > Build: kernel configured via make menuconfig > no modules, no initrd > built via "make KDEB_PKGVERSION=" > installed via dpkg, bootloader is LILO > Role: headless SOHO server, run level 2, _very_ light load > Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba > bulk of system memory (>300MB) is consumed by buffers/cache > Issue: AFAIK, these errors never occurred with any revisions of > 2.6.26, .31, or .32. After installing 2.6.34.1 I've noticed > the following errors in dmesg. I see 6 of these, including > two errors each for kswapd0, lighttpd, and smtpd, all not > tainted. AFAICT everything is still running fine. Are these > critical errors? If so, how do I fix? > > kswapd0: page allocation failure. order:1, mode:0x20 > Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 > Call Trace: > [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a > [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 > [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 > [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a > [<c11773a5>] ? sk_prot_alloc+0x19/0x55 > [<c117744b>] ? sk_clone+0x16/0x1cc > [<c119a71d>] ? inet_csk_clone+0xf/0x80 > [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 > [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 > [<c11abf9d>] ? tcp_check_req+0x209/0x335 > [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d > [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d > [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 > [<c1193b10>] ? ip_rcv+0x3dc/0x3fa > [<c103655e>] ? ktime_get_real+0xf/0x2b > [<c117f8d3>] ? netif_receive_skb+0x219/0x234 > [<c115ff46>] ? e100_poll+0x1d0/0x47e > [<c117fa98>] ? net_rx_action+0x58/0xf8 > [<c102539c>] ? __do_softirq+0x78/0xe5 > [<c102542c>] ? do_softirq+0x23/0x27 > [<c1003955>] ? do_IRQ+0x7d/0x8e > [<c1002aa9>] ? common_interrupt+0x29/0x30 > [<c1062870>] ? kmem_cache_free+0xbd/0xc5 > [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f > [<c1075215>] ? destroy_inode+0x1c/0x2b > [<c10752ce>] ? dispose_list+0xaa/0xd0 > [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 > [<c104f76b>] ? shrink_slab+0xda/0x12f > [<c104fc28>] ? kswapd+0x468/0x63b > [<c104dca3>] ? isolate_pages_global+0x0/0x1bc > [<c10304d6>] ? autoremove_wake_function+0x0/0x2d > [<c1018faf>] ? complete+0x28/0x36 > [<c104f7c0>] ? kswapd+0x0/0x63b > [<c10301cd>] ? kthread+0x61/0x66 > [<c103016c>] ? kthread+0x0/0x66 > [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 > Mem-Info: > DMA per-cpu: > CPU 0: hi: 0, btch: 1 usd: 0 > CPU 1: hi: 0, btch: 1 usd: 0 > Normal per-cpu: > CPU 0: hi: 186, btch: 31 usd: 180 > CPU 1: hi: 186, btch: 31 usd: 29 > active_anon:646 inactive_anon:4337 isolated_anon:0 > active_file:27189 inactive_file:35957 isolated_file:0 > unevictable:0 dirty:56 writeback:0 unstable:0 > free:1142 slab_reclaimable:25495 slab_unreclaimable:1020 > mapped:3116 shmem:143 pagetables:123 bounce:0 > DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB > inactive_anon:4kB active_file:5704kB inactive_file:7732kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB > mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB > slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB > pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 > all_unreclaimable? no > lowmem_reserve[]: 0 365 365 > Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB > inactive_anon:17344kB active_file:103052kB inactive_file:136096kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB > mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB > slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB > pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB > pages_scanned:0 all_unreclaimable? no > lowmem_reserve[]: 0 0 0 > DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB > 0*2048kB 0*4096kB = 1564kB > Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB > 0*1024kB 0*2048kB 0*4096kB = 3000kB > 63342 total pagecache pages > 23 pages in swap cache > Swap cache stats: add 159, delete 136, find 401/412 > Free swap = 995636kB > Total swap = 995992kB > 98303 pages RAM > 1638 pages reserved > 22416 pages shared > 76947 pages non-shared > > Thanks. > > -- > Stan > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Mikael Abrahamsson email: swmike@swm.pp.se -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-22 6:47 ` Mikael Abrahamsson @ 2010-08-22 19:51 ` Pekka Enberg -1 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-22 19:51 UTC (permalink / raw) To: Mikael Abrahamsson Cc: Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman, Christoph Lameter On Sun, Aug 22, 2010 at 9:47 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote: > On Sun, 22 Aug 2010, Stan Hoeppner wrote: > >> I'm not subscribed to lkml so please CC me in replies. First post. > > I'm seeing similar problems on older kernels (.24 up to .32). > > <http://www.spinics.net/lists/linux-mm/msg07808.html> > > I didn't get any response at all, neither on linux-mm or lkml... Our > problems seem very similar, but I'm running 64bit and I have 8 gigs of ram. > > Personally I can avoid this by tuning down my TCP settings so TCP uses less > memory, but I don't think that workaround is very good, this shouldn't > happen. My machine also freezes up (pressing caps lock doesn't work) > sometimes, sometimes it just logs the error. > >> Mobo: Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100 >> Disk: SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII >> Kernel: vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine >> OS: Debian 5.0.5 (Stable) >> Build: kernel configured via make menuconfig >> no modules, no initrd >> built via "make KDEB_PKGVERSION=" >> installed via dpkg, bootloader is LILO >> Role: headless SOHO server, run level 2, _very_ light load >> Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba >> bulk of system memory (>300MB) is consumed by buffers/cache >> Issue: AFAIK, these errors never occurred with any revisions of >> 2.6.26, .31, or .32. After installing 2.6.34.1 I've noticed >> the following errors in dmesg. I see 6 of these, including >> two errors each for kswapd0, lighttpd, and smtpd, all not >> tainted. AFAICT everything is still running fine. Are these >> critical errors? If so, how do I fix? >> >> kswapd0: page allocation failure. order:1, mode:0x20 >> Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 >> Call Trace: >> [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a >> [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 >> [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 >> [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a >> [<c11773a5>] ? sk_prot_alloc+0x19/0x55 >> [<c117744b>] ? sk_clone+0x16/0x1cc >> [<c119a71d>] ? inet_csk_clone+0xf/0x80 >> [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 >> [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 >> [<c11abf9d>] ? tcp_check_req+0x209/0x335 >> [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d >> [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d >> [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 >> [<c1193b10>] ? ip_rcv+0x3dc/0x3fa >> [<c103655e>] ? ktime_get_real+0xf/0x2b >> [<c117f8d3>] ? netif_receive_skb+0x219/0x234 >> [<c115ff46>] ? e100_poll+0x1d0/0x47e >> [<c117fa98>] ? net_rx_action+0x58/0xf8 >> [<c102539c>] ? __do_softirq+0x78/0xe5 >> [<c102542c>] ? do_softirq+0x23/0x27 >> [<c1003955>] ? do_IRQ+0x7d/0x8e >> [<c1002aa9>] ? common_interrupt+0x29/0x30 >> [<c1062870>] ? kmem_cache_free+0xbd/0xc5 >> [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f >> [<c1075215>] ? destroy_inode+0x1c/0x2b >> [<c10752ce>] ? dispose_list+0xaa/0xd0 >> [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 >> [<c104f76b>] ? shrink_slab+0xda/0x12f >> [<c104fc28>] ? kswapd+0x468/0x63b >> [<c104dca3>] ? isolate_pages_global+0x0/0x1bc >> [<c10304d6>] ? autoremove_wake_function+0x0/0x2d >> [<c1018faf>] ? complete+0x28/0x36 >> [<c104f7c0>] ? kswapd+0x0/0x63b >> [<c10301cd>] ? kthread+0x61/0x66 >> [<c103016c>] ? kthread+0x0/0x66 >> [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 >> Mem-Info: >> DMA per-cpu: >> CPU 0: hi: 0, btch: 1 usd: 0 >> CPU 1: hi: 0, btch: 1 usd: 0 >> Normal per-cpu: >> CPU 0: hi: 186, btch: 31 usd: 180 >> CPU 1: hi: 186, btch: 31 usd: 29 >> active_anon:646 inactive_anon:4337 isolated_anon:0 >> active_file:27189 inactive_file:35957 isolated_file:0 >> unevictable:0 dirty:56 writeback:0 unstable:0 >> free:1142 slab_reclaimable:25495 slab_unreclaimable:1020 >> mapped:3116 shmem:143 pagetables:123 bounce:0 >> DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB >> inactive_anon:4kB active_file:5704kB inactive_file:7732kB >> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB >> mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB >> slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB >> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 >> all_unreclaimable? no >> lowmem_reserve[]: 0 365 365 >> Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB >> inactive_anon:17344kB active_file:103052kB inactive_file:136096kB >> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB >> mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB >> slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB >> pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB >> pages_scanned:0 all_unreclaimable? no >> lowmem_reserve[]: 0 0 0 >> DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB >> 0*2048kB 0*4096kB = 1564kB >> Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB >> 0*1024kB 0*2048kB 0*4096kB = 3000kB >> 63342 total pagecache pages >> 23 pages in swap cache >> Swap cache stats: add 159, delete 136, find 401/412 >> Free swap = 995636kB >> Total swap = 995992kB >> 98303 pages RAM >> 1638 pages reserved >> 22416 pages shared >> 76947 pages non-shared In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are only order-0 pages available. Mel, any recent page allocator fixes in 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-22 19:51 ` Pekka Enberg 0 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-22 19:51 UTC (permalink / raw) To: Mikael Abrahamsson Cc: Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman, Christoph Lameter On Sun, Aug 22, 2010 at 9:47 AM, Mikael Abrahamsson <swmike@swm.pp.se> wrote: > On Sun, 22 Aug 2010, Stan Hoeppner wrote: > >> I'm not subscribed to lkml so please CC me in replies. First post. > > I'm seeing similar problems on older kernels (.24 up to .32). > > <http://www.spinics.net/lists/linux-mm/msg07808.html> > > I didn't get any response at all, neither on linux-mm or lkml... Our > problems seem very similar, but I'm running 64bit and I have 8 gigs of ram. > > Personally I can avoid this by tuning down my TCP settings so TCP uses less > memory, but I don't think that workaround is very good, this shouldn't > happen. My machine also freezes up (pressing caps lock doesn't work) > sometimes, sometimes it just logs the error. > >> Mobo: Abit BP6, dual Celeron 366@500, i440BX chipset, 384MB PC100 >> Disk: SiI 3512 PCI (sata_sil, libata), 1 x WD5000AAKS 500 GB SATAII >> Kernel: vanilla 2.6.34.1, 32 bit x86, SMP, Celeron pre Coppermine >> OS: Debian 5.0.5 (Stable) >> Build: kernel configured via make menuconfig >> no modules, no initrd >> built via "make KDEB_PKGVERSION=" >> installed via dpkg, bootloader is LILO >> Role: headless SOHO server, run level 2, _very_ light load >> Postfix, pdns-recursor, Dovecot, Lighttpd, Roundcube, Samba >> bulk of system memory (>300MB) is consumed by buffers/cache >> Issue: AFAIK, these errors never occurred with any revisions of >> 2.6.26, .31, or .32. After installing 2.6.34.1 I've noticed >> the following errors in dmesg. I see 6 of these, including >> two errors each for kswapd0, lighttpd, and smtpd, all not >> tainted. AFAICT everything is still running fine. Are these >> critical errors? If so, how do I fix? >> >> kswapd0: page allocation failure. order:1, mode:0x20 >> Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 >> Call Trace: >> [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a >> [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 >> [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 >> [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a >> [<c11773a5>] ? sk_prot_alloc+0x19/0x55 >> [<c117744b>] ? sk_clone+0x16/0x1cc >> [<c119a71d>] ? inet_csk_clone+0xf/0x80 >> [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 >> [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 >> [<c11abf9d>] ? tcp_check_req+0x209/0x335 >> [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d >> [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d >> [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 >> [<c1193b10>] ? ip_rcv+0x3dc/0x3fa >> [<c103655e>] ? ktime_get_real+0xf/0x2b >> [<c117f8d3>] ? netif_receive_skb+0x219/0x234 >> [<c115ff46>] ? e100_poll+0x1d0/0x47e >> [<c117fa98>] ? net_rx_action+0x58/0xf8 >> [<c102539c>] ? __do_softirq+0x78/0xe5 >> [<c102542c>] ? do_softirq+0x23/0x27 >> [<c1003955>] ? do_IRQ+0x7d/0x8e >> [<c1002aa9>] ? common_interrupt+0x29/0x30 >> [<c1062870>] ? kmem_cache_free+0xbd/0xc5 >> [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f >> [<c1075215>] ? destroy_inode+0x1c/0x2b >> [<c10752ce>] ? dispose_list+0xaa/0xd0 >> [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 >> [<c104f76b>] ? shrink_slab+0xda/0x12f >> [<c104fc28>] ? kswapd+0x468/0x63b >> [<c104dca3>] ? isolate_pages_global+0x0/0x1bc >> [<c10304d6>] ? autoremove_wake_function+0x0/0x2d >> [<c1018faf>] ? complete+0x28/0x36 >> [<c104f7c0>] ? kswapd+0x0/0x63b >> [<c10301cd>] ? kthread+0x61/0x66 >> [<c103016c>] ? kthread+0x0/0x66 >> [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 >> Mem-Info: >> DMA per-cpu: >> CPU 0: hi: 0, btch: 1 usd: 0 >> CPU 1: hi: 0, btch: 1 usd: 0 >> Normal per-cpu: >> CPU 0: hi: 186, btch: 31 usd: 180 >> CPU 1: hi: 186, btch: 31 usd: 29 >> active_anon:646 inactive_anon:4337 isolated_anon:0 >> active_file:27189 inactive_file:35957 isolated_file:0 >> unevictable:0 dirty:56 writeback:0 unstable:0 >> free:1142 slab_reclaimable:25495 slab_unreclaimable:1020 >> mapped:3116 shmem:143 pagetables:123 bounce:0 >> DMA free:1568kB min:100kB low:124kB high:148kB active_anon:0kB >> inactive_anon:4kB active_file:5704kB inactive_file:7732kB >> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15868kB >> mlocked:0kB dirty:0kB writeback:0kB mapped:28kB shmem:0kB >> slab_reclaimable:912kB slab_unreclaimable:52kB kernel_stack:0kB >> pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 >> all_unreclaimable? no >> lowmem_reserve[]: 0 365 365 >> Normal free:3000kB min:2392kB low:2988kB high:3588kB active_anon:2584kB >> inactive_anon:17344kB active_file:103052kB inactive_file:136096kB >> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:373888kB >> mlocked:0kB dirty:224kB writeback:0kB mapped:12436kB shmem:572kB >> slab_reclaimable:101068kB slab_unreclaimable:4028kB kernel_stack:520kB >> pagetables:492kB unstable:0kB bounce:0kB writeback_tmp:0kB >> pages_scanned:0 all_unreclaimable? no >> lowmem_reserve[]: 0 0 0 >> DMA: 391*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB >> 0*2048kB 0*4096kB = 1564kB >> Normal: 750*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB >> 0*1024kB 0*2048kB 0*4096kB = 3000kB >> 63342 total pagecache pages >> 23 pages in swap cache >> Swap cache stats: add 159, delete 136, find 401/412 >> Free swap = 995636kB >> Total swap = 995992kB >> 98303 pages RAM >> 1638 pages reserved >> 22416 pages shared >> 76947 pages non-shared In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are only order-0 pages available. Mel, any recent page allocator fixes in 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-22 19:51 ` Pekka Enberg @ 2010-08-22 22:40 ` Christoph Lameter -1 siblings, 0 replies; 29+ messages in thread From: Christoph Lameter @ 2010-08-22 22:40 UTC (permalink / raw) To: Pekka Enberg Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman On Sun, 22 Aug 2010, Pekka Enberg wrote: > In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are > only order-0 pages available. Mel, any recent page allocator fixes in > 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? This is the TCP slab? Best fix would be in the page allocator. However, in this particular case the slub allocator would be able to fall back to an order 0 allocation and still satisfy the request. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-22 22:40 ` Christoph Lameter 0 siblings, 0 replies; 29+ messages in thread From: Christoph Lameter @ 2010-08-22 22:40 UTC (permalink / raw) To: Pekka Enberg Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman On Sun, 22 Aug 2010, Pekka Enberg wrote: > In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are > only order-0 pages available. Mel, any recent page allocator fixes in > 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? This is the TCP slab? Best fix would be in the page allocator. However, in this particular case the slub allocator would be able to fall back to an order 0 allocation and still satisfy the request. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-22 22:40 ` Christoph Lameter @ 2010-08-23 9:37 ` Pekka Enberg -1 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-23 9:37 UTC (permalink / raw) To: Christoph Lameter Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman On 8/23/10 1:40 AM, Christoph Lameter wrote: > On Sun, 22 Aug 2010, Pekka Enberg wrote: > >> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are >> only order-0 pages available. Mel, any recent page allocator fixes in >> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? > This is the TCP slab? Best fix would be in the page allocator. However, > in this particular case the slub allocator would be able to fall back to > an order 0 allocation and still satisfy the request. > Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB which doesn't have order-0 fallback. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-23 9:37 ` Pekka Enberg 0 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-23 9:37 UTC (permalink / raw) To: Christoph Lameter Cc: Mikael Abrahamsson, Stan Hoeppner, Linux Kernel List, linux-mm, Mel Gorman On 8/23/10 1:40 AM, Christoph Lameter wrote: > On Sun, 22 Aug 2010, Pekka Enberg wrote: > >> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are >> only order-0 pages available. Mel, any recent page allocator fixes in >> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? > This is the TCP slab? Best fix would be in the page allocator. However, > in this particular case the slub allocator would be able to fall back to > an order 0 allocation and still satisfy the request. > Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB which doesn't have order-0 fallback. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-23 9:37 ` Pekka Enberg @ 2010-08-23 22:35 ` Stan Hoeppner -1 siblings, 0 replies; 29+ messages in thread From: Stan Hoeppner @ 2010-08-23 22:35 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman Pekka Enberg put forth on 8/23/2010 4:37 AM: > On 8/23/10 1:40 AM, Christoph Lameter wrote: >> On Sun, 22 Aug 2010, Pekka Enberg wrote: >> >>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are >>> only order-0 pages available. Mel, any recent page allocator fixes in >>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? >> This is the TCP slab? Best fix would be in the page allocator. However, >> in this particular case the slub allocator would be able to fall back to >> an order 0 allocation and still satisfy the request. >> > Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB > which doesn't have order-0 fallback. That is correct. The menuconfig help screen led me to believe the SLAB allocator was the "safe" choice: "CONFIG_SLAB: The regular slab allocator that is established and known to work well in all environments" Should I be using SLUB instead? Any downsides to SLUB on an old and slow (500 MHz) single core dual CPU box with <512MB RAM? Also, what is the impact of these oopses? Despite the entries in dmesg, the system "seems" to be running ok. Or is this simply the calm before the impending storm? -- Stan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-23 22:35 ` Stan Hoeppner 0 siblings, 0 replies; 29+ messages in thread From: Stan Hoeppner @ 2010-08-23 22:35 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman Pekka Enberg put forth on 8/23/2010 4:37 AM: > On 8/23/10 1:40 AM, Christoph Lameter wrote: >> On Sun, 22 Aug 2010, Pekka Enberg wrote: >> >>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are >>> only order-0 pages available. Mel, any recent page allocator fixes in >>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? >> This is the TCP slab? Best fix would be in the page allocator. However, >> in this particular case the slub allocator would be able to fall back to >> an order 0 allocation and still satisfy the request. >> > Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB > which doesn't have order-0 fallback. That is correct. The menuconfig help screen led me to believe the SLAB allocator was the "safe" choice: "CONFIG_SLAB: The regular slab allocator that is established and known to work well in all environments" Should I be using SLUB instead? Any downsides to SLUB on an old and slow (500 MHz) single core dual CPU box with <512MB RAM? Also, what is the impact of these oopses? Despite the entries in dmesg, the system "seems" to be running ok. Or is this simply the calm before the impending storm? -- Stan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-23 22:35 ` Stan Hoeppner @ 2010-08-24 17:13 ` Christoph Lameter -1 siblings, 0 replies; 29+ messages in thread From: Christoph Lameter @ 2010-08-24 17:13 UTC (permalink / raw) To: Stan Hoeppner Cc: Pekka Enberg, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman On Mon, 23 Aug 2010, Stan Hoeppner wrote: > Should I be using SLUB instead? Any downsides to SLUB on an old and > slow (500 MHz) single core dual CPU box with <512MB RAM? SLUB has a smaller memory footprint so you may come out ahead for such a small system in particular. > Also, what is the impact of these oopses? Despite the entries in dmesg, > the system "seems" to be running ok. Or is this simply the calm before > the impending storm? The system does not guarantee that GFP_ATOMIC allocation succeed so any caller must provide logic to fall back if no memory is allocated. So the effect may just be that certain OS operations have to be retried. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-24 17:13 ` Christoph Lameter 0 siblings, 0 replies; 29+ messages in thread From: Christoph Lameter @ 2010-08-24 17:13 UTC (permalink / raw) To: Stan Hoeppner Cc: Pekka Enberg, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman On Mon, 23 Aug 2010, Stan Hoeppner wrote: > Should I be using SLUB instead? Any downsides to SLUB on an old and > slow (500 MHz) single core dual CPU box with <512MB RAM? SLUB has a smaller memory footprint so you may come out ahead for such a small system in particular. > Also, what is the impact of these oopses? Despite the entries in dmesg, > the system "seems" to be running ok. Or is this simply the calm before > the impending storm? The system does not guarantee that GFP_ATOMIC allocation succeed so any caller must provide logic to fall back if no memory is allocated. So the effect may just be that certain OS operations have to be retried. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-23 22:35 ` Stan Hoeppner @ 2010-08-24 18:03 ` Pekka Enberg -1 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-24 18:03 UTC (permalink / raw) To: Stan Hoeppner Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List [ I'm CC'ing netdev. ] On 24.8.2010 1.35, Stan Hoeppner wrote: > Pekka Enberg put forth on 8/23/2010 4:37 AM: >> On 8/23/10 1:40 AM, Christoph Lameter wrote: >>> On Sun, 22 Aug 2010, Pekka Enberg wrote: >>> >>>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are >>>> only order-0 pages available. Mel, any recent page allocator fixes in >>>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? >>> This is the TCP slab? Best fix would be in the page allocator. However, >>> in this particular case the slub allocator would be able to fall back to >>> an order 0 allocation and still satisfy the request. >> Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB >> which doesn't have order-0 fallback. > That is correct. The menuconfig help screen led me to believe the SLAB > allocator was the "safe" choice: > > "CONFIG_SLAB: > The regular slab allocator that is established and known to work well in > all environments" > > Should I be using SLUB instead? Any downsides to SLUB on an old and > slow (500 MHz) single core dual CPU box with<512MB RAM? I don't think the problem here is SLAB so it shouldn't matter which one you use. You might not see the problems with SLUB, though, because it falls back to 0-order allocations. > Also, what is the impact of these oopses? Despite the entries in dmesg, > the system "seems" to be running ok. Or is this simply the calm before > the impending storm? The page allocation failure in question is this: kswapd0: page allocation failure. order:1, mode:0x20 Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 Call Trace: [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a [<c11773a5>] ? sk_prot_alloc+0x19/0x55 [<c117744b>] ? sk_clone+0x16/0x1cc [<c119a71d>] ? inet_csk_clone+0xf/0x80 [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 [<c11abf9d>] ? tcp_check_req+0x209/0x335 [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 [<c1193b10>] ? ip_rcv+0x3dc/0x3fa [<c103655e>] ? ktime_get_real+0xf/0x2b [<c117f8d3>] ? netif_receive_skb+0x219/0x234 [<c115ff46>] ? e100_poll+0x1d0/0x47e [<c117fa98>] ? net_rx_action+0x58/0xf8 [<c102539c>] ? __do_softirq+0x78/0xe5 [<c102542c>] ? do_softirq+0x23/0x27 [<c1003955>] ? do_IRQ+0x7d/0x8e [<c1002aa9>] ? common_interrupt+0x29/0x30 [<c1062870>] ? kmem_cache_free+0xbd/0xc5 [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f [<c1075215>] ? destroy_inode+0x1c/0x2b [<c10752ce>] ? dispose_list+0xaa/0xd0 [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 [<c104f76b>] ? shrink_slab+0xda/0x12f [<c104fc28>] ? kswapd+0x468/0x63b [<c104dca3>] ? isolate_pages_global+0x0/0x1bc [<c10304d6>] ? autoremove_wake_function+0x0/0x2d [<c1018faf>] ? complete+0x28/0x36 [<c104f7c0>] ? kswapd+0x0/0x63b [<c10301cd>] ? kthread+0x61/0x66 [<c103016c>] ? kthread+0x0/0x66 [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 It looks to me as if tcp_create_openreq_child() is able to cope with the situation so the warning could be harmless. If that's the case, we should probably stick a __GFP_NOWARN there. Pekka ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-24 18:03 ` Pekka Enberg 0 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-24 18:03 UTC (permalink / raw) To: Stan Hoeppner Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List [ I'm CC'ing netdev. ] On 24.8.2010 1.35, Stan Hoeppner wrote: > Pekka Enberg put forth on 8/23/2010 4:37 AM: >> On 8/23/10 1:40 AM, Christoph Lameter wrote: >>> On Sun, 22 Aug 2010, Pekka Enberg wrote: >>> >>>> In Stan's case, it's a order-1 GFP_ATOMIC allocation but there are >>>> only order-0 pages available. Mel, any recent page allocator fixes in >>>> 2.6.35 or 2.6.36-rc1 that Stan/Mikael should test? >>> This is the TCP slab? Best fix would be in the page allocator. However, >>> in this particular case the slub allocator would be able to fall back to >>> an order 0 allocation and still satisfy the request. >> Looking at the stack trace of the oops, I think Stan has CONFIG_SLAB >> which doesn't have order-0 fallback. > That is correct. The menuconfig help screen led me to believe the SLAB > allocator was the "safe" choice: > > "CONFIG_SLAB: > The regular slab allocator that is established and known to work well in > all environments" > > Should I be using SLUB instead? Any downsides to SLUB on an old and > slow (500 MHz) single core dual CPU box with<512MB RAM? I don't think the problem here is SLAB so it shouldn't matter which one you use. You might not see the problems with SLUB, though, because it falls back to 0-order allocations. > Also, what is the impact of these oopses? Despite the entries in dmesg, > the system "seems" to be running ok. Or is this simply the calm before > the impending storm? The page allocation failure in question is this: kswapd0: page allocation failure. order:1, mode:0x20 Pid: 139, comm: kswapd0 Not tainted 2.6.34.1 #1 Call Trace: [<c104b6b3>] ? __alloc_pages_nodemask+0x448/0x48a [<c1062ffb>] ? cache_alloc_refill+0x22f/0x422 [<c11a9a73>] ? tcp_v4_send_check+0x6e/0xa4 [<c10632c3>] ? kmem_cache_alloc+0x41/0x6a [<c11773a5>] ? sk_prot_alloc+0x19/0x55 [<c117744b>] ? sk_clone+0x16/0x1cc [<c119a71d>] ? inet_csk_clone+0xf/0x80 [<c11ac0e3>] ? tcp_create_openreq_child+0x1a/0x3c8 [<c11aaf0a>] ? tcp_v4_syn_recv_sock+0x4b/0x151 [<c11abf9d>] ? tcp_check_req+0x209/0x335 [<c11aa892>] ? tcp_v4_do_rcv+0x8d/0x14d [<c11aacd5>] ? tcp_v4_rcv+0x383/0x56d [<c1193ba4>] ? ip_local_deliver+0x76/0xc0 [<c1193b10>] ? ip_rcv+0x3dc/0x3fa [<c103655e>] ? ktime_get_real+0xf/0x2b [<c117f8d3>] ? netif_receive_skb+0x219/0x234 [<c115ff46>] ? e100_poll+0x1d0/0x47e [<c117fa98>] ? net_rx_action+0x58/0xf8 [<c102539c>] ? __do_softirq+0x78/0xe5 [<c102542c>] ? do_softirq+0x23/0x27 [<c1003955>] ? do_IRQ+0x7d/0x8e [<c1002aa9>] ? common_interrupt+0x29/0x30 [<c1062870>] ? kmem_cache_free+0xbd/0xc5 [<c10fa7d1>] ? __xfs_inode_set_reclaim_tag+0x29/0x2f [<c1075215>] ? destroy_inode+0x1c/0x2b [<c10752ce>] ? dispose_list+0xaa/0xd0 [<c107548c>] ? shrink_icache_memory+0x198/0x1c5 [<c104f76b>] ? shrink_slab+0xda/0x12f [<c104fc28>] ? kswapd+0x468/0x63b [<c104dca3>] ? isolate_pages_global+0x0/0x1bc [<c10304d6>] ? autoremove_wake_function+0x0/0x2d [<c1018faf>] ? complete+0x28/0x36 [<c104f7c0>] ? kswapd+0x0/0x63b [<c10301cd>] ? kthread+0x61/0x66 [<c103016c>] ? kthread+0x0/0x66 [<c1002ab6>] ? kernel_thread_helper+0x6/0x10 It looks to me as if tcp_create_openreq_child() is able to cope with the situation so the warning could be harmless. If that's the case, we should probably stick a __GFP_NOWARN there. Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-24 18:03 ` Pekka Enberg @ 2010-08-24 19:08 ` Stan Hoeppner -1 siblings, 0 replies; 29+ messages in thread From: Stan Hoeppner @ 2010-08-24 19:08 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List Pekka Enberg put forth on 8/24/2010 1:03 PM: > It looks to me as if tcp_create_openreq_child() is able to cope with the > situation so the warning could be harmless. If that's the case, we > should probably stick a __GFP_NOWARN there. If it would be helpful, here's a complete copy of dmesg: http://www.hardwarefreak.com/2.6.34.1-dmesg-oopses.txt Something I forgot to mention earlier is that every now and then I unmount swap and drop caches to clear things out a bit. Not sure if that may be relevant, but since it has to do with memory allocation I thought I'd mention it. -- Stan ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-24 19:08 ` Stan Hoeppner 0 siblings, 0 replies; 29+ messages in thread From: Stan Hoeppner @ 2010-08-24 19:08 UTC (permalink / raw) To: Pekka Enberg Cc: Christoph Lameter, Mikael Abrahamsson, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List Pekka Enberg put forth on 8/24/2010 1:03 PM: > It looks to me as if tcp_create_openreq_child() is able to cope with the > situation so the warning could be harmless. If that's the case, we > should probably stick a __GFP_NOWARN there. If it would be helpful, here's a complete copy of dmesg: http://www.hardwarefreak.com/2.6.34.1-dmesg-oopses.txt Something I forgot to mention earlier is that every now and then I unmount swap and drop caches to clear things out a bit. Not sure if that may be relevant, but since it has to do with memory allocation I thought I'd mention it. -- Stan -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-24 18:03 ` Pekka Enberg @ 2010-08-24 19:21 ` Mikael Abrahamsson -1 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-24 19:21 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Tue, 24 Aug 2010, Pekka Enberg wrote: > It looks to me as if tcp_create_openreq_child() is able to cope with the > situation so the warning could be harmless. If that's the case, we > should probably stick a __GFP_NOWARN there. What about my situation? (a complete dmesg can be had at <http://swm.pp.se/dmesg.100809-2.txt.gz>) [87578.494471] swapper: page allocation failure. order:0, mode:0x4020 [87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic #39-Ubuntu [87578.494480] Call Trace: [87578.494483] <IRQ> [<ffffffff810fad0e>] __alloc_pages_slowpath+0x56e/0x580 [87578.494499] [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0 [87578.494506] [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0 [87578.494511] [<ffffffff81133b17>] new_slab+0x2f7/0x310 [87578.494516] [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0 [87578.494522] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 [87578.494528] [<ffffffff81137408>] __kmalloc_node_track_caller+0xb8/0x180 [87578.494532] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 [87578.494536] [<ffffffff81455ca0>] __alloc_skb+0x80/0x190 [87578.494540] [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60 [87578.494564] [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 [r8169] [87578.494572] [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169] [87578.494580] [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10 [87578.494586] [<ffffffff8146029f>] net_rx_action+0x10f/0x250 [87578.494594] [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 [r8169] [87578.494600] [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0 [87578.494605] [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170 [87578.494610] [<ffffffff810142ec>] call_softirq+0x1c/0x30 [87578.494614] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 [87578.494618] [<ffffffff8106e305>] irq_exit+0x85/0x90 [87578.494623] [<ffffffff81549515>] do_IRQ+0x75/0xf0 [87578.494627] [<ffffffff81013b13>] ret_from_intr+0x0/0x11 [87578.494629] <EOI> [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1 [87578.494639] [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1 [87578.494646] [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140 [87578.494652] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 [87578.494657] [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa [87578.494660] Mem-Info: [87578.494662] Node 0 DMA per-cpu: [87578.494666] CPU 0: hi: 0, btch: 1 usd: 0 [87578.494669] CPU 1: hi: 0, btch: 1 usd: 0 [87578.494672] CPU 2: hi: 0, btch: 1 usd: 0 [87578.494674] CPU 3: hi: 0, btch: 1 usd: 0 [87578.494677] Node 0 DMA32 per-cpu: [87578.494680] CPU 0: hi: 186, btch: 31 usd: 173 [87578.494683] CPU 1: hi: 186, btch: 31 usd: 87 [87578.494686] CPU 2: hi: 186, btch: 31 usd: 168 [87578.494689] CPU 3: hi: 186, btch: 31 usd: 63 [87578.494691] Node 0 Normal per-cpu: [87578.494695] CPU 0: hi: 186, btch: 31 usd: 177 [87578.494698] CPU 1: hi: 186, btch: 31 usd: 176 [87578.494700] CPU 2: hi: 186, btch: 31 usd: 82 [87578.494703] CPU 3: hi: 186, btch: 31 usd: 191 [87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0 [87578.494711] active_file:916528 inactive_file:914736 isolated_file:0 [87578.494713] unevictable:0 dirty:135959 writeback:24423 unstable:0 [87578.494714] free:9990 slab_reclaimable:59767 slab_unreclaimable:11135 [87578.494716] mapped:119343 shmem:985 pagetables:2113 bounce:0 [87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [87578.494733] lowmem_reserve[]: 0 2866 7852 7852 [87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB high:6204kB active_anon:4056kB inactive_anon:5856kB active_file:1322360kB inactive_file:1320432kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2935456kB mlocked:0kB dirty:190824kB writeback:31900kB mapped:157676kB shmem:0kB slab_reclaimable:107316kB slab_unreclaimable:15480kB kernel_stack:56kB pagetables:764kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [87578.494754] lowmem_reserve[]: 0 0 4986 4986 [87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB high:10788kB active_anon:87824kB inactive_anon:19876kB active_file:2343752kB inactive_file:2338512kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB slab_reclaimable:131752kB slab_unreclaimable:29060kB kernel_stack:2160kB pagetables:7688kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [87578.494775] lowmem_reserve[]: 0 0 0 0 [87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB [87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB 4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB [87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB [87578.494818] 1832322 total pagecache pages [87578.494820] 0 pages in swap cache [87578.494823] Swap cache stats: add 0, delete 0, find 0/0 [87578.494825] Free swap = 0kB [87578.494827] Total swap = 0kB [87578.531041] 2064368 pages RAM [87578.531044] 66019 pages reserved [87578.531046] 1501227 pages shared [87578.531048] 619257 pages non-shared [87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20) [87578.531057] cache: kmalloc-4096, object size: 4096, buffer size: 4096, default order: 3, min order: 0 [87578.531061] node 0: slabs: 1322, objs: 4129, free: 0 This actually made the machine go offline for hours before it for some reason came back. The second time this happened it did not come back (waited 8 hours). I also seem to have TCP related problems: [87578.531806] [<ffffffff8113651f>] kmem_cache_alloc_node+0x8f/0x160 [87578.531812] [<ffffffff81455c6f>] __alloc_skb+0x4f/0x190 [87578.531820] [<ffffffff814acbe0>] ? tcp_delack_timer+0x0/0x270 [87578.531828] [<ffffffff814ab423>] tcp_send_ack+0x33/0x120 [87578.531834] [<ffffffff814acd22>] tcp_delack_timer+0x142/0x270 [87578.531842] [<ffffffff8105a34d>] ? scheduler_tick+0x18d/0x260 [87578.531849] [<ffffffff8107776b>] run_timer_softirq+0x19b/0x340 [87578.531857] [<ffffffff81094ac0>] ? tick_sched_timer+0x0/0xc0 [87578.531865] [<ffffffff8108f723>] ? ktime_get+0x63/0xe0 [87578.531871] [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0 [87578.531878] [<ffffffff810946aa>] ? tick_program_event+0x2a/0x30 [87578.531885] [<ffffffff810142ec>] call_softirq+0x1c/0x30 [87578.531891] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 [87578.531897] [<ffffffff8106e305>] irq_exit+0x85/0x90 [87578.531904] [<ffffffff81549601>] smp_apic_timer_interrupt+0x71/0x9c [87578.531910] [<ffffffff81013cb3>] apic_timer_interrupt+0x13/0x20 [87578.531914] <EOI> [<ffffffff8130fbbe>] ? acpi_idle_enter_simple+0x117/0x14b [87578.531928] [<ffffffff8130fbb7>] ? acpi_idle_enter_simple+0x110/0x14b [87578.531936] [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140 [87578.531943] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 [87578.531950] [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-24 19:21 ` Mikael Abrahamsson 0 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-24 19:21 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Tue, 24 Aug 2010, Pekka Enberg wrote: > It looks to me as if tcp_create_openreq_child() is able to cope with the > situation so the warning could be harmless. If that's the case, we > should probably stick a __GFP_NOWARN there. What about my situation? (a complete dmesg can be had at <http://swm.pp.se/dmesg.100809-2.txt.gz>) [87578.494471] swapper: page allocation failure. order:0, mode:0x4020 [87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic #39-Ubuntu [87578.494480] Call Trace: [87578.494483] <IRQ> [<ffffffff810fad0e>] __alloc_pages_slowpath+0x56e/0x580 [87578.494499] [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0 [87578.494506] [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0 [87578.494511] [<ffffffff81133b17>] new_slab+0x2f7/0x310 [87578.494516] [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0 [87578.494522] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 [87578.494528] [<ffffffff81137408>] __kmalloc_node_track_caller+0xb8/0x180 [87578.494532] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 [87578.494536] [<ffffffff81455ca0>] __alloc_skb+0x80/0x190 [87578.494540] [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60 [87578.494564] [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 [r8169] [87578.494572] [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169] [87578.494580] [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10 [87578.494586] [<ffffffff8146029f>] net_rx_action+0x10f/0x250 [87578.494594] [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 [r8169] [87578.494600] [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0 [87578.494605] [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170 [87578.494610] [<ffffffff810142ec>] call_softirq+0x1c/0x30 [87578.494614] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 [87578.494618] [<ffffffff8106e305>] irq_exit+0x85/0x90 [87578.494623] [<ffffffff81549515>] do_IRQ+0x75/0xf0 [87578.494627] [<ffffffff81013b13>] ret_from_intr+0x0/0x11 [87578.494629] <EOI> [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1 [87578.494639] [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1 [87578.494646] [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140 [87578.494652] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 [87578.494657] [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa [87578.494660] Mem-Info: [87578.494662] Node 0 DMA per-cpu: [87578.494666] CPU 0: hi: 0, btch: 1 usd: 0 [87578.494669] CPU 1: hi: 0, btch: 1 usd: 0 [87578.494672] CPU 2: hi: 0, btch: 1 usd: 0 [87578.494674] CPU 3: hi: 0, btch: 1 usd: 0 [87578.494677] Node 0 DMA32 per-cpu: [87578.494680] CPU 0: hi: 186, btch: 31 usd: 173 [87578.494683] CPU 1: hi: 186, btch: 31 usd: 87 [87578.494686] CPU 2: hi: 186, btch: 31 usd: 168 [87578.494689] CPU 3: hi: 186, btch: 31 usd: 63 [87578.494691] Node 0 Normal per-cpu: [87578.494695] CPU 0: hi: 186, btch: 31 usd: 177 [87578.494698] CPU 1: hi: 186, btch: 31 usd: 176 [87578.494700] CPU 2: hi: 186, btch: 31 usd: 82 [87578.494703] CPU 3: hi: 186, btch: 31 usd: 191 [87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0 [87578.494711] active_file:916528 inactive_file:914736 isolated_file:0 [87578.494713] unevictable:0 dirty:135959 writeback:24423 unstable:0 [87578.494714] free:9990 slab_reclaimable:59767 slab_unreclaimable:11135 [87578.494716] mapped:119343 shmem:985 pagetables:2113 bounce:0 [87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [87578.494733] lowmem_reserve[]: 0 2866 7852 7852 [87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB high:6204kB active_anon:4056kB inactive_anon:5856kB active_file:1322360kB inactive_file:1320432kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2935456kB mlocked:0kB dirty:190824kB writeback:31900kB mapped:157676kB shmem:0kB slab_reclaimable:107316kB slab_unreclaimable:15480kB kernel_stack:56kB pagetables:764kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [87578.494754] lowmem_reserve[]: 0 0 4986 4986 [87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB high:10788kB active_anon:87824kB inactive_anon:19876kB active_file:2343752kB inactive_file:2338512kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB slab_reclaimable:131752kB slab_unreclaimable:29060kB kernel_stack:2160kB pagetables:7688kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [87578.494775] lowmem_reserve[]: 0 0 0 0 [87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB [87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB 4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB [87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB [87578.494818] 1832322 total pagecache pages [87578.494820] 0 pages in swap cache [87578.494823] Swap cache stats: add 0, delete 0, find 0/0 [87578.494825] Free swap = 0kB [87578.494827] Total swap = 0kB [87578.531041] 2064368 pages RAM [87578.531044] 66019 pages reserved [87578.531046] 1501227 pages shared [87578.531048] 619257 pages non-shared [87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20) [87578.531057] cache: kmalloc-4096, object size: 4096, buffer size: 4096, default order: 3, min order: 0 [87578.531061] node 0: slabs: 1322, objs: 4129, free: 0 This actually made the machine go offline for hours before it for some reason came back. The second time this happened it did not come back (waited 8 hours). I also seem to have TCP related problems: [87578.531806] [<ffffffff8113651f>] kmem_cache_alloc_node+0x8f/0x160 [87578.531812] [<ffffffff81455c6f>] __alloc_skb+0x4f/0x190 [87578.531820] [<ffffffff814acbe0>] ? tcp_delack_timer+0x0/0x270 [87578.531828] [<ffffffff814ab423>] tcp_send_ack+0x33/0x120 [87578.531834] [<ffffffff814acd22>] tcp_delack_timer+0x142/0x270 [87578.531842] [<ffffffff8105a34d>] ? scheduler_tick+0x18d/0x260 [87578.531849] [<ffffffff8107776b>] run_timer_softirq+0x19b/0x340 [87578.531857] [<ffffffff81094ac0>] ? tick_sched_timer+0x0/0xc0 [87578.531865] [<ffffffff8108f723>] ? ktime_get+0x63/0xe0 [87578.531871] [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0 [87578.531878] [<ffffffff810946aa>] ? tick_program_event+0x2a/0x30 [87578.531885] [<ffffffff810142ec>] call_softirq+0x1c/0x30 [87578.531891] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 [87578.531897] [<ffffffff8106e305>] irq_exit+0x85/0x90 [87578.531904] [<ffffffff81549601>] smp_apic_timer_interrupt+0x71/0x9c [87578.531910] [<ffffffff81013cb3>] apic_timer_interrupt+0x13/0x20 [87578.531914] <EOI> [<ffffffff8130fbbe>] ? acpi_idle_enter_simple+0x117/0x14b [87578.531928] [<ffffffff8130fbb7>] ? acpi_idle_enter_simple+0x110/0x14b [87578.531936] [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140 [87578.531943] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 [87578.531950] [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa -- Mikael Abrahamsson email: swmike@swm.pp.se -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-24 19:21 ` Mikael Abrahamsson @ 2010-08-29 10:49 ` Pekka Enberg -1 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-29 10:49 UTC (permalink / raw) To: Mikael Abrahamsson Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On 24.8.2010 22.21, Mikael Abrahamsson wrote: > On Tue, 24 Aug 2010, Pekka Enberg wrote: > >> It looks to me as if tcp_create_openreq_child() is able to cope with >> the situation so the warning could be harmless. If that's the case, >> we should probably stick a __GFP_NOWARN there. > > What about my situation? (a complete dmesg can be had at > <http://swm.pp.se/dmesg.100809-2.txt.gz>) This looks like something the kernel can't really recover from. > [87578.494471] swapper: page allocation failure. order:0, mode:0x4020 > [87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic > #39-Ubuntu > [87578.494480] Call Trace: > [87578.494483] <IRQ> [<ffffffff810fad0e>] > __alloc_pages_slowpath+0x56e/0x580 > [87578.494499] [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0 > [87578.494506] [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0 > [87578.494511] [<ffffffff81133b17>] new_slab+0x2f7/0x310 > [87578.494516] [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0 > [87578.494522] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 > [87578.494528] [<ffffffff81137408>] > __kmalloc_node_track_caller+0xb8/0x180 > [87578.494532] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 > [87578.494536] [<ffffffff81455ca0>] __alloc_skb+0x80/0x190 > [87578.494540] [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60 > [87578.494564] [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 > [r8169] > [87578.494572] [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169] > [87578.494580] [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10 > [87578.494586] [<ffffffff8146029f>] net_rx_action+0x10f/0x250 > [87578.494594] [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 > [r8169] > [87578.494600] [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0 > [87578.494605] [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170 > [87578.494610] [<ffffffff810142ec>] call_softirq+0x1c/0x30 > [87578.494614] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 > [87578.494618] [<ffffffff8106e305>] irq_exit+0x85/0x90 > [87578.494623] [<ffffffff81549515>] do_IRQ+0x75/0xf0 > [87578.494627] [<ffffffff81013b13>] ret_from_intr+0x0/0x11 > [87578.494629] <EOI> [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1 > [87578.494639] [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1 > [87578.494646] [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140 > [87578.494652] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 > [87578.494657] [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa > [87578.494660] Mem-Info: > [87578.494662] Node 0 DMA per-cpu: > [87578.494666] CPU 0: hi: 0, btch: 1 usd: 0 > [87578.494669] CPU 1: hi: 0, btch: 1 usd: 0 > [87578.494672] CPU 2: hi: 0, btch: 1 usd: 0 > [87578.494674] CPU 3: hi: 0, btch: 1 usd: 0 > [87578.494677] Node 0 DMA32 per-cpu: > [87578.494680] CPU 0: hi: 186, btch: 31 usd: 173 > [87578.494683] CPU 1: hi: 186, btch: 31 usd: 87 > [87578.494686] CPU 2: hi: 186, btch: 31 usd: 168 > [87578.494689] CPU 3: hi: 186, btch: 31 usd: 63 > [87578.494691] Node 0 Normal per-cpu: > [87578.494695] CPU 0: hi: 186, btch: 31 usd: 177 > [87578.494698] CPU 1: hi: 186, btch: 31 usd: 176 > [87578.494700] CPU 2: hi: 186, btch: 31 usd: 82 > [87578.494703] CPU 3: hi: 186, btch: 31 usd: 191 > [87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0 > [87578.494711] active_file:916528 inactive_file:914736 isolated_file:0 > [87578.494713] unevictable:0 dirty:135959 writeback:24423 unstable:0 > [87578.494714] free:9990 slab_reclaimable:59767 slab_unreclaimable:11135 > [87578.494716] mapped:119343 shmem:985 pagetables:2113 bounce:0 > [87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB > active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB > mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB > slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB > pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB > pages_scanned:0 all_unreclaimable? yes > [87578.494733] lowmem_reserve[]: 0 2866 7852 7852 > [87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB > high:6204kB active_anon:4056kB inactive_anon:5856kB > active_file:1322360kB inactive_file:1320432kB unevictable:0kB > isolated(anon):0kB isolated(file):0kB present:2935456kB mlocked:0kB > dirty:190824kB writeback:31900kB mapped:157676kB shmem:0kB > slab_reclaimable:107316kB slab_unreclaimable:15480kB kernel_stack:56kB > pagetables:764kB unstable:0kB bounce:0kB writeback_tmp:0kB > pages_scanned:0 all_unreclaimable? no > [87578.494754] lowmem_reserve[]: 0 0 4986 4986 > [87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB > high:10788kB active_anon:87824kB inactive_anon:19876kB > active_file:2343752kB inactive_file:2338512kB unevictable:0kB > isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB > dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB > slab_reclaimable:131752kB slab_unreclaimable:29060kB > kernel_stack:2160kB pagetables:7688kB unstable:0kB bounce:0kB > writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > [87578.494775] lowmem_reserve[]: 0 0 0 0 > [87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB > 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB > [87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB > 4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB > [87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB > 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB You seem to have 4K pages available still. I wonder why the page allocator isn't giving them to SLUB? > [87578.494818] 1832322 total pagecache pages > [87578.494820] 0 pages in swap cache > [87578.494823] Swap cache stats: add 0, delete 0, find 0/0 > [87578.494825] Free swap = 0kB > [87578.494827] Total swap = 0kB > [87578.531041] 2064368 pages RAM > [87578.531044] 66019 pages reserved > [87578.531046] 1501227 pages shared > [87578.531048] 619257 pages non-shared > [87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20) > [87578.531057] cache: kmalloc-4096, object size: 4096, buffer size: > 4096, default order: 3, min order: 0 > [87578.531061] node 0: slabs: 1322, objs: 4129, free: 0 > > This actually made the machine go offline for hours before it for some > reason came back. The second time this happened it did not come back > (waited 8 hours). Do you see these out-of-memory problems with 2.6.35? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-29 10:49 ` Pekka Enberg 0 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-29 10:49 UTC (permalink / raw) To: Mikael Abrahamsson Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On 24.8.2010 22.21, Mikael Abrahamsson wrote: > On Tue, 24 Aug 2010, Pekka Enberg wrote: > >> It looks to me as if tcp_create_openreq_child() is able to cope with >> the situation so the warning could be harmless. If that's the case, >> we should probably stick a __GFP_NOWARN there. > > What about my situation? (a complete dmesg can be had at > <http://swm.pp.se/dmesg.100809-2.txt.gz>) This looks like something the kernel can't really recover from. > [87578.494471] swapper: page allocation failure. order:0, mode:0x4020 > [87578.494476] Pid: 0, comm: swapper Not tainted 2.6.32-24-generic > #39-Ubuntu > [87578.494480] Call Trace: > [87578.494483] <IRQ> [<ffffffff810fad0e>] > __alloc_pages_slowpath+0x56e/0x580 > [87578.494499] [<ffffffff810fae7e>] __alloc_pages_nodemask+0x15e/0x1a0 > [87578.494506] [<ffffffff8112dba7>] alloc_pages_current+0x87/0xd0 > [87578.494511] [<ffffffff81133b17>] new_slab+0x2f7/0x310 > [87578.494516] [<ffffffff811363c1>] __slab_alloc+0x201/0x2d0 > [87578.494522] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 > [87578.494528] [<ffffffff81137408>] > __kmalloc_node_track_caller+0xb8/0x180 > [87578.494532] [<ffffffff81455fe6>] ? __netdev_alloc_skb+0x36/0x60 > [87578.494536] [<ffffffff81455ca0>] __alloc_skb+0x80/0x190 > [87578.494540] [<ffffffff81455fe6>] __netdev_alloc_skb+0x36/0x60 > [87578.494564] [<ffffffffa008f5c7>] rtl8169_rx_interrupt+0x247/0x5b0 > [r8169] > [87578.494572] [<ffffffffa008faad>] rtl8169_poll+0x3d/0x270 [r8169] > [87578.494580] [<ffffffff810397a9>] ? default_spin_lock_flags+0x9/0x10 > [87578.494586] [<ffffffff8146029f>] net_rx_action+0x10f/0x250 > [87578.494594] [<ffffffffa008d54e>] ? rtl8169_interrupt+0xde/0x1e0 > [r8169] > [87578.494600] [<ffffffff8106e467>] __do_softirq+0xb7/0x1e0 > [87578.494605] [<ffffffff810c52c0>] ? handle_IRQ_event+0x60/0x170 > [87578.494610] [<ffffffff810142ec>] call_softirq+0x1c/0x30 > [87578.494614] [<ffffffff81015cb5>] do_softirq+0x65/0xa0 > [87578.494618] [<ffffffff8106e305>] irq_exit+0x85/0x90 > [87578.494623] [<ffffffff81549515>] do_IRQ+0x75/0xf0 > [87578.494627] [<ffffffff81013b13>] ret_from_intr+0x0/0x11 > [87578.494629] <EOI> [<ffffffff8130f7cb>] ? acpi_idle_enter_c1+0xa3/0xc1 > [87578.494639] [<ffffffff8130f7aa>] ? acpi_idle_enter_c1+0x82/0xc1 > [87578.494646] [<ffffffff8143a5a7>] ? cpuidle_idle_call+0xa7/0x140 > [87578.494652] [<ffffffff81011e73>] ? cpu_idle+0xb3/0x110 > [87578.494657] [<ffffffff8153e27e>] ? start_secondary+0xa8/0xaa > [87578.494660] Mem-Info: > [87578.494662] Node 0 DMA per-cpu: > [87578.494666] CPU 0: hi: 0, btch: 1 usd: 0 > [87578.494669] CPU 1: hi: 0, btch: 1 usd: 0 > [87578.494672] CPU 2: hi: 0, btch: 1 usd: 0 > [87578.494674] CPU 3: hi: 0, btch: 1 usd: 0 > [87578.494677] Node 0 DMA32 per-cpu: > [87578.494680] CPU 0: hi: 186, btch: 31 usd: 173 > [87578.494683] CPU 1: hi: 186, btch: 31 usd: 87 > [87578.494686] CPU 2: hi: 186, btch: 31 usd: 168 > [87578.494689] CPU 3: hi: 186, btch: 31 usd: 63 > [87578.494691] Node 0 Normal per-cpu: > [87578.494695] CPU 0: hi: 186, btch: 31 usd: 177 > [87578.494698] CPU 1: hi: 186, btch: 31 usd: 176 > [87578.494700] CPU 2: hi: 186, btch: 31 usd: 82 > [87578.494703] CPU 3: hi: 186, btch: 31 usd: 191 > [87578.494710] active_anon:22970 inactive_anon:6433 isolated_anon:0 > [87578.494711] active_file:916528 inactive_file:914736 isolated_file:0 > [87578.494713] unevictable:0 dirty:135959 writeback:24423 unstable:0 > [87578.494714] free:9990 slab_reclaimable:59767 slab_unreclaimable:11135 > [87578.494716] mapped:119343 shmem:985 pagetables:2113 bounce:0 > [87578.494719] Node 0 DMA free:15860kB min:20kB low:24kB high:28kB > active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB > unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15272kB > mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB > slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB > pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB > pages_scanned:0 all_unreclaimable? yes > [87578.494733] lowmem_reserve[]: 0 2866 7852 7852 > [87578.494738] Node 0 DMA32 free:21420kB min:4136kB low:5168kB > high:6204kB active_anon:4056kB inactive_anon:5856kB > active_file:1322360kB inactive_file:1320432kB unevictable:0kB > isolated(anon):0kB isolated(file):0kB present:2935456kB mlocked:0kB > dirty:190824kB writeback:31900kB mapped:157676kB shmem:0kB > slab_reclaimable:107316kB slab_unreclaimable:15480kB kernel_stack:56kB > pagetables:764kB unstable:0kB bounce:0kB writeback_tmp:0kB > pages_scanned:0 all_unreclaimable? no > [87578.494754] lowmem_reserve[]: 0 0 4986 4986 > [87578.494759] Node 0 Normal free:2680kB min:7192kB low:8988kB > high:10788kB active_anon:87824kB inactive_anon:19876kB > active_file:2343752kB inactive_file:2338512kB unevictable:0kB > isolated(anon):0kB isolated(file):0kB present:5105664kB mlocked:0kB > dirty:353012kB writeback:65792kB mapped:319696kB shmem:3940kB > slab_reclaimable:131752kB slab_unreclaimable:29060kB > kernel_stack:2160kB pagetables:7688kB unstable:0kB bounce:0kB > writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > [87578.494775] lowmem_reserve[]: 0 0 0 0 > [87578.494779] Node 0 DMA: 3*4kB 3*8kB 3*16kB 1*32kB 2*64kB 2*128kB > 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15860kB > [87578.494792] Node 0 DMA32: 789*4kB 765*8kB 589*16kB 1*32kB 1*64kB > 4*128kB 4*256kB 2*512kB 0*1024kB 0*2048kB 0*4096kB = 21356kB > [87578.494805] Node 0 Normal: 374*4kB 4*8kB 20*16kB 1*32kB 0*64kB > 0*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2648kB You seem to have 4K pages available still. I wonder why the page allocator isn't giving them to SLUB? > [87578.494818] 1832322 total pagecache pages > [87578.494820] 0 pages in swap cache > [87578.494823] Swap cache stats: add 0, delete 0, find 0/0 > [87578.494825] Free swap = 0kB > [87578.494827] Total swap = 0kB > [87578.531041] 2064368 pages RAM > [87578.531044] 66019 pages reserved > [87578.531046] 1501227 pages shared > [87578.531048] 619257 pages non-shared > [87578.531053] SLUB: Unable to allocate memory on node -1 (gfp=0x20) > [87578.531057] cache: kmalloc-4096, object size: 4096, buffer size: > 4096, default order: 3, min order: 0 > [87578.531061] node 0: slabs: 1322, objs: 4129, free: 0 > > This actually made the machine go offline for hours before it for some > reason came back. The second time this happened it did not come back > (waited 8 hours). Do you see these out-of-memory problems with 2.6.35? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-29 10:49 ` Pekka Enberg @ 2010-08-29 12:38 ` Mikael Abrahamsson -1 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-29 12:38 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Pekka Enberg wrote: > Do you see these out-of-memory problems with 2.6.35? Haven't tried it. Has there been substantial work done there that changes things so that if I reproduce it on 2.6.35, someone will look into the issue in earnest? Since I'll most likely have to compile a new kernel, are there any debug options I should enable to give more information to aid fault finding? I'll start with the .config file from Ubuntu 10.04 2.6.32 kernel and oldconfig from there. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-29 12:38 ` Mikael Abrahamsson 0 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-29 12:38 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Pekka Enberg wrote: > Do you see these out-of-memory problems with 2.6.35? Haven't tried it. Has there been substantial work done there that changes things so that if I reproduce it on 2.6.35, someone will look into the issue in earnest? Since I'll most likely have to compile a new kernel, are there any debug options I should enable to give more information to aid fault finding? I'll start with the .config file from Ubuntu 10.04 2.6.32 kernel and oldconfig from there. -- Mikael Abrahamsson email: swmike@swm.pp.se -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-29 12:38 ` Mikael Abrahamsson @ 2010-08-29 13:17 ` Pekka Enberg -1 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-29 13:17 UTC (permalink / raw) To: Mikael Abrahamsson Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Pekka Enberg wrote: >> Do you see these out-of-memory problems with 2.6.35? On 29.8.2010 15.38, Mikael Abrahamsson wrote: > Haven't tried it. > > Has there been substantial work done there that changes things so that > if I reproduce it on 2.6.35, someone will look into the issue in > earnest? Since I'll most likely have to compile a new kernel, are > there any debug options I should enable to give more information to > aid fault finding? There aren't any debug options that need to be enabled. The reason I'm asking is because we had a bunch of similar issues being reported earlier that got fixed and it's been calm for a while. That's why it would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too unstable to test) fixes things. Pekka ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-29 13:17 ` Pekka Enberg 0 siblings, 0 replies; 29+ messages in thread From: Pekka Enberg @ 2010-08-29 13:17 UTC (permalink / raw) To: Mikael Abrahamsson Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Pekka Enberg wrote: >> Do you see these out-of-memory problems with 2.6.35? On 29.8.2010 15.38, Mikael Abrahamsson wrote: > Haven't tried it. > > Has there been substantial work done there that changes things so that > if I reproduce it on 2.6.35, someone will look into the issue in > earnest? Since I'll most likely have to compile a new kernel, are > there any debug options I should enable to give more information to > aid fault finding? There aren't any debug options that need to be enabled. The reason I'm asking is because we had a bunch of similar issues being reported earlier that got fixed and it's been calm for a while. That's why it would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too unstable to test) fixes things. Pekka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-29 13:17 ` Pekka Enberg @ 2010-08-29 15:37 ` Mikael Abrahamsson -1 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-29 15:37 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Pekka Enberg wrote: > There aren't any debug options that need to be enabled. The reason I'm > asking is because we had a bunch of similar issues being reported > earlier that got fixed and it's been calm for a while. That's why it > would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too > unstable to test) fixes things. Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for 10.04), just need to do a reboot at some convenient time. -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-29 15:37 ` Mikael Abrahamsson 0 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-29 15:37 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Pekka Enberg wrote: > There aren't any debug options that need to be enabled. The reason I'm > asking is because we had a bunch of similar issues being reported > earlier that got fixed and it's been calm for a while. That's why it > would be interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too > unstable to test) fixes things. Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for 10.04), just need to do a reboot at some convenient time. -- Mikael Abrahamsson email: swmike@swm.pp.se -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure 2010-08-29 15:37 ` Mikael Abrahamsson @ 2010-08-31 20:28 ` Mikael Abrahamsson -1 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-31 20:28 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Mikael Abrahamsson wrote: > On Sun, 29 Aug 2010, Pekka Enberg wrote: > >> There aren't any debug options that need to be enabled. The reason I'm >> asking is because we had a bunch of similar issues being reported earlier >> that got fixed and it's been calm for a while. That's why it would be >> interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too unstable to >> test) fixes things. > > Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for > 10.04), just need to do a reboot at some convenient time. I just rebooted and ran a similar load of network+disk load that made the machine give "swapper allocation failure" messages before, and I couldn't reproduce it with 2.6.35: 2.6.35-19-generic #25~lucid1-Ubuntu SMP Wed Aug 25 03:50:05 UTC 2010 x86_64 GNU/Linux Doing "sync" in the middle made sync take more than 5+ minutes to complete (2 hung-task messages in dmesg), but at least nothing ran out of memory. Considering the amount of people running 2.6.32 and who will be running it in the future, it still worries me that this is present in 2.6.32 (and earlier kernels as well). -- Mikael Abrahamsson email: swmike@swm.pp.se ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: 2.6.34.1 page allocation failure @ 2010-08-31 20:28 ` Mikael Abrahamsson 0 siblings, 0 replies; 29+ messages in thread From: Mikael Abrahamsson @ 2010-08-31 20:28 UTC (permalink / raw) To: Pekka Enberg Cc: Stan Hoeppner, Christoph Lameter, Linux Kernel List, linux-mm, Mel Gorman, Linux Netdev List On Sun, 29 Aug 2010, Mikael Abrahamsson wrote: > On Sun, 29 Aug 2010, Pekka Enberg wrote: > >> There aren't any debug options that need to be enabled. The reason I'm >> asking is because we had a bunch of similar issues being reported earlier >> that got fixed and it's been calm for a while. That's why it would be >> interesting to know if 2.6.35 or 2.6.36-rc2 (if it's not too unstable to >> test) fixes things. > > Oki, I have installed 2.6.35 now (found backport from ubuntu 10.10 for > 10.04), just need to do a reboot at some convenient time. I just rebooted and ran a similar load of network+disk load that made the machine give "swapper allocation failure" messages before, and I couldn't reproduce it with 2.6.35: 2.6.35-19-generic #25~lucid1-Ubuntu SMP Wed Aug 25 03:50:05 UTC 2010 x86_64 GNU/Linux Doing "sync" in the middle made sync take more than 5+ minutes to complete (2 hung-task messages in dmesg), but at least nothing ran out of memory. Considering the amount of people running 2.6.32 and who will be running it in the future, it still worries me that this is present in 2.6.32 (and earlier kernels as well). -- Mikael Abrahamsson email: swmike@swm.pp.se -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2010-08-31 20:30 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-08-22 6:13 2.6.34.1 page allocation failure Stan Hoeppner 2010-08-22 6:47 ` Mikael Abrahamsson 2010-08-22 6:47 ` Mikael Abrahamsson 2010-08-22 19:51 ` Pekka Enberg 2010-08-22 19:51 ` Pekka Enberg 2010-08-22 22:40 ` Christoph Lameter 2010-08-22 22:40 ` Christoph Lameter 2010-08-23 9:37 ` Pekka Enberg 2010-08-23 9:37 ` Pekka Enberg 2010-08-23 22:35 ` Stan Hoeppner 2010-08-23 22:35 ` Stan Hoeppner 2010-08-24 17:13 ` Christoph Lameter 2010-08-24 17:13 ` Christoph Lameter 2010-08-24 18:03 ` Pekka Enberg 2010-08-24 18:03 ` Pekka Enberg 2010-08-24 19:08 ` Stan Hoeppner 2010-08-24 19:08 ` Stan Hoeppner 2010-08-24 19:21 ` Mikael Abrahamsson 2010-08-24 19:21 ` Mikael Abrahamsson 2010-08-29 10:49 ` Pekka Enberg 2010-08-29 10:49 ` Pekka Enberg 2010-08-29 12:38 ` Mikael Abrahamsson 2010-08-29 12:38 ` Mikael Abrahamsson 2010-08-29 13:17 ` Pekka Enberg 2010-08-29 13:17 ` Pekka Enberg 2010-08-29 15:37 ` Mikael Abrahamsson 2010-08-29 15:37 ` Mikael Abrahamsson 2010-08-31 20:28 ` Mikael Abrahamsson 2010-08-31 20:28 ` Mikael Abrahamsson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.