linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Memory Problem in 2.4.10-pre2 / __alloc_pages failed
@ 2001-08-29 12:07 Stephan von Krawczynski
  2001-08-29 16:47 ` Roger Larsson
                   ` (4 more replies)
  0 siblings, 5 replies; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-08-29 12:07 UTC (permalink / raw)
  To: roger.larsson; +Cc: linux-kernel

Hello,

I managed it again. As with previous 2.4-releases I managed to let __alloc_pages fail quite easily with pretty standard test bed:

Hardware: 2 x P-III 1GHz, 1 GB RAM, 29160 U160 SCSI, 36GB HD
Kernel: 2.4.10-pre2 with trace output in mm/page_alloc.c (thanks Roger)

Test:
exported reiserfs filesystem, simply copying files on it from a 2.2.19 nfs client (files are big 10-20 MB each).
at the same time I read a CD to HD via xcdroast and run setiathome on nice-level 19.

meminfo before test:

        total:    used:    free:  shared: buffers:  cached:
Mem:  921726976 90714112 831012864        0  6696960 36126720
Swap: 271392768        0 271392768
MemTotal:       900124 kB
MemFree:        811536 kB
MemShared:           0 kB
Buffers:          6540 kB
Cached:          35280 kB
SwapCached:          0 kB
Active:           3640 kB
Inact_dirty:     38180 kB
Inact_clean:         0 kB
Inact_target:      824 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900124 kB
LowFree:        811536 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

Everything runs acceptable, but CPU-Load is high (6-8). Simply "cat /proc/meminfo" takes half a minute sometimes during test. Mouse cannot be moved smoothly during the whole test.
When xcdroast is finished with reading CD (at about 1 MB/sec speed, not very fast indeed) the following shows up:

Aug 29 13:43:34 admin kernel: >] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec8f5>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec826>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
Aug 29 13:43:34 admin kernel: __alloc_pages: 0-order allocation failed (gfp=0x20/0).
Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=0, ...)
Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
Aug 29 13:43:34 admin kernel:    [system_call+51/56] 

meminfo at this point:
        total:    used:    free:  shared: buffers:  cached:
Mem:  921726976 918597632  3129344        0  8036352 812560384
Swap: 271392768        0 271392768
MemTotal:       900124 kB
MemFree:          3056 kB
MemShared:           0 kB
Buffers:          7848 kB
Cached:         793516 kB
SwapCached:          0 kB
Active:          47396 kB
Inact_dirty:    750120 kB
Inact_clean:      3848 kB
Inact_target:     5736 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900124 kB
LowFree:          3056 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

Unfortunately I cannot tell what pid 1207 is, for it is gone when I do a ps afterwards. This test setup shows vm errors on every 2.4 I tested so far, but on various occasions, all 2.4 below 10-pre1 fail during reading / writing. All 10-pre-x fail afterwards. If I can provide additional information please tell me. I am very willing to test anything you like with chances it doesn't corrupt my filesystems ;-)
Another thing worth mentioning: NFS server fails in this test if I do not use export-option "no_subtree_check". I found this out with very friendly help from Neil Brown. It fails _only_ on exported reiserfs, though. It would be nice if someone could investigate this (maybe Hans?).

BTW this same test never (mem) fails on 2.2.19 and has far less CPU load and lower memory consumption.

Regards, Stephan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 12:07 Memory Problem in 2.4.10-pre2 / __alloc_pages failed Stephan von Krawczynski
@ 2001-08-29 16:47 ` Roger Larsson
  2001-08-29 19:18 ` Stephan von Krawczynski
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 30+ messages in thread
From: Roger Larsson @ 2001-08-29 16:47 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On Wednesdayen den 29 August 2001 14:07, Stephan von Krawczynski wrote:

> Aug 29 13:43:34 admin kernel: __alloc_pages: 2-order allocation failed
> (gfp=0x20/0). Aug 29 13:43:34 admin kernel: pid=1207;
> __alloc_pages(gfp=0x20, order=2, ...) 
> Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] 
> [__get_free_pages+10/24] [<fdcec845>]
> [<fdcec913>] [<fdceb7d7>] [<fdcec0f5>] [<fdcea589>]
> [ip_local_deliver_finish+0/368]
I think this is the okfn parameter to nf_hook_slow below (0/368).
And that skb_linearize(skp, GFP_ATOMIC) is much more likely candidate
especially since it calls kmalloc...

Why does skb_linearize need to be GFP_ATOMIC? Lets see, furter down we have
do_softirq... hmm.. not good to sleep in...

Next question: do we really need to linearize?

Third question: order of the package is 2 => 4096 << 2 = 16k quite big
for a packet... (MRU MTU network settings?)

> [nf_hook_slow+272/404]
> [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
> [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480]
> [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404]
> [ip_rcv+870/944] [ip_rcv_finish+0/480] [net_rx_action+362/628] 
> [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] 
> [sys_ioctl+443/532] [system_call+51/56]

>
> Unfortunately I cannot tell what pid 1207 is, for it is gone when I do a ps
> afterwards.

Change the printout of 'current->pid' to 'current->comm' format '%s'

> This test setup shows vm errors on every 2.4 I tested so far,
> but on various occasions, all 2.4 below 10-pre1 fail during reading /
> writing. All 10-pre-x fail afterwards. If I can provide additional
> information please tell me. I am very willing to test anything you like
> with chances it doesn't corrupt my filesystems ;-) 

No guarantees...

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 12:07 Memory Problem in 2.4.10-pre2 / __alloc_pages failed Stephan von Krawczynski
  2001-08-29 16:47 ` Roger Larsson
@ 2001-08-29 19:18 ` Stephan von Krawczynski
  2001-08-30 14:16   ` Stephan von Krawczynski
  2001-08-29 23:36 ` Daniel Phillips
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-08-29 19:18 UTC (permalink / raw)
  To: Roger Larsson; +Cc: linux-kernel

On Wed, 29 Aug 2001 18:47:54 +0200
Roger Larsson <roger.larsson@skelleftea.mail.telia.com> wrote:

> Change the printout of 'current->pid' to 'current->comm' format '%s'

Ok. I can present this one:

Aug 29 21:13:27 admin kernel: __alloc_pages: 0-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=0, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8845>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 2-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=2, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8845>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=1, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8845>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 0-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=0, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8845>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf8913>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 
Aug 29 21:13:27 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 29 21:13:27 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 29 21:13:27 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 29 21:13:27 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] 
Aug 29 21:13:27 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008] [do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>] [do_page_fault+0/1164] 
Aug 29 21:13:27 admin kernel:    [dentry_open+189/316] [filp_open+82/92] [do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56] 

Let me see if I can produce some others, too. 
I'll be back.

Regards, Stephan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 12:07 Memory Problem in 2.4.10-pre2 / __alloc_pages failed Stephan von Krawczynski
  2001-08-29 16:47 ` Roger Larsson
  2001-08-29 19:18 ` Stephan von Krawczynski
@ 2001-08-29 23:36 ` Daniel Phillips
  2001-08-30 16:49   ` Roger Larsson
  2001-08-30 14:46 ` Stephan von Krawczynski
  2001-08-31 11:06 ` Stephan von Krawczynski
  4 siblings, 1 reply; 30+ messages in thread
From: Daniel Phillips @ 2001-08-29 23:36 UTC (permalink / raw)
  To: Stephan von Krawczynski, roger.larsson; +Cc: linux-kernel

On August 29, 2001 02:07 pm, Stephan von Krawczynski wrote:
> I managed it again. As with previous 2.4-releases I managed to let __alloc_pages fail quite easily with pretty standard test bed:
> 
> Hardware: 2 x P-III 1GHz, 1 GB RAM, 29160 U160 SCSI, 36GB HD
> Kernel: 2.4.10-pre2 with trace output in mm/page_alloc.c (thanks Roger)
> 
> Test:
> exported reiserfs filesystem, simply copying files on it from a 2.2.19 nfs client (files are big 10-20 MB each).
> at the same time I read a CD to HD via xcdroast and run setiathome on nice-level 19.
> 
> Everything runs acceptable, but CPU-Load is high (6-8). Simply "cat /proc/meminfo" takes half a minute sometimes during test. Mouse cannot be moved smoothly during the whole test.
> When xcdroast is finished with reading CD (at about 1 MB/sec speed, not very fast indeed) the following shows up:
> 
> Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
> Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
> Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
> Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
> Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
> Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
> Aug 29 13:43:34 admin kernel:    [system_call+51/56] 

OK, I see what the problem is.  Regular memory users are consuming memory
right down to the emergency reserve limit, beyond which only PF_MEMALLOC
users can go.  Unfortunately, since atomic memory allocators can't wait,
they tend to fail with high frequency in this state.  Duh.

First, there's an effective way to make these particular atomic failures
go away almost entirely.  The atomic memory user (in this case a network
interrupt handler) keeps a list of pages for its private use, starting with
an empty list.  Each time it needs a page it gets it from its private list,
but if that list is empty it gets it from alloc_pages, and when done with
it, returns it to its private list.  The alloc_pages call can still fail of
course, but now it will only fail a few times as it expands its list up to
the size required for normal traffic.  The effect on throughput should be
roughly nothing.

Let's try another way of dealing with it.  What I'm trying to do with the
patch below is leave a small reserve of 1/12 of pages->min, above the
emergency reserve, to be consumed by non-PF_MEMALLOC atomic allocators.
Please bear in mind this is completely untested, but would you try it
please and see if the failure frequency goes down?

--- ../2.4.9.clean/mm/page_alloc.c	Thu Aug 16 12:43:02 2001
+++ ./mm/page_alloc.c	Wed Aug 29 23:47:39 2001
@@ -493,6 +493,9 @@
 		}
 
 		/* XXX: is pages_min/4 a good amount to reserve for this? */
+		if (z->free_pages < z->pages_min / 3 && (gfp_mask & __GFP_WAIT) &&
+				!(current->flags & PF_MEMALLOC))
+			continue;
 		if (z->free_pages < z->pages_min / 4 &&
 				!(current->flags & PF_MEMALLOC))
 			continue;

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 19:18 ` Stephan von Krawczynski
@ 2001-08-30 14:16   ` Stephan von Krawczynski
  0 siblings, 0 replies; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-08-30 14:16 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel

On Thu, 30 Aug 2001 05:42:14 +0200 (CEST)
Mike Galbraith <mikeg@wen-online.de> wrote:

> On Wed, 29 Aug 2001, Stephan von Krawczynski wrote:
> A small sample with junk like that order 3 GFP_ATOMIC allocation should
> pin the tail on the donkey.

Ok. I produced another one, here it is. This time I send only one :-) If it is not sufficient, tell me, I have some dozends left.

Aug 30 16:05:07 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 30 16:05:07 admin kernel: comm=cdda2wav; __alloc_pages(gfp=0x20, order=3, ...)
Aug 30 16:05:07 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>] 
Aug 30 16:05:07 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [send_sigio_to_task+226/236] [ip_rcv_finish+0/480] [nf_iterate+48/132] [ip_rcv_finish+0/480] 
Aug 30 16:05:07 admin kernel:    [ip_rcv_finish+0/480] [nf_hook_slow+215/404] [send_sigio+88/168] [update_wall_time+11/52] [timer_bh+54/700] [tqueue_bh+22/28] 
Aug 30 16:05:07 admin kernel:    [bh_action+76/132] [tasklet_hi_action+110/156] [update_process_times+32/148] [smp_apic_timer_interrupt+241/276] [sys_ioctl+443/532] [system_call+51/56] 

Trace; c012beee <_alloc_pages+16/18>
Trace; c012c1aa <__get_free_pages+a/18>
Trace; fdcf8826 <[sg]sg_low_malloc+13e/1a4>
Trace; fdcf8913 <[sg]sg_malloc+87/120>
Trace; fdcf77d7 <[sg]sg_build_indi+16f/1a8>
Trace; fdcf80f5 <[sg]sg_build_reserve+25/48>
Trace; fdcf6589 <[sg]sg_ioctl+6c5/ae4>
Trace; c01403d6 <send_sigio_to_task+e2/ec>
Trace; c01d4ba0 <ip_rcv_finish+0/1e0>
Trace; c01cec8c <nf_iterate+30/84>   
Trace; c01d4ba0 <ip_rcv_finish+0/1e0>
Trace; c01d4ba0 <ip_rcv_finish+0/1e0>
Trace; c01cefdf <nf_hook_slow+d7/194>
Trace; c0140438 <send_sigio+58/a8>   
Trace; c011bb87 <update_wall_time+b/34>
Trace; c011bd96 <timer_bh+36/2bc>
Trace; c011b89e <tqueue_bh+16/1c>
Trace; c0118e74 <bh_action+4c/84>
Trace; c0118d5a <tasklet_hi_action+6e/9c>
Trace; c011bca4 <update_process_times+20/94>
Trace; c011019d <smp_apic_timer_interrupt+f1/114>
Trace; c0140937 <sys_ioctl+1bb/214>
Trace; c0106d1b <system_call+33/38>

Regards, Stephan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 12:07 Memory Problem in 2.4.10-pre2 / __alloc_pages failed Stephan von Krawczynski
                   ` (2 preceding siblings ...)
  2001-08-29 23:36 ` Daniel Phillips
@ 2001-08-30 14:46 ` Stephan von Krawczynski
  2001-08-30 18:02   ` Daniel Phillips
                     ` (2 more replies)
  2001-08-31 11:06 ` Stephan von Krawczynski
  4 siblings, 3 replies; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-08-30 14:46 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Thu, 30 Aug 2001 01:36:10 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> > Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
> > Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
> > Aug 29 13:43:34 admin kernel: Call Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>] [<fdcec913>] [<fdceb7d7>] 
> > Aug 29 13:43:34 admin kernel:    [<fdcec0f5>] [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404] [ip_rcv_finish+0/480] [ip_local_deliver+436/444] 
> > Aug 29 13:43:34 admin kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480] [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404] [ip_rcv+870/944] 
> > Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480] [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236] [ret_from_intr+0/7] [sys_ioctl+443/532] 
> > Aug 29 13:43:34 admin kernel:    [system_call+51/56] 
> 
> OK, I see what the problem is.  Regular memory users are consuming memory
> right down to the emergency reserve limit, beyond which only PF_MEMALLOC
> users can go.  Unfortunately, since atomic memory allocators can't wait,
> they tend to fail with high frequency in this state.  Duh.

Aehm, excuse my ignorance, but why is a "regular memory user" effectively _consuming_ the memory? I mean how does CD reading and NFS writing _consume_ memory (to such an extent)? Where _is_ this memory gone? If I stop I/O the memory is still gone. I can make at least quite a bit appear again as free, if I delete all the files I have copied via NFS before. After that I receive lots of free mem right away. Why? I would not expect the memory as validly consumed by knfsd during writing files. I mean what for? I guess from "Inact_dirty" list being _huge_, that the mem is in fact already freed again by the original allocator, but the vm holds it, until ... well I don't know. Or am I wrong?

> First, there's an effective way to make these particular atomic failures
> go away almost entirely.  The atomic memory user (in this case a network
> interrupt handler) keeps a list of pages for its private use, starting with
> an empty list.  Each time it needs a page it gets it from its private list,
> but if that list is empty it gets it from alloc_pages, and when done with
> it, returns it to its private list.  The alloc_pages call can still fail of
> course, but now it will only fail a few times as it expands its list up to
> the size required for normal traffic.  The effect on throughput should be
> roughly nothing.

Uh, I would not do that. To a shared memory pool system this is really contra-productive (is this english?). You simply let the mem vanish in some private pools, so only _one_ process (or whatever) can use it. To tell the full truth, you do not even know, if he really uses it. If he allocated it in a heavy load situation and does not give it back (or has his own weird strategy of returning it) you run out of mem only because of one flaky driver. It will not be easy as external spectator of a driver to find out if it performs well or has some memory leakage inside. You simply can't tell.
In fact I do trust kernel mem management more :-), even if it isn't performing very well currently.

> Let's try another way of dealing with it.  What I'm trying to do with the
> patch below is leave a small reserve of 1/12 of pages->min, above the
> emergency reserve, to be consumed by non-PF_MEMALLOC atomic allocators.
> Please bear in mind this is completely untested, but would you try it
> please and see if the failure frequency goes down?

Well, I will try. But must honestly mention, that the whole idea looks like a patch to patch a patch. Especially because nobody can tell what the right reserve may be. I guess this may very much depend on host layout. How do you want to make an acceptable kernel-patch for all the world out of this idea? The idea sounds obvious, but looks not very helpful for solving the basic problem.

Regards,
Stephan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 23:36 ` Daniel Phillips
@ 2001-08-30 16:49   ` Roger Larsson
  0 siblings, 0 replies; 30+ messages in thread
From: Roger Larsson @ 2001-08-30 16:49 UTC (permalink / raw)
  To: Daniel Phillips, Stephan von Krawczynski; +Cc: linux-kernel

On Thursdayen den 30 August 2001 01:36, Daniel Phillips wrote:
> On August 29, 2001 02:07 pm, Stephan von Krawczynski wrote:
> > I managed it again. As with previous 2.4-releases I managed to let
> > __alloc_pages fail quite easily with pretty standard test bed:
> >
> > Hardware: 2 x P-III 1GHz, 1 GB RAM, 29160 U160 SCSI, 36GB HD
> > Kernel: 2.4.10-pre2 with trace output in mm/page_alloc.c (thanks Roger)
> >
> > Test:
> > exported reiserfs filesystem, simply copying files on it from a 2.2.19
> > nfs client (files are big 10-20 MB each). at the same time I read a CD to
> > HD via xcdroast and run setiathome on nice-level 19.
> >
> > Everything runs acceptable, but CPU-Load is high (6-8). Simply "cat
> > /proc/meminfo" takes half a minute sometimes during test. Mouse cannot be
> > moved smoothly during the whole test. When xcdroast is finished with
> > reading CD (at about 1 MB/sec speed, not very fast indeed) the following
> > shows up:
> >
> > Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed
> > (gfp=0x20/0). Aug 29 13:43:34 admin kernel: pid=1207;
> > __alloc_pages(gfp=0x20, order=1, ...) Aug 29 13:43:34 admin kernel: Call
> > Trace: [_alloc_pages+22/24] [__get_free_pages+10/24] [<fdcec845>]
> > [<fdcec913>] [<fdceb7d7>] Aug 29 13:43:34 admin kernel:    [<fdcec0f5>]
> > [<fdcea589>] [ip_local_deliver_finish+0/368] [nf_hook_slow+272/404]
> > [ip_rcv_finish+0/480] [ip_local_deliver+436/444] Aug 29 13:43:34 admin
> > kernel:    [ip_local_deliver_finish+0/368] [ip_rcv_finish+0/480]
> > [ip_rcv_finish+413/480] [ip_rcv_finish+0/480] [nf_hook_slow+272/404]
> > [ip_rcv+870/944] Aug 29 13:43:34 admin kernel:    [ip_rcv_finish+0/480]
> > [net_rx_action+362/628] [do_softirq+111/204] [do_IRQ+219/236]
> > [ret_from_intr+0/7] [sys_ioctl+443/532] Aug 29 13:43:34 admin kernel:   
> > [system_call+51/56]
>
> OK, I see what the problem is.  Regular memory users are consuming memory
> right down to the emergency reserve limit, beyond which only PF_MEMALLOC
> users can go.  Unfortunately, since atomic memory allocators can't wait,
> they tend to fail with high frequency in this state.  Duh.
>
> First, there's an effective way to make these particular atomic failures
> go away almost entirely.  The atomic memory user (in this case a network
> interrupt handler) keeps a list of pages for its private use, starting with
> an empty list.  Each time it needs a page it gets it from its private list,
> but if that list is empty it gets it from alloc_pages, and when done with
> it, returns it to its private list.  The alloc_pages call can still fail of
> course, but now it will only fail a few times as it expands its list up to
> the size required for normal traffic.  The effect on throughput should be
> roughly nothing.

Looking at the code - sg already has some private memory... but it does not
use it in this case. (see sg_low_malloc)

But it tries to alloc 8 (2**3) pages atomically. That it quite a bit of memory
(32k!) that is twice more memory than the first computer I used.
But in this case we do not see errors with lower orders, try 4*4096

IN FILE sg.h

#define SG_SCATTER_SZ (8 * 4096)  /* PAGE_SIZE not available to user */
/* Largest size (in bytes) a single scatter-gather list element can have.
   The value must be a power of 2 and <= (PAGE_SIZE * 32) [131072 bytes on
   i386]. The minimum value is PAGE_SIZE. If scatter-gather not supported
   by adapter then this value is the largest data block that can be
   read/written by a single scsi command. The user can find the value of
   PAGE_SIZE by calling getpagesize() defined in unistd.h . */


/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-30 14:46 ` Stephan von Krawczynski
@ 2001-08-30 18:02   ` Daniel Phillips
  2001-08-30 23:53   ` [PATCH] __alloc_pages cleanup -R6 Was: " Roger Larsson
  2001-08-31 10:32   ` Stephan von Krawczynski
  2 siblings, 0 replies; 30+ messages in thread
From: Daniel Phillips @ 2001-08-30 18:02 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On August 30, 2001 04:46 pm, Stephan von Krawczynski wrote:
> On Thu, 30 Aug 2001 01:36:10 +0200
> Daniel Phillips <phillips@bonn-fries.net> wrote:
> 
> > > Aug 29 13:43:34 admin kernel: __alloc_pages: 1-order allocation failed (gfp=0x20/0).
> > > Aug 29 13:43:34 admin kernel: pid=1207; __alloc_pages(gfp=0x20, order=1, ...)
> >
> > OK, I see what the problem is.  Regular memory users are consuming memory
> > right down to the emergency reserve limit, beyond which only PF_MEMALLOC
> > users can go.  Unfortunately, since atomic memory allocators can't wait,
> > they tend to fail with high frequency in this state.  Duh.
> 
> Aehm, excuse my ignorance, but why is a "regular memory user" effectively
> _consuming_ the memory? I mean how does CD reading and NFS writing 
> _consume_ memory (to such an extent)? Where _is_ this memory gone?

In essence, the memory isn't consumed, it's bound to data.  Once we have
data sitting on a page it's to our advantage to try to keep the page
around as long as possible so it doesn't have to be re-read if we need it
again.  So the normal situation is, we run with as little memory actually
free and empty of data as possible, around 1-2%.

But there's a distinction between "free" and "freeable".  Of the remaining
98% of memory, most of it is freeable.  This is fine if your memory
request is able to wait for the vm to go out and free some.  This happens
often, and while the preliminary scanning work is being done the system
tends to sit there in its absolute rock-bottom memory state (zone->pages
== zone->min).  Along comes an interrupt with a memory allocation request
and it must fail, because it can't wait.

> If I 
> stop I/O the memory is still gone. I can make at least quite a bit appear 
> again as free, if I delete all the files I have copied via NFS before. 
> After that I receive lots of free mem right away. Why?

Because the cached data pages are forcibly removed from the page cache and
freed, since the data can never be referenced again.

> I would not expect 
> the memory as validly consumed by knfsd during writing files. I mean what 
> for? I guess from "Inact_dirty" list being _huge_, that the mem is in fact 
> already freed again by the original allocator, but the vm holds it, until 
> ... well I don't know. Or am I wrong?

I hope it's clearer now.

> > First, there's an effective way to make these particular atomic failures
> > go away almost entirely.  The atomic memory user (in this case a network
> > interrupt handler) keeps a list of pages for its private use, starting with
> > an empty list.  Each time it needs a page it gets it from its private list,
> > but if that list is empty it gets it from alloc_pages, and when done with
> > it, returns it to its private list.  The alloc_pages call can still fail of
> > course, but now it will only fail a few times as it expands its list up to
> > the size required for normal traffic.  The effect on throughput should be
> > roughly nothing.
> 
> Uh, I would not do that. To a shared memory pool system this is really 
> contra-productive (is this english?). You simply let the mem vanish in some 
> private pools, so only _one_ process (or whatever) can use it.

It's not very much memory, that's the point.  But it sure hurts if it's not
available at the time it's needed.

> To tell the 
> full truth, you do not even know, if he really uses it. If he allocated it 
> in a heavy load situation and does not give it back (or has his own weird 
> strategy of returning it) you run out of mem only because of one flaky 
> driver. It will not be easy as external spectator of a driver to find out 
> if it performs well or has some memory leakage inside. You simply can't 
> tell.  In fact I do trust kernel mem management more :-), even if it isn't 
> performing very well currently.

Keep in mind we're solving an impossible problem here.  We have a task that
can't wait, but needs an unknown (to the system) amount of memory.  We dance
around this by decreeing that the allocation can fail, and the interrupt
handler has to be able to deal with that.  Well, that works, but often not
very fast.  In the network subsystem it translates into dropped packets.
That's very, very bad if it happens often.  So it's ok to use a little memory
in a less-than-frugal way if we reduce the frequency of packet dropping.

> > Let's try another way of dealing with it.  What I'm trying to do with the
> > patch below is leave a small reserve of 1/12 of pages->min, above the
> > emergency reserve, to be consumed by non-PF_MEMALLOC atomic allocators.
> > Please bear in mind this is completely untested, but would you try it
> > please and see if the failure frequency goes down?
> 
> Well, I will try. But must honestly mention, that the whole idea looks like 
> a patch to patch a patch. Especially because nobody can tell what the right 
> reserve may be. I guess this may very much depend on host layout. How do 
> you want to make an acceptable kernel-patch for all the world out of this 
> idea? The idea sounds obvious, but looks not very helpful for solving the 
> basic problem.

Lets see what it does.  I'd be the last to claim it's a pretty or efficient
way to express the idea.  The immediate goal is to determine if I analyzed 
the problem correctly, and if the logic of the patch is correct.

--
Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] __alloc_pages cleanup -R6 Was: Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-30 14:46 ` Stephan von Krawczynski
  2001-08-30 18:02   ` Daniel Phillips
@ 2001-08-30 23:53   ` Roger Larsson
  2001-08-31  7:43     ` Russell King
  2001-08-31 10:32   ` Stephan von Krawczynski
  2 siblings, 1 reply; 30+ messages in thread
From: Roger Larsson @ 2001-08-30 23:53 UTC (permalink / raw)
  To: Stephan von Krawczynski, Daniel Phillips; +Cc: linux-kernel, linux-mm

Hi,

A new version of the __alloc_pages{_limit} cleanup.
This time for 2.4.10-pre2

Some ideas implemented in this code:
* Reserve memory below min for atomic and recursive allocations.
* When being min..low on free pages, free one more than you want to allocate.
* When being low..high on free pages, free one less than wanted.
* When above high - don't free anything.
* First select zones with more than high free memory.
* Then those with more than high 'free + inactive_clean - inactive_target'
* When freeing - do it properly. Don't steal direct reclaimed pages
(questionable due to locking issues on SMP)
This will "regulate" the number of FREE_PAGES towards the PAGES_LOW.
Possibility for success of an atomic high order alloc is in some way 
proportional with number of pages free.

I have not been able to notice any performance degradation with this
patch. But I do not have a SMP PC...

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden


*******************************************
Patch prepared by: roger.larsson@norran.net
Name of file: /home/roger/patches/patch-2.4.10-pre2-alloc_pages_limit-R6b

--- linux/mm/page_alloc.c.orig	Thu Aug 30 23:20:01 2001
+++ linux/mm/page_alloc.c	Fri Aug 31 00:56:20 2001
@@ -212,9 +212,13 @@
 	return NULL;
 }
 
-#define PAGES_MIN	0
-#define PAGES_LOW	1
-#define PAGES_HIGH	2
+#define PAGES_MEMALLOC    0
+#define PAGES_CRITICAL    1
+#define PAGES_MIN_FREE	  2
+#define PAGES_NORMAL_FREE 3
+#define PAGES_HIGH_FREE   4
+#define PAGES_HIGH        5
+#define PAGES_INACTIVE_TARGET    6
 
 /*
  * This function does the dirty work for __alloc_pages
@@ -228,7 +232,7 @@
 
 	for (;;) {
 		zone_t *z = *(zone++);
-		unsigned long water_mark;
+		unsigned long water_mark, free_min, pages_to_reclaim;
 
 		if (!z)
 			break;
@@ -239,26 +243,85 @@
 		 * We allocate if the number of free + inactive_clean
 		 * pages is above the watermark.
 		 */
+
+		free_min = z->pages_min;
+
+
 		switch (limit) {
+			case PAGES_MEMALLOC:
+				free_min = 1;
+				water_mark = 0; /* there might be inactive_clean pages */
+				break;
+			case PAGES_CRITICAL:
+				/* XXX: is pages_min/4 a good amount to reserve for this? */
+				free_min = water_mark = z->pages_min / 4;
+				break;
 			default:
-			case PAGES_MIN:
+				printk(KERN_ERR 
+				       "__alloc_pages_limit unknown limit (%d) using default\n",
+				       limit);
+			case PAGES_MIN_FREE:
 				water_mark = z->pages_min;
 				break;
-			case PAGES_LOW:
-				water_mark = z->pages_low;
+			case PAGES_NORMAL_FREE:
+				water_mark = (z->pages_min + z->pages_low) / 2;
 				break;
-			case PAGES_HIGH:
+			case PAGES_INACTIVE_TARGET:
+				water_mark = z->pages_high +
+					inactive_target -  z->inactive_clean_pages;
+				break;
+			case PAGES_HIGH_FREE:
 				water_mark = z->pages_high;
+				break;
+			case PAGES_HIGH:
+				water_mark = z->pages_high - z->inactive_clean_pages;
+				break;
 		}
 
-		if (z->free_pages + z->inactive_clean_pages >= water_mark) {
-			struct page *page = NULL;
-			/* If possible, reclaim a page directly. */
-			if (direct_reclaim)
-				page = reclaim_page(z);
-			/* If that fails, fall back to rmqueue. */
-			if (!page)
-				page = rmqueue(z, order);
+
+
+		if (z->free_pages < water_mark) 
+			continue;
+		
+
+		/*
+		 * Reclaim a page from the inactive_clean list.
+		 * low water mark. Free all reclaimed pages to
+		 * give them a chance to merge to higher orders.
+		 */
+		if (direct_reclaim) {
+			/* Our goal for free pages is z->pages_low
+			 * if there are less try to free one more than needed
+			 * when more, free one less
+			 */
+			pages_to_reclaim = 1 << order; /* pages to try to reclaim at free_pages 
level */
+			if (z->free_pages < z->pages_low)
+				pages_to_reclaim++;
+			else if (z->free_pages < z->pages_high)
+				pages_to_reclaim--;
+			else /* free >= high */
+				pages_to_reclaim = 0;
+
+			while (z->inactive_clean_pages &&
+			       (z->free_pages < z->pages_min ||
+				pages_to_reclaim--)) { /* note: lazy evaluation! decr. only when free > 
min */ 
+				struct page *reclaim = reclaim_page(z);
+				if (reclaim) {
+					__free_page(reclaim);
+				}
+				else {
+					if (z->inactive_clean_pages > 0)
+						printk(KERN_ERR "reclaim_pages failed but there are 
inactive_clean_pages\n");
+
+					break;
+				}
+			}
+		}
+				
+		/* Always alloc via rmqueue */
+		if (z->free_pages >= free_min)
+		{
+			struct page *page = rmqueue(z, order);
 			if (page)
 				return page;
 		}
@@ -268,6 +331,7 @@
 	return NULL;
 }
 
+
 #ifndef CONFIG_DISCONTIGMEM
 struct page *_alloc_pages(unsigned int gfp_mask, unsigned long order)
 {
@@ -281,7 +345,6 @@
  */
 struct page * __alloc_pages(unsigned int gfp_mask, unsigned long order, 
zonelist_t *zonelist)
 {
-	zone_t **zone;
 	int direct_reclaim = 0;
 	struct page * page;
 
@@ -291,6 +354,14 @@
 	memory_pressure++;
 
 	/*
+	 * To get a hint on who is requesting higher order atomically.
+	 */
+	if (order > 0 && !(gfp_mask & __GFP_WAIT)) {
+		printk("%s; __alloc_pages(gfp=0x%x, order=%ld, ...)\n", current->comm, 
gfp_mask, order);
+		show_trace(NULL);
+	}
+	  
+	/*
 	 * (If anyone calls gfp from interrupts nonatomically then it
 	 * will sooner or later tripped up by a schedule().)
 	 *
@@ -299,70 +370,69 @@
 	 */
 
 	/*
-	 * Can we take pages directly from the inactive_clean
-	 * list?
-	 */
-	if (order == 0 && (gfp_mask & __GFP_WAIT))
-		direct_reclaim = 1;
-
-try_again:
-	/*
 	 * First, see if we have any zones with lots of free memory.
 	 *
 	 * We allocate free memory first because it doesn't contain
 	 * any data ... DUH!
 	 */
-	zone = zonelist->zones;
-	for (;;) {
-		zone_t *z = *(zone++);
-		if (!z)
-			break;
-		if (!z->size)
-			BUG();
+	page = __alloc_pages_limit(zonelist, order, PAGES_HIGH_FREE, 0);
+	if (page)
+		return page;
 
-		if (z->free_pages >= z->pages_low) {
-			page = rmqueue(z, order);
-			if (page)
-				return page;
-		} else if (z->free_pages < z->pages_min &&
-					waitqueue_active(&kreclaimd_wait)) {
-				wake_up_interruptible(&kreclaimd_wait);
-		}
-	}
+	/*
+	 * Can we take pages directly from the inactive_clean
+	 * list? __alloc_pages_limit now handles any 'order'.
+	 */
+	if (gfp_mask & __GFP_WAIT)
+		direct_reclaim = 1;
+
+	/* Lots of free and inactive memory? i.e. more than target for
+	 * the next second.
+	 */
+	page = __alloc_pages_limit(zonelist, order, PAGES_INACTIVE_TARGET, 
direct_reclaim);
+	if (page)
+		return page;
 
 	/*
-	 * Try to allocate a page from a zone with a HIGH
-	 * amount of free + inactive_clean pages.
+	 * Hmm. Too few pages inactive to reach our inactive_target.
+	 *
+	 * We wake up kswapd, in the hope that kswapd will
+	 * resolve this situation before memory gets tight.
 	 *
-	 * If there is a lot of activity, inactive_target
-	 * will be high and we'll have a good chance of
-	 * finding a page using the HIGH limit.
 	 */
+
+	wakeup_kswapd();
+
+
 	page = __alloc_pages_limit(zonelist, order, PAGES_HIGH, direct_reclaim);
 	if (page)
 		return page;
 
 	/*
-	 * Then try to allocate a page from a zone with more
-	 * than zone->pages_low free + inactive_clean pages.
+	 * Then try to allocate a page from a zone with slightly less
+	 * than zone->pages_low free pages. Since this is the goal
+	 * of free pages this alloc will dynamically change among
+	 * zones.
 	 *
 	 * When the working set is very large and VM activity
 	 * is low, we're most likely to have our allocation
 	 * succeed here.
 	 */
-	page = __alloc_pages_limit(zonelist, order, PAGES_LOW, direct_reclaim);
+try_again:
+	page = __alloc_pages_limit(zonelist, order, PAGES_NORMAL_FREE, 
direct_reclaim);
 	if (page)
 		return page;
 
-	/*
-	 * OK, none of the zones on our zonelist has lots
-	 * of pages free.
-	 *
-	 * We wake up kswapd, in the hope that kswapd will
-	 * resolve this situation before memory gets tight.
-	 *
-	 * We also yield the CPU, because that:
-	 * - gives kswapd a chance to do something
+
+	/* "all" zones has less than NORMAL free, i.e. our reclaiming in 
__alloc_pages_limit
+	 * has not kept up with demand, possibly too few allocs with reclaim
+	 */
+	if (waitqueue_active(&kreclaimd_wait)) {
+		wake_up_interruptible(&kreclaimd_wait);
+	}
+
+	/* We also yield the CPU, because that:
+	 * - gives kswapd and kreclaimd a chance to do something
 	 * - slows down allocations, in particular the
 	 *   allocations from the fast allocator that's
 	 *   causing the problems ...
@@ -371,13 +441,13 @@
 	 * - if we don't have __GFP_IO set, kswapd may be
 	 *   able to free some memory we can't free ourselves
 	 */
-	wakeup_kswapd();
 	if (gfp_mask & __GFP_WAIT) {
 		__set_current_state(TASK_RUNNING);
 		current->policy |= SCHED_YIELD;
 		schedule();
 	}
 
+
 	/*
 	 * After waking up kswapd, we try to allocate a page
 	 * from any zone which isn't critical yet.
@@ -385,7 +455,7 @@
 	 * Kswapd should, in most situations, bring the situation
 	 * back to normal in no time.
 	 */
-	page = __alloc_pages_limit(zonelist, order, PAGES_MIN, direct_reclaim);
+	page = __alloc_pages_limit(zonelist, order, PAGES_MIN_FREE, direct_reclaim);
 	if (page)
 		return page;
 
@@ -398,40 +468,21 @@
 	 * - we're /really/ tight on memory
 	 * 	--> try to free pages ourselves with page_launder
 	 */
-	if (!(current->flags & PF_MEMALLOC)) {
+	if (!(current->flags & PF_MEMALLOC) &&
+	    (gfp_mask & __GFP_WAIT)) { /* implies direct_reclaim==1 */
 		/*
-		 * Are we dealing with a higher order allocation?
-		 *
-		 * Move pages from the inactive_clean to the free list
-		 * in the hope of creating a large, physically contiguous
-		 * piece of free memory.
+		 * Move pages from the inactive_dirty to the inactive_clean
 		 */
-		if (order > 0 && (gfp_mask & __GFP_WAIT)) {
-			zone = zonelist->zones;
-			/* First, clean some dirty pages. */
-			current->flags |= PF_MEMALLOC;
-			page_launder(gfp_mask, 1);
-			current->flags &= ~PF_MEMALLOC;
-			for (;;) {
-				zone_t *z = *(zone++);
-				if (!z)
-					break;
-				if (!z->size)
-					continue;
-				while (z->inactive_clean_pages) {
-					struct page * page;
-					/* Move one page to the free list. */
-					page = reclaim_page(z);
-					if (!page)
-						break;
-					__free_page(page);
-					/* Try if the allocation succeeds. */
-					page = rmqueue(z, order);
-					if (page)
-						return page;
-				}
-			}
-		}
+
+		/* First, clean some dirty pages. */
+		current->flags |= PF_MEMALLOC;
+		page_launder(gfp_mask, 1);
+		current->flags &= ~PF_MEMALLOC;
+
+		page = __alloc_pages_limit(zonelist, order, PAGES_MIN_FREE, 
direct_reclaim); 
+		if (page)
+			return page;
+
 		/*
 		 * When we arrive here, we are really tight on memory.
 		 * Since kswapd didn't succeed in freeing pages for us,
@@ -447,22 +498,23 @@
 		 * any progress freeing pages, in that case it's better
 		 * to give up than to deadlock the kernel looping here.
 		 */
-		if (gfp_mask & __GFP_WAIT) {
-			if (!order || free_shortage()) {
-				int progress = try_to_free_pages(gfp_mask);
-				if (progress || (gfp_mask & __GFP_FS))
-					goto try_again;
-				/*
-				 * Fail in case no progress was made and the
-				 * allocation may not be able to block on IO.
-				 */
-				return NULL;
-			}
+		if (!order || free_shortage()) {
+			int progress = try_to_free_pages(gfp_mask);
+			if (progress || (gfp_mask & __GFP_FS))
+				goto try_again;
 		}
+
+		/*
+		 * Fail in case no further progress can be made.
+		 */
+		return NULL;
 	}
 
 	/*
-	 * Final phase: allocate anything we can!
+	 * Final phase: atomic and recursive only - allocate anything we can!
+	 *
+	 * Note: very high order allocs are not that important and are unlikely
+	 * to succeed with this anyway.
 	 *
 	 * Higher order allocations, GFP_ATOMIC allocations and
 	 * recursive allocations (PF_MEMALLOC) end up here.
@@ -471,39 +523,18 @@
 	 * in the system, otherwise it would be just too easy to
 	 * deadlock the system...
 	 */
-	zone = zonelist->zones;
-	for (;;) {
-		zone_t *z = *(zone++);
-		struct page * page = NULL;
-		if (!z)
-			break;
-		if (!z->size)
-			BUG();
-
-		/*
-		 * SUBTLE: direct_reclaim is only possible if the task
-		 * becomes PF_MEMALLOC while looping above. This will
-		 * happen when the OOM killer selects this task for
-		 * instant execution...
-		 */
-		if (direct_reclaim) {
-			page = reclaim_page(z);
-			if (page)
-				return page;
-		}
-
-		/* XXX: is pages_min/4 a good amount to reserve for this? */
-		if (z->free_pages < z->pages_min / 4 &&
-				!(current->flags & PF_MEMALLOC))
-			continue;
-		page = rmqueue(z, order);
-		if (page)
-			return page;
-	}
-
+ 	page = __alloc_pages_limit(zonelist, order,
+ 				   current->flags & PF_MEMALLOC 
+				   ? PAGES_MEMALLOC : PAGES_CRITICAL,
+ 				   direct_reclaim); 
+ 	if (page)
+ 		return page;
+  
 	/* No luck.. */
-	printk(KERN_ERR "__alloc_pages: %lu-order allocation failed 
(gfp=0x%x/%i).\n",
-		order, gfp_mask, !!(current->flags & PF_MEMALLOC));
+	printk(KERN_ERR
+	       "%s; __alloc_pages: %lu-order allocatioa failed. (gfp=0x%x/%d)\n",
+	       current->comm, order, gfp_mask,
+	       !!(current->flags & PF_MEMALLOC));
 	return NULL;
 }
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] __alloc_pages cleanup -R6 Was: Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-30 23:53   ` [PATCH] __alloc_pages cleanup -R6 Was: " Roger Larsson
@ 2001-08-31  7:43     ` Russell King
  2001-08-31 23:22       ` Roger Larsson
  0 siblings, 1 reply; 30+ messages in thread
From: Russell King @ 2001-08-31  7:43 UTC (permalink / raw)
  To: Roger Larsson
  Cc: Stephan von Krawczynski, Daniel Phillips, linux-kernel, linux-mm

On Fri, Aug 31, 2001 at 01:53:24AM +0200, Roger Larsson wrote:
> Some ideas implemented in this code:
> * Reserve memory below min for atomic and recursive allocations.
> * When being min..low on free pages, free one more than you want to allocate.
> * When being low..high on free pages, free one less than wanted.
> * When above high - don't free anything.
> * First select zones with more than high free memory.
> * Then those with more than high 'free + inactive_clean - inactive_target'
> * When freeing - do it properly. Don't steal direct reclaimed pages

Hmm, I wonder.

I have a 1MB DMA zone, and 31MB of normal memory.

The machine has been running lots of programs for some time, but not under
any VM pressure.  I now come to open a device which requires 64K in 8K
pages from the DMA zone.  What happens?

I suspect that the chances of it failing will be significantly higher with
this algorithm - do you have any thoughts for this?

I don't think we should purely select the allocation zone based purely on
how much free it contains, but also if it's special (like the DMA zone).

You can't clean in-use slab pages out on demand like you can for fs
cache/user pages.

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-30 14:46 ` Stephan von Krawczynski
  2001-08-30 18:02   ` Daniel Phillips
  2001-08-30 23:53   ` [PATCH] __alloc_pages cleanup -R6 Was: " Roger Larsson
@ 2001-08-31 10:32   ` Stephan von Krawczynski
  2 siblings, 0 replies; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-08-31 10:32 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Thu, 30 Aug 2001 20:02:55 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> In essence, the memory isn't consumed, it's bound to data.  Once we have
> data sitting on a page it's to our advantage to try to keep the page
> around as long as possible so it doesn't have to be re-read if we need it
> again.  So the normal situation is, we run with as little memory actually
> free and empty of data as possible, around 1-2%.
> 
> But there's a distinction between "free" and "freeable".  Of the remaining
> 98% of memory, most of it is freeable.  This is fine if your memory
> request is able to wait for the vm to go out and free some.  This happens
> often, and while the preliminary scanning work is being done the system
> tends to sit there in its absolute rock-bottom memory state (zone->pages
> == zone->min).  Along comes an interrupt with a memory allocation request
> and it must fail, because it can't wait.

Well, I do understand the strategy, only I tend to believe it is bs. The reason is pretty simple: with this strategy you cannot even add physical memory to your host if it runs low, because the system "uses" it for caching that is _mostly not needed_. No matter how much money you spend, you will always run low on memory and will always have allocation failures -  and performance drawbacks because of this. This _is_ bs. Furthermore your 1-2% free mem obviously doesn't take into account mem fragmentation. I checked my setup and found out that I really do have around 30 MB free mem, but sg driver fails to allocate ridiculous 32 KB. Tell me honestly: do you think this makes sense?
There is another thing that is not taken into account: how is the runtime situation while performing mem functions?
What I want to say, I would suspect it to be more probable that it is generally less time critical to do a _free_ than to do an _alloc_. On _allocs_ somebody needs mem and possibly _waits_ for actions to be done. On _free_ it is far more possible that you are in a cleanup and exit situation. But your strategy moves the work to be done from _free_ situation to _alloc_ situation meaning strategically you have to wait longer, although maybe _alloc_ and _free_ take the same time summed up. A quick and handsome system should give away mem _fast_ and cleanup things when a user _expects_ a cleanup, and not vice versa.
This looks like w*doze to me: do the wrong thing at the wrong time.

> > Uh, I would not do that. To a shared memory pool system this is really 
> > contra-productive (is this english?). You simply let the mem vanish in some 
> > private pools, so only _one_ process (or whatever) can use it.
> 
> It's not very much memory, that's the point.  But it sure hurts if it's not
> available at the time it's needed.

Well, you name it.

> Keep in mind we're solving an impossible problem here.  We have a task that
> can't wait, but needs an unknown (to the system) amount of memory.  We dance
> around this by decreeing that the allocation can fail, and the interrupt
> handler has to be able to deal with that.  Well, that works, but often not
> very fast.  In the network subsystem it translates into dropped packets.
> That's very, very bad if it happens often.  So it's ok to use a little memory
> in a less-than-frugal way if we reduce the frequency of packet dropping.

AHHHH! That really hurt me! Can't you see how this hurts: you buy _lots_ of mem, and anyway the system merely throws it away and drops pakets anyway, because it runs out of mem. "Help me if you can I'm feeling down, down!" (Lennon/McCartney). 
There is nothing impossible about this situation, only what vm makes out of it. Running a simple NFS-server with _one_ client runs out of mem on a server with 1 GB! And the best about it, it would be just the same if I had 2 GB!
Well, maybe that is in fact the reason for the problems of DRAM producers, nobody needs to buy it anymore: it does not help anyway.

Regards,
Stephan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-29 12:07 Memory Problem in 2.4.10-pre2 / __alloc_pages failed Stephan von Krawczynski
                   ` (3 preceding siblings ...)
  2001-08-30 14:46 ` Stephan von Krawczynski
@ 2001-08-31 11:06 ` Stephan von Krawczynski
  2001-08-31 19:03   ` Daniel Phillips
  4 siblings, 1 reply; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-08-31 11:06 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Thu, 30 Aug 2001 01:36:10 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> [...]
> Let's try another way of dealing with it.  What I'm trying to do with the
> patch below is leave a small reserve of 1/12 of pages->min, above the
> emergency reserve, to be consumed by non-PF_MEMALLOC atomic allocators.
> Please bear in mind this is completely untested, but would you try it
> please and see if the failure frequency goes down?
> 
> --- ../2.4.9.clean/mm/page_alloc.c	Thu Aug 16 12:43:02 2001
> +++ ./mm/page_alloc.c	Wed Aug 29 23:47:39 2001
> @@ -493,6 +493,9 @@
>  		}
>  
>  		/* XXX: is pages_min/4 a good amount to reserve for this? */
> +		if (z->free_pages < z->pages_min / 3 && (gfp_mask & __GFP_WAIT) &&
> +				!(current->flags & PF_MEMALLOC))
> +			continue;
>  		if (z->free_pages < z->pages_min / 4 &&
>  				!(current->flags & PF_MEMALLOC))
>  			continue;
> 

Hello Daniel,

I tried this patch and it makes _no_ difference. Failures show up in same situation and amount. Do you need traces? They look the same

Regards,
Stephan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-31 11:06 ` Stephan von Krawczynski
@ 2001-08-31 19:03   ` Daniel Phillips
  0 siblings, 0 replies; 30+ messages in thread
From: Daniel Phillips @ 2001-08-31 19:03 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On August 31, 2001 01:06 pm, Stephan von Krawczynski wrote:
> On Thu, 30 Aug 2001 01:36:10 +0200
> Daniel Phillips <phillips@bonn-fries.net> wrote:
> 
> > [...]
> > Let's try another way of dealing with it.  What I'm trying to do with the
> > patch below is leave a small reserve of 1/12 of pages->min, above the
> > emergency reserve, to be consumed by non-PF_MEMALLOC atomic allocators.
> > Please bear in mind this is completely untested, but would you try it
> > please and see if the failure frequency goes down?
> > 
> > --- ../2.4.9.clean/mm/page_alloc.c	Thu Aug 16 12:43:02 2001
> > +++ ./mm/page_alloc.c	Wed Aug 29 23:47:39 2001
> > @@ -493,6 +493,9 @@
> >  		}
> >  
> >  		/* XXX: is pages_min/4 a good amount to reserve for this? */
> > +		if (z->free_pages < z->pages_min / 3 && (gfp_mask & __GFP_WAIT) &&
> > +				!(current->flags & PF_MEMALLOC))
> > +			continue;
> >  		if (z->free_pages < z->pages_min / 4 &&
> >  				!(current->flags & PF_MEMALLOC))
> >  			continue;
> > 
> 
> Hello Daniel,
> 
> I tried this patch and it makes _no_ difference. Failures show up in same 
> situation and amount. Do you need traces? They look the same

OK, first would you confirm that the frequency of 0 order failures has
stayed the same?

If some other thread is always in PF_MEMALLOC when these failures are 
happening then no, this approach would not be any help.

--
Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] __alloc_pages cleanup -R6 Was: Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-08-31  7:43     ` Russell King
@ 2001-08-31 23:22       ` Roger Larsson
  0 siblings, 0 replies; 30+ messages in thread
From: Roger Larsson @ 2001-08-31 23:22 UTC (permalink / raw)
  To: Russell King, Roger Larsson
  Cc: Stephan von Krawczynski, Daniel Phillips, linux-kernel, linux-mm

On Friday den 31 August 2001 09:43, Russell King wrote:
> On Fri, Aug 31, 2001 at 01:53:24AM +0200, Roger Larsson wrote:
> > Some ideas implemented in this code:
> > * Reserve memory below min for atomic and recursive allocations.
> > * When being min..low on free pages, free one more than you want to
> > allocate. * When being low..high on free pages, free one less than
> > wanted. * When above high - don't free anything.
> > * First select zones with more than high free memory.
> > * Then those with more than high 'free + inactive_clean -
> > inactive_target' * When freeing - do it properly. Don't steal direct
> > reclaimed pages
>
> Hmm, I wonder.
>
> I have a 1MB DMA zone, and 31MB of normal memory.
>
> The machine has been running lots of programs for some time, but not under
> any VM pressure. 

OK, zones have about PAGES_LOW pages free - these are buddied together as
good as they possibly can. Since direct reclaims put the page on the free
list first before allocating one.

> I now come to open a device which requires 64K in 8K
> pages from the DMA zone.  What happens?

First - is it atomic or not? (device sounds like atomic)

If it is atomic:
1) you get one chance, no retries in __alloc_pages. [not changed]
2) you get one from those already free, no reclaims possible [not changed]
3) you are allowed to allocate below PAGES_MIN [not changed]
The result will depend on how many pages were free, if there are enough order
1 buddies.
With my algorithm it the number of free pages of all zones are very likely to
be close to PAGES_LOW since it tries to move towards it.
The original algorithm is harder to analyze, free pages will not grow unless
one hits PAGES_MIN and then kreclaimd gets started.

As a test I hit Magic-SysRq-M (256 MB RAM):

Free pages:        3836kB (     0kB HighMem) ( Active: 21853, inactive_dirty: 
19101, inactive_clean: 392, free: 959 (383 766 1149) )
0*4kB 1*8kB 8*16kB 8*32kB 4*64kB 3*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB
 = 1800kB)
1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB
 = 2036kB)
 = 0kB)

And a while later I hit it again:
( Active: 22300, inactive_dirty: 18742, inactive_clean: 587, free: 947 (383 
766 1149) )
3*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB
 = 1028kB)
80*4kB 39*8kB 3*16kB 3*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB
 = 2760kB)

For non atomic higher order allocations: There are more changes.
1) Tries to free the same number of pages that it want to alloc later.
2) Does not allow allocs when there are less than PAGES_MIN. [BUG in current 
code, see earlier patch - higher order non atomic allocs could drain the free
reserve if there are a lot of inactive clean pages...]
3) Retries only while there is free shortage - this could be changed...
  until all zones has more than PAGES_HIGH free. Or until there are no
  inactive clean pages left. But why favor non atomic over atomic in this
  way?

>
> I suspect that the chances of it failing will be significantly higher with
> this algorithm - do you have any thoughts for this?
>

Do you still think the risk is higher?
Stephans problem seems to be that this alloc runs over and over...

> I don't think we should purely select the allocation zone based purely on
> how much free it contains, but also if it's special (like the DMA zone).
>

It does prioritize due to the order the zones are checked in.

> You can't clean in-use slab pages out on demand like you can for fs
> cache/user pages.

/RogerL

-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
       [not found] ` <20010907154801.028a48e8.skraw@ithnet.com>
@ 2001-09-07 21:13   ` Stephan von Krawczynski
  0 siblings, 0 replies; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-09-07 21:13 UTC (permalink / raw)
  To: Alex Bligh - linux-kernel; +Cc: phillips, roger.larsson, mikeg, linux-kernel

Hello all,

in an effort to try and find the reason for the real bad performance of vm in
my test bed I tried to find the bad guy that holds all my nice mem and makes
the rest of the system crawl on its knees. Most of my testing is somehow
related to network (because of knfsd) and fs (obviously writing to disk and
reading from CD). So I tried to eliminate some of the unknown factors.
I tried something very simple:
Copy a file from disk a to b (on different controllers) which is bigger than
current free mem (I used a 730 MB file).
What you see is (quite as expected) that your free mem gets lower during this
copy down to the point where you have none left. Up to this the copy seemed
pretty fast, from then on it seemed really slow.
I ended up with a copy time of 4m14 and a completely filled mem (cache).
Obviously this is the cached file data. Fine. Now you rm the new file and your
memory is back. Fine. As two disks are involved I tried next how long it takes
to read the file from the source disk. Therefore I simply used "cat file
>/dev/null". This takes 24 secs and you end up with a filled mem, obviously
again cached file data. Only this time you cannot get it free, because you have
no destination to remove. Ok, you count on vm to remove it later on
automagically. Next try, how long does it take to write such a file onto
destination disk, result at worst: 1m33.
Ok, now the interesting question: 24 secs (read) + 1m33 (write) = 1m57.
This is by far less than 4m14 from the copy test. Why? My simple guess: kernel
needs a damn lot of time to manage the memory mess of this really simple
action.
This brings up the simple question: why the heck does it make sense to cache so
much data of an I/O action that your systems gets under heavy pressure? I can't
tell, can you? Obviously you do not gain performance, but loose. The system
comes up to load 4 with this simple cp.
I really would like to know what part of the kernel holds responsibility for
this tremendous amount of cache pages. How can you influence this (e.g. size)? 
Kernel seems to have a very hard time in finding free mem in this situation,
this is what probably eats a lot of time. So a big performance step should be
possible if it can easier get rid of outdated (name it aged) cache pages. I
don't really care if a cp eats up all mem, as long as it does not equally eat
up the systems performance as a kind of drawback.
Of course this is a cry for help from somebody that can accept that the current
vm is simply a hack with a lot of lists, but no consistent design - and that it
needs cleanup _and_ simplification, and _not_ tuning. You can spend years in
tuning with no real result. So please (Alex), do not add Yet Another Ten Lists,
even if you call them zones, to make defragmentation work.
The only thing which will happen if we continue to go that way is more
inconsistent stuff like

static unsigned int zone_free_plenty(zone_t *zone)
{
        unsigned int free;

        free = zone->free_pages;
        free += zone->inactive_clean_pages;

        return free > zone->pages_high*2;
}

which pretty well shows (on a simple approach) what I mean: if a page is free,
call it free. If it is not, then its _used_. Otherwise you will end up with
more stuff like the above to find out what _could_ be called free, but
according to Daniels last answer to my question (what is the difference between
clean and free?) is not really free, except the case when it is full moon on
second monday of the month.
This code is full to the limit with rough guesses about certain parameters,
things that might happen and workarounds.
If we walk on that way for another 3 months we may as well end up with a system
not being able to do a simple cp without allocation failure. And don't call
this an exaggeration: currently I cannot even burn a simple CD while doing a
nfs copy. This already worked one year ago.
Start from scratch instead: what is the interface definition from
rest-of-kernel to vm functions? Can't be that many. 
We have a lot of brain online here, lets use it.

Regards,
Stephan

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 21:23           ` Daniel Phillips
@ 2001-09-02 21:28             ` Alex Bligh - linux-kernel
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Bligh - linux-kernel @ 2001-09-02 21:28 UTC (permalink / raw)
  To: Daniel Phillips, Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

--On Sunday, 02 September, 2001 11:23 PM +0200 Daniel Phillips 
<phillips@bonn-fries.net> wrote:

> Reposted to include Alex's correction.  Alex, could you please check this?

Thanks, yep, including () optimisation :-)

Dear volunteer/victim: Last time I looked,
  ping -f -s5000 victimip &
  ping -f -s10000 victimip &
  ping -f -s15000 victimip &
  ping -f -s20000 victimip &
  ping -f -s25000 victimip &
  ping -f -s30000 victimip &
  ping -f -s35000 victimip &
while running something buffer intensive (bonnie etc.)
tended to do a fair job of exercizing the machine's
powers of memory fragmentation / defragmentation;
reassembly of IP fragments allocates memory GFP_ATOMIC.

--
Alex Bligh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 19:32         ` Alex Bligh - linux-kernel
  2001-09-02 20:24           ` Daniel Phillips
  2001-09-02 20:33           ` Daniel Phillips
@ 2001-09-02 21:23           ` Daniel Phillips
  2001-09-02 21:28             ` Alex Bligh - linux-kernel
  2 siblings, 1 reply; 30+ messages in thread
From: Daniel Phillips @ 2001-09-02 21:23 UTC (permalink / raw)
  To: Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

Reposted to include Alex's correction.  Alex, could you please check this?

On September 2, 2001 09:32 pm, Alex Bligh - linux-kernel wrote:
> IDEA: Attempt not to allocate sets of pages buddied with 'nearly
> free' sets of pages.
> 
> When freeing pages, we work our way up the orders until
> we find a buddy which is non-empty. Let's assume that
> the free area, and our (non-empty) buddy, are of order N.
> Let's look at whether at order N-1, it's merely half full,
> or completely full. If completely full, we guess that
> the buddy is unlikely to become free soon, and thus
> add our area to the front of the memory queue (mem_add_head)
> else we guess it's more likely to have its buddy freed
> and add it to the back of the memory queue (mem_add_tail).
> 
> /Completely/ untested (i.e. uncompiled) patch attached

Rediffed to 2.4.9, whitespace and wrapping problems fixed, and compiled
but not tested.  Now we just need a victi^H^H^H^H^H volunteer to try it...
(Stephan?)

--- ../2.4.9.clean/mm/page_alloc.c	Thu Aug 16 12:43:02 2001
+++ ./mm/page_alloc.c	Sun Sep  2 21:09:05 2001
@@ -69,6 +69,8 @@
 	struct page *base;
 	zone_t *zone;
 
+       int addfront=1;
+
 	if (page->buffers)
 		BUG();
 	if (page->mapping)
@@ -112,10 +114,22 @@
 		if (area >= zone->free_area + MAX_ORDER)
 			BUG();
 		if (!__test_and_change_bit(index, area->map))
-			/*
-			 * the buddy page is still allocated.
-			 */
-			break;
+                 {
+                   /*
+                    * the buddy page is still allocated.
+                    *
+                    * see how many bits are set in its bitmap;
+                    * if 50% or more, we conclude the buddy is
+                    * unlikely to be freed soon, and add the
+                    * area to the head of the queue; else we
+                    * conclude the buddy may be free soon and
+                    * add it to the head.
+                    */
+                   if (mask & 1) /* not order 0 merge */
+                     addfront = ( !test_bit((index^1)<<1, (area-1)->map) &&
+                                  !test_bit(((index^1)<<1) | 1, (area-1)->map) );
+                   break;
+                 }
 		/*
 		 * Move the buddy up one level.
 		 */
@@ -132,7 +146,11 @@
 		index >>= 1;
 		page_idx &= mask;
 	}
-	memlist_add_head(&(base + page_idx)->list, &area->free_list);
+
+       if (addfront)
+         memlist_add_head(&(base + page_idx)->list, &area->free_list);
+       else
+         memlist_add_tail(&(base + page_idx)->list, &area->free_list);
 
 	spin_unlock_irqrestore(&zone->lock, flags);
 


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 20:33           ` Daniel Phillips
@ 2001-09-02 21:14             ` Alex Bligh - linux-kernel
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Bligh - linux-kernel @ 2001-09-02 21:14 UTC (permalink / raw)
  To: Daniel Phillips, Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

Daniel,

>> > What do you do when a new module gets inserted, increasing the high
>> > order load and requiring that the slab be expanded? I.e, the need for
>> > dependable  handling of high order physical allocations doesn't go away
>> > entirely.  The slab would help even out the situation with atomic
>> > allocs because it can be expanded to a target size by a normal task,
>> > which can wait.
>>
>> Yes, chew away to disk as this allocation is non atomic.
>
> What I meant was, the new module implements a network driver which
> proceeds  to do atomic allocs.

I think I was agreeing with you, but I think what you meant was
that when you load the module, it expands the slab, but with
a /non-atomic/ alloc. And what I meant was that as the
page reclaim stuff is dumb in terms of recovering buddies
of existing pages, it will chew away for a good while until
it happens to find a large enough hole. But this isn't really
a problem if it happens on module load only.

> One thing I am hearing from some developers is that we don't need to
> solve  the high-order allocation problems because they are really driver
> issues -  all drivers should be changed to use scatter-gather or some
> such.  I don't  know if that's correct, this would require knowledge of
> all driver types and  archs which I can't begin to claim.

We have lots of interesting current restrictions, like we assume
IP packets are contiguous in memory, and we assume we can allocate
their buffers atomically. I suspect there will not be many fans
of non-contiguous IP packet models. [Historics: the buddy system
was originally introduced to cope with fragments which reassembled
to >4k in length, though devices like HSSI with 4470 byte MTU's
are going to have the same problem without fragmentation]

No argument that scatter/gather is better if possible & practical
(SCSI comes to mind).

--
Alex Bligh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 20:24           ` Daniel Phillips
@ 2001-09-02 21:03             ` Alex Bligh - linux-kernel
  0 siblings, 0 replies; 30+ messages in thread
From: Alex Bligh - linux-kernel @ 2001-09-02 21:03 UTC (permalink / raw)
  To: Daniel Phillips, Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

> +                     addfront = ( !test_bit((index^1)<<1, (area-1)->map)
> +                                  && !test_bit((index^1)<<1,
					                (area-1) -> map) );

Ooops, that second one should be

> +                                  && !test_bit((index^1)<<1 | 1,
> +					                (area-1) -> map) );


--
Alex Bligh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 19:32         ` Alex Bligh - linux-kernel
  2001-09-02 20:24           ` Daniel Phillips
@ 2001-09-02 20:33           ` Daniel Phillips
  2001-09-02 21:14             ` Alex Bligh - linux-kernel
  2001-09-02 21:23           ` Daniel Phillips
  2 siblings, 1 reply; 30+ messages in thread
From: Daniel Phillips @ 2001-09-02 20:33 UTC (permalink / raw)
  To: Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

On September 2, 2001 09:32 pm, Alex Bligh - linux-kernel wrote:
> Daniel,
> 
> > What do you do when a new module gets inserted, increasing the high order
> > load and requiring that the slab be expanded? I.e, the need for
> > dependable  handling of high order physical allocations doesn't go away
> > entirely.  The slab would help even out the situation with atomic allocs
> > because it can be expanded to a target size by a normal task, which can
> > wait.
> 
> Yes, chew away to disk as this allocation is non atomic.

What I meant was, the new module implements a network driver which proceeds 
to do atomic allocs.

> But this probably still needs something which goes and identifies
> kernel allocated pages with buddies which can be relocated / pushed to
> disk / freed etc.;
> 
> Alternatively something to temporarilly hold onto 'nearly freed'
> high-order areas is probably useful. IE if there's an order=3
> allocation stuck waiting for a suitable hole, and there's
> a bit of bitmap that looks like '00010000' (i.e. order 1 hole,
> order 0 hole, order 0 used, order 2 hole), I wonder if we can't
> think of some hueristic to avoid allocating the next order 0
> page request (atomic or not) from the order 0 hole even if
> it's at the front of the order (0) free area list.

One thing I am hearing from some developers is that we don't need to solve 
the high-order allocation problems because they are really driver issues - 
all drivers should be changed to use scatter-gather or some such.  I don't 
know if that's correct, this would require knowledge of all driver types and 
archs which I can't begin to claim.

I'm quite sure we can fix this up though, not to the point of being able to 
guarantee all high order allocations, but to the point where we have a high 
probability of success under all loads, as opposed to what we have now which 
is very fragile, and not just in the Linus tree.

--
Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 19:32         ` Alex Bligh - linux-kernel
@ 2001-09-02 20:24           ` Daniel Phillips
  2001-09-02 21:03             ` Alex Bligh - linux-kernel
  2001-09-02 20:33           ` Daniel Phillips
  2001-09-02 21:23           ` Daniel Phillips
  2 siblings, 1 reply; 30+ messages in thread
From: Daniel Phillips @ 2001-09-02 20:24 UTC (permalink / raw)
  To: Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

On September 2, 2001 09:32 pm, Alex Bligh - linux-kernel wrote:
> IDEA: Attempt not to allocate sets of pages buddied with 'nearly
> free' sets of pages.
> 
> When freeing pages, we work our way up the orders until
> we find a buddy which is non-empty. Let's assume that
> the free area, and our (non-empty) buddy, are of order N.
> Let's look at whether at order N-1, it's merely half full,
> or completely full. If completely full, we guess that
> the buddy is unlikely to become free soon, and thus
> add our area to the front of the memory queue (mem_add_head)
> else we guess it's more likely to have its buddy freed
> and add it to the back of the memory queue (mem_add_tail).
> 
> /Completely/ untested (i.e. uncompiled) patch attached

Rediffed to 2.4.9, whitespace and wrapping problems fixed, and compiled
but not tested.  Now we just need a victi^H^H^H^H^H volunteer to try it...
(Stephan?)

--- ../2.4.9.clean/mm/page_alloc.c	Thu Aug 16 12:43:02 2001
+++ ./mm/page_alloc.c	Sun Sep  2 21:09:05 2001
@@ -69,6 +69,8 @@
 	struct page *base;
 	zone_t *zone;
 
+       int addfront=1;
+
 	if (page->buffers)
 		BUG();
 	if (page->mapping)
@@ -112,10 +114,22 @@
 		if (area >= zone->free_area + MAX_ORDER)
 			BUG();
 		if (!__test_and_change_bit(index, area->map))
-			/*
-			 * the buddy page is still allocated.
-			 */
-			break;
+                 {
+                   /*
+                    * the buddy page is still allocated.
+                    *
+                    * see how many bits are set in its bitmap;
+                    * if 50% or more, we conclude the buddy is
+                    * unlikely to be freed soon, and add the
+                    * area to the head of the queue; else we
+                    * conclude the buddy may be free soon and
+                    * add it to the head.
+                    */
+                   if (mask & 1) /* not order 0 merge */
+                     addfront = ( !test_bit((index^1)<<1, (area-1)->map)
+                                  && !test_bit((index^1)<<1, (area-1)->map) );
+                   break;
+                 }
 		/*
 		 * Move the buddy up one level.
 		 */
@@ -132,7 +146,11 @@
 		index >>= 1;
 		page_idx &= mask;
 	}
-	memlist_add_head(&(base + page_idx)->list, &area->free_list);
+
+       if (addfront)
+         memlist_add_head(&(base + page_idx)->list, &area->free_list);
+       else
+         memlist_add_tail(&(base + page_idx)->list, &area->free_list);
 
 	spin_unlock_irqrestore(&zone->lock, flags);
 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 18:26       ` Daniel Phillips
@ 2001-09-02 19:32         ` Alex Bligh - linux-kernel
  2001-09-02 20:24           ` Daniel Phillips
                             ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Alex Bligh - linux-kernel @ 2001-09-02 19:32 UTC (permalink / raw)
  To: Daniel Phillips, Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

Daniel,

> What do you do when a new module gets inserted, increasing the high order
> load and requiring that the slab be expanded? I.e, the need for
> dependable  handling of high order physical allocations doesn't go away
> entirely.  The slab would help even out the situation with atomic allocs
> because it can be expanded to a target size by a normal task, which can
> wait.

Yes, chew away to disk as this allocation is non atomic.
But this probably still needs something which goes and identifies
kernel allocated pages with buddies which can be relocated / pushed to
disk / freed etc.;

Alternatively something to temporarilly hold onto 'nearly freed'
high-order areas is probably useful. IE if there's an order=3
allocation stuck waiting for a suitable hole, and there's
a bit of bitmap that looks like '00010000' (i.e. order 1 hole,
order 0 hole, order 0 used, order 2 hole), I wonder if we can't
think of some hueristic to avoid allocating the next order 0
page request (atomic or not) from the order 0 hole even if
it's at the front of the order (0) free area list.

IDEA: Attempt not to allocate sets of pages buddied with 'nearly
free' sets of pages.

When freeing pages, we work our way up the orders until
we find a buddy which is non-empty. Let's assume that
the free area, and our (non-empty) buddy, are of order N.
Let's look at whether at order N-1, it's merely half full,
or completely full. If completely full, we guess that
the buddy is unlikely to become free soon, and thus
add our area to the front of the memory queue (mem_add_head)
else we guess it's more likely to have its buddy freed
and add it to the back of the memory queue (mem_add_tail).

/Completely/ untested (i.e. uncompiled) patch attached

> The only  problem with slab allocation is, it has more overhead than
> __alloc_pages  allocation.  For high-performance networking this may be a
> measurable hit.

I meant slab /like/. If all the objects it allocates are the same
size, it can't be slower than the buddy allocator. It could
be simplified (for instance, drop the cache colouring stuff
if that's heavy - everything we allocated is, by definition,
much > 1 page in size).

--
Alex Bligh


--- page_alloc.c        Mon Jan 15 20:35:12 2001
+++ /tmp/page_alloc.c   Sun Sep  2 20:28:08 2001
@@ -69,6 +69,8 @@
        struct page *base;
        zone_t *zone;

+       int addfront=1;
+
        if (page->buffers)
                BUG();
        if (page->mapping)
@@ -112,10 +114,22 @@
                if (area >= zone->free_area + MAX_ORDER)
                        BUG();
                if (!test_and_change_bit(index, area->map))
-                       /*
-                        * the buddy page is still allocated.
-                        */
-                       break;
+                 {
+                   /*
+                    * the buddy page is still allocated.
+                    *
+                    * see how many bits are set in its bitmap;
+                    * if 50% or more, we conclude the buddy is
+                    * unlikely to be freed soon, and add the
+                    * area to the head of the queue; else we
+                    * conclude the buddy may be free soon and
+                    * add it to the head.
+                    */
+                   if (mask & 1) /* not order 0 merge */
+                     addfront = ( !test_bit((index^1)<<1, (area-1)->map)
+                                  && !test_bit((index^1)<<1, 
(area-1)->map) );
+                   break;
+                 }
                /*
                 * Move the buddy up one level.
                 */
@@ -132,7 +146,11 @@
                index >>= 1;
                page_idx &= mask;
        }
-       memlist_add_head(&(base + page_idx)->list, &area->free_list);
+
+       if (addfront)
+         memlist_add_head(&(base + page_idx)->list, &area->free_list);
+       else
+         memlist_add_tail(&(base + page_idx)->list, &area->free_list);

        spin_unlock_irqrestore(&zone->lock, flags);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02 13:48     ` Alex Bligh - linux-kernel
@ 2001-09-02 18:26       ` Daniel Phillips
  2001-09-02 19:32         ` Alex Bligh - linux-kernel
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Phillips @ 2001-09-02 18:26 UTC (permalink / raw)
  To: Alex Bligh - linux-kernel, Roger Larsson,
	Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

On September 2, 2001 03:48 pm, Alex Bligh - linux-kernel wrote:
> > Or/and we could remove the sources of higher order allocs, as an example:
> > why is the SCSI layer allowed to allocate order 3 allocs (32 kB) several
> > times per second? Will we really get a performance hit by not allowing
> > higher order allocs in active driver code?
> 
> Or put them in some slab like code, the slab for which gets allocated
> early on when memory is not fragmented, and (nearly) never gets released.

What do you do when a new module gets inserted, increasing the high order
load and requiring that the slab be expanded?  I.e, the need for dependable 
handling of high order physical allocations doesn't go away entirely.  The
slab would help even out the situation with atomic allocs because it can
be expanded to a target size by a normal task, which can wait.

> Most of the stuff that actually NEEDS atomic allocation (as opposed
> to some of the requirements that are bogus) are for packets / data
> in flight. There is probably a finite amount of this at any given time.

There is no bound, but yes it tends to stay limited to a given, small amount 
over long periods of time.  Hoarding schemes are appropriate.  The only 
problem with slab allocation is, it has more overhead than __alloc_pages 
allocation.  For high-performance networking this may be a measurable hit.

--
Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02  2:21   ` Roger Larsson
  2001-09-02  4:16     ` Daniel Phillips
@ 2001-09-02 13:48     ` Alex Bligh - linux-kernel
  2001-09-02 18:26       ` Daniel Phillips
  1 sibling, 1 reply; 30+ messages in thread
From: Alex Bligh - linux-kernel @ 2001-09-02 13:48 UTC (permalink / raw)
  To: Roger Larsson, Daniel Phillips, Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti, Alex Bligh - linux-kernel

> Or/and we could remove the sources of higher order allocs, as an example:
> why is the SCSI layer allowed to allocate order 3 allocs (32 kB) several
> times per second? Will we really get a performance hit by not allowing
> higher order allocs in active driver code?

Or put them in some slab like code, the slab for which gets allocated
early on when memory is not fragmented, and (nearly) never gets released.
Most of the stuff that actually NEEDS atomic allocation (as opposed
to some of the requirements that are bogus) are for packets / data
in flight. There is probably a finite amount of this at any given time.

--
Alex Bligh

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02  2:21   ` Roger Larsson
@ 2001-09-02  4:16     ` Daniel Phillips
  2001-09-02 13:48     ` Alex Bligh - linux-kernel
  1 sibling, 0 replies; 30+ messages in thread
From: Daniel Phillips @ 2001-09-02  4:16 UTC (permalink / raw)
  To: Roger Larsson, Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti

On September 2, 2001 04:21 am, Roger Larsson wrote:
> On Sunday den 2 September 2001 03:57, Daniel Phillips wrote:
> > In some sense, it's been good to have the issue
> > forced so that we must come up with ways to make atomic and higher order
> > allocations less fragile.
> 
> It might be that the elevator works now... :-)

You think it's the elevator?  It could be, but scanning policy seems much
more likely.

> You will only see it once there are no remaining free pages of an even
> higher order left - then you will start to fail...
> 
> Two things are required:
> 1) You have lots of memory.

Actually, the situation improves a little as you add memory.  I'll show that
mathematically tomorrow.

> 2) You have used it all at some point.

This is the normal case, except for startup and a few special situations such 
as after heavy file deletion or unmounting a volume.

> Another thing to do could be to add a order parameter to free.
> The pages allocated has to be freed sometime... if we make sure that
> they are freed together it could simplify things - no chance that the
> first part gets allocated directly...

We must be getting a little bit of avoidable fragmentation on freeing, but 
the real culprit is allocation, which tends to split up higher order 
allocations rapidly.

> Or/and we could remove the sources of higher order allocs, as an example:
> why is the SCSI layer allowed to allocate order 3 allocs (32 kB) several
> times per second? Will we really get a performance hit by not allowing
> higher order allocs in active driver code?

Yes, well, if we make it work properly that might not be necessary ;-)

I imagine a lot of higher order allocations could be removed without hurting 
performance, for example, where dma can handle non-physically-contiguous 
regions (i.e., scatter/gather).  On the other hand, leaving them just the way 
they are creates more incentive to fix the damn thing, not to mention 
providing the needed test cases.

--
Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-01 18:54   ` Stephan von Krawczynski
@ 2001-09-02  3:21     ` Mike Galbraith
  0 siblings, 0 replies; 30+ messages in thread
From: Mike Galbraith @ 2001-09-02  3:21 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On Sat, 1 Sep 2001, Stephan von Krawczynski wrote:

> On Sat, 1 Sep 2001 10:26:48 +0200 (CEST) Mike Galbraith <mikeg@wen-online.de>
> wrote:
>
> > On Sat, 1 Sep 2001, Daniel Phillips wrote:
> >
> > > Better go back and read the thread.  The allocation rate is definitely
> > > limited - he's doing a cd burn and some network copies.  [....]
> >                          ^^^^^^^
> > P.S.
> > Stephan:  try unconditionally doing gfp_mask &= ~__GFP_WAIT at the top
> > of page_launder().  I think that will help your problem some.
>
> Hi Mike,
>
> I tried, doesn't make a difference. Same number of alloc-fails.
> Sorry.

Darn.  Scratch one theory.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-02  1:57 ` Daniel Phillips
@ 2001-09-02  2:21   ` Roger Larsson
  2001-09-02  4:16     ` Daniel Phillips
  2001-09-02 13:48     ` Alex Bligh - linux-kernel
  0 siblings, 2 replies; 30+ messages in thread
From: Roger Larsson @ 2001-09-02  2:21 UTC (permalink / raw)
  To: Daniel Phillips, Stephan von Krawczynski, linux-kernel
  Cc: Rik van Riel, Marcelo Tosatti

On Sunday den 2 September 2001 03:57, Daniel Phillips wrote:
> On September 1, 2001 08:28 pm, Stephan von Krawczynski wrote:
>
> The next part of the theory says that the higher order allocations are
> failing because of fragmentation.  I put considerable thought into this
> today while wandering around in a dungeon in Berlin watching bats (really)
> and I will post an article to lkml tomorrow with my findings.  To summarize
> briefly here: a Linux system in steady state operation *is* going to show
> physical fragmentation so that the chance of a higher order allocation
> succeeding becomes very small.  The chance of failure increases
> exponentially (or worse) with a) the allocation order and b) the ratio of
> allocated to free memory.  The second of these you can control: the higher
> you set zone->pages_min the better chance your higher order allocations
> will have to succeed.  Do you want a patch for that, to see if this works
> in practice?
>

You beat me to it, with some minutes...
(I sent a email to Stephan...)

> Of course it would be much better if we had some positive mechanism for
> defragging physical memory instead of just relying on chance and hoping
> for the best the way we do now.  IMHO, such a mechanism can be built
> practically and I'm willing to give it a try.  Basically, kswapd would try
> to restore a depleted zone order list by scanning mem_map to find buddy
> partners for free blocks of the next lower order.  This strategy, together
> with the one used in the patch above, could largly eliminate atomic
> allocation failures.  (Although as I mentioned some time ago, getting rid
> of them entirely is an impossible problem.)
>
> The question remains why we suddenly started seeing more atomic allocation
> failures in the recent Linus trees.  I'll guess that the changes in
> scanning strategy have caused the system to spend more time close to the
> zone->pages_min amount of free memory.  This idea seems to be supported by
> your memstat listings.  In some sense, it's been good to have the issue
> forced so that we must come up with ways to make atomic and higher order
> allocations less fragile.

It might be that the elevator works now... :-)

You will only see it once there are no remaining free pages of an even higher
order left - then you will start to fail...

Two things are required:
1) You have lots of memory.
2) You have used it all at some point.

Another thing to do could be to add a order parameter to free.
The pages allocated has to be freed sometime... if we make sure that
they are freed together it could simplify things - no chance that the
first part gets allocated directly...

Or/and we could remove the sources of higher order allocs, as an example:
why is the SCSI layer allowed to allocate order 3 allocs (32 kB) several
times per second? Will we really get a performance hit by not allowing
higher order allocs in active driver code?

/RogerL
-- 
Roger Larsson
Skellefteå
Sweden

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
  2001-09-01 18:28 Stephan von Krawczynski
@ 2001-09-02  1:57 ` Daniel Phillips
  2001-09-02  2:21   ` Roger Larsson
  0 siblings, 1 reply; 30+ messages in thread
From: Daniel Phillips @ 2001-09-02  1:57 UTC (permalink / raw)
  To: Stephan von Krawczynski, linux-kernel; +Cc: Rik van Riel, Marcelo Tosatti

On September 1, 2001 08:28 pm, Stephan von Krawczynski wrote:
> On Fri, 31 Aug 2001 21:03:22 +0200
> Daniel Phillips <phillips@bonn-fries.net> wrote:
> 
> > > >  		/* XXX: is pages_min/4 a good amount to reserve for this? */
> > > > +		if (z->free_pages < z->pages_min / 3 && (gfp_mask & __GFP_WAIT) &&
> > > > +				!(current->flags & PF_MEMALLOC))
> > > > +			continue;
> > > Hello Daniel,
> > > 
> > > I tried this patch and it makes _no_ difference. Failures show up in same 
> > > situation and amount. Do you need traces? They look the same
> > 
> > OK, first would you confirm that the frequency of 0 order failures has
> > stayed the same?
> 
> Hello Daniel (and the rest),
> 
> I redid the test and have the following results, based on a 2.4.10-pre2 with
> above patch:
> [...]
>
> And the traces:
> 83 mostly identical errors showed up, all looking like
> 
> Sep  1 15:17:53 admin kernel: cdda2wav: __alloc_pages: 3-order allocation
> failed (gfp=0x20/0).
> [...]
> Trace; fdcf80f5 <[sg]sg_build_reserve+25/48>
> Trace; fdcf6589 <[sg]sg_ioctl+6c5/ae4>
> Trace; fdcf76bd <[sg]sg_build_indi+55/1a8>
> 
> So, there are no 0-order allocs failing in this setup.
> 
> Are you content with having no 0-order failures?

It's a start.  The important thing is to have supported my theory of what is
going on here.  What I did there is probably a good thing, it seems quite
effective for combatting 0 order atomic failures.  In this case you have a
driver that uses a fallback allocation strategy, starting with a 3 order
allocation attempt and dropping down top the next lower size on failure.  If
0 order allocation fails the whole operation fails, and maybe you will lose
a packet.  So 0 order allocations are important, we really want them to
succeed.

The next part of the theory says that the higher order allocations are
failing because of fragmentation.  I put considerable thought into this
today while wandering around in a dungeon in Berlin watching bats (really)
and I will post an article to lkml tomorrow with my findings.  To summarize
briefly here: a Linux system in steady state operation *is* going to show
physical fragmentation so that the chance of a higher order allocation
succeeding becomes very small.  The chance of failure increases
exponentially (or worse) with a) the allocation order and b) the ratio of
allocated to free memory.  The second of these you can control: the higher
you set zone->pages_min the better chance your higher order allocations
will have to succeed.  Do you want a patch for that, to see if this works
in practice?

Of course it would be much better if we had some positive mechanism for
defragging physical memory instead of just relying on chance and hoping
for the best the way we do now.  IMHO, such a mechanism can be built
practically and I'm willing to give it a try.  Basically, kswapd would try
to restore a depleted zone order list by scanning mem_map to find buddy
partners for free blocks of the next lower order.  This strategy, together
with the one used in the patch above, could largly eliminate atomic
allocation failures.  (Although as I mentioned some time ago, getting rid
of them entirely is an impossible problem.)

The question remains why we suddenly started seeing more atomic allocation
failures in the recent Linus trees.  I'll guess that the changes in
scanning strategy have caused the system to spend more time close to the
zone->pages_min amount of free memory.  This idea seems to be supported by
your memstat listings.  In some sense, it's been good to have the issue
forced so that we must come up with ways to make atomic and higher order
allocations less fragile.

--
Daniel

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
       [not found] ` <20010901055634Z16057-32383+2785@humbolt.nl.linux.org>
@ 2001-09-01 18:54   ` Stephan von Krawczynski
  2001-09-02  3:21     ` Mike Galbraith
  0 siblings, 1 reply; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-09-01 18:54 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel

On Sat, 1 Sep 2001 10:26:48 +0200 (CEST) Mike Galbraith <mikeg@wen-online.de>
wrote:

> On Sat, 1 Sep 2001, Daniel Phillips wrote:
> 
> > Better go back and read the thread.  The allocation rate is definitely
> > limited - he's doing a cd burn and some network copies.  [....]
>                          ^^^^^^^
> P.S.
> Stephan:  try unconditionally doing gfp_mask &= ~__GFP_WAIT at the top
> of page_launder().  I think that will help your problem some.

Hi Mike,

I tried, doesn't make a difference. Same number of alloc-fails.
Sorry. 
Patch looks weird anyway :-)

Regards, Stephan


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Memory Problem in 2.4.10-pre2 / __alloc_pages failed
@ 2001-09-01 18:28 Stephan von Krawczynski
  2001-09-02  1:57 ` Daniel Phillips
  0 siblings, 1 reply; 30+ messages in thread
From: Stephan von Krawczynski @ 2001-09-01 18:28 UTC (permalink / raw)
  To: linux-kernel

On Fri, 31 Aug 2001 21:03:22 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> > >  		/* XXX: is pages_min/4 a good amount to reserve for this? */
> > > +		if (z->free_pages < z->pages_min / 3 && (gfp_mask & __GFP_WAIT) &&
> > > +				!(current->flags & PF_MEMALLOC))
> > > +			continue;
> > Hello Daniel,
> > 
> > I tried this patch and it makes _no_ difference. Failures show up in same 
> > situation and amount. Do you need traces? They look the same
> 
> OK, first would you confirm that the frequency of 0 order failures has
> stayed the same?

Hello Daniel (and the rest),

I redid the test and have the following results, based on a 2.4.10-pre2 with
above patch:

meminfo before:

Sep  1 15:09:40 admin kernel: SysRq: Show Memory
Sep  1 15:09:40 admin kernel: Mem-info:
Sep  1 15:09:40 admin kernel: Free pages:      811288kB (     0kB HighMem)
Sep  1 15:09:40 admin kernel: ( Active: 927, inactive_dirty: 9604,
inactive_clea
n: 0, free: 202822 (383 766 1149) )
Sep  1 15:09:40 admin kernel: 1*4kB 1*8kB 5*16kB 4*32kB 5*64kB 1*128kB 1*256kB
0
*512kB 1*1024kB 6*2048kB = 14236kB)
Sep  1 15:09:40 admin kernel: 1*4kB 1*8kB 1*16kB 141*32kB 57*64kB 31*128kB
10*25
6kB 4*512kB 0*1024kB 381*2048kB = 797052kB)
Sep  1 15:09:40 admin kernel: = 0kB)
Sep  1 15:09:40 admin kernel: Swap cache: add 0, delete 0, find 0/0
Sep  1 15:09:40 admin kernel: Free swap:       265032kB
Sep  1 15:09:40 admin kernel: 229376 pages of RAM
Sep  1 15:09:40 admin kernel: 0 pages of HIGHMEM
Sep  1 15:09:40 admin kernel: 4378 reserved pages
Sep  1 15:09:40 admin kernel: 22056 pages shared
Sep  1 15:09:40 admin kernel: 0 pages swap cached
Sep  1 15:09:40 admin kernel: 0 pages in page table cachSep  1 15:09:40 admin
kernel: Buffer memory:     6576kB
Sep  1 15:09:40 admin kernel:     CLEAN: 1489 buffers, 5935 kbyte, 513 used
(las
t=1489), 0 locked, 0 protected, 0 dirty

        total:    used:    free:  shared: buffers:  cached:
Mem:  921726976 105066496 816660480        0  6733824 36917248
Swap: 271392768        0 271392768
MemTotal:       900124 kB
MemFree:        797520 kB
MemShared:           0 kB
Buffers:          6576 kB
Cached:          36052 kB
SwapCached:          0 kB
Active:           4076 kB
Inact_dirty:     38552 kB
Inact_clean:         0 kB
Inact_target:      296 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900124 kB
LowFree:        797520 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

(meminfo may slightly differ from SYSREQ, an application (seti) started in
between)

The usual test was going on, in the end I had:

meminfo:

Sep  1 15:18:27 admin kernel: SysRq: Show Memory
Sep  1 15:18:27 admin kernel: Mem-info:
Sep  1 15:18:27 admin kernel: Free pages:        3056kB (     0kB HighMem)
Sep  1 15:18:27 admin kernel: ( Active: 13965, inactive_dirty: 185778,
inactive_clean: 178, free: 764 (383 766 1149) )
Sep  1 15:18:27 admin kernel: 1*4kB 1*8kB 1*16kB 9*32kB 5*64kB 1*128kB 1*256kB
0*512kB 0*1024kB 0*2048kB = 1020kB)
Sep  1 15:18:27 admin kernel: 1*4kB 0*8kB 1*16kB 1*32kB 1*64kB 5*128kB 3*256kB
1*512kB 0*1024kB 0*2048kB = 2036kB)
Sep  1 15:18:27 admin kernel: = 0kB)
Sep  1 15:18:27 admin kernel: Swap cache: add 0, delete 0, find 0/0
Sep  1 15:18:27 admin kernel: Free swap:       265032kB
Sep  1 15:18:27 admin kernel: 229376 pages of RAM
Sep  1 15:18:27 admin kernel: 0 pages of HIGHMEM
Sep  1 15:18:27 admin kernel: 4378 reserved pages
Sep  1 15:18:27 admin kernel: 215286 pages shared
Sep  1 15:18:27 admin kernel: 0 pages swap cached
Sep  1 15:18:27 admin kernel: 0 pages in page table cache
Sep  1 15:18:27 admin kernel: Buffer memory:    15576kB
Sep  1 15:18:27 admin kernel:     CLEAN: 178719 buffers, 714855 kbyte, 536 used
(last=178719), 0 locked, 0 protected, 0 dirty
Sep  1 15:18:27 admin kernel:     DIRTY: 10060 buffers, 40240 kbyte, 0 used
(last=0), 0 locked, 0 protected, 10060 dirty
S
        total:    used:    free:  shared: buffers:  cached:
Mem:  921726976 918597632  3129344        0 15958016 802725888
Swap: 271392768        0 271392768
MemTotal:       900124 kB
MemFree:          3056 kB
MemShared:           0 kB
Buffers:         15584 kB
Cached:         783912 kB
SwapCached:          0 kB
Active:          55864 kB
Inact_dirty:    742948 kB
Inact_clean:       684 kB
Inact_target:     8188 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900124 kB
LowFree:          3056 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

And the traces:
83 mostly identical errors showed up, all looking like

Sep  1 15:17:53 admin kernel: cdda2wav: __alloc_pages: 3-order allocation
failed (gfp=0x20/0).
Sep  1 15:17:53 admin kernel: Call Trace: [_alloc_pages+22/24]
[__get_free_pages+10/28] [<fdcf8826>] [<fdcf88f5>] [<fdcf77d7>]     
Sep  1 15:17:53 admin kernel:    [<fdcf80f5>] [<fdcf6589>] [_alloc_pages+22/24]
[__get_free_pages+10/28] [<fdcf8826>] [<fdcf88f5>] 
Sep  1 15:17:53 admin kernel:    [<fdcf76bd>] [filemap_nopage+171/1008]
[do_no_page+90/244] [handle_mm_fault+97/192] [<fdcf54aa>]
[do_page_fault+0/1164]
Sep  1 15:17:53 admin kernel:    [dentry_open+189/316] [filp_open+82/92]
[do_fcntl+370/712] [sys_ioctl+443/532] [system_call+51/56]

Trace; fdcf80f5 <[sg]sg_build_reserve+25/48>
Trace; fdcf6589 <[sg]sg_ioctl+6c5/ae4>
Trace; fdcf76bd <[sg]sg_build_indi+55/1a8>

So, there are no 0-order allocs failing in this setup.

Are you content with having no 0-order failures?

I will try to simplify the test to a case that anybody can check out more
easily.
Stay tuned.

Regards,
Stephan


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2001-09-07 21:13 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-29 12:07 Memory Problem in 2.4.10-pre2 / __alloc_pages failed Stephan von Krawczynski
2001-08-29 16:47 ` Roger Larsson
2001-08-29 19:18 ` Stephan von Krawczynski
2001-08-30 14:16   ` Stephan von Krawczynski
2001-08-29 23:36 ` Daniel Phillips
2001-08-30 16:49   ` Roger Larsson
2001-08-30 14:46 ` Stephan von Krawczynski
2001-08-30 18:02   ` Daniel Phillips
2001-08-30 23:53   ` [PATCH] __alloc_pages cleanup -R6 Was: " Roger Larsson
2001-08-31  7:43     ` Russell King
2001-08-31 23:22       ` Roger Larsson
2001-08-31 10:32   ` Stephan von Krawczynski
2001-08-31 11:06 ` Stephan von Krawczynski
2001-08-31 19:03   ` Daniel Phillips
2001-09-01 18:28 Stephan von Krawczynski
2001-09-02  1:57 ` Daniel Phillips
2001-09-02  2:21   ` Roger Larsson
2001-09-02  4:16     ` Daniel Phillips
2001-09-02 13:48     ` Alex Bligh - linux-kernel
2001-09-02 18:26       ` Daniel Phillips
2001-09-02 19:32         ` Alex Bligh - linux-kernel
2001-09-02 20:24           ` Daniel Phillips
2001-09-02 21:03             ` Alex Bligh - linux-kernel
2001-09-02 20:33           ` Daniel Phillips
2001-09-02 21:14             ` Alex Bligh - linux-kernel
2001-09-02 21:23           ` Daniel Phillips
2001-09-02 21:28             ` Alex Bligh - linux-kernel
     [not found] <Pine.LNX.4.33.0109011021570.280-100000@mikeg.weiden.de>
     [not found] ` <20010901055634Z16057-32383+2785@humbolt.nl.linux.org>
2001-09-01 18:54   ` Stephan von Krawczynski
2001-09-02  3:21     ` Mike Galbraith
     [not found] <689208719.999883299@[10.132.112.53]>
     [not found] ` <20010907154801.028a48e8.skraw@ithnet.com>
2001-09-07 21:13   ` Stephan von Krawczynski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).