linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Memory Problem in 2.4.9 ?
@ 2001-08-21 13:46 ` Stephan von Krawczynski
  2001-08-21 14:33   ` Daniel Phillips
  0 siblings, 1 reply; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-21 13:46 UTC (permalink / raw)
  To: linux-kernel

Hello all,

can anybody enlighten me about the following kernel-message:

Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
Aug 21 14:37:39 admin kernel: __alloc_pages: 2-order allocation failed. 
Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
Aug 21 14:37:39 admin kernel: __alloc_pages: 2-order allocation failed.
Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
Aug 21 14:37:39 admin last message repeated 69 times

I get tons of them during verifying burned CDs with xcdroast (which takes a really long time, and must have some problem therefore). 
I would not worry you if it didn't work with earlier kernel-versions, but in fact it did. 

Hardware: (Asus CUV4X-D, yes it works :-)
00:00.0 Host bridge: VIA Technologies, Inc. VT82C693A/694x [Apollo PRO133x] (rev c4)
00:01.0 PCI bridge: VIA Technologies, Inc. VT82C598/694x [Apollo MVP3/Pro133x AGP]
00:04.0 ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 40)
00:04.1 IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 06)
00:04.2 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:04.3 USB Controller: VIA Technologies, Inc. UHCI USB (rev 16)
00:04.4 Host bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 40)
00:09.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
00:0a.0 Network controller: Elsa AG QuickStep 1000 (rev 01)
00:0b.0 SCSI storage controller: Adaptec 7892A (rev 02)
00:0d.0 Multimedia audio controller: Creative Labs SB Live! EMU10000 (rev 07)
00:0d.1 Input device controller: Creative Labs SB Live! (rev 07)
01:00.0 VGA compatible controller: nVidia Corporation NV11 (rev b2)
02:04.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
02:05.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
02:06.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)
02:07.0 Ethernet controller: Digital Equipment Corporation DECchip 21142/43 (rev 41)

cpu:
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 10
cpu MHz         : 1004.525
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 mmx fxsr sse
bogomips        : 2005.40

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 8
model name      : Pentium III (Coppermine)
stepping        : 10
cpu MHz         : 1004.525
cache size      : 256 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 mmx fxsr sse
bogomips        : 2005.40

MEM:
        total:    used:    free:  shared: buffers:  cached:
Mem:  1053675520 1047502848  6172672        0 20930560 939307008
Swap: 271392768 15880192 255512576
MemTotal:      1028980 kB
MemFree:          6028 kB
MemShared:           0 kB
Buffers:         20440 kB
Cached:         911860 kB
SwapCached:       5432 kB
Active:         571980 kB
Inact_dirty:    362480 kB
Inact_clean:      3272 kB
Inact_target:     2336 kB
HighTotal:      131056 kB
HighFree:         2036 kB
LowTotal:       897924 kB
LowFree:          3992 kB
SwapTotal:      265032 kB
SwapFree:       249524 kB


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-21 13:46 ` Memory Problem in 2.4.9 ? Stephan von Krawczynski
@ 2001-08-21 14:33   ` Daniel Phillips
       [not found]     ` <20010821194140.43b46b10.skraw@ithnet.com>
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Phillips @ 2001-08-21 14:33 UTC (permalink / raw)
  To: Stephan von Krawczynski, linux-kernel

On August 21, 2001 03:46 pm, Stephan von Krawczynski wrote:
> Hello all,
> 
> can anybody enlighten me about the following kernel-message:
> 
> Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
> Aug 21 14:37:39 admin kernel: __alloc_pages: 2-order allocation failed. 
> Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
> Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
> Aug 21 14:37:39 admin kernel: __alloc_pages: 2-order allocation failed.
> Aug 21 14:37:39 admin kernel: __alloc_pages: 3-order allocation failed.
> Aug 21 14:37:39 admin last message repeated 69 times
> 
> I get tons of them during verifying burned CDs with xcdroast (which takes
> a really long time, and must have some problem therefore).  I would not 
> worry you if it didn't work with earlier kernel-versions, but in fact it
> did. 

The following patch will tell us more about the failure, could you please
apply (patch -p0): 

--- ../2.4.9.clean/mm/page_alloc.c	Thu Aug 16 12:43:02 2001
+++ ./mm/page_alloc.c	Mon Aug 20 22:05:40 2001
@@ -502,7 +502,7 @@
 	}
 
 	/* No luck.. */
-	printk(KERN_ERR "__alloc_pages: %lu-order allocation failed.\n", order);
+	printk(KERN_ERR "__alloc_pages: %lu-order allocation failed (gfp=0x%x/%i).\n", order, gfp_mask, !!(current->flags & PF_MEMALLOC));
 	return NULL;
 }
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
       [not found]     ` <20010821194140.43b46b10.skraw@ithnet.com>
@ 2001-08-21 18:17       ` Stephan von Krawczynski
  2001-08-21 19:10         ` Daniel Phillips
                           ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-21 18:17 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, 21 Aug 2001 19:55:49 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> Do you have highmem configged?  Try it with highmem off.

I did. Problem stays:

Aug 21 20:14:51 admin kernel: __alloc_pages: 3-order allocation failed (gfp=0x20/0).
Aug 21 20:14:51 admin last message repeated 146 times

Next idea?

Regards,
Stephan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-21 18:17       ` Stephan von Krawczynski
@ 2001-08-21 19:10         ` Daniel Phillips
  2001-08-22  0:04         ` Stephan von Krawczynski
  2001-08-22 11:52         ` Stephan von Krawczynski
  2 siblings, 0 replies; 21+ messages in thread
From: Daniel Phillips @ 2001-08-21 19:10 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On August 21, 2001 08:17 pm, Stephan von Krawczynski wrote:
> On Tue, 21 Aug 2001 19:55:49 +0200
> Daniel Phillips <phillips@bonn-fries.net> wrote:
> 
> > Do you have highmem configged?  Try it with highmem off.
> 
> I did. Problem stays:
> 
> Aug 21 20:14:51 admin kernel: __alloc_pages: 3-order allocation failed 
(gfp=0x20/0).
> Aug 21 20:14:51 admin last message repeated 146 times
> 
> Next idea?

It's an atomic allocation, the driver is supposed to be able to handle this, 
and it does since you report that the burn just runs slowly, it does not 
stop.  There is way too much in cache:

>         total:    used:    free:  shared: buffers:  cached:
> Mem:  1053675520 1047502848  6172672        0 20930560 939307008
> Swap: 271392768 15880192 255512576

This is causing the high order allocation failures - with only a small 
fraction of memory free the chances of none of it being in contiguous, 
aligned 8 page units rises dramatically.  It's likely the kprint that is 
slowing you down, you could check this by commenting it out (page_alloc.c, 
near the end of __alloc_pages).

Do you have the same problem on 2.4.8, but not in 2.4.7?

--
Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-21 18:17       ` Stephan von Krawczynski
  2001-08-21 19:10         ` Daniel Phillips
@ 2001-08-22  0:04         ` Stephan von Krawczynski
  2001-08-22  0:43           ` Daniel Phillips
  2001-08-22 11:52         ` Stephan von Krawczynski
  2 siblings, 1 reply; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-22  0:04 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, 21 Aug 2001 21:10:44 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> > Aug 21 20:14:51 admin kernel: __alloc_pages: 3-order allocation failed 
> (gfp=0x20/0).
> > Aug 21 20:14:51 admin last message repeated 146 times
> > 
> > Next idea?
> 
> It's an atomic allocation, the driver is supposed to be able to handle this, 
> and it does since you report that the burn just runs slowly, it does not 
> stop.  There is way too much in cache:
> 
> >         total:    used:    free:  shared: buffers:  cached:
> > Mem:  1053675520 1047502848  6172672        0 20930560 939307008
> > Swap: 271392768 15880192 255512576
> 
> This is causing the high order allocation failures - with only a small 
> fraction of memory free the chances of none of it being in contiguous, 
> aligned 8 page units rises dramatically.

I basically thought the same. In fact I do not understand why. Are there any parameters tunable to balance the whole picture a bit more towards the free pages?

>  It's likely the kprint that is 
> slowing you down, you could check this by commenting it out (page_alloc.c, 
> near the end of __alloc_pages).

I guess you mean the formerly patched debug-output, do you? I commented it out and saw a way better result than before. In fact I did not manage to break the NFS-copy at all, and although I managed to get the cpu load up to about 5 everything worked smoother. Only now and then were some moments where the display freezes "a bit", but mouse movement continues to work.
Anyway I am not sure, if it is intended that my browser gets swapped out only by copying files via NFS which are alltogether smaller than my physical 1 GB of RAM. I do think that there is still a little too much caching going on.

> Do you have the same problem on 2.4.8, but not in 2.4.7?

I am going to check that tomorrow. Downgrading is a bit tricky on this system.

Thanks Daniel, 
I'll be back
:-)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22  0:04         ` Stephan von Krawczynski
@ 2001-08-22  0:43           ` Daniel Phillips
  2001-08-22  0:48             ` Rik van Riel
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Phillips @ 2001-08-22  0:43 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On August 22, 2001 02:04 am, Stephan von Krawczynski wrote:
> Daniel Phillips <phillips@bonn-fries.net> wrote:
> 
> > > Aug 21 20:14:51 admin kernel: __alloc_pages: 3-order allocation failed 
> > > (gfp=0x20/0).
> > > Aug 21 20:14:51 admin last message repeated 146 times
> > > 
> > > Next idea?
> > 
> > It's an atomic allocation, the driver is supposed to be able to handle
> > this, and it does since you report that the burn just runs slowly, it 
> > does not stop.  There is way too much in cache:
> > 
> > >         total:    used:    free:  shared: buffers:  cached:
> > > Mem:  1053675520 1047502848  6172672        0 20930560 939307008
> > > Swap: 271392768 15880192 255512576
> > 
> > This is causing the high order allocation failures - with only a small 
> > fraction of memory free the chances of none of it being in contiguous, 
> > aligned 8 page units rises dramatically.
> 
> I basically thought the same. In fact I do not understand why. Are there any
> parameters tunable to balance the whole picture a bit more towards the free
> pages?

I'd like to try to isolate the cause a little more.  Can you please try the 
following patch and see if it improves the cache balance.  (This would not
be a good solution, it will just help show what is happening.)

--- ../2.4.9.clean/mm/filemap.c	Thu Aug 16 14:12:07 2001
+++ ./mm/filemap.c	Wed Aug 22 01:11:44 2001
@@ -980,7 +980,7 @@
 static inline void check_used_once (struct page *page)
 {
 	if (!PageActive(page)) {
-		if (page->age)
+		if (page->age > 8)
 			activate_page(page);
 		else {
 			page->age = PAGE_AGE_START;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22  0:43           ` Daniel Phillips
@ 2001-08-22  0:48             ` Rik van Riel
  2001-08-22  1:13               ` Daniel Phillips
  2001-08-22 10:43               ` Stephan von Krawczynski
  0 siblings, 2 replies; 21+ messages in thread
From: Rik van Riel @ 2001-08-22  0:48 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Stephan von Krawczynski, linux-kernel

On Wed, 22 Aug 2001, Daniel Phillips wrote:

> --- ../2.4.9.clean/mm/filemap.c	Thu Aug 16 14:12:07 2001
> +++ ./mm/filemap.c	Wed Aug 22 01:11:44 2001
> @@ -980,7 +980,7 @@
>  static inline void check_used_once (struct page *page)
>  {
>  	if (!PageActive(page)) {
> -		if (page->age)
> +		if (page->age > 8)
>  			activate_page(page);
>  		else {
>  			page->age = PAGE_AGE_START;

This makes absolutely no sense since you'll never set the
page age higher than PAGE_AGE_START until the is actually
on the active list.

Rik
--
IA64: a worthy successor to i860.

http://www.surriel.com/		http://distro.conectiva.com/

Send all your spam to aardvark@nl.linux.org (spam digging piggy)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22  0:48             ` Rik van Riel
@ 2001-08-22  1:13               ` Daniel Phillips
  2001-08-22 11:01                 ` Stephan von Krawczynski
  2001-08-22 10:43               ` Stephan von Krawczynski
  1 sibling, 1 reply; 21+ messages in thread
From: Daniel Phillips @ 2001-08-22  1:13 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Stephan von Krawczynski, linux-kernel

On August 22, 2001 02:48 am, Rik van Riel wrote:
> On Wed, 22 Aug 2001, Daniel Phillips wrote:
> 
> > --- ../2.4.9.clean/mm/filemap.c	Thu Aug 16 14:12:07 2001
> > +++ ./mm/filemap.c	Wed Aug 22 01:11:44 2001
> > @@ -980,7 +980,7 @@
> >  static inline void check_used_once (struct page *page)
> >  {
> >  	if (!PageActive(page)) {
> > -		if (page->age)
> > +		if (page->age > 8)
> >  			activate_page(page);
> >  		else {
> >  			page->age = PAGE_AGE_START;
> 
> This makes absolutely no sense since you'll never set the
> page age higher than PAGE_AGE_START until the is actually
> on the active list.

Oops, yes, I forgot for the moment that we no longer age up in 
__find_page_nolock.  Lets try this instead, which should capture the intended 
effect of requiring 4 hits to activate a page (n.b., it's just a test):

--- ../2.4.9.clean/mm/filemap.c	Thu Aug 16 14:12:07 2001
+++ ./mm/filemap.c	Wed Aug 22 02:02:24 2001
@@ -980,10 +980,9 @@
 static inline void check_used_once (struct page *page)
 {
 	if (!PageActive(page)) {
-		if (page->age)
+		if (++page->age >= 4)
 			activate_page(page);
 		else {
-			page->age = PAGE_AGE_START;
 			ClearPageReferenced(page);
 		}
 	}

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22  0:48             ` Rik van Riel
  2001-08-22  1:13               ` Daniel Phillips
@ 2001-08-22 10:43               ` Stephan von Krawczynski
  1 sibling, 0 replies; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-22 10:43 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Wed, 22 Aug 2001 03:13:23 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> Oops, yes, I forgot for the moment that we no longer age up in 
> __find_page_nolock.  Lets try this instead, which should capture the intended 
> effect of requiring 4 hits to activate a page (n.b., it's just a test):
> 
> --- ../2.4.9.clean/mm/filemap.c	Thu Aug 16 14:12:07 2001
> +++ ./mm/filemap.c	Wed Aug 22 02:02:24 2001
> @@ -980,10 +980,9 @@
>  static inline void check_used_once (struct page *page)
>  {
>  	if (!PageActive(page)) {
> -		if (page->age)
> +		if (++page->age >= 4)
>  			activate_page(page);
>  		else {
> -			page->age = PAGE_AGE_START;
>  			ClearPageReferenced(page);
>  		}
>  	}
> 

Ok. I applied this patch. What I experience is this:

meminfo Before test:

        total:    used:    free:  shared: buffers:  cached:
Mem:  921726976 87789568 833937408        0  6705152 37306368
Swap: 271392768        0 271392768
MemTotal:       900124 kB
MemFree:        814392 kB
MemShared:           0 kB
Buffers:          6548 kB
Cached:          36432 kB
SwapCached:          0 kB
Active:           2944 kB
Inact_dirty:     40036 kB
Inact_clean:         0 kB
Inact_target:      868 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900124 kB
LowFree:        814392 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

meminfo after test:
        total:    used:    free:  shared: buffers:  cached:
Mem:  921726976 918429696  3297280        0  9211904 792858624
Swap: 271392768        0 271392768
MemTotal:       900124 kB
MemFree:          3220 kB
MemShared:           0 kB
Buffers:          8996 kB
Cached:         774276 kB
SwapCached:          0 kB
Active:          46776 kB
Inact_dirty:    731852 kB
Inact_clean:      4644 kB
Inact_target:     8460 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900124 kB
LowFree:          3220 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

I see the cache grow slowly but constantly during file-copy. I stopped the test, when the first errors occured from NFS at client side (cp: /backup/Aug/day_18_10.gz: Stale NFS file handle). Interestingly the errors came up, when all physical memory was eaten up by the cache, the free section was very low, but no swapping occured (swap _is_ turned on).
I could copy 766389 kB in total which looks roughly like cached-value. I guess there is simply no release done. Does the aging algorithm really work (as expected)?

Regards, Stephan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22  1:13               ` Daniel Phillips
@ 2001-08-22 11:01                 ` Stephan von Krawczynski
  2001-08-22 17:22                   ` Mike Galbraith
  2001-08-22 19:18                   ` Stephan von Krawczynski
  0 siblings, 2 replies; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-22 11:01 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel

On Wed, 22 Aug 2001 07:33:38 +0200 (CEST)
Mike Galbraith <mikeg@wen-online.de> wrote:

>> HAHAHA.. I was right, hurried whack with my little hammer _did_ bust
> it all to pieces :)
> 
> This is also (very!) hurried and _lightly_ tested, but still cures my
> problem..  what do you think?
> 
> 	-Mike
> 
> 
> --- linux-2.4.9/mm/vmscan.c.org	Sun Aug 19 08:55:24 2001
> +++ linux-2.4.9/mm/vmscan.c	Wed Aug 22 05:03:50 2001
> @@ -506,11 +506,17 @@
>  		}
[...]
> +			if (++page->age > PAGE_AGE_START) {

I am not very experienced with the aging algorithm, but can this statement be false at all? I mean if I get that right page->age starts with PAGE_AGE_START, doesn't it?

> +				del_page_from_inactive_dirty_list(page);
> +				add_page_to_active_list(page);
> +				page->age = PAGE_AGE_START;
> +				continue;
> +			}
> +			list_del(page_lru);
> +			list_add(page_lru, &inactive_dirty_list);
>  			continue;
>  		}
> 
> @@ -927,7 +933,7 @@
>  			recalculate_vm_stats();
>  		}
> 
> -		if (!do_try_to_free_pages(GFP_KSWAPD, 1)) {
> +		if (!do_try_to_free_pages(GFP_KSWAPD, 0)) {
>  			if (out_of_memory())
>  				oom_kill();
>  			continue;
> --- linux-2.4.9/mm/filemap.c.org	Mon Aug 20 17:25:20 2001
> +++ linux-2.4.9/mm/filemap.c	Wed Aug 22 05:07:35 2001
> @@ -980,12 +980,9 @@
>  static inline void check_used_once (struct page *page)
>  {
>  	if (!PageActive(page)) {
> -		if (page->age)
> +		if (++page->age > PAGE_AGE_START)

same here. Am I missing something?

>  			activate_page(page);
> -		else {
> -			page->age = PAGE_AGE_START;
> -			ClearPageReferenced(page);
> -		}
> +		ClearPageReferenced(page);
>  	}
>  }
> 
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-21 18:17       ` Stephan von Krawczynski
  2001-08-21 19:10         ` Daniel Phillips
  2001-08-22  0:04         ` Stephan von Krawczynski
@ 2001-08-22 11:52         ` Stephan von Krawczynski
  2 siblings, 0 replies; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-22 11:52 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, 21 Aug 2001 21:10:44 +0200
Daniel Phillips <phillips@bonn-fries.net> wrote:

> Do you have the same problem on 2.4.8, but not in 2.4.7?

I tested the situation with 2.4.7 (straight, no patches) and it looks like this:

meminfo before:

        total:    used:    free:  shared: buffers:  cached:
Mem:  921735168 88141824 833593344        0  6643712 36012032
Swap: 271392768        0 271392768
MemTotal:       900132 kB
MemFree:        814056 kB
MemShared:           0 kB
Buffers:          6488 kB
Cached:          35168 kB
SwapCached:          0 kB
Active:          34920 kB
Inact_dirty:      6736 kB
Inact_clean:         0 kB
Inact_target:      864 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900132 kB
LowFree:        814056 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

meminfo after test:

        total:    used:    free:  shared: buffers:  cached:
Mem:  921735168 917393408  4341760        0 221192192 567046144
Swap: 271392768        0 271392768
MemTotal:       900132 kB
MemFree:          4240 kB
MemShared:           0 kB
Buffers:        216008 kB
Cached:         553756 kB
SwapCached:          0 kB
Active:         107912 kB
Inact_dirty:    658432 kB
Inact_clean:      3420 kB
Inact_target:    11360 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:       900132 kB
LowFree:          4240 kB
SwapTotal:      265032 kB
SwapFree:       265032 kB

I can see these:

Aug 22 13:34:53 admin kernel: __alloc_pages: 2-order allocation failed.
Aug 22 13:34:53 admin kernel: __alloc_pages: 3-order allocation failed.
Aug 22 13:34:53 admin last message repeated 21 times

_BUT_ I cannot see any errors during NFS-filecopy. I tried hard, but no errors. Another thing is the CPU load. It is definitely lower than with 2.4.9 around 3 - 3.5, but never 4 or above.
Swap is not used, although turned on.
Besides the above kernel-messages I would say that 2.4.7 performs a lot better (and more stable) than 2.4.9 in this test case.
I think a deep look should be taken into this topic, because it makes 2.4.9 pretty unusable for server-environment. I wonder if anybody can produce (low-memory) errors during normal file-operation on localhost (not NFS like me). I would expect that, for it doesn't look NFS specific. I can make the kernel shoot loads of error messages only by reading CDs while copying in the background from the net. The effect can be seen vice versa, too. So you could say it is clearly a memory management problem, and not related to the allocating process.

Should I try with a plain 2.4.8 ?

Regards, Stephan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22 11:01                 ` Stephan von Krawczynski
@ 2001-08-22 17:22                   ` Mike Galbraith
  2001-08-22 19:18                   ` Stephan von Krawczynski
  1 sibling, 0 replies; 21+ messages in thread
From: Mike Galbraith @ 2001-08-22 17:22 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel

On Wed, 22 Aug 2001, Stephan von Krawczynski wrote:

> On Wed, 22 Aug 2001 07:33:38 +0200 (CEST)
> Mike Galbraith <mikeg@wen-online.de> wrote:
>
> >> HAHAHA.. I was right, hurried whack with my little hammer _did_ bust
> > it all to pieces :)
> >
> > This is also (very!) hurried and _lightly_ tested, but still cures my
> > problem..  what do you think?
> >
> > 	-Mike
> >
> >
> > --- linux-2.4.9/mm/vmscan.c.org	Sun Aug 19 08:55:24 2001
> > +++ linux-2.4.9/mm/vmscan.c	Wed Aug 22 05:03:50 2001
> > @@ -506,11 +506,17 @@
> >  		}
> [...]
> > +			if (++page->age > PAGE_AGE_START) {
>
> I am not very experienced with the aging algorithm, but can this statement be false at all? I mean if I get that right page->age starts with PAGE_AGE_START, doesn't it?

When page is added to to the pagecache, it begins life with age=0 and
is placed on the inactive_dirty list with use_once.  With the original
aging, it started with PAGE_AGE_START and was placed on the active
list.  The intent of used once (correct me Daniel if I fsck up.. haven't
been able to track vm changes very thoroughly lately [as you can see:])
is to place a new page in the line of fire of page reclamation and only
pull it into the active aging scheme if it is referenced again prior to
consumption.  This is intended to preserve other cached pages in the event
of streaming IO.  Your cache won't be demolished as quickly, the pages
which are only used one time will self destruct instead.  Cool idea.

Unfortunately, with loads like file rewrite, so many (all?) pages meet
the qualification, and are activated and aged up immediately that they
swamp the system.  Background aging can't keep up at all (even if you
accelerate it wildly btw), so you end up swapping needlessly.

This quick hack is intended to do something like use_once in that a page
which has been deactivated does not go back to the active queue merely
because of a single access (etc).  Instead, you get a couple of chances
to stay on your death march.  Often used pages will drop out of line,
seldom used pages won't.

It might be a really rotten way to cure my problem.. I'm not sure yet.

Christ on a crutch.. that sure was a longwinded way to say "yes, the
statement can be false".  Think I'll go turn on the idiot box, crack
a brew and play a round of couch potato :)

	Later,

	-Mike


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22 11:01                 ` Stephan von Krawczynski
  2001-08-22 17:22                   ` Mike Galbraith
@ 2001-08-22 19:18                   ` Stephan von Krawczynski
  2001-08-23  4:57                     ` Mike Galbraith
  1 sibling, 1 reply; 21+ messages in thread
From: Stephan von Krawczynski @ 2001-08-22 19:18 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: linux-kernel, phillips

On Wed, 22 Aug 2001 19:22:35 +0200 (CEST)
Mike Galbraith <mikeg@wen-online.de> wrote:

> When page is added to to the pagecache, it begins life with age=0 and
> is placed on the inactive_dirty list with use_once.  With the original
> aging, it started with PAGE_AGE_START and was placed on the active
> list.  The intent of used once (correct me Daniel if I fsck up.. haven't
> been able to track vm changes very thoroughly lately [as you can see:])
> is to place a new page in the line of fire of page reclamation and only
> pull it into the active aging scheme if it is referenced again prior to
> consumption.  This is intended to preserve other cached pages in the event
> of streaming IO.  Your cache won't be demolished as quickly, the pages
> which are only used one time will self destruct instead.  Cool idea.

Well, maybe I am completely off the road, but the primary problem seems to be that a whole lot of the pages _look_ like being of the same age, and the algorithm cannot cope with that very well. There is obviously no way out of this problem for the code, and thats basically why it fails to alloc pages with this warning message. So the primary goal should be to refine the algorithm and give it a way to _know_ a way out, and not to _guess_ ("maybe we got some free pages later") or _give up_ on the problem. How about the following (ridiculously simple) approach:
every alloc'ed page gets a "timestamp". If an alloc-request reaches the current "dead point" it simply throws out the oldest x pages of the lowest aging level reachable. This is sort of a garbage-collection idea. It sounds not very fast indeed, but it sounds working, does it?
Best of all, very few changes have to be made to make it work.

Shoot me for this :-)

Regards, Stephan

PS: timestamp could be a simple static int, that is counted up on every successful alloc. Obviously page needs an additional struct member.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22 19:18                   ` Stephan von Krawczynski
@ 2001-08-23  4:57                     ` Mike Galbraith
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Galbraith @ 2001-08-23  4:57 UTC (permalink / raw)
  To: Stephan von Krawczynski; +Cc: linux-kernel, phillips

On Wed, 22 Aug 2001, Stephan von Krawczynski wrote:

> On Wed, 22 Aug 2001 19:22:35 +0200 (CEST)
> Mike Galbraith <mikeg@wen-online.de> wrote:
>
> > When page is added to to the pagecache, it begins life with age=0 and
> > is placed on the inactive_dirty list with use_once.  With the original
> > aging, it started with PAGE_AGE_START and was placed on the active
> > list.  The intent of used once (correct me Daniel if I fsck up.. haven't
> > been able to track vm changes very thoroughly lately [as you can see:])
> > is to place a new page in the line of fire of page reclamation and only
> > pull it into the active aging scheme if it is referenced again prior to
> > consumption.  This is intended to preserve other cached pages in the event
> > of streaming IO.  Your cache won't be demolished as quickly, the pages
> > which are only used one time will self destruct instead.  Cool idea.
>

(your mailer doesn't wrap lines.. formatting)

> Well, maybe I am completely off the road, but the primary problem seems
> to be that a whole lot of the pages _look_ like being of the same age,
> and the algorithm cannot cope with that very well. There is obviously
> no way out of this problem for the code, and thats basically why it
>  fails to alloc pages with this warning message. So the primary goal

Sure, having a poor distribution of age isn't good (makes vm 'rough'),
but I don't think that's what is causing most of the allocation
failures.  IMHO, the largest problem with these is not keeping our
inactive_clean ammo belt full enough in general.  I bet that changing
page_launder to have a cleaned_pages target instead of limiting the
scan to 1/64 would cure a lot of that.  A 1/64 scan is nice as long
as the list is very long.  As it shrinks though, you do less and less
work.  It needs min and max limits.  IMHO, kswapd should never launder
less than at _least_ freepages.min even if that's the entire list.

>  should be to refine the algorithm and give it a way to _know_ a way
>  out, and not to _guess_ ("maybe we got some free pages later") or
>  _give up_ on the problem. How about the following (ridiculously
>  simple) approach:
> every alloc'ed page gets a "timestamp". If an alloc-request reaches
>  the current "dead point" it simply throws out the oldest x pages of
>  the lowest aging level reachable. This is sort of a garbage-collection
>  idea. It sounds not very fast indeed, but it sounds working, does it?
> Best of all, very few changes have to be made to make it work.

Many changes are required to make it work.  First of all, it requires
enforced deallocation.

> Shoot me for this :-)

<click click click.. dang, no bullets> :)

	-Mike


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-23  0:10       ` Marcelo Tosatti
@ 2001-08-23  2:29         ` Daniel Phillips
  2001-08-23  1:19           ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Phillips @ 2001-08-23  2:29 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: tommy, Linux Kernel

On August 23, 2001 02:10 am, Marcelo Tosatti wrote:
> On Thu, 23 Aug 2001, Daniel Phillips wrote:
> 
> > On August 22, 2001 09:05 pm, Marcelo Tosatti wrote:
> > > On Wed, 22 Aug 2001, Daniel Phillips wrote:
> > > > What can we do right now?  We could always just comment out the alloc failed 
> > > > message.  The result will be a lot of busy waiting on dirty page writeout 
> > > > which will work but it will keep us from focussing on the question: how did 
> > > > we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
> > > > without intelligent throttling (/me waves at Ben).  That sounds like the 
> > > > place to attack first.
> > > 
> > > We can just wait on the writeout of lowmem buffers at page_launder()
> > > (which will not cause IO buffering since we are doing lowmem IO, duh), and
> > > then we are done.
> > > 
> > > Take a look at the patch I posted before (__GFP_NOBOUNCE). 
> > 
> > A little light reading for a Wednesday afternoon ;-)
> > 
> > Nice hack, way to go.  So this will wait synchronously in try_to_free_buffers
> > if we have to go around twice in alloc_bounce_page or alloc_bounce_bh (the
> > latter eventually resulting in a page_alloc from kmem_cache grow).
> 
> Not synchronously, no. It will just allow allocations trying to get memory
> for bounce buffering to block on lowmem IO.

Whoops, it's been a while since I read page_launder.  Hey!  Major cleanup.
It's much easier to understand what it's doing now.

OK, sync_page_buffers no longer does what its name says, or implements what
its comment says it does.  Now the GFP_WAIT just means wait on already-locked
buffers so that IO can be initiated.  (By the way, there are bunch of
comments in try_to_free_buffers that lie now.)  OK, so the busy wait is
implemented in alloc_bounce_page, and page_launder is just used to start IO,
fine.  Hmm, I think I will try my semaphore idea, not because you haven't
solved the problem, but because I think a lot of CPU-wasting trips into
page_launder could be eliminated.  (A 2.5 thing of course.)

> With this behaviour, its safe to completly remove Ingo's emergency scheme.

Yes, so why don't you add that to your patch, and also the correction to the
page->zone test and call it [PATCH]?

> > What does SLAB_LEVEL_MASK do?  Did you find out by hitting the BUG when you
> > tried the patch?  Anyway, it needs a comment.
> 
> SLAB_LEVEL_MASK is the mask for SLAB-valid allocations.

And "LEVEL" means?

--
Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-23  2:29         ` Daniel Phillips
@ 2001-08-23  1:19           ` Marcelo Tosatti
  0 siblings, 0 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2001-08-23  1:19 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: tommy, Linux Kernel



On Thu, 23 Aug 2001, Daniel Phillips wrote:

> On August 23, 2001 02:10 am, Marcelo Tosatti wrote:
> > On Thu, 23 Aug 2001, Daniel Phillips wrote:
> > 
> > > On August 22, 2001 09:05 pm, Marcelo Tosatti wrote:
> > > > On Wed, 22 Aug 2001, Daniel Phillips wrote:
> > > > > What can we do right now?  We could always just comment out the alloc failed 
> > > > > message.  The result will be a lot of busy waiting on dirty page writeout 
> > > > > which will work but it will keep us from focussing on the question: how did 
> > > > > we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
> > > > > without intelligent throttling (/me waves at Ben).  That sounds like the 
> > > > > place to attack first.
> > > > 
> > > > We can just wait on the writeout of lowmem buffers at page_launder()
> > > > (which will not cause IO buffering since we are doing lowmem IO, duh), and
> > > > then we are done.
> > > > 
> > > > Take a look at the patch I posted before (__GFP_NOBOUNCE). 
> > > 
> > > A little light reading for a Wednesday afternoon ;-)
> > > 
> > > Nice hack, way to go.  So this will wait synchronously in try_to_free_buffers
> > > if we have to go around twice in alloc_bounce_page or alloc_bounce_bh (the
> > > latter eventually resulting in a page_alloc from kmem_cache grow).
> > 
> > Not synchronously, no. It will just allow allocations trying to get memory
> > for bounce buffering to block on lowmem IO.
> 
> Whoops, it's been a while since I read page_launder.  Hey!  Major cleanup.
> It's much easier to understand what it's doing now.
> 
> OK, sync_page_buffers no longer does what its name says, or implements what
> its comment says it does.  Now the GFP_WAIT just means wait on already-locked
> buffers so that IO can be initiated.  (By the way, there are bunch of
> comments in try_to_free_buffers that lie now.)  OK, so the busy wait is
> implemented in alloc_bounce_page, and page_launder is just used to start IO,
> fine.  Hmm, I think I will try my semaphore idea, not because you haven't
> solved the problem, but because I think a lot of CPU-wasting trips into
> page_launder could be eliminated.  (A 2.5 thing of course.)

There is no real CPU-wasting trips into page_launder().

As soon as we find a dirty, lowmem, unlocked page with ->buffers, we block
on IO there.

> > With this behaviour, its safe to completly remove Ingo's emergency scheme.
> 
> Yes, so why don't you add that to your patch, and also the correction to the
> page->zone test and call it [PATCH]?

Will do it soon.
> 
> > > What does SLAB_LEVEL_MASK do?  Did you find out by hitting the BUG when you
> > > tried the patch?  Anyway, it needs a comment.
> > 
> > SLAB_LEVEL_MASK is the mask for SLAB-valid allocations.
> 
> And "LEVEL" means?

No idea. Nothing, probably. 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22 19:05   ` Marcelo Tosatti
@ 2001-08-23  1:11     ` Daniel Phillips
  2001-08-23  0:10       ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Phillips @ 2001-08-23  1:11 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: tommy, Linux Kernel, Ben LaHaise

On August 22, 2001 09:05 pm, Marcelo Tosatti wrote:
> On Wed, 22 Aug 2001, Daniel Phillips wrote:
> > What can we do right now?  We could always just comment out the alloc failed 
> > message.  The result will be a lot of busy waiting on dirty page writeout 
> > which will work but it will keep us from focussing on the question: how did 
> > we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
> > without intelligent throttling (/me waves at Ben).  That sounds like the 
> > place to attack first.
> 
> We can just wait on the writeout of lowmem buffers at page_launder()
> (which will not cause IO buffering since we are doing lowmem IO, duh), and
> then we are done.
> 
> Take a look at the patch I posted before (__GFP_NOBOUNCE). 

A little light reading for a Wednesday afternoon ;-)

Nice hack, way to go.  So this will wait synchronously in try_to_free_buffers
if we have to go around twice in alloc_bounce_page or alloc_bounce_bh (the
latter eventually resulting in a page_alloc from kmem_cache grow).

What does SLAB_LEVEL_MASK do?  Did you find out by hitting the BUG when you
tried the patch?  Anyway, it needs a comment.

I had in mind a completely different approach to try, using a semaphore to
count bounce buffers, and block when they run out.  Your patch fits the
pattern of the current busy-waiting strategy much better.  It's the right
thing to do.

OK, race you to the next bug ;-)

--
Daniel


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-23  1:11     ` Daniel Phillips
@ 2001-08-23  0:10       ` Marcelo Tosatti
  2001-08-23  2:29         ` Daniel Phillips
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2001-08-23  0:10 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: tommy, Linux Kernel, Ben LaHaise



On Thu, 23 Aug 2001, Daniel Phillips wrote:

> On August 22, 2001 09:05 pm, Marcelo Tosatti wrote:
> > On Wed, 22 Aug 2001, Daniel Phillips wrote:
> > > What can we do right now?  We could always just comment out the alloc failed 
> > > message.  The result will be a lot of busy waiting on dirty page writeout 
> > > which will work but it will keep us from focussing on the question: how did 
> > > we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
> > > without intelligent throttling (/me waves at Ben).  That sounds like the 
> > > place to attack first.
> > 
> > We can just wait on the writeout of lowmem buffers at page_launder()
> > (which will not cause IO buffering since we are doing lowmem IO, duh), and
> > then we are done.
> > 
> > Take a look at the patch I posted before (__GFP_NOBOUNCE). 
> 
> A little light reading for a Wednesday afternoon ;-)
> 
> Nice hack, way to go.  So this will wait synchronously in try_to_free_buffers
> if we have to go around twice in alloc_bounce_page or alloc_bounce_bh (the
> latter eventually resulting in a page_alloc from kmem_cache grow).

Not synchronously, no. It will just allow allocations trying to get memory
for bounce buffering to block on lowmem IO.

With this behaviour, its safe to completly remove Ingo's emergency scheme.

> What does SLAB_LEVEL_MASK do?  Did you find out by hitting the BUG when you
> tried the patch?  Anyway, it needs a comment.

SLAB_LEVEL_MASK is the mask for SLAB-valid allocations.

> I had in mind a completely different approach to try, using a semaphore to
> count bounce buffers, and block when they run out.  Your patch fits the
> pattern of the current busy-waiting strategy much better.  It's the right
> thing to do.
> 
> OK, race you to the next bug ;-)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22  4:47 Tommy Wu
@ 2001-08-22 19:32 ` Daniel Phillips
  2001-08-22 19:05   ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Daniel Phillips @ 2001-08-22 19:32 UTC (permalink / raw)
  To: tommy, Linux Kernel; +Cc: Ben LaHaise

On August 22, 2001 06:47 am, Tommy Wu wrote:
>    I've tried the patch in the kernel list. Got the result as following...
>    This message for command: 
>    dd if=/dev/zero of=test.dmp bs=1000k count=2500
>    on a PIII 1G SMP box with 1G RAM (HIGHMEM enabled)
>    kernel 2.4.9 with XFS filesystem patch.
>
> Aug 22 11:51:04 standby kernel: __alloc_pages: 0-order allocation failed
> (gfp=0x30/1).
> Aug 22 11:51:11 standby last message repeated 111 times

OK, this is a straight-up design bug.  Although this can also happen with 
normal memory, it's much more likely to happen with highmem because of heavy 
demand for bounce buffers while a process is in PF_MEMALLOC state.  You can 
just turn off highmem and these messages will go away, or become so rare that 
you are unlikely to ever see one.

Now lets chase the real problem.  The gfp=0x30 tells us the requestor is 
willing to wait (0x10) and that it is not allowed to do any io (0x40) or call 
->writepage (0x80).  (By process of elimination, it's a bounce buffer.) 
Furthermore, this is a recursive memory request (/1) so __alloc_pages won't 
call page_launder because that could hit another allocation request resulting 
in a fatal infinite recursion (note to self: why couldn't we call 
page_launder here, with NOIO?).

There are probably dirty pages in flight and __alloc_pages is allowed to wait 
for them, but it doesn't - it trys reclaim_page once (in 
__alloc_pages_limit), falls the rest of the way through __alloc_pages and 
gives up with NULL.  This is clearly a bad thing because whoever wanted the 
page needs it to do writeout.  Memory users are supposed to be able to 
tolerate alloc failure, but in a case like this, there isn't much choice 
other than to spin.

So what could we do better here?  Well, obviously when there are writeout 
pages in flight, __alloc_pages should wait and not give up.  Secondly, we 
should be sure that when writeout does complete, the newly freeable page is 
given to a PF_MEMALLOC waiter in preference to a normal user.  We don't have 
mechanisms in place for doing either of those things right now, although some 
preliminary design ideas have been discussed.  This gets way outside the 
bound of what we should be doing in 2.4, we will need such things as 
reservations (which Ben has done some work on) and orderly prioritization of 
requests in __alloc_pages, with explicit blocking for low priority requests.  

What can we do right now?  We could always just comment out the alloc failed 
message.  The result will be a lot of busy waiting on dirty page writeout 
which will work but it will keep us from focussing on the question: how did 
we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
without intelligent throttling (/me waves at Ben).  That sounds like the 
place to attack first.

--
Daniel

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Memory Problem in 2.4.9 ?
  2001-08-22 19:32 ` Daniel Phillips
@ 2001-08-22 19:05   ` Marcelo Tosatti
  2001-08-23  1:11     ` Daniel Phillips
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2001-08-22 19:05 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: tommy, Linux Kernel, Ben LaHaise



On Wed, 22 Aug 2001, Daniel Phillips wrote:

> On August 22, 2001 06:47 am, Tommy Wu wrote:
> >    I've tried the patch in the kernel list. Got the result as following...
> >    This message for command: 
> >    dd if=/dev/zero of=test.dmp bs=1000k count=2500
> >    on a PIII 1G SMP box with 1G RAM (HIGHMEM enabled)
> >    kernel 2.4.9 with XFS filesystem patch.
> >
> > Aug 22 11:51:04 standby kernel: __alloc_pages: 0-order allocation failed
> > (gfp=0x30/1).
> > Aug 22 11:51:11 standby last message repeated 111 times
> 
> OK, this is a straight-up design bug.  Although this can also happen with 
> normal memory, it's much more likely to happen with highmem because of heavy 
> demand for bounce buffers while a process is in PF_MEMALLOC state.  You can 
> just turn off highmem and these messages will go away, or become so rare that 
> you are unlikely to ever see one.
> 
> Now lets chase the real problem.  The gfp=0x30 tells us the requestor is 
> willing to wait (0x10) and that it is not allowed to do any io (0x40) or call 
> ->writepage (0x80).  (By process of elimination, it's a bounce buffer.) 
> Furthermore, this is a recursive memory request (/1) so __alloc_pages won't 
> call page_launder because that could hit another allocation request resulting 
> in a fatal infinite recursion (note to self: why couldn't we call 
> page_launder here, with NOIO?).
> 
> There are probably dirty pages in flight and __alloc_pages is allowed to wait 
> for them, but it doesn't - it trys reclaim_page once (in 
> __alloc_pages_limit), falls the rest of the way through __alloc_pages and 
> gives up with NULL.  This is clearly a bad thing because whoever wanted the 
> page needs it to do writeout.  Memory users are supposed to be able to 
> tolerate alloc failure, but in a case like this, there isn't much choice 
> other than to spin.
> 
> So what could we do better here?  Well, obviously when there are writeout 
> pages in flight, __alloc_pages should wait and not give up.  Secondly, we 
> should be sure that when writeout does complete, the newly freeable page is 
> given to a PF_MEMALLOC waiter in preference to a normal user.  We don't have 
> mechanisms in place for doing either of those things right now, although some 
> preliminary design ideas have been discussed.  This gets way outside the 
> bound of what we should be doing in 2.4, we will need such things as 
> reservations (which Ben has done some work on) and orderly prioritization of 
> requests in __alloc_pages, with explicit blocking for low priority requests.  
> 
> What can we do right now?  We could always just comment out the alloc failed 
> message.  The result will be a lot of busy waiting on dirty page writeout 
> which will work but it will keep us from focussing on the question: how did 
> we get so short of bounce buffers?  Well, maybe we are submitting too much IO 
> without intelligent throttling (/me waves at Ben).  That sounds like the 
> place to attack first.

We can just wait on the writeout of lowmem buffers at page_launder()
(which will not cause IO buffering since we are doing lowmem IO, duh), and
then we are done.

Take a look at the patch I posted before (__GFP_NOBOUNCE). 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Memory Problem in 2.4.9 ?
@ 2001-08-22  4:47 Tommy Wu
  2001-08-22 19:32 ` Daniel Phillips
  0 siblings, 1 reply; 21+ messages in thread
From: Tommy Wu @ 2001-08-22  4:47 UTC (permalink / raw)
  To: Linux Kernel; +Cc: Daniel Phillips

Hi!

   I've tried the patch in the kernel list. Got the result as following...
   This message for command: 
   dd if=/dev/zero of=test.dmp bs=1000k count=2500
   on a PIII 1G SMP box with 1G RAM (HIGHMEM enabled)
   kernel 2.4.9 with XFS filesystem patch.
   
Aug 22 11:51:04 standby kernel: __alloc_pages: 0-order allocation failed (gfp=0x30/1).
Aug 22 11:51:11 standby last message repeated 111 times
Aug 22 11:51:11 standby kernel: cation failed (gfp=0x30/1).
Aug 22 11:51:11 standby kernel: __alloc_pages: 0-order allocation failed (gfp=0x30/1).
Aug 22 11:51:11 standby last message repeated 281 times
Aug 22 11:51:17 standby kernel: cation failed (gfp=0x30/1).
Aug 22 11:51:17 standby kernel: __alloc_pages: 0-order allocation failed (gfp=0x30/1).
Aug 22 11:51:29 standby last message repeated 315 times
Aug 22 11:51:29 standby kernel: cation failed (gfp=0x30/1).
Aug 22 11:51:29 standby kernel: __alloc_pages: 0-order allocation failed (gfp=0x30/1).
Aug 22 11:51:29 standby last message repeated 281 times
Aug 22 11:52:21 standby last message repeated 43 times
Aug 22 11:52:21 standby kernel: cation failed (gfp=0x30/1).
Aug 22 11:52:21 standby kernel: __alloc_pages: 0-order allocation failed (gfp=0x30/1).
Aug 22 11:52:22 standby last message repeated 290 times


-- 

    Tommy Wu
    mailto:tommy@teatime.com.tw
    http://www.teatime.com.tw/~tommy
    ICQ: 22766091
    Mobile Phone: +886 936 909490
    TeaTime BBS +886 2 31515964 24Hrs V.Everything



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2001-08-23  4:57 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20010821174918Z16114-32383+718@humbolt.nl.linux.org>
2001-08-21 13:46 ` Memory Problem in 2.4.9 ? Stephan von Krawczynski
2001-08-21 14:33   ` Daniel Phillips
     [not found]     ` <20010821194140.43b46b10.skraw@ithnet.com>
2001-08-21 18:17       ` Stephan von Krawczynski
2001-08-21 19:10         ` Daniel Phillips
2001-08-22  0:04         ` Stephan von Krawczynski
2001-08-22  0:43           ` Daniel Phillips
2001-08-22  0:48             ` Rik van Riel
2001-08-22  1:13               ` Daniel Phillips
2001-08-22 11:01                 ` Stephan von Krawczynski
2001-08-22 17:22                   ` Mike Galbraith
2001-08-22 19:18                   ` Stephan von Krawczynski
2001-08-23  4:57                     ` Mike Galbraith
2001-08-22 10:43               ` Stephan von Krawczynski
2001-08-22 11:52         ` Stephan von Krawczynski
2001-08-22  4:47 Tommy Wu
2001-08-22 19:32 ` Daniel Phillips
2001-08-22 19:05   ` Marcelo Tosatti
2001-08-23  1:11     ` Daniel Phillips
2001-08-23  0:10       ` Marcelo Tosatti
2001-08-23  2:29         ` Daniel Phillips
2001-08-23  1:19           ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).