linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* oom-killer 2.6.8.1
@ 2004-08-18 12:55 Anders Saaby
  2004-08-18 13:57 ` Gene Heskett
  2004-08-18 14:05 ` William Lee Irwin III
  0 siblings, 2 replies; 12+ messages in thread
From: Anders Saaby @ 2004-08-18 12:55 UTC (permalink / raw)
  To: linux-kernel

Hi all,

This is a high-volume NFS server running almost no user-space applications. It 
serves a handfull of web server NFS clients from a ~700G XFS filesystem.

The machine has about 2.5 GB of RAM and 4G of swap (which is almost not in use 
- i may use 5-10 MB total tops).  CONFIG_HIGHMEM and CONFIG_HIGHMEM4G 
enabled, SMP enabled, preempt disabled.

Today the OOM killer kicked in - it seemed that swap was almost unused at the 
time (which is strange, as that should prevent the OOM killer from kicking 
in).

Relevant part of the syslog follows (syslogd was killed too eventually):

Aug 18 14:14:52 st1 kernel: oom-killer: gfp_mask=0xd0
Aug 18 14:14:52 st1 kernel: DMA per-cpu:
Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 2, high 6, batch 1
Aug 18 14:14:52 st1 kernel: cpu 0 cold: low 0, high 2, batch 1
Aug 18 14:14:52 st1 kernel: cpu 1 hot: low 2, high 6, batch 1
Aug 18 14:14:52 st1 kernel: cpu 1 cold: low 0, high 2, batch 1
Aug 18 14:14:52 st1 kernel: Normal per-cpu:
Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 32, high 96, batch 16
Aug 18 14:14:52 st1 kernel: cpu 0 cold: low 0, high 32, batch 16
Aug 18 14:14:52 st1 kernel: cpu 1 hot: low 32, high 96, batch 16
Aug 18 14:14:52 st1 kernel: cpu 1 cold: low 0, high 32, batch 16
Aug 18 14:14:52 st1 kernel: HighMem per-cpu:
Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 32, high 96, batch 16
Aug 18 14:14:53 st1 kernel: cpu 0 cold: low 0, high 32, batch 16
Aug 18 14:14:54 st1 kernel: cpu 1 hot: low 32, high 96, batch 16
Aug 18 14:14:54 st1 kernel: cpu 1 cold: low 0, high 32, batch 16
Aug 18 14:14:54 st1 kernel:
Aug 18 14:14:54 st1 kernel: Free pages:      352376kB (348672kB HighMem)
Aug 18 14:14:54 st1 kernel: Active:18220 inactive:319911 dirty:12154
writeback:0 unstable:0 free:88094 slab:219644 mapped:3231 pagetables:121
Aug 18 14:14:54 st1 kernel: DMA free:1880kB min:16kB low:32kB high:48kB 
active:0kB inactive:0kB present:16384kB
Aug 18 14:14:54 st1 kernel: protections[]: 8 476 732
Aug 18 14:14:54 st1 kernel: Normal free:1824kB min:936kB low:1872kB 
high:2808kB active:84kB inactive:284kB present:901120kB
Aug 18 14:14:54 st1 kernel: protections[]: 0 468 724
Aug 18 14:14:54 st1 kernel: HighMem free:348672kB min:512kB low:1024kB 
high:1536kB active:72796kB inactive:1279360kB present:1703788kB
Aug 18 14:14:54 st1 kernel: protections[]: 0 0 256
Aug 18 14:14:54 st1 kernel: DMA: 6*4kB 2*8kB 1*16kB 1*32kB 0*64kB 0*128kB 
1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1880kB
Aug 18 14:14:54 st1 kernel: Normal: 88*4kB 42*8kB 3*16kB 6*32kB 2*64kB 2*128kB 
0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1824kB
Aug 18 14:14:55 st1 kernel: HighMem: 62500*4kB 11396*8kB 409*16kB 0*32kB 
1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 348672kB
Aug 18 14:14:55 st1 kernel: Swap cache: add 6433, delete 6394, find 
35305/35334, race 0+0
Aug 18 14:14:55 st1 kernel: Out of Memory: Killed process 1049 (rpc.statd).

Any ideas?

Try to disable swap?  Tune magic VM settings?  Any suggestions are highly 
welcome.

Thank you very much,

-- 
Med venlig hilsen - Best regards - Meilleures salutations

Anders Saaby
Systems Engineer
------------------------------------------------
Cohaesio A/S - Maglebjergvej 5D - DK-2800 Lyngby
Phone: +45 45 880 888 - Fax: +45 45 880 777
Mail: as@cohaesio.com - http://www.cohaesio.com
------------------------------------------------

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 12:55 oom-killer 2.6.8.1 Anders Saaby
@ 2004-08-18 13:57 ` Gene Heskett
  2004-08-18 14:11   ` Jakob Oestergaard
  2004-08-18 14:14   ` Anders Saaby
  2004-08-18 14:05 ` William Lee Irwin III
  1 sibling, 2 replies; 12+ messages in thread
From: Gene Heskett @ 2004-08-18 13:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Anders Saaby

On Wednesday 18 August 2004 08:55, Anders Saaby wrote:
>Hi all,
>
>This is a high-volume NFS server running almost no user-space
> applications. It serves a handfull of web server NFS clients from a
> ~700G XFS filesystem.
>
>The machine has about 2.5 GB of RAM and 4G of swap (which is almost
> not in use - i may use 5-10 MB total tops).  CONFIG_HIGHMEM and
> CONFIG_HIGHMEM4G enabled, SMP enabled, preempt disabled.
>
>Today the OOM killer kicked in - it seemed that swap was almost
> unused at the time (which is strange, as that should prevent the
> OOM killer from kicking in).
>
>Relevant part of the syslog follows (syslogd was killed too
> eventually):
>
>Aug 18 14:14:52 st1 kernel: oom-killer: gfp_mask=0xd0
>Aug 18 14:14:52 st1 kernel: DMA per-cpu:
>Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 2, high 6, batch 1
>Aug 18 14:14:52 st1 kernel: cpu 0 cold: low 0, high 2, batch 1
>Aug 18 14:14:52 st1 kernel: cpu 1 hot: low 2, high 6, batch 1
>Aug 18 14:14:52 st1 kernel: cpu 1 cold: low 0, high 2, batch 1
>Aug 18 14:14:52 st1 kernel: Normal per-cpu:
>Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 32, high 96, batch 16
>Aug 18 14:14:52 st1 kernel: cpu 0 cold: low 0, high 32, batch 16
>Aug 18 14:14:52 st1 kernel: cpu 1 hot: low 32, high 96, batch 16
>Aug 18 14:14:52 st1 kernel: cpu 1 cold: low 0, high 32, batch 16
>Aug 18 14:14:52 st1 kernel: HighMem per-cpu:
>Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 32, high 96, batch 16
>Aug 18 14:14:53 st1 kernel: cpu 0 cold: low 0, high 32, batch 16
>Aug 18 14:14:54 st1 kernel: cpu 1 hot: low 32, high 96, batch 16
>Aug 18 14:14:54 st1 kernel: cpu 1 cold: low 0, high 32, batch 16
>Aug 18 14:14:54 st1 kernel:
>Aug 18 14:14:54 st1 kernel: Free pages:      352376kB (348672kB
> HighMem) Aug 18 14:14:54 st1 kernel: Active:18220 inactive:319911
> dirty:12154 writeback:0 unstable:0 free:88094 slab:219644
> mapped:3231 pagetables:121 Aug 18 14:14:54 st1 kernel: DMA
> free:1880kB min:16kB low:32kB high:48kB active:0kB inactive:0kB
> present:16384kB
>Aug 18 14:14:54 st1 kernel: protections[]: 8 476 732
>Aug 18 14:14:54 st1 kernel: Normal free:1824kB min:936kB low:1872kB
>high:2808kB active:84kB inactive:284kB present:901120kB
>Aug 18 14:14:54 st1 kernel: protections[]: 0 468 724
>Aug 18 14:14:54 st1 kernel: HighMem free:348672kB min:512kB
> low:1024kB high:1536kB active:72796kB inactive:1279360kB
> present:1703788kB Aug 18 14:14:54 st1 kernel: protections[]: 0 0
> 256
>Aug 18 14:14:54 st1 kernel: DMA: 6*4kB 2*8kB 1*16kB 1*32kB 0*64kB
> 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1880kB
>Aug 18 14:14:54 st1 kernel: Normal: 88*4kB 42*8kB 3*16kB 6*32kB
> 2*64kB 2*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1824kB
>Aug 18 14:14:55 st1 kernel: HighMem: 62500*4kB 11396*8kB 409*16kB
> 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB =
> 348672kB Aug 18 14:14:55 st1 kernel: Swap cache: add 6433, delete
> 6394, find 35305/35334, race 0+0
>Aug 18 14:14:55 st1 kernel: Out of Memory: Killed process 1049
> (rpc.statd).
>
>Any ideas?
>
>Try to disable swap?  Tune magic VM settings?  Any suggestions are
> highly welcome.
>
>Thank you very much,

You may be another candidate for the same thing thats troubleing me.  
Back up the log above that and look for an Oops and post that please.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.24% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 12:55 oom-killer 2.6.8.1 Anders Saaby
  2004-08-18 13:57 ` Gene Heskett
@ 2004-08-18 14:05 ` William Lee Irwin III
  2004-08-18 14:24   ` Anders Saaby
  1 sibling, 1 reply; 12+ messages in thread
From: William Lee Irwin III @ 2004-08-18 14:05 UTC (permalink / raw)
  To: Anders Saaby; +Cc: linux-kernel

On Wed, Aug 18, 2004 at 02:55:42PM +0200, Anders Saaby wrote:
> This is a high-volume NFS server running almost no user-space
> applications. It serves a handfull of web server NFS clients from a
> ~700G XFS filesystem. The machine has about 2.5 GB of RAM and 4G of
> swap (which is almost not in use - i may use 5-10 MB total tops).
> CONFIG_HIGHMEM and CONFIG_HIGHMEM4G enabled, SMP enabled, preempt disabled.
> Today the OOM killer kicked in - it seemed that swap was almost unused at the 
> time (which is strange, as that should prevent the OOM killer from kicking 
> in).
> Relevant part of the syslog follows (syslogd was killed too eventually):

This seems to have been meant to resolve laptop_mode issues, but looks
like it didn't get applied. I'm not convinced it will help given that
you appear to have a vanilla ZONE_NORMAL slab OOM (858MB slab), but you
never know. Capturing /proc/slabinfo data may be more helpful.


Index: oom-2.6.8-rc1/mm/vmscan.c
===================================================================
--- oom-2.6.8-rc1.orig/mm/vmscan.c	2004-07-14 06:17:13.876343912 -0700
+++ oom-2.6.8-rc1/mm/vmscan.c	2004-07-14 06:22:15.986416200 -0700
@@ -417,7 +417,8 @@
 				goto keep_locked;
 			if (!may_enter_fs)
 				goto keep_locked;
-			if (laptop_mode && !sc->may_writepage)
+			if (laptop_mode && !sc->may_writepage &&
+							!PageSwapCache(page))
 				goto keep_locked;
 
 			/* Page is dirty, try to write it out here */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 13:57 ` Gene Heskett
@ 2004-08-18 14:11   ` Jakob Oestergaard
  2004-08-18 14:46     ` Hugh Dickins
  2004-08-18 14:14   ` Anders Saaby
  1 sibling, 1 reply; 12+ messages in thread
From: Jakob Oestergaard @ 2004-08-18 14:11 UTC (permalink / raw)
  To: Gene Heskett; +Cc: linux-kernel, Anders Saaby, sct

On Wed, Aug 18, 2004 at 09:57:03AM -0400, Gene Heskett wrote:
...
 >Any ideas?
> >
> >Try to disable swap?  Tune magic VM settings?  Any suggestions are
> > highly welcome.
> >
> >Thank you very much,
> 
> You may be another candidate for the same thing thats troubleing me.  
> Back up the log above that and look for an Oops and post that please.

Looking thru the swapfile.c and oom killer code, one thing that is
making me scratch my head:

nr_swap_pages is a *signed* integer.  This does not make sense. There
are even tests in swapfile.c that explicitly test "nr_swap_pages <= 0"
instead of simply "!nr_swap_pages" - this does not make sense at all
either - or does it?

Stephen is that your code?

See, if nr_swap_pages can validly be negative and some meaning is
attached to that (some meaning other than "we're out of swap"), the
oom_killer surely misses that, as it tests "nr_swap_pages > 0".

I don't think that nr_swap_pages can be negative (unless one adds a
*lot* of swap in which case this will unintentionally happen all by
itself),  but I felt I should chirp in with this comment in case
someone's looking at it anyway  :)

-- 

 / jakob


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 13:57 ` Gene Heskett
  2004-08-18 14:11   ` Jakob Oestergaard
@ 2004-08-18 14:14   ` Anders Saaby
  1 sibling, 0 replies; 12+ messages in thread
From: Anders Saaby @ 2004-08-18 14:14 UTC (permalink / raw)
  To: gene.heskett; +Cc: linux-kernel

On Wednesday 18 August 2004 15:57, Gene Heskett wrote:
> On Wednesday 18 August 2004 08:55, Anders Saaby wrote:
> >Hi all,
> >
> >This is a high-volume NFS server running almost no user-space
> > applications. It serves a handfull of web server NFS clients from a
> > ~700G XFS filesystem.
> >
> >The machine has about 2.5 GB of RAM and 4G of swap (which is almost
> > not in use - i may use 5-10 MB total tops).  CONFIG_HIGHMEM and
> > CONFIG_HIGHMEM4G enabled, SMP enabled, preempt disabled.
> >
> >Today the OOM killer kicked in - it seemed that swap was almost
> > unused at the time (which is strange, as that should prevent the
> > OOM killer from kicking in).
> >
> >Relevant part of the syslog follows (syslogd was killed too
> > eventually):
> >
> >Aug 18 14:14:52 st1 kernel: oom-killer: gfp_mask=0xd0
> >Aug 18 14:14:52 st1 kernel: DMA per-cpu:
> >Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 2, high 6, batch 1
> >Aug 18 14:14:52 st1 kernel: cpu 0 cold: low 0, high 2, batch 1
> >Aug 18 14:14:52 st1 kernel: cpu 1 hot: low 2, high 6, batch 1
> >Aug 18 14:14:52 st1 kernel: cpu 1 cold: low 0, high 2, batch 1
> >Aug 18 14:14:52 st1 kernel: Normal per-cpu:
> >Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 32, high 96, batch 16
> >Aug 18 14:14:52 st1 kernel: cpu 0 cold: low 0, high 32, batch 16
> >Aug 18 14:14:52 st1 kernel: cpu 1 hot: low 32, high 96, batch 16
> >Aug 18 14:14:52 st1 kernel: cpu 1 cold: low 0, high 32, batch 16
> >Aug 18 14:14:52 st1 kernel: HighMem per-cpu:
> >Aug 18 14:14:52 st1 kernel: cpu 0 hot: low 32, high 96, batch 16
> >Aug 18 14:14:53 st1 kernel: cpu 0 cold: low 0, high 32, batch 16
> >Aug 18 14:14:54 st1 kernel: cpu 1 hot: low 32, high 96, batch 16
> >Aug 18 14:14:54 st1 kernel: cpu 1 cold: low 0, high 32, batch 16
> >Aug 18 14:14:54 st1 kernel:
> >Aug 18 14:14:54 st1 kernel: Free pages:      352376kB (348672kB
> > HighMem) Aug 18 14:14:54 st1 kernel: Active:18220 inactive:319911
> > dirty:12154 writeback:0 unstable:0 free:88094 slab:219644
> > mapped:3231 pagetables:121 Aug 18 14:14:54 st1 kernel: DMA
> > free:1880kB min:16kB low:32kB high:48kB active:0kB inactive:0kB
> > present:16384kB
> >Aug 18 14:14:54 st1 kernel: protections[]: 8 476 732
> >Aug 18 14:14:54 st1 kernel: Normal free:1824kB min:936kB low:1872kB
> >high:2808kB active:84kB inactive:284kB present:901120kB
> >Aug 18 14:14:54 st1 kernel: protections[]: 0 468 724
> >Aug 18 14:14:54 st1 kernel: HighMem free:348672kB min:512kB
> > low:1024kB high:1536kB active:72796kB inactive:1279360kB
> > present:1703788kB Aug 18 14:14:54 st1 kernel: protections[]: 0 0
> > 256
> >Aug 18 14:14:54 st1 kernel: DMA: 6*4kB 2*8kB 1*16kB 1*32kB 0*64kB
> > 0*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 1880kB
> >Aug 18 14:14:54 st1 kernel: Normal: 88*4kB 42*8kB 3*16kB 6*32kB
> > 2*64kB 2*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 1824kB
> >Aug 18 14:14:55 st1 kernel: HighMem: 62500*4kB 11396*8kB 409*16kB
> > 0*32kB 1*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB =
> > 348672kB Aug 18 14:14:55 st1 kernel: Swap cache: add 6433, delete
> > 6394, find 35305/35334, race 0+0
> >Aug 18 14:14:55 st1 kernel: Out of Memory: Killed process 1049
> > (rpc.statd).
> >
> >Any ideas?
> >
> >Try to disable swap?  Tune magic VM settings?  Any suggestions are
> > highly welcome.
> >
> >Thank you very much,
>
> You may be another candidate for the same thing thats troubleing me.
> Back up the log above that and look for an Oops and post that please.

No Oops :-)

/Saaby

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 14:05 ` William Lee Irwin III
@ 2004-08-18 14:24   ` Anders Saaby
  2004-08-18 21:11     ` William Lee Irwin III
  0 siblings, 1 reply; 12+ messages in thread
From: Anders Saaby @ 2004-08-18 14:24 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel

On Wednesday 18 August 2004 16:05, William Lee Irwin III wrote:
> On Wed, Aug 18, 2004 at 02:55:42PM +0200, Anders Saaby wrote:
> > This is a high-volume NFS server running almost no user-space
> > applications. It serves a handfull of web server NFS clients from a
> > ~700G XFS filesystem. The machine has about 2.5 GB of RAM and 4G of
> > swap (which is almost not in use - i may use 5-10 MB total tops).
> > CONFIG_HIGHMEM and CONFIG_HIGHMEM4G enabled, SMP enabled, preempt
> > disabled. Today the OOM killer kicked in - it seemed that swap was almost
> > unused at the time (which is strange, as that should prevent the OOM
> > killer from kicking in).
> > Relevant part of the syslog follows (syslogd was killed too eventually):
>
> This seems to have been meant to resolve laptop_mode issues, but looks
> like it didn't get applied. I'm not convinced it will help given that
> you appear to have a vanilla ZONE_NORMAL slab OOM (858MB slab), but you
> never know. Capturing /proc/slabinfo data may be more helpful.
>
>
> Index: oom-2.6.8-rc1/mm/vmscan.c
> ===================================================================
> --- oom-2.6.8-rc1.orig/mm/vmscan.c	2004-07-14 06:17:13.876343912 -0700
> +++ oom-2.6.8-rc1/mm/vmscan.c	2004-07-14 06:22:15.986416200 -0700
> @@ -417,7 +417,8 @@
>  				goto keep_locked;
>  			if (!may_enter_fs)
>  				goto keep_locked;
> -			if (laptop_mode && !sc->may_writepage)
> +			if (laptop_mode && !sc->may_writepage &&
> +							!PageSwapCache(page))
>  				goto keep_locked;
>
>  			/* Page is dirty, try to write it out here */

laptop_mode is not set on this server <- :-)

- So I guess this is not relevant for my setup?

/Saaby

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 14:11   ` Jakob Oestergaard
@ 2004-08-18 14:46     ` Hugh Dickins
  0 siblings, 0 replies; 12+ messages in thread
From: Hugh Dickins @ 2004-08-18 14:46 UTC (permalink / raw)
  To: Jakob Oestergaard; +Cc: Gene Heskett, linux-kernel, Anders Saaby, sct

On Wed, 18 Aug 2004, Jakob Oestergaard wrote:
> 
> Looking thru the swapfile.c and oom killer code, one thing that is
> making me scratch my head:
> 
> nr_swap_pages is a *signed* integer.  This does not make sense. There
> are even tests in swapfile.c that explicitly test "nr_swap_pages <= 0"
> instead of simply "!nr_swap_pages" - this does not make sense at all
> either - or does it?
> 
> Stephen is that your code?

I'm not Stephen, and it wasn't originally my code, but I do remember
tidying this up to stop /proc/meminfo showing negative SwapFree
(see nr_to_be_unused).

nr_swap_pages _may_ legitimately be negative, during sys_swapoff:
that does "nr_swap_pages -= p->pages", which is liable to send it
negative, before going on to "try_to_unuse", which slowly increments
nr_swap_pages back up to its final value (0 if no other swap areas),
page by page via delete_from_swap_cache's swap_free.

Surprising, I agree, but it allows swap_free to increment
nr_swap_pages without any special casing for swapoff.

Hugh

> See, if nr_swap_pages can validly be negative and some meaning is
> attached to that (some meaning other than "we're out of swap"), the
> oom_killer surely misses that, as it tests "nr_swap_pages > 0".
> 
> I don't think that nr_swap_pages can be negative (unless one adds a
> *lot* of swap in which case this will unintentionally happen all by
> itself),  but I felt I should chirp in with this comment in case
> someone's looking at it anyway  :)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 14:24   ` Anders Saaby
@ 2004-08-18 21:11     ` William Lee Irwin III
  2004-08-19 12:58       ` Anders Saaby
  2004-08-24  9:30       ` Anders Saaby
  0 siblings, 2 replies; 12+ messages in thread
From: William Lee Irwin III @ 2004-08-18 21:11 UTC (permalink / raw)
  To: Anders Saaby; +Cc: linux-kernel

On Wednesday 18 August 2004 16:05, William Lee Irwin III wrote:
>> Index: oom-2.6.8-rc1/mm/vmscan.c
>> ===================================================================
>> --- oom-2.6.8-rc1.orig/mm/vmscan.c	2004-07-14 06:17:13.876343912 -0700
>> +++ oom-2.6.8-rc1/mm/vmscan.c	2004-07-14 06:22:15.986416200 -0700
>> @@ -417,7 +417,8 @@
>>  				goto keep_locked;
>>  			if (!may_enter_fs)
>>  				goto keep_locked;
>> -			if (laptop_mode && !sc->may_writepage)
>> +			if (laptop_mode && !sc->may_writepage &&
>> +							!PageSwapCache(page))
>>  				goto keep_locked;
>>
>>  			/* Page is dirty, try to write it out here */

On Wed, Aug 18, 2004 at 04:24:24PM +0200, Anders Saaby wrote:
> laptop_mode is not set on this server <- :-)
> - So I guess this is not relevant for my setup?

Probably not. Please try to collect /proc/slabinfo snapshots while the
system is still functional as it degrades.


-- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 21:11     ` William Lee Irwin III
@ 2004-08-19 12:58       ` Anders Saaby
  2004-08-24  9:30       ` Anders Saaby
  1 sibling, 0 replies; 12+ messages in thread
From: Anders Saaby @ 2004-08-19 12:58 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel

On Wednesday 18 August 2004 23:11, William Lee Irwin III wrote:
> On Wednesday 18 August 2004 16:05, William Lee Irwin III wrote:
> >> Index: oom-2.6.8-rc1/mm/vmscan.c
> >> ===================================================================
> >> --- oom-2.6.8-rc1.orig/mm/vmscan.c	2004-07-14 06:17:13.876343912 -0700
> >> +++ oom-2.6.8-rc1/mm/vmscan.c	2004-07-14 06:22:15.986416200 -0700
> >> @@ -417,7 +417,8 @@
> >>  				goto keep_locked;
> >>  			if (!may_enter_fs)
> >>  				goto keep_locked;
> >> -			if (laptop_mode && !sc->may_writepage)
> >> +			if (laptop_mode && !sc->may_writepage &&
> >> +							!PageSwapCache(page))
> >>  				goto keep_locked;
> >>
> >>  			/* Page is dirty, try to write it out here */
>
> On Wed, Aug 18, 2004 at 04:24:24PM +0200, Anders Saaby wrote:
> > laptop_mode is not set on this server <- :-)
> > - So I guess this is not relevant for my setup?
>
> Probably not. Please try to collect /proc/slabinfo snapshots while the
> system is still functional as it degrades.
>
OK, I am now collecting /proc/slabinfo every hour to some logfiles - I will 
send you the results when I have some interesting data.

/Saaby

>
> -- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-18 21:11     ` William Lee Irwin III
  2004-08-19 12:58       ` Anders Saaby
@ 2004-08-24  9:30       ` Anders Saaby
  2004-08-24 15:41         ` William Lee Irwin III
  1 sibling, 1 reply; 12+ messages in thread
From: Anders Saaby @ 2004-08-24  9:30 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel, joe, Gene Heskett, Hugh Dickins

OK - I now have some additional info regarding the slapinfo/oom-killer issue.

As I wrote earlier this server is a storage server providing NFS storage to a 
number of webservers - ondisk filesystem is xfs, kernel = 2.6.8.1. At 03:00 
some logrotate scripts runs throug a lot of files. It appears that this is 
what is using the slabs (see this graph, K = M, so max used slab is approx. 
700M) 

http://saaby.com/slabused.gif (values are from /proc/meminfo)

These are the values (from slabinfo, active_objs), which changed remarkably 
from 03:00 to 06:00:

03:00:        06:00:
xfs_chashlist  91297    xfs_chashlist     151994
xfs_inode     243791    xfs_inode         586780
linvfs_icache 243791    linvfs_icache     586807
dentry_cache  196033    dentry_cache      430609

The server crashed every night at approx. 03:00 to 04:00 - until last night 
where we changed:

vm.min_free_kbytes from default (approx. 900K) to vm.min_free_kbytes=32768 
(32M)

This seems to solve the problem - Does this make any sense to you? - Or just 
pure luck?

/Saaby

On Wednesday 18 August 2004 23:11, William Lee Irwin III wrote:
> On Wednesday 18 August 2004 16:05, William Lee Irwin III wrote:
> >> Index: oom-2.6.8-rc1/mm/vmscan.c
> >> ===================================================================
> >> --- oom-2.6.8-rc1.orig/mm/vmscan.c 2004-07-14 06:17:13.876343912 -0700
> >> +++ oom-2.6.8-rc1/mm/vmscan.c 2004-07-14 06:22:15.986416200 -0700
> >> @@ -417,7 +417,8 @@
> >>      goto keep_locked;
> >>     if (!may_enter_fs)
> >>      goto keep_locked;
> >> -   if (laptop_mode && !sc->may_writepage)
> >> +   if (laptop_mode && !sc->may_writepage &&
> >> +       !PageSwapCache(page))
> >>      goto keep_locked;
> >>
> >>     /* Page is dirty, try to write it out here */
>
> On Wed, Aug 18, 2004 at 04:24:24PM +0200, Anders Saaby wrote:
> > laptop_mode is not set on this server <- :-)
> > - So I guess this is not relevant for my setup?
>
> Probably not. Please try to collect /proc/slabinfo snapshots while the
> system is still functional as it degrades.
>
>
> -- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-24  9:30       ` Anders Saaby
@ 2004-08-24 15:41         ` William Lee Irwin III
  2004-08-24 15:55           ` Anders Saaby
  0 siblings, 1 reply; 12+ messages in thread
From: William Lee Irwin III @ 2004-08-24 15:41 UTC (permalink / raw)
  To: Anders Saaby; +Cc: linux-kernel, joe, Gene Heskett, Hugh Dickins

On Tue, Aug 24, 2004 at 11:30:15AM +0200, Anders Saaby wrote:
> OK - I now have some additional info regarding the slapinfo/oom-killer
> issue.
> As I wrote earlier this server is a storage server providing NFS storage to a 
> number of webservers - ondisk filesystem is xfs, kernel = 2.6.8.1. At 03:00 
> some logrotate scripts runs throug a lot of files. It appears that this is 
> what is using the slabs (see this graph, K = M, so max used slab is approx. 
> 700M) 
> http://saaby.com/slabused.gif (values are from /proc/meminfo)
> These are the values (from slabinfo, active_objs), which changed remarkably 
> from 03:00 to 06:00:
> 03:00:        06:00:
> xfs_chashlist  91297    xfs_chashlist     151994
> xfs_inode     243791    xfs_inode         586780
> linvfs_icache 243791    linvfs_icache     586807
> dentry_cache  196033    dentry_cache      430609

xfs has some known bad slab behavior. I think punting this in the
direction of xfs mailing lists may be useful.


On Tue, Aug 24, 2004 at 11:30:15AM +0200, Anders Saaby wrote:
> The server crashed every night at approx. 03:00 to 04:00 - until last night 
> where we changed:
> vm.min_free_kbytes from default (approx. 900K) to vm.min_free_kbytes=32768 
> (32M)
> This seems to solve the problem - Does this make any sense to you? - Or just 
> pure luck?

I guess it makes some sense since it refuses to let slab cut into the
very last bits of RAM. If you're getting temporarily heavily fragmented
with active references it may mean the difference between the box
livelocking/deadlocking and making forward progress.


-- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: oom-killer 2.6.8.1
  2004-08-24 15:41         ` William Lee Irwin III
@ 2004-08-24 15:55           ` Anders Saaby
  0 siblings, 0 replies; 12+ messages in thread
From: Anders Saaby @ 2004-08-24 15:55 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: linux-kernel, joe, Gene Heskett, Hugh Dickins

On Tuesday 24 August 2004 17:41, William Lee Irwin III wrote:
> On Tue, Aug 24, 2004 at 11:30:15AM +0200, Anders Saaby wrote:
> > OK - I now have some additional info regarding the slapinfo/oom-killer
> > issue.
> > As I wrote earlier this server is a storage server providing NFS storage
> > to a number of webservers - ondisk filesystem is xfs, kernel = 2.6.8.1.
> > At 03:00 some logrotate scripts runs throug a lot of files. It appears
> > that this is what is using the slabs (see this graph, K = M, so max used
> > slab is approx. 700M)
> > http://saaby.com/slabused.gif (values are from /proc/meminfo)
> > These are the values (from slabinfo, active_objs), which changed
> > remarkably from 03:00 to 06:00:
> > 03:00:        06:00:
> > xfs_chashlist  91297    xfs_chashlist     151994
> > xfs_inode     243791    xfs_inode         586780
> > linvfs_icache 243791    linvfs_icache     586807
> > dentry_cache  196033    dentry_cache      430609
>
> xfs has some known bad slab behavior. I think punting this in the
> direction of xfs mailing lists may be useful.
>
OK, interesting! - I will do that.

> On Tue, Aug 24, 2004 at 11:30:15AM +0200, Anders Saaby wrote:
> > The server crashed every night at approx. 03:00 to 04:00 - until last
> > night where we changed:
> > vm.min_free_kbytes from default (approx. 900K) to
> > vm.min_free_kbytes=32768 (32M)
> > This seems to solve the problem - Does this make any sense to you? - Or
> > just pure luck?
>
> I guess it makes some sense since it refuses to let slab cut into the
> very last bits of RAM. If you're getting temporarily heavily fragmented
> with active references it may mean the difference between the box
> livelocking/deadlocking and making forward progress.
>
Sounds right.

Thanks for your help! :-)

/Saaby

>
> -- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2004-08-24 15:55 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-08-18 12:55 oom-killer 2.6.8.1 Anders Saaby
2004-08-18 13:57 ` Gene Heskett
2004-08-18 14:11   ` Jakob Oestergaard
2004-08-18 14:46     ` Hugh Dickins
2004-08-18 14:14   ` Anders Saaby
2004-08-18 14:05 ` William Lee Irwin III
2004-08-18 14:24   ` Anders Saaby
2004-08-18 21:11     ` William Lee Irwin III
2004-08-19 12:58       ` Anders Saaby
2004-08-24  9:30       ` Anders Saaby
2004-08-24 15:41         ` William Lee Irwin III
2004-08-24 15:55           ` Anders Saaby

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).