linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [BUG]  oom killer not triggering in 2.6.9-rc3
@ 2004-10-09  0:14 Chris Friesen
  2004-10-09  0:26 ` Chris Wright
  2004-10-11 13:58 ` Alan Cox
  0 siblings, 2 replies; 14+ messages in thread
From: Chris Friesen @ 2004-10-09  0:14 UTC (permalink / raw)
  To: Linux kernel


I have an Xserve running 2.6.9-rc3 and patched to run the ppc kernel rather than 
the ppc64 kernel.  It's configured with 2GB of memory, no swap.

If I run one instance of the following program, it allocates all but about 3MB 
of memory, and the memory hog spins with 100% of the cpu.

If I run two instances of the program, the machine locks up, doesn't respond to 
pings, and is basically dead to the world.

Shouldn't the oom-killer be kicking in?

Chris







#include <stdlib.h>
#include <unistd.h>

#define PAGES 1000
#define BLOCKSIZE (pgsz * PAGES)

int main()
{
	int pgsz = sysconf(_SC_PAGESIZE);

	while(1) {
		char *p = (char *)malloc(BLOCKSIZE);
		if (p)
			for (int i=0;i<PAGES;i++)
				*(p+(i*pgsz)) = 1;
	}
	return 0;
}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-09  0:14 [BUG] oom killer not triggering in 2.6.9-rc3 Chris Friesen
@ 2004-10-09  0:26 ` Chris Wright
  2004-10-11 13:58 ` Alan Cox
  1 sibling, 0 replies; 14+ messages in thread
From: Chris Wright @ 2004-10-09  0:26 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Linux kernel, akpm

* Chris Friesen (cfriesen@nortelnetworks.com) wrote:
> 
> I have an Xserve running 2.6.9-rc3 and patched to run the ppc kernel rather than 
> the ppc64 kernel.  It's configured with 2GB of memory, no swap.
> 
> If I run one instance of the following program, it allocates all but about 3MB 
> of memory, and the memory hog spins with 100% of the cpu.
> 
> If I run two instances of the program, the machine locks up, doesn't respond to 
> pings, and is basically dead to the world.
> 
> Shouldn't the oom-killer be kicking in?

Kicks in for me (albeit the world comes to a screeching halt for a few
moments before it kicks in).  I'm using this patch below from Andrew.
Does it help you?

diff -puN mm/vmscan.c~vmscan-handle-empty-zones mm/vmscan.c
--- 25/mm/vmscan.c~vmscan-handle-empty-zones	2004-10-07 19:10:52.844797784 -0700
+++ 25-akpm/mm/vmscan.c	2004-10-07 19:11:49.804138648 -0700
@@ -851,6 +851,9 @@ shrink_caches(struct zone **zones, struc
 	for (i = 0; zones[i] != NULL; i++) {
 		struct zone *zone = zones[i];
 
+		if (zone->present_pages == 0)
+			continue;
+
 		zone->temp_priority = sc->priority;
 		if (zone->prev_priority > sc->priority)
 			zone->prev_priority = sc->priority;
@@ -999,6 +1002,9 @@ static int balance_pgdat(pg_data_t *pgda
 			for (i = pgdat->nr_zones - 1; i >= 0; i--) {
 				struct zone *zone = pgdat->node_zones + i;
 
+				if (zone->present_pages == 0)
+					continue;
+
 				if (zone->all_unreclaimable &&
 						priority != DEF_PRIORITY)
 					continue;
@@ -1033,6 +1039,9 @@ static int balance_pgdat(pg_data_t *pgda
 		for (i = 0; i <= end_zone; i++) {
 			struct zone *zone = pgdat->node_zones + i;
 
+			if (zone->present_pages == 0)
+				continue;
+
 			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
 				continue;
 
_


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-09  0:14 [BUG] oom killer not triggering in 2.6.9-rc3 Chris Friesen
  2004-10-09  0:26 ` Chris Wright
@ 2004-10-11 13:58 ` Alan Cox
  2004-10-12  5:03   ` Chris Friesen
  1 sibling, 1 reply; 14+ messages in thread
From: Alan Cox @ 2004-10-11 13:58 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Linux Kernel Mailing List

The OOM killer is a heuristic. Switch the machine to strict accounting
and it'll kill or block memory access correctly.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-11 13:58 ` Alan Cox
@ 2004-10-12  5:03   ` Chris Friesen
  2004-10-12  5:22     ` William Lee Irwin III
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Chris Friesen @ 2004-10-12  5:03 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:
> The OOM killer is a heuristic. 

Sure, but presumably it's a bad thing for a user with no priorities to be able 
to lock up a machine by running two tasks?  I'm not complaining that its killing 
the wrong thing, I'm complaining that the machine locked up.

 > Switch the machine to strict accounting
> and it'll kill or block memory access correctly.

I must be able to run an app that uses over 90% of system memory, and calls 
fork().  I was under the impression this made strict accounting unfeasable?

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12  5:03   ` Chris Friesen
@ 2004-10-12  5:22     ` William Lee Irwin III
  2004-10-12 15:24       ` Chris Friesen
  2004-10-12  9:44     ` Andries Brouwer
  2004-10-12 21:02     ` [BUG] oom killer not triggering in 2.6.9-rc3 Alan Cox
  2 siblings, 1 reply; 14+ messages in thread
From: William Lee Irwin III @ 2004-10-12  5:22 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Alan Cox, Linux Kernel Mailing List

Alan Cox wrote:
>> The OOM killer is a heuristic. 

On Mon, Oct 11, 2004 at 11:03:16PM -0600, Chris Friesen wrote:
> Sure, but presumably it's a bad thing for a user with no priorities to be 
> able to lock up a machine by running two tasks?  I'm not complaining that 
> its killing the wrong thing, I'm complaining that the machine locked up.

Alan Cox wrote:
>> Switch the machine to strict accounting
>> and it'll kill or block memory access correctly.

On Mon, Oct 11, 2004 at 11:03:16PM -0600, Chris Friesen wrote:
> I must be able to run an app that uses over 90% of system memory, and calls 
> fork().  I was under the impression this made strict accounting unfeasable?

Not so. Just add enough swapspace to act as the backing store for the
aggregate anonymous virtualspace.


-- wli

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12  5:03   ` Chris Friesen
  2004-10-12  5:22     ` William Lee Irwin III
@ 2004-10-12  9:44     ` Andries Brouwer
  2004-10-12 15:32       ` Chris Friesen
  2004-10-12 21:02     ` [BUG] oom killer not triggering in 2.6.9-rc3 Alan Cox
  2 siblings, 1 reply; 14+ messages in thread
From: Andries Brouwer @ 2004-10-12  9:44 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Alan Cox, Linux Kernel Mailing List

On Mon, Oct 11, 2004 at 11:03:16PM -0600, Chris Friesen wrote:

> Alan Cox wrote:
> >The OOM killer is a heuristic. 
> 
> Sure, but presumably it's a bad thing for a user with no priorities to be 
> able to lock up a machine by running two tasks?  I'm not complaining that 
> its killing the wrong thing, I'm complaining that the machine locked up.
> 
> > Switch the machine to strict accounting
> >and it'll kill or block memory access correctly.
> 
> I must be able to run an app that uses over 90% of system memory, and calls 
> fork().  I was under the impression this made strict accounting unfeasable?
> 
> Chris

No.

The default allows a job to take ten times what is available,
and bad things happen later.
With overcommit mode 2 there is an upper bound, but you can
twiddle the bound as desired. From proc(5):

    /proc/sys/vm/overcommit_memory
              This   file  contains  the  kernel  virtual  memory
              accounting mode. Values are:
              0: heuristic overcommit (this is the default)
              1: always overcommit, never check
              2: always check, never overcommit
              In mode 0, calls of mmap(2) with MAP_NORESERVE  set
              are  not  checked,  and  the  default check is very
              weak, leading to the  risk  of  getting  a  process
              "OOM-killed".   Under  Linux  2.4 any nonzero value
              implies mode 1.  In mode 2 (available  since  Linux
              2.6), the total virtual address space on the system
              is limited to (SS + RAM*(r/100)), where SS  is  the
              size  of the swap space, and RAM is the size of the
              physical memory, and r is the contents of the  file
              /proc/sys/vm/overcommit_ratio.

       /proc/sys/vm/overcommit_ratio
              See the description of /proc/sys/vm/overcommit_mem

So, what you have is a bound on virtual memory, and that bound
can very easily be larger than twice physical memory.

Andries

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12  5:22     ` William Lee Irwin III
@ 2004-10-12 15:24       ` Chris Friesen
  2004-10-12 16:28         ` William Lee Irwin III
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Friesen @ 2004-10-12 15:24 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Alan Cox, Linux Kernel Mailing List

William Lee Irwin III wrote:
> On Mon, Oct 11, 2004 at 11:03:16PM -0600, Chris Friesen wrote:
> 
>>I must be able to run an app that uses over 90% of system memory, and calls 
>>fork().  I was under the impression this made strict accounting unfeasable?
> 
> 
> Not so. Just add enough swapspace to act as the backing store for the
> aggregate anonymous virtualspace.

In my first message I mentioned that I had no swap.  It's embedded, so I do not 
have the ability to add swap.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12  9:44     ` Andries Brouwer
@ 2004-10-12 15:32       ` Chris Friesen
  2004-10-12 20:08         ` Chris Wright
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Friesen @ 2004-10-12 15:32 UTC (permalink / raw)
  To: Andries Brouwer; +Cc: Alan Cox, Linux Kernel Mailing List, wli

Andries Brouwer wrote:

> The default allows a job to take ten times what is available,
> and bad things happen later.
> With overcommit mode 2 there is an upper bound, but you can
> twiddle the bound as desired. From proc(5):

Okay, that may be a possibility.  I'll look into that.

However, isn't it a bad thing that a vanilla 2.6.9-rc3 can be totally locked up 
by an unpriviledged user by running two tasks?

It seems to me that the OOM-killer not waking up is a bug.  I should not be able 
to lock up the system by running it out of memory--it should wake up and start 
killing things rather than hang the system.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12 15:24       ` Chris Friesen
@ 2004-10-12 16:28         ` William Lee Irwin III
  0 siblings, 0 replies; 14+ messages in thread
From: William Lee Irwin III @ 2004-10-12 16:28 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Alan Cox, Linux Kernel Mailing List

>On Mon, Oct 11, 2004 at 11:03:16PM -0600, Chris Friesen wrote:
>>> I must be able to run an app that uses over 90% of system memory, and 
>>> calls fork().  I was under the impression this made strict accounting 
>>> unfeasable?

William Lee Irwin III wrote:
>> Not so. Just add enough swapspace to act as the backing store for the
>> aggregate anonymous virtualspace.

On Tue, Oct 12, 2004 at 09:24:31AM -0600, Chris Friesen wrote:
> In my first message I mentioned that I had no swap.  It's embedded, so I do 
> not have the ability to add swap.

Then the closest thing to a good idea may be to enable overcommitment.
echo 1 > /proc/sys/vm/overcommit_memory to do that.


-- wli

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12 15:32       ` Chris Friesen
@ 2004-10-12 20:08         ` Chris Wright
  2004-10-12 20:41           ` [BUG] oom killer not triggering in 2.6.9-rc3 -- FIXED in -rc4 Chris Friesen
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Wright @ 2004-10-12 20:08 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Andries Brouwer, Alan Cox, Linux Kernel Mailing List, wli

* Chris Friesen (cfriesen@nortelnetworks.com) wrote:
> However, isn't it a bad thing that a vanilla 2.6.9-rc3 can be totally locked up 
> by an unpriviledged user by running two tasks?

Chris, did you try the patch I sent you (it's in mainline now, so if you
re-test on 2.6.9-rc4 you'd pick it up)?  With that patch, with 2G of
memory and no swap, my machine did not lock up, and the conditions that
the patch protect against were triggered.  And, with the patch backed
out, kswapd spins out of control.  I believe this is fixed.

thanks,
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3  -- FIXED in -rc4
  2004-10-12 20:08         ` Chris Wright
@ 2004-10-12 20:41           ` Chris Friesen
  2004-10-12 20:54             ` Chris Wright
  0 siblings, 1 reply; 14+ messages in thread
From: Chris Friesen @ 2004-10-12 20:41 UTC (permalink / raw)
  To: Chris Wright; +Cc: Andries Brouwer, Alan Cox, Linux Kernel Mailing List, wli

Chris Wright wrote:

> Chris, did you try the patch I sent you (it's in mainline now, so if you
> re-test on 2.6.9-rc4 you'd pick it up)?  

Oops.  No I didn't--meant to, but then forgot.  I should have, it seems to be 
fixed in -rc4.

 > With that patch, with 2G of
> memory and no swap, my machine did not lock up, and the conditions that
> the patch protect against were triggered.  And, with the patch backed
> out, kswapd spins out of control.  I believe this is fixed.

2.6.9-rc4 seems sane again.  Start up two memory hogs, one gets killed immediately.

Sweet.

Thanks for your help,

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3  -- FIXED in -rc4
  2004-10-12 20:41           ` [BUG] oom killer not triggering in 2.6.9-rc3 -- FIXED in -rc4 Chris Friesen
@ 2004-10-12 20:54             ` Chris Wright
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Wright @ 2004-10-12 20:54 UTC (permalink / raw)
  To: Chris Friesen
  Cc: Chris Wright, Andries Brouwer, Alan Cox, Linux Kernel Mailing List, wli

* Chris Friesen (cfriesen@nortelnetworks.com) wrote:
> 2.6.9-rc4 seems sane again.  Start up two memory hogs, one gets killed immediately.

Good, thanks for checking.
-chris
-- 
Linux Security Modules     http://lsm.immunix.org     http://lsm.bkbits.net

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12  5:03   ` Chris Friesen
  2004-10-12  5:22     ` William Lee Irwin III
  2004-10-12  9:44     ` Andries Brouwer
@ 2004-10-12 21:02     ` Alan Cox
  2004-10-12 22:25       ` Chris Friesen
  2 siblings, 1 reply; 14+ messages in thread
From: Alan Cox @ 2004-10-12 21:02 UTC (permalink / raw)
  To: Chris Friesen; +Cc: Linux Kernel Mailing List

On Maw, 2004-10-12 at 06:03, Chris Friesen wrote:
>  > Switch the machine to strict accounting
> > and it'll kill or block memory access correctly.
> 
> I must be able to run an app that uses over 90% of system memory, and calls 
> fork().  I was under the impression this made strict accounting unfeasable?

Its rather smarter than that, you'll want swap probably. The strict
accountant is a virtual address accountant not a memory accountant. It
knows shared r/o segments don't need charging all the time etc



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [BUG]  oom killer not triggering in 2.6.9-rc3
  2004-10-12 21:02     ` [BUG] oom killer not triggering in 2.6.9-rc3 Alan Cox
@ 2004-10-12 22:25       ` Chris Friesen
  0 siblings, 0 replies; 14+ messages in thread
From: Chris Friesen @ 2004-10-12 22:25 UTC (permalink / raw)
  To: Alan Cox; +Cc: Linux Kernel Mailing List

Alan Cox wrote:
> On Maw, 2004-10-12 at 06:03, Chris Friesen wrote:

>>I must be able to run an app that uses over 90% of system memory, and calls 
>>fork().  I was under the impression this made strict accounting unfeasable?

> Its rather smarter than that, you'll want swap probably. The strict
> accountant is a virtual address accountant not a memory accountant. It
> knows shared r/o segments don't need charging all the time etc

As I said in the first message, I've got no swap.

In any case, moving to -rc4 seems to have cleared up the issue, the patch Chris 
Wright suggested seems to have worked.  Oom killer now wakes up immediately and 
kills one of the memory hogs, and the system continues on.

Chris

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2004-10-12 22:27 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-10-09  0:14 [BUG] oom killer not triggering in 2.6.9-rc3 Chris Friesen
2004-10-09  0:26 ` Chris Wright
2004-10-11 13:58 ` Alan Cox
2004-10-12  5:03   ` Chris Friesen
2004-10-12  5:22     ` William Lee Irwin III
2004-10-12 15:24       ` Chris Friesen
2004-10-12 16:28         ` William Lee Irwin III
2004-10-12  9:44     ` Andries Brouwer
2004-10-12 15:32       ` Chris Friesen
2004-10-12 20:08         ` Chris Wright
2004-10-12 20:41           ` [BUG] oom killer not triggering in 2.6.9-rc3 -- FIXED in -rc4 Chris Friesen
2004-10-12 20:54             ` Chris Wright
2004-10-12 21:02     ` [BUG] oom killer not triggering in 2.6.9-rc3 Alan Cox
2004-10-12 22:25       ` Chris Friesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).