From: "Martin MOKREJŠ" <mmokrejs@ribosome.natur.cuni.cz>
To: tglx@linutronix.de
Cc: Andrew Morton <akpm@osdl.org>,
piggin@cyberone.com.au, chris@tebibyte.org,
marcelo.tosatti@cyclades.com, andrea@novell.com,
LKML <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH] fix spurious OOM kills
Date: Tue, 14 Dec 2004 17:04:29 +0100 [thread overview]
Message-ID: <41BF0F0D.4000408@ribosome.natur.cuni.cz> (raw)
In-Reply-To: <1101205649.3888.6.camel@tglx.tec.linutronix.de>
Thomas Gleixner wrote:
Hi,
I went to check what's the status of this. I tested 2.6.10-rc3-bk8
on the same machine, and the parent process still get's killed.
The last patch Thomas has posted to the list in this thread for 2.6.10-rc2-mm3
killed only the application. Maybe it's still in -mm tree?
Anyway, here are results for 2.6.10-rc3-bk8 as I've said:
Free pages: 3924kB (112kB HighMem)
Active:128410 inactive:125323 dirty:0 writeback:0 unstable:0 free:981 slab:1985 mapped:253497 pagetables:739
DMA free:68kB min:68kB low:84kB high:100kB active:5436kB inactive:5512kB present:16384kB pages_scanned:11608 all_unreclaimable
? yes
protections[]: 0 0 0
Normal free:3744kB min:3756kB low:4692kB high:5632kB active:443312kB inactive:430804kB present:901120kB pages_scanned:887679 all_unreclaimable? yes
protections[]: 0 0 0
HighMem free:112kB min:128kB low:160kB high:192kB active:64892kB inactive:64976kB present:131044kB pages_scanned:132923 all_unreclaimable? yes
protections[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68kB
Normal: 0*4kB 0*8kB 0*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3744kB
HighMem: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 112kB
Swap cache: add 294889, delete 294883, find 530/704, race 0+0
Out of Memory: Killed process 6944 (RNAsubopt).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 14, high 42, batch 7
cpu 0 cold: low 0, high 14, batch 7
Free pages: 3924kB (112kB HighMem)
Active:135050 inactive:118681 dirty:0 writeback:0 unstable:0 free:981 slab:1977 mapped:253498 pagetables:739
DMA free:68kB min:68kB low:84kB high:100kB active:5572kB inactive:5368kB present:16384kB pages_scanned:13496 all_unreclaimable ? yes
protections[]: 0 0 0
Normal free:3744kB min:3756kB low:4692kB high:5632kB active:469736kB inactive:404380kB present:901120kB pages_scanned:941233 all_unreclaimable? yes
protections[]: 0 0 0
HighMem free:112kB min:128kB low:160kB high:192kB active:64892kB inactive:64976kB present:131044kB pages_scanned:137915 all_unreclaimable? yes
protections[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68kB
Normal: 0*4kB 0*8kB 0*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3744kB
HighMem: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 112kB
Swap cache: add 294889, delete 294883, find 530/704, race 0+0
Out of Memory: Killed process 6863 (xterm).
I see the machine a lot less responsive when it starts swapping
compared to 2.6.10-rc2-mm3. For example, just moving mouse between
windows takes some 10-12 seconds to fvwm2 to re-focus to another xterm
window.
Martin
> On Tue, 2004-11-23 at 08:41 +0100, Martin MOKREJŠ wrote:
>
>>>One big problem when killing the requesting process or just sending
>>>ENOMEM to the requesting process is, that exactly this process might be
>>>a ssh login, when you try to log into to machine after some application
>>>went crazy and ate up most of the memory. The result is that you
>>>_cannot_ log into the machine, because the login is either killed or
>>>cannot start because it receives ENOMEM.
>>
>>I believe the application is _first_ who will get ENOMEM. It must be
>>terrible luck that it would ask exactly for the size of remaining free
>>memory. Most probably, it will ask for less or more. "Less" in not
>>a problem in this case, so consider it asks for more. Then, OOM killer
>>might well expect the application asking for memory is most probably
>>exactly the application which caused the trouble.
>
>
> For one application, which eats up all memory the 2.4 ENOMEM bahviour
> works.
>
> The scenario which made one of my boxes unusable under 2.4 is a forking
> server, which gets out of control. The last fork gets ENOMEM and does
> not happen, but the other forked processes are still there and consuming
> memory. The server application does the correct thing. It receives
> ENOMEM on fork() and cancels the connection request. On the next request
> the game starts again. Somebody notices that the box is not repsonding
> anymore and tries to login via ssh. Guess what happens. ssh login cannot
> fork due to ENOMEM. The same will happen on 2.6 if we make it behave
> like 2.4.
>
> We have TWO problems in oom handling:
>
> 1. When do we trigger the out of memory killer
>
> As far as my test cases go, 2.6.10-rc2-mm3 does not longer trigger the
> oom without reason.
>
> 2. Which process do we select to kill
>
> The decision is screwed since the oom killer was introduced. Also the
> reentrancy problem and some of the mechanisms in the out_of_memory
> function have to be modified to make it work.
> That's what my patch is addressing.
>
>
>>>Putting hard coded decisions like "prefer sshd, xyz,...", " don't kill
>>>a, b, c" are out of discussion.
>>
>>I'd go for it at least nowadays.
>
>
> Sure, you can do so on your box, but can you accept, that we _CANNOT_
> hard code a list of do not kill apps, except init, into the kernel. I
> don't want to see the mail thread on LKML, where the list of precious
> application is discussed.
>
>
>>>
>>>The ideas which were proposed to have a possibility to set a "don't kill
>>>me" or "yes, I'm a candidate" flag are likely to be a future way to go.
>>>But at the moment we have no way to make this work in current userlands.
>>
>>Do you think login or sshd will ever use flag "yes, I'm a candidate"?
>>I think exactly same bahaviour we get right now with those hard coded decisions
>>you mention above. Otherwise the hard coded decision is programmed into
>>every sshd, init instance anyway. I think it's not necessary to put
>>login and shells on thsi ban list, user will re-login again. ;)
>
>
> Having a generic interface to make this configurable is the only way to
> go. So users can decide what is important in their environment. There is
> more than a desktop PC environment and a lot of embedded boxes need to
> protect special applications.
>
>
>>>I refined the decision, so it does not longer kill the parent, if there
>>>were forked child processes available to kill. So it now should keep
>>>your bash alive.
>>
>>Yes, it doesn't kill parent bash. I don't understand the _doubled_ output
>>in syslog, but maybe you do. Is that related to hyperthreading? ;)
>>Tested on 2.6.10-rc2-mm2.
>
>
>>oom-killer: gfp_mask=0xd2
>>Free pages: 3924kB (112kB HighMem)
>
>
>>oom-killer: gfp_mask=0x1d2
>>Free pages: 3924kB (112kB HighMem)
>
>
> No, it's not related to hyperthreading. It's on the way out.
>
> I put an additional check into the page allocator. Does this help ?
>
> tglx
>
next prev parent reply other threads:[~2004-12-14 16:04 UTC|newest]
Thread overview: 64+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-11-11 11:29 [PATCH] fix spurious OOM kills Marcelo Tosatti
2004-11-11 15:42 ` Andrea Arcangeli
2004-11-11 12:38 ` Marcelo Tosatti
2004-11-11 16:50 ` Andrea Arcangeli
2004-11-11 13:56 ` Marcelo Tosatti
2004-11-11 21:45 ` Andrea Arcangeli
2004-11-11 19:19 ` Marcelo Tosatti
2004-11-11 17:42 ` Martin J. Bligh
2004-11-11 21:50 ` Andrea Arcangeli
2004-11-12 11:13 ` fix for mpol mm corruption on tmpfs Andrea Arcangeli
2004-11-11 21:57 ` [PATCH] fix spurious OOM kills Chris Ross
2004-11-12 16:52 ` Chris Ross
2004-11-12 23:56 ` Nick Piggin
2004-11-13 23:37 ` Andrea Arcangeli
2004-11-14 9:44 ` Marcelo Tosatti
2004-11-14 10:02 ` Marcelo Tosatti
2004-11-14 17:11 ` Andrea Arcangeli
2004-11-14 17:03 ` Andrea Arcangeli
2004-11-14 18:16 ` Martin J. Bligh
2004-11-14 18:27 ` Andrea Arcangeli
2004-11-14 20:21 ` Marcelo Tosatti
2004-11-16 16:30 ` Chris Ross
2004-11-17 9:08 ` Chris Ross
2004-11-17 9:23 ` Andrew Morton
2004-11-17 6:06 ` Marcelo Tosatti
2004-11-17 6:08 ` Marcelo Tosatti
2004-11-17 6:38 ` Marcelo Tosatti
2004-11-17 11:04 ` Chris Ross
2004-11-17 10:26 ` Andrew Morton
2004-11-17 10:50 ` Chris Ross
2004-11-17 7:09 ` Marcelo Tosatti
2004-11-17 11:49 ` Chris Ross
2004-11-17 12:09 ` Rik van Riel
2004-11-17 13:12 ` Chris Ross
[not found] ` <419CD8C1.4030506@ribosome.natur.cuni.cz>
2004-11-18 21:16 ` Andrew Morton
[not found] ` <419D25B5.1060504@ribosome.natur.cuni.cz>
[not found] ` <419D2987.8010305@cyberone.com.au>
2004-11-19 0:03 ` Martin MOKREJŠ
2004-11-19 0:08 ` Andrew Morton
2004-11-19 8:09 ` Marcelo Tosatti
2004-11-19 16:17 ` Thomas Gleixner
[not found] ` <419E821F.7010601@ribosome.natur.cuni.cz>
2004-11-20 10:23 ` Thomas Gleixner
2004-11-20 10:45 ` Martin MOKREJŠ
2004-11-20 11:29 ` Martin MOKREJŠ
2004-11-20 13:29 ` Thomas Gleixner
2004-11-20 21:19 ` Martin MOKREJŠ
2004-11-21 11:53 ` Thomas Gleixner
2004-11-21 12:17 ` Martin MOKREJŠ
2004-11-21 13:57 ` Thomas Gleixner
2004-11-22 10:55 ` Thomas Gleixner
2004-11-23 7:41 ` Martin MOKREJŠ
2004-11-23 10:27 ` Thomas Gleixner
2004-11-24 15:52 ` Martin MOKREJŠ
2004-11-24 16:36 ` Thomas Gleixner
2004-12-14 16:04 ` Martin MOKREJŠ [this message]
2004-12-14 17:38 ` Andrea Arcangeli
2004-12-14 23:30 ` Nick Piggin
2004-12-14 23:55 ` Andrea Arcangeli
2004-12-15 0:16 ` Thomas Gleixner
2004-12-15 0:37 ` Andrea Arcangeli
2004-12-15 0:48 ` Thomas Gleixner
2004-11-21 19:01 ` Chris Ross
2004-11-22 12:15 ` Chris Ross
2004-11-22 8:35 ` Marcelo Tosatti
2004-11-16 8:37 ` Chris Ross
2004-11-17 3:45 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=41BF0F0D.4000408@ribosome.natur.cuni.cz \
--to=mmokrejs@ribosome.natur.cuni.cz \
--cc=akpm@osdl.org \
--cc=andrea@novell.com \
--cc=chris@tebibyte.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcelo.tosatti@cyclades.com \
--cc=piggin@cyberone.com.au \
--cc=riel@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).