linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin MOKREJŠ" <mmokrejs@ribosome.natur.cuni.cz>
To: tglx@linutronix.de
Cc: Andrew Morton <akpm@osdl.org>,
	piggin@cyberone.com.au, chris@tebibyte.org,
	marcelo.tosatti@cyclades.com, andrea@novell.com,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH] fix spurious OOM kills
Date: Tue, 14 Dec 2004 17:04:29 +0100	[thread overview]
Message-ID: <41BF0F0D.4000408@ribosome.natur.cuni.cz> (raw)
In-Reply-To: <1101205649.3888.6.camel@tglx.tec.linutronix.de>

Thomas Gleixner wrote:

Hi,
  I went to check what's the status of this. I tested 2.6.10-rc3-bk8
on the same machine, and the parent process still get's killed.
The last patch Thomas has posted to the list in this thread for 2.6.10-rc2-mm3
killed only the application. Maybe it's still in -mm tree?
Anyway, here are results for 2.6.10-rc3-bk8 as I've said:

Free pages:        3924kB (112kB HighMem)
Active:128410 inactive:125323 dirty:0 writeback:0 unstable:0 free:981 slab:1985 mapped:253497 pagetables:739
DMA free:68kB min:68kB low:84kB high:100kB active:5436kB inactive:5512kB present:16384kB pages_scanned:11608 all_unreclaimable
? yes
protections[]: 0 0 0
Normal free:3744kB min:3756kB low:4692kB high:5632kB active:443312kB inactive:430804kB present:901120kB pages_scanned:887679 all_unreclaimable? yes
protections[]: 0 0 0
HighMem free:112kB min:128kB low:160kB high:192kB active:64892kB inactive:64976kB present:131044kB pages_scanned:132923 all_unreclaimable? yes
protections[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68kB
Normal: 0*4kB 0*8kB 0*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3744kB
HighMem: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 112kB
Swap cache: add 294889, delete 294883, find 530/704, race 0+0
Out of Memory: Killed process 6944 (RNAsubopt).
oom-killer: gfp_mask=0xd0
DMA per-cpu:
cpu 0 hot: low 2, high 6, batch 1
cpu 0 cold: low 0, high 2, batch 1
Normal per-cpu:
cpu 0 hot: low 32, high 96, batch 16
cpu 0 cold: low 0, high 32, batch 16
HighMem per-cpu:
cpu 0 hot: low 14, high 42, batch 7
cpu 0 cold: low 0, high 14, batch 7

Free pages:        3924kB (112kB HighMem)
Active:135050 inactive:118681 dirty:0 writeback:0 unstable:0 free:981 slab:1977 mapped:253498 pagetables:739
DMA free:68kB min:68kB low:84kB high:100kB active:5572kB inactive:5368kB present:16384kB pages_scanned:13496 all_unreclaimable ? yes
protections[]: 0 0 0
Normal free:3744kB min:3756kB low:4692kB high:5632kB active:469736kB inactive:404380kB present:901120kB pages_scanned:941233 all_unreclaimable? yes
protections[]: 0 0 0
HighMem free:112kB min:128kB low:160kB high:192kB active:64892kB inactive:64976kB present:131044kB pages_scanned:137915 all_unreclaimable? yes
protections[]: 0 0 0
DMA: 1*4kB 0*8kB 0*16kB 0*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 68kB
Normal: 0*4kB 0*8kB 0*16kB 1*32kB 0*64kB 1*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3744kB
HighMem: 0*4kB 0*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 112kB
Swap cache: add 294889, delete 294883, find 530/704, race 0+0
Out of Memory: Killed process 6863 (xterm).


I see the machine a lot less responsive when it starts swapping
compared to 2.6.10-rc2-mm3. For example, just moving mouse between
windows takes some 10-12 seconds to fvwm2 to re-focus to another xterm
window.

Martin


> On Tue, 2004-11-23 at 08:41 +0100, Martin MOKREJŠ wrote: 
> 
>>>One big problem when killing the requesting process or just sending
>>>ENOMEM to the requesting process is, that exactly this process might be
>>>a ssh login, when you try to log into to machine after some application
>>>went crazy and ate up most of the memory. The result is that you
>>>_cannot_ log into the machine, because the login is either killed or
>>>cannot start because it receives ENOMEM.
>>
>>I believe the application is _first_ who will get ENOMEM. It must be
>>terrible luck that it would ask exactly for the size of remaining free
>>memory. Most probably, it will ask for less or more. "Less" in not
>>a problem in this case, so consider it asks for more. Then, OOM killer
>>might well expect the application asking for memory is most probably
>>exactly the application which caused the trouble.
> 
> 
> For one application, which eats up all memory the 2.4 ENOMEM bahviour
> works.
> 
> The scenario which made one of my boxes unusable under 2.4 is a forking
> server, which gets out of control. The last fork gets ENOMEM and does
> not happen, but the other forked processes are still there and consuming
> memory. The server application does the correct thing. It receives
> ENOMEM on fork() and cancels the connection request. On the next request
> the game starts again. Somebody notices that the box is not repsonding
> anymore and tries to login via ssh. Guess what happens. ssh login cannot
> fork due to ENOMEM. The same will happen on 2.6 if we make it behave
> like 2.4. 
> 
> We have TWO problems in oom handling:
> 
> 1. When do we trigger the out of memory killer
> 
> As far as my test cases go, 2.6.10-rc2-mm3 does not longer trigger the
> oom without reason.
> 
> 2. Which process do we select to kill
> 
> The decision is screwed since the oom killer was introduced. Also the
> reentrancy problem and some of the mechanisms in the out_of_memory
> function have to be modified to make it work.
> That's what my patch is addressing.
> 
> 
>>>Putting hard coded decisions like "prefer sshd, xyz,...", " don't kill
>>>a, b, c" are out of discussion.
>>
>>I'd go for it at least nowadays.
> 
> 
> Sure, you can do so on your box, but can you accept, that we _CANNOT_
> hard code a list of do not kill apps, except init, into the kernel. I
> don't want to see the mail thread on LKML, where the list of precious
> application is discussed.
> 
> 
>>> 
>>>The ideas which were proposed to have a possibility to set a "don't kill
>>>me" or "yes, I'm a candidate" flag are likely to be a future way to go.
>>>But at the moment we have no way to make this work in current userlands.
>>
>>Do you think login or sshd will ever use flag "yes, I'm a candidate"?
>>I think exactly same bahaviour we get right now with those hard coded decisions
>>you mention above. Otherwise the hard coded decision is programmed into
>>every sshd, init instance anyway. I think it's not necessary to put
>>login and shells on thsi ban list, user will re-login again. ;)
> 
> 
> Having a generic interface to make this configurable is the only way to
> go. So users can decide what is important in their environment. There is
> more than a desktop PC environment and a lot of embedded boxes need to
> protect special applications.
> 
> 
>>>I refined the decision, so it does not longer kill the parent, if there
>>>were forked child processes available to kill. So it now should keep
>>>your bash alive.
>>
>>Yes, it doesn't kill parent bash. I don't understand the _doubled_ output
>>in syslog, but maybe you do. Is that related to hyperthreading? ;)
>>Tested on 2.6.10-rc2-mm2.
> 
> 
>>oom-killer: gfp_mask=0xd2
>>Free pages:        3924kB (112kB HighMem)
> 
> 
>>oom-killer: gfp_mask=0x1d2
>>Free pages:        3924kB (112kB HighMem)
> 
> 
> No, it's not related to hyperthreading. It's on the way out. 
> 
> I put an additional check into the page allocator. Does this help ?
> 
> tglx
> 


  parent reply	other threads:[~2004-12-14 16:04 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-11 11:29 [PATCH] fix spurious OOM kills Marcelo Tosatti
2004-11-11 15:42 ` Andrea Arcangeli
2004-11-11 12:38   ` Marcelo Tosatti
2004-11-11 16:50     ` Andrea Arcangeli
2004-11-11 13:56       ` Marcelo Tosatti
2004-11-11 21:45         ` Andrea Arcangeli
2004-11-11 19:19           ` Marcelo Tosatti
2004-11-11 17:42       ` Martin J. Bligh
2004-11-11 21:50         ` Andrea Arcangeli
2004-11-12 11:13       ` fix for mpol mm corruption on tmpfs Andrea Arcangeli
2004-11-11 21:57 ` [PATCH] fix spurious OOM kills Chris Ross
2004-11-12 16:52   ` Chris Ross
2004-11-12 23:56     ` Nick Piggin
2004-11-13 23:37     ` Andrea Arcangeli
2004-11-14  9:44       ` Marcelo Tosatti
2004-11-14 10:02         ` Marcelo Tosatti
2004-11-14 17:11           ` Andrea Arcangeli
2004-11-14 17:03         ` Andrea Arcangeli
2004-11-14 18:16           ` Martin J. Bligh
2004-11-14 18:27             ` Andrea Arcangeli
2004-11-14 20:21           ` Marcelo Tosatti
2004-11-16 16:30             ` Chris Ross
2004-11-17  9:08               ` Chris Ross
2004-11-17  9:23                 ` Andrew Morton
2004-11-17  6:06                   ` Marcelo Tosatti
2004-11-17  6:08                     ` Marcelo Tosatti
2004-11-17  6:38                       ` Marcelo Tosatti
2004-11-17 11:04                         ` Chris Ross
2004-11-17 10:26                       ` Andrew Morton
2004-11-17 10:50                       ` Chris Ross
2004-11-17  7:09                         ` Marcelo Tosatti
2004-11-17 11:49                           ` Chris Ross
2004-11-17 12:09                           ` Rik van Riel
2004-11-17 13:12                   ` Chris Ross
     [not found]                   ` <419CD8C1.4030506@ribosome.natur.cuni.cz>
2004-11-18 21:16                     ` Andrew Morton
     [not found]                       ` <419D25B5.1060504@ribosome.natur.cuni.cz>
     [not found]                         ` <419D2987.8010305@cyberone.com.au>
2004-11-19  0:03                           ` Martin MOKREJŠ
2004-11-19  0:08                             ` Andrew Morton
2004-11-19  8:09                               ` Marcelo Tosatti
2004-11-19 16:17                                 ` Thomas Gleixner
     [not found]                               ` <419E821F.7010601@ribosome.natur.cuni.cz>
2004-11-20 10:23                                 ` Thomas Gleixner
2004-11-20 10:45                                   ` Martin MOKREJŠ
2004-11-20 11:29                                   ` Martin MOKREJŠ
2004-11-20 13:29                                     ` Thomas Gleixner
2004-11-20 21:19                                       ` Martin MOKREJŠ
2004-11-21 11:53                                         ` Thomas Gleixner
2004-11-21 12:17                                           ` Martin MOKREJŠ
2004-11-21 13:57                                             ` Thomas Gleixner
2004-11-22 10:55                                               ` Thomas Gleixner
2004-11-23  7:41                                                 ` Martin MOKREJŠ
2004-11-23 10:27                                                   ` Thomas Gleixner
2004-11-24 15:52                                                     ` Martin MOKREJŠ
2004-11-24 16:36                                                       ` Thomas Gleixner
2004-12-14 16:04                                                     ` Martin MOKREJŠ [this message]
2004-12-14 17:38                                                       ` Andrea Arcangeli
2004-12-14 23:30                                                         ` Nick Piggin
2004-12-14 23:55                                                           ` Andrea Arcangeli
2004-12-15  0:16                                                             ` Thomas Gleixner
2004-12-15  0:37                                                               ` Andrea Arcangeli
2004-12-15  0:48                                                                 ` Thomas Gleixner
2004-11-21 19:01                   ` Chris Ross
2004-11-22 12:15                     ` Chris Ross
2004-11-22  8:35                       ` Marcelo Tosatti
2004-11-16  8:37           ` Chris Ross
2004-11-17  3:45   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41BF0F0D.4000408@ribosome.natur.cuni.cz \
    --to=mmokrejs@ribosome.natur.cuni.cz \
    --cc=akpm@osdl.org \
    --cc=andrea@novell.com \
    --cc=chris@tebibyte.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcelo.tosatti@cyclades.com \
    --cc=piggin@cyberone.com.au \
    --cc=riel@redhat.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).