linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Killing/balancing processes when overcommited
@ 2002-09-11 18:08 Jim Sibley
  2002-09-11 18:27 ` Jurriaan
                   ` (4 more replies)
  0 siblings, 5 replies; 43+ messages in thread
From: Jim Sibley @ 2002-09-11 18:08 UTC (permalink / raw)
  To: linux-kernel; +Cc: riel, ltc, Troy Reed

I have run into a situation in a multi-user Linux environment that when
memory is exhausted, random things happen. The best case is that the
"offending" user's task is killed. Just as likely, another user's task is
killed. In some cases, important tasks such as the telnet deamon are
killed. In extreme cases, something is killed that is critical to the
over-all well being of the system, causing random loops, kernel panics, or
system "autism" (where it is not running and responds to no external
intervention other than a reboot).

Since Rik Riel is listed as the author of the oom_kill module (which is
still the one being used in Linux 2.5), I contacted him and he suggested I
contact the vger.kernel.org.

We running Linux in multi-user, SMP, large memory environment (an LPAR in a
zSeries with 2 GB of real memory, but this could just as well happen be on
any other hardware platform). We have several Linux systems running with
4-8 swap volumes or 10-18 GB of real+swap memory and we have run Linux
systems with over 40 GB of real+swap for significant periods, so the paging
mechanism seems to be quite robust in general.

However, when the memory+swap space becomes full, we start getting random
problems. I've turned on the debugging in oom_kill and have watched how the
"kill" scores are calculated and it seems rather random to me.  When the
memory is exhuasted Linux needs some attention. In a "well tuned" system,
we are safe, but when the system accidentally (or deliberately) becomes
"detuned", oom_kill is entered and arbitrarily kills  a  process.

Essentially, Linux has no real conception of "importance" to the
installation. In a single user environment, this is a moot point and
oom_kill works well enough.

In a multi-user environment oom_kill needs to be more selective. There is
no good algorithmic method to make a determination to terminate with
extreme prejudice:

1 - cpu usage may not be a good measure - the user causing the system to
"become detuned" may use little CPU. The assumption that the memory
offender is consuming CPU, such as being in a loop, is neither necessary
nor sufficient. If the system has been running for quite a while (days,
weeks, months), important tasks may have accumulated a relatively large
amount of CPU time and become higher profile targets for oom_kill.

2 - Large memory tasks may not be a good measure as some important tasks
often have large memory and working set prequirements, such as a database
server.

3 - Measuring memory by task rather than total memory usage of the user is
misleading because a single task which uses a moderate amount of memory
scores higher than a single user with a lot of small tasks using more
memory.

4 - Niceness is not really useful in a multi-user environment. Each user
gets to determine how nice he is. In a commerical environment, business
needs determine the "niceness". A long running process that uses a lot of
resources may be very important to the business, but could run at a lower
priority to allow better interactive response for all users. Killing it
just because it has been "niced" is not adequate.

5 - Other numerical limits tend to be arbitrary. Resources should be
allocated by  installation need. Resources should be used by the
installations most important users when they need them and others when the
resources are available.

Since algorithmic methods don't really do the job, I suggest that there
should be some way for the installation to designate importance (not just
for oom_kill, but for overall use of resources).

For example, a file in /etc could be read that lists the processes by
userid or group in the order of importance or give them a weight. When
oom_kill or other processes need to make a decision to limit resources or
kill a task, users or groups with the lowest priority would be most
restricted or killed first.

Suppose an installation decided to replace a lot of desktop Linuxes with
thin clients and a large central Linux server (hardware platform is up to
the installation) for running servers, client processes, data base
services, data storage and sharing, and backup.

You might see the installations priority list like this (lowest priority is
most important, highest value is least important).  Kill priority would be
the importance of keeping the process or task running. I've also added a
Resource priority to give an indication of who should get resources first
(such as CPU or devices).

                                    resource
     group                          priority                kill priority
     system                         0                       0 - never kill
     support                        1                       1
     payroll                        2                       2
     production                     3                       3
     general user                   4                       4
     production backgournd    5           3   <- make sure testing and
general user are killed BEFORE production
     testing                        6                       5

Note that in the example above, production has the second lowest resource
priority, but a higher kill priority ("we don't care how long it takes, but
it must complete").

In a system with sufficient resources, everyone would get what they needed.
As resources become limit, payroll gets resources first and testing gets
the least. In the extreme case, when the system is overwhelmed, testing is
the first to be removed.

This approach also has the advantage in a multi-user environment that the
system administrator would get phone calls and spam when before the
important processes are jeopardized from the less important users and
hopefully have time to react.


Regards, Jim
Linux S/390-zSeries Support, SEEL, IBM Silicon Valley Labs
t/l 543-4021, 408-463-4021, jlsibley@us.ibm.com
*** Grace Happens ***




^ permalink raw reply	[flat|nested] 43+ messages in thread
* RE: Killing/balancing processes when overcommited
@ 2002-09-12 18:14 Jim Sibley
  0 siblings, 0 replies; 43+ messages in thread
From: Jim Sibley @ 2002-09-12 18:14 UTC (permalink / raw)
  To: Rik van Riel; +Cc: linux-kernel, Giuliano Pochini, Troy Reed


Agreed, and I think its up to the installation to decide who that process
is.

Regards, Jim
Linux S/390-zSeries Support, SEEL, IBM Silicon Valley Labs
t/l 543-4021, 408-463-4021, jlsibley@us.ibm.com
*** Grace Happens ***



                                                                                                                  
                      Rik van Riel                                                                                
                      <riel@conectiva.c        To:       Giuliano Pochini <pochini@shiny.it>                      
                      om.br>                   cc:       Jim Sibley/San Jose/IBM@IBMUS, Troy Reed/Santa           
                                                Teresa/IBM@IBMUS, <linux-kernel@vger.kernel.org>                  
                      09/12/02 12:02 PM        Subject:  RE: Killing/balancing processes when overcommited        
                                                                                                                  
                                                                                                                  
                                                                                                                  



On Thu, 12 Sep 2002, Giuliano Pochini wrote:
> On 11-Sep-2002 Jim Sibley wrote:
> > I have run into a situation in a multi-user Linux environment that when
> > memory is exhausted, random things happen. [...] In a "well tuned"
system,
> > we are safe, but when the system accidentally (or deliberately) becomes
> > "detuned", oom_kill is entered and arbitrarily kills  a  process.
>
> It's not difficult to make the kerner choose the right processes
> to kill. It's impossible.

This assumes there is only 1 "good" process to kill.  In reality
there will often be a number of acceptable candidates, so we just
need to identify one of those ;)

Rik
--
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/                    http://distro.conectiva.com/

Spamtraps of the month:  september@surriel.com trac@trac.org







^ permalink raw reply	[flat|nested] 43+ messages in thread
* RE: Killing/balancing processes when overcommited
@ 2002-09-12 19:00 Jim Sibley
  2002-09-12 19:08 ` Thunder from the hill
                   ` (2 more replies)
  0 siblings, 3 replies; 43+ messages in thread
From: Jim Sibley @ 2002-09-12 19:00 UTC (permalink / raw)
  To: Thunder from the hill; +Cc: linux-kernel, Giuliano Pochini, riel


The mem eaters may not be the ones really "causing the problem" as
determined by the installation. The case where I discovered this was when
someone was asking for a lot of small chunks of memory (legitemately). So
you would need a history and a total memory usage to identify who this is.
And this is not really just limited to memory.

I still favor an installation file in /etc specifying the order in which
things are to be killed. Any alogrithmic assumptions are bound to fail at
some point to the dissatisfaction of the installation.

And this is not just limited to memory exhaustion. For example, if I exceed
the maximum number of files, I can't log on to fix the problem. If the
installation could set some priorities, they could say who to sacrifice in
order to keep others running.

Regards, Jim
Linux S/390-zSeries Support, SEEL, IBM Silicon Valley Labs
t/l 543-4021, 408-463-4021, jlsibley@us.ibm.com
*** Grace Happens ***




^ permalink raw reply	[flat|nested] 43+ messages in thread
* RE: Killing/balancing processes when overcommited
@ 2002-09-13 21:13 Jim Sibley
  2002-09-13 22:31 ` Timothy D. Witham
  0 siblings, 1 reply; 43+ messages in thread
From: Jim Sibley @ 2002-09-13 21:13 UTC (permalink / raw)
  To: wookie; +Cc: riel, linux-kernel, thunder



Tim wrote:
>  There is another solution.  And that is never >allocate memory unless
>you have swap space.  Yes, the issue is that you >need to have lots of
>disk allocated to swap but on a big machine you >will have that space.

How do you predict if a program is going to ask for more memory? Maybe it only
needs additional memory for a short time and is a good citizen and gives it
back?

How much disk needs to be allocated for swap? In 32 bit, each logged in user
is limited to 2 GB, so do we need 2 GB for each logged on user? 250 users, 500
GB of disk?

In a 64 bit system, how much swap would you reserve?

Actually, another OS took this approach in the early 70's and this was quickly
junked when they found out how much disk they really had to keep in reserve
for paging.

>  This way the offending process that asks for >more memory will be
>the one that gets killed.  Even if the 1st couple >of ones aren't the
>misbehaving process eventually it will ask for >more memory and suffer
>process execution.  

It may not be the "offending process" that is asking for more memory. How do
you judge "offending?"







^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2002-09-16 20:28 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-11 18:08 Killing/balancing processes when overcommited Jim Sibley
2002-09-11 18:27 ` Jurriaan
2002-09-12  8:26   ` Helge Hafting
2002-09-11 21:44 ` Alan Cox
2002-09-12  7:06 ` Tim Connors
2002-09-12  7:25 ` Giuliano Pochini
2002-09-12 16:02   ` Rik van Riel
2002-09-12 18:30   ` Thunder from the hill
2002-09-13  8:17     ` Giuliano Pochini
2002-09-13 10:22       ` Thunder from the hill
2002-09-13 12:53 ` Denis Vlasenko
2002-09-13 12:54   ` Jesse Pollard
2002-09-12 18:14 Jim Sibley
2002-09-12 19:00 Jim Sibley
2002-09-12 19:08 ` Thunder from the hill
2002-09-12 20:35   ` Alan Cox
2002-09-12 20:43     ` Thunder from the hill
2002-09-12 20:55       ` Alan Cox
2002-09-12 21:15         ` Thunder from the hill
2002-09-12 21:22           ` Jesse Pollard
2002-09-12 23:08           ` Alan Cox
2002-09-12 23:12             ` Thunder from the hill
2002-09-12 21:19       ` Jesse Pollard
2002-09-12 21:56         ` Thunder from the hill
2002-09-13  7:51           ` Giuliano Pochini
2002-09-13 10:17             ` Thunder from the hill
2002-09-12 19:09 ` Rik van Riel
2002-09-13  8:05 ` Giuliano Pochini
2002-09-13 10:54   ` Helge Hafting
2002-09-13 13:02     ` Giuliano Pochini
2002-09-13 16:40       ` Gerhard Mack
2002-09-13 20:23   ` Timothy D. Witham
2002-09-13 21:13 Jim Sibley
2002-09-13 22:31 ` Timothy D. Witham
2002-09-13 22:38   ` Timothy D. Witham
2002-09-13 22:44   ` Rik van Riel
2002-09-13 23:12     ` Timothy D. Witham
2002-09-16  7:29     ` Helge Hafting
2002-09-16 14:03       ` Rik van Riel
2002-09-16 18:49         ` Timothy D. Witham
2002-09-16 19:11           ` Rik van Riel
2002-09-16 20:27             ` Timothy D. Witham
2002-09-14  0:23   ` Jim Sibley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).