linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Break 2.4 VM in five easy steps
@ 2001-06-05 22:19 Derek Glidden
  2001-06-05 23:38 ` Jeffrey W. Baker
                   ` (8 more replies)
  0 siblings, 9 replies; 106+ messages in thread
From: Derek Glidden @ 2001-06-05 22:19 UTC (permalink / raw)
  To: linux-kernel


After reading the messages to this list for the last couple of weeks and
playing around on my machine, I'm convinced that the VM system in 2.4 is
still severely broken.  

This isn't trying to test extreme low-memory pressure, just how the
system handles recovering from going somewhat into swap, which is a real
day-to-day problem for me, because I often run a couple of apps that
most of the time live in RAM, but during heavy computation runs, can go
a couple hundred megs into swap for a few minutes at a time.  Whenever
that happens, my machine always starts acting up afterwards, so I
started investigating and found some really strange stuff going on.

To demonstrate this to a co-worker, I cooked up this really simple,
really stupid, very effective test.  (Note that this all is probably
specific to IA32, which is the platform on which I'm running.)

-- How to Break your 2.4 kernel VM in 5 easy steps

1) compile the following code:

#include <stdlib.h>
void main(void) {
   /* allocate a buttload of memory and try to touch it all */
   void *ptr = (void *)calloc(100000000, sizeof(int)) ;

   /* sleep for a bit to let the system quiesce */
   sleep(20);

   /* let it all go away now */
   free(ptr);
}

2) depending on the amount of RAM/swap available in your machine, you
might need to adjust the calloc to allocate a different amount.  This
allocates about 400MB.  

3) Run the program, or more than one copy at once.  You want to put your
machine somewhat into swap, but not totally overwhelmed.  On the system
I'm using to write this, with 512MB of RAM and 512MB of swap, I run two
copies of this program simultaneously and it puts me a couple hundred
megs into swap.

4) Let the program exit, run "free" or cat /proc/memstat or something to
make sure your machine has paged a bunch of stuff out into swap.

5) try to "swapoff" your swap partition and watch the machine become
completely and entirely unresponsive for several minutes.

--

If I do this on my machine, which is a K7-700 on an ASUS K7M motherboard
with 512MB each of swap and RAM where I'm writing this (but I can make
any machine running 2.4 behave the same way, and any version I've tried
it with from 2.4.2 on up through most of the -ac kernels too), the
machine will become _entirely_ unresponsive for several minutes.  The HD
comes on for a few seconds at the very start of the "swapoff", CPU
utilization immediately pegs up to 100% system time, and then for a few
minutes after, as far as anyone can tell, the machine is TOTALLY locked
up.  No console response, no response from anything on the machine. 
However, after a few minutes of TOTAL catatonia, it will mysteriously
come back to life, having finally released all its swap.

Now, this is a VERY contrived test, but there are a couple of things
about doing this against 2.4 compared with 2.2 that seem VERY BROKEN to
me.

1) Running this against a machine running a 2.2-series kernel does
nothing out of the ordinary.  You hit a bunch of swap, exit the
"allocate" program, swapoff, and everything is fine after a few seconds
of disk activity as it pages everything back into RAM.  Least surprise. 
Under 2.4, when you "swapoff" it appears as far as anyone can tell that
the machine has locked up completely.  Very surprising.  In fact, the
first time it happened to me, I hit the Big Red Switch thinking the
machine _had_ locked up.  It wasn't until I started playing around with
memory allocation a bit more and read some of the problems on LKML that
I started to realize it wasn't locked up - just spinning.

2) Under 2.2, when the "allocate" programs exit, the amount of mem and
swap that show up in the "used" column are quite small - about what
you'd expect from all the apps that are actually running. No surprise
there.  Under 2.4, after running the "allocate" program, "free" shows
about 200MB each under mem and swap as "used".  A lot of memory shows up
in the "cached" column, so that explains the mem usage, (although not
what's cached, unless it's caching swap activity, which is odd) but what
the heck is in that swap space?  Very surprising.

Now, I'm sure some of the response will be "Don't run 2.4.  If you want
to run a stable kernel run 2.2."  That may be a reasonable, but there
are a couple of features and a couple of drivers that make the 2.4 very
appealing, and somewhat necessary, to me.  Also, I want to help FIX
these problems.  I don't know if my hokey test is an indication of
something for real, but hopefully it's something that's simple enough
that a lot of people can run it and see if they experience similar
things.  

And, AFAIC, a truly stable kernel (like 2.2) should be able to go deep
into swap, and once the applications taking up the memory have exited,
be able to turn off that swap and not have something utterly surprising,
like the machine becoming comatose for several minutes, happen.  If it
does, that's an indication to me that there is something severely wrong.

Now, with that being said, is there anything I can do to help?  Run
experimental patches?  Try things on different machines?  I have access
to a number of different computers (all IA32) with widely varying memory
configurations and am willing to try test patches to try to get this
working correctly.

Or am I completely smoking crack and the fact that my machine hoses up
for several minutes after this very contrived test is only an indication
that the test is very contrived and in fact the kernel VM is perfectly
fine and this is totally expected behaviour and I just should never try
to "swapoff" a swap partition under 2.4 if I want my machine to behave
itself?

Please respond to me directly, as I'm not subscribed to the list.  I
have tried to keep current via archives in the last couple of weeks, but
with the PSI/C&W disconnect going down, it seems like I'm unable to
reach some of the online archives.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2001-06-09 12:38 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden
2001-06-05 23:38 ` Jeffrey W. Baker
2001-06-06  1:42   ` Russell Leighton
2001-06-06  7:14     ` Sean Hunter
2001-06-06  2:16   ` Andrew Morton
2001-06-06  3:19     ` Derek Glidden
2001-06-06 14:16       ` Disconnect
     [not found]       ` <3B1DEAC7.43DEFA1C@idb.hist.no>
2001-06-06 14:51         ` Derek Glidden
2001-06-06 21:34           ` Alan Cox
2001-06-09  8:07             ` Rik van Riel
2001-06-07  7:23           ` Helge Hafting
2001-06-07 16:56             ` Eric W. Biederman
2001-06-07 20:24             ` José Luis Domingo López
2001-06-06  4:03     ` Jeffrey W. Baker
2001-06-06  8:19     ` Xavier Bestel
2001-06-06  8:54       ` Sean Hunter
2001-06-06  9:57         ` Dr S.M. Huen
2001-06-06 10:06           ` DBs (ML)
2001-06-06 10:08           ` Vivek Dasmohapatra
2001-06-06 10:19             ` Lauri Tischler
2001-06-06 10:22           ` Sean Hunter
2001-06-06 10:48             ` Alexander Viro
2001-06-06 16:58               ` dean gaudet
2001-06-06 17:10               ` Remi Turk
2001-06-06 22:44             ` Kai Henningsen
2001-06-09  7:17             ` Rik van Riel
2001-06-06 16:47           ` dean gaudet
2001-06-06 17:17           ` Kurt Roeckx
2001-06-06 18:35             ` Dr S.M. Huen
2001-06-06 18:40               ` Mark Salisbury
2001-06-07  0:20           ` Mike A. Harris
2001-06-09  8:16             ` Rik van Riel
2001-06-09  8:57               ` Mike A. Harris
2001-06-07 21:31           ` Shane Nay
2001-06-07 20:00             ` Marcelo Tosatti
2001-06-07 21:55               ` Shane Nay
2001-06-07 20:29                 ` Marcelo Tosatti
2001-06-07 23:29                   ` VM Report was:Re: " Shane Nay
2001-06-08  1:18                   ` Jonathan Morton
2001-06-08 12:50                     ` Mike Galbraith
2001-06-08 14:19                       ` Tobias Ringstrom
2001-06-08 16:51                         ` Mike Galbraith
2001-06-08 19:09                           ` Tobias Ringstrom
2001-06-09  4:36                             ` Mike Galbraith
2001-06-08 15:51                       ` John Stoffel
2001-06-08 17:01                         ` Mike Galbraith
2001-06-09  3:34                           ` Rik van Riel
2001-06-08 17:43                         ` John Stoffel
2001-06-08 17:35                           ` Marcelo Tosatti
2001-06-09  5:07                             ` Mike Galbraith
2001-06-08 18:30                           ` Mike Galbraith
2001-06-09 12:31                             ` Zlatko Calusic
2001-06-08 20:58                           ` John Stoffel
2001-06-08 20:04                             ` Marcelo Tosatti
2001-06-08 23:44                             ` Jonathan Morton
2001-06-09  2:36                               ` Andrew Morton
2001-06-09  6:33                                 ` Mark Hahn
2001-06-09  3:43                               ` Mike Galbraith
2001-06-09  4:05                               ` Jonathan Morton
2001-06-09  5:09                                 ` Mike Galbraith
2001-06-06 10:04         ` Jonathan Morton
2001-06-06 11:16         ` Daniel Phillips
2001-06-06 13:58         ` Gerhard Mack
2001-06-08  4:56           ` C. Martins
2001-06-06 15:28         ` Richard Gooch
2001-06-06 15:42           ` Christian Bornträger
2001-06-06 15:57             ` Requirement: swap = RAM x 2.5 ?? Jeff Garzik
2001-06-06 18:42               ` Eric W. Biederman
2001-06-07  1:29                 ` Jan Harkes
2001-06-06 16:12             ` Richard Gooch
2001-06-06 16:15               ` Jeff Garzik
2001-06-06 16:19               ` Richard Gooch
2001-06-06 16:53                 ` Mike Galbraith
2001-06-06 17:05               ` Greg Hennessy
2001-06-06 17:14           ` Break 2.4 VM in five easy steps Ben Greear
2001-06-06 19:11         ` android
2001-06-07  0:27           ` Mike A. Harris
2001-06-06  9:16       ` Xavier Bestel
2001-06-06  9:25         ` Sean Hunter
2001-06-06 12:07       ` Jonathan Morton
2001-06-06 14:41       ` Derek Glidden
2001-06-06 20:29       ` José Luis Domingo López
2001-06-06 13:32     ` Eric W. Biederman
2001-06-06 14:41     ` Marc Heckmann
2001-06-06 14:51     ` Hugh Dickins
2001-06-06  7:47   ` Jonathan Morton
2001-06-06 13:08   ` Eric W. Biederman
2001-06-06 16:48     ` Jeffrey W. Baker
     [not found] ` <m2lmn61ceb.fsf@sympatico.ca>
2001-06-06 14:37   ` Derek Glidden
2001-06-07  0:34     ` Mike A. Harris
2001-06-07  3:13       ` Miles Lane
2001-06-07 15:49         ` Derek Glidden
2001-06-07 19:06         ` Miles Lane
2001-06-09  5:57         ` Mike A. Harris
2001-06-06 18:59 ` Mike Galbraith
2001-06-06 19:39   ` Derek Glidden
2001-06-06 20:47 ` Linus Torvalds
2001-06-07  7:42   ` Eric W. Biederman
2001-06-07  8:11     ` Linus Torvalds
2001-06-07  8:54       ` Eric W. Biederman
2001-06-06 21:39 ` android
2001-06-06 22:08 ` Jonathan Morton
2001-06-06 22:27 ` android
2001-06-06 22:33   ` Antoine
2001-06-06 22:38 ` Robert Love
2001-06-06 22:40 ` Jonathan Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).