* Break 2.4 VM in five easy steps @ 2001-06-05 22:19 Derek Glidden 2001-06-05 23:38 ` Jeffrey W. Baker ` (8 more replies) 0 siblings, 9 replies; 106+ messages in thread From: Derek Glidden @ 2001-06-05 22:19 UTC (permalink / raw) To: linux-kernel After reading the messages to this list for the last couple of weeks and playing around on my machine, I'm convinced that the VM system in 2.4 is still severely broken. This isn't trying to test extreme low-memory pressure, just how the system handles recovering from going somewhat into swap, which is a real day-to-day problem for me, because I often run a couple of apps that most of the time live in RAM, but during heavy computation runs, can go a couple hundred megs into swap for a few minutes at a time. Whenever that happens, my machine always starts acting up afterwards, so I started investigating and found some really strange stuff going on. To demonstrate this to a co-worker, I cooked up this really simple, really stupid, very effective test. (Note that this all is probably specific to IA32, which is the platform on which I'm running.) -- How to Break your 2.4 kernel VM in 5 easy steps 1) compile the following code: #include <stdlib.h> void main(void) { /* allocate a buttload of memory and try to touch it all */ void *ptr = (void *)calloc(100000000, sizeof(int)) ; /* sleep for a bit to let the system quiesce */ sleep(20); /* let it all go away now */ free(ptr); } 2) depending on the amount of RAM/swap available in your machine, you might need to adjust the calloc to allocate a different amount. This allocates about 400MB. 3) Run the program, or more than one copy at once. You want to put your machine somewhat into swap, but not totally overwhelmed. On the system I'm using to write this, with 512MB of RAM and 512MB of swap, I run two copies of this program simultaneously and it puts me a couple hundred megs into swap. 4) Let the program exit, run "free" or cat /proc/memstat or something to make sure your machine has paged a bunch of stuff out into swap. 5) try to "swapoff" your swap partition and watch the machine become completely and entirely unresponsive for several minutes. -- If I do this on my machine, which is a K7-700 on an ASUS K7M motherboard with 512MB each of swap and RAM where I'm writing this (but I can make any machine running 2.4 behave the same way, and any version I've tried it with from 2.4.2 on up through most of the -ac kernels too), the machine will become _entirely_ unresponsive for several minutes. The HD comes on for a few seconds at the very start of the "swapoff", CPU utilization immediately pegs up to 100% system time, and then for a few minutes after, as far as anyone can tell, the machine is TOTALLY locked up. No console response, no response from anything on the machine. However, after a few minutes of TOTAL catatonia, it will mysteriously come back to life, having finally released all its swap. Now, this is a VERY contrived test, but there are a couple of things about doing this against 2.4 compared with 2.2 that seem VERY BROKEN to me. 1) Running this against a machine running a 2.2-series kernel does nothing out of the ordinary. You hit a bunch of swap, exit the "allocate" program, swapoff, and everything is fine after a few seconds of disk activity as it pages everything back into RAM. Least surprise. Under 2.4, when you "swapoff" it appears as far as anyone can tell that the machine has locked up completely. Very surprising. In fact, the first time it happened to me, I hit the Big Red Switch thinking the machine _had_ locked up. It wasn't until I started playing around with memory allocation a bit more and read some of the problems on LKML that I started to realize it wasn't locked up - just spinning. 2) Under 2.2, when the "allocate" programs exit, the amount of mem and swap that show up in the "used" column are quite small - about what you'd expect from all the apps that are actually running. No surprise there. Under 2.4, after running the "allocate" program, "free" shows about 200MB each under mem and swap as "used". A lot of memory shows up in the "cached" column, so that explains the mem usage, (although not what's cached, unless it's caching swap activity, which is odd) but what the heck is in that swap space? Very surprising. Now, I'm sure some of the response will be "Don't run 2.4. If you want to run a stable kernel run 2.2." That may be a reasonable, but there are a couple of features and a couple of drivers that make the 2.4 very appealing, and somewhat necessary, to me. Also, I want to help FIX these problems. I don't know if my hokey test is an indication of something for real, but hopefully it's something that's simple enough that a lot of people can run it and see if they experience similar things. And, AFAIC, a truly stable kernel (like 2.2) should be able to go deep into swap, and once the applications taking up the memory have exited, be able to turn off that swap and not have something utterly surprising, like the machine becoming comatose for several minutes, happen. If it does, that's an indication to me that there is something severely wrong. Now, with that being said, is there anything I can do to help? Run experimental patches? Try things on different machines? I have access to a number of different computers (all IA32) with widely varying memory configurations and am willing to try test patches to try to get this working correctly. Or am I completely smoking crack and the fact that my machine hoses up for several minutes after this very contrived test is only an indication that the test is very contrived and in fact the kernel VM is perfectly fine and this is totally expected behaviour and I just should never try to "swapoff" a swap partition under 2.4 if I want my machine to behave itself? Please respond to me directly, as I'm not subscribed to the list. I have tried to keep current via archives in the last couple of weeks, but with the PSI/C&W disconnect going down, it seems like I'm unable to reach some of the online archives. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden @ 2001-06-05 23:38 ` Jeffrey W. Baker 2001-06-06 1:42 ` Russell Leighton ` (3 more replies) [not found] ` <m2lmn61ceb.fsf@sympatico.ca> ` (7 subsequent siblings) 8 siblings, 4 replies; 106+ messages in thread From: Jeffrey W. Baker @ 2001-06-05 23:38 UTC (permalink / raw) To: Derek Glidden; +Cc: linux-kernel On Tue, 5 Jun 2001, Derek Glidden wrote: > > After reading the messages to this list for the last couple of weeks and > playing around on my machine, I'm convinced that the VM system in 2.4 is > still severely broken. > > This isn't trying to test extreme low-memory pressure, just how the > system handles recovering from going somewhat into swap, which is a real > day-to-day problem for me, because I often run a couple of apps that > most of the time live in RAM, but during heavy computation runs, can go > a couple hundred megs into swap for a few minutes at a time. Whenever > that happens, my machine always starts acting up afterwards, so I > started investigating and found some really strange stuff going on. I reboot each of my machines every week, to take them offline for intrusion detection. I use 2.4 because I need advanced features of iptables that ipchains lacks. Because the 2.4 VM is so broken, and because my machines are frequently deeply swapped, they can sometimes take over 30 minutes to shutdown. They hang of course when the shutdown rc script turns off the swap. The first few times this happened I assumed they were dead. So, unlike what certain people like to repeatedly claim, the 2.4 VM problems are causing havoc in the real world. -jwb ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 23:38 ` Jeffrey W. Baker @ 2001-06-06 1:42 ` Russell Leighton 2001-06-06 7:14 ` Sean Hunter 2001-06-06 2:16 ` Andrew Morton ` (2 subsequent siblings) 3 siblings, 1 reply; 106+ messages in thread From: Russell Leighton @ 2001-06-06 1:42 UTC (permalink / raw) To: linux-kernel I also need some 2.4 features and can't really goto 2.2. I would have to agree that the VM is too broken for production...looking forward to the work that (hopefully) will be in 2.4.6 to resolve these issues. "Jeffrey W. Baker" wrote: > On Tue, 5 Jun 2001, Derek Glidden wrote: > > > > > After reading the messages to this list for the last couple of weeks and > > playing around on my machine, I'm convinced that the VM system in 2.4 is > > still severely broken. > > > > This isn't trying to test extreme low-memory pressure, just how the > > system handles recovering from going somewhat into swap, which is a real > > day-to-day problem for me, because I often run a couple of apps that > > most of the time live in RAM, but during heavy computation runs, can go > > a couple hundred megs into swap for a few minutes at a time. Whenever > > that happens, my machine always starts acting up afterwards, so I > > started investigating and found some really strange stuff going on. > > I reboot each of my machines every week, to take them offline for > intrusion detection. I use 2.4 because I need advanced features of > iptables that ipchains lacks. Because the 2.4 VM is so broken, and > because my machines are frequently deeply swapped, they can sometimes take > over 30 minutes to shutdown. They hang of course when the shutdown rc > script turns off the swap. The first few times this happened I assumed > they were dead. > > So, unlike what certain people like to repeatedly claim, the 2.4 VM > problems are causing havoc in the real world. > > -jwb > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- --------------------------------------------------- Russell Leighton russell.leighton@247media.com Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning. - Rich Cook --------------------------------------------------- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 1:42 ` Russell Leighton @ 2001-06-06 7:14 ` Sean Hunter 0 siblings, 0 replies; 106+ messages in thread From: Sean Hunter @ 2001-06-06 7:14 UTC (permalink / raw) To: Russell Leighton; +Cc: linux-kernel On Tue, Jun 05, 2001 at 09:42:26PM -0400, Russell Leighton wrote: > > I also need some 2.4 features and can't really goto 2.2. > I would have to agree that the VM is too broken for production...looking > forward to the work that (hopefully) will be in 2.4.6 to resolve these issues. > Boring to do a "me too", but "me too". We have four big production oracle servers that could use 2.4 . However, the test server we have put 2.4 on has no end of ridiculous VM and OOM problems. It seems bizarre that a 4GB machine with a working set _far_ lower than that should be dying from OOM and swapping itself to death, but that's life in 2.4 land. Sean ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 23:38 ` Jeffrey W. Baker 2001-06-06 1:42 ` Russell Leighton @ 2001-06-06 2:16 ` Andrew Morton 2001-06-06 3:19 ` Derek Glidden ` (5 more replies) 2001-06-06 7:47 ` Jonathan Morton 2001-06-06 13:08 ` Eric W. Biederman 3 siblings, 6 replies; 106+ messages in thread From: Andrew Morton @ 2001-06-06 2:16 UTC (permalink / raw) To: Jeffrey W. Baker; +Cc: Derek Glidden, linux-kernel "Jeffrey W. Baker" wrote: > > Because the 2.4 VM is so broken, and > because my machines are frequently deeply swapped, The swapoff algorithms in 2.2 and 2.4 are basically identical. The problem *appears* worse in 2.4 because it uses lots more swap. > they can sometimes take over 30 minutes to shutdown. Yes. The sys_swapoff() system call can take many minutes of CPU time. It basically does: for (each page in swap device) { for (each process) { for (each page used by this process) stuff It's interesting that you've found a case where this actually has an operational impact. Haven't looked at it closely, but I think the algorithm could become something like: for (each process) { for (each page in this process) { if (page is on target swap device) get_it_off() } } for (each page in swap device) { if (it is busy) complain() } That's 10^4 to 10^6 times faster. - ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 2:16 ` Andrew Morton @ 2001-06-06 3:19 ` Derek Glidden 2001-06-06 14:16 ` Disconnect [not found] ` <3B1DEAC7.43DEFA1C@idb.hist.no> 2001-06-06 4:03 ` Jeffrey W. Baker ` (4 subsequent siblings) 5 siblings, 2 replies; 106+ messages in thread From: Derek Glidden @ 2001-06-06 3:19 UTC (permalink / raw) To: Andrew Morton; +Cc: Jeffrey W. Baker, linux-kernel On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote: > "Jeffrey W. Baker" wrote: > > > > Because the 2.4 VM is so broken, and > > because my machines are frequently deeply swapped, > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > The problem *appears* worse in 2.4 because it uses lots > more swap. I disagree with the terminology you're using. It *is* worse in 2.4, period. If it only *appears* worse, then if I encounter a situation where a 2.2 box has utilized as much swap as a 2.4 box, I should see the same results. Yet this happens not to be the case. A 2.2 box that's a hundred or more megs into swap, even when that swap space is "actual programs" rather than just mysteriously allocated swap (as with 2.4), it only takes the time to page all that off disk back into RAM, making the machine a little sluggish while it's happening and definitely not making the machine entirely unresponsive. On the other hand, a 2.4 box that's a hundred or more megs into swap, even when there's nothing actually running to take up that swap space, i.e. it's just "mysteriously allocated for some reason" swap, will take several minutes, while the CPU is pegged, the drive is inactive, and the entire machine is completely unresponsive to anything - for all appearances locked up tight. I have been unable to make a box running the 2.2 kernel behave the same way as 2.4 does, even if I put the 2.2 kernel under severe memory pressure and the 2.4 kernel is entirely idle with no tasks but the most basic system processes running. So from my perspective, which really is a real-world perspective because I can cause this same behaviour during normal day-to-day desktop operation of my machine, the behaviour of 2.4 *is* different from the behaviour of 2.2 when it comes to memory management. > > they can sometimes take over 30 minutes to shutdown. > > Yes. The sys_swapoff() system call can take many minutes > of CPU time. It basically does: > [...] > It's interesting that you've found a case where this > actually has an operational impact. I can't tell if this is humour or not. I hope it is, because I can sure show you actual operational impact of this mis-behaviour all day long. :) -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 3:19 ` Derek Glidden @ 2001-06-06 14:16 ` Disconnect [not found] ` <3B1DEAC7.43DEFA1C@idb.hist.no> 1 sibling, 0 replies; 106+ messages in thread From: Disconnect @ 2001-06-06 14:16 UTC (permalink / raw) To: linux-kernel On Tue, 05 Jun 2001, Derek Glidden did have cause to say: > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > > The problem *appears* worse in 2.4 because it uses lots > > more swap. > > I disagree with the terminology you're using. It *is* worse in 2.4, > period. If it only *appears* worse, then if I encounter a situation > where a 2.2 box has utilized as much swap as a 2.4 box, I should see the > same results. Yet this happens not to be the case. Ditto here - my box (1.2g tbird, 512M ram, 128M+128M swap, mixed scsi/ide) does the same on swapoff -- 2.2.16 can be 100 megs or more into swap, and it gets sluggish for a bit and then is fine. 2.4.[123] can be only 10 megs into swap and it basically hardlocks for about 5-10 minutes. --- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 [www.ebb.org/ungeek] GIT/CC/CM/AT d--(-)@ s+:-- a-->? C++++$ ULBS*++++$ P- L+++>+++++ E--- W+++ N+@ o+>$ K? w--->+++++ O- M V-- PS+() PE Y+@ PGP++() t 5--- X-- R tv+@ b++++>$ DI++++ D++(+++) G++ e* h(-)* r++ y++ ------END GEEK CODE BLOCK------ ^ permalink raw reply [flat|nested] 106+ messages in thread
[parent not found: <3B1DEAC7.43DEFA1C@idb.hist.no>]
* Re: Break 2.4 VM in five easy steps [not found] ` <3B1DEAC7.43DEFA1C@idb.hist.no> @ 2001-06-06 14:51 ` Derek Glidden 2001-06-06 21:34 ` Alan Cox 2001-06-07 7:23 ` Helge Hafting 0 siblings, 2 replies; 106+ messages in thread From: Derek Glidden @ 2001-06-06 14:51 UTC (permalink / raw) To: Helge Hafting, linux-kernel Helge Hafting wrote: > > The drive is inactive because it isn't needed, the machine is > running loops on data in memory. And it is unresponsive because > nothing else is scheduled, maybe "swapoff" is easier to implement I don't quite get what you're saying. If the system becomes unresponsive because the VM swap recovery parts of the kernel are interfering with the kernel scheduler then that's also bad because there absolutely *are* other processes that should be getting time, like the console windows/shells at which I'm logged in. If they aren't getting it specifically because the VM is preventing them from receiving execution time, then that's another bug. > when processes cannot try to allocate more or touch pages > while it runs. "swapoff" isn't something you normally do often, > so it don't have to be nice. I'm not familiar enough with the swapping bits of the kernel code, so I could be totally wrong, but turning off a swap file/partition should just call the same parts of the VM subsystem that would normally try to recover swap space under memory pressure. Using "swapoff" to force this behaviour should just force it to happen manually rather than when memory pressure is high enough. Which means that if that's the normal behaviour of the VM subsystem when memory pressure gets high and it needs to recover unused pages from swap - i.e. the machine stops running - then that's still very broken behaviour, no matter what instigated the occurance. > Still, I find it strange that swapoff should take much more time, > even if you can get 2.2 to have the same amount in swap. So do I. Hence the original report. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- #!/usr/bin/perl -w $_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map {$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110; $t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z) [$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join "",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d= unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d >>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q* 8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]} print+x"C*",@a}';s/x/pack+/g;eval usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \ | extract_mpeg2 | mpeg2dec - http://www.eff.org/ http://www.opendvd.org/ http://www.cs.cmu.edu/~dst/DeCSS/Gallery/ ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 14:51 ` Derek Glidden @ 2001-06-06 21:34 ` Alan Cox 2001-06-09 8:07 ` Rik van Riel 2001-06-07 7:23 ` Helge Hafting 1 sibling, 1 reply; 106+ messages in thread From: Alan Cox @ 2001-06-06 21:34 UTC (permalink / raw) To: Derek Glidden; +Cc: Helge Hafting, linux-kernel > interfering with the kernel scheduler then that's also bad because there > absolutely *are* other processes that should be getting time, like the > console windows/shells at which I'm logged in. If they aren't getting > it specifically because the VM is preventing them from receiving > execution time, then that's another bug. Its in fact very important the VM interferes with scheduling. When a task is a heavy generator of dirty pages it has to be throttled to get fair use of disk bandwidth and memory Similarly its desirable as paging rates increase to ensure that everything gets some running time to make progress even at the cost of interactivity. This is something BSD does that we don't. Arguably nowdays its reasonable to claim you should have enough ram to avoid the total thrash state that BSD handles this way o course ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 21:34 ` Alan Cox @ 2001-06-09 8:07 ` Rik van Riel 0 siblings, 0 replies; 106+ messages in thread From: Rik van Riel @ 2001-06-09 8:07 UTC (permalink / raw) To: Alan Cox; +Cc: Derek Glidden, Helge Hafting, linux-kernel On Wed, 6 Jun 2001, Alan Cox wrote: > Similarly its desirable as paging rates increase to ensure that > everything gets some running time to make progress even at the cost of > interactivity. This is something BSD does that we don't. Arguably > nowdays its reasonable to claim you should have enough ram to avoid > the total thrash state that BSD handles this way o course During last week's holidays I've started working on some load control code for Linux. The basic mechanisms are working, the only problem is that it doesn't actually prevent thrashing yet ;) http://www.surriel.com/patches/2.4/2.4.5-ac5-swapper regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 14:51 ` Derek Glidden 2001-06-06 21:34 ` Alan Cox @ 2001-06-07 7:23 ` Helge Hafting 2001-06-07 16:56 ` Eric W. Biederman 2001-06-07 20:24 ` José Luis Domingo López 1 sibling, 2 replies; 106+ messages in thread From: Helge Hafting @ 2001-06-07 7:23 UTC (permalink / raw) To: Derek Glidden, linux-kernel Derek Glidden wrote: > > Helge Hafting wrote: > > > > The drive is inactive because it isn't needed, the machine is > > running loops on data in memory. And it is unresponsive because > > nothing else is scheduled, maybe "swapoff" is easier to implement > > I don't quite get what you're saying. If the system becomes > unresponsive because the VM swap recovery parts of the kernel are > interfering with the kernel scheduler then that's also bad because there > absolutely *are* other processes that should be getting time, like the > console windows/shells at which I'm logged in. If they aren't getting > it specifically because the VM is preventing them from receiving > execution time, then that's another bug. > Sure. The kernel doing a big job without scheduling anything is a problem. > I'm not familiar enough with the swapping bits of the kernel code, so I > could be totally wrong, but turning off a swap file/partition should > just call the same parts of the VM subsystem that would normally try to > recover swap space under memory pressure. A problem with this is that normal paging-in is allowed to page other things out as well. But you can't have that when swap is about to be turned off. My guess is that swapoff functionality was perceived to be so seldom used that they didn't bother too much with scheduling or efficiency. I don't have the same problem myself though. Shutting down with 30M or so in swap never take unusual time on 2.4.x kernels here, with a 300MHz processor. I did a test while typing this letter, almost filling the 96M swap partition with 88M. swapoff took 1 minute at 100% cpu. This is long, but the machine was responsive most of that time. I.e. no worse than during a kernel compile. The machine froze 10 seconds or so at the end of the minute, I can imagine that biting with bigger swap. Helge Hafting ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 7:23 ` Helge Hafting @ 2001-06-07 16:56 ` Eric W. Biederman 2001-06-07 20:24 ` José Luis Domingo López 1 sibling, 0 replies; 106+ messages in thread From: Eric W. Biederman @ 2001-06-07 16:56 UTC (permalink / raw) To: Helge Hafting; +Cc: Derek Glidden, linux-kernel Helge Hafting <helgehaf@idb.hist.no> writes: > A problem with this is that normal paging-in is allowed to page other > things out as well. But you can't have that when swap is about to > be turned off. My guess is that swapoff functionality was perceived to > be so seldom used that they didn't bother too much with scheduling > or efficiency. There is some truth in that. You aren't allowed to allocate new pages in the swap space currently being removed however. The current swap off code removes pages from the current swap space without breaking any sharing between swap pages. Depending on your load this may be important. Fixing swapoff to be more efficient while at the same time keeping sharing between pages is tricky. That under loads that are easy to trigger in 2.4 swapoff never sleeps is a big bug. > I don't have the same problem myself though. Shutting down with > 30M or so in swap never take unusual time on 2.4.x kernels here, > with a 300MHz processor. I did a test while typing this letter, > almost filling the 96M swap partition with 88M. swapoff > took 1 minute at 100% cpu. This is long, but the machine was responsive > most of that time. I.e. no worse than during a kernel compile. > The machine froze 10 seconds or so at the end of the minute, I can > imagine that biting with bigger swap. O.k. so at some point you actually wait for I/O and other process get a chance to run. On the larger machines we never wait for I/O and thus never schedule at all. The problem is now understood. Now we just need to fix it. Eric ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 7:23 ` Helge Hafting 2001-06-07 16:56 ` Eric W. Biederman @ 2001-06-07 20:24 ` José Luis Domingo López 1 sibling, 0 replies; 106+ messages in thread From: José Luis Domingo López @ 2001-06-07 20:24 UTC (permalink / raw) To: linux-kernel On Thursday, 07 June 2001, at 09:23:42 +0200, Helge Hafting wrote: > Derek Glidden wrote: > > > > Helge Hafting wrote: > [...] > The machine froze 10 seconds or so at the end of the minute, I can > imagine that biting with bigger swap. > Same behavior here with a Pentium III 600, 128 MB RAM and 128 MB of swap. Filled mem and swap with the infamous glob() "bug" (ls ../*/.. etc.), made swapoff, and the machine kept very responsive except for the last 10-15 seconds before swapoff ends. Even scrolling complex pages with Mozilla 0.9 worked smoothly :). -- José Luis Domingo López Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM) jdomingo EN internautas PUNTO org => ¿ Spam ? Atente a las consecuencias jdomingo AT internautas DOT org => Spam at your own risk ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 2:16 ` Andrew Morton 2001-06-06 3:19 ` Derek Glidden @ 2001-06-06 4:03 ` Jeffrey W. Baker 2001-06-06 8:19 ` Xavier Bestel ` (3 subsequent siblings) 5 siblings, 0 replies; 106+ messages in thread From: Jeffrey W. Baker @ 2001-06-06 4:03 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-kernel On Wed, 6 Jun 2001, Andrew Morton wrote: > "Jeffrey W. Baker" wrote: > > > > Because the 2.4 VM is so broken, and > > because my machines are frequently deeply swapped, > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > The problem *appears* worse in 2.4 because it uses lots > more swap. > > > they can sometimes take over 30 minutes to shutdown. > > Yes. The sys_swapoff() system call can take many minutes > of CPU time. It basically does: > > for (each page in swap device) { > for (each process) { > for (each page used by this process) > stuff Sure, and at shutdown time when swapoff is called, there is only 1 process, init, which isn't swapped out anymore. So this should run like lightning. Repeat: something is horribly wrong with the VM's management of pages, lists, swap, cache, etc. -jwb ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 2:16 ` Andrew Morton 2001-06-06 3:19 ` Derek Glidden 2001-06-06 4:03 ` Jeffrey W. Baker @ 2001-06-06 8:19 ` Xavier Bestel 2001-06-06 8:54 ` Sean Hunter ` (4 more replies) 2001-06-06 13:32 ` Eric W. Biederman ` (2 subsequent siblings) 5 siblings, 5 replies; 106+ messages in thread From: Xavier Bestel @ 2001-06-06 8:19 UTC (permalink / raw) To: Derek Glidden; +Cc: Andrew Morton, Jeffrey W. Baker, linux-kernel On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote: > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote: > > "Jeffrey W. Baker" wrote: > > > > > > Because the 2.4 VM is so broken, and > > > because my machines are frequently deeply swapped, > > > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > > The problem *appears* worse in 2.4 because it uses lots > > more swap. > > I disagree with the terminology you're using. It *is* worse in 2.4, > period. If it only *appears* worse, then if I encounter a situation > where a 2.2 box has utilized as much swap as a 2.4 box, I should see the > same results. Yet this happens not to be the case. Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M swapfile to your box) This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that anything less won't do any good: 2.4 overallocates swap even if it doesn't use it all. So in your case you just have enough swap to map your RAM, and nothing to really swap your apps. Xav ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:19 ` Xavier Bestel @ 2001-06-06 8:54 ` Sean Hunter 2001-06-06 9:57 ` Dr S.M. Huen ` (5 more replies) 2001-06-06 9:16 ` Xavier Bestel ` (3 subsequent siblings) 4 siblings, 6 replies; 106+ messages in thread From: Sean Hunter @ 2001-06-06 8:54 UTC (permalink / raw) To: Xavier Bestel; +Cc: linux-kernel On Wed, Jun 06, 2001 at 10:19:30AM +0200, Xavier Bestel wrote: > On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote: > > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote: > > > "Jeffrey W. Baker" wrote: > > > > > > > > Because the 2.4 VM is so broken, and > > > > because my machines are frequently deeply swapped, > > > > > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > > > The problem *appears* worse in 2.4 because it uses lots > > > more swap. > > > > I disagree with the terminology you're using. It *is* worse in 2.4, > > period. If it only *appears* worse, then if I encounter a situation > > where a 2.2 box has utilized as much swap as a 2.4 box, I should see the > > same results. Yet this happens not to be the case. > > Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M > swapfile to your box) > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that > anything less won't do any good: 2.4 overallocates swap even if it > doesn't use it all. So in your case you just have enough swap to map > your RAM, and nothing to really swap your apps. > For large memory boxes, this is ridiculous. Should I have 8GB of swap? Sean ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:54 ` Sean Hunter @ 2001-06-06 9:57 ` Dr S.M. Huen 2001-06-06 10:06 ` DBs (ML) ` (6 more replies) 2001-06-06 10:04 ` Jonathan Morton ` (4 subsequent siblings) 5 siblings, 7 replies; 106+ messages in thread From: Dr S.M. Huen @ 2001-06-06 9:57 UTC (permalink / raw) To: Sean Hunter; +Cc: Xavier Bestel, linux-kernel On Wed, 6 Jun 2001, Sean Hunter wrote: > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > Do I understand you correctly? ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB drives. It will cost you 19x as much to put the RAM in as to put the developer's recommended amount of swap space to back up that RAM. The developers gave their reasons for this design some time ago and if the ONLY problem was that it required you to allocate more swap, why should it be a priority item to fix it for those that refuse to do so? By all means fix it urgently where it doesn't work when used as advised but demanding priority to fixing a problem encountered when a user refuses to use it in the manner specified seems very unreasonable. If you can afford 4GB RAM, you certainly can afford 8GB swap. ^ permalink raw reply [flat|nested] 106+ messages in thread
* RE: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen @ 2001-06-06 10:06 ` DBs (ML) 2001-06-06 10:08 ` Vivek Dasmohapatra ` (5 subsequent siblings) 6 siblings, 0 replies; 106+ messages in thread From: DBs (ML) @ 2001-06-06 10:06 UTC (permalink / raw) To: linux-kernel What happens if the box is full of disk capacity and you cannot add anymore spindles? Then what? Upgrade the whole disk subsystem just to cater for this issue? That would turn out to be a bit more expensive in both money terms and downtime/labour costs. It really annoys me when people just say "Add more of this then....". Best regards Antonio Covelli > -----Original Message----- > From: linux-kernel-owner@vger.kernel.org > [mailto:linux-kernel-owner@vger.kernel.org]On Behalf Of Dr S.M. Huen > Sent: Wednesday, June 06, 2001 10:58 AM > To: Sean Hunter > Cc: Xavier Bestel; linux-kernel@vger.kernel.org > Subject: Re: Break 2.4 VM in five easy steps > > > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > Do I understand you correctly? > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > drives. > > It will cost you 19x as much to put the RAM in as to put the > developer's recommended amount of swap space to back up that RAM. The > developers gave their reasons for this design some time ago and if the > ONLY problem was that it required you to allocate more swap, why should > it be a priority item to fix it for those that refuse to do so? By all > means fix it urgently where it doesn't work when used as advised but > demanding priority to fixing a problem encountered when a user refuses to > use it in the manner specified seems very unreasonable. If you can afford > 4GB RAM, you certainly can afford 8GB swap. > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen 2001-06-06 10:06 ` DBs (ML) @ 2001-06-06 10:08 ` Vivek Dasmohapatra 2001-06-06 10:19 ` Lauri Tischler 2001-06-06 10:22 ` Sean Hunter ` (4 subsequent siblings) 6 siblings, 1 reply; 106+ messages in thread From: Vivek Dasmohapatra @ 2001-06-06 10:08 UTC (permalink / raw) To: Dr S.M. Huen; +Cc: Linux Kernel Mailing List On Wed, 6 Jun 2001, Dr S.M. Huen wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > Do I understand you correctly? > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > drives. Not the point. It is an absolute pig to have to allocate extra swap just because extra memory was added. You might not have a bay free. You might not have the space knocking around to allocate as swap. It's not about the money, it's about adaptability. 2.2 was perfectly happy before, why this giant leap backwards? If I quadruple the memory in my laptop to 512Mb, do I have to carve up my partitions just to get an extra 768Mb of swap? Or must I turn off swap completely? What if you are working on a device where everything is at a premium, both permamanent storage and memory, but you do have a little to spare as swap? It just seems like an overly onerous restriction. -- The time for action is past! Now is the time for senseless bickering. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 10:08 ` Vivek Dasmohapatra @ 2001-06-06 10:19 ` Lauri Tischler 0 siblings, 0 replies; 106+ messages in thread From: Lauri Tischler @ 2001-06-06 10:19 UTC (permalink / raw) To: Linux Kernel Mailing List Vivek Dasmohapatra wrote: > > On Wed, 6 Jun 2001, Dr S.M. Huen wrote: > > > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > > > > Do I understand you correctly? > > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > > drives. > > Not the point. It is an absolute pig to have to allocate extra swap just > because extra memory was added. Not to mention that some people stuff their machines with memory just to avoid using swap at all. -- Lauri Tischler, Network Admin Tel: +358-9-47846331 * Mouse movement detected * Fax: +358-9-47846500 * Reboot Windows to activate changes * Mobile: +358-40-5569010 EMail: lauri.tischler@efore.fi ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen 2001-06-06 10:06 ` DBs (ML) 2001-06-06 10:08 ` Vivek Dasmohapatra @ 2001-06-06 10:22 ` Sean Hunter 2001-06-06 10:48 ` Alexander Viro ` (2 more replies) 2001-06-06 16:47 ` dean gaudet ` (3 subsequent siblings) 6 siblings, 3 replies; 106+ messages in thread From: Sean Hunter @ 2001-06-06 10:22 UTC (permalink / raw) To: Dr S.M. Huen; +Cc: linux-kernel On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > Do I understand you correctly? > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > drives. > > It will cost you 19x as much to put the RAM in as to put the > developer's recommended amount of swap space to back up that RAM. The > developers gave their reasons for this design some time ago and if the > ONLY problem was that it required you to allocate more swap, why should > it be a priority item to fix it for those that refuse to do so? By all > means fix it urgently where it doesn't work when used as advised but > demanding priority to fixing a problem encountered when a user refuses to > use it in the manner specified seems very unreasonable. If you can afford > 4GB RAM, you certainly can afford 8GB swap. > This is completely bogus. I am not saying that I can't afford the swap. What I am saying is that it is completely broken to require this amount of swap given the boundaries of efficient use. This is only one of several things which make the 2.4 VM suck for large, small or medium machines at the moment. Until we have a working VM 2.4 can't possibly go into production on my site on these machines. A working VM would have several differences from what we have in my opinion, among which are: - It wouldn't require 8GB of swap on my large boxes - It wouldn't suffer from the "bounce buffer" bug on my large boxes - It wouldn't cause the disk drive on my laptop to be _constantly_ in use even when all I have done is spawned a shell session and have no large apps or daemons running. - It wouldn't kill things saying it was OOM unless it was OOM. Furthermore, I am not demanding anything, much less "priority fixing" for this bug. Its my personal opinion that this is the most critical bug in the 2.4 series, and if I had the time and skill, this is what I would be working on. Because I don't have the time and skill, I am perfectly happy to wait until those that do fix the problem. To say it isn't a problem because I can buy more disk is nonsense, and its that sort of thinking that leads to constant need to upgrade hardware in the proprietary OS world. Sean ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 10:22 ` Sean Hunter @ 2001-06-06 10:48 ` Alexander Viro 2001-06-06 16:58 ` dean gaudet 2001-06-06 17:10 ` Remi Turk 2001-06-06 22:44 ` Kai Henningsen 2001-06-09 7:17 ` Rik van Riel 2 siblings, 2 replies; 106+ messages in thread From: Alexander Viro @ 2001-06-06 10:48 UTC (permalink / raw) To: Sean Hunter; +Cc: Dr S.M. Huen, linux-kernel On Wed, 6 Jun 2001, Sean Hunter wrote: > This is completely bogus. I am not saying that I can't afford the swap. > What I am saying is that it is completely broken to require this amount > of swap given the boundaries of efficient use. Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD systems I've used were broken, but I've never thought that swap==2*RAM rule was one of them. Not that being more kind on swap would be a bad thing, but that rule for amount of swap is pretty common. ISTR similar for (very old) SCO, so it's not just BSD world. How are modern Missed'em'V variants in that respect, BTW? ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 10:48 ` Alexander Viro @ 2001-06-06 16:58 ` dean gaudet 2001-06-06 17:10 ` Remi Turk 1 sibling, 0 replies; 106+ messages in thread From: dean gaudet @ 2001-06-06 16:58 UTC (permalink / raw) To: Alexander Viro; +Cc: Sean Hunter, Dr S.M. Huen, linux-kernel On Wed, 6 Jun 2001, Alexander Viro wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > This is completely bogus. I am not saying that I can't afford the swap. > > What I am saying is that it is completely broken to require this amount > > of swap given the boundaries of efficient use. > > Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD > systems I've used were broken, but I've never thought that swap==2*RAM rule > was one of them. > > Not that being more kind on swap would be a bad thing, but that rule for > amount of swap is pretty common. ISTR similar for (very old) SCO, so it's > not just BSD world. How are modern Missed'em'V variants in that respect, BTW? frequently when building out a solaris web farm you have to just bite it and throw away half your disk for swap that will never be used. it's got pessimistic memory allocation by default. you can do something with mmap() to get an optimistic allocation, but i didn't trust making this change to apache when i was involved with a farm like this... i didn't want to be debugging any potential low memory problems. -dean ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 10:48 ` Alexander Viro 2001-06-06 16:58 ` dean gaudet @ 2001-06-06 17:10 ` Remi Turk 1 sibling, 0 replies; 106+ messages in thread From: Remi Turk @ 2001-06-06 17:10 UTC (permalink / raw) To: Alexander Viro; +Cc: linux-kernel On Wed, Jun 06, 2001 at 06:48:32AM -0400, Alexander Viro wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > This is completely bogus. I am not saying that I can't afford the swap. > > What I am saying is that it is completely broken to require this amount > > of swap given the boundaries of efficient use. > > Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD > systems I've used were broken, but I've never thought that swap==2*RAM rule > was one of them. > > Not that being more kind on swap would be a bad thing, but that rule for > amount of swap is pretty common. ISTR similar for (very old) SCO, so it's > not just BSD world. How are modern Missed'em'V variants in that respect, BTW? Although I don't have any swap-trouble myself, what I think most people are having problems with is not that Linux doesn't have the "you-dont-need-2xRAM-size-swap-if-you-swap-at-all feature", but that it lost it in 2.4. -- Linux 2.4.5-ac9 #5 Wed Jun 6 18:30:24 CEST 2001 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 10:22 ` Sean Hunter 2001-06-06 10:48 ` Alexander Viro @ 2001-06-06 22:44 ` Kai Henningsen 2001-06-09 7:17 ` Rik van Riel 2 siblings, 0 replies; 106+ messages in thread From: Kai Henningsen @ 2001-06-06 22:44 UTC (permalink / raw) To: linux-kernel viro@math.psu.edu (Alexander Viro) wrote on 06.06.01 in <Pine.GSO.4.21.0106060637580.7264-100000@weyl.math.psu.edu>: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > This is completely bogus. I am not saying that I can't afford the swap. > > What I am saying is that it is completely broken to require this amount > > of swap given the boundaries of efficient use. > > Funny. I can count many ways in which 4.3BSD, SunOS{3,4} and post-4.4 BSD > systems I've used were broken, but I've never thought that swap==2*RAM rule > was one of them. As a "will break without" rule, I'd consider a kernel with that property completely unsuitable for production use. I certainly don't remember thinking of that as more than a recommendation back when I used commercial Unices (SysVsomething). MfG Kai ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 10:22 ` Sean Hunter 2001-06-06 10:48 ` Alexander Viro 2001-06-06 22:44 ` Kai Henningsen @ 2001-06-09 7:17 ` Rik van Riel 2 siblings, 0 replies; 106+ messages in thread From: Rik van Riel @ 2001-06-09 7:17 UTC (permalink / raw) To: Sean Hunter; +Cc: Dr S.M. Huen, linux-kernel On Wed, 6 Jun 2001, Sean Hunter wrote: > A working VM would have several differences from what we have in my > opinion, among which are: > - It wouldn't require 8GB of swap on my large boxes > - It wouldn't suffer from the "bounce buffer" bug on my > large boxes > - It wouldn't cause the disk drive on my laptop to be > _constantly_ in use even when all I have done is spawned a > shell session and have no large apps or daemons running. > - It wouldn't kill things saying it was OOM unless it was OOM. I fully agree these problems need to be fixed. I just wish I had the time to tackle all of them right now ;) We should be close to getting the 3rd problem fixed and the deadlock problem with the bounce buffers seems to be fixed already. Getting reclaiming of swap space and OOM fixed is a matter of time ... I hope I'll have that time in the near future. regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen ` (2 preceding siblings ...) 2001-06-06 10:22 ` Sean Hunter @ 2001-06-06 16:47 ` dean gaudet 2001-06-06 17:17 ` Kurt Roeckx ` (2 subsequent siblings) 6 siblings, 0 replies; 106+ messages in thread From: dean gaudet @ 2001-06-06 16:47 UTC (permalink / raw) To: Dr S.M. Huen; +Cc: Sean Hunter, Xavier Bestel, linux-kernel On Wed, 6 Jun 2001, Dr S.M. Huen wrote: > If you can afford 4GB RAM, you certainly can afford 8GB swap. this is a completely crap argument. you should study the economics of managing a farm of thousands of machines some day. when you do this, you'll also learn to consider the power requirements (8W+ per 3.5" disk) which you need to bring to each rack, supply backup UPS/generator power for, and exhaust through your air conditioning for each of these useless swap disks. plus you'll also learn to consider the wages for the unlucky person who has to go around to every box in a farm, open it up, and install another disk. plus you'll learn that the time this person spent installing new disks wasn't spent installing new systems, which means you couldn't bring as many customers on line this month, which means you may not make revenue targets. plus you'll learn that every time you open a box that's been in production for a while, there's a small, but noticeable, chance that it won't reboot. so your normal monthly failure rate will go from the 2% range up to the 5% range. -dean ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen ` (3 preceding siblings ...) 2001-06-06 16:47 ` dean gaudet @ 2001-06-06 17:17 ` Kurt Roeckx 2001-06-06 18:35 ` Dr S.M. Huen 2001-06-07 0:20 ` Mike A. Harris 2001-06-07 21:31 ` Shane Nay 6 siblings, 1 reply; 106+ messages in thread From: Kurt Roeckx @ 2001-06-06 17:17 UTC (permalink / raw) To: Dr S.M. Huen; +Cc: Sean Hunter, Xavier Bestel, linux-kernel On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > Do I understand you correctly? > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > drives. Maybe you really should reread the statements people made about this before. One of them being, that if you're not using swap in 2.2, it won't need any in 2.4 either. 2.4 will use more swap in case it does use it. It now works more like other UNIX variants where the rule is that swap = 2 * RAM. That swap = 2 * RAM is just a guideline, you really should look at what applications you run, and how memory they use. If you choise your RAM so that all application can always be in memory at all time, there is no need for swap. If they can't be, the rule might help you. I think someone said that the swap should be large enough to hold all application that are running on swapspace, that is, in case you want to use swap. Disk maybe be alot cheaper than RAM, but it's also alot slower. Kurt ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 17:17 ` Kurt Roeckx @ 2001-06-06 18:35 ` Dr S.M. Huen 2001-06-06 18:40 ` Mark Salisbury 0 siblings, 1 reply; 106+ messages in thread From: Dr S.M. Huen @ 2001-06-06 18:35 UTC (permalink / raw) To: Kurt Roeckx; +Cc: Sean Hunter, Xavier Bestel, linux-kernel On Wed, 6 Jun 2001, Kurt Roeckx wrote: > On Wed, Jun 06, 2001 at 10:57:57AM +0100, Dr S.M. Huen wrote: > > On Wed, 6 Jun 2001, Sean Hunter wrote: > > > > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > > > > > > Do I understand you correctly? > > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > > drives. > > Maybe you really should reread the statements people made about > this before. > I think you might do with a more careful quoting or reading of the thread yourself before casting such aspersions. I did not recommend swap use. I argued that it was not reasonable to reject a 2*RAM swap requirement on cost grounds. There are those who do not think this argument adequate because of grounds other than hardware cost (e.g. retrofitting existing farms, laptops with zillions of OSes etc.) > > That swap = 2 * RAM is just a guideline, you really should look > at what applications you run, and how memory they use. If you > choise your RAM so that all application can always be in memory > at all time, there is no need for swap. If they can't be, the > rule might help you. > I think the whole argument of the thread is against you here. It seems that if you do NOT provide 2*RAM you get into trouble much earlier than you expect (a few argue that even if you do you get trouble). If it were just a guideline that gracefully degraded your performance the other lot wouldn't be screaming. The whole screaming match is about whether a drastic degradation on using swap with less than the 2*RAM swap specified by the developers should lead one to conclude that a kernel is "broken". To conclude, this is not a hypothetical argument about whether to operate completely in core. There's not a person on LKML who doesn't know running in RAM is better than running swapping. It is one where users do swap but allocate a size smaller than that recommended and are adversely affected. It is about whether a kernel that reacts this way could be regarded as stable. Answe ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 18:35 ` Dr S.M. Huen @ 2001-06-06 18:40 ` Mark Salisbury 0 siblings, 0 replies; 106+ messages in thread From: Mark Salisbury @ 2001-06-06 18:40 UTC (permalink / raw) To: Dr S.M. Huen, Dr S.M. Huen, Kurt Roeckx Cc: Sean Hunter, Xavier Bestel, linux-kernel On Wed, 06 Jun 2001, Dr S.M. Huen wrote: > The whole screaming match is about whether a drastic degradation on using > swap with less than the 2*RAM swap specified by the developers should lead > one to conclude that a kernel is "broken". I would argue that any system that performs substantially worse with swap==1xRAM than a system with swap==0xRAM is fundamentally broken. it seems that w/ todays 2.4.x kernel, people running programs totalling LESS THAN their physical dram are having swap problems. they should not even be using 1 byte of swap. the whole point of swapping pages is to give you more memory to execute programs. if I want to execute 140MB of programs+kernel on a system with 128 MB of ram, I should be able to do the job effectively with ANY amount of "total memory" exceeding 140MB. not some hokey 128MB RAM + 256MB swap just because the kernel it too fscked up to deal with a small swap file. -- /*------------------------------------------------** ** Mark Salisbury | Mercury Computer Systems ** **------------------------------------------------*/ ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen ` (4 preceding siblings ...) 2001-06-06 17:17 ` Kurt Roeckx @ 2001-06-07 0:20 ` Mike A. Harris 2001-06-09 8:16 ` Rik van Riel 2001-06-07 21:31 ` Shane Nay 6 siblings, 1 reply; 106+ messages in thread From: Mike A. Harris @ 2001-06-07 0:20 UTC (permalink / raw) To: Dr S.M. Huen; +Cc: Sean Hunter, Xavier Bestel, linux-kernel On Wed, 6 Jun 2001, Dr S.M. Huen wrote: >> For large memory boxes, this is ridiculous. Should I have 8GB of swap? >> > >Do I understand you correctly? >ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even >at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB >drives. Linux is all about technical correctness, and doing the job properly. It isn't about "there is a bug in the kernel, but that is ok because a 8Gb swapfile only costs $2" Why are half the people here trying to hide behind this diskspace is cheap argument? If we rely on that, then Linux sucks shit. The problem IMHO is widely acknowledged by those who matter as an official BUG, and that is that. It is also acknowledged widely by those who can fix the problem that it will be fixed in time. So technically speaking - the kernel has a widely known bug/misfeature, which is acknowledged by core kernel developers as needing fixing, and that it will get fixed at some point. Saying it is a nonissue due to the cost of hardware resources is just plain Microsoft attitude and holds absolutely zero technical merit. It *IS* an issue, because it is making Linux suck, and is causing REAL WORLD PROBLEMS. The use 2x RAM is nothing more than a bandaid workaround, so don't claim that it is the proper fix due to big wallet size. I have 2.2 doing a software build that takes 40 minutes with 256Mb of RAM, and 1G of swap. The same build on 2.4 takes 60 minutes. That is 4x RAM for swap. Lowering the swap down to 2x RAM makes no difference in the numbers, down to 1x RAM the 2.4 build slows down horrendously, and droping the swap to 20Mb makes it die completely in 2.4. 2.4 is fine for a firewall, or certain other applications, but regardless of the amount of SWAP, I'll take the 40minute build using 2.2 over the 60minute build using 2.4 anyday. This is the real world. And no cost isn't an issue to me. Putting another 80Gb drive in this box for swap isn't going to help the work get done any faster. ---------------------------------------------------------------------- Mike A. Harris - Linux advocate - Open Source advocate Opinions and viewpoints expressed are solely my own. ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 0:20 ` Mike A. Harris @ 2001-06-09 8:16 ` Rik van Riel 2001-06-09 8:57 ` Mike A. Harris 0 siblings, 1 reply; 106+ messages in thread From: Rik van Riel @ 2001-06-09 8:16 UTC (permalink / raw) To: Mike A. Harris; +Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, linux-kernel On Wed, 6 Jun 2001, Mike A. Harris wrote: > Why are half the people here trying to hide behind this diskspace > is cheap argument? If we rely on that, then Linux sucks shit. Never mind them, I haven't seen any of them contribute VM code, even ;) OTOH, disk space _is_ cheap, so the other VM - performance related - VM bugs do have a somewhat higher priority at the moment. regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-09 8:16 ` Rik van Riel @ 2001-06-09 8:57 ` Mike A. Harris 0 siblings, 0 replies; 106+ messages in thread From: Mike A. Harris @ 2001-06-09 8:57 UTC (permalink / raw) To: Rik van Riel; +Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, linux-kernel On Sat, 9 Jun 2001, Rik van Riel wrote: >> Why are half the people here trying to hide behind this diskspace >> is cheap argument? If we rely on that, then Linux sucks shit. > >Never mind them, I haven't seen any of them contribute >VM code, even ;) Nor have I, but I think you guys working on it will get it cleaned up eventually. What bugs me is people trying to pretend that it isn't important to fix, or that spending money to get newer hardware is acceptable solution. >OTOH, disk space _is_ cheap, so the other VM - performance >related - VM bugs do have a somewhat higher priority at the >moment. Yes, it is cheap. It isn't always an acceptable workaround though, so I'm glad you guys are working on it - even if we have to wait a bit. I have faith in the system. ;o) ---------------------------------------------------------------------- Mike A. Harris - Linux advocate - Open Source advocate Opinions and viewpoints expressed are solely my own. ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:57 ` Dr S.M. Huen ` (5 preceding siblings ...) 2001-06-07 0:20 ` Mike A. Harris @ 2001-06-07 21:31 ` Shane Nay 2001-06-07 20:00 ` Marcelo Tosatti 6 siblings, 1 reply; 106+ messages in thread From: Shane Nay @ 2001-06-07 21:31 UTC (permalink / raw) To: Dr S.M. Huen, Sean Hunter; +Cc: Xavier Bestel, linux-kernel Uh, last I checked on my linux based embedded device I didn't want to swap to flash. Hmm.., now why was that..., oh, that's right, it's *much* more expensive than memory, oh yes, and it actually gets FRIED when you write to a block more than 100k times. Oh, what was that other thing..., oh yes, and its SOLDERED ON THE BOARD. Damn..., guess I just lost a grand or so. Seriously folks, Linux isn't just for big webservers... Thanks, Shane Nay. (Oh, BTW, I really appreciate the work that people have done on the VM, but folks that are just talking..., well, think clearly before you impact other people that are writing code.) On Wednesday 06 June 2001 02:57, Dr S.M. Huen wrote: > On Wed, 6 Jun 2001, Sean Hunter wrote: > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > Do I understand you correctly? > ECC grade SDRAM for your 8GB server costs £335 per GB as 512MB sticks even > at today's silly prices (Crucial). Ultra160 SCSI costs £8.93/GB as 73GB > drives. > > It will cost you 19x as much to put the RAM in as to put the > developer's recommended amount of swap space to back up that RAM. The > developers gave their reasons for this design some time ago and if the > ONLY problem was that it required you to allocate more swap, why should > it be a priority item to fix it for those that refuse to do so? By all > means fix it urgently where it doesn't work when used as advised but > demanding priority to fixing a problem encountered when a user refuses to > use it in the manner specified seems very unreasonable. If you can afford > 4GB RAM, you certainly can afford 8GB swap. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 21:31 ` Shane Nay @ 2001-06-07 20:00 ` Marcelo Tosatti 2001-06-07 21:55 ` Shane Nay 0 siblings, 1 reply; 106+ messages in thread From: Marcelo Tosatti @ 2001-06-07 20:00 UTC (permalink / raw) To: Shane Nay; +Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, linux-kernel On Thu, 7 Jun 2001, Shane Nay wrote: > (Oh, BTW, I really appreciate the work that people have done on the VM, but > folks that are just talking..., well, think clearly before you impact other > people that are writing code.) If all the people talking were reporting results we would be really happy. Seriously, we really lack VM reports. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 20:00 ` Marcelo Tosatti @ 2001-06-07 21:55 ` Shane Nay 2001-06-07 20:29 ` Marcelo Tosatti 0 siblings, 1 reply; 106+ messages in thread From: Shane Nay @ 2001-06-07 21:55 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, linux-kernel On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote: > On Thu, 7 Jun 2001, Shane Nay wrote: > > (Oh, BTW, I really appreciate the work that people have done on the VM, > > but folks that are just talking..., well, think clearly before you impact > > other people that are writing code.) > > If all the people talking were reporting results we would be really happy. > > Seriously, we really lack VM reports. Okay, I've had some problems with the VM on my machine, what is the most usefull way to compile reports for you? I have modified the kernel for a few different ports fixing bugs, and device drivers, etc., but the VM is all greek to me, I can just see that caching is hyper aggressive and doesn't look like it's going back to the pool..., which results in sluggish performance. Now I know from the work that I've done that anecdotal information is almost never even remotely usefull. Therefore is there any body of information that I can read up on to create a usefull set of data points for you or other VM hackers to look at? (Or maybe some report in the past that you thought was especially usefull?) Thank You, Shane Nay. (I have in the past had many problems with the VM on embedded machines as well, but I'm not actively working on any right this second..., though my Psion is sitting next to me begging for me to run some VM tests on it :) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 21:55 ` Shane Nay @ 2001-06-07 20:29 ` Marcelo Tosatti 2001-06-07 23:29 ` VM Report was:Re: " Shane Nay 2001-06-08 1:18 ` Jonathan Morton 0 siblings, 2 replies; 106+ messages in thread From: Marcelo Tosatti @ 2001-06-07 20:29 UTC (permalink / raw) To: Shane Nay; +Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml On Thu, 7 Jun 2001, Shane Nay wrote: > On Thursday 07 June 2001 13:00, Marcelo Tosatti wrote: > > On Thu, 7 Jun 2001, Shane Nay wrote: > > > (Oh, BTW, I really appreciate the work that people have done on the VM, > > > but folks that are just talking..., well, think clearly before you impact > > > other people that are writing code.) > > > > If all the people talking were reporting results we would be really happy. > > > > Seriously, we really lack VM reports. > > Okay, I've had some problems with the VM on my machine, what is the most > usefull way to compile reports for you? 1) Describe what you're running. (your workload) 2) Describe what you're feeling. (eg "interactivity is crap when I run this or that thing", etc) If we need more info than that I'll request in private. Also send this reports to the linux-mm list, so other VM hackers can also get those reports and we avoid traffic on lk. > I have modified the kernel for a few different ports fixing bugs, and > device drivers, etc., but the VM is all greek to me, I can just see > that caching is hyper aggressive and doesn't look like it's going back > to the pool..., which results in sluggish performance. By performance you mean interactivity or throughput? > Now I know from the work that I've done that anecdotal information is > almost never even remotely usefull. If we need more info, we will request. > Therefore is there any body of information that I can read up on to > create a usefull set of data points for you or other VM hackers to > look at? (Or maybe some report in the past that you thought was > especially usefull?) Just do what I described above. Thanks ^ permalink raw reply [flat|nested] 106+ messages in thread
* VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-07 20:29 ` Marcelo Tosatti @ 2001-06-07 23:29 ` Shane Nay 2001-06-08 1:18 ` Jonathan Morton 1 sibling, 0 replies; 106+ messages in thread From: Shane Nay @ 2001-06-07 23:29 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm (VM report at Marcelo Tosatti's request. He has mentioned that rather than complaining about the VM that people mention what there experiences were. I have tried to do so in the way that he asked.) > 1) Describe what you're running. (your workload) A lot of daemons, all on a private network so there is no throughput load on them. About 13 rxvt's, freeamp actively playing music at all times, xemacs with 25 active buffers, a few instances of vi, opera, no "desktop env", just windowmaker. (Though I have a few KDE2 apps open, and one or two GTK based apps open, so lots of library code swapping in and out I imagine) Now what I've noticed lately is this, with 2.4.2 my machine would lock quite frequently when I was compiling code and had other apps that were allocing memory. With 2.4.5 I haven't had that behaviour, but I've been much lighter on my machine. (I was doing full toolchain builds with 2.4.2 when I had the real problems) But processes were still running when the machine would lock..., like the mp3 player was still playing I noticed one time. With 2.4.5 (not -ac) I haven't had any deadlocks, but the system seems very sluggish at acute moments . While doing absolutely nothing processor intensive (I've been loading up top and ps'ing with regularity when this happens, looking for kswapd going crazy), when I switch between workspaces the refresh is much more sluggish on occasion, like I can watch windows appear. Almost like a micro freeze really. (AMD T-Bird 1.333Mhz 256MB-DDR) > 2) Describe what you're feeling. (eg "interactivity is crap when I run > this or that thing", etc) Freeing memory takes *forever*, but I think that's a function of how I'm allocing in this polygon rendering routine I'm working on. Like literally sucks up vast numbers of cycles and makes picogui totally unusable. But I think this is unrelated to the kernel..., I think that's just because I haven't implemented re-use in memory structures for the polygon routine. (It's malloc/freeing massive numbers of small chunks of memory rather than doing it's own memory management, probably related to glibc memory organization) Here's a vmstat line after a 8 days of uptime and before contrived mem tests: procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 0 0 3056 7856 121872 0 0 7 4 37 16 1 0 40 > If we need more info than that I'll request in private. > > Also send this reports to the linux-mm list, so other VM hackers can also > get those reports and we avoid traffic on lk. > By performance you mean interactivity or throughput? Interactivity. I don't have any throughput needs to speak of. I just ran a barage of tests on my machine, and the smallest it would ever make the cache was 16M, it would prefer to kill processes rather than make the cache smaller than that. Contrived stressor program: (pseudo code) fork(); fork(); fork(); fork(); //16 total processes for (i=0;i<n;i++) { ptr=malloc(1M) while(++m<ptrsize) ptr[m]='b'; sleep(2); } I would change n such that the total amount of memory was less than the amount of cache plus free memory. Running this put the entire system into chaos in short order. After it had killed off only one of the contrived memory hungry processes and at least two others (MP3 player and opera), the machine was slugish..., very slow to respond to any key input. It stayed in this near freeze state for about 20 seconds, after that it started to speed up to user input gradually. (Probably swapping code from disk into cache or something like that) It took about 5-10 minutes to come back "up to speed". > Just do what I described above. Done :). Thanks, Shane Nay. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-07 20:29 ` Marcelo Tosatti 2001-06-07 23:29 ` VM Report was:Re: " Shane Nay @ 2001-06-08 1:18 ` Jonathan Morton 2001-06-08 12:50 ` Mike Galbraith 1 sibling, 1 reply; 106+ messages in thread From: Jonathan Morton @ 2001-06-08 1:18 UTC (permalink / raw) To: Shane Nay, Marcelo Tosatti Cc: Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm At 12:29 am +0100 8/6/2001, Shane Nay wrote: >(VM report at Marcelo Tosatti's request. He has mentioned that rather than >complaining about the VM that people mention what there experiences were. I >have tried to do so in the way that he asked.) >> By performance you mean interactivity or throughput? > >Interactivity. I don't have any throughput needs to speak of. > >I just ran a barage of tests on my machine, and the smallest it would ever >make the cache was 16M, it would prefer to kill processes rather than make >the cache smaller than that. http://www.chromatix.uklinux.net/linux-patches/vm-update-2.patch Try this. I can't guarantee it's SMP-safe yet (I'm leaving the gurus to that, but they haven't told me about any errors in the past hour so I'm assuming they aren't going to find anything glaringly wrong...), but you might like to see if your performance improves with it. It also fixes the OOM-killer bug, which you refer to above. Some measurements, from my own box (1GHz Athlon, 256Mb RAM): For the following benchmarks, physical memory availability was reduced according to the parameter in the left column. The benchmark is the wall-clock time taken to compile MySQL. mem= 2.4.5 earlier tweaks now 48M 8m30s 6m30s 5m58s 32M unknown 2h15m 12m34s The following was performed with all 256Mb RAM available. This is compilation of MySQL using make -j 15. kernel: 2.4.5 now time: 6m30s 6m15s peak swap: 190M 70M For the following test, the 256Mb swap partition on my IDE drive was disabled and replaced with a 1Gb swapfile on my Ultra160 SCSI drive. This is compilation of MySQL using make -j 20. kernel: 2.4.5 now time: 7m20s 6m30s peak swap: 370M 254M Draw your own conclusions. :) -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) The key to knowledge is not to rely on people to teach you it. GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 1:18 ` Jonathan Morton @ 2001-06-08 12:50 ` Mike Galbraith 2001-06-08 14:19 ` Tobias Ringstrom 2001-06-08 15:51 ` John Stoffel 0 siblings, 2 replies; 106+ messages in thread From: Mike Galbraith @ 2001-06-08 12:50 UTC (permalink / raw) To: Jonathan Morton Cc: Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Jonathan Morton wrote: > http://www.chromatix.uklinux.net/linux-patches/vm-update-2.patch > > Try this. I can't guarantee it's SMP-safe yet (I'm leaving the gurus to > that, but they haven't told me about any errors in the past hour so I'm > assuming they aren't going to find anything glaringly wrong...), but you > might like to see if your performance improves with it. It also fixes the > OOM-killer bug, which you refer to above. > > Some measurements, from my own box (1GHz Athlon, 256Mb RAM): > > For the following benchmarks, physical memory availability was reduced > according to the parameter in the left column. The benchmark is the > wall-clock time taken to compile MySQL. > > mem= 2.4.5 earlier tweaks now > 48M 8m30s 6m30s 5m58s > 32M unknown 2h15m 12m34s > > The following was performed with all 256Mb RAM available. This is > compilation of MySQL using make -j 15. > > kernel: 2.4.5 now > time: 6m30s 6m15s > peak swap: 190M 70M > > For the following test, the 256Mb swap partition on my IDE drive was > disabled and replaced with a 1Gb swapfile on my Ultra160 SCSI drive. This > is compilation of MySQL using make -j 20. > > kernel: 2.4.5 now > time: 7m20s 6m30s > peak swap: 370M 254M > > Draw your own conclusions. :) (ok;) Hi, I gave this a shot at my favorite vm beater test (make -j30 bzImage) while testing some other stuff today. seven identical runs, six slightly different kernels plus yours. real 11m23.522s 2.4.5.vm-update-2 user 7m59.170s sys 0m37.030s user : 0:08:07.06 65.6% page in : 642402 nice : 0:00:00.00 0.0% page out: 676820 system: 0:02:09.44 17.4% swap in : 105965 idle : 0:02:05.66 16.9% swap out: 162603 real 10m9.512s 2.4.5.virgin user 7m55.520s sys 0m35.460s user : 0:08:02.66 72.2% page in : 535186 nice : 0:00:00.00 0.0% page out: 377992 system: 0:01:37.78 14.6% swap in : 99445 idle : 0:01:28.14 13.2% swap out: 81926 real 10m48.939s 2.4.5.virgin+reclaim.marcelo user 7m54.960s sys 0m36.240s user : 0:08:02.33 68.0% page in : 566239 nice : 0:00:00.00 0.0% page out: 431874 system: 0:01:56.02 16.4% swap in : 108633 idle : 0:01:50.61 15.6% swap out: 96415 real 9m54.466s 2.4.5.virgin+reclaim.mike (icky 'bleeder valve') user 7m57.370s sys 0m35.890s user : 0:08:04.74 74.1% page in : 527678 nice : 0:00:00.00 0.0% page out: 405259 system: 0:01:12.01 11.0% swap in : 98616 idle : 0:01:37.47 14.9% swap out: 91492 real 9m12.198s 2.4.5.tweak user 7m41.290s sys 0m34.840s user : 0:07:47.69 76.8% page in : 452632 nice : 0:00:00.00 0.0% page out: 399847 system: 0:01:17.08 12.7% swap in : 75338 idle : 0:01:03.97 10.5% swap out: 88291 real 9m41.563s 2.4.5.tweak+reclaim.marcelo user 7m59.880s sys 0m34.690s user : 0:08:07.22 73.4% page in : 515433 nice : 0:00:00.00 0.0% page out: 545762 system: 0:01:35.34 14.4% swap in : 88425 idle : 0:01:21.11 12.2% swap out: 125967 real 9m47.682s 2.4.5.tweak+reclaim.mike user 8m2.190s sys 0m34.550s user : 0:08:09.57 75.7% page in : 513166 nice : 0:00:00.00 0.0% page out: 473539 system: 0:01:20.27 12.4% swap in : 83127 idle : 0:01:16.89 11.9% swap out: 108886 Conclusion: Your patch hits the cache too hard and pays through the nose for doing so.. at least under this hefty weight load it does. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 12:50 ` Mike Galbraith @ 2001-06-08 14:19 ` Tobias Ringstrom 2001-06-08 16:51 ` Mike Galbraith 2001-06-08 15:51 ` John Stoffel 1 sibling, 1 reply; 106+ messages in thread From: Tobias Ringstrom @ 2001-06-08 14:19 UTC (permalink / raw) To: Mike Galbraith Cc: Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Mike Galbraith wrote: > I gave this a shot at my favorite vm beater test (make -j30 bzImage) > while testing some other stuff today. Could you please explain what is good about this test? I understand that it will stress the VM, but will it do so in a realistic and relevant way? Isn't the interesting case when you have a number of processes using lots of memory, but only a part of all that memory is beeing actively used, and that memory fits in RAM. In that case, the VM should make sure that the not used memory is swapped out. In RAM you should have the used memory, but also disk cache if there is any RAM left. Does the current VM handle this case fine yet? IMHO, this is the case most people care about. It is definately the case I care about, at least. :-) I'm not saying that it's a completely uninteresting case when your active memory is bigger than you RAM of course, but perhaps there should be other algorithms handling that case, such as putting some of the swapping processes to sleep for some time, especially if you have lots of processes competing for the memory. I may be wrong, but it seems to me that your testcase falls into this second category (also known as thrashing). An at last, a humble request: Every problem I've had with the VM has been that it either swapped out too many processes and used too much cache, or the other way around. I'd really enjoy a way to tune this behaviour, if possible. /Tobias ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 14:19 ` Tobias Ringstrom @ 2001-06-08 16:51 ` Mike Galbraith 2001-06-08 19:09 ` Tobias Ringstrom 0 siblings, 1 reply; 106+ messages in thread From: Mike Galbraith @ 2001-06-08 16:51 UTC (permalink / raw) To: Tobias Ringstrom Cc: Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Tobias Ringstrom wrote: > On Fri, 8 Jun 2001, Mike Galbraith wrote: > > I gave this a shot at my favorite vm beater test (make -j30 bzImage) > > while testing some other stuff today. > > Could you please explain what is good about this test? I understand that > it will stress the VM, but will it do so in a realistic and relevant way? Can you explain what is bad about this test? ;) It spins the same VM wheels as any other load does. What's the difference if I have a bunch of httpd allocating or a bunch of cc1/as/ld? This load has a modest cachable data set and is compute bound.. and above all gives very repeatable results. I use it to watch reaction to surge. I watch for the vm to build to a solid maximum throughput without thrashing. That's the portion of VM that I'm interested in, so that's what I test. Besides :) I simply don't have the hardware to try to simulate hairy chested server loads. There are lots of folks with hairy chested boxes.. they should test that stuff. I've been repeating ~this test since 2.0 times, and have noticed a 1:1 relationship. When I notice that my box is ~happy doing this load test, I also notice very few VM gripes hitting the list. > Isn't the interesting case when you have a number of processes using lots > of memory, but only a part of all that memory is beeing actively used, and > that memory fits in RAM. In that case, the VM should make sure that the > not used memory is swapped out. In RAM you should have the used memory, > but also disk cache if there is any RAM left. Does the current VM handle > this case fine yet? IMHO, this is the case most people care about. It is > definately the case I care about, at least. :-) The interesting case is _every_ case. Try seeing my particular test as a simulation of a small classroom box with 30 students compiling their assignments and it'll suddenly become quite realistic. You'll notice by the numbers I post that I was very careful to not overload the box in a rediculous manner when selecting the total size of the job.. it's just a heavily loaded box. This test does not overload my IO resources, so it tests the VM's ability to choose and move the right stuff at the right time to get the job done with a minimum of additional overhead. The current VM handles things generally well imho, but has problems regulating itself under load. My test load hits the VM right in it's weakest point (not _that_ weak, but..) by starting at zero and building rapidly to max.. and keeping it _right there_. > I'm not saying that it's a completely uninteresting case when your active > memory is bigger than you RAM of course, but perhaps there should be other > algorithms handling that case, such as putting some of the swapping > processes to sleep for some time, especially if you have lots of processes > competing for the memory. I may be wrong, but it seems to me that your > testcase falls into this second category (also known as thrashing). Thrashing? Let's look some numbers. (not the ugly ones, the ~ok ones;) real 9m12.198s make -j 30 bzImage user 7m41.290s sys 0m34.840s user : 0:07:47.69 76.8% page in : 452632 nice : 0:00:00.00 0.0% page out: 399847 system: 0:01:17.08 12.7% swap in : 75338 idle : 0:01:03.97 10.5% swap out: 88291 real 8m6.994s make bzImage user 7m34.350s sys 0m26.550s user : 0:07:37.52 78.4% page in : 90546 nice : 0:00:00.00 0.0% page out: 18164 system: 0:01:26.13 14.8% swap in : 1 idle : 0:00:39.69 6.8% swap out: 0 ...look at cpu utilization. One minute +tiny change to complete the large job vs the small (VM footprint) job. The box is not thrashing, it's working it's little silicon butt off. What I'm testing is the VM's ability to handle load without thrashing so badly that it loses throughput bigtime, stalls itself whatever.. it's ability to regulate itself. I consider a minute and a half to be ~acceptable, a minute to be good, and 30 seconds to be excellent. That's just my own little VM performance thermometer. > An at last, a humble request: Every problem I've had with the VM has been > that it either swapped out too many processes and used too much cache, or > the other way around. I'd really enjoy a way to tune this behaviour, if > possible. Tunables aren't really practical in VM (imho). If there were a dozen knobs, you'd have to turn a dozen knobs a dozen times a day. VM has to be self regulating. In case you can't tell (the length of this reply) I like my fovorite little generic throughput test a LOT :-) -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 16:51 ` Mike Galbraith @ 2001-06-08 19:09 ` Tobias Ringstrom 2001-06-09 4:36 ` Mike Galbraith 0 siblings, 1 reply; 106+ messages in thread From: Tobias Ringstrom @ 2001-06-08 19:09 UTC (permalink / raw) To: Mike Galbraith Cc: Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Mike Galbraith wrote: > On Fri, 8 Jun 2001, Tobias Ringstrom wrote: > > On Fri, 8 Jun 2001, Mike Galbraith wrote: > > > I gave this a shot at my favorite vm beater test (make -j30 bzImage) > > > while testing some other stuff today. > > > > Could you please explain what is good about this test? I understand that > > it will stress the VM, but will it do so in a realistic and relevant way? > > Can you explain what is bad about this test? ;) It spins the same VM wheels I think a load of ~30 is quit uncommon, and therefor it is unclear to me that it would be a test that would be repesentative of most normal loads. > as any other load does. What's the difference if I have a bunch of httpd > allocating or a bunch of cc1/as/ld? This load has a modest cachable data > set and is compute bound.. and above all gives very repeatable results. Not a big difference. The difference I was thinking abount is the difference between spawning lots of processes allocating, using and freeing lots of memory, compared to a case where you have a few processes touching a lot of already allocated pages in some pattern. I was wondering whether optimizing for your case would be good or bad for the other case. I know, I know, I should do more testing myself. And I should probably not ask you, since you really really like your test, and you will probably just say yes... ;-) At home, I'm running a couple of computers. One of them is a slow computer running Linux, serving mail, NFS, SMB, etc. I'm usually logged in on a couple of virtual consoles. On this machine, I do not mind if all shells, daemons and other idle processes are beeing swapped out in favor of disk cache for the NFS and SMB serving. In fact, that is a very good thing, and I want it that way. Another maching is my desktop machine. When using this maching, I really hate when my emacsen, browsers, xterms, etc are swapped out just to give me some stupid disk cache for my xmms or compilations. I do not care if a kernel compile is a little slower as long as my applications are snappy. How could Linux predict this? It is a matter of taste, IMHO. > I use it to watch reaction to surge. I watch for the vm to build to a > solid maximum throughput without thrashing. That's the portion of VM > that I'm interested in, so that's what I test. Besides :) I simply don't > have the hardware to try to simulate hairy chested server loads. There > are lots of folks with hairy chested boxes.. they should test that stuff. Agreed. More testing is needed. Now if we would have those knobs and wheels to turn, we could perhaps also tune our systems to behave as we like them, and submit that as well. Right now you need to be a kernel hacker, and see through all the magic with shm, mmap, a bunch of caches, page lists, etc. I'd give a lot for a nice picture (or state diagram) showing the lifetime of a page, but I have not found such a picture anywhere. Besides, the VM seems to change every new release anyway. > I've been repeating ~this test since 2.0 times, and have noticed a 1:1 > relationship. When I notice that my box is ~happy doing this load test, > I also notice very few VM gripes hitting the list. Ok, but as you say, we need more tests. > > Isn't the interesting case when you have a number of processes using lots > > of memory, but only a part of all that memory is beeing actively used, and > > that memory fits in RAM. In that case, the VM should make sure that the > > not used memory is swapped out. In RAM you should have the used memory, > > but also disk cache if there is any RAM left. Does the current VM handle > > this case fine yet? IMHO, this is the case most people care about. It is > > definately the case I care about, at least. :-) > > The interesting case is _every_ case. Try seeing my particular test as > a simulation of a small classroom box with 30 students compiling their > assignments and it'll suddenly become quite realistic. You'll notice > by the numbers I post that I was very careful to not overload the box in > a rediculous manner when selecting the total size of the job.. it's just > a heavily loaded box. This test does not overload my IO resources, so > it tests the VM's ability to choose and move the right stuff at the right > time to get the job done with a minimum of additional overhead. I did not understand those numbers when I saw them the first time. Now, I must say that your test does not look as silly as it did before. > The current VM handles things generally well imho, but has problems > regulating itself under load. My test load hits the VM right in it's > weakest point (not _that_ weak, but..) by starting at zero and building > rapidly to max.. and keeping it _right there_. > > > I'm not saying that it's a completely uninteresting case when your active > > memory is bigger than you RAM of course, but perhaps there should be other > > algorithms handling that case, such as putting some of the swapping > > processes to sleep for some time, especially if you have lots of processes > > competing for the memory. I may be wrong, but it seems to me that your > > testcase falls into this second category (also known as thrashing). > > Thrashing? Let's look some numbers. (not the ugly ones, the ~ok ones;) > > real 9m12.198s make -j 30 bzImage > user 7m41.290s > sys 0m34.840s > user : 0:07:47.69 76.8% page in : 452632 > nice : 0:00:00.00 0.0% page out: 399847 > system: 0:01:17.08 12.7% swap in : 75338 > idle : 0:01:03.97 10.5% swap out: 88291 > > real 8m6.994s make bzImage > user 7m34.350s > sys 0m26.550s > user : 0:07:37.52 78.4% page in : 90546 > nice : 0:00:00.00 0.0% page out: 18164 > system: 0:01:26.13 14.8% swap in : 1 > idle : 0:00:39.69 6.8% swap out: 0 > > ...look at cpu utilization. One minute +tiny change to complete the > large job vs the small (VM footprint) job. > > The box is not thrashing, it's working it's little silicon butt off. > What I'm testing is the VM's ability to handle load without thrashing > so badly that it loses throughput bigtime, stalls itself whatever.. > it's ability to regulate itself. I consider a minute and a half to > be ~acceptable, a minute to be good, and 30 seconds to be excellent. > That's just my own little VM performance thermometer. Why isn't user+system+idle == real? SMP? > > An at last, a humble request: Every problem I've had with the VM has been > > that it either swapped out too many processes and used too much cache, or > > the other way around. I'd really enjoy a way to tune this behaviour, if > > possible. > > Tunables aren't really practical in VM (imho). If there were a dozen > knobs, you'd have to turn a dozen knobs a dozen times a day. VM has > to be self regulating. Yes, that is of course the goal, but I'm suggesting that we would reach the goal of a self-optimizing VM faster, if there were tunables to play with. The human brain is a very good optimizer. > In case you can't tell (the length of this reply) I like my fovorite > little generic throughput test a LOT :-) Point taken. :-) /Tobias ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 19:09 ` Tobias Ringstrom @ 2001-06-09 4:36 ` Mike Galbraith 0 siblings, 0 replies; 106+ messages in thread From: Mike Galbraith @ 2001-06-09 4:36 UTC (permalink / raw) To: Tobias Ringstrom Cc: Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Tobias Ringstrom wrote: > On Fri, 8 Jun 2001, Mike Galbraith wrote: > > On Fri, 8 Jun 2001, Tobias Ringstrom wrote: > > > On Fri, 8 Jun 2001, Mike Galbraith wrote: > > > > I gave this a shot at my favorite vm beater test (make -j30 bzImage) > > > > while testing some other stuff today. > > > > > > Could you please explain what is good about this test? I understand that > > > it will stress the VM, but will it do so in a realistic and relevant way? > > > > Can you explain what is bad about this test? ;) It spins the same VM wheels > > I think a load of ~30 is quit uncommon, and therefor it is unclear to me > that it would be a test that would be repesentative of most normal loads. It's not supposed to be repesentative. It's supposed to take the box rapidly (but not instantly) from idle through lo->medium->high and maintain solid throughput. > > as any other load does. What's the difference if I have a bunch of httpd > > allocating or a bunch of cc1/as/ld? This load has a modest cachable data > > set and is compute bound.. and above all gives very repeatable results. > > Not a big difference. The difference I was thinking abount is the > difference between spawning lots of processes allocating, using and > freeing lots of memory, compared to a case where you have a few processes > touching a lot of already allocated pages in some pattern. I was > wondering whether optimizing for your case would be good or bad for the > other case. I know, I know, I should do more testing myself. And I > should probably not ask you, since you really really like your test, > and you will probably just say yes... ;-) It's not a matter of optimizing for my case.. that would be horrible. It's a matter of is the vm capable of rapid and correct responses. > At home, I'm running a couple of computers. One of them is a slow > computer running Linux, serving mail, NFS, SMB, etc. I'm usually logged > in on a couple of virtual consoles. On this machine, I do not mind if all > shells, daemons and other idle processes are beeing swapped out in favor > of disk cache for the NFS and SMB serving. In fact, that is a very good > thing, and I want it that way. > > Another maching is my desktop machine. When using this maching, I really > hate when my emacsen, browsers, xterms, etc are swapped out just to give > me some stupid disk cache for my xmms or compilations. I do not care if a > kernel compile is a little slower as long as my applications are snappy. > > How could Linux predict this? It is a matter of taste, IMHO. I have no idea. It would be _wonderful_ if it could detect interactive tasks and give them preferencial treatment. > > I use it to watch reaction to surge. I watch for the vm to build to a > > solid maximum throughput without thrashing. That's the portion of VM > > that I'm interested in, so that's what I test. Besides :) I simply don't > > have the hardware to try to simulate hairy chested server loads. There > > are lots of folks with hairy chested boxes.. they should test that stuff. > > Agreed. More testing is needed. Now if we would have those knobs and > wheels to turn, we could perhaps also tune our systems to behave as we > like them, and submit that as well. Right now you need to be a kernel > hacker, and see through all the magic with shm, mmap, a bunch of caches, > page lists, etc. I'd give a lot for a nice picture (or state diagram) > showing the lifetime of a page, but I have not found such a picture > anywhere. Besides, the VM seems to change every new release anyway. > > > I've been repeating ~this test since 2.0 times, and have noticed a 1:1 > > relationship. When I notice that my box is ~happy doing this load test, > > I also notice very few VM gripes hitting the list. > > Ok, but as you say, we need more tests. > > > > Isn't the interesting case when you have a number of processes using lots > > > of memory, but only a part of all that memory is beeing actively used, and > > > that memory fits in RAM. In that case, the VM should make sure that the > > > not used memory is swapped out. In RAM you should have the used memory, > > > but also disk cache if there is any RAM left. Does the current VM handle > > > this case fine yet? IMHO, this is the case most people care about. It is > > > definately the case I care about, at least. :-) > > > > The interesting case is _every_ case. Try seeing my particular test as > > a simulation of a small classroom box with 30 students compiling their > > assignments and it'll suddenly become quite realistic. You'll notice > > by the numbers I post that I was very careful to not overload the box in > > a rediculous manner when selecting the total size of the job.. it's just > > a heavily loaded box. This test does not overload my IO resources, so > > it tests the VM's ability to choose and move the right stuff at the right > > time to get the job done with a minimum of additional overhead. > > I did not understand those numbers when I saw them the first time. Now, I > must say that your test does not look as silly as it did before. [snip.. save a tree] > Why isn't user+system+idle == real? SMP? Good question, no smp (sniff) here. > > Tunables aren't really practical in VM (imho). If there were a dozen > > knobs, you'd have to turn a dozen knobs a dozen times a day. VM has > > to be self regulating. > > Yes, that is of course the goal, but I'm suggesting that we would reach > the goal of a self-optimizing VM faster, if there were tunables to play > with. The human brain is a very good optimizer. You bet! The CPU is a stupid robot. I've tried to think up good generic tunables, and failed. This is something that more folks should give some thought. Maybe someone will think of knobs that _are_ practical. > > In case you can't tell (the length of this reply) I like my fovorite > > little generic throughput test a LOT :-) > > Point taken. :-) Cheers, -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 12:50 ` Mike Galbraith 2001-06-08 14:19 ` Tobias Ringstrom @ 2001-06-08 15:51 ` John Stoffel 2001-06-08 17:01 ` Mike Galbraith 2001-06-08 17:43 ` John Stoffel 1 sibling, 2 replies; 106+ messages in thread From: John Stoffel @ 2001-06-08 15:51 UTC (permalink / raw) To: Tobias Ringstrom Cc: Mike Galbraith, Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm >>>>> "Tobias" == Tobias Ringstrom <tori@unhappy.mine.nu> writes: Tobias> On Fri, 8 Jun 2001, Mike Galbraith wrote: >> I gave this a shot at my favorite vm beater test (make -j30 bzImage) >> while testing some other stuff today. Tobias> Could you please explain what is good about this test? I Tobias> understand that it will stress the VM, but will it do so in a Tobias> realistic and relevant way? I agree, this isn't really a good test case. I'd rather see what happens when you fire up a gimp session to edit an image which is *almost* the size of RAM, or even just 50% the size of ram. Then how does that affect your other processes that are running at the same time? This testing could even be automated with the script-foo stuff to get consistent results across runs, which is the prime requirement of any sort of testing. On another issue, in swap.c we have two defines for buffer_mem and page_cache, but the first maxes out at 60%, while the cache maxes out at 75%. Shouldn't they both be lower numbers? Or at least equally sized? I've set my page_cache maximum to be 60, I'll be trying to test it over the weekend, but good weather will keep me outside doing other stuff... Thanks, John John Stoffel - Senior Unix Systems Administrator - Lucent Technologies stoffel@lucent.com - http://www.lucent.com - 978-952-7548 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 15:51 ` John Stoffel @ 2001-06-08 17:01 ` Mike Galbraith 2001-06-09 3:34 ` Rik van Riel 2001-06-08 17:43 ` John Stoffel 1 sibling, 1 reply; 106+ messages in thread From: Mike Galbraith @ 2001-06-08 17:01 UTC (permalink / raw) To: John Stoffel Cc: Tobias Ringstrom, Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, John Stoffel wrote: > >>>>> "Tobias" == Tobias Ringstrom <tori@unhappy.mine.nu> writes: > > Tobias> On Fri, 8 Jun 2001, Mike Galbraith wrote: > > >> I gave this a shot at my favorite vm beater test (make -j30 bzImage) > >> while testing some other stuff today. > > Tobias> Could you please explain what is good about this test? I > Tobias> understand that it will stress the VM, but will it do so in a > Tobias> realistic and relevant way? > > I agree, this isn't really a good test case. I'd rather see what > happens when you fire up a gimp session to edit an image which is > *almost* the size of RAM, or even just 50% the size of ram. Then how > does that affect your other processes that are running at the same > time? OK, riddle me this. If this test is a crummy test, just how is it that I was able to warn Rik in advance that when 2.4.5 was released, he should expect complaints? How did I _know_ that? The answer is that I fiddle with Rik's code a lot, and I test with this test because it tells me a lot. It may not tell you anything, but it does me. -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 17:01 ` Mike Galbraith @ 2001-06-09 3:34 ` Rik van Riel 0 siblings, 0 replies; 106+ messages in thread From: Rik van Riel @ 2001-06-09 3:34 UTC (permalink / raw) To: Mike Galbraith Cc: John Stoffel, Tobias Ringstrom, Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Mike Galbraith wrote: > On Fri, 8 Jun 2001, John Stoffel wrote: > > I agree, this isn't really a good test case. I'd rather see what > > happens when you fire up a gimp session to edit an image which is > > *almost* the size of RAM, or even just 50% the size of ram. > > OK, riddle me this. If this test is a crummy test, just how is it Personally, I'd like to see BOTH of these tests, and many many more. Preferably, handed to the VM hackers in various colourful graphs that allow even severely undercaffeinated hackers to see how things changed for the good or the bad between kernel revisions. cheers, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 15:51 ` John Stoffel 2001-06-08 17:01 ` Mike Galbraith @ 2001-06-08 17:43 ` John Stoffel 2001-06-08 17:35 ` Marcelo Tosatti ` (2 more replies) 1 sibling, 3 replies; 106+ messages in thread From: John Stoffel @ 2001-06-08 17:43 UTC (permalink / raw) To: Mike Galbraith Cc: John Stoffel, Tobias Ringstrom, Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm Mike> OK, riddle me this. If this test is a crummy test, just how is Mike> it that I was able to warn Rik in advance that when 2.4.5 was Mike> released, he should expect complaints? How did I _know_ that? Mike> The answer is that I fiddle with Rik's code a lot, and I test Mike> with this test because it tells me a lot. It may not tell you Mike> anything, but it does me. I never said it was a crummy test, please do not read more into my words than was written. What I was trying to get across is that just one test (such as a compile of the kernel) isn't perfect at showing where the problems are with the VM sub-system. Jonathan Morton has been using another large compile to also test the sub-system, and it includes a compile which puts a large, single process pressure on the VM. I consider this to be a more representative test of how the VM deals with pressure. The kernel compile is an ok test of basic VM handling, but from what I've been hearing on linux-kernel and linux-mm is that the VM goes to crap when you have a mix of stuff running, and one (or more) processes starts up or grows much larger and starts impacting the system performance. I'm also not knocking your contributions to this discussion, so stop being so touchy. I was trying to contribute and say (albeit poorly) that a *mix* of tests is needed to test the VM. More importantly, a *repeatable* set of tests is what is needed to test the VM and get consistent results from run to run, so you can see how your changes are impacting performance. The kernel compile doesn't really have any one process grow to a large fraction of memory, so dropping in a compile which *does* is a good thing. John John Stoffel - Senior Unix Systems Administrator - Lucent Technologies stoffel@lucent.com - http://www.lucent.com - 978-952-7548 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 17:43 ` John Stoffel @ 2001-06-08 17:35 ` Marcelo Tosatti 2001-06-09 5:07 ` Mike Galbraith 2001-06-08 18:30 ` Mike Galbraith 2001-06-08 20:58 ` John Stoffel 2 siblings, 1 reply; 106+ messages in thread From: Marcelo Tosatti @ 2001-06-08 17:35 UTC (permalink / raw) To: John Stoffel Cc: Mike Galbraith, Tobias Ringstrom, Jonathan Morton, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, John Stoffel wrote: > > Mike> OK, riddle me this. If this test is a crummy test, just how is > Mike> it that I was able to warn Rik in advance that when 2.4.5 was > Mike> released, he should expect complaints? How did I _know_ that? > Mike> The answer is that I fiddle with Rik's code a lot, and I test > Mike> with this test because it tells me a lot. It may not tell you > Mike> anything, but it does me. > > I never said it was a crummy test, please do not read more into my > words than was written. What I was trying to get across is that just > one test (such as a compile of the kernel) isn't perfect at showing > where the problems are with the VM sub-system. > > Jonathan Morton has been using another large compile to also test the > sub-system, and it includes a compile which puts a large, single > process pressure on the VM. I consider this to be a more > representative test of how the VM deals with pressure. > > The kernel compile is an ok test of basic VM handling, but from what > I've been hearing on linux-kernel and linux-mm is that the VM goes to > crap when you have a mix of stuff running, and one (or more) processes > starts up or grows much larger and starts impacting the system > performance. > > I'm also not knocking your contributions to this discussion, so stop > being so touchy. I was trying to contribute and say (albeit poorly) > that a *mix* of tests is needed to test the VM. > > More importantly, a *repeatable* set of tests is what is needed to > test the VM and get consistent results from run to run, so you can see > how your changes are impacting performance. The kernel compile > doesn't really have any one process grow to a large fraction of > memory, so dropping in a compile which *does* is a good thing. I agree with you. Mike, I'm sure you have noticed that stock kernel gives much better results than mine or Jonathan's patch. Now the stock kernel gives us crappy interactivity compared to my patch. (Note: my patch still does not gives me the interactivity I want under high VM loads, but I hope to get there soon). BTW, we are talking with the OSDL (http://www.osdlab.org) guys about a possibility to setup a test system which would run a different variety of benchmarks to give us results of different kinds of workloads. If that ever happens, we'll probably get rid of most of this testing problems. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 17:35 ` Marcelo Tosatti @ 2001-06-09 5:07 ` Mike Galbraith 0 siblings, 0 replies; 106+ messages in thread From: Mike Galbraith @ 2001-06-09 5:07 UTC (permalink / raw) To: Marcelo Tosatti Cc: John Stoffel, Tobias Ringstrom, Jonathan Morton, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, Marcelo Tosatti wrote: > On Fri, 8 Jun 2001, John Stoffel wrote: > > > More importantly, a *repeatable* set of tests is what is needed to > > test the VM and get consistent results from run to run, so you can see > > how your changes are impacting performance. The kernel compile > > doesn't really have any one process grow to a large fraction of > > memory, so dropping in a compile which *does* is a good thing. > > I agree with you. > > Mike, I'm sure you have noticed that stock kernel gives much better > results than mine or Jonathan's patch. I noticed that Jonathan brought back waiting.. that (among others) made me veeeeery interested. > Now the stock kernel gives us crappy interactivity compared to my patch. > (Note: my patch still does not gives me the interactivity I want under > high VM loads, but I hope to get there soon). (And that's why) Among other things (yes, I do love throughput) I've poked at the interactivity problem. I can't improve it anymore without doing some strategic waiting :( I used to be able to help it a little by doing a careful roll-up in scrub size as load builds.. trying to smooth the transition from latency oriented to hammer down throughput. > BTW, we are talking with the OSDL (http://www.osdlab.org) guys about a > possibility to setup a test system which would run a different variety of > benchmarks to give us results of different kinds of workloads. If that > ever happens, we'll probably get rid of most of this testing problems. Excellent! -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 17:43 ` John Stoffel 2001-06-08 17:35 ` Marcelo Tosatti @ 2001-06-08 18:30 ` Mike Galbraith 2001-06-09 12:31 ` Zlatko Calusic 2001-06-08 20:58 ` John Stoffel 2 siblings, 1 reply; 106+ messages in thread From: Mike Galbraith @ 2001-06-08 18:30 UTC (permalink / raw) To: John Stoffel Cc: Tobias Ringstrom, Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, John Stoffel wrote: > Mike> OK, riddle me this. If this test is a crummy test, just how is > Mike> it that I was able to warn Rik in advance that when 2.4.5 was > Mike> released, he should expect complaints? How did I _know_ that? > Mike> The answer is that I fiddle with Rik's code a lot, and I test > Mike> with this test because it tells me a lot. It may not tell you > Mike> anything, but it does me. > > I never said it was a crummy test, please do not read more into my > words than was written. What I was trying to get across is that just > one test (such as a compile of the kernel) isn't perfect at showing > where the problems are with the VM sub-system. Hmm... Tobias> Could you please explain what is good about this test? I Tobias> understand that it will stress the VM, but will it do so in a Tobias> realistic and relevant way? I agree, this isn't really a good test case. I'd rather see what ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ happens when you fire up a gimp session to edit an image which is *almost* the size of RAM, or even just 50% the size of ram. Then how does that affect your other processes that are running at the same time? ...but anyway, yes it just one test from any number of possibles. > Jonathan Morton has been using another large compile to also test the > sub-system, and it includes a compile which puts a large, single > process pressure on the VM. I consider this to be a more > representative test of how the VM deals with pressure. What does 'more representative' mean given that the VM must react to every situation it runs into? > The kernel compile is an ok test of basic VM handling, but from what Now we're communicating. I never said it was more than that ;-) > I've been hearing on linux-kernel and linux-mm is that the VM goes to > crap when you have a mix of stuff running, and one (or more) processes > starts up or grows much larger and starts impacting the system > performance. > > I'm also not knocking your contributions to this discussion, so stop > being so touchy. I was trying to contribute and say (albeit poorly) > that a *mix* of tests is needed to test the VM. Yes, more people need to test. I don't need to do all of those other tests (no have right toys), more people need to do repeatable tests. > More importantly, a *repeatable* set of tests is what is needed to > test the VM and get consistent results from run to run, so you can see > how your changes are impacting performance. The kernel compile > doesn't really have any one process grow to a large fraction of > memory, so dropping in a compile which *does* is a good thing. I know I'm only watching basic functionality. I'm watching basic functionality with one very consistant test run very consistantly. -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 18:30 ` Mike Galbraith @ 2001-06-09 12:31 ` Zlatko Calusic 0 siblings, 0 replies; 106+ messages in thread From: Zlatko Calusic @ 2001-06-09 12:31 UTC (permalink / raw) To: Mike Galbraith Cc: John Stoffel, Tobias Ringstrom, Jonathan Morton, Shane Nay, Marcelo Tosatti, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm Mike Galbraith <mikeg@wen-online.de> writes: > On Fri, 8 Jun 2001, John Stoffel wrote: > > > Mike> OK, riddle me this. If this test is a crummy test, just how is > > Mike> it that I was able to warn Rik in advance that when 2.4.5 was > > Mike> released, he should expect complaints? How did I _know_ that? > > Mike> The answer is that I fiddle with Rik's code a lot, and I test > > Mike> with this test because it tells me a lot. It may not tell you > > Mike> anything, but it does me. > > > > I never said it was a crummy test, please do not read more into my > > words than was written. What I was trying to get across is that just > > one test (such as a compile of the kernel) isn't perfect at showing > > where the problems are with the VM sub-system. > > Hmm... > > Tobias> Could you please explain what is good about this test? I > Tobias> understand that it will stress the VM, but will it do so in a > Tobias> realistic and relevant way? > > I agree, this isn't really a good test case. I'd rather see what > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > happens when you fire up a gimp session to edit an image which is > *almost* the size of RAM, or even just 50% the size of ram. Then how > does that affect your other processes that are running at the same > time? > > ...but anyway, yes it just one test from any number of possibles. One great test that I'm using regularly to see what's goin' on, is at http://lxr.linux.no/. It is a cool utility to cross reference your Linux kernel source tree, and in the mean time eat gobs of memory, do lots of I/O, and burn many CPU cycles (all at the same time). Ideal test, if you ask me and if anybody has the time, it would be nice to see different timing numbers when run on different kernels. Just make sure you run it on the same kernel tree to make reproducable results. It has three passes, and the third one is the most interesting one (use vmstat 1 to see why). When run with 64MB RAM configuration, it would swap heavily, with 128MB somewhat, and at 192MB maybe not (depending on the other applications running at the same time). Try it, it is a nice utility, and a great test. :) -- Zlatko ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 17:43 ` John Stoffel 2001-06-08 17:35 ` Marcelo Tosatti 2001-06-08 18:30 ` Mike Galbraith @ 2001-06-08 20:58 ` John Stoffel 2001-06-08 20:04 ` Marcelo Tosatti 2001-06-08 23:44 ` Jonathan Morton 2 siblings, 2 replies; 106+ messages in thread From: John Stoffel @ 2001-06-08 20:58 UTC (permalink / raw) To: Marcelo Tosatti Cc: John Stoffel, Mike Galbraith, Tobias Ringstrom, Jonathan Morton, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm Marcelo> Now the stock kernel gives us crappy interactivity compared Marcelo> to my patch. (Note: my patch still does not gives me the Marcelo> interactivity I want under high VM loads, but I hope to get Marcelo> there soon). This raises the important question, how can we objectively measure interactive response in the kernel and relate it to the user's perceived interactive response? If we could come up with some sort of testing system that would show us this, it would help alot, since we could just have people run tests in a more automatic and repeatable manner. And I think it would also help us automatically tune the Kernel, since it would have a knowledge of it's own performance. There is the problem in terms of some people want pure interactive performance, while others are looking for throughput over all else, but those are both extremes of the spectrum. Though I suspect raw throughput is the less wanted (in terms of numbers of systems) than keeping interactive response good during VM pressure. I have zero knowledge of how we could do this, but giving the kernel some counters, even if only for use during debugging runs, which would give us some objective feedback on performance would be a big win. Having people just send in reports of "I ran X,Y,Z and it was slow" doesn't help us, since it's so hard to re-create their environment so you can run tests against it. Anyway, enjoy the weekend all. John John Stoffel - Senior Unix Systems Administrator - Lucent Technologies stoffel@lucent.com - http://www.lucent.com - 978-952-7548 ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 20:58 ` John Stoffel @ 2001-06-08 20:04 ` Marcelo Tosatti 2001-06-08 23:44 ` Jonathan Morton 1 sibling, 0 replies; 106+ messages in thread From: Marcelo Tosatti @ 2001-06-08 20:04 UTC (permalink / raw) To: John Stoffel Cc: Mike Galbraith, Tobias Ringstrom, Jonathan Morton, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Fri, 8 Jun 2001, John Stoffel wrote: > > Marcelo> Now the stock kernel gives us crappy interactivity compared > Marcelo> to my patch. (Note: my patch still does not gives me the > Marcelo> interactivity I want under high VM loads, but I hope to get > Marcelo> there soon). > > This raises the important question, how can we objectively measure > interactive response in the kernel and relate it to the user's > perceived interactive response? If we could come up with some sort of > testing system that would show us this, it would help alot, since we > could just have people run tests in a more automatic and repeatable > manner. > > And I think it would also help us automatically tune the Kernel, since > it would have a knowledge of it's own performance. > > There is the problem in terms of some people want pure interactive > performance, while others are looking for throughput over all else, > but those are both extremes of the spectrum. Though I suspect > raw throughput is the less wanted (in terms of numbers of systems) > than keeping interactive response good during VM pressure. And this raises a very very important point: raw throughtput wins enterprise-like benchmarks, and the enterprise people are the ones who pay most of hackers here. (including me and Rik) We have to be careful about that. > I have zero knowledge of how we could do this, but giving the kernel > some counters, even if only for use during debugging runs, which would > give us some objective feedback on performance would be a big win. > > Having people just send in reports of "I ran X,Y,Z and it was slow" > doesn't help us, since it's so hard to re-create their environment so > you can run tests against it. Lets wait for some test system to be set up (eg the OSDL thing). Once thats done, I'm sure we will find out some way of doing it. Well, good weekend for you too. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 20:58 ` John Stoffel 2001-06-08 20:04 ` Marcelo Tosatti @ 2001-06-08 23:44 ` Jonathan Morton 2001-06-09 2:36 ` Andrew Morton ` (2 more replies) 1 sibling, 3 replies; 106+ messages in thread From: Jonathan Morton @ 2001-06-08 23:44 UTC (permalink / raw) To: Marcelo Tosatti, John Stoffel Cc: Mike Galbraith, Tobias Ringstrom, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm [ Re-entering discussion after too long a day and a long sleep... ] >> There is the problem in terms of some people want pure interactive >> performance, while others are looking for throughput over all else, >> but those are both extremes of the spectrum. Though I suspect >> raw throughput is the less wanted (in terms of numbers of systems) >> than keeping interactive response good during VM pressure. > >And this raises a very very important point: raw throughtput wins >enterprise-like benchmarks, and the enterprise people are the ones who pay >most of hackers here. (including me and Rik) Very true. As well as the fact that interactivity is much harder to measure. The question is, what is interactivity (from the kernel's perspective)? It usually means small(ish) processes with intermittent working-set and CPU requirements. These types of process can safely be swapped out when not immediately in use, but the kernel has to be able to page them in quite quickly when needed. Doing that under heavy load is very non-trivial. It can also mean multimedia applications with a continuous (maybe small) working set, a continuous but not 100% CPU usage, and the special property that the user WILL notice if this process gets swapped out even briefly. mpg123 and XMMS fall into this category, and I sometimes tried running these alongside my compilation tests to see how they fared. I think I had it going fairly well towards the end, with mpg123 stuttering relatively rarely and briefly while VM load was high. On the subject of Mike Galbraith's kernel compilation test, how much physical RAM does he have for his machine, what type of CPU is it, and what (approximate) type of device does he use for swap? I'll see if I can partially duplicate his results at this end. So far all my tests have been done with a fast CPU - perhaps I should try the P166/MMX or even try loading linux-pmac onto my 8100. -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) The key to knowledge is not to rely on people to teach you it. GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 23:44 ` Jonathan Morton @ 2001-06-09 2:36 ` Andrew Morton 2001-06-09 6:33 ` Mark Hahn 2001-06-09 3:43 ` Mike Galbraith 2001-06-09 4:05 ` Jonathan Morton 2 siblings, 1 reply; 106+ messages in thread From: Andrew Morton @ 2001-06-09 2:36 UTC (permalink / raw) To: Jonathan Morton; +Cc: lkml, linux-mm Jonathan Morton wrote: > > [ Re-entering discussion after too long a day and a long sleep... ] > > >> There is the problem in terms of some people want pure interactive > >> performance, while others are looking for throughput over all else, > >> but those are both extremes of the spectrum. Though I suspect > >> raw throughput is the less wanted (in terms of numbers of systems) > >> than keeping interactive response good during VM pressure. > > > >And this raises a very very important point: raw throughtput wins > >enterprise-like benchmarks, and the enterprise people are the ones who pay > >most of hackers here. (including me and Rik) > > Very true. As well as the fact that interactivity is much harder to > measure. The question is, what is interactivity (from the kernel's > perspective)? It usually means small(ish) processes with intermittent > working-set and CPU requirements. These types of process can safely be > swapped out when not immediately in use, but the kernel has to be able to > page them in quite quickly when needed. Doing that under heavy load is > very non-trivial. For the low-latency stuff, latency can be defined as the worst-case time to schedule a userspace process in response to an interrupt. That metric is also appropriate in this case, (latency equals interactivity), although here you don't need to be so fanatical about the *worst case*. A few scheduling blips here are less fatal. I have tools to measure latency (aka interactivity). At http://www.uow.edu.au/~andrewm/linux/schedlat.html#downloads there is a kernel patch called `rtc-debug' which causes the PC RTC to generate a stream of interrupts. A user-space task called `amlat' responds to those interrupts and reads the RTC device. The patched RTC driver can then measure the elapsed time between the interrupt and the read from userspace. Voila: latency. When you close the RTC device (by killing amlat), the RTC driver will print out a histogram of the latencies. `amlat' at present gives itself SCHED_RR policy and runs under mlockall() - for your testing you'll need to delete those lines. So. Simple apply rtc-debug, run `amlat' and kill it when you've finished the workload. The challenge will be to relate the latency histogram to human-perceived interactivity. I'm not sure of the best way of doing that. Perhaps monitor the 90th percentile, and aim to keep it below 100 milliseconds. Also, `amlat' should do a bit of disk I/O as well. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-09 2:36 ` Andrew Morton @ 2001-06-09 6:33 ` Mark Hahn 0 siblings, 0 replies; 106+ messages in thread From: Mark Hahn @ 2001-06-09 6:33 UTC (permalink / raw) To: lkml > reads the RTC device. The patched RTC driver can then > measure the elapsed time between the interrupt and the > read from userspace. Voila: latency. interesting, but I'm not sure there's much advantage over doing it entirely in user-space with the normal /dev/rtc: http://brain.mcmaster.ca/~hahn/realfeel.c it just prints out the raw time difference from when rtc should have woken up the program. you can do your own histogram; for summary purposes, something like stdev is probably best. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 23:44 ` Jonathan Morton 2001-06-09 2:36 ` Andrew Morton @ 2001-06-09 3:43 ` Mike Galbraith 2001-06-09 4:05 ` Jonathan Morton 2 siblings, 0 replies; 106+ messages in thread From: Mike Galbraith @ 2001-06-09 3:43 UTC (permalink / raw) To: Jonathan Morton Cc: Marcelo Tosatti, John Stoffel, Tobias Ringstrom, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Sat, 9 Jun 2001, Jonathan Morton wrote: > On the subject of Mike Galbraith's kernel compilation test, how much > physical RAM does he have for his machine, what type of CPU is it, and what > (approximate) type of device does he use for swap? I'll see if I can > partially duplicate his results at this end. So far all my tests have been > done with a fast CPU - perhaps I should try the P166/MMX or even try > loading linux-pmac onto my 8100. It's a PIII/500 with one ide disk. -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-08 23:44 ` Jonathan Morton 2001-06-09 2:36 ` Andrew Morton 2001-06-09 3:43 ` Mike Galbraith @ 2001-06-09 4:05 ` Jonathan Morton 2001-06-09 5:09 ` Mike Galbraith 2 siblings, 1 reply; 106+ messages in thread From: Jonathan Morton @ 2001-06-09 4:05 UTC (permalink / raw) To: Mike Galbraith Cc: Marcelo Tosatti, John Stoffel, Tobias Ringstrom, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm >> On the subject of Mike Galbraith's kernel compilation test, how much >> physical RAM does he have for his machine, what type of CPU is it, and what >> (approximate) type of device does he use for swap? I'll see if I can >> partially duplicate his results at this end. So far all my tests have been >> done with a fast CPU - perhaps I should try the P166/MMX or even try >> loading linux-pmac onto my 8100. > >It's a PIII/500 with one ide disk. ...with how much RAM? That's the important bit. -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) The key to knowledge is not to rely on people to teach you it. GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: VM Report was:Re: Break 2.4 VM in five easy steps 2001-06-09 4:05 ` Jonathan Morton @ 2001-06-09 5:09 ` Mike Galbraith 0 siblings, 0 replies; 106+ messages in thread From: Mike Galbraith @ 2001-06-09 5:09 UTC (permalink / raw) To: Jonathan Morton Cc: Marcelo Tosatti, John Stoffel, Tobias Ringstrom, Shane Nay, Dr S.M. Huen, Sean Hunter, Xavier Bestel, lkml, linux-mm On Sat, 9 Jun 2001, Jonathan Morton wrote: > >> On the subject of Mike Galbraith's kernel compilation test, how much > >> physical RAM does he have for his machine, what type of CPU is it, and what > >> (approximate) type of device does he use for swap? I'll see if I can > >> partially duplicate his results at this end. So far all my tests have been > >> done with a fast CPU - perhaps I should try the P166/MMX or even try > >> loading linux-pmac onto my 8100. > > > >It's a PIII/500 with one ide disk. > > ...with how much RAM? That's the important bit. Duh! :) I'm a dipstick. 128mb. -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:54 ` Sean Hunter 2001-06-06 9:57 ` Dr S.M. Huen @ 2001-06-06 10:04 ` Jonathan Morton 2001-06-06 11:16 ` Daniel Phillips ` (3 subsequent siblings) 5 siblings, 0 replies; 106+ messages in thread From: Jonathan Morton @ 2001-06-06 10:04 UTC (permalink / raw) To: Sean Hunter, Xavier Bestel; +Cc: linux-kernel >I am waiting patiently for the bug to be fixed. However, it is a real >embarrasment that we can't run this "stable" kernel in production yet >because somethign as fundamental as this is so badly broken. Rest assured that a fix is in the works. I'm already seeing a big improvement in behaviour on my Athlon (256Mb RAM, but testing using mem=32M and mem=48M), and I strongly believe that we're making progress here. Maybe some of the more significant improvements will find their way into 2.4.6. -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) big-mail: chromatix@penguinpowered.com uni-mail: j.d.morton@lancaster.ac.uk The key to knowledge is not to rely on people to teach you it. Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/ -----BEGIN GEEK CODE BLOCK----- Version 3.12 GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) -----END GEEK CODE BLOCK----- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:54 ` Sean Hunter 2001-06-06 9:57 ` Dr S.M. Huen 2001-06-06 10:04 ` Jonathan Morton @ 2001-06-06 11:16 ` Daniel Phillips 2001-06-06 13:58 ` Gerhard Mack ` (2 subsequent siblings) 5 siblings, 0 replies; 106+ messages in thread From: Daniel Phillips @ 2001-06-06 11:16 UTC (permalink / raw) To: Sean Hunter, Xavier Bestel; +Cc: linux-kernel On Wednesday 06 June 2001 10:54, Sean Hunter wrote: > > Did you try to put twice as much swap as you have RAM ? (e.g. add a > > 512M swapfile to your box) > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying > > that anything less won't do any good: 2.4 overallocates swap even > > if it doesn't use it all. So in your case you just have enough swap > > to map your RAM, and nothing to really swap your apps. > > For large memory boxes, this is ridiculous. Should I have 8GB of > swap? And laptops with big memories and small disks. -- Daniel ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:54 ` Sean Hunter ` (2 preceding siblings ...) 2001-06-06 11:16 ` Daniel Phillips @ 2001-06-06 13:58 ` Gerhard Mack 2001-06-08 4:56 ` C. Martins 2001-06-06 15:28 ` Richard Gooch 2001-06-06 19:11 ` android 5 siblings, 1 reply; 106+ messages in thread From: Gerhard Mack @ 2001-06-06 13:58 UTC (permalink / raw) To: Sean Hunter; +Cc: Xavier Bestel, linux-kernel On Wed, 6 Jun 2001, Sean Hunter wrote: > On Wed, Jun 06, 2001 at 10:19:30AM +0200, Xavier Bestel wrote: > > On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote: > > > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote: > > > > "Jeffrey W. Baker" wrote: > > > > > > > > > > Because the 2.4 VM is so broken, and > > > > > because my machines are frequently deeply swapped, > > > > > > > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > > > > The problem *appears* worse in 2.4 because it uses lots > > > > more swap. > > > > > > I disagree with the terminology you're using. It *is* worse in 2.4, > > > period. If it only *appears* worse, then if I encounter a situation > > > where a 2.2 box has utilized as much swap as a 2.4 box, I should see the > > > same results. Yet this happens not to be the case. > > > > Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M > > swapfile to your box) > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that > > anything less won't do any good: 2.4 overallocates swap even if it > > doesn't use it all. So in your case you just have enough swap to map > > your RAM, and nothing to really swap your apps. > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > Sean I have several boxes with 2x ram as swap and performance still sucks compared to 2.2.17. Gerhard -- Gerhard Mack gmack@innerfire.net <>< As a computer I find your faith in technology amusing. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 13:58 ` Gerhard Mack @ 2001-06-08 4:56 ` C. Martins 0 siblings, 0 replies; 106+ messages in thread From: C. Martins @ 2001-06-08 4:56 UTC (permalink / raw) To: linux-kernel In my everyday desktop workstation (PII 350) I have 64MB of RAM and use 300MB of swap, 150MB on each hard disk. After upgrading to 2.4, and maintaining the same set of applications (KDE, Netscape & friends), the machine performance is _definitely_ much worse, in terms of responsiveness and throughput. Most of applications just take much longer to load, and once you've made something that required more memory for a while (like compiling a kernel, opening a large JPEG in gimp, etc) it takes lots of time to come back to normal. Strangely, with 2.4 the workstation just feels that someone stole the 64MB DIMM and put in a 16MB one!! One thing I find strange is that with 2.4 if you run top or something similar you notice that memory allocated for cache is almost always using more than half total RAM. I don't remember seeing this with 2.2 kernel series... Anyway I think there is something really broken with respect to 2.4 VM. It is just NOT acceptable that when running the same set of apps and type of work and you upgrade your kernel, your hardware no longer is up to the job, when it fited perfectly right before. This is just MS way of solving problems here. Best regards Claudio Martins On Wed, Jun 06, 2001 at 06:58:39AM -0700, Gerhard Mack wrote: > > I have several boxes with 2x ram as swap and performance still sucks > compared to 2.2.17. > ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:54 ` Sean Hunter ` (3 preceding siblings ...) 2001-06-06 13:58 ` Gerhard Mack @ 2001-06-06 15:28 ` Richard Gooch 2001-06-06 15:42 ` Christian Bornträger 2001-06-06 17:14 ` Break 2.4 VM in five easy steps Ben Greear 2001-06-06 19:11 ` android 5 siblings, 2 replies; 106+ messages in thread From: Richard Gooch @ 2001-06-06 15:28 UTC (permalink / raw) To: Daniel Phillips; +Cc: Sean Hunter, Xavier Bestel, linux-kernel Daniel Phillips writes: > On Wednesday 06 June 2001 10:54, Sean Hunter wrote: > > > > Did you try to put twice as much swap as you have RAM ? (e.g. add a > > > 512M swapfile to your box) > > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying > > > that anything less won't do any good: 2.4 overallocates swap even > > > if it doesn't use it all. So in your case you just have enough swap > > > to map your RAM, and nothing to really swap your apps. > > > > For large memory boxes, this is ridiculous. Should I have 8GB of > > swap? Sure. It's cheap. If you don't mind slumming it, go and buy a 20 GB IDE drive for US$65. I know RAM has gotten a lot cheaper lately (US$66 for a 512 MiB PC133 DIMM), but it's still far more expensive. If you can afford 4 GiB of RAM, you can definately afford 8 GiB of swap. > And laptops with big memories and small disks. That's not that common, though. Usually you get far more disc than RAM on a laptop, just as with a desktop. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 15:28 ` Richard Gooch @ 2001-06-06 15:42 ` Christian Bornträger 2001-06-06 15:57 ` Requirement: swap = RAM x 2.5 ?? Jeff Garzik 2001-06-06 16:12 ` Richard Gooch 2001-06-06 17:14 ` Break 2.4 VM in five easy steps Ben Greear 1 sibling, 2 replies; 106+ messages in thread From: Christian Bornträger @ 2001-06-06 15:42 UTC (permalink / raw) To: linux-kernel OK, Linus said if I use swap it should be at least twice as much as RAM. there will be much more discussion about it, for me this contraint is a very very bad idea. Have you ever thought about diskless workstations? Swapping over a network sounds ugly. Nevertheless, my question is: what happens if I plan to use no swap. I have enough memory installed for my purposes and every swapping operation can do only one thing: slowing down the system. Is there a different behaviour if I completely disable swap? greetings Christian Bornträger ^ permalink raw reply [flat|nested] 106+ messages in thread
* Requirement: swap = RAM x 2.5 ?? 2001-06-06 15:42 ` Christian Bornträger @ 2001-06-06 15:57 ` Jeff Garzik 2001-06-06 18:42 ` Eric W. Biederman 2001-06-06 16:12 ` Richard Gooch 1 sibling, 1 reply; 106+ messages in thread From: Jeff Garzik @ 2001-06-06 15:57 UTC (permalink / raw) To: linux-kernel, linux-mm; +Cc: Christian Bornträger, Derek Glidden I'm sorry but this is a regression, plain and simple. Previous versons of Linux have worked great on diskless workstations with NO swap. Swap is "extra space to be used if we have it" and nothing else. -- Jeff Garzik | Andre the Giant has a posse. Building 1024 | MandrakeSoft | ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 15:57 ` Requirement: swap = RAM x 2.5 ?? Jeff Garzik @ 2001-06-06 18:42 ` Eric W. Biederman 2001-06-07 1:29 ` Jan Harkes 0 siblings, 1 reply; 106+ messages in thread From: Eric W. Biederman @ 2001-06-06 18:42 UTC (permalink / raw) To: Jeff Garzik; +Cc: linux-kernel, linux-mm, Christian Borntrdger, Derek Glidden Jeff Garzik <jgarzik@mandrakesoft.com> writes: > I'm sorry but this is a regression, plain and simple. > > Previous versons of Linux have worked great on diskless workstations > with NO swap. > > Swap is "extra space to be used if we have it" and nothing else. Given the slow speed of disks to use them efficiently when you are using swap some additional rules apply. In the worse case when swapping is being used you get: Virtual Memory = RAM + (swap - RAM). That cannot be improved. You can increase your likely hood that that case won't come up, but that is a different matter entirely. I suspect in practice that we are suffering more from lazy reclamation of swap pages than from a more aggressive swap cache. Eric ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 18:42 ` Eric W. Biederman @ 2001-06-07 1:29 ` Jan Harkes 0 siblings, 0 replies; 106+ messages in thread From: Jan Harkes @ 2001-06-07 1:29 UTC (permalink / raw) To: Eric W. Biederman Cc: Jeff Garzik, linux-kernel, linux-mm, Christian Borntrdger, Derek Glidden On Wed, Jun 06, 2001 at 12:42:03PM -0600, Eric W. Biederman wrote: > Jeff Garzik <jgarzik@mandrakesoft.com> writes: > > I'm sorry but this is a regression, plain and simple. > > > > Previous versons of Linux have worked great on diskless workstations > > with NO swap. > > > > Swap is "extra space to be used if we have it" and nothing else. > > Given the slow speed of disks to use them efficiently when you are using > swap some additional rules apply. > > In the worse case when swapping is being used you get: > Virtual Memory = RAM + (swap - RAM). > > That cannot be improved. You can increase your likely hood that that > case won't come up, but that is a different matter entirely. I believe you are taking the right approach to the problem, which is not to complain about that we need 2*RAM, but to try and figure out _why_ we need 2*RAM. As far as I can make out, any pages that at one time got swapped out, will remain in swap. It is even there when there are no more references to the page, but will be reclaimed lazily (i.e. when we need to swap something new out). I'm assuming the reason we need SWAP > RAM is because once swap is filled only the subset of VM users that occupy this space are candidates for further swapping. I'm assuming this probably has a significant impact on long-running processes that have more chance of being pushed into swap at some point. The advantage of this is that when we need to remove a clean page that is already in swap we can simply discard the copy in ram, paying only a swap-in penalty. Dirty pages will have to be re-written, but we don't need to find a place to put them, swap is already reserved. If we wanted reclaim swap pages that were swapped into ram, we need to find a place to swap to, swap the page out, and eventually swap it back in. Obviously a lot more expensive. However, we must have pushed the page into swap because it was not 'pageable'. i.e. it got dirtied, and there is no underlying file to write it back to, shm, private mmap, or dirty heap. So there is infact a high likelyhood that the page will not be clean when we have to swap it out again. Now if we would reclaim not just dead swap pages, but also pages that have been swapped in but are dirtied, the 'additional cost' only involves finding a place to swap to. The nice thing is that with fewer used swap pages as a result of agressive reclaiming of swapped but in the mean time swapped back in and dirtied pages it should become a lot easier to find a free spot (until we are really overcommitted). I don't know how feasable it is to tell from a given swap page, whether there is a dirtied copy present in ram, but we could drop the swap reference when the copy in ram is modified, turning the swap page into a dead page and letting the regular reclamation pick it up. Jan ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 15:42 ` Christian Bornträger 2001-06-06 15:57 ` Requirement: swap = RAM x 2.5 ?? Jeff Garzik @ 2001-06-06 16:12 ` Richard Gooch 2001-06-06 16:15 ` Jeff Garzik ` (2 more replies) 1 sibling, 3 replies; 106+ messages in thread From: Richard Gooch @ 2001-06-06 16:12 UTC (permalink / raw) To: Jeff Garzik Cc: linux-kernel, linux-mm, Christian Bornträger, Derek Glidden Jeff Garzik writes: > > I'm sorry but this is a regression, plain and simple. > > Previous versons of Linux have worked great on diskless workstations > with NO swap. > > Swap is "extra space to be used if we have it" and nothing else. Sure. But Linux still works without swap. It's just that if you *do* have swap, it works best with 2* RAM. Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 16:12 ` Richard Gooch @ 2001-06-06 16:15 ` Jeff Garzik 2001-06-06 16:19 ` Richard Gooch 2001-06-06 17:05 ` Greg Hennessy 2 siblings, 0 replies; 106+ messages in thread From: Jeff Garzik @ 2001-06-06 16:15 UTC (permalink / raw) To: Richard Gooch Cc: linux-kernel, linux-mm, Christian Bornträger, Derek Glidden Richard Gooch wrote: > > Jeff Garzik writes: > > > > I'm sorry but this is a regression, plain and simple. > > > > Previous versons of Linux have worked great on diskless workstations > > with NO swap. > > > > Swap is "extra space to be used if we have it" and nothing else. > > Sure. But Linux still works without swap. It's just that if you *do* > have swap, it works best with 2* RAM. Yes, but that's not the point of the discussion. Currently 2*RAM is more of a requirement than a recommendation. -- Jeff Garzik | Andre the Giant has a posse. Building 1024 | MandrakeSoft | ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 16:12 ` Richard Gooch 2001-06-06 16:15 ` Jeff Garzik @ 2001-06-06 16:19 ` Richard Gooch 2001-06-06 16:53 ` Mike Galbraith 2001-06-06 17:05 ` Greg Hennessy 2 siblings, 1 reply; 106+ messages in thread From: Richard Gooch @ 2001-06-06 16:19 UTC (permalink / raw) To: Jeff Garzik Cc: linux-kernel, linux-mm, Christian Bornträger, Derek Glidden Jeff Garzik writes: > Richard Gooch wrote: > > > > Jeff Garzik writes: > > > > > > I'm sorry but this is a regression, plain and simple. > > > > > > Previous versons of Linux have worked great on diskless workstations > > > with NO swap. > > > > > > Swap is "extra space to be used if we have it" and nothing else. > > > > Sure. But Linux still works without swap. It's just that if you *do* > > have swap, it works best with 2* RAM. > > Yes, but that's not the point of the discussion. Currently 2*RAM is > more of a requirement than a recommendation. Um, do you mean "2*RAM is required, always", or "2*RAM or more swap is required if swap != 0"? Regards, Richard.... Permanent: rgooch@atnf.csiro.au Current: rgooch@ras.ucalgary.ca ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 16:19 ` Richard Gooch @ 2001-06-06 16:53 ` Mike Galbraith 0 siblings, 0 replies; 106+ messages in thread From: Mike Galbraith @ 2001-06-06 16:53 UTC (permalink / raw) To: Richard Gooch Cc: Jeff Garzik, linux-kernel, linux-mm, Christian Bornträger, Derek Glidden On Wed, 6 Jun 2001, Richard Gooch wrote: > Jeff Garzik writes: > > Richard Gooch wrote: > > > > > > Jeff Garzik writes: > > > > > > > > I'm sorry but this is a regression, plain and simple. > > > > > > > > Previous versons of Linux have worked great on diskless workstations > > > > with NO swap. > > > > > > > > Swap is "extra space to be used if we have it" and nothing else. > > > > > > Sure. But Linux still works without swap. It's just that if you *do* > > > have swap, it works best with 2* RAM. > > > > Yes, but that's not the point of the discussion. Currently 2*RAM is > > more of a requirement than a recommendation. > > Um, do you mean "2*RAM is required, always", or "2*RAM or more swap is > required if swap != 0"? When Rik starts to reclaim unused swap (didn't he say he was going to do that?) this will instantly revert to the most respected of rules.. rules are made to be b0rken. -Mike ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Requirement: swap = RAM x 2.5 ?? 2001-06-06 16:12 ` Richard Gooch 2001-06-06 16:15 ` Jeff Garzik 2001-06-06 16:19 ` Richard Gooch @ 2001-06-06 17:05 ` Greg Hennessy 2 siblings, 0 replies; 106+ messages in thread From: Greg Hennessy @ 2001-06-06 17:05 UTC (permalink / raw) To: linux-kernel In article <200106061612.f56GCbA14901@vindaloo.ras.ucalgary.ca>, Richard Gooch <rgooch@ras.ucalgary.ca> wrote: > > Swap is "extra space to be used if we have it" and nothing else. > > Sure. But Linux still works without swap. It's just that if you *do* > have swap, it works best with 2* RAM. There is a large difference between saying Linux works best with 2*RAM, and Linux requires 2*RAM. ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 15:28 ` Richard Gooch 2001-06-06 15:42 ` Christian Bornträger @ 2001-06-06 17:14 ` Ben Greear 1 sibling, 0 replies; 106+ messages in thread From: Ben Greear @ 2001-06-06 17:14 UTC (permalink / raw) To: Richard Gooch; +Cc: Daniel Phillips, Sean Hunter, Xavier Bestel, linux-kernel Richard Gooch wrote: > > Daniel Phillips writes: > > On Wednesday 06 June 2001 10:54, Sean Hunter wrote: > > > > > > Did you try to put twice as much swap as you have RAM ? (e.g. add a > > > > 512M swapfile to your box) > > > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying > > > > that anything less won't do any good: 2.4 overallocates swap even > > > > if it doesn't use it all. So in your case you just have enough swap > > > > to map your RAM, and nothing to really swap your apps. > > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of > > > swap? > > Sure. It's cheap. If you don't mind slumming it, go and buy a 20 GB > IDE drive for US$65. I know RAM has gotten a lot cheaper lately (US$66 > for a 512 MiB PC133 DIMM), but it's still far more expensive. If you > can afford 4 GiB of RAM, you can definately afford 8 GiB of swap. For me, the problem is not the money. If I have a system that needs 4GB of RAM, it is highly unlikely that I would ever want to be running this machine with 8GB of swap active. However, I may be willing to tollerate 1GB of swapping before paging to disk slowed things down too much. This is the exact scenario I had when dealing with a large Sun machine running Oracle & some other stuff. Oracle is dedicated large amounts of RAM, but if I wanted to run a quick, memory intensive program too, (and at the moment performance isn't all that big of a deal), then using some swap is OK. So, I too cast my vote for the 2*RAM requiment to be odious and in need of fixing!! It could be a suggestion, but I would consider that if not following the suggestion caused more than 10% slowdown, then things are still broken, and optimally, it should work like the 2.2 does (in other words, I don't notice, and don't particularly care how much swap per RAM I need, just how much total RAM-like-stuff I need.) Thanks, Ben -- Ben Greear <greearb@candelatech.com> <Ben_Greear@excite.com> President of Candela Technologies Inc http://www.candelatech.com ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:54 ` Sean Hunter ` (4 preceding siblings ...) 2001-06-06 15:28 ` Richard Gooch @ 2001-06-06 19:11 ` android 2001-06-07 0:27 ` Mike A. Harris 5 siblings, 1 reply; 106+ messages in thread From: android @ 2001-06-06 19:11 UTC (permalink / raw) To: Sean Hunter; +Cc: linux-kernel >Furthermore, I am not demanding anything, much less "priority fixing" >for this bug. Its my personal opinion that this is the most critical bug >in the 2.4 series, and if I had the time and skill, this is what I would >be working on. Because I don't have the time and skill, I am perfectly >happy to wait until those that do fix the problem. To say it isn't a >problem because I can buy more disk is nonsense, and its that sort of >thinking that leads to constant need to upgrade hardware in the >proprietary OS world. > >Sean This would reflect the Microsoft way of programming: If there's a bug in the system, don't fix it, but upgrade your hardware. Why do you think the requirements for Windows is so great? Most of their code is very inefficient. I'm sure they programmed their kernel in Visual Basic. The worst part is that they get paid to do this! I program in Linux because I don't want to be associated with that mindset that made Microsoft such a [fill in the blank]. As for the 2.4 VM problem, what are you doing with your machine that's making it use up so much memory? I have several processes running on mine all the time, including a slew in X, and I have yet to see significant swap activity. -- Ted P.S. My faithful Timex Sinclair from the 80's never had swap :-) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 19:11 ` android @ 2001-06-07 0:27 ` Mike A. Harris 0 siblings, 0 replies; 106+ messages in thread From: Mike A. Harris @ 2001-06-07 0:27 UTC (permalink / raw) To: android; +Cc: Sean Hunter, linux-kernel On Wed, 6 Jun 2001, android wrote: >associated with that mindset that made Microsoft such a [fill in the blank]. >As for the 2.4 VM problem, what are you doing with your machine that's >making it use up so much memory? I have several processes running >on mine all the time, including a slew in X, and I have yet to see >significant swap activity. Try _compiling_ XFree86. Watch the machine nosedive. ---------------------------------------------------------------------- Mike A. Harris - Linux advocate - Open Source advocate Opinions and viewpoints expressed are solely my own. ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:19 ` Xavier Bestel 2001-06-06 8:54 ` Sean Hunter @ 2001-06-06 9:16 ` Xavier Bestel 2001-06-06 9:25 ` Sean Hunter 2001-06-06 12:07 ` Jonathan Morton ` (2 subsequent siblings) 4 siblings, 1 reply; 106+ messages in thread From: Xavier Bestel @ 2001-06-06 9:16 UTC (permalink / raw) To: Sean Hunter; +Cc: linux-kernel On 06 Jun 2001 09:54:31 +0100, Sean Hunter wrote: > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that > > anything less won't do any good: 2.4 overallocates swap even if it > > doesn't use it all. So in your case you just have enough swap to map > > your RAM, and nothing to really swap your apps. > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? Life is tough. If guess if you have 4GB RAM, you'd be better having no swap at all. Or, yes, at least 8GB. Or just wait for this bug to be fixed. But be patient. Xav ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 9:16 ` Xavier Bestel @ 2001-06-06 9:25 ` Sean Hunter 0 siblings, 0 replies; 106+ messages in thread From: Sean Hunter @ 2001-06-06 9:25 UTC (permalink / raw) To: Xavier Bestel; +Cc: linux-kernel On Wed, Jun 06, 2001 at 11:16:27AM +0200, Xavier Bestel wrote: > On 06 Jun 2001 09:54:31 +0100, Sean Hunter wrote: > > > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that > > > anything less won't do any good: 2.4 overallocates swap even if it > > > doesn't use it all. So in your case you just have enough swap to map > > > your RAM, and nothing to really swap your apps. > > > > > > > For large memory boxes, this is ridiculous. Should I have 8GB of swap? > > Life is tough. If guess if you have 4GB RAM, you'd be better having no > swap at all. Or, yes, at least 8GB. > Or just wait for this bug to be fixed. But be patient. This is just pure bollocks. Virtual memory is one of the killer features of unix. It would be a strange admission to say that our "advanced" 2.4 kernel is so advanced that now you can't use virtual memory at all on large machines. Needing 8GB of swap to prevent a box from committing suicide when it has a working set of less than 512M is crazy. I am waiting patiently for the bug to be fixed. However, it is a real embarrasment that we can't run this "stable" kernel in production yet because somethign as fundamental as this is so badly broken. Sean ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:19 ` Xavier Bestel 2001-06-06 8:54 ` Sean Hunter 2001-06-06 9:16 ` Xavier Bestel @ 2001-06-06 12:07 ` Jonathan Morton 2001-06-06 14:41 ` Derek Glidden 2001-06-06 20:29 ` José Luis Domingo López 4 siblings, 0 replies; 106+ messages in thread From: Jonathan Morton @ 2001-06-06 12:07 UTC (permalink / raw) To: Daniel Phillips, Sean Hunter, Xavier Bestel; +Cc: linux-kernel >> > Did you try to put twice as much swap as you have RAM ? (e.g. add a >> > 512M swapfile to your box) >> > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying >> > that anything less won't do any good: 2.4 overallocates swap even >> > if it doesn't use it all. So in your case you just have enough swap >> > to map your RAM, and nothing to really swap your apps. >> >> For large memory boxes, this is ridiculous. Should I have 8GB of >> swap? > >And laptops with big memories and small disks. Strongly agree. I have a PowerBook G3 with 320Mb RAM. The 18Gb HD is shared between a total of 4 operating systems. I haven't got space to put 2/3rds of a Gb of swap on it - in fact I use only 128Mb of swap under Linux, and don't usually have a problem. MacOS X uses whatever disk space it needs, from the volumes currently mounted. MacOS 9.0.4 is configured to run totally without swap. Windoze 95 is configured to run in it's usual bloated way, from a total of about 1Gb of virtual HD. I'm glad to report that with the new fixes being worked on at present, swap usage is relatively minimalist under the test loads I am able to subject my Athlon to. With mem=32M, compiling MySQL goes 65Mb into swap at maximum, during compilation of a particularly massive C++ file. Compilation takes 2h15m under these conditions, which is a little slow but that's what happens when a system starts thrashing heavily. With mem=48M, compilation completes in about 6m30s, which compares well with the 5-minute "best case" compile time with unrestricted memory available. I didn't check the total swap usage on that run, but it was less than the 65Mb used with mem=32M. After the monster file had completed, the swap balance was largely restored within a few files' compilation - something which doesn't happen with stock 2.4.x. With mem=32M, I can sensibly load XFree86 v4, KDE 1.2, XMMS, a webcam app and Netscape 4.6. XMMS glitches occasionally (not often, and not particularly seriously) as I switch between 1600x1200x24bpp virtual desktops, and swapping gets heavy at times, but the system is essentially usable and avoids thrashing. This weekend, I'll treat a friend with an ageing Cyrix machine to the patches and see if she notices the difference - the answer will probably be yes. It remains to be seen how industrial-sized applications fare with the changes, but I strongly suspect that any reaction will be positive rather than negative. Industrial applications *should* be running as if no swap was available, in any case... -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) big-mail: chromatix@penguinpowered.com uni-mail: j.d.morton@lancaster.ac.uk The key to knowledge is not to rely on people to teach you it. Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/ -----BEGIN GEEK CODE BLOCK----- Version 3.12 GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) -----END GEEK CODE BLOCK----- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:19 ` Xavier Bestel ` (2 preceding siblings ...) 2001-06-06 12:07 ` Jonathan Morton @ 2001-06-06 14:41 ` Derek Glidden 2001-06-06 20:29 ` José Luis Domingo López 4 siblings, 0 replies; 106+ messages in thread From: Derek Glidden @ 2001-06-06 14:41 UTC (permalink / raw) To: Xavier Bestel; +Cc: linux-kernel Xavier Bestel wrote: > > Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M > swapfile to your box) > This is what Linus recommended for 2.4 (swap = 2 * RAM), saying that > anything less won't do any good: 2.4 overallocates swap even if it > doesn't use it all. So in your case you just have enough swap to map > your RAM, and nothing to really swap your apps. Yes, the example given is against the machine at work, which is configured 512/512. My machine at home is configured 512/1024 and has the same problems. Further, this machine *used* to have only 256MB of RAM, and I could still cause the misbehaviour. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- #!/usr/bin/perl -w $_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map {$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110; $t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z) [$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join "",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d= unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d >>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q* 8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]} print+x"C*",@a}';s/x/pack+/g;eval usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \ | extract_mpeg2 | mpeg2dec - http://www.eff.org/ http://www.opendvd.org/ http://www.cs.cmu.edu/~dst/DeCSS/Gallery/ ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 8:19 ` Xavier Bestel ` (3 preceding siblings ...) 2001-06-06 14:41 ` Derek Glidden @ 2001-06-06 20:29 ` José Luis Domingo López 4 siblings, 0 replies; 106+ messages in thread From: José Luis Domingo López @ 2001-06-06 20:29 UTC (permalink / raw) To: linux-kernel On Wednesday, 06 June 2001, at 10:19:30 +0200, Xavier Bestel wrote: > On 05 Jun 2001 23:19:08 -0400, Derek Glidden wrote: > > On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote: > [...] > Did you try to put twice as much swap as you have RAM ? (e.g. add a 512M > swapfile to your box) > I'm not a kernel guru, neither I can even try to understand how an operating system's memory management is designed or behaves. But I've some questions and thoughs: 1. Is swap=2xRAM a desing issue, or just a recommendation to get best results _based_ on current VM subsystem status ? 2. Wouldn't performance drop quickly when VM starts to swap processes/pages to disk, instead of keeping them on RAM ?. Maybe having a couple of GB worth of processes on disk is not very wyse. 3. Shouldn't an ideal VM manage swap space as an extension of system's RAM (of course, taking into account that RAM is much faster than HD, and nothing should be on swap if there is room enough on RAM ?. 4. Wouldn't you say that "adding more swap" (maybe 2xRAM is a recommendation, maybe a temporary fix, maybe a design decission) is the M$-way of fixing things ?. If there is a _real_ need for more swap to get a well baheving system, let's add swap. But we shouldn't hide inner desing and/or implementation problems under the "cheap multigigabyte disks" argument. 5. AFAIK, kernel developers are well aware of current 2.4.x problems in some areas. I don't think insisting on certain problems without providing ideas, testing, support, and limiting to just blaming the authors is the best way to go. Maybe kernel hackers are the most interested of all in fixing all these issues ASAP. Just some thoughts from someone unable to write C code and help fix this mess ;). -- José Luis Domingo López Linux Registered User #189436 Debian GNU/Linux Potato (P166 64 MB RAM) jdomingo EN internautas PUNTO org => ¿ Spam ? Atente a las consecuencias jdomingo AT internautas DOT org => Spam at your own risk ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 2:16 ` Andrew Morton ` (2 preceding siblings ...) 2001-06-06 8:19 ` Xavier Bestel @ 2001-06-06 13:32 ` Eric W. Biederman 2001-06-06 14:41 ` Marc Heckmann 2001-06-06 14:51 ` Hugh Dickins 5 siblings, 0 replies; 106+ messages in thread From: Eric W. Biederman @ 2001-06-06 13:32 UTC (permalink / raw) To: Andrew Morton; +Cc: Jeffrey W. Baker, Derek Glidden, linux-kernel Andrew Morton <andrewm@uow.edu.au> writes: > "Jeffrey W. Baker" wrote: > > > > Because the 2.4 VM is so broken, and > > because my machines are frequently deeply swapped, > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > The problem *appears* worse in 2.4 because it uses lots > more swap. And 2.4 does delayed swap deallocation. We don't appear to optimize the case where a page is only used by the swap cache. That should be able to save some cpu overhead if nothing else. And I do know that in the early 2.2 timeframe, swapoff was used to generate an artifically high VM load, for testing the VM. It looks like that testing procedure has been abandoned :) > > they can sometimes take over 30 minutes to shutdown. > > Yes. The sys_swapoff() system call can take many minutes > of CPU time. It basically does: > > for (each page in swap device) { > for (each process) { > for (each page used by this process) > stuff > > It's interesting that you've found a case where this > actually has an operational impact. Agreed. > Haven't looked at it closely, but I think the algorithm > could become something like: > > for (each process) { > for (each page in this process) { > if (page is on target swap device) > get_it_off() > } > } > > for (each page in swap device) { > if (it is busy) > complain() > } You would need to handle the shared memory case as well. But otherwise this looks sound. I would suggest going through page->address_space->i_mmap_shared to find all of the potential mappings but the swapper address space is used by all processes that have pages in swap. > That's 10^4 to 10^6 times faster. It looks like it could be. The bottleneck should be diskio, if it is not we have a noticeable inefficient algorithm. Eric ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 2:16 ` Andrew Morton ` (3 preceding siblings ...) 2001-06-06 13:32 ` Eric W. Biederman @ 2001-06-06 14:41 ` Marc Heckmann 2001-06-06 14:51 ` Hugh Dickins 5 siblings, 0 replies; 106+ messages in thread From: Marc Heckmann @ 2001-06-06 14:41 UTC (permalink / raw) To: Andrew Morton; +Cc: Jeffrey W. Baker, Derek Glidden, linux-kernel On Wed, Jun 06, 2001 at 12:16:30PM +1000, Andrew Morton wrote: > "Jeffrey W. Baker" wrote: > > > > Because the 2.4 VM is so broken, and > > because my machines are frequently deeply swapped, > > The swapoff algorithms in 2.2 and 2.4 are basically identical. > The problem *appears* worse in 2.4 because it uses lots > more swap. exactly, I've seen this on a 2.2.16 box that went deep into swap. Although it didn't lock up, kswapd was using most of the CPU time during a swapoff. -mh ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 2:16 ` Andrew Morton ` (4 preceding siblings ...) 2001-06-06 14:41 ` Marc Heckmann @ 2001-06-06 14:51 ` Hugh Dickins 5 siblings, 0 replies; 106+ messages in thread From: Hugh Dickins @ 2001-06-06 14:51 UTC (permalink / raw) To: Andrew Morton; +Cc: Jeffrey W. Baker, Derek Glidden, linux-kernel On Wed, 6 Jun 2001, Andrew Morton wrote: > Yes. The sys_swapoff() system call can take many minutes > Haven't looked at it closely, but I think the algorithm > could become something like: > > for (each process) { > for (each page in this process) { > if (page is on target swap device) > get_it_off() > } > } Substitute "mm" for "process". Yes, of course, that would be vastly better than the present way (or is there some gotcha? it's hard to understand why someone would choose to write it the way it is). However... don't forget that another of the current swap problems is pages being left in the swap cache after they've been unmapped from (all) their mms - those need to be dealt with too. Hugh ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 23:38 ` Jeffrey W. Baker 2001-06-06 1:42 ` Russell Leighton 2001-06-06 2:16 ` Andrew Morton @ 2001-06-06 7:47 ` Jonathan Morton 2001-06-06 13:08 ` Eric W. Biederman 3 siblings, 0 replies; 106+ messages in thread From: Jonathan Morton @ 2001-06-06 7:47 UTC (permalink / raw) To: Sean Hunter, Russell Leighton; +Cc: linux-kernel >It seems bizarre that a 4GB machine with a working set _far_ lower than that >should be dying from OOM and swapping itself to death, but that's life in 2.4 >land. I posted a fix for the OOM problem long ago, and it didn't get integrated (even after I sent Alan a separated-out version from the larger patch it was embedded in). I'm going to re-introduce it soon, and hope that it gets a better hearing this time. -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) big-mail: chromatix@penguinpowered.com uni-mail: j.d.morton@lancaster.ac.uk The key to knowledge is not to rely on people to teach you it. Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/ -----BEGIN GEEK CODE BLOCK----- Version 3.12 GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) -----END GEEK CODE BLOCK----- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 23:38 ` Jeffrey W. Baker ` (2 preceding siblings ...) 2001-06-06 7:47 ` Jonathan Morton @ 2001-06-06 13:08 ` Eric W. Biederman 2001-06-06 16:48 ` Jeffrey W. Baker 3 siblings, 1 reply; 106+ messages in thread From: Eric W. Biederman @ 2001-06-06 13:08 UTC (permalink / raw) To: Jeffrey W. Baker; +Cc: Derek Glidden, linux-kernel "Jeffrey W. Baker" <jwbaker@acm.org> writes: > On Tue, 5 Jun 2001, Derek Glidden wrote: > > > > > After reading the messages to this list for the last couple of weeks and > > playing around on my machine, I'm convinced that the VM system in 2.4 is > > still severely broken. > > > > This isn't trying to test extreme low-memory pressure, just how the > > system handles recovering from going somewhat into swap, which is a real > > day-to-day problem for me, because I often run a couple of apps that > > most of the time live in RAM, but during heavy computation runs, can go > > a couple hundred megs into swap for a few minutes at a time. Whenever > > that happens, my machine always starts acting up afterwards, so I > > started investigating and found some really strange stuff going on. > > I reboot each of my machines every week, to take them offline for > intrusion detection. I use 2.4 because I need advanced features of > iptables that ipchains lacks. Because the 2.4 VM is so broken, and > because my machines are frequently deeply swapped, they can sometimes take > over 30 minutes to shutdown. They hang of course when the shutdown rc > script turns off the swap. The first few times this happened I assumed > they were dead. Interesting. Is it constant disk I/O? Or constant CPU utilization. In any case you should be able to comment that line out of your shutdown rc script and be in perfectly good shape. Eric ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 13:08 ` Eric W. Biederman @ 2001-06-06 16:48 ` Jeffrey W. Baker 0 siblings, 0 replies; 106+ messages in thread From: Jeffrey W. Baker @ 2001-06-06 16:48 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel On 6 Jun 2001, Eric W. Biederman wrote: > "Jeffrey W. Baker" <jwbaker@acm.org> writes: > > > On Tue, 5 Jun 2001, Derek Glidden wrote: > > > > > > > > After reading the messages to this list for the last couple of weeks and > > > playing around on my machine, I'm convinced that the VM system in 2.4 is > > > still severely broken. > > > > > > This isn't trying to test extreme low-memory pressure, just how the > > > system handles recovering from going somewhat into swap, which is a real > > > day-to-day problem for me, because I often run a couple of apps that > > > most of the time live in RAM, but during heavy computation runs, can go > > > a couple hundred megs into swap for a few minutes at a time. Whenever > > > that happens, my machine always starts acting up afterwards, so I > > > started investigating and found some really strange stuff going on. > > > > I reboot each of my machines every week, to take them offline for > > intrusion detection. I use 2.4 because I need advanced features of > > iptables that ipchains lacks. Because the 2.4 VM is so broken, and > > because my machines are frequently deeply swapped, they can sometimes take > > over 30 minutes to shutdown. They hang of course when the shutdown rc > > script turns off the swap. The first few times this happened I assumed > > they were dead. > > Interesting. Is it constant disk I/O? Or constant CPU utilization. > In any case you should be able to comment that line out of your shutdown > rc script and be in perfectly good shape. Well I can't exactly run top(1) at shutdown time, but the disks aren't running at all. Either the system is using the CPUs, or it is blocked waiting for something to happen. You're right about swapoff, we removed it from our shutdown script. ^ permalink raw reply [flat|nested] 106+ messages in thread
[parent not found: <m2lmn61ceb.fsf@sympatico.ca>]
* Re: Break 2.4 VM in five easy steps [not found] ` <m2lmn61ceb.fsf@sympatico.ca> @ 2001-06-06 14:37 ` Derek Glidden 2001-06-07 0:34 ` Mike A. Harris 0 siblings, 1 reply; 106+ messages in thread From: Derek Glidden @ 2001-06-06 14:37 UTC (permalink / raw) To: Bill Pringlemeir, linux-kernel Bill Pringlemeir wrote: > > [snip] > Derek> overwhelmed. On the system I'm using to write this, with > Derek> 512MB of RAM and 512MB of swap, I run two copies of this > > Please see the following message on the kernel mailing list, > > 3086:Linus 2.4.0 notes are quite clear that you need at least twice RAM of swap > Message-Id: <E155bG5-0008AX-00@the-village.bc.nu> Yes, I'm aware of this. However, I still believe that my original problem report is a BUG. No matter how much swap I have, or don't have, and how much is or isn't being used, running "swapoff" and forcing the VM subsystem to reclaim unused swap should NOT cause my machine to feign death for several minutes. I can easily take 256MB out of this machine, and then I *will* have twice as much swap as RAM and I can still cause the exact same behaviour. It's a bug, and no number of times saying "You need twice as much swap as RAM" will change that fact. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 14:37 ` Derek Glidden @ 2001-06-07 0:34 ` Mike A. Harris 2001-06-07 3:13 ` Miles Lane 0 siblings, 1 reply; 106+ messages in thread From: Mike A. Harris @ 2001-06-07 0:34 UTC (permalink / raw) To: Derek Glidden; +Cc: Bill Pringlemeir, linux-kernel On Wed, 6 Jun 2001, Derek Glidden wrote: >> Derek> overwhelmed. On the system I'm using to write this, with >> Derek> 512MB of RAM and 512MB of swap, I run two copies of this >> >> Please see the following message on the kernel mailing list, >> >> 3086:Linus 2.4.0 notes are quite clear that you need at least twice RAM of swap >> Message-Id: <E155bG5-0008AX-00@the-village.bc.nu> > >Yes, I'm aware of this. > >However, I still believe that my original problem report is a BUG. No >matter how much swap I have, or don't have, and how much is or isn't >being used, running "swapoff" and forcing the VM subsystem to reclaim >unused swap should NOT cause my machine to feign death for several >minutes. > >I can easily take 256MB out of this machine, and then I *will* have >twice as much swap as RAM and I can still cause the exact same >behaviour. > >It's a bug, and no number of times saying "You need twice as much swap >as RAM" will change that fact. Precicely. Saying 8x RAM doesn't change it either. Sometime next week I'm going to purposefully put a new 60Gb disk in on a separate controller as pure swap on top of 256Mb of RAM. My guess is after bootup, and login, I'll have 48Gb of stuff in swap "just in case". ---------------------------------------------------------------------- Mike A. Harris - Linux advocate - Open Source advocate Opinions and viewpoints expressed are solely my own. ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 0:34 ` Mike A. Harris @ 2001-06-07 3:13 ` Miles Lane 2001-06-07 15:49 ` Derek Glidden ` (2 more replies) 0 siblings, 3 replies; 106+ messages in thread From: Miles Lane @ 2001-06-07 3:13 UTC (permalink / raw) To: Mike A. Harris; +Cc: Derek Glidden, Bill Pringlemeir, linux-kernel On 06 Jun 2001 20:34:49 -0400, Mike A. Harris wrote: > On Wed, 6 Jun 2001, Derek Glidden wrote: > > >> Derek> overwhelmed. On the system I'm using to write this, with > >> Derek> 512MB of RAM and 512MB of swap, I run two copies of this > >> > >> Please see the following message on the kernel mailing list, > >> > >> 3086:Linus 2.4.0 notes are quite clear that you need at least twice RAM of swap > >> Message-Id: <E155bG5-0008AX-00@the-village.bc.nu> > > > >Yes, I'm aware of this. > > > >However, I still believe that my original problem report is a BUG. No > >matter how much swap I have, or don't have, and how much is or isn't > >being used, running "swapoff" and forcing the VM subsystem to reclaim > >unused swap should NOT cause my machine to feign death for several > >minutes. > > > >I can easily take 256MB out of this machine, and then I *will* have > >twice as much swap as RAM and I can still cause the exact same > >behaviour. > > > >It's a bug, and no number of times saying "You need twice as much swap > >as RAM" will change that fact. > > Precicely. Saying 8x RAM doesn't change it either. Sometime > next week I'm going to purposefully put a new 60Gb disk in on a > separate controller as pure swap on top of 256Mb of RAM. My > guess is after bootup, and login, I'll have 48Gb of stuff in > swap "just in case". Mike and others, I am getting tired of your comments. Sheesh. The various developers who actually work on the VM have already acknowledged the issues and are exploring fixes, including at least one patch that already exists. It seems clear that the uproar from the people who are having trouble with the new VM's handling of swap space have been heard and folks are going to fix these problems. It may not happen today or tomorrow, but soon. What the heck else do you want? Making enflammatory remarks about the current situation does nothing to help get the problems fixed, it just wastes our time and bandwidth. So please, if you have new facts that you want to offer that will help us characterize and understand these VM issues better or discover new problems, feel free to share them. But if you just want to rant, I, for one, would rather you didn't. Miles ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 3:13 ` Miles Lane @ 2001-06-07 15:49 ` Derek Glidden 2001-06-07 19:06 ` Miles Lane 2001-06-09 5:57 ` Mike A. Harris 2 siblings, 0 replies; 106+ messages in thread From: Derek Glidden @ 2001-06-07 15:49 UTC (permalink / raw) To: Miles Lane; +Cc: linux-kernel Miles Lane wrote: > > So please, if you have new facts that you want to offer that > will help us characterize and understand these VM issues better > or discover new problems, feel free to share them. But if you > just want to rant, I, for one, would rather you didn't. *sigh* Not to prolong an already pointless thread, but that really was the intent of my original message. I had figured out a specific way, with easy-to-follow steps, to make the VM misbehave under very certain conditions. I even offered to help figure out a solution in any way I could, considering I'm not familiar with kernel code. However, I guess this whole "too much swap" issue has a lot of people on edge and immediately assumed I was talking about this subject, without actually reading my original message. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- #!/usr/bin/perl -w $_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$t=255;@t=map {$_%16or$t^=$c^=($m=(11,10,116,100,11,122,20,100)[$_/16%8])&110; $t^=(72,@z=(64,72,$a^=12*($_%16-2?0:$m&17)),$b^=$_%64?12:0,@z) [$_%8]}(16..271);if((@a=unx"C*",$_)[20]&48){$h=5;$_=unxb24,join "",@b=map{xB8,unxb8,chr($_^$a[--$h+84])}@ARGV;s/...$/1$&/;$d= unxV,xb25,$_;$e=256|(ord$b[4])<<9|ord$b[3];$d=$d>>8^($f=$t&($d >>12^$d>>4^$d^$d/8))<<17,$e=$e>>8^($t&($g=($q=$e>>14&7^$e)^$q* 8^$q<<6))<<9,$_=$t[$_]^(($h>>=8)+=$f+(~$g&$t))for@a[128..$#a]} print+x"C*",@a}';s/x/pack+/g;eval usage: qrpff 153 2 8 105 225 < /mnt/dvd/VOB_FILENAME \ | extract_mpeg2 | mpeg2dec - http://www.eff.org/ http://www.opendvd.org/ http://www.cs.cmu.edu/~dst/DeCSS/Gallery/ ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 3:13 ` Miles Lane 2001-06-07 15:49 ` Derek Glidden @ 2001-06-07 19:06 ` Miles Lane 2001-06-09 5:57 ` Mike A. Harris 2 siblings, 0 replies; 106+ messages in thread From: Miles Lane @ 2001-06-07 19:06 UTC (permalink / raw) To: Derek Glidden; +Cc: linux-kernel On 07 Jun 2001 11:49:47 -0400, Derek Glidden wrote: > Miles Lane wrote: > > > > So please, if you have new facts that you want to offer that > > will help us characterize and understand these VM issues better > > or discover new problems, feel free to share them. But if you > > just want to rant, I, for one, would rather you didn't. > > *sigh* > > Not to prolong an already pointless thread, but that really was the > intent of my original message. I had figured out a specific way, with > easy-to-follow steps, to make the VM misbehave under very certain > conditions. I even offered to help figure out a solution in any way I > could, considering I'm not familiar with kernel code. > > However, I guess this whole "too much swap" issue has a lot of people on > edge and immediately assumed I was talking about this subject, without > actually reading my original message. Actually, I think your original message was useful. It has spurred a reevaluation of some design assumptions implicit in the VM in the 2.4 series and has also surfaced some bugs. It was not you who I felt was sending enflammatory remarks, it was the folks who have been bellyaching about the current swap disk space requirements without offering any new information to help developers remedy the situation. So, thanks for bringing the topic up. :-) Cheers, Miles ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 3:13 ` Miles Lane 2001-06-07 15:49 ` Derek Glidden 2001-06-07 19:06 ` Miles Lane @ 2001-06-09 5:57 ` Mike A. Harris 2 siblings, 0 replies; 106+ messages in thread From: Mike A. Harris @ 2001-06-09 5:57 UTC (permalink / raw) To: Miles Lane; +Cc: Derek Glidden, Bill Pringlemeir, linux-kernel On 6 Jun 2001, Miles Lane wrote: >> Precicely. Saying 8x RAM doesn't change it either. Sometime >> next week I'm going to purposefully put a new 60Gb disk in on a >> separate controller as pure swap on top of 256Mb of RAM. My >> guess is after bootup, and login, I'll have 48Gb of stuff in >> swap "just in case". > >Mike and others, I am getting tired of your comments. Sheesh. And I'm tired of having people tell me, or tell others to buy a faster computer or more RAM to work around a real technical problem. If a dual 1Ghz system with 1Gb of RAM and 60GB of disk space broken across 3 U160 drives is not a modern fast workstation I don't know what is. My 300Mhz system however works on its own stuff, and doesn't need upgrading. >The various developers who actually work on the VM have already >acknowledged the issues and are exploring fixes, including at >least one patch that already exists. Precicely, which underscores what I'm saying: The problem is acknowledged, and being worked on by talented hackers knowing what they are doing - so why must people keep saying "get more disk space, it is cheap?" et al.? That is totally nonuseful advice in most cases. Many have pointed out already for example how impossible that would be in a 500 computer webserver farm. >It seems clear that the uproar from the people who are having >trouble with the new VM's handling of swap space have been >heard and folks are going to fix these problems. It may not >happen today or tomorrow, but soon. What the heck else do you >want? I agree with you. What I want, is when someone talks about this stuff or inquires about it, for people to stop telling them that their computer is out of date and they should upgrade it as that is bogus advice. "It worked fine yesterday, why should I upgrade" reigns supreme. >Making enflammatory remarks about the current situation does >nothing to help get the problems fixed, it just wastes our time >and bandwidth. It's not like there is someone forcing you to read it though. >So please, if you have new facts that you want to offer that >will help us characterize and understand these VM issues better >or discover new problems, feel free to share them. But if you >just want to rant, I, for one, would rather you didn't. Point noted, however that isn't going to stop anyone from speaking their personal opinion on things. Freedom of speech. ---------------------------------------------------------------------- Mike A. Harris - Linux advocate - Open Source advocate Opinions and viewpoints expressed are solely my own. ---------------------------------------------------------------------- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden 2001-06-05 23:38 ` Jeffrey W. Baker [not found] ` <m2lmn61ceb.fsf@sympatico.ca> @ 2001-06-06 18:59 ` Mike Galbraith 2001-06-06 19:39 ` Derek Glidden 2001-06-06 20:47 ` Linus Torvalds ` (5 subsequent siblings) 8 siblings, 1 reply; 106+ messages in thread From: Mike Galbraith @ 2001-06-06 18:59 UTC (permalink / raw) To: Derek Glidden; +Cc: linux-kernel On Tue, 5 Jun 2001, Derek Glidden wrote: > After reading the messages to this list for the last couple of weeks and > playing around on my machine, I'm convinced that the VM system in 2.4 is > still severely broken. ... Hi, Can you try the patch below to see if it helps? If you watch with vmstat, you should see swap shrinking after your test. Let is shrink a while and then see how long swapoff takes. Under a normal load, it'll munch a handfull of them at least once a second and keep them from getting annoying. (theory;) -Mike --- linux-2.4.5.ac5/mm/vmscan.c.org Sat Jun 2 07:37:16 2001 +++ linux-2.4.5.ac5/mm/vmscan.c Wed Jun 6 18:29:02 2001 @@ -1005,6 +1005,53 @@ return ret; } +int deadswap_reclaim(unsigned int priority) +{ + struct list_head * page_lru; + struct page * page; + int maxscan = nr_active_pages >> priority; + int nr_reclaim = 0; + + /* Take the lock while messing with the list... */ + spin_lock(&pagemap_lru_lock); + while (maxscan-- > 0 && (page_lru = active_list.prev) != &active_list) { + page = list_entry(page_lru, struct page, lru); + + /* Wrong page on list?! (list corruption, should not happen) */ + if (!PageActive(page)) { + printk("VM: refill_inactive, wrong page on list.\n"); + list_del(page_lru); + nr_active_pages--; + continue; + } + + if (PageSwapCache(page) && + (page_count(page) - !!page->buffers) == 1 && + swap_count(page) == 1) { + if (page->buffers || TryLockPage(page)) { + ClearPageReferenced(page); + ClearPageDirty(page); + page->age = 0; + deactivate_page_nolock(page); + } else { + page_cache_get(page); + spin_unlock(&pagemap_lru_lock); + delete_from_swap_cache_nolock(page); + spin_lock(&pagemap_lru_lock); + UnlockPage(page); + page_cache_release(page); + } + nr_reclaim++; + continue; + } + list_del(page_lru); + list_add(page_lru, &active_list); + } + spin_unlock(&pagemap_lru_lock); + + return nr_reclaim; +} + DECLARE_WAIT_QUEUE_HEAD(kreclaimd_wait); /* * Kreclaimd will move pages from the inactive_clean list to the @@ -1027,7 +1074,7 @@ * We sleep until someone wakes us up from * page_alloc.c::__alloc_pages(). */ - interruptible_sleep_on(&kreclaimd_wait); + interruptible_sleep_on_timeout(&kreclaimd_wait, HZ); /* * Move some pages from the inactive_clean lists to @@ -1051,6 +1098,7 @@ } pgdat = pgdat->node_next; } while (pgdat); + deadswap_reclaim(4); } } ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 18:59 ` Mike Galbraith @ 2001-06-06 19:39 ` Derek Glidden 0 siblings, 0 replies; 106+ messages in thread From: Derek Glidden @ 2001-06-06 19:39 UTC (permalink / raw) To: Mike Galbraith; +Cc: linux-kernel Mike Galbraith wrote: > > Can you try the patch below to see if it helps? If you watch > with vmstat, you should see swap shrinking after your test. > Let is shrink a while and then see how long swapoff takes. > Under a normal load, it'll munch a handfull of them at least > once a second and keep them from getting annoying. (theory;) Hi Mike, I'll give that patch a spin this evening after work when I have time to patch and recompile the kernel. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden ` (2 preceding siblings ...) 2001-06-06 18:59 ` Mike Galbraith @ 2001-06-06 20:47 ` Linus Torvalds 2001-06-07 7:42 ` Eric W. Biederman 2001-06-06 21:39 ` android ` (4 subsequent siblings) 8 siblings, 1 reply; 106+ messages in thread From: Linus Torvalds @ 2001-06-06 20:47 UTC (permalink / raw) To: linux-kernel In article <3B1D5ADE.7FA50CD0@illusionary.com>, Derek Glidden <dglidden@illusionary.com> wrote: > >After reading the messages to this list for the last couple of weeks and >playing around on my machine, I'm convinced that the VM system in 2.4 is >still severely broken. Now, this may well be true, but what you actually demonstrated is that "swapoff()" is extremely (and I mean _EXTREMELY_) inefficient, to the point that it can certainly be called broken. It got worse in 2.4.x not so much due to any generic VM worseness, as due to the fact that the much more persistent swap cache behaviour in 2.4.x just exposes the fundamental inefficiencies of "swapoff()" more clearly. I don't think the swapoff() algorithm itself has changed, it's just that the algorithm was always exponential, I think (and because of the persistent swap cache, the "n" in the algorithm became much bigger). So this is really a separate problem from the general VM balancing issues. Go and look at the "try_to_unuse()" logic, and wince. I'd love to have somebody look a bit more at swap-off. It may well be, for example, that swap-off does not correctly notice dead swap-pages at all - somebody should verify that it doesn't try to read in and "try_to_unuse()" dead swap entries. That would make the inefficiency show up even more clearly. (Quick look gives the following: right now try_to_unuse() in mm/swapfile.c does something like lock_page(page); if (PageSwapCache(page)) delete_from_swap_cache_nolock(page); UnlockPage(page); read_lock(&tasklist_lock); for_each_task(p) unuse_process(p->mm, entry, page); read_unlock(&tasklist_lock); shmem_unuse(entry, page); /* Now get rid of the extra reference to the temporary page we've been using. */ page_cache_release(page); and we should trivially notice that if the page count is 1, it cannot be mapped in any process, so we should maybe add something like lock_page(page); if (PageSwapCache(page)) delete_from_swap_cache_nolock(page); UnlockPage(page); + if (page_count(page) == 1) + goto nothing_to_do; read_lock(&tasklist_lock); for_each_task(p) unuse_process(p->mm, entry, page); read_unlock(&tasklist_lock); shmem_unuse(entry, page); + + nothing_to_do: + /* Now get rid of the extra reference to the temporary page we've been using. */ page_cache_release(page); which should (assuming I got the page count thing right - I'v eobviously not tested the above change) make sure that we don't spend tons of time on dead swap pages. Somebody interested in trying the above add? And looking for other more obvious bandaid fixes. It won't "fix" swapoff per se, but it might make it bearable and bring it to the 2.2.x levels. The _real_ fix is to really make "swapoff()" work the other way around - go through each process and look for swap entries in the page tables _first_, and bring all entries for that device in sanely, and after everything is brought in just drop all the swap cache pages for that device. The current swapoff() thing is really a quick hack that has lived on since early 1992 with quick hacks to make it work with the big VM changes that have happened since. That would make swapoff be O(n) in VM size (and you can easily do some further micro-optimizations at that time by avoiding shared mappings with backing store and other things that cannot have swap info involved) Is anybody interested in making "swapoff()" better? Please speak up.. Linus ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 20:47 ` Linus Torvalds @ 2001-06-07 7:42 ` Eric W. Biederman 2001-06-07 8:11 ` Linus Torvalds 0 siblings, 1 reply; 106+ messages in thread From: Eric W. Biederman @ 2001-06-07 7:42 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel torvalds@transmeta.com (Linus Torvalds) writes: > > Somebody interested in trying the above add? And looking for other more > obvious bandaid fixes. It won't "fix" swapoff per se, but it might make > it bearable and bring it to the 2.2.x levels. At little bit. The one really bad behavior of not letting any other processes run seems to be fixed with an explicit: if (need_resched) { schedule(); } What I can't figure out is why this is necessary. Because we should be sleeping in alloc_pages if nowhere else. I suppose if the bulk of our effort really is freeing dead swap cache pages we can spin without sleeping, and never let another process run because we are busily recycling dead swap cache pages. Does this sound right? If this is going on I think we need to look at our delayed deallocation policy a little more carefully. I suspect we should have code in kswapd actively removing these dead swap cache pages. After we get the latency improvements in exit these pages do absolutely nothing for us except clog up the whole system, and generally give the 2.4 VM a bad name. Anyone care to check my analysis? > Is anybody interested in making "swapoff()" better? Please speak up.. Interested. But finding the time... Eric ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 7:42 ` Eric W. Biederman @ 2001-06-07 8:11 ` Linus Torvalds 2001-06-07 8:54 ` Eric W. Biederman 0 siblings, 1 reply; 106+ messages in thread From: Linus Torvalds @ 2001-06-07 8:11 UTC (permalink / raw) To: Eric W. Biederman; +Cc: linux-kernel On 7 Jun 2001, Eric W. Biederman wrote: > torvalds@transmeta.com (Linus Torvalds) writes: > > > > Somebody interested in trying the above add? And looking for other more > > obvious bandaid fixes. It won't "fix" swapoff per se, but it might make > > it bearable and bring it to the 2.2.x levels. > > At little bit. The one really bad behavior of not letting any other > processes run seems to be fixed with an explicit: > if (need_resched) { > schedule(); > } > > What I can't figure out is why this is necessary. Because we should > be sleeping in alloc_pages if nowhere else. No - I suspect that we're not actually doing all that much IO at all, and the real reason for the lock-up is just that the current algorithm is so bad that when it starts to act exponentially worse it really _is_ taking minutes of CPU time following pointers and generally not being very nice on the CPU cache etc.. The bulk of the work is walking the process page tables thousands and thousands of times. Expensive. > If this is going on I think we need to look at our delayed > deallocation policy a little more carefully. Agreed. I already talked in private with some people about just re-visiting the issue of the lazy de-allocation. It has nice properties, but it certainly appears as if the nasty cases just plain outweigh the advantages. Linus ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-07 8:11 ` Linus Torvalds @ 2001-06-07 8:54 ` Eric W. Biederman 0 siblings, 0 replies; 106+ messages in thread From: Eric W. Biederman @ 2001-06-07 8:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel Linus Torvalds <torvalds@transmeta.com> writes: > On 7 Jun 2001, Eric W. Biederman wrote: > > No - I suspect that we're not actually doing all that much IO at all, and > the real reason for the lock-up is just that the current algorithm is so > bad that when it starts to act exponentially worse it really _is_ taking > minutes of CPU time following pointers and generally not being very nice > on the CPU cache etc.. Hmm. Unless I am mistaken the complexity is O(SwapPages*VMSize) Which is very bad, but no where near exponentially horrible. > The bulk of the work is walking the process page tables thousands and > thousands of times. Expensive. Definitely. I played following the page tables in a good way a while back, and even when you do it right the process is slow. Is if (need_resched) { schedule(); } A good idiom to use when you know you have a loop that will take a long time. Because even if we do this right we should do our best to avoid starving other processes in the system Hmm. There is a nasty case with turning the walk inside out. When we read a page into RAM there could still be other users of that page that still refer to the swap entry. So we cannot immediately remove the page from the swap cache. Unless we want to break sharing and increase the demands upon the virtual memory when we are shrinking it... > > If this is going on I think we need to look at our delayed > > deallocation policy a little more carefully. > > Agreed. I already talked in private with some people about just > re-visiting the issue of the lazy de-allocation. It has nice properties, > but it certainly appears as if the nasty cases just plain outweigh the > advantages. I'm trying to remember the advantages. Besides not having to care that a page is a swap page in free_pte. If there really is some value in not handling the pages there (and I seem to recall something about pages under I/O). It might at least be worth putting the pages on their own LRU list. So that kswapd can cruch through the list whenever it wakes up and gives a bunch of free pages. Eric ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden ` (3 preceding siblings ...) 2001-06-06 20:47 ` Linus Torvalds @ 2001-06-06 21:39 ` android 2001-06-06 22:08 ` Jonathan Morton ` (3 subsequent siblings) 8 siblings, 0 replies; 106+ messages in thread From: android @ 2001-06-06 21:39 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel >Is anybody interested in making "swapoff()" better? Please speak up.. > > Linus I'd be happy to write a new routine in assembly, if I had a clue as to how the VM algorithm works in Linux. What should swapoff do if all physical memory is in use? How does the swapping algorithm balance against cache memory? Can someone point me to where I can find the exact details of the VM mechanism in Linux? Thanks! -- Ted ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden ` (4 preceding siblings ...) 2001-06-06 21:39 ` android @ 2001-06-06 22:08 ` Jonathan Morton 2001-06-06 22:27 ` android ` (2 subsequent siblings) 8 siblings, 0 replies; 106+ messages in thread From: Jonathan Morton @ 2001-06-06 22:08 UTC (permalink / raw) To: android, Linus Torvalds; +Cc: linux-kernel >I'd be happy to write a new routine in assembly I sincerely hope you're joking. It's the algorithm that needs fixing, not the implementation of that algorithm. Writing in assembler? Hope you're proficient at writing in x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other architectures we support these days. And you darn well better hope every other kernel hacker is as proficient as that, to be able to read it. IOW, no chance. -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) The key to knowledge is not to rely on people to teach you it. GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden ` (5 preceding siblings ...) 2001-06-06 22:08 ` Jonathan Morton @ 2001-06-06 22:27 ` android 2001-06-06 22:33 ` Antoine 2001-06-06 22:38 ` Robert Love 2001-06-06 22:40 ` Jonathan Morton 8 siblings, 1 reply; 106+ messages in thread From: android @ 2001-06-06 22:27 UTC (permalink / raw) To: Jonathan Morton; +Cc: linux-kernel > >I'd be happy to write a new routine in assembly > >I sincerely hope you're joking. > >It's the algorithm that needs fixing, not the implementation of that >algorithm. Writing in assembler? Hope you're proficient at writing in >x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other >architectures we support these days. And you darn well better hope every >other kernel hacker is as proficient as that, to be able to read it. I realize that assembly is platform-specific. Being that I use the IA32 class machine, that's what I would write for. Others who use other platforms could do the deed for their native language. As for the algorithm, I'm sure that whatever method is used to handle page swapping, it has to comply with the kernel's memory management scheme already in place. That's why I would need the details so that I wouldn't create more problems than already present. Being that most users are on the IA32 platform, I'm sure they wouldn't reject an assembly solution to this problem. As for kernel acceptance, that's an issue for the political eggheads. Not my forte. :-) -- Ted ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-06 22:27 ` android @ 2001-06-06 22:33 ` Antoine 0 siblings, 0 replies; 106+ messages in thread From: Antoine @ 2001-06-06 22:33 UTC (permalink / raw) To: linux-kernel hi, I have a problem with kswapd, it takes suddenly 98 % CPU and crash my server I dono why, I have a linux kernel 2.2.17 debian distro if anyone can help me ... thx ;) Antoine ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden ` (6 preceding siblings ...) 2001-06-06 22:27 ` android @ 2001-06-06 22:38 ` Robert Love 2001-06-06 22:40 ` Jonathan Morton 8 siblings, 0 replies; 106+ messages in thread From: Robert Love @ 2001-06-06 22:38 UTC (permalink / raw) To: android; +Cc: Jonathan Morton, linux-kernel On 06 Jun 2001 15:27:57 -0700, android wrote: > >I sincerely hope you're joking. > > I realize that assembly is platform-specific. Being that I use the IA32 class > machine, that's what I would write for. Others who use other platforms could > do the deed for their native language.<snip> no, look at the code. it is not going to benefit from assembly (assuming you can even implement it cleanly in assembly). its basically an iteration of other function calls. doing a new implementation in assembly for each platform is not feasible, anyhow. this is the sort of thing that needs to be uniform. this really has nothing to do with the "iron" of the computer -- its a loop to check and free swap pages. assembly will not provide benefit. -- Robert M. Love rml@ufl.edu rml@tech9.net ^ permalink raw reply [flat|nested] 106+ messages in thread
* Re: Break 2.4 VM in five easy steps 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden ` (7 preceding siblings ...) 2001-06-06 22:38 ` Robert Love @ 2001-06-06 22:40 ` Jonathan Morton 8 siblings, 0 replies; 106+ messages in thread From: Jonathan Morton @ 2001-06-06 22:40 UTC (permalink / raw) To: android; +Cc: linux-kernel At 11:27 pm +0100 6/6/2001, android wrote: >> >I'd be happy to write a new routine in assembly >> >>I sincerely hope you're joking. >> >>It's the algorithm that needs fixing, not the implementation of that >>algorithm. Writing in assembler? Hope you're proficient at writing in >>x86, PPC, 68k, MIPS (several varieties), ARM, SPARC, and whatever other >>architectures we support these days. And you darn well better hope every >>other kernel hacker is as proficient as that, to be able to read it. >As for the algorithm, I'm sure that >whatever method is used to handle page swapping, it has to comply with >the kernel's memory management scheme already in place. That's why I would >need the details so that I wouldn't create more problems than already present. Have you actually been following this thread? The algorithm has been discussed and at least one alternative brought forward. -------------------------------------------------------------- from: Jonathan "Chromatix" Morton mail: chromi@cyberspace.org (not for attachments) The key to knowledge is not to rely on people to teach you it. GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r++ y+(*) ^ permalink raw reply [flat|nested] 106+ messages in thread
end of thread, other threads:[~2001-06-09 12:38 UTC | newest] Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2001-06-05 22:19 Break 2.4 VM in five easy steps Derek Glidden 2001-06-05 23:38 ` Jeffrey W. Baker 2001-06-06 1:42 ` Russell Leighton 2001-06-06 7:14 ` Sean Hunter 2001-06-06 2:16 ` Andrew Morton 2001-06-06 3:19 ` Derek Glidden 2001-06-06 14:16 ` Disconnect [not found] ` <3B1DEAC7.43DEFA1C@idb.hist.no> 2001-06-06 14:51 ` Derek Glidden 2001-06-06 21:34 ` Alan Cox 2001-06-09 8:07 ` Rik van Riel 2001-06-07 7:23 ` Helge Hafting 2001-06-07 16:56 ` Eric W. Biederman 2001-06-07 20:24 ` José Luis Domingo López 2001-06-06 4:03 ` Jeffrey W. Baker 2001-06-06 8:19 ` Xavier Bestel 2001-06-06 8:54 ` Sean Hunter 2001-06-06 9:57 ` Dr S.M. Huen 2001-06-06 10:06 ` DBs (ML) 2001-06-06 10:08 ` Vivek Dasmohapatra 2001-06-06 10:19 ` Lauri Tischler 2001-06-06 10:22 ` Sean Hunter 2001-06-06 10:48 ` Alexander Viro 2001-06-06 16:58 ` dean gaudet 2001-06-06 17:10 ` Remi Turk 2001-06-06 22:44 ` Kai Henningsen 2001-06-09 7:17 ` Rik van Riel 2001-06-06 16:47 ` dean gaudet 2001-06-06 17:17 ` Kurt Roeckx 2001-06-06 18:35 ` Dr S.M. Huen 2001-06-06 18:40 ` Mark Salisbury 2001-06-07 0:20 ` Mike A. Harris 2001-06-09 8:16 ` Rik van Riel 2001-06-09 8:57 ` Mike A. Harris 2001-06-07 21:31 ` Shane Nay 2001-06-07 20:00 ` Marcelo Tosatti 2001-06-07 21:55 ` Shane Nay 2001-06-07 20:29 ` Marcelo Tosatti 2001-06-07 23:29 ` VM Report was:Re: " Shane Nay 2001-06-08 1:18 ` Jonathan Morton 2001-06-08 12:50 ` Mike Galbraith 2001-06-08 14:19 ` Tobias Ringstrom 2001-06-08 16:51 ` Mike Galbraith 2001-06-08 19:09 ` Tobias Ringstrom 2001-06-09 4:36 ` Mike Galbraith 2001-06-08 15:51 ` John Stoffel 2001-06-08 17:01 ` Mike Galbraith 2001-06-09 3:34 ` Rik van Riel 2001-06-08 17:43 ` John Stoffel 2001-06-08 17:35 ` Marcelo Tosatti 2001-06-09 5:07 ` Mike Galbraith 2001-06-08 18:30 ` Mike Galbraith 2001-06-09 12:31 ` Zlatko Calusic 2001-06-08 20:58 ` John Stoffel 2001-06-08 20:04 ` Marcelo Tosatti 2001-06-08 23:44 ` Jonathan Morton 2001-06-09 2:36 ` Andrew Morton 2001-06-09 6:33 ` Mark Hahn 2001-06-09 3:43 ` Mike Galbraith 2001-06-09 4:05 ` Jonathan Morton 2001-06-09 5:09 ` Mike Galbraith 2001-06-06 10:04 ` Jonathan Morton 2001-06-06 11:16 ` Daniel Phillips 2001-06-06 13:58 ` Gerhard Mack 2001-06-08 4:56 ` C. Martins 2001-06-06 15:28 ` Richard Gooch 2001-06-06 15:42 ` Christian Bornträger 2001-06-06 15:57 ` Requirement: swap = RAM x 2.5 ?? Jeff Garzik 2001-06-06 18:42 ` Eric W. Biederman 2001-06-07 1:29 ` Jan Harkes 2001-06-06 16:12 ` Richard Gooch 2001-06-06 16:15 ` Jeff Garzik 2001-06-06 16:19 ` Richard Gooch 2001-06-06 16:53 ` Mike Galbraith 2001-06-06 17:05 ` Greg Hennessy 2001-06-06 17:14 ` Break 2.4 VM in five easy steps Ben Greear 2001-06-06 19:11 ` android 2001-06-07 0:27 ` Mike A. Harris 2001-06-06 9:16 ` Xavier Bestel 2001-06-06 9:25 ` Sean Hunter 2001-06-06 12:07 ` Jonathan Morton 2001-06-06 14:41 ` Derek Glidden 2001-06-06 20:29 ` José Luis Domingo López 2001-06-06 13:32 ` Eric W. Biederman 2001-06-06 14:41 ` Marc Heckmann 2001-06-06 14:51 ` Hugh Dickins 2001-06-06 7:47 ` Jonathan Morton 2001-06-06 13:08 ` Eric W. Biederman 2001-06-06 16:48 ` Jeffrey W. Baker [not found] ` <m2lmn61ceb.fsf@sympatico.ca> 2001-06-06 14:37 ` Derek Glidden 2001-06-07 0:34 ` Mike A. Harris 2001-06-07 3:13 ` Miles Lane 2001-06-07 15:49 ` Derek Glidden 2001-06-07 19:06 ` Miles Lane 2001-06-09 5:57 ` Mike A. Harris 2001-06-06 18:59 ` Mike Galbraith 2001-06-06 19:39 ` Derek Glidden 2001-06-06 20:47 ` Linus Torvalds 2001-06-07 7:42 ` Eric W. Biederman 2001-06-07 8:11 ` Linus Torvalds 2001-06-07 8:54 ` Eric W. Biederman 2001-06-06 21:39 ` android 2001-06-06 22:08 ` Jonathan Morton 2001-06-06 22:27 ` android 2001-06-06 22:33 ` Antoine 2001-06-06 22:38 ` Robert Love 2001-06-06 22:40 ` Jonathan Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).