On Wed, 10 Dec 2003 09:47:57 -0800, William Lee Irwin III wrote: > On Wed, Dec 10, 2003 at 02:58:29PM +0100, Roger Luethi wrote: > > Actually, I'm rather well on my way wrapping things up. I documented > > in detail how much 2.6 sucks in this area and where the potential for > > improvements would have likely been, but now I've got a deadline to > > meet and other things on my plate. > > Well, it'd be nice to see the code, then. I attached the stunning code I wrote a few months ago, rediffed against test11, seems to compile. It does not include the eviction code (although you can tell where it plugs in) -- that's a bit messy and I'm not too confident that I got all the locking right. The trigger in the page allocator worked pretty well in test4 to test6, but it is sensitive to VM changes. Earlier 2.5 kernels went through the slow path much more frequently (IIRC before akpm limited use of blk_congestion_wait), for instance. That would require a different trigger. The time processes spend in the stunning queue (defined in stun_time()) is too short to gain much in terms of throughput -- that's because back then I tried to put a cap on worst case latency. > you pointed them out. I had presumed it was due to physical scanning. Everybody did, including me. Only after doing some of the benchmarks did I realize I had been wrong. It's quite clear that physical scanning accounts for a 50% higher execution time at most, which is a mere fifth of the overall slow down in compile benchmarks. > > There are variables other than the demotion criteria that I found can > > be important, to name a few: > > - Trigger: Under which circumstances is suspending any processes > > considered? How often? > > This is generally part of the load control algorithm, but it > essentially just tries to detect levels of overcommitment that would > degrade performance so it can resolve them. Level of overcommitment? What kind of criteria is that supposed to be? You can have 10x overcommit and not thrash at all, if most of the memory is allocated and filled but never referenced again. IOW, I can't derive an algorithm from your handwaving . > > - Eviction: Does regular pageout take care of the memory of a suspended > > process, or are pages marked old or even unmapped upon stunning? > > This is generally unmapping and evicting upon suspension. The effect > isn't immediate anyway, since io is required, and batching the work for > io contiguity etc. is a fair amount of savings, so there's little or no > incentive to delay this apart from keeping io rates down to where user > io and VM io aren't in competition. I agree with that part. > On Wed, Dec 10, 2003 at 02:58:29PM +0100, Roger Luethi wrote: > > - Release: Is the stunning queue a simple FIFO? How long do the > > processes stay there? Does a process get a bonus after it's woken up > > again -- bigger quantum, chunk of free memory, prepaged working set > > before stunning? > > It's a form of process scheduling. Memory scheduling policies are not > discussed very much in the sources I can get at, so some synthesis may > be required unless material can be found on that, but in general this > isn't a very interesting problem (at least not since the 70's or earlier). Not interesting, yes. And I realize that it's not even important once you accept the very real possibility of extreme latencies. > > There's quite a bit of complexity involved and many variables will depend > > on the scenario. Sort of like interactivity, except lots of people were > > affected by the interactivity tuning and only few will notice and test > > load control. > > It's basically just process scheduling, so I don't see an issue there. The issue is that there are tons of knobs and dials that affect the behavior, and it's hard to get good heuristics with a tiny test field. Admittedly, things get easier once you want load control only for the heavy thrashing case, and that's been my plan, too, since I realized that it doesn't work well for the light and medium type I'd been working on. Roger