From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Thu, 4 Jul 2002 22:51:16 -0700 (PDT) From: Linus Torvalds Subject: Re: vm lock contention reduction In-Reply-To: <3D2530B9.8BC0C0AE@zip.com.au> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org Return-Path: To: Andrew Morton Cc: Rik van Riel , Andrea Arcangeli , "linux-mm@kvack.org" List-ID: On Thu, 4 Jul 2002, Andrew Morton wrote: > > > > Get away from this "minimum wait" thing, because it is WRONG. > > Well yes, we do want to batch work up. And a crude way of doing that > is "each time 64 pages have come clean, wake up one waiter". That would probably work, but we also need to be careful that we don't get into some strange situations (ie we have 50 waiters that all needed memory at the same time, and less than 50*64 pages that caused us to be in trouble, so we only wake up the 46 first waiters and the last 4 waiters get stuck until the next batch even though we now have _lots_ of pages free). Dont't laugh - things like this has actually happened at times with some of our balancing work with HIGHMEM/NORMAL. Basically, the logic would go: "everybody who ends up having to wait for an allocation should free at least N pages", but then you would end up with 50*N pages total that the system thought it "needed" to free up, and that could be a big number that would cause the VM to want to free up stuff long after it was really done. > Or > "as soon as the number of reclaimable pages exceeds zone->pages_min". > Some logic would also be needed to prevent new page allocators from > jumping the queue, of course. Yeah, the unfairness is the thing that really can be nasty. On the other hand, some unfairness is ok too - and can improve throughput. So jumping the queue is fine, you just must not be able to _consistently_ jump the queue. (In fact, jumping the queue is inevitable to some degree - not allowing any queue jumping at all would imply that any time _anybody_ starts waiting, every single allocation afterwards will have to wait until the waiter got woken up. Which we have actually tried before at times, but which causes really really bad behaviour and horribly bad "pausing") You probably want the occasional allocator able to jump the queue, but the "big spenders" to be caught eventually. "Fairness" really doesn't mean that "everybody should wait equally much", it really means "people should wait roughly relative to how much as they 'spend' memory". If a process is frugal with it's memory footprint, and doesn't dirty a lot of pages/buffers, such a process should obviously wait less than one that allocates/dirties a lot. Right now, we get roughly this behaviour simply by way of statistical behaviour for the page allocator ("if somebody allocates 5 times as many pages, he's 5 times as likely to have to clean something up too"), but trying to be smarter about this could easily break this relative fairness. In particular, statefulness like "cannot jump the queue" breaks it. > > some fuzzy feeling about "we shouldn't wait unless we have to". We _do_ > > have to wait. > > Sure, page allocators must throttle their allocation rate to that at > which the IO system can retire writes. But by waiting on a randomly-chosen > disk block, we're at the mercy of the elevator. Yes. On the other hand, "random" is actually usually a fairly good choice in itself. And it a lot easier that many of the alternatives - especially when it comes to fairness. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/