From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Fri, 5 Jul 2002 09:33:15 +0200 From: Andrea Arcangeli Subject: Re: vm lock contention reduction Message-ID: <20020705073315.GU1227@dualathlon.random> References: <3D253DC9.545865D4@zip.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3D253DC9.545865D4@zip.com.au> Sender: owner-linux-mm@kvack.org Return-Path: To: Andrew Morton Cc: Linus Torvalds , Rik van Riel , "linux-mm@kvack.org" , "Martin J. Bligh" List-ID: On Thu, Jul 04, 2002 at 11:33:45PM -0700, Andrew Morton wrote: > Well. First locks first. kmap_lock is a bad one on x86. Actually I thought about kmap_lock and the per-process kmaps a bit more with Martin (cc'ed) during OLS and there is an easy process-scalable solution to drop: the kmap_lock in turn the global pool in turn the global tlb flush The only problem is that it's not anymore both atomic *and* persistent, it's only persistent. It's also atomic if the mm_count == 1, but the kernel cannot rely on it, it has to assume it's a blocking operation always (you find it out if it's blocking only at runtime). In short the same design of the per-process kmaps will work just fine if we add a semaphore to the mm_struct. then before starting using the kmap entry we must acquire the semaphore. This way all the global locking and global tlb flush goes away completely for normal tasks, but still remains the contention of that per-mm semaphore with threads doing simutaneous pte manipulation or simultaneous pagecache I/O though. Furthmore this I/O will be serialized, threaded benchmark like dbench may perform poorly that way I suspect, or we should add a pool of userspace pages so more than 1 thread is allowed to go ahead, but still we may cacheline-bounce in the synchronization of the pool across threads (similar to what we do now in the global pool). Then there's the problem the pagecache/FS API should be changed to pass the vaddr through the stack because page->virtual would go away, the virtual address would be per-process protected by the mm->kmap_sem so we couldn't store it in a global, all tasks can kmap the same page at the same time at virtual vaddr. This as well will break some common code. Last but not the least, I hope in 2.6 production I won't be running benchmarks and profiling using a 32bit cpu anymore anyways. So I'm not very motivated anymore in doing that change after the comment from Linus about the issue with threads. Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/