From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S262655AbTDEVLh (for ); Sat, 5 Apr 2003 16:11:37 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S262659AbTDEVLh (for ); Sat, 5 Apr 2003 16:11:37 -0500 Received: from [12.47.58.55] ([12.47.58.55]:65420 "EHLO pao-ex01.pao.digeo.com") by vger.kernel.org with ESMTP id S262655AbTDEVLg (for ); Sat, 5 Apr 2003 16:11:36 -0500 Date: Sat, 5 Apr 2003 13:24:06 -0800 From: Andrew Morton To: Andrea Arcangeli Cc: mbligh@aracnet.com, mingo@elte.hu, hugh@veritas.com, dmccr@us.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: objrmap and vmtruncate Message-Id: <20030405132406.437b27d7.akpm@digeo.com> In-Reply-To: <20030405163003.GD1326@dualathlon.random> References: <20030404163154.77f19d9e.akpm@digeo.com> <12880000.1049508832@flay> <20030405024414.GP16293@dualathlon.random> <20030404192401.03292293.akpm@digeo.com> <20030405040614.66511e1e.akpm@digeo.com> <20030405163003.GD1326@dualathlon.random> X-Mailer: Sylpheed version 0.8.9 (GTK+ 1.2.10; i586-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 05 Apr 2003 21:23:00.0998 (UTC) FILETIME=[893FDA60:01C2FBB9] Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Andrea Arcangeli wrote: > > On Sat, Apr 05, 2003 at 04:06:14AM -0800, Andrew Morton wrote: > > Andrew Morton wrote: > > > > > > Nobody has written an "exploit" for this yet, but it's there. > > > > Here we go. The test app is called `rmap-test'. It is in ext3 CVS. See > > > > http://www.zip.com.au/~akpm/linux/ext3/ > > > > It sets up N MAP_SHARED VMA's and N tasks touching them in various access > > patterns. > > I'm not questioning during paging rmap is more efficient than objrmap, > but your argument about rmap having lower complexity of objrmap and that > rmap is needed is wrong. The fact is that with your 100 mappings per > each of the 100 tasks case, both algorithms works in O(N) where N is > the number of the pagetables mapping the page. Nope. To unmap a page, full rmap has to scan 100 pte_chain slots, which is 3 cachelines worth. objrmap has to scan 10,000 vma's, 9,900 of which do not map that page at all. (Actually, there's a recent optimisation in objrmap which will on average halve these figures). > And objrmap can't avoided, it's needed for the truncate semantics > against mmap. What do you mean by this? vmtruncate continues to use the 2.4 algorithm for that. > Check all other important benchmarks not testing the paging load like > page faults, kernel compile from Martin, fork, AIM etc... Those are IMHO > an order of magnitude of more interest than your rmap-test paging load > with some hundred thousand of vmas. Andrea, I whine about rmap as much as anyone ;) I'm the guy who halved both its speed and space overhead shortly after it was merged. But the fact is that it is not completely useless overhead. It provides a very robust VM which is stable and predictable under extreme and unusual loads. That is valuable. Yes, rmap adds a few% speed overhead - up to 10% for things which are admittedly already very inefficient. objrmap will reclaim a lot of that common-case overhead. But the cost of that is apparently unviability for certain workloads on certain machines. Once you hit 100k VMA's it's time to find a new operating system. Maybe that is a tradeoff we want to make. I'm adding some balance here. The space consumption of rmap is a much more serious problem than the speed overhead. It makes some workloads on huge ia32 machines unviable. Me, I have never seen any evidence that we need any of it. I have never seen a demonstration of the alleged failure modes of 2.4's virtual scan. But then I haven't tried very hard. The extreme stability and scalability of full rmap is good. The space consumption on highmem is bad. The CPU cost is much less important than these things.