From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762858AbYEGXpi (ORCPT ); Wed, 7 May 2008 19:45:38 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755568AbYEGXp0 (ORCPT ); Wed, 7 May 2008 19:45:26 -0400 Received: from host36-195-149-62.serverdedicati.aruba.it ([62.149.195.36]:51354 "EHLO mx.cpushare.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753113AbYEGXpX (ORCPT ); Wed, 7 May 2008 19:45:23 -0400 Date: Thu, 8 May 2008 01:45:21 +0200 From: Andrea Arcangeli To: Benjamin Herrenschmidt Cc: Andrew Morton , clameter@sgi.com, steiner@sgi.com, holt@sgi.com, npiggin@suse.de, a.p.zijlstra@chello.nl, kvm-devel@lists.sourceforge.net, kanojsarcar@yahoo.com, rdreier@cisco.com, swise@opengridcomputing.com, linux-kernel@vger.kernel.org, avi@qumranet.com, linux-mm@kvack.org, general@lists.openfabrics.org, hugh@veritas.com, rusty@rustcorp.com.au, aliguori@us.ibm.com, chrisw@redhat.com, marcelo@kvack.org, dada1@cosmosbay.com, paulmck@us.ibm.com Subject: Re: [PATCH 08 of 11] anon-vma-rwsem Message-ID: <20080507234521.GN8276@duo.random> References: <6b384bb988786aa78ef0.1210170958@duo.random> <20080507212650.GA8276@duo.random> <20080507222205.GC8276@duo.random> <20080507153103.237ea5b6.akpm@linux-foundation.org> <20080507224406.GI8276@duo.random> <1210202918.1421.20.camel@pasglop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1210202918.1421.20.camel@pasglop> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 08, 2008 at 09:28:38AM +1000, Benjamin Herrenschmidt wrote: > > On Thu, 2008-05-08 at 00:44 +0200, Andrea Arcangeli wrote: > > > > Please note, we can't allow a thread to be in the middle of > > zap_page_range while mmu_notifier_register runs. > > You said yourself that mmu_notifier_register can be as slow as you > want ... what about you use stop_machine for it ? I'm not even joking > here :-) We can put a cap of time + a cap of vmas. It's not important if it fails, the only useful case we know it, and it won't be slow at all. The failure can happen because the cap of time or the cap of vmas or the cap vmas triggers or there's a vmalloc shortage. We handle the failure in userland of course. There are zillon of allocations needed anyway, any one of them can fail, so this isn't a new fail path, is the same fail path that always existed before mmu_notifiers existed. I can't possibly see how adding a new global wide lock that forces all truncate to be serialized against each other, practically eliminating the need of the i_mmap_lock, could be superior to an approach that doesn't cause the overhead to the VM at all, and only require kvm to pay for an additional cost when it startup. Furthermore the only reason I had to implement mm_lock was to fix the invalidate_range_start/end model, if we go with only invalidate_page and invalidate_pages called inside the PT lock and we use the PT lock to serialize, we don't need a mm_lock anymore and no new lock from the VM either. I tried to push for that, but everyone else wanted invalidate_range_start/end. I only did the only possible thing to do: to make invalidate_range_start safe to make everyone happy without slowing down the VM.