From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754048Ab2KMPd4 (ORCPT ); Tue, 13 Nov 2012 10:33:56 -0500 Received: from mail-pb0-f46.google.com ([209.85.160.46]:37116 "EHLO mail-pb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751662Ab2KMPdy (ORCPT ); Tue, 13 Nov 2012 10:33:54 -0500 Date: Wed, 14 Nov 2012 00:33:50 +0900 From: Takuya Yoshikawa To: Marcelo Tosatti Cc: xiaoguangrong@linux.vnet.ibm.com, avi@redhat.com, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, qemu-devel@nongnu.org, owasserm@redhat.com, quintela@redhat.com, pbonzini@redhat.com, chegu_vinod@hp.com, yamahata@valinux.co.jp Subject: Re: [PATCH] KVM: MMU: lazily drop large spte Message-Id: <20121114003350.d6e8ff85658fccbf41183f05@gmail.com> In-Reply-To: <20121112231032.GB5798@amt.cnet> References: <50978DFE.1000005@linux.vnet.ibm.com> <20121112231032.GB5798@amt.cnet> X-Mailer: Sylpheed 3.2.0beta3 (GTK+ 2.24.6; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Ccing live migration developers who should be interested in this work, On Mon, 12 Nov 2012 21:10:32 -0200 Marcelo Tosatti wrote: > On Mon, Nov 05, 2012 at 05:59:26PM +0800, Xiao Guangrong wrote: > > Do not drop large spte until it can be insteaded by small pages so that > > the guest can happliy read memory through it > > > > The idea is from Avi: > > | As I mentioned before, write-protecting a large spte is a good idea, > > | since it moves some work from protect-time to fault-time, so it reduces > > | jitter. This removes the need for the return value. > > > > Signed-off-by: Xiao Guangrong > > --- > > arch/x86/kvm/mmu.c | 34 +++++++++------------------------- > > 1 files changed, 9 insertions(+), 25 deletions(-) > > Its likely that other 4k pages are mapped read-write in the 2mb range > covered by a read-only 2mb map. Therefore its not entirely useful to > map read-only. > > Can you measure an improvement with this change? What we discussed at KVM Forum last week was about the jitter we could measure right after starting live migration: both Isaku and Chegu reported such jitter. So if this patch reduces such jitter for some real workloads, by lazily dropping largepage mappings and saving read faults until that point, that would be very nice! But sadly, what they measured included interactions with the outside of the guest, and the main cause was due to the big QEMU lock problem, they guessed. The order is so different that an improvement by a kernel side effort may not be seen easily. FWIW: I am now changing the initial write protection by kvm_mmu_slot_remove_write_access() to rmap based as I proposed at KVM Forum. ftrace said that 1ms was improved to 250-350us by the change for 10GB guest. My code still drops largepage mappings, so the initial write protection time itself may not be a such big issue here, I think. Again, if we can eliminate read faults to such an extent that guests can see measurable improvement, that should be very nice! Any thoughts? Thanks, Takuya