From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756896Ab2CaNND (ORCPT ); Sat, 31 Mar 2012 09:13:03 -0400 Received: from e28smtp09.in.ibm.com ([122.248.162.9]:52640 "EHLO e28smtp09.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752035Ab2CaNNA (ORCPT ); Sat, 31 Mar 2012 09:13:00 -0400 Message-ID: <4F7702CB.4050704@linux.vnet.ibm.com> Date: Sat, 31 Mar 2012 21:12:43 +0800 From: Xiao Guangrong User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.1) Gecko/20120216 Thunderbird/10.0.1 MIME-Version: 1.0 To: Xiao Guangrong CC: Avi Kivity , Marcelo Tosatti , LKML , KVM Subject: Re: [PATCH 00/13] KVM: MMU: fast page fault References: <4F742951.7080003@linux.vnet.ibm.com> <4F7436FB.9000004@redhat.com> <4F744A43.4060600@linux.vnet.ibm.com> <4F745C4F.4060404@redhat.com> <4F757A7C.6020109@linux.vnet.ibm.com> In-Reply-To: <4F757A7C.6020109@linux.vnet.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit x-cbid: 12033113-2674-0000-0000-000003E60273 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/30/2012 05:18 PM, Xiao Guangrong wrote: > On 03/29/2012 08:57 PM, Avi Kivity wrote: > >> On 03/29/2012 01:40 PM, Xiao Guangrong wrote: >>>>> * Implementation >>>>> We can freely walk the page between walk_shadow_page_lockless_begin and >>>>> walk_shadow_page_lockless_end, it can ensure all the shadow page is valid. >>>>> >>>>> In the most case, cmpxchg is fair enough to change the access bit of spte, >>>>> but the write-protect path on softmmu/nested mmu is a especial case: it is >>>>> a read-check-modify path: read spte, check W bit, then clear W bit. >>>> >>>> We also set gpte.D and gpte.A, no? How do you handle that? >>>> >>> >>> >>> We still need walk gust page table before fast page fault to check >>> whether the access is valid. >> >> Ok. But that's outside the mmu lock. >> >> We can also skip that: if !sp->unsync, copy access rights and gpte.A/D >> into spare bits in the spte, and use that to check. >> > > > Great! > > gpte.A need not be copied into spte since EFEC.P = 1 means the shadow page > table is present, gpte.A must be set in this case. > > And, we do not need to cache gpte access rights into spte, instead of it, > we can atomicly read gpte to get these information (anyway, we need read gpte > to get the gfn.) > It needs more thinking, we can excellent improvement for dirty page logged in this idea, but i am not sure we can gain the performance in the below case: - the page fault is trigged by the invalid access from guest in the origin way, it is fixed on the FNAME(walk_addr) path which walk guest page table, we way need call gfn_to_pfn (it is fast since the page is always not swap out). After the idea, it is fixed on fast page fault path which walk shadow page table with RCU locked, the preemption is disabled. They are not too different i think. - the page fault is caused by host, but we can not quickly check the page writable since gfn is unknown, then after shadow page walking we get the gfn (read gpte), what will we do if gfn is write-protect? - if the page is write-protected by the host (!spte.SPTE_HOST_WRITEABLE), we have no choice, just call gfn_to_pfn and waiting the page cowed. Comparing with the origin way, the time costed on shadow page table walking is wasted, unfortunately, it is triggered really frequently if KSM is enabled. It may be a regression. - if the write-protect is caused by page table protected, we have two choice: - call slow page fault path. It is unacceptable, since the number of this kind of page fault is very large. We may wast many CPU time. - hold mmu-lock and fix it in the last spte. It is OK but makes thing little complex, i am not sure if you agree with this. :)