All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	avi.kivity@gmail.com, pbonzini@redhat.com,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Anthony Liguori <anthony@codemonkey.ws>
Subject: Re: [PATCH v6 3/7] KVM: MMU: fast invalidate all pages
Date: Wed, 22 May 2013 18:42:51 +0300	[thread overview]
Message-ID: <20130522154251.GP14287@redhat.com> (raw)
In-Reply-To: <519CE359.6040802@linux.vnet.ibm.com>

On Wed, May 22, 2013 at 11:25:13PM +0800, Xiao Guangrong wrote:
> On 05/22/2013 09:17 PM, Gleb Natapov wrote:
> > On Wed, May 22, 2013 at 05:41:10PM +0800, Xiao Guangrong wrote:
> >> On 05/22/2013 04:54 PM, Gleb Natapov wrote:
> >>> On Wed, May 22, 2013 at 04:46:04PM +0800, Xiao Guangrong wrote:
> >>>> On 05/22/2013 02:34 PM, Gleb Natapov wrote:
> >>>>> On Tue, May 21, 2013 at 10:33:30PM -0300, Marcelo Tosatti wrote:
> >>>>>> On Tue, May 21, 2013 at 11:39:03AM +0300, Gleb Natapov wrote:
> >>>>>>>> Any pages with stale information will be zapped by kvm_mmu_zap_all().
> >>>>>>>> When that happens, page faults will take place which will automatically 
> >>>>>>>> use the new generation number.
> >>>>>>>>
> >>>>>>>> So still not clear why is this necessary.
> >>>>>>>>
> >>>>>>> This is not, strictly speaking, necessary, but it is the sane thing to do.
> >>>>>>> You cannot update page's generation number to prevent it from been
> >>>>>>> destroyed since after kvm_mmu_zap_all() completes stale ptes in the
> >>>>>>> shadow page may point to now deleted memslot. So why build shadow page
> >>>>>>> table with a page that is in a process of been destroyed?
> >>>>>>
> >>>>>> OK, can this be introduced separately, in a later patch, with separate
> >>>>>> justification, then?
> >>>>>>
> >>>>>> Xiao please have the first patches of the patchset focus on the problem
> >>>>>> at hand: fix long mmu_lock hold times.
> >>>>>>
> >>>>>>> Not sure what you mean again. We flush TLB once before entering this function.
> >>>>>>> kvm_reload_remote_mmus() does this for us, no?
> >>>>>>
> >>>>>> kvm_reload_remote_mmus() is used as an optimization, its separate from the
> >>>>>> problem solution.
> >>>>>>
> >>>>>>>>
> >>>>>>>> What was suggested was... go to phrase which starts with "The only purpose
> >>>>>>>> of the generation number should be to".
> >>>>>>>>
> >>>>>>>> The comment quoted here does not match that description.
> >>>>>>>>
> >>>>>>> The comment describes what code does and in this it is correct.
> >>>>>>>
> >>>>>>> You propose to not reload roots right away and do it only when root sp
> >>>>>>> is encountered, right? So my question is what's the point? There are,
> >>>>>>> obviously, root sps with invalid generation number at this point, so
> >>>>>>> reload will happen regardless in kvm_mmu_prepare_zap_page(). So why not
> >>>>>>> do it here right away and avoid it in kvm_mmu_prepare_zap_page() for
> >>>>>>> invalid and obsolete sps as I proposed in one of my email?
> >>>>>>
> >>>>>> Sure. But Xiao please introduce that TLB collapsing optimization as a
> >>>>>> later patch, so we can reason about it in a more organized fashion.
> >>>>>
> >>>>> So, if I understand correctly, you are asking to move is_obsolete_sp()
> >>>>> check from kvm_mmu_get_page() and kvm_reload_remote_mmus() from
> >>>>> kvm_mmu_invalidate_all_pages() to a separate patch. Fine by me, but if
> >>>>> we drop kvm_reload_remote_mmus() from kvm_mmu_invalidate_all_pages() the
> >>>>> call to kvm_mmu_invalidate_all_pages() in emulator_fix_hypercall() will
> >>>>> become nop. But I question the need to zap all shadow pages tables there
> >>>>> in the first place, why kvm_flush_remote_tlbs() is not enough?
> >>>>
> >>>> I do not know too... I even do no know why kvm_flush_remote_tlbs
> >>>> is needed. :(
> >>> We changed the content of an executable page, we need to flush instruction
> >>> cache of all vcpus to not use stale data, so my suggestion to call
> >>
> >> I thought the reason is about icache too but icache is automatically
> >> flushed on x86, we only need to invalidate the prefetched instructions by
> >> executing a serializing operation.
> >>
> >> See the SDM in the chapter of
> >> "8.1.3 Handling Self- and Cross-Modifying Code"
> >>
> > Right, so we do cross-modifying code here and we need to make sure no
> > vcpu is running in a guest mode while this happens, but
> > kvm_mmu_zap_all() does not provide this guaranty since vcpus will
> > continue running after reloading roots!
> 
> May be we can introduce a function to atomic write gpa, then the guest
> either 1) see the old value, in that case, it can be intercepted or
> 2) see the the new value in that case, it can continue to execute.
> 
SDM says atomic write is not enough. All vcpu should be guarantied to
not execute code in the vicinity of modified code. This is easy to
achieve though:

vcpu0:                            
lock(x);
make_all_cpus_request(EXIT);
unlock(x);

vcpuX:
if (kvm_check_request(EXIT)) { 
    lock(x);
    unlock(x);
}

> >>> kvm_flush_remote_tlbs() is obviously incorrect since this flushes tlb,
> >>> not instruction cache, but why kvm_reload_remote_mmus() would flush
> >>> instruction cache?
> >>
> >> kvm_reload_remote_mmus do not have any help i think.
> >>
> >> I find that this change is introduced by commit: 7aa81cc0
> >> and I have added Anthony in the CC.
> >>
> >> I also find some discussions related to calling
> >> kvm_reload_remote_mmus():
> >>
> >>>
> >>> But if the instruction is architecture dependent, and you run on the
> >>> wrong architecture, now you have to patch many locations at fault time,
> >>> introducing some nasty runtime code / data cache overlap performance
> >>> problems.  Granted, they go away eventually.
> >>>
> >>
> >> We're addressing that by blowing away the shadow cache and holding the
> >> big kvm lock to ensure SMP safety.  Not a great thing to do from a
> >> performance perspective but the whole point of patching is that the cost
> >> is amortized.
> >>
> >> (http://kerneltrap.org/mailarchive/linux-kernel/2007/9/14/260288)
> >>
> >> But i can not understand...
> > Back then kvm->lock protected memslot access so code like:
> > 
> > mutex_lock(&vcpu->kvm->lock);
> > kvm_mmu_zap_all(vcpu->kvm);
> > mutex_unlock(&vcpu->kvm->lock);
> > 
> > which is what 7aa81cc0 does was enough to guaranty that no vcpu will
> > run while code is patched. 
> 
> So, at that time, kvm->lock is also held when #PF is being fixed?
> 
It was, and also during kvm_mmu_load() which is called during vcpu entry
after roots are zapped.

> > This is no longer the case and
> > mutex_lock(&vcpu->kvm->lock); is gone from that code path long time ago,
> > so now kvm_mmu_zap_all() there is useless and the code is incorrect.
> > 
> > Lets drop kvm_mmu_zap_all() there (in separate patch) and fix the
> > patching properly later.
> 
> Will do.
> 

--
			Gleb.

  reply	other threads:[~2013-05-22 15:42 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-16 21:12 [PATCH v6 0/7] KVM: MMU: fast zap all shadow pages Xiao Guangrong
2013-05-16 21:12 ` [PATCH v6 1/7] KVM: MMU: drop unnecessary kvm_reload_remote_mmus Xiao Guangrong
2013-05-16 21:12 ` [PATCH v6 2/7] KVM: MMU: delete shadow page from hash list in kvm_mmu_prepare_zap_page Xiao Guangrong
2013-05-19 10:47   ` Gleb Natapov
2013-05-20  9:19     ` Xiao Guangrong
2013-05-20  9:42       ` Gleb Natapov
2013-05-16 21:12 ` [PATCH v6 3/7] KVM: MMU: fast invalidate all pages Xiao Guangrong
2013-05-19 10:04   ` Gleb Natapov
2013-05-20  9:12     ` Xiao Guangrong
2013-05-20 19:46   ` Marcelo Tosatti
2013-05-20 20:15     ` Gleb Natapov
2013-05-20 20:40       ` Marcelo Tosatti
2013-05-21  3:36         ` Xiao Guangrong
2013-05-21  8:45           ` Gleb Natapov
2013-05-22  1:41           ` Marcelo Tosatti
2013-05-21  8:39         ` Gleb Natapov
2013-05-22  1:33           ` Marcelo Tosatti
2013-05-22  6:34             ` Gleb Natapov
2013-05-22  8:46               ` Xiao Guangrong
2013-05-22  8:54                 ` Gleb Natapov
2013-05-22  9:41                   ` Xiao Guangrong
2013-05-22 13:17                     ` Gleb Natapov
2013-05-22 15:25                       ` Xiao Guangrong
2013-05-22 15:42                         ` Gleb Natapov [this message]
2013-05-22 15:06               ` Marcelo Tosatti
2013-05-16 21:12 ` [PATCH v6 4/7] KVM: MMU: zap pages in batch Xiao Guangrong
2013-05-16 21:13 ` [PATCH v6 5/7] KVM: x86: use the fast way to invalidate all pages Xiao Guangrong
2013-05-16 21:13 ` [PATCH v6 6/7] KVM: MMU: show mmu_valid_gen in shadow page related tracepoints Xiao Guangrong
2013-05-16 21:13 ` [PATCH v6 7/7] KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages Xiao Guangrong
2013-05-19 10:49 ` [PATCH v6 0/7] KVM: MMU: fast zap all shadow pages Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130522154251.GP14287@redhat.com \
    --to=gleb@redhat.com \
    --cc=anthony@codemonkey.ws \
    --cc=avi.kivity@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.