All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH v4 06/10] KVM: MMU: fast path of handling guest page fault
Date: Fri, 27 Apr 2012 11:52:13 -0300	[thread overview]
Message-ID: <20120427145213.GB28796@amt.cnet> (raw)
In-Reply-To: <4F9A3445.2060305@linux.vnet.ibm.com>

On Fri, Apr 27, 2012 at 01:53:09PM +0800, Xiao Guangrong wrote:
> On 04/27/2012 07:45 AM, Marcelo Tosatti wrote:
> 
> 
> >> +static bool
> >> +fast_pf_fix_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
> >> +		  u64 *sptep, u64 spte)
> >> +{
> >> +	gfn_t gfn;
> >> +
> >> +	spin_lock(&vcpu->kvm->mmu_lock);
> >> +
> >> +	/* The spte has been changed. */
> >> +	if (*sptep != spte)
> >> +		goto exit;
> >> +
> >> +	gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
> >> +
> >> +	*sptep = spte | PT_WRITABLE_MASK;
> >> +	mark_page_dirty(vcpu->kvm, gfn);
> >> +
> >> +exit:
> >> +	spin_unlock(&vcpu->kvm->mmu_lock);
> > 
> > There was a misunderstanding. 
> 
> 
> Sorry, Marcelo! It is still not clear for me now. :(
> 
> > The suggestion is to change codepaths that
> > today assume that a side effect of holding mmu_lock is that sptes are
> > not updated or read before attempting to introduce any lockless spte
> > updates.
> > 
> > example)
> > 
> >         u64 mask, old_spte = *sptep;
> > 
> >         if (!spte_has_volatile_bits(old_spte) || (new_spte & mask) == mask)
> >                 __update_clear_spte_fast(sptep, new_spte);
> >         else
> >                 old_spte = __update_clear_spte_slow(sptep, new_spte);
> > 
> > The local old_spte copy might be stale by the
> > time "spte_has_volatile_bits(old_spte)" reads the writable bit.
> > 
> > example)
> > 
> > 
> > VCPU0                                       VCPU1
> > set_spte
> > mmu_spte_update decides fast write 
> > mov newspte to sptep
> > (non-locked write instruction)
> > newspte in store buffer
> > 
> >                                             lockless spte read path reads stale value
> > 
> > spin_unlock(mmu_lock) 
> > 
> > Depending on the what stale value CPU1 read and what decision it took
> > this can be a problem, say the new bits (and we do not want to verify
> > every single case). The spte write from inside mmu_lock should always be
> > atomic now?
> > 
> > So these details must be exposed to the reader, they are hidden now
> > (note mmu_spte_update is already a maze, its getting too deep).
> > 
> 
> 
> Actually, in this patch, all the spte update is under mmu-lock, and we
> lockless-ly read spte , but the spte will be verified again after holding
> mmu-lock.

Yes but the objective you are aiming for is to read and write sptes
without mmu_lock. That is, i am not talking about this patch. 
Please read carefully the two examples i gave (separated by "example)").

> +	spin_lock(&vcpu->kvm->mmu_lock);
> +
> +	/* The spte has been changed. */
> +	if (*sptep != spte)
> +		goto exit;
> +
> +	gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
> +
> +	*sptep = spte | PT_WRITABLE_MASK;
> +	mark_page_dirty(vcpu->kvm, gfn);
> +
> +exit:
> +	spin_unlock(&vcpu->kvm->mmu_lock);
> 
> Is not the same as both read/update spte under mmu-lock?
> 
> Hmm, this is what you want?

The rules for code under mmu_lock should be:

1) Spte updates under mmu lock must always be atomic and 
with locked instructions.
2) Spte values must be read once, and appropriate action
must be taken when writing them back in case their value 
has changed (remote TLB flush might be required).

The maintenance of:

- gpte writable bit 
- protected by dirty log

Bits is tricky. We should think of a way to simplify things 
and get rid of them (or at least one of them), if possible.


  reply	other threads:[~2012-04-27 14:55 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-25  4:00 [PATCH v4 00/10] KVM: MMU: fast page fault Xiao Guangrong
2012-04-25  4:01 ` [PATCH v4 01/10] KVM: MMU: return bool in __rmap_write_protect Xiao Guangrong
2012-04-25  4:01 ` [PATCH v4 02/10] KVM: MMU: abstract spte write-protect Xiao Guangrong
2012-04-25  4:02 ` [PATCH v4 03/10] KVM: VMX: export PFEC.P bit on ept Xiao Guangrong
2012-04-25  4:02 ` [PATCH v4 04/10] KVM: MMU: introduce SPTE_MMU_WRITEABLE bit Xiao Guangrong
2012-04-25  4:03 ` [PATCH v4 05/10] KVM: MMU: introduce SPTE_WRITE_PROTECT bit Xiao Guangrong
2012-04-25  4:03 ` [PATCH v4 06/10] KVM: MMU: fast path of handling guest page fault Xiao Guangrong
2012-04-26 23:45   ` Marcelo Tosatti
2012-04-27  5:53     ` Xiao Guangrong
2012-04-27 14:52       ` Marcelo Tosatti [this message]
2012-04-28  6:10         ` Xiao Guangrong
2012-05-01  1:34           ` Marcelo Tosatti
2012-05-02  5:28             ` Xiao Guangrong
2012-05-02 21:07               ` Marcelo Tosatti
2012-05-03 11:26                 ` Xiao Guangrong
2012-05-05 14:08                   ` Marcelo Tosatti
2012-05-06  9:36                     ` Avi Kivity
2012-05-07  6:52                     ` Xiao Guangrong
2012-04-29  8:50         ` Takuya Yoshikawa
2012-05-01  2:31           ` Marcelo Tosatti
2012-05-02  5:39           ` Xiao Guangrong
2012-05-02 21:10             ` Marcelo Tosatti
2012-05-03 12:09               ` Xiao Guangrong
2012-05-03 12:13                 ` Avi Kivity
2012-05-03  0:15             ` Takuya Yoshikawa
2012-05-03 12:23               ` Xiao Guangrong
2012-05-03 12:40                 ` Takuya Yoshikawa
2012-04-25  4:04 ` [PATCH v4 07/10] KVM: MMU: lockless update spte on fast page fault path Xiao Guangrong
2012-04-25  4:04 ` [PATCH v4 08/10] KVM: MMU: trace fast page fault Xiao Guangrong
2012-04-25  4:05 ` [PATCH v4 09/10] KVM: MMU: fix kvm_mmu_pagetable_walk tracepoint Xiao Guangrong
2012-04-25  4:06 ` [PATCH v4 10/10] KVM: MMU: document mmu-lock and fast page fault Xiao Guangrong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120427145213.GB28796@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.