From: Marcelo Tosatti <mtosatti@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: gleb@redhat.com, avi.kivity@gmail.com, pbonzini@redhat.com,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes
Date: Wed, 13 Nov 2013 22:36:10 -0200 [thread overview]
Message-ID: <20131114003609.GA15692@amt.cnet> (raw)
In-Reply-To: <1382534973-13197-5-git-send-email-xiaoguangrong@linux.vnet.ibm.com>
On Wed, Oct 23, 2013 at 09:29:22PM +0800, Xiao Guangrong wrote:
> Now we can flush all the TLBs out of the mmu lock without TLB corruption when
> write-proect the sptes, it is because:
> - we have marked large sptes readonly instead of dropping them that means we
> just change the spte from writable to readonly so that we only need to care
> the case of changing spte from present to present (changing the spte from
> present to nonpresent will flush all the TLBs immediately), in other words,
> the only case we need to care is mmu_spte_update()
Xiao,
Any code location which reads the writable bit in the spte and assumes if its not
set, that the translation which the spte refers to is not cached in a
remote CPU's TLB can become buggy. (*)
It might be the case that now its not an issue, but its so subtle that
it should be improved.
Can you add a fat comment on top of is_writeable_bit describing this?
(and explain why is_writable_pte users do not make an assumption
about (*).
"Writeable bit of locklessly modifiable sptes might be cleared
but TLBs not flushed: so whenever reading locklessly modifiable sptes
you cannot assume TLBs are flushed".
For example this one is unclear:
if (!can_unsync && is_writable_pte(*sptep))
goto set_pte;
And:
if (!is_writable_pte(spte) &&
!(pt_protect && spte_is_locklessly_modifiable(spte)))
return false;
This is safe because get_dirty_log/kvm_mmu_slot_remove_write_access are
serialized by a single mutex (if there were two mutexes, it would not be
safe). Can you add an assert to both
kvm_mmu_slot_remove_write_access/kvm_vm_ioctl_get_dirty_log
for (slots_lock) is locked, and explain?
So just improve the comments please, thanks (no need to resend whole
series).
> - in mmu_spte_update(), we haved checked
> SPTE_HOST_WRITEABLE | PTE_MMU_WRITEABLE instead of PT_WRITABLE_MASK, that
> means it does not depend on PT_WRITABLE_MASK anymore
>
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
> arch/x86/kvm/mmu.c | 18 ++++++++++++++----
> arch/x86/kvm/x86.c | 9 +++++++--
> 2 files changed, 21 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 62f18ec..337d173 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4273,15 +4273,25 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> if (*rmapp)
> __rmap_write_protect(kvm, rmapp, false);
>
> - if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
> - kvm_flush_remote_tlbs(kvm);
> + if (need_resched() || spin_needbreak(&kvm->mmu_lock))
> cond_resched_lock(&kvm->mmu_lock);
> - }
> }
> }
>
> - kvm_flush_remote_tlbs(kvm);
> spin_unlock(&kvm->mmu_lock);
> +
> + /*
> + * We can flush all the TLBs out of the mmu lock without TLB
> + * corruption since we just change the spte from writable to
> + * readonly so that we only need to care the case of changing
> + * spte from present to present (changing the spte from present
> + * to nonpresent will flush all the TLBs immediately), in other
> + * words, the only case we care is mmu_spte_update() where we
> + * haved checked SPTE_HOST_WRITEABLE | SPTE_MMU_WRITEABLE
> + * instead of PT_WRITABLE_MASK, that means it does not depend
> + * on PT_WRITABLE_MASK anymore.
> + */
> + kvm_flush_remote_tlbs(kvm);
> }
>
> #define BATCH_ZAP_PAGES 10
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b3aa650..573c6b3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3571,11 +3571,16 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
> offset = i * BITS_PER_LONG;
> kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
> }
> - if (is_dirty)
> - kvm_flush_remote_tlbs(kvm);
>
> spin_unlock(&kvm->mmu_lock);
>
> + /*
> + * All the TLBs can be flushed out of mmu lock, see the comments in
> + * kvm_mmu_slot_remove_write_access().
> + */
> + if (is_dirty)
> + kvm_flush_remote_tlbs(kvm);
> +
> r = -EFAULT;
> if (copy_to_user(log->dirty_bitmap, dirty_bitmap_buffer, n))
> goto out;
> --
> 1.8.1.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-11-14 0:36 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-23 13:29 [PATCH v3 00/15] KVM: MMU: locklessly write-protect Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 01/15] KVM: MMU: properly check last spte in fast_page_fault() Xiao Guangrong
2013-11-12 0:25 ` Marcelo Tosatti
2013-10-23 13:29 ` [PATCH v3 02/15] KVM: MMU: lazily drop large spte Xiao Guangrong
2013-11-12 22:44 ` Marcelo Tosatti
2013-10-23 13:29 ` [PATCH v3 03/15] KVM: MMU: flush tlb if the spte can be locklessly modified Xiao Guangrong
2013-11-13 0:10 ` Marcelo Tosatti
2013-10-23 13:29 ` [PATCH v3 04/15] KVM: MMU: flush tlb out of mmu lock when write-protect the sptes Xiao Guangrong
2013-11-14 0:36 ` Marcelo Tosatti [this message]
2013-11-14 5:15 ` Xiao Guangrong
2013-11-14 18:39 ` Marcelo Tosatti
2013-11-15 7:09 ` Xiao Guangrong
2013-11-19 0:19 ` Marcelo Tosatti
2013-10-23 13:29 ` [PATCH v3 05/15] KVM: MMU: update spte and add it into rmap before dirty log Xiao Guangrong
2013-11-15 0:08 ` Marcelo Tosatti
2013-10-23 13:29 ` [PATCH v3 06/15] KVM: MMU: redesign the algorithm of pte_list Xiao Guangrong
2013-11-19 0:48 ` Marcelo Tosatti
2013-10-23 13:29 ` [PATCH v3 07/15] KVM: MMU: introduce nulls desc Xiao Guangrong
2013-11-22 19:14 ` Marcelo Tosatti
2013-11-25 6:11 ` Xiao Guangrong
2013-11-25 6:29 ` Xiao Guangrong
2013-11-25 18:12 ` Marcelo Tosatti
2013-11-26 3:21 ` Xiao Guangrong
2013-11-26 10:12 ` Gleb Natapov
2013-11-26 19:31 ` Marcelo Tosatti
2013-11-28 8:53 ` Xiao Guangrong
2013-12-03 7:10 ` Xiao Guangrong
2013-12-05 13:50 ` Marcelo Tosatti
2013-12-05 15:30 ` Xiao Guangrong
2013-12-06 0:15 ` Marcelo Tosatti
2013-12-06 0:22 ` Marcelo Tosatti
2013-12-10 6:58 ` Xiao Guangrong
2013-11-25 10:19 ` Gleb Natapov
2013-11-25 10:25 ` Xiao Guangrong
2013-11-25 12:48 ` Avi Kivity
2013-11-25 14:23 ` Marcelo Tosatti
2013-11-25 14:29 ` Gleb Natapov
2013-11-25 18:06 ` Marcelo Tosatti
2013-11-26 3:10 ` Xiao Guangrong
2013-11-26 10:15 ` Gleb Natapov
2013-11-26 19:58 ` Marcelo Tosatti
2013-11-28 8:32 ` Xiao Guangrong
2013-11-25 14:08 ` Marcelo Tosatti
2013-11-26 3:02 ` Xiao Guangrong
2013-11-25 9:31 ` Peter Zijlstra
2013-11-25 10:59 ` Xiao Guangrong
2013-11-25 11:05 ` Peter Zijlstra
2013-11-25 11:29 ` Peter Zijlstra
2013-10-23 13:29 ` [PATCH v3 08/15] KVM: MMU: introduce pte-list lockless walker Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 09/15] KVM: MMU: initialize the pointers in pte_list_desc properly Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 10/15] KVM: MMU: allocate shadow pages from slab Xiao Guangrong
2013-10-24 9:19 ` Gleb Natapov
2013-10-24 9:29 ` Xiao Guangrong
2013-10-24 9:52 ` Gleb Natapov
2013-10-24 10:10 ` Xiao Guangrong
2013-10-24 10:39 ` Gleb Natapov
2013-10-24 11:01 ` Xiao Guangrong
2013-10-24 12:32 ` Gleb Natapov
2013-10-28 3:16 ` Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 11/15] KVM: MMU: locklessly access shadow page under rcu protection Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 12/15] KVM: MMU: check last spte with unawareness of mapping level Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 13/15] KVM: MMU: locklessly write-protect the page Xiao Guangrong
2013-10-24 9:17 ` Gleb Natapov
2013-10-24 9:24 ` Xiao Guangrong
2013-10-24 9:32 ` Gleb Natapov
2013-10-23 13:29 ` [PATCH v3 14/15] KVM: MMU: clean up spte_write_protect Xiao Guangrong
2013-10-23 13:29 ` [PATCH v3 15/15] KVM: MMU: use rcu functions to access the pointer Xiao Guangrong
2013-11-03 12:29 ` [PATCH v3 00/15] KVM: MMU: locklessly write-protect Gleb Natapov
2013-11-11 5:33 ` Xiao Guangrong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131114003609.GA15692@amt.cnet \
--to=mtosatti@redhat.com \
--cc=avi.kivity@gmail.com \
--cc=gleb@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=xiaoguangrong@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).