All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging
@ 2013-01-08 10:42 Takuya Yoshikawa
  2013-01-08 10:43 ` [PATCH 1/7] KVM: Write protect the updated slot only when dirty logging is enabled Takuya Yoshikawa
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:42 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

Changelog v1->v2:
  The condition in patch 1 was changed like this:
    npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES)

This patch set makes kvm_mmu_slot_remove_write_access() rmap based and
adds conditional rescheduling to it.

The motivation for this change is of course to reduce the mmu_lock hold
time when we start dirty logging for a large memory slot.  You may not
see the problem if you just give 8GB or less of the memory to the guest
with THP enabled on the host -- this is for the worst case.

Takuya Yoshikawa (7):
  KVM: Write protect the updated slot only when dirty logging is enabled
  KVM: MMU: Remove unused parameter level from __rmap_write_protect()
  KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based
  KVM: Remove unused slot_bitmap from kvm_mmu_page
  KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself
  KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself
  KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time

 Documentation/virtual/kvm/mmu.txt |    7 ----
 arch/x86/include/asm/kvm_host.h   |    5 ---
 arch/x86/kvm/mmu.c                |   56 +++++++++++++++++++-----------------
 arch/x86/kvm/x86.c                |   12 ++++---
 virt/kvm/kvm_main.c               |    1 -
 5 files changed, 37 insertions(+), 44 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/7] KVM: Write protect the updated slot only when dirty logging is enabled
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
@ 2013-01-08 10:43 ` Takuya Yoshikawa
  2013-01-08 10:44 ` [PATCH 2/7] KVM: MMU: Remove unused parameter level from __rmap_write_protect() Takuya Yoshikawa
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:43 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

Calling kvm_mmu_slot_remove_write_access() for a deleted slot does
nothing but search for non-existent mmu pages which have mappings to
that deleted memory; this is safe but a waste of time.

Since we want to make the function rmap based in a later patch, in a
manner which makes it unsafe to be called for a deleted slot, we makes
the caller see if the slot is non-zero and being dirty logged.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/x86.c  |    8 +++++++-
 virt/kvm/kvm_main.c |    1 -
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1c9c834..add5e48 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6897,7 +6897,13 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	spin_lock(&kvm->mmu_lock);
 	if (nr_mmu_pages)
 		kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
-	kvm_mmu_slot_remove_write_access(kvm, mem->slot);
+	/*
+	 * Write protect all pages for dirty logging.
+	 * Existing largepage mappings are destroyed here and new ones will
+	 * not be created until the end of the logging.
+	 */
+	if (npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
+		kvm_mmu_slot_remove_write_access(kvm, mem->slot);
 	spin_unlock(&kvm->mmu_lock);
 	/*
 	 * If memory slot is created, or moved, we need to clear all
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e45c20c..f689a6d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -817,7 +817,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
 	if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) {
 		if (kvm_create_dirty_bitmap(&new) < 0)
 			goto out_free;
-		/* destroy any largepage mappings for dirty tracking */
 	}
 
 	if (!npages || base_gfn != old.base_gfn) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/7] KVM: MMU: Remove unused parameter level from __rmap_write_protect()
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
  2013-01-08 10:43 ` [PATCH 1/7] KVM: Write protect the updated slot only when dirty logging is enabled Takuya Yoshikawa
@ 2013-01-08 10:44 ` Takuya Yoshikawa
  2013-01-08 10:44 ` [PATCH 3/7] KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based Takuya Yoshikawa
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:44 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

No longer need to care about the mapping level in this function.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 01d7c2a..bee3509 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1142,7 +1142,7 @@ spte_write_protect(struct kvm *kvm, u64 *sptep, bool *flush, bool pt_protect)
 }
 
 static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
-				 int level, bool pt_protect)
+				 bool pt_protect)
 {
 	u64 *sptep;
 	struct rmap_iterator iter;
@@ -1180,7 +1180,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 	while (mask) {
 		rmapp = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
 				      PT_PAGE_TABLE_LEVEL, slot);
-		__rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL, false);
+		__rmap_write_protect(kvm, rmapp, false);
 
 		/* clear the first set bit */
 		mask &= mask - 1;
@@ -1199,7 +1199,7 @@ static bool rmap_write_protect(struct kvm *kvm, u64 gfn)
 	for (i = PT_PAGE_TABLE_LEVEL;
 	     i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
 		rmapp = __gfn_to_rmap(gfn, i, slot);
-		write_protected |= __rmap_write_protect(kvm, rmapp, i, true);
+		write_protected |= __rmap_write_protect(kvm, rmapp, true);
 	}
 
 	return write_protected;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/7] KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
  2013-01-08 10:43 ` [PATCH 1/7] KVM: Write protect the updated slot only when dirty logging is enabled Takuya Yoshikawa
  2013-01-08 10:44 ` [PATCH 2/7] KVM: MMU: Remove unused parameter level from __rmap_write_protect() Takuya Yoshikawa
@ 2013-01-08 10:44 ` Takuya Yoshikawa
  2013-01-08 10:45 ` [PATCH 4/7] KVM: Remove unused slot_bitmap from kvm_mmu_page Takuya Yoshikawa
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:44 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

This makes it possible to release mmu_lock and reschedule conditionally
in a later patch.  Although this may increase the time needed to protect
the whole slot when we start dirty logging, the kernel should not allow
the userspace to trigger something that will hold a spinlock for such a
long time as tens of milliseconds: actually there is no limit since it
is roughly proportional to the number of guest pages.

Another point to note is that this patch removes the only user of
slot_bitmap which will cause some problems when we increase the number
of slots further.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c |   28 +++++++++++++++-------------
 1 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bee3509..b4d4fd1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4198,25 +4198,27 @@ int kvm_mmu_setup(struct kvm_vcpu *vcpu)
 
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 {
-	struct kvm_mmu_page *sp;
-	bool flush = false;
+	struct kvm_memory_slot *memslot;
+	gfn_t last_gfn;
+	int i;
 
-	list_for_each_entry(sp, &kvm->arch.active_mmu_pages, link) {
-		int i;
-		u64 *pt;
+	memslot = id_to_memslot(kvm->memslots, slot);
+	last_gfn = memslot->base_gfn + memslot->npages - 1;
 
-		if (!test_bit(slot, sp->slot_bitmap))
-			continue;
+	for (i = PT_PAGE_TABLE_LEVEL;
+	     i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
+		unsigned long *rmapp;
+		unsigned long last_index, index;
 
-		pt = sp->spt;
-		for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
-			if (!is_shadow_present_pte(pt[i]) ||
-			      !is_last_spte(pt[i], sp->role.level))
-				continue;
+		rmapp = memslot->arch.rmap[i - PT_PAGE_TABLE_LEVEL];
+		last_index = gfn_to_index(last_gfn, memslot->base_gfn, i);
 
-			spte_write_protect(kvm, &pt[i], &flush, false);
+		for (index = 0; index <= last_index; ++index, ++rmapp) {
+			if (*rmapp)
+				__rmap_write_protect(kvm, rmapp, false);
 		}
 	}
+
 	kvm_flush_remote_tlbs(kvm);
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/7] KVM: Remove unused slot_bitmap from kvm_mmu_page
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
                   ` (2 preceding siblings ...)
  2013-01-08 10:44 ` [PATCH 3/7] KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based Takuya Yoshikawa
@ 2013-01-08 10:45 ` Takuya Yoshikawa
  2013-01-08 10:46 ` [PATCH 5/7] KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself Takuya Yoshikawa
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:45 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

Not needed any more.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 Documentation/virtual/kvm/mmu.txt |    7 -------
 arch/x86/include/asm/kvm_host.h   |    5 -----
 arch/x86/kvm/mmu.c                |   10 ----------
 3 files changed, 0 insertions(+), 22 deletions(-)

diff --git a/Documentation/virtual/kvm/mmu.txt b/Documentation/virtual/kvm/mmu.txt
index fa5f1db..43fcb76 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -187,13 +187,6 @@ Shadow pages contain the following information:
     perform a reverse map from a pte to a gfn. When role.direct is set, any
     element of this array can be calculated from the gfn field when used, in
     this case, the array of gfns is not allocated. See role.direct and gfn.
-  slot_bitmap:
-    A bitmap containing one bit per memory slot.  If the page contains a pte
-    mapping a page from memory slot n, then bit n of slot_bitmap will be set
-    (if a page is aliased among several slots, then it is not guaranteed that
-    all slots will be marked).
-    Used during dirty logging to avoid scanning a shadow page if none if its
-    pages need tracking.
   root_count:
     A counter keeping track of how many hardware registers (guest cr3 or
     pdptrs) are now pointing at the page.  While this counter is nonzero, the
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c431b33..f75e1fe 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -219,11 +219,6 @@ struct kvm_mmu_page {
 	u64 *spt;
 	/* hold the gfn of each spte inside spt */
 	gfn_t *gfns;
-	/*
-	 * One bit set per slot which has memory
-	 * in this shadow page.
-	 */
-	DECLARE_BITMAP(slot_bitmap, KVM_MEM_SLOTS_NUM);
 	bool unsync;
 	int root_count;          /* Currently serving as active root */
 	unsigned int unsync_children;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b4d4fd1..bb964b3 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1522,7 +1522,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 		sp->gfns = mmu_memory_cache_alloc(&vcpu->arch.mmu_page_cache);
 	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
 	list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
-	bitmap_zero(sp->slot_bitmap, KVM_MEM_SLOTS_NUM);
 	sp->parent_ptes = 0;
 	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 	kvm_mod_used_mmu_pages(vcpu->kvm, +1);
@@ -2183,14 +2182,6 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page);
 
-static void page_header_update_slot(struct kvm *kvm, void *pte, gfn_t gfn)
-{
-	int slot = memslot_id(kvm, gfn);
-	struct kvm_mmu_page *sp = page_header(__pa(pte));
-
-	__set_bit(slot, sp->slot_bitmap);
-}
-
 /*
  * The function is based on mtrr_type_lookup() in
  * arch/x86/kernel/cpu/mtrr/generic.c
@@ -2497,7 +2488,6 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 		++vcpu->kvm->stat.lpages;
 
 	if (is_shadow_present_pte(*sptep)) {
-		page_header_update_slot(vcpu->kvm, sptep, gfn);
 		if (!was_rmapped) {
 			rmap_count = rmap_add(vcpu, sptep, gfn);
 			if (rmap_count > RMAP_RECYCLE_THRESHOLD)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/7] KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
                   ` (3 preceding siblings ...)
  2013-01-08 10:45 ` [PATCH 4/7] KVM: Remove unused slot_bitmap from kvm_mmu_page Takuya Yoshikawa
@ 2013-01-08 10:46 ` Takuya Yoshikawa
  2013-01-08 10:46 ` [PATCH 6/7] KVM: Make kvm_mmu_slot_remove_write_access() " Takuya Yoshikawa
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:46 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

No reason to make callers take mmu_lock since we do not need to protect
kvm_mmu_change_mmu_pages() and kvm_mmu_slot_remove_write_access()
together by mmu_lock in kvm_arch_commit_memory_region(): the former
calls kvm_mmu_commit_zap_page() and flushes TLBs by itself.

Note: we do not need to protect kvm->arch.n_requested_mmu_pages by
mmu_lock as can be seen from the fact that it is read locklessly.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c |    4 ++++
 arch/x86/kvm/x86.c |    9 ++++-----
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bb964b3..fc7d84a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2143,6 +2143,8 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int goal_nr_mmu_pages)
 	 * change the value
 	 */
 
+	spin_lock(&kvm->mmu_lock);
+
 	if (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages) {
 		while (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages &&
 			!list_empty(&kvm->arch.active_mmu_pages)) {
@@ -2157,6 +2159,8 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int goal_nr_mmu_pages)
 	}
 
 	kvm->arch.n_max_mmu_pages = goal_nr_mmu_pages;
+
+	spin_unlock(&kvm->mmu_lock);
 }
 
 int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index add5e48..080bbdc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3270,12 +3270,10 @@ static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm *kvm,
 		return -EINVAL;
 
 	mutex_lock(&kvm->slots_lock);
-	spin_lock(&kvm->mmu_lock);
 
 	kvm_mmu_change_mmu_pages(kvm, kvm_nr_mmu_pages);
 	kvm->arch.n_requested_mmu_pages = kvm_nr_mmu_pages;
 
-	spin_unlock(&kvm->mmu_lock);
 	mutex_unlock(&kvm->slots_lock);
 	return 0;
 }
@@ -6894,7 +6892,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	if (!kvm->arch.n_requested_mmu_pages)
 		nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm);
 
-	spin_lock(&kvm->mmu_lock);
 	if (nr_mmu_pages)
 		kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
 	/*
@@ -6902,9 +6899,11 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 * Existing largepage mappings are destroyed here and new ones will
 	 * not be created until the end of the logging.
 	 */
-	if (npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
+	if (npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
+		spin_lock(&kvm->mmu_lock);
 		kvm_mmu_slot_remove_write_access(kvm, mem->slot);
-	spin_unlock(&kvm->mmu_lock);
+		spin_unlock(&kvm->mmu_lock);
+	}
 	/*
 	 * If memory slot is created, or moved, we need to clear all
 	 * mmio sptes.
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6/7] KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
                   ` (4 preceding siblings ...)
  2013-01-08 10:46 ` [PATCH 5/7] KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself Takuya Yoshikawa
@ 2013-01-08 10:46 ` Takuya Yoshikawa
  2013-01-08 10:47 ` [PATCH 7/7] KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time Takuya Yoshikawa
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:46 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

Better to place mmu_lock handling and TLB flushing code together since
this is a self-contained function.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c |    3 +++
 arch/x86/kvm/x86.c |    5 +----
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fc7d84a..b7a1235 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4199,6 +4199,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 	memslot = id_to_memslot(kvm->memslots, slot);
 	last_gfn = memslot->base_gfn + memslot->npages - 1;
 
+	spin_lock(&kvm->mmu_lock);
+
 	for (i = PT_PAGE_TABLE_LEVEL;
 	     i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
 		unsigned long *rmapp;
@@ -4214,6 +4216,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 	}
 
 	kvm_flush_remote_tlbs(kvm);
+	spin_unlock(&kvm->mmu_lock);
 }
 
 void kvm_mmu_zap_all(struct kvm *kvm)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 080bbdc..5483228 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6899,11 +6899,8 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 * Existing largepage mappings are destroyed here and new ones will
 	 * not be created until the end of the logging.
 	 */
-	if (npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
-		spin_lock(&kvm->mmu_lock);
+	if (npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES))
 		kvm_mmu_slot_remove_write_access(kvm, mem->slot);
-		spin_unlock(&kvm->mmu_lock);
-	}
 	/*
 	 * If memory slot is created, or moved, we need to clear all
 	 * mmio sptes.
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 7/7] KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
                   ` (5 preceding siblings ...)
  2013-01-08 10:46 ` [PATCH 6/7] KVM: Make kvm_mmu_slot_remove_write_access() " Takuya Yoshikawa
@ 2013-01-08 10:47 ` Takuya Yoshikawa
  2013-01-10 17:49 ` [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Marcelo Tosatti
  2013-01-14  9:15 ` Gleb Natapov
  8 siblings, 0 replies; 10+ messages in thread
From: Takuya Yoshikawa @ 2013-01-08 10:47 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm, linux-kernel

If the userspace starts dirty logging for a large slot, say 64GB of
memory, kvm_mmu_slot_remove_write_access() needs to hold mmu_lock for
a long time such as tens of milliseconds.  This patch controls the lock
hold time by asking the scheduler if we need to reschedule for others.

One penalty for this is that we need to flush TLBs before releasing
mmu_lock.  But since holding mmu_lock for a long time does affect not
only the guest, vCPU threads in other words, but also the host as a
whole, we should pay for that.

In practice, the cost will not be so high because we can protect a fair
amount of memory before being rescheduled: on my test environment,
cond_resched_lock() was called only once for protecting 12GB of memory
even without THP.  We can also revisit Avi's "unlocked TLB flush" work
later for completely suppressing extra TLB flushes if needed.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/kvm/mmu.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b7a1235..a32e8cf 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4212,6 +4212,11 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 		for (index = 0; index <= last_index; ++index, ++rmapp) {
 			if (*rmapp)
 				__rmap_write_protect(kvm, rmapp, false);
+
+			if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
+				kvm_flush_remote_tlbs(kvm);
+				cond_resched_lock(&kvm->mmu_lock);
+			}
 		}
 	}
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
                   ` (6 preceding siblings ...)
  2013-01-08 10:47 ` [PATCH 7/7] KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time Takuya Yoshikawa
@ 2013-01-10 17:49 ` Marcelo Tosatti
  2013-01-14  9:15 ` Gleb Natapov
  8 siblings, 0 replies; 10+ messages in thread
From: Marcelo Tosatti @ 2013-01-10 17:49 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: gleb, kvm, linux-kernel

On Tue, Jan 08, 2013 at 07:42:38PM +0900, Takuya Yoshikawa wrote:
> Changelog v1->v2:
>   The condition in patch 1 was changed like this:
>     npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
> 
> This patch set makes kvm_mmu_slot_remove_write_access() rmap based and
> adds conditional rescheduling to it.
> 
> The motivation for this change is of course to reduce the mmu_lock hold
> time when we start dirty logging for a large memory slot.  You may not
> see the problem if you just give 8GB or less of the memory to the guest
> with THP enabled on the host -- this is for the worst case.
> 
> Takuya Yoshikawa (7):
>   KVM: Write protect the updated slot only when dirty logging is enabled
>   KVM: MMU: Remove unused parameter level from __rmap_write_protect()
>   KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based
>   KVM: Remove unused slot_bitmap from kvm_mmu_page
>   KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself
>   KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself
>   KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time
> 
>  Documentation/virtual/kvm/mmu.txt |    7 ----
>  arch/x86/include/asm/kvm_host.h   |    5 ---
>  arch/x86/kvm/mmu.c                |   56 +++++++++++++++++++-----------------
>  arch/x86/kvm/x86.c                |   12 ++++---
>  virt/kvm/kvm_main.c               |    1 -
>  5 files changed, 37 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.4

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging
  2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
                   ` (7 preceding siblings ...)
  2013-01-10 17:49 ` [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Marcelo Tosatti
@ 2013-01-14  9:15 ` Gleb Natapov
  8 siblings, 0 replies; 10+ messages in thread
From: Gleb Natapov @ 2013-01-14  9:15 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: mtosatti, kvm, linux-kernel

On Tue, Jan 08, 2013 at 07:42:38PM +0900, Takuya Yoshikawa wrote:
> Changelog v1->v2:
>   The condition in patch 1 was changed like this:
>     npages && (mem->flags & KVM_MEM_LOG_DIRTY_PAGES)
> 
> This patch set makes kvm_mmu_slot_remove_write_access() rmap based and
> adds conditional rescheduling to it.
> 
> The motivation for this change is of course to reduce the mmu_lock hold
> time when we start dirty logging for a large memory slot.  You may not
> see the problem if you just give 8GB or less of the memory to the guest
> with THP enabled on the host -- this is for the worst case.
> 
Applied, thanks.

> Takuya Yoshikawa (7):
>   KVM: Write protect the updated slot only when dirty logging is enabled
>   KVM: MMU: Remove unused parameter level from __rmap_write_protect()
>   KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based
>   KVM: Remove unused slot_bitmap from kvm_mmu_page
>   KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself
>   KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself
>   KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time
> 
>  Documentation/virtual/kvm/mmu.txt |    7 ----
>  arch/x86/include/asm/kvm_host.h   |    5 ---
>  arch/x86/kvm/mmu.c                |   56 +++++++++++++++++++-----------------
>  arch/x86/kvm/x86.c                |   12 ++++---
>  virt/kvm/kvm_main.c               |    1 -
>  5 files changed, 37 insertions(+), 44 deletions(-)
> 
> -- 
> 1.7.5.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-01-14  9:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-08 10:42 [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Takuya Yoshikawa
2013-01-08 10:43 ` [PATCH 1/7] KVM: Write protect the updated slot only when dirty logging is enabled Takuya Yoshikawa
2013-01-08 10:44 ` [PATCH 2/7] KVM: MMU: Remove unused parameter level from __rmap_write_protect() Takuya Yoshikawa
2013-01-08 10:44 ` [PATCH 3/7] KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based Takuya Yoshikawa
2013-01-08 10:45 ` [PATCH 4/7] KVM: Remove unused slot_bitmap from kvm_mmu_page Takuya Yoshikawa
2013-01-08 10:46 ` [PATCH 5/7] KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself Takuya Yoshikawa
2013-01-08 10:46 ` [PATCH 6/7] KVM: Make kvm_mmu_slot_remove_write_access() " Takuya Yoshikawa
2013-01-08 10:47 ` [PATCH 7/7] KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time Takuya Yoshikawa
2013-01-10 17:49 ` [PATCH 0/7 -v2] KVM: Alleviate mmu_lock hold time when we start dirty logging Marcelo Tosatti
2013-01-14  9:15 ` Gleb Natapov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.