All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhuangyanying <ann.zhuangyanying@huawei.com>
To: <xiaoguangrong@tencent.com>, <pbonzini@redhat.com>,
	<arei.gonglei@huawei.com>
Cc: liu.jinsong@huawei.com, wangxinxin.wang@huawei.com,
	Zhuang Yanying <ann.zhuangyanying@huawei.com>,
	qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect
Date: Thu, 17 Jan 2019 13:55:31 +0000	[thread overview]
Message-ID: <1547733331-16140-5-git-send-email-ann.zhuangyanying@huawei.com> (raw)
In-Reply-To: <1547733331-16140-1-git-send-email-ann.zhuangyanying@huawei.com>

From: Zhuang Yanying <ann.zhuangyanying@huawei.com>

When live-migration with large-memory guests, vcpu may hang for a long
time while starting migration, such as 9s for 2T
(linux-5.0.0-rc2+qemu-3.1.0).
The reason is memory_global_dirty_log_start() taking too long, and the
vcpu is waiting for BQL. The page-by-page D bit clearup is the main time
consumption. I think that the idea of "KVM: MMU: fast write protect" by
xiaoguangrong, especially the function kvm_mmu_write_protect_all_pages(),
is very helpful. After a little modifcation, on his patch, can solve
this problem, 9s to 0.5s.

At the beginning of live migration, write protection is only applied to the
top-level SPTE. Then the write from vm trigger the EPT violation, with
for_each_shadow_entry write protection is performed at dirct_map.
Finally the Dirty bit of the target page(at level 1 page table) is
cleared, and the dirty page tracking is started. Of coure, the page where
GPA is located is marked dirty when mmu_set_spte.
A similar implementation on xen, just emt instead of write protection.

Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
 arch/x86/kvm/mmu.c     | 8 +++++---
 arch/x86/kvm/vmx/vmx.c | 3 +--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 047b897..a18bcc0 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3257,7 +3257,10 @@ static bool mmu_load_shadow_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 			break;
 
 		if (is_last_spte(spte, sp->role.level)) {
-			flush |= spte_write_protect(sptep, false);
+			if (sp->role.level == PT_PAGE_TABLE_LEVEL)
+				flush |= spte_clear_dirty(sptep);
+			else
+				flush |= spte_write_protect(sptep, false);
 			continue;
 		}
 
@@ -6114,7 +6117,6 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect)
 {
 	u64 wp_all_indicator, kvm_wp_all_gen;
 
-	mutex_lock(&kvm->slots_lock);
 	wp_all_indicator = get_write_protect_all_indicator(kvm);
 	kvm_wp_all_gen = get_write_protect_all_gen(wp_all_indicator);
 
@@ -6134,8 +6136,8 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect)
 	 */
 	if (write_protect)
 		kvm_reload_remote_mmus(kvm);
-	mutex_unlock(&kvm->slots_lock);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_write_protect_all_pages);
 
 static unsigned long
 mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f6915f1..5236a07 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7180,8 +7180,7 @@ static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu)
 static void vmx_slot_enable_log_dirty(struct kvm *kvm,
 				     struct kvm_memory_slot *slot)
 {
-	kvm_mmu_slot_leaf_clear_dirty(kvm, slot);
-	kvm_mmu_slot_largepage_remove_write_access(kvm, slot);
+	kvm_mmu_write_protect_all_pages(kvm, true);
 }
 
 static void vmx_slot_disable_log_dirty(struct kvm *kvm,
-- 
1.8.3.1

WARNING: multiple messages have this Message-ID (diff)
From: Zhuangyanying <ann.zhuangyanying@huawei.com>
To: xiaoguangrong@tencent.com, pbonzini@redhat.com, arei.gonglei@huawei.com
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	wangxinxin.wang@huawei.com, liu.jinsong@huawei.com,
	Zhuang Yanying <ann.zhuangyanying@huawei.com>
Subject: [Qemu-devel] [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect
Date: Thu, 17 Jan 2019 13:55:31 +0000	[thread overview]
Message-ID: <1547733331-16140-5-git-send-email-ann.zhuangyanying@huawei.com> (raw)
In-Reply-To: <1547733331-16140-1-git-send-email-ann.zhuangyanying@huawei.com>

From: Zhuang Yanying <ann.zhuangyanying@huawei.com>

When live-migration with large-memory guests, vcpu may hang for a long
time while starting migration, such as 9s for 2T
(linux-5.0.0-rc2+qemu-3.1.0).
The reason is memory_global_dirty_log_start() taking too long, and the
vcpu is waiting for BQL. The page-by-page D bit clearup is the main time
consumption. I think that the idea of "KVM: MMU: fast write protect" by
xiaoguangrong, especially the function kvm_mmu_write_protect_all_pages(),
is very helpful. After a little modifcation, on his patch, can solve
this problem, 9s to 0.5s.

At the beginning of live migration, write protection is only applied to the
top-level SPTE. Then the write from vm trigger the EPT violation, with
for_each_shadow_entry write protection is performed at dirct_map.
Finally the Dirty bit of the target page(at level 1 page table) is
cleared, and the dirty page tracking is started. Of coure, the page where
GPA is located is marked dirty when mmu_set_spte.
A similar implementation on xen, just emt instead of write protection.

Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
---
 arch/x86/kvm/mmu.c     | 8 +++++---
 arch/x86/kvm/vmx/vmx.c | 3 +--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 047b897..a18bcc0 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3257,7 +3257,10 @@ static bool mmu_load_shadow_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 			break;
 
 		if (is_last_spte(spte, sp->role.level)) {
-			flush |= spte_write_protect(sptep, false);
+			if (sp->role.level == PT_PAGE_TABLE_LEVEL)
+				flush |= spte_clear_dirty(sptep);
+			else
+				flush |= spte_write_protect(sptep, false);
 			continue;
 		}
 
@@ -6114,7 +6117,6 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect)
 {
 	u64 wp_all_indicator, kvm_wp_all_gen;
 
-	mutex_lock(&kvm->slots_lock);
 	wp_all_indicator = get_write_protect_all_indicator(kvm);
 	kvm_wp_all_gen = get_write_protect_all_gen(wp_all_indicator);
 
@@ -6134,8 +6136,8 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect)
 	 */
 	if (write_protect)
 		kvm_reload_remote_mmus(kvm);
-	mutex_unlock(&kvm->slots_lock);
 }
+EXPORT_SYMBOL_GPL(kvm_mmu_write_protect_all_pages);
 
 static unsigned long
 mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f6915f1..5236a07 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7180,8 +7180,7 @@ static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu)
 static void vmx_slot_enable_log_dirty(struct kvm *kvm,
 				     struct kvm_memory_slot *slot)
 {
-	kvm_mmu_slot_leaf_clear_dirty(kvm, slot);
-	kvm_mmu_slot_largepage_remove_write_access(kvm, slot);
+	kvm_mmu_write_protect_all_pages(kvm, true);
 }
 
 static void vmx_slot_disable_log_dirty(struct kvm *kvm,
-- 
1.8.3.1

  parent reply	other threads:[~2019-01-17 13:55 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-17 13:55 [PATCH 0/4] KVM: MMU: fast cleanup D bit based on fast write protect Zhuangyanying
2019-01-17 13:55 ` [Qemu-devel] " Zhuangyanying
2019-01-17 13:55 ` [PATCH 1/4] KVM: MMU: correct the behavior of mmu_spte_update_no_track Zhuangyanying
2019-01-17 13:55   ` [Qemu-devel] " Zhuangyanying
2019-01-17 15:44   ` Sean Christopherson
2019-01-17 15:44     ` [Qemu-devel] " Sean Christopherson
2019-01-17 13:55 ` [PATCH 2/4] KVM: MMU: introduce possible_writable_spte_bitmap Zhuangyanying
2019-01-17 13:55   ` [Qemu-devel] " Zhuangyanying
2019-01-17 15:55   ` Sean Christopherson
2019-01-17 15:55     ` [Qemu-devel] " Sean Christopherson
2019-01-17 13:55 ` [PATCH 3/4] KVM: MMU: introduce kvm_mmu_write_protect_all_pages Zhuangyanying
2019-01-17 13:55   ` [Qemu-devel] " Zhuangyanying
2019-01-17 16:09   ` Sean Christopherson
2019-01-17 16:09     ` [Qemu-devel] " Sean Christopherson
2019-01-17 13:55 ` Zhuangyanying [this message]
2019-01-17 13:55   ` [Qemu-devel] [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect Zhuangyanying
2019-01-17 16:32   ` Sean Christopherson
2019-01-17 16:32     ` [Qemu-devel] " Sean Christopherson
2019-01-21  6:37     ` Zhuangyanying
2019-01-21  6:37       ` [Qemu-devel] " Zhuangyanying
2019-01-22 15:17       ` Sean Christopherson
2019-01-22 15:17         ` [Qemu-devel] " Sean Christopherson
2019-01-23 18:38         ` Zhuangyanying
2019-01-23 18:38           ` [Qemu-devel] " Zhuangyanying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1547733331-16140-5-git-send-email-ann.zhuangyanying@huawei.com \
    --to=ann.zhuangyanying@huawei.com \
    --cc=arei.gonglei@huawei.com \
    --cc=kvm@vger.kernel.org \
    --cc=liu.jinsong@huawei.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=wangxinxin.wang@huawei.com \
    --cc=xiaoguangrong@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.