From: Zhuangyanying <ann.zhuangyanying@huawei.com> To: <xiaoguangrong@tencent.com>, <pbonzini@redhat.com>, <arei.gonglei@huawei.com> Cc: liu.jinsong@huawei.com, wangxinxin.wang@huawei.com, Zhuang Yanying <ann.zhuangyanying@huawei.com>, qemu-devel@nongnu.org, kvm@vger.kernel.org Subject: [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect Date: Thu, 17 Jan 2019 13:55:31 +0000 [thread overview] Message-ID: <1547733331-16140-5-git-send-email-ann.zhuangyanying@huawei.com> (raw) In-Reply-To: <1547733331-16140-1-git-send-email-ann.zhuangyanying@huawei.com> From: Zhuang Yanying <ann.zhuangyanying@huawei.com> When live-migration with large-memory guests, vcpu may hang for a long time while starting migration, such as 9s for 2T (linux-5.0.0-rc2+qemu-3.1.0). The reason is memory_global_dirty_log_start() taking too long, and the vcpu is waiting for BQL. The page-by-page D bit clearup is the main time consumption. I think that the idea of "KVM: MMU: fast write protect" by xiaoguangrong, especially the function kvm_mmu_write_protect_all_pages(), is very helpful. After a little modifcation, on his patch, can solve this problem, 9s to 0.5s. At the beginning of live migration, write protection is only applied to the top-level SPTE. Then the write from vm trigger the EPT violation, with for_each_shadow_entry write protection is performed at dirct_map. Finally the Dirty bit of the target page(at level 1 page table) is cleared, and the dirty page tracking is started. Of coure, the page where GPA is located is marked dirty when mmu_set_spte. A similar implementation on xen, just emt instead of write protection. Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> --- arch/x86/kvm/mmu.c | 8 +++++--- arch/x86/kvm/vmx/vmx.c | 3 +-- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 047b897..a18bcc0 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3257,7 +3257,10 @@ static bool mmu_load_shadow_page(struct kvm *kvm, struct kvm_mmu_page *sp) break; if (is_last_spte(spte, sp->role.level)) { - flush |= spte_write_protect(sptep, false); + if (sp->role.level == PT_PAGE_TABLE_LEVEL) + flush |= spte_clear_dirty(sptep); + else + flush |= spte_write_protect(sptep, false); continue; } @@ -6114,7 +6117,6 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect) { u64 wp_all_indicator, kvm_wp_all_gen; - mutex_lock(&kvm->slots_lock); wp_all_indicator = get_write_protect_all_indicator(kvm); kvm_wp_all_gen = get_write_protect_all_gen(wp_all_indicator); @@ -6134,8 +6136,8 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect) */ if (write_protect) kvm_reload_remote_mmus(kvm); - mutex_unlock(&kvm->slots_lock); } +EXPORT_SYMBOL_GPL(kvm_mmu_write_protect_all_pages); static unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f6915f1..5236a07 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7180,8 +7180,7 @@ static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu) static void vmx_slot_enable_log_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) { - kvm_mmu_slot_leaf_clear_dirty(kvm, slot); - kvm_mmu_slot_largepage_remove_write_access(kvm, slot); + kvm_mmu_write_protect_all_pages(kvm, true); } static void vmx_slot_disable_log_dirty(struct kvm *kvm, -- 1.8.3.1
WARNING: multiple messages have this Message-ID (diff)
From: Zhuangyanying <ann.zhuangyanying@huawei.com> To: xiaoguangrong@tencent.com, pbonzini@redhat.com, arei.gonglei@huawei.com Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org, wangxinxin.wang@huawei.com, liu.jinsong@huawei.com, Zhuang Yanying <ann.zhuangyanying@huawei.com> Subject: [Qemu-devel] [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect Date: Thu, 17 Jan 2019 13:55:31 +0000 [thread overview] Message-ID: <1547733331-16140-5-git-send-email-ann.zhuangyanying@huawei.com> (raw) In-Reply-To: <1547733331-16140-1-git-send-email-ann.zhuangyanying@huawei.com> From: Zhuang Yanying <ann.zhuangyanying@huawei.com> When live-migration with large-memory guests, vcpu may hang for a long time while starting migration, such as 9s for 2T (linux-5.0.0-rc2+qemu-3.1.0). The reason is memory_global_dirty_log_start() taking too long, and the vcpu is waiting for BQL. The page-by-page D bit clearup is the main time consumption. I think that the idea of "KVM: MMU: fast write protect" by xiaoguangrong, especially the function kvm_mmu_write_protect_all_pages(), is very helpful. After a little modifcation, on his patch, can solve this problem, 9s to 0.5s. At the beginning of live migration, write protection is only applied to the top-level SPTE. Then the write from vm trigger the EPT violation, with for_each_shadow_entry write protection is performed at dirct_map. Finally the Dirty bit of the target page(at level 1 page table) is cleared, and the dirty page tracking is started. Of coure, the page where GPA is located is marked dirty when mmu_set_spte. A similar implementation on xen, just emt instead of write protection. Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com> --- arch/x86/kvm/mmu.c | 8 +++++--- arch/x86/kvm/vmx/vmx.c | 3 +-- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 047b897..a18bcc0 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3257,7 +3257,10 @@ static bool mmu_load_shadow_page(struct kvm *kvm, struct kvm_mmu_page *sp) break; if (is_last_spte(spte, sp->role.level)) { - flush |= spte_write_protect(sptep, false); + if (sp->role.level == PT_PAGE_TABLE_LEVEL) + flush |= spte_clear_dirty(sptep); + else + flush |= spte_write_protect(sptep, false); continue; } @@ -6114,7 +6117,6 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect) { u64 wp_all_indicator, kvm_wp_all_gen; - mutex_lock(&kvm->slots_lock); wp_all_indicator = get_write_protect_all_indicator(kvm); kvm_wp_all_gen = get_write_protect_all_gen(wp_all_indicator); @@ -6134,8 +6136,8 @@ void kvm_mmu_write_protect_all_pages(struct kvm *kvm, bool write_protect) */ if (write_protect) kvm_reload_remote_mmus(kvm); - mutex_unlock(&kvm->slots_lock); } +EXPORT_SYMBOL_GPL(kvm_mmu_write_protect_all_pages); static unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f6915f1..5236a07 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7180,8 +7180,7 @@ static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu) static void vmx_slot_enable_log_dirty(struct kvm *kvm, struct kvm_memory_slot *slot) { - kvm_mmu_slot_leaf_clear_dirty(kvm, slot); - kvm_mmu_slot_largepage_remove_write_access(kvm, slot); + kvm_mmu_write_protect_all_pages(kvm, true); } static void vmx_slot_disable_log_dirty(struct kvm *kvm, -- 1.8.3.1
next prev parent reply other threads:[~2019-01-17 13:55 UTC|newest] Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-01-17 13:55 [PATCH 0/4] KVM: MMU: fast cleanup D bit based on fast write protect Zhuangyanying 2019-01-17 13:55 ` [Qemu-devel] " Zhuangyanying 2019-01-17 13:55 ` [PATCH 1/4] KVM: MMU: correct the behavior of mmu_spte_update_no_track Zhuangyanying 2019-01-17 13:55 ` [Qemu-devel] " Zhuangyanying 2019-01-17 15:44 ` Sean Christopherson 2019-01-17 15:44 ` [Qemu-devel] " Sean Christopherson 2019-01-17 13:55 ` [PATCH 2/4] KVM: MMU: introduce possible_writable_spte_bitmap Zhuangyanying 2019-01-17 13:55 ` [Qemu-devel] " Zhuangyanying 2019-01-17 15:55 ` Sean Christopherson 2019-01-17 15:55 ` [Qemu-devel] " Sean Christopherson 2019-01-17 13:55 ` [PATCH 3/4] KVM: MMU: introduce kvm_mmu_write_protect_all_pages Zhuangyanying 2019-01-17 13:55 ` [Qemu-devel] " Zhuangyanying 2019-01-17 16:09 ` Sean Christopherson 2019-01-17 16:09 ` [Qemu-devel] " Sean Christopherson 2019-01-17 13:55 ` Zhuangyanying [this message] 2019-01-17 13:55 ` [Qemu-devel] [PATCH 4/4] KVM: MMU: fast cleanup D bit based on fast write protect Zhuangyanying 2019-01-17 16:32 ` Sean Christopherson 2019-01-17 16:32 ` [Qemu-devel] " Sean Christopherson 2019-01-21 6:37 ` Zhuangyanying 2019-01-21 6:37 ` [Qemu-devel] " Zhuangyanying 2019-01-22 15:17 ` Sean Christopherson 2019-01-22 15:17 ` [Qemu-devel] " Sean Christopherson 2019-01-23 18:38 ` Zhuangyanying 2019-01-23 18:38 ` [Qemu-devel] " Zhuangyanying
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=1547733331-16140-5-git-send-email-ann.zhuangyanying@huawei.com \ --to=ann.zhuangyanying@huawei.com \ --cc=arei.gonglei@huawei.com \ --cc=kvm@vger.kernel.org \ --cc=liu.jinsong@huawei.com \ --cc=pbonzini@redhat.com \ --cc=qemu-devel@nongnu.org \ --cc=wangxinxin.wang@huawei.com \ --cc=xiaoguangrong@tencent.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.