[PATCH 0/2] RFC use dirty bit for page dirty tracking (v2)

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2)
@ 2009-06-10 16:23 Izik Eidus
  2009-06-10 16:23 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
  2009-06-15 14:58 ` [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Ryan Harper
  0 siblings, 2 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 16:23 UTC (permalink / raw)
  To: kvm; +Cc: avi, Izik Eidus

RFC move to dirty bit tracking using the page table dirty bit (v2)

(BTW, it seems like the vnc in the mainline have some bugs, i have wasted 2
 hours debugging rendering bug that i thought was related to that seires, but
 it came out not to be related)

Thanks.

Izik Eidus (2):
  kvm: fix dirty bit tracking for slots with large pages
  kvm: change the dirty page tracking to work with dirty bity

 arch/ia64/kvm/kvm-ia64.c        |    4 +++
 arch/powerpc/kvm/powerpc.c      |    4 +++
 arch/s390/kvm/kvm-s390.c        |    4 +++
 arch/x86/include/asm/kvm_host.h |    3 ++
 arch/x86/kvm/mmu.c              |   42 ++++++++++++++++++++++++++++++++++++--
 arch/x86/kvm/svm.c              |    7 ++++++
 arch/x86/kvm/vmx.c              |    7 ++++++
 arch/x86/kvm/x86.c              |   26 ++++++++++++++++++++---
 include/linux/kvm_host.h        |    1 +
 virt/kvm/kvm_main.c             |    8 ++++++-
 10 files changed, 98 insertions(+), 8 deletions(-)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages
  2009-06-10 16:23 [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Izik Eidus
@ 2009-06-10 16:23 ` Izik Eidus
  2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
  2009-06-14 11:10   ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Avi Kivity
  2009-06-15 14:58 ` [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Ryan Harper
  1 sibling, 2 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 16:23 UTC (permalink / raw)
  To: kvm; +Cc: avi, Izik Eidus

When slot is already allocted and being asked to be tracked we need to break the
large pages.

This code flush the mmu when someone ask a slot to start dirty bit tracking.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 virt/kvm/kvm_main.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 669eb4a..3046e9c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1194,6 +1194,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
 		if (!new.dirty_bitmap)
 			goto out_free;
 		memset(new.dirty_bitmap, 0, dirty_bytes);
+		if (old.npages)
+			kvm_arch_flush_shadow(kvm);
 	}
 #endif /* not defined CONFIG_S390 */
 
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 16:23 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
@ 2009-06-10 16:23   ` Izik Eidus
  2009-06-10 17:00     ` Izik Eidus
                       ` (3 more replies)
  2009-06-14 11:10   ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Avi Kivity
  1 sibling, 4 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 16:23 UTC (permalink / raw)
  To: kvm; +Cc: avi, Izik Eidus

change the dirty page tracking to work with dirty bity instead of page fault.
right now the dirty page tracking work with the help of page faults, when we
want to track a page for being dirty, we write protect it and we mark it dirty
when we have write page fault, this code move into looking at the dirty bit
of the spte.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 arch/ia64/kvm/kvm-ia64.c        |    4 +++
 arch/powerpc/kvm/powerpc.c      |    4 +++
 arch/s390/kvm/kvm-s390.c        |    4 +++
 arch/x86/include/asm/kvm_host.h |    3 ++
 arch/x86/kvm/mmu.c              |   42 ++++++++++++++++++++++++++++++++++++--
 arch/x86/kvm/svm.c              |    7 ++++++
 arch/x86/kvm/vmx.c              |    7 ++++++
 arch/x86/kvm/x86.c              |   26 ++++++++++++++++++++---
 include/linux/kvm_host.h        |    1 +
 virt/kvm/kvm_main.c             |    6 ++++-
 10 files changed, 96 insertions(+), 8 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 3199221..5914128 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1809,6 +1809,10 @@ void kvm_arch_exit(void)
 	kvm_vmm_info = NULL;
 }
 
+void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
+{
+}
+
 static int kvm_ia64_sync_dirty_log(struct kvm *kvm,
 		struct kvm_dirty_log *log)
 {
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 2cf915e..6beb368 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -418,6 +418,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
 	return -ENOTSUPP;
 }
 
+void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
+{
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 981ab04..ab6f115 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -130,6 +130,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 	return 0;
 }
 
+void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
+{
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
 		       unsigned int ioctl, unsigned long arg)
 {
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c7b0cc2..8a24149 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -527,6 +527,7 @@ struct kvm_x86_ops {
 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
 	int (*get_tdp_level)(void);
 	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
+	int (*dirty_bit_support)(void);
 };
 
 extern struct kvm_x86_ops *kvm_x86_ops;
@@ -796,4 +797,6 @@ int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
 int kvm_age_hva(struct kvm *kvm, unsigned long hva);
 int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
 
+int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp);
+
 #endif /* _ASM_X86_KVM_HOST_H */
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 809cce0..500e0e2 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -140,6 +140,8 @@ module_param(oos_shadow, bool, 0644);
 #define ACC_USER_MASK    PT_USER_MASK
 #define ACC_ALL          (ACC_EXEC_MASK | ACC_WRITE_MASK | ACC_USER_MASK)
 
+#define SPTE_DONT_DIRTY (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
+
 #define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
 
 struct kvm_rmap_desc {
@@ -629,6 +631,29 @@ static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
 	return NULL;
 }
 
+int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp)
+{
+	u64 *spte;
+	int dirty = 0;
+
+	if (!shadow_dirty_mask)
+		return 0;
+
+	spte = rmap_next(kvm, rmapp, NULL);
+	while (spte) {
+		if (*spte & PT_DIRTY_MASK) {
+			set_shadow_pte(spte, (*spte &= ~PT_DIRTY_MASK) |
+				       SPTE_DONT_DIRTY);
+			dirty = 1;
+			break;
+		}
+		spte = rmap_next(kvm, rmapp, spte);
+	}
+
+	return dirty;
+}
+
+
 static int rmap_write_protect(struct kvm *kvm, u64 gfn)
 {
 	unsigned long *rmapp;
@@ -1381,11 +1406,17 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
 static int kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp)
 {
 	int ret;
+	int i;
+
 	++kvm->stat.mmu_shadow_zapped;
 	ret = mmu_zap_unsync_children(kvm, sp);
 	kvm_mmu_page_unlink_children(kvm, sp);
 	kvm_mmu_unlink_parents(kvm, sp);
 	kvm_flush_remote_tlbs(kvm);
+	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
+		if (sp->spt[i] & PT_DIRTY_MASK)
+			mark_page_dirty(kvm, sp->gfns[i]);
+	}
 	if (!sp->role.invalid && !sp->role.direct)
 		unaccount_shadowed(kvm, sp->gfn);
 	if (sp->unsync)
@@ -1676,7 +1707,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
 	 * whether the guest actually used the pte (in order to detect
 	 * demand paging).
 	 */
-	spte = shadow_base_present_pte | shadow_dirty_mask;
+	spte = shadow_base_present_pte;
+	if (!(spte & SPTE_DONT_DIRTY))
+		spte |= shadow_dirty_mask;
+
 	if (!speculative)
 		spte |= shadow_accessed_mask;
 	if (!dirty)
@@ -1725,8 +1759,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
 		}
 	}
 
-	if (pte_access & ACC_WRITE_MASK)
-		mark_page_dirty(vcpu->kvm, gfn);
+	if (!shadow_dirty_mask) {
+		if (pte_access & ACC_WRITE_MASK)
+			mark_page_dirty(vcpu->kvm, gfn);
+	}
 
 set_pte:
 	set_shadow_pte(shadow_pte, spte);
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 37397f6..5b6351e 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2724,6 +2724,11 @@ static u64 svm_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 	return 0;
 }
 
+static int svm_dirty_bit_support(void)
+{
+	return 1;
+}
+
 static struct kvm_x86_ops svm_x86_ops = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -2785,6 +2790,8 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.set_tss_addr = svm_set_tss_addr,
 	.get_tdp_level = get_npt_level,
 	.get_mt_mask = svm_get_mt_mask,
+
+	.dirty_bit_support = svm_dirty_bit_support,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 673bcb3..8903314 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3774,6 +3774,11 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
 	return ret;
 }
 
+static int vmx_dirty_bit_support(void)
+{
+	return false;
+}
+
 static struct kvm_x86_ops vmx_x86_ops = {
 	.cpu_has_kvm_support = cpu_has_kvm_support,
 	.disabled_by_bios = vmx_disabled_by_bios,
@@ -3833,6 +3838,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.set_tss_addr = vmx_set_tss_addr,
 	.get_tdp_level = get_ept_level,
 	.get_mt_mask = vmx_get_mt_mask,
+
+	.dirty_bit_support = vmx_dirty_bit_support,
 };
 
 static int __init vmx_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 272e2e8..e06387c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1963,6 +1963,20 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
 	return 0;
 }
 
+void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
+{
+	int i;
+
+	spin_lock(&kvm->mmu_lock);
+	for (i = 0; i < memslot->npages; ++i) {
+		if (!test_bit(i, memslot->dirty_bitmap)) {
+			if (is_dirty_and_clean_rmapp(kvm, &memslot->rmap[i]))
+				set_bit(i, memslot->dirty_bitmap);
+		}
+	}
+	spin_unlock(&kvm->mmu_lock);
+}
+
 /*
  * Get (and clear) the dirty memory log for a memory slot.
  */
@@ -1982,10 +1996,14 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 
 	/* If nothing is dirty, don't bother messing with page tables. */
 	if (is_dirty) {
-		spin_lock(&kvm->mmu_lock);
-		kvm_mmu_slot_remove_write_access(kvm, log->slot);
-		spin_unlock(&kvm->mmu_lock);
-		kvm_flush_remote_tlbs(kvm);
+		if (!kvm_x86_ops->dirty_bit_support()) {
+			spin_lock(&kvm->mmu_lock);
+			/*  remove_write_access() flush the tlb */
+			kvm_mmu_slot_remove_write_access(kvm, log->slot);
+			spin_unlock(&kvm->mmu_lock);
+		} else {
+			kvm_flush_remote_tlbs(kvm);
+		}
 		memslot = &kvm->memslots[log->slot];
 		n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
 		memset(memslot->dirty_bitmap, 0, n);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index cdf1279..d1657a3 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -250,6 +250,7 @@ int kvm_dev_ioctl_check_extension(long ext);
 
 int kvm_get_dirty_log(struct kvm *kvm,
 			struct kvm_dirty_log *log, int *is_dirty);
+void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot);
 int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
 				struct kvm_dirty_log *log);
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3046e9c..a876231 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1135,8 +1135,11 @@ int __kvm_set_memory_region(struct kvm *kvm,
 	}
 
 	/* Free page dirty bitmap if unneeded */
-	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
+	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
 		new.dirty_bitmap = NULL;
+		if (old.flags & KVM_MEM_LOG_DIRTY_PAGES)
+			kvm_arch_flush_shadow(kvm);
+	}
 
 	r = -ENOMEM;
 
@@ -1279,6 +1282,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
 	if (!memslot->dirty_bitmap)
 		goto out;
 
+	kvm_arch_get_dirty_log(kvm, memslot);
 	n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
 
 	for (i = 0; !any && i < n/sizeof(long); ++i)
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
@ 2009-06-10 17:00     ` Izik Eidus
  2009-06-10 20:42     ` Marcelo Tosatti
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 17:00 UTC (permalink / raw)
  To: kvm; +Cc: avi

Izik Eidus wrote:
> +static int vmx_dirty_bit_support(void)
> +{
> +	return false;
> +}
> +
>   


Again, idiotic bug: this should be:
return tdp_enable == false;


...



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
  2009-06-10 17:00     ` Izik Eidus
@ 2009-06-10 20:42     ` Marcelo Tosatti
  2009-06-10 23:57       ` Izik Eidus
  2009-06-11  8:24     ` Ulrich Drepper
  2009-06-11  9:33     ` Avi Kivity
  3 siblings, 1 reply; 24+ messages in thread
From: Marcelo Tosatti @ 2009-06-10 20:42 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm, avi

On Wed, Jun 10, 2009 at 07:23:25PM +0300, Izik Eidus wrote:
> change the dirty page tracking to work with dirty bity instead of page fault.
> right now the dirty page tracking work with the help of page faults, when we
> want to track a page for being dirty, we write protect it and we mark it dirty
> when we have write page fault, this code move into looking at the dirty bit
> of the spte.
> 
> Signed-off-by: Izik Eidus <ieidus@redhat.com>
> ---
>  arch/ia64/kvm/kvm-ia64.c        |    4 +++
>  arch/powerpc/kvm/powerpc.c      |    4 +++
>  arch/s390/kvm/kvm-s390.c        |    4 +++
>  arch/x86/include/asm/kvm_host.h |    3 ++
>  arch/x86/kvm/mmu.c              |   42 ++++++++++++++++++++++++++++++++++++--
>  arch/x86/kvm/svm.c              |    7 ++++++
>  arch/x86/kvm/vmx.c              |    7 ++++++
>  arch/x86/kvm/x86.c              |   26 ++++++++++++++++++++---
>  include/linux/kvm_host.h        |    1 +
>  virt/kvm/kvm_main.c             |    6 ++++-
>  10 files changed, 96 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
> index 3199221..5914128 100644
> --- a/arch/ia64/kvm/kvm-ia64.c
> +++ b/arch/ia64/kvm/kvm-ia64.c
> @@ -1809,6 +1809,10 @@ void kvm_arch_exit(void)
>  	kvm_vmm_info = NULL;
>  }
>  
> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
> +{
> +}
> +
>  static int kvm_ia64_sync_dirty_log(struct kvm *kvm,
>  		struct kvm_dirty_log *log)
>  {
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 2cf915e..6beb368 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -418,6 +418,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>  	return -ENOTSUPP;
>  }

>  

#ifndef KVM_ARCH_HAVE_DIRTY_LOG
> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
> +{
> +}
> +
#endif

in virt/kvm/main.c


> index c7b0cc2..8a24149 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -527,6 +527,7 @@ struct kvm_x86_ops {
>  	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>  	int (*get_tdp_level)(void);
>  	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
> +	int (*dirty_bit_support)(void);
>  };
>  
>  extern struct kvm_x86_ops *kvm_x86_ops;
> @@ -796,4 +797,6 @@ int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
>  int kvm_age_hva(struct kvm *kvm, unsigned long hva);
>  int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
>  
> +int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp);
> +
>  #endif /* _ASM_X86_KVM_HOST_H */
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 809cce0..500e0e2 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -140,6 +140,8 @@ module_param(oos_shadow, bool, 0644);
>  #define ACC_USER_MASK    PT_USER_MASK
>  #define ACC_ALL          (ACC_EXEC_MASK | ACC_WRITE_MASK | ACC_USER_MASK)
>  
> +#define SPTE_DONT_DIRTY (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
> +
>  #define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
>  
>  struct kvm_rmap_desc {
> @@ -629,6 +631,29 @@ static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
>  	return NULL;
>  }
>  
> +int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp)
> +{
> +	u64 *spte;
> +	int dirty = 0;
> +
> +	if (!shadow_dirty_mask)
> +		return 0;
> +
> +	spte = rmap_next(kvm, rmapp, NULL);
> +	while (spte) {
> +		if (*spte & PT_DIRTY_MASK) {
> +			set_shadow_pte(spte, (*spte &= ~PT_DIRTY_MASK) |
> +				       SPTE_DONT_DIRTY);
> +			dirty = 1;
> +			break;
> +		}
> +		spte = rmap_next(kvm, rmapp, spte);
> +	}
> +
> +	return dirty;
> +}
> +
> +
>  static int rmap_write_protect(struct kvm *kvm, u64 gfn)
>  {
>  	unsigned long *rmapp;
> @@ -1381,11 +1406,17 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
>  static int kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp)
>  {
>  	int ret;
> +	int i;
> +
>  	++kvm->stat.mmu_shadow_zapped;
>  	ret = mmu_zap_unsync_children(kvm, sp);
>  	kvm_mmu_page_unlink_children(kvm, sp);
>  	kvm_mmu_unlink_parents(kvm, sp);
>  	kvm_flush_remote_tlbs(kvm);
> +	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
> +		if (sp->spt[i] & PT_DIRTY_MASK)
> +			mark_page_dirty(kvm, sp->gfns[i]);
> +	}

Also need to transfer dirty bit in other places probably.

>  	if (!sp->role.invalid && !sp->role.direct)
>  		unaccount_shadowed(kvm, sp->gfn);
>  	if (sp->unsync)
> @@ -1676,7 +1707,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
>  	 * whether the guest actually used the pte (in order to detect
>  	 * demand paging).
>  	 */
> -	spte = shadow_base_present_pte | shadow_dirty_mask;
> +	spte = shadow_base_present_pte;
> +	if (!(spte & SPTE_DONT_DIRTY))
> +		spte |= shadow_dirty_mask;
> +
>  	if (!speculative)
>  		spte |= shadow_accessed_mask;
>  	if (!dirty)
> @@ -1725,8 +1759,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
>  		}
>  	}
>  
> -	if (pte_access & ACC_WRITE_MASK)
> -		mark_page_dirty(vcpu->kvm, gfn);
> +	if (!shadow_dirty_mask) {
> +		if (pte_access & ACC_WRITE_MASK)
> +			mark_page_dirty(vcpu->kvm, gfn);
> +	}

You can avoid the mark_page_dirty if SPTE_DONT_DIRTY? (which is a good idea,
gfn_to_memslot_unaliased and friends show up high in profiling).

>  set_pte:
>  	set_shadow_pte(shadow_pte, spte);
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 37397f6..5b6351e 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -2724,6 +2724,11 @@ static u64 svm_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>  	return 0;
>  }
>  
> +static int svm_dirty_bit_support(void)
> +{
> +	return 1;
> +}
> +
>  static struct kvm_x86_ops svm_x86_ops = {
>  	.cpu_has_kvm_support = has_svm,
>  	.disabled_by_bios = is_disabled,
> @@ -2785,6 +2790,8 @@ static struct kvm_x86_ops svm_x86_ops = {
>  	.set_tss_addr = svm_set_tss_addr,
>  	.get_tdp_level = get_npt_level,
>  	.get_mt_mask = svm_get_mt_mask,
> +
> +	.dirty_bit_support = svm_dirty_bit_support,
>  };
>  
>  static int __init svm_init(void)
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 673bcb3..8903314 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -3774,6 +3774,11 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>  	return ret;
>  }
>  
> +static int vmx_dirty_bit_support(void)
> +{
> +	return false;
> +}
> +
>  static struct kvm_x86_ops vmx_x86_ops = {
>  	.cpu_has_kvm_support = cpu_has_kvm_support,
>  	.disabled_by_bios = vmx_disabled_by_bios,
> @@ -3833,6 +3838,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
>  	.set_tss_addr = vmx_set_tss_addr,
>  	.get_tdp_level = get_ept_level,
>  	.get_mt_mask = vmx_get_mt_mask,
> +
> +	.dirty_bit_support = vmx_dirty_bit_support,
>  };
>  
>  static int __init vmx_init(void)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 272e2e8..e06387c 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1963,6 +1963,20 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
>  	return 0;
>  }
>  
> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
> +{
> +	int i;
> +
> +	spin_lock(&kvm->mmu_lock);
> +	for (i = 0; i < memslot->npages; ++i) {
> +		if (!test_bit(i, memslot->dirty_bitmap)) {
> +			if (is_dirty_and_clean_rmapp(kvm, &memslot->rmap[i]))
> +				set_bit(i, memslot->dirty_bitmap);
> +		}
> +	}
> +	spin_unlock(&kvm->mmu_lock);
> +}
> +
>  /*
>   * Get (and clear) the dirty memory log for a memory slot.
>   */
> @@ -1982,10 +1996,14 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
>  
>  	/* If nothing is dirty, don't bother messing with page tables. */
>  	if (is_dirty) {
> -		spin_lock(&kvm->mmu_lock);
> -		kvm_mmu_slot_remove_write_access(kvm, log->slot);
> -		spin_unlock(&kvm->mmu_lock);
> -		kvm_flush_remote_tlbs(kvm);
> +		if (!kvm_x86_ops->dirty_bit_support()) {
> +			spin_lock(&kvm->mmu_lock);
> +			/*  remove_write_access() flush the tlb */
> +			kvm_mmu_slot_remove_write_access(kvm, log->slot);
> +			spin_unlock(&kvm->mmu_lock);
> +		} else {
> +			kvm_flush_remote_tlbs(kvm);
> +		}
>  		memslot = &kvm->memslots[log->slot];
>  		n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
>  		memset(memslot->dirty_bitmap, 0, n);
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index cdf1279..d1657a3 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -250,6 +250,7 @@ int kvm_dev_ioctl_check_extension(long ext);
>  
>  int kvm_get_dirty_log(struct kvm *kvm,
>  			struct kvm_dirty_log *log, int *is_dirty);
> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot);
>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
>  				struct kvm_dirty_log *log);
>  
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 3046e9c..a876231 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1135,8 +1135,11 @@ int __kvm_set_memory_region(struct kvm *kvm,
>  	}
>  
>  	/* Free page dirty bitmap if unneeded */
> -	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
> +	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
>  		new.dirty_bitmap = NULL;
> +		if (old.flags & KVM_MEM_LOG_DIRTY_PAGES)
> +			kvm_arch_flush_shadow(kvm);
> +	}

Whats this for?

The idea of making dirty bit accessible (also can use it to map host
ptes read-only, when guest fault is read-only, for KSM (*)) is good. But
better first introduce infrastructure to handle dirty bit (making sure
the bit is transferred properly), then have logging make use of it.

* EPT violations do no transfer fault information to the page fault
handler, but its available (there's a vm-exit field).

>  	r = -ENOMEM;
>  
> @@ -1279,6 +1282,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
>  	if (!memslot->dirty_bitmap)
>  		goto out;
>  
> +	kvm_arch_get_dirty_log(kvm, memslot);
>  	n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
>  
>  	for (i = 0; !any && i < n/sizeof(long); ++i)
> -- 
> 1.5.6.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 20:42     ` Marcelo Tosatti
@ 2009-06-10 23:57       ` Izik Eidus
  2009-06-10 23:59         ` Izik Eidus
  2009-06-11  1:04         ` Marcelo Tosatti
  0 siblings, 2 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 23:57 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi

Marcelo Tosatti wrote:
> On Wed, Jun 10, 2009 at 07:23:25PM +0300, Izik Eidus wrote:
>   
>> change the dirty page tracking to work with dirty bity instead of page fault.
>> right now the dirty page tracking work with the help of page faults, when we
>> want to track a page for being dirty, we write protect it and we mark it dirty
>> when we have write page fault, this code move into looking at the dirty bit
>> of the spte.
>>
>> Signed-off-by: Izik Eidus <ieidus@redhat.com>
>> ---
>>  arch/ia64/kvm/kvm-ia64.c        |    4 +++
>>  arch/powerpc/kvm/powerpc.c      |    4 +++
>>  arch/s390/kvm/kvm-s390.c        |    4 +++
>>  arch/x86/include/asm/kvm_host.h |    3 ++
>>  arch/x86/kvm/mmu.c              |   42 ++++++++++++++++++++++++++++++++++++--
>>  arch/x86/kvm/svm.c              |    7 ++++++
>>  arch/x86/kvm/vmx.c              |    7 ++++++
>>  arch/x86/kvm/x86.c              |   26 ++++++++++++++++++++---
>>  include/linux/kvm_host.h        |    1 +
>>  virt/kvm/kvm_main.c             |    6 ++++-
>>  10 files changed, 96 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
>> index 3199221..5914128 100644
>> --- a/arch/ia64/kvm/kvm-ia64.c
>> +++ b/arch/ia64/kvm/kvm-ia64.c
>> @@ -1809,6 +1809,10 @@ void kvm_arch_exit(void)
>>  	kvm_vmm_info = NULL;
>>  }
>>  
>> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
>> +{
>> +}
>> +
>>  static int kvm_ia64_sync_dirty_log(struct kvm *kvm,
>>  		struct kvm_dirty_log *log)
>>  {
>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
>> index 2cf915e..6beb368 100644
>> --- a/arch/powerpc/kvm/powerpc.c
>> +++ b/arch/powerpc/kvm/powerpc.c
>> @@ -418,6 +418,10 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>>  	return -ENOTSUPP;
>>  }
>>     
>
>   
>>  
>>     
>
> #ifndef KVM_ARCH_HAVE_DIRTY_LOG
>   
>> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
>> +{
>> +}
>> +
>>     
> #endif
>
> in virt/kvm/main.c
>
>
>   
>> index c7b0cc2..8a24149 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -527,6 +527,7 @@ struct kvm_x86_ops {
>>  	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
>>  	int (*get_tdp_level)(void);
>>  	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
>> +	int (*dirty_bit_support)(void);
>>  };
>>  
>>  extern struct kvm_x86_ops *kvm_x86_ops;
>> @@ -796,4 +797,6 @@ int kvm_unmap_hva(struct kvm *kvm, unsigned long hva);
>>  int kvm_age_hva(struct kvm *kvm, unsigned long hva);
>>  int cpuid_maxphyaddr(struct kvm_vcpu *vcpu);
>>  
>> +int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp);
>> +
>>  #endif /* _ASM_X86_KVM_HOST_H */
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 809cce0..500e0e2 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -140,6 +140,8 @@ module_param(oos_shadow, bool, 0644);
>>  #define ACC_USER_MASK    PT_USER_MASK
>>  #define ACC_ALL          (ACC_EXEC_MASK | ACC_WRITE_MASK | ACC_USER_MASK)
>>  
>> +#define SPTE_DONT_DIRTY (1ULL << PT_FIRST_AVAIL_BITS_SHIFT)
>> +
>>  #define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level)
>>  
>>  struct kvm_rmap_desc {
>> @@ -629,6 +631,29 @@ static u64 *rmap_next(struct kvm *kvm, unsigned long *rmapp, u64 *spte)
>>  	return NULL;
>>  }
>>  
>> +int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp)
>> +{
>> +	u64 *spte;
>> +	int dirty = 0;
>> +
>> +	if (!shadow_dirty_mask)
>> +		return 0;
>> +
>> +	spte = rmap_next(kvm, rmapp, NULL);
>> +	while (spte) {
>> +		if (*spte & PT_DIRTY_MASK) {
>> +			set_shadow_pte(spte, (*spte &= ~PT_DIRTY_MASK) |
>> +				       SPTE_DONT_DIRTY);
>> +			dirty = 1;
>> +			break;
>> +		}
>> +		spte = rmap_next(kvm, rmapp, spte);
>> +	}
>> +
>> +	return dirty;
>> +}
>> +
>> +
>>  static int rmap_write_protect(struct kvm *kvm, u64 gfn)
>>  {
>>  	unsigned long *rmapp;
>> @@ -1381,11 +1406,17 @@ static int mmu_zap_unsync_children(struct kvm *kvm,
>>  static int kvm_mmu_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp)
>>  {
>>  	int ret;
>> +	int i;
>> +
>>  	++kvm->stat.mmu_shadow_zapped;
>>  	ret = mmu_zap_unsync_children(kvm, sp);
>>  	kvm_mmu_page_unlink_children(kvm, sp);
>>  	kvm_mmu_unlink_parents(kvm, sp);
>>  	kvm_flush_remote_tlbs(kvm);
>> +	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
>> +		if (sp->spt[i] & PT_DIRTY_MASK)
>> +			mark_page_dirty(kvm, sp->gfns[i]);
>> +	}
>>     
>
> Also need to transfer dirty bit in other places probably.
>   


Yes, i can think about some other case, but maybe i can avoid it using 
some trick.


>   
>>  	if (!sp->role.invalid && !sp->role.direct)
>>  		unaccount_shadowed(kvm, sp->gfn);
>>  	if (sp->unsync)
>> @@ -1676,7 +1707,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
>>  	 * whether the guest actually used the pte (in order to detect
>>  	 * demand paging).
>>  	 */
>> -	spte = shadow_base_present_pte | shadow_dirty_mask;
>> +	spte = shadow_base_present_pte;
>> +	if (!(spte & SPTE_DONT_DIRTY))
>> +		spte |= shadow_dirty_mask;
>> +
>>  	if (!speculative)
>>  		spte |= shadow_accessed_mask;
>>  	if (!dirty)
>> @@ -1725,8 +1759,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *shadow_pte,
>>  		}
>>  	}
>>  
>> -	if (pte_access & ACC_WRITE_MASK)
>> -		mark_page_dirty(vcpu->kvm, gfn);
>> +	if (!shadow_dirty_mask) {
>> +		if (pte_access & ACC_WRITE_MASK)
>> +			mark_page_dirty(vcpu->kvm, gfn);
>> +	}
>>     
>
> You can avoid the mark_page_dirty if SPTE_DONT_DIRTY? (which is a good idea,
> gfn_to_memslot_unaliased and friends show up high in profiling).
>   

This code shouldnt run on anything but EPT, shadow_dirty_mask should be 
set to zero only on EPT that doesnt have dirty bit tracking, so we still 
need to mark the page dirty from this context...
>   
>>  set_pte:
>>  	set_shadow_pte(shadow_pte, spte);
>> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
>> index 37397f6..5b6351e 100644
>> --- a/arch/x86/kvm/svm.c
>> +++ b/arch/x86/kvm/svm.c
>> @@ -2724,6 +2724,11 @@ static u64 svm_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>>  	return 0;
>>  }
>>  
>> +static int svm_dirty_bit_support(void)
>> +{
>> +	return 1;
>> +}
>> +
>>  static struct kvm_x86_ops svm_x86_ops = {
>>  	.cpu_has_kvm_support = has_svm,
>>  	.disabled_by_bios = is_disabled,
>> @@ -2785,6 +2790,8 @@ static struct kvm_x86_ops svm_x86_ops = {
>>  	.set_tss_addr = svm_set_tss_addr,
>>  	.get_tdp_level = get_npt_level,
>>  	.get_mt_mask = svm_get_mt_mask,
>> +
>> +	.dirty_bit_support = svm_dirty_bit_support,
>>  };
>>  
>>  static int __init svm_init(void)
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 673bcb3..8903314 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3774,6 +3774,11 @@ static u64 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>>  	return ret;
>>  }
>>  
>> +static int vmx_dirty_bit_support(void)
>> +{
>> +	return false;
>> +}
>> +
>>  static struct kvm_x86_ops vmx_x86_ops = {
>>  	.cpu_has_kvm_support = cpu_has_kvm_support,
>>  	.disabled_by_bios = vmx_disabled_by_bios,
>> @@ -3833,6 +3838,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
>>  	.set_tss_addr = vmx_set_tss_addr,
>>  	.get_tdp_level = get_ept_level,
>>  	.get_mt_mask = vmx_get_mt_mask,
>> +
>> +	.dirty_bit_support = vmx_dirty_bit_support,
>>  };
>>  
>>  static int __init vmx_init(void)
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 272e2e8..e06387c 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1963,6 +1963,20 @@ static int kvm_vm_ioctl_reinject(struct kvm *kvm,
>>  	return 0;
>>  }
>>  
>> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot)
>> +{
>> +	int i;
>> +
>> +	spin_lock(&kvm->mmu_lock);
>> +	for (i = 0; i < memslot->npages; ++i) {
>> +		if (!test_bit(i, memslot->dirty_bitmap)) {
>> +			if (is_dirty_and_clean_rmapp(kvm, &memslot->rmap[i]))
>> +				set_bit(i, memslot->dirty_bitmap);
>> +		}
>> +	}
>> +	spin_unlock(&kvm->mmu_lock);
>> +}
>> +
>>  /*
>>   * Get (and clear) the dirty memory log for a memory slot.
>>   */
>> @@ -1982,10 +1996,14 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
>>  
>>  	/* If nothing is dirty, don't bother messing with page tables. */
>>  	if (is_dirty) {
>> -		spin_lock(&kvm->mmu_lock);
>> -		kvm_mmu_slot_remove_write_access(kvm, log->slot);
>> -		spin_unlock(&kvm->mmu_lock);
>> -		kvm_flush_remote_tlbs(kvm);
>> +		if (!kvm_x86_ops->dirty_bit_support()) {
>> +			spin_lock(&kvm->mmu_lock);
>> +			/*  remove_write_access() flush the tlb */
>> +			kvm_mmu_slot_remove_write_access(kvm, log->slot);
>> +			spin_unlock(&kvm->mmu_lock);
>> +		} else {
>> +			kvm_flush_remote_tlbs(kvm);
>> +		}
>>  		memslot = &kvm->memslots[log->slot];
>>  		n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
>>  		memset(memslot->dirty_bitmap, 0, n);
>> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
>> index cdf1279..d1657a3 100644
>> --- a/include/linux/kvm_host.h
>> +++ b/include/linux/kvm_host.h
>> @@ -250,6 +250,7 @@ int kvm_dev_ioctl_check_extension(long ext);
>>  
>>  int kvm_get_dirty_log(struct kvm *kvm,
>>  			struct kvm_dirty_log *log, int *is_dirty);
>> +void kvm_arch_get_dirty_log(struct kvm *kvm, struct kvm_memory_slot *memslot);
>>  int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
>>  				struct kvm_dirty_log *log);
>>  
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 3046e9c..a876231 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -1135,8 +1135,11 @@ int __kvm_set_memory_region(struct kvm *kvm,
>>  	}
>>  
>>  	/* Free page dirty bitmap if unneeded */
>> -	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
>> +	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
>>  		new.dirty_bitmap = NULL;
>> +		if (old.flags & KVM_MEM_LOG_DIRTY_PAGES)
>> +			kvm_arch_flush_shadow(kvm);
>> +	}
>>     
>
> Whats this for?
>   

We have added all this SPTE_DONT_DIRTY..., when we stop dirty bit 
tracking, we want to continue setting the dirty bit for the spte inside 
set_spte(), so writing to the page would be faster....

> The idea of making dirty bit accessible (also can use it to map host
> ptes read-only, when guest fault is read-only, for KSM (*)) is good. But
> better first introduce infrastructure to handle dirty bit (making sure
> the bit is transferred properly), then have logging make use of it.
>   

???, I dont understand it much, it mean you want to continue trying 
walking in that direction of the patch? or you want some other way?

> * EPT violations do no transfer fault information to the page fault
> handler, but its available (there's a vm-exit field).
>   

EPT in this patch, run as before, (using page faults with 
mark_page_dirty() inside set_spte())

>   
>>  	r = -ENOMEM;
>>  
>> @@ -1279,6 +1282,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
>>  	if (!memslot->dirty_bitmap)
>>  		goto out;
>>  
>> +	kvm_arch_get_dirty_log(kvm, memslot);
>>  	n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
>>  
>>  	for (i = 0; !any && i < n/sizeof(long); ++i)
>> -- 
>> 1.5.6.5
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>     


Thanks :).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 23:57       ` Izik Eidus
@ 2009-06-10 23:59         ` Izik Eidus
  2009-06-11  1:04         ` Marcelo Tosatti
  1 sibling, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 23:59 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi

Izik Eidus wrote:
> Marcelo Tosatti wrote:
>
>>>
>>>  
>>>      /* Free page dirty bitmap if unneeded */
>>> -    if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
>>> +    if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
>>>          new.dirty_bitmap = NULL;
>>> +        if (old.flags & KVM_MEM_LOG_DIRTY_PAGES)
>>> +            kvm_arch_flush_shadow(kvm);
>>> +    }
>>>     
>>
>> Whats this for?
>>   
>
> We have added all this SPTE_DONT_DIRTY..., when we stop dirty bit 
> tracking, we want to continue setting the dirty bit for the spte 
> inside set_spte(), so writing to the page would be faster....

Another way would be doing something like kvm_arch_clean_dont_dirty(), 
might be better than flushing the whole shadow page tables.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 23:57       ` Izik Eidus
  2009-06-10 23:59         ` Izik Eidus
@ 2009-06-11  1:04         ` Marcelo Tosatti
  2009-06-11 11:27           ` Izik Eidus
  1 sibling, 1 reply; 24+ messages in thread
From: Marcelo Tosatti @ 2009-06-11  1:04 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm, avi

On Thu, Jun 11, 2009 at 02:57:02AM +0300, Izik Eidus wrote:
>> You can avoid the mark_page_dirty if SPTE_DONT_DIRTY? (which is a good idea,
>> gfn_to_memslot_unaliased and friends show up high in profiling).
>>   
>
> This code shouldnt run on anything but EPT, shadow_dirty_mask should be  
> set to zero only on EPT that doesnt have dirty bit tracking, so we still  
> need to mark the page dirty from this context...

Right, i misunderstood you were not skipping it for
kvm_x86_ops->have_dirty_bit() case.

>>> --- a/virt/kvm/kvm_main.c
>>> +++ b/virt/kvm/kvm_main.c
>>> @@ -1135,8 +1135,11 @@ int __kvm_set_memory_region(struct kvm *kvm,
>>>  	}
>>>   	/* Free page dirty bitmap if unneeded */
>>> -	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES))
>>> +	if (!(new.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
>>>  		new.dirty_bitmap = NULL;
>>> +		if (old.flags & KVM_MEM_LOG_DIRTY_PAGES)
>>> +			kvm_arch_flush_shadow(kvm);
>>> +	}
>>>     
>>
>> Whats this for?
>>   
>
> We have added all this SPTE_DONT_DIRTY..., when we stop dirty bit  
> tracking, we want to continue setting the dirty bit for the spte inside  
> set_spte(), so writing to the page would be faster....

Right... so instead of using kvm_arch_flush_shadow make a
"kvm_arch_stop_dirty_logging(kvm, slot)", and clear the SPTE_DONT_DIRTY
bits?

>
>> The idea of making dirty bit accessible (also can use it to map host
>> ptes read-only, when guest fault is read-only, for KSM (*)) is good. But
>> better first introduce infrastructure to handle dirty bit (making sure
>> the bit is transferred properly), then have logging make use of it.
>>   
>
> ???, I dont understand it much, it mean you want to continue trying  
> walking in that direction of the patch? or you want some other way?

What i'm saying is with shadow and NPT (i believe) you can mark a spte
writable but not dirty, which gives you the ability to know whether
certain pages have been dirtied.

Should be ok todo that with shadow, as long as the gpte is already
dirty, right? So:

- guest create writable pte
- guest write fault on that pte
- vmexit
- write dirty bit to gpte
- create writable spte
-> 

from this point on, you can create a writable spte which is not dirtied, 
and use the dirty bit to know whether a page has been written to?

And unless i'm mistaken NPT gives you that for free since the hw updates
the spte with accessed/dirty bits.

Maybe this is a stupid idea. 

>> * EPT violations do no transfer fault information to the page fault
>> handler, but its available (there's a vm-exit field).
>>   
>
> EPT in this patch, run as before, (using page faults with  
> mark_page_dirty() inside set_spte())

Right, i was talking about gup(write=0) when guest fault is read-only to
relief KSM of wrprotecting host ptes, but that is quite different.

>>   
>>>  	r = -ENOMEM;
>>>  @@ -1279,6 +1282,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
>>>  	if (!memslot->dirty_bitmap)
>>>  		goto out;
>>>  +	kvm_arch_get_dirty_log(kvm, memslot);
>>>  	n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
>>>   	for (i = 0; !any && i < n/sizeof(long); ++i)
>>> -- 
>>> 1.5.6.5
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>     
>
>
> Thanks :).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
  2009-06-10 17:00     ` Izik Eidus
  2009-06-10 20:42     ` Marcelo Tosatti
@ 2009-06-11  8:24     ` Ulrich Drepper
  2009-06-11  9:44       ` Izik Eidus
  2009-06-11  9:33     ` Avi Kivity
  3 siblings, 1 reply; 24+ messages in thread
From: Ulrich Drepper @ 2009-06-11  8:24 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm, avi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Izik Eidus wrote:
> +		if (!kvm_x86_ops->dirty_bit_support()) {
> +			spin_lock(&kvm->mmu_lock);
> +			/*  remove_write_access() flush the tlb */
> +			kvm_mmu_slot_remove_write_access(kvm, log->slot);
> +			spin_unlock(&kvm->mmu_lock);
> +		} else {
> +			kvm_flush_remote_tlbs(kvm);

It might not correspond to the common style, but I think a callback
function ->dirty_bit_support is overkill.  This is a function pointer
the compiler cannot see through.  Hence it's an indirect function call.
 But the implementation is always a simple yes/no (it seems).  Indirect
calls are rather expensive (most of the time they cannot be predicted
right).

Why not instead have a read-only data constants and have an inline
function test that value?  It means no function call and only one data
access.

Also, you're inconsistent in the use of integers and true/false in the
implementations of this function.  Either use 0/1 or false/true.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkowv08ACgkQ2ijCOnn/RHR71ACdH3xr3XPnCLgsMMwdTawfehEN
vs4An2DlErhU6SeanSYVIyP3eLB4sjsz
=UZ32
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
                       ` (2 preceding siblings ...)
  2009-06-11  8:24     ` Ulrich Drepper
@ 2009-06-11  9:33     ` Avi Kivity
  2009-06-11  9:48       ` Izik Eidus
  3 siblings, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2009-06-11  9:33 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm

Izik Eidus wrote:
> change the dirty page tracking to work with dirty bity instead of page fault.
> right now the dirty page tracking work with the help of page faults, when we
> want to track a page for being dirty, we write protect it and we mark it dirty
> when we have write page fault, this code move into looking at the dirty bit
> of the spte.
>
>   

I'm concerned about performance during the later stages of live 
migration.  Even if only 1000 pages are dirty, you still have to look at 
2,000,000 or more ptes (for an 8GB guest).  That's a lot of overhead.

I think we need to use the page table hierarchy, write protect the upper 
page table so we know which page tables we need to look at.

> +int is_dirty_and_clean_rmapp(struct kvm *kvm, unsigned long *rmapp)
> +{
> +	u64 *spte;
> +	int dirty = 0;
> +
> +	if (!shadow_dirty_mask)
> +		return 0;
> +
> +	spte = rmap_next(kvm, rmapp, NULL);
> +	while (spte) {
> +		if (*spte & PT_DIRTY_MASK) {
> +			set_shadow_pte(spte, (*spte &= ~PT_DIRTY_MASK) |
> +				       SPTE_DONT_DIRTY);
>   

Keep using shadow_dirty_mask here for consistency.

>  	kvm_flush_remote_tlbs(kvm);
> +	for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
> +		if (sp->spt[i] & PT_DIRTY_MASK)
> +			mark_page_dirty(kvm, sp->gfns[i]);
> +	}
>   

shadow_dirty_mask.

> @@ -2785,6 +2790,8 @@ static struct kvm_x86_ops svm_x86_ops = {
>  	.set_tss_addr = svm_set_tss_addr,
>  	.get_tdp_level = get_npt_level,
>  	.get_mt_mask = svm_get_mt_mask,
> +
> +	.dirty_bit_support = svm_dirty_bit_support,
>  };
>   

Just use shadow_dirty_mask != 0.

>  
> +static int vmx_dirty_bit_support(void)
> +{
> +	return false;
> +}
>   

It's false only when ept is enabled.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-11  8:24     ` Ulrich Drepper
@ 2009-06-11  9:44       ` Izik Eidus
  0 siblings, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-11  9:44 UTC (permalink / raw)
  To: Ulrich Drepper; +Cc: kvm, avi

Ulrich Drepper wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Izik Eidus wrote:
>   
>> +		if (!kvm_x86_ops->dirty_bit_support()) {
>> +			spin_lock(&kvm->mmu_lock);
>> +			/*  remove_write_access() flush the tlb */
>> +			kvm_mmu_slot_remove_write_access(kvm, log->slot);
>> +			spin_unlock(&kvm->mmu_lock);
>> +		} else {
>> +			kvm_flush_remote_tlbs(kvm);
>>     
>
>   

Hi.

> It might not correspond to the common style, but I think a callback
> function ->dirty_bit_support is overkill.  This is a function pointer
> the compiler cannot see through.  Hence it's an indirect function call.
>  But the implementation is always a simple yes/no (it seems).  Indirect
> calls are rather expensive (most of the time they cannot be predicted
> right).
>   

This function pointer will be called once every ioctl to get the dirty 
bit tracking, so i dont think it is a big issue (normal 30 times a sec)

> Why not instead have a read-only data constants and have an inline
> function test that value?  It means no function call and only one data
> access.
>   

May be relevent, but i dont sure if it is needed optimization for this 
patch consider the amount of time ->dirty_bit_support() will be called

>
> Also, you're inconsistent in the use of integers and true/false in the
> implementations of this function.  Either use 0/1 or false/true.
>   

I will fix it, thanks.

> - --
> ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
>
> iEYEARECAAYFAkowv08ACgkQ2ijCOnn/RHR71ACdH3xr3XPnCLgsMMwdTawfehEN
> vs4An2DlErhU6SeanSYVIyP3eLB4sjsz
> =UZ32
> -----END PGP SIGNATURE-----
>   


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-11  9:33     ` Avi Kivity
@ 2009-06-11  9:48       ` Izik Eidus
  2009-06-11  9:53         ` Avi Kivity
  0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2009-06-11  9:48 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm

Avi Kivity wrote:
> Izik Eidus wrote:
>> change the dirty page tracking to work with dirty bity instead of 
>> page fault.
>> right now the dirty page tracking work with the help of page faults, 
>> when we
>> want to track a page for being dirty, we write protect it and we mark 
>> it dirty
>> when we have write page fault, this code move into looking at the 
>> dirty bit
>> of the spte.
>>
>>   
>
> I'm concerned about performance during the later stages of live 
> migration.  Even if only 1000 pages are dirty, you still have to look 
> at 2,000,000 or more ptes (for an 8GB guest).  That's a lot of overhead.
>
> I think we need to use the page table hierarchy, write protect the 
> upper page table so we know which page tables we need to look at.
>
>

Great idea, so i add another bitmap for the page directory?
>>  
>> +static int vmx_dirty_bit_support(void)
>> +{
>> +    return false;
>> +}
>>   
>
> It's false only when ept is enabled.
>

Yea, that i found out already....


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-11  9:48       ` Izik Eidus
@ 2009-06-11  9:53         ` Avi Kivity
  0 siblings, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2009-06-11  9:53 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm

Izik Eidus wrote:
> Avi Kivity wrote:
>> Izik Eidus wrote:
>>> change the dirty page tracking to work with dirty bity instead of 
>>> page fault.
>>> right now the dirty page tracking work with the help of page faults, 
>>> when we
>>> want to track a page for being dirty, we write protect it and we 
>>> mark it dirty
>>> when we have write page fault, this code move into looking at the 
>>> dirty bit
>>> of the spte.
>>>
>>>   
>>
>> I'm concerned about performance during the later stages of live 
>> migration.  Even if only 1000 pages are dirty, you still have to look 
>> at 2,000,000 or more ptes (for an 8GB guest).  That's a lot of overhead.
>>
>> I think we need to use the page table hierarchy, write protect the 
>> upper page table so we know which page tables we need to look at.
>>
>>
>
> Great idea, so i add another bitmap for the page directory?

No, why?

You need to drop write access to the shadow root ptes.  When you get a 
fault, restore write access to the root ptes, but drop access from the 
L3 ptes, and so on until you reach the L1 ptes.  There you clear the 
dirty bits, and add the page to a list of pages that need to be checked 
for dirty bits.  This way you only check ptes that have a chance to be 
dirty.

I'm not sure that will be faster, but there's a good chance.



-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-11  1:04         ` Marcelo Tosatti
@ 2009-06-11 11:27           ` Izik Eidus
  2009-06-11 12:24             ` Marcelo Tosatti
  0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2009-06-11 11:27 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi

Marcelo Tosatti wrote:
>
>
> What i'm saying is with shadow and NPT (i believe) you can mark a spte
> writable but not dirty, which gives you the ability to know whether
> certain pages have been dirtied.
>   

Isnt this what this patch is doing?

> Should be ok todo that with shadow, as long as the gpte is already
> dirty, right? So:
>
> - guest create writable pte
> - guest write fault on that pte
> - vmexit
> - write dirty bit to gpte
> - create writable spte
> -> 
>
> from this point on, you can create a writable spte which is not dirtied, 
> and use the dirty bit to know whether a page has been written to?
>
> And unless i'm mistaken NPT gives you that for free since the hw updates
> the spte with accessed/dirty bits.
>   

Yes

> Maybe this is a stupid idea. 
>   

I dont sure if i am missing it, or if this what i tried to do with this 
patch.

>   
>>> * EPT violations do no transfer fault information to the page fault
>>> handler, but its available (there's a vm-exit field).
>>>   
>>>       
>> EPT in this patch, run as before, (using page faults with  
>> mark_page_dirty() inside set_spte())
>>     
>
> Right, i was talking about gup(write=0) when guest fault is read-only to
> relief KSM of wrprotecting host ptes, but that is quite different.
>
>   
>>>   
>>>       
>>>>  	r = -ENOMEM;
>>>>  @@ -1279,6 +1282,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
>>>>  	if (!memslot->dirty_bitmap)
>>>>  		goto out;
>>>>  +	kvm_arch_get_dirty_log(kvm, memslot);
>>>>  	n = ALIGN(memslot->npages, BITS_PER_LONG) / 8;
>>>>   	for (i = 0; !any && i < n/sizeof(long); ++i)
>>>> -- 
>>>> 1.5.6.5
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>     
>>>>         
>> Thanks :).
>>     


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-11 11:27           ` Izik Eidus
@ 2009-06-11 12:24             ` Marcelo Tosatti
  2009-06-11 15:49               ` Izik Eidus
  0 siblings, 1 reply; 24+ messages in thread
From: Marcelo Tosatti @ 2009-06-11 12:24 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm, avi

On Thu, Jun 11, 2009 at 02:27:46PM +0300, Izik Eidus wrote:
> Marcelo Tosatti wrote:
>>
>>
>> What i'm saying is with shadow and NPT (i believe) you can mark a spte
>> writable but not dirty, which gives you the ability to know whether
>> certain pages have been dirtied.
>>   
>
> Isnt this what this patch is doing?

Yes, was confused for some reason i don't remember.

So making the dirty bit available to the host is a good idea, but would
have to check things like faults on out of sync pagetables (where
the guest dirty bit might be cleared in parallel, maybe its ok but
not sure), verify transfer of dirty bit when zapping is consistent
everywhere, etc.

So it would be nicer to introduce an optimization to the way dirty bit
info is acquired, then you use that to optimize kvm's dirty log ioctl.

The link with KSM was that you can consult this dirty info, which is
fast, to know if content of pages has changed. But it maybe useless,
don't know.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity
  2009-06-11 12:24             ` Marcelo Tosatti
@ 2009-06-11 15:49               ` Izik Eidus
  0 siblings, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-11 15:49 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, avi

Marcelo Tosatti wrote:
> On Thu, Jun 11, 2009 at 02:27:46PM +0300, Izik Eidus wrote:
>   
>> Marcelo Tosatti wrote:
>>     
>>> What i'm saying is with shadow and NPT (i believe) you can mark a spte
>>> writable but not dirty, which gives you the ability to know whether
>>> certain pages have been dirtied.
>>>   
>>>       
>> Isnt this what this patch is doing?
>>     
>
> Yes, was confused for some reason i don't remember.
>
> So making the dirty bit available to the host is a good idea, but would
> have to check things like faults on out of sync pagetables (where
> the guest dirty bit might be cleared in parallel, maybe its ok but
> not sure), verify transfer of dirty bit when zapping is consistent
> everywhere, etc.
>   

I am looking on the host sptes, not the guest emulated page tables, when 
the guest clean its dirty bit, we dont clean it from the spte, or am i 
wrong?
When we zap the spte, i transfer the dirty bit into the bitmap in case 
of SPTE_DONT_DIRTY...,


But Avi, pointed out that in case of live migration, it might be 
overkill to walk all the sptes in the system, so maybe we want to add 
page fault from the page directory (would be write protected, or if it 
cant be write protected at least notpresent), and then scan only sptes 
that inside this "write protected" page tables...

You have some ideas for this case?


> So it would be nicer to introduce an optimization to the way dirty bit
> info is acquired, then you use that to optimize kvm's dirty log ioctl.
>
> The link with KSM was that you can consult this dirty info, which is
> fast, to know if content of pages has changed. But it maybe useless,
> don't know.
>
>   
It sure can help for ksm as we calc sometimes jhash of the page to know 
if the page content was changed, but i afraid that starting to clean the 
dirty bit that will need tlbflush, plus taking the mmu_lock to walk the 
sptes, maybe will save ksm some cpu, but might hurt kvm, we will have to 
check this...



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages
  2009-06-10 16:23 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
  2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
@ 2009-06-14 11:10   ` Avi Kivity
  1 sibling, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2009-06-14 11:10 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm

Izik Eidus wrote:
> When slot is already allocted and being asked to be tracked we need to break the
> large pages.
>
> This code flush the mmu when someone ask a slot to start dirty bit tracking.
>
>   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2)
  2009-06-10 16:23 [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Izik Eidus
  2009-06-10 16:23 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
@ 2009-06-15 14:58 ` Ryan Harper
  2009-06-15 15:19   ` Avi Kivity
  1 sibling, 1 reply; 24+ messages in thread
From: Ryan Harper @ 2009-06-15 14:58 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm, avi

* Izik Eidus <ieidus@redhat.com> [2009-06-10 11:25]:
> RFC move to dirty bit tracking using the page table dirty bit (v2)
> 

Where is this series at?  Did you want me to test one of the ept dirty
tracking patches for that hugepage+ept+local migration bug?

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh@us.ibm.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2)
  2009-06-15 14:58 ` [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Ryan Harper
@ 2009-06-15 15:19   ` Avi Kivity
  2009-07-07  3:32     ` Ryan Harper
  0 siblings, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2009-06-15 15:19 UTC (permalink / raw)
  To: Ryan Harper; +Cc: Izik Eidus, kvm

On 06/15/2009 05:58 PM, Ryan Harper wrote:
> * Izik Eidus<ieidus@redhat.com>  [2009-06-10 11:25]:
>    
>> RFC move to dirty bit tracking using the page table dirty bit (v2)
>>
>>      
>
> Where is this series at?  Did you want me to test one of the ept dirty
> tracking patches for that hugepage+ept+local migration bug?
>    

The first patch (which might cure your oops) is already in qemu-kvm.git, 
please test it out.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2)
  2009-06-15 15:19   ` Avi Kivity
@ 2009-07-07  3:32     ` Ryan Harper
  2009-07-07  5:11       ` Avi Kivity
  0 siblings, 1 reply; 24+ messages in thread
From: Ryan Harper @ 2009-07-07  3:32 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Ryan Harper, Izik Eidus, kvm

* Avi Kivity <avi@redhat.com> [2009-06-15 10:19]:
> On 06/15/2009 05:58 PM, Ryan Harper wrote:
> >* Izik Eidus<ieidus@redhat.com>  [2009-06-10 11:25]:
> >   
> >>RFC move to dirty bit tracking using the page table dirty bit (v2)
> >>
> >>     
> >
> >Where is this series at?  Did you want me to test one of the ept dirty
> >tracking patches for that hugepage+ept+local migration bug?
> >   
> 
> The first patch (which might cure your oops) is already in qemu-kvm.git, 
> please test it out.

Sorry for taking forever on this... current upstream kvm modules from
kvm-kmod.git are working fine for this.  I believe the final patch that
was committed:

commit e244584fe3a5c20deddeca246548ac86dbc6e1d1
Author: Izik Eidus <ieidus@redhat.com>
Date:   Wed Jun 10 19:23:24 2009 +0300

    KVM: Fix dirty bit tracking for slots with large pages


did the trick.  Should this get pulled back into any of the stable
trees?


-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh@us.ibm.com

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2)
  2009-07-07  3:32     ` Ryan Harper
@ 2009-07-07  5:11       ` Avi Kivity
  0 siblings, 0 replies; 24+ messages in thread
From: Avi Kivity @ 2009-07-07  5:11 UTC (permalink / raw)
  To: Ryan Harper; +Cc: Izik Eidus, kvm

On 07/07/2009 06:32 AM, Ryan Harper wrote:
> * Avi Kivity<avi@redhat.com>  [2009-06-15 10:19]:
>    
>> On 06/15/2009 05:58 PM, Ryan Harper wrote:
>>      
>>> * Izik Eidus<ieidus@redhat.com>   [2009-06-10 11:25]:
>>>
>>>        
>>>> RFC move to dirty bit tracking using the page table dirty bit (v2)
>>>>
>>>>
>>>>          
>>> Where is this series at?  Did you want me to test one of the ept dirty
>>> tracking patches for that hugepage+ept+local migration bug?
>>>
>>>        
>> The first patch (which might cure your oops) is already in qemu-kvm.git,
>> please test it out.
>>      
>
> Sorry for taking forever on this... current upstream kvm modules from
> kvm-kmod.git are working fine for this.  I believe the final patch that
> was committed:
>
> commit e244584fe3a5c20deddeca246548ac86dbc6e1d1
> Author: Izik Eidus<ieidus@redhat.com>
> Date:   Wed Jun 10 19:23:24 2009 +0300
>
>      KVM: Fix dirty bit tracking for slots with large pages
>
>
> did the trick.  Should this get pulled back into any of the stable
> trees?
>
>    

It's in 2.6.30.1 and hopefully on its way to the other trees.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages
  2009-06-10 11:54   ` Avi Kivity
@ 2009-06-10 12:06     ` Izik Eidus
  0 siblings, 0 replies; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 12:06 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Marcelo Tosatti, Ryan Harper

Avi Kivity wrote:
> Izik Eidus wrote:
>> When slot is already allocted and being asked to be tracked we need 
>> to break the
>> large pages.
>>
>> This code flush the mmu when someone ask a slot to start dirty bit 
>> tracking.
>>
>> Signed-off-by: Izik Eidus <ieidus@redhat.com>
>> ---
>>  virt/kvm/kvm_main.c |    2 ++
>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 669eb4a..4a60c72 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -1160,6 +1160,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
>>              new.userspace_addr = mem->userspace_addr;
>>          else
>>              new.userspace_addr = 0;
>> +
>> +        kvm_arch_flush_shadow(kvm);
>>      }
>>      if (npages && !new.lpage_info) {
>>          largepages = 1 + (base_gfn + npages - 1) / KVM_PAGES_PER_HPAGE;
>>   
>
> Ryan, can you try this out with your large page migration failures?
>
Wait, i think it is in the wrong place., i am sending a second seires :(

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages
  2009-06-10 11:51 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
@ 2009-06-10 11:54   ` Avi Kivity
  2009-06-10 12:06     ` Izik Eidus
  0 siblings, 1 reply; 24+ messages in thread
From: Avi Kivity @ 2009-06-10 11:54 UTC (permalink / raw)
  To: Izik Eidus; +Cc: kvm, Marcelo Tosatti, Ryan Harper

Izik Eidus wrote:
> When slot is already allocted and being asked to be tracked we need to break the
> large pages.
>
> This code flush the mmu when someone ask a slot to start dirty bit tracking.
>
> Signed-off-by: Izik Eidus <ieidus@redhat.com>
> ---
>  virt/kvm/kvm_main.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 669eb4a..4a60c72 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1160,6 +1160,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
>  			new.userspace_addr = mem->userspace_addr;
>  		else
>  			new.userspace_addr = 0;
> +
> +		kvm_arch_flush_shadow(kvm);
>  	}
>  	if (npages && !new.lpage_info) {
>  		largepages = 1 + (base_gfn + npages - 1) / KVM_PAGES_PER_HPAGE;
>   

Ryan, can you try this out with your large page migration failures?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages
  2009-06-10 11:51 [PATCH 0/2] *** SUBJECT HERE *** Izik Eidus
@ 2009-06-10 11:51 ` Izik Eidus
  2009-06-10 11:54   ` Avi Kivity
  0 siblings, 1 reply; 24+ messages in thread
From: Izik Eidus @ 2009-06-10 11:51 UTC (permalink / raw)
  To: kvm; +Cc: avi, Izik Eidus

When slot is already allocted and being asked to be tracked we need to break the
large pages.

This code flush the mmu when someone ask a slot to start dirty bit tracking.

Signed-off-by: Izik Eidus <ieidus@redhat.com>
---
 virt/kvm/kvm_main.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 669eb4a..4a60c72 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1160,6 +1160,8 @@ int __kvm_set_memory_region(struct kvm *kvm,
 			new.userspace_addr = mem->userspace_addr;
 		else
 			new.userspace_addr = 0;
+
+		kvm_arch_flush_shadow(kvm);
 	}
 	if (npages && !new.lpage_info) {
 		largepages = 1 + (base_gfn + npages - 1) / KVM_PAGES_PER_HPAGE;
-- 
1.5.6.5


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2009-07-07  5:10 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-06-10 16:23 [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Izik Eidus
2009-06-10 16:23 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
2009-06-10 16:23   ` [PATCH 2/2] kvm: change the dirty page tracking to work with dirty bity Izik Eidus
2009-06-10 17:00     ` Izik Eidus
2009-06-10 20:42     ` Marcelo Tosatti
2009-06-10 23:57       ` Izik Eidus
2009-06-10 23:59         ` Izik Eidus
2009-06-11  1:04         ` Marcelo Tosatti
2009-06-11 11:27           ` Izik Eidus
2009-06-11 12:24             ` Marcelo Tosatti
2009-06-11 15:49               ` Izik Eidus
2009-06-11  8:24     ` Ulrich Drepper
2009-06-11  9:44       ` Izik Eidus
2009-06-11  9:33     ` Avi Kivity
2009-06-11  9:48       ` Izik Eidus
2009-06-11  9:53         ` Avi Kivity
2009-06-14 11:10   ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Avi Kivity
2009-06-15 14:58 ` [PATCH 0/2] RFC use dirty bit for page dirty tracking (v2) Ryan Harper
2009-06-15 15:19   ` Avi Kivity
2009-07-07  3:32     ` Ryan Harper
2009-07-07  5:11       ` Avi Kivity
  -- strict thread matches above, loose matches on Subject: below --
2009-06-10 11:51 [PATCH 0/2] *** SUBJECT HERE *** Izik Eidus
2009-06-10 11:51 ` [PATCH 1/2] kvm: fix dirty bit tracking for slots with large pages Izik Eidus
2009-06-10 11:54   ` Avi Kivity
2009-06-10 12:06     ` Izik Eidus

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.