[PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot
@ 2013-03-12  8:43 Takuya Yoshikawa
  2013-03-12  8:44 ` [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte Takuya Yoshikawa
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-12  8:43 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm

This is only for mmio spte zapping, not for all zap_all() cases.

Takuya Yoshikawa (2):
  KVM: MMU: Mark sp mmio cached when creating mmio spte
  KVM: x86: Optimize mmio spte zapping when creating/moving memslot

 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/mmu.c              |   21 +++++++++++++++++++++
 arch/x86/kvm/x86.c              |    2 +-
 3 files changed, 24 insertions(+), 1 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-12  8:43 [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
@ 2013-03-12  8:44 ` Takuya Yoshikawa
  2013-03-13  5:06   ` Xiao Guangrong
  2013-03-12  8:45 ` [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-12  8:44 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm

This will be used not to zap unrelated mmu pages when creating/moving
a memory slot later.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |    3 +++
 2 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 635a74d..b84310a 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -230,6 +230,7 @@ struct kvm_mmu_page {
 #endif
 
 	int write_flooding_count;
+	bool mmio_cached;
 };
 
 struct kvm_pio_request {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fdacabb..de45ec1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -199,8 +199,11 @@ EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask);
 
 static void mark_mmio_spte(u64 *sptep, u64 gfn, unsigned access)
 {
+	struct kvm_mmu_page *sp =  page_header(__pa(sptep));
+
 	access &= ACC_WRITE_MASK | ACC_USER_MASK;
 
+	sp->mmio_cached = true;
 	trace_mark_mmio_spte(sptep, gfn, access);
 	mmu_spte_set(sptep, shadow_mmio_mask | access | gfn << PAGE_SHIFT);
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot
  2013-03-12  8:43 [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
  2013-03-12  8:44 ` [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte Takuya Yoshikawa
@ 2013-03-12  8:45 ` Takuya Yoshikawa
  2013-03-12 12:06   ` Gleb Natapov
  2013-03-13  1:41 ` [PATCH 0/2] KVM: " Marcelo Tosatti
  2013-03-14  8:23 ` Gleb Natapov
  3 siblings, 1 reply; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-12  8:45 UTC (permalink / raw)
  To: mtosatti, gleb; +Cc: kvm

When we create or move a memory slot, we need to zap mmio sptes.
Currently, zap_all() is used for this and this is causing two problems:
 - extra page faults after zapping mmu pages
 - long mmu_lock hold time during zapping mmu pages

For the latter, Marcelo reported a disastrous mmu_lock hold time during
hot-plug, which made the guest unresponsive for a long time.

This patch takes a simple way to fix these problems: do not zap mmu
pages unless they are marked mmio cached.  On our test box, this took
only 50us for the 4GB guest and we did not see ms of mmu_lock hold time
any more.

Note that we still need to do zap_all() for other cases.  So another
work is also needed: Xiao's work may be the one.

Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |   18 ++++++++++++++++++
 arch/x86/kvm/x86.c              |    2 +-
 3 files changed, 20 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b84310a..028b03f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -768,6 +768,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
 				     struct kvm_memory_slot *slot,
 				     gfn_t gfn_offset, unsigned long mask);
 void kvm_mmu_zap_all(struct kvm *kvm);
+void kvm_mmu_zap_mmio_sptes(struct kvm *kvm);
 unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
 void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index de45ec1..c1a9b7b 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4189,6 +4189,24 @@ restart:
 	spin_unlock(&kvm->mmu_lock);
 }
 
+void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
+{
+	struct kvm_mmu_page *sp, *node;
+	LIST_HEAD(invalid_list);
+
+	spin_lock(&kvm->mmu_lock);
+restart:
+	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
+		if (!sp->mmio_cached)
+			continue;
+		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
+			goto restart;
+	}
+
+	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	spin_unlock(&kvm->mmu_lock);
+}
+
 static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 {
 	struct kvm *kvm;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 35b4912..16b6df2 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6969,7 +6969,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 	 * mmio sptes.
 	 */
 	if ((change == KVM_MR_CREATE) || (change == KVM_MR_MOVE)) {
-		kvm_mmu_zap_all(kvm);
+		kvm_mmu_zap_mmio_sptes(kvm);
 		kvm_reload_remote_mmus(kvm);
 	}
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot
  2013-03-12  8:45 ` [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
@ 2013-03-12 12:06   ` Gleb Natapov
  2013-03-13  1:40     ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-03-12 12:06 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: mtosatti, kvm

On Tue, Mar 12, 2013 at 05:45:30PM +0900, Takuya Yoshikawa wrote:
> When we create or move a memory slot, we need to zap mmio sptes.
> Currently, zap_all() is used for this and this is causing two problems:
>  - extra page faults after zapping mmu pages
>  - long mmu_lock hold time during zapping mmu pages
> 
> For the latter, Marcelo reported a disastrous mmu_lock hold time during
> hot-plug, which made the guest unresponsive for a long time.
> 
> This patch takes a simple way to fix these problems: do not zap mmu
> pages unless they are marked mmio cached.  On our test box, this took
> only 50us for the 4GB guest and we did not see ms of mmu_lock hold time
> any more.
> 
> Note that we still need to do zap_all() for other cases.  So another
> work is also needed: Xiao's work may be the one.
> 
> Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> ---
>  arch/x86/include/asm/kvm_host.h |    1 +
>  arch/x86/kvm/mmu.c              |   18 ++++++++++++++++++
>  arch/x86/kvm/x86.c              |    2 +-
>  3 files changed, 20 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index b84310a..028b03f 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -768,6 +768,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
>  				     struct kvm_memory_slot *slot,
>  				     gfn_t gfn_offset, unsigned long mask);
>  void kvm_mmu_zap_all(struct kvm *kvm);
> +void kvm_mmu_zap_mmio_sptes(struct kvm *kvm);
>  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
>  void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
>  
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index de45ec1..c1a9b7b 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4189,6 +4189,24 @@ restart:
>  	spin_unlock(&kvm->mmu_lock);
>  }
>  
> +void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
> +{
> +	struct kvm_mmu_page *sp, *node;
> +	LIST_HEAD(invalid_list);
> +
> +	spin_lock(&kvm->mmu_lock);
> +restart:
> +	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
> +		if (!sp->mmio_cached)
> +			continue;
> +		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> +			goto restart;
> +	}
> +
> +	kvm_mmu_commit_zap_page(kvm, &invalid_list);
> +	spin_unlock(&kvm->mmu_lock);
> +}
> +
>  static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	struct kvm *kvm;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 35b4912..16b6df2 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6969,7 +6969,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
>  	 * mmio sptes.
>  	 */
>  	if ((change == KVM_MR_CREATE) || (change == KVM_MR_MOVE)) {
I wonder why check for KVM_MR_MOVE here. For KVM_MR_MOVE
kvm_mmu_zap_all() should be called and it is indeed called by the common code.

> -		kvm_mmu_zap_all(kvm);
> +		kvm_mmu_zap_mmio_sptes(kvm);
>  		kvm_reload_remote_mmus(kvm);
>  	}
>  }
> -- 
> 1.7.5.4

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot
  2013-03-12 12:06   ` Gleb Natapov
@ 2013-03-13  1:40     ` Marcelo Tosatti
  0 siblings, 0 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2013-03-13  1:40 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Takuya Yoshikawa, kvm

On Tue, Mar 12, 2013 at 02:06:22PM +0200, Gleb Natapov wrote:
> On Tue, Mar 12, 2013 at 05:45:30PM +0900, Takuya Yoshikawa wrote:
> > When we create or move a memory slot, we need to zap mmio sptes.
> > Currently, zap_all() is used for this and this is causing two problems:
> >  - extra page faults after zapping mmu pages
> >  - long mmu_lock hold time during zapping mmu pages
> > 
> > For the latter, Marcelo reported a disastrous mmu_lock hold time during
> > hot-plug, which made the guest unresponsive for a long time.
> > 
> > This patch takes a simple way to fix these problems: do not zap mmu
> > pages unless they are marked mmio cached.  On our test box, this took
> > only 50us for the 4GB guest and we did not see ms of mmu_lock hold time
> > any more.
> > 
> > Note that we still need to do zap_all() for other cases.  So another
> > work is also needed: Xiao's work may be the one.
> > 
> > Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
> > ---
> >  arch/x86/include/asm/kvm_host.h |    1 +
> >  arch/x86/kvm/mmu.c              |   18 ++++++++++++++++++
> >  arch/x86/kvm/x86.c              |    2 +-
> >  3 files changed, 20 insertions(+), 1 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index b84310a..028b03f 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -768,6 +768,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
> >  				     struct kvm_memory_slot *slot,
> >  				     gfn_t gfn_offset, unsigned long mask);
> >  void kvm_mmu_zap_all(struct kvm *kvm);
> > +void kvm_mmu_zap_mmio_sptes(struct kvm *kvm);
> >  unsigned int kvm_mmu_calculate_mmu_pages(struct kvm *kvm);
> >  void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int kvm_nr_mmu_pages);
> >  
> > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > index de45ec1..c1a9b7b 100644
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -4189,6 +4189,24 @@ restart:
> >  	spin_unlock(&kvm->mmu_lock);
> >  }
> >  
> > +void kvm_mmu_zap_mmio_sptes(struct kvm *kvm)
> > +{
> > +	struct kvm_mmu_page *sp, *node;
> > +	LIST_HEAD(invalid_list);
> > +
> > +	spin_lock(&kvm->mmu_lock);
> > +restart:
> > +	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link) {
> > +		if (!sp->mmio_cached)
> > +			continue;
> > +		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> > +			goto restart;
> > +	}
> > +
> > +	kvm_mmu_commit_zap_page(kvm, &invalid_list);
> > +	spin_unlock(&kvm->mmu_lock);
> > +}
> > +
> >  static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
> >  {
> >  	struct kvm *kvm;
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 35b4912..16b6df2 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -6969,7 +6969,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
> >  	 * mmio sptes.
> >  	 */
> >  	if ((change == KVM_MR_CREATE) || (change == KVM_MR_MOVE)) {
> I wonder why check for KVM_MR_MOVE here. For KVM_MR_MOVE
> kvm_mmu_zap_all() should be called and it is indeed called by the common code.

Its per memslot, the common code flush:

kvm_arch_flush_shadow_memslot(kvm, slot);


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot
  2013-03-12  8:43 [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
  2013-03-12  8:44 ` [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte Takuya Yoshikawa
  2013-03-12  8:45 ` [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
@ 2013-03-13  1:41 ` Marcelo Tosatti
  2013-03-14  8:23 ` Gleb Natapov
  3 siblings, 0 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2013-03-13  1:41 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: gleb, kvm

On Tue, Mar 12, 2013 at 05:43:33PM +0900, Takuya Yoshikawa wrote:
> This is only for mmio spte zapping, not for all zap_all() cases.
> 
> Takuya Yoshikawa (2):
>   KVM: MMU: Mark sp mmio cached when creating mmio spte
>   KVM: x86: Optimize mmio spte zapping when creating/moving memslot
> 
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  arch/x86/kvm/mmu.c              |   21 +++++++++++++++++++++
>  arch/x86/kvm/x86.c              |    2 +-
>  3 files changed, 24 insertions(+), 1 deletions(-)
> 

Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-12  8:44 ` [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte Takuya Yoshikawa
@ 2013-03-13  5:06   ` Xiao Guangrong
  2013-03-13  7:28     ` Takuya Yoshikawa
  0 siblings, 1 reply; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-13  5:06 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: mtosatti, gleb, kvm

On 03/12/2013 04:44 PM, Takuya Yoshikawa wrote:
> This will be used not to zap unrelated mmu pages when creating/moving
> a memory slot later.

How about save all mmio spte into a mmio-rmap?

The good things are:
- instead walking all shadow page, we can only walk the rmap
- Comparing to zap a shadow page, it does not need to flush TLB after
  zapping mmio sptes


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13  5:06   ` Xiao Guangrong
@ 2013-03-13  7:28     ` Takuya Yoshikawa
  2013-03-13  7:42       ` Xiao Guangrong
  0 siblings, 1 reply; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-13  7:28 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: mtosatti, gleb, kvm

On Wed, 13 Mar 2013 13:06:23 +0800
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:

> On 03/12/2013 04:44 PM, Takuya Yoshikawa wrote:
> > This will be used not to zap unrelated mmu pages when creating/moving
> > a memory slot later.
> 
> How about save all mmio spte into a mmio-rmap?

The problem is that other mmu code would need to care about the pointers
stored in the new rmap list: when mmu_shrink zaps shadow pages for example.

Maybe worth thinking about, but I want to have a simple, back-portable patch
for distributors, as a first step: note that creating a memory slot can happen
many times for some guest configurations since QEMU is doing strange things
for re-mapping some regions IIRC.

> 
> The good things are:
> - instead walking all shadow page, we can only walk the rmap

Traversing the active list does not take such a long time compared to
other things to do for zapping pages: us, not ms order.  But I'm now
preparing for an additional work to avoid "goto restart" after deleting
entries.  That will at least help us not to traverse more than once.

> - Comparing to zap a shadow page, it does not need to flush TLB after
>   zapping mmio sptes

If we check each spte in the sp, we can achieve the similar goal:
similar to the old remove_write_access() code.  I implemented such
code but have not seen a clear improvement yet.  Pros and cons will
be there.

Thanks,
	Takuya

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13  7:28     ` Takuya Yoshikawa
@ 2013-03-13  7:42       ` Xiao Guangrong
  2013-03-13 12:33         ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-13  7:42 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: mtosatti, gleb, kvm

On 03/13/2013 03:28 PM, Takuya Yoshikawa wrote:
> On Wed, 13 Mar 2013 13:06:23 +0800
> Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
> 
>> On 03/12/2013 04:44 PM, Takuya Yoshikawa wrote:
>>> This will be used not to zap unrelated mmu pages when creating/moving
>>> a memory slot later.
>>
>> How about save all mmio spte into a mmio-rmap?
> 
> The problem is that other mmu code would need to care about the pointers
> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.

It is not hard... all the codes have been wrapped by *zap_spte*.

> 
> Maybe worth thinking about, but I want to have a simple, back-portable patch
> for distributors, as a first step: note that creating a memory slot can happen
> many times for some guest configurations since QEMU is doing strange things
> for re-mapping some regions IIRC.

Hmm, that means also need to delete memslot frequently, this patch can not
help much on deletion case.

> 
>>
>> The good things are:
>> - instead walking all shadow page, we can only walk the rmap
> 
> Traversing the active list does not take such a long time compared to
> other things to do for zapping pages: us, not ms order.  But I'm now

Walking shadow page depends on how much memory used on guest...

> preparing for an additional work to avoid "goto restart" after deleting
> entries.  That will at least help us not to traverse more than once.

If drop the walking, so you need not care "goto" stuff anymore...

> 
>> - Comparing to zap a shadow page, it does not need to flush TLB after
>>   zapping mmio sptes
> 
> If we check each spte in the sp, we can achieve the similar goal:
> similar to the old remove_write_access() code.  I implemented such
> code but have not seen a clear improvement yet.  Pros and cons will
> be there.

Checking every entries (512) in the shadow page is bad...

> 
> Thanks,
> 	Takuya
> 
> 
> 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13  7:42       ` Xiao Guangrong
@ 2013-03-13 12:33         ` Gleb Natapov
  2013-03-13 12:42           ` Xiao Guangrong
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2013-03-13 12:33 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Takuya Yoshikawa, mtosatti, kvm

On Wed, Mar 13, 2013 at 03:42:18PM +0800, Xiao Guangrong wrote:
> On 03/13/2013 03:28 PM, Takuya Yoshikawa wrote:
> > On Wed, 13 Mar 2013 13:06:23 +0800
> > Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
> > 
> >> On 03/12/2013 04:44 PM, Takuya Yoshikawa wrote:
> >>> This will be used not to zap unrelated mmu pages when creating/moving
> >>> a memory slot later.
> >>
> >> How about save all mmio spte into a mmio-rmap?
> > 
> > The problem is that other mmu code would need to care about the pointers
> > stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
> 
> It is not hard... all the codes have been wrapped by *zap_spte*.
> 
So are you going to send a patch? What do you think about applying this
as temporary solution?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13 12:33         ` Gleb Natapov
@ 2013-03-13 12:42           ` Xiao Guangrong
  2013-03-13 13:40             ` Takuya Yoshikawa
  0 siblings, 1 reply; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-13 12:42 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Takuya Yoshikawa, mtosatti, kvm

On 03/13/2013 08:33 PM, Gleb Natapov wrote:
> On Wed, Mar 13, 2013 at 03:42:18PM +0800, Xiao Guangrong wrote:
>> On 03/13/2013 03:28 PM, Takuya Yoshikawa wrote:
>>> On Wed, 13 Mar 2013 13:06:23 +0800
>>> Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
>>>
>>>> On 03/12/2013 04:44 PM, Takuya Yoshikawa wrote:
>>>>> This will be used not to zap unrelated mmu pages when creating/moving
>>>>> a memory slot later.
>>>>
>>>> How about save all mmio spte into a mmio-rmap?
>>>
>>> The problem is that other mmu code would need to care about the pointers
>>> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
>>
>> It is not hard... all the codes have been wrapped by *zap_spte*.
>>
> So are you going to send a patch? What do you think about applying this
> as temporary solution?

Hi Gleb,

Since it only needs small change based on this patch, I think we can directly
apply the rmap-based way.

Takuya, could you please do this? ;)


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13 12:42           ` Xiao Guangrong
@ 2013-03-13 13:40             ` Takuya Yoshikawa
  2013-03-13 14:05               ` Xiao Guangrong
  0 siblings, 1 reply; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-13 13:40 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Gleb Natapov, Takuya Yoshikawa, mtosatti, kvm

On Wed, 13 Mar 2013 20:42:41 +0800
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:

> >>>> How about save all mmio spte into a mmio-rmap?
> >>>
> >>> The problem is that other mmu code would need to care about the pointers
> >>> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
> >>
> >> It is not hard... all the codes have been wrapped by *zap_spte*.
> >>
> > So are you going to send a patch? What do you think about applying this
> > as temporary solution?
> 
> Hi Gleb,
> 
> Since it only needs small change based on this patch, I think we can directly
> apply the rmap-based way.
> 
> Takuya, could you please do this? ;)

Though I'm fine with my making the patch better, I'm still thinking
about the bad side of it, though.

In zap_spte, don't we need to search the pointer to be removed from the
global mmio-rmap list?  How long can that list be?

Implementing it will/may not be difficult but I'm not sure if we would
get pure improvement.  Unless it becomes 99% sure, I think we should
first take a basic approach.

What do you think?

Thanks,
	Takuya

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13 13:40             ` Takuya Yoshikawa
@ 2013-03-13 14:05               ` Xiao Guangrong
  2013-03-14  1:58                 ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-13 14:05 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Gleb Natapov, Takuya Yoshikawa, mtosatti, kvm

On 03/13/2013 09:40 PM, Takuya Yoshikawa wrote:
> On Wed, 13 Mar 2013 20:42:41 +0800
> Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
> 
>>>>>> How about save all mmio spte into a mmio-rmap?
>>>>>
>>>>> The problem is that other mmu code would need to care about the pointers
>>>>> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
>>>>
>>>> It is not hard... all the codes have been wrapped by *zap_spte*.
>>>>
>>> So are you going to send a patch? What do you think about applying this
>>> as temporary solution?
>>
>> Hi Gleb,
>>
>> Since it only needs small change based on this patch, I think we can directly
>> apply the rmap-based way.
>>
>> Takuya, could you please do this? ;)
> 
> Though I'm fine with my making the patch better, I'm still thinking
> about the bad side of it, though.
> 
> In zap_spte, don't we need to search the pointer to be removed from the
> global mmio-rmap list?  How long can that list be?

It is not bad. On softmmu, the rmap list has already been long more than 300.
On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).

The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
can be speed up after applying my previous patch:
KVM: MMU: fast drop all spte on the pte_list

> 
> Implementing it will/may not be difficult but I'm not sure if we would
> get pure improvement.  Unless it becomes 99% sure, I think we should
> first take a basic approach.

I definitely sure zapping all mmio-sptes is fast than zapping mmio shadow
pages. ;)

> 
> What do you think?

I am considering if zap all shadow page is faster enough (after my patchset), do
we really need to care it?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-13 14:05               ` Xiao Guangrong
@ 2013-03-14  1:58                 ` Marcelo Tosatti
  2013-03-14  2:26                   ` Takuya Yoshikawa
  2013-03-14  5:13                   ` Xiao Guangrong
  0 siblings, 2 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2013-03-14  1:58 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Takuya Yoshikawa, Gleb Natapov, Takuya Yoshikawa, kvm

On Wed, Mar 13, 2013 at 10:05:20PM +0800, Xiao Guangrong wrote:
> On 03/13/2013 09:40 PM, Takuya Yoshikawa wrote:
> > On Wed, 13 Mar 2013 20:42:41 +0800
> > Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
> > 
> >>>>>> How about save all mmio spte into a mmio-rmap?
> >>>>>
> >>>>> The problem is that other mmu code would need to care about the pointers
> >>>>> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
> >>>>
> >>>> It is not hard... all the codes have been wrapped by *zap_spte*.
> >>>>
> >>> So are you going to send a patch? What do you think about applying this
> >>> as temporary solution?
> >>
> >> Hi Gleb,
> >>
> >> Since it only needs small change based on this patch, I think we can directly
> >> apply the rmap-based way.
> >>
> >> Takuya, could you please do this? ;)
> > 
> > Though I'm fine with my making the patch better, I'm still thinking
> > about the bad side of it, though.
> > 
> > In zap_spte, don't we need to search the pointer to be removed from the
> > global mmio-rmap list?  How long can that list be?
> 
> It is not bad. On softmmu, the rmap list has already been long more than 300.
> On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).
> 
> The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
> can be speed up after applying my previous patch:
> KVM: MMU: fast drop all spte on the pte_list
> 
> > 
> > Implementing it will/may not be difficult but I'm not sure if we would
> > get pure improvement.  Unless it becomes 99% sure, I think we should
> > first take a basic approach. 
> 
> I definitely sure zapping all mmio-sptes is fast than zapping mmio shadow
> pages. ;)

With a huge number of shadow pages (think 512GB guest, 262144 pte-level
shadow pages to map), it might be a problem.

> > What do you think?
> 
> I am considering if zap all shadow page is faster enough (after my patchset), do
> we really need to care it?

Still needed: your patch reduces kvm_mmu_zap_all() time, but as you can
see with huge memory sized guests 100% improvement over the current
situation will be a bottleneck (and as you noted the deletion case is
still unsolved).

Suppose another improvement angle is to zap only whats necessary for the
given operation (say there is the memslot hint available, but unused for
x86).


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-14  1:58                 ` Marcelo Tosatti
@ 2013-03-14  2:26                   ` Takuya Yoshikawa
  2013-03-14  2:39                     ` Marcelo Tosatti
  2013-03-14  5:13                   ` Xiao Guangrong
  1 sibling, 1 reply; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-14  2:26 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Xiao Guangrong, Takuya Yoshikawa, Gleb Natapov, kvm

On Wed, 13 Mar 2013 22:58:21 -0300
Marcelo Tosatti <mtosatti@redhat.com> wrote:

> > > In zap_spte, don't we need to search the pointer to be removed from the
> > > global mmio-rmap list?  How long can that list be?
> > 
> > It is not bad. On softmmu, the rmap list has already been long more than 300.
> > On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).

mmu_shrink() is an exception.

> > 
> > The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
> > can be speed up after applying my previous patch:
> > KVM: MMU: fast drop all spte on the pte_list

My point is other code may need to care more about latency.

Zapping all mmio sptes can happen only when changing memory regions:
not so latency severe but should be reasonably fast not to hold
mmu_lock for a (too) long time.

Compared to that, mmu_shrink() may be called any time and adding
more work to it should be avoided IMO.  It should return ASAP.

In general, we should try hard to keep ourselves from affecting
unrelated code path for optimizing something.  The global pte
list is something which can affect many code paths in the future.


So, I'm fine with trying mmio-rmap once we can actually measure
very long mmu_lock hold time by traversing shadow pages.

How about applying this first and then see the effect on big guests?

Thanks,
	Takuya


> > > Implementing it will/may not be difficult but I'm not sure if we would
> > > get pure improvement.  Unless it becomes 99% sure, I think we should
> > > first take a basic approach. 
> > 
> > I definitely sure zapping all mmio-sptes is fast than zapping mmio shadow
> > pages. ;)
> 
> With a huge number of shadow pages (think 512GB guest, 262144 pte-level
> shadow pages to map), it might be a problem.
> 
> > > What do you think?
> > 
> > I am considering if zap all shadow page is faster enough (after my patchset), do
> > we really need to care it?
> 
> Still needed: your patch reduces kvm_mmu_zap_all() time, but as you can
> see with huge memory sized guests 100% improvement over the current
> situation will be a bottleneck (and as you noted the deletion case is
> still unsolved).
> 
> Suppose another improvement angle is to zap only whats necessary for the
> given operation (say there is the memslot hint available, but unused for
> x86).

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-14  2:26                   ` Takuya Yoshikawa
@ 2013-03-14  2:39                     ` Marcelo Tosatti
  2013-03-14  5:36                       ` Xiao Guangrong
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2013-03-14  2:39 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: Xiao Guangrong, Takuya Yoshikawa, Gleb Natapov, kvm

On Thu, Mar 14, 2013 at 11:26:41AM +0900, Takuya Yoshikawa wrote:
> On Wed, 13 Mar 2013 22:58:21 -0300
> Marcelo Tosatti <mtosatti@redhat.com> wrote:
> 
> > > > In zap_spte, don't we need to search the pointer to be removed from the
> > > > global mmio-rmap list?  How long can that list be?
> > > 
> > > It is not bad. On softmmu, the rmap list has already been long more than 300.
> > > On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).
> 
> mmu_shrink() is an exception.
> 
> > > 
> > > The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
> > > can be speed up after applying my previous patch:
> > > KVM: MMU: fast drop all spte on the pte_list
> 
> My point is other code may need to care more about latency.
> 
> Zapping all mmio sptes can happen only when changing memory regions:
> not so latency severe but should be reasonably fast not to hold
> mmu_lock for a (too) long time.
> 
> Compared to that, mmu_shrink() may be called any time and adding
> more work to it should be avoided IMO.  It should return ASAP.

Good point.

> In general, we should try hard to keep ourselves from affecting
> unrelated code path for optimizing something.  The global pte
> list is something which can affect many code paths in the future.
> 
> 
> So, I'm fine with trying mmio-rmap once we can actually measure
> very long mmu_lock hold time by traversing shadow pages.
> 
> How about applying this first and then see the effect on big guests?

Works for me. Xiao?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-14  1:58                 ` Marcelo Tosatti
  2013-03-14  2:26                   ` Takuya Yoshikawa
@ 2013-03-14  5:13                   ` Xiao Guangrong
  2013-03-14  5:45                     ` Xiao Guangrong
  2013-03-16  2:01                     ` Takuya Yoshikawa
  1 sibling, 2 replies; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-14  5:13 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Takuya Yoshikawa, Gleb Natapov, Takuya Yoshikawa, kvm

On 03/14/2013 09:58 AM, Marcelo Tosatti wrote:
> On Wed, Mar 13, 2013 at 10:05:20PM +0800, Xiao Guangrong wrote:
>> On 03/13/2013 09:40 PM, Takuya Yoshikawa wrote:
>>> On Wed, 13 Mar 2013 20:42:41 +0800
>>> Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
>>>
>>>>>>>> How about save all mmio spte into a mmio-rmap?
>>>>>>>
>>>>>>> The problem is that other mmu code would need to care about the pointers
>>>>>>> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
>>>>>>
>>>>>> It is not hard... all the codes have been wrapped by *zap_spte*.
>>>>>>
>>>>> So are you going to send a patch? What do you think about applying this
>>>>> as temporary solution?
>>>>
>>>> Hi Gleb,
>>>>
>>>> Since it only needs small change based on this patch, I think we can directly
>>>> apply the rmap-based way.
>>>>
>>>> Takuya, could you please do this? ;)
>>>
>>> Though I'm fine with my making the patch better, I'm still thinking
>>> about the bad side of it, though.
>>>
>>> In zap_spte, don't we need to search the pointer to be removed from the
>>> global mmio-rmap list?  How long can that list be?
>>
>> It is not bad. On softmmu, the rmap list has already been long more than 300.
>> On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).
>>
>> The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
>> can be speed up after applying my previous patch:
>> KVM: MMU: fast drop all spte on the pte_list
>>
>>>
>>> Implementing it will/may not be difficult but I'm not sure if we would
>>> get pure improvement.  Unless it becomes 99% sure, I think we should
>>> first take a basic approach. 
>>
>> I definitely sure zapping all mmio-sptes is fast than zapping mmio shadow
>> pages. ;)
> 
> With a huge number of shadow pages (think 512GB guest, 262144 pte-level
> shadow pages to map), it might be a problem.

That is one of the reasons why i think zap mmio shadow page is not good. ;)

This patch needs to walk all shadow pages to find all mmio shadow page out
and zap them, it depends on how much memory is used on guest (huge memory
causes huge shadow page as you said). But the time of zapping mmio spte is
constant, no matter of memory used.

> 
>>> What do you think?
>>
>> I am considering if zap all shadow page is faster enough (after my patchset), do
>> we really need to care it?
> 
> Still needed: your patch reduces kvm_mmu_zap_all() time, but as you can
> see with huge memory sized guests 100% improvement over the current
> situation will be a bottleneck (and as you noted the deletion case is
> still unsolved).	

The improvement can be greater if more memory is used. (I only used 2G memory in
guest since my test case is 32bit program which can not use huge memory, and
not lock contention in my testcase.)

Actually, the time complexity of current kvm_mmu_zap_all is the same as zap
mmio shadow page in the mmu-lock (O(n), n is the number of shadow page table).
Both of them walking all shadow page table.  The reset work of kvm_mmu_zap is
constant.

And this is a TODO thing:
(2): free shadow pages by using generation-number
After that, kvm_mmu_zap needn't to walking all shadow pages anymore.

> 
> Suppose another improvement angle is to zap only whats necessary for the
> given operation (say there is the memslot hint available, but unused for
> x86).

Yes, i agree on this point. Zapping all shadow pages smake vcpus fault
on all memory access. This is the shortage.





^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-14  2:39                     ` Marcelo Tosatti
@ 2013-03-14  5:36                       ` Xiao Guangrong
  0 siblings, 0 replies; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-14  5:36 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Takuya Yoshikawa, Takuya Yoshikawa, Gleb Natapov, kvm

On 03/14/2013 10:39 AM, Marcelo Tosatti wrote:
> On Thu, Mar 14, 2013 at 11:26:41AM +0900, Takuya Yoshikawa wrote:
>> On Wed, 13 Mar 2013 22:58:21 -0300
>> Marcelo Tosatti <mtosatti@redhat.com> wrote:
>>
>>>>> In zap_spte, don't we need to search the pointer to be removed from the
>>>>> global mmio-rmap list?  How long can that list be?
>>>>
>>>> It is not bad. On softmmu, the rmap list has already been long more than 300.
>>>> On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).
>>
>> mmu_shrink() is an exception.
>>
>>>>
>>>> The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
>>>> can be speed up after applying my previous patch:
>>>> KVM: MMU: fast drop all spte on the pte_list
>>
>> My point is other code may need to care more about latency.
>>
>> Zapping all mmio sptes can happen only when changing memory regions:
>> not so latency severe but should be reasonably fast not to hold
>> mmu_lock for a (too) long time.
>>
>> Compared to that, mmu_shrink() may be called any time and adding
>> more work to it should be avoided IMO.  It should return ASAP.

Hmm? How frequently is of mmu_shrink? Well, it would be heavy sometimes, but
is not the case on normal running.
How many mmio shdow pages we got in the system? Not many, especially on the
virtio supported guest.

And, if it is a real problem, it is worthwhile to optimize it since it is
more worse for normal page rmap on shadow mmu.

I have a idea to avoid holding mmu-lock that i mentioned in the previous mail
that is cache generation-number into mmio spte. When zap mmio spte is needed,
we can just simply increase the global generation-number.

> 
> Good point.
> 
>> In general, we should try hard to keep ourselves from affecting
>> unrelated code path for optimizing something.  The global pte
>> list is something which can affect many code paths in the future.
>>
>>
>> So, I'm fine with trying mmio-rmap once we can actually measure
>> very long mmu_lock hold time by traversing shadow pages.
>>
>> How about applying this first and then see the effect on big guests?
> 
> Works for me. Xiao?

Marcelo, I do not persist in it. ;)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-14  5:13                   ` Xiao Guangrong
@ 2013-03-14  5:45                     ` Xiao Guangrong
  2013-03-16  2:01                     ` Takuya Yoshikawa
  1 sibling, 0 replies; 21+ messages in thread
From: Xiao Guangrong @ 2013-03-14  5:45 UTC (permalink / raw)
  To: Xiao Guangrong
  Cc: Marcelo Tosatti, Takuya Yoshikawa, Gleb Natapov, Takuya Yoshikawa, kvm

On 03/14/2013 01:13 PM, Xiao Guangrong wrote:
> On 03/14/2013 09:58 AM, Marcelo Tosatti wrote:
>> On Wed, Mar 13, 2013 at 10:05:20PM +0800, Xiao Guangrong wrote:
>>> On 03/13/2013 09:40 PM, Takuya Yoshikawa wrote:
>>>> On Wed, 13 Mar 2013 20:42:41 +0800
>>>> Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:
>>>>
>>>>>>>>> How about save all mmio spte into a mmio-rmap?
>>>>>>>>
>>>>>>>> The problem is that other mmu code would need to care about the pointers
>>>>>>>> stored in the new rmap list: when mmu_shrink zaps shadow pages for example.
>>>>>>>
>>>>>>> It is not hard... all the codes have been wrapped by *zap_spte*.
>>>>>>>
>>>>>> So are you going to send a patch? What do you think about applying this
>>>>>> as temporary solution?
>>>>>
>>>>> Hi Gleb,
>>>>>
>>>>> Since it only needs small change based on this patch, I think we can directly
>>>>> apply the rmap-based way.
>>>>>
>>>>> Takuya, could you please do this? ;)
>>>>
>>>> Though I'm fine with my making the patch better, I'm still thinking
>>>> about the bad side of it, though.
>>>>
>>>> In zap_spte, don't we need to search the pointer to be removed from the
>>>> global mmio-rmap list?  How long can that list be?
>>>
>>> It is not bad. On softmmu, the rmap list has already been long more than 300.
>>> On hardmmu, normally the mmio spte is not frequently zapped (just set not clear).
>>>
>>> The worst case is zap-all-mmio-spte that removes all mmio-spte. This operation
>>> can be speed up after applying my previous patch:
>>> KVM: MMU: fast drop all spte on the pte_list
>>>
>>>>
>>>> Implementing it will/may not be difficult but I'm not sure if we would
>>>> get pure improvement.  Unless it becomes 99% sure, I think we should
>>>> first take a basic approach. 
>>>
>>> I definitely sure zapping all mmio-sptes is fast than zapping mmio shadow
>>> pages. ;)
>>
>> With a huge number of shadow pages (think 512GB guest, 262144 pte-level
>> shadow pages to map), it might be a problem.
> 
> That is one of the reasons why i think zap mmio shadow page is not good. ;)
> 
> This patch needs to walk all shadow pages to find all mmio shadow page out
> and zap them, it depends on how much memory is used on guest (huge memory
> causes huge shadow page as you said). But the time of zapping mmio spte is
> constant, no matter of memory used.
> 
>>
>>>> What do you think?
>>>
>>> I am considering if zap all shadow page is faster enough (after my patchset), do
>>> we really need to care it?
>>
>> Still needed: your patch reduces kvm_mmu_zap_all() time, but as you can
>> see with huge memory sized guests 100% improvement over the current
>> situation will be a bottleneck (and as you noted the deletion case is
>> still unsolved).	
> 
> The improvement can be greater if more memory is used. (I only used 2G memory in
> guest since my test case is 32bit program which can not use huge memory, and
> not lock contention in my testcase.)
> 
> Actually, the time complexity of current kvm_mmu_zap_all is the same as zap

                                    ^^^^^
Sorry, not current way. It is the optimizing way in my patchset.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot
  2013-03-12  8:43 [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
                   ` (2 preceding siblings ...)
  2013-03-13  1:41 ` [PATCH 0/2] KVM: " Marcelo Tosatti
@ 2013-03-14  8:23 ` Gleb Natapov
  3 siblings, 0 replies; 21+ messages in thread
From: Gleb Natapov @ 2013-03-14  8:23 UTC (permalink / raw)
  To: Takuya Yoshikawa; +Cc: mtosatti, kvm

On Tue, Mar 12, 2013 at 05:43:33PM +0900, Takuya Yoshikawa wrote:
> This is only for mmio spte zapping, not for all zap_all() cases.
> 
> Takuya Yoshikawa (2):
>   KVM: MMU: Mark sp mmio cached when creating mmio spte
>   KVM: x86: Optimize mmio spte zapping when creating/moving memslot
> 
>  arch/x86/include/asm/kvm_host.h |    2 ++
>  arch/x86/kvm/mmu.c              |   21 +++++++++++++++++++++
>  arch/x86/kvm/x86.c              |    2 +-
>  3 files changed, 24 insertions(+), 1 deletions(-)
> 
The patches are simple, improve the current situation and can be replaced
with something better if someone is willing to do the job of writing the patches.

Applied, thanks.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte
  2013-03-14  5:13                   ` Xiao Guangrong
  2013-03-14  5:45                     ` Xiao Guangrong
@ 2013-03-16  2:01                     ` Takuya Yoshikawa
  1 sibling, 0 replies; 21+ messages in thread
From: Takuya Yoshikawa @ 2013-03-16  2:01 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, Takuya Yoshikawa, kvm

[ I'm still reading your patches, so please forgive me If I'm wrong. ]

On Thu, 14 Mar 2013 13:13:30 +0800
Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> wrote:

> Actually, the time complexity of current kvm_mmu_zap_all is the same as zap
> mmio shadow page in the mmu-lock (O(n), n is the number of shadow page table).
> Both of them walking all shadow page table.  The reset work of kvm_mmu_zap is
> constant.

Clearing rmap arrays, using memset, cannot be constant.
It's proportional to the number of guest pages (not shadow pages).
I guess we can think it's practically constant for all cases,
so I think your optimization is great!

But anyway it's worth remembering the arrays can be very long.
512GB: 128M pages.  Clearing 1GB of memory will not take too long(?)...
So my guess is that your method can cover most of the use cases we can
think of now.

Thanks,
	Takuya

> 
> And this is a TODO thing:
> (2): free shadow pages by using generation-number
> After that, kvm_mmu_zap needn't to walking all shadow pages anymore.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2013-03-16  2:01 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-12  8:43 [PATCH 0/2] KVM: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
2013-03-12  8:44 ` [PATCH 1/2] KVM: MMU: Mark sp mmio cached when creating mmio spte Takuya Yoshikawa
2013-03-13  5:06   ` Xiao Guangrong
2013-03-13  7:28     ` Takuya Yoshikawa
2013-03-13  7:42       ` Xiao Guangrong
2013-03-13 12:33         ` Gleb Natapov
2013-03-13 12:42           ` Xiao Guangrong
2013-03-13 13:40             ` Takuya Yoshikawa
2013-03-13 14:05               ` Xiao Guangrong
2013-03-14  1:58                 ` Marcelo Tosatti
2013-03-14  2:26                   ` Takuya Yoshikawa
2013-03-14  2:39                     ` Marcelo Tosatti
2013-03-14  5:36                       ` Xiao Guangrong
2013-03-14  5:13                   ` Xiao Guangrong
2013-03-14  5:45                     ` Xiao Guangrong
2013-03-16  2:01                     ` Takuya Yoshikawa
2013-03-12  8:45 ` [PATCH 2/2] KVM: x86: Optimize mmio spte zapping when creating/moving memslot Takuya Yoshikawa
2013-03-12 12:06   ` Gleb Natapov
2013-03-13  1:40     ` Marcelo Tosatti
2013-03-13  1:41 ` [PATCH 0/2] KVM: " Marcelo Tosatti
2013-03-14  8:23 ` Gleb Natapov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.