All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
@ 2012-04-13 22:38 Ying Han
  2012-04-14 11:44 ` Hillf Danton
  2012-04-20 22:11   ` Andrew Morton
  0 siblings, 2 replies; 21+ messages in thread
From: Ying Han @ 2012-04-13 22:38 UTC (permalink / raw)
  To: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	Andrew Morton
  Cc: linux-mm

The mmu_shrink() is heavy by itself by iterating all kvms and holding
the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
don't need to call the shrinker if nothing to shrink.

Signed-off-by: Ying Han <yinghan@google.com>
---
 arch/x86/kvm/mmu.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4cb1642..7025736 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -188,6 +188,11 @@ static u64 __read_mostly shadow_mmio_mask;
 
 static void mmu_spte_set(u64 *sptep, u64 spte);
 
+static inline int get_kvm_total_used_mmu_pages()
+{
+	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
+}
+
 void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
 {
 	shadow_mmio_mask = mmio_mask;
@@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 	if (nr_to_scan == 0)
 		goto out;
 
+	if (!get_kvm_total_used_mmu_pages())
+		return 0;
+
 	raw_spin_lock(&kvm_lock);
 
 	list_for_each_entry(kvm, &vm_list, vm_list) {
@@ -3926,7 +3934,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 	raw_spin_unlock(&kvm_lock);
 
 out:
-	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
+	return get_kvm_total_used_mmu_pages();
 }
 
 static struct shrinker mmu_shrinker = {
-- 
1.7.7.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-13 22:38 [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages Ying Han
@ 2012-04-14 11:44 ` Hillf Danton
  2012-04-16 16:43   ` Ying Han
  2012-04-20 22:11   ` Andrew Morton
  1 sibling, 1 reply; 21+ messages in thread
From: Hillf Danton @ 2012-04-14 11:44 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hugh Dickins, Dan Magenheimer, Andrew Morton,
	linux-mm

On Sat, Apr 14, 2012 at 6:38 AM, Ying Han <yinghan@google.com> wrote:
> The mmu_shrink() is heavy by itself by iterating all kvms and holding
> the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
> don't need to call the shrinker if nothing to shrink.
>
> Signed-off-by: Ying Han <yinghan@google.com>
> ---
>  arch/x86/kvm/mmu.c |   10 +++++++++-
>  1 files changed, 9 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 4cb1642..7025736 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -188,6 +188,11 @@ static u64 __read_mostly shadow_mmio_mask;
>
>  static void mmu_spte_set(u64 *sptep, u64 spte);
>
> +static inline int get_kvm_total_used_mmu_pages()
> +{
> +       return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> +}
> +
>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
>  {
>        shadow_mmio_mask = mmio_mask;
> @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>        if (nr_to_scan == 0)
>                goto out;
>
> +       if (!get_kvm_total_used_mmu_pages())
> +               return 0;
> +
>        raw_spin_lock(&kvm_lock);
>
>        list_for_each_entry(kvm, &vm_list, vm_list) {
> @@ -3926,7 +3934,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>        raw_spin_unlock(&kvm_lock);
>
>  out:
> -       return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> +       return get_kvm_total_used_mmu_pages();
>  }
>
Just nitpick.
If new helper not created, there is only one hunk needed.

btw, make sense to check nr_to_scan while scanning vm_list, and bail out
if it hits zero?

Good Weekend
-hd

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-14 11:44 ` Hillf Danton
@ 2012-04-16 16:43   ` Ying Han
  0 siblings, 0 replies; 21+ messages in thread
From: Ying Han @ 2012-04-16 16:43 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hugh Dickins, Dan Magenheimer, Andrew Morton,
	linux-mm

On Sat, Apr 14, 2012 at 4:44 AM, Hillf Danton <dhillf@gmail.com> wrote:
> On Sat, Apr 14, 2012 at 6:38 AM, Ying Han <yinghan@google.com> wrote:
>> The mmu_shrink() is heavy by itself by iterating all kvms and holding
>> the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
>> don't need to call the shrinker if nothing to shrink.
>>
>> Signed-off-by: Ying Han <yinghan@google.com>
>> ---
>>  arch/x86/kvm/mmu.c |   10 +++++++++-
>>  1 files changed, 9 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 4cb1642..7025736 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -188,6 +188,11 @@ static u64 __read_mostly shadow_mmio_mask;
>>
>>  static void mmu_spte_set(u64 *sptep, u64 spte);
>>
>> +static inline int get_kvm_total_used_mmu_pages()
>> +{
>> +       return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
>> +}
>> +
>>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
>>  {
>>        shadow_mmio_mask = mmio_mask;
>> @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>>        if (nr_to_scan == 0)
>>                goto out;
>>
>> +       if (!get_kvm_total_used_mmu_pages())
>> +               return 0;
>> +
>>        raw_spin_lock(&kvm_lock);
>>
>>        list_for_each_entry(kvm, &vm_list, vm_list) {
>> @@ -3926,7 +3934,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>>        raw_spin_unlock(&kvm_lock);
>>
>>  out:
>> -       return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
>> +       return get_kvm_total_used_mmu_pages();
>>  }
>>
> Just nitpick.
> If new helper not created, there is only one hunk needed.

Hmm, thought it looks nicer with the helpful function instead of long
percpu_counter_read_positive() in the if block.

>
> btw, make sense to check nr_to_scan while scanning vm_list, and bail out
> if it hits zero?

Not totally understand the nr_to_scan in that function, but we could
do a separate patch if that is needed.

--Ying
>
> Good Weekend
> -hd

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-13 22:38 [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages Ying Han
@ 2012-04-20 22:11   ` Andrew Morton
  2012-04-20 22:11   ` Andrew Morton
  1 sibling, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2012-04-20 22:11 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	linux-mm, kvm, Avi Kivity, Marcelo Tosatti

On Fri, 13 Apr 2012 15:38:41 -0700
Ying Han <yinghan@google.com> wrote:

> The mmu_shrink() is heavy by itself by iterating all kvms and holding
> the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
> don't need to call the shrinker if nothing to shrink.
> 

We should probably tell the kvm maintainers about this ;)

> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -188,6 +188,11 @@ static u64 __read_mostly shadow_mmio_mask;
>  
>  static void mmu_spte_set(u64 *sptep, u64 spte);
>  
> +static inline int get_kvm_total_used_mmu_pages()
> +{
> +	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> +}
> +
>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
>  {
>  	shadow_mmio_mask = mmio_mask;
> @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  	if (nr_to_scan == 0)
>  		goto out;
>  
> +	if (!get_kvm_total_used_mmu_pages())
> +		return 0;
> +
>  	raw_spin_lock(&kvm_lock);
>  
>  	list_for_each_entry(kvm, &vm_list, vm_list) {
> @@ -3926,7 +3934,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  	raw_spin_unlock(&kvm_lock);
>  
>  out:
> -	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> +	return get_kvm_total_used_mmu_pages();
>  }
>  
>  static struct shrinker mmu_shrinker = {

There's a small functional change: percpu_counter_read_positive() is an
approximate thing, so there will be cases where there will be some
pages which are accounted for only in the percpu_counter's per-cpu
accumulators.  In that case mmu_shrink() will bale out when there are
in fact some freeable pages available.  This is hopefully unimportant.

Do we actually know that this patch helps anything?  Any measurements? Is
kvm_total_used_mmu_pages==0 at all common?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
@ 2012-04-20 22:11   ` Andrew Morton
  0 siblings, 0 replies; 21+ messages in thread
From: Andrew Morton @ 2012-04-20 22:11 UTC (permalink / raw)
  To: Ying Han
  Cc: Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Rik van Riel, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	linux-mm, kvm, Avi Kivity, Marcelo Tosatti

On Fri, 13 Apr 2012 15:38:41 -0700
Ying Han <yinghan@google.com> wrote:

> The mmu_shrink() is heavy by itself by iterating all kvms and holding
> the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
> don't need to call the shrinker if nothing to shrink.
> 

We should probably tell the kvm maintainers about this ;)

> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -188,6 +188,11 @@ static u64 __read_mostly shadow_mmio_mask;
>  
>  static void mmu_spte_set(u64 *sptep, u64 spte);
>  
> +static inline int get_kvm_total_used_mmu_pages()
> +{
> +	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> +}
> +
>  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
>  {
>  	shadow_mmio_mask = mmio_mask;
> @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  	if (nr_to_scan == 0)
>  		goto out;
>  
> +	if (!get_kvm_total_used_mmu_pages())
> +		return 0;
> +
>  	raw_spin_lock(&kvm_lock);
>  
>  	list_for_each_entry(kvm, &vm_list, vm_list) {
> @@ -3926,7 +3934,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>  	raw_spin_unlock(&kvm_lock);
>  
>  out:
> -	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> +	return get_kvm_total_used_mmu_pages();
>  }
>  
>  static struct shrinker mmu_shrinker = {

There's a small functional change: percpu_counter_read_positive() is an
approximate thing, so there will be cases where there will be some
pages which are accounted for only in the percpu_counter's per-cpu
accumulators.  In that case mmu_shrink() will bale out when there are
in fact some freeable pages available.  This is hopefully unimportant.

Do we actually know that this patch helps anything?  Any measurements? Is
kvm_total_used_mmu_pages==0 at all common?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-20 22:11   ` Andrew Morton
  (?)
@ 2012-04-20 22:53   ` Rik van Riel
  2012-04-20 23:07     ` Ying Han
  -1 siblings, 1 reply; 21+ messages in thread
From: Rik van Riel @ 2012-04-20 22:53 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ying Han, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	linux-mm, kvm, Avi Kivity, Marcelo Tosatti

On 04/20/2012 06:11 PM, Andrew Morton wrote:
> On Fri, 13 Apr 2012 15:38:41 -0700
> Ying Han<yinghan@google.com>  wrote:
>
>> The mmu_shrink() is heavy by itself by iterating all kvms and holding
>> the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
>> don't need to call the shrinker if nothing to shrink.

>> @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
>>   	if (nr_to_scan == 0)
>>   		goto out;
>>
>> +	if (!get_kvm_total_used_mmu_pages())
>> +		return 0;
>> +

> Do we actually know that this patch helps anything?  Any measurements? Is
> kvm_total_used_mmu_pages==0 at all common?
>

On re-reading mmu.c, it looks like even with EPT or NPT,
we end up creating mmu pages for the nested page tables.

I have not had the time to look into it more, but it would
be nice to know if the patch has any effect at all.

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-20 22:53   ` Rik van Riel
@ 2012-04-20 23:07     ` Ying Han
  2012-04-21  1:56       ` Takuya Yoshikawa
  0 siblings, 1 reply; 21+ messages in thread
From: Ying Han @ 2012-04-20 23:07 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andrew Morton, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	linux-mm, kvm, Avi Kivity, Marcelo Tosatti, Mike Waychison

On Fri, Apr 20, 2012 at 3:53 PM, Rik van Riel <riel@redhat.com> wrote:
> On 04/20/2012 06:11 PM, Andrew Morton wrote:
>>
>> On Fri, 13 Apr 2012 15:38:41 -0700
>> Ying Han<yinghan@google.com>  wrote:
>>
>>> The mmu_shrink() is heavy by itself by iterating all kvms and holding
>>> the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
>>> don't need to call the shrinker if nothing to shrink.
>
>
>>> @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink,
>>> struct shrink_control *sc)
>>>        if (nr_to_scan == 0)
>>>                goto out;
>>>
>>> +       if (!get_kvm_total_used_mmu_pages())
>>> +               return 0;
>>> +
>
>
>> Do we actually know that this patch helps anything?  Any measurements? Is
>> kvm_total_used_mmu_pages==0 at all common?
>>
>
> On re-reading mmu.c, it looks like even with EPT or NPT,
> we end up creating mmu pages for the nested page tables.

I think you are right here. So the patch doesn't help the real pain.

My understanding of the real pain is the poor implementation of the
mmu_shrinker. It iterates all the registered mmu_shrink callbacks for
each kvm and only does little work at a time while holding two big
locks. I learned from mikew@ (also ++cc-ed) that is causing latency
spikes and unfairness among kvm instance in some of the experiment
we've seen.

Mike might tell more on that.

--Ying

>
> I have not had the time to look into it more, but it would
> be nice to know if the patch has any effect at all.
>
> --
> All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-20 23:07     ` Ying Han
@ 2012-04-21  1:56       ` Takuya Yoshikawa
  2012-04-21  2:15         ` Mike Waychison
  0 siblings, 1 reply; 21+ messages in thread
From: Takuya Yoshikawa @ 2012-04-21  1:56 UTC (permalink / raw)
  To: Ying Han
  Cc: Rik van Riel, Andrew Morton, Michal Hocko, Johannes Weiner,
	Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins,
	Dan Magenheimer, linux-mm, kvm, Avi Kivity, Marcelo Tosatti,
	Mike Waychison

On Fri, 20 Apr 2012 16:07:41 -0700
Ying Han <yinghan@google.com> wrote:

> My understanding of the real pain is the poor implementation of the
> mmu_shrinker. It iterates all the registered mmu_shrink callbacks for
> each kvm and only does little work at a time while holding two big
> locks. I learned from mikew@ (also ++cc-ed) that is causing latency
> spikes and unfairness among kvm instance in some of the experiment
> we've seen.

Last year, I discussed the mmu_shrink issues on kvm ML:

	[PATCH 0/4] KVM: Make mmu_shrink() scan nr_to_scan shadow pages
	http://www.spinics.net/lists/kvm/msg65231.html

Sadly, we could not find any good way at that time.

Thanks,
	Takuya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-21  1:56       ` Takuya Yoshikawa
@ 2012-04-21  2:15         ` Mike Waychison
  2012-04-21  2:29             ` Takuya Yoshikawa
  2012-04-22  9:16           ` Avi Kivity
  0 siblings, 2 replies; 21+ messages in thread
From: Mike Waychison @ 2012-04-21  2:15 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Ying Han, Rik van Riel, Andrew Morton, Michal Hocko,
	Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton,
	Hugh Dickins, Dan Magenheimer, linux-mm, kvm, Avi Kivity,
	Marcelo Tosatti, Eric Northup

On Fri, Apr 20, 2012 at 6:56 PM, Takuya Yoshikawa
<takuya.yoshikawa@gmail.com> wrote:
> On Fri, 20 Apr 2012 16:07:41 -0700
> Ying Han <yinghan@google.com> wrote:
>
>> My understanding of the real pain is the poor implementation of the
>> mmu_shrinker. It iterates all the registered mmu_shrink callbacks for
>> each kvm and only does little work at a time while holding two big
>> locks. I learned from mikew@ (also ++cc-ed) that is causing latency
>> spikes and unfairness among kvm instance in some of the experiment
>> we've seen.

The pains we have with mmu_shrink are twofold:

 - Memory pressure against the shinker applies globally.  Any task can
cause pressure within their own environment (using numa or memcg) and
cause the global shrinker to shrink all shadowed tables on the system
(regardless of how memory is isolated between tasks).
 - Massive lock contention when all these CPUs are hitting the global
lock (which backs everybody on the system up).

In our situation, we simple disable the shrinker altogether.  My
understanding is that we EPT or NPT, the amount of memory used by
these tables is bounded by the size of guest physical memory, whereas
with software shadowed tables, it is bounded by the addresses spaces
in the guest.  This bound makes it reasonable to not do any reclaim
and charge it as a "system overhead tax".

As for data, the most impressive result was a massive improvement in
round-trip latency to a webserver running in a guest while another
process on the system was thrashing through page-cache (on a dozen or
so spinning disks iirc).  We were using fake-numa, and would otherwise
not expect the antagonist to drastrically affect the latency-sensitive
task (as per a lot of effort into making that work).  Unfortunately,
we saw the 99th%ile latency riding at the 140ms timeout cut-off (they
were likely tailing out much longer), with the 95%ile at over 40ms.
With the mmu_shrinker disabled, the 99th%ile latency quickly dropped
down to about 20ms.

CPU profiles were showing 30% of cpu time wasted on spinlocks, all the
mmu_list_lock iirc.

In our case, I'm much happier just disabling the damned thing altogether.

>
> Last year, I discussed the mmu_shrink issues on kvm ML:
>
>        [PATCH 0/4] KVM: Make mmu_shrink() scan nr_to_scan shadow pages
>        http://www.spinics.net/lists/kvm/msg65231.html
>
> Sadly, we could not find any good way at that time.
>
> Thanks,
>        Takuya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-21  2:15         ` Mike Waychison
@ 2012-04-21  2:29             ` Takuya Yoshikawa
  2012-04-22  9:16           ` Avi Kivity
  1 sibling, 0 replies; 21+ messages in thread
From: Takuya Yoshikawa @ 2012-04-21  2:29 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Ying Han, Rik van Riel, Andrew Morton, Michal Hocko,
	Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton,
	Hugh Dickins, Dan Magenheimer, linux-mm, kvm, Avi Kivity,
	Marcelo Tosatti, Eric Northup

On Fri, 20 Apr 2012 19:15:24 -0700
Mike Waychison <mikew@google.com> wrote:

> In our situation, we simple disable the shrinker altogether.  My
> understanding is that we EPT or NPT, the amount of memory used by
> these tables is bounded by the size of guest physical memory, whereas
> with software shadowed tables, it is bounded by the addresses spaces
> in the guest.  This bound makes it reasonable to not do any reclaim
> and charge it as a "system overhead tax".

IIRC, KVM's mmu_shrink is mainly for protecting the host from pathological
guest without EPT or NPT.

You can see Avi's summary: -- http://www.spinics.net/lists/kvm/msg65671.html
===
We should aim for the following:
- normal operation causes very little shrinks (some are okay)
- high pressure mostly due to kvm results in kvm being shrunk (this is a
pathological case caused by a starting a guest with a huge amount of
memory, and mapping it all to /dev/zero (or ksm), and getting the guest
the create shadow mappings for all of it)
- general high pressure is shared among other caches like dcache and icache

The cost of reestablishing an mmu page can be as high as half a
millisecond of cpu time, which is the reason I want to be conservative.
===

Thanks,
	Takuya

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
@ 2012-04-21  2:29             ` Takuya Yoshikawa
  0 siblings, 0 replies; 21+ messages in thread
From: Takuya Yoshikawa @ 2012-04-21  2:29 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Ying Han, Rik van Riel, Andrew Morton, Michal Hocko,
	Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton,
	Hugh Dickins, Dan Magenheimer, linux-mm, kvm, Avi Kivity,
	Marcelo Tosatti, Eric Northup

On Fri, 20 Apr 2012 19:15:24 -0700
Mike Waychison <mikew@google.com> wrote:

> In our situation, we simple disable the shrinker altogether.  My
> understanding is that we EPT or NPT, the amount of memory used by
> these tables is bounded by the size of guest physical memory, whereas
> with software shadowed tables, it is bounded by the addresses spaces
> in the guest.  This bound makes it reasonable to not do any reclaim
> and charge it as a "system overhead tax".

IIRC, KVM's mmu_shrink is mainly for protecting the host from pathological
guest without EPT or NPT.

You can see Avi's summary: -- http://www.spinics.net/lists/kvm/msg65671.html
===
We should aim for the following:
- normal operation causes very little shrinks (some are okay)
- high pressure mostly due to kvm results in kvm being shrunk (this is a
pathological case caused by a starting a guest with a huge amount of
memory, and mapping it all to /dev/zero (or ksm), and getting the guest
the create shadow mappings for all of it)
- general high pressure is shared among other caches like dcache and icache

The cost of reestablishing an mmu page can be as high as half a
millisecond of cpu time, which is the reason I want to be conservative.
===

Thanks,
	Takuya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-21  2:29             ` Takuya Yoshikawa
@ 2012-04-21  2:48               ` Mike Waychison
  -1 siblings, 0 replies; 21+ messages in thread
From: Mike Waychison @ 2012-04-21  2:48 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Ying Han, Rik van Riel, Andrew Morton, Michal Hocko,
	Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton,
	Hugh Dickins, Dan Magenheimer, linux-mm, kvm, Avi Kivity,
	Marcelo Tosatti, Eric Northup

On Fri, Apr 20, 2012 at 7:29 PM, Takuya Yoshikawa
<takuya.yoshikawa@gmail.com> wrote:
> On Fri, 20 Apr 2012 19:15:24 -0700
> Mike Waychison <mikew@google.com> wrote:
>
>> In our situation, we simple disable the shrinker altogether.  My
>> understanding is that we EPT or NPT, the amount of memory used by
>> these tables is bounded by the size of guest physical memory, whereas
>> with software shadowed tables, it is bounded by the addresses spaces
>> in the guest.  This bound makes it reasonable to not do any reclaim
>> and charge it as a "system overhead tax".
>
> IIRC, KVM's mmu_shrink is mainly for protecting the host from pathological
> guest without EPT or NPT.
>
> You can see Avi's summary: -- http://www.spinics.net/lists/kvm/msg65671.html
> ===
> We should aim for the following:
> - normal operation causes very little shrinks (some are okay)
> - high pressure mostly due to kvm results in kvm being shrunk (this is a
> pathological case caused by a starting a guest with a huge amount of
> memory, and mapping it all to /dev/zero (or ksm), and getting the guest
> the create shadow mappings for all of it)
> - general high pressure is shared among other caches like dcache and icache
>
> The cost of reestablishing an mmu page can be as high as half a
> millisecond of cpu time, which is the reason I want to be conservative.

To add to that, on these systems (32-way), the fault itself isn't as
heavy-handed as a global lock in everyone's reclaim path :)

I'd be very happy if this stuff was memcg aware, but until that
happens, this code is disabled in our production builds.  30% of CPU
time lost to a spinlock when mixing VMs with IO is worth paying the <
1% of system ram these pages cost if it means
tighter/more-deterministic service latencies.

> ===
>
> Thanks,
>        Takuya

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
@ 2012-04-21  2:48               ` Mike Waychison
  0 siblings, 0 replies; 21+ messages in thread
From: Mike Waychison @ 2012-04-21  2:48 UTC (permalink / raw)
  To: Takuya Yoshikawa
  Cc: Ying Han, Rik van Riel, Andrew Morton, Michal Hocko,
	Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton,
	Hugh Dickins, Dan Magenheimer, linux-mm, kvm, Avi Kivity,
	Marcelo Tosatti, Eric Northup

On Fri, Apr 20, 2012 at 7:29 PM, Takuya Yoshikawa
<takuya.yoshikawa@gmail.com> wrote:
> On Fri, 20 Apr 2012 19:15:24 -0700
> Mike Waychison <mikew@google.com> wrote:
>
>> In our situation, we simple disable the shrinker altogether.  My
>> understanding is that we EPT or NPT, the amount of memory used by
>> these tables is bounded by the size of guest physical memory, whereas
>> with software shadowed tables, it is bounded by the addresses spaces
>> in the guest.  This bound makes it reasonable to not do any reclaim
>> and charge it as a "system overhead tax".
>
> IIRC, KVM's mmu_shrink is mainly for protecting the host from pathological
> guest without EPT or NPT.
>
> You can see Avi's summary: -- http://www.spinics.net/lists/kvm/msg65671.html
> ===
> We should aim for the following:
> - normal operation causes very little shrinks (some are okay)
> - high pressure mostly due to kvm results in kvm being shrunk (this is a
> pathological case caused by a starting a guest with a huge amount of
> memory, and mapping it all to /dev/zero (or ksm), and getting the guest
> the create shadow mappings for all of it)
> - general high pressure is shared among other caches like dcache and icache
>
> The cost of reestablishing an mmu page can be as high as half a
> millisecond of cpu time, which is the reason I want to be conservative.

To add to that, on these systems (32-way), the fault itself isn't as
heavy-handed as a global lock in everyone's reclaim path :)

I'd be very happy if this stuff was memcg aware, but until that
happens, this code is disabled in our production builds.  30% of CPU
time lost to a spinlock when mixing VMs with IO is worth paying the <
1% of system ram these pages cost if it means
tighter/more-deterministic service latencies.

> ===
>
> Thanks,
>        Takuya

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-20 22:11   ` Andrew Morton
  (?)
  (?)
@ 2012-04-22  9:04   ` Avi Kivity
  -1 siblings, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2012-04-22  9:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ying Han, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Rik van Riel, Hillf Danton, Hugh Dickins,
	Dan Magenheimer, linux-mm, kvm, Marcelo Tosatti

On 04/21/2012 01:11 AM, Andrew Morton wrote:
> On Fri, 13 Apr 2012 15:38:41 -0700
> Ying Han <yinghan@google.com> wrote:
>
> > The mmu_shrink() is heavy by itself by iterating all kvms and holding
> > the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
> > don't need to call the shrinker if nothing to shrink.
> > 
>
> We should probably tell the kvm maintainers about this ;)
>
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -188,6 +188,11 @@ static u64 __read_mostly shadow_mmio_mask;
> >  
> >  static void mmu_spte_set(u64 *sptep, u64 spte);
> >  
> > +static inline int get_kvm_total_used_mmu_pages()
> > +{
> > +	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> > +}
> > +
> >  void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask)
> >  {
> >  	shadow_mmio_mask = mmio_mask;
> > @@ -3900,6 +3905,9 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
> >  	if (nr_to_scan == 0)
> >  		goto out;
> >  
> > +	if (!get_kvm_total_used_mmu_pages())
> > +		return 0;
> > +
> >  	raw_spin_lock(&kvm_lock);
> >  
> >  	list_for_each_entry(kvm, &vm_list, vm_list) {
> > @@ -3926,7 +3934,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
> >  	raw_spin_unlock(&kvm_lock);
> >  
> >  out:
> > -	return percpu_counter_read_positive(&kvm_total_used_mmu_pages);
> > +	return get_kvm_total_used_mmu_pages();
> >  }
> >  
> >  static struct shrinker mmu_shrinker = {
>
> There's a small functional change: percpu_counter_read_positive() is an
> approximate thing, so there will be cases where there will be some
> pages which are accounted for only in the percpu_counter's per-cpu
> accumulators.  In that case mmu_shrink() will bale out when there are
> in fact some freeable pages available.  This is hopefully unimportant.
>
> Do we actually know that this patch helps anything?  Any measurements? Is
> kvm_total_used_mmu_pages==0 at all common?
>

It's very common - this corresponds to the case where the kvm module is
loaded but no virtual machines are present.  But in that case the
shrinker loop is not at all heavy - take a lock, iterate over an empty
list, release lock.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-21  2:15         ` Mike Waychison
  2012-04-21  2:29             ` Takuya Yoshikawa
@ 2012-04-22  9:16           ` Avi Kivity
  2012-04-22 19:05             ` Eric Northup
  1 sibling, 1 reply; 21+ messages in thread
From: Avi Kivity @ 2012-04-22  9:16 UTC (permalink / raw)
  To: Mike Waychison
  Cc: Takuya Yoshikawa, Ying Han, Rik van Riel, Andrew Morton,
	Michal Hocko, Johannes Weiner, Mel Gorman, KAMEZAWA Hiroyuki,
	Hillf Danton, Hugh Dickins, Dan Magenheimer, linux-mm, kvm,
	Marcelo Tosatti, Eric Northup

On 04/21/2012 05:15 AM, Mike Waychison wrote:
> On Fri, Apr 20, 2012 at 6:56 PM, Takuya Yoshikawa
> <takuya.yoshikawa@gmail.com> wrote:
> > On Fri, 20 Apr 2012 16:07:41 -0700
> > Ying Han <yinghan@google.com> wrote:
> >
> >> My understanding of the real pain is the poor implementation of the
> >> mmu_shrinker. It iterates all the registered mmu_shrink callbacks for
> >> each kvm and only does little work at a time while holding two big
> >> locks. I learned from mikew@ (also ++cc-ed) that is causing latency
> >> spikes and unfairness among kvm instance in some of the experiment
> >> we've seen.
>
> The pains we have with mmu_shrink are twofold:
>
>  - Memory pressure against the shinker applies globally.  Any task can
> cause pressure within their own environment (using numa or memcg) and
> cause the global shrinker to shrink all shadowed tables on the system
> (regardless of how memory is isolated between tasks).
>  - Massive lock contention when all these CPUs are hitting the global
> lock (which backs everybody on the system up).
>
> In our situation, we simple disable the shrinker altogether.  My
> understanding is that we EPT or NPT, the amount of memory used by
> these tables is bounded by the size of guest physical memory, whereas
> with software shadowed tables, it is bounded by the addresses spaces
> in the guest.  

There is also a 2% (default) bound enforced on a per-vm basis.

> This bound makes it reasonable to not do any reclaim
> and charge it as a "system overhead tax".



>
> As for data, the most impressive result was a massive improvement in
> round-trip latency to a webserver running in a guest while another
> process on the system was thrashing through page-cache (on a dozen or
> so spinning disks iirc).  We were using fake-numa, and would otherwise
> not expect the antagonist to drastrically affect the latency-sensitive
> task (as per a lot of effort into making that work).  Unfortunately,
> we saw the 99th%ile latency riding at the 140ms timeout cut-off (they
> were likely tailing out much longer), with the 95%ile at over 40ms.
> With the mmu_shrinker disabled, the 99th%ile latency quickly dropped
> down to about 20ms.
>
> CPU profiles were showing 30% of cpu time wasted on spinlocks, all the
> mmu_list_lock iirc.
>
> In our case, I'm much happier just disabling the damned thing altogether.
>

There is no mmu_list_lock.  Do you mean kvm_lock or kvm->mmu_lock?

If the former, then we could easily fix this by dropping kvm_lock while
the work is being done.  If the latter, then it's more difficult.

(kvm_lock being contended implies that mmu_shrink is called concurrently?)

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-20 22:11   ` Andrew Morton
                     ` (2 preceding siblings ...)
  (?)
@ 2012-04-22  9:35   ` Avi Kivity
  2012-04-23 16:40     ` Ying Han
  -1 siblings, 1 reply; 21+ messages in thread
From: Avi Kivity @ 2012-04-22  9:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ying Han, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Rik van Riel, Hillf Danton, Hugh Dickins,
	Dan Magenheimer, linux-mm, kvm, Marcelo Tosatti

On 04/21/2012 01:11 AM, Andrew Morton wrote:
> On Fri, 13 Apr 2012 15:38:41 -0700
> Ying Han <yinghan@google.com> wrote:
>
> > The mmu_shrink() is heavy by itself by iterating all kvms and holding
> > the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
> > don't need to call the shrinker if nothing to shrink.
> > 
>
> We should probably tell the kvm maintainers about this ;)
>


Andrew, I see you added this to -mm.  First, it should go through the
kvm tree.  Second, unless we misunderstand something, the patch does
nothing, so I don't think it should be added at all.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-22  9:16           ` Avi Kivity
@ 2012-04-22 19:05             ` Eric Northup
  2012-04-23  8:37               ` Avi Kivity
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Northup @ 2012-04-22 19:05 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Mike Waychison, Takuya Yoshikawa, Ying Han, Rik van Riel,
	Andrew Morton, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	linux-mm, kvm, Marcelo Tosatti

On Sun, Apr 22, 2012 at 2:16 AM, Avi Kivity <avi@redhat.com> wrote:
> On 04/21/2012 05:15 AM, Mike Waychison wrote:
[...]
> There is no mmu_list_lock.  Do you mean kvm_lock or kvm->mmu_lock?
>
> If the former, then we could easily fix this by dropping kvm_lock while
> the work is being done.  If the latter, then it's more difficult.
>
> (kvm_lock being contended implies that mmu_shrink is called concurrently?)

On a 32-core system experiencing memory pressure, mmu_shrink was often
being called concurrently (before we turned it off).

With just one, or a small number of VMs on a host, when the
mmu_shrinker contents on the kvm_lock, that's just a proxy for the
contention on kvm->mmu_lock.  It is the one that gets reported,
though, since it gets acquired first.

The contention on mmu_lock would indeed be difficult to remove.  Our
case was perhaps unusual, because of the use of memory containers.  So
some cgroups were under memory pressure (thus calling the shrinker)
but the various VCPU threads (whose guest page tables were being
evicted by the shrinker) could immediately turn around and
successfully re-allocate them.  That made the kvm->mmu_lock really
hot.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-22 19:05             ` Eric Northup
@ 2012-04-23  8:37               ` Avi Kivity
  0 siblings, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2012-04-23  8:37 UTC (permalink / raw)
  To: Eric Northup
  Cc: Mike Waychison, Takuya Yoshikawa, Ying Han, Rik van Riel,
	Andrew Morton, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins, Dan Magenheimer,
	linux-mm, kvm, Marcelo Tosatti

On 04/22/2012 10:05 PM, Eric Northup wrote:
> On Sun, Apr 22, 2012 at 2:16 AM, Avi Kivity <avi@redhat.com> wrote:
> > On 04/21/2012 05:15 AM, Mike Waychison wrote:
> [...]
> > There is no mmu_list_lock.  Do you mean kvm_lock or kvm->mmu_lock?
> >
> > If the former, then we could easily fix this by dropping kvm_lock while
> > the work is being done.  If the latter, then it's more difficult.
> >
> > (kvm_lock being contended implies that mmu_shrink is called concurrently?)
>
> On a 32-core system experiencing memory pressure, mmu_shrink was often
> being called concurrently (before we turned it off).
>
> With just one, or a small number of VMs on a host, when the
> mmu_shrinker contents on the kvm_lock, that's just a proxy for the
> contention on kvm->mmu_lock.  It is the one that gets reported,
> though, since it gets acquired first.
>
> The contention on mmu_lock would indeed be difficult to remove.  Our
> case was perhaps unusual, because of the use of memory containers.  So
> some cgroups were under memory pressure (thus calling the shrinker)
> but the various VCPU threads (whose guest page tables were being
> evicted by the shrinker) could immediately turn around and
> successfully re-allocate them.  That made the kvm->mmu_lock really
> hot.

There are two flaws at work here: first, kvm doesn't really maintain the
pages in LRU or even approximate LRU order, so a hot page can be
evicted.  Second, the shrinker subsystem has no way to tell whether the
items are recently used or not.  If you're using EPT and the guest is
active, likely all mmu pages are in use.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-22  9:35   ` Avi Kivity
@ 2012-04-23 16:40     ` Ying Han
  2012-04-23 16:48       ` Rik van Riel
  0 siblings, 1 reply; 21+ messages in thread
From: Ying Han @ 2012-04-23 16:40 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Andrew Morton, Michal Hocko, Johannes Weiner, Mel Gorman,
	KAMEZAWA Hiroyuki, Rik van Riel, Hillf Danton, Hugh Dickins,
	Dan Magenheimer, linux-mm, kvm, Marcelo Tosatti

On Sun, Apr 22, 2012 at 2:35 AM, Avi Kivity <avi@redhat.com> wrote:
> On 04/21/2012 01:11 AM, Andrew Morton wrote:
>> On Fri, 13 Apr 2012 15:38:41 -0700
>> Ying Han <yinghan@google.com> wrote:
>>
>> > The mmu_shrink() is heavy by itself by iterating all kvms and holding
>> > the kvm_lock. spotted the code w/ Rik during LSF, and it turns out we
>> > don't need to call the shrinker if nothing to shrink.
>> >
>>
>> We should probably tell the kvm maintainers about this ;)
>>
>
>
> Andrew, I see you added this to -mm.  First, it should go through the
> kvm tree.  Second, unless we misunderstand something, the patch does
> nothing, so I don't think it should be added at all.

Avi, does this patch help the case as you mentioned above, where kvm
module is loaded but no virtual machines are present ? Why we have to
walk the empty while holding the spinlock?

--Ying

>
> --
> error compiling committee.c: too many arguments to function
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-23 16:40     ` Ying Han
@ 2012-04-23 16:48       ` Rik van Riel
  2012-04-23 16:57         ` Avi Kivity
  0 siblings, 1 reply; 21+ messages in thread
From: Rik van Riel @ 2012-04-23 16:48 UTC (permalink / raw)
  To: Ying Han
  Cc: Avi Kivity, Andrew Morton, Michal Hocko, Johannes Weiner,
	Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins,
	Dan Magenheimer, linux-mm, kvm, Marcelo Tosatti

On 04/23/2012 12:40 PM, Ying Han wrote:

> Avi, does this patch help the case as you mentioned above, where kvm
> module is loaded but no virtual machines are present ? Why we have to
> walk the empty while holding the spinlock?

It might help marginally, but rather than defending
the patch, it might be more useful to try finding a
solution to the problem Mike and Eric are triggering.

How can we prevent the code from taking the lock and
throwing out EPT/NPT pages when there is no real need
to, and they will be faulted back in soon anyway?

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages
  2012-04-23 16:48       ` Rik van Riel
@ 2012-04-23 16:57         ` Avi Kivity
  0 siblings, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2012-04-23 16:57 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Ying Han, Andrew Morton, Michal Hocko, Johannes Weiner,
	Mel Gorman, KAMEZAWA Hiroyuki, Hillf Danton, Hugh Dickins,
	Dan Magenheimer, linux-mm, kvm, Marcelo Tosatti

On 04/23/2012 07:48 PM, Rik van Riel wrote:
> On 04/23/2012 12:40 PM, Ying Han wrote:
>
>> Avi, does this patch help the case as you mentioned above, where kvm
>> module is loaded but no virtual machines are present ? Why we have to
>> walk the empty while holding the spinlock?
>
> It might help marginally, but rather than defending
> the patch, it might be more useful to try finding a
> solution to the problem Mike and Eric are triggering.
>
> How can we prevent the code from taking the lock and
> throwing out EPT/NPT pages when there is no real need
> to, and they will be faulted back in soon anyway?

We need to split the lru into an active and inactive list, just like the
linux mm.  The inactive list might be global (rather than vm local). 
Then we can just examine the inactive list, see that it's empty, and
quit.  Perhaps signal the guests to move some pages from the active to
inactive list.

This is hindered by lack of accessed bits on EPT page tables.  That
means we must resort to faults to reactivate shadow pages.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-04-23 16:57 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-13 22:38 [PATCH] kvm: don't call mmu_shrinker w/o used_mmu_pages Ying Han
2012-04-14 11:44 ` Hillf Danton
2012-04-16 16:43   ` Ying Han
2012-04-20 22:11 ` Andrew Morton
2012-04-20 22:11   ` Andrew Morton
2012-04-20 22:53   ` Rik van Riel
2012-04-20 23:07     ` Ying Han
2012-04-21  1:56       ` Takuya Yoshikawa
2012-04-21  2:15         ` Mike Waychison
2012-04-21  2:29           ` Takuya Yoshikawa
2012-04-21  2:29             ` Takuya Yoshikawa
2012-04-21  2:48             ` Mike Waychison
2012-04-21  2:48               ` Mike Waychison
2012-04-22  9:16           ` Avi Kivity
2012-04-22 19:05             ` Eric Northup
2012-04-23  8:37               ` Avi Kivity
2012-04-22  9:04   ` Avi Kivity
2012-04-22  9:35   ` Avi Kivity
2012-04-23 16:40     ` Ying Han
2012-04-23 16:48       ` Rik van Riel
2012-04-23 16:57         ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.