linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler
@ 2012-08-29 19:21 Raghavendra K T
  2012-09-02 10:12 ` Gleb Natapov
  0 siblings, 1 reply; 5+ messages in thread
From: Raghavendra K T @ 2012-08-29 19:21 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti, Rik van Riel
  Cc: Srikar, Nikunj A. Dadhania, KVM, Raghavendra K T, LKML,
	Srivatsa Vaddagiri, Gleb Natapov

 The idea of starting from next vcpu (source of yield_to + 1) seem to work
 well for overcomitted guest rather than using last boosted vcpu. We can also
 remove per VM variable with this approach.
 
 Iteration for eligible candidate after this patch starts from vcpu source+1
 and ends at source-1 (after wrapping)
 
 Thanks Nikunj for his quick verification of the patch.
 
 Please let me know if this patch is interesting and makes sense.

====8<====
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>

 Currently we use next vcpu to last boosted vcpu as starting point
 while deciding eligible vcpu for directed yield.

 In overcomitted scenarios, if more vcpu try to do directed yield,
 they start from same vcpu, resulting in wastage of cpu time (because of
 failing yields and double runqueue lock).
 
 Since probability of same vcpu trying to do directed yield is already
 prevented by improved PLE handler, we can start from next vcpu from source
 of yield_to.

Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
---

 include/linux/kvm_host.h |    1 -
 virt/kvm/kvm_main.c      |   12 ++++--------
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b70b48b..64a090d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -275,7 +275,6 @@ struct kvm {
 #endif
 	struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
 	atomic_t online_vcpus;
-	int last_boosted_vcpu;
 	struct list_head vm_list;
 	struct mutex lock;
 	struct kvm_io_bus *buses[KVM_NR_BUSES];
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2468523..65a6c83 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1584,7 +1584,6 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
 {
 	struct kvm *kvm = me->kvm;
 	struct kvm_vcpu *vcpu;
-	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
 	int yielded = 0;
 	int pass;
 	int i;
@@ -1594,21 +1593,18 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
 	 * currently running, because it got preempted by something
 	 * else and called schedule in __vcpu_run.  Hopefully that
 	 * VCPU is holding the lock that we need and will release it.
-	 * We approximate round-robin by starting at the last boosted VCPU.
+	 * We approximate round-robin by starting at the next VCPU.
 	 */
 	for (pass = 0; pass < 2 && !yielded; pass++) {
 		kvm_for_each_vcpu(i, vcpu, kvm) {
-			if (!pass && i <= last_boosted_vcpu) {
-				i = last_boosted_vcpu;
+			if (!pass && i <= me->vcpu_id) {
+				i = me->vcpu_id;
 				continue;
-			} else if (pass && i > last_boosted_vcpu)
+			} else if (pass && i >= me->vcpu_id)
 				break;
-			if (vcpu == me)
-				continue;
 			if (waitqueue_active(&vcpu->wq))
 				continue;
 			if (kvm_vcpu_yield_to(vcpu)) {
-				kvm->last_boosted_vcpu = i;
 				yielded = 1;
 				break;
 			}


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler
  2012-08-29 19:21 [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler Raghavendra K T
@ 2012-09-02 10:12 ` Gleb Natapov
  2012-09-02 16:29   ` Rik van Riel
  0 siblings, 1 reply; 5+ messages in thread
From: Gleb Natapov @ 2012-09-02 10:12 UTC (permalink / raw)
  To: Raghavendra K T
  Cc: Avi Kivity, Marcelo Tosatti, Rik van Riel, Srikar,
	Nikunj A. Dadhania, KVM, LKML, Srivatsa Vaddagiri

On Thu, Aug 30, 2012 at 12:51:01AM +0530, Raghavendra K T wrote:
>  The idea of starting from next vcpu (source of yield_to + 1) seem to work
>  well for overcomitted guest rather than using last boosted vcpu. We can also
>  remove per VM variable with this approach.
>  
>  Iteration for eligible candidate after this patch starts from vcpu source+1
>  and ends at source-1 (after wrapping)
>  
>  Thanks Nikunj for his quick verification of the patch.
>  
>  Please let me know if this patch is interesting and makes sense.
> 
This last_boosted_vcpu thing caused us trouble during attempt to
implement vcpu destruction. It is good to see it removed from this POV.

> ====8<====
> From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> 
>  Currently we use next vcpu to last boosted vcpu as starting point
>  while deciding eligible vcpu for directed yield.
> 
>  In overcomitted scenarios, if more vcpu try to do directed yield,
>  they start from same vcpu, resulting in wastage of cpu time (because of
>  failing yields and double runqueue lock).
>  
>  Since probability of same vcpu trying to do directed yield is already
>  prevented by improved PLE handler, we can start from next vcpu from source
>  of yield_to.
> 
> Suggested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
> ---
> 
>  include/linux/kvm_host.h |    1 -
>  virt/kvm/kvm_main.c      |   12 ++++--------
>  2 files changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index b70b48b..64a090d 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -275,7 +275,6 @@ struct kvm {
>  #endif
>  	struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
>  	atomic_t online_vcpus;
> -	int last_boosted_vcpu;
>  	struct list_head vm_list;
>  	struct mutex lock;
>  	struct kvm_io_bus *buses[KVM_NR_BUSES];
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2468523..65a6c83 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1584,7 +1584,6 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  {
>  	struct kvm *kvm = me->kvm;
>  	struct kvm_vcpu *vcpu;
> -	int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
>  	int yielded = 0;
>  	int pass;
>  	int i;
> @@ -1594,21 +1593,18 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me)
>  	 * currently running, because it got preempted by something
>  	 * else and called schedule in __vcpu_run.  Hopefully that
>  	 * VCPU is holding the lock that we need and will release it.
> -	 * We approximate round-robin by starting at the last boosted VCPU.
> +	 * We approximate round-robin by starting at the next VCPU.
>  	 */
>  	for (pass = 0; pass < 2 && !yielded; pass++) {
>  		kvm_for_each_vcpu(i, vcpu, kvm) {
> -			if (!pass && i <= last_boosted_vcpu) {
> -				i = last_boosted_vcpu;
> +			if (!pass && i <= me->vcpu_id) {
> +				i = me->vcpu_id;
>  				continue;
> -			} else if (pass && i > last_boosted_vcpu)
> +			} else if (pass && i >= me->vcpu_id)
>  				break;
> -			if (vcpu == me)
> -				continue;
>  			if (waitqueue_active(&vcpu->wq))
>  				continue;
>  			if (kvm_vcpu_yield_to(vcpu)) {
> -				kvm->last_boosted_vcpu = i;
>  				yielded = 1;
>  				break;
>  			}

--
			Gleb.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler
  2012-09-02 10:12 ` Gleb Natapov
@ 2012-09-02 16:29   ` Rik van Riel
  2012-09-04 11:57     ` Raghavendra K T
  2012-09-15  2:22     ` Raghavendra K T
  0 siblings, 2 replies; 5+ messages in thread
From: Rik van Riel @ 2012-09-02 16:29 UTC (permalink / raw)
  To: Gleb Natapov
  Cc: Raghavendra K T, Avi Kivity, Marcelo Tosatti, Srikar,
	Nikunj A. Dadhania, KVM, LKML, Srivatsa Vaddagiri

On 09/02/2012 06:12 AM, Gleb Natapov wrote:
> On Thu, Aug 30, 2012 at 12:51:01AM +0530, Raghavendra K T wrote:
>>   The idea of starting from next vcpu (source of yield_to + 1) seem to work
>>   well for overcomitted guest rather than using last boosted vcpu. We can also
>>   remove per VM variable with this approach.
>>
>>   Iteration for eligible candidate after this patch starts from vcpu source+1
>>   and ends at source-1 (after wrapping)
>>
>>   Thanks Nikunj for his quick verification of the patch.
>>
>>   Please let me know if this patch is interesting and makes sense.
>>
> This last_boosted_vcpu thing caused us trouble during attempt to
> implement vcpu destruction. It is good to see it removed from this POV.

I like this implementation.  It should achieve pretty much
the same as my old code, but without the downsides and without
having to keep the same amount of global state.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler
  2012-09-02 16:29   ` Rik van Riel
@ 2012-09-04 11:57     ` Raghavendra K T
  2012-09-15  2:22     ` Raghavendra K T
  1 sibling, 0 replies; 5+ messages in thread
From: Raghavendra K T @ 2012-09-04 11:57 UTC (permalink / raw)
  To: Rik van Riel, Gleb Natapov
  Cc: Avi Kivity, Marcelo Tosatti, Srikar, Nikunj A. Dadhania, KVM,
	LKML, Srivatsa Vaddagiri

On 09/02/2012 09:59 PM, Rik van Riel wrote:
> On 09/02/2012 06:12 AM, Gleb Natapov wrote:
>> On Thu, Aug 30, 2012 at 12:51:01AM +0530, Raghavendra K T wrote:
>>> The idea of starting from next vcpu (source of yield_to + 1) seem to
>>> work
>>> well for overcomitted guest rather than using last boosted vcpu. We
>>> can also
>>> remove per VM variable with this approach.
>>>
>>> Iteration for eligible candidate after this patch starts from vcpu
>>> source+1
>>> and ends at source-1 (after wrapping)
>>>
>>> Thanks Nikunj for his quick verification of the patch.
>>>
>>> Please let me know if this patch is interesting and makes sense.
>>>
>> This last_boosted_vcpu thing caused us trouble during attempt to
>> implement vcpu destruction. It is good to see it removed from this POV.
>
> I like this implementation. It should achieve pretty much
> the same as my old code, but without the downsides and without
> having to keep the same amount of global state.
>

My theoretical understanding how it would help is,

       |
       V
T0 ------- T1

suppose there are 4 vcpus (v1..v4) out of 32/64 vcpus simpultaneously 
enter directed yield handler,

if last_boosted_vcpu = i then v1 .. v4 will start from i, and there may
be some unnecessary attempts for directed yields.

We may not see such attempts with above patch. But again I agree that,
whole directed_yield stuff itself is very complicated because of 
possibility of each vcpu in different state (running/pauseloop exited 
while spinning/eligible)  and how they are located w.r.t each other.

Here is the result I got for ebizzy, 32 vcpu guest 32 core PLE machine
for 1x 2x and 3x overcommits.

base = 3.5-rc5 kernel with ple handler improvements patches applied
patched = base + vcpuid patch

      base         stdev       patched     stdev       %improvement
1x  1955.6250    39.8961    1863.3750    37.8302    -4.71716
2x  2475.3750   165.0307    3078.8750   341.9500    24.38014
3x  2071.5556    91.5370    2112.6667    56.6171     1.98455

Note:
I have to admit that, I am seeing very inconsistent results while 
experimenting with 3.6-rc kernel (not specific to vcpuid patch but as a 
whole) but not sure if it is some thing wrong in my config or should I 
spend some time debugging. Anybody has observed same?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler
  2012-09-02 16:29   ` Rik van Riel
  2012-09-04 11:57     ` Raghavendra K T
@ 2012-09-15  2:22     ` Raghavendra K T
  1 sibling, 0 replies; 5+ messages in thread
From: Raghavendra K T @ 2012-09-15  2:22 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: Rik van Riel, Gleb Natapov, Srikar, Nikunj A. Dadhania, KVM,
	LKML, Srivatsa Vaddagiri

On 09/02/2012 09:59 PM, Rik van Riel wrote:
> On 09/02/2012 06:12 AM, Gleb Natapov wrote:
>> On Thu, Aug 30, 2012 at 12:51:01AM +0530, Raghavendra K T wrote:
>>> The idea of starting from next vcpu (source of yield_to + 1) seem to
>>> work
>>> well for overcomitted guest rather than using last boosted vcpu. We
>>> can also
>>> remove per VM variable with this approach.
>>>
>>> Iteration for eligible candidate after this patch starts from vcpu
>>> source+1
>>> and ends at source-1 (after wrapping)
>>>
>>> Thanks Nikunj for his quick verification of the patch.
>>>
>>> Please let me know if this patch is interesting and makes sense.
>>>
>> This last_boosted_vcpu thing caused us trouble during attempt to
>> implement vcpu destruction. It is good to see it removed from this POV.
>
> I like this implementation. It should achieve pretty much
> the same as my old code, but without the downsides and without
> having to keep the same amount of global state.
>

I able to test this on 3.6-rc5 (where I do not see inconsistency may be
it was my bad to go with rc1), with 32 guest 1x and 2x overcommit
scenario

Here is the result on 16 core ple machine (with HT 32 thread) x240
machine

base = 3.6-rc5 + ple handler improvement patch
patched = base + vcpuid usage patch

+-----------+-----------+-----------+------------+-----------+
      ebizzy (records/sec higher is better)
+-----------+-----------+-----------+------------+-----------+
     base         stdev       patched    stdev        %improve
+-----------+-----------+-----------+------------+-----------+
1x 11293.3750   624.4378   11242.8750   583.1757    -0.44716
2x  3641.8750   468.9400    4088.8750   290.5470    12.27390
+-----------+-----------+-----------+------------+-----------+


Avi, Marcelo.. any comments on this?


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-09-15  2:26 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-29 19:21 [PATCH RFC 1/1] kvm: Use vcpu_id as pivot instead of last boosted vcpu in PLE handler Raghavendra K T
2012-09-02 10:12 ` Gleb Natapov
2012-09-02 16:29   ` Rik van Riel
2012-09-04 11:57     ` Raghavendra K T
2012-09-15  2:22     ` Raghavendra K T

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).