All of lore.kernel.org
 help / color / mirror / Atom feed
From: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
To: habanero@linux.vnet.ibm.com
Cc: Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Ingo Molnar <mingo@redhat.com>, Rik van Riel <riel@redhat.com>,
	Srikar <srikar@linux.vnet.ibm.com>, KVM <kvm@vger.kernel.org>,
	chegu vinod <chegu_vinod@hp.com>,
	LKML <linux-kernel@vger.kernel.org>, X86 <x86@kernel.org>,
	Gleb Natapov <gleb@redhat.com>,
	Srivatsa Vaddagiri <srivatsa.vaddagiri@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [RFC][PATCH] Improving directed yield scalability for PLE handler
Date: Mon, 10 Sep 2012 20:13:05 +0530	[thread overview]
Message-ID: <504DFC79.9010307@linux.vnet.ibm.com> (raw)
In-Reply-To: <1347046931.7332.51.camel@oc2024037011.ibm.com>

On 09/08/2012 01:12 AM, Andrew Theurer wrote:
> On Fri, 2012-09-07 at 23:36 +0530, Raghavendra K T wrote:
>> CCing PeterZ also.
>>
>> On 09/07/2012 06:41 PM, Andrew Theurer wrote:
>>> I have noticed recently that PLE/yield_to() is still not that scalable
>>> for really large guests, sometimes even with no CPU over-commit.  I have
>>> a small change that make a very big difference.
[...]
>> We are indeed avoiding CPUS in guest mode when we check
>> task->flags&  PF_VCPU in vcpu_on_spin path.  Doesn't that suffice?
> My understanding is that it checks if the candidate vcpu task is in
> guest mode (let's call this vcpu g1vcpuN), and that vcpu will not be a
> target to yield to if it is already in guest mode.  I am concerned about
> a different vcpu, possibly from a different VM (let's call it g2vcpuN),
> but it also located on the same runqueue as g1vcpuN -and- running.  That
> vcpu, g2vcpuN, may also be doing a directed yield, and it may already be
> holding the rq lock.  Or it could be in guest mode.  If it is in guest
> mode, then let's still target this rq, and try to yield to g1vcpuN.
> However, if g2vcpuN is not in guest mode, then don't bother trying.

- If a non vcpu task was currently running, this change can ignore 
request to yield to a target vcpu. The target vcpu could be the most 
eligible vcpu causing other vcpus to do ple exits.
Is it possible to modify the check to deal with only vcpu tasks?

- Should we use p_rq->cfs_rq->skip instead to let us know that some 
yield was active at this time?

-

Cpu 1              cpu2                     cpu3
a1                  a2                        a3
b1                  b2                        b3
                     c2(yield target of a1)    c3(yield target of a2)

If vcpu a1 is doing directed yield to vcpu c2; current vcpu a2 on target 
cpu is also doing a directed yield(to some vcpu c3). Then this change 
will only allow vcpu a2 will do a schedule() to b2 (if a2 -> c3 yield is 
successful). Do we miss yielding to a vcpu c2?
a1 might not find a suitable vcpu to yield and might go back to 
spinning. Is my understanding correct?

> Patch include below.
>
> Here's the new, v2 result with the previous two:
>
> 10 VMs, 16-way each, all running dbench (2x cpu over-commit)
>              throughput +/- stddev
>                   -----     -----
> ple on:           2552 +/- .70%
> ple on: w/fixv1:  4621 +/- 2.12%  (81% improvement)
> ple on: w/fixv2:  6115*           (139% improvement)
>

The numbers look great.

> [*] I do not have stdev yet because all 10 runs are not complete
>
> for v1 to v2, host CPU dropped from 60% to 50%.  Time in spin_lock() is
> also dropping:
>
[...]
>
> So this seems to be working.  However I wonder just how far we can take
> this.  Ideally we need to be in<3-4% in host for PLE work, like I
> observe for the 8-way VMs.  We are still way off.
>
> -Andrew
>
>
> signed-off-by: Andrew Theurer<habanero@linux.vnet.ibm.com>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index fbf1fd0..c767915 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4844,6 +4844,9 @@ bool __sched yield_to(struct task_struct *p, bool
> preempt)
>
>   again:
>   	p_rq = task_rq(p);
> +	if (task_running(p_rq, p) || p->state || !(p_rq->curr->flags&
> PF_VCPU)) {

While we are checking the flags of p_rq->curr task, the task p can 
migrate to some other runqueue. In this case will we miss yielding to 
the most eligible vcpu?

> +		goto out_no_unlock;
> +	}

Nit:
We dont need parenthesis above.


      parent reply	other threads:[~2012-09-10 14:46 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-18 13:37 [PATCH RFC V5 0/3] kvm: Improving directed yield in PLE handler Raghavendra K T
2012-07-18 13:37 ` [PATCH RFC V5 1/3] kvm/config: Add config to support ple or cpu relax optimzation Raghavendra K T
2012-07-18 13:37 ` [PATCH RFC V5 2/3] kvm: Note down when cpu relax intercepted or pause loop exited Raghavendra K T
2012-07-18 13:38 ` [PATCH RFC V5 3/3] kvm: Choose better candidate for directed yield Raghavendra K T
2012-07-18 14:39   ` Raghavendra K T
2012-07-19  9:47     ` [RESEND PATCH " Raghavendra K T
2012-07-20 17:36 ` [PATCH RFC V5 0/3] kvm: Improving directed yield in PLE handler Marcelo Tosatti
2012-07-22 12:34   ` Raghavendra K T
2012-07-22 12:43     ` Avi Kivity
2012-07-23  7:35       ` Christian Borntraeger
2012-07-22 17:58     ` Rik van Riel
2012-07-23 10:03 ` Avi Kivity
2012-09-07 13:11   ` [RFC][PATCH] Improving directed yield scalability for " Andrew Theurer
2012-09-07 18:06     ` Raghavendra K T
2012-09-07 19:42       ` Andrew Theurer
2012-09-08  8:43         ` Srikar Dronamraju
2012-09-10 13:16           ` Andrew Theurer
2012-09-10 16:03             ` Peter Zijlstra
2012-09-10 16:56               ` Srikar Dronamraju
2012-09-10 17:12                 ` Peter Zijlstra
2012-09-10 19:10                   ` Raghavendra K T
2012-09-10 20:12                   ` Andrew Theurer
2012-09-10 20:19                     ` Peter Zijlstra
2012-09-10 20:31                       ` Rik van Riel
2012-09-11  6:08                     ` Raghavendra K T
2012-09-11 12:48                       ` Andrew Theurer
2012-09-11 18:27                       ` Andrew Theurer
2012-09-13 11:48                         ` Raghavendra K T
2012-09-13 21:30                           ` Andrew Theurer
2012-09-14 17:10                             ` Andrew Jones
2012-09-15 16:08                               ` Raghavendra K T
2012-09-17 13:48                                 ` Andrew Jones
2012-09-14 20:34                             ` Konrad Rzeszutek Wilk
2012-09-17  8:02                               ` Andrew Jones
2012-09-16  8:55                             ` Avi Kivity
2012-09-17  8:10                               ` Andrew Jones
2012-09-18  3:03                               ` Andrew Theurer
2012-09-19 13:39                                 ` Avi Kivity
2012-09-13 12:13                         ` Avi Kivity
2012-09-11  7:04                   ` Srikar Dronamraju
2012-09-10 14:43         ` Raghavendra K T [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=504DFC79.9010307@linux.vnet.ibm.com \
    --to=raghavendra.kt@linux.vnet.ibm.com \
    --cc=avi@redhat.com \
    --cc=chegu_vinod@hp.com \
    --cc=gleb@redhat.com \
    --cc=habanero@linux.vnet.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=srikar@linux.vnet.ibm.com \
    --cc=srivatsa.vaddagiri@gmail.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.