kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wanpeng Li <kernellwp@gmail.com>
To: Sean Christopherson <seanjc@google.com>
Cc: LKML <linux-kernel@vger.kernel.org>, kvm <kvm@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>
Subject: Re: [PATCH 2/3] KVM: X86: Bail out of direct yield in case of undercomitted scenarios
Date: Wed, 12 May 2021 10:43:18 +0800	[thread overview]
Message-ID: <CANRm+Czbc9AX3=Qj7dDCENyWj27drWniimZLnyKd9=--Ag8F+g@mail.gmail.com> (raw)
In-Reply-To: <YJr6v+hfMJxI2iAn@google.com>

On Wed, 12 May 2021 at 05:44, Sean Christopherson <seanjc@google.com> wrote:
>
> On Sat, May 08, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > In case of undercomitted scenarios, vCPU can get scheduling easily,
> > kvm_vcpu_yield_to adds extra overhead, we can observe a lot of race
> > between vcpu->ready is true and yield fails due to p->state is
> > TASK_RUNNING. Let's bail out is such scenarios by checking the length
> > of current cpu runqueue.
> >
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > ---
> >  arch/x86/kvm/x86.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 5bd550e..c0244a6 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -8358,6 +8358,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id)
> >       struct kvm_vcpu *target = NULL;
> >       struct kvm_apic_map *map;
> >
> > +     if (single_task_running())
> > +             goto no_yield;
> > +
>
> Hmm, could we push the result of kvm_sched_yield() down into the guest?
> Currently the guest bails after the first attempt, which is perfect for this
> scenario, but it seems like it would make sense to keep trying to yield if there
> are multiple preempted vCPUs and

It can have a race in case of sustain yield if there are multiple
preempted vCPUs , the vCPU which you intend to yield may have already
completed to handle IPI and be preempted now when the yielded sender
is scheduled again and checks the next preempted candidate.

> the "problem" was with the target.  E.g.

At the beginning of kvm_sched_yield() we can just get the run queue
length of the source, it can be treated as a hint of under-committed
instead of guarantee of accuracy.

>
>         /*
>          * Make sure other vCPUs get a chance to run if they need to.  Yield at
>          * most once, and stop trying to yield if the VMM says yielding isn't
>          * going to happen.
>          */
>         for_each_cpu(cpu, mask) {
>                 if (vcpu_is_preempted(cpu)) {
>                         r = kvm_hypercall1(KVM_HC_SCHED_YIELD,
>                                            per_cpu(x86_cpu_to_apicid, cpu));
>                         if (r != -EBUSY)
>                                 break;
>                 }
>         }
>
>
> Unrelated to this patch, but it's the first time I've really looked at the guest
> side of directed yield...
>
> Wouldn't it also make sense for the guest side to hook .send_call_func_single_ipi?

reschedule ipi is called by .smp_send_reschedule hook, there are a lot
of researches intend to accelerate idle vCPU reactivation, my original
attemption is to boost synchronization primitive, I believe we need a
lot of benchmarkings to consider inter-VM fairness and performance
benefit for  hooks .send_call_func_single_ipi and
.smp_send_reschedule.

>
> >       vcpu->stat.directed_yield_attempted++;
>
> Shouldn't directed_yield_attempted be incremented in this case?  It doesn't seem
> fundamentally different than the case where the target was scheduled in between
> the guest's check and the host's processing of the yield request.  In both
> instances, the guest did indeed attempt to yield.

Yes, it should be treated as attempted, I move it above the counting
because this patch helps improve successful ratio in under-committed
scenarios and easily shows me how much failure ratio leaves over. I
can move it after the counting in the next version.

    Wanpeng

  reply	other threads:[~2021-05-12  2:43 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-08  9:31 [PATCH 1/3] KVM: PPC: Book3S HV: exit halt polling on need_resched() as well Wanpeng Li
2021-05-08  9:31 ` [PATCH 2/3] KVM: X86: Bail out of direct yield in case of undercomitted scenarios Wanpeng Li
2021-05-11 21:44   ` Sean Christopherson
2021-05-12  2:43     ` Wanpeng Li [this message]
2021-05-12 16:59       ` Sean Christopherson
2021-05-08  9:31 ` [PATCH 3/3] KVM: X86: Fix vCPU preempted state from guest point of view Wanpeng Li
2021-05-11  0:18   ` Sean Christopherson
2021-05-11 10:28     ` Wanpeng Li
2021-05-12  0:02 ` [PATCH 1/3] KVM: PPC: Book3S HV: exit halt polling on need_resched() as well Wanpeng Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANRm+Czbc9AX3=Qj7dDCENyWj27drWniimZLnyKd9=--Ag8F+g@mail.gmail.com' \
    --to=kernellwp@gmail.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).