From: Wanpeng Li <kernellwp@gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Sean Christopherson <seanjc@google.com>,
LKML <linux-kernel@vger.kernel.org>, kvm <kvm@vger.kernel.org>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>
Subject: Re: [PATCH] KVM: Boost vCPU candidiate in user mode which is delivering interrupt
Date: Tue, 20 Apr 2021 16:48:34 +0800 [thread overview]
Message-ID: <CANRm+Czysw6z1u+fbsRF3JUyiJc0jErVATusar_Vj8CcSBy5LQ@mail.gmail.com> (raw)
In-Reply-To: <b2fca9a5-9b2b-b8f2-0d1e-fc8b9d9b5659@redhat.com>
On Tue, 20 Apr 2021 at 15:23, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 20/04/21 08:08, Wanpeng Li wrote:
> > On Tue, 20 Apr 2021 at 14:02, Wanpeng Li <kernellwp@gmail.com> wrote:
> >>
> >> On Tue, 20 Apr 2021 at 00:59, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>>
> >>> On 19/04/21 18:32, Sean Christopherson wrote:
> >>>> If false positives are a big concern, what about adding another pass to the loop
> >>>> and only yielding to usermode vCPUs with interrupts in the second full pass?
> >>>> I.e. give vCPUs that are already in kernel mode priority, and only yield to
> >>>> handle an interrupt if there are no vCPUs in kernel mode.
> >>>>
> >>>> kvm_arch_dy_runnable() pulls in pv_unhalted, which seems like a good thing.
> >>>
> >>> pv_unhalted won't help if you're waiting for a kernel spinlock though,
> >>> would it? Doing two passes (or looking for a "best" candidate that
> >>> prefers kernel mode vCPUs to user mode vCPUs waiting for an interrupt)
> >>> seems like the best choice overall.
> >>
> >> How about something like this:
>
> I was thinking of something simpler:
>
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 9b8e30dd5b9b..455c648f9adc 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3198,10 +3198,9 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> {
> struct kvm *kvm = me->kvm;
> struct kvm_vcpu *vcpu;
> - int last_boosted_vcpu = me->kvm->last_boosted_vcpu;
> int yielded = 0;
> int try = 3;
> - int pass;
> + int pass, num_passes = 1;
> int i;
>
> kvm_vcpu_set_in_spin_loop(me, true);
> @@ -3212,13 +3211,14 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> * VCPU is holding the lock that we need and will release it.
> * We approximate round-robin by starting at the last boosted VCPU.
> */
> - for (pass = 0; pass < 2 && !yielded && try; pass++) {
> - kvm_for_each_vcpu(i, vcpu, kvm) {
> - if (!pass && i <= last_boosted_vcpu) {
> - i = last_boosted_vcpu;
> - continue;
> - } else if (pass && i > last_boosted_vcpu)
> - break;
> + for (pass = 0; pass < num_passes; pass++) {
> + int idx = me->kvm->last_boosted_vcpu;
> + int n = atomic_read(&kvm->online_vcpus);
> + for (i = 0; i < n; i++, idx++) {
> + if (idx == n)
> + idx = 0;
> +
> + vcpu = kvm_get_vcpu(kvm, idx);
> if (!READ_ONCE(vcpu->ready))
> continue;
> if (vcpu == me)
> @@ -3226,23 +3226,36 @@ void kvm_vcpu_on_spin(struct kvm_vcpu *me, bool yield_to_kernel_mode)
> if (rcuwait_active(&vcpu->wait) &&
> !vcpu_dy_runnable(vcpu))
> continue;
> - if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
> - !kvm_arch_vcpu_in_kernel(vcpu))
> - continue;
> if (!kvm_vcpu_eligible_for_directed_yield(vcpu))
> continue;
>
> + if (READ_ONCE(vcpu->preempted) && yield_to_kernel_mode &&
> + !kvm_arch_vcpu_in_kernel(vcpu)) {
> + /*
> + * A vCPU running in userspace can get to kernel mode via
> + * an interrupt. That's a worse choice than a CPU already
> + * in kernel mode so only do it on a second pass.
> + */
> + if (!vcpu_dy_runnable(vcpu))
> + continue;
> + if (pass == 0) {
> + num_passes = 2;
> + continue;
> + }
> + }
> +
> yielded = kvm_vcpu_yield_to(vcpu);
> if (yielded > 0) {
> kvm->last_boosted_vcpu = i;
> - break;
> + goto done;
> } else if (yielded < 0) {
> try--;
> if (!try)
> - break;
> + goto done;
> }
> }
> }
> +done:
We just tested the above post against 96 vCPUs VM in an over-subscribe
scenario, the score of pbzip2 fluctuated drastically. Sometimes it is
worse than vanilla, but the average improvement is around 2.2%. The
new version of my post is around 9.3%,the origial posted patch is
around 10% which is totally as expected since now both IPI receivers
in user-mode and lock-waiters are second class citizens. Big VM
increases the probability multiple vCPUs may enter PLE handler, the
previous vCPU who starts searching earlier can mark IPI receivers in
user-mode as dy_eligible, the vCPU who starts searching a little later
can select it directly. However, after the above posting, the
PLE-caused vCPU should search the second full pass by himself.
Wanpeng
next prev parent reply other threads:[~2021-04-20 8:48 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-16 3:08 [PATCH] KVM: Boost vCPU candidiate in user mode which is delivering interrupt Wanpeng Li
2021-04-17 13:09 ` Paolo Bonzini
2021-04-19 7:34 ` Wanpeng Li
2021-04-19 16:32 ` Sean Christopherson
2021-04-19 16:59 ` Paolo Bonzini
2021-04-20 6:02 ` Wanpeng Li
2021-04-20 6:08 ` Wanpeng Li
2021-04-20 7:22 ` Paolo Bonzini
2021-04-20 8:48 ` Wanpeng Li [this message]
2021-04-20 10:23 ` Paolo Bonzini
2021-04-20 10:27 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CANRm+Czysw6z1u+fbsRF3JUyiJc0jErVATusar_Vj8CcSBy5LQ@mail.gmail.com \
--to=kernellwp@gmail.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).