All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: "Longpeng (Mike,
	Cloud Infrastructure Service Product Dept.)" 
	<longpeng2@huawei.com>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Gonglei (Arei)" <arei.gonglei@huawei.com>,
	Huangzhichao <huangzhichao@huawei.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: The vcpu won't be wakened for a long time
Date: Tue, 21 Dec 2021 15:27:01 +0000	[thread overview]
Message-ID: <YcHyReHoF+qjIVTy@google.com> (raw)
In-Reply-To: <8a1a3ac75a6e4acf9bd1ce9779835e1c@huawei.com>

On Sat, Dec 18, 2021, Longpeng (Mike, Cloud Infrastructure Service Product Dept.) wrote:
> > Hmm, that strongly suggests the "vcpu != kvm_get_running_vcpu()" is at fault.
> > Can you try running with the below commit?  It's currently sitting in kvm/queue,
> > but not marked for stable because I didn't think it was possible for the check
> > to a cause a missed wake event in KVM's current code base.
> > 
> 
> The below commit can fix the bug, we have just completed  the tests.
> Thanks.

Aha!  Somehow I missed this call chain when analyzing the change.

  irqfd_wakeup()
  |
  |->kvm_arch_set_irq_inatomic()
     |
     |-> kvm_irq_delivery_to_apic_fast()
         |
	 |-> kvm_apic_set_irq()


Paolo, can the changelog be amended to the below, and maybe even pull the commit
into 5.16?


KVM: VMX: Wake vCPU when delivering posted IRQ even if vCPU == this vCPU

Drop a check that guards triggering a posted interrupt on the currently
running vCPU, and more importantly guards waking the target vCPU if
triggering a posted interrupt fails because the vCPU isn't IN_GUEST_MODE.
If a vIRQ is delivered from asynchronous context, the target vCPU can be
the currently running vCPU and can also be blocking, in which case
skipping kvm_vcpu_wake_up() is effectively dropping what is supposed to
be a wake event for the vCPU.

The "do nothing" logic when "vcpu == running_vcpu" mostly works only
because the majority of calls to ->deliver_posted_interrupt(), especially
when using posted interrupts, come from synchronous KVM context.  But if
a device is exposed to the guest using vfio-pci passthrough, the VFIO IRQ
and vCPU are bound to the same pCPU, and the IRQ is _not_ configured to
use posted interrupts, wake events from the device will be delivered to
KVM from IRQ context, e.g.

  vfio_msihandler()
  |
  |-> eventfd_signal()
      |
      |-> ...
          |
          |->  irqfd_wakeup()
               |
               |->kvm_arch_set_irq_inatomic()
                  |
                  |-> kvm_irq_delivery_to_apic_fast()
                      |
                      |-> kvm_apic_set_irq()

This also aligns the non-nested and nested usage of triggering posted
interrupts, and will allow for additional cleanups.

Fixes: 379a3c8ee444 ("KVM: VMX: Optimize posted-interrupt delivery for timer fastpath")
Cc: stable@vger.kernel.org
Reported-by: Longpeng (Mike) <longpeng2@huawei.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20211208015236.1616697-18-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>




> > commit 6a8110fea2c1b19711ac1ef718680dfd940363c6
> > Author: Sean Christopherson <seanjc@google.com>
> > Date:   Wed Dec 8 01:52:27 2021 +0000
> > 
> >     KVM: VMX: Wake vCPU when delivering posted IRQ even if vCPU == this vCPU
> > 
> >     Drop a check that guards triggering a posted interrupt on the currently
> >     running vCPU, and more importantly guards waking the target vCPU if
> >     triggering a posted interrupt fails because the vCPU isn't IN_GUEST_MODE.
> >     The "do nothing" logic when "vcpu == running_vcpu" works only because KVM
> >     doesn't have a path to ->deliver_posted_interrupt() from asynchronous
> >     context, e.g. if apic_timer_expired() were changed to always go down the
> >     posted interrupt path for APICv, or if the IN_GUEST_MODE check in
> >     kvm_use_posted_timer_interrupt() were dropped, and the hrtimer fired in
> >     kvm_vcpu_block() after the final kvm_vcpu_check_block() check, the vCPU
> >     would be scheduled() out without being awakened, i.e. would "miss" the
> >     timer interrupt.
> > 
> >     One could argue that invoking kvm_apic_local_deliver() from (soft) IRQ
> >     context for the current running vCPU should be illegal, but nothing in
> >     KVM actually enforces that rules.  There's also no strong obvious benefit
> >     to making such behavior illegal, e.g. checking IN_GUEST_MODE and calling
> >     kvm_vcpu_wake_up() is at worst marginally more costly than querying the
> >     current running vCPU.
> > 
> >     Lastly, this aligns the non-nested and nested usage of triggering posted
> >     interrupts, and will allow for additional cleanups.
> > 
> >     Signed-off-by: Sean Christopherson <seanjc@google.com>
> >     Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
> >     Message-Id: <20211208015236.1616697-18-seanjc@google.com>
> >     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 38749063da0e..f61a6348cffd 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -3995,8 +3995,7 @@ static int vmx_deliver_posted_interrupt(struct kvm_vcpu
> > *vcpu, int vector)
> >          * guaranteed to see PID.ON=1 and sync the PIR to IRR if triggering a
> >          * posted interrupt "fails" because vcpu->mode != IN_GUEST_MODE.
> >          */
> > -       if (vcpu != kvm_get_running_vcpu() &&
> > -           !kvm_vcpu_trigger_posted_interrupt(vcpu, false))
> > +       if (!kvm_vcpu_trigger_posted_interrupt(vcpu, false))
> >                 kvm_vcpu_wake_up(vcpu);
> > 
> >         return 0;

  reply	other threads:[~2021-12-21 15:27 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-14 13:55 The vcpu won't be wakened for a long time Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-12-14 17:36 ` Sean Christopherson
2021-12-16 14:03   ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-12-16 15:42     ` Sean Christopherson
2021-12-17  2:11       ` Wanpeng Li
2021-12-17  5:51         ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-12-18  9:08       ` Longpeng (Mike, Cloud Infrastructure Service Product Dept.)
2021-12-21 15:27         ` Sean Christopherson [this message]
2021-12-21 15:34           ` Paolo Bonzini
2021-12-22  6:07           ` Chao Gao
2021-12-22 15:44             ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YcHyReHoF+qjIVTy@google.com \
    --to=seanjc@google.com \
    --cc=arei.gonglei@huawei.com \
    --cc=huangzhichao@huawei.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longpeng2@huawei.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.