linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wanpeng Li <kernellwp@gmail.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection
Date: Thu, 9 Mar 2017 09:23:26 +0800	[thread overview]
Message-ID: <CANRm+CzgCBOY+XJyNgFHZ65O-UPP4oFjGEepgPrGuciF7Q=+Bw@mail.gmail.com> (raw)
In-Reply-To: <1482164232-130035-7-git-send-email-pbonzini@redhat.com>

2016-12-20 0:17 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU
> is blocked", 2015-09-18) the posted interrupt descriptor is checked
> unconditionally for PIR.ON.  Therefore we don't need KVM_REQ_EVENT to
> trigger the scan and, if NMIs or SMIs are not involved, we can avoid
> the complicated event injection path.
>
> Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been
> there since APICv was introduced.
>
> However, without the KVM_REQ_EVENT safety net KVM needs to be much
> more careful about races between vmx_deliver_posted_interrupt and
> vcpu_enter_guest.  First, the IPI for posted interrupts may be issued
> between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts.
> If that happens, kvm_trigger_posted_interrupt returns true, but
> smp_kvm_posted_intr_ipi doesn't do anything about it.  The guest is
> entered with PIR.ON, but the posted interrupt IPI has not been sent
> and the interrupt is only delivered to the guest on the next vmentry
> (if any).  To fix this, disable interrupts before setting vcpu->mode.
> This ensures that the IPI is delayed until the guest enters non-root mode;
> it is then trapped by the processor causing the interrupt to be injected.
>
> Second, the IPI may be issued between
>
>                         kvm_x86_ops->hwapic_irr_update(vcpu,
>                                 kvm_lapic_find_highest_irr(vcpu));
>
> and vcpu->mode = IN_GUEST_MODE.  In this case, kvm_vcpu_kick is called
> but it (correctly) doesn't do anything because it sees vcpu->mode ==
> OUTSIDE_GUEST_MODE.  Again, the guest is entered with PIR.ON but no
> posted interrupt IPI is pending; this time, the fix for this is to move
> the RVI update after IN_GUEST_MODE.
>
> Both issues were previously masked by the liberal usage of KVM_REQ_EVENT.
> In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting
> in another vmentry which would inject the interrupt.
>
> This saves about 300 cycles on the self_ipi_* tests of vmexit.flat.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kvm/lapic.c | 11 ++++-------
>  arch/x86/kvm/vmx.c   |  8 +++++---
>  arch/x86/kvm/x86.c   | 44 +++++++++++++++++++++++++-------------------
>  3 files changed, 34 insertions(+), 29 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index f644dd1dbe71..5ea94b622e88 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -385,12 +385,8 @@ int __kvm_apic_update_irr(u32 *pir, void *regs)
>  int kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
>  {
>         struct kvm_lapic *apic = vcpu->arch.apic;
> -       int max_irr;
>
> -       max_irr = __kvm_apic_update_irr(pir, apic->regs);
> -
> -       kvm_make_request(KVM_REQ_EVENT, vcpu);
> -       return max_irr;
> +       return __kvm_apic_update_irr(pir, apic->regs);
>  }
>  EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>
> @@ -423,9 +419,10 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
>         vcpu = apic->vcpu;
>
>         if (unlikely(vcpu->arch.apicv_active)) {
> -               /* try to update RVI */
> +               /* need to update RVI */
>                 apic_clear_vector(vec, apic->regs + APIC_IRR);
> -               kvm_make_request(KVM_REQ_EVENT, vcpu);
> +               kvm_x86_ops->hwapic_irr_update(vcpu,
> +                               apic_find_highest_irr(apic));
>         } else {
>                 apic->irr_pending = false;
>                 apic_clear_vector(vec, apic->regs + APIC_IRR);
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 27e40b180242..3dd4fad35a3e 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5062,9 +5062,11 @@ static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>         if (pi_test_and_set_pir(vector, &vmx->pi_desc))
>                 return;
>
> -       r = pi_test_and_set_on(&vmx->pi_desc);
> -       kvm_make_request(KVM_REQ_EVENT, vcpu);
> -       if (r || !kvm_vcpu_trigger_posted_interrupt(vcpu))
> +       /* If a previous notification has sent the IPI, nothing to do.  */
> +       if (pi_test_and_set_on(&vmx->pi_desc))
> +               return;
> +
> +       if (!kvm_vcpu_trigger_posted_interrupt(vcpu))
>                 kvm_vcpu_kick(vcpu);
>  }
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c666414adc1d..725473ba6dd3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6710,19 +6710,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>                         kvm_hv_process_stimers(vcpu);
>         }
>
> -       /*
> -        * KVM_REQ_EVENT is not set when posted interrupts are set by
> -        * VT-d hardware, so we have to update RVI unconditionally.
> -        */
> -       if (kvm_lapic_enabled(vcpu)) {
> -               /*
> -                * Update architecture specific hints for APIC
> -                * virtual interrupt delivery.
> -                */
> -               if (kvm_x86_ops->sync_pir_to_irr)
> -                       kvm_x86_ops->sync_pir_to_irr(vcpu);
> -       }
> -
>         if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>                 ++vcpu->stat.req_event;
>                 kvm_apic_accept_events(vcpu);
> @@ -6767,20 +6754,39 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>         kvm_x86_ops->prepare_guest_switch(vcpu);
>         if (vcpu->fpu_active)
>                 kvm_load_guest_fpu(vcpu);
> +
> +       /*
> +        * Disabling IRQs before setting IN_GUEST_MODE.  Posted interrupt
> +        * IPI are then delayed after guest entry, which ensures that they
> +        * result in virtual interrupt delivery.
> +        */
> +       local_irq_disable();
>         vcpu->mode = IN_GUEST_MODE;
>
>         srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>
>         /*
> -        * We should set ->mode before check ->requests,
> -        * Please see the comment in kvm_make_all_cpus_request.
> -        * This also orders the write to mode from any reads
> -        * to the page tables done while the VCPU is running.
> -        * Please see the comment in kvm_flush_remote_tlbs.
> +        * 1) We should set ->mode before checking ->requests.  Please see
> +        * the comment in kvm_make_all_cpus_request.
> +        *
> +        * 2) For APICv, we should set ->mode before checking PIR.ON.  This
> +        * pairs with the memory barrier implicit in pi_test_and_set_on
> +        * (see vmx_deliver_posted_interrupt).
> +        *
> +        * 3) This also orders the write to mode from any reads to the page
> +        * tables done while the VCPU is running.  Please see the comment
> +        * in kvm_flush_remote_tlbs.
>          */
>         smp_mb__after_srcu_read_unlock();
>
> -       local_irq_disable();

The local_irq_disable() movement is unnecessary if you move sync_pir_to_irr.

- IPI after vcpu->mode = IN_GUEST_MODE and interrupt disable, PI is
successfully.
- IPI between vcpu->mode = IN_GUEST_MODE and interrupt disable, the
sync_ir_to_irr will catch the PIR and set RVI.

Regards,
Wanpeng Li

> +       if (kvm_lapic_enabled(vcpu)) {
> +               /*
> +                * This handles the case where a posted interrupt was
> +                * notified with kvm_vcpu_kick.
> +                */
> +               if (kvm_x86_ops->sync_pir_to_irr)
> +                       kvm_x86_ops->sync_pir_to_irr(vcpu);
> +       }
>
>         if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>             || need_resched() || signal_pending(current)) {
> --
> 1.8.3.1
>

  parent reply	other threads:[~2017-03-09  2:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-19 16:17 [PATCH v2 0/6] KVM: x86: cleanup and speedup for APICv Paolo Bonzini
2016-12-19 16:17 ` [PATCH 1/6] KVM: vmx: clear pending interrupts on KVM_SET_LAPIC Paolo Bonzini
2017-02-07 17:42   ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 2/6] kvm: nVMX: move nested events check to kvm_vcpu_running Paolo Bonzini
2017-02-07 18:16   ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 3/6] KVM: x86: preparatory changes for APICv cleanups Paolo Bonzini
2017-02-07 18:20   ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 4/6] KVM: vmx: move sync_pir_to_irr from apic_find_highest_irr to callers Paolo Bonzini
2016-12-19 16:17 ` [PATCH 5/6] KVM: x86: do not scan IRR twice on APICv vmentry Paolo Bonzini
2017-02-07 20:19   ` Radim Krčmář
2017-02-07 21:49     ` Radim Krčmář
2017-02-08 14:10     ` Paolo Bonzini
2017-02-08 14:24       ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection Paolo Bonzini
2017-02-07 19:58   ` Radim Krčmář
2017-02-08 16:23     ` Paolo Bonzini
2017-02-09 15:11       ` Radim Krčmář
2017-03-09  1:23   ` Wanpeng Li [this message]
2017-03-09  9:40     ` Wanpeng Li
2017-03-09 10:03       ` Paolo Bonzini
2017-02-07 17:23 ` [PATCH v2 0/6] KVM: x86: cleanup and speedup for APICv Paolo Bonzini
2017-02-07 21:52   ` Radim Krčmář
2017-02-08 10:04     ` Paolo Bonzini
2017-02-08 13:33       ` Radim Krčmář
2017-02-08 15:01         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANRm+CzgCBOY+XJyNgFHZ65O-UPP4oFjGEepgPrGuciF7Q=+Bw@mail.gmail.com' \
    --to=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).