From: Paolo Bonzini <pbonzini@redhat.com>
To: Wanpeng Li <kernellwp@gmail.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
kvm <kvm@vger.kernel.org>
Subject: Re: [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection
Date: Thu, 9 Mar 2017 11:03:18 +0100 [thread overview]
Message-ID: <7aef786e-17f8-d711-28f0-e640b605b60a@redhat.com> (raw)
In-Reply-To: <CANRm+Cyq7jphDZX49u=KtxOjWx3DzmdsL6EnNHPOeCnvPaoayQ@mail.gmail.com>
On 09/03/2017 10:40, Wanpeng Li wrote:
> 2017-03-09 9:23 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
>> 2016-12-20 0:17 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU
>>> is blocked", 2015-09-18) the posted interrupt descriptor is checked
>>> unconditionally for PIR.ON. Therefore we don't need KVM_REQ_EVENT to
>>> trigger the scan and, if NMIs or SMIs are not involved, we can avoid
>>> the complicated event injection path.
>>>
>>> Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been
>>> there since APICv was introduced.
>>>
>>> However, without the KVM_REQ_EVENT safety net KVM needs to be much
>>> more careful about races between vmx_deliver_posted_interrupt and
>>> vcpu_enter_guest. First, the IPI for posted interrupts may be issued
>>> between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts.
>>> If that happens, kvm_trigger_posted_interrupt returns true, but
>>> smp_kvm_posted_intr_ipi doesn't do anything about it. The guest is
>>> entered with PIR.ON, but the posted interrupt IPI has not been sent
>>> and the interrupt is only delivered to the guest on the next vmentry
>>> (if any). To fix this, disable interrupts before setting vcpu->mode.
>>> This ensures that the IPI is delayed until the guest enters non-root mode;
>>> it is then trapped by the processor causing the interrupt to be injected.
>>>
>>> Second, the IPI may be issued between
>>>
>>> kvm_x86_ops->hwapic_irr_update(vcpu,
>>> kvm_lapic_find_highest_irr(vcpu));
>>>
>>> and vcpu->mode = IN_GUEST_MODE. In this case, kvm_vcpu_kick is called
>>> but it (correctly) doesn't do anything because it sees vcpu->mode ==
>>> OUTSIDE_GUEST_MODE. Again, the guest is entered with PIR.ON but no
>>> posted interrupt IPI is pending; this time, the fix for this is to move
>>> the RVI update after IN_GUEST_MODE.
>>>
>>> Both issues were previously masked by the liberal usage of KVM_REQ_EVENT.
>>> In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting
>>> in another vmentry which would inject the interrupt.
>>>
>>> This saves about 300 cycles on the self_ipi_* tests of vmexit.flat.
>>>
>>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>>> ---
>>> arch/x86/kvm/lapic.c | 11 ++++-------
>>> arch/x86/kvm/vmx.c | 8 +++++---
>>> arch/x86/kvm/x86.c | 44 +++++++++++++++++++++++++-------------------
>>> 3 files changed, 34 insertions(+), 29 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>> index f644dd1dbe71..5ea94b622e88 100644
>>> --- a/arch/x86/kvm/lapic.c
>>> +++ b/arch/x86/kvm/lapic.c
>>> @@ -385,12 +385,8 @@ int __kvm_apic_update_irr(u32 *pir, void *regs)
>>> int kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
>>> {
>>> struct kvm_lapic *apic = vcpu->arch.apic;
>>> - int max_irr;
>>>
>>> - max_irr = __kvm_apic_update_irr(pir, apic->regs);
>>> -
>>> - kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> - return max_irr;
>>> + return __kvm_apic_update_irr(pir, apic->regs);
>>> }
>>> EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
>>>
>>> @@ -423,9 +419,10 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
>>> vcpu = apic->vcpu;
>>>
>>> if (unlikely(vcpu->arch.apicv_active)) {
>>> - /* try to update RVI */
>>> + /* need to update RVI */
>>> apic_clear_vector(vec, apic->regs + APIC_IRR);
>>> - kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> + kvm_x86_ops->hwapic_irr_update(vcpu,
>>> + apic_find_highest_irr(apic));
>>> } else {
>>> apic->irr_pending = false;
>>> apic_clear_vector(vec, apic->regs + APIC_IRR);
>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>> index 27e40b180242..3dd4fad35a3e 100644
>>> --- a/arch/x86/kvm/vmx.c
>>> +++ b/arch/x86/kvm/vmx.c
>>> @@ -5062,9 +5062,11 @@ static void vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>>> if (pi_test_and_set_pir(vector, &vmx->pi_desc))
>>> return;
>>>
>>> - r = pi_test_and_set_on(&vmx->pi_desc);
>>> - kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> - if (r || !kvm_vcpu_trigger_posted_interrupt(vcpu))
>>> + /* If a previous notification has sent the IPI, nothing to do. */
>>> + if (pi_test_and_set_on(&vmx->pi_desc))
>>> + return;
>>> +
>>> + if (!kvm_vcpu_trigger_posted_interrupt(vcpu))
>>> kvm_vcpu_kick(vcpu);
>>> }
>>>
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index c666414adc1d..725473ba6dd3 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -6710,19 +6710,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>> kvm_hv_process_stimers(vcpu);
>>> }
>>>
>>> - /*
>>> - * KVM_REQ_EVENT is not set when posted interrupts are set by
>>> - * VT-d hardware, so we have to update RVI unconditionally.
>>> - */
>>> - if (kvm_lapic_enabled(vcpu)) {
>>> - /*
>>> - * Update architecture specific hints for APIC
>>> - * virtual interrupt delivery.
>>> - */
>>> - if (kvm_x86_ops->sync_pir_to_irr)
>>> - kvm_x86_ops->sync_pir_to_irr(vcpu);
>>> - }
>>> -
>>> if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>>> ++vcpu->stat.req_event;
>>> kvm_apic_accept_events(vcpu);
>>> @@ -6767,20 +6754,39 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>>> kvm_x86_ops->prepare_guest_switch(vcpu);
>>> if (vcpu->fpu_active)
>>> kvm_load_guest_fpu(vcpu);
>>> +
>>> + /*
>>> + * Disabling IRQs before setting IN_GUEST_MODE. Posted interrupt
>>> + * IPI are then delayed after guest entry, which ensures that they
>>> + * result in virtual interrupt delivery.
>>> + */
>>> + local_irq_disable();
>>> vcpu->mode = IN_GUEST_MODE;
>>>
>>> srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>>>
>>> /*
>>> - * We should set ->mode before check ->requests,
>>> - * Please see the comment in kvm_make_all_cpus_request.
>>> - * This also orders the write to mode from any reads
>>> - * to the page tables done while the VCPU is running.
>>> - * Please see the comment in kvm_flush_remote_tlbs.
>>> + * 1) We should set ->mode before checking ->requests. Please see
>>> + * the comment in kvm_make_all_cpus_request.
>>> + *
>>> + * 2) For APICv, we should set ->mode before checking PIR.ON. This
>>> + * pairs with the memory barrier implicit in pi_test_and_set_on
>>> + * (see vmx_deliver_posted_interrupt).
>>> + *
>>> + * 3) This also orders the write to mode from any reads to the page
>>> + * tables done while the VCPU is running. Please see the comment
>>> + * in kvm_flush_remote_tlbs.
>>> */
>>> smp_mb__after_srcu_read_unlock();
>>>
>>> - local_irq_disable();
>>
>> The local_irq_disable() movement is unnecessary if you move sync_pir_to_irr.
>
> In addition, this movement will increase the time of irq disable to
> some degree. Do you think I can send a patch to revert it?
The difference is a few dozen hundred clock cycles, I don't think it
matters. Also, a posted interrupt sent to the host while IN_GUEST_MODE
is more expensive than one sent while the processor is in non-root mode.
All in all, I think it's preferrable to keep the local_irq_disable here.
Your observation seems correct though.
Paolo
> Regards,
> Wanpeng Li
>
>>
>> - IPI after vcpu->mode = IN_GUEST_MODE and interrupt disable, PI is
>> successfully.
>> - IPI between vcpu->mode = IN_GUEST_MODE and interrupt disable, the
>> sync_ir_to_irr will catch the PIR and set RVI.
>>
>> Regards,
>> Wanpeng Li
>>
>>> + if (kvm_lapic_enabled(vcpu)) {
>>> + /*
>>> + * This handles the case where a posted interrupt was
>>> + * notified with kvm_vcpu_kick.
>>> + */
>>> + if (kvm_x86_ops->sync_pir_to_irr)
>>> + kvm_x86_ops->sync_pir_to_irr(vcpu);
>>> + }
>>>
>>> if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>>> || need_resched() || signal_pending(current)) {
>>> --
>>> 1.8.3.1
>>>
next prev parent reply other threads:[~2017-03-09 10:04 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-19 16:17 [PATCH v2 0/6] KVM: x86: cleanup and speedup for APICv Paolo Bonzini
2016-12-19 16:17 ` [PATCH 1/6] KVM: vmx: clear pending interrupts on KVM_SET_LAPIC Paolo Bonzini
2017-02-07 17:42 ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 2/6] kvm: nVMX: move nested events check to kvm_vcpu_running Paolo Bonzini
2017-02-07 18:16 ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 3/6] KVM: x86: preparatory changes for APICv cleanups Paolo Bonzini
2017-02-07 18:20 ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 4/6] KVM: vmx: move sync_pir_to_irr from apic_find_highest_irr to callers Paolo Bonzini
2016-12-19 16:17 ` [PATCH 5/6] KVM: x86: do not scan IRR twice on APICv vmentry Paolo Bonzini
2017-02-07 20:19 ` Radim Krčmář
2017-02-07 21:49 ` Radim Krčmář
2017-02-08 14:10 ` Paolo Bonzini
2017-02-08 14:24 ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection Paolo Bonzini
2017-02-07 19:58 ` Radim Krčmář
2017-02-08 16:23 ` Paolo Bonzini
2017-02-09 15:11 ` Radim Krčmář
2017-03-09 1:23 ` Wanpeng Li
2017-03-09 9:40 ` Wanpeng Li
2017-03-09 10:03 ` Paolo Bonzini [this message]
2017-02-07 17:23 ` [PATCH v2 0/6] KVM: x86: cleanup and speedup for APICv Paolo Bonzini
2017-02-07 21:52 ` Radim Krčmář
2017-02-08 10:04 ` Paolo Bonzini
2017-02-08 13:33 ` Radim Krčmář
2017-02-08 15:01 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7aef786e-17f8-d711-28f0-e640b605b60a@redhat.com \
--to=pbonzini@redhat.com \
--cc=kernellwp@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).