linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Radim Krčmář" <rkrcmar@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Subject: Re: [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection
Date: Tue, 7 Feb 2017 20:58:04 +0100	[thread overview]
Message-ID: <20170207195804.GA1473@potion> (raw)
In-Reply-To: <1482164232-130035-7-git-send-email-pbonzini@redhat.com>

2016-12-19 17:17+0100, Paolo Bonzini:
> Since bf9f6ac8d749 ("KVM: Update Posted-Interrupts Descriptor when vCPU
> is blocked", 2015-09-18) the posted interrupt descriptor is checked
> unconditionally for PIR.ON.  Therefore we don't need KVM_REQ_EVENT to
> trigger the scan and, if NMIs or SMIs are not involved, we can avoid
> the complicated event injection path.
> 
> Calling kvm_vcpu_kick if PIR.ON=1 is also useless, though it has been
> there since APICv was introduced.
> 
> However, without the KVM_REQ_EVENT safety net KVM needs to be much
> more careful about races between vmx_deliver_posted_interrupt and
> vcpu_enter_guest.  First, the IPI for posted interrupts may be issued
> between setting vcpu->mode = IN_GUEST_MODE and disabling interrupts.
> If that happens, kvm_trigger_posted_interrupt returns true, but
> smp_kvm_posted_intr_ipi doesn't do anything about it.  The guest is
> entered with PIR.ON, but the posted interrupt IPI has not been sent
> and the interrupt is only delivered to the guest on the next vmentry
> (if any).  To fix this, disable interrupts before setting vcpu->mode.
> This ensures that the IPI is delayed until the guest enters non-root mode;
> it is then trapped by the processor causing the interrupt to be injected.
> 
> Second, the IPI may be issued between
> 
>                         kvm_x86_ops->hwapic_irr_update(vcpu,
>                                 kvm_lapic_find_highest_irr(vcpu));
> 
> and vcpu->mode = IN_GUEST_MODE.  In this case, kvm_vcpu_kick is called
> but it (correctly) doesn't do anything because it sees vcpu->mode ==
> OUTSIDE_GUEST_MODE.  Again, the guest is entered with PIR.ON but no
> posted interrupt IPI is pending; this time, the fix for this is to move
> the RVI update after IN_GUEST_MODE.
> 
> Both issues were previously masked by the liberal usage of KVM_REQ_EVENT.
> In both race scenarios KVM_REQ_EVENT would cancel guest entry, resulting
> in another vmentry which would inject the interrupt.
> 
> This saves about 300 cycles on the self_ipi_* tests of vmexit.flat.

Please mention that this also fixes an existing problem with posted
interrupts from devices.  If we didn't check PIR.ON after disabling host
interrupts, we might delay delivery to the next VM exit or posted
interrupt.  (It was recently posted.)

> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> @@ -6767,20 +6754,39 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	kvm_x86_ops->prepare_guest_switch(vcpu);
>  	if (vcpu->fpu_active)
>  		kvm_load_guest_fpu(vcpu);
> +
> +	/*
> +	 * Disabling IRQs before setting IN_GUEST_MODE.  Posted interrupt
> +	 * IPI are then delayed after guest entry, which ensures that they
> +	 * result in virtual interrupt delivery.
> +	 */
> +	local_irq_disable();
>  	vcpu->mode = IN_GUEST_MODE;
>  
>  	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>  
>  	/*
> -	 * We should set ->mode before check ->requests,
> -	 * Please see the comment in kvm_make_all_cpus_request.
> -	 * This also orders the write to mode from any reads
> -	 * to the page tables done while the VCPU is running.
> -	 * Please see the comment in kvm_flush_remote_tlbs.
> +	 * 1) We should set ->mode before checking ->requests.  Please see
> +	 * the comment in kvm_make_all_cpus_request.
> +	 *
> +	 * 2) For APICv, we should set ->mode before checking PIR.ON.  This
> +	 * pairs with the memory barrier implicit in pi_test_and_set_on
> +	 * (see vmx_deliver_posted_interrupt).
> +	 *
> +	 * 3) This also orders the write to mode from any reads to the page
> +	 * tables done while the VCPU is running.  Please see the comment
> +	 * in kvm_flush_remote_tlbs.
>  	 */
>  	smp_mb__after_srcu_read_unlock();
> -	local_irq_disable();
> +	if (kvm_lapic_enabled(vcpu)) {
> +		/*
> +		 * This handles the case where a posted interrupt was
> +		 * notified with kvm_vcpu_kick.
> +		 */
> +		if (kvm_x86_ops->sync_pir_to_irr)
> +			kvm_x86_ops->sync_pir_to_irr(vcpu);

Hm, this is not working well when nesting while L1 has assigned devices:
if the posted interrupt arrives just before local_irq_disable(), then
we'll just enter L2 instead of doing a nested VM exit (in case we have
interrupt exiting).

And after reading the code a bit, I think we allow posted interrupts in
L2 while L1 has assigned devices that use posted interrupts, and that it
doesn't work.

Am I missing something?

Thanks.

  reply	other threads:[~2017-02-07 20:07 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-19 16:17 [PATCH v2 0/6] KVM: x86: cleanup and speedup for APICv Paolo Bonzini
2016-12-19 16:17 ` [PATCH 1/6] KVM: vmx: clear pending interrupts on KVM_SET_LAPIC Paolo Bonzini
2017-02-07 17:42   ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 2/6] kvm: nVMX: move nested events check to kvm_vcpu_running Paolo Bonzini
2017-02-07 18:16   ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 3/6] KVM: x86: preparatory changes for APICv cleanups Paolo Bonzini
2017-02-07 18:20   ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 4/6] KVM: vmx: move sync_pir_to_irr from apic_find_highest_irr to callers Paolo Bonzini
2016-12-19 16:17 ` [PATCH 5/6] KVM: x86: do not scan IRR twice on APICv vmentry Paolo Bonzini
2017-02-07 20:19   ` Radim Krčmář
2017-02-07 21:49     ` Radim Krčmář
2017-02-08 14:10     ` Paolo Bonzini
2017-02-08 14:24       ` Radim Krčmář
2016-12-19 16:17 ` [PATCH 6/6] kvm: x86: do not use KVM_REQ_EVENT for APICv interrupt injection Paolo Bonzini
2017-02-07 19:58   ` Radim Krčmář [this message]
2017-02-08 16:23     ` Paolo Bonzini
2017-02-09 15:11       ` Radim Krčmář
2017-03-09  1:23   ` Wanpeng Li
2017-03-09  9:40     ` Wanpeng Li
2017-03-09 10:03       ` Paolo Bonzini
2017-02-07 17:23 ` [PATCH v2 0/6] KVM: x86: cleanup and speedup for APICv Paolo Bonzini
2017-02-07 21:52   ` Radim Krčmář
2017-02-08 10:04     ` Paolo Bonzini
2017-02-08 13:33       ` Radim Krčmář
2017-02-08 15:01         ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170207195804.GA1473@potion \
    --to=rkrcmar@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).