All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5]: KVM: nVMX: Fix multiple issues with nested-posted-interrupts
@ 2017-12-05  8:16 Liran Alon
  2017-12-05  8:16 ` [PATCH v2 1/5] KVM: nVMX: Remove pi_pending as signal to process nested posted-interrupts Liran Alon
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Liran Alon @ 2017-12-05  8:16 UTC (permalink / raw)
  To: pbonzini, rkrcmar, kvm; +Cc: jmattson, wanpeng.li, idan.brown

Hi,

This series aims to fix multiple issues with nested-posted-interrupts.

The first patch removes a per vCPU flag called pi_pending which
is used to signal KVM that it should emulate nested-posted-interrupt
dispatching on next resume of L2. However, this flag is unnecessary as
it has the exact same meaning as vmx->nested.pi_desc->control ON bit.

The second patch fixes an issue of not re-evaluating what should be done
with a new L1 pending interrupt that was discovered by syncing PIR to
IRR just before resuming L2 guest. For example, this pending L1 event
should in most cases result in exiting from L2 to L1 on
external-interrupt. But currently, we will just continue resuming L2
which is wrong.

The third patch clean-up & fix handling of directly injecting a L1
interrupt to L2 when L1 don't intercept external-interrupts. The
current handling of this case doesn't correctly consider the LAPIC TPR
and don't update it's IRR & ISR after injecting the interrupt to L2.
Fix this by using standard interrupt-injection code-path in this
scenario as-well.

The fourth patch fix multiple race-condition issues in sending &
dispatching nested-posted-interrupts. The patch fixes these issues by
checking if there is pending nested-posted-interrupts before each
vmentry and if yes, using self-IPI to make hardware dispatch them
instead of emulating behavior in software.

The fifth patch fixes a bug of not waking up a halted L2 when L1 sends
it a nested-posted-interrupt and L1 doesn't intercept HLT.

Regards,
-Liran

^ permalink raw reply	[flat|nested] 18+ messages in thread
* Re: [PATCH v2 1/5] KVM: nVMX: Remove pi_pending as signal to process nested posted-interrupts
@ 2017-12-12  0:20 Liran Alon
  0 siblings, 0 replies; 18+ messages in thread
From: Liran Alon @ 2017-12-12  0:20 UTC (permalink / raw)
  To: rkrcmar, pbonzini, kvm
  Cc: konrad.wilk, jmattson, krish.sadhukhan, wanpeng.li, idan.brown


On 12/05/17 10:16, Liran Alon wrote: 
> When L1 wants to send a posted-interrupt to another L1 CPU which runs
> L2, it does the following operations:
> 1. Sets the relevant bit in vmx->nested.pi_desc PIR.
> 2. Set ON bit in vmx->nested.pi_desc->control field.
> 3. Sends an IPI to dest L1 CPU by writing the the posted-interrupt
>     notification-vector into the LAPIC ICR.
> 
> Step (3) will exit to L0 on APIC_WRITE which will eventually reach
> vmx_deliver_nested_posted_interrupt(). If dest L0 CPU is in guest,
> then a physical IPI of notification-vector will be sent which will
> trigger evaluation of posted-interrupts and clear
> vmx->nested.pi_desc->control ON bit.
> Otherwise (or if dest L0 CPU exited from guest just before sending the
> physical IPI), the nested-posted-interrupts will be evaluated on next
> vmentry by vmx_complete_nested_posted_interrupt().
> 
> In order for vmx_complete_nested_posted_interrupt() to know if it
> should do any work, the flag vmx->nested.pi_pending was used which
> was set by vmx_deliver_nested_posted_interrupt().
> 
> However, this seems unnecessary because if posted-interrupts was not
> processed yet, vmx->nested.pi_desc->control ON bit should still be
> set. Therefore, it should suffice to use it in order to know if work
> should be done.
> 
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>   arch/x86/kvm/vmx.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 714a0673ec3c..f5074ec5701b 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -453,7 +453,6 @@ struct nested_vmx {
>   	struct page *virtual_apic_page;
>   	struct page *pi_desc_page;
>   	struct pi_desc *pi_desc;
> -	bool pi_pending;
>   	u16 posted_intr_nv;
>   
>   	unsigned long *msr_bitmap;
> @@ -5050,10 +5049,9 @@ static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
>   	void *vapic_page;
>   	u16 status;
>   
> -	if (!vmx->nested.pi_desc || !vmx->nested.pi_pending)
> +	if (!vmx->nested.pi_desc)
>   		return;
>   
> -	vmx->nested.pi_pending = false;
>   	if (!pi_test_and_clear_on(vmx->nested.pi_desc))
>   		return;
>   
> @@ -5126,7 +5124,6 @@ static int vmx_deliver_nested_posted_interrupt(struct kvm_vcpu *vcpu,
>   		 * If a posted intr is not recognized by hardware,
>   		 * we will accomplish it in the next vmentry.
>   		 */
> -		vmx->nested.pi_pending = true;
>   		kvm_make_request(KVM_REQ_EVENT, vcpu);
>   		return 0;
>   	}
> @@ -10488,7 +10485,6 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
>   	/* Posted interrupts setting is only taken from vmcs12.  */
>   	if (nested_cpu_has_posted_intr(vmcs12)) {
>   		vmx->nested.posted_intr_nv = vmcs12->posted_intr_nv;
> -		vmx->nested.pi_pending = false;
>   		vmcs_write16(POSTED_INTR_NV, POSTED_INTR_NESTED_VECTOR);
>   	} else {
>   		exec_control &= ~PIN_BASED_POSTED_INTR;
> 

It turns out this patch is incorrect and should be removed from series.
I will remove it from next version of this patch series. Just didn't want you to queue it by mistake. :)

I figured that pi_pending is used as a flag to indicate if L1 has sent a posted-interrupt notification-vector such that KVM won't process the nested-posted-interrupts on vmentry in case L1 has already set the ON bit in vmx->nested.pi_desc->control field but not sent the notification-vector yet. That will break vCPU semantics.

(This is in contrast to how vmx_sync_pir_to_irr() does ignore whether a posted-interrupt notification-vector was sent or not and just consume posted-interrupts on each vmentry if the vmx->pi_desc.control ON bit is set. That is fine because in this case L0 KVM defines how it wishes to handle posted-interrupts). 

Regards,
-Liran

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-12-12  0:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05  8:16 [PATCH v2 0/5]: KVM: nVMX: Fix multiple issues with nested-posted-interrupts Liran Alon
2017-12-05  8:16 ` [PATCH v2 1/5] KVM: nVMX: Remove pi_pending as signal to process nested posted-interrupts Liran Alon
2017-12-05  8:16 ` [PATCH v2 2/5] KVM: nVMX: Re-evaluate L1 pending events when running L2 and L1 got posted-interrupt Liran Alon
2017-12-06 18:52   ` Radim Krčmář
2017-12-07  2:29     ` Liran Alon
2017-12-11 22:53     ` Paolo Bonzini
2017-12-05  8:16 ` [PATCH v2 3/5] KVM: nVMX: Fix injection to L2 when L1 don't intercept external-interrupts Liran Alon
2017-12-06 20:20   ` Radim Krčmář
2017-12-07 11:19     ` Liran Alon
2017-12-07 16:41       ` Radim Krčmář
2017-12-05  8:16 ` [PATCH v2 4/5] KVM: nVMX: Deliver missed nested-PI notification-vector via self-IPI while interrupts disabled Liran Alon
2017-12-05  8:36   ` Wincy Van
2017-12-06 20:45   ` Radim Krčmář
2017-12-07 11:33     ` Liran Alon
2017-12-07 16:26       ` Radim Krčmář
2017-12-11 22:57         ` Paolo Bonzini
2017-12-05  8:16 ` [PATCH v2 5/5] KVM: nVMX: Wake halted L2 on nested posted-interrupt Liran Alon
2017-12-12  0:20 [PATCH v2 1/5] KVM: nVMX: Remove pi_pending as signal to process nested posted-interrupts Liran Alon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.