From mboxrd@z Thu Jan  1 00:00:00 1970
From: Radim =?utf-8?B?S3LEjW3DocWZ?= <rkrcmar@redhat.com>
Subject: Re: [PATCH v2 2/5] KVM: nVMX: Re-evaluate L1 pending events when
 running L2 and L1 got posted-interrupt
Date: Wed, 6 Dec 2017 19:52:31 +0100
Message-ID: <20171206185231.GB3644@flask>
References: <1512461786-6465-1-git-send-email-liran.alon@oracle.com>
 <1512461786-6465-3-git-send-email-liran.alon@oracle.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: pbonzini@redhat.com, kvm@vger.kernel.org, jmattson@google.com,
        wanpeng.li@hotmail.com, idan.brown@oracle.com,
        Krish Sadhukhan <krish.sadhukhan@oracle.com>,
        Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Liran Alon <liran.alon@oracle.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:62128 "EHLO mx1.redhat.com"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1752096AbdLFSwg (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 6 Dec 2017 13:52:36 -0500
Content-Disposition: inline
In-Reply-To: <1512461786-6465-3-git-send-email-liran.alon@oracle.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

2017-12-05 10:16+0200, Liran Alon:
> In case posted-interrupt was delivered to CPU while it is in host
> (outside guest), then posted-interrupt delivery will be done by
> calling sync_pir_to_irr() at vmentry after interrupts are disabled.
> 
> sync_pir_to_irr() will check if vmx->pi_desc.control ON bit and if
> set, it will sync vmx->pi_desc.pir to IRR and afterwards update RVI to
> ensure virtual-interrupt-delivery will dispatch interrupt to guest.
> 
> However, it is possible that L1 will receive a posted-interrupt while
> CPU runs at host and is about to enter L2. In this case, the call to
> sync_pir_to_irr() will indeed update the L1's APIC IRR but
> vcpu_enter_guest() will then just resume into L2 guest without
> re-evaluating if it should exit from L2 to L1 as a result of this
> new pending L1 event.
> 
> To address this case, if sync_pir_to_irr() has a new L1 injectable
> interrupt and CPU is running L2, we set KVM_REQ_EVENT.
> This will cause vcpu_enter_guest() to run another iteration of
> evaluating pending KVM requests and will therefore consume
> KVM_REQ_EVENT which will make sure to call check_nested_events() which
> will handle the pending L1 event properly.
> 
> Signed-off-by: Liran Alon <liran.alon@oracle.com>
> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> ---
>  arch/x86/kvm/vmx.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index f5074ec5701b..47bbb8b691e8 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -9031,20 +9031,33 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
>  static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> +	int prev_max_irr;
>  	int max_irr;
>  
>  	WARN_ON(!vcpu->arch.apicv_active);
> +
> +	prev_max_irr = kvm_lapic_find_highest_irr(vcpu);
>  	if (pi_test_on(&vmx->pi_desc)) {
>  		pi_clear_on(&vmx->pi_desc);
> +
>  		/*
>  		 * IOMMU can write to PIR.ON, so the barrier matters even on UP.
>  		 * But on x86 this is just a compiler barrier anyway.
>  		 */
>  		smp_mb__after_atomic();
>  		max_irr = kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);

I think the optimization (partly livelock protection) is not worth the
overhead of two IRR scans for non-nested guests.  Please make
kvm_apic_update_irr() return both prev_max_irr and max_irr in one pass.

> +
> +		/*
> +		 * If we are running L2 and L1 has a new pending interrupt
> +		 * which can be injected, we should re-evaluate
> +		 * what should be done with this new L1 interrupt.
> +		 */
> +		if (is_guest_mode(vcpu) && (max_irr > prev_max_irr))
> +			kvm_make_request(KVM_REQ_EVENT, vcpu);

We don't need anything from KVM_REQ_EVENT and only use it to abort the
VM entry, kvm_vcpu_exiting_guest_mode() is better for that.

>  	} else {
> -		max_irr = kvm_lapic_find_highest_irr(vcpu);
> +		max_irr = prev_max_irr;
>  	}
> +
>  	vmx_hwapic_irr_update(vcpu, max_irr);

We also should just inject the interrupt if L2 is run without
nested_exit_on_intr(), maybe reusing the check in vmx_hwapic_irr_update?

Thanks.