From mboxrd@z Thu Jan  1 00:00:00 1970
From: Liran Alon <LIRAN.ALON@ORACLE.COM>
Subject: Re: [PATCH v2 2/5] KVM: nVMX: Re-evaluate L1 pending events when
 running L2 and L1 got posted-interrupt
Date: Thu, 07 Dec 2017 04:29:16 +0200
Message-ID: <5A28A77C.9060608@ORACLE.COM>
References: <1512461786-6465-1-git-send-email-liran.alon@oracle.com> <1512461786-6465-3-git-send-email-liran.alon@oracle.com> <20171206185231.GB3644@flask>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Cc: pbonzini@redhat.com, kvm@vger.kernel.org, jmattson@google.com,
        wanpeng.li@hotmail.com, idan.brown@ORACLE.COM,
        Krish Sadhukhan <krish.sadhukhan@ORACLE.COM>,
        Konrad Rzeszutek Wilk <konrad.wilk@ORACLE.COM>
To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from userp2130.oracle.com ([156.151.31.86]:60543 "EHLO
        userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1752132AbdLGC3X (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 6 Dec 2017 21:29:23 -0500
In-Reply-To: <20171206185231.GB3644@flask>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>


On 06/12/17 20:52, Radim Krčmář wrote:
> 2017-12-05 10:16+0200, Liran Alon:
>> In case posted-interrupt was delivered to CPU while it is in host
>> (outside guest), then posted-interrupt delivery will be done by
>> calling sync_pir_to_irr() at vmentry after interrupts are disabled.
>>
>> sync_pir_to_irr() will check if vmx->pi_desc.control ON bit and if
>> set, it will sync vmx->pi_desc.pir to IRR and afterwards update RVI to
>> ensure virtual-interrupt-delivery will dispatch interrupt to guest.
>>
>> However, it is possible that L1 will receive a posted-interrupt while
>> CPU runs at host and is about to enter L2. In this case, the call to
>> sync_pir_to_irr() will indeed update the L1's APIC IRR but
>> vcpu_enter_guest() will then just resume into L2 guest without
>> re-evaluating if it should exit from L2 to L1 as a result of this
>> new pending L1 event.
>>
>> To address this case, if sync_pir_to_irr() has a new L1 injectable
>> interrupt and CPU is running L2, we set KVM_REQ_EVENT.
>> This will cause vcpu_enter_guest() to run another iteration of
>> evaluating pending KVM requests and will therefore consume
>> KVM_REQ_EVENT which will make sure to call check_nested_events() which
>> will handle the pending L1 event properly.
>>
>> Signed-off-by: Liran Alon <liran.alon@oracle.com>
>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
>> Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> Signed-off-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> ---
>>   arch/x86/kvm/vmx.c | 15 ++++++++++++++-
>>   1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index f5074ec5701b..47bbb8b691e8 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -9031,20 +9031,33 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
>>   static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
>>   {
>>   	struct vcpu_vmx *vmx = to_vmx(vcpu);
>> +	int prev_max_irr;
>>   	int max_irr;
>>
>>   	WARN_ON(!vcpu->arch.apicv_active);
>> +
>> +	prev_max_irr = kvm_lapic_find_highest_irr(vcpu);
>>   	if (pi_test_on(&vmx->pi_desc)) {
>>   		pi_clear_on(&vmx->pi_desc);
>> +
>>   		/*
>>   		 * IOMMU can write to PIR.ON, so the barrier matters even on UP.
>>   		 * But on x86 this is just a compiler barrier anyway.
>>   		 */
>>   		smp_mb__after_atomic();
>>   		max_irr = kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);
>
> I think the optimization (partly livelock protection) is not worth the
> overhead of two IRR scans for non-nested guests.  Please make
> kvm_apic_update_irr() return both prev_max_irr and max_irr in one pass.

OK. I will modify kvm_apic_update_irr().

>
>> +
>> +		/*
>> +		 * If we are running L2 and L1 has a new pending interrupt
>> +		 * which can be injected, we should re-evaluate
>> +		 * what should be done with this new L1 interrupt.
>> +		 */
>> +		if (is_guest_mode(vcpu) && (max_irr > prev_max_irr))
>> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
>
> We don't need anything from KVM_REQ_EVENT and only use it to abort the
> VM entry, kvm_vcpu_exiting_guest_mode() is better for that.

Yes you are right. I will change to kvm_vcpu_exiting_guest_mode().

>
>>   	} else {
>> -		max_irr = kvm_lapic_find_highest_irr(vcpu);
>> +		max_irr = prev_max_irr;
>>   	}
>> +
>>   	vmx_hwapic_irr_update(vcpu, max_irr);
>
> We also should just inject the interrupt if L2 is run without
> nested_exit_on_intr(), maybe reusing the check in vmx_hwapic_irr_update?
See next patch in series :)

>
> Thanks.
>

Regards,
-Liran