From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Subject: RE: [PATCH v11 3/3] x86, apicv: add virtual interrupt delivery
 support
Date: Wed, 23 Jan 2013 00:45:39 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E3145A1@SHSMSX101.ccr.corp.intel.com>
References: <1358331672-32384-1-git-send-email-yang.z.zhang@intel.com>
 <1358331672-32384-4-git-send-email-yang.z.zhang@intel.com>
 <A9667DDFB95DB7438FA9D7D576C3D87E30BB5A@SHSMSX101.ccr.corp.intel.com>
 <20130120125136.GA5119@redhat.com>
 <A9667DDFB95DB7438FA9D7D576C3D87E311507@SHSMSX101.ccr.corp.intel.com>
 <20130121050300.GA25818@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8BIT
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Shan, Haitao" <haitao.shan@intel.com>,
	"mtosatti@redhat.com" <mtosatti@redhat.com>,
	"Zhang, Xiantao" <xiantao.zhang@intel.com>,
	"Tian, Kevin" <kevin.tian@intel.com>
To: Gleb Natapov <gleb@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mga14.intel.com ([143.182.124.37]:52105 "EHLO mga14.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751127Ab3AWApy convert rfc822-to-8bit (ORCPT
	<rfc822;kvm@vger.kernel.org>); Tue, 22 Jan 2013 19:45:54 -0500
In-Reply-To: <20130121050300.GA25818@redhat.com>
Content-Language: en-US
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Gleb Natapov wrote on 2013-01-21:
> On Mon, Jan 21, 2013 at 12:49:01AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-01-20:
>>> On Thu, Jan 17, 2013 at 01:26:03AM +0000, Zhang, Yang Z wrote:
>>>> Previous patch is stale. Resend the new patch. The only change is
>>>> clear EOI and SELF-IPI reg in msr bitmap when vid is enabled.
>>>> 
>>>> ------------------------
>>>> @@ -340,6 +325,8 @@ static inline int apic_find_highest_irr(struct kvm_lapic
>>> *apic)
>>>>  {
>>>>  	int result;
>>>> +	/* Note that irr_pending is just a hint. It will be always
>>>> +	 * true with virtual interrupt delivery enabled. */
>>> This is not correct format for multi-line comments.
>> Sure, will correct it here and below.
>> 
>>>> +static void vmx_check_ioapic_entry(struct kvm_vcpu *vcpu,
>>>> +				   struct kvm_lapic_irq *irq)
>>>> +{
>>>> +	struct kvm_lapic **dst;
>>>> +	struct kvm_apic_map *map;
>>>> +	unsigned long bitmap = 1;
>>>> +	int i;
>>>> +
>>>> +	rcu_read_lock();
>>>> +	map = rcu_dereference(vcpu->kvm->arch.apic_map);
>>>> +
>>>> +	if (unlikely(!map)) {
>>>> +		set_eoi_exitmap_one(vcpu, irq->vector);
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	if (irq->dest_mode == 0) { /* physical mode */
>>>> +		if (irq->delivery_mode == APIC_DM_LOWEST ||
>>>> +				irq->dest_id == 0xff) {
>>>> +			set_eoi_exitmap_one(vcpu, irq->vector);
>>>> +			goto out;
>>>> +		}
>>>> +		dst = &map->phys_map[irq->dest_id & 0xff];
>>>> +	} else {
>>>> +		u32 mda = irq->dest_id << (32 - map->ldr_bits);
>>>> +
>>>> +		dst = map->logical_map[apic_cluster_id(map, mda)];
>>>> +
>>>> +		bitmap = apic_logical_id(map, mda);
>>>> +	}
>>>> +
>>>> +	for_each_set_bit(i, &bitmap, 16) {
>>>> +		if (!dst[i])
>>>> +			continue;
>>>> +		if (dst[i]->vcpu == vcpu) {
>>>> +			set_eoi_exitmap_one(vcpu, irq->vector);
>>>> +			break;
>>>> +		}
>>>> +	}
>>>> +
>>>> +out:
>>>> +	rcu_read_unlock();
>>>> +}
>>> The logic in this function belongs to lapic code. The only thing
>>> that is specific to vmx in the function is setting of the bit in
>>> vmx->eoi_exit_bitmap, but since eoi_exit_bitmap is calculated and
>>> loaded during same vcpu entry we do not need vmx->eoi_exit_bitmap at
>>> all. Declare it on a stack in vmx_update_eoi_exitmap() and pass it to
>>> set_eoi_exitmap() and vmx_load_eoi_exitmap().
>> IIRC, this logic is in lapic before v7. And you suggested to move the
>> whole function into vmx code. So, it better to move back to lapic file?
>> 
> IIRC I suggested to call it only from vmx, not move it there. Before
> that the calculation was done even with vid disabled and only result was
> ignored. With current logic KVM_REQ_EOIBITMAP will be set only with vid
> enabled so the calculation will not be done needlessly.
> 
> 
>>>> @@ -115,6 +116,42 @@ static void update_handled_vectors(struct
> kvm_ioapic
>>> *ioapic)
>>>>  	smp_wmb();
>>>>  }
>>>> +void set_eoi_exitmap(struct kvm_vcpu *vcpu)
>>>> +{
>>> This function is exported from the file and need to have more unique
>>> name. kvm_ioapic_calculate_eoi_exitmap() for instance.
>> Ok.
>> 
>>>> @@ -156,6 +193,7 @@ static void ioapic_write_indirect(struct kvm_ioapic
>>> *ioapic, u32 val)
>>>>  		if (e->fields.trig_mode == IOAPIC_LEVEL_TRIG
>>>>  		    && ioapic->irr & (1 << index))
>>>>  			ioapic_service(ioapic, index);
>>>> +		ioapic_update_eoi_exitmap(ioapic->kvm);
>>> ioapic_write_indirect() is called under ioapic->lock,
>>> ioapic_update_eoi_exitmap() takes the same lock. Have you tested the
>>> code?
>> ioapic_update_eoi_exitmap doesn't take any lock.
>> 
> Sorry. You are correct. Confused between different functions.
> 
>> I will do a full testing for every patch before sending out. It covers
>> both windows and Linux guest.
>> 
> We are getting close so please test with userspace irq chip too.
Thanks for your suggestion to test with userspace irqchip. I found some issues and will modify the logic:
As we known, APICv deponds on TPR shadow. But TPR shadow is per VM(it will be disabled when VM uses userspace irq chip), this means APICv also is per VM. But in current implementation, we use the global variable enable_apicv_reg to check whether APICv is used by target vcpu. This is wrong. Instead, it should to read VMCS to see whether the bit is set or not.

Best regards,
Yang