From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Subject: RE: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending
 APIC timers
Date: Thu, 21 Mar 2013 14:27:22 +0000
Message-ID: <A9667DDFB95DB7438FA9D7D576C3D87E099EA694@SHSMSX101.ccr.corp.intel.com>
References: <5144DAC3.7080401@web.de> <20130317084705.GC11223@redhat.com>
 <51459ECE.2000107@web.de> <20130317104717.GA6117@redhat.com>
 <20130320193033.GB11138@amt.cnet> <20130320200319.GA16367@amt.cnet>
 <20130320213238.GB9382@redhat.com> <20130320231913.GA2319@amt.cnet>
 <20130321045446.GC9382@redhat.com> <20130321140224.GA29237@amt.cnet>
 <20130321141853.GU3889@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 8BIT
Cc: Jan Kiszka <jan.kiszka@web.de>, kvm <kvm@vger.kernel.org>
To: Gleb Natapov <gleb@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mga02.intel.com ([134.134.136.20]:42103 "EHLO mga02.intel.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S932457Ab3CUO1Z convert rfc822-to-8bit (ORCPT
	<rfc822;kvm@vger.kernel.org>); Thu, 21 Mar 2013 10:27:25 -0400
In-Reply-To: <20130321141853.GU3889@redhat.com>
Content-Language: en-US
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

Gleb Natapov wrote on 2013-03-21:
> On Thu, Mar 21, 2013 at 11:02:24AM -0300, Marcelo Tosatti wrote:
>> On Thu, Mar 21, 2013 at 06:54:46AM +0200, Gleb Natapov wrote:
>>> On Wed, Mar 20, 2013 at 08:19:13PM -0300, Marcelo Tosatti wrote:
>>>> On Wed, Mar 20, 2013 at 11:32:38PM +0200, Gleb Natapov wrote:
>>>>> On Wed, Mar 20, 2013 at 05:03:19PM -0300, Marcelo Tosatti wrote:
>>>>>> On Wed, Mar 20, 2013 at 04:30:33PM -0300, Marcelo Tosatti wrote:
>>>>>>> On Sun, Mar 17, 2013 at 12:47:17PM +0200, Gleb Natapov wrote:
>>>>>>>> On Sun, Mar 17, 2013 at 11:45:34AM +0100, Jan Kiszka wrote:
>>>>>>>>> On 2013-03-17 09:47, Gleb Natapov wrote:
>>>>>>>>>> On Sat, Mar 16, 2013 at 09:49:07PM +0100, Jan Kiszka wrote:
>>>>>>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>>>>> 
>>>>>>>>>>> If the guest didn't take the last APIC timer interrupt yet and
>>>>>>>>>>> generates another one on top, e.g. via periodic mode, we do
>>>>>>>>>>> not block the VCPU even if the guest state is halted. The
>>>>>>>>>>> reason is that apic_has_pending_timer continues to return a
>>>>>>>>>>> non-zero value.
>>>>>>>>>>> 
>>>>>>>>>>> Fix this busy loop by taking the IRR content for the LVT vector in
>>>>>>>>>>> apic_has_pending_timer into account.
>>>>>>>>>>> 
>>>>>>>>>> Just drop coalescing tacking for lapic interrupt. After posted
>>>>>>>>>> interrupt will be merged __apic_accept_irq() will not longer
>>>>>>>>>> return coalescing information, so the code will be dead anyway.
>>>>>>>>> 
>>>>>>>>> That requires the RTC decoalescing series to go first to avoid a
>>>>>>>>> regression, no? Then let's postpone this topic for now.
>>>>>>>>> 
>>>>>>>> Yes, but decoalescing will work only for RTC :(
>>>>>>> 
>>>>>>> Are you proposing to drop LAPIC interrupt reinjection?
>>>>>> 
>>>>>> Since timer handling and injection is VCPU-local for LAPIC,
>>>>>> __apic_accept_irq can (and must) return coalesced information (cannot
>>>>>> drop LAPIC interrupt reinjection).
>>>>>> 
>>>>> Why can't we drop LAPIC interrupt reinjection? Proposed posted
>>>>> interrupt patches do not properly check for interrupt coalescing
>>>>> even for VCPU-local injection.
>>>>> 
>>>>> --
>>>>> 			Gleb.
>>>> 
>>>> Because older Linux guests depend on reinjection for proper timekeeping.
>>> Which versions? Those without kvmclock? Can we make them use PIT
>>> instead? Posted interrupts going to break them.
>> 
>> There is no reason to break them if its OK to receive reinjection info
>> from LAPIC... its a matter of returning the information from
>> apic_accept_irq, no big deal.
>> 
> But current PI patches do break them, thats my point. So we either
> need to revise them again, or drop LAPIC timer reinjection. Making
> apic_accept_irq semantics "it returns coalescing info, but only sometimes"
> is dubious though.
We may rollback to the initial idea: test both irr and pir to get coalescing info. In this case, inject LAPIC timer always in vcpu context. So apic_accept_irq() will return right coalescing info.
Also, we need to add comments to tell caller, apic_accept_irq() can ensure the return value is correct only when caller is in target vcpu context.

Best regards,
Yang