From mboxrd@z Thu Jan  1 00:00:00 1970
From: Gleb Natapov <gleb@redhat.com>
Subject: Re: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending
 APIC timers
Date: Thu, 21 Mar 2013 18:27:32 +0200
Message-ID: <20130321162732.GF9382@redhat.com>
References: <51459ECE.2000107@web.de>
 <20130317104717.GA6117@redhat.com>
 <20130320193033.GB11138@amt.cnet>
 <20130320200319.GA16367@amt.cnet>
 <20130320213238.GB9382@redhat.com>
 <20130320231913.GA2319@amt.cnet>
 <20130321045446.GC9382@redhat.com>
 <20130321140224.GA29237@amt.cnet>
 <20130321141853.GU3889@redhat.com>
 <A9667DDFB95DB7438FA9D7D576C3D87E099EA694@SHSMSX101.ccr.corp.intel.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
	Jan Kiszka <jan.kiszka@web.de>, kvm <kvm@vger.kernel.org>
To: "Zhang, Yang Z" <yang.z.zhang@intel.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx1.redhat.com ([209.132.183.28]:62118 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S933781Ab3CUQ1f (ORCPT <rfc822;kvm@vger.kernel.org>);
	Thu, 21 Mar 2013 12:27:35 -0400
Content-Disposition: inline
In-Reply-To: <A9667DDFB95DB7438FA9D7D576C3D87E099EA694@SHSMSX101.ccr.corp.intel.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Thu, Mar 21, 2013 at 02:27:22PM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-03-21:
> > On Thu, Mar 21, 2013 at 11:02:24AM -0300, Marcelo Tosatti wrote:
> >> On Thu, Mar 21, 2013 at 06:54:46AM +0200, Gleb Natapov wrote:
> >>> On Wed, Mar 20, 2013 at 08:19:13PM -0300, Marcelo Tosatti wrote:
> >>>> On Wed, Mar 20, 2013 at 11:32:38PM +0200, Gleb Natapov wrote:
> >>>>> On Wed, Mar 20, 2013 at 05:03:19PM -0300, Marcelo Tosatti wrote:
> >>>>>> On Wed, Mar 20, 2013 at 04:30:33PM -0300, Marcelo Tosatti wrote:
> >>>>>>> On Sun, Mar 17, 2013 at 12:47:17PM +0200, Gleb Natapov wrote:
> >>>>>>>> On Sun, Mar 17, 2013 at 11:45:34AM +0100, Jan Kiszka wrote:
> >>>>>>>>> On 2013-03-17 09:47, Gleb Natapov wrote:
> >>>>>>>>>> On Sat, Mar 16, 2013 at 09:49:07PM +0100, Jan Kiszka wrote:
> >>>>>>>>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>>>>>>>> 
> >>>>>>>>>>> If the guest didn't take the last APIC timer interrupt yet and
> >>>>>>>>>>> generates another one on top, e.g. via periodic mode, we do
> >>>>>>>>>>> not block the VCPU even if the guest state is halted. The
> >>>>>>>>>>> reason is that apic_has_pending_timer continues to return a
> >>>>>>>>>>> non-zero value.
> >>>>>>>>>>> 
> >>>>>>>>>>> Fix this busy loop by taking the IRR content for the LVT vector in
> >>>>>>>>>>> apic_has_pending_timer into account.
> >>>>>>>>>>> 
> >>>>>>>>>> Just drop coalescing tacking for lapic interrupt. After posted
> >>>>>>>>>> interrupt will be merged __apic_accept_irq() will not longer
> >>>>>>>>>> return coalescing information, so the code will be dead anyway.
> >>>>>>>>> 
> >>>>>>>>> That requires the RTC decoalescing series to go first to avoid a
> >>>>>>>>> regression, no? Then let's postpone this topic for now.
> >>>>>>>>> 
> >>>>>>>> Yes, but decoalescing will work only for RTC :(
> >>>>>>> 
> >>>>>>> Are you proposing to drop LAPIC interrupt reinjection?
> >>>>>> 
> >>>>>> Since timer handling and injection is VCPU-local for LAPIC,
> >>>>>> __apic_accept_irq can (and must) return coalesced information (cannot
> >>>>>> drop LAPIC interrupt reinjection).
> >>>>>> 
> >>>>> Why can't we drop LAPIC interrupt reinjection? Proposed posted
> >>>>> interrupt patches do not properly check for interrupt coalescing
> >>>>> even for VCPU-local injection.
> >>>>> 
> >>>>> --
> >>>>> 			Gleb.
> >>>> 
> >>>> Because older Linux guests depend on reinjection for proper timekeeping.
> >>> Which versions? Those without kvmclock? Can we make them use PIT
> >>> instead? Posted interrupts going to break them.
> >> 
> >> There is no reason to break them if its OK to receive reinjection info
> >> from LAPIC... its a matter of returning the information from
> >> apic_accept_irq, no big deal.
> >> 
> > But current PI patches do break them, thats my point. So we either
> > need to revise them again, or drop LAPIC timer reinjection. Making
> > apic_accept_irq semantics "it returns coalescing info, but only sometimes"
> > is dubious though.
> We may rollback to the initial idea: test both irr and pir to get coalescing info. In this case, inject LAPIC timer always in vcpu context. So apic_accept_irq() will return right coalescing info.
> Also, we need to add comments to tell caller, apic_accept_irq() can ensure the return value is correct only when caller is in target vcpu context.
> 
We cannot touch irr while vcpu is in non-root operation, so we will have
to pass flag to apic_accept_irq() to let it know that it is called
synchronously. While all this is possible I want to know which guests
exactly will we break if we will not track interrupt coalescing for
lapic timer. If only 2.0 smp kernels will break we can probably drop it.

--
			Gleb.