From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Zhang, Yang Z" Subject: RE: [PATCH] KVM: x86: Avoid busy loops over uninjectable pending APIC timers Date: Thu, 21 Mar 2013 14:27:22 +0000 Message-ID: References: <5144DAC3.7080401@web.de> <20130317084705.GC11223@redhat.com> <51459ECE.2000107@web.de> <20130317104717.GA6117@redhat.com> <20130320193033.GB11138@amt.cnet> <20130320200319.GA16367@amt.cnet> <20130320213238.GB9382@redhat.com> <20130320231913.GA2319@amt.cnet> <20130321045446.GC9382@redhat.com> <20130321140224.GA29237@amt.cnet> <20130321141853.GU3889@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT Cc: Jan Kiszka , kvm To: Gleb Natapov , Marcelo Tosatti Return-path: Received: from mga02.intel.com ([134.134.136.20]:42103 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932457Ab3CUO1Z convert rfc822-to-8bit (ORCPT ); Thu, 21 Mar 2013 10:27:25 -0400 In-Reply-To: <20130321141853.GU3889@redhat.com> Content-Language: en-US Sender: kvm-owner@vger.kernel.org List-ID: Gleb Natapov wrote on 2013-03-21: > On Thu, Mar 21, 2013 at 11:02:24AM -0300, Marcelo Tosatti wrote: >> On Thu, Mar 21, 2013 at 06:54:46AM +0200, Gleb Natapov wrote: >>> On Wed, Mar 20, 2013 at 08:19:13PM -0300, Marcelo Tosatti wrote: >>>> On Wed, Mar 20, 2013 at 11:32:38PM +0200, Gleb Natapov wrote: >>>>> On Wed, Mar 20, 2013 at 05:03:19PM -0300, Marcelo Tosatti wrote: >>>>>> On Wed, Mar 20, 2013 at 04:30:33PM -0300, Marcelo Tosatti wrote: >>>>>>> On Sun, Mar 17, 2013 at 12:47:17PM +0200, Gleb Natapov wrote: >>>>>>>> On Sun, Mar 17, 2013 at 11:45:34AM +0100, Jan Kiszka wrote: >>>>>>>>> On 2013-03-17 09:47, Gleb Natapov wrote: >>>>>>>>>> On Sat, Mar 16, 2013 at 09:49:07PM +0100, Jan Kiszka wrote: >>>>>>>>>>> From: Jan Kiszka >>>>>>>>>>> >>>>>>>>>>> If the guest didn't take the last APIC timer interrupt yet and >>>>>>>>>>> generates another one on top, e.g. via periodic mode, we do >>>>>>>>>>> not block the VCPU even if the guest state is halted. The >>>>>>>>>>> reason is that apic_has_pending_timer continues to return a >>>>>>>>>>> non-zero value. >>>>>>>>>>> >>>>>>>>>>> Fix this busy loop by taking the IRR content for the LVT vector in >>>>>>>>>>> apic_has_pending_timer into account. >>>>>>>>>>> >>>>>>>>>> Just drop coalescing tacking for lapic interrupt. After posted >>>>>>>>>> interrupt will be merged __apic_accept_irq() will not longer >>>>>>>>>> return coalescing information, so the code will be dead anyway. >>>>>>>>> >>>>>>>>> That requires the RTC decoalescing series to go first to avoid a >>>>>>>>> regression, no? Then let's postpone this topic for now. >>>>>>>>> >>>>>>>> Yes, but decoalescing will work only for RTC :( >>>>>>> >>>>>>> Are you proposing to drop LAPIC interrupt reinjection? >>>>>> >>>>>> Since timer handling and injection is VCPU-local for LAPIC, >>>>>> __apic_accept_irq can (and must) return coalesced information (cannot >>>>>> drop LAPIC interrupt reinjection). >>>>>> >>>>> Why can't we drop LAPIC interrupt reinjection? Proposed posted >>>>> interrupt patches do not properly check for interrupt coalescing >>>>> even for VCPU-local injection. >>>>> >>>>> -- >>>>> Gleb. >>>> >>>> Because older Linux guests depend on reinjection for proper timekeeping. >>> Which versions? Those without kvmclock? Can we make them use PIT >>> instead? Posted interrupts going to break them. >> >> There is no reason to break them if its OK to receive reinjection info >> from LAPIC... its a matter of returning the information from >> apic_accept_irq, no big deal. >> > But current PI patches do break them, thats my point. So we either > need to revise them again, or drop LAPIC timer reinjection. Making > apic_accept_irq semantics "it returns coalescing info, but only sometimes" > is dubious though. We may rollback to the initial idea: test both irr and pir to get coalescing info. In this case, inject LAPIC timer always in vcpu context. So apic_accept_irq() will return right coalescing info. Also, we need to add comments to tell caller, apic_accept_irq() can ensure the return value is correct only when caller is in target vcpu context. Best regards, Yang