From mboxrd@z Thu Jan  1 00:00:00 1970
From: Gleb Natapov <gleb@redhat.com>
Subject: Re: [patch 4/8] KVM: x86: replace hrtimer based timer emulation
Date: Wed, 8 Jul 2009 19:13:32 +0300
Message-ID: <20090708161332.GR28046@redhat.com>
References: <20090706015511.923596553@localhost.localdomain> <20090706015812.786509491@localhost.localdomain> <20090708125819.GM28046@redhat.com> <20090708131721.GA3382@amt.cnet> <20090708133956.GP28046@redhat.com> <20090708154252.GA6364@amt.cnet>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Cc: kvm@vger.kernel.org
To: Marcelo Tosatti <mtosatti@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from mx2.redhat.com ([66.187.237.31]:46347 "EHLO mx2.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753670AbZGHQNf (ORCPT <rfc822;kvm@vger.kernel.org>);
	Wed, 8 Jul 2009 12:13:35 -0400
Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26])
	by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n68GDYKF025096
	for <kvm@vger.kernel.org>; Wed, 8 Jul 2009 12:13:34 -0400
Content-Disposition: inline
In-Reply-To: <20090708154252.GA6364@amt.cnet>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On Wed, Jul 08, 2009 at 12:42:52PM -0300, Marcelo Tosatti wrote:
> On Wed, Jul 08, 2009 at 04:39:56PM +0300, Gleb Natapov wrote:
> > On Wed, Jul 08, 2009 at 10:17:21AM -0300, Marcelo Tosatti wrote:
> > > On Wed, Jul 08, 2009 at 03:58:19PM +0300, Gleb Natapov wrote:
> > > > Excellent patch series.
> > > > 
> > > > On Sun, Jul 05, 2009 at 10:55:15PM -0300, Marcelo Tosatti wrote:
> > > > >  int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
> > > > >  {
> > > > > -	int ret;
> > > > > +	ktime_t now, expires;
> > > > >  
> > > > > -	ret = pit_has_pending_timer(vcpu);
> > > > > -	ret |= apic_has_pending_timer(vcpu);
> > > > > +	expires = kvm_vcpu_next_timer_event(vcpu);
> > > > > +	now = ktime_get();
> > > > > +	if (expires.tv64 <= now.tv64) {
> > > > > +		if (kvm_arch_interrupt_allowed(vcpu))
> > > > > +			set_bit(KVM_REQ_UNHALT, &vcpu->requests);
> > > > You shouldn't unhalt vcpu here. Not every timer event will generate
> > > > interrupt (vector can be masked in pic/ioapic)
> > > 
> > > Yeah. Note however that kvm_vcpu_next_timer_event only returns the
> > > expiration time for events that have been acked (for timers that have
> > > had events injected, but not acked it returns KTIME_MAX).
> > > 
> > > So, the code above will set one spurious unhalt if:
> > > 
> > > - inject timer irq
> > > - guest acks irq
> > > - guest mask irq
> > > - unhalt (once)
> > > 
> > > I had a "kvm_timer_mask" callback before (along with the attached
> > > patch), but decided to keep it simpler using the ack trick above.
> > > 
> > > I suppose one spurious unhalt is harmless, or is it a correctness issue?
> > > 
> > This is correctness issue. We should be as close to real CPU as
> > possible. This will save us may hours of debugging later :)
> 
> Hum, fine. Will update the kvm_timer_mask patch below and let you know.
> 
> > > > or timer event can generate NMI instead of interrupt.
> > > 
> > > In the NMI case it should not unhalt the processor?
> > Why? It should. It should jump to NMI handler.
> 
> I meant unhalt as in KVM_REQ_UNHALT so vcpu_enter_guest runs.
> 
Yes. It should. Inside vcpu_enter_guest() NMI will be injected
and nmi handler will be executed.

> What did you mention about ISR/IRR again?
On real HW the following may happen:
 Timer interrupt delivered to apic and placed into IRR
 Timer interrupt delivered to cpu and moved from IRR to ISR
 New timer interrupt delivered to apic and placed into IRR before
  previous one is acked.

In your patch ktimer->can_inject is set to false when timer is injected
and next interrupt is injected only after OS acks previous timer. So the
above situation cannot happen. I don't know if this important or not. It
is possible to write code that will work only with former behaviour, but
I don't see why somebody will want to do that. We can emulate former behaviour
though. If we will no rely on acks to count delivered event but make ->inject
callback return status that will indicate if interrupt was delivered to apic or
not.

--
			Gleb.