All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sheng Yang <sheng@linux.intel.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	kvm@vger.kernel.org, Gleb Natapov <gleb@redhat.com>
Subject: Re: [PATCH] KVM: Merge kvm_ioapic_get_delivery_bitmask into kvm_get_intr_delivery_bitmask
Date: Wed, 4 Mar 2009 10:41:34 +0800	[thread overview]
Message-ID: <200903041041.35016.sheng@linux.intel.com> (raw)
In-Reply-To: <20090303112037.GA13997@amt.cnet>

On Tuesday 03 March 2009 19:20:37 Marcelo Tosatti wrote:
> On Mon, Mar 02, 2009 at 04:19:33PM +0800, Sheng Yang wrote:
> > Gleb fixed bitmap ops usage in kvm_ioapic_get_delivery_bitmask.
> >
> > Sheng merged two functions, as well as fix several issues in
> > kvm_get_intr_delivery_bitmask:
> > 1. deliver_bitmask is a bitmap rather than a unsigned long intereger.
> > 2. Lowest priority target bitmap wrong calculated by mistake.
> > 3. Prevent potential NULL reference.
> > 4. Declaration in include/kvm_host.h caused powerpc compilation warning.
> >
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > Signed-off-by: Sheng Yang <sheng@linux.intel.com>
> > ---
> >  include/linux/kvm_host.h |    3 ---
> >  virt/kvm/ioapic.c        |   39 ---------------------------------------
> >  virt/kvm/ioapic.h        |    5 +++--
> >  virt/kvm/irq_comm.c      |   42
> > ++++++++++++++++++++++++++++++++++++++---- 4 files changed, 41
> > insertions(+), 48 deletions(-)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index 3832243..fb60f31 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -363,9 +363,6 @@ void kvm_unregister_irq_mask_notifier(struct kvm
> > *kvm, int irq, struct kvm_irq_mask_notifier *kimn);
> >  void kvm_fire_mask_notifiers(struct kvm *kvm, int irq, bool mask);
> >
> > -void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > -				   union kvm_ioapic_redirect_entry *entry,
> > -				   unsigned long *deliver_bitmask);
> >  int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level);
> >  void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned
> > pin); void kvm_register_irq_ack_notifier(struct kvm *kvm,
> > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> > index 7c2cb2b..2c40ff8 100644
> > --- a/virt/kvm/ioapic.c
> > +++ b/virt/kvm/ioapic.c
> > @@ -161,45 +161,6 @@ static void ioapic_inj_nmi(struct kvm_vcpu *vcpu)
> >  	kvm_vcpu_kick(vcpu);
> >  }
> >
> > -void kvm_ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
> > -				     u8 dest_mode, unsigned long *mask)
> > -{
> > -	int i;
> > -	struct kvm *kvm = ioapic->kvm;
> > -	struct kvm_vcpu *vcpu;
> > -
> > -	ioapic_debug("dest %d dest_mode %d\n", dest, dest_mode);
> > -
> > -	*mask = 0;
> > -	if (dest_mode == 0) {	/* Physical mode. */
> > -		if (dest == 0xFF) {	/* Broadcast. */
> > -			for (i = 0; i < KVM_MAX_VCPUS; ++i)
> > -				if (kvm->vcpus[i] && kvm->vcpus[i]->arch.apic)
> > -					*mask |= 1 << i;
> > -			return;
> > -		}
> > -		for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> > -			vcpu = kvm->vcpus[i];
> > -			if (!vcpu)
> > -				continue;
> > -			if (kvm_apic_match_physical_addr(vcpu->arch.apic, dest)) {
> > -				if (vcpu->arch.apic)
> > -					*mask = 1 << i;
> > -				break;
> > -			}
> > -		}
> > -	} else if (dest != 0)	/* Logical mode, MDA non-zero. */
> > -		for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> > -			vcpu = kvm->vcpus[i];
> > -			if (!vcpu)
> > -				continue;
> > -			if (vcpu->arch.apic &&
> > -			    kvm_apic_match_logical_addr(vcpu->arch.apic, dest))
> > -				*mask |= 1 << vcpu->vcpu_id;
> > -		}
> > -	ioapic_debug("mask %x\n", *mask);
> > -}
> > -
> >  static int ioapic_deliver(struct kvm_ioapic *ioapic, int irq)
> >  {
> >  	union kvm_ioapic_redirect_entry entry = ioapic->redirtbl[irq];
> > diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
> > index 7275f87..c8032ab 100644
> > --- a/virt/kvm/ioapic.h
> > +++ b/virt/kvm/ioapic.h
> > @@ -70,7 +70,8 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector,
> > int trigger_mode); int kvm_ioapic_init(struct kvm *kvm);
> >  int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level);
> >  void kvm_ioapic_reset(struct kvm_ioapic *ioapic);
> > -void kvm_ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
> > -				     u8 dest_mode, unsigned long *mask);
> > +void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
> > +				   union kvm_ioapic_redirect_entry *entry,
> > +				   unsigned long *deliver_bitmask);
> >
> >  #endif
> > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > index d165e05..bb6c5d0 100644
> > --- a/virt/kvm/irq_comm.c
> > +++ b/virt/kvm/irq_comm.c
> > @@ -47,15 +47,49 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic
> > *ioapic, union kvm_ioapic_redirect_entry *entry,
> >  				   unsigned long *deliver_bitmask)
> >  {
> > +	int i;
> > +	struct kvm *kvm = ioapic->kvm;
> >  	struct kvm_vcpu *vcpu;
> >
> > -	kvm_ioapic_get_delivery_bitmask(ioapic, entry->fields.dest_id,
> > -					entry->fields.dest_mode,
> > -					deliver_bitmask);
> > +	bitmap_zero(deliver_bitmask, KVM_MAX_VCPUS);
>
> I think you can drop the zeroing from the callers?

OK...

> > +
> > +	if (entry->fields.dest_mode == 0) {	/* Physical mode. */
> > +		if (entry->fields.dest_id == 0xFF) {	/* Broadcast. */
> > +			for (i = 0; i < KVM_MAX_VCPUS; ++i)
> > +				if (kvm->vcpus[i] && kvm->vcpus[i]->arch.apic)
> > +					__set_bit(i, deliver_bitmask);
> > +			return;
> > +		}
>
> The spec says broadcast is not supported with lowest priority delivery
> mode, and that "must not be configured by software". I wonder what happens
> in HW if you do that.
>

Um.. So you mean to prohibit this kind of action? OK.

> > +		for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> > +			vcpu = kvm->vcpus[i];
> > +			if (!vcpu)
> > +				continue;
> > +			if (kvm_apic_match_physical_addr(vcpu->arch.apic,
> > +					entry->fields.dest_id)) {
> > +				if (vcpu->arch.apic)
> > +					__set_bit(i, deliver_bitmask);
> > +				break;
> > +			}
> > +		}
> > +	} else if (entry->fields.dest_id != 0) /* Logical mode, MDA non-zero.
> > */ +		for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> > +			vcpu = kvm->vcpus[i];
> > +			if (!vcpu)
> > +				continue;
> > +			if (vcpu->arch.apic &&
> > +			    kvm_apic_match_logical_addr(vcpu->arch.apic,
> > +					entry->fields.dest_id))
> > +				__set_bit(i, deliver_bitmask);
> > +		}
> > +
> >  	switch (entry->fields.delivery_mode) {
> >  	case IOAPIC_LOWEST_PRIORITY:
> > +		/* Select one in deliver_bitmask */
> >  		vcpu = kvm_get_lowest_prio_vcpu(ioapic->kvm,
> >  				entry->fields.vector, deliver_bitmask);
> > +		bitmap_zero(deliver_bitmask, KVM_MAX_VCPUS);
> > +		if (!vcpu)
> > +			return;
> >  		__set_bit(vcpu->vcpu_id, deliver_bitmask);
> >  		break;
> >  	case IOAPIC_FIXED:
> > @@ -65,7 +99,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic
> > *ioapic, if (printk_ratelimit())
> >  			printk(KERN_INFO "kvm: unsupported delivery mode %d\n",
> >  				entry->fields.delivery_mode);
> > -		*deliver_bitmask = 0;
> > +		bitmap_zero(deliver_bitmask, KVM_MAX_VCPUS);
> >  	}
> >  }
>
> Unrelated question, what was the issue (in detail) which caused
> this change again:

I have dig out my old post (I thought you are also involved :) )

>[kvm-devel] The SMP RHEL 5.1 PAE guest can't boot up issue
>From:
>"Yang, Sheng" <sheng.yang@intel.com>  (Intel)
>  To:
>kvm-devel@lists.sourceforge.net
>  Date:
>2008-02-22 16:57
>I believe I have found the root cause of SMP RHEL5.1 PAE guest can't boot up
>issue. The problem was caused by
>kvm:6685637b211ad67bdce21bfd9f91bc888b3acb4f
>"KVM: VMX: Ensure vcpu time stamp counter is monotonous" (It didn't take me
>much time to found the solution, but a lot of time to find the proper
>explanation...  :( )
>
>As we guessed, the problem was the monotonous of TSC. I have traced to
>the 2.6.18 PAE guest kernel, and finally found it caused by a overflow in
> the loop of function update_wall_timer()(kernel/timer.c), when using TSC as
> clocksource by default.
>
>The reason is that the patch "KVM: VMX: Ensure vcpu time stamp counter is
>monotonous" bring big gap between different VCPUs (error between
>TSC_OFFSETs). Though I have proved that the patch can ensure the monotonous
>on each VCPU (which rejected my first thought...), the patch
>have 2 problems:
>
>1. It have accumulated the error. Each vcpu's TSC is monotonous, but get
>slower and slower, compared to the host. That's because the TSC is very
>accuracy and the interval between reading TSC is big. But this is not very
>critical.
>
>2. The critical one. In normal condition, VCPU0 migrated much more
>frequently than other VCPUs. And the patch add more "delta" (always negative
>if host TSC is stable) to TSC_OFFSET each
>time migrated. Then after boot for a while, VCPU0 became much
>slower than others (In my test, VCPU0 was migrated about two times than the
>others, and easily to be more than 100k cycles slower). In the guest kernel,
>clocksource TSC is global variable, the variable "cycle_last" may got the
>VCPU1's TSC value, then turn to VCPU0. For VCPU0's TSC_OFFSET is
>smaller than VCPU1's, so it's possible to got the "cycle_last" (from VCPU1)
>bigger than current TSC value (from VCPU0) in next tick. Then "u64 offset =
>clocksource_read() - cycle_last" overflowed and caused the "infinite" loop.
>And it can also explained why Marcelo's patch don't work - it just reduce
> the rate of gap increasing.
>
>The freezing didn't happen when using userspace IOAPIC, just because the
> qemu APIC didn't implement real LOWPRI(or round_robin) to choose CPU for
> delivery. It choose VCPU0 everytime if possible, so CPU1 in guest won't
> update cycle_last. :(
>
>This freezing only occurred on RHEL5/5.1 pae (kernel 2.6.18), because of
> they set IO-APIC IRQ0's dest_mask to 0x3 (with 2 vcpus) and dest_mode as
> LOWEST_PRIOITY, then other vcpus had chance to modify "cycle_last". In
> contrast, RHEL5/5.1 32e set IRQ0's dest_mode as FIXED, to CPU0, then don't
> have this problem. So does RHEL4(kernel 2.6.9).
>
>I don't know if the patch was still needed now, since it was posted long
> ago(I don't know which issue it solved). I'd like to post a revert patch if
> necessary.

-- 
regards
Yang, Sheng

  reply	other threads:[~2009-03-04  2:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-02  8:19 [PATCH] KVM: Merge kvm_ioapic_get_delivery_bitmask into kvm_get_intr_delivery_bitmask Sheng Yang
2009-03-03 11:20 ` Marcelo Tosatti
2009-03-04  2:41   ` Sheng Yang [this message]
2009-03-04  2:55     ` Marcelo Tosatti
2009-03-04  2:59       ` Sheng Yang
2009-03-04  5:33         ` [PATCH 1/1] " Sheng Yang
2009-03-04 18:30           ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200903041041.35016.sheng@linux.intel.com \
    --to=sheng@linux.intel.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.