All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: Avi Kivity <avi@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	gleb@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] kvm: fix race with level interrupts
Date: Thu, 19 Jul 2012 15:24:56 +0300	[thread overview]
Message-ID: <20120719122456.GB9303@redhat.com> (raw)
In-Reply-To: <1342699875.3229.51.camel@ul30vt>

On Thu, Jul 19, 2012 at 06:11:15AM -0600, Alex Williamson wrote:
> On Thu, 2012-07-19 at 12:07 +0300, Michael S. Tsirkin wrote:
> > On Wed, Jul 18, 2012 at 05:20:40PM -0600, Alex Williamson wrote:
> > > On Thu, 2012-07-19 at 01:49 +0300, Michael S. Tsirkin wrote:
> > > > On Wed, Jul 18, 2012 at 04:26:41PM -0600, Alex Williamson wrote:
> > > > > On Thu, 2012-07-19 at 00:52 +0300, Michael S. Tsirkin wrote:
> > > > > > When more than 1 source id is in use for the same GSI, we have the
> > > > > > following race related to handling irq_states:
> > > > > > 
> > > > > > CPU 0 clears bit in irq_states. CPU 0 reads irq_state as 0.
> > > > > > CPU 1 sets bit in irq_states.  CPU 1 calls kvm_ioapic_set_irq(1). CPU 0
> > > > > > calls kvm_ioapic_set_irq(0).
> > > > > > 
> > > > > > Now ioapic thinks the level is 0 but irq_state is not 0.
> > > > > > 
> > > > > > Note that above is valid behaviour if CPU0 and CPU1 are using different
> > > > > > source ids.
> > > > > > 
> > > > > > Fix by performing all irq_states bitmap handling under pic/ioapic lock.
> > > > > > This also removes the need for atomics with irq_states handling.
> > > > > > 
> > > > > > Reported-by: Gleb Natapov <gleb@redhat.com>
> > > > > > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > > > ---
> > > > > > 
> > > > > > This survives stress testing for me (though I only tried virtio, not
> > > > > > device assignment).
> > > > > > Avi, Marcelo, though we have not observed this in the field,
> > > > > > it's a bugfix so probably 3.5 material?
> > > > > > I assume yes so the patch is against 3.5-rc7.
> > > > > > Also stable? It's a very old bug.
> > > > > > 
> > > > > > 
> > > > > >  arch/x86/include/asm/kvm_host.h | 15 ++++++++++++++-
> > > > > >  arch/x86/kvm/i8259.c            | 14 ++++++++++++--
> > > > > >  virt/kvm/ioapic.c               | 13 ++++++++++++-
> > > > > >  virt/kvm/ioapic.h               |  4 +++-
> > > > > >  virt/kvm/irq_comm.c             | 31 ++++---------------------------
> > > > > >  5 files changed, 45 insertions(+), 32 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > > index db7c1f2..fe6e9f1 100644
> > > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > > @@ -802,7 +802,20 @@ int kvm_read_guest_page_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
> > > > > >  void kvm_propagate_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
> > > > > >  bool kvm_require_cpl(struct kvm_vcpu *vcpu, int required_cpl);
> > > > > >  
> > > > > > -int kvm_pic_set_irq(void *opaque, int irq, int level);
> > > > > > +static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > +				     int irq_source_id, int level)
> > > > > 
> > > > > This should probably be __kvm_irq_line_state given the calling
> > > > > restrictions.
> > > > 
> > > > It's such a trivial helper, do we need to split hairs about this?
> > > > 
> > > > Look it's not a good time for minor coding style nits.
> > > > 3.5 is imminent, it's about 1am here and I really don't have time to retest
> > > > today, so we'll release another kernel with a bug.
> > > > 
> > > > Could you focus on reviewing the functionality and correctness, and
> > > > leave ideas for better variable naming aside until 3.6?
> > > 
> > > Please get off your high horse, this bug has existed for a long time and
> > > nobody has noticed.  __ indicates locking and makes people think twice
> > > about arbitrarily calling it.  Correct naming prevents future bugs,
> > > which is something you've been riding me on in other areas...
> > 
> > It's all fine but you already sent two messages about improving coding
> > style in the same patch. What I really hoped for is a thorough review
> > that looks at all issues at once, including the functional side as well.
> > Or maybe style comments is all there is ... we'll see.
> 
> So you're accusing me of an superfluous review because I noted coding
> style errors?

No, what I mean is that you sent multiple messages on different parts of
the same patch which is confusing.  It seems reasonable to ask for a single
mail that looks at all issues at once with a small patch like this.

> Very professional.  My comments on your rfc should
> validate the review was thorough.

But you gave me no indication whether you have completed the review or
only looked at style and more comments are forthcoming.

> 
> > > > > > +{
> > > > > > +	/* Logical OR for level trig interrupt */
> > > > > > +	if (level)
> > > > > > +		__set_bit(irq_source_id, irq_state);
> > > > > > +	else
> > > > > > +		__clear_bit(irq_source_id, irq_state);
> > > > > > +
> > > > > > +	return !!(*irq_state);
> > > > > > +}
> > > > > > +
> > > > > > +int kvm_pic_set_irq(struct kvm_pic *pic, int irq, int irq_source_id, int level);
> > > > > > +void kvm_pic_clear_all(struct kvm_pic *pic, int irq_source_id);
> > > > > >  
> > > > > >  void kvm_inject_nmi(struct kvm_vcpu *vcpu);
> > > > > >  
> > > > > > diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> > > > > > index 81cf4fa..95b504b 100644
> > > > > > --- a/arch/x86/kvm/i8259.c
> > > > > > +++ b/arch/x86/kvm/i8259.c
> > > > > > @@ -188,13 +188,14 @@ void kvm_pic_update_irq(struct kvm_pic *s)
> > > > > >  	pic_unlock(s);
> > > > > >  }
> > > > > >  
> > > > > > -int kvm_pic_set_irq(void *opaque, int irq, int level)
> > > > > > +int kvm_pic_set_irq(struct kvm_pic *s, int irq, int src_id, int l)
> > > > > >  {
> > > > > > -	struct kvm_pic *s = opaque;
> > > > > >  	int ret = -1;
> > > > > > +	int level;
> > > > > >  
> > > > > >  	pic_lock(s);
> > > > > >  	if (irq >= 0 && irq < PIC_NUM_PINS) {
> > > > > > +		level = kvm_irq_line_state(&s->irq_states[irq], src_id, l);
> > > > > >  		ret = pic_set_irq1(&s->pics[irq >> 3], irq & 7, level);
> > > > > >  		pic_update_irq(s);
> > > > > >  		trace_kvm_pic_set_irq(irq >> 3, irq & 7, s->pics[irq >> 3].elcr,
> > > > > > @@ -205,6 +206,15 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
> > > > > >  	return ret;
> > > > > >  }
> > > > > >  
> > > > > > +void kvm_pic_clear_all(struct kvm_pic *s, int src_id)
> > > > > > +{
> > > > > > +	int i;
> > > > > > +	pic_lock(s);
> > > > > > +	for (i = 0; i < PIC_NUM_PINS; i++)
> > > > > > +		__clear_bit(src_id, &s->irq_states[i]);
> > > > > > +	pic_unlock(s);
> > > > > > +}
> > > > > > +
> > > > > >  /*
> > > > > >   * acknowledge interrupt 'irq'
> > > > > >   */
> > > > > > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> > > > > > index 26fd54d..268b332 100644
> > > > > > --- a/virt/kvm/ioapic.c
> > > > > > +++ b/virt/kvm/ioapic.c
> > > > > > @@ -191,17 +191,19 @@ static int ioapic_deliver(struct kvm_ioapic *ioapic, int irq)
> > > > > >  	return kvm_irq_delivery_to_apic(ioapic->kvm, NULL, &irqe);
> > > > > >  }
> > > > > >  
> > > > > > -int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
> > > > > > +int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int src_id, int l)
> > > > > >  {
> > > > > >  	u32 old_irr;
> > > > > >  	u32 mask = 1 << irq;
> > > > > >  	union kvm_ioapic_redirect_entry entry;
> > > > > >  	int ret = 1;
> > > > > > +	int level;
> > > > > >  
> > > > > >  	spin_lock(&ioapic->lock);
> > > > > >  	old_irr = ioapic->irr;
> > > > > >  	if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
> > > > > >  		entry = ioapic->redirtbl[irq];
> > > > > > +		level = kvm_irq_line_state(&ioapic->irq_states[irq], src_id, l);
> > > > > >  		level ^= entry.fields.polarity;
> > > > > >  		if (!level)
> > > > > >  			ioapic->irr &= ~mask;
> > > > > > @@ -221,6 +223,15 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
> > > > > >  	return ret;
> > > > > >  }
> > > > > >  
> > > > > > +void kvm_ioapic_clear_all(struct kvm_ioapic *ioapic, int src_id)
> > > > > > +{
> > > > > > +	int i;
> > > > > > +	spin_lock(&ioapic->lock);
> > > > > > +	for (i = 0; i < KVM_IOAPIC_NUM_PINS; i++)
> > > > > > +		__clear_bit(src_id, &ioapic->irq_states[i]);
> > > > > > +	spin_unlock(&ioapic->lock);
> > > > > > +}
> > > > > > +
> > > > > >  static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
> > > > > >  				     int trigger_mode)
> > > > > >  {
> > > > > > diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
> > > > > > index 32872a0..a30abfe 100644
> > > > > > --- a/virt/kvm/ioapic.h
> > > > > > +++ b/virt/kvm/ioapic.h
> > > > > > @@ -74,7 +74,9 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode);
> > > > > >  bool kvm_ioapic_handles_vector(struct kvm *kvm, int vector);
> > > > > >  int kvm_ioapic_init(struct kvm *kvm);
> > > > > >  void kvm_ioapic_destroy(struct kvm *kvm);
> > > > > > -int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level);
> > > > > > +int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
> > > > > > +		       int level);
> > > > > > +void kvm_ioapic_clear_all(struct kvm_ioapic *ioapic, int irq_source_id);
> > > > > >  void kvm_ioapic_reset(struct kvm_ioapic *ioapic);
> > > > > >  int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
> > > > > >  		struct kvm_lapic_irq *irq);
> > > > > > diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> > > > > > index 5afb431..83402d7 100644
> > > > > > --- a/virt/kvm/irq_comm.c
> > > > > > +++ b/virt/kvm/irq_comm.c
> > > > > > @@ -33,26 +33,12 @@
> > > > > >  
> > > > > >  #include "ioapic.h"
> > > > > >  
> > > > > > -static inline int kvm_irq_line_state(unsigned long *irq_state,
> > > > > > -				     int irq_source_id, int level)
> > > > > > -{
> > > > > > -	/* Logical OR for level trig interrupt */
> > > > > > -	if (level)
> > > > > > -		set_bit(irq_source_id, irq_state);
> > > > > > -	else
> > > > > > -		clear_bit(irq_source_id, irq_state);
> > > > > > -
> > > > > > -	return !!(*irq_state);
> > > > > > -}
> > > > > > -
> > > > > >  static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > > > >  			   struct kvm *kvm, int irq_source_id, int level)
> > > > > >  {
> > > > > >  #ifdef CONFIG_X86
> > > > > >  	struct kvm_pic *pic = pic_irqchip(kvm);
> > > > > > -	level = kvm_irq_line_state(&pic->irq_states[e->irqchip.pin],
> > > > > > -				   irq_source_id, level);
> > > > > > -	return kvm_pic_set_irq(pic, e->irqchip.pin, level);
> > > > > > +	return kvm_pic_set_irq(pic, e->irqchip.pin, irq_source_id, level);
> > > > > >  #else
> > > > > >  	return -1;
> > > > > >  #endif
> > > > > > @@ -62,10 +48,7 @@ static int kvm_set_ioapic_irq(struct kvm_kernel_irq_routing_entry *e,
> > > > > >  			      struct kvm *kvm, int irq_source_id, int level)
> > > > > >  {
> > > > > >  	struct kvm_ioapic *ioapic = kvm->arch.vioapic;
> > > > > > -	level = kvm_irq_line_state(&ioapic->irq_states[e->irqchip.pin],
> > > > > > -				   irq_source_id, level);
> > > > > > -
> > > > > > -	return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, level);
> > > > > > +	return kvm_ioapic_set_irq(ioapic, e->irqchip.pin, irq_source_id, level);
> > > > > >  }
> > > > > >  
> > > > > >  inline static bool kvm_is_dm_lowest_prio(struct kvm_lapic_irq *irq)
> > > > > > @@ -249,8 +232,6 @@ unlock:
> > > > > >  
> > > > > >  void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id)
> > > > > >  {
> > > > > > -	int i;
> > > > > > -
> > > > > >  	ASSERT(irq_source_id != KVM_USERSPACE_IRQ_SOURCE_ID);
> > > > > >  
> > > > > >  	mutex_lock(&kvm->irq_lock);
> > > > > > @@ -263,14 +244,10 @@ void kvm_free_irq_source_id(struct kvm *kvm, int irq_source_id)
> > > > > >  	if (!irqchip_in_kernel(kvm))
> > > > > >  		goto unlock;
> > > > > >  
> > > > > > -	for (i = 0; i < KVM_IOAPIC_NUM_PINS; i++) {
> > > > > > -		clear_bit(irq_source_id, &kvm->arch.vioapic->irq_states[i]);
> > > > > > -		if (i >= 16)
> > > > > > -			continue;
> > > > > > +	kvm_ioapic_clear_all(kvm->arch.vioapic, irq_source_id);
> > > > > >  #ifdef CONFIG_X86
> > > > > > -		clear_bit(irq_source_id, &pic_irqchip(kvm)->irq_states[i]);
> > > > > > +	kvm_pic_clear_all(pic_irqchip(kvm), irq_source_id);
> > > > > >  #endif
> > > > > > -	}
> > > > > >  unlock:
> > > > > >  	mutex_unlock(&kvm->irq_lock);
> > > > > >  }
> > > > > 
> > > > > 
> > > 
> > > 
> 
> 

      reply	other threads:[~2012-07-19 12:24 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-18 21:52 [PATCH] kvm: fix race with level interrupts Michael S. Tsirkin
2012-07-18 22:16 ` Alex Williamson
2012-07-18 22:44   ` Michael S. Tsirkin
2012-07-18 23:22     ` Alex Williamson
2012-07-19  9:15       ` Michael S. Tsirkin
2012-07-19 12:06         ` Alex Williamson
2012-07-19 12:10           ` Michael S. Tsirkin
2012-07-19  7:51     ` Gleb Natapov
2012-07-18 22:26 ` Alex Williamson
2012-07-18 22:49   ` Michael S. Tsirkin
2012-07-18 23:20     ` Alex Williamson
2012-07-19  7:47       ` Gleb Natapov
2012-07-19  9:07       ` Michael S. Tsirkin
2012-07-19 12:11         ` Alex Williamson
2012-07-19 12:24           ` Michael S. Tsirkin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120719122456.GB9303@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=avi@redhat.com \
    --cc=gleb@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.