All of lore.kernel.org
 help / color / mirror / Atom feed
* [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
@ 2010-04-20 15:54 Marcelo Tosatti
  2010-04-20 21:49 ` Bonenkamp, Ralf
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-20 15:54 UTC (permalink / raw)
  To: kvm; +Cc: bonenkamp, Chris Wright, Yang, Sheng


The assigned device interrupt work handler calls kvm_set_irq, which
can sleep, for example, waiting for the ioapic mutex, from irq disabled
section.

https://bugzilla.kernel.org/show_bug.cgi?id=15725

Fix by dropping assigned_dev_lock (and re-enabling interrupts)
before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
cases do not require the lock or interrupts disabled (a new work
instance will be queued in case of concurrent interrupt).

KVM-Stable-Tag.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 47ca447..7ac7bbe 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -64,24 +64,33 @@ static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
 				    interrupt_work);
 	kvm = assigned_dev->kvm;
 
-	spin_lock_irq(&assigned_dev->assigned_dev_lock);
 	if (assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_MSIX) {
 		struct kvm_guest_msix_entry *guest_entries =
 			assigned_dev->guest_msix_entries;
+
+		spin_lock_irq(&assigned_dev->assigned_dev_lock);
 		for (i = 0; i < assigned_dev->entries_nr; i++) {
 			if (!(guest_entries[i].flags &
 					KVM_ASSIGNED_MSIX_PENDING))
 				continue;
 			guest_entries[i].flags &= ~KVM_ASSIGNED_MSIX_PENDING;
+			/*
+ 			 * If kvm_assigned_dev_intr sets pending for an
+ 			 * entry smaller than this work instance is
+ 			 * currently processing, a new work instance
+ 			 * will be queued.
+ 			 */
+			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
 			kvm_set_irq(assigned_dev->kvm,
 				    assigned_dev->irq_source_id,
 				    guest_entries[i].vector, 1);
+			spin_lock_irq(&assigned_dev->assigned_dev_lock);
 		}
+		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
 	} else
 		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
 			    assigned_dev->guest_irq, 1);
 
-	spin_unlock_irq(&assigned_dev->assigned_dev_lock);
 }
 
 static irqreturn_t kvm_assigned_dev_intr(int irq, void *dev_id)



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-20 15:54 [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Marcelo Tosatti
@ 2010-04-20 21:49 ` Bonenkamp, Ralf
  2010-04-21  7:51   ` Yang, Sheng
  2010-04-21  7:48 ` Yang, Sheng
  2010-04-21  8:32 ` [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Avi Kivity
  2 siblings, 1 reply; 21+ messages in thread
From: Bonenkamp, Ralf @ 2010-04-20 21:49 UTC (permalink / raw)
  To: Marcelo Tosatti, kvm; +Cc: Chris Wright, Yang, Sheng

Hi Marcelo,

Thanks for the patch.
I put it into my kernel source tree and tested the freshly build kernel in my testing environment. The problem is now - almost - gone. The only suspicious message (ONE occurrence immediate after starting the Server 2008 R2 VM) in syslog is now:

BUG: scheduling while atomic: qemu/3674/0x00000002
Modules linked in: tun bridge stp llc ext3 jbd uhci_hcd snd_hda_codec_realtek snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc i915 drm_kms_helper drm i2c_algo_bit video output i2c_i801 i2c_core ide_pci_generic ide_core button intel_agp ehci_hcd thermal e1000e iTCO_wdt iTCO_vendor_support processor usbcore kvm_intel evdev sg psmouse pcspkr serio_raw kvm rtc_cmos rtc_core rtc_lib ext4 mbcache jbd2 crc16 dm_mod sr_mod cdrom sd_mod ata_generic pata_acpi ahci libata scsi_mod
Pid: 3674, comm: qemu Not tainted 2.6.33.2-KVM_patch #4
Call Trace:
 [<ffffffff8133d414>] ? schedule+0x544/0xa60
 [<ffffffffa01839ad>] ? mmu_zap_unsync_children+0x17d/0x200 [kvm]
 [<ffffffff8133e612>] ? __mutex_lock_slowpath+0x162/0x350
 [<ffffffff8133e809>] ? mutex_lock+0x9/0x20
 [<ffffffffa016cc87>] ? kvm_ioapic_set_irq+0x47/0x160 [kvm]
 [<ffffffffa016d914>] ? kvm_set_irq+0xf4/0x190 [kvm]
 [<ffffffffa016d9b0>] ? kvm_set_ioapic_irq+0x0/0x50 [kvm]
 [<ffffffffa016da00>] ? kvm_set_pic_irq+0x0/0x50 [kvm]
 [<ffffffffa018587f>] ? paging64_walk_addr+0x25f/0x750 [kvm]
 [<ffffffff8133e70d>] ? __mutex_lock_slowpath+0x25d/0x350
 [<ffffffffa016e905>] ? kvm_assigned_dev_ack_irq+0x35/0x90 [kvm]
 [<ffffffffa016d771>] ? kvm_notify_acked_irq+0x71/0x120 [kvm]
 [<ffffffffa016ce13>] ? kvm_ioapic_update_eoi+0x73/0xd0 [kvm]
 [<ffffffffa0192689>] ? apic_reg_write+0x569/0x700 [kvm]
 [<ffffffffa0192939>] ? apic_mmio_write+0x69/0x70 [kvm]
 [<ffffffffa0178fec>] ? emulator_write_emulated_onepage+0xac/0x1b0 [kvm]
 [<ffffffffa018d9c5>] ? x86_emulate_insn+0x1d25/0x4d20 [kvm]
 [<ffffffffa018b98e>] ? x86_decode_insn+0x96e/0xba0 [kvm]
 [<ffffffffa018442c>] ? kvm_mmu_unprotect_page_virt+0xec/0x100 [kvm]
 [<ffffffffa0178c13>] ? emulate_instruction+0xd3/0x380 [kvm]
 [<ffffffff8133fe3e>] ? __down_read+0xce/0xd0
 [<ffffffffa0191c69>] ? apic_update_ppr+0x29/0x70 [kvm]
 [<ffffffffa01fcc08>] ? handle_apic_access+0x18/0x40 [kvm_intel]
 [<ffffffffa017ba7d>] ? kvm_arch_vcpu_ioctl_run+0x3ad/0xcf0 [kvm]
 [<ffffffff81081af7>] ? wake_futex+0x37/0x70
 [<ffffffffa0168d5c>] ? kvm_vcpu_ioctl+0x53c/0x910 [kvm]
 [<ffffffff81013ab3>] ? __switch_to_xtra+0x163/0x1a0
 [<ffffffff810088b1>] ? __switch_to+0x271/0x340
 [<ffffffff81121405>] ? vfs_ioctl+0x35/0xd0
 [<ffffffff811215c8>] ? do_vfs_ioctl+0x88/0x570
 [<ffffffff8133d1c9>] ? schedule+0x2f9/0xa60
 [<ffffffff81121b30>] ? sys_ioctl+0x80/0xa0
 [<ffffffff810cf4ea>] ? fire_user_return_notifiers+0x3a/0x70
 [<ffffffff8100a042>] ? system_call_fastpath+0x16/0x1b
kvm: emulating exchange as write 

Honestly my knowledge of the kvm internals is not sufficient to decide if this bug report still belongs to my problem or is something different. So if you need more information or additional testing please let me know..

Best regards
Ralf Bonenkamp  
 
-----Ursprüngliche Nachricht-----
Von: Marcelo Tosatti [mailto:mtosatti@redhat.com] 
Gesendet: Dienstag, 20. April 2010 17:54
An: kvm
Cc: Ralf Bonenkamp; Chris Wright; Yang, Sheng
Betreff: *** GMX Spamverdacht *** [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section


The assigned device interrupt work handler calls kvm_set_irq, which
can sleep, for example, waiting for the ioapic mutex, from irq disabled
section.

https://bugzilla.kernel.org/show_bug.cgi?id=15725

Fix by dropping assigned_dev_lock (and re-enabling interrupts)
before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
cases do not require the lock or interrupts disabled (a new work
instance will be queued in case of concurrent interrupt).

KVM-Stable-Tag.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-20 15:54 [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Marcelo Tosatti
  2010-04-20 21:49 ` Bonenkamp, Ralf
@ 2010-04-21  7:48 ` Yang, Sheng
  2010-04-21 15:58   ` Marcelo Tosatti
  2010-04-21  8:32 ` [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Avi Kivity
  2 siblings, 1 reply; 21+ messages in thread
From: Yang, Sheng @ 2010-04-21  7:48 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, bonenkamp, Chris Wright

On Tuesday 20 April 2010 23:54:01 Marcelo Tosatti wrote:
> The assigned device interrupt work handler calls kvm_set_irq, which
> can sleep, for example, waiting for the ioapic mutex, from irq disabled
> section.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=15725
> 
> Fix by dropping assigned_dev_lock (and re-enabling interrupts)
> before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
> cases do not require the lock or interrupts disabled (a new work
> instance will be queued in case of concurrent interrupt).

Looks fine, but depends on the new work would be queued sounds a little 
tricky...

How about a local_irq_disable() at the beginning? It can ensure no concurrent 
interrupts would happen as well I think.

> 
> KVM-Stable-Tag.
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> index 47ca447..7ac7bbe 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -64,24 +64,33 @@ static void
>  kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
>  interrupt_work);
>  	kvm = assigned_dev->kvm;
> 
> -	spin_lock_irq(&assigned_dev->assigned_dev_lock);
>  	if (assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_MSIX) {
>  		struct kvm_guest_msix_entry *guest_entries =
>  			assigned_dev->guest_msix_entries;

irq_requested_type and guest_msix_entries should also protected by the lock. 
So how about another spin_lock()/unlock() pair wraps the second kvm_set_irq()?

> +
> +		spin_lock_irq(&assigned_dev->assigned_dev_lock);
>  		for (i = 0; i < assigned_dev->entries_nr; i++) {
>  			if (!(guest_entries[i].flags &
>  					KVM_ASSIGNED_MSIX_PENDING))
>  				continue;
>  			guest_entries[i].flags &= ~KVM_ASSIGNED_MSIX_PENDING;
> +			/*
> + 			 * If kvm_assigned_dev_intr sets pending for an
> + 			 * entry smaller than this work instance is
> + 			 * currently processing, a new work instance
> + 			 * will be queued.
> + 			 */
> +			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>  			kvm_set_irq(assigned_dev->kvm,
>  				    assigned_dev->irq_source_id,
>  				    guest_entries[i].vector, 1);
> +			spin_lock_irq(&assigned_dev->assigned_dev_lock);
>  		}
> +		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>  	} else
>  		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
>  			    assigned_dev->guest_irq, 1);

Or could we make kvm_set_irq() atomic? Though the code path is a little long 
for spinlock.

> 
> -	spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>  }
> 
>  static irqreturn_t kvm_assigned_dev_intr(int irq, void *dev_id)

-- 
regards
Yang, Sheng

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-20 21:49 ` Bonenkamp, Ralf
@ 2010-04-21  7:51   ` Yang, Sheng
  0 siblings, 0 replies; 21+ messages in thread
From: Yang, Sheng @ 2010-04-21  7:51 UTC (permalink / raw)
  To: Bonenkamp, Ralf; +Cc: Marcelo Tosatti, kvm, Chris Wright

On Wednesday 21 April 2010 05:49:11 Bonenkamp, Ralf wrote:
> Hi Marcelo,
> 
> Thanks for the patch.
> I put it into my kernel source tree and tested the freshly build kernel in
>  my testing environment. The problem is now - almost - gone. The only
>  suspicious message (ONE occurrence immediate after starting the Server
>  2008 R2 VM) in syslog is now:
> 
> BUG: scheduling while atomic: qemu/3674/0x00000002
> Modules linked in: tun bridge stp llc ext3 jbd uhci_hcd
>  snd_hda_codec_realtek snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
>  snd_seq_device snd_pcm_oss snd_mixer_oss snd_hda_intel snd_hda_codec
>  snd_hwdep snd_pcm snd_timer snd soundcore snd_page_alloc i915
>  drm_kms_helper drm i2c_algo_bit video output i2c_i801 i2c_core
>  ide_pci_generic ide_core button intel_agp ehci_hcd thermal e1000e iTCO_wdt
>  iTCO_vendor_support processor usbcore kvm_intel evdev sg psmouse pcspkr
>  serio_raw kvm rtc_cmos rtc_core rtc_lib ext4 mbcache jbd2 crc16 dm_mod
>  sr_mod cdrom sd_mod ata_generic pata_acpi ahci libata scsi_mod Pid: 3674,
>  comm: qemu Not tainted 2.6.33.2-KVM_patch #4
> Call Trace:
>  [<ffffffff8133d414>] ? schedule+0x544/0xa60
>  [<ffffffffa01839ad>] ? mmu_zap_unsync_children+0x17d/0x200 [kvm]
>  [<ffffffff8133e612>] ? __mutex_lock_slowpath+0x162/0x350
>  [<ffffffff8133e809>] ? mutex_lock+0x9/0x20
>  [<ffffffffa016cc87>] ? kvm_ioapic_set_irq+0x47/0x160 [kvm]
>  [<ffffffffa016d914>] ? kvm_set_irq+0xf4/0x190 [kvm]
>  [<ffffffffa016d9b0>] ? kvm_set_ioapic_irq+0x0/0x50 [kvm]
>  [<ffffffffa016da00>] ? kvm_set_pic_irq+0x0/0x50 [kvm]
>  [<ffffffffa018587f>] ? paging64_walk_addr+0x25f/0x750 [kvm]
>  [<ffffffff8133e70d>] ? __mutex_lock_slowpath+0x25d/0x350
>  [<ffffffffa016e905>] ? kvm_assigned_dev_ack_irq+0x35/0x90 [kvm]
>  [<ffffffffa016d771>] ? kvm_notify_acked_irq+0x71/0x120 [kvm]

Seems this time is kvm_notify_acked_irq() with RCU.

-- 
regards
Yang, Sheng

>  [<ffffffffa016ce13>] ? kvm_ioapic_update_eoi+0x73/0xd0 [kvm]
>  [<ffffffffa0192689>] ? apic_reg_write+0x569/0x700 [kvm]
>  [<ffffffffa0192939>] ? apic_mmio_write+0x69/0x70 [kvm]
>  [<ffffffffa0178fec>] ? emulator_write_emulated_onepage+0xac/0x1b0 [kvm]
>  [<ffffffffa018d9c5>] ? x86_emulate_insn+0x1d25/0x4d20 [kvm]
>  [<ffffffffa018b98e>] ? x86_decode_insn+0x96e/0xba0 [kvm]
>  [<ffffffffa018442c>] ? kvm_mmu_unprotect_page_virt+0xec/0x100 [kvm]
>  [<ffffffffa0178c13>] ? emulate_instruction+0xd3/0x380 [kvm]
>  [<ffffffff8133fe3e>] ? __down_read+0xce/0xd0
>  [<ffffffffa0191c69>] ? apic_update_ppr+0x29/0x70 [kvm]
>  [<ffffffffa01fcc08>] ? handle_apic_access+0x18/0x40 [kvm_intel]
>  [<ffffffffa017ba7d>] ? kvm_arch_vcpu_ioctl_run+0x3ad/0xcf0 [kvm]
>  [<ffffffff81081af7>] ? wake_futex+0x37/0x70
>  [<ffffffffa0168d5c>] ? kvm_vcpu_ioctl+0x53c/0x910 [kvm]
>  [<ffffffff81013ab3>] ? __switch_to_xtra+0x163/0x1a0
>  [<ffffffff810088b1>] ? __switch_to+0x271/0x340
>  [<ffffffff81121405>] ? vfs_ioctl+0x35/0xd0
>  [<ffffffff811215c8>] ? do_vfs_ioctl+0x88/0x570
>  [<ffffffff8133d1c9>] ? schedule+0x2f9/0xa60
>  [<ffffffff81121b30>] ? sys_ioctl+0x80/0xa0
>  [<ffffffff810cf4ea>] ? fire_user_return_notifiers+0x3a/0x70
>  [<ffffffff8100a042>] ? system_call_fastpath+0x16/0x1b
> kvm: emulating exchange as write
> 
> Honestly my knowledge of the kvm internals is not sufficient to decide if
>  this bug report still belongs to my problem or is something different. So
>  if you need more information or additional testing please let me know..
> 
> Best regards
> Ralf Bonenkamp
> 
> -----Ursprüngliche Nachricht-----
> Von: Marcelo Tosatti [mailto:mtosatti@redhat.com]
> Gesendet: Dienstag, 20. April 2010 17:54
> An: kvm
> Cc: Ralf Bonenkamp; Chris Wright; Yang, Sheng
> Betreff: *** GMX Spamverdacht *** [UNTESTED] KVM: do not call kvm_set_irq
>  from irq disabled section
> 
> 
> The assigned device interrupt work handler calls kvm_set_irq, which
> can sleep, for example, waiting for the ioapic mutex, from irq disabled
> section.
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=15725
> 
> Fix by dropping assigned_dev_lock (and re-enabling interrupts)
> before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
> cases do not require the lock or interrupts disabled (a new work
> instance will be queued in case of concurrent interrupt).
> 
> KVM-Stable-Tag.
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-20 15:54 [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Marcelo Tosatti
  2010-04-20 21:49 ` Bonenkamp, Ralf
  2010-04-21  7:48 ` Yang, Sheng
@ 2010-04-21  8:32 ` Avi Kivity
  2010-04-21 16:03   ` Marcelo Tosatti
  2 siblings, 1 reply; 21+ messages in thread
From: Avi Kivity @ 2010-04-21  8:32 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, bonenkamp, Chris Wright, Yang, Sheng

On 04/20/2010 06:54 PM, Marcelo Tosatti wrote:
> The assigned device interrupt work handler calls kvm_set_irq, which
> can sleep, for example, waiting for the ioapic mutex, from irq disabled
> section.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=15725
>
> Fix by dropping assigned_dev_lock (and re-enabling interrupts)
> before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
> cases do not require the lock or interrupts disabled (a new work
> instance will be queued in case of concurrent interrupt).
>
> KVM-Stable-Tag.
> Signed-off-by: Marcelo Tosatti<mtosatti@redhat.com>
>
> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> index 47ca447..7ac7bbe 100644
> --- a/virt/kvm/assigned-dev.c
> +++ b/virt/kvm/assigned-dev.c
> @@ -64,24 +64,33 @@ static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
>   				    interrupt_work);
>   	kvm = assigned_dev->kvm;
>
> -	spin_lock_irq(&assigned_dev->assigned_dev_lock);
>   	if (assigned_dev->irq_requested_type&  KVM_DEV_IRQ_HOST_MSIX) {
>   		struct kvm_guest_msix_entry *guest_entries =
>   			assigned_dev->guest_msix_entries;
> +
> +		spin_lock_irq(&assigned_dev->assigned_dev_lock);
>   		for (i = 0; i<  assigned_dev->entries_nr; i++) {
>   			if (!(guest_entries[i].flags&
>   					KVM_ASSIGNED_MSIX_PENDING))
>   				continue;
>   			guest_entries[i].flags&= ~KVM_ASSIGNED_MSIX_PENDING;
> +			/*
> + 			 * If kvm_assigned_dev_intr sets pending for an
> + 			 * entry smaller than this work instance is
> + 			 * currently processing, a new work instance
> + 			 * will be queued.
> + 			 */
> +			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>    

What happens if assigned_dev->entries_nr is changed here?

>   			kvm_set_irq(assigned_dev->kvm,
>   				    assigned_dev->irq_source_id,
>   				    guest_entries[i].vector, 1);
> +			spin_lock_irq(&assigned_dev->assigned_dev_lock);
>   		}
> +		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>   	} else
>   		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
>   			    assigned_dev->guest_irq, 1);
>
> -	spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>   }
>    

A less dangerous fix is to copy all vectors to be triggered into a local 
array, drop the lock, and replay the array into kvm_set_irq().  The else 
branch could do this as well.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21  7:48 ` Yang, Sheng
@ 2010-04-21 15:58   ` Marcelo Tosatti
  2010-04-21 17:12     ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-21 15:58 UTC (permalink / raw)
  To: Yang, Sheng, Gleb Natapov; +Cc: kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 03:48:12PM +0800, Yang, Sheng wrote:
> On Tuesday 20 April 2010 23:54:01 Marcelo Tosatti wrote:
> > The assigned device interrupt work handler calls kvm_set_irq, which
> > can sleep, for example, waiting for the ioapic mutex, from irq disabled
> > section.
> > 
> > https://bugzilla.kernel.org/show_bug.cgi?id=15725
> > 
> > Fix by dropping assigned_dev_lock (and re-enabling interrupts)
> > before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
> > cases do not require the lock or interrupts disabled (a new work
> > instance will be queued in case of concurrent interrupt).
> 
> Looks fine, but depends on the new work would be queued sounds a little 
> tricky...

I think thats guaranteed behaviour, so you can schedule_work() from
within a worker.

> How about a local_irq_disable() at the beginning? It can ensure no concurrent 
> interrupts would happen as well I think.
> 
> > 
> > KVM-Stable-Tag.
> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> > 
> > diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> > index 47ca447..7ac7bbe 100644
> > --- a/virt/kvm/assigned-dev.c
> > +++ b/virt/kvm/assigned-dev.c
> > @@ -64,24 +64,33 @@ static void
> >  kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
> >  interrupt_work);
> >  	kvm = assigned_dev->kvm;
> > 
> > -	spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  	if (assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_MSIX) {
> >  		struct kvm_guest_msix_entry *guest_entries =
> >  			assigned_dev->guest_msix_entries;
> 
> irq_requested_type and guest_msix_entries should also protected by the lock. 
> So how about another spin_lock()/unlock() pair wraps the second kvm_set_irq()?

Don't think its necessary because irq_requested_type and
guest_msix_entries never change once setup.

They only change via deassign_irq, which first disables the IRQ and
flushes pending work.

> > +
> > +		spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  		for (i = 0; i < assigned_dev->entries_nr; i++) {
> >  			if (!(guest_entries[i].flags &
> >  					KVM_ASSIGNED_MSIX_PENDING))
> >  				continue;
> >  			guest_entries[i].flags &= ~KVM_ASSIGNED_MSIX_PENDING;
> > +			/*
> > + 			 * If kvm_assigned_dev_intr sets pending for an
> > + 			 * entry smaller than this work instance is
> > + 			 * currently processing, a new work instance
> > + 			 * will be queued.
> > + 			 */
> > +			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> >  			kvm_set_irq(assigned_dev->kvm,
> >  				    assigned_dev->irq_source_id,
> >  				    guest_entries[i].vector, 1);
> > +			spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  		}
> > +		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> >  	} else
> >  		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
> >  			    assigned_dev->guest_irq, 1);
> 
> Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> for spinlock.

Yes, given the sleep-inside-RCU-protected section bug from
kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.

But as you said, the code paths are long and potentially slow, so
probably SRCU is a better alternative.

Gleb?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21  8:32 ` [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Avi Kivity
@ 2010-04-21 16:03   ` Marcelo Tosatti
  2010-04-21 16:28     ` Avi Kivity
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-21 16:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, bonenkamp, Chris Wright, Yang, Sheng

On Wed, Apr 21, 2010 at 11:32:49AM +0300, Avi Kivity wrote:
> On 04/20/2010 06:54 PM, Marcelo Tosatti wrote:
> >The assigned device interrupt work handler calls kvm_set_irq, which
> >can sleep, for example, waiting for the ioapic mutex, from irq disabled
> >section.
> >
> >https://bugzilla.kernel.org/show_bug.cgi?id=15725
> >
> >Fix by dropping assigned_dev_lock (and re-enabling interrupts)
> >before invoking kvm_set_irq for the KVM_DEV_IRQ_HOST_MSIX case. Other
> >cases do not require the lock or interrupts disabled (a new work
> >instance will be queued in case of concurrent interrupt).
> >
> >KVM-Stable-Tag.
> >Signed-off-by: Marcelo Tosatti<mtosatti@redhat.com>
> >
> >diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
> >index 47ca447..7ac7bbe 100644
> >--- a/virt/kvm/assigned-dev.c
> >+++ b/virt/kvm/assigned-dev.c
> >@@ -64,24 +64,33 @@ static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
> >  				    interrupt_work);
> >  	kvm = assigned_dev->kvm;
> >
> >-	spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  	if (assigned_dev->irq_requested_type&  KVM_DEV_IRQ_HOST_MSIX) {
> >  		struct kvm_guest_msix_entry *guest_entries =
> >  			assigned_dev->guest_msix_entries;
> >+
> >+		spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  		for (i = 0; i<  assigned_dev->entries_nr; i++) {
> >  			if (!(guest_entries[i].flags&
> >  					KVM_ASSIGNED_MSIX_PENDING))
> >  				continue;
> >  			guest_entries[i].flags&= ~KVM_ASSIGNED_MSIX_PENDING;
> >+			/*
> >+ 			 * If kvm_assigned_dev_intr sets pending for an
> >+ 			 * entry smaller than this work instance is
> >+ 			 * currently processing, a new work instance
> >+ 			 * will be queued.
> >+ 			 */
> >+			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> 
> What happens if assigned_dev->entries_nr is changed here?

It does not. Interrupts and work instances are stopped before 
irq_requested_type or entries_nr can change.

> >  			kvm_set_irq(assigned_dev->kvm,
> >  				    assigned_dev->irq_source_id,
> >  				    guest_entries[i].vector, 1);
> >+			spin_lock_irq(&assigned_dev->assigned_dev_lock);
> >  		}
> >+		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> >  	} else
> >  		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
> >  			    assigned_dev->guest_irq, 1);
> >
> >-	spin_unlock_irq(&assigned_dev->assigned_dev_lock);
> >  }
> 
> A less dangerous fix is to copy all vectors to be triggered into a
> local array, drop the lock, and replay the array into kvm_set_irq().
> The else branch could do this as well.

Its a bit large, 256 max vectors.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 16:03   ` Marcelo Tosatti
@ 2010-04-21 16:28     ` Avi Kivity
  0 siblings, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2010-04-21 16:28 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, bonenkamp, Chris Wright, Yang, Sheng

On 04/21/2010 07:03 PM, Marcelo Tosatti wrote:
>>> diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
>>> index 47ca447..7ac7bbe 100644
>>> --- a/virt/kvm/assigned-dev.c
>>> +++ b/virt/kvm/assigned-dev.c
>>> @@ -64,24 +64,33 @@ static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
>>>   				    interrupt_work);
>>>   	kvm = assigned_dev->kvm;
>>>
>>> -	spin_lock_irq(&assigned_dev->assigned_dev_lock);
>>>   	if (assigned_dev->irq_requested_type&   KVM_DEV_IRQ_HOST_MSIX) {
>>>   		struct kvm_guest_msix_entry *guest_entries =
>>>   			assigned_dev->guest_msix_entries;
>>> +
>>> +		spin_lock_irq(&assigned_dev->assigned_dev_lock);
>>>   		for (i = 0; i<   assigned_dev->entries_nr; i++) {
>>>   			if (!(guest_entries[i].flags&
>>>   					KVM_ASSIGNED_MSIX_PENDING))
>>>   				continue;
>>>   			guest_entries[i].flags&= ~KVM_ASSIGNED_MSIX_PENDING;
>>> +			/*
>>> + 			 * If kvm_assigned_dev_intr sets pending for an
>>> + 			 * entry smaller than this work instance is
>>> + 			 * currently processing, a new work instance
>>> + 			 * will be queued.
>>> + 			 */
>>> +			spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>>>        
>> What happens if assigned_dev->entries_nr is changed here?
>>      
> It does not. Interrupts and work instances are stopped before
> irq_requested_type or entries_nr can change.
>    

Interesting.  A comment please.

>    
>>>   			kvm_set_irq(assigned_dev->kvm,
>>>   				    assigned_dev->irq_source_id,
>>>   				    guest_entries[i].vector, 1);
>>> +			spin_lock_irq(&assigned_dev->assigned_dev_lock);
>>>   		}
>>> +		spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>>>   	} else
>>>   		kvm_set_irq(assigned_dev->kvm, assigned_dev->irq_source_id,
>>>   			    assigned_dev->guest_irq, 1);
>>>
>>> -	spin_unlock_irq(&assigned_dev->assigned_dev_lock);
>>>   }
>>>        
>> A less dangerous fix is to copy all vectors to be triggered into a
>> local array, drop the lock, and replay the array into kvm_set_irq().
>> The else branch could do this as well.
>>      
> Its a bit large, 256 max vectors.
>    

Yeah.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 15:58   ` Marcelo Tosatti
@ 2010-04-21 17:12     ` Gleb Natapov
  2010-04-21 17:37       ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2010-04-21 17:12 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > for spinlock.
> 
> Yes, given the sleep-inside-RCU-protected section bug from
> kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> 
> But as you said, the code paths are long and potentially slow, so
> probably SRCU is a better alternative.
> 
> Gleb?
kvm_set_irq() was converted to rcu from mutex to make msix interrupt
injection scalable.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 17:12     ` Gleb Natapov
@ 2010-04-21 17:37       ` Marcelo Tosatti
  2010-04-21 17:58         ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-21 17:37 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > for spinlock.
> > 
> > Yes, given the sleep-inside-RCU-protected section bug from
> > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > 
> > But as you said, the code paths are long and potentially slow, so
> > probably SRCU is a better alternative.
> > 
> > Gleb?
> kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> injection scalable.

We meant ioapic lock. See the last report from Ralf on this thread. 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 17:37       ` Marcelo Tosatti
@ 2010-04-21 17:58         ` Gleb Natapov
  2010-04-21 18:29           ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2010-04-21 17:58 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > for spinlock.
> > > 
> > > Yes, given the sleep-inside-RCU-protected section bug from
> > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > 
> > > But as you said, the code paths are long and potentially slow, so
> > > probably SRCU is a better alternative.
> > > 
> > > Gleb?
> > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > injection scalable.
> 
> We meant ioapic lock. See the last report from Ralf on this thread. 
Can we solve the problem by calling ack notifier outside rcu read
section in kvm_notify_acked_irq()?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 17:58         ` Gleb Natapov
@ 2010-04-21 18:29           ` Marcelo Tosatti
  2010-04-21 18:38             ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-21 18:29 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote:
> On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > > for spinlock.
> > > > 
> > > > Yes, given the sleep-inside-RCU-protected section bug from
> > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > > 
> > > > But as you said, the code paths are long and potentially slow, so
> > > > probably SRCU is a better alternative.
> > > > 
> > > > Gleb?
> > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > > injection scalable.
> > 
> > We meant ioapic lock. See the last report from Ralf on this thread. 
> Can we solve the problem by calling ack notifier outside rcu read
> section in kvm_notify_acked_irq()?

The unregister path does

- remove_from_list(entry)
- synchronize_rcu
- kfree(entry)

So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the
notifier entry can be freed.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 18:29           ` Marcelo Tosatti
@ 2010-04-21 18:38             ` Gleb Natapov
  2010-04-22 16:40               ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2010-04-21 18:38 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 03:29:11PM -0300, Marcelo Tosatti wrote:
> On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote:
> > On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> > > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > > > for spinlock.
> > > > > 
> > > > > Yes, given the sleep-inside-RCU-protected section bug from
> > > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > > > 
> > > > > But as you said, the code paths are long and potentially slow, so
> > > > > probably SRCU is a better alternative.
> > > > > 
> > > > > Gleb?
> > > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > > > injection scalable.
> > > 
> > > We meant ioapic lock. See the last report from Ralf on this thread. 
> > Can we solve the problem by calling ack notifier outside rcu read
> > section in kvm_notify_acked_irq()?
> 
> The unregister path does
> 
> - remove_from_list(entry)
> - synchronize_rcu
> - kfree(entry)
> 
> So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the
> notifier entry can be freed.
What I mean is kvm_notify_acked_irq() will iterate over all ack entries
in rcu read protected section, but instead of calling callback it will
collect them into the array and call them outside rcu read section. At
this point it doesn't matter if entry is unregistered since the copy is
used to actually call the notifier.

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-21 18:38             ` Gleb Natapov
@ 2010-04-22 16:40               ` Marcelo Tosatti
  2010-04-22 18:11                 ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-22 16:40 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Wed, Apr 21, 2010 at 09:38:39PM +0300, Gleb Natapov wrote:
> On Wed, Apr 21, 2010 at 03:29:11PM -0300, Marcelo Tosatti wrote:
> > On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote:
> > > On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> > > > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > > > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > > > > for spinlock.
> > > > > > 
> > > > > > Yes, given the sleep-inside-RCU-protected section bug from
> > > > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > > > > 
> > > > > > But as you said, the code paths are long and potentially slow, so
> > > > > > probably SRCU is a better alternative.
> > > > > > 
> > > > > > Gleb?
> > > > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > > > > injection scalable.
> > > > 
> > > > We meant ioapic lock. See the last report from Ralf on this thread. 
> > > Can we solve the problem by calling ack notifier outside rcu read
> > > section in kvm_notify_acked_irq()?
> > 
> > The unregister path does
> > 
> > - remove_from_list(entry)
> > - synchronize_rcu
> > - kfree(entry)
> > 
> > So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the
> > notifier entry can be freed.
> What I mean is kvm_notify_acked_irq() will iterate over all ack entries
> in rcu read protected section, but instead of calling callback it will
> collect them into the array and call them outside rcu read section. At
> this point it doesn't matter if entry is unregistered since the copy is
> used to actually call the notifier.

Here it is, but no, this trick can't be done safely for ack notifiers
because they are unregistered at runtime by device assignment.

How does the freeing path knows it can proceed to free its entry if
there is no reliable way to know if there is a reference to itself?
(think "priv" gets freed between rcu_read_unlock and ->irq_acked with
the patch below).

I can convert to SRCU if you have no objections.

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 0150aff..900ac05 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -241,8 +241,8 @@ int pit_has_pending_timer(struct kvm_vcpu *vcpu)
 
 static void kvm_pit_ack_irq(struct kvm_irq_ack_notifier *kian)
 {
-	struct kvm_kpit_state *ps = container_of(kian, struct kvm_kpit_state,
-						 irq_ack_notifier);
+	struct kvm_kpit_state *ps = kian->priv;
+
 	raw_spin_lock(&ps->inject_lock);
 	if (atomic_dec_return(&ps->pit_timer.pending) < 0)
 		atomic_inc(&ps->pit_timer.pending);
@@ -636,6 +636,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
 		     CLOCK_MONOTONIC, HRTIMER_MODE_ABS);
 	pit_state->irq_ack_notifier.gsi = 0;
 	pit_state->irq_ack_notifier.irq_acked = kvm_pit_ack_irq;
+	pit_state->irq_ack_notifier.priv = &pit->pit_state;
 	kvm_register_irq_ack_notifier(kvm, &pit_state->irq_ack_notifier);
 	pit_state->pit_timer.reinject = true;
 	mutex_unlock(&pit->pit_state.lock);
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ce027d5..6c3fb06 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -404,6 +404,7 @@ struct kvm_irq_ack_notifier {
 	struct hlist_node link;
 	unsigned gsi;
 	void (*irq_acked)(struct kvm_irq_ack_notifier *kian);
+	void *priv;
 };
 
 #define KVM_ASSIGNED_MSIX_PENDING		0x1
diff --git a/virt/kvm/assigned-dev.c b/virt/kvm/assigned-dev.c
index 4d10b1e..779b749 100644
--- a/virt/kvm/assigned-dev.c
+++ b/virt/kvm/assigned-dev.c
@@ -121,8 +121,7 @@ static void kvm_assigned_dev_ack_irq(struct kvm_irq_ack_notifier *kian)
 	if (kian->gsi == -1)
 		return;
 
-	dev = container_of(kian, struct kvm_assigned_dev_kernel,
-			   ack_notifier);
+	dev = kian->priv;
 
 	kvm_set_irq(dev->kvm, dev->irq_source_id, dev->guest_irq, 0);
 
@@ -563,6 +562,7 @@ static int kvm_vm_ioctl_assign_device(struct kvm *kvm,
 	match->irq_source_id = -1;
 	match->kvm = kvm;
 	match->ack_notifier.irq_acked = kvm_assigned_dev_ack_irq;
+	match->ack_notifier.priv = match;
 	INIT_WORK(&match->interrupt_work,
 		  kvm_assigned_dev_interrupt_work_handler);
 
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index a0e8880..ebccea8 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -175,11 +175,14 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 	return ret;
 }
 
+#define MAX_ACK_NOTIFIER_PER_GSI 4
+
 void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 {
 	struct kvm_irq_ack_notifier *kian;
+	struct kvm_irq_ack_notifier acks[MAX_ACK_NOTIFIER_PER_GSI];
 	struct hlist_node *n;
-	int gsi;
+	int gsi, i, acks_nr = 0;
 
 	trace_kvm_ack_irq(irqchip, pin);
 
@@ -188,9 +191,15 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 	if (gsi != -1)
 		hlist_for_each_entry_rcu(kian, n, &kvm->irq_ack_notifier_list,
 					 link)
-			if (kian->gsi == gsi)
-				kian->irq_acked(kian);
+			if (kian->gsi == gsi) {
+				if (acks_nr == MAX_ACK_NOTIFIER_PER_GSI)
+					break;
+				acks[acks_nr++] = *kian;
+			}
 	rcu_read_unlock();
+
+	for (i = 0; i < acks_nr; i++)
+		acks[i].irq_acked(&acks[i]);
 }
 
 void kvm_register_irq_ack_notifier(struct kvm *kvm,

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-22 16:40               ` Marcelo Tosatti
@ 2010-04-22 18:11                 ` Gleb Natapov
  2010-04-22 19:40                   ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2010-04-22 18:11 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Thu, Apr 22, 2010 at 01:40:38PM -0300, Marcelo Tosatti wrote:
> On Wed, Apr 21, 2010 at 09:38:39PM +0300, Gleb Natapov wrote:
> > On Wed, Apr 21, 2010 at 03:29:11PM -0300, Marcelo Tosatti wrote:
> > > On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote:
> > > > On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> > > > > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > > > > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > > > > > for spinlock.
> > > > > > > 
> > > > > > > Yes, given the sleep-inside-RCU-protected section bug from
> > > > > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > > > > > 
> > > > > > > But as you said, the code paths are long and potentially slow, so
> > > > > > > probably SRCU is a better alternative.
> > > > > > > 
> > > > > > > Gleb?
> > > > > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > > > > > injection scalable.
> > > > > 
> > > > > We meant ioapic lock. See the last report from Ralf on this thread. 
> > > > Can we solve the problem by calling ack notifier outside rcu read
> > > > section in kvm_notify_acked_irq()?
> > > 
> > > The unregister path does
> > > 
> > > - remove_from_list(entry)
> > > - synchronize_rcu
> > > - kfree(entry)
> > > 
> > > So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the
> > > notifier entry can be freed.
> > What I mean is kvm_notify_acked_irq() will iterate over all ack entries
> > in rcu read protected section, but instead of calling callback it will
> > collect them into the array and call them outside rcu read section. At
> > this point it doesn't matter if entry is unregistered since the copy is
> > used to actually call the notifier.
> 
> Here it is, but no, this trick can't be done safely for ack notifiers
> because they are unregistered at runtime by device assignment.
> 
> How does the freeing path knows it can proceed to free its entry if
> there is no reliable way to know if there is a reference to itself?
> (think "priv" gets freed between rcu_read_unlock and ->irq_acked with
> the patch below).
> 
Yeah, I see :(

> I can convert to SRCU if you have no objections.
> 
AFAIR there was a disadvantage comparing to RCU, but I don't remember what
it was exactly. What about converting PIC/IOAPIC mutexes into spinlocks?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-22 18:11                 ` Gleb Natapov
@ 2010-04-22 19:40                   ` Marcelo Tosatti
  2010-04-22 19:55                     ` Gleb Natapov
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-22 19:40 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Thu, Apr 22, 2010 at 09:11:30PM +0300, Gleb Natapov wrote:
> On Thu, Apr 22, 2010 at 01:40:38PM -0300, Marcelo Tosatti wrote:
> > On Wed, Apr 21, 2010 at 09:38:39PM +0300, Gleb Natapov wrote:
> > > On Wed, Apr 21, 2010 at 03:29:11PM -0300, Marcelo Tosatti wrote:
> > > > On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote:
> > > > > On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> > > > > > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > > > > > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > > > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > > > > > > for spinlock.
> > > > > > > > 
> > > > > > > > Yes, given the sleep-inside-RCU-protected section bug from
> > > > > > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > > > > > > 
> > > > > > > > But as you said, the code paths are long and potentially slow, so
> > > > > > > > probably SRCU is a better alternative.
> > > > > > > > 
> > > > > > > > Gleb?
> > > > > > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > > > > > > injection scalable.
> > > > > > 
> > > > > > We meant ioapic lock. See the last report from Ralf on this thread. 
> > > > > Can we solve the problem by calling ack notifier outside rcu read
> > > > > section in kvm_notify_acked_irq()?
> > > > 
> > > > The unregister path does
> > > > 
> > > > - remove_from_list(entry)
> > > > - synchronize_rcu
> > > > - kfree(entry)
> > > > 
> > > > So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the
> > > > notifier entry can be freed.
> > > What I mean is kvm_notify_acked_irq() will iterate over all ack entries
> > > in rcu read protected section, but instead of calling callback it will
> > > collect them into the array and call them outside rcu read section. At
> > > this point it doesn't matter if entry is unregistered since the copy is
> > > used to actually call the notifier.
> > 
> > Here it is, but no, this trick can't be done safely for ack notifiers
> > because they are unregistered at runtime by device assignment.
> > 
> > How does the freeing path knows it can proceed to free its entry if
> > there is no reliable way to know if there is a reference to itself?
> > (think "priv" gets freed between rcu_read_unlock and ->irq_acked with
> > the patch below).
> > 
> Yeah, I see :(
> 
> > I can convert to SRCU if you have no objections.
> > 
> AFAIR there was a disadvantage comparing to RCU, but I don't remember what
> it was exactly. 

http://www.mentby.com/paul-e-mckenney/kvm-patch-v3-13-kvm-fix-race-in-irqrouting-logic.html

But for KVM IRQ path's it should not matter much since usage of grace
periods is rare (registration/unregistration is very rare compared to 
read side), and the IRQ path's are already large and slow, so the added
overhead should not be noticeable.

> What about converting PIC/IOAPIC mutexes into spinlocks?

Works for me, but on large guests the spinning will be noticeable.
I believe.

Avi?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-22 19:40                   ` Marcelo Tosatti
@ 2010-04-22 19:55                     ` Gleb Natapov
  2010-04-23 11:05                       ` Avi Kivity
  0 siblings, 1 reply; 21+ messages in thread
From: Gleb Natapov @ 2010-04-22 19:55 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Yang, Sheng, kvm, bonenkamp, Chris Wright

On Thu, Apr 22, 2010 at 04:40:30PM -0300, Marcelo Tosatti wrote:
> On Thu, Apr 22, 2010 at 09:11:30PM +0300, Gleb Natapov wrote:
> > On Thu, Apr 22, 2010 at 01:40:38PM -0300, Marcelo Tosatti wrote:
> > > On Wed, Apr 21, 2010 at 09:38:39PM +0300, Gleb Natapov wrote:
> > > > On Wed, Apr 21, 2010 at 03:29:11PM -0300, Marcelo Tosatti wrote:
> > > > > On Wed, Apr 21, 2010 at 08:58:48PM +0300, Gleb Natapov wrote:
> > > > > > On Wed, Apr 21, 2010 at 02:37:34PM -0300, Marcelo Tosatti wrote:
> > > > > > > On Wed, Apr 21, 2010 at 08:12:27PM +0300, Gleb Natapov wrote:
> > > > > > > > On Wed, Apr 21, 2010 at 12:58:41PM -0300, Marcelo Tosatti wrote:
> > > > > > > > > > Or could we make kvm_set_irq() atomic? Though the code path is a little long 
> > > > > > > > > > for spinlock.
> > > > > > > > > 
> > > > > > > > > Yes, given the sleep-inside-RCU-protected section bug from
> > > > > > > > > kvm_notify_acked_irq, either that or convert IRQ locking to SRCU.
> > > > > > > > > 
> > > > > > > > > But as you said, the code paths are long and potentially slow, so
> > > > > > > > > probably SRCU is a better alternative.
> > > > > > > > > 
> > > > > > > > > Gleb?
> > > > > > > > kvm_set_irq() was converted to rcu from mutex to make msix interrupt
> > > > > > > > injection scalable.
> > > > > > > 
> > > > > > > We meant ioapic lock. See the last report from Ralf on this thread. 
> > > > > > Can we solve the problem by calling ack notifier outside rcu read
> > > > > > section in kvm_notify_acked_irq()?
> > > > > 
> > > > > The unregister path does
> > > > > 
> > > > > - remove_from_list(entry)
> > > > > - synchronize_rcu
> > > > > - kfree(entry)
> > > > > 
> > > > > So if kvm_notify_acked_irq sleeps, synchronize_rcu can succeed, and the
> > > > > notifier entry can be freed.
> > > > What I mean is kvm_notify_acked_irq() will iterate over all ack entries
> > > > in rcu read protected section, but instead of calling callback it will
> > > > collect them into the array and call them outside rcu read section. At
> > > > this point it doesn't matter if entry is unregistered since the copy is
> > > > used to actually call the notifier.
> > > 
> > > Here it is, but no, this trick can't be done safely for ack notifiers
> > > because they are unregistered at runtime by device assignment.
> > > 
> > > How does the freeing path knows it can proceed to free its entry if
> > > there is no reliable way to know if there is a reference to itself?
> > > (think "priv" gets freed between rcu_read_unlock and ->irq_acked with
> > > the patch below).
> > > 
> > Yeah, I see :(
> > 
> > > I can convert to SRCU if you have no objections.
> > > 
> > AFAIR there was a disadvantage comparing to RCU, but I don't remember what
> > it was exactly. 
> 
> http://www.mentby.com/paul-e-mckenney/kvm-patch-v3-13-kvm-fix-race-in-irqrouting-logic.html
> 
> But for KVM IRQ path's it should not matter much since usage of grace
> periods is rare (registration/unregistration is very rare compared to 
> read side), and the IRQ path's are already large and slow, so the added
> overhead should not be noticeable.
> 
msix path shouldn't be slow and this is the case we are trying to
optimize.

 SRCU's read-side primitives are also significantly slower than
 those of RCU.

Sounds not too optimistic.

> > What about converting PIC/IOAPIC mutexes into spinlocks?
> 
> Works for me, but on large guests the spinning will be noticeable.
> I believe.
For interrupts going through IOPIC, but we know this is not scalable
anyway.

> 
> Avi?

--
			Gleb.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-22 19:55                     ` Gleb Natapov
@ 2010-04-23 11:05                       ` Avi Kivity
  2010-04-23 13:02                         ` Chris Lalancette
  2010-04-23 17:03                         ` KVM: convert ioapic lock to spinlock Marcelo Tosatti
  0 siblings, 2 replies; 21+ messages in thread
From: Avi Kivity @ 2010-04-23 11:05 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, Yang, Sheng, kvm, bonenkamp, Chris Wright

On 04/22/2010 10:55 PM, Gleb Natapov wrote:
>
>
>>> What about converting PIC/IOAPIC mutexes into spinlocks?
>>>        
>> Works for me, but on large guests the spinning will be noticeable.
>> I believe.
>>      
> For interrupts going through IOPIC, but we know this is not scalable
> anyway.
>    

Yes.  We also wanted to convert the ioapic/pic to spinlocks so we could 
queue the interrupt from the PIT directly instead of using 
KVM_REQ_PENDING_TIMER which keeps confusing me.  Chris Lalancette posted 
a patchset for this a while back but it was never completed.

I'm not really happy with adding lots of spin_lock_irqsave()s though, 
especially on the ioapic which may iterate over all vcpus (not worried 
about scaling, but about a malicious guest hurting host latency).

An alternative is make kvm_set_irq() irq safe: if msi, do things 
immediately, otherwise post a work item.  So we can call kvm_set_irq() 
directly from the interrupt.

Alternative alternative (perhaps better for short term): switch 
assigned_dev_lock to a mutex (we're already in a work handler, no need 
for spinlock).  The race between the irq and removal of an assigned 
device is closed by free_irq():


  lock
  mark assigned device as going away
  unlock
  free_irq()
  actually kill it

like invalid mmu pages.

-- 

Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-23 11:05                       ` Avi Kivity
@ 2010-04-23 13:02                         ` Chris Lalancette
  2010-04-23 13:30                           ` Avi Kivity
  2010-04-23 17:03                         ` KVM: convert ioapic lock to spinlock Marcelo Tosatti
  1 sibling, 1 reply; 21+ messages in thread
From: Chris Lalancette @ 2010-04-23 13:02 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Gleb Natapov, Marcelo Tosatti, Yang, Sheng, kvm, bonenkamp, Chris Wright

On 04/23/2010 07:05 AM, Avi Kivity wrote:
> On 04/22/2010 10:55 PM, Gleb Natapov wrote:
>>
>>
>>>> What about converting PIC/IOAPIC mutexes into spinlocks?
>>>>        
>>> Works for me, but on large guests the spinning will be noticeable.
>>> I believe.
>>>      
>> For interrupts going through IOPIC, but we know this is not scalable
>> anyway.
>>    
> 
> Yes.  We also wanted to convert the ioapic/pic to spinlocks so we could
> queue the interrupt from the PIT directly instead of using
> KVM_REQ_PENDING_TIMER which keeps confusing me.  Chris Lalancette posted
> a patchset for this a while back but it was never completed.

Yeah, I'm sorry I never completed it.  It turns out that with the HPET
changes that went in around the time I was looking at it, that set of
patches wasn't really required to fix the problem I was seeing with kdump.

That being said, if it's useful to somebody, I can repost the patches
(though they are woefully out-of-date now).  Let me know if you want
to see them again.

-- 
Chris Lalancette

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section
  2010-04-23 13:02                         ` Chris Lalancette
@ 2010-04-23 13:30                           ` Avi Kivity
  0 siblings, 0 replies; 21+ messages in thread
From: Avi Kivity @ 2010-04-23 13:30 UTC (permalink / raw)
  To: Chris Lalancette
  Cc: Gleb Natapov, Marcelo Tosatti, Yang, Sheng, kvm, bonenkamp, Chris Wright

On 04/23/2010 04:02 PM, Chris Lalancette wrote:
> On 04/23/2010 07:05 AM, Avi Kivity wrote:
>    
>> On 04/22/2010 10:55 PM, Gleb Natapov wrote:
>>      
>>>
>>>        
>>>>> What about converting PIC/IOAPIC mutexes into spinlocks?
>>>>>
>>>>>            
>>>> Works for me, but on large guests the spinning will be noticeable.
>>>> I believe.
>>>>
>>>>          
>>> For interrupts going through IOPIC, but we know this is not scalable
>>> anyway.
>>>
>>>        
>> Yes.  We also wanted to convert the ioapic/pic to spinlocks so we could
>> queue the interrupt from the PIT directly instead of using
>> KVM_REQ_PENDING_TIMER which keeps confusing me.  Chris Lalancette posted
>> a patchset for this a while back but it was never completed.
>>      
> Yeah, I'm sorry I never completed it.  It turns out that with the HPET
> changes that went in around the time I was looking at it, that set of
> patches wasn't really required to fix the problem I was seeing with kdump.
>
> That being said, if it's useful to somebody, I can repost the patches
> (though they are woefully out-of-date now).  Let me know if you want
> to see them again.
>    

Let's see if one of the alternatives works out.  I prefer to keep the 
critical sections short.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* KVM: convert ioapic lock to spinlock
  2010-04-23 11:05                       ` Avi Kivity
  2010-04-23 13:02                         ` Chris Lalancette
@ 2010-04-23 17:03                         ` Marcelo Tosatti
  1 sibling, 0 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2010-04-23 17:03 UTC (permalink / raw)
  To: Avi Kivity, bonenkamp; +Cc: Gleb Natapov, Yang, Sheng, kvm, Chris Wright


Ralf, can you give this patch a try, please? (revert the first one
first).

-----

kvm_set_irq is used from non sleepable contexes, so convert ioapic from
mutex to spinlock.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>

diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 3db15a8..8edd6fd 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -196,7 +196,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
 	union kvm_ioapic_redirect_entry entry;
 	int ret = 1;
 
-	mutex_lock(&ioapic->lock);
+	spin_lock(&ioapic->lock);
 	if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
 		entry = ioapic->redirtbl[irq];
 		level ^= entry.fields.polarity;
@@ -213,7 +213,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
 		}
 		trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
 	}
-	mutex_unlock(&ioapic->lock);
+	spin_unlock(&ioapic->lock);
 
 	return ret;
 }
@@ -237,9 +237,9 @@ static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
 		 * is dropped it will be put into irr and will be delivered
 		 * after ack notifier returns.
 		 */
-		mutex_unlock(&ioapic->lock);
+		spin_unlock(&ioapic->lock);
 		kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
-		mutex_lock(&ioapic->lock);
+		spin_lock(&ioapic->lock);
 
 		if (trigger_mode != IOAPIC_LEVEL_TRIG)
 			continue;
@@ -258,9 +258,9 @@ void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode)
 	smp_rmb();
 	if (!test_bit(vector, ioapic->handled_vectors))
 		return;
-	mutex_lock(&ioapic->lock);
+	spin_lock(&ioapic->lock);
 	__kvm_ioapic_update_eoi(ioapic, vector, trigger_mode);
-	mutex_unlock(&ioapic->lock);
+	spin_unlock(&ioapic->lock);
 }
 
 static inline struct kvm_ioapic *to_ioapic(struct kvm_io_device *dev)
@@ -286,7 +286,7 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
 	ASSERT(!(addr & 0xf));	/* check alignment */
 
 	addr &= 0xff;
-	mutex_lock(&ioapic->lock);
+	spin_lock(&ioapic->lock);
 	switch (addr) {
 	case IOAPIC_REG_SELECT:
 		result = ioapic->ioregsel;
@@ -300,7 +300,7 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
 		result = 0;
 		break;
 	}
-	mutex_unlock(&ioapic->lock);
+	spin_unlock(&ioapic->lock);
 
 	switch (len) {
 	case 8:
@@ -337,7 +337,7 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
 	}
 
 	addr &= 0xff;
-	mutex_lock(&ioapic->lock);
+	spin_lock(&ioapic->lock);
 	switch (addr) {
 	case IOAPIC_REG_SELECT:
 		ioapic->ioregsel = data;
@@ -355,7 +355,7 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
 	default:
 		break;
 	}
-	mutex_unlock(&ioapic->lock);
+	spin_unlock(&ioapic->lock);
 	return 0;
 }
 
@@ -385,7 +385,7 @@ int kvm_ioapic_init(struct kvm *kvm)
 	ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL);
 	if (!ioapic)
 		return -ENOMEM;
-	mutex_init(&ioapic->lock);
+	spin_lock_init(&ioapic->lock);
 	kvm->arch.vioapic = ioapic;
 	kvm_ioapic_reset(ioapic);
 	kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops);
@@ -418,9 +418,9 @@ int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
 	if (!ioapic)
 		return -EINVAL;
 
-	mutex_lock(&ioapic->lock);
+	spin_lock(&ioapic->lock);
 	memcpy(state, ioapic, sizeof(struct kvm_ioapic_state));
-	mutex_unlock(&ioapic->lock);
+	spin_unlock(&ioapic->lock);
 	return 0;
 }
 
@@ -430,9 +430,9 @@ int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
 	if (!ioapic)
 		return -EINVAL;
 
-	mutex_lock(&ioapic->lock);
+	spin_lock(&ioapic->lock);
 	memcpy(ioapic, state, sizeof(struct kvm_ioapic_state));
 	update_handled_vectors(ioapic);
-	mutex_unlock(&ioapic->lock);
+	spin_unlock(&ioapic->lock);
 	return 0;
 }
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 8a751b7..0b190c3 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -45,7 +45,7 @@ struct kvm_ioapic {
 	struct kvm_io_device dev;
 	struct kvm *kvm;
 	void (*ack_notifier)(void *opaque, int irq);
-	struct mutex lock;
+	spinlock_t lock;
 	DECLARE_BITMAP(handled_vectors, 256);
 };
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-04-23 17:08 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-04-20 15:54 [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Marcelo Tosatti
2010-04-20 21:49 ` Bonenkamp, Ralf
2010-04-21  7:51   ` Yang, Sheng
2010-04-21  7:48 ` Yang, Sheng
2010-04-21 15:58   ` Marcelo Tosatti
2010-04-21 17:12     ` Gleb Natapov
2010-04-21 17:37       ` Marcelo Tosatti
2010-04-21 17:58         ` Gleb Natapov
2010-04-21 18:29           ` Marcelo Tosatti
2010-04-21 18:38             ` Gleb Natapov
2010-04-22 16:40               ` Marcelo Tosatti
2010-04-22 18:11                 ` Gleb Natapov
2010-04-22 19:40                   ` Marcelo Tosatti
2010-04-22 19:55                     ` Gleb Natapov
2010-04-23 11:05                       ` Avi Kivity
2010-04-23 13:02                         ` Chris Lalancette
2010-04-23 13:30                           ` Avi Kivity
2010-04-23 17:03                         ` KVM: convert ioapic lock to spinlock Marcelo Tosatti
2010-04-21  8:32 ` [UNTESTED] KVM: do not call kvm_set_irq from irq disabled section Avi Kivity
2010-04-21 16:03   ` Marcelo Tosatti
2010-04-21 16:28     ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.