kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* KVM Arm Device passthrough and linux-rt
@ 2019-06-04 12:58 Julien Grall
  2019-06-04 13:16 ` Steven Rostedt
  2019-06-04 13:50 ` Sebastian Andrzej Siewior
  0 siblings, 2 replies; 6+ messages in thread
From: Julien Grall @ 2019-06-04 12:58 UTC (permalink / raw)
  To: linux-rt-users, linux-arm-kernel, kvmarm
  Cc: Marc Zyngier, Sebastian Andrzej Siewior, Steven Rostedt, julia

Hi,

While trying device passthrough on Linux-rt with KVM Arm, I had
the following splat.

[  363.410141] 000: BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974
[  363.410150] 000: in_atomic(): 0, irqs_disabled(): 128, pid: 2916, name: qemu-system-aar
[  363.410153] 000: 4 locks held by qemu-system-aar/2916:
[  363.410157] 000:  #0: ffff8007bd248100 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x70/0xae0
[  363.410171] 000:  #1: ffff8007bd1e2b20 (&kvm->irq_srcu){....}, at: kvm_notify_acked_irq+0x7c/0x300
[  363.410179] 000:  #2: ffff8007bd1e2b20 (&kvm->irq_srcu){....}, at: irqfd_resampler_ack+0x0/0xd8
[  363.410187] 000:  #3: ffff8007c2b27d28 (&ctx->wqh#2){+.+.}, at: eventfd_signal+0x24/0x78
[  363.410196] 000: irq event stamp: 4033894
[  363.410197] 000: hardirqs last  enabled at (4033893): [<ffff000011119400>] _raw_spin_unlock_irqrestore+0x88/0x90
[  363.410203] 000: hardirqs last disabled at (4033894): [<ffff0000100bed18>] kvm_arch_vcpu_ioctl_run+0x2a8/0xc08
[  363.410207] 000: softirqs last  enabled at (0): [<ffff0000100edde0>] copy_process.isra.1.part.2+0x8d8/0x1958
[  363.410212] 000: softirqs last disabled at (0): [<0000000000000000>]  (null)
[  363.410216] 000: CPU: 0 PID: 2916 Comm: qemu-system-aar Tainted: G        W       5.0.14-rt9-00013-g4b2a13c8a804 #84
[  363.410219] 000: Hardware name: AMD Seattle (Rev.B0) Development Board (Overdrive) (DT)
[  363.410221] 000: Call trace:
[  363.410222] 000:  dump_backtrace+0x0/0x158
[  363.410225] 000:  show_stack+0x14/0x20
[  363.410227] 000:  dump_stack+0xa0/0xd4
[  363.410230] 000:  ___might_sleep+0x16c/0x1f8
[  363.410234] 000:  rt_spin_lock+0x5c/0x70
[  363.410237] 000:  eventfd_signal+0x24/0x78
[  363.410238] 000:  irqfd_resampler_ack+0x94/0xd8
[  363.410241] 000:  kvm_notify_acked_irq+0xf8/0x300
[  363.410243] 000:  vgic_v2_fold_lr_state+0x174/0x1e0
[  363.410246] 000:  kvm_vgic_sync_hwstate+0x5c/0x2b8
[  363.410249] 000:  kvm_arch_vcpu_ioctl_run+0x624/0xc08
[  363.410250] 000:  kvm_vcpu_ioctl+0x3a0/0xae0
[  363.410252] 000:  do_vfs_ioctl+0xbc/0x910
[  363.410255] 000:  ksys_ioctl+0x78/0xa8
[  363.410257] 000:  __arm64_sys_ioctl+0x1c/0x28
[  363.410260] 000:  el0_svc_common+0x90/0x118
[  363.410263] 000:  el0_svc_handler+0x2c/0x80
[  363.410265] 000:  el0_svc+0x8/0xc

This is happening because vgic_v2_fold_lr_state() is expected
to be called with interrupt disabled. However, some of the path
(e.g eventfd) will take a spinlock.

The spinlock is from the waitqueue, so using a raw_spin_lock cannot
even be considered.

Do you have any input on how this could be solved?

Cheers,

-- 
Julien Grall
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KVM Arm Device passthrough and linux-rt
  2019-06-04 12:58 KVM Arm Device passthrough and linux-rt Julien Grall
@ 2019-06-04 13:16 ` Steven Rostedt
  2019-06-04 13:53   ` Marc Zyngier
  2019-06-04 13:50 ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 6+ messages in thread
From: Steven Rostedt @ 2019-06-04 13:16 UTC (permalink / raw)
  To: Julien Grall
  Cc: linux-rt-users, julia, Marc Zyngier, Sebastian Andrzej Siewior,
	kvmarm, linux-arm-kernel

On Tue, 4 Jun 2019 13:58:51 +0100
Julien Grall <julien.grall@arm.com> wrote:

> This is happening because vgic_v2_fold_lr_state() is expected
> to be called with interrupt disabled. However, some of the path
> (e.g eventfd) will take a spinlock.
> 
> The spinlock is from the waitqueue, so using a raw_spin_lock cannot
> even be considered.
> 
> Do you have any input on how this could be solved?

What's the reason that vgic_v2_fold_lr_state() expects interrupts to be
disabled?

-- Steve
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KVM Arm Device passthrough and linux-rt
  2019-06-04 12:58 KVM Arm Device passthrough and linux-rt Julien Grall
  2019-06-04 13:16 ` Steven Rostedt
@ 2019-06-04 13:50 ` Sebastian Andrzej Siewior
  1 sibling, 0 replies; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-06-04 13:50 UTC (permalink / raw)
  To: Julien Grall
  Cc: linux-rt-users, julia, Marc Zyngier, Steven Rostedt, kvmarm,
	linux-arm-kernel

On 2019-06-04 13:58:51 [+0100], Julien Grall wrote:
> Hi,
Hi,

> This is happening because vgic_v2_fold_lr_state() is expected
> to be called with interrupt disabled. However, some of the path
> (e.g eventfd) will take a spinlock.
> 
> The spinlock is from the waitqueue, so using a raw_spin_lock cannot
> even be considered.
> 
> Do you have any input on how this could be solved?

There is swair (init_swait_queue_head() and friends) in case that works
for you.

> Cheers,

Sebastian
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KVM Arm Device passthrough and linux-rt
  2019-06-04 13:16 ` Steven Rostedt
@ 2019-06-04 13:53   ` Marc Zyngier
  2019-06-04 14:28     ` Marc Zyngier
  2019-06-04 15:04     ` Steven Rostedt
  0 siblings, 2 replies; 6+ messages in thread
From: Marc Zyngier @ 2019-06-04 13:53 UTC (permalink / raw)
  To: Steven Rostedt, Julien Grall
  Cc: Sebastian Andrzej Siewior, julia, linux-rt-users,
	linux-arm-kernel, kvmarm

On 04/06/2019 14:16, Steven Rostedt wrote:
> On Tue, 4 Jun 2019 13:58:51 +0100
> Julien Grall <julien.grall@arm.com> wrote:
> 
>> This is happening because vgic_v2_fold_lr_state() is expected
>> to be called with interrupt disabled. However, some of the path
>> (e.g eventfd) will take a spinlock.
>>
>> The spinlock is from the waitqueue, so using a raw_spin_lock cannot
>> even be considered.
>>
>> Do you have any input on how this could be solved?
> 
> What's the reason that vgic_v2_fold_lr_state() expects interrupts to be
> disabled?

That's to prevent the injection of an interrupt firing on the same CPU
while we're saving the corresponding vcpu interrupt context, among other
things (the whole guest exit path runs with interrupt disabled in order
to avoid this kind of thing).

One possibility would be to accumulate the set of interrupt that require
resampling (which is bound to be small, the number of LRs at most) and
call kvm_notify_acked_irq on that set once interrupts are re-enabled.

I'll have a look...

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: KVM Arm Device passthrough and linux-rt
  2019-06-04 13:53   ` Marc Zyngier
@ 2019-06-04 14:28     ` Marc Zyngier
  2019-06-04 15:04     ` Steven Rostedt
  1 sibling, 0 replies; 6+ messages in thread
From: Marc Zyngier @ 2019-06-04 14:28 UTC (permalink / raw)
  To: Steven Rostedt, Julien Grall
  Cc: Sebastian Andrzej Siewior, julia, linux-rt-users,
	linux-arm-kernel, kvmarm

On 04/06/2019 14:53, Marc Zyngier wrote:
> On 04/06/2019 14:16, Steven Rostedt wrote:
>> On Tue, 4 Jun 2019 13:58:51 +0100
>> Julien Grall <julien.grall@arm.com> wrote:
>>
>>> This is happening because vgic_v2_fold_lr_state() is expected
>>> to be called with interrupt disabled. However, some of the path
>>> (e.g eventfd) will take a spinlock.
>>>
>>> The spinlock is from the waitqueue, so using a raw_spin_lock cannot
>>> even be considered.
>>>
>>> Do you have any input on how this could be solved?
>>
>> What's the reason that vgic_v2_fold_lr_state() expects interrupts to be
>> disabled?
> 
> That's to prevent the injection of an interrupt firing on the same CPU
> while we're saving the corresponding vcpu interrupt context, among other
> things (the whole guest exit path runs with interrupt disabled in order
> to avoid this kind of thing).
> 
> One possibility would be to accumulate the set of interrupt that require
> resampling (which is bound to be small, the number of LRs at most) and
> call kvm_notify_acked_irq on that set once interrupts are re-enabled.
> 
> I'll have a look...

Julien,

Here's what I have in mind, compile-tested only. Please let me know how
it fares on your setup...

Thanks,

	M.

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c36c86f1ec9a..9e587f5d90fa 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -307,6 +307,10 @@ struct vgic_cpu {
 	unsigned int used_lrs;
 	struct vgic_irq private_irqs[VGIC_NR_PRIVATE_IRQS];
 
+	/* SPIs to be resampled by SW at EOI time */
+	u16		spi_notify[VGIC_V2_MAX_LRS];
+	int		spi_notify_count;
+
 	raw_spinlock_t ap_list_lock;	/* Protects the ap_list */
 
 	/*
@@ -371,6 +375,7 @@ void kvm_vgic_put(struct kvm_vcpu *vcpu);
 bool kvm_vcpu_has_pending_irqs(struct kvm_vcpu *vcpu);
 void kvm_vgic_sync_hwstate(struct kvm_vcpu *vcpu);
 void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu);
+void kvm_vgic_process_resample(struct kvm_vcpu *vcpu);
 void kvm_vgic_reset_mapped_irq(struct kvm_vcpu *vcpu, u32 vintid);
 
 void vgic_v3_dispatch_sgi(struct kvm_vcpu *vcpu, u64 reg, bool allow_group1);
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 7eeebe5e9da2..a95c46b2d723 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -827,6 +827,12 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		guest_exit();
 		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 
+		/*
+		 * Now that interrupts are enabled, signal interrupts
+		 * that require SW resampling.
+		 */
+		kvm_vgic_process_resample(vcpu);
+
 		/* Exit types that need handling before we can be preempted */
 		handle_exit_early(vcpu, run, ret);
 
diff --git a/virt/kvm/arm/vgic/vgic-v2.c b/virt/kvm/arm/vgic/vgic-v2.c
index d91a8938aa7c..9953e1742d65 100644
--- a/virt/kvm/arm/vgic/vgic-v2.c
+++ b/virt/kvm/arm/vgic/vgic-v2.c
@@ -79,8 +79,7 @@ void vgic_v2_fold_lr_state(struct kvm_vcpu *vcpu)
 
 		/* Notify fds when the guest EOI'ed a level-triggered SPI */
 		if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid))
-			kvm_notify_acked_irq(vcpu->kvm, 0,
-					     intid - VGIC_NR_PRIVATE_IRQS);
+			vgic_cpu->spi_notify[vgic_cpu->spi_notify_count++] = intid;
 
 		irq = vgic_get_irq(vcpu->kvm, vcpu, intid);
 
diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c
index 9f87e58dbd4a..54b1e4ea5cfa 100644
--- a/virt/kvm/arm/vgic/vgic-v3.c
+++ b/virt/kvm/arm/vgic/vgic-v3.c
@@ -69,8 +69,7 @@ void vgic_v3_fold_lr_state(struct kvm_vcpu *vcpu)
 
 		/* Notify fds when the guest EOI'ed a level-triggered IRQ */
 		if (lr_signals_eoi_mi(val) && vgic_valid_spi(vcpu->kvm, intid))
-			kvm_notify_acked_irq(vcpu->kvm, 0,
-					     intid - VGIC_NR_PRIVATE_IRQS);
+			vgic_cpu->spi_notify[vgic_cpu->spi_notify_count++] = intid;
 
 		irq = vgic_get_irq(vcpu->kvm, vcpu, intid);
 		if (!irq)	/* An LPI could have been unmapped. */
diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 191deccf60bf..baa6aa494e86 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -880,6 +880,8 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 {
 	WARN_ON(vgic_v4_flush_hwstate(vcpu));
 
+	vcpu->arch.vgic_cpu.spi_notify_count = 0;
+
 	/*
 	 * If there are no virtual interrupts active or pending for this
 	 * VCPU, then there is no work to do and we can bail out without
@@ -908,6 +910,16 @@ void kvm_vgic_flush_hwstate(struct kvm_vcpu *vcpu)
 		vgic_restore_state(vcpu);
 }
 
+void kvm_vgic_process_resample(struct kvm_vcpu *vcpu)
+{
+	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
+	int i;
+
+	for (i = 0; i < vgic_cpu->spi_notify_count; i++)
+		kvm_notify_acked_irq(vcpu->kvm, 0,
+				     vgic_cpu->spi_notify[i] - VGIC_NR_PRIVATE_IRQS);
+}
+
 void kvm_vgic_load(struct kvm_vcpu *vcpu)
 {
 	if (unlikely(!vgic_initialized(vcpu->kvm)))

-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: KVM Arm Device passthrough and linux-rt
  2019-06-04 13:53   ` Marc Zyngier
  2019-06-04 14:28     ` Marc Zyngier
@ 2019-06-04 15:04     ` Steven Rostedt
  1 sibling, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2019-06-04 15:04 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-rt-users, julia, Sebastian Andrzej Siewior, Julien Grall,
	kvmarm, linux-arm-kernel

On Tue, 4 Jun 2019 14:53:26 +0100
Marc Zyngier <marc.zyngier@arm.com> wrote:

> That's to prevent the injection of an interrupt firing on the same CPU
> while we're saving the corresponding vcpu interrupt context, among other
> things (the whole guest exit path runs with interrupt disabled in order
> to avoid this kind of thing).

Can't we use a per_cpu local lock for this?

-- Steve
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-06-04 16:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 12:58 KVM Arm Device passthrough and linux-rt Julien Grall
2019-06-04 13:16 ` Steven Rostedt
2019-06-04 13:53   ` Marc Zyngier
2019-06-04 14:28     ` Marc Zyngier
2019-06-04 15:04     ` Steven Rostedt
2019-06-04 13:50 ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).