All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/8] make interrupt injection lockless (almost)
@ 2009-08-12 12:17 Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 1/8] Do not call ack notifiers on PIC reset Gleb Natapov
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

kvm->irq_lock protects too much stuff, but still fail to protect
everything it was design to protect (see ack notifiers call in pic). I
want to make IRQ injection fast path as lockless as possible. This patch
series removes kvm->irq_lock from irq injection path effectively making
interrupt injection to lapic lockless (several kvm_irq_delivery_to_apic()
may run in parallel), but access to lapic was never fully locked in the
first place. VCPU could access lapic in parallel with interrupt injection.
Patches 1/2 changes irq routing data structure to much more efficient one.

v1->v2:
  Drop MSI injection interface (for now).
  Use irq_lock to protect irq routing and ack notifiers.
  Splitting irq routing table changes to two patches (+ comments addressed).
  Drop ioapic/pic lock before calling ack notifiers.
v2->v3
  Drop patch that changes irq_lock to spinlock.
  Use mutex for ioapic lock.
  Do not call ack notifier if there is no GSI mapping.
  Call pic_clear_isr() after PIC state completely changed.

Gleb Natapov (8):
  Do not call ack notifiers on PIC reset.
  Change irq routing table to use gsi indexed array.
  Maintain back mapping from irqchip/pin to gsi.
  Move irq routing data structure to rcu locking
  Move irq ack notifier list to arch independent code.
  Convert irq notifiers lists to RCU locking.
  Move IO APIC to its own lock.
  Drop kvm->irq_lock lock from irq injection path.

 arch/ia64/include/asm/kvm.h      |    1 +
 arch/ia64/include/asm/kvm_host.h |    1 -
 arch/ia64/kvm/kvm-ia64.c         |    9 +--
 arch/x86/include/asm/kvm.h       |    1 +
 arch/x86/include/asm/kvm_host.h  |    1 -
 arch/x86/kvm/i8254.c             |    2 -
 arch/x86/kvm/i8259.c             |   38 +++++-------
 arch/x86/kvm/lapic.c             |    7 +--
 arch/x86/kvm/x86.c               |   12 +---
 include/linux/kvm_host.h         |   18 +++++-
 virt/kvm/eventfd.c               |    2 -
 virt/kvm/ioapic.c                |   80 ++++++++++++++++++------
 virt/kvm/ioapic.h                |    4 +
 virt/kvm/irq_comm.c              |  127 ++++++++++++++++++++------------------
 virt/kvm/kvm_main.c              |    4 +-
 15 files changed, 171 insertions(+), 136 deletions(-)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 1/8] Do not call ack notifiers on PIC reset.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-13  9:11   ` Marcelo Tosatti
  2009-08-12 12:17 ` [PATCH v3 2/8] Change irq routing table to use gsi indexed array Gleb Natapov
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

For device assigned it may cause host hang since ack notifier callback
enables host interrupt and guest not necessary cleared interrupt
condition in an assigned device. For PIT we should not call ack notifier
here since interrupt was not acked by a guest and should be redelivered.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/x86/kvm/i8259.c |   16 ----------------
 1 files changed, 0 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index 01f1516..eb2b8b7 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -225,22 +225,6 @@ int kvm_pic_read_irq(struct kvm *kvm)
 
 void kvm_pic_reset(struct kvm_kpic_state *s)
 {
-	int irq, irqbase, n;
-	struct kvm *kvm = s->pics_state->irq_request_opaque;
-	struct kvm_vcpu *vcpu0 = kvm->bsp_vcpu;
-
-	if (s == &s->pics_state->pics[0])
-		irqbase = 0;
-	else
-		irqbase = 8;
-
-	for (irq = 0; irq < PIC_NUM_PINS/2; irq++) {
-		if (vcpu0 && kvm_apic_accept_pic_intr(vcpu0))
-			if (s->irr & (1 << irq) || s->isr & (1 << irq)) {
-				n = irq + irqbase;
-				kvm_notify_acked_irq(kvm, SELECT_PIC(n), n);
-			}
-	}
 	s->last_irr = 0;
 	s->irr = 0;
 	s->imr = 0;
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/8] Change irq routing table to use gsi indexed array.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 1/8] Do not call ack notifiers on PIC reset Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 3/8] Maintain back mapping from irqchip/pin to gsi Gleb Natapov
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

Use gsi indexed array instead of scanning all entries on each interrupt
injection.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 include/linux/kvm_host.h |   16 ++++++++--
 virt/kvm/irq_comm.c      |   77 +++++++++++++++++++++++++---------------------
 virt/kvm/kvm_main.c      |    1 -
 3 files changed, 55 insertions(+), 39 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index f814512..09df31d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -129,7 +129,17 @@ struct kvm_kernel_irq_routing_entry {
 		} irqchip;
 		struct msi_msg msi;
 	};
-	struct list_head link;
+	struct hlist_node link;
+};
+
+struct kvm_irq_routing_table {
+	struct kvm_kernel_irq_routing_entry *rt_entries;
+	u32 nr_rt_entries;
+	/*
+	 * Array indexed by gsi. Each entry contains list of irq chips
+	 * the gsi is connected to.
+	 */
+	struct hlist_head map[0];
 };
 
 struct kvm {
@@ -167,7 +177,7 @@ struct kvm {
 
 	struct mutex irq_lock;
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
-	struct list_head irq_routing; /* of kvm_kernel_irq_routing_entry */
+	struct kvm_irq_routing_table *irq_routing;
 	struct hlist_head mask_notifier_list;
 #endif
 
@@ -396,7 +406,7 @@ void kvm_get_intr_delivery_bitmask(struct kvm_ioapic *ioapic,
 				   union kvm_ioapic_redirect_entry *entry,
 				   unsigned long *deliver_bitmask);
 #endif
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level);
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level);
 void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin);
 void kvm_register_irq_ack_notifier(struct kvm *kvm,
 				   struct kvm_irq_ack_notifier *kian);
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 001663f..a9d2262 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -122,11 +122,13 @@ static int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
  *  = 0   Interrupt was coalesced (previous irq is still pending)
  *  > 0   Number of CPUs interrupt was delivered to
  */
-int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level)
+int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 {
 	struct kvm_kernel_irq_routing_entry *e;
 	unsigned long *irq_state, sig_level;
 	int ret = -1;
+	struct kvm_irq_routing_table *irq_rt;
+	struct hlist_node *n;
 
 	trace_kvm_set_irq(irq, level, irq_source_id);
 
@@ -150,8 +152,9 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level)
 	 * IOAPIC.  So set the bit in both. The guest will ignore
 	 * writes to the unused one.
 	 */
-	list_for_each_entry(e, &kvm->irq_routing, link)
-		if (e->gsi == irq) {
+	irq_rt = kvm->irq_routing;
+	if (irq < irq_rt->nr_rt_entries)
+		hlist_for_each_entry(e, n, &irq_rt->map[irq], link) {
 			int r = e->set(e, kvm, sig_level);
 			if (r < 0)
 				continue;
@@ -163,20 +166,23 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, int irq, int level)
 
 void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 {
-	struct kvm_kernel_irq_routing_entry *e;
 	struct kvm_irq_ack_notifier *kian;
 	struct hlist_node *n;
 	unsigned gsi = pin;
+	int i;
 
 	trace_kvm_ack_irq(irqchip, pin);
 
-	list_for_each_entry(e, &kvm->irq_routing, link)
+	for (i = 0; i < kvm->irq_routing->nr_rt_entries; i++) {
+		struct kvm_kernel_irq_routing_entry *e;
+		e = &kvm->irq_routing->rt_entries[i];
 		if (e->type == KVM_IRQ_ROUTING_IRQCHIP &&
 		    e->irqchip.irqchip == irqchip &&
 		    e->irqchip.pin == pin) {
 			gsi = e->gsi;
 			break;
 		}
+	}
 
 	hlist_for_each_entry(kian, n, &kvm->arch.irq_ack_notifier_list, link)
 		if (kian->gsi == gsi)
@@ -267,22 +273,15 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, int irq, bool mask)
 			kimn->func(kimn, mask);
 }
 
-static void __kvm_free_irq_routing(struct list_head *irq_routing)
-{
-	struct kvm_kernel_irq_routing_entry *e, *n;
-
-	list_for_each_entry_safe(e, n, irq_routing, link)
-		kfree(e);
-}
-
 void kvm_free_irq_routing(struct kvm *kvm)
 {
 	mutex_lock(&kvm->irq_lock);
-	__kvm_free_irq_routing(&kvm->irq_routing);
+	kfree(kvm->irq_routing);
 	mutex_unlock(&kvm->irq_lock);
 }
 
-static int setup_routing_entry(struct kvm_kernel_irq_routing_entry *e,
+static int setup_routing_entry(struct kvm_irq_routing_table *rt,
+			       struct kvm_kernel_irq_routing_entry *e,
 			       const struct kvm_irq_routing_entry *ue)
 {
 	int r = -EINVAL;
@@ -319,6 +318,8 @@ static int setup_routing_entry(struct kvm_kernel_irq_routing_entry *e,
 	default:
 		goto out;
 	}
+
+	hlist_add_head(&e->link, &rt->map[e->gsi]);
 	r = 0;
 out:
 	return r;
@@ -330,43 +331,49 @@ int kvm_set_irq_routing(struct kvm *kvm,
 			unsigned nr,
 			unsigned flags)
 {
-	struct list_head irq_list = LIST_HEAD_INIT(irq_list);
-	struct list_head tmp = LIST_HEAD_INIT(tmp);
-	struct kvm_kernel_irq_routing_entry *e = NULL;
-	unsigned i;
+	struct kvm_irq_routing_table *new, *old;
+	u32 i, nr_rt_entries = 0;
 	int r;
 
 	for (i = 0; i < nr; ++i) {
+		if (ue[i].gsi >= KVM_MAX_IRQ_ROUTES)
+			return -EINVAL;
+		nr_rt_entries = max(nr_rt_entries, ue[i].gsi);
+	}
+
+	nr_rt_entries += 1;
+
+	new = kzalloc(sizeof(*new) + (nr_rt_entries * sizeof(struct hlist_head))
+		      + (nr * sizeof(struct kvm_kernel_irq_routing_entry)),
+		      GFP_KERNEL);
+
+	if (!new)
+		return -ENOMEM;
+
+	new->rt_entries = (void *)&new->map[nr_rt_entries];
+
+	new->nr_rt_entries = nr_rt_entries;
+
+	for (i = 0; i < nr; ++i) {
 		r = -EINVAL;
-		if (ue->gsi >= KVM_MAX_IRQ_ROUTES)
-			goto out;
 		if (ue->flags)
 			goto out;
-		r = -ENOMEM;
-		e = kzalloc(sizeof(*e), GFP_KERNEL);
-		if (!e)
-			goto out;
-		r = setup_routing_entry(e, ue);
+		r = setup_routing_entry(new, &new->rt_entries[i], ue);
 		if (r)
 			goto out;
 		++ue;
-		list_add(&e->link, &irq_list);
-		e = NULL;
 	}
 
 	mutex_lock(&kvm->irq_lock);
-	list_splice(&kvm->irq_routing, &tmp);
-	INIT_LIST_HEAD(&kvm->irq_routing);
-	list_splice(&irq_list, &kvm->irq_routing);
-	INIT_LIST_HEAD(&irq_list);
-	list_splice(&tmp, &irq_list);
+	old = kvm->irq_routing;
+	kvm->irq_routing = new;
 	mutex_unlock(&kvm->irq_lock);
 
+	new = old;
 	r = 0;
 
 out:
-	kfree(e);
-	__kvm_free_irq_routing(&irq_list);
+	kfree(new);
 	return r;
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 1df4c04..e8b03ee 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -944,7 +944,6 @@ static struct kvm *kvm_create_vm(void)
 	if (IS_ERR(kvm))
 		goto out;
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
-	INIT_LIST_HEAD(&kvm->irq_routing);
 	INIT_HLIST_HEAD(&kvm->mask_notifier_list);
 #endif
 
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 3/8] Maintain back mapping from irqchip/pin to gsi.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 1/8] Do not call ack notifiers on PIC reset Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 2/8] Change irq routing table to use gsi indexed array Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-17  9:43   ` Avi Kivity
  2009-08-12 12:17 ` [PATCH v3 4/8] Move irq routing data structure to rcu locking Gleb Natapov
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

Maintain back mapping from irqchip/pin to gsi to speedup
interrupt acknowledgment notifications.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/ia64/include/asm/kvm.h |    1 +
 arch/x86/include/asm/kvm.h  |    1 +
 include/linux/kvm_host.h    |    1 +
 virt/kvm/irq_comm.c         |   31 ++++++++++++++-----------------
 4 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/ia64/include/asm/kvm.h b/arch/ia64/include/asm/kvm.h
index 18a7e49..bc90c75 100644
--- a/arch/ia64/include/asm/kvm.h
+++ b/arch/ia64/include/asm/kvm.h
@@ -60,6 +60,7 @@ struct kvm_ioapic_state {
 #define KVM_IRQCHIP_PIC_MASTER   0
 #define KVM_IRQCHIP_PIC_SLAVE    1
 #define KVM_IRQCHIP_IOAPIC       2
+#define KVM_NR_IRQCHIPS          3
 
 #define KVM_CONTEXT_SIZE	8*1024
 
diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h
index 4a5fe91..f02e87a 100644
--- a/arch/x86/include/asm/kvm.h
+++ b/arch/x86/include/asm/kvm.h
@@ -79,6 +79,7 @@ struct kvm_ioapic_state {
 #define KVM_IRQCHIP_PIC_MASTER   0
 #define KVM_IRQCHIP_PIC_SLAVE    1
 #define KVM_IRQCHIP_IOAPIC       2
+#define KVM_NR_IRQCHIPS          3
 
 /* for KVM_GET_REGS and KVM_SET_REGS */
 struct kvm_regs {
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 09df31d..7ac58df 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -133,6 +133,7 @@ struct kvm_kernel_irq_routing_entry {
 };
 
 struct kvm_irq_routing_table {
+	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
 	struct kvm_kernel_irq_routing_entry *rt_entries;
 	u32 nr_rt_entries;
 	/*
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index a9d2262..00e4be7 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -168,25 +168,16 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 {
 	struct kvm_irq_ack_notifier *kian;
 	struct hlist_node *n;
-	unsigned gsi = pin;
-	int i;
+	unsigned gsi;
 
 	trace_kvm_ack_irq(irqchip, pin);
 
-	for (i = 0; i < kvm->irq_routing->nr_rt_entries; i++) {
-		struct kvm_kernel_irq_routing_entry *e;
-		e = &kvm->irq_routing->rt_entries[i];
-		if (e->type == KVM_IRQ_ROUTING_IRQCHIP &&
-		    e->irqchip.irqchip == irqchip &&
-		    e->irqchip.pin == pin) {
-			gsi = e->gsi;
-			break;
-		}
-	}
-
-	hlist_for_each_entry(kian, n, &kvm->arch.irq_ack_notifier_list, link)
-		if (kian->gsi == gsi)
-			kian->irq_acked(kian);
+	gsi = kvm->irq_routing->chip[irqchip][pin];
+	if (gsi != -1)
+		hlist_for_each_entry(kian, n, &kvm->arch.irq_ack_notifier_list,
+				     link)
+			if (kian->gsi == gsi)
+				kian->irq_acked(kian);
 }
 
 void kvm_register_irq_ack_notifier(struct kvm *kvm,
@@ -308,6 +299,9 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
 		}
 		e->irqchip.irqchip = ue->u.irqchip.irqchip;
 		e->irqchip.pin = ue->u.irqchip.pin + delta;
+		if (e->irqchip.pin > KVM_IOAPIC_NUM_PINS)
+			goto out;
+		rt->chip[ue->u.irqchip.irqchip][e->irqchip.pin] = ue->gsi;
 		break;
 	case KVM_IRQ_ROUTING_MSI:
 		e->set = kvm_set_msi;
@@ -332,7 +326,7 @@ int kvm_set_irq_routing(struct kvm *kvm,
 			unsigned flags)
 {
 	struct kvm_irq_routing_table *new, *old;
-	u32 i, nr_rt_entries = 0;
+	u32 i, j, nr_rt_entries = 0;
 	int r;
 
 	for (i = 0; i < nr; ++i) {
@@ -353,6 +347,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
 	new->rt_entries = (void *)&new->map[nr_rt_entries];
 
 	new->nr_rt_entries = nr_rt_entries;
+	for (i = 0; i < 3; i++)
+		for (j = 0; j < KVM_IOAPIC_NUM_PINS; j++)
+			new->chip[i][j] = -1;
 
 	for (i = 0; i < nr; ++i) {
 		r = -EINVAL;
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 4/8] Move irq routing data structure to rcu locking
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
                   ` (2 preceding siblings ...)
  2009-08-12 12:17 ` [PATCH v3 3/8] Maintain back mapping from irqchip/pin to gsi Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 5/8] Move irq ack notifier list to arch independent code Gleb Natapov
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm


Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 virt/kvm/irq_comm.c |   16 +++++++++++-----
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 00e4be7..35c3bbf 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -152,7 +152,8 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 	 * IOAPIC.  So set the bit in both. The guest will ignore
 	 * writes to the unused one.
 	 */
-	irq_rt = kvm->irq_routing;
+	rcu_read_lock();
+	irq_rt = rcu_dereference(kvm->irq_routing);
 	if (irq < irq_rt->nr_rt_entries)
 		hlist_for_each_entry(e, n, &irq_rt->map[irq], link) {
 			int r = e->set(e, kvm, sig_level);
@@ -161,6 +162,7 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 
 			ret = r + ((ret < 0) ? 0 : ret);
 		}
+	rcu_read_unlock();
 	return ret;
 }
 
@@ -172,7 +174,10 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 
 	trace_kvm_ack_irq(irqchip, pin);
 
-	gsi = kvm->irq_routing->chip[irqchip][pin];
+	rcu_read_lock();
+	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
+	rcu_read_unlock();
+
 	if (gsi != -1)
 		hlist_for_each_entry(kian, n, &kvm->arch.irq_ack_notifier_list,
 				     link)
@@ -266,9 +271,9 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, int irq, bool mask)
 
 void kvm_free_irq_routing(struct kvm *kvm)
 {
-	mutex_lock(&kvm->irq_lock);
+	/* Called only during vm destruction. Nobody can use the pointer
+	   at this stage */
 	kfree(kvm->irq_routing);
-	mutex_unlock(&kvm->irq_lock);
 }
 
 static int setup_routing_entry(struct kvm_irq_routing_table *rt,
@@ -363,8 +368,9 @@ int kvm_set_irq_routing(struct kvm *kvm,
 
 	mutex_lock(&kvm->irq_lock);
 	old = kvm->irq_routing;
-	kvm->irq_routing = new;
+	rcu_assign_pointer(kvm->irq_routing, new);
 	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
 
 	new = old;
 	r = 0;
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 5/8] Move irq ack notifier list to arch independent code.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
                   ` (3 preceding siblings ...)
  2009-08-12 12:17 ` [PATCH v3 4/8] Move irq routing data structure to rcu locking Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 6/8] Convert irq notifiers lists to RCU locking Gleb Natapov
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

Mask irq notifier list is already there.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/ia64/include/asm/kvm_host.h |    1 -
 arch/x86/include/asm/kvm_host.h  |    1 -
 include/linux/kvm_host.h         |    1 +
 virt/kvm/irq_comm.c              |    5 ++---
 virt/kvm/kvm_main.c              |    1 +
 5 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/ia64/include/asm/kvm_host.h b/arch/ia64/include/asm/kvm_host.h
index d9b6325..a362e67 100644
--- a/arch/ia64/include/asm/kvm_host.h
+++ b/arch/ia64/include/asm/kvm_host.h
@@ -475,7 +475,6 @@ struct kvm_arch {
 	struct list_head assigned_dev_head;
 	struct iommu_domain *iommu_domain;
 	int iommu_flags;
-	struct hlist_head irq_ack_notifier_list;
 
 	unsigned long irq_sources_bitmap;
 	unsigned long irq_states[KVM_IOAPIC_NUM_PINS];
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b17d845..7dfde38 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -400,7 +400,6 @@ struct kvm_arch{
 	struct kvm_pic *vpic;
 	struct kvm_ioapic *vioapic;
 	struct kvm_pit *vpit;
-	struct hlist_head irq_ack_notifier_list;
 	int vapics_in_nmi_mode;
 
 	unsigned int tss_addr;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 7ac58df..0026e4a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -180,6 +180,7 @@ struct kvm {
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
 	struct kvm_irq_routing_table *irq_routing;
 	struct hlist_head mask_notifier_list;
+	struct hlist_head irq_ack_notifier_list;
 #endif
 
 #ifdef KVM_ARCH_WANT_MMU_NOTIFIER
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 35c3bbf..ad84758 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -179,8 +179,7 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 	rcu_read_unlock();
 
 	if (gsi != -1)
-		hlist_for_each_entry(kian, n, &kvm->arch.irq_ack_notifier_list,
-				     link)
+		hlist_for_each_entry(kian, n, &kvm->irq_ack_notifier_list, link)
 			if (kian->gsi == gsi)
 				kian->irq_acked(kian);
 }
@@ -189,7 +188,7 @@ void kvm_register_irq_ack_notifier(struct kvm *kvm,
 				   struct kvm_irq_ack_notifier *kian)
 {
 	mutex_lock(&kvm->irq_lock);
-	hlist_add_head(&kian->link, &kvm->arch.irq_ack_notifier_list);
+	hlist_add_head(&kian->link, &kvm->irq_ack_notifier_list);
 	mutex_unlock(&kvm->irq_lock);
 }
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e8b03ee..fcc0c44 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -945,6 +945,7 @@ static struct kvm *kvm_create_vm(void)
 		goto out;
 #ifdef CONFIG_HAVE_KVM_IRQCHIP
 	INIT_HLIST_HEAD(&kvm->mask_notifier_list);
+	INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list);
 #endif
 
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 6/8] Convert irq notifiers lists to RCU locking.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
                   ` (4 preceding siblings ...)
  2009-08-12 12:17 ` [PATCH v3 5/8] Move irq ack notifier list to arch independent code Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-12 12:17 ` [PATCH v3 7/8] Move IO APIC to its own lock Gleb Natapov
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

Use RCU locking for mask/ack notifiers lists.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 virt/kvm/irq_comm.c |   22 ++++++++++++----------
 1 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index ad84758..d276b42 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -176,19 +176,19 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
 
 	rcu_read_lock();
 	gsi = rcu_dereference(kvm->irq_routing)->chip[irqchip][pin];
-	rcu_read_unlock();
-
 	if (gsi != -1)
-		hlist_for_each_entry(kian, n, &kvm->irq_ack_notifier_list, link)
+		hlist_for_each_entry_rcu(kian, n, &kvm->irq_ack_notifier_list,
+					 link)
 			if (kian->gsi == gsi)
 				kian->irq_acked(kian);
+	rcu_read_unlock();
 }
 
 void kvm_register_irq_ack_notifier(struct kvm *kvm,
 				   struct kvm_irq_ack_notifier *kian)
 {
 	mutex_lock(&kvm->irq_lock);
-	hlist_add_head(&kian->link, &kvm->irq_ack_notifier_list);
+	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
 	mutex_unlock(&kvm->irq_lock);
 }
 
@@ -196,8 +196,9 @@ void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
 				    struct kvm_irq_ack_notifier *kian)
 {
 	mutex_lock(&kvm->irq_lock);
-	hlist_del_init(&kian->link);
+	hlist_del_init_rcu(&kian->link);
 	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
 }
 
 int kvm_request_irq_source_id(struct kvm *kvm)
@@ -244,7 +245,7 @@ void kvm_register_irq_mask_notifier(struct kvm *kvm, int irq,
 {
 	mutex_lock(&kvm->irq_lock);
 	kimn->irq = irq;
-	hlist_add_head(&kimn->link, &kvm->mask_notifier_list);
+	hlist_add_head_rcu(&kimn->link, &kvm->mask_notifier_list);
 	mutex_unlock(&kvm->irq_lock);
 }
 
@@ -252,8 +253,9 @@ void kvm_unregister_irq_mask_notifier(struct kvm *kvm, int irq,
 				      struct kvm_irq_mask_notifier *kimn)
 {
 	mutex_lock(&kvm->irq_lock);
-	hlist_del(&kimn->link);
+	hlist_del_rcu(&kimn->link);
 	mutex_unlock(&kvm->irq_lock);
+	synchronize_rcu();
 }
 
 void kvm_fire_mask_notifiers(struct kvm *kvm, int irq, bool mask)
@@ -261,11 +263,11 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, int irq, bool mask)
 	struct kvm_irq_mask_notifier *kimn;
 	struct hlist_node *n;
 
-	WARN_ON(!mutex_is_locked(&kvm->irq_lock));
-
-	hlist_for_each_entry(kimn, n, &kvm->mask_notifier_list, link)
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(kimn, n, &kvm->mask_notifier_list, link)
 		if (kimn->irq == irq)
 			kimn->func(kimn, mask);
+	rcu_read_unlock();
 }
 
 void kvm_free_irq_routing(struct kvm *kvm)
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
                   ` (5 preceding siblings ...)
  2009-08-12 12:17 ` [PATCH v3 6/8] Convert irq notifiers lists to RCU locking Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-13  9:13   ` Marcelo Tosatti
  2009-08-12 12:17 ` [PATCH v3 8/8] Drop kvm->irq_lock lock from irq injection path Gleb Natapov
  2009-08-17  9:46 ` [PATCH v3 0/8] make interrupt injection lockless (almost) Avi Kivity
  8 siblings, 1 reply; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm


Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/ia64/kvm/kvm-ia64.c |    7 +---
 arch/x86/kvm/i8259.c     |   22 +++++++++---
 arch/x86/kvm/lapic.c     |    5 +--
 arch/x86/kvm/x86.c       |   10 +----
 virt/kvm/ioapic.c        |   80 +++++++++++++++++++++++++++++++++++-----------
 virt/kvm/ioapic.h        |    4 ++
 6 files changed, 86 insertions(+), 42 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 0ad09f0..4a98314 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -851,8 +851,7 @@ static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm,
 	r = 0;
 	switch (chip->chip_id) {
 	case KVM_IRQCHIP_IOAPIC:
-		memcpy(&chip->chip.ioapic, ioapic_irqchip(kvm),
-				sizeof(struct kvm_ioapic_state));
+		r = kvm_get_ioapic(kvm, &chip->chip.ioapic);
 		break;
 	default:
 		r = -EINVAL;
@@ -868,9 +867,7 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 	r = 0;
 	switch (chip->chip_id) {
 	case KVM_IRQCHIP_IOAPIC:
-		memcpy(ioapic_irqchip(kvm),
-				&chip->chip.ioapic,
-				sizeof(struct kvm_ioapic_state));
+		r = kvm_set_ioapic(kvm, &chip->chip.ioapic);
 		break;
 	default:
 		r = -EINVAL;
diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
index eb2b8b7..96ea7f7 100644
--- a/arch/x86/kvm/i8259.c
+++ b/arch/x86/kvm/i8259.c
@@ -38,7 +38,15 @@ static void pic_clear_isr(struct kvm_kpic_state *s, int irq)
 	s->isr_ack |= (1 << irq);
 	if (s != &s->pics_state->pics[0])
 		irq += 8;
+	/*
+	 * We are dropping lock while calling ack notifiers since ack
+	 * notifier callbacks for assigned devices call into PIC recursively.
+	 * Other interrupt may be delivered to PIC while lock is dropped but
+	 * it should be safe since PIC state is already updated at this stage.
+	 */
+	spin_unlock(&s->pics_state->lock);
 	kvm_notify_acked_irq(s->pics_state->kvm, SELECT_PIC(irq), irq);
+	spin_lock(&s->pics_state->lock);
 }
 
 void kvm_pic_clear_isr_ack(struct kvm *kvm)
@@ -176,16 +184,18 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
 static inline void pic_intack(struct kvm_kpic_state *s, int irq)
 {
 	s->isr |= 1 << irq;
-	if (s->auto_eoi) {
-		if (s->rotate_on_auto_eoi)
-			s->priority_add = (irq + 1) & 7;
-		pic_clear_isr(s, irq);
-	}
 	/*
 	 * We don't clear a level sensitive interrupt here
 	 */
 	if (!(s->elcr & (1 << irq)))
 		s->irr &= ~(1 << irq);
+
+	if (s->auto_eoi) {
+		if (s->rotate_on_auto_eoi)
+			s->priority_add = (irq + 1) & 7;
+		pic_clear_isr(s, irq);
+	}
+
 }
 
 int kvm_pic_read_irq(struct kvm *kvm)
@@ -282,9 +292,9 @@ static void pic_ioport_write(void *opaque, u32 addr, u32 val)
 				priority = get_priority(s, s->isr);
 				if (priority != 8) {
 					irq = (priority + s->priority_add) & 7;
-					pic_clear_isr(s, irq);
 					if (cmd == 5)
 						s->priority_add = (irq + 1) & 7;
+					pic_clear_isr(s, irq);
 					pic_update_irq(s->pics_state);
 				}
 				break;
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index ce195f8..f24d4d0 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -471,11 +471,8 @@ static void apic_set_eoi(struct kvm_lapic *apic)
 		trigger_mode = IOAPIC_LEVEL_TRIG;
 	else
 		trigger_mode = IOAPIC_EDGE_TRIG;
-	if (!(apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)) {
-		mutex_lock(&apic->vcpu->kvm->irq_lock);
+	if (!(apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI))
 		kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
-		mutex_unlock(&apic->vcpu->kvm->irq_lock);
-	}
 }
 
 static void apic_send_ipi(struct kvm_lapic *apic)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 850cf56..75dcdad 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2023,9 +2023,7 @@ static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 			sizeof(struct kvm_pic_state));
 		break;
 	case KVM_IRQCHIP_IOAPIC:
-		memcpy(&chip->chip.ioapic,
-			ioapic_irqchip(kvm),
-			sizeof(struct kvm_ioapic_state));
+		r = kvm_get_ioapic(kvm, &chip->chip.ioapic);
 		break;
 	default:
 		r = -EINVAL;
@@ -2055,11 +2053,7 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
 		spin_unlock(&pic_irqchip(kvm)->lock);
 		break;
 	case KVM_IRQCHIP_IOAPIC:
-		mutex_lock(&kvm->irq_lock);
-		memcpy(ioapic_irqchip(kvm),
-			&chip->chip.ioapic,
-			sizeof(struct kvm_ioapic_state));
-		mutex_unlock(&kvm->irq_lock);
+		r = kvm_set_ioapic(kvm, &chip->chip.ioapic);
 		break;
 	default:
 		r = -EINVAL;
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index fa05f67..e9de458 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -182,6 +182,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
 	union kvm_ioapic_redirect_entry entry;
 	int ret = 1;
 
+	mutex_lock(&ioapic->lock);
 	if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
 		entry = ioapic->redirtbl[irq];
 		level ^= entry.fields.polarity;
@@ -196,34 +197,51 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
 		}
 		trace_kvm_ioapic_set_irq(entry.bits, irq, ret == 0);
 	}
+	mutex_unlock(&ioapic->lock);
+
 	return ret;
 }
 
-static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int pin,
-				    int trigger_mode)
+static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
+				     int trigger_mode)
 {
-	union kvm_ioapic_redirect_entry *ent;
+	int i;
+
+	for (i = 0; i < IOAPIC_NUM_PINS; i++) {
+		union kvm_ioapic_redirect_entry *ent = &ioapic->redirtbl[i];
 
-	ent = &ioapic->redirtbl[pin];
+		if (ent->fields.vector != vector)
+			continue;
 
-	kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, pin);
+		/*
+		 * We are dropping lock while calling ack notifiers because ack
+		 * notifier callbacks for assigned devices call into IOAPIC
+		 * recursively. Since remote_irr is cleared only after call
+		 * to notifiers if the same vector will be delivered while lock
+		 * is dropped it will be put into irr and will be delivered
+		 * after ack notifier returns.
+		 */
+		mutex_unlock(&ioapic->lock);
+		kvm_notify_acked_irq(ioapic->kvm, KVM_IRQCHIP_IOAPIC, i);
+		mutex_lock(&ioapic->lock);
+
+		if (trigger_mode != IOAPIC_LEVEL_TRIG)
+			continue;
 
-	if (trigger_mode == IOAPIC_LEVEL_TRIG) {
 		ASSERT(ent->fields.trig_mode == IOAPIC_LEVEL_TRIG);
 		ent->fields.remote_irr = 0;
-		if (!ent->fields.mask && (ioapic->irr & (1 << pin)))
-			ioapic_service(ioapic, pin);
+		if (!ent->fields.mask && (ioapic->irr & (1 << i)))
+			ioapic_service(ioapic, i);
 	}
 }
 
 void kvm_ioapic_update_eoi(struct kvm *kvm, int vector, int trigger_mode)
 {
 	struct kvm_ioapic *ioapic = kvm->arch.vioapic;
-	int i;
 
-	for (i = 0; i < IOAPIC_NUM_PINS; i++)
-		if (ioapic->redirtbl[i].fields.vector == vector)
-			__kvm_ioapic_update_eoi(ioapic, i, trigger_mode);
+	mutex_lock(&ioapic->lock);
+	__kvm_ioapic_update_eoi(ioapic, vector, trigger_mode);
+	mutex_unlock(&ioapic->lock);
 }
 
 static inline struct kvm_ioapic *to_ioapic(struct kvm_io_device *dev)
@@ -248,8 +266,8 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
 	ioapic_debug("addr %lx\n", (unsigned long)addr);
 	ASSERT(!(addr & 0xf));	/* check alignment */
 
-	mutex_lock(&ioapic->kvm->irq_lock);
 	addr &= 0xff;
+	mutex_lock(&ioapic->lock);
 	switch (addr) {
 	case IOAPIC_REG_SELECT:
 		result = ioapic->ioregsel;
@@ -263,6 +281,8 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
 		result = 0;
 		break;
 	}
+	mutex_unlock(&ioapic->lock);
+
 	switch (len) {
 	case 8:
 		*(u64 *) val = result;
@@ -275,7 +295,6 @@ static int ioapic_mmio_read(struct kvm_io_device *this, gpa_t addr, int len,
 	default:
 		printk(KERN_WARNING "ioapic: wrong length %d\n", len);
 	}
-	mutex_unlock(&ioapic->kvm->irq_lock);
 	return 0;
 }
 
@@ -291,15 +310,15 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
 		     (void*)addr, len, val);
 	ASSERT(!(addr & 0xf));	/* check alignment */
 
-	mutex_lock(&ioapic->kvm->irq_lock);
 	if (len == 4 || len == 8)
 		data = *(u32 *) val;
 	else {
 		printk(KERN_WARNING "ioapic: Unsupported size %d\n", len);
-		goto unlock;
+		return 0;
 	}
 
 	addr &= 0xff;
+	mutex_lock(&ioapic->lock);
 	switch (addr) {
 	case IOAPIC_REG_SELECT:
 		ioapic->ioregsel = data;
@@ -310,15 +329,14 @@ static int ioapic_mmio_write(struct kvm_io_device *this, gpa_t addr, int len,
 		break;
 #ifdef	CONFIG_IA64
 	case IOAPIC_REG_EOI:
-		kvm_ioapic_update_eoi(ioapic->kvm, data, IOAPIC_LEVEL_TRIG);
+		__kvm_ioapic_update_eoi(ioapic, data, IOAPIC_LEVEL_TRIG);
 		break;
 #endif
 
 	default:
 		break;
 	}
-unlock:
-	mutex_unlock(&ioapic->kvm->irq_lock);
+	mutex_unlock(&ioapic->lock);
 	return 0;
 }
 
@@ -347,6 +365,7 @@ int kvm_ioapic_init(struct kvm *kvm)
 	ioapic = kzalloc(sizeof(struct kvm_ioapic), GFP_KERNEL);
 	if (!ioapic)
 		return -ENOMEM;
+	mutex_init(&ioapic->lock);
 	kvm->arch.vioapic = ioapic;
 	kvm_ioapic_reset(ioapic);
 	kvm_iodevice_init(&ioapic->dev, &ioapic_mmio_ops);
@@ -358,3 +377,26 @@ int kvm_ioapic_init(struct kvm *kvm)
 	return ret;
 }
 
+int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
+{
+	struct kvm_ioapic *ioapic = ioapic_irqchip(kvm);
+	if (!ioapic)
+		return -EINVAL;
+
+	mutex_lock(&ioapic->lock);
+	memcpy(state, ioapic, sizeof(struct kvm_ioapic_state));
+	mutex_unlock(&ioapic->lock);
+	return 0;
+}
+
+int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state)
+{
+	struct kvm_ioapic *ioapic = ioapic_irqchip(kvm);
+	if (!ioapic)
+		return -EINVAL;
+
+	mutex_lock(&ioapic->lock);
+	memcpy(ioapic, state, sizeof(struct kvm_ioapic_state));
+	mutex_unlock(&ioapic->lock);
+	return 0;
+}
diff --git a/virt/kvm/ioapic.h b/virt/kvm/ioapic.h
index 7080b71..9de8272 100644
--- a/virt/kvm/ioapic.h
+++ b/virt/kvm/ioapic.h
@@ -44,6 +44,7 @@ struct kvm_ioapic {
 	struct kvm_io_device dev;
 	struct kvm *kvm;
 	void (*ack_notifier)(void *opaque, int irq);
+	struct mutex lock;
 };
 
 #ifdef DEBUG
@@ -73,4 +74,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level);
 void kvm_ioapic_reset(struct kvm_ioapic *ioapic);
 int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq);
+int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
+int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
+
 #endif
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 8/8] Drop kvm->irq_lock lock from irq injection path.
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
                   ` (6 preceding siblings ...)
  2009-08-12 12:17 ` [PATCH v3 7/8] Move IO APIC to its own lock Gleb Natapov
@ 2009-08-12 12:17 ` Gleb Natapov
  2009-08-17  9:46 ` [PATCH v3 0/8] make interrupt injection lockless (almost) Avi Kivity
  8 siblings, 0 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-12 12:17 UTC (permalink / raw)
  To: avi; +Cc: kvm

The only thing it protects now is interrupt injection into lapic and
this can work lockless. Even now with kvm->irq_lock in place access
to lapic is not entirely serialized since vcpu access doesn't take
kvm->irq_lock.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
---
 arch/ia64/kvm/kvm-ia64.c |    2 --
 arch/x86/kvm/i8254.c     |    2 --
 arch/x86/kvm/lapic.c     |    2 --
 arch/x86/kvm/x86.c       |    2 --
 virt/kvm/eventfd.c       |    2 --
 virt/kvm/irq_comm.c      |    6 +-----
 virt/kvm/kvm_main.c      |    2 --
 7 files changed, 1 insertions(+), 17 deletions(-)

diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 4a98314..f534e0f 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -982,10 +982,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
 			goto out;
 		if (irqchip_in_kernel(kvm)) {
 			__s32 status;
-			mutex_lock(&kvm->irq_lock);
 			status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
 				    irq_event.irq, irq_event.level);
-			mutex_unlock(&kvm->irq_lock);
 			if (ioctl == KVM_IRQ_LINE_STATUS) {
 				irq_event.status = status;
 				if (copy_to_user(argp, &irq_event,
diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 82ad523..b857ca3 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -688,10 +688,8 @@ static void __inject_pit_timer_intr(struct kvm *kvm)
 	struct kvm_vcpu *vcpu;
 	int i;
 
-	mutex_lock(&kvm->irq_lock);
 	kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
 	kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
-	mutex_unlock(&kvm->irq_lock);
 
 	/*
 	 * Provides NMI watchdog support via Virtual Wire mode.
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index f24d4d0..e41e948 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -501,9 +501,7 @@ static void apic_send_ipi(struct kvm_lapic *apic)
 		   irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
 		   irq.vector);
 
-	mutex_lock(&apic->vcpu->kvm->irq_lock);
 	kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq);
-	mutex_unlock(&apic->vcpu->kvm->irq_lock);
 }
 
 static u32 apic_get_tmcct(struct kvm_lapic *apic)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 75dcdad..b3e41ec 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2268,10 +2268,8 @@ long kvm_arch_vm_ioctl(struct file *filp,
 			goto out;
 		if (irqchip_in_kernel(kvm)) {
 			__s32 status;
-			mutex_lock(&kvm->irq_lock);
 			status = kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID,
 					irq_event.irq, irq_event.level);
-			mutex_unlock(&kvm->irq_lock);
 			if (ioctl == KVM_IRQ_LINE_STATUS) {
 				irq_event.status = status;
 				if (copy_to_user(argp, &irq_event,
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 99017e8..95954ad 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -61,10 +61,8 @@ irqfd_inject(struct work_struct *work)
 	struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
 	struct kvm *kvm = irqfd->kvm;
 
-	mutex_lock(&kvm->irq_lock);
 	kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1);
 	kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0);
-	mutex_unlock(&kvm->irq_lock);
 }
 
 /*
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index d276b42..2c4bf32 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -63,8 +63,6 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 	int i, r = -1;
 	struct kvm_vcpu *vcpu, *lowest = NULL;
 
-	WARN_ON(!mutex_is_locked(&kvm->irq_lock));
-
 	if (irq->dest_mode == 0 && irq->dest_id == 0xff &&
 			kvm_is_dm_lowest_prio(irq))
 		printk(KERN_INFO "kvm: apic: phys broadcast and lowest prio\n");
@@ -116,7 +114,7 @@ static int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
 }
 
-/* This should be called with the kvm->irq_lock mutex held
+/*
  * Return value:
  *  < 0   Interrupt was ignored (masked or not delivered for other reasons)
  *  = 0   Interrupt was coalesced (previous irq is still pending)
@@ -132,8 +130,6 @@ int kvm_set_irq(struct kvm *kvm, int irq_source_id, u32 irq, int level)
 
 	trace_kvm_set_irq(irq, level, irq_source_id);
 
-	WARN_ON(!mutex_is_locked(&kvm->irq_lock));
-
 	if (irq < KVM_IOAPIC_NUM_PINS) {
 		irq_state = (unsigned long *)&kvm->arch.irq_states[irq];
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fcc0c44..ef96c04 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -137,7 +137,6 @@ static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
 				    interrupt_work);
 	kvm = assigned_dev->kvm;
 
-	mutex_lock(&kvm->irq_lock);
 	spin_lock_irq(&assigned_dev->assigned_dev_lock);
 	if (assigned_dev->irq_requested_type & KVM_DEV_IRQ_HOST_MSIX) {
 		struct kvm_guest_msix_entry *guest_entries =
@@ -156,7 +155,6 @@ static void kvm_assigned_dev_interrupt_work_handler(struct work_struct *work)
 			    assigned_dev->guest_irq, 1);
 
 	spin_unlock_irq(&assigned_dev->assigned_dev_lock);
-	mutex_unlock(&assigned_dev->kvm->irq_lock);
 }
 
 static irqreturn_t kvm_assigned_dev_intr(int irq, void *dev_id)
-- 
1.6.3.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/8] Do not call ack notifiers on PIC reset.
  2009-08-12 12:17 ` [PATCH v3 1/8] Do not call ack notifiers on PIC reset Gleb Natapov
@ 2009-08-13  9:11   ` Marcelo Tosatti
  2009-08-13  9:39     ` Gleb Natapov
  0 siblings, 1 reply; 19+ messages in thread
From: Marcelo Tosatti @ 2009-08-13  9:11 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: avi, kvm

On Wed, Aug 12, 2009 at 03:17:15PM +0300, Gleb Natapov wrote:
> For device assigned it may cause host hang since ack notifier callback
> enables host interrupt and guest not necessary cleared interrupt
> condition in an assigned device. For PIT we should not call ack notifier
> here since interrupt was not acked by a guest and should be redelivered.
> 
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> ---
>  arch/x86/kvm/i8259.c |   16 ----------------
>  1 files changed, 0 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index 01f1516..eb2b8b7 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -225,22 +225,6 @@ int kvm_pic_read_irq(struct kvm *kvm)
>  
>  void kvm_pic_reset(struct kvm_kpic_state *s)
>  {
> -	int irq, irqbase, n;
> -	struct kvm *kvm = s->pics_state->irq_request_opaque;
> -	struct kvm_vcpu *vcpu0 = kvm->bsp_vcpu;
> -
> -	if (s == &s->pics_state->pics[0])
> -		irqbase = 0;
> -	else
> -		irqbase = 8;
> -
> -	for (irq = 0; irq < PIC_NUM_PINS/2; irq++) {
> -		if (vcpu0 && kvm_apic_accept_pic_intr(vcpu0))
> -			if (s->irr & (1 << irq) || s->isr & (1 << irq)) {
> -				n = irq + irqbase;
> -				kvm_notify_acked_irq(kvm, SELECT_PIC(n), n);
> -			}
> -	}
>  	s->last_irr = 0;
>  	s->irr = 0;
>  	s->imr = 0;
> -- 
> 1.6.3.3

This used to be necessary to clear pending state from i8254.c
"irq_acked" logic. I think it'll break it.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-12 12:17 ` [PATCH v3 7/8] Move IO APIC to its own lock Gleb Natapov
@ 2009-08-13  9:13   ` Marcelo Tosatti
  2009-08-13  9:48     ` Gleb Natapov
  0 siblings, 1 reply; 19+ messages in thread
From: Marcelo Tosatti @ 2009-08-13  9:13 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: avi, kvm

On Wed, Aug 12, 2009 at 03:17:21PM +0300, Gleb Natapov wrote:
> 
> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> ---
>  arch/ia64/kvm/kvm-ia64.c |    7 +---
>  arch/x86/kvm/i8259.c     |   22 +++++++++---
>  arch/x86/kvm/lapic.c     |    5 +--
>  arch/x86/kvm/x86.c       |   10 +----
>  virt/kvm/ioapic.c        |   80 +++++++++++++++++++++++++++++++++++-----------
>  virt/kvm/ioapic.h        |    4 ++
>  6 files changed, 86 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
> index 0ad09f0..4a98314 100644
> --- a/arch/ia64/kvm/kvm-ia64.c
> +++ b/arch/ia64/kvm/kvm-ia64.c
> @@ -851,8 +851,7 @@ static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm,
>  	r = 0;
>  	switch (chip->chip_id) {
>  	case KVM_IRQCHIP_IOAPIC:
> -		memcpy(&chip->chip.ioapic, ioapic_irqchip(kvm),
> -				sizeof(struct kvm_ioapic_state));
> +		r = kvm_get_ioapic(kvm, &chip->chip.ioapic);
>  		break;
>  	default:
>  		r = -EINVAL;
> @@ -868,9 +867,7 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
>  	r = 0;
>  	switch (chip->chip_id) {
>  	case KVM_IRQCHIP_IOAPIC:
> -		memcpy(ioapic_irqchip(kvm),
> -				&chip->chip.ioapic,
> -				sizeof(struct kvm_ioapic_state));
> +		r = kvm_set_ioapic(kvm, &chip->chip.ioapic);
>  		break;
>  	default:
>  		r = -EINVAL;
> diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> index eb2b8b7..96ea7f7 100644
> --- a/arch/x86/kvm/i8259.c
> +++ b/arch/x86/kvm/i8259.c
> @@ -38,7 +38,15 @@ static void pic_clear_isr(struct kvm_kpic_state *s, int irq)
>  	s->isr_ack |= (1 << irq);
>  	if (s != &s->pics_state->pics[0])
>  		irq += 8;
> +	/*
> +	 * We are dropping lock while calling ack notifiers since ack
> +	 * notifier callbacks for assigned devices call into PIC recursively.
> +	 * Other interrupt may be delivered to PIC while lock is dropped but
> +	 * it should be safe since PIC state is already updated at this stage.
> +	 */
> +	spin_unlock(&s->pics_state->lock);
>  	kvm_notify_acked_irq(s->pics_state->kvm, SELECT_PIC(irq), irq);
> +	spin_lock(&s->pics_state->lock);
>  }
>  
>  void kvm_pic_clear_isr_ack(struct kvm *kvm)
> @@ -176,16 +184,18 @@ int kvm_pic_set_irq(void *opaque, int irq, int level)
>  static inline void pic_intack(struct kvm_kpic_state *s, int irq)
>  {
>  	s->isr |= 1 << irq;
> -	if (s->auto_eoi) {
> -		if (s->rotate_on_auto_eoi)
> -			s->priority_add = (irq + 1) & 7;
> -		pic_clear_isr(s, irq);
> -	}
>  	/*
>  	 * We don't clear a level sensitive interrupt here
>  	 */
>  	if (!(s->elcr & (1 << irq)))
>  		s->irr &= ~(1 << irq);
> +
> +	if (s->auto_eoi) {
> +		if (s->rotate_on_auto_eoi)
> +			s->priority_add = (irq + 1) & 7;
> +		pic_clear_isr(s, irq);
> +	}
> +
>  }
>  
>  int kvm_pic_read_irq(struct kvm *kvm)
> @@ -282,9 +292,9 @@ static void pic_ioport_write(void *opaque, u32 addr, u32 val)
>  				priority = get_priority(s, s->isr);
>  				if (priority != 8) {
>  					irq = (priority + s->priority_add) & 7;
> -					pic_clear_isr(s, irq);
>  					if (cmd == 5)
>  						s->priority_add = (irq + 1) & 7;
> +					pic_clear_isr(s, irq);
>  					pic_update_irq(s->pics_state);
>  				}
>  				break;
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index ce195f8..f24d4d0 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -471,11 +471,8 @@ static void apic_set_eoi(struct kvm_lapic *apic)
>  		trigger_mode = IOAPIC_LEVEL_TRIG;
>  	else
>  		trigger_mode = IOAPIC_EDGE_TRIG;
> -	if (!(apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)) {
> -		mutex_lock(&apic->vcpu->kvm->irq_lock);
> +	if (!(apic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI))
>  		kvm_ioapic_update_eoi(apic->vcpu->kvm, vector, trigger_mode);
> -		mutex_unlock(&apic->vcpu->kvm->irq_lock);
> -	}
>  }
>  
>  static void apic_send_ipi(struct kvm_lapic *apic)
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 850cf56..75dcdad 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2023,9 +2023,7 @@ static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
>  			sizeof(struct kvm_pic_state));
>  		break;
>  	case KVM_IRQCHIP_IOAPIC:
> -		memcpy(&chip->chip.ioapic,
> -			ioapic_irqchip(kvm),
> -			sizeof(struct kvm_ioapic_state));
> +		r = kvm_get_ioapic(kvm, &chip->chip.ioapic);
>  		break;
>  	default:
>  		r = -EINVAL;
> @@ -2055,11 +2053,7 @@ static int kvm_vm_ioctl_set_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
>  		spin_unlock(&pic_irqchip(kvm)->lock);
>  		break;
>  	case KVM_IRQCHIP_IOAPIC:
> -		mutex_lock(&kvm->irq_lock);
> -		memcpy(ioapic_irqchip(kvm),
> -			&chip->chip.ioapic,
> -			sizeof(struct kvm_ioapic_state));
> -		mutex_unlock(&kvm->irq_lock);
> +		r = kvm_set_ioapic(kvm, &chip->chip.ioapic);
>  		break;
>  	default:
>  		r = -EINVAL;
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index fa05f67..e9de458 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -182,6 +182,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
>  	union kvm_ioapic_redirect_entry entry;
>  	int ret = 1;
>  
> +	mutex_lock(&ioapic->lock);
>  	if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
>  		entry = ioapic->redirtbl[irq];
>  		level ^= entry.fields.polarity;

But this is an RCU critical section now, right? 

If so, you can't sleep, must use a spinlock.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/8] Do not call ack notifiers on PIC reset.
  2009-08-13  9:11   ` Marcelo Tosatti
@ 2009-08-13  9:39     ` Gleb Natapov
  0 siblings, 0 replies; 19+ messages in thread
From: Gleb Natapov @ 2009-08-13  9:39 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: avi, kvm

On Thu, Aug 13, 2009 at 06:11:05AM -0300, Marcelo Tosatti wrote:
> On Wed, Aug 12, 2009 at 03:17:15PM +0300, Gleb Natapov wrote:
> > For device assigned it may cause host hang since ack notifier callback
> > enables host interrupt and guest not necessary cleared interrupt
> > condition in an assigned device. For PIT we should not call ack notifier
> > here since interrupt was not acked by a guest and should be redelivered.
> > 
> > Signed-off-by: Gleb Natapov <gleb@redhat.com>
> > ---
> >  arch/x86/kvm/i8259.c |   16 ----------------
> >  1 files changed, 0 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/i8259.c b/arch/x86/kvm/i8259.c
> > index 01f1516..eb2b8b7 100644
> > --- a/arch/x86/kvm/i8259.c
> > +++ b/arch/x86/kvm/i8259.c
> > @@ -225,22 +225,6 @@ int kvm_pic_read_irq(struct kvm *kvm)
> >  
> >  void kvm_pic_reset(struct kvm_kpic_state *s)
> >  {
> > -	int irq, irqbase, n;
> > -	struct kvm *kvm = s->pics_state->irq_request_opaque;
> > -	struct kvm_vcpu *vcpu0 = kvm->bsp_vcpu;
> > -
> > -	if (s == &s->pics_state->pics[0])
> > -		irqbase = 0;
> > -	else
> > -		irqbase = 8;
> > -
> > -	for (irq = 0; irq < PIC_NUM_PINS/2; irq++) {
> > -		if (vcpu0 && kvm_apic_accept_pic_intr(vcpu0))
> > -			if (s->irr & (1 << irq) || s->isr & (1 << irq)) {
> > -				n = irq + irqbase;
> > -				kvm_notify_acked_irq(kvm, SELECT_PIC(n), n);
> > -			}
> > -	}
> >  	s->last_irr = 0;
> >  	s->irr = 0;
> >  	s->imr = 0;
> > -- 
> > 1.6.3.3
> 
> This used to be necessary to clear pending state from i8254.c
> "irq_acked" logic. I think it'll break it.
This is just a hack then and it does not exists in ioapic so if
it is really needed ioapic+pit combination is broken. But the problem
should be solved inside i8254.c not somewhere else. Setting irq_acked to 1 in
pit_load_count() seems like a right thing to do. Something like
the patch below. Ideally pending should be scaled instead of reset.
Also may be the problem exists because PIC doesn't call mask notifiers?


diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index b857ca3..aa7f68e 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -325,6 +325,9 @@ static void pit_load_count(struct kvm *kvm, int channel, u32 val)
 		return;
 	}
 
+	atomic_set(&pt->pending, 0);	
+	ps->irq_ack = 1;
+
 	/* Two types of timer
 	 * mode 1 is one shot, mode 2 is period, otherwise del timer */
 	switch (ps->channels[0].mode) {

--
			Gleb.

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-13  9:13   ` Marcelo Tosatti
@ 2009-08-13  9:48     ` Gleb Natapov
  2009-08-13  9:49       ` Avi Kivity
  0 siblings, 1 reply; 19+ messages in thread
From: Gleb Natapov @ 2009-08-13  9:48 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: avi, kvm

On Thu, Aug 13, 2009 at 06:13:30AM -0300, Marcelo Tosatti wrote:
> > +++ b/virt/kvm/ioapic.c
> > @@ -182,6 +182,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
> >  	union kvm_ioapic_redirect_entry entry;
> >  	int ret = 1;
> >  
> > +	mutex_lock(&ioapic->lock);
> >  	if (irq >= 0 && irq < IOAPIC_NUM_PINS) {
> >  		entry = ioapic->redirtbl[irq];
> >  		level ^= entry.fields.polarity;
> 
> But this is an RCU critical section now, right? 
> 
Correct! Forget about that. It was spinlock, but Avi prefers mutexes.

> If so, you can't sleep, must use a spinlock.
Either that or I can collect callbacks in critical section and call them
afterwords.

--
			Gleb.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-13  9:48     ` Gleb Natapov
@ 2009-08-13  9:49       ` Avi Kivity
  2009-08-13 10:09         ` Gleb Natapov
  0 siblings, 1 reply; 19+ messages in thread
From: Avi Kivity @ 2009-08-13  9:49 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm

On 08/13/2009 12:48 PM, Gleb Natapov wrote:
> On Thu, Aug 13, 2009 at 06:13:30AM -0300, Marcelo Tosatti wrote:
>    
>>> +++ b/virt/kvm/ioapic.c
>>> @@ -182,6 +182,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
>>>   	union kvm_ioapic_redirect_entry entry;
>>>   	int ret = 1;
>>>
>>> +	mutex_lock(&ioapic->lock);
>>>   	if (irq>= 0&&  irq<  IOAPIC_NUM_PINS) {
>>>   		entry = ioapic->redirtbl[irq];
>>>   		level ^= entry.fields.polarity;
>>>        
>> But this is an RCU critical section now, right?
>>
>>      
> Correct! Forget about that. It was spinlock, but Avi prefers mutexes.
>    

Well, I prefer correct code to mutexes.

>    
>> If so, you can't sleep, must use a spinlock.
>>      
> Either that or I can collect callbacks in critical section and call them
> afterwords.
>    

There's also srcu.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-13  9:49       ` Avi Kivity
@ 2009-08-13 10:09         ` Gleb Natapov
  2009-08-13 10:44           ` Avi Kivity
  0 siblings, 1 reply; 19+ messages in thread
From: Gleb Natapov @ 2009-08-13 10:09 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm

On Thu, Aug 13, 2009 at 12:49:45PM +0300, Avi Kivity wrote:
> On 08/13/2009 12:48 PM, Gleb Natapov wrote:
> >On Thu, Aug 13, 2009 at 06:13:30AM -0300, Marcelo Tosatti wrote:
> >>>+++ b/virt/kvm/ioapic.c
> >>>@@ -182,6 +182,7 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level)
> >>>  	union kvm_ioapic_redirect_entry entry;
> >>>  	int ret = 1;
> >>>
> >>>+	mutex_lock(&ioapic->lock);
> >>>  	if (irq>= 0&&  irq<  IOAPIC_NUM_PINS) {
> >>>  		entry = ioapic->redirtbl[irq];
> >>>  		level ^= entry.fields.polarity;
> >>But this is an RCU critical section now, right?
> >>
> >Correct! Forget about that. It was spinlock, but Avi prefers mutexes.
> 
> Well, I prefer correct code to mutexes.
> 
> >>If so, you can't sleep, must use a spinlock.
> >Either that or I can collect callbacks in critical section and call them
> >afterwords.
> 
> There's also srcu.
> 
What are the disadvantages? There should be some, otherwise why not use
it all the time.

--
			Gleb.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-13 10:09         ` Gleb Natapov
@ 2009-08-13 10:44           ` Avi Kivity
  2009-08-13 15:15             ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Avi Kivity @ 2009-08-13 10:44 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Paul E. McKenney

On 08/13/2009 01:09 PM, Gleb Natapov wrote:
>> There's also srcu.
>>
>>      
> What are the disadvantages? There should be some, otherwise why not use
> it all the time.
>    

I think it incurs an atomic op in the read path, but not much overhead 
otherwise.  Paul?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 7/8] Move IO APIC to its own lock.
  2009-08-13 10:44           ` Avi Kivity
@ 2009-08-13 15:15             ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2009-08-13 15:15 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Gleb Natapov, Marcelo Tosatti, kvm

On Thu, Aug 13, 2009 at 01:44:06PM +0300, Avi Kivity wrote:
> On 08/13/2009 01:09 PM, Gleb Natapov wrote:
>>> There's also srcu.
>>>      
>> What are the disadvantages? There should be some, otherwise why not use
>> it all the time.
>
> I think it incurs an atomic op in the read path, but not much overhead 
> otherwise.  Paul?

There are not atomic operations in srcu_read_lock():

	int srcu_read_lock(struct srcu_struct *sp)
	{
		int idx;

		preempt_disable();
		idx = sp->completed & 0x1;
		barrier();  /* ensure compiler looks -once- at sp->completed. */
		per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
		srcu_barrier();  /* ensure compiler won't misorder critical section. */
		preempt_enable();
		return idx;
	}

There is a preempt_disable() and a preempt_enable(), which
non-atomically manipulate a field in the thread_info structure.
There is a barrier() and an srcu_barrier(), which are just compiler
directives (no code generated).  Other than that, simple arithmetic
and array accesses.  Shouldn't even be any cache misses in the common
case (the uncommon case being where synchronize_srcu() executing on
some other CPU).

There is even less in srcu_read_unlock():

	void srcu_read_unlock(struct srcu_struct *sp, int idx)
	{
		preempt_disable();
		srcu_barrier();  /* ensure compiler won't misorder critical section. */
		per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]--;
		preempt_enable();
	}

So SRCU should have pretty low overhead.  And, as with other forms
of RCU, legal use of the read-side primitives cannot possibly
participate in deadlocks.

So, to answer the question above, what are the disadvantages?

o	On the update side, synchronize_srcu() does takes some time,
	mostly blocking in synchronize_sched().  So, like other
	forms of RCU, you would use SRCU in read-mostly situations.

o	Just as with RCU, reads and updates run concurrently, with
	all the good and bad that this implies.  For an example
	of the good, srcu_read_lock() executes deterministically,
	no blocking or spinning.  For an example of the bad, there
	is no way to shut down SRCU readers.  These are opposite
	sides of the same coin.  ;-)

o	Although srcu_read_lock() and srcu_read_unlock() are light
	weight, they are expensive compared to other forms of RCU.

o	In contrast to other forms of RCU, SRCU requires that the
	return value from srcu_read_lock() be passed into
	srcu_read_unlock().  Usually not a problem, but does place
	another constraint on the code.

Please keep in mind that I have no idea about what you are thinking of
using SRCU for, so the above advice is necessarily quite generic.  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 3/8] Maintain back mapping from irqchip/pin to gsi.
  2009-08-12 12:17 ` [PATCH v3 3/8] Maintain back mapping from irqchip/pin to gsi Gleb Natapov
@ 2009-08-17  9:43   ` Avi Kivity
  0 siblings, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-08-17  9:43 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 08/12/2009 03:17 PM, Gleb Natapov wrote:
> Maintain back mapping from irqchip/pin to gsi to speedup
> interrupt acknowledgment notifications.
>
>
>   struct kvm_irq_routing_table {
> +	int chip[KVM_NR_IRQCHIPS][KVM_IOAPIC_NUM_PINS];
>   	struct kvm_kernel_irq_routing_entry *rt_entries;
>   	u32 nr_rt_entries;
>   	/*
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index a9d2262..00e4be7 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -168,25 +168,16 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
>   {
>   	struct kvm_irq_ack_notifier *kian;
>   	struct hlist_node *n;
> -	unsigned gsi = pin;
> -	int i;
> +	unsigned gsi;
>
>   	trace_kvm_ack_irq(irqchip, pin);
>
> -	for (i = 0; i<  kvm->irq_routing->nr_rt_entries; i++) {
> -		struct kvm_kernel_irq_routing_entry *e;
> -		e =&kvm->irq_routing->rt_entries[i];
> -		if (e->type == KVM_IRQ_ROUTING_IRQCHIP&&
> -		    e->irqchip.irqchip == irqchip&&
> -		    e->irqchip.pin == pin) {
> -			gsi = e->gsi;
> -			break;
> -		}
> -	}
> -
> -	hlist_for_each_entry(kian, n,&kvm->arch.irq_ack_notifier_list, link)
> -		if (kian->gsi == gsi)
> -			kian->irq_acked(kian);
> +	gsi = kvm->irq_routing->chip[irqchip][pin];
> +	if (gsi != -1)
>    

unsigned vs. int compare... it will work, but will raise eyebrows.  
Better make it a -1U.

> +		hlist_for_each_entry(kian, n,&kvm->arch.irq_ack_notifier_list,
> +				     link)
> +			if (kian->gsi == gsi)
> +				kian->irq_acked(kian);
>   }
>
>   void kvm_register_irq_ack_notifier(struct kvm *kvm,
> @@ -308,6 +299,9 @@ static int setup_routing_entry(struct kvm_irq_routing_table *rt,
>   		}
>   		e->irqchip.irqchip = ue->u.irqchip.irqchip;
>   		e->irqchip.pin = ue->u.irqchip.pin + delta;
> +		if (e->irqchip.pin>  KVM_IOAPIC_NUM_PINS)
> +			goto out;
>    

 >=

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 0/8] make interrupt injection lockless (almost)
  2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
                   ` (7 preceding siblings ...)
  2009-08-12 12:17 ` [PATCH v3 8/8] Drop kvm->irq_lock lock from irq injection path Gleb Natapov
@ 2009-08-17  9:46 ` Avi Kivity
  8 siblings, 0 replies; 19+ messages in thread
From: Avi Kivity @ 2009-08-17  9:46 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm

On 08/12/2009 03:17 PM, Gleb Natapov wrote:
> kvm->irq_lock protects too much stuff, but still fail to protect
> everything it was design to protect (see ack notifiers call in pic). I
> want to make IRQ injection fast path as lockless as possible. This patch
> series removes kvm->irq_lock from irq injection path effectively making
> interrupt injection to lapic lockless (several kvm_irq_delivery_to_apic()
> may run in parallel), but access to lapic was never fully locked in the
> first place. VCPU could access lapic in parallel with interrupt injection.
> Patches 1/2 changes irq routing data structure to much more efficient one.
>    

Apart from my small comments it looks ready.  Please drop the pic reset 
changes to reduce risks.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2009-08-17  9:46 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-12 12:17 [PATCH v3 0/8] make interrupt injection lockless (almost) Gleb Natapov
2009-08-12 12:17 ` [PATCH v3 1/8] Do not call ack notifiers on PIC reset Gleb Natapov
2009-08-13  9:11   ` Marcelo Tosatti
2009-08-13  9:39     ` Gleb Natapov
2009-08-12 12:17 ` [PATCH v3 2/8] Change irq routing table to use gsi indexed array Gleb Natapov
2009-08-12 12:17 ` [PATCH v3 3/8] Maintain back mapping from irqchip/pin to gsi Gleb Natapov
2009-08-17  9:43   ` Avi Kivity
2009-08-12 12:17 ` [PATCH v3 4/8] Move irq routing data structure to rcu locking Gleb Natapov
2009-08-12 12:17 ` [PATCH v3 5/8] Move irq ack notifier list to arch independent code Gleb Natapov
2009-08-12 12:17 ` [PATCH v3 6/8] Convert irq notifiers lists to RCU locking Gleb Natapov
2009-08-12 12:17 ` [PATCH v3 7/8] Move IO APIC to its own lock Gleb Natapov
2009-08-13  9:13   ` Marcelo Tosatti
2009-08-13  9:48     ` Gleb Natapov
2009-08-13  9:49       ` Avi Kivity
2009-08-13 10:09         ` Gleb Natapov
2009-08-13 10:44           ` Avi Kivity
2009-08-13 15:15             ` Paul E. McKenney
2009-08-12 12:17 ` [PATCH v3 8/8] Drop kvm->irq_lock lock from irq injection path Gleb Natapov
2009-08-17  9:46 ` [PATCH v3 0/8] make interrupt injection lockless (almost) Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.