kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache
@ 2019-06-06 16:54 Marc Zyngier
  2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
                   ` (7 more replies)
  0 siblings, 8 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

It recently became apparent[1] that our LPI injection path is not as
efficient as it could be when injecting interrupts coming from a VFIO
assigned device.

Although the proposed patch wasn't 100% correct, it outlined at least
two issues:

(1) Injecting an LPI from VFIO always results in a context switch to a
    worker thread: no good

(2) We have no way of amortising the cost of translating a DID+EID pair
    to an LPI number

The reason for (1) is that we may sleep when translating an LPI, so we
do need a context process. A way to fix that is to implement a small
LPI translation cache that could be looked up from an atomic
context. It would also solve (2).

This is what this small series proposes. it implements a very basic
LRU cache of pre-translated LPIs, which gets used to implement
kvm_arch_set_irq_inatomic. The size of the cache is currently
hard-coded at 4 times the number of vcpus, a number I have randomly
chosen with the utmost care.

Does it work? well, it doesn't crash, and is thus perfect. More
seriously, I don't really have a way to benchmark it directly, so my
observations are indirect:

On a TX2 system, I run a 4 vcpu VM with an Ethernet interface passed
to it directly. From the host, I inject interrupts using debugfs. In
parallel, I look at the number of context switch, and the number of
interrupts on the host. Without this series, I get the same number for
both IRQ and CS (about half a million of each per second is pretty
easy to reach). With this series, the number of context switches drops
to something pretty small (in the low 2k), while the number of
interrupts stays the same.

Yes, this is a pretty rubbish benchmark, what did you expect? ;-)

So I'm putting this out for people with real workloads to try out and
report what they see.

[1] https://lore.kernel.org/lkml/1552833373-19828-1-git-send-email-yuzenghui@huawei.com/

Marc Zyngier (8):
  KVM: arm/arm64: vgic: Add LPI translation cache definition
  KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
  KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  KVM: arm/arm64: vgic-its: Add kvm parameter to
    vgic_its_free_collection
  KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
    specific commands
  KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
    disabling LPIs
  KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI
    injection
  KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic

 include/kvm/arm_vgic.h           |  10 +++
 virt/kvm/arm/vgic/vgic-init.c    |  34 ++++++++
 virt/kvm/arm/vgic/vgic-irqfd.c   |  36 ++++++--
 virt/kvm/arm/vgic/vgic-its.c     | 143 +++++++++++++++++++++++++++++--
 virt/kvm/arm/vgic/vgic-mmio-v3.c |   4 +-
 virt/kvm/arm/vgic/vgic.c         |  26 ++++--
 virt/kvm/arm/vgic/vgic.h         |   6 ++
 7 files changed, 238 insertions(+), 21 deletions(-)

-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-07  3:47   ` Saidi, Ali
                     ` (2 more replies)
  2019-06-06 16:54 ` [PATCH 2/8] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
                   ` (6 subsequent siblings)
  7 siblings, 3 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Add the basic data structure that expresses an MSI to LPI
translation as well as the allocation/release hooks.

THe size of the cache is arbitrarily defined as 4*nr_vcpus.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h        | 10 ++++++++++
 virt/kvm/arm/vgic/vgic-init.c | 34 ++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic-its.c  |  2 ++
 virt/kvm/arm/vgic/vgic.h      |  3 +++
 4 files changed, 49 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index c36c86f1ec9a..5a0d6b07c5ef 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -173,6 +173,14 @@ struct vgic_io_device {
 	struct kvm_io_device dev;
 };
 
+struct vgic_translation_cache_entry {
+	struct list_head	entry;
+	phys_addr_t		db;
+	u32			devid;
+	u32			eventid;
+	struct vgic_irq		*irq;
+};
+
 struct vgic_its {
 	/* The base address of the ITS control register frame */
 	gpa_t			vgic_its_base;
@@ -260,6 +268,8 @@ struct vgic_dist {
 	struct list_head	lpi_list_head;
 	int			lpi_list_count;
 
+	struct list_head	lpi_translation_cache;
+
 	/* used by vgic-debug */
 	struct vgic_state_iter *iter;
 
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 3bdb31eaed64..25ae25694a28 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -64,6 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
 	struct vgic_dist *dist = &kvm->arch.vgic;
 
 	INIT_LIST_HEAD(&dist->lpi_list_head);
+	INIT_LIST_HEAD(&dist->lpi_translation_cache);
 	raw_spin_lock_init(&dist->lpi_list_lock);
 }
 
@@ -260,6 +261,27 @@ static void kvm_vgic_vcpu_enable(struct kvm_vcpu *vcpu)
 		vgic_v3_enable(vcpu);
 }
 
+void vgic_lpi_translation_cache_init(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	int i;
+
+	if (!list_empty(&dist->lpi_translation_cache))
+		return;
+
+	for (i = 0; i < LPI_CACHE_SIZE(kvm); i++) {
+		struct vgic_translation_cache_entry *cte;
+
+		/* An allocation failure is not fatal */
+		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
+		if (WARN_ON(!cte))
+			break;
+
+		INIT_LIST_HEAD(&cte->entry);
+		list_add(&cte->entry, &dist->lpi_translation_cache);
+	}
+}
+
 /*
  * vgic_init: allocates and initializes dist and vcpu data structures
  * depending on two dimensioning parameters:
@@ -305,6 +327,7 @@ int vgic_init(struct kvm *kvm)
 	}
 
 	if (vgic_has_its(kvm)) {
+		vgic_lpi_translation_cache_init(kvm);
 		ret = vgic_v4_init(kvm);
 		if (ret)
 			goto out;
@@ -346,6 +369,17 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
 		INIT_LIST_HEAD(&dist->rd_regions);
 	}
 
+	if (vgic_has_its(kvm)) {
+		struct vgic_translation_cache_entry *cte, *tmp;
+
+		list_for_each_entry_safe(cte, tmp,
+					 &dist->lpi_translation_cache, entry) {
+			list_del(&cte->entry);
+			kfree(cte);
+		}
+		INIT_LIST_HEAD(&dist->lpi_translation_cache);
+	}
+
 	if (vgic_supports_direct_msis(kvm))
 		vgic_v4_teardown(kvm);
 }
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 44ceaccb18cf..5758504fd934 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1696,6 +1696,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
 			kfree(its);
 			return ret;
 		}
+
+		vgic_lpi_translation_cache_init(dev->kvm);
 	}
 
 	mutex_init(&its->its_lock);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index abeeffabc456..a58e1b263dca 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -316,6 +316,9 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
+void vgic_lpi_translation_cache_init(struct kvm *kvm);
+
+#define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
 
 bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/8] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
  2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-07 12:11   ` Auger Eric
  2019-06-06 16:54 ` [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Our LPI translation cache needs to be able to drop the refcount
on an LPI whilst already holding the lpi_list_lock.

Let's add a new primitive for this.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic.c | 26 +++++++++++++++++---------
 virt/kvm/arm/vgic/vgic.h |  1 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 191deccf60bf..376a297e2169 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -130,6 +130,22 @@ static void vgic_irq_release(struct kref *ref)
 {
 }
 
+/*
+ * Drop the refcount on the LPI. Must be called with lpi_list_lock held.
+ */
+void __vgic_put_lpi_locked(struct kvm *kvm, struct vgic_irq *irq)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	if (!kref_put(&irq->refcount, vgic_irq_release))
+		return;
+
+	list_del(&irq->lpi_list);
+	dist->lpi_list_count--;
+
+	kfree(irq);
+}
+
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
@@ -139,16 +155,8 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 		return;
 
 	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
-	if (!kref_put(&irq->refcount, vgic_irq_release)) {
-		raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
-		return;
-	};
-
-	list_del(&irq->lpi_list);
-	dist->lpi_list_count--;
+	__vgic_put_lpi_locked(kvm, irq);
 	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
-
-	kfree(irq);
 }
 
 void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index a58e1b263dca..80cd40575bc9 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -172,6 +172,7 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
 		     gpa_t addr, int len);
 struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
 			      u32 intid);
+void __vgic_put_lpi_locked(struct kvm *kvm, struct vgic_irq *irq);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
 bool vgic_get_phys_line_level(struct vgic_irq *irq);
 void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
  2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
  2019-06-06 16:54 ` [PATCH 2/8] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-07  8:35   ` Julien Thierry
  2019-06-07 14:29   ` Auger Eric
  2019-06-06 16:54 ` [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection Marc Zyngier
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

On a successful translation, preserve the parameters in the LPI
translation cache. Each translation is reusing the last slot
in the list, naturally evincting the least recently used entry.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 41 ++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 5758504fd934..bc370b6c5afa 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -538,6 +538,45 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
 	return 0;
 }
 
+static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
+				       u32 devid, u32 eventid,
+				       struct vgic_irq *irq)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_translation_cache_entry *cte;
+	unsigned long flags;
+
+	/* Do not cache a directly injected interrupt */
+	if (irq->hw)
+		return;
+
+	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+
+	/* Always reuse the last entry (LRU policy) */
+	cte = list_last_entry(&dist->lpi_translation_cache,
+			      typeof(*cte), entry);
+
+	/*
+	 * Caching the translation implies having an extra reference
+	 * to the interrupt, so drop the potential reference on what
+	 * was in the cache, and increment it on the new interrupt.
+	 */
+	if (cte->irq)
+		__vgic_put_lpi_locked(kvm, cte->irq);
+
+	vgic_get_irq_kref(irq);
+
+	cte->db		= its->vgic_its_base + GITS_TRANSLATER;
+	cte->devid	= devid;
+	cte->eventid	= eventid;
+	cte->irq	= irq;
+
+	/* Move the new translation to the head of the list */
+	list_move(&cte->entry, &dist->lpi_translation_cache);
+
+	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+}
+
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq)
 {
@@ -558,6 +597,8 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 	if (!vcpu->arch.vgic_cpu.lpis_enabled)
 		return -EBUSY;
 
+	vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
+
 	*irq = ite->irq;
 	return 0;
 }
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (2 preceding siblings ...)
  2019-06-06 16:54 ` [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-07 14:29   ` Auger Eric
  2019-06-06 16:54 ` [PATCH 5/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands Marc Zyngier
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

As we are going to perform some VM-wide operations when freeing
a collection, add the kvm pointer to vgic_its_free_collection.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index bc370b6c5afa..f637edd77e1f 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -885,7 +885,8 @@ static int vgic_its_alloc_collection(struct vgic_its *its,
 	return 0;
 }
 
-static void vgic_its_free_collection(struct vgic_its *its, u32 coll_id)
+static void vgic_its_free_collection(struct kvm *kvm,
+				     struct vgic_its *its, u32 coll_id)
 {
 	struct its_collection *collection;
 	struct its_device *device;
@@ -974,7 +975,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
 	ite = vgic_its_alloc_ite(device, collection, event_id);
 	if (IS_ERR(ite)) {
 		if (new_coll)
-			vgic_its_free_collection(its, coll_id);
+			vgic_its_free_collection(kvm, its, coll_id);
 		return PTR_ERR(ite);
 	}
 
@@ -984,7 +985,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
 	irq = vgic_add_lpi(kvm, lpi_nr, vcpu);
 	if (IS_ERR(irq)) {
 		if (new_coll)
-			vgic_its_free_collection(its, coll_id);
+			vgic_its_free_collection(kvm, its, coll_id);
 		its_free_ite(kvm, ite);
 		return PTR_ERR(irq);
 	}
@@ -1025,7 +1026,7 @@ static void vgic_its_free_collection_list(struct kvm *kvm, struct vgic_its *its)
 	struct its_collection *cur, *temp;
 
 	list_for_each_entry_safe(cur, temp, &its->collection_list, coll_list)
-		vgic_its_free_collection(its, cur->collection_id);
+		vgic_its_free_collection(kvm, its, cur->collection_id);
 }
 
 /* Must be called with its_lock mutex held */
@@ -1110,7 +1111,7 @@ static int vgic_its_cmd_handle_mapc(struct kvm *kvm, struct vgic_its *its,
 		return E_ITS_MAPC_PROCNUM_OOR;
 
 	if (!valid) {
-		vgic_its_free_collection(its, coll_id);
+		vgic_its_free_collection(kvm, its, coll_id);
 	} else {
 		collection = find_collection(its, coll_id);
 
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 5/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (3 preceding siblings ...)
  2019-06-06 16:54 ` [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-06 16:54 ` [PATCH 6/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

The LPI translation cache needs to be discarded when an ITS command
may affect the translation of an LPI (DISCARD and MAPD with V=0) or
the routing of an LPI to a redistributor with disabled LPIs (MOVI,
MOVALL).

We decide to perform a full invalidation of the cache, irrespective
of the LPI that is affected. Commands are supposed to be rare enough
that it doesn't matter.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index f637edd77e1f..62917aa15493 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -577,6 +577,29 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
 	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 }
 
+static void vgic_its_invalidate_cache(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_translation_cache_entry *cte;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+
+	list_for_each_entry(cte, &dist->lpi_translation_cache, entry) {
+		/*
+		 * If we hit a NULL entry, there is nothing after this
+		 * point.
+		 */
+		if (!cte->irq)
+			break;
+
+		__vgic_put_lpi_locked(kvm, cte->irq);
+		cte->irq = NULL;
+	}
+
+	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+}
+
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq)
 {
@@ -743,6 +766,8 @@ static int vgic_its_cmd_handle_discard(struct kvm *kvm, struct vgic_its *its,
 		 * don't bother here since we clear the ITTE anyway and the
 		 * pending state is a property of the ITTE struct.
 		 */
+		vgic_its_invalidate_cache(kvm);
+
 		its_free_ite(kvm, ite);
 		return 0;
 	}
@@ -778,6 +803,8 @@ static int vgic_its_cmd_handle_movi(struct kvm *kvm, struct vgic_its *its,
 	ite->collection = collection;
 	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
 
+	vgic_its_invalidate_cache(kvm);
+
 	return update_affinity(ite->irq, vcpu);
 }
 
@@ -1007,6 +1034,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *device)
 	list_for_each_entry_safe(ite, temp, &device->itt_head, ite_list)
 		its_free_ite(kvm, ite);
 
+	vgic_its_invalidate_cache(kvm);
+
 	list_del(&device->dev_list);
 	kfree(device);
 }
@@ -1260,6 +1289,8 @@ static int vgic_its_cmd_handle_movall(struct kvm *kvm, struct vgic_its *its,
 		vgic_put_irq(kvm, irq);
 	}
 
+	vgic_its_invalidate_cache(kvm);
+
 	kfree(intids);
 	return 0;
 }
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 6/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (4 preceding siblings ...)
  2019-06-06 16:54 ` [PATCH 5/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-06 16:54 ` [PATCH 7/8] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
  2019-06-06 16:54 ` [PATCH 8/8] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic Marc Zyngier
  7 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

If a vcpu disables LPIs at its redistributor level, we need to make sure
we won't pend more interrupts. For this, we need to invalidate the LPI
translation cache.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c     | 2 +-
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 4 +++-
 virt/kvm/arm/vgic/vgic.h         | 1 +
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 62917aa15493..bd76683781e8 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -577,7 +577,7 @@ static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
 	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 }
 
-static void vgic_its_invalidate_cache(struct kvm *kvm)
+void vgic_its_invalidate_cache(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_translation_cache_entry *cte;
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index 9f4843fe9cda..2aea9f40fecd 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -200,8 +200,10 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
 
 	vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS;
 
-	if (was_enabled && !vgic_cpu->lpis_enabled)
+	if (was_enabled && !vgic_cpu->lpis_enabled) {
 		vgic_flush_pending_lpis(vcpu);
+		vgic_its_invalidate_cache(vcpu->kvm);
+	}
 
 	if (!was_enabled && vgic_cpu->lpis_enabled)
 		vgic_enable_lpis(vcpu);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 80cd40575bc9..04335e997907 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -318,6 +318,7 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
 void vgic_lpi_translation_cache_init(struct kvm *kvm);
+void vgic_its_invalidate_cache(struct kvm *kvm);
 
 #define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
 
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 7/8] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (5 preceding siblings ...)
  2019-06-06 16:54 ` [PATCH 6/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  2019-06-06 16:54 ` [PATCH 8/8] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic Marc Zyngier
  7 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

When performing an MSI injection, let's first check if the translation
is already in the cache. If so, let's inject it quickly without
going through the whole translation process.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 58 ++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.h     |  1 +
 2 files changed, 59 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index bd76683781e8..3021cbf89b70 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -600,6 +600,42 @@ void vgic_its_invalidate_cache(struct kvm *kvm)
 	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
 }
 
+static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
+					     u32 devid, u32 eventid)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_translation_cache_entry *cte;
+	struct vgic_irq *irq = NULL;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+
+	list_for_each_entry(cte, &dist->lpi_translation_cache, entry) {
+		/*
+		 * If we hit a NULL entry, there is nothing after this
+		 * point.
+		 */
+		if (!cte->irq)
+			break;
+
+		if (cte->db == db &&
+		    cte->devid == devid &&
+		    cte->eventid == eventid) {
+			/*
+			 * Move this entry to the head, as it is the
+			 * most recently used.
+			 */
+			list_move(&cte->entry, &dist->lpi_translation_cache);
+			irq = cte->irq;
+			break;
+		}
+	}
+
+	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+
+	return irq;
+}
+
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq)
 {
@@ -683,6 +719,25 @@ static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
 	return 0;
 }
 
+int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct vgic_irq *irq;
+	unsigned long flags;
+	phys_addr_t db;
+
+	db = (u64)msi->address_hi << 32 | msi->address_lo;
+	irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data);
+
+	if (!irq)
+		return -1;
+
+	raw_spin_lock_irqsave(&irq->irq_lock, flags);
+	irq->pending_latch = true;
+	vgic_queue_irq_unlock(kvm, irq, flags);
+
+	return 0;
+}
+
 /*
  * Queries the KVM IO bus framework to get the ITS pointer from the given
  * doorbell address.
@@ -694,6 +749,9 @@ int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
 	struct vgic_its *its;
 	int ret;
 
+	if (!vgic_its_inject_cached_translation(kvm, msi))
+		return 1;
+
 	its = vgic_msi_to_its(kvm, msi);
 	if (IS_ERR(its))
 		return PTR_ERR(its);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 04335e997907..fb2c83d6f748 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -317,6 +317,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
+int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi);
 void vgic_lpi_translation_cache_init(struct kvm *kvm);
 void vgic_its_invalidate_cache(struct kvm *kvm);
 
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 8/8] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic
  2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (6 preceding siblings ...)
  2019-06-06 16:54 ` [PATCH 7/8] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
@ 2019-06-06 16:54 ` Marc Zyngier
  7 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-06 16:54 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Now that we have a cache of MSI->LPI translations, it is pretty
easy to implement kvm_arch_set_irq_inatomic (this cache can be
parsed without sleeping).

Hopefully, this will improve some LPI-heavy workloads.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-irqfd.c | 36 ++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-irqfd.c b/virt/kvm/arm/vgic/vgic-irqfd.c
index 99e026d2dade..9f203ed8c8f3 100644
--- a/virt/kvm/arm/vgic/vgic-irqfd.c
+++ b/virt/kvm/arm/vgic/vgic-irqfd.c
@@ -77,6 +77,15 @@ int kvm_set_routing_entry(struct kvm *kvm,
 	return r;
 }
 
+static void kvm_populate_msi(struct kvm_kernel_irq_routing_entry *e,
+			     struct kvm_msi *msi)
+{
+	msi->address_lo = e->msi.address_lo;
+	msi->address_hi = e->msi.address_hi;
+	msi->data = e->msi.data;
+	msi->flags = e->msi.flags;
+	msi->devid = e->msi.devid;
+}
 /**
  * kvm_set_msi: inject the MSI corresponding to the
  * MSI routing entry
@@ -90,21 +99,36 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 {
 	struct kvm_msi msi;
 
-	msi.address_lo = e->msi.address_lo;
-	msi.address_hi = e->msi.address_hi;
-	msi.data = e->msi.data;
-	msi.flags = e->msi.flags;
-	msi.devid = e->msi.devid;
-
 	if (!vgic_has_its(kvm))
 		return -ENODEV;
 
 	if (!level)
 		return -1;
 
+	kvm_populate_msi(e, &msi);
 	return vgic_its_inject_msi(kvm, &msi);
 }
 
+/**
+ * kvm_arch_set_irq_inatomic: fast-path for irqfd injection
+ *
+ * Currently only direct MSI injecton is supported.
+ */
+int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
+			      struct kvm *kvm, int irq_source_id, int level,
+			      bool line_status)
+{
+	if (e->type == KVM_IRQ_ROUTING_MSI && vgic_has_its(kvm) && level) {
+		struct kvm_msi msi;
+
+		kvm_populate_msi(e, &msi);
+		if (!vgic_its_inject_cached_translation(kvm, &msi))
+			return 0;
+	}
+
+	return -EWOULDBLOCK;
+}
+
 int kvm_vgic_setup_default_irq_routing(struct kvm *kvm)
 {
 	struct kvm_irq_routing_entry *entries;
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
@ 2019-06-07  3:47   ` Saidi, Ali
  2019-06-07  7:38     ` Marc Zyngier
  2019-06-07  8:12   ` Julien Thierry
  2019-06-07 12:09   ` Auger Eric
  2 siblings, 1 reply; 25+ messages in thread
From: Saidi, Ali @ 2019-06-07  3:47 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah


On 6/6/19, 11:55 AM, "Marc Zyngier" <marc.zyngier@arm.com> wrote:

    Add the basic data structure that expresses an MSI to LPI
    translation as well as the allocation/release hooks.
    
    THe size of the cache is arbitrarily defined as 4*nr_vcpus.

A cache size of 8/vCPU should result in cache hits in most cases and 16/vCPU will pretty much always result in a cache hit.

Ali


_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-07  3:47   ` Saidi, Ali
@ 2019-06-07  7:38     ` Marc Zyngier
  0 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07  7:38 UTC (permalink / raw)
  To: Saidi, Ali; +Cc: kvm, Raslan, KarimAllah, kvmarm, linux-arm-kernel

On Fri, 07 Jun 2019 04:47:29 +0100,
"Saidi, Ali" <alisaidi@amazon.com> wrote:
> 
> 
> On 6/6/19, 11:55 AM, "Marc Zyngier" <marc.zyngier@arm.com> wrote:
> 
>     Add the basic data structure that expresses an MSI to LPI
>     translation as well as the allocation/release hooks.
>     
>     THe size of the cache is arbitrarily defined as 4*nr_vcpus.
> 
> A cache size of 8/vCPU should result in cache hits in most cases and
> 16/vCPU will pretty much always result in a cache hit.

What is this interesting observation based on? On the face of it, this
is just as random as what I have already.

Thanks,

	M.

-- 
Jazz is not dead, it just smells funny.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
  2019-06-07  3:47   ` Saidi, Ali
@ 2019-06-07  8:12   ` Julien Thierry
  2019-06-07  8:38     ` Marc Zyngier
  2019-06-07 12:09   ` Auger Eric
  2 siblings, 1 reply; 25+ messages in thread
From: Julien Thierry @ 2019-06-07  8:12 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,

On 06/06/2019 17:54, Marc Zyngier wrote:
> Add the basic data structure that expresses an MSI to LPI
> translation as well as the allocation/release hooks.
> 
> THe size of the cache is arbitrarily defined as 4*nr_vcpus.
> 

Since this arbitrary and that people migh want to try it with different
size, could the number of (per vCPU) ITS translation cache entries be
passed as a kernel parameter?

> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h        | 10 ++++++++++
>  virt/kvm/arm/vgic/vgic-init.c | 34 ++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic/vgic-its.c  |  2 ++
>  virt/kvm/arm/vgic/vgic.h      |  3 +++
>  4 files changed, 49 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index c36c86f1ec9a..5a0d6b07c5ef 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -173,6 +173,14 @@ struct vgic_io_device {
>  	struct kvm_io_device dev;
>  };
>  
> +struct vgic_translation_cache_entry {
> +	struct list_head	entry;
> +	phys_addr_t		db;
> +	u32			devid;
> +	u32			eventid;
> +	struct vgic_irq		*irq;
> +};
> +
>  struct vgic_its {
>  	/* The base address of the ITS control register frame */
>  	gpa_t			vgic_its_base;
> @@ -260,6 +268,8 @@ struct vgic_dist {
>  	struct list_head	lpi_list_head;
>  	int			lpi_list_count;
>  
> +	struct list_head	lpi_translation_cache;
> +
>  	/* used by vgic-debug */
>  	struct vgic_state_iter *iter;
>  
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 3bdb31eaed64..25ae25694a28 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -64,6 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  
>  	INIT_LIST_HEAD(&dist->lpi_list_head);
> +	INIT_LIST_HEAD(&dist->lpi_translation_cache);
>  	raw_spin_lock_init(&dist->lpi_list_lock);
>  }
>  
> @@ -260,6 +261,27 @@ static void kvm_vgic_vcpu_enable(struct kvm_vcpu *vcpu)
>  		vgic_v3_enable(vcpu);
>  }
>  
> +void vgic_lpi_translation_cache_init(struct kvm *kvm)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	int i;
> +
> +	if (!list_empty(&dist->lpi_translation_cache))
> +		return;
> +
> +	for (i = 0; i < LPI_CACHE_SIZE(kvm); i++) {
> +		struct vgic_translation_cache_entry *cte;
> +
> +		/* An allocation failure is not fatal */
> +		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
> +		if (WARN_ON(!cte))
> +			break;
> +
> +		INIT_LIST_HEAD(&cte->entry);
> +		list_add(&cte->entry, &dist->lpi_translation_cache);
> +	}
> +}
> +
>  /*
>   * vgic_init: allocates and initializes dist and vcpu data structures
>   * depending on two dimensioning parameters:
> @@ -305,6 +327,7 @@ int vgic_init(struct kvm *kvm)
>  	}
>  
>  	if (vgic_has_its(kvm)) {
> +		vgic_lpi_translation_cache_init(kvm);
>  		ret = vgic_v4_init(kvm);
>  		if (ret)
>  			goto out;
> @@ -346,6 +369,17 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>  		INIT_LIST_HEAD(&dist->rd_regions);
>  	}
>  
> +	if (vgic_has_its(kvm)) {
> +		struct vgic_translation_cache_entry *cte, *tmp;
> +
> +		list_for_each_entry_safe(cte, tmp,
> +					 &dist->lpi_translation_cache, entry) {
> +			list_del(&cte->entry);
> +			kfree(cte);
> +		}
> +		INIT_LIST_HEAD(&dist->lpi_translation_cache);

I would expect that removing all entries from a list would leave that
list as a "clean" empty list. Is INIT_LIST_HEAD() really needed here?

> +	}
> +
>  	if (vgic_supports_direct_msis(kvm))
>  		vgic_v4_teardown(kvm);
>  }
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 44ceaccb18cf..5758504fd934 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -1696,6 +1696,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
>  			kfree(its);
>  			return ret;
>  		}
> +
> +		vgic_lpi_translation_cache_init(dev->kvm);

I'm not sure I understand why we need to call that here. Isn't the
single call in vgic_init() enough? Are there cases where the other call
might come to late (I guess I might discover that in the rest of the
series).

>  	}
>  
>  	mutex_init(&its->its_lock);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index abeeffabc456..a58e1b263dca 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -316,6 +316,9 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
>  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>  			 u32 devid, u32 eventid, struct vgic_irq **irq);
>  struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
> +void vgic_lpi_translation_cache_init(struct kvm *kvm);
> +
> +#define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
>  
>  bool vgic_supports_direct_msis(struct kvm *kvm);
>  int vgic_v4_init(struct kvm *kvm);
> 

Cheers,

-- 
Julien Thierry
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-06-06 16:54 ` [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
@ 2019-06-07  8:35   ` Julien Thierry
  2019-06-07  8:51     ` Marc Zyngier
  2019-06-07 14:29   ` Auger Eric
  1 sibling, 1 reply; 25+ messages in thread
From: Julien Thierry @ 2019-06-07  8:35 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,

On 06/06/2019 17:54, Marc Zyngier wrote:
> On a successful translation, preserve the parameters in the LPI
> translation cache. Each translation is reusing the last slot
> in the list, naturally evincting the least recently used entry.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 41 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 5758504fd934..bc370b6c5afa 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -538,6 +538,45 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>  	return 0;
>  }
>  
> +static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> +				       u32 devid, u32 eventid,
> +				       struct vgic_irq *irq)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct vgic_translation_cache_entry *cte;
> +	unsigned long flags;
> +
> +	/* Do not cache a directly injected interrupt */
> +	if (irq->hw)
> +		return;
> +
> +	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
> +
> +	/* Always reuse the last entry (LRU policy) */
> +	cte = list_last_entry(&dist->lpi_translation_cache,
> +			      typeof(*cte), entry);
> +
> +	/*
> +	 * Caching the translation implies having an extra reference
> +	 * to the interrupt, so drop the potential reference on what
> +	 * was in the cache, and increment it on the new interrupt.
> +	 */
> +	if (cte->irq)
> +		__vgic_put_lpi_locked(kvm, cte->irq);
> +
> +	vgic_get_irq_kref(irq);

If cte->irq == irq, can we avoid the ref putting and getting and just
move the list entry (and update cte)?

Cheers,

-- 
Julien Thierry
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-07  8:12   ` Julien Thierry
@ 2019-06-07  8:38     ` Marc Zyngier
  0 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07  8:38 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Julien,

On 07/06/2019 09:12, Julien Thierry wrote:
> Hi Marc,
> 
> On 06/06/2019 17:54, Marc Zyngier wrote:
>> Add the basic data structure that expresses an MSI to LPI
>> translation as well as the allocation/release hooks.
>>
>> THe size of the cache is arbitrarily defined as 4*nr_vcpus.
>>
> 
> Since this arbitrary and that people migh want to try it with different
> size, could the number of (per vCPU) ITS translation cache entries be
> passed as a kernel parameter?

I'm not overly keen on the kernel parameter, but I'm open to this being
an optional parameter at vITS creation time, for particularly creative
use cases... What I'd really want is a way to dynamically dimension it,
but I can't really think of a way.

> 
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h        | 10 ++++++++++
>>  virt/kvm/arm/vgic/vgic-init.c | 34 ++++++++++++++++++++++++++++++++++
>>  virt/kvm/arm/vgic/vgic-its.c  |  2 ++
>>  virt/kvm/arm/vgic/vgic.h      |  3 +++
>>  4 files changed, 49 insertions(+)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index c36c86f1ec9a..5a0d6b07c5ef 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -173,6 +173,14 @@ struct vgic_io_device {
>>  	struct kvm_io_device dev;
>>  };
>>  
>> +struct vgic_translation_cache_entry {
>> +	struct list_head	entry;
>> +	phys_addr_t		db;
>> +	u32			devid;
>> +	u32			eventid;
>> +	struct vgic_irq		*irq;
>> +};
>> +
>>  struct vgic_its {
>>  	/* The base address of the ITS control register frame */
>>  	gpa_t			vgic_its_base;
>> @@ -260,6 +268,8 @@ struct vgic_dist {
>>  	struct list_head	lpi_list_head;
>>  	int			lpi_list_count;
>>  
>> +	struct list_head	lpi_translation_cache;
>> +
>>  	/* used by vgic-debug */
>>  	struct vgic_state_iter *iter;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>> index 3bdb31eaed64..25ae25694a28 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -64,6 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
>>  	struct vgic_dist *dist = &kvm->arch.vgic;
>>  
>>  	INIT_LIST_HEAD(&dist->lpi_list_head);
>> +	INIT_LIST_HEAD(&dist->lpi_translation_cache);
>>  	raw_spin_lock_init(&dist->lpi_list_lock);
>>  }
>>  
>> @@ -260,6 +261,27 @@ static void kvm_vgic_vcpu_enable(struct kvm_vcpu *vcpu)
>>  		vgic_v3_enable(vcpu);
>>  }
>>  
>> +void vgic_lpi_translation_cache_init(struct kvm *kvm)
>> +{
>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>> +	int i;
>> +
>> +	if (!list_empty(&dist->lpi_translation_cache))
>> +		return;
>> +
>> +	for (i = 0; i < LPI_CACHE_SIZE(kvm); i++) {
>> +		struct vgic_translation_cache_entry *cte;
>> +
>> +		/* An allocation failure is not fatal */
>> +		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
>> +		if (WARN_ON(!cte))
>> +			break;
>> +
>> +		INIT_LIST_HEAD(&cte->entry);
>> +		list_add(&cte->entry, &dist->lpi_translation_cache);
>> +	}
>> +}
>> +
>>  /*
>>   * vgic_init: allocates and initializes dist and vcpu data structures
>>   * depending on two dimensioning parameters:
>> @@ -305,6 +327,7 @@ int vgic_init(struct kvm *kvm)
>>  	}
>>  
>>  	if (vgic_has_its(kvm)) {
>> +		vgic_lpi_translation_cache_init(kvm);
>>  		ret = vgic_v4_init(kvm);
>>  		if (ret)
>>  			goto out;
>> @@ -346,6 +369,17 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>>  		INIT_LIST_HEAD(&dist->rd_regions);
>>  	}
>>  
>> +	if (vgic_has_its(kvm)) {
>> +		struct vgic_translation_cache_entry *cte, *tmp;
>> +
>> +		list_for_each_entry_safe(cte, tmp,
>> +					 &dist->lpi_translation_cache, entry) {
>> +			list_del(&cte->entry);
>> +			kfree(cte);
>> +		}
>> +		INIT_LIST_HEAD(&dist->lpi_translation_cache);
> 
> I would expect that removing all entries from a list would leave that
> list as a "clean" empty list. Is INIT_LIST_HEAD() really needed here?

You're right, that's a leftover from an earlier debugging session.

> 
>> +	}
>> +
>>  	if (vgic_supports_direct_msis(kvm))
>>  		vgic_v4_teardown(kvm);
>>  }
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index 44ceaccb18cf..5758504fd934 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -1696,6 +1696,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
>>  			kfree(its);
>>  			return ret;
>>  		}
>> +
>> +		vgic_lpi_translation_cache_init(dev->kvm);
> 
> I'm not sure I understand why we need to call that here. Isn't the
> single call in vgic_init() enough? Are there cases where the other call
> might come to late (I guess I might discover that in the rest of the
> series).

That's because you're allowed to create the ITS after having initialized
the vgic itself (this is guarded with a vgic_initialized() check).
Overall, we follow the same pattern as the GICv4 init. Yes, the API is a
mess.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-06-07  8:35   ` Julien Thierry
@ 2019-06-07  8:51     ` Marc Zyngier
  2019-06-07  8:56       ` Julien Thierry
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07  8:51 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

On 07/06/2019 09:35, Julien Thierry wrote:
> Hi Marc,
> 
> On 06/06/2019 17:54, Marc Zyngier wrote:
>> On a successful translation, preserve the parameters in the LPI
>> translation cache. Each translation is reusing the last slot
>> in the list, naturally evincting the least recently used entry.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic/vgic-its.c | 41 ++++++++++++++++++++++++++++++++++++
>>  1 file changed, 41 insertions(+)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index 5758504fd934..bc370b6c5afa 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -538,6 +538,45 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>>  	return 0;
>>  }
>>  
>> +static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>> +				       u32 devid, u32 eventid,
>> +				       struct vgic_irq *irq)
>> +{
>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>> +	struct vgic_translation_cache_entry *cte;
>> +	unsigned long flags;
>> +
>> +	/* Do not cache a directly injected interrupt */
>> +	if (irq->hw)
>> +		return;
>> +
>> +	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
>> +
>> +	/* Always reuse the last entry (LRU policy) */
>> +	cte = list_last_entry(&dist->lpi_translation_cache,
>> +			      typeof(*cte), entry);
>> +
>> +	/*
>> +	 * Caching the translation implies having an extra reference
>> +	 * to the interrupt, so drop the potential reference on what
>> +	 * was in the cache, and increment it on the new interrupt.
>> +	 */
>> +	if (cte->irq)
>> +		__vgic_put_lpi_locked(kvm, cte->irq);
>> +
>> +	vgic_get_irq_kref(irq);
> 
> If cte->irq == irq, can we avoid the ref putting and getting and just
> move the list entry (and update cte)?
But in that case, we should have hit in the cache the first place, no?
Or is there a particular race I'm not thinking of just yet?

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-06-07  8:51     ` Marc Zyngier
@ 2019-06-07  8:56       ` Julien Thierry
  2019-06-07  9:16         ` Marc Zyngier
  0 siblings, 1 reply; 25+ messages in thread
From: Julien Thierry @ 2019-06-07  8:56 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah



On 07/06/2019 09:51, Marc Zyngier wrote:
> On 07/06/2019 09:35, Julien Thierry wrote:
>> Hi Marc,
>>
>> On 06/06/2019 17:54, Marc Zyngier wrote:
>>> On a successful translation, preserve the parameters in the LPI
>>> translation cache. Each translation is reusing the last slot
>>> in the list, naturally evincting the least recently used entry.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  virt/kvm/arm/vgic/vgic-its.c | 41 ++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 41 insertions(+)
>>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>> index 5758504fd934..bc370b6c5afa 100644
>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>> @@ -538,6 +538,45 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>>>  	return 0;
>>>  }
>>>  
>>> +static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>>> +				       u32 devid, u32 eventid,
>>> +				       struct vgic_irq *irq)
>>> +{
>>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>>> +	struct vgic_translation_cache_entry *cte;
>>> +	unsigned long flags;
>>> +
>>> +	/* Do not cache a directly injected interrupt */
>>> +	if (irq->hw)
>>> +		return;
>>> +
>>> +	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
>>> +
>>> +	/* Always reuse the last entry (LRU policy) */
>>> +	cte = list_last_entry(&dist->lpi_translation_cache,
>>> +			      typeof(*cte), entry);
>>> +
>>> +	/*
>>> +	 * Caching the translation implies having an extra reference
>>> +	 * to the interrupt, so drop the potential reference on what
>>> +	 * was in the cache, and increment it on the new interrupt.
>>> +	 */
>>> +	if (cte->irq)
>>> +		__vgic_put_lpi_locked(kvm, cte->irq);
>>> +
>>> +	vgic_get_irq_kref(irq);
>>
>> If cte->irq == irq, can we avoid the ref putting and getting and just
>> move the list entry (and update cte)?
> But in that case, we should have hit in the cache the first place, no?
> Or is there a particular race I'm not thinking of just yet?
> 

Yes, I had not made it far enough in the series to see the cache hits
and assumed this function would also be used to update the LRU policy.

You can dismiss this comment, sorry for the noise.

Cheers,

-- 
Julien Thierry
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-06-07  8:56       ` Julien Thierry
@ 2019-06-07  9:16         ` Marc Zyngier
  0 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07  9:16 UTC (permalink / raw)
  To: Julien Thierry, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

On 07/06/2019 09:56, Julien Thierry wrote:
> 
> 
> On 07/06/2019 09:51, Marc Zyngier wrote:
>> On 07/06/2019 09:35, Julien Thierry wrote:
>>> Hi Marc,
>>>
>>> On 06/06/2019 17:54, Marc Zyngier wrote:
>>>> On a successful translation, preserve the parameters in the LPI
>>>> translation cache. Each translation is reusing the last slot
>>>> in the list, naturally evincting the least recently used entry.
>>>>
>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>> ---
>>>>  virt/kvm/arm/vgic/vgic-its.c | 41 ++++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 41 insertions(+)
>>>>
>>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>>> index 5758504fd934..bc370b6c5afa 100644
>>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>>> @@ -538,6 +538,45 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>>>>  	return 0;
>>>>  }
>>>>  
>>>> +static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>>>> +				       u32 devid, u32 eventid,
>>>> +				       struct vgic_irq *irq)
>>>> +{
>>>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>>>> +	struct vgic_translation_cache_entry *cte;
>>>> +	unsigned long flags;
>>>> +
>>>> +	/* Do not cache a directly injected interrupt */
>>>> +	if (irq->hw)
>>>> +		return;
>>>> +
>>>> +	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
>>>> +
>>>> +	/* Always reuse the last entry (LRU policy) */
>>>> +	cte = list_last_entry(&dist->lpi_translation_cache,
>>>> +			      typeof(*cte), entry);
>>>> +
>>>> +	/*
>>>> +	 * Caching the translation implies having an extra reference
>>>> +	 * to the interrupt, so drop the potential reference on what
>>>> +	 * was in the cache, and increment it on the new interrupt.
>>>> +	 */
>>>> +	if (cte->irq)
>>>> +		__vgic_put_lpi_locked(kvm, cte->irq);
>>>> +
>>>> +	vgic_get_irq_kref(irq);
>>>
>>> If cte->irq == irq, can we avoid the ref putting and getting and just
>>> move the list entry (and update cte)?
>> But in that case, we should have hit in the cache the first place, no?
>> Or is there a particular race I'm not thinking of just yet?
>>
> 
> Yes, I had not made it far enough in the series to see the cache hits
> and assumed this function would also be used to update the LRU policy.
> 
> You can dismiss this comment, sorry for the noise.

Well, I think you're onto something here. Consider the following
(slightly improbably, but not impossible scenario):

CPU0:                        CPU1:

interrupt arrives,
cache miss

<physical interrupt affinity change>

                             interrupt arrives,
                             cache miss

                             resolve translation,
                             cache allocation
resolve translation,
cache allocation

Oh look, we have the same interrupt in the cache twice. Nothing really
bad should result from that, but that's not really the anticipated
behaviour. Which means the list_last_entry() is not the right thing to
do, and we should lookup this particular interrupt in the cache before
adding it. Probably indicates that a long list is not the best data
structure for a cache (who would have thought?).

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
  2019-06-07  3:47   ` Saidi, Ali
  2019-06-07  8:12   ` Julien Thierry
@ 2019-06-07 12:09   ` Auger Eric
  2019-06-07 12:44     ` Marc Zyngier
  2 siblings, 1 reply; 25+ messages in thread
From: Auger Eric @ 2019-06-07 12:09 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,

On 6/6/19 6:54 PM, Marc Zyngier wrote:
> Add the basic data structure that expresses an MSI to LPI
> translation as well as the allocation/release hooks.
> 
> THe size of the cache is arbitrarily defined as 4*nr_vcpus.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  include/kvm/arm_vgic.h        | 10 ++++++++++
>  virt/kvm/arm/vgic/vgic-init.c | 34 ++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic/vgic-its.c  |  2 ++
>  virt/kvm/arm/vgic/vgic.h      |  3 +++
>  4 files changed, 49 insertions(+)
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index c36c86f1ec9a..5a0d6b07c5ef 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -173,6 +173,14 @@ struct vgic_io_device {
>  	struct kvm_io_device dev;
>  };
>  
> +struct vgic_translation_cache_entry {
> +	struct list_head	entry;
> +	phys_addr_t		db;
it is not obvious to me why you do need the db field? Isn't the LPI
uniquely identfiied by the devid and eventid. If I recall correctly
theorically the architecture allows to handle LPIs even without ITS.
> +	u32			devid;
> +	u32			eventid;
> +	struct vgic_irq		*irq;
> +};
> +
>  struct vgic_its {
>  	/* The base address of the ITS control register frame */
>  	gpa_t			vgic_its_base;
> @@ -260,6 +268,8 @@ struct vgic_dist {
>  	struct list_head	lpi_list_head;
>  	int			lpi_list_count;
>  
> +	struct list_head	lpi_translation_cache;
> +
>  	/* used by vgic-debug */
>  	struct vgic_state_iter *iter;
>  
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 3bdb31eaed64..25ae25694a28 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -64,6 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  
>  	INIT_LIST_HEAD(&dist->lpi_list_head);
> +	INIT_LIST_HEAD(&dist->lpi_translation_cache);
>  	raw_spin_lock_init(&dist->lpi_list_lock);
>  }
>  
> @@ -260,6 +261,27 @@ static void kvm_vgic_vcpu_enable(struct kvm_vcpu *vcpu)
>  		vgic_v3_enable(vcpu);
>  }
>  
> +void vgic_lpi_translation_cache_init(struct kvm *kvm)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	int i;
> +
> +	if (!list_empty(&dist->lpi_translation_cache))
> +		return;
> +
> +	for (i = 0; i < LPI_CACHE_SIZE(kvm); i++) {
> +		struct vgic_translation_cache_entry *cte;
> +
> +		/* An allocation failure is not fatal */
> +		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
> +		if (WARN_ON(!cte))
> +			break;
> +
> +		INIT_LIST_HEAD(&cte->entry);
> +		list_add(&cte->entry, &dist->lpi_translation_cache);
> +	}
> +}
> +
>  /*
>   * vgic_init: allocates and initializes dist and vcpu data structures
>   * depending on two dimensioning parameters:
> @@ -305,6 +327,7 @@ int vgic_init(struct kvm *kvm)
>  	}
>  
>  	if (vgic_has_its(kvm)) {
> +		vgic_lpi_translation_cache_init(kvm);
>  		ret = vgic_v4_init(kvm);
>  		if (ret)
>  			goto out;
> @@ -346,6 +369,17 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>  		INIT_LIST_HEAD(&dist->rd_regions);
>  	}
>  
> +	if (vgic_has_its(kvm)) {
> +		struct vgic_translation_cache_entry *cte, *tmp;
> +
> +		list_for_each_entry_safe(cte, tmp,
> +					 &dist->lpi_translation_cache, entry) {
> +			list_del(&cte->entry);
> +			kfree(cte);
> +		}
> +		INIT_LIST_HEAD(&dist->lpi_translation_cache);
> +	}
> +
>  	if (vgic_supports_direct_msis(kvm))
>  		vgic_v4_teardown(kvm);
>  }
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 44ceaccb18cf..5758504fd934 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -1696,6 +1696,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
>  			kfree(its);
>  			return ret;
>  		}
> +
> +		vgic_lpi_translation_cache_init(dev->kvm);
>  	}
>  
>  	mutex_init(&its->its_lock);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index abeeffabc456..a58e1b263dca 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -316,6 +316,9 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
>  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>  			 u32 devid, u32 eventid, struct vgic_irq **irq);
>  struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
> +void vgic_lpi_translation_cache_init(struct kvm *kvm);
> +
> +#define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
Couldn't the cache be a function of the number of allocated lpis. We
could realloc the list accordingly. I miss why it is rather dependent on
the number of vcpus and not on the number of assigned devices/MSIs?

Thanks

Eric

>  
>  bool vgic_supports_direct_msis(struct kvm *kvm);
>  int vgic_v4_init(struct kvm *kvm);
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 2/8] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
  2019-06-06 16:54 ` [PATCH 2/8] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
@ 2019-06-07 12:11   ` Auger Eric
  0 siblings, 0 replies; 25+ messages in thread
From: Auger Eric @ 2019-06-07 12:11 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,
On 6/6/19 6:54 PM, Marc Zyngier wrote:
> Our LPI translation cache needs to be able to drop the refcount
> on an LPI whilst already holding the lpi_list_lock.
> 
> Let's add a new primitive for this.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
>  virt/kvm/arm/vgic/vgic.c | 26 +++++++++++++++++---------
>  virt/kvm/arm/vgic/vgic.h |  1 +
>  2 files changed, 18 insertions(+), 9 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
> index 191deccf60bf..376a297e2169 100644
> --- a/virt/kvm/arm/vgic/vgic.c
> +++ b/virt/kvm/arm/vgic/vgic.c
> @@ -130,6 +130,22 @@ static void vgic_irq_release(struct kref *ref)
>  {
>  }
>  
> +/*
> + * Drop the refcount on the LPI. Must be called with lpi_list_lock held.
> + */
> +void __vgic_put_lpi_locked(struct kvm *kvm, struct vgic_irq *irq)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +
> +	if (!kref_put(&irq->refcount, vgic_irq_release))
> +		return;
> +
> +	list_del(&irq->lpi_list);
> +	dist->lpi_list_count--;
> +
> +	kfree(irq);
> +}
> +
>  void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  {
>  	struct vgic_dist *dist = &kvm->arch.vgic;
> @@ -139,16 +155,8 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
>  		return;
>  
>  	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
> -	if (!kref_put(&irq->refcount, vgic_irq_release)) {
> -		raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
> -		return;
> -	};
> -
> -	list_del(&irq->lpi_list);
> -	dist->lpi_list_count--;
> +	__vgic_put_lpi_locked(kvm, irq);
>  	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
> -
> -	kfree(irq);
>  }
>  
>  void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index a58e1b263dca..80cd40575bc9 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -172,6 +172,7 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
>  		     gpa_t addr, int len);
>  struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
>  			      u32 intid);
> +void __vgic_put_lpi_locked(struct kvm *kvm, struct vgic_irq *irq);
>  void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
>  bool vgic_get_phys_line_level(struct vgic_irq *irq);
>  void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending);
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-07 12:09   ` Auger Eric
@ 2019-06-07 12:44     ` Marc Zyngier
  2019-06-07 14:15       ` Auger Eric
  0 siblings, 1 reply; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07 12:44 UTC (permalink / raw)
  To: Auger Eric, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Eric,

On 07/06/2019 13:09, Auger Eric wrote:
> Hi Marc,
> 
> On 6/6/19 6:54 PM, Marc Zyngier wrote:
>> Add the basic data structure that expresses an MSI to LPI
>> translation as well as the allocation/release hooks.
>>
>> THe size of the cache is arbitrarily defined as 4*nr_vcpus.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h        | 10 ++++++++++
>>  virt/kvm/arm/vgic/vgic-init.c | 34 ++++++++++++++++++++++++++++++++++
>>  virt/kvm/arm/vgic/vgic-its.c  |  2 ++
>>  virt/kvm/arm/vgic/vgic.h      |  3 +++
>>  4 files changed, 49 insertions(+)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index c36c86f1ec9a..5a0d6b07c5ef 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -173,6 +173,14 @@ struct vgic_io_device {
>>  	struct kvm_io_device dev;
>>  };
>>  
>> +struct vgic_translation_cache_entry {
>> +	struct list_head	entry;
>> +	phys_addr_t		db;
> it is not obvious to me why you do need the db field? Isn't the LPI
> uniquely identfiied by the devid and eventid. If I recall correctly
> theorically the architecture allows to handle LPIs even without ITS.

Only having DID+EID is unfortunately not enough, and the translation has
to be per ITS. Think of a system with two ITSs, and a PCI device in
front of each of the ITSs. There is no reason why the two devices would
have different IDs, as they belong to different PCI hierarchies.

So the cache must take the source ITS into account. The alternative
would be to keep a separate cache per ITS, but that would lead to more
overhead on the fast path, having to lookup the ITS first.

As for LPIs without ITS, that wouldn't need a cache at all.

>> +	u32			devid;
>> +	u32			eventid;
>> +	struct vgic_irq		*irq;
>> +};
>> +
>>  struct vgic_its {
>>  	/* The base address of the ITS control register frame */
>>  	gpa_t			vgic_its_base;
>> @@ -260,6 +268,8 @@ struct vgic_dist {
>>  	struct list_head	lpi_list_head;
>>  	int			lpi_list_count;
>>  
>> +	struct list_head	lpi_translation_cache;
>> +
>>  	/* used by vgic-debug */
>>  	struct vgic_state_iter *iter;
>>  
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>> index 3bdb31eaed64..25ae25694a28 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -64,6 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
>>  	struct vgic_dist *dist = &kvm->arch.vgic;
>>  
>>  	INIT_LIST_HEAD(&dist->lpi_list_head);
>> +	INIT_LIST_HEAD(&dist->lpi_translation_cache);
>>  	raw_spin_lock_init(&dist->lpi_list_lock);
>>  }
>>  
>> @@ -260,6 +261,27 @@ static void kvm_vgic_vcpu_enable(struct kvm_vcpu *vcpu)
>>  		vgic_v3_enable(vcpu);
>>  }
>>  
>> +void vgic_lpi_translation_cache_init(struct kvm *kvm)
>> +{
>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>> +	int i;
>> +
>> +	if (!list_empty(&dist->lpi_translation_cache))
>> +		return;
>> +
>> +	for (i = 0; i < LPI_CACHE_SIZE(kvm); i++) {
>> +		struct vgic_translation_cache_entry *cte;
>> +
>> +		/* An allocation failure is not fatal */
>> +		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
>> +		if (WARN_ON(!cte))
>> +			break;
>> +
>> +		INIT_LIST_HEAD(&cte->entry);
>> +		list_add(&cte->entry, &dist->lpi_translation_cache);
>> +	}
>> +}
>> +
>>  /*
>>   * vgic_init: allocates and initializes dist and vcpu data structures
>>   * depending on two dimensioning parameters:
>> @@ -305,6 +327,7 @@ int vgic_init(struct kvm *kvm)
>>  	}
>>  
>>  	if (vgic_has_its(kvm)) {
>> +		vgic_lpi_translation_cache_init(kvm);
>>  		ret = vgic_v4_init(kvm);
>>  		if (ret)
>>  			goto out;
>> @@ -346,6 +369,17 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>>  		INIT_LIST_HEAD(&dist->rd_regions);
>>  	}
>>  
>> +	if (vgic_has_its(kvm)) {
>> +		struct vgic_translation_cache_entry *cte, *tmp;
>> +
>> +		list_for_each_entry_safe(cte, tmp,
>> +					 &dist->lpi_translation_cache, entry) {
>> +			list_del(&cte->entry);
>> +			kfree(cte);
>> +		}
>> +		INIT_LIST_HEAD(&dist->lpi_translation_cache);
>> +	}
>> +
>>  	if (vgic_supports_direct_msis(kvm))
>>  		vgic_v4_teardown(kvm);
>>  }
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index 44ceaccb18cf..5758504fd934 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -1696,6 +1696,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
>>  			kfree(its);
>>  			return ret;
>>  		}
>> +
>> +		vgic_lpi_translation_cache_init(dev->kvm);
>>  	}
>>  
>>  	mutex_init(&its->its_lock);
>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>> index abeeffabc456..a58e1b263dca 100644
>> --- a/virt/kvm/arm/vgic/vgic.h
>> +++ b/virt/kvm/arm/vgic/vgic.h
>> @@ -316,6 +316,9 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
>>  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>>  			 u32 devid, u32 eventid, struct vgic_irq **irq);
>>  struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
>> +void vgic_lpi_translation_cache_init(struct kvm *kvm);
>> +
>> +#define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
> Couldn't the cache be a function of the number of allocated lpis. We
> could realloc the list accordingly. I miss why it is rather dependent on
> the number of vcpus and not on the number of assigned devices/MSIs?

How do you find out about the number of LPIs? That's really for the
guest to decide what it wants to do. Also, KVM itself doesn't have much
of a clue about the number of assigned devices or their MSI capability.
That's why I've suggested that userspace could be involved here.

So far, I've used the number of vcpus as MSIs are usually used to deal
with per-CPU queues. This allows the cache to scale with the number of
queues that the guest is expected to deal with. Ali's reply earlier seem
to indicate that this is a common pattern, but it is the multiplying
factor that is hard to express.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-07 12:44     ` Marc Zyngier
@ 2019-06-07 14:15       ` Auger Eric
  2019-06-07 15:04         ` Marc Zyngier
  0 siblings, 1 reply; 25+ messages in thread
From: Auger Eric @ 2019-06-07 14:15 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,
On 6/7/19 2:44 PM, Marc Zyngier wrote:
> Hi Eric,
> 
> On 07/06/2019 13:09, Auger Eric wrote:
>> Hi Marc,
>>
>> On 6/6/19 6:54 PM, Marc Zyngier wrote:
>>> Add the basic data structure that expresses an MSI to LPI
>>> translation as well as the allocation/release hooks.
>>>
>>> THe size of the cache is arbitrarily defined as 4*nr_vcpus.
>>>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  include/kvm/arm_vgic.h        | 10 ++++++++++
>>>  virt/kvm/arm/vgic/vgic-init.c | 34 ++++++++++++++++++++++++++++++++++
>>>  virt/kvm/arm/vgic/vgic-its.c  |  2 ++
>>>  virt/kvm/arm/vgic/vgic.h      |  3 +++
>>>  4 files changed, 49 insertions(+)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index c36c86f1ec9a..5a0d6b07c5ef 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -173,6 +173,14 @@ struct vgic_io_device {
>>>  	struct kvm_io_device dev;
>>>  };
>>>  
>>> +struct vgic_translation_cache_entry {
>>> +	struct list_head	entry;
>>> +	phys_addr_t		db;
>> it is not obvious to me why you do need the db field? Isn't the LPI
>> uniquely identfiied by the devid and eventid. If I recall correctly
>> theorically the architecture allows to handle LPIs even without ITS.
> 
> Only having DID+EID is unfortunately not enough, and the translation has
> to be per ITS. Think of a system with two ITSs, and a PCI device in
> front of each of the ITSs. There is no reason why the two devices would
> have different IDs, as they belong to different PCI hierarchies.
> 
> So the cache must take the source ITS into account. The alternative
> would be to keep a separate cache per ITS, but that would lead to more
> overhead on the fast path, having to lookup the ITS first.

Yes you're right. In the meantime I double checked the IORT spec and the
deviceid only is unique within an ITS group node. But there can be
several group nodes.

"ITS group nodes describe which ITS units are in the system. A node
allows grouping of more than one ITS, but all ITSs in the group must
share a common understanding of DeviceID values. That is, a given
DeviceID must represent the same device for all ITS units in the group."
> 
> As for LPIs without ITS, that wouldn't need a cache at all.
> 
>>> +	u32			devid;
>>> +	u32			eventid;
>>> +	struct vgic_irq		*irq;
>>> +};
>>> +
>>>  struct vgic_its {
>>>  	/* The base address of the ITS control register frame */
>>>  	gpa_t			vgic_its_base;
>>> @@ -260,6 +268,8 @@ struct vgic_dist {
>>>  	struct list_head	lpi_list_head;
>>>  	int			lpi_list_count;
>>>  
>>> +	struct list_head	lpi_translation_cache;
>>> +
>>>  	/* used by vgic-debug */
>>>  	struct vgic_state_iter *iter;
>>>  
>>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>>> index 3bdb31eaed64..25ae25694a28 100644
>>> --- a/virt/kvm/arm/vgic/vgic-init.c
>>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>>> @@ -64,6 +64,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
>>>  	struct vgic_dist *dist = &kvm->arch.vgic;
>>>  
>>>  	INIT_LIST_HEAD(&dist->lpi_list_head);
>>> +	INIT_LIST_HEAD(&dist->lpi_translation_cache);
>>>  	raw_spin_lock_init(&dist->lpi_list_lock);
>>>  }
>>>  
>>> @@ -260,6 +261,27 @@ static void kvm_vgic_vcpu_enable(struct kvm_vcpu *vcpu)
>>>  		vgic_v3_enable(vcpu);
>>>  }
>>>  
>>> +void vgic_lpi_translation_cache_init(struct kvm *kvm)
>>> +{
>>> +	struct vgic_dist *dist = &kvm->arch.vgic;
>>> +	int i;
>>> +
>>> +	if (!list_empty(&dist->lpi_translation_cache))
>>> +		return;
>>> +
>>> +	for (i = 0; i < LPI_CACHE_SIZE(kvm); i++) {
>>> +		struct vgic_translation_cache_entry *cte;
>>> +
>>> +		/* An allocation failure is not fatal */
>>> +		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
>>> +		if (WARN_ON(!cte))
>>> +			break;
>>> +
>>> +		INIT_LIST_HEAD(&cte->entry);
>>> +		list_add(&cte->entry, &dist->lpi_translation_cache);
>>> +	}
>>> +}
>>> +
>>>  /*
>>>   * vgic_init: allocates and initializes dist and vcpu data structures
>>>   * depending on two dimensioning parameters:
>>> @@ -305,6 +327,7 @@ int vgic_init(struct kvm *kvm)
>>>  	}
>>>  
>>>  	if (vgic_has_its(kvm)) {
>>> +		vgic_lpi_translation_cache_init(kvm);
>>>  		ret = vgic_v4_init(kvm);
>>>  		if (ret)
>>>  			goto out;
>>> @@ -346,6 +369,17 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
>>>  		INIT_LIST_HEAD(&dist->rd_regions);
>>>  	}
>>>  
>>> +	if (vgic_has_its(kvm)) {
>>> +		struct vgic_translation_cache_entry *cte, *tmp;
>>> +
>>> +		list_for_each_entry_safe(cte, tmp,
>>> +					 &dist->lpi_translation_cache, entry) {
>>> +			list_del(&cte->entry);
>>> +			kfree(cte);
>>> +		}
>>> +		INIT_LIST_HEAD(&dist->lpi_translation_cache);
>>> +	}
>>> +
>>>  	if (vgic_supports_direct_msis(kvm))
>>>  		vgic_v4_teardown(kvm);
>>>  }
>>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>>> index 44ceaccb18cf..5758504fd934 100644
>>> --- a/virt/kvm/arm/vgic/vgic-its.c
>>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>>> @@ -1696,6 +1696,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
>>>  			kfree(its);
>>>  			return ret;
>>>  		}
>>> +
>>> +		vgic_lpi_translation_cache_init(dev->kvm);
>>>  	}
>>>  
>>>  	mutex_init(&its->its_lock);
>>> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
>>> index abeeffabc456..a58e1b263dca 100644
>>> --- a/virt/kvm/arm/vgic/vgic.h
>>> +++ b/virt/kvm/arm/vgic/vgic.h
>>> @@ -316,6 +316,9 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
>>>  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>>>  			 u32 devid, u32 eventid, struct vgic_irq **irq);
>>>  struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
>>> +void vgic_lpi_translation_cache_init(struct kvm *kvm);
>>> +
>>> +#define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
>> Couldn't the cache be a function of the number of allocated lpis. We
>> could realloc the list accordingly. I miss why it is rather dependent on
>> the number of vcpus and not on the number of assigned devices/MSIs?
> 
> How do you find out about the number of LPIs? That's really for the
> guest to decide what it wants to do. Also, KVM itself doesn't have much
> of a clue about the number of assigned devices or their MSI capability.
> That's why I've suggested that userspace could be involved here.

Can't we setup an heuristic based on dist->lpi_list_count incremented on
vgic_add_lpi() used on MAPI/MAPTI? Of course not all of them are
assigned device ones. But currently the cache is being used for all LPIs
including those triggered through the user space injection (KVM_SIGNAL_MSI).

Otherwise there is an existing interface between KVM and VFIO that may
be leveraged to pass info between both?

Thanks

Eric
> 
> So far, I've used the number of vcpus as MSIs are usually used to deal
> with per-CPU queues. This allows the cache to scale with the number of
> queues that the guest is expected to deal with. Ali's reply earlier seem
> to indicate that this is a common pattern, but it is the multiplying
> factor that is hard to express.
> 
> Thanks,
> 
> 	M.
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection
  2019-06-06 16:54 ` [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection Marc Zyngier
@ 2019-06-07 14:29   ` Auger Eric
  2019-06-07 14:49     ` Marc Zyngier
  0 siblings, 1 reply; 25+ messages in thread
From: Auger Eric @ 2019-06-07 14:29 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,
On 6/6/19 6:54 PM, Marc Zyngier wrote:
> As we are going to perform some VM-wide operations when freeing
> a collection, add the kvm pointer to vgic_its_free_collection.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Eventually do you use that commit in subsequent patches?

Thanks

Eric
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 11 ++++++-----
>  1 file changed, 6 insertions(+), 5 deletions(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index bc370b6c5afa..f637edd77e1f 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -885,7 +885,8 @@ static int vgic_its_alloc_collection(struct vgic_its *its,
>  	return 0;
>  }
>  
> -static void vgic_its_free_collection(struct vgic_its *its, u32 coll_id)
> +static void vgic_its_free_collection(struct kvm *kvm,
> +				     struct vgic_its *its, u32 coll_id)
>  {
>  	struct its_collection *collection;
>  	struct its_device *device;
> @@ -974,7 +975,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
>  	ite = vgic_its_alloc_ite(device, collection, event_id);
>  	if (IS_ERR(ite)) {
>  		if (new_coll)
> -			vgic_its_free_collection(its, coll_id);
> +			vgic_its_free_collection(kvm, its, coll_id);
>  		return PTR_ERR(ite);
>  	}
>  
> @@ -984,7 +985,7 @@ static int vgic_its_cmd_handle_mapi(struct kvm *kvm, struct vgic_its *its,
>  	irq = vgic_add_lpi(kvm, lpi_nr, vcpu);
>  	if (IS_ERR(irq)) {
>  		if (new_coll)
> -			vgic_its_free_collection(its, coll_id);
> +			vgic_its_free_collection(kvm, its, coll_id);
>  		its_free_ite(kvm, ite);
>  		return PTR_ERR(irq);
>  	}
> @@ -1025,7 +1026,7 @@ static void vgic_its_free_collection_list(struct kvm *kvm, struct vgic_its *its)
>  	struct its_collection *cur, *temp;
>  
>  	list_for_each_entry_safe(cur, temp, &its->collection_list, coll_list)
> -		vgic_its_free_collection(its, cur->collection_id);
> +		vgic_its_free_collection(kvm, its, cur->collection_id);
>  }
>  
>  /* Must be called with its_lock mutex held */
> @@ -1110,7 +1111,7 @@ static int vgic_its_cmd_handle_mapc(struct kvm *kvm, struct vgic_its *its,
>  		return E_ITS_MAPC_PROCNUM_OOR;
>  
>  	if (!valid) {
> -		vgic_its_free_collection(its, coll_id);
> +		vgic_its_free_collection(kvm, its, coll_id);
>  	} else {
>  		collection = find_collection(its, coll_id);
>  
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-06-06 16:54 ` [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
  2019-06-07  8:35   ` Julien Thierry
@ 2019-06-07 14:29   ` Auger Eric
  1 sibling, 0 replies; 25+ messages in thread
From: Auger Eric @ 2019-06-07 14:29 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

Hi Marc,

On 6/6/19 6:54 PM, Marc Zyngier wrote:
> On a successful translation, preserve the parameters in the LPI
> translation cache. Each translation is reusing the last slot
> in the list, naturally evincting the least recently used entry.
evicting
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 41 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 41 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 5758504fd934..bc370b6c5afa 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -538,6 +538,45 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>  	return 0;
>  }
>  
> +static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
> +				       u32 devid, u32 eventid,
> +				       struct vgic_irq *irq)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct vgic_translation_cache_entry *cte;
> +	unsigned long flags;
> +
> +	/* Do not cache a directly injected interrupt */
> +	if (irq->hw)
> +		return;
> +
> +	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
> +
> +	/* Always reuse the last entry (LRU policy) */
> +	cte = list_last_entry(&dist->lpi_translation_cache,
> +			      typeof(*cte), entry);
> +
> +	/*
> +	 * Caching the translation implies having an extra reference
> +	 * to the interrupt, so drop the potential reference on what
> +	 * was in the cache, and increment it on the new interrupt.
> +	 */
> +	if (cte->irq)
> +		__vgic_put_lpi_locked(kvm, cte->irq);
> +
> +	vgic_get_irq_kref(irq);
> +
> +	cte->db		= its->vgic_its_base + GITS_TRANSLATER;
> +	cte->devid	= devid;
> +	cte->eventid	= eventid;
> +	cte->irq	= irq;
> +
> +	/* Move the new translation to the head of the list */
> +	list_move(&cte->entry, &dist->lpi_translation_cache);
> +
> +	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
> +}
> +
>  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>  			 u32 devid, u32 eventid, struct vgic_irq **irq)
>  {
> @@ -558,6 +597,8 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>  	if (!vcpu->arch.vgic_cpu.lpis_enabled)
>  		return -EBUSY;
>  
> +	vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
> +
>  	*irq = ite->irq;
>  	return 0;
>  }
> 
Otherwise looks good to me.

Thanks

Eric
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection
  2019-06-07 14:29   ` Auger Eric
@ 2019-06-07 14:49     ` Marc Zyngier
  0 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07 14:49 UTC (permalink / raw)
  To: Auger Eric, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

On 07/06/2019 15:29, Auger Eric wrote:
> Hi Marc,
> On 6/6/19 6:54 PM, Marc Zyngier wrote:
>> As we are going to perform some VM-wide operations when freeing
>> a collection, add the kvm pointer to vgic_its_free_collection.
>>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Eventually do you use that commit in subsequent patches?

Ah! That's a leftover of a previous version, where I was pointlessly
invalidating the cache on MAPC with V=0. You're absolutely right, this
patch is completely useless!

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-06-07 14:15       ` Auger Eric
@ 2019-06-07 15:04         ` Marc Zyngier
  0 siblings, 0 replies; 25+ messages in thread
From: Marc Zyngier @ 2019-06-07 15:04 UTC (permalink / raw)
  To: Auger Eric, linux-arm-kernel, kvmarm, kvm; +Cc: Raslan, KarimAllah

On 07/06/2019 15:15, Auger Eric wrote:
> Hi Marc,
> On 6/7/19 2:44 PM, Marc Zyngier wrote:
>> Hi Eric,
>>
>> On 07/06/2019 13:09, Auger Eric wrote:
>>> Hi Marc,

[...]

>>>> +#define LPI_CACHE_SIZE(kvm)	(atomic_read(&(kvm)->online_vcpus) * 4)
>>> Couldn't the cache be a function of the number of allocated lpis. We
>>> could realloc the list accordingly. I miss why it is rather dependent on
>>> the number of vcpus and not on the number of assigned devices/MSIs?
>>
>> How do you find out about the number of LPIs? That's really for the
>> guest to decide what it wants to do. Also, KVM itself doesn't have much
>> of a clue about the number of assigned devices or their MSI capability.
>> That's why I've suggested that userspace could be involved here.
> 
> Can't we setup an heuristic based on dist->lpi_list_count incremented on
> vgic_add_lpi() used on MAPI/MAPTI? Of course not all of them are
> assigned device ones. But currently the cache is being used for all LPIs
> including those triggered through the user space injection (KVM_SIGNAL_MSI).

I'm happy to grow the cache on MAPI, but that doesn't solve the real
problem: how do we cap it to a value that is "good enough"?

> Otherwise there is an existing interface between KVM and VFIO that may
> be leveraged to pass info between both?

Same thing. The problem is in defining the limit. I guess only people
deploying real workloads can tell us what is a reasonable default, and
we can also make that a tuneable parameter...

	M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-06-07 15:05 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-06 16:54 [PATCH 0/8] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
2019-06-06 16:54 ` [PATCH 1/8] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
2019-06-07  3:47   ` Saidi, Ali
2019-06-07  7:38     ` Marc Zyngier
2019-06-07  8:12   ` Julien Thierry
2019-06-07  8:38     ` Marc Zyngier
2019-06-07 12:09   ` Auger Eric
2019-06-07 12:44     ` Marc Zyngier
2019-06-07 14:15       ` Auger Eric
2019-06-07 15:04         ` Marc Zyngier
2019-06-06 16:54 ` [PATCH 2/8] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
2019-06-07 12:11   ` Auger Eric
2019-06-06 16:54 ` [PATCH 3/8] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
2019-06-07  8:35   ` Julien Thierry
2019-06-07  8:51     ` Marc Zyngier
2019-06-07  8:56       ` Julien Thierry
2019-06-07  9:16         ` Marc Zyngier
2019-06-07 14:29   ` Auger Eric
2019-06-06 16:54 ` [PATCH 4/8] KVM: arm/arm64: vgic-its: Add kvm parameter to vgic_its_free_collection Marc Zyngier
2019-06-07 14:29   ` Auger Eric
2019-06-07 14:49     ` Marc Zyngier
2019-06-06 16:54 ` [PATCH 5/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands Marc Zyngier
2019-06-06 16:54 ` [PATCH 6/8] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
2019-06-06 16:54 ` [PATCH 7/8] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
2019-06-06 16:54 ` [PATCH 8/8] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).