kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache
@ 2019-07-25 15:35 Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 01/10] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
                   ` (11 more replies)
  0 siblings, 12 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

It recently became apparent[1] that our LPI injection path is not as
efficient as it could be when injecting interrupts coming from a VFIO
assigned device.

Although the proposed patch wasn't 100% correct, it outlined at least
two issues:

(1) Injecting an LPI from VFIO always results in a context switch to a
    worker thread: no good

(2) We have no way of amortising the cost of translating a DID+EID pair
    to an LPI number

The reason for (1) is that we may sleep when translating an LPI, so we
do need a context process. A way to fix that is to implement a small
LPI translation cache that could be looked up from an atomic
context. It would also solve (2).

This is what this small series proposes. It implements a very basic
LRU cache of pre-translated LPIs, which gets used to implement
kvm_arch_set_irq_inatomic. The size of the cache is currently
hard-coded at 16 times the number of vcpus, a number I have picked
under the influence of Ali Saidi. If that's not enough for you, blame
me, though.

Does it work? well, it doesn't crash, and is thus perfect. More
seriously, I don't really have a way to benchmark it directly, so my
observations are only indirect:

On a TX2 system, I run a 4 vcpu VM with an Ethernet interface passed
to it directly. From the host, I inject interrupts using debugfs. In
parallel, I look at the number of context switch, and the number of
interrupts on the host. Without this series, I get the same number for
both IRQ and CS (about half a million of each per second is pretty
easy to reach). With this series, the number of context switches drops
to something pretty small (in the low 2k), while the number of
interrupts stays the same.

Yes, this is a pretty rubbish benchmark, what did you expect? ;-)

Andre did run some benchmarks of his own, with some rather positive
results[2]. So I'm putting this out for people with real workloads to
try out and report what they see.

[1] https://lore.kernel.org/lkml/1552833373-19828-1-git-send-email-yuzenghui@huawei.com/
[2] https://www.spinics.net/lists/arm-kernel/msg742655.html

* From v2:

  - Added invalidation on turning the ITS off
  - Added invalidation on MAPC with V=0
  - Added Rb's from Eric

* From v1:

  - Fixed race on allocation, where the same LPI could be cached multiple times
  - Now invalidate the cache on vgic teardown, avoiding memory leaks
  - Change patch split slightly, general reshuffling
  - Small cleanups here and there
  - Rebased on 5.2-rc4

Marc Zyngier (10):
  KVM: arm/arm64: vgic: Add LPI translation cache definition
  KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
  KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation
  KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
    specific commands
  KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
    disabling LPIs
  KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS
    disable
  KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic
    teardown
  KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI
    injection
  KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic

 include/kvm/arm_vgic.h           |   3 +
 virt/kvm/arm/vgic/vgic-init.c    |   5 +
 virt/kvm/arm/vgic/vgic-irqfd.c   |  36 +++++-
 virt/kvm/arm/vgic/vgic-its.c     | 207 +++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic-mmio-v3.c |   4 +-
 virt/kvm/arm/vgic/vgic.c         |  26 ++--
 virt/kvm/arm/vgic/vgic.h         |   5 +
 7 files changed, 270 insertions(+), 16 deletions(-)

-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 01/10] KVM: arm/arm64: vgic: Add LPI translation cache definition
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 02/10] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

Add the basic data structure that expresses an MSI to LPI
translation as well as the allocation/release hooks.

The size of the cache is arbitrarily defined as 16*nr_vcpus.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 include/kvm/arm_vgic.h        |  3 +++
 virt/kvm/arm/vgic/vgic-init.c |  5 ++++
 virt/kvm/arm/vgic/vgic-its.c  | 49 +++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.h      |  2 ++
 4 files changed, 59 insertions(+)

diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 46bbc949c20a..298cfc38e4a1 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -249,6 +249,9 @@ struct vgic_dist {
 	struct list_head	lpi_list_head;
 	int			lpi_list_count;
 
+	/* LPI translation cache */
+	struct list_head	lpi_translation_cache;
+
 	/* used by vgic-debug */
 	struct vgic_state_iter *iter;
 
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index bdbc297d06fb..80127ca9269f 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -53,6 +53,7 @@ void kvm_vgic_early_init(struct kvm *kvm)
 	struct vgic_dist *dist = &kvm->arch.vgic;
 
 	INIT_LIST_HEAD(&dist->lpi_list_head);
+	INIT_LIST_HEAD(&dist->lpi_translation_cache);
 	raw_spin_lock_init(&dist->lpi_list_lock);
 }
 
@@ -294,6 +295,7 @@ int vgic_init(struct kvm *kvm)
 	}
 
 	if (vgic_has_its(kvm)) {
+		vgic_lpi_translation_cache_init(kvm);
 		ret = vgic_v4_init(kvm);
 		if (ret)
 			goto out;
@@ -335,6 +337,9 @@ static void kvm_vgic_dist_destroy(struct kvm *kvm)
 		INIT_LIST_HEAD(&dist->rd_regions);
 	}
 
+	if (vgic_has_its(kvm))
+		vgic_lpi_translation_cache_destroy(kvm);
+
 	if (vgic_supports_direct_msis(kvm))
 		vgic_v4_teardown(kvm);
 }
diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 482036612adf..0e5c1519bbe2 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -138,6 +138,14 @@ struct its_ite {
 	u32 event_id;
 };
 
+struct vgic_translation_cache_entry {
+	struct list_head	entry;
+	phys_addr_t		db;
+	u32			devid;
+	u32			eventid;
+	struct vgic_irq		*irq;
+};
+
 /**
  * struct vgic_its_abi - ITS abi ops and settings
  * @cte_esz: collection table entry size
@@ -1657,6 +1665,45 @@ static int vgic_register_its_iodev(struct kvm *kvm, struct vgic_its *its,
 	return ret;
 }
 
+/* Default is 16 cached LPIs per vcpu */
+#define LPI_DEFAULT_PCPU_CACHE_SIZE	16
+
+void vgic_lpi_translation_cache_init(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	unsigned int sz;
+	int i;
+
+	if (!list_empty(&dist->lpi_translation_cache))
+		return;
+
+	sz = atomic_read(&kvm->online_vcpus) * LPI_DEFAULT_PCPU_CACHE_SIZE;
+
+	for (i = 0; i < sz; i++) {
+		struct vgic_translation_cache_entry *cte;
+
+		/* An allocation failure is not fatal */
+		cte = kzalloc(sizeof(*cte), GFP_KERNEL);
+		if (WARN_ON(!cte))
+			break;
+
+		INIT_LIST_HEAD(&cte->entry);
+		list_add(&cte->entry, &dist->lpi_translation_cache);
+	}
+}
+
+void vgic_lpi_translation_cache_destroy(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_translation_cache_entry *cte, *tmp;
+
+	list_for_each_entry_safe(cte, tmp,
+				 &dist->lpi_translation_cache, entry) {
+		list_del(&cte->entry);
+		kfree(cte);
+	}
+}
+
 #define INITIAL_BASER_VALUE						  \
 	(GIC_BASER_CACHEABILITY(GITS_BASER, INNER, RaWb)		| \
 	 GIC_BASER_CACHEABILITY(GITS_BASER, OUTER, SameAsInner)		| \
@@ -1685,6 +1732,8 @@ static int vgic_its_create(struct kvm_device *dev, u32 type)
 			kfree(its);
 			return ret;
 		}
+
+		vgic_lpi_translation_cache_init(dev->kvm);
 	}
 
 	mutex_init(&its->its_lock);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 57205beaa981..11291a9c8c1c 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -305,6 +305,8 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
+void vgic_lpi_translation_cache_init(struct kvm *kvm);
+void vgic_lpi_translation_cache_destroy(struct kvm *kvm);
 
 bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 02/10] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 01/10] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 03/10] KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation Marc Zyngier
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

Our LPI translation cache needs to be able to drop the refcount
on an LPI whilst already holding the lpi_list_lock.

Let's add a new primitive for this.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic.c | 26 +++++++++++++++++---------
 virt/kvm/arm/vgic/vgic.h |  1 +
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic.c b/virt/kvm/arm/vgic/vgic.c
index 04786c8ec77e..deb84712f550 100644
--- a/virt/kvm/arm/vgic/vgic.c
+++ b/virt/kvm/arm/vgic/vgic.c
@@ -119,6 +119,22 @@ static void vgic_irq_release(struct kref *ref)
 {
 }
 
+/*
+ * Drop the refcount on the LPI. Must be called with lpi_list_lock held.
+ */
+void __vgic_put_lpi_locked(struct kvm *kvm, struct vgic_irq *irq)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+
+	if (!kref_put(&irq->refcount, vgic_irq_release))
+		return;
+
+	list_del(&irq->lpi_list);
+	dist->lpi_list_count--;
+
+	kfree(irq);
+}
+
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
@@ -128,16 +144,8 @@ void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq)
 		return;
 
 	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
-	if (!kref_put(&irq->refcount, vgic_irq_release)) {
-		raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
-		return;
-	};
-
-	list_del(&irq->lpi_list);
-	dist->lpi_list_count--;
+	__vgic_put_lpi_locked(kvm, irq);
 	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
-
-	kfree(irq);
 }
 
 void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 11291a9c8c1c..e523b3a54590 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -161,6 +161,7 @@ vgic_get_mmio_region(struct kvm_vcpu *vcpu, struct vgic_io_device *iodev,
 		     gpa_t addr, int len);
 struct vgic_irq *vgic_get_irq(struct kvm *kvm, struct kvm_vcpu *vcpu,
 			      u32 intid);
+void __vgic_put_lpi_locked(struct kvm *kvm, struct vgic_irq *irq);
 void vgic_put_irq(struct kvm *kvm, struct vgic_irq *irq);
 bool vgic_get_phys_line_level(struct vgic_irq *irq);
 void vgic_irq_set_phys_pending(struct vgic_irq *irq, bool pending);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 03/10] KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 01/10] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 02/10] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 04/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands Marc Zyngier
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

There's a number of cases where we need to invalidate the caching
of translations, so let's add basic support for that.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 23 +++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.h     |  1 +
 2 files changed, 24 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 0e5c1519bbe2..cc6b5e49a312 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -535,6 +535,29 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
 	return 0;
 }
 
+void vgic_its_invalidate_cache(struct kvm *kvm)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_translation_cache_entry *cte;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+
+	list_for_each_entry(cte, &dist->lpi_translation_cache, entry) {
+		/*
+		 * If we hit a NULL entry, there is nothing after this
+		 * point.
+		 */
+		if (!cte->irq)
+			break;
+
+		__vgic_put_lpi_locked(kvm, cte->irq);
+		cte->irq = NULL;
+	}
+
+	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+}
+
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq)
 {
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index e523b3a54590..09908b80fb1e 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -308,6 +308,7 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
 void vgic_lpi_translation_cache_init(struct kvm *kvm);
 void vgic_lpi_translation_cache_destroy(struct kvm *kvm);
+void vgic_its_invalidate_cache(struct kvm *kvm);
 
 bool vgic_supports_direct_msis(struct kvm *kvm);
 int vgic_v4_init(struct kvm *kvm);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 04/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (2 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 03/10] KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 05/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

The LPI translation cache needs to be discarded when an ITS command
may affect the translation of an LPI (DISCARD, MAPC and MAPD with V=0)
or the routing of an LPI to a redistributor with disabled LPIs (MOVI,
MOVALL).

We decide to perform a full invalidation of the cache, irrespective
of the LPI that is affected. Commands are supposed to be rare enough
that it doesn't matter.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index cc6b5e49a312..09a179820816 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -722,6 +722,8 @@ static int vgic_its_cmd_handle_discard(struct kvm *kvm, struct vgic_its *its,
 		 * don't bother here since we clear the ITTE anyway and the
 		 * pending state is a property of the ITTE struct.
 		 */
+		vgic_its_invalidate_cache(kvm);
+
 		its_free_ite(kvm, ite);
 		return 0;
 	}
@@ -757,6 +759,8 @@ static int vgic_its_cmd_handle_movi(struct kvm *kvm, struct vgic_its *its,
 	ite->collection = collection;
 	vcpu = kvm_get_vcpu(kvm, collection->target_addr);
 
+	vgic_its_invalidate_cache(kvm);
+
 	return update_affinity(ite->irq, vcpu);
 }
 
@@ -985,6 +989,8 @@ static void vgic_its_free_device(struct kvm *kvm, struct its_device *device)
 	list_for_each_entry_safe(ite, temp, &device->itt_head, ite_list)
 		its_free_ite(kvm, ite);
 
+	vgic_its_invalidate_cache(kvm);
+
 	list_del(&device->dev_list);
 	kfree(device);
 }
@@ -1090,6 +1096,7 @@ static int vgic_its_cmd_handle_mapc(struct kvm *kvm, struct vgic_its *its,
 
 	if (!valid) {
 		vgic_its_free_collection(its, coll_id);
+		vgic_its_invalidate_cache(kvm);
 	} else {
 		collection = find_collection(its, coll_id);
 
@@ -1238,6 +1245,8 @@ static int vgic_its_cmd_handle_movall(struct kvm *kvm, struct vgic_its *its,
 		vgic_put_irq(kvm, irq);
 	}
 
+	vgic_its_invalidate_cache(kvm);
+
 	kfree(intids);
 	return 0;
 }
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 05/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (3 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 04/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 16:38   ` Auger Eric
  2019-07-25 15:35 ` [PATCH v3 06/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable Marc Zyngier
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

If a vcpu disables LPIs at its redistributor level, we need to make sure
we won't pend more interrupts. For this, we need to invalidate the LPI
translation cache.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index 936962abc38d..cb60da48810d 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -192,8 +192,10 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
 
 	vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS;
 
-	if (was_enabled && !vgic_cpu->lpis_enabled)
+	if (was_enabled && !vgic_cpu->lpis_enabled) {
 		vgic_flush_pending_lpis(vcpu);
+		vgic_its_invalidate_cache(vcpu->kvm);
+	}
 
 	if (!was_enabled && vgic_cpu->lpis_enabled)
 		vgic_enable_lpis(vcpu);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 06/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (4 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 05/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 16:38   ` Auger Eric
  2019-07-25 15:35 ` [PATCH v3 07/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown Marc Zyngier
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

If an ITS gets disabled, we need to make sure that further interrupts
won't hit in the cache. For that, we invalidate the translation cache
when the ITS is disabled.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 09a179820816..05406bd92ce9 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1597,6 +1597,8 @@ static void vgic_mmio_write_its_ctlr(struct kvm *kvm, struct vgic_its *its,
 		goto out;
 
 	its->enabled = !!(val & GITS_CTLR_ENABLE);
+	if (!its->enabled)
+		vgic_its_invalidate_cache(kvm);
 
 	/*
 	 * Try to process any pending commands. This function bails out early
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 07/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (5 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 06/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 16:37   ` Auger Eric
  2019-07-25 15:35 ` [PATCH v3 08/10] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

In order to avoid leaking vgic_irq structures on teardown, we need to
drop all references to LPIs before deallocating the cache itself.

This is done by invalidating the cache on vgic teardown.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index 05406bd92ce9..d3e90a9d0a7a 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -1731,6 +1731,8 @@ void vgic_lpi_translation_cache_destroy(struct kvm *kvm)
 	struct vgic_dist *dist = &kvm->arch.vgic;
 	struct vgic_translation_cache_entry *cte, *tmp;
 
+	vgic_its_invalidate_cache(kvm);
+
 	list_for_each_entry_safe(cte, tmp,
 				 &dist->lpi_translation_cache, entry) {
 		list_del(&cte->entry);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 08/10] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (6 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 07/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 15:35 ` [PATCH v3 09/10] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

On a successful translation, preserve the parameters in the LPI
translation cache. Each translation is reusing the last slot
in the list, naturally evicting the least recently used entry.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 86 ++++++++++++++++++++++++++++++++++++
 1 file changed, 86 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index d3e90a9d0a7a..e61d3ea0ab40 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -535,6 +535,90 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
 	return 0;
 }
 
+static struct vgic_irq *__vgic_its_check_cache(struct vgic_dist *dist,
+					       phys_addr_t db,
+					       u32 devid, u32 eventid)
+{
+	struct vgic_translation_cache_entry *cte;
+
+	list_for_each_entry(cte, &dist->lpi_translation_cache, entry) {
+		/*
+		 * If we hit a NULL entry, there is nothing after this
+		 * point.
+		 */
+		if (!cte->irq)
+			break;
+
+		if (cte->db != db || cte->devid != devid ||
+		    cte->eventid != eventid)
+			continue;
+
+		/*
+		 * Move this entry to the head, as it is the most
+		 * recently used.
+		 */
+		if (!list_is_first(&cte->entry, &dist->lpi_translation_cache))
+			list_move(&cte->entry, &dist->lpi_translation_cache);
+
+		return cte->irq;
+	}
+
+	return NULL;
+}
+
+static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
+				       u32 devid, u32 eventid,
+				       struct vgic_irq *irq)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_translation_cache_entry *cte;
+	unsigned long flags;
+	phys_addr_t db;
+
+	/* Do not cache a directly injected interrupt */
+	if (irq->hw)
+		return;
+
+	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+
+	if (unlikely(list_empty(&dist->lpi_translation_cache)))
+		goto out;
+
+	/*
+	 * We could have raced with another CPU caching the same
+	 * translation behind our back, so let's check it is not in
+	 * already
+	 */
+	db = its->vgic_its_base + GITS_TRANSLATER;
+	if (__vgic_its_check_cache(dist, db, devid, eventid))
+		goto out;
+
+	/* Always reuse the last entry (LRU policy) */
+	cte = list_last_entry(&dist->lpi_translation_cache,
+			      typeof(*cte), entry);
+
+	/*
+	 * Caching the translation implies having an extra reference
+	 * to the interrupt, so drop the potential reference on what
+	 * was in the cache, and increment it on the new interrupt.
+	 */
+	if (cte->irq)
+		__vgic_put_lpi_locked(kvm, cte->irq);
+
+	vgic_get_irq_kref(irq);
+
+	cte->db		= db;
+	cte->devid	= devid;
+	cte->eventid	= eventid;
+	cte->irq	= irq;
+
+	/* Move the new translation to the head of the list */
+	list_move(&cte->entry, &dist->lpi_translation_cache);
+
+out:
+	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+}
+
 void vgic_its_invalidate_cache(struct kvm *kvm)
 {
 	struct vgic_dist *dist = &kvm->arch.vgic;
@@ -578,6 +662,8 @@ int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 	if (!vcpu->arch.vgic_cpu.lpis_enabled)
 		return -EBUSY;
 
+	vgic_its_cache_translation(kvm, its, devid, eventid, ite->irq);
+
 	*irq = ite->irq;
 	return 0;
 }
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 09/10] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (7 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 08/10] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-25 16:38   ` Auger Eric
  2019-07-25 15:35 ` [PATCH v3 10/10] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic Marc Zyngier
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

When performing an MSI injection, let's first check if the translation
is already in the cache. If so, let's inject it quickly without
going through the whole translation process.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-its.c | 36 ++++++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic.h     |  1 +
 2 files changed, 37 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index e61d3ea0ab40..2be6b66b3856 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -566,6 +566,20 @@ static struct vgic_irq *__vgic_its_check_cache(struct vgic_dist *dist,
 	return NULL;
 }
 
+static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
+					     u32 devid, u32 eventid)
+{
+	struct vgic_dist *dist = &kvm->arch.vgic;
+	struct vgic_irq *irq;
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
+	irq = __vgic_its_check_cache(dist, db, devid, eventid);
+	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
+
+	return irq;
+}
+
 static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
 				       u32 devid, u32 eventid,
 				       struct vgic_irq *irq)
@@ -725,6 +739,25 @@ static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
 	return 0;
 }
 
+int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi)
+{
+	struct vgic_irq *irq;
+	unsigned long flags;
+	phys_addr_t db;
+
+	db = (u64)msi->address_hi << 32 | msi->address_lo;
+	irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data);
+
+	if (!irq)
+		return -1;
+
+	raw_spin_lock_irqsave(&irq->irq_lock, flags);
+	irq->pending_latch = true;
+	vgic_queue_irq_unlock(kvm, irq, flags);
+
+	return 0;
+}
+
 /*
  * Queries the KVM IO bus framework to get the ITS pointer from the given
  * doorbell address.
@@ -736,6 +769,9 @@ int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
 	struct vgic_its *its;
 	int ret;
 
+	if (!vgic_its_inject_cached_translation(kvm, msi))
+		return 1;
+
 	its = vgic_msi_to_its(kvm, msi);
 	if (IS_ERR(its))
 		return PTR_ERR(its);
diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
index 09908b80fb1e..db308a62b34d 100644
--- a/virt/kvm/arm/vgic/vgic.h
+++ b/virt/kvm/arm/vgic/vgic.h
@@ -306,6 +306,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
 int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
 			 u32 devid, u32 eventid, struct vgic_irq **irq);
 struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
+int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi);
 void vgic_lpi_translation_cache_init(struct kvm *kvm);
 void vgic_lpi_translation_cache_destroy(struct kvm *kvm);
 void vgic_its_invalidate_cache(struct kvm *kvm);
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 10/10] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (8 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 09/10] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
@ 2019-07-25 15:35 ` Marc Zyngier
  2019-07-30 17:04 ` [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Andre Przywara
  2019-08-18 17:55 ` Marc Zyngier
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-07-25 15:35 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

From: Marc Zyngier <marc.zyngier@arm.com>

Now that we have a cache of MSI->LPI translations, it is pretty
easy to implement kvm_arch_set_irq_inatomic (this cache can be
parsed without sleeping).

Hopefully, this will improve some LPI-heavy workloads.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
---
 virt/kvm/arm/vgic/vgic-irqfd.c | 36 ++++++++++++++++++++++++++++------
 1 file changed, 30 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-irqfd.c b/virt/kvm/arm/vgic/vgic-irqfd.c
index c9304b88e720..d8cdfea5cc96 100644
--- a/virt/kvm/arm/vgic/vgic-irqfd.c
+++ b/virt/kvm/arm/vgic/vgic-irqfd.c
@@ -66,6 +66,15 @@ int kvm_set_routing_entry(struct kvm *kvm,
 	return r;
 }
 
+static void kvm_populate_msi(struct kvm_kernel_irq_routing_entry *e,
+			     struct kvm_msi *msi)
+{
+	msi->address_lo = e->msi.address_lo;
+	msi->address_hi = e->msi.address_hi;
+	msi->data = e->msi.data;
+	msi->flags = e->msi.flags;
+	msi->devid = e->msi.devid;
+}
 /**
  * kvm_set_msi: inject the MSI corresponding to the
  * MSI routing entry
@@ -79,21 +88,36 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 {
 	struct kvm_msi msi;
 
-	msi.address_lo = e->msi.address_lo;
-	msi.address_hi = e->msi.address_hi;
-	msi.data = e->msi.data;
-	msi.flags = e->msi.flags;
-	msi.devid = e->msi.devid;
-
 	if (!vgic_has_its(kvm))
 		return -ENODEV;
 
 	if (!level)
 		return -1;
 
+	kvm_populate_msi(e, &msi);
 	return vgic_its_inject_msi(kvm, &msi);
 }
 
+/**
+ * kvm_arch_set_irq_inatomic: fast-path for irqfd injection
+ *
+ * Currently only direct MSI injection is supported.
+ */
+int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
+			      struct kvm *kvm, int irq_source_id, int level,
+			      bool line_status)
+{
+	if (e->type == KVM_IRQ_ROUTING_MSI && vgic_has_its(kvm) && level) {
+		struct kvm_msi msi;
+
+		kvm_populate_msi(e, &msi);
+		if (!vgic_its_inject_cached_translation(kvm, &msi))
+			return 0;
+	}
+
+	return -EWOULDBLOCK;
+}
+
 int kvm_vgic_setup_default_irq_routing(struct kvm *kvm)
 {
 	struct kvm_irq_routing_entry *entries;
-- 
2.20.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 07/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown
  2019-07-25 15:35 ` [PATCH v3 07/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown Marc Zyngier
@ 2019-07-25 16:37   ` Auger Eric
  0 siblings, 0 replies; 17+ messages in thread
From: Auger Eric @ 2019-07-25 16:37 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

Hi Marc,

On 7/25/19 5:35 PM, Marc Zyngier wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> In order to avoid leaking vgic_irq structures on teardown, we need to
> drop all references to LPIs before deallocating the cache itself.
> 
> This is done by invalidating the cache on vgic teardown.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 05406bd92ce9..d3e90a9d0a7a 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -1731,6 +1731,8 @@ void vgic_lpi_translation_cache_destroy(struct kvm *kvm)
>  	struct vgic_dist *dist = &kvm->arch.vgic;
>  	struct vgic_translation_cache_entry *cte, *tmp;
>  
> +	vgic_its_invalidate_cache(kvm);
> +
>  	list_for_each_entry_safe(cte, tmp,
>  				 &dist->lpi_translation_cache, entry) {
>  		list_del(&cte->entry);
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 09/10] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection
  2019-07-25 15:35 ` [PATCH v3 09/10] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
@ 2019-07-25 16:38   ` Auger Eric
  0 siblings, 0 replies; 17+ messages in thread
From: Auger Eric @ 2019-07-25 16:38 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

Hi Marc,

On 7/25/19 5:35 PM, Marc Zyngier wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> When performing an MSI injection, let's first check if the translation
> is already in the cache. If so, let's inject it quickly without
> going through the whole translation process.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Thanks

Eric
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 36 ++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic/vgic.h     |  1 +
>  2 files changed, 37 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index e61d3ea0ab40..2be6b66b3856 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -566,6 +566,20 @@ static struct vgic_irq *__vgic_its_check_cache(struct vgic_dist *dist,
>  	return NULL;
>  }
>  
> +static struct vgic_irq *vgic_its_check_cache(struct kvm *kvm, phys_addr_t db,
> +					     u32 devid, u32 eventid)
> +{
> +	struct vgic_dist *dist = &kvm->arch.vgic;
> +	struct vgic_irq *irq;
> +	unsigned long flags;
> +
> +	raw_spin_lock_irqsave(&dist->lpi_list_lock, flags);
> +	irq = __vgic_its_check_cache(dist, db, devid, eventid);
> +	raw_spin_unlock_irqrestore(&dist->lpi_list_lock, flags);
> +
> +	return irq;
> +}
> +
>  static void vgic_its_cache_translation(struct kvm *kvm, struct vgic_its *its,
>  				       u32 devid, u32 eventid,
>  				       struct vgic_irq *irq)
> @@ -725,6 +739,25 @@ static int vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
>  	return 0;
>  }
>  
> +int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi)
> +{
> +	struct vgic_irq *irq;
> +	unsigned long flags;
> +	phys_addr_t db;
> +
> +	db = (u64)msi->address_hi << 32 | msi->address_lo;
> +	irq = vgic_its_check_cache(kvm, db, msi->devid, msi->data);
> +
> +	if (!irq)
> +		return -1;
> +
> +	raw_spin_lock_irqsave(&irq->irq_lock, flags);
> +	irq->pending_latch = true;
> +	vgic_queue_irq_unlock(kvm, irq, flags);
> +
> +	return 0;
> +}
> +
>  /*
>   * Queries the KVM IO bus framework to get the ITS pointer from the given
>   * doorbell address.
> @@ -736,6 +769,9 @@ int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
>  	struct vgic_its *its;
>  	int ret;
>  
> +	if (!vgic_its_inject_cached_translation(kvm, msi))
> +		return 1;
> +
>  	its = vgic_msi_to_its(kvm, msi);
>  	if (IS_ERR(its))
>  		return PTR_ERR(its);
> diff --git a/virt/kvm/arm/vgic/vgic.h b/virt/kvm/arm/vgic/vgic.h
> index 09908b80fb1e..db308a62b34d 100644
> --- a/virt/kvm/arm/vgic/vgic.h
> +++ b/virt/kvm/arm/vgic/vgic.h
> @@ -306,6 +306,7 @@ int vgic_copy_lpi_list(struct kvm *kvm, struct kvm_vcpu *vcpu, u32 **intid_ptr);
>  int vgic_its_resolve_lpi(struct kvm *kvm, struct vgic_its *its,
>  			 u32 devid, u32 eventid, struct vgic_irq **irq);
>  struct vgic_its *vgic_msi_to_its(struct kvm *kvm, struct kvm_msi *msi);
> +int vgic_its_inject_cached_translation(struct kvm *kvm, struct kvm_msi *msi);
>  void vgic_lpi_translation_cache_init(struct kvm *kvm);
>  void vgic_lpi_translation_cache_destroy(struct kvm *kvm);
>  void vgic_its_invalidate_cache(struct kvm *kvm);
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 05/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs
  2019-07-25 15:35 ` [PATCH v3 05/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
@ 2019-07-25 16:38   ` Auger Eric
  0 siblings, 0 replies; 17+ messages in thread
From: Auger Eric @ 2019-07-25 16:38 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

Hi Marc,

On 7/25/19 5:35 PM, Marc Zyngier wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> If a vcpu disables LPIs at its redistributor level, we need to make sure
> we won't pend more interrupts. For this, we need to invalidate the LPI
> translation cache.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> index 936962abc38d..cb60da48810d 100644
> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> @@ -192,8 +192,10 @@ static void vgic_mmio_write_v3r_ctlr(struct kvm_vcpu *vcpu,
>  
>  	vgic_cpu->lpis_enabled = val & GICR_CTLR_ENABLE_LPIS;
>  
> -	if (was_enabled && !vgic_cpu->lpis_enabled)
> +	if (was_enabled && !vgic_cpu->lpis_enabled) {
>  		vgic_flush_pending_lpis(vcpu);
> +		vgic_its_invalidate_cache(vcpu->kvm);
> +	}
>  
>  	if (!was_enabled && vgic_cpu->lpis_enabled)
>  		vgic_enable_lpis(vcpu);
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 06/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable
  2019-07-25 15:35 ` [PATCH v3 06/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable Marc Zyngier
@ 2019-07-25 16:38   ` Auger Eric
  0 siblings, 0 replies; 17+ messages in thread
From: Auger Eric @ 2019-07-25 16:38 UTC (permalink / raw)
  To: Marc Zyngier, linux-arm-kernel, kvmarm, kvm
  Cc: Marc Zyngier, Andre Przywara, Raslan, KarimAllah, Saidi, Ali

Hi Marc,

On 7/25/19 5:35 PM, Marc Zyngier wrote:
> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> If an ITS gets disabled, we need to make sure that further interrupts
> won't hit in the cache. For that, we invalidate the translation cache
> when the ITS is disabled.
> 
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 09a179820816..05406bd92ce9 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -1597,6 +1597,8 @@ static void vgic_mmio_write_its_ctlr(struct kvm *kvm, struct vgic_its *its,
>  		goto out;
>  
>  	its->enabled = !!(val & GITS_CTLR_ENABLE);
> +	if (!its->enabled)
> +		vgic_its_invalidate_cache(kvm);
>  
>  	/*
>  	 * Try to process any pending commands. This function bails out early
> 
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (9 preceding siblings ...)
  2019-07-25 15:35 ` [PATCH v3 10/10] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic Marc Zyngier
@ 2019-07-30 17:04 ` Andre Przywara
  2019-08-18 17:55 ` Marc Zyngier
  11 siblings, 0 replies; 17+ messages in thread
From: Andre Przywara @ 2019-07-30 17:04 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, Marc Zyngier, Raslan, KarimAllah, Saidi,  Ali,
	linux-arm-kernel, kvmarm

On Thu, 25 Jul 2019 16:35:33 +0100
Marc Zyngier <maz@kernel.org> wrote:

Hi,

> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> It recently became apparent[1] that our LPI injection path is not as
> efficient as it could be when injecting interrupts coming from a VFIO
> assigned device.
> 
> Although the proposed patch wasn't 100% correct, it outlined at least
> two issues:
> 
> (1) Injecting an LPI from VFIO always results in a context switch to a
>     worker thread: no good
> 
> (2) We have no way of amortising the cost of translating a DID+EID pair
>     to an LPI number
> 
> The reason for (1) is that we may sleep when translating an LPI, so we
> do need a context process. A way to fix that is to implement a small
> LPI translation cache that could be looked up from an atomic
> context. It would also solve (2).
> 
> This is what this small series proposes. It implements a very basic
> LRU cache of pre-translated LPIs, which gets used to implement
> kvm_arch_set_irq_inatomic. The size of the cache is currently
> hard-coded at 16 times the number of vcpus, a number I have picked
> under the influence of Ali Saidi. If that's not enough for you, blame
> me, though.
> 
> Does it work? well, it doesn't crash, and is thus perfect. More
> seriously, I don't really have a way to benchmark it directly, so my
> observations are only indirect:
> 
> On a TX2 system, I run a 4 vcpu VM with an Ethernet interface passed
> to it directly. From the host, I inject interrupts using debugfs. In
> parallel, I look at the number of context switch, and the number of
> interrupts on the host. Without this series, I get the same number for
> both IRQ and CS (about half a million of each per second is pretty
> easy to reach). With this series, the number of context switches drops
> to something pretty small (in the low 2k), while the number of
> interrupts stays the same.
> 
> Yes, this is a pretty rubbish benchmark, what did you expect? ;-)
> 
> Andre did run some benchmarks of his own, with some rather positive
> results[2]. So I'm putting this out for people with real workloads to
> try out and report what they see.

And I gave this series a try, on top of 5.3-rc2, with Robin's patch [3] to
fix a VFIO IRQ breakage.
The results were very similar, though at least one performance number was
slightly worse than compared to this series on top of v5.2.
But nevertheless there is still the big improvement compared to the
baseline without this series, so:

(for the whole series):
Tested-by: Andre Przywara <andre.przywara@arm.com>

Cheers,
Andre.

[3] http://lists.infradead.org/pipermail/linux-arm-kernel/2019-July/669468.html

> [1] https://lore.kernel.org/lkml/1552833373-19828-1-git-send-email-yuzenghui@huawei.com/
> [2] https://www.spinics.net/lists/arm-kernel/msg742655.html
> 
> * From v2:
> 
>   - Added invalidation on turning the ITS off
>   - Added invalidation on MAPC with V=0
>   - Added Rb's from Eric
> 
> * From v1:
> 
>   - Fixed race on allocation, where the same LPI could be cached multiple times
>   - Now invalidate the cache on vgic teardown, avoiding memory leaks
>   - Change patch split slightly, general reshuffling
>   - Small cleanups here and there
>   - Rebased on 5.2-rc4
> 
> Marc Zyngier (10):
>   KVM: arm/arm64: vgic: Add LPI translation cache definition
>   KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
>   KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
>     specific commands
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
>     disabling LPIs
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS
>     disable
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic
>     teardown
>   KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
>   KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI
>     injection
>   KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic
> 
>  include/kvm/arm_vgic.h           |   3 +
>  virt/kvm/arm/vgic/vgic-init.c    |   5 +
>  virt/kvm/arm/vgic/vgic-irqfd.c   |  36 +++++-
>  virt/kvm/arm/vgic/vgic-its.c     | 207 +++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic/vgic-mmio-v3.c |   4 +-
>  virt/kvm/arm/vgic/vgic.c         |  26 ++--
>  virt/kvm/arm/vgic/vgic.h         |   5 +
>  7 files changed, 270 insertions(+), 16 deletions(-)
> 

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache
  2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
                   ` (10 preceding siblings ...)
  2019-07-30 17:04 ` [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Andre Przywara
@ 2019-08-18 17:55 ` Marc Zyngier
  11 siblings, 0 replies; 17+ messages in thread
From: Marc Zyngier @ 2019-08-18 17:55 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm
  Cc: Raslan, KarimAllah, Andre Przywara, Saidi, Ali

On Thu, 25 Jul 2019 16:35:33 +0100
Marc Zyngier <maz@kernel.org> wrote:

> From: Marc Zyngier <marc.zyngier@arm.com>
> 
> It recently became apparent[1] that our LPI injection path is not as
> efficient as it could be when injecting interrupts coming from a VFIO
> assigned device.
> 
> Although the proposed patch wasn't 100% correct, it outlined at least
> two issues:
> 
> (1) Injecting an LPI from VFIO always results in a context switch to a
>     worker thread: no good
> 
> (2) We have no way of amortising the cost of translating a DID+EID pair
>     to an LPI number
> 
> The reason for (1) is that we may sleep when translating an LPI, so we
> do need a context process. A way to fix that is to implement a small
> LPI translation cache that could be looked up from an atomic
> context. It would also solve (2).
> 
> This is what this small series proposes. It implements a very basic
> LRU cache of pre-translated LPIs, which gets used to implement
> kvm_arch_set_irq_inatomic. The size of the cache is currently
> hard-coded at 16 times the number of vcpus, a number I have picked
> under the influence of Ali Saidi. If that's not enough for you, blame
> me, though.
> 
> Does it work? well, it doesn't crash, and is thus perfect. More
> seriously, I don't really have a way to benchmark it directly, so my
> observations are only indirect:
> 
> On a TX2 system, I run a 4 vcpu VM with an Ethernet interface passed
> to it directly. From the host, I inject interrupts using debugfs. In
> parallel, I look at the number of context switch, and the number of
> interrupts on the host. Without this series, I get the same number for
> both IRQ and CS (about half a million of each per second is pretty
> easy to reach). With this series, the number of context switches drops
> to something pretty small (in the low 2k), while the number of
> interrupts stays the same.
> 
> Yes, this is a pretty rubbish benchmark, what did you expect? ;-)
> 
> Andre did run some benchmarks of his own, with some rather positive
> results[2]. So I'm putting this out for people with real workloads to
> try out and report what they see.
> 
> [1] https://lore.kernel.org/lkml/1552833373-19828-1-git-send-email-yuzenghui@huawei.com/
> [2] https://www.spinics.net/lists/arm-kernel/msg742655.html
> 
> * From v2:
> 
>   - Added invalidation on turning the ITS off
>   - Added invalidation on MAPC with V=0
>   - Added Rb's from Eric
> 
> * From v1:
> 
>   - Fixed race on allocation, where the same LPI could be cached multiple times
>   - Now invalidate the cache on vgic teardown, avoiding memory leaks
>   - Change patch split slightly, general reshuffling
>   - Small cleanups here and there
>   - Rebased on 5.2-rc4
> 
> Marc Zyngier (10):
>   KVM: arm/arm64: vgic: Add LPI translation cache definition
>   KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive
>   KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
>     specific commands
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on
>     disabling LPIs
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS
>     disable
>   KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic
>     teardown
>   KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation
>   KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI
>     injection
>   KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic
> 
>  include/kvm/arm_vgic.h           |   3 +
>  virt/kvm/arm/vgic/vgic-init.c    |   5 +
>  virt/kvm/arm/vgic/vgic-irqfd.c   |  36 +++++-
>  virt/kvm/arm/vgic/vgic-its.c     | 207 +++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic/vgic-mmio-v3.c |   4 +-
>  virt/kvm/arm/vgic/vgic.c         |  26 ++--
>  virt/kvm/arm/vgic/vgic.h         |   5 +
>  7 files changed, 270 insertions(+), 16 deletions(-)
> 

FWIW, I've now queued this for 5.4, with Eric's RBs and Andre's TBs.

Thanks,

	M.
-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-08-18 17:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-25 15:35 [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Marc Zyngier
2019-07-25 15:35 ` [PATCH v3 01/10] KVM: arm/arm64: vgic: Add LPI translation cache definition Marc Zyngier
2019-07-25 15:35 ` [PATCH v3 02/10] KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive Marc Zyngier
2019-07-25 15:35 ` [PATCH v3 03/10] KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation Marc Zyngier
2019-07-25 15:35 ` [PATCH v3 04/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands Marc Zyngier
2019-07-25 15:35 ` [PATCH v3 05/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs Marc Zyngier
2019-07-25 16:38   ` Auger Eric
2019-07-25 15:35 ` [PATCH v3 06/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable Marc Zyngier
2019-07-25 16:38   ` Auger Eric
2019-07-25 15:35 ` [PATCH v3 07/10] KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown Marc Zyngier
2019-07-25 16:37   ` Auger Eric
2019-07-25 15:35 ` [PATCH v3 08/10] KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation Marc Zyngier
2019-07-25 15:35 ` [PATCH v3 09/10] KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection Marc Zyngier
2019-07-25 16:38   ` Auger Eric
2019-07-25 15:35 ` [PATCH v3 10/10] KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic Marc Zyngier
2019-07-30 17:04 ` [PATCH v3 00/10] KVM: arm/arm64: vgic: ITS translation cache Andre Przywara
2019-08-18 17:55 ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).