kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7
@ 2020-04-22 16:18 Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 1/6] KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER Marc Zyngier
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, Andre Przywara

Here's a few vgic fixes I've been piling on during the merge window,
plus a couple that Zenghui contributed, and which I added to the mix.

The first patch is a silly off-by-one bug in the ACTIVE handling code,
where we miss fail to stop the guest if writing to the first set of
GICv2 SPIs. Oopsie boo.

The second patch improves the handling of the ACTIVE registers, which
we never synchronise on the read side (the distributor state can only
be updated when the vcpu exits). Let's fix it the same way we do it on
the write side (stop-the-world, read, restart). Yes, this is
expensive.

The following two patches deal with an issue where we consider the HW
state of an interrupt when responding to a userspace access. We should
never do this, as the guest shouldn't be running at this stage and if
it is, it is absolutely fine to return random bits to userspace. It
could also be that there is no active guest context at this stage, and
you end up with an Oops, which nobody really enjoys.

The last two patches fix a couple of memory leaks.

* From v2:
  - Now handle userspace access to GICv2 GICD_I{S,C}PENDR, which never
    really worked (pointed out by James)
  - Collected tags from Andre and James

Marc Zyngier (4):
  KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER
  KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER
    read
  KVM: arm: vgic: Only use the virtual state when userspace accesses
    enable bits
  KVM: arm: vgic-v2: Only use the virtual state when userspace accesses
    pending bits

Zenghui Yu (2):
  KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  KVM: arm64: vgic-its: Fix memory leak on the error path of
    vgic_add_lpi()

 virt/kvm/arm/vgic/vgic-init.c    |   6 +
 virt/kvm/arm/vgic/vgic-its.c     |  11 +-
 virt/kvm/arm/vgic/vgic-mmio-v2.c |  16 ++-
 virt/kvm/arm/vgic/vgic-mmio-v3.c |  28 ++--
 virt/kvm/arm/vgic/vgic-mmio.c    | 228 +++++++++++++++++++++++--------
 virt/kvm/arm/vgic/vgic-mmio.h    |  19 +++
 6 files changed, 230 insertions(+), 78 deletions(-)

-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3 1/6] KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER
  2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
@ 2020-04-22 16:18 ` Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 2/6] KVM: arm: vgic: Synchronize the whole guest on GIC{D, R}_I{S, C}ACTIVER read Marc Zyngier
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, André Przywara, stable

When deciding whether a guest has to be stopped we check whether this
is a private interrupt or not. Unfortunately, there's an off-by-one bug
here, and we fail to recognize a whole range of interrupts as being
global (GICv2 SPIs 32-63).

Fix the condition from > to be >=.

Cc: stable@vger.kernel.org
Fixes: abd7229626b93 ("KVM: arm/arm64: Simplify active_change_prepare and plug race")
Reported-by: André Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/vgic/vgic-mmio.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index 2199302597faf..d085e047953fa 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -444,7 +444,7 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 static void vgic_change_active_prepare(struct kvm_vcpu *vcpu, u32 intid)
 {
 	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
-	    intid > VGIC_NR_PRIVATE_IRQS)
+	    intid >= VGIC_NR_PRIVATE_IRQS)
 		kvm_arm_halt_guest(vcpu->kvm);
 }
 
@@ -452,7 +452,7 @@ static void vgic_change_active_prepare(struct kvm_vcpu *vcpu, u32 intid)
 static void vgic_change_active_finish(struct kvm_vcpu *vcpu, u32 intid)
 {
 	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
-	    intid > VGIC_NR_PRIVATE_IRQS)
+	    intid >= VGIC_NR_PRIVATE_IRQS)
 		kvm_arm_resume_guest(vcpu->kvm);
 }
 
-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 2/6] KVM: arm: vgic: Synchronize the whole guest on GIC{D, R}_I{S, C}ACTIVER read
  2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 1/6] KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER Marc Zyngier
@ 2020-04-22 16:18 ` Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 3/6] KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits Marc Zyngier
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, Andre Przywara

When a guest tries to read the active state of its interrupts,
we currently just return whatever state we have in memory. This
means that if such an interrupt lives in a List Register on another
CPU, we fail to obsertve the latest active state for this interrupt.

In order to remedy this, stop all the other vcpus so that they exit
and we can observe the most recent value for the state. This is
similar to what we are doing for the write side of the same
registers, and results in new MMIO handlers for userspace (which
do not need to stop the guest, as it is supposed to be stopped
already).

Reported-by: Julien Grall <julien@xen.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/vgic/vgic-mmio-v2.c |   4 +-
 virt/kvm/arm/vgic/vgic-mmio-v3.c |  12 ++--
 virt/kvm/arm/vgic/vgic-mmio.c    | 100 ++++++++++++++++++++-----------
 virt/kvm/arm/vgic/vgic-mmio.h    |   3 +
 4 files changed, 75 insertions(+), 44 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index 5945f062d7497..d63881f60e1a5 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -422,11 +422,11 @@ static const struct vgic_register_region vgic_v2_dist_registers[] = {
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_SET,
 		vgic_mmio_read_active, vgic_mmio_write_sactive,
-		NULL, vgic_mmio_uaccess_write_sactive, 1,
+		vgic_uaccess_read_active, vgic_mmio_uaccess_write_sactive, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_CLEAR,
 		vgic_mmio_read_active, vgic_mmio_write_cactive,
-		NULL, vgic_mmio_uaccess_write_cactive, 1,
+		vgic_uaccess_read_active, vgic_mmio_uaccess_write_cactive, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PRI,
 		vgic_mmio_read_priority, vgic_mmio_write_priority, NULL, NULL,
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index e72dcc4542475..f2b37a081f26a 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -553,11 +553,11 @@ static const struct vgic_register_region vgic_v3_dist_registers[] = {
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISACTIVER,
 		vgic_mmio_read_active, vgic_mmio_write_sactive,
-		NULL, vgic_mmio_uaccess_write_sactive, 1,
+		vgic_uaccess_read_active, vgic_mmio_uaccess_write_sactive, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ICACTIVER,
 		vgic_mmio_read_active, vgic_mmio_write_cactive,
-		NULL, vgic_mmio_uaccess_write_cactive,
+		vgic_uaccess_read_active, vgic_mmio_uaccess_write_cactive,
 		1, VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_IPRIORITYR,
 		vgic_mmio_read_priority, vgic_mmio_write_priority, NULL, NULL,
@@ -625,12 +625,12 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = {
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISACTIVER0,
 		vgic_mmio_read_active, vgic_mmio_write_sactive,
-		NULL, vgic_mmio_uaccess_write_sactive,
-		4, VGIC_ACCESS_32bit),
+		vgic_uaccess_read_active, vgic_mmio_uaccess_write_sactive, 4,
+		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ICACTIVER0,
 		vgic_mmio_read_active, vgic_mmio_write_cactive,
-		NULL, vgic_mmio_uaccess_write_cactive,
-		4, VGIC_ACCESS_32bit),
+		vgic_uaccess_read_active, vgic_mmio_uaccess_write_cactive, 4,
+		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_IPRIORITYR0,
 		vgic_mmio_read_priority, vgic_mmio_write_priority, 32,
 		VGIC_ACCESS_32bit | VGIC_ACCESS_8bit),
diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index d085e047953fa..b38e94e8f74ad 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -348,8 +348,39 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 	}
 }
 
-unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
-				    gpa_t addr, unsigned int len)
+
+/*
+ * If we are fiddling with an IRQ's active state, we have to make sure the IRQ
+ * is not queued on some running VCPU's LRs, because then the change to the
+ * active state can be overwritten when the VCPU's state is synced coming back
+ * from the guest.
+ *
+ * For shared interrupts as well as GICv3 private interrupts, we have to
+ * stop all the VCPUs because interrupts can be migrated while we don't hold
+ * the IRQ locks and we don't want to be chasing moving targets.
+ *
+ * For GICv2 private interrupts we don't have to do anything because
+ * userspace accesses to the VGIC state already require all VCPUs to be
+ * stopped, and only the VCPU itself can modify its private interrupts
+ * active state, which guarantees that the VCPU is not running.
+ */
+static void vgic_access_active_prepare(struct kvm_vcpu *vcpu, u32 intid)
+{
+	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
+	    intid >= VGIC_NR_PRIVATE_IRQS)
+		kvm_arm_halt_guest(vcpu->kvm);
+}
+
+/* See vgic_access_active_prepare */
+static void vgic_access_active_finish(struct kvm_vcpu *vcpu, u32 intid)
+{
+	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
+	    intid >= VGIC_NR_PRIVATE_IRQS)
+		kvm_arm_resume_guest(vcpu->kvm);
+}
+
+static unsigned long __vgic_mmio_read_active(struct kvm_vcpu *vcpu,
+					     gpa_t addr, unsigned int len)
 {
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	u32 value = 0;
@@ -359,6 +390,10 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 	for (i = 0; i < len * 8; i++) {
 		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
 
+		/*
+		 * Even for HW interrupts, don't evaluate the HW state as
+		 * all the guest is interested in is the virtual state.
+		 */
 		if (irq->active)
 			value |= (1U << i);
 
@@ -368,6 +403,29 @@ unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 	return value;
 }
 
+unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
+				    gpa_t addr, unsigned int len)
+{
+	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+	u32 val;
+
+	mutex_lock(&vcpu->kvm->lock);
+	vgic_access_active_prepare(vcpu, intid);
+
+	val = __vgic_mmio_read_active(vcpu, addr, len);
+
+	vgic_access_active_finish(vcpu, intid);
+	mutex_unlock(&vcpu->kvm->lock);
+
+	return val;
+}
+
+unsigned long vgic_uaccess_read_active(struct kvm_vcpu *vcpu,
+				    gpa_t addr, unsigned int len)
+{
+	return __vgic_mmio_read_active(vcpu, addr, len);
+}
+
 /* Must be called with irq->irq_lock held */
 static void vgic_hw_irq_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 				      bool active, bool is_uaccess)
@@ -426,36 +484,6 @@ static void vgic_mmio_change_active(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
 		raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
 }
 
-/*
- * If we are fiddling with an IRQ's active state, we have to make sure the IRQ
- * is not queued on some running VCPU's LRs, because then the change to the
- * active state can be overwritten when the VCPU's state is synced coming back
- * from the guest.
- *
- * For shared interrupts, we have to stop all the VCPUs because interrupts can
- * be migrated while we don't hold the IRQ locks and we don't want to be
- * chasing moving targets.
- *
- * For private interrupts we don't have to do anything because userspace
- * accesses to the VGIC state already require all VCPUs to be stopped, and
- * only the VCPU itself can modify its private interrupts active state, which
- * guarantees that the VCPU is not running.
- */
-static void vgic_change_active_prepare(struct kvm_vcpu *vcpu, u32 intid)
-{
-	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
-	    intid >= VGIC_NR_PRIVATE_IRQS)
-		kvm_arm_halt_guest(vcpu->kvm);
-}
-
-/* See vgic_change_active_prepare */
-static void vgic_change_active_finish(struct kvm_vcpu *vcpu, u32 intid)
-{
-	if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3 ||
-	    intid >= VGIC_NR_PRIVATE_IRQS)
-		kvm_arm_resume_guest(vcpu->kvm);
-}
-
 static void __vgic_mmio_write_cactive(struct kvm_vcpu *vcpu,
 				      gpa_t addr, unsigned int len,
 				      unsigned long val)
@@ -477,11 +505,11 @@ void vgic_mmio_write_cactive(struct kvm_vcpu *vcpu,
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 
 	mutex_lock(&vcpu->kvm->lock);
-	vgic_change_active_prepare(vcpu, intid);
+	vgic_access_active_prepare(vcpu, intid);
 
 	__vgic_mmio_write_cactive(vcpu, addr, len, val);
 
-	vgic_change_active_finish(vcpu, intid);
+	vgic_access_active_finish(vcpu, intid);
 	mutex_unlock(&vcpu->kvm->lock);
 }
 
@@ -514,11 +542,11 @@ void vgic_mmio_write_sactive(struct kvm_vcpu *vcpu,
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 
 	mutex_lock(&vcpu->kvm->lock);
-	vgic_change_active_prepare(vcpu, intid);
+	vgic_access_active_prepare(vcpu, intid);
 
 	__vgic_mmio_write_sactive(vcpu, addr, len, val);
 
-	vgic_change_active_finish(vcpu, intid);
+	vgic_access_active_finish(vcpu, intid);
 	mutex_unlock(&vcpu->kvm->lock);
 }
 
diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
index 5af2aefad4359..30713a44e3faa 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.h
+++ b/virt/kvm/arm/vgic/vgic-mmio.h
@@ -152,6 +152,9 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 				    gpa_t addr, unsigned int len);
 
+unsigned long vgic_uaccess_read_active(struct kvm_vcpu *vcpu,
+				    gpa_t addr, unsigned int len);
+
 void vgic_mmio_write_cactive(struct kvm_vcpu *vcpu,
 			     gpa_t addr, unsigned int len,
 			     unsigned long val);
-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 3/6] KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits
  2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 1/6] KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 2/6] KVM: arm: vgic: Synchronize the whole guest on GIC{D, R}_I{S, C}ACTIVER read Marc Zyngier
@ 2020-04-22 16:18 ` Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 4/6] KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits Marc Zyngier
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, Andre Przywara

There is no point in accessing the HW when writing to any of the
ISENABLER/ICENABLER registers from userspace, as only the guest
should be allowed to change the HW state.

Introduce new userspace-specific accessors that deal solely with
the virtual state.

Reported-by: James Morse <james.morse@arm.com>
Tested-by: James Morse <james.morse@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/vgic/vgic-mmio-v2.c |  6 +++--
 virt/kvm/arm/vgic/vgic-mmio-v3.c | 16 +++++++-----
 virt/kvm/arm/vgic/vgic-mmio.c    | 42 ++++++++++++++++++++++++++++++++
 virt/kvm/arm/vgic/vgic-mmio.h    |  8 ++++++
 4 files changed, 64 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index d63881f60e1a5..f51c6e939c761 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -409,10 +409,12 @@ static const struct vgic_register_region vgic_v2_dist_registers[] = {
 		NULL, vgic_mmio_uaccess_write_v2_group, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ENABLE_SET,
-		vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1,
+		vgic_mmio_read_enable, vgic_mmio_write_senable,
+		NULL, vgic_uaccess_write_senable, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ENABLE_CLEAR,
-		vgic_mmio_read_enable, vgic_mmio_write_cenable, NULL, NULL, 1,
+		vgic_mmio_read_enable, vgic_mmio_write_cenable,
+		NULL, vgic_uaccess_write_cenable, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_SET,
 		vgic_mmio_read_pending, vgic_mmio_write_spending, NULL, NULL, 1,
diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
index f2b37a081f26a..416613f2400c5 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
@@ -538,10 +538,12 @@ static const struct vgic_register_region vgic_v3_dist_registers[] = {
 		vgic_mmio_read_group, vgic_mmio_write_group, NULL, NULL, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISENABLER,
-		vgic_mmio_read_enable, vgic_mmio_write_senable, NULL, NULL, 1,
+		vgic_mmio_read_enable, vgic_mmio_write_senable,
+		NULL, vgic_uaccess_write_senable, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ICENABLER,
-		vgic_mmio_read_enable, vgic_mmio_write_cenable, NULL, NULL, 1,
+		vgic_mmio_read_enable, vgic_mmio_write_cenable,
+	       NULL, vgic_uaccess_write_cenable, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ_SHARED(GICD_ISPENDR,
 		vgic_mmio_read_pending, vgic_mmio_write_spending,
@@ -609,11 +611,13 @@ static const struct vgic_register_region vgic_v3_rd_registers[] = {
 	REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_IGROUPR0,
 		vgic_mmio_read_group, vgic_mmio_write_group, 4,
 		VGIC_ACCESS_32bit),
-	REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_ISENABLER0,
-		vgic_mmio_read_enable, vgic_mmio_write_senable, 4,
+	REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISENABLER0,
+		vgic_mmio_read_enable, vgic_mmio_write_senable,
+		NULL, vgic_uaccess_write_senable, 4,
 		VGIC_ACCESS_32bit),
-	REGISTER_DESC_WITH_LENGTH(SZ_64K + GICR_ICENABLER0,
-		vgic_mmio_read_enable, vgic_mmio_write_cenable, 4,
+	REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ICENABLER0,
+		vgic_mmio_read_enable, vgic_mmio_write_cenable,
+		NULL, vgic_uaccess_write_cenable, 4,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_LENGTH_UACCESS(SZ_64K + GICR_ISPENDR0,
 		vgic_mmio_read_pending, vgic_mmio_write_spending,
diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index b38e94e8f74ad..6e30034d14645 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -184,6 +184,48 @@ void vgic_mmio_write_cenable(struct kvm_vcpu *vcpu,
 	}
 }
 
+int vgic_uaccess_write_senable(struct kvm_vcpu *vcpu,
+			       gpa_t addr, unsigned int len,
+			       unsigned long val)
+{
+	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+	int i;
+	unsigned long flags;
+
+	for_each_set_bit(i, &val, len * 8) {
+		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
+
+		raw_spin_lock_irqsave(&irq->irq_lock, flags);
+		irq->enabled = true;
+		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
+
+		vgic_put_irq(vcpu->kvm, irq);
+	}
+
+	return 0;
+}
+
+int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu,
+			       gpa_t addr, unsigned int len,
+			       unsigned long val)
+{
+	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+	int i;
+	unsigned long flags;
+
+	for_each_set_bit(i, &val, len * 8) {
+		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
+
+		raw_spin_lock_irqsave(&irq->irq_lock, flags);
+		irq->enabled = false;
+		raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
+
+		vgic_put_irq(vcpu->kvm, irq);
+	}
+
+	return 0;
+}
+
 unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
 				     gpa_t addr, unsigned int len)
 {
diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
index 30713a44e3faa..327d0a6938e4d 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.h
+++ b/virt/kvm/arm/vgic/vgic-mmio.h
@@ -138,6 +138,14 @@ void vgic_mmio_write_cenable(struct kvm_vcpu *vcpu,
 			     gpa_t addr, unsigned int len,
 			     unsigned long val);
 
+int vgic_uaccess_write_senable(struct kvm_vcpu *vcpu,
+			       gpa_t addr, unsigned int len,
+			       unsigned long val);
+
+int vgic_uaccess_write_cenable(struct kvm_vcpu *vcpu,
+			       gpa_t addr, unsigned int len,
+			       unsigned long val);
+
 unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
 				     gpa_t addr, unsigned int len);
 
-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 4/6] KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits
  2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
                   ` (2 preceding siblings ...)
  2020-04-22 16:18 ` [PATCH v3 3/6] KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits Marc Zyngier
@ 2020-04-22 16:18 ` Marc Zyngier
  2020-04-23 11:05   ` James Morse
  2020-04-22 16:18 ` [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy Marc Zyngier
  2020-04-22 16:18 ` [PATCH v3 6/6] KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi() Marc Zyngier
  5 siblings, 1 reply; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, Andre Przywara

There is no point in accessing the HW when writing to any of the
ISPENDR/ICPENDR registers from userspace, as only the guest should
be allowed to change the HW state.

Introduce new userspace-specific accessors that deal solely with
the virtual state. Note that the API differs from that of GICv3,
where userspace exclusively uses ISPENDR to set the state. Too
bad we can't reuse it.

Fixes: 82e40f558de56 ("KVM: arm/arm64: vgic-v2: Handle SGI bits in GICD_I{S,C}PENDR0 as WI")
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 virt/kvm/arm/vgic/vgic-mmio-v2.c |  6 ++-
 virt/kvm/arm/vgic/vgic-mmio.c    | 86 ++++++++++++++++++++++++--------
 virt/kvm/arm/vgic/vgic-mmio.h    |  8 +++
 3 files changed, 76 insertions(+), 24 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-mmio-v2.c b/virt/kvm/arm/vgic/vgic-mmio-v2.c
index f51c6e939c761..a016f07adc281 100644
--- a/virt/kvm/arm/vgic/vgic-mmio-v2.c
+++ b/virt/kvm/arm/vgic/vgic-mmio-v2.c
@@ -417,10 +417,12 @@ static const struct vgic_register_region vgic_v2_dist_registers[] = {
 		NULL, vgic_uaccess_write_cenable, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_SET,
-		vgic_mmio_read_pending, vgic_mmio_write_spending, NULL, NULL, 1,
+		vgic_mmio_read_pending, vgic_mmio_write_spending,
+		NULL, vgic_uaccess_write_spending, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_PENDING_CLEAR,
-		vgic_mmio_read_pending, vgic_mmio_write_cpending, NULL, NULL, 1,
+		vgic_mmio_read_pending, vgic_mmio_write_cpending,
+		NULL, vgic_uaccess_write_cpending, 1,
 		VGIC_ACCESS_32bit),
 	REGISTER_DESC_WITH_BITS_PER_IRQ(GIC_DIST_ACTIVE_SET,
 		vgic_mmio_read_active, vgic_mmio_write_sactive,
diff --git a/virt/kvm/arm/vgic/vgic-mmio.c b/virt/kvm/arm/vgic/vgic-mmio.c
index 6e30034d14645..b2d73fc0d1ef4 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.c
+++ b/virt/kvm/arm/vgic/vgic-mmio.c
@@ -261,17 +261,6 @@ unsigned long vgic_mmio_read_pending(struct kvm_vcpu *vcpu,
 	return value;
 }
 
-/* Must be called with irq->irq_lock held */
-static void vgic_hw_irq_spending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
-				 bool is_uaccess)
-{
-	if (is_uaccess)
-		return;
-
-	irq->pending_latch = true;
-	vgic_irq_set_phys_active(irq, true);
-}
-
 static bool is_vgic_v2_sgi(struct kvm_vcpu *vcpu, struct vgic_irq *irq)
 {
 	return (vgic_irq_is_sgi(irq->intid) &&
@@ -282,7 +271,6 @@ void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
 {
-	bool is_uaccess = !kvm_get_running_vcpu();
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	int i;
 	unsigned long flags;
@@ -312,22 +300,48 @@ void vgic_mmio_write_spending(struct kvm_vcpu *vcpu,
 			continue;
 		}
 
+		irq->pending_latch = true;
 		if (irq->hw)
-			vgic_hw_irq_spending(vcpu, irq, is_uaccess);
-		else
-			irq->pending_latch = true;
+			vgic_irq_set_phys_active(irq, true);
+
 		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
 		vgic_put_irq(vcpu->kvm, irq);
 	}
 }
 
-/* Must be called with irq->irq_lock held */
-static void vgic_hw_irq_cpending(struct kvm_vcpu *vcpu, struct vgic_irq *irq,
-				 bool is_uaccess)
+int vgic_uaccess_write_spending(struct kvm_vcpu *vcpu,
+				gpa_t addr, unsigned int len,
+				unsigned long val)
 {
-	if (is_uaccess)
-		return;
+	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+	int i;
+	unsigned long flags;
+
+	for_each_set_bit(i, &val, len * 8) {
+		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
 
+		raw_spin_lock_irqsave(&irq->irq_lock, flags);
+		irq->pending_latch = true;
+
+		/*
+		 * GICv2 SGIs are terribly broken. We can't restore
+		 * the source of the interrupt, so just pick the vcpu
+		 * itself as the source...
+		 */
+		if (is_vgic_v2_sgi(vcpu, irq))
+			irq->source |= BIT(vcpu->vcpu_id);
+
+		vgic_queue_irq_unlock(vcpu->kvm, irq, flags);
+
+		vgic_put_irq(vcpu->kvm, irq);
+	}
+
+	return 0;
+}
+
+/* Must be called with irq->irq_lock held */
+static void vgic_hw_irq_cpending(struct kvm_vcpu *vcpu, struct vgic_irq *irq)
+{
 	irq->pending_latch = false;
 
 	/*
@@ -350,7 +364,6 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val)
 {
-	bool is_uaccess = !kvm_get_running_vcpu();
 	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
 	int i;
 	unsigned long flags;
@@ -381,7 +394,7 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 		}
 
 		if (irq->hw)
-			vgic_hw_irq_cpending(vcpu, irq, is_uaccess);
+			vgic_hw_irq_cpending(vcpu, irq);
 		else
 			irq->pending_latch = false;
 
@@ -390,6 +403,35 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 	}
 }
 
+int vgic_uaccess_write_cpending(struct kvm_vcpu *vcpu,
+				gpa_t addr, unsigned int len,
+				unsigned long val)
+{
+	u32 intid = VGIC_ADDR_TO_INTID(addr, 1);
+	int i;
+	unsigned long flags;
+
+	for_each_set_bit(i, &val, len * 8) {
+		struct vgic_irq *irq = vgic_get_irq(vcpu->kvm, vcpu, intid + i);
+
+		raw_spin_lock_irqsave(&irq->irq_lock, flags);
+		/*
+		 * More fun with GICv2 SGIs! If we're clearing one of them
+		 * from userspace, which source vcpu to clear? Let's not
+		 * even think of it, and blow the whole set.
+		 */
+		if (is_vgic_v2_sgi(vcpu, irq))
+			irq->source = 0;
+
+		irq->pending_latch = false;
+
+		raw_spin_unlock_irqrestore(&irq->irq_lock, flags);
+
+		vgic_put_irq(vcpu->kvm, irq);
+	}
+
+	return 0;
+}
 
 /*
  * If we are fiddling with an IRQ's active state, we have to make sure the IRQ
diff --git a/virt/kvm/arm/vgic/vgic-mmio.h b/virt/kvm/arm/vgic/vgic-mmio.h
index 327d0a6938e4d..fefcca2b14dc7 100644
--- a/virt/kvm/arm/vgic/vgic-mmio.h
+++ b/virt/kvm/arm/vgic/vgic-mmio.h
@@ -157,6 +157,14 @@ void vgic_mmio_write_cpending(struct kvm_vcpu *vcpu,
 			      gpa_t addr, unsigned int len,
 			      unsigned long val);
 
+int vgic_uaccess_write_spending(struct kvm_vcpu *vcpu,
+				gpa_t addr, unsigned int len,
+				unsigned long val);
+
+int vgic_uaccess_write_cpending(struct kvm_vcpu *vcpu,
+				gpa_t addr, unsigned int len,
+				unsigned long val);
+
 unsigned long vgic_mmio_read_active(struct kvm_vcpu *vcpu,
 				    gpa_t addr, unsigned int len);
 
-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
                   ` (3 preceding siblings ...)
  2020-04-22 16:18 ` [PATCH v3 4/6] KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits Marc Zyngier
@ 2020-04-22 16:18 ` Marc Zyngier
  2020-04-23 11:35   ` James Morse
  2020-04-22 16:18 ` [PATCH v3 6/6] KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi() Marc Zyngier
  5 siblings, 1 reply; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, Andre Przywara

From: Zenghui Yu <yuzenghui@huawei.com>

It's likely that the vcpu fails to handle all virtual interrupts if
userspace decides to destroy it, leaving the pending ones stay in the
ap_list. If the un-handled one is a LPI, its vgic_irq structure will
be eventually leaked because of an extra refcount increment in
vgic_queue_irq_unlock().

This was detected by kmemleak on almost every guest destroy, the
backtrace is as follows:

unreferenced object 0xffff80725aed5500 (size 128):
comm "CPU 5/KVM", pid 40711, jiffies 4298024754 (age 166366.512s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 08 01 a9 73 6d 80 ff ff ...........sm...
c8 61 ee a9 00 20 ff ff 28 1e 55 81 6c 80 ff ff .a... ..(.U.l...
backtrace:
[<000000004bcaa122>] kmem_cache_alloc_trace+0x2dc/0x418
[<0000000069c7dabb>] vgic_add_lpi+0x88/0x418
[<00000000bfefd5c5>] vgic_its_cmd_handle_mapi+0x4dc/0x588
[<00000000cf993975>] vgic_its_process_commands.part.5+0x484/0x1198
[<000000004bd3f8e3>] vgic_its_process_commands+0x50/0x80
[<00000000b9a65b2b>] vgic_mmio_write_its_cwriter+0xac/0x108
[<0000000009641ebb>] dispatch_mmio_write+0xd0/0x188
[<000000008f79d288>] __kvm_io_bus_write+0x134/0x240
[<00000000882f39ac>] kvm_io_bus_write+0xe0/0x150
[<0000000078197602>] io_mem_abort+0x484/0x7b8
[<0000000060954e3c>] kvm_handle_guest_abort+0x4cc/0xa58
[<00000000e0d0cd65>] handle_exit+0x24c/0x770
[<00000000b44a7fad>] kvm_arch_vcpu_ioctl_run+0x460/0x1988
[<0000000025fb897c>] kvm_vcpu_ioctl+0x4f8/0xee0
[<000000003271e317>] do_vfs_ioctl+0x160/0xcd8
[<00000000e7f39607>] ksys_ioctl+0x98/0xd8

Fix it by retiring all pending LPIs in the ap_list on the destroy path.

p.s. I can also reproduce it on a normal guest shutdown. It is because
userspace still send LPIs to vcpu (through KVM_SIGNAL_MSI ioctl) while
the guest is being shutdown and unable to handle it. A little strange
though and haven't dig further...

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200414030349.625-2-yuzenghui@huawei.com
---
 virt/kvm/arm/vgic/vgic-init.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index a963b9d766b73..53ec9b9d9bc43 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
 
+	/*
+	 * Retire all pending LPIs on this vcpu anyway as we're
+	 * going to destroy it.
+	 */
+	vgic_flush_pending_lpis(vcpu);
+
 	INIT_LIST_HEAD(&vgic_cpu->ap_list_head);
 }
 
-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v3 6/6] KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi()
  2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
                   ` (4 preceding siblings ...)
  2020-04-22 16:18 ` [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy Marc Zyngier
@ 2020-04-22 16:18 ` Marc Zyngier
  5 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-22 16:18 UTC (permalink / raw)
  To: linux-arm-kernel, kvmarm, kvm; +Cc: Julien Grall, Andre Przywara

From: Zenghui Yu <yuzenghui@huawei.com>

If we're going to fail out the vgic_add_lpi(), let's make sure the
allocated vgic_irq memory is also freed. Though it seems that both
cases are unlikely to fail.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20200414030349.625-3-yuzenghui@huawei.com
---
 virt/kvm/arm/vgic/vgic-its.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
index d53d34a33e35d..c012a52b19f57 100644
--- a/virt/kvm/arm/vgic/vgic-its.c
+++ b/virt/kvm/arm/vgic/vgic-its.c
@@ -96,14 +96,21 @@ static struct vgic_irq *vgic_add_lpi(struct kvm *kvm, u32 intid,
 	 * We "cache" the configuration table entries in our struct vgic_irq's.
 	 * However we only have those structs for mapped IRQs, so we read in
 	 * the respective config data from memory here upon mapping the LPI.
+	 *
+	 * Should any of these fail, behave as if we couldn't create the LPI
+	 * by dropping the refcount and returning the error.
 	 */
 	ret = update_lpi_config(kvm, irq, NULL, false);
-	if (ret)
+	if (ret) {
+		vgic_put_irq(kvm, irq);
 		return ERR_PTR(ret);
+	}
 
 	ret = vgic_v3_lpi_sync_pending_status(kvm, irq);
-	if (ret)
+	if (ret) {
+		vgic_put_irq(kvm, irq);
 		return ERR_PTR(ret);
+	}
 
 	return irq;
 }
-- 
2.26.1

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 4/6] KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits
  2020-04-22 16:18 ` [PATCH v3 4/6] KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits Marc Zyngier
@ 2020-04-23 11:05   ` James Morse
  0 siblings, 0 replies; 15+ messages in thread
From: James Morse @ 2020-04-23 11:05 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi Marc,

On 22/04/2020 17:18, Marc Zyngier wrote:
> There is no point in accessing the HW when writing to any of the
> ISPENDR/ICPENDR registers from userspace, as only the guest should
> be allowed to change the HW state.
> 
> Introduce new userspace-specific accessors that deal solely with
> the virtual state. Note that the API differs from that of GICv3,
> where userspace exclusively uses ISPENDR to set the state. Too
> bad we can't reuse it.

Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-22 16:18 ` [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy Marc Zyngier
@ 2020-04-23 11:35   ` James Morse
  2020-04-23 11:57     ` Zenghui Yu
  2020-04-23 12:03     ` Marc Zyngier
  0 siblings, 2 replies; 15+ messages in thread
From: James Morse @ 2020-04-23 11:35 UTC (permalink / raw)
  To: Marc Zyngier, Zenghui Yu
  Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi Marc, Zenghui,

On 22/04/2020 17:18, Marc Zyngier wrote:
> From: Zenghui Yu <yuzenghui@huawei.com>
> 
> It's likely that the vcpu fails to handle all virtual interrupts if
> userspace decides to destroy it, leaving the pending ones stay in the
> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
> be eventually leaked because of an extra refcount increment in
> vgic_queue_irq_unlock().

> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index a963b9d766b73..53ec9b9d9bc43 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>  
> +	/*
> +	 * Retire all pending LPIs on this vcpu anyway as we're
> +	 * going to destroy it.
> +	 */

Looking at the other caller, do we need something like:
|	if (vgic_cpu->lpis_enabled)

?

> +	vgic_flush_pending_lpis(vcpu);
> +

Otherwise, I get this on a gic-v2 machine!:
[ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0
[ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task qemu-system-aar/542
[ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted
5.7.0-rc2-00006-g4fb0f7bb0e27 #2
[ 1742.211780] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development
Platform, BIOS EDK II Jul 30 2018
[ 1742.222596] Call trace:
[ 1742.225059]  dump_backtrace+0x0/0x328
[ 1742.228738]  show_stack+0x18/0x28
[ 1742.232071]  dump_stack+0x134/0x1b0
[ 1742.235578]  print_address_description.isra.0+0x6c/0x350
[ 1742.240910]  __kasan_report+0x10c/0x180
[ 1742.244763]  kasan_report+0x4c/0x68
[ 1742.248268]  __asan_report_load8_noabort+0x30/0x48
[ 1742.253081]  vgic_flush_pending_lpis+0x250/0x2c0
[ 1742.257718]  __kvm_vgic_destroy+0x1cc/0x478
[ 1742.261919]  kvm_vgic_destroy+0x30/0x48
[ 1742.265773]  kvm_arch_destroy_vm+0x20/0x128
[ 1742.269976]  kvm_put_kvm+0x3e0/0x8d0
[ 1742.273567]  kvm_vm_release+0x3c/0x60
[ 1742.277248]  __fput+0x218/0x630
[ 1742.280406]  ____fput+0x10/0x20
[ 1742.283565]  task_work_run+0xd8/0x1f0
[ 1742.287245]  do_exit+0x87c/0x2640
[ 1742.290575]  do_group_exit+0xd0/0x258
[ 1742.294254]  __arm64_sys_exit_group+0x3c/0x48
[ 1742.298631]  el0_svc_common.constprop.0+0x10c/0x348
[ 1742.303529]  do_el0_svc+0x48/0xd0
[ 1742.306861]  el0_sync_handler+0x11c/0x1b8
[ 1742.310888]  el0_sync+0x158/0x180
[ 1742.315716] The buggy address belongs to the page:
[ 1742.320529] page:fffffe002366fc40 refcount:0 mapcount:0 mapping:000000007e21d29f index:0x0
[ 1742.328821] flags: 0x2ffff00000000000()
[ 1742.332678] raw: 2ffff00000000000 0000000000000000 ffffffff23660401 0000000000000000
[ 1742.340449] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 1742.348215] page dumped because: kasan: bad access detected
[ 1742.355304] Memory state around the buggy address:
[ 1742.360115]  ffff0008e1bf1e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 1742.367360]  ffff0008e1bf1e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 1742.374606] >ffff0008e1bf1f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 1742.381851]                                   ^
[ 1742.386399]  ffff0008e1bf1f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 1742.393645]  ffff0008e1bf2000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 1742.400889] ==================================================================
[ 1742.408132] Disabling lock debugging due to kernel taint


With that:
Reviewed-by: James Morse <james.morse@arm.com>


Thanks,

James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-23 11:35   ` James Morse
@ 2020-04-23 11:57     ` Zenghui Yu
  2020-04-23 14:34       ` James Morse
  2020-04-23 12:03     ` Marc Zyngier
  1 sibling, 1 reply; 15+ messages in thread
From: Zenghui Yu @ 2020-04-23 11:57 UTC (permalink / raw)
  To: James Morse, Marc Zyngier
  Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi James,

Thanks for having a look and testing it!

On 2020/4/23 19:35, James Morse wrote:
> Hi Marc, Zenghui,
> 
> On 22/04/2020 17:18, Marc Zyngier wrote:
>> From: Zenghui Yu <yuzenghui@huawei.com>
>>
>> It's likely that the vcpu fails to handle all virtual interrupts if
>> userspace decides to destroy it, leaving the pending ones stay in the
>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
>> be eventually leaked because of an extra refcount increment in
>> vgic_queue_irq_unlock().
> 
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>> index a963b9d766b73..53ec9b9d9bc43 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>   {
>>   	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>   
>> +	/*
>> +	 * Retire all pending LPIs on this vcpu anyway as we're
>> +	 * going to destroy it.
>> +	 */
> 
> Looking at the other caller, do we need something like:
> |	if (vgic_cpu->lpis_enabled)
> 
> ?

If LPIs are disabled at redistributor level, yes there should be no
pending LPIs in the ap_list. But I'm not sure how can the following
use-after-free BUG be triggered.

> 
>> +	vgic_flush_pending_lpis(vcpu);
>> +
> 
> Otherwise, I get this on a gic-v2 machine!:

I don't have a gic-v2 one and thus can't reproduce it :-(

> [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0
> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task qemu-system-aar/542
> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted
> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2
> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development
> Platform, BIOS EDK II Jul 30 2018
> [ 1742.222596] Call trace:
> [ 1742.225059]  dump_backtrace+0x0/0x328
> [ 1742.228738]  show_stack+0x18/0x28
> [ 1742.232071]  dump_stack+0x134/0x1b0
> [ 1742.235578]  print_address_description.isra.0+0x6c/0x350
> [ 1742.240910]  __kasan_report+0x10c/0x180
> [ 1742.244763]  kasan_report+0x4c/0x68
> [ 1742.248268]  __asan_report_load8_noabort+0x30/0x48
> [ 1742.253081]  vgic_flush_pending_lpis+0x250/0x2c0

Could you please show the result of

./scripts/faddr2line vmlinux vgic_flush_pending_lpis+0x250/0x2c0

on your setup?

> [ 1742.257718]  __kvm_vgic_destroy+0x1cc/0x478
> [ 1742.261919]  kvm_vgic_destroy+0x30/0x48
> [ 1742.265773]  kvm_arch_destroy_vm+0x20/0x128
> [ 1742.269976]  kvm_put_kvm+0x3e0/0x8d0
> [ 1742.273567]  kvm_vm_release+0x3c/0x60
> [ 1742.277248]  __fput+0x218/0x630
> [ 1742.280406]  ____fput+0x10/0x20
> [ 1742.283565]  task_work_run+0xd8/0x1f0
> [ 1742.287245]  do_exit+0x87c/0x2640
> [ 1742.290575]  do_group_exit+0xd0/0x258
> [ 1742.294254]  __arm64_sys_exit_group+0x3c/0x48
> [ 1742.298631]  el0_svc_common.constprop.0+0x10c/0x348
> [ 1742.303529]  do_el0_svc+0x48/0xd0
> [ 1742.306861]  el0_sync_handler+0x11c/0x1b8
> [ 1742.310888]  el0_sync+0x158/0x180
> [ 1742.315716] The buggy address belongs to the page:
> [ 1742.320529] page:fffffe002366fc40 refcount:0 mapcount:0 mapping:000000007e21d29f index:0x0
> [ 1742.328821] flags: 0x2ffff00000000000()
> [ 1742.332678] raw: 2ffff00000000000 0000000000000000 ffffffff23660401 0000000000000000
> [ 1742.340449] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
> [ 1742.348215] page dumped because: kasan: bad access detected
> [ 1742.355304] Memory state around the buggy address:
> [ 1742.360115]  ffff0008e1bf1e00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 1742.367360]  ffff0008e1bf1e80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 1742.374606] >ffff0008e1bf1f00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 1742.381851]                                   ^
> [ 1742.386399]  ffff0008e1bf1f80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 1742.393645]  ffff0008e1bf2000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 1742.400889] ==================================================================
> [ 1742.408132] Disabling lock debugging due to kernel taint
> 
> 
> With that:
> Reviewed-by: James Morse <james.morse@arm.com>

Thanks a lot!

Zenghui

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-23 11:35   ` James Morse
  2020-04-23 11:57     ` Zenghui Yu
@ 2020-04-23 12:03     ` Marc Zyngier
  2020-04-23 12:18       ` Zenghui Yu
  2020-04-23 14:34       ` James Morse
  1 sibling, 2 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-23 12:03 UTC (permalink / raw)
  To: James Morse; +Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi James,

Thanks for the heads up.

On 2020-04-23 12:35, James Morse wrote:
> Hi Marc, Zenghui,
> 
> On 22/04/2020 17:18, Marc Zyngier wrote:
>> From: Zenghui Yu <yuzenghui@huawei.com>
>> 
>> It's likely that the vcpu fails to handle all virtual interrupts if
>> userspace decides to destroy it, leaving the pending ones stay in the
>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
>> be eventually leaked because of an extra refcount increment in
>> vgic_queue_irq_unlock().
> 
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c 
>> b/virt/kvm/arm/vgic/vgic-init.c
>> index a963b9d766b73..53ec9b9d9bc43 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>  {
>>  	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> 
>> +	/*
>> +	 * Retire all pending LPIs on this vcpu anyway as we're
>> +	 * going to destroy it.
>> +	 */
> 
> Looking at the other caller, do we need something like:
> |	if (vgic_cpu->lpis_enabled)
> 
> ?

Huh... On its own, this call is absolutely harmless even if you
don't have LPIs. But see below.

> 
>> +	vgic_flush_pending_lpis(vcpu);
>> +
> 
> Otherwise, I get this on a gic-v2 machine!:
> [ 1742.187139] BUG: KASAN: use-after-free in 
> vgic_flush_pending_lpis+0x250/0x2c0
> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task
> qemu-system-aar/542
> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted
> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2
> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development
> Platform/ARM Juno Development
> Platform, BIOS EDK II Jul 30 2018
> [ 1742.222596] Call trace:
> [ 1742.225059]  dump_backtrace+0x0/0x328
> [ 1742.228738]  show_stack+0x18/0x28
> [ 1742.232071]  dump_stack+0x134/0x1b0
> [ 1742.235578]  print_address_description.isra.0+0x6c/0x350
> [ 1742.240910]  __kasan_report+0x10c/0x180
> [ 1742.244763]  kasan_report+0x4c/0x68
> [ 1742.248268]  __asan_report_load8_noabort+0x30/0x48
> [ 1742.253081]  vgic_flush_pending_lpis+0x250/0x2c0
> [ 1742.257718]  __kvm_vgic_destroy+0x1cc/0x478
> [ 1742.261919]  kvm_vgic_destroy+0x30/0x48
> [ 1742.265773]  kvm_arch_destroy_vm+0x20/0x128
> [ 1742.269976]  kvm_put_kvm+0x3e0/0x8d0
> [ 1742.273567]  kvm_vm_release+0x3c/0x60
> [ 1742.277248]  __fput+0x218/0x630
> [ 1742.280406]  ____fput+0x10/0x20
> [ 1742.283565]  task_work_run+0xd8/0x1f0
> [ 1742.287245]  do_exit+0x87c/0x2640
> [ 1742.290575]  do_group_exit+0xd0/0x258
> [ 1742.294254]  __arm64_sys_exit_group+0x3c/0x48
> [ 1742.298631]  el0_svc_common.constprop.0+0x10c/0x348
> [ 1742.303529]  do_el0_svc+0x48/0xd0
> [ 1742.306861]  el0_sync_handler+0x11c/0x1b8
> [ 1742.310888]  el0_sync+0x158/0x180
> [ 1742.315716] The buggy address belongs to the page:
> [ 1742.320529] page:fffffe002366fc40 refcount:0 mapcount:0
> mapping:000000007e21d29f index:0x0
> [ 1742.328821] flags: 0x2ffff00000000000()
> [ 1742.332678] raw: 2ffff00000000000 0000000000000000 ffffffff23660401
> 0000000000000000
> [ 1742.340449] raw: 0000000000000000 0000000000000000 00000000ffffffff
> 0000000000000000
> [ 1742.348215] page dumped because: kasan: bad access detected
> [ 1742.355304] Memory state around the buggy address:
> [ 1742.360115]  ffff0008e1bf1e00: ff ff ff ff ff ff ff ff ff ff ff ff
> ff ff ff ff
> [ 1742.367360]  ffff0008e1bf1e80: ff ff ff ff ff ff ff ff ff ff ff ff
> ff ff ff ff
> [ 1742.374606] >ffff0008e1bf1f00: ff ff ff ff ff ff ff ff ff ff ff ff
> ff ff ff ff
> [ 1742.381851]                                   ^
> [ 1742.386399]  ffff0008e1bf1f80: ff ff ff ff ff ff ff ff ff ff ff ff
> ff ff ff ff
> [ 1742.393645]  ffff0008e1bf2000: ff ff ff ff ff ff ff ff ff ff ff ff
> ff ff ff ff
> [ 1742.400889]
> ==================================================================
> [ 1742.408132] Disabling lock debugging due to kernel taint
> 
> 
> With that:
> Reviewed-by: James Morse <james.morse@arm.com>

I think this is slightly more concerning. The issue is that we have
started freeing parts of the interrupt state already (we free the
SPIs early in kvm_vgic_dist_destroy()).

If a SPI was pending or active at this stage (i.e. present in the
ap_list), we are going to iterate over memory that has been freed
already. This is bad, and this can happen on GICv3 as well.

I think this should solve it, but I need to test it on a GICv2 system:

diff --git a/virt/kvm/arm/vgic/vgic-init.c 
b/virt/kvm/arm/vgic/vgic-init.c
index 53ec9b9d9bc43..30dbec9fe0b4a 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm)

  	vgic_debug_destroy(kvm);

-	kvm_vgic_dist_destroy(kvm);
-
  	kvm_for_each_vcpu(i, vcpu, kvm)
  		kvm_vgic_vcpu_destroy(vcpu);
+
+	kvm_vgic_dist_destroy(kvm);
  }

  void kvm_vgic_destroy(struct kvm *kvm)

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-23 12:03     ` Marc Zyngier
@ 2020-04-23 12:18       ` Zenghui Yu
  2020-04-23 14:34       ` James Morse
  1 sibling, 0 replies; 15+ messages in thread
From: Zenghui Yu @ 2020-04-23 12:18 UTC (permalink / raw)
  To: Marc Zyngier, James Morse
  Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

On 2020/4/23 20:03, Marc Zyngier wrote:
> 
> I think this is slightly more concerning. The issue is that we have
> started freeing parts of the interrupt state already (we free the
> SPIs early in kvm_vgic_dist_destroy()).
> 
> If a SPI was pending or active at this stage (i.e. present in the
> ap_list), we are going to iterate over memory that has been freed
> already. This is bad, and this can happen on GICv3 as well.

Ah, I think this should be the case.

> 
> I think this should solve it, but I need to test it on a GICv2 system:

Agreed.

> 
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 53ec9b9d9bc43..30dbec9fe0b4a 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm)
> 
>       vgic_debug_destroy(kvm);
> 
> -    kvm_vgic_dist_destroy(kvm);
> -
>       kvm_for_each_vcpu(i, vcpu, kvm)
>           kvm_vgic_vcpu_destroy(vcpu);
> +
> +    kvm_vgic_dist_destroy(kvm);
>   }
> 
>   void kvm_vgic_destroy(struct kvm *kvm)

Thanks for the fix,
Zenghui

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-23 11:57     ` Zenghui Yu
@ 2020-04-23 14:34       ` James Morse
  0 siblings, 0 replies; 15+ messages in thread
From: James Morse @ 2020-04-23 14:34 UTC (permalink / raw)
  To: Zenghui Yu, Marc Zyngier
  Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi Zenghui,

On 23/04/2020 12:57, Zenghui Yu wrote:
> On 2020/4/23 19:35, James Morse wrote:
>> On 22/04/2020 17:18, Marc Zyngier wrote:
>>> From: Zenghui Yu <yuzenghui@huawei.com>
>>>
>>> It's likely that the vcpu fails to handle all virtual interrupts if
>>> userspace decides to destroy it, leaving the pending ones stay in the
>>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
>>> be eventually leaked because of an extra refcount increment in
>>> vgic_queue_irq_unlock().
>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>>> index a963b9d766b73..53ec9b9d9bc43 100644
>>> --- a/virt/kvm/arm/vgic/vgic-init.c
>>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>>   {
>>>       struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>   +    /*
>>> +     * Retire all pending LPIs on this vcpu anyway as we're
>>> +     * going to destroy it.
>>> +     */
>>
>> Looking at the other caller, do we need something like:
>> |    if (vgic_cpu->lpis_enabled)
>>
>> ?
> 
> If LPIs are disabled at redistributor level, yes there should be no
> pending LPIs in the ap_list. But I'm not sure how can the following
> use-after-free BUG be triggered.
> 
>>
>>> +    vgic_flush_pending_lpis(vcpu);
>>> +
>>
>> Otherwise, I get this on a gic-v2 machine!:
> 
> I don't have a gic-v2 one and thus can't reproduce it :-(
> 
>> [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0
>> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task qemu-system-aar/542
>> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted
>> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2
>> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development
>> Platform, BIOS EDK II Jul 30 2018
>> [ 1742.222596] Call trace:
>> [ 1742.225059]  dump_backtrace+0x0/0x328
>> [ 1742.228738]  show_stack+0x18/0x28
>> [ 1742.232071]  dump_stack+0x134/0x1b0
>> [ 1742.235578]  print_address_description.isra.0+0x6c/0x350
>> [ 1742.240910]  __kasan_report+0x10c/0x180
>> [ 1742.244763]  kasan_report+0x4c/0x68
>> [ 1742.248268]  __asan_report_load8_noabort+0x30/0x48
>> [ 1742.253081]  vgic_flush_pending_lpis+0x250/0x2c0
> 
> Could you please show the result of
> 
> ./scripts/faddr2line vmlinux vgic_flush_pending_lpis+0x250/0x2c0
> 
> on your setup?

vgic_flush_pending_lpis+0x250/0x2c0:
vgic_flush_pending_lpis at arch/arm64/kvm/../../../virt/kvm/arm/vgic/vgic.c:159

Which is:
|	list_for_each_entry_safe(irq, tmp, &vgic_cpu->ap_list_head, ap_list) {


I think this confirms Marc's view of the KASAN splat.


>> With that:
>> Reviewed-by: James Morse <james.morse@arm.com>
> 
> Thanks a lot!

Heh, it looks like I had the wrong end of the stick with this... I wrongly assumed calling
this on gicv2 would go using structures that held something else.


Thanks,

James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-23 12:03     ` Marc Zyngier
  2020-04-23 12:18       ` Zenghui Yu
@ 2020-04-23 14:34       ` James Morse
  2020-04-23 15:13         ` Marc Zyngier
  1 sibling, 1 reply; 15+ messages in thread
From: James Morse @ 2020-04-23 14:34 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi guys,

On 23/04/2020 13:03, Marc Zyngier wrote:
> On 2020-04-23 12:35, James Morse wrote:
>> On 22/04/2020 17:18, Marc Zyngier wrote:
>>> From: Zenghui Yu <yuzenghui@huawei.com>
>>>
>>> It's likely that the vcpu fails to handle all virtual interrupts if
>>> userspace decides to destroy it, leaving the pending ones stay in the
>>> ap_list. If the un-handled one is a LPI, its vgic_irq structure will
>>> be eventually leaked because of an extra refcount increment in
>>> vgic_queue_irq_unlock().
>>
>>> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
>>> index a963b9d766b73..53ec9b9d9bc43 100644
>>> --- a/virt/kvm/arm/vgic/vgic-init.c
>>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>>> @@ -348,6 +348,12 @@ void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
>>>  {
>>>      struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>
>>> +    /*
>>> +     * Retire all pending LPIs on this vcpu anyway as we're
>>> +     * going to destroy it.
>>> +     */
>>
>> Looking at the other caller, do we need something like:
>> |    if (vgic_cpu->lpis_enabled)
>>
>> ?
> 
> Huh... On its own, this call is absolutely harmless even if you
> don't have LPIs. But see below.
> 
>>
>>> +    vgic_flush_pending_lpis(vcpu);
>>> +
>>
>> Otherwise, I get this on a gic-v2 machine!:
>> [ 1742.187139] BUG: KASAN: use-after-free in vgic_flush_pending_lpis+0x250/0x2c0
>> [ 1742.194302] Read of size 8 at addr ffff0008e1bf1f28 by task
>> qemu-system-aar/542
>> [ 1742.203140] CPU: 2 PID: 542 Comm: qemu-system-aar Not tainted
>> 5.7.0-rc2-00006-g4fb0f7bb0e27 #2
>> [ 1742.211780] Hardware name: ARM LTD ARM Juno Development
>> Platform/ARM Juno Development
>> Platform, BIOS EDK II Jul 30 2018
>> [ 1742.222596] Call trace:
>> [ 1742.225059]  dump_backtrace+0x0/0x328
>> [ 1742.228738]  show_stack+0x18/0x28
>> [ 1742.232071]  dump_stack+0x134/0x1b0
>> [ 1742.235578]  print_address_description.isra.0+0x6c/0x350
>> [ 1742.240910]  __kasan_report+0x10c/0x180
>> [ 1742.244763]  kasan_report+0x4c/0x68
>> [ 1742.248268]  __asan_report_load8_noabort+0x30/0x48
>> [ 1742.253081]  vgic_flush_pending_lpis+0x250/0x2c0
>> [ 1742.257718]  __kvm_vgic_destroy+0x1cc/0x478
>> [ 1742.261919]  kvm_vgic_destroy+0x30/0x48
>> [ 1742.265773]  kvm_arch_destroy_vm+0x20/0x128
>> [ 1742.269976]  kvm_put_kvm+0x3e0/0x8d0
>> [ 1742.273567]  kvm_vm_release+0x3c/0x60
>> [ 1742.277248]  __fput+0x218/0x630
>> [ 1742.280406]  ____fput+0x10/0x20
>> [ 1742.283565]  task_work_run+0xd8/0x1f0
>> [ 1742.287245]  do_exit+0x87c/0x2640
>> [ 1742.290575]  do_group_exit+0xd0/0x258
>> [ 1742.294254]  __arm64_sys_exit_group+0x3c/0x48
>> [ 1742.298631]  el0_svc_common.constprop.0+0x10c/0x348
>> [ 1742.303529]  do_el0_svc+0x48/0xd0
>> [ 1742.306861]  el0_sync_handler+0x11c/0x1b8
>> [ 1742.310888]  el0_sync+0x158/0x180

>> [ 1742.348215] page dumped because: kasan: bad access detected

> I think this is slightly more concerning. The issue is that we have
> started freeing parts of the interrupt state already (we free the
> SPIs early in kvm_vgic_dist_destroy()).

(I took this to be some wild pointer access. Previously for use-after-free I've seen it
print where it was allocated and where it was freed).


> If a SPI was pending or active at this stage (i.e. present in the
> ap_list), we are going to iterate over memory that has been freed
> already. This is bad, and this can happen on GICv3 as well.


> I think this should solve it, but I need to test it on a GICv2 system:
> 
> diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
> index 53ec9b9d9bc43..30dbec9fe0b4a 100644
> --- a/virt/kvm/arm/vgic/vgic-init.c
> +++ b/virt/kvm/arm/vgic/vgic-init.c
> @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm)
> 
>      vgic_debug_destroy(kvm);
> 
> -    kvm_vgic_dist_destroy(kvm);
> -
>      kvm_for_each_vcpu(i, vcpu, kvm)
>          kvm_vgic_vcpu_destroy(vcpu);
> +
> +    kvm_vgic_dist_destroy(kvm);
>  }
> >  void kvm_vgic_destroy(struct kvm *kvm)

This works for me on Juno.


Thanks,

James
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy
  2020-04-23 14:34       ` James Morse
@ 2020-04-23 15:13         ` Marc Zyngier
  0 siblings, 0 replies; 15+ messages in thread
From: Marc Zyngier @ 2020-04-23 15:13 UTC (permalink / raw)
  To: James Morse; +Cc: Julien Grall, kvm, Andre Przywara, kvmarm, linux-arm-kernel

Hi James,

On 2020-04-23 15:34, James Morse wrote:
> Hi guys,
> 
> On 23/04/2020 13:03, Marc Zyngier wrote:
>> On 2020-04-23 12:35, James Morse wrote:

[...]

>>> [ 1742.348215] page dumped because: kasan: bad access detected
> 
>> I think this is slightly more concerning. The issue is that we have
>> started freeing parts of the interrupt state already (we free the
>> SPIs early in kvm_vgic_dist_destroy()).
> 
> (I took this to be some wild pointer access. Previously for
> use-after-free I've seen it
> print where it was allocated and where it was freed).

This is indeed what I managed to trigger by forcing a pending
SPI (the kvmtool UART interrupt) in the guest and forcefully
terminating it:

[ 3807.084237] 
==================================================================
[ 3807.086516] BUG: KASAN: use-after-free in 
vgic_flush_pending_lpis+0x54/0x198
[ 3807.088027] Read of size 8 at addr ffff00085514a328 by task 
ioeventfd-worke/231
[ 3807.089771]
[ 3807.090911] CPU: 4 PID: 231 Comm: ioeventfd-worke Not tainted 
5.7.0-rc2-00086-g2100c066e9a78 #200
[ 3807.092864] Hardware name: FVP Base RevC (DT)
[ 3807.094003] Call trace:
[ 3807.095180]  dump_backtrace+0x0/0x268
[ 3807.096445]  show_stack+0x1c/0x28
[ 3807.097961]  dump_stack+0xe8/0x144
[ 3807.099374]  print_address_description.isra.11+0x6c/0x354
[ 3807.101002]  __kasan_report+0x110/0x1c8
[ 3807.102332]  kasan_report+0x48/0x60
[ 3807.103769]  __asan_load8+0x9c/0xc0
[ 3807.105113]  vgic_flush_pending_lpis+0x54/0x198
[ 3807.107187]  __kvm_vgic_destroy+0x120/0x278
[ 3807.108814]  kvm_vgic_destroy+0x30/0x48
[ 3807.110443]  kvm_arch_destroy_vm+0x20/0xa8
[ 3807.111868]  kvm_put_kvm+0x234/0x460
[ 3807.113697]  kvm_vm_release+0x34/0x48
[ 3807.115162]  __fput+0x104/0x2f8
[ 3807.116464]  ____fput+0x14/0x20
[ 3807.117929]  task_work_run+0xbc/0x188
[ 3807.119419]  do_exit+0x514/0xff8
[ 3807.120859]  do_group_exit+0x78/0x108
[ 3807.122323]  get_signal+0x164/0xcc0
[ 3807.123951]  do_notify_resume+0x244/0x5e0
[ 3807.125416]  work_pending+0x8/0x10
[ 3807.126392]
[ 3807.126969] Allocated by task 229:
[ 3807.128834]  save_stack+0x24/0x50
[ 3807.130462]  __kasan_kmalloc.isra.10+0xc4/0xe0
[ 3807.132134]  kasan_kmalloc+0xc/0x18
[ 3807.133554]  __kmalloc+0x174/0x270
[ 3807.135182]  vgic_init.part.2+0xe0/0x4f0
[ 3807.136809]  vgic_init+0x48/0x58
[ 3807.138095]  vgic_set_common_attr.isra.4+0x2fc/0x388
[ 3807.140081]  vgic_v3_set_attr+0x8c/0x350
[ 3807.141692]  kvm_device_ioctl_attr+0x124/0x190
[ 3807.143260]  kvm_device_ioctl+0xe8/0x170
[ 3807.144947]  ksys_ioctl+0xb8/0xf8
[ 3807.146575]  __arm64_sys_ioctl+0x48/0x60
[ 3807.148365]  el0_svc_common.constprop.1+0xc8/0x1c8
[ 3807.150015]  do_el0_svc+0x94/0xa0
[ 3807.151605]  el0_sync_handler+0x120/0x190
[ 3807.152922]  el0_sync+0x140/0x180
[ 3807.153899]
[ 3807.154784] Freed by task 231:
[ 3807.156178]  save_stack+0x24/0x50
[ 3807.157805]  __kasan_slab_free+0x10c/0x188
[ 3807.159433]  kasan_slab_free+0x10/0x18
[ 3807.160897]  kfree+0x88/0x350
[ 3807.162570]  __kvm_vgic_destroy+0x5c/0x278
[ 3807.164153]  kvm_vgic_destroy+0x30/0x48
[ 3807.165780]  kvm_arch_destroy_vm+0x20/0xa8
[ 3807.167408]  kvm_put_kvm+0x234/0x460
[ 3807.168691]  kvm_vm_release+0x34/0x48
[ 3807.170281]  __fput+0x104/0x2f8
[ 3807.171870]  ____fput+0x14/0x20
[ 3807.173268]  task_work_run+0xbc/0x188
[ 3807.174733]  do_exit+0x514/0xff8
[ 3807.176242]  do_group_exit+0x78/0x108
[ 3807.177434]  get_signal+0x164/0xcc0
[ 3807.179289]  do_notify_resume+0x244/0x5e0
[ 3807.180755]  work_pending+0x8/0x10
[ 3807.181731]
[ 3807.182707] The buggy address belongs to the object at 
ffff00085514a000
[ 3807.182707]  which belongs to the cache kmalloc-4k of size 4096
[ 3807.185381] The buggy address is located 808 bytes inside of
[ 3807.185381]  4096-byte region [ffff00085514a000, ffff00085514b000)
[ 3807.187591] The buggy address belongs to the page:
[ 3807.189381] page:fffffe0021345200 refcount:1 mapcount:0 
mapping:00000000090b1068 index:0x0 head:fffffe0021345200 order:3 
compound_mapcount:0 compound_pincount:0
[ 3807.192148] flags: 0x2ffff00000010200(slab|head)
[ 3807.194123] raw: 2ffff00000010200 dead000000000100 dead000000000122 
ffff00085a00f200
[ 3807.196379] raw: 0000000000000000 0000000080040004 00000001ffffffff 
0000000000000000
[ 3807.198097] page dumped because: kasan: bad access detected
[ 3807.199289]
[ 3807.200123] Memory state around the buggy address:
[ 3807.201750]  ffff00085514a200: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[ 3807.203704]  ffff00085514a280: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[ 3807.205657] >ffff00085514a300: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[ 3807.207285]                                   ^
[ 3807.208826]  ffff00085514a380: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[ 3807.210812]  ffff00085514a400: fb fb fb fb fb fb fb fb fb fb fb fb fb 
fb fb fb
[ 3807.212402] 
==================================================================

>> If a SPI was pending or active at this stage (i.e. present in the
>> ap_list), we are going to iterate over memory that has been freed
>> already. This is bad, and this can happen on GICv3 as well.
> 
> 
>> I think this should solve it, but I need to test it on a GICv2 system:
>> 
>> diff --git a/virt/kvm/arm/vgic/vgic-init.c 
>> b/virt/kvm/arm/vgic/vgic-init.c
>> index 53ec9b9d9bc43..30dbec9fe0b4a 100644
>> --- a/virt/kvm/arm/vgic/vgic-init.c
>> +++ b/virt/kvm/arm/vgic/vgic-init.c
>> @@ -365,10 +365,10 @@ static void __kvm_vgic_destroy(struct kvm *kvm)
>> 
>>      vgic_debug_destroy(kvm);
>> 
>> -    kvm_vgic_dist_destroy(kvm);
>> -
>>      kvm_for_each_vcpu(i, vcpu, kvm)
>>          kvm_vgic_vcpu_destroy(vcpu);
>> +
>> +    kvm_vgic_dist_destroy(kvm);
>>  }
>> >  void kvm_vgic_destroy(struct kvm *kvm)
> 
> This works for me on Juno.

I've verified that the above splat disappears on the FVP too.
I'll squash the fix in, add your RB (which I assume stands)
and send the whole thing as a lockdown present to Paolo!

Thanks,

          M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-04-23 15:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-22 16:18 [PATCH v3 0/6] KVM: arm: vgic fixes for 5.7 Marc Zyngier
2020-04-22 16:18 ` [PATCH v3 1/6] KVM: arm: vgic: Fix limit condition when writing to GICD_I[CS]ACTIVER Marc Zyngier
2020-04-22 16:18 ` [PATCH v3 2/6] KVM: arm: vgic: Synchronize the whole guest on GIC{D, R}_I{S, C}ACTIVER read Marc Zyngier
2020-04-22 16:18 ` [PATCH v3 3/6] KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits Marc Zyngier
2020-04-22 16:18 ` [PATCH v3 4/6] KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits Marc Zyngier
2020-04-23 11:05   ` James Morse
2020-04-22 16:18 ` [PATCH v3 5/6] KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy Marc Zyngier
2020-04-23 11:35   ` James Morse
2020-04-23 11:57     ` Zenghui Yu
2020-04-23 14:34       ` James Morse
2020-04-23 12:03     ` Marc Zyngier
2020-04-23 12:18       ` Zenghui Yu
2020-04-23 14:34       ` James Morse
2020-04-23 15:13         ` Marc Zyngier
2020-04-22 16:18 ` [PATCH v3 6/6] KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi() Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).