kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] KVM: Turn the vcpu array into an xarray
@ 2021-11-05 19:20 Marc Zyngier
  2021-11-05 19:20 ` [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code Marc Zyngier
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
  To: kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

The kvm structure is pretty large. A large portion of it is the vcpu
array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
VMs. Of course, hardly anyone runs VMs this big, so this is often a
net waste of memory and cache locality.

A possible approach is to turn the fixed-size array into an xarray,
which results in a net code deletion after a bit of cleanup.

This series is on top of the current linux/master as it touches the
RISC-V implementation. Only tested on arm64.

Marc Zyngier (5):
  KVM: Move wiping of the kvm->vcpus array to common code
  KVM: mips: Use kvm_get_vcpu() instead of open-coded access
  KVM: s390: Use kvm_get_vcpu() instead of open-coded access
  KVM: x86: Use kvm_get_vcpu() instead of open-coded access
  KVM: Convert the kvm->vcpus array to a xarray

 arch/arm64/kvm/arm.c           | 10 +---------
 arch/mips/kvm/loongson_ipi.c   |  4 ++--
 arch/mips/kvm/mips.c           | 23 ++---------------------
 arch/powerpc/kvm/powerpc.c     | 10 +---------
 arch/riscv/kvm/vm.c            | 10 +---------
 arch/s390/kvm/kvm-s390.c       | 26 ++++++--------------------
 arch/x86/kvm/vmx/posted_intr.c |  2 +-
 arch/x86/kvm/x86.c             |  9 +--------
 include/linux/kvm_host.h       |  7 ++++---
 virt/kvm/kvm_main.c            | 33 ++++++++++++++++++++++++++-------
 10 files changed, 45 insertions(+), 89 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
@ 2021-11-05 19:20 ` Marc Zyngier
  2021-11-05 20:12   ` Sean Christopherson
  2021-11-08 12:12   ` Claudio Imbrenda
  2021-11-05 19:20 ` [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access Marc Zyngier
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
  To: kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

All architectures have similar loops iterating over the vcpus,
freeing one vcpu at a time, and eventually wiping the reference
off the vcpus array. They are also inconsistently taking
the kvm->lock mutex when wiping the references from the array.

Make this code common, which will simplify further changes.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/arm.c       | 10 +---------
 arch/mips/kvm/mips.c       | 21 +--------------------
 arch/powerpc/kvm/powerpc.c | 10 +---------
 arch/riscv/kvm/vm.c        | 10 +---------
 arch/s390/kvm/kvm-s390.c   | 18 +-----------------
 arch/x86/kvm/x86.c         |  9 +--------
 include/linux/kvm_host.h   |  2 +-
 virt/kvm/kvm_main.c        | 20 ++++++++++++++++++--
 8 files changed, 25 insertions(+), 75 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f5490afe1ebf..75bb7215da03 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -175,19 +175,11 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
  */
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
-	int i;
-
 	bitmap_free(kvm->arch.pmu_filter);
 
 	kvm_vgic_destroy(kvm);
 
-	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
-		if (kvm->vcpus[i]) {
-			kvm_vcpu_destroy(kvm->vcpus[i]);
-			kvm->vcpus[i] = NULL;
-		}
-	}
-	atomic_set(&kvm->online_vcpus, 0);
+	kvm_destroy_vcpus(kvm);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 562aa878b266..ceacca74f808 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -171,25 +171,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	return 0;
 }
 
-void kvm_mips_free_vcpus(struct kvm *kvm)
-{
-	unsigned int i;
-	struct kvm_vcpu *vcpu;
-
-	kvm_for_each_vcpu(i, vcpu, kvm) {
-		kvm_vcpu_destroy(vcpu);
-	}
-
-	mutex_lock(&kvm->lock);
-
-	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
-		kvm->vcpus[i] = NULL;
-
-	atomic_set(&kvm->online_vcpus, 0);
-
-	mutex_unlock(&kvm->lock);
-}
-
 static void kvm_mips_free_gpa_pt(struct kvm *kvm)
 {
 	/* It should always be safe to remove after flushing the whole range */
@@ -199,7 +180,7 @@ static void kvm_mips_free_gpa_pt(struct kvm *kvm)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
-	kvm_mips_free_vcpus(kvm);
+	kvm_destroy_vcpus(kvm);
 	kvm_mips_free_gpa_pt(kvm);
 }
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 35e9cccdeef9..492e4a4121cb 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -463,9 +463,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
-	unsigned int i;
-	struct kvm_vcpu *vcpu;
-
 #ifdef CONFIG_KVM_XICS
 	/*
 	 * We call kick_all_cpus_sync() to ensure that all
@@ -476,14 +473,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 		kick_all_cpus_sync();
 #endif
 
-	kvm_for_each_vcpu(i, vcpu, kvm)
-		kvm_vcpu_destroy(vcpu);
+	kvm_destroy_vcpus(kvm);
 
 	mutex_lock(&kvm->lock);
-	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
-		kvm->vcpus[i] = NULL;
-
-	atomic_set(&kvm->online_vcpus, 0);
 
 	kvmppc_core_destroy_vm(kvm);
 
diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c
index 26399df15b63..6af6cde295eb 100644
--- a/arch/riscv/kvm/vm.c
+++ b/arch/riscv/kvm/vm.c
@@ -46,15 +46,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
-	int i;
-
-	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
-		if (kvm->vcpus[i]) {
-			kvm_vcpu_destroy(kvm->vcpus[i]);
-			kvm->vcpus[i] = NULL;
-		}
-	}
-	atomic_set(&kvm->online_vcpus, 0);
+	kvm_destroy_vcpus(kvm);
 }
 
 int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index c6257f625929..7af53b8788fa 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2819,27 +2819,11 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
 	free_page((unsigned long)(vcpu->arch.sie_block));
 }
 
-static void kvm_free_vcpus(struct kvm *kvm)
-{
-	unsigned int i;
-	struct kvm_vcpu *vcpu;
-
-	kvm_for_each_vcpu(i, vcpu, kvm)
-		kvm_vcpu_destroy(vcpu);
-
-	mutex_lock(&kvm->lock);
-	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
-		kvm->vcpus[i] = NULL;
-
-	atomic_set(&kvm->online_vcpus, 0);
-	mutex_unlock(&kvm->lock);
-}
-
 void kvm_arch_destroy_vm(struct kvm *kvm)
 {
 	u16 rc, rrc;
 
-	kvm_free_vcpus(kvm);
+	kvm_destroy_vcpus(kvm);
 	sca_dispose(kvm);
 	kvm_s390_gisa_destroy(kvm);
 	/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c1c4e2b05a63..498a43126615 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11302,15 +11302,8 @@ static void kvm_free_vcpus(struct kvm *kvm)
 		kvm_clear_async_pf_completion_queue(vcpu);
 		kvm_unload_vcpu_mmu(vcpu);
 	}
-	kvm_for_each_vcpu(i, vcpu, kvm)
-		kvm_vcpu_destroy(vcpu);
-
-	mutex_lock(&kvm->lock);
-	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
-		kvm->vcpus[i] = NULL;
 
-	atomic_set(&kvm->online_vcpus, 0);
-	mutex_unlock(&kvm->lock);
+	kvm_destroy_vcpus(kvm);
 }
 
 void kvm_arch_sync_events(struct kvm *kvm)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 60a35d9fe259..36967291b8c6 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -725,7 +725,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu_by_id(struct kvm *kvm, int id)
 		if (WARN_ON_ONCE(!memslot->npages)) {			\
 		} else
 
-void kvm_vcpu_destroy(struct kvm_vcpu *vcpu);
+void kvm_destroy_vcpus(struct kvm *kvm);
 
 void vcpu_load(struct kvm_vcpu *vcpu);
 void vcpu_put(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3f6d450355f0..d83553eeea21 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -435,7 +435,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
 	vcpu->last_used_slot = 0;
 }
 
-void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
+static void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	kvm_dirty_ring_free(&vcpu->dirty_ring);
 	kvm_arch_vcpu_destroy(vcpu);
@@ -450,7 +450,23 @@ void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
 	free_page((unsigned long)vcpu->run);
 	kmem_cache_free(kvm_vcpu_cache, vcpu);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_destroy);
+
+void kvm_destroy_vcpus(struct kvm *kvm)
+{
+	unsigned int i;
+	struct kvm_vcpu *vcpu;
+
+	kvm_for_each_vcpu(i, vcpu, kvm)
+		kvm_vcpu_destroy(vcpu);
+
+	mutex_lock(&kvm->lock);
+	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
+		kvm->vcpus[i] = NULL;
+
+	atomic_set(&kvm->online_vcpus, 0);
+	mutex_unlock(&kvm->lock);
+}
+EXPORT_SYMBOL_GPL(kvm_destroy_vcpus);
 
 #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
 static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
  2021-11-05 19:20 ` [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code Marc Zyngier
@ 2021-11-05 19:20 ` Marc Zyngier
  2021-11-06 15:56   ` Philippe Mathieu-Daudé
  2021-11-05 19:20 ` [PATCH 3/5] KVM: s390: " Marc Zyngier
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
  To: kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/mips/kvm/loongson_ipi.c | 4 ++--
 arch/mips/kvm/mips.c         | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/mips/kvm/loongson_ipi.c b/arch/mips/kvm/loongson_ipi.c
index 3681fc8fba38..5d53f32d837c 100644
--- a/arch/mips/kvm/loongson_ipi.c
+++ b/arch/mips/kvm/loongson_ipi.c
@@ -120,7 +120,7 @@ static int loongson_vipi_write(struct loongson_kvm_ipi *ipi,
 		s->status |= data;
 		irq.cpu = id;
 		irq.irq = 6;
-		kvm_vcpu_ioctl_interrupt(kvm->vcpus[id], &irq);
+		kvm_vcpu_ioctl_interrupt(kvm_get_vcpu(kvm, id), &irq);
 		break;
 
 	case CORE0_CLEAR_OFF:
@@ -128,7 +128,7 @@ static int loongson_vipi_write(struct loongson_kvm_ipi *ipi,
 		if (!s->status) {
 			irq.cpu = id;
 			irq.irq = -6;
-			kvm_vcpu_ioctl_interrupt(kvm->vcpus[id], &irq);
+			kvm_vcpu_ioctl_interrupt(kvm_get_vcpu(kvm, id), &irq);
 		}
 		break;
 
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index ceacca74f808..6228bf396d63 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -479,7 +479,7 @@ int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
 	if (irq->cpu == -1)
 		dvcpu = vcpu;
 	else
-		dvcpu = vcpu->kvm->vcpus[irq->cpu];
+		dvcpu = kvm_get_vcpu(vcpu->kvm, irq->cpu);
 
 	if (intr == 2 || intr == 3 || intr == 4 || intr == 6) {
 		kvm_mips_callbacks->queue_io_int(dvcpu, irq);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/5] KVM: s390: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
  2021-11-05 19:20 ` [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code Marc Zyngier
  2021-11-05 19:20 ` [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access Marc Zyngier
@ 2021-11-05 19:20 ` Marc Zyngier
  2021-11-08 12:13   ` Claudio Imbrenda
  2021-11-05 19:21 ` [PATCH 4/5] KVM: x86: " Marc Zyngier
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2021-11-05 19:20 UTC (permalink / raw)
  To: kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/s390/kvm/kvm-s390.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7af53b8788fa..4a0f62b03964 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -4572,7 +4572,7 @@ int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
 	}
 
 	for (i = 0; i < online_vcpus; i++) {
-		if (!is_vcpu_stopped(vcpu->kvm->vcpus[i]))
+		if (!is_vcpu_stopped(kvm_get_vcpu(vcpu->kvm, i)))
 			started_vcpus++;
 	}
 
@@ -4634,9 +4634,11 @@ int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
 	__disable_ibs_on_vcpu(vcpu);
 
 	for (i = 0; i < online_vcpus; i++) {
-		if (!is_vcpu_stopped(vcpu->kvm->vcpus[i])) {
+		struct kvm_vcpu *tmp = kvm_get_vcpu(vcpu->kvm, i);
+
+		if (!is_vcpu_stopped(tmp)) {
 			started_vcpus++;
-			started_vcpu = vcpu->kvm->vcpus[i];
+			started_vcpu = tmp;
 		}
 	}
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
                   ` (2 preceding siblings ...)
  2021-11-05 19:20 ` [PATCH 3/5] KVM: s390: " Marc Zyngier
@ 2021-11-05 19:21 ` Marc Zyngier
  2021-11-05 20:03   ` Sean Christopherson
  2021-11-05 19:21 ` [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray Marc Zyngier
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2021-11-05 19:21 UTC (permalink / raw)
  To: kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/x86/kvm/vmx/posted_intr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..82a49720727d 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
 
 	if (!kvm_arch_has_assigned_device(kvm) ||
 	    !irq_remapping_cap(IRQ_POSTING_CAP) ||
-	    !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+	    !kvm_vcpu_apicv_active(kvm_get_vcpu(kvm, 0)))
 		return 0;
 
 	idx = srcu_read_lock(&kvm->irq_srcu);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
                   ` (3 preceding siblings ...)
  2021-11-05 19:21 ` [PATCH 4/5] KVM: x86: " Marc Zyngier
@ 2021-11-05 19:21 ` Marc Zyngier
  2021-11-05 20:21   ` Sean Christopherson
  2021-11-16 14:13 ` [PATCH 0/5] KVM: Turn the vcpu array into an xarray Juergen Gross
  2021-11-16 15:03 ` Paolo Bonzini
  6 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2021-11-05 19:21 UTC (permalink / raw)
  To: kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
and is mostly empty in most cases (running 512 vcpu VMs is not that
common). This mean that we end-up with a 4kB block of unused memory
in the middle of the kvm structure.

Instead of wasting away this memory, let's use an xarray instead,
which gives us almost the same flexibility as a normal array, but
with a reduced memory usage with smaller VMs.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 include/linux/kvm_host.h |  5 +++--
 virt/kvm/kvm_main.c      | 15 +++++++++------
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 36967291b8c6..3933d825e28b 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -29,6 +29,7 @@
 #include <linux/refcount.h>
 #include <linux/nospec.h>
 #include <linux/notifier.h>
+#include <linux/xarray.h>
 #include <asm/signal.h>
 
 #include <linux/kvm.h>
@@ -552,7 +553,7 @@ struct kvm {
 	struct mutex slots_arch_lock;
 	struct mm_struct *mm; /* userspace tied to this vm */
 	struct kvm_memslots __rcu *memslots[KVM_ADDRESS_SPACE_NUM];
-	struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
+	struct xarray vcpu_array;
 
 	/* Used to wait for completion of MMU notifiers.  */
 	spinlock_t mn_invalidate_lock;
@@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
 
 	/* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu.  */
 	smp_rmb();
-	return kvm->vcpus[i];
+	return xa_load(&kvm->vcpu_array, i);
 }
 
 #define kvm_for_each_vcpu(idx, vcpup, kvm) \
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d83553eeea21..4c18d7911fa5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -461,7 +461,7 @@ void kvm_destroy_vcpus(struct kvm *kvm)
 
 	mutex_lock(&kvm->lock);
 	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
-		kvm->vcpus[i] = NULL;
+		xa_erase(&kvm->vcpu_array, i);
 
 	atomic_set(&kvm->online_vcpus, 0);
 	mutex_unlock(&kvm->lock);
@@ -1066,6 +1066,7 @@ static struct kvm *kvm_create_vm(unsigned long type)
 	mutex_init(&kvm->slots_arch_lock);
 	spin_lock_init(&kvm->mn_invalidate_lock);
 	rcuwait_init(&kvm->mn_memslots_update_rcuwait);
+	xa_init(&kvm->vcpu_array);
 
 	INIT_LIST_HEAD(&kvm->devices);
 
@@ -3661,7 +3662,10 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 	}
 
 	vcpu->vcpu_idx = atomic_read(&kvm->online_vcpus);
-	BUG_ON(kvm->vcpus[vcpu->vcpu_idx]);
+	r = xa_insert(&kvm->vcpu_array, vcpu->vcpu_idx, vcpu, GFP_KERNEL_ACCOUNT);
+	BUG_ON(r == -EBUSY);
+	if (r)
+		goto unlock_vcpu_destroy;
 
 	/* Fill the stats id string for the vcpu */
 	snprintf(vcpu->stats_id, sizeof(vcpu->stats_id), "kvm-%d/vcpu-%d",
@@ -3671,15 +3675,14 @@ static int kvm_vm_ioctl_create_vcpu(struct kvm *kvm, u32 id)
 	kvm_get_kvm(kvm);
 	r = create_vcpu_fd(vcpu);
 	if (r < 0) {
+		xa_erase(&kvm->vcpu_array, vcpu->vcpu_idx);
 		kvm_put_kvm_no_destroy(kvm);
 		goto unlock_vcpu_destroy;
 	}
 
-	kvm->vcpus[vcpu->vcpu_idx] = vcpu;
-
 	/*
-	 * Pairs with smp_rmb() in kvm_get_vcpu.  Write kvm->vcpus
-	 * before kvm->online_vcpu's incremented value.
+	 * Pairs with smp_rmb() in kvm_get_vcpu.  Store the vcpu
+	 * pointer before kvm->online_vcpu's incremented value.
 	 */
 	smp_wmb();
 	atomic_inc(&kvm->online_vcpus);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 19:21 ` [PATCH 4/5] KVM: x86: " Marc Zyngier
@ 2021-11-05 20:03   ` Sean Christopherson
  2021-11-16 14:04     ` Paolo Bonzini
  0 siblings, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2021-11-05 20:03 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On Fri, Nov 05, 2021, Marc Zyngier wrote:
> As we are about to change the way vcpus are allocated, mandate
> the use of kvm_get_vcpu() instead of open-coding the access.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/x86/kvm/vmx/posted_intr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..82a49720727d 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
>  
>  	if (!kvm_arch_has_assigned_device(kvm) ||
>  	    !irq_remapping_cap(IRQ_POSTING_CAP) ||
> -	    !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> +	    !kvm_vcpu_apicv_active(kvm_get_vcpu(kvm, 0)))

Huh.  The existing code is decidedly odd.  I think it might even be broken, as
it's not obvious that vCPU0 _must_ be created when e.g. kvm_arch_irq_bypass_add_producer()
is called.

An equivalent, safe check would be:

diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..a3100591a9ca 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,

        if (!kvm_arch_has_assigned_device(kvm) ||
            !irq_remapping_cap(IRQ_POSTING_CAP) ||
-           !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+           !kvm_apicv_activated(kvm))
                return 0;

        idx = srcu_read_lock(&kvm->irq_srcu);


But I think even that is flawed, as APICv can be dynamically deactivated and
re-activated while the VM is running, and I don't see a path that re-updates
the IRTE when APICv is re-activated.  So I think a more conservative check is
needed, e.g.

diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
index 5f81ef092bd4..6cf5b2e86118 100644
--- a/arch/x86/kvm/vmx/posted_intr.c
+++ b/arch/x86/kvm/vmx/posted_intr.c
@@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,

        if (!kvm_arch_has_assigned_device(kvm) ||
            !irq_remapping_cap(IRQ_POSTING_CAP) ||
-           !kvm_vcpu_apicv_active(kvm->vcpus[0]))
+           !irqchip_in_kernel(kvm) || !enable_apicv)
                return 0;

        idx = srcu_read_lock(&kvm->irq_srcu);


Paolo, am I missing something?

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
  2021-11-05 19:20 ` [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code Marc Zyngier
@ 2021-11-05 20:12   ` Sean Christopherson
  2021-11-06 11:17     ` Marc Zyngier
  2021-11-08 12:12   ` Claudio Imbrenda
  1 sibling, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2021-11-05 20:12 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On Fri, Nov 05, 2021, Marc Zyngier wrote:
> All architectures have similar loops iterating over the vcpus,
> freeing one vcpu at a time, and eventually wiping the reference
> off the vcpus array. They are also inconsistently taking
> the kvm->lock mutex when wiping the references from the array.

...

> +void kvm_destroy_vcpus(struct kvm *kvm)
> +{
> +	unsigned int i;
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm)
> +		kvm_vcpu_destroy(vcpu);
> +
> +	mutex_lock(&kvm->lock);

But why is kvm->lock taken here?  Unless I'm overlooking an arch, everyone calls
this from kvm_arch_destroy_vm(), in which case this is the only remaining reference
to @kvm.  And if there's some magic path for which that's not true, I don't see how
it can possibly be safe to call kvm_vcpu_destroy() without holding kvm->lock, or
how this would guarantee that all vCPUs have actually been destroyed before nullifying
the array.

> +	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> +		kvm->vcpus[i] = NULL;
> +
> +	atomic_set(&kvm->online_vcpus, 0);
> +	mutex_unlock(&kvm->lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_destroy_vcpus);

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
  2021-11-05 19:21 ` [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray Marc Zyngier
@ 2021-11-05 20:21   ` Sean Christopherson
  2021-11-06 11:48     ` Marc Zyngier
  0 siblings, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2021-11-05 20:21 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On Fri, Nov 05, 2021, Marc Zyngier wrote:
> At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
> and is mostly empty in most cases (running 512 vcpu VMs is not that
> common). This mean that we end-up with a 4kB block of unused memory
> in the middle of the kvm structure.

Heh, x86 is now up to 1024 entries.
 
> Instead of wasting away this memory, let's use an xarray instead,
> which gives us almost the same flexibility as a normal array, but
> with a reduced memory usage with smaller VMs.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
> @@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
>  
>  	/* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu.  */
>  	smp_rmb();
> -	return kvm->vcpus[i];
> +	return xa_load(&kvm->vcpu_array, i);
>  }

It'd be nice for this series to convert kvm_for_each_vcpu() to use xa_for_each()
as well.  Maybe as a patch on top so that potential explosions from that are
isolated from the initiali conversion?

Or maybe even use xa_for_each_range() to cap at online_vcpus?  That's technically
a functional change, but IMO it's easier to reason about iterating over a snapshot
of vCPUs as opposed to being able to iterate over vCPUs as their being added.  In
practice I doubt it matters.

#define kvm_for_each_vcpu(idx, vcpup, kvm) \
	xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, atomic_read(&kvm->online_vcpus))

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
  2021-11-05 20:12   ` Sean Christopherson
@ 2021-11-06 11:17     ` Marc Zyngier
  2021-11-16 13:49       ` Paolo Bonzini
  0 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2021-11-06 11:17 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On Fri, 05 Nov 2021 20:12:12 +0000,
Sean Christopherson <seanjc@google.com> wrote:
> 
> On Fri, Nov 05, 2021, Marc Zyngier wrote:
> > All architectures have similar loops iterating over the vcpus,
> > freeing one vcpu at a time, and eventually wiping the reference
> > off the vcpus array. They are also inconsistently taking
> > the kvm->lock mutex when wiping the references from the array.
> 
> ...
> 
> > +void kvm_destroy_vcpus(struct kvm *kvm)
> > +{
> > +	unsigned int i;
> > +	struct kvm_vcpu *vcpu;
> > +
> > +	kvm_for_each_vcpu(i, vcpu, kvm)
> > +		kvm_vcpu_destroy(vcpu);
> > +
> > +	mutex_lock(&kvm->lock);
> 
> But why is kvm->lock taken here?  Unless I'm overlooking an arch,
> everyone calls this from kvm_arch_destroy_vm(), in which case this
> is the only remaining reference to @kvm.  And if there's some magic
> path for which that's not true, I don't see how it can possibly be
> safe to call kvm_vcpu_destroy() without holding kvm->lock, or how
> this would guarantee that all vCPUs have actually been destroyed
> before nullifying the array.

I asked myself the same question two years ago, and couldn't really
understand the requirement. However, x86 does just that, so I
preserved the behaviour.

If you too believe that this is just wrong, I'm happy to drop the
locking altogether. If that breaks someone's flow, they'll shout soon
enough.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
  2021-11-05 20:21   ` Sean Christopherson
@ 2021-11-06 11:48     ` Marc Zyngier
  2021-11-08  8:23       ` Marc Zyngier
  0 siblings, 1 reply; 24+ messages in thread
From: Marc Zyngier @ 2021-11-06 11:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On Fri, 05 Nov 2021 20:21:36 +0000,
Sean Christopherson <seanjc@google.com> wrote:
> 
> On Fri, Nov 05, 2021, Marc Zyngier wrote:
> > At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
> > and is mostly empty in most cases (running 512 vcpu VMs is not that
> > common). This mean that we end-up with a 4kB block of unused memory
> > in the middle of the kvm structure.
> 
> Heh, x86 is now up to 1024 entries.

Humph. I don't want to know whether people are actually using that in
practice. The only time I create VMs with 512 vcpus is to check
whether it still works...

>  
> > Instead of wasting away this memory, let's use an xarray instead,
> > which gives us almost the same flexibility as a normal array, but
> > with a reduced memory usage with smaller VMs.
> > 
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > ---
> > @@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
> >  
> >  	/* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu.  */
> >  	smp_rmb();
> > -	return kvm->vcpus[i];
> > +	return xa_load(&kvm->vcpu_array, i);
> >  }
> 
> It'd be nice for this series to convert kvm_for_each_vcpu() to use
> xa_for_each() as well.  Maybe as a patch on top so that potential
> explosions from that are isolated from the initiali conversion?
> 
> Or maybe even use xa_for_each_range() to cap at online_vcpus?
> That's technically a functional change, but IMO it's easier to
> reason about iterating over a snapshot of vCPUs as opposed to being
> able to iterate over vCPUs as their being added.  In practice I
> doubt it matters.
> 
> #define kvm_for_each_vcpu(idx, vcpup, kvm) \
> 	xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, atomic_read(&kvm->online_vcpus))
>

I think that's already the behaviour of this iterator (we stop at the
first empty slot capped to online_vcpus. The only change in behaviour
is that vcpup currently holds a pointer to the last vcpu in no empty
slot has been encountered. xa_for_each{,_range}() would set the
pointer to NULL at all times.

I doubt anyone relies on that, but it is probably worth eyeballing
some of the use cases...

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 19:20 ` [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access Marc Zyngier
@ 2021-11-06 15:56   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 24+ messages in thread
From: Philippe Mathieu-Daudé @ 2021-11-06 15:56 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, open list:BROADCOM NVRAM DRIVER, kvmarm, linuxppc-dev,
	Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

On Fri, Nov 5, 2021 at 9:14 PM Marc Zyngier <maz@kernel.org> wrote:
>
> As we are about to change the way vcpus are allocated, mandate
> the use of kvm_get_vcpu() instead of open-coding the access.
>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  arch/mips/kvm/loongson_ipi.c | 4 ++--
>  arch/mips/kvm/mips.c         | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
  2021-11-06 11:48     ` Marc Zyngier
@ 2021-11-08  8:23       ` Marc Zyngier
  0 siblings, 0 replies; 24+ messages in thread
From: Marc Zyngier @ 2021-11-08  8:23 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On 2021-11-06 11:48, Marc Zyngier wrote:
> On Fri, 05 Nov 2021 20:21:36 +0000,
> Sean Christopherson <seanjc@google.com> wrote:
>> 
>> On Fri, Nov 05, 2021, Marc Zyngier wrote:
>> > At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
>> > and is mostly empty in most cases (running 512 vcpu VMs is not that
>> > common). This mean that we end-up with a 4kB block of unused memory
>> > in the middle of the kvm structure.
>> 
>> Heh, x86 is now up to 1024 entries.
> 
> Humph. I don't want to know whether people are actually using that in
> practice. The only time I create VMs with 512 vcpus is to check
> whether it still works...
> 
>> 
>> > Instead of wasting away this memory, let's use an xarray instead,
>> > which gives us almost the same flexibility as a normal array, but
>> > with a reduced memory usage with smaller VMs.
>> >
>> > Signed-off-by: Marc Zyngier <maz@kernel.org>
>> > ---
>> > @@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
>> >
>> >  	/* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu.  */
>> >  	smp_rmb();
>> > -	return kvm->vcpus[i];
>> > +	return xa_load(&kvm->vcpu_array, i);
>> >  }
>> 
>> It'd be nice for this series to convert kvm_for_each_vcpu() to use
>> xa_for_each() as well.  Maybe as a patch on top so that potential
>> explosions from that are isolated from the initiali conversion?
>> 
>> Or maybe even use xa_for_each_range() to cap at online_vcpus?
>> That's technically a functional change, but IMO it's easier to
>> reason about iterating over a snapshot of vCPUs as opposed to being
>> able to iterate over vCPUs as their being added.  In practice I
>> doubt it matters.
>> 
>> #define kvm_for_each_vcpu(idx, vcpup, kvm) \
>> 	xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0, 
>> atomic_read(&kvm->online_vcpus))
>> 
> 
> I think that's already the behaviour of this iterator (we stop at the
> first empty slot capped to online_vcpus. The only change in behaviour
> is that vcpup currently holds a pointer to the last vcpu in no empty
> slot has been encountered. xa_for_each{,_range}() would set the
> pointer to NULL at all times.
> 
> I doubt anyone relies on that, but it is probably worth eyeballing
> some of the use cases...

This turned out to be an interesting exercise, as we always use an
int for the index, and the xarray iterators insist on an unsigned
long (and even on a pointer to it). On the other hand, I couldn't
spot any case where we'd rely on the last value of the vcpu pointer.

I'll repost the series once we have a solution for patch #4, and
we can then decide whether we want the iterator churn.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
  2021-11-05 19:20 ` [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code Marc Zyngier
  2021-11-05 20:12   ` Sean Christopherson
@ 2021-11-08 12:12   ` Claudio Imbrenda
  1 sibling, 0 replies; 24+ messages in thread
From: Claudio Imbrenda @ 2021-11-08 12:12 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

On Fri,  5 Nov 2021 19:20:57 +0000
Marc Zyngier <maz@kernel.org> wrote:

> All architectures have similar loops iterating over the vcpus,
> freeing one vcpu at a time, and eventually wiping the reference
> off the vcpus array. They are also inconsistently taking
> the kvm->lock mutex when wiping the references from the array.
> 
> Make this code common, which will simplify further changes.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>

no objections

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  arch/arm64/kvm/arm.c       | 10 +---------
>  arch/mips/kvm/mips.c       | 21 +--------------------
>  arch/powerpc/kvm/powerpc.c | 10 +---------
>  arch/riscv/kvm/vm.c        | 10 +---------
>  arch/s390/kvm/kvm-s390.c   | 18 +-----------------
>  arch/x86/kvm/x86.c         |  9 +--------
>  include/linux/kvm_host.h   |  2 +-
>  virt/kvm/kvm_main.c        | 20 ++++++++++++++++++--
>  8 files changed, 25 insertions(+), 75 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index f5490afe1ebf..75bb7215da03 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -175,19 +175,11 @@ vm_fault_t kvm_arch_vcpu_fault(struct kvm_vcpu *vcpu, struct vm_fault *vmf)
>   */
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> -	int i;
> -
>  	bitmap_free(kvm->arch.pmu_filter);
>  
>  	kvm_vgic_destroy(kvm);
>  
> -	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> -		if (kvm->vcpus[i]) {
> -			kvm_vcpu_destroy(kvm->vcpus[i]);
> -			kvm->vcpus[i] = NULL;
> -		}
> -	}
> -	atomic_set(&kvm->online_vcpus, 0);
> +	kvm_destroy_vcpus(kvm);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
> index 562aa878b266..ceacca74f808 100644
> --- a/arch/mips/kvm/mips.c
> +++ b/arch/mips/kvm/mips.c
> @@ -171,25 +171,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  	return 0;
>  }
>  
> -void kvm_mips_free_vcpus(struct kvm *kvm)
> -{
> -	unsigned int i;
> -	struct kvm_vcpu *vcpu;
> -
> -	kvm_for_each_vcpu(i, vcpu, kvm) {
> -		kvm_vcpu_destroy(vcpu);
> -	}
> -
> -	mutex_lock(&kvm->lock);
> -
> -	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> -		kvm->vcpus[i] = NULL;
> -
> -	atomic_set(&kvm->online_vcpus, 0);
> -
> -	mutex_unlock(&kvm->lock);
> -}
> -
>  static void kvm_mips_free_gpa_pt(struct kvm *kvm)
>  {
>  	/* It should always be safe to remove after flushing the whole range */
> @@ -199,7 +180,7 @@ static void kvm_mips_free_gpa_pt(struct kvm *kvm)
>  
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> -	kvm_mips_free_vcpus(kvm);
> +	kvm_destroy_vcpus(kvm);
>  	kvm_mips_free_gpa_pt(kvm);
>  }
>  
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 35e9cccdeef9..492e4a4121cb 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -463,9 +463,6 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> -	unsigned int i;
> -	struct kvm_vcpu *vcpu;
> -
>  #ifdef CONFIG_KVM_XICS
>  	/*
>  	 * We call kick_all_cpus_sync() to ensure that all
> @@ -476,14 +473,9 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
>  		kick_all_cpus_sync();
>  #endif
>  
> -	kvm_for_each_vcpu(i, vcpu, kvm)
> -		kvm_vcpu_destroy(vcpu);
> +	kvm_destroy_vcpus(kvm);
>  
>  	mutex_lock(&kvm->lock);
> -	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> -		kvm->vcpus[i] = NULL;
> -
> -	atomic_set(&kvm->online_vcpus, 0);
>  
>  	kvmppc_core_destroy_vm(kvm);
>  
> diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c
> index 26399df15b63..6af6cde295eb 100644
> --- a/arch/riscv/kvm/vm.c
> +++ b/arch/riscv/kvm/vm.c
> @@ -46,15 +46,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>  
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
> -	int i;
> -
> -	for (i = 0; i < KVM_MAX_VCPUS; ++i) {
> -		if (kvm->vcpus[i]) {
> -			kvm_vcpu_destroy(kvm->vcpus[i]);
> -			kvm->vcpus[i] = NULL;
> -		}
> -	}
> -	atomic_set(&kvm->online_vcpus, 0);
> +	kvm_destroy_vcpus(kvm);
>  }
>  
>  int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index c6257f625929..7af53b8788fa 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2819,27 +2819,11 @@ void kvm_arch_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	free_page((unsigned long)(vcpu->arch.sie_block));
>  }
>  
> -static void kvm_free_vcpus(struct kvm *kvm)
> -{
> -	unsigned int i;
> -	struct kvm_vcpu *vcpu;
> -
> -	kvm_for_each_vcpu(i, vcpu, kvm)
> -		kvm_vcpu_destroy(vcpu);
> -
> -	mutex_lock(&kvm->lock);
> -	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> -		kvm->vcpus[i] = NULL;
> -
> -	atomic_set(&kvm->online_vcpus, 0);
> -	mutex_unlock(&kvm->lock);
> -}
> -
>  void kvm_arch_destroy_vm(struct kvm *kvm)
>  {
>  	u16 rc, rrc;
>  
> -	kvm_free_vcpus(kvm);
> +	kvm_destroy_vcpus(kvm);
>  	sca_dispose(kvm);
>  	kvm_s390_gisa_destroy(kvm);
>  	/*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c1c4e2b05a63..498a43126615 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11302,15 +11302,8 @@ static void kvm_free_vcpus(struct kvm *kvm)
>  		kvm_clear_async_pf_completion_queue(vcpu);
>  		kvm_unload_vcpu_mmu(vcpu);
>  	}
> -	kvm_for_each_vcpu(i, vcpu, kvm)
> -		kvm_vcpu_destroy(vcpu);
> -
> -	mutex_lock(&kvm->lock);
> -	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> -		kvm->vcpus[i] = NULL;
>  
> -	atomic_set(&kvm->online_vcpus, 0);
> -	mutex_unlock(&kvm->lock);
> +	kvm_destroy_vcpus(kvm);
>  }
>  
>  void kvm_arch_sync_events(struct kvm *kvm)
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 60a35d9fe259..36967291b8c6 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -725,7 +725,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu_by_id(struct kvm *kvm, int id)
>  		if (WARN_ON_ONCE(!memslot->npages)) {			\
>  		} else
>  
> -void kvm_vcpu_destroy(struct kvm_vcpu *vcpu);
> +void kvm_destroy_vcpus(struct kvm *kvm);
>  
>  void vcpu_load(struct kvm_vcpu *vcpu);
>  void vcpu_put(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 3f6d450355f0..d83553eeea21 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -435,7 +435,7 @@ static void kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id)
>  	vcpu->last_used_slot = 0;
>  }
>  
> -void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
> +static void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
>  {
>  	kvm_dirty_ring_free(&vcpu->dirty_ring);
>  	kvm_arch_vcpu_destroy(vcpu);
> @@ -450,7 +450,23 @@ void kvm_vcpu_destroy(struct kvm_vcpu *vcpu)
>  	free_page((unsigned long)vcpu->run);
>  	kmem_cache_free(kvm_vcpu_cache, vcpu);
>  }
> -EXPORT_SYMBOL_GPL(kvm_vcpu_destroy);
> +
> +void kvm_destroy_vcpus(struct kvm *kvm)
> +{
> +	unsigned int i;
> +	struct kvm_vcpu *vcpu;
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm)
> +		kvm_vcpu_destroy(vcpu);
> +
> +	mutex_lock(&kvm->lock);
> +	for (i = 0; i < atomic_read(&kvm->online_vcpus); i++)
> +		kvm->vcpus[i] = NULL;
> +
> +	atomic_set(&kvm->online_vcpus, 0);
> +	mutex_unlock(&kvm->lock);
> +}
> +EXPORT_SYMBOL_GPL(kvm_destroy_vcpus);
>  
>  #if defined(CONFIG_MMU_NOTIFIER) && defined(KVM_ARCH_WANT_MMU_NOTIFIER)
>  static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5] KVM: s390: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 19:20 ` [PATCH 3/5] KVM: s390: " Marc Zyngier
@ 2021-11-08 12:13   ` Claudio Imbrenda
  0 siblings, 0 replies; 24+ messages in thread
From: Claudio Imbrenda @ 2021-11-08 12:13 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Paolo Bonzini, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

On Fri,  5 Nov 2021 19:20:59 +0000
Marc Zyngier <maz@kernel.org> wrote:

> As we are about to change the way vcpus are allocated, mandate
> the use of kvm_get_vcpu() instead of open-coding the access.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>

makes sense

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  arch/s390/kvm/kvm-s390.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 7af53b8788fa..4a0f62b03964 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -4572,7 +4572,7 @@ int kvm_s390_vcpu_start(struct kvm_vcpu *vcpu)
>  	}
>  
>  	for (i = 0; i < online_vcpus; i++) {
> -		if (!is_vcpu_stopped(vcpu->kvm->vcpus[i]))
> +		if (!is_vcpu_stopped(kvm_get_vcpu(vcpu->kvm, i)))
>  			started_vcpus++;
>  	}
>  
> @@ -4634,9 +4634,11 @@ int kvm_s390_vcpu_stop(struct kvm_vcpu *vcpu)
>  	__disable_ibs_on_vcpu(vcpu);
>  
>  	for (i = 0; i < online_vcpus; i++) {
> -		if (!is_vcpu_stopped(vcpu->kvm->vcpus[i])) {
> +		struct kvm_vcpu *tmp = kvm_get_vcpu(vcpu->kvm, i);
> +
> +		if (!is_vcpu_stopped(tmp)) {
>  			started_vcpus++;
> -			started_vcpu = vcpu->kvm->vcpus[i];
> +			started_vcpu = tmp;
>  		}
>  	}
>  


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code
  2021-11-06 11:17     ` Marc Zyngier
@ 2021-11-16 13:49       ` Paolo Bonzini
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2021-11-16 13:49 UTC (permalink / raw)
  To: Marc Zyngier, Sean Christopherson
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Juergen Gross, Nicholas Piggin, Paul Mackerras,
	Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On 11/6/21 12:17, Marc Zyngier wrote:
> 
> If you too believe that this is just wrong, I'm happy to drop the
> locking altogether. If that breaks someone's flow, they'll shout soon
> enough.

Yes, it's not necessary.  It was added in 2009 (commit 988a2cae6a3c, 
"KVM: Use macro to iterate over vcpus.") and it was unnecessary back 
then too.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
  2021-11-05 20:03   ` Sean Christopherson
@ 2021-11-16 14:04     ` Paolo Bonzini
  2021-11-16 16:07       ` Sean Christopherson
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2021-11-16 14:04 UTC (permalink / raw)
  To: Sean Christopherson, Marc Zyngier
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Juergen Gross, Nicholas Piggin, Paul Mackerras,
	Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On 11/5/21 21:03, Sean Christopherson wrote:
> But I think even that is flawed, as APICv can be dynamically deactivated and
> re-activated while the VM is running, and I don't see a path that re-updates
> the IRTE when APICv is re-activated.  So I think a more conservative check is
> needed, e.g.
> 
> diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> index 5f81ef092bd4..6cf5b2e86118 100644
> --- a/arch/x86/kvm/vmx/posted_intr.c
> +++ b/arch/x86/kvm/vmx/posted_intr.c
> @@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
> 
>          if (!kvm_arch_has_assigned_device(kvm) ||
>              !irq_remapping_cap(IRQ_POSTING_CAP) ||
> -           !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> +           !irqchip_in_kernel(kvm) || !enable_apicv)
>                  return 0;
> 
>          idx = srcu_read_lock(&kvm->irq_srcu);

What happens then if pi_pre_block is called and the IRTE denotes a 
posted interrupt?

I might be wrong, but it seems to me that you have to change all of the 
occurrences this way.  As soon as enable_apicv is set, you need to go 
through the POSTED_INTR_WAKEUP_VECTOR just in case.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] KVM: Turn the vcpu array into an xarray
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
                   ` (4 preceding siblings ...)
  2021-11-05 19:21 ` [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray Marc Zyngier
@ 2021-11-16 14:13 ` Juergen Gross
  2021-11-16 14:21   ` Paolo Bonzini
  2021-11-16 15:03 ` Paolo Bonzini
  6 siblings, 1 reply; 24+ messages in thread
From: Juergen Gross @ 2021-11-16 14:13 UTC (permalink / raw)
  To: Marc Zyngier, kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Paolo Bonzini, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team


[-- Attachment #1.1.1: Type: text/plain, Size: 1596 bytes --]

On 05.11.21 20:20, Marc Zyngier wrote:
> The kvm structure is pretty large. A large portion of it is the vcpu
> array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
> VMs. Of course, hardly anyone runs VMs this big, so this is often a
> net waste of memory and cache locality.
> 
> A possible approach is to turn the fixed-size array into an xarray,
> which results in a net code deletion after a bit of cleanup.
> 
> This series is on top of the current linux/master as it touches the
> RISC-V implementation. Only tested on arm64.
> 
> Marc Zyngier (5):
>    KVM: Move wiping of the kvm->vcpus array to common code
>    KVM: mips: Use kvm_get_vcpu() instead of open-coded access
>    KVM: s390: Use kvm_get_vcpu() instead of open-coded access
>    KVM: x86: Use kvm_get_vcpu() instead of open-coded access
>    KVM: Convert the kvm->vcpus array to a xarray
> 
>   arch/arm64/kvm/arm.c           | 10 +---------
>   arch/mips/kvm/loongson_ipi.c   |  4 ++--
>   arch/mips/kvm/mips.c           | 23 ++---------------------
>   arch/powerpc/kvm/powerpc.c     | 10 +---------
>   arch/riscv/kvm/vm.c            | 10 +---------
>   arch/s390/kvm/kvm-s390.c       | 26 ++++++--------------------
>   arch/x86/kvm/vmx/posted_intr.c |  2 +-
>   arch/x86/kvm/x86.c             |  9 +--------
>   include/linux/kvm_host.h       |  7 ++++---
>   virt/kvm/kvm_main.c            | 33 ++++++++++++++++++++++++++-------
>   10 files changed, 45 insertions(+), 89 deletions(-)
> 

For x86 you can add my:

Tested-by: Juergen Gross <jgross@suse.com>


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] KVM: Turn the vcpu array into an xarray
  2021-11-16 14:13 ` [PATCH 0/5] KVM: Turn the vcpu array into an xarray Juergen Gross
@ 2021-11-16 14:21   ` Paolo Bonzini
  2021-11-16 14:54     ` Juergen Gross
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2021-11-16 14:21 UTC (permalink / raw)
  To: Juergen Gross, Marc Zyngier, kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Nicholas Piggin, Sean Christopherson,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On 11/16/21 15:13, Juergen Gross wrote:
> On 05.11.21 20:20, Marc Zyngier wrote:
>> The kvm structure is pretty large. A large portion of it is the vcpu
>> array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
>> VMs. Of course, hardly anyone runs VMs this big, so this is often a
>> net waste of memory and cache locality.
>>
>> A possible approach is to turn the fixed-size array into an xarray,
>> which results in a net code deletion after a bit of cleanup.
>>
>> This series is on top of the current linux/master as it touches the
>> RISC-V implementation. Only tested on arm64.
>>
>> Marc Zyngier (5):
>>    KVM: Move wiping of the kvm->vcpus array to common code
>>    KVM: mips: Use kvm_get_vcpu() instead of open-coded access
>>    KVM: s390: Use kvm_get_vcpu() instead of open-coded access
>>    KVM: x86: Use kvm_get_vcpu() instead of open-coded access
>>    KVM: Convert the kvm->vcpus array to a xarray
>>
>>   arch/arm64/kvm/arm.c           | 10 +---------
>>   arch/mips/kvm/loongson_ipi.c   |  4 ++--
>>   arch/mips/kvm/mips.c           | 23 ++---------------------
>>   arch/powerpc/kvm/powerpc.c     | 10 +---------
>>   arch/riscv/kvm/vm.c            | 10 +---------
>>   arch/s390/kvm/kvm-s390.c       | 26 ++++++--------------------
>>   arch/x86/kvm/vmx/posted_intr.c |  2 +-
>>   arch/x86/kvm/x86.c             |  9 +--------
>>   include/linux/kvm_host.h       |  7 ++++---
>>   virt/kvm/kvm_main.c            | 33 ++++++++++++++++++++++++++-------
>>   10 files changed, 45 insertions(+), 89 deletions(-)
>>
> 
> For x86 you can add my:
> 
> Tested-by: Juergen Gross <jgross@suse.com>

Heh, unfortunately x86 is the only one that needs a change in patch 4. 
I'll Cc you on my version.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] KVM: Turn the vcpu array into an xarray
  2021-11-16 14:21   ` Paolo Bonzini
@ 2021-11-16 14:54     ` Juergen Gross
  0 siblings, 0 replies; 24+ messages in thread
From: Juergen Gross @ 2021-11-16 14:54 UTC (permalink / raw)
  To: Paolo Bonzini, Marc Zyngier, kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Nicholas Piggin, Sean Christopherson,
	Paul Mackerras, Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team


[-- Attachment #1.1.1: Type: text/plain, Size: 2147 bytes --]

On 16.11.21 15:21, Paolo Bonzini wrote:
> On 11/16/21 15:13, Juergen Gross wrote:
>> On 05.11.21 20:20, Marc Zyngier wrote:
>>> The kvm structure is pretty large. A large portion of it is the vcpu
>>> array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
>>> VMs. Of course, hardly anyone runs VMs this big, so this is often a
>>> net waste of memory and cache locality.
>>>
>>> A possible approach is to turn the fixed-size array into an xarray,
>>> which results in a net code deletion after a bit of cleanup.
>>>
>>> This series is on top of the current linux/master as it touches the
>>> RISC-V implementation. Only tested on arm64.
>>>
>>> Marc Zyngier (5):
>>>    KVM: Move wiping of the kvm->vcpus array to common code
>>>    KVM: mips: Use kvm_get_vcpu() instead of open-coded access
>>>    KVM: s390: Use kvm_get_vcpu() instead of open-coded access
>>>    KVM: x86: Use kvm_get_vcpu() instead of open-coded access
>>>    KVM: Convert the kvm->vcpus array to a xarray
>>>
>>>   arch/arm64/kvm/arm.c           | 10 +---------
>>>   arch/mips/kvm/loongson_ipi.c   |  4 ++--
>>>   arch/mips/kvm/mips.c           | 23 ++---------------------
>>>   arch/powerpc/kvm/powerpc.c     | 10 +---------
>>>   arch/riscv/kvm/vm.c            | 10 +---------
>>>   arch/s390/kvm/kvm-s390.c       | 26 ++++++--------------------
>>>   arch/x86/kvm/vmx/posted_intr.c |  2 +-
>>>   arch/x86/kvm/x86.c             |  9 +--------
>>>   include/linux/kvm_host.h       |  7 ++++---
>>>   virt/kvm/kvm_main.c            | 33 ++++++++++++++++++++++++++-------
>>>   10 files changed, 45 insertions(+), 89 deletions(-)
>>>
>>
>> For x86 you can add my:
>>
>> Tested-by: Juergen Gross <jgross@suse.com>
> 
> Heh, unfortunately x86 is the only one that needs a change in patch 4. 
> I'll Cc you on my version.

I guess the changes in kvm_main.c are more important for my series. :-)

I've replaced patch 4 with your variant and everything is still working.
Not sure how relevant that is, though.


Juergen


[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] KVM: Turn the vcpu array into an xarray
  2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
                   ` (5 preceding siblings ...)
  2021-11-16 14:13 ` [PATCH 0/5] KVM: Turn the vcpu array into an xarray Juergen Gross
@ 2021-11-16 15:03 ` Paolo Bonzini
  2021-11-16 15:40   ` Marc Zyngier
  6 siblings, 1 reply; 24+ messages in thread
From: Paolo Bonzini @ 2021-11-16 15:03 UTC (permalink / raw)
  To: Marc Zyngier, kvm, linux-mips, kvmarm, linuxppc-dev
  Cc: Huacai Chen, Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

On 11/5/21 20:20, Marc Zyngier wrote:
> The kvm structure is pretty large. A large portion of it is the vcpu
> array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
> VMs. Of course, hardly anyone runs VMs this big, so this is often a
> net waste of memory and cache locality.
> 
> A possible approach is to turn the fixed-size array into an xarray,
> which results in a net code deletion after a bit of cleanup.
> 
> This series is on top of the current linux/master as it touches the
> RISC-V implementation. Only tested on arm64.

Queued, only locally until I get a review for my replacement of patch 4 
(see 
https://lore.kernel.org/kvm/20211116142205.719375-1-pbonzini@redhat.com/T/).

Paolo

> Marc Zyngier (5):
>    KVM: Move wiping of the kvm->vcpus array to common code
>    KVM: mips: Use kvm_get_vcpu() instead of open-coded access
>    KVM: s390: Use kvm_get_vcpu() instead of open-coded access
>    KVM: x86: Use kvm_get_vcpu() instead of open-coded access
>    KVM: Convert the kvm->vcpus array to a xarray
> 
>   arch/arm64/kvm/arm.c           | 10 +---------
>   arch/mips/kvm/loongson_ipi.c   |  4 ++--
>   arch/mips/kvm/mips.c           | 23 ++---------------------
>   arch/powerpc/kvm/powerpc.c     | 10 +---------
>   arch/riscv/kvm/vm.c            | 10 +---------
>   arch/s390/kvm/kvm-s390.c       | 26 ++++++--------------------
>   arch/x86/kvm/vmx/posted_intr.c |  2 +-
>   arch/x86/kvm/x86.c             |  9 +--------
>   include/linux/kvm_host.h       |  7 ++++---
>   virt/kvm/kvm_main.c            | 33 ++++++++++++++++++++++++++-------
>   10 files changed, 45 insertions(+), 89 deletions(-)
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] KVM: Turn the vcpu array into an xarray
  2021-11-16 15:03 ` Paolo Bonzini
@ 2021-11-16 15:40   ` Marc Zyngier
  0 siblings, 0 replies; 24+ messages in thread
From: Marc Zyngier @ 2021-11-16 15:40 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Juergen Gross, Nicholas Piggin,
	Sean Christopherson, Paul Mackerras, Michael Ellerman,
	James Morse, Suzuki K Poulose, Alexandru Elisei, kernel-team

On Tue, 16 Nov 2021 15:03:40 +0000,
Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 11/5/21 20:20, Marc Zyngier wrote:
> > The kvm structure is pretty large. A large portion of it is the vcpu
> > array, which is 4kB on x86_64 and arm64 as they deal with 512 vcpu
> > VMs. Of course, hardly anyone runs VMs this big, so this is often a
> > net waste of memory and cache locality.
> > 
> > A possible approach is to turn the fixed-size array into an xarray,
> > which results in a net code deletion after a bit of cleanup.
> > 
> > This series is on top of the current linux/master as it touches the
> > RISC-V implementation. Only tested on arm64.
> 
> Queued, only locally until I get a review for my replacement of patch
> 4 (see
> https://lore.kernel.org/kvm/20211116142205.719375-1-pbonzini@redhat.com/T/).

In which case, let me send a v2 with the changes that we discussed
with Sean. It will still have my version of patch 4, but that's
nothing you can't fix.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
  2021-11-16 14:04     ` Paolo Bonzini
@ 2021-11-16 16:07       ` Sean Christopherson
  2021-11-16 16:48         ` Paolo Bonzini
  0 siblings, 1 reply; 24+ messages in thread
From: Sean Christopherson @ 2021-11-16 16:07 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Marc Zyngier, kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Juergen Gross, Nicholas Piggin, Paul Mackerras,
	Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On Tue, Nov 16, 2021, Paolo Bonzini wrote:
> On 11/5/21 21:03, Sean Christopherson wrote:
> > But I think even that is flawed, as APICv can be dynamically deactivated and
> > re-activated while the VM is running, and I don't see a path that re-updates
> > the IRTE when APICv is re-activated.  So I think a more conservative check is
> > needed, e.g.
> > 
> > diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c
> > index 5f81ef092bd4..6cf5b2e86118 100644
> > --- a/arch/x86/kvm/vmx/posted_intr.c
> > +++ b/arch/x86/kvm/vmx/posted_intr.c
> > @@ -272,7 +272,7 @@ int pi_update_irte(struct kvm *kvm, unsigned int host_irq, uint32_t guest_irq,
> > 
> >          if (!kvm_arch_has_assigned_device(kvm) ||
> >              !irq_remapping_cap(IRQ_POSTING_CAP) ||
> > -           !kvm_vcpu_apicv_active(kvm->vcpus[0]))
> > +           !irqchip_in_kernel(kvm) || !enable_apicv)
> >                  return 0;
> > 
> >          idx = srcu_read_lock(&kvm->irq_srcu);
> 
> What happens then if pi_pre_block is called and the IRTE denotes a posted
> interrupt?
> 
> I might be wrong, but it seems to me that you have to change all of the
> occurrences this way.  As soon as enable_apicv is set, you need to go
> through the POSTED_INTR_WAKEUP_VECTOR just in case.

Sorry, I didn't grok that at all.  All occurences of what?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] KVM: x86: Use kvm_get_vcpu() instead of open-coded access
  2021-11-16 16:07       ` Sean Christopherson
@ 2021-11-16 16:48         ` Paolo Bonzini
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Bonzini @ 2021-11-16 16:48 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, kvm, linux-mips, kvmarm, linuxppc-dev, Huacai Chen,
	Aleksandar Markovic, Anup Patel, Atish Patra,
	Christian Borntraeger, Janosch Frank, David Hildenbrand,
	Claudio Imbrenda, Juergen Gross, Nicholas Piggin, Paul Mackerras,
	Michael Ellerman, James Morse, Suzuki K Poulose,
	Alexandru Elisei, kernel-team

On 11/16/21 17:07, Sean Christopherson wrote:
>>>           if (!kvm_arch_has_assigned_device(kvm) ||
>>>               !irq_remapping_cap(IRQ_POSTING_CAP) ||
>>> -           !kvm_vcpu_apicv_active(kvm->vcpus[0]))
>>> +           !irqchip_in_kernel(kvm) || !enable_apicv)
>>>                   return 0;
>>>
>>>           idx = srcu_read_lock(&kvm->irq_srcu);
>> What happens then if pi_pre_block is called and the IRTE denotes a posted
>> interrupt?
>>
>> I might be wrong, but it seems to me that you have to change all of the
>> occurrences this way.  As soon as enable_apicv is set, you need to go
>> through the POSTED_INTR_WAKEUP_VECTOR just in case.
> Sorry, I didn't grok that at all.  All occurences of what?

Of the !assigned-device || !VTd-PI || !kvm_vcpu_apicv_active(vcpu) 
checks.  This way, CPUs are woken up correctly even if you have 
!kvm_vcpu_apicv_active(vcpu) but the IRTE is a posted-interrupt one.

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-11-16 16:48 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-05 19:20 [PATCH 0/5] KVM: Turn the vcpu array into an xarray Marc Zyngier
2021-11-05 19:20 ` [PATCH 1/5] KVM: Move wiping of the kvm->vcpus array to common code Marc Zyngier
2021-11-05 20:12   ` Sean Christopherson
2021-11-06 11:17     ` Marc Zyngier
2021-11-16 13:49       ` Paolo Bonzini
2021-11-08 12:12   ` Claudio Imbrenda
2021-11-05 19:20 ` [PATCH 2/5] KVM: mips: Use kvm_get_vcpu() instead of open-coded access Marc Zyngier
2021-11-06 15:56   ` Philippe Mathieu-Daudé
2021-11-05 19:20 ` [PATCH 3/5] KVM: s390: " Marc Zyngier
2021-11-08 12:13   ` Claudio Imbrenda
2021-11-05 19:21 ` [PATCH 4/5] KVM: x86: " Marc Zyngier
2021-11-05 20:03   ` Sean Christopherson
2021-11-16 14:04     ` Paolo Bonzini
2021-11-16 16:07       ` Sean Christopherson
2021-11-16 16:48         ` Paolo Bonzini
2021-11-05 19:21 ` [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray Marc Zyngier
2021-11-05 20:21   ` Sean Christopherson
2021-11-06 11:48     ` Marc Zyngier
2021-11-08  8:23       ` Marc Zyngier
2021-11-16 14:13 ` [PATCH 0/5] KVM: Turn the vcpu array into an xarray Juergen Gross
2021-11-16 14:21   ` Paolo Bonzini
2021-11-16 14:54     ` Juergen Gross
2021-11-16 15:03 ` Paolo Bonzini
2021-11-16 15:40   ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).