linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs
@ 2021-08-27  9:25 Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 1/8] KVM: Clean up benign vcpu->cpu data races when " Vitaly Kuznetsov
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

Changes since v3:
- "KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with
  vcpu_mask==NULL" patch added.
- Untangle kvm_make_all_cpus_request_except()/kvm_make_vcpus_request_mask()
 [Sean]
- "KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask()" patch
 added [Sean]
- "KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()" patch
 added.
- "KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask"
 patch added.
- Add Sean's R-b tag to PATCH6.

This series is a continuation to Sean's "[PATCH 0/2] VM: Fix a benign race
in kicking vCPUs" work and v2 for my "KVM: Optimize
kvm_make_vcpus_request_mask() a bit"/"KVM: x86: Fix stack-out-of-bounds
memory access from ioapic_write_indirect()" patchset.

From Sean:

"Fix benign races when kicking vCPUs where the task doing the kicking can
consume a stale vcpu->cpu.  The races are benign because of the
impliciations of task migration with respect to interrupts and being in
guest mode, but IMO they're worth fixing if only as an excuse to
document the flows.

Patch 2 is a tangentially related cleanup to prevent future me from
trying to get rid of the NULL check on the cpumask parameters, which
_looks_ like it can't ever be NULL, but has a subtle edge case due to the
way CONFIG_CPUMASK_OFFSTACK=y handles cpumasks."

Patch3 is a preparation to untangling kvm_make_all_cpus_request_except()
and kvm_make_vcpus_request_mask().

Patch4 is a minor optimization for kvm_make_vcpus_request_mask() for big
guests.

Patch5 is a minor cleanup.

Patch6 fixes a real problem with ioapic_write_indirect() KVM does
out-of-bounds access to stack memory.

Patches7 and 8 get rid of dynamic cpumask allocation for kicking vCPUs.

Sean Christopherson (2):
  KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  KVM: KVM: Use cpumask_available() to check for NULL cpumask when
    kicking vCPUs

Vitaly Kuznetsov (6):
  KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with
    vcpu_mask==NULL
  KVM: Optimize kvm_make_vcpus_request_mask() a bit
  KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask()
  KVM: x86: Fix stack-out-of-bounds memory access from
    ioapic_write_indirect()
  KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
  KVM: Make kvm_make_vcpus_request_mask() use pre-allocated
    cpu_kick_mask

 arch/x86/include/asm/kvm_host.h |   1 -
 arch/x86/kvm/hyperv.c           |  18 ++---
 arch/x86/kvm/ioapic.c           |  10 +--
 arch/x86/kvm/x86.c              |   8 +--
 include/linux/kvm_host.h        |   3 +-
 virt/kvm/kvm_main.c             | 118 ++++++++++++++++++++++++--------
 6 files changed, 106 insertions(+), 52 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v4 1/8] KVM: Clean up benign vcpu->cpu data races when kicking vCPUs
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 2/8] KVM: KVM: Use cpumask_available() to check for NULL cpumask " Vitaly Kuznetsov
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

From: Sean Christopherson <seanjc@google.com>

Fix a benign data race reported by syzbot+KCSAN[*] by ensuring vcpu->cpu
is read exactly once, and by ensuring the vCPU is booted from guest mode
if kvm_arch_vcpu_should_kick() returns true.  Fix a similar race in
kvm_make_vcpus_request_mask() by ensuring the vCPU is interrupted if
kvm_request_needs_ipi() returns true.

Reading vcpu->cpu before vcpu->mode (via kvm_arch_vcpu_should_kick() or
kvm_request_needs_ipi()) means the target vCPU could get migrated (change
vcpu->cpu) and enter !OUTSIDE_GUEST_MODE between reading vcpu->cpud and
reading vcpu->mode.  If that happens, the kick/IPI will be sent to the
old pCPU, not the new pCPU that is now running the vCPU or reading SPTEs.

Although failing to kick the vCPU is not exactly ideal, practically
speaking it cannot cause a functional issue unless there is also a bug in
the caller, and any such bug would exist regardless of kvm_vcpu_kick()'s
behavior.

The purpose of sending an IPI is purely to get a vCPU into the host (or
out of reading SPTEs) so that the vCPU can recognize a change in state,
e.g. a KVM_REQ_* request.  If vCPU's handling of the state change is
required for correctness, KVM must ensure either the vCPU sees the change
before entering the guest, or that the sender sees the vCPU as running in
guest mode.  All architectures handle this by (a) sending the request
before calling kvm_vcpu_kick() and (b) checking for requests _after_
setting vcpu->mode.

x86's READING_SHADOW_PAGE_TABLES has similar requirements; KVM needs to
ensure it kicks and waits for vCPUs that started reading SPTEs _before_
MMU changes were finalized, but any vCPU that starts reading after MMU
changes were finalized will see the new state and can continue on
uninterrupted.

For uses of kvm_vcpu_kick() that are not paired with a KVM_REQ_*, e.g.
x86's kvm_arch_sync_dirty_log(), the order of the kick must not be relied
upon for functional correctness, e.g. in the dirty log case, userspace
cannot assume it has a 100% complete log if vCPUs are still running.

All that said, eliminate the benign race since the cost of doing so is an
"extra" atomic cmpxchg() in the case where the target vCPU is loaded by
the current pCPU or is not loaded at all.  I.e. the kick will be skipped
due to kvm_vcpu_exiting_guest_mode() seeing a compatible vcpu->mode as
opposed to the kick being skipped because of the cpu checks.

Keep the "cpu != me" checks even though they appear useless/impossible at
first glance.  x86 processes guest IPI writes in a fast path that runs in
IN_GUEST_MODE, i.e. can call kvm_vcpu_kick() from IN_GUEST_MODE.  And
calling kvm_vm_bugged()->kvm_make_vcpus_request_mask() from IN_GUEST or
READING_SHADOW_PAGE_TABLES is perfectly reasonable.

Note, a race with the cpu_online() check in kvm_vcpu_kick() likely
persists, e.g. the vCPU could exit guest mode and get offlined between
the cpu_online() check and the sending of smp_send_reschedule().  But,
the online check appears to exist only to avoid a WARN in x86's
native_smp_send_reschedule() that fires if the target CPU is not online.
The reschedule WARN exists because CPU offlining takes the CPU out of the
scheduling pool, i.e. the WARN is intended to detect the case where the
kernel attempts to schedule a task on an offline CPU.  The actual sending
of the IPI is a non-issue as at worst it will simpy be dropped on the
floor.  In other words, KVM's usurping of the reschedule IPI could
theoretically trigger a WARN if the stars align, but there will be no
loss of functionality.

[*] https://syzkaller.appspot.com/bug?extid=cd4154e502f43f10808a

Cc: Venkatesh Srinivas <venkateshs@google.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Fixes: 97222cc83163 ("KVM: Emulate local APIC in kernel")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 virt/kvm/kvm_main.c | 36 ++++++++++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3e67c93ca403..786b914db98f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -273,14 +273,26 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 			continue;
 
 		kvm_make_request(req, vcpu);
-		cpu = vcpu->cpu;
 
 		if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
 			continue;
 
-		if (tmp != NULL && cpu != -1 && cpu != me &&
-		    kvm_request_needs_ipi(vcpu, req))
-			__cpumask_set_cpu(cpu, tmp);
+		/*
+		 * Note, the vCPU could get migrated to a different pCPU at any
+		 * point after kvm_request_needs_ipi(), which could result in
+		 * sending an IPI to the previous pCPU.  But, that's ok because
+		 * the purpose of the IPI is to ensure the vCPU returns to
+		 * OUTSIDE_GUEST_MODE, which is satisfied if the vCPU migrates.
+		 * Entering READING_SHADOW_PAGE_TABLES after this point is also
+		 * ok, as the requirement is only that KVM wait for vCPUs that
+		 * were reading SPTEs _before_ any changes were finalized.  See
+		 * kvm_vcpu_kick() for more details on handling requests.
+		 */
+		if (tmp != NULL && kvm_request_needs_ipi(vcpu, req)) {
+			cpu = READ_ONCE(vcpu->cpu);
+			if (cpu != -1 && cpu != me)
+				__cpumask_set_cpu(cpu, tmp);
+		}
 	}
 
 	called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT));
@@ -3309,16 +3321,24 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_wake_up);
  */
 void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 {
-	int me;
-	int cpu = vcpu->cpu;
+	int me, cpu;
 
 	if (kvm_vcpu_wake_up(vcpu))
 		return;
 
+	/*
+	 * Note, the vCPU could get migrated to a different pCPU at any point
+	 * after kvm_arch_vcpu_should_kick(), which could result in sending an
+	 * IPI to the previous pCPU.  But, that's ok because the purpose of the
+	 * IPI is to force the vCPU to leave IN_GUEST_MODE, and migrating the
+	 * vCPU also requires it to leave IN_GUEST_MODE.
+	 */
 	me = get_cpu();
-	if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
-		if (kvm_arch_vcpu_should_kick(vcpu))
+	if (kvm_arch_vcpu_should_kick(vcpu)) {
+		cpu = READ_ONCE(vcpu->cpu);
+		if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
 			smp_send_reschedule(cpu);
+	}
 	put_cpu();
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 2/8] KVM: KVM: Use cpumask_available() to check for NULL cpumask when kicking vCPUs
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 1/8] KVM: Clean up benign vcpu->cpu data races when " Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 3/8] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL Vitaly Kuznetsov
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

From: Sean Christopherson <seanjc@google.com>

Check for a NULL cpumask_var_t when kicking multiple vCPUs via
cpumask_available(), which performs a !NULL check if and only if cpumasks
are configured to be allocated off-stack.  This is a meaningless
optimization, e.g. avoids a TEST+Jcc and TEST+CMOV on x86, but more
importantly helps document that the NULL check is necessary even though
all callers pass in a local variable.

No functional change intended.

Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 virt/kvm/kvm_main.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 786b914db98f..2082aceffbf6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -245,9 +245,13 @@ static void ack_flush(void *_completed)
 {
 }
 
-static inline bool kvm_kick_many_cpus(const struct cpumask *cpus, bool wait)
+static inline bool kvm_kick_many_cpus(cpumask_var_t tmp, bool wait)
 {
-	if (unlikely(!cpus))
+	const struct cpumask *cpus;
+
+	if (likely(cpumask_available(tmp)))
+		cpus = tmp;
+	else
 		cpus = cpu_online_mask;
 
 	if (cpumask_empty(cpus))
@@ -277,6 +281,14 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 		if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
 			continue;
 
+		/*
+		 * tmp can be "unavailable" if cpumasks are allocated off stack
+		 * as allocation of the mask is deliberately not fatal and is
+		 * handled by falling back to kicking all online CPUs.
+		 */
+		if (!cpumask_available(tmp))
+			continue;
+
 		/*
 		 * Note, the vCPU could get migrated to a different pCPU at any
 		 * point after kvm_request_needs_ipi(), which could result in
@@ -288,7 +300,7 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 		 * were reading SPTEs _before_ any changes were finalized.  See
 		 * kvm_vcpu_kick() for more details on handling requests.
 		 */
-		if (tmp != NULL && kvm_request_needs_ipi(vcpu, req)) {
+		if (kvm_request_needs_ipi(vcpu, req)) {
 			cpu = READ_ONCE(vcpu->cpu);
 			if (cpu != -1 && cpu != me)
 				__cpumask_set_cpu(cpu, tmp);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 3/8] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 1/8] KVM: Clean up benign vcpu->cpu data races when " Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 2/8] KVM: KVM: Use cpumask_available() to check for NULL cpumask " Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-09-02 20:57   ` Sean Christopherson
  2021-08-27  9:25 ` [PATCH v4 4/8] KVM: Optimize kvm_make_vcpus_request_mask() a bit Vitaly Kuznetsov
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

In preparation to making kvm_make_vcpus_request_mask() use for_each_set_bit()
switch kvm_hv_flush_tlb() to calling kvm_make_all_cpus_request() for 'all cpus'
case.

Note: kvm_make_all_cpus_request() (unlike kvm_make_vcpus_request_mask())
currently dynamically allocates cpumask on each call and this is suboptimal.
Both kvm_make_all_cpus_request() and kvm_make_vcpus_request_mask() are
going to be switched to using pre-allocated per-cpu masks.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index fe4a02715266..783a7f2441bd 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1839,16 +1839,19 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 
 	cpumask_clear(&hv_vcpu->tlb_flush);
 
-	vcpu_mask = all_cpus ? NULL :
-		sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask,
-					vp_bitmap, vcpu_bitmap);
-
 	/*
 	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
 	 * analyze it here, flush TLB regardless of the specified address space.
 	 */
-	kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
-				    NULL, vcpu_mask, &hv_vcpu->tlb_flush);
+	if (all_cpus) {
+		kvm_make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH_GUEST);
+	} else {
+		vcpu_mask = sparse_set_to_vcpu_mask(kvm, sparse_banks, valid_bank_mask,
+						    vp_bitmap, vcpu_bitmap);
+
+		kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
+					    NULL, vcpu_mask, &hv_vcpu->tlb_flush);
+	}
 
 ret_success:
 	/* We always do full TLB flush, set 'Reps completed' = 'Rep Count' */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 4/8] KVM: Optimize kvm_make_vcpus_request_mask() a bit
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
                   ` (2 preceding siblings ...)
  2021-08-27  9:25 ` [PATCH v4 3/8] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-09-02 21:00   ` Sean Christopherson
  2021-08-27  9:25 ` [PATCH v4 5/8] KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask() Vitaly Kuznetsov
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

Iterating over set bits in 'vcpu_bitmap' should be faster than going
through all vCPUs, especially when just a few bits are set.

Drop kvm_make_vcpus_request_mask() call from kvm_make_all_cpus_request_except()
to avoid handling the special case when 'vcpu_bitmap' is NULL, move the
code to kvm_make_all_cpus_request_except() itself.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 virt/kvm/kvm_main.c | 88 +++++++++++++++++++++++++++------------------
 1 file changed, 53 insertions(+), 35 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2082aceffbf6..e32ba210025f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -261,50 +261,57 @@ static inline bool kvm_kick_many_cpus(cpumask_var_t tmp, bool wait)
 	return true;
 }
 
+static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
+				  unsigned int req, cpumask_var_t tmp,
+				  int current_cpu)
+{
+	int cpu = vcpu->cpu;
+
+	kvm_make_request(req, vcpu);
+
+	if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
+		return;
+
+	/*
+	 * tmp can be "unavailable" if cpumasks are allocated off stack as
+	 * allocation of the mask is deliberately not fatal and is handled by
+	 * falling back to kicking all online CPUs.
+	 */
+	if (!cpumask_available(tmp))
+		return;
+
+	/*
+	 * Note, the vCPU could get migrated to a different pCPU at any point
+	 * after kvm_request_needs_ipi(), which could result in sending an IPI
+	 * to the previous pCPU.  But, that's OK because the purpose of the IPI
+	 * is to ensure the vCPU returns to OUTSIDE_GUEST_MODE, which is
+	 * satisfied if the vCPU migrates. Entering READING_SHADOW_PAGE_TABLES
+	 * after this point is also OK, as the requirement is only that KVM wait
+	 * for vCPUs that were reading SPTEs _before_ any changes were
+	 * finalized. See kvm_vcpu_kick() for more details on handling requests.
+	 */
+	if (kvm_request_needs_ipi(vcpu, req)) {
+		cpu = READ_ONCE(vcpu->cpu);
+		if (cpu != -1 && cpu != current_cpu)
+			__cpumask_set_cpu(cpu, tmp);
+	}
+}
+
 bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 				 struct kvm_vcpu *except,
 				 unsigned long *vcpu_bitmap, cpumask_var_t tmp)
 {
-	int i, cpu, me;
 	struct kvm_vcpu *vcpu;
+	int i, me;
 	bool called;
 
 	me = get_cpu();
 
-	kvm_for_each_vcpu(i, vcpu, kvm) {
-		if ((vcpu_bitmap && !test_bit(i, vcpu_bitmap)) ||
-		    vcpu == except)
-			continue;
-
-		kvm_make_request(req, vcpu);
-
-		if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
+	for_each_set_bit(i, vcpu_bitmap, KVM_MAX_VCPUS) {
+		vcpu = kvm_get_vcpu(kvm, i);
+		if (!vcpu || vcpu == except)
 			continue;
-
-		/*
-		 * tmp can be "unavailable" if cpumasks are allocated off stack
-		 * as allocation of the mask is deliberately not fatal and is
-		 * handled by falling back to kicking all online CPUs.
-		 */
-		if (!cpumask_available(tmp))
-			continue;
-
-		/*
-		 * Note, the vCPU could get migrated to a different pCPU at any
-		 * point after kvm_request_needs_ipi(), which could result in
-		 * sending an IPI to the previous pCPU.  But, that's ok because
-		 * the purpose of the IPI is to ensure the vCPU returns to
-		 * OUTSIDE_GUEST_MODE, which is satisfied if the vCPU migrates.
-		 * Entering READING_SHADOW_PAGE_TABLES after this point is also
-		 * ok, as the requirement is only that KVM wait for vCPUs that
-		 * were reading SPTEs _before_ any changes were finalized.  See
-		 * kvm_vcpu_kick() for more details on handling requests.
-		 */
-		if (kvm_request_needs_ipi(vcpu, req)) {
-			cpu = READ_ONCE(vcpu->cpu);
-			if (cpu != -1 && cpu != me)
-				__cpumask_set_cpu(cpu, tmp);
-		}
+		kvm_make_vcpu_request(kvm, vcpu, req, tmp, me);
 	}
 
 	called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT));
@@ -316,12 +323,23 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
 				      struct kvm_vcpu *except)
 {
+	struct kvm_vcpu *vcpu;
 	cpumask_var_t cpus;
 	bool called;
+	int i, me;
 
 	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
 
-	called = kvm_make_vcpus_request_mask(kvm, req, except, NULL, cpus);
+	me = get_cpu();
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		if (vcpu == except)
+			continue;
+		kvm_make_vcpu_request(kvm, vcpu, req, cpus, me);
+	}
+
+	called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
+	put_cpu();
 
 	free_cpumask_var(cpus);
 	return called;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 5/8] KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask()
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
                   ` (3 preceding siblings ...)
  2021-08-27  9:25 ` [PATCH v4 4/8] KVM: Optimize kvm_make_vcpus_request_mask() a bit Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-09-02 21:00   ` Sean Christopherson
  2021-08-27  9:25 ` [PATCH v4 6/8] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect() Vitaly Kuznetsov
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

Both remaining callers of kvm_make_vcpus_request_mask() pass 'NULL' for
'except' parameter so it can just be dropped.

No functional change intended.

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/kvm/hyperv.c    | 2 +-
 arch/x86/kvm/x86.c       | 2 +-
 include/linux/kvm_host.h | 1 -
 virt/kvm/kvm_main.c      | 3 +--
 4 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 783a7f2441bd..5704bfe53ee0 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1850,7 +1850,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 						    vp_bitmap, vcpu_bitmap);
 
 		kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
-					    NULL, vcpu_mask, &hv_vcpu->tlb_flush);
+					    vcpu_mask, &hv_vcpu->tlb_flush);
 	}
 
 ret_success:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 86539c1686fa..a4752dcc2a75 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9229,7 +9229,7 @@ void kvm_make_scan_ioapic_request_mask(struct kvm *kvm,
 	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
 
 	kvm_make_vcpus_request_mask(kvm, KVM_REQ_SCAN_IOAPIC,
-				    NULL, vcpu_bitmap, cpus);
+				    vcpu_bitmap, cpus);
 
 	free_cpumask_var(cpus);
 }
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e4d712e9f760..2f149ed140f7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -160,7 +160,6 @@ static inline bool is_error_page(struct page *page)
 #define KVM_ARCH_REQ(nr)           KVM_ARCH_REQ_FLAGS(nr, 0)
 
 bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
-				 struct kvm_vcpu *except,
 				 unsigned long *vcpu_bitmap, cpumask_var_t tmp);
 bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
 bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e32ba210025f..2e9927c4eb32 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -298,7 +298,6 @@ static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
 }
 
 bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
-				 struct kvm_vcpu *except,
 				 unsigned long *vcpu_bitmap, cpumask_var_t tmp)
 {
 	struct kvm_vcpu *vcpu;
@@ -309,7 +308,7 @@ bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
 
 	for_each_set_bit(i, vcpu_bitmap, KVM_MAX_VCPUS) {
 		vcpu = kvm_get_vcpu(kvm, i);
-		if (!vcpu || vcpu == except)
+		if (!vcpu)
 			continue;
 		kvm_make_vcpu_request(kvm, vcpu, req, tmp, me);
 	}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 6/8] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect()
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
                   ` (4 preceding siblings ...)
  2021-08-27  9:25 ` [PATCH v4 5/8] KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask() Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except() Vitaly Kuznetsov
  2021-08-27  9:25 ` [PATCH v4 8/8] KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask Vitaly Kuznetsov
  7 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

KASAN reports the following issue:

 BUG: KASAN: stack-out-of-bounds in kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
 Read of size 8 at addr ffffc9001364f638 by task qemu-kvm/4798

 CPU: 0 PID: 4798 Comm: qemu-kvm Tainted: G               X --------- ---
 Hardware name: AMD Corporation DAYTONA_X/DAYTONA_X, BIOS RYM0081C 07/13/2020
 Call Trace:
  dump_stack+0xa5/0xe6
  print_address_description.constprop.0+0x18/0x130
  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
  __kasan_report.cold+0x7f/0x114
  ? kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
  kasan_report+0x38/0x50
  kasan_check_range+0xf5/0x1d0
  kvm_make_vcpus_request_mask+0x174/0x440 [kvm]
  kvm_make_scan_ioapic_request_mask+0x84/0xc0 [kvm]
  ? kvm_arch_exit+0x110/0x110 [kvm]
  ? sched_clock+0x5/0x10
  ioapic_write_indirect+0x59f/0x9e0 [kvm]
  ? static_obj+0xc0/0xc0
  ? __lock_acquired+0x1d2/0x8c0
  ? kvm_ioapic_eoi_inject_work+0x120/0x120 [kvm]

The problem appears to be that 'vcpu_bitmap' is allocated as a single long
on stack and it should really be KVM_MAX_VCPUS long. We also seem to clear
the lower 16 bits of it with bitmap_zero() for no particular reason (my
guess would be that 'bitmap' and 'vcpu_bitmap' variables in
kvm_bitmap_or_dest_vcpus() caused the confusion: while the later is indeed
16-bit long, the later should accommodate all possible vCPUs).

Fixes: 7ee30bc132c6 ("KVM: x86: deliver KVM IOAPIC scan request to target vCPUs")
Fixes: 9a2ae9f6b6bb ("KVM: x86: Zero the IOAPIC scan request dest vCPUs bitmap")
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/ioapic.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index ff005fe738a4..8c065da73f8e 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -319,8 +319,8 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
 	unsigned index;
 	bool mask_before, mask_after;
 	union kvm_ioapic_redirect_entry *e;
-	unsigned long vcpu_bitmap;
 	int old_remote_irr, old_delivery_status, old_dest_id, old_dest_mode;
+	DECLARE_BITMAP(vcpu_bitmap, KVM_MAX_VCPUS);
 
 	switch (ioapic->ioregsel) {
 	case IOAPIC_REG_VERSION:
@@ -384,9 +384,9 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
 			irq.shorthand = APIC_DEST_NOSHORT;
 			irq.dest_id = e->fields.dest_id;
 			irq.msi_redir_hint = false;
-			bitmap_zero(&vcpu_bitmap, 16);
+			bitmap_zero(vcpu_bitmap, KVM_MAX_VCPUS);
 			kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq,
-						 &vcpu_bitmap);
+						 vcpu_bitmap);
 			if (old_dest_mode != e->fields.dest_mode ||
 			    old_dest_id != e->fields.dest_id) {
 				/*
@@ -399,10 +399,10 @@ static void ioapic_write_indirect(struct kvm_ioapic *ioapic, u32 val)
 				    kvm_lapic_irq_dest_mode(
 					!!e->fields.dest_mode);
 				kvm_bitmap_or_dest_vcpus(ioapic->kvm, &irq,
-							 &vcpu_bitmap);
+							 vcpu_bitmap);
 			}
 			kvm_make_scan_ioapic_request_mask(ioapic->kvm,
-							  &vcpu_bitmap);
+							  vcpu_bitmap);
 		} else {
 			kvm_make_scan_ioapic_request(ioapic->kvm);
 		}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
                   ` (5 preceding siblings ...)
  2021-08-27  9:25 ` [PATCH v4 6/8] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect() Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-09-02 21:08   ` Sean Christopherson
  2021-08-27  9:25 ` [PATCH v4 8/8] KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask Vitaly Kuznetsov
  7 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

Allocating cpumask dynamically in zalloc_cpumask_var() is not ideal.
Allocation is somewhat slow and can (in theory and when CPUMASK_OFFSTACK)
fail. kvm_make_all_cpus_request_except() already disables preemption so
we can use pre-allocated per-cpu cpumasks instead.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 virt/kvm/kvm_main.c | 29 +++++++++++++++++++++++------
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2e9927c4eb32..2f5fe4f54a51 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -155,6 +155,8 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm);
 static unsigned long long kvm_createvm_count;
 static unsigned long long kvm_active_vms;
 
+static DEFINE_PER_CPU(cpumask_var_t, cpu_kick_mask);
+
 __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
 						   unsigned long start, unsigned long end)
 {
@@ -323,14 +325,15 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
 				      struct kvm_vcpu *except)
 {
 	struct kvm_vcpu *vcpu;
-	cpumask_var_t cpus;
+	struct cpumask *cpus;
 	bool called;
 	int i, me;
 
-	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
-
 	me = get_cpu();
 
+	cpus = this_cpu_cpumask_var_ptr(cpu_kick_mask);
+	cpumask_clear(cpus);
+
 	kvm_for_each_vcpu(i, vcpu, kvm) {
 		if (vcpu == except)
 			continue;
@@ -340,7 +343,6 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
 	called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
 	put_cpu();
 
-	free_cpumask_var(cpus);
 	return called;
 }
 
@@ -5581,9 +5583,15 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 		goto out_free_3;
 	}
 
+	for_each_possible_cpu(cpu) {
+		if (!alloc_cpumask_var_node(&per_cpu(cpu_kick_mask, cpu),
+					    GFP_KERNEL, cpu_to_node(cpu)))
+			goto out_free_4;
+	}
+
 	r = kvm_async_pf_init();
 	if (r)
-		goto out_free;
+		goto out_free_5;
 
 	kvm_chardev_ops.owner = module;
 	kvm_vm_fops.owner = module;
@@ -5609,7 +5617,11 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
 
 out_unreg:
 	kvm_async_pf_deinit();
-out_free:
+out_free_5:
+	for_each_possible_cpu(cpu) {
+		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
+	}
+out_free_4:
 	kmem_cache_destroy(kvm_vcpu_cache);
 out_free_3:
 	unregister_reboot_notifier(&kvm_reboot_notifier);
@@ -5629,8 +5641,13 @@ EXPORT_SYMBOL_GPL(kvm_init);
 
 void kvm_exit(void)
 {
+	int cpu;
+
 	debugfs_remove_recursive(kvm_debugfs_dir);
 	misc_deregister(&kvm_dev);
+	for_each_possible_cpu(cpu) {
+		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
+	}
 	kmem_cache_destroy(kvm_vcpu_cache);
 	kvm_async_pf_deinit();
 	unregister_syscore_ops(&kvm_syscore_ops);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v4 8/8] KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask
  2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
                   ` (6 preceding siblings ...)
  2021-08-27  9:25 ` [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except() Vitaly Kuznetsov
@ 2021-08-27  9:25 ` Vitaly Kuznetsov
  2021-09-02 21:19   ` Sean Christopherson
  7 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-08-27  9:25 UTC (permalink / raw)
  To: kvm, Paolo Bonzini
  Cc: Sean Christopherson, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

kvm_make_vcpus_request_mask() already disables preemption so just like
kvm_make_all_cpus_request_except() it can be switched to using
pre-allocated per-cpu cpumasks. This allows for improvements for both
users of the function: in Hyper-V emulation code 'tlb_flush' can now be
dropped from 'struct kvm_vcpu_hv' and kvm_make_scan_ioapic_request_mask()
gets rid of dynamic allocation.

cpumask_available() check in kvm_make_vcpu_request() can now be dropped as
it checks for an impossible condition: kvm_init() makes sure per-cpu masks
are allocated.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 -
 arch/x86/kvm/hyperv.c           |  5 +----
 arch/x86/kvm/x86.c              |  8 +-------
 include/linux/kvm_host.h        |  2 +-
 virt/kvm/kvm_main.c             | 18 +++++++-----------
 5 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 09b256db394a..846552fa2012 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -569,7 +569,6 @@ struct kvm_vcpu_hv {
 	struct kvm_hyperv_exit exit;
 	struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
 	DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
-	cpumask_t tlb_flush;
 	bool enforce_cpuid;
 	struct {
 		u32 features_eax; /* HYPERV_CPUID_FEATURES.EAX */
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 5704bfe53ee0..f76e7228f687 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1755,7 +1755,6 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 	int i;
 	gpa_t gpa;
 	struct kvm *kvm = vcpu->kvm;
-	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
 	struct hv_tlb_flush_ex flush_ex;
 	struct hv_tlb_flush flush;
 	u64 vp_bitmap[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
@@ -1837,8 +1836,6 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 		}
 	}
 
-	cpumask_clear(&hv_vcpu->tlb_flush);
-
 	/*
 	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
 	 * analyze it here, flush TLB regardless of the specified address space.
@@ -1850,7 +1847,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
 						    vp_bitmap, vcpu_bitmap);
 
 		kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
-					    vcpu_mask, &hv_vcpu->tlb_flush);
+					    vcpu_mask);
 	}
 
 ret_success:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a4752dcc2a75..91c1e6c98b0f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9224,14 +9224,8 @@ static void process_smi(struct kvm_vcpu *vcpu)
 void kvm_make_scan_ioapic_request_mask(struct kvm *kvm,
 				       unsigned long *vcpu_bitmap)
 {
-	cpumask_var_t cpus;
-
-	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
-
 	kvm_make_vcpus_request_mask(kvm, KVM_REQ_SCAN_IOAPIC,
-				    vcpu_bitmap, cpus);
-
-	free_cpumask_var(cpus);
+				    vcpu_bitmap);
 }
 
 void kvm_make_scan_ioapic_request(struct kvm *kvm)
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 2f149ed140f7..1ee85de0bf74 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -160,7 +160,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_ARCH_REQ(nr)           KVM_ARCH_REQ_FLAGS(nr, 0)
 
 bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
-				 unsigned long *vcpu_bitmap, cpumask_var_t tmp);
+				 unsigned long *vcpu_bitmap);
 bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
 bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
 				      struct kvm_vcpu *except);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2f5fe4f54a51..dc52a04f0586 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -274,14 +274,6 @@ static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
 	if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
 		return;
 
-	/*
-	 * tmp can be "unavailable" if cpumasks are allocated off stack as
-	 * allocation of the mask is deliberately not fatal and is handled by
-	 * falling back to kicking all online CPUs.
-	 */
-	if (!cpumask_available(tmp))
-		return;
-
 	/*
 	 * Note, the vCPU could get migrated to a different pCPU at any point
 	 * after kvm_request_needs_ipi(), which could result in sending an IPI
@@ -300,22 +292,26 @@ static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
 }
 
 bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
-				 unsigned long *vcpu_bitmap, cpumask_var_t tmp)
+				 unsigned long *vcpu_bitmap)
 {
 	struct kvm_vcpu *vcpu;
+	struct cpumask *cpus;
 	int i, me;
 	bool called;
 
 	me = get_cpu();
 
+	cpus = this_cpu_cpumask_var_ptr(cpu_kick_mask);
+	cpumask_clear(cpus);
+
 	for_each_set_bit(i, vcpu_bitmap, KVM_MAX_VCPUS) {
 		vcpu = kvm_get_vcpu(kvm, i);
 		if (!vcpu)
 			continue;
-		kvm_make_vcpu_request(kvm, vcpu, req, tmp, me);
+		kvm_make_vcpu_request(kvm, vcpu, req, cpus, me);
 	}
 
-	called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT));
+	called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
 	put_cpu();
 
 	return called;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 3/8] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL
  2021-08-27  9:25 ` [PATCH v4 3/8] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL Vitaly Kuznetsov
@ 2021-09-02 20:57   ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2021-09-02 20:57 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

On Fri, Aug 27, 2021, Vitaly Kuznetsov wrote:
> In preparation to making kvm_make_vcpus_request_mask() use for_each_set_bit()
> switch kvm_hv_flush_tlb() to calling kvm_make_all_cpus_request() for 'all cpus'
> case.
> 
> Note: kvm_make_all_cpus_request() (unlike kvm_make_vcpus_request_mask())
> currently dynamically allocates cpumask on each call and this is suboptimal.
> Both kvm_make_all_cpus_request() and kvm_make_vcpus_request_mask() are
> going to be switched to using pre-allocated per-cpu masks.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 4/8] KVM: Optimize kvm_make_vcpus_request_mask() a bit
  2021-08-27  9:25 ` [PATCH v4 4/8] KVM: Optimize kvm_make_vcpus_request_mask() a bit Vitaly Kuznetsov
@ 2021-09-02 21:00   ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2021-09-02 21:00 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

On Fri, Aug 27, 2021, Vitaly Kuznetsov wrote:
> Iterating over set bits in 'vcpu_bitmap' should be faster than going
> through all vCPUs, especially when just a few bits are set.
> 
> Drop kvm_make_vcpus_request_mask() call from kvm_make_all_cpus_request_except()
> to avoid handling the special case when 'vcpu_bitmap' is NULL, move the
> code to kvm_make_all_cpus_request_except() itself.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  virt/kvm/kvm_main.c | 88 +++++++++++++++++++++++++++------------------
>  1 file changed, 53 insertions(+), 35 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2082aceffbf6..e32ba210025f 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -261,50 +261,57 @@ static inline bool kvm_kick_many_cpus(cpumask_var_t tmp, bool wait)
>  	return true;
>  }
>  
> +static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
> +				  unsigned int req, cpumask_var_t tmp,
> +				  int current_cpu)
> +{
> +	int cpu = vcpu->cpu;

'cpu' doesn't need to be initialized here.  Leaving it uninitialized will also
deter consumption before the READ_ONCE below.

> +
> +	kvm_make_request(req, vcpu);
> +
> +	if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
> +		return;
> +
> +	/*
> +	 * tmp can be "unavailable" if cpumasks are allocated off stack as
> +	 * allocation of the mask is deliberately not fatal and is handled by
> +	 * falling back to kicking all online CPUs.
> +	 */
> +	if (!cpumask_available(tmp))
> +		return;
> +
> +	/*
> +	 * Note, the vCPU could get migrated to a different pCPU at any point
> +	 * after kvm_request_needs_ipi(), which could result in sending an IPI
> +	 * to the previous pCPU.  But, that's OK because the purpose of the IPI
> +	 * is to ensure the vCPU returns to OUTSIDE_GUEST_MODE, which is
> +	 * satisfied if the vCPU migrates. Entering READING_SHADOW_PAGE_TABLES
> +	 * after this point is also OK, as the requirement is only that KVM wait
> +	 * for vCPUs that were reading SPTEs _before_ any changes were
> +	 * finalized. See kvm_vcpu_kick() for more details on handling requests.
> +	 */
> +	if (kvm_request_needs_ipi(vcpu, req)) {
> +		cpu = READ_ONCE(vcpu->cpu);
> +		if (cpu != -1 && cpu != current_cpu)
> +			__cpumask_set_cpu(cpu, tmp);
> +	}
> +}

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 5/8] KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask()
  2021-08-27  9:25 ` [PATCH v4 5/8] KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask() Vitaly Kuznetsov
@ 2021-09-02 21:00   ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2021-09-02 21:00 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

On Fri, Aug 27, 2021, Vitaly Kuznetsov wrote:
> Both remaining callers of kvm_make_vcpus_request_mask() pass 'NULL' for
> 'except' parameter so it can just be dropped.
> 
> No functional change intended.

Trademark infringement.

> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <seanjc@google.com> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
  2021-08-27  9:25 ` [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except() Vitaly Kuznetsov
@ 2021-09-02 21:08   ` Sean Christopherson
  2021-09-03  7:20     ` Vitaly Kuznetsov
  0 siblings, 1 reply; 16+ messages in thread
From: Sean Christopherson @ 2021-09-02 21:08 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

On Fri, Aug 27, 2021, Vitaly Kuznetsov wrote:
> Allocating cpumask dynamically in zalloc_cpumask_var() is not ideal.
> Allocation is somewhat slow and can (in theory and when CPUMASK_OFFSTACK)
> fail. kvm_make_all_cpus_request_except() already disables preemption so
> we can use pre-allocated per-cpu cpumasks instead.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  virt/kvm/kvm_main.c | 29 +++++++++++++++++++++++------
>  1 file changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2e9927c4eb32..2f5fe4f54a51 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -155,6 +155,8 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm);
>  static unsigned long long kvm_createvm_count;
>  static unsigned long long kvm_active_vms;
>  
> +static DEFINE_PER_CPU(cpumask_var_t, cpu_kick_mask);
> +
>  __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
>  						   unsigned long start, unsigned long end)
>  {
> @@ -323,14 +325,15 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
>  				      struct kvm_vcpu *except)
>  {
>  	struct kvm_vcpu *vcpu;
> -	cpumask_var_t cpus;
> +	struct cpumask *cpus;
>  	bool called;
>  	int i, me;
>  
> -	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
> -
>  	me = get_cpu();
>  
> +	cpus = this_cpu_cpumask_var_ptr(cpu_kick_mask);
> +	cpumask_clear(cpus);
> +
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		if (vcpu == except)
>  			continue;
> @@ -340,7 +343,6 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
>  	called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
>  	put_cpu();
>  
> -	free_cpumask_var(cpus);
>  	return called;
>  }
>  
> @@ -5581,9 +5583,15 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>  		goto out_free_3;
>  	}
>  
> +	for_each_possible_cpu(cpu) {
> +		if (!alloc_cpumask_var_node(&per_cpu(cpu_kick_mask, cpu),
> +					    GFP_KERNEL, cpu_to_node(cpu)))
> +			goto out_free_4;

'r' needs to be explicitly set to -EFAULT, e.g. in the current code it's
guaranteed to be 0 here.

> +	}
> +
>  	r = kvm_async_pf_init();
>  	if (r)
> -		goto out_free;
> +		goto out_free_5;
>  
>  	kvm_chardev_ops.owner = module;
>  	kvm_vm_fops.owner = module;
> @@ -5609,7 +5617,11 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>  
>  out_unreg:
>  	kvm_async_pf_deinit();
> -out_free:
> +out_free_5:
> +	for_each_possible_cpu(cpu) {

Unnecessary braces.

> +		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
> +	}
> +out_free_4:
>  	kmem_cache_destroy(kvm_vcpu_cache);
>  out_free_3:
>  	unregister_reboot_notifier(&kvm_reboot_notifier);
> @@ -5629,8 +5641,13 @@ EXPORT_SYMBOL_GPL(kvm_init);
>  
>  void kvm_exit(void)
>  {
> +	int cpu;
> +
>  	debugfs_remove_recursive(kvm_debugfs_dir);
>  	misc_deregister(&kvm_dev);
> +	for_each_possible_cpu(cpu) {

Same here.

> +		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
> +	}
>  	kmem_cache_destroy(kvm_vcpu_cache);
>  	kvm_async_pf_deinit();
>  	unregister_syscore_ops(&kvm_syscore_ops);
> -- 
> 2.31.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 8/8] KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask
  2021-08-27  9:25 ` [PATCH v4 8/8] KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask Vitaly Kuznetsov
@ 2021-09-02 21:19   ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2021-09-02 21:19 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

On Fri, Aug 27, 2021, Vitaly Kuznetsov wrote:
> kvm_make_vcpus_request_mask() already disables preemption so just like
> kvm_make_all_cpus_request_except() it can be switched to using
> pre-allocated per-cpu cpumasks. This allows for improvements for both
> users of the function: in Hyper-V emulation code 'tlb_flush' can now be
> dropped from 'struct kvm_vcpu_hv' and kvm_make_scan_ioapic_request_mask()
> gets rid of dynamic allocation.
> 
> cpumask_available() check in kvm_make_vcpu_request() can now be dropped as
> it checks for an impossible condition: kvm_init() makes sure per-cpu masks
> are allocated.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 -
>  arch/x86/kvm/hyperv.c           |  5 +----
>  arch/x86/kvm/x86.c              |  8 +-------
>  include/linux/kvm_host.h        |  2 +-
>  virt/kvm/kvm_main.c             | 18 +++++++-----------
>  5 files changed, 10 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 09b256db394a..846552fa2012 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -569,7 +569,6 @@ struct kvm_vcpu_hv {
>  	struct kvm_hyperv_exit exit;
>  	struct kvm_vcpu_hv_stimer stimer[HV_SYNIC_STIMER_COUNT];
>  	DECLARE_BITMAP(stimer_pending_bitmap, HV_SYNIC_STIMER_COUNT);
> -	cpumask_t tlb_flush;
>  	bool enforce_cpuid;
>  	struct {
>  		u32 features_eax; /* HYPERV_CPUID_FEATURES.EAX */
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index 5704bfe53ee0..f76e7228f687 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -1755,7 +1755,6 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
>  	int i;
>  	gpa_t gpa;
>  	struct kvm *kvm = vcpu->kvm;
> -	struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>  	struct hv_tlb_flush_ex flush_ex;
>  	struct hv_tlb_flush flush;
>  	u64 vp_bitmap[KVM_HV_MAX_SPARSE_VCPU_SET_BITS];
> @@ -1837,8 +1836,6 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
>  		}
>  	}
>  
> -	cpumask_clear(&hv_vcpu->tlb_flush);
> -
>  	/*
>  	 * vcpu->arch.cr3 may not be up-to-date for running vCPUs so we can't
>  	 * analyze it here, flush TLB regardless of the specified address space.
> @@ -1850,7 +1847,7 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *vcpu, struct kvm_hv_hcall *hc, bool
>  						    vp_bitmap, vcpu_bitmap);
>  
>  		kvm_make_vcpus_request_mask(kvm, KVM_REQ_TLB_FLUSH_GUEST,
> -					    vcpu_mask, &hv_vcpu->tlb_flush);
> +					    vcpu_mask);
>  	}
>  
>  ret_success:
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a4752dcc2a75..91c1e6c98b0f 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9224,14 +9224,8 @@ static void process_smi(struct kvm_vcpu *vcpu)
>  void kvm_make_scan_ioapic_request_mask(struct kvm *kvm,
>  				       unsigned long *vcpu_bitmap)
>  {
> -	cpumask_var_t cpus;
> -
> -	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
> -
>  	kvm_make_vcpus_request_mask(kvm, KVM_REQ_SCAN_IOAPIC,
> -				    vcpu_bitmap, cpus);
> -
> -	free_cpumask_var(cpus);
> +				    vcpu_bitmap);

This can opportunistically all go on a single line.

>  }
>  
>  void kvm_make_scan_ioapic_request(struct kvm *kvm)
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 2f149ed140f7..1ee85de0bf74 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -160,7 +160,7 @@ static inline bool is_error_page(struct page *page)
>  #define KVM_ARCH_REQ(nr)           KVM_ARCH_REQ_FLAGS(nr, 0)
>  
>  bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
> -				 unsigned long *vcpu_bitmap, cpumask_var_t tmp);
> +				 unsigned long *vcpu_bitmap);
>  bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
>  bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
>  				      struct kvm_vcpu *except);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2f5fe4f54a51..dc52a04f0586 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -274,14 +274,6 @@ static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
>  	if (!(req & KVM_REQUEST_NO_WAKEUP) && kvm_vcpu_wake_up(vcpu))
>  		return;
>  
> -	/*
> -	 * tmp can be "unavailable" if cpumasks are allocated off stack as
> -	 * allocation of the mask is deliberately not fatal and is handled by
> -	 * falling back to kicking all online CPUs.
> -	 */
> -	if (!cpumask_available(tmp))
> -		return;

Hmm, maybe convert the param to an explicit "struct cpumask *" to try and convey
that cpumask_available() doesn't need to be checked?

And I believe you can also do:

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index dc52a04f0586..bfd2ecbd97a8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -247,15 +247,8 @@ static void ack_flush(void *_completed)
 {
 }
 
-static inline bool kvm_kick_many_cpus(cpumask_var_t tmp, bool wait)
+static inline bool kvm_kick_many_cpus(struct cpumask *cpus, bool wait)
 {
-       const struct cpumask *cpus;
-
-       if (likely(cpumask_available(tmp)))
-               cpus = tmp;
-       else
-               cpus = cpu_online_mask;
-
        if (cpumask_empty(cpus))
                return false;
 
> -
>  	/*
>  	 * Note, the vCPU could get migrated to a different pCPU at any point
>  	 * after kvm_request_needs_ipi(), which could result in sending an IPI
> @@ -300,22 +292,26 @@ static void kvm_make_vcpu_request(struct kvm *kvm, struct kvm_vcpu *vcpu,
>  }
>  
>  bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
> -				 unsigned long *vcpu_bitmap, cpumask_var_t tmp)
> +				 unsigned long *vcpu_bitmap)
>  {
>  	struct kvm_vcpu *vcpu;
> +	struct cpumask *cpus;
>  	int i, me;
>  	bool called;
>  
>  	me = get_cpu();
>  
> +	cpus = this_cpu_cpumask_var_ptr(cpu_kick_mask);
> +	cpumask_clear(cpus);
> +
>  	for_each_set_bit(i, vcpu_bitmap, KVM_MAX_VCPUS) {
>  		vcpu = kvm_get_vcpu(kvm, i);
>  		if (!vcpu)
>  			continue;
> -		kvm_make_vcpu_request(kvm, vcpu, req, tmp, me);
> +		kvm_make_vcpu_request(kvm, vcpu, req, cpus, me);
>  	}
>  
> -	called = kvm_kick_many_cpus(tmp, !!(req & KVM_REQUEST_WAIT));
> +	called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
>  	put_cpu();
>  
>  	return called;
> -- 
> 2.31.1
> 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
  2021-09-02 21:08   ` Sean Christopherson
@ 2021-09-03  7:20     ` Vitaly Kuznetsov
  2021-09-03 14:54       ` Sean Christopherson
  0 siblings, 1 reply; 16+ messages in thread
From: Vitaly Kuznetsov @ 2021-09-03  7:20 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

Sean Christopherson <seanjc@google.com> writes:

> On Fri, Aug 27, 2021, Vitaly Kuznetsov wrote:
>> Allocating cpumask dynamically in zalloc_cpumask_var() is not ideal.
>> Allocation is somewhat slow and can (in theory and when CPUMASK_OFFSTACK)
>> fail. kvm_make_all_cpus_request_except() already disables preemption so
>> we can use pre-allocated per-cpu cpumasks instead.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  virt/kvm/kvm_main.c | 29 +++++++++++++++++++++++------
>>  1 file changed, 23 insertions(+), 6 deletions(-)
>> 
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 2e9927c4eb32..2f5fe4f54a51 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -155,6 +155,8 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm);
>>  static unsigned long long kvm_createvm_count;
>>  static unsigned long long kvm_active_vms;
>>  
>> +static DEFINE_PER_CPU(cpumask_var_t, cpu_kick_mask);
>> +
>>  __weak void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm,
>>  						   unsigned long start, unsigned long end)
>>  {
>> @@ -323,14 +325,15 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
>>  				      struct kvm_vcpu *except)
>>  {
>>  	struct kvm_vcpu *vcpu;
>> -	cpumask_var_t cpus;
>> +	struct cpumask *cpus;
>>  	bool called;
>>  	int i, me;
>>  
>> -	zalloc_cpumask_var(&cpus, GFP_ATOMIC);
>> -
>>  	me = get_cpu();
>>  
>> +	cpus = this_cpu_cpumask_var_ptr(cpu_kick_mask);
>> +	cpumask_clear(cpus);
>> +
>>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>>  		if (vcpu == except)
>>  			continue;
>> @@ -340,7 +343,6 @@ bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
>>  	called = kvm_kick_many_cpus(cpus, !!(req & KVM_REQUEST_WAIT));
>>  	put_cpu();
>>  
>> -	free_cpumask_var(cpus);
>>  	return called;
>>  }
>>  
>> @@ -5581,9 +5583,15 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>>  		goto out_free_3;
>>  	}
>>  
>> +	for_each_possible_cpu(cpu) {
>> +		if (!alloc_cpumask_var_node(&per_cpu(cpu_kick_mask, cpu),
>> +					    GFP_KERNEL, cpu_to_node(cpu)))
>> +			goto out_free_4;
>
> 'r' needs to be explicitly set to -EFAULT, e.g. in the current code it's
> guaranteed to be 0 here.

Oops, yes. Any particular reason to avoid -ENOMEM? (hope not, will use
this in v5)

>
>> +	}
>> +
>>  	r = kvm_async_pf_init();
>>  	if (r)
>> -		goto out_free;
>> +		goto out_free_5;
>>  
>>  	kvm_chardev_ops.owner = module;
>>  	kvm_vm_fops.owner = module;
>> @@ -5609,7 +5617,11 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
>>  
>>  out_unreg:
>>  	kvm_async_pf_deinit();
>> -out_free:
>> +out_free_5:
>> +	for_each_possible_cpu(cpu) {
>
> Unnecessary braces.
>
>> +		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
>> +	}
>> +out_free_4:
>>  	kmem_cache_destroy(kvm_vcpu_cache);
>>  out_free_3:
>>  	unregister_reboot_notifier(&kvm_reboot_notifier);
>> @@ -5629,8 +5641,13 @@ EXPORT_SYMBOL_GPL(kvm_init);
>>  
>>  void kvm_exit(void)
>>  {
>> +	int cpu;
>> +
>>  	debugfs_remove_recursive(kvm_debugfs_dir);
>>  	misc_deregister(&kvm_dev);
>> +	for_each_possible_cpu(cpu) {
>
> Same here.
>
>> +		free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
>> +	}
>>  	kmem_cache_destroy(kvm_vcpu_cache);
>>  	kvm_async_pf_deinit();
>>  	unregister_syscore_ops(&kvm_syscore_ops);
>> -- 
>> 2.31.1
>> 
>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except()
  2021-09-03  7:20     ` Vitaly Kuznetsov
@ 2021-09-03 14:54       ` Sean Christopherson
  0 siblings, 0 replies; 16+ messages in thread
From: Sean Christopherson @ 2021-09-03 14:54 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Paolo Bonzini, Wanpeng Li, Jim Mattson,
	Dr. David Alan Gilbert, Nitesh Narayan Lal, Lai Jiangshan,
	Maxim Levitsky, Eduardo Habkost, linux-kernel

On Fri, Sep 03, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <seanjc@google.com> writes:
> >> @@ -5581,9 +5583,15 @@ int kvm_init(void *opaque, unsigned vcpu_size, unsigned vcpu_align,
> >>  		goto out_free_3;
> >>  	}
> >>  
> >> +	for_each_possible_cpu(cpu) {
> >> +		if (!alloc_cpumask_var_node(&per_cpu(cpu_kick_mask, cpu),
> >> +					    GFP_KERNEL, cpu_to_node(cpu)))
> >> +			goto out_free_4;
> >
> > 'r' needs to be explicitly set to -EFAULT, e.g. in the current code it's
> > guaranteed to be 0 here.
> 
> Oops, yes. Any particular reason to avoid -ENOMEM? (hope not, will use
> this in v5)

Huh.  Yes, -ENOMEM.  I have no idea why I typed -EFAULT.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-09-03 14:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-27  9:25 [PATCH v4 0/8] KVM: Various fixes and improvements around kicking vCPUs Vitaly Kuznetsov
2021-08-27  9:25 ` [PATCH v4 1/8] KVM: Clean up benign vcpu->cpu data races when " Vitaly Kuznetsov
2021-08-27  9:25 ` [PATCH v4 2/8] KVM: KVM: Use cpumask_available() to check for NULL cpumask " Vitaly Kuznetsov
2021-08-27  9:25 ` [PATCH v4 3/8] KVM: x86: hyper-v: Avoid calling kvm_make_vcpus_request_mask() with vcpu_mask==NULL Vitaly Kuznetsov
2021-09-02 20:57   ` Sean Christopherson
2021-08-27  9:25 ` [PATCH v4 4/8] KVM: Optimize kvm_make_vcpus_request_mask() a bit Vitaly Kuznetsov
2021-09-02 21:00   ` Sean Christopherson
2021-08-27  9:25 ` [PATCH v4 5/8] KVM: Drop 'except' parameter from kvm_make_vcpus_request_mask() Vitaly Kuznetsov
2021-09-02 21:00   ` Sean Christopherson
2021-08-27  9:25 ` [PATCH v4 6/8] KVM: x86: Fix stack-out-of-bounds memory access from ioapic_write_indirect() Vitaly Kuznetsov
2021-08-27  9:25 ` [PATCH v4 7/8] KVM: Pre-allocate cpumasks for kvm_make_all_cpus_request_except() Vitaly Kuznetsov
2021-09-02 21:08   ` Sean Christopherson
2021-09-03  7:20     ` Vitaly Kuznetsov
2021-09-03 14:54       ` Sean Christopherson
2021-08-27  9:25 ` [PATCH v4 8/8] KVM: Make kvm_make_vcpus_request_mask() use pre-allocated cpu_kick_mask Vitaly Kuznetsov
2021-09-02 21:19   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).