linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat
@ 2021-09-25  0:55 Sean Christopherson
  2021-09-25  0:55 ` [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU Sean Christopherson
                   ` (14 more replies)
  0 siblings, 15 replies; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

The main purpose of this series is differentiate between "halt" and a more
generic "block", where "halt" aligns with x86's HLT instruction, the
halt-polling mechanisms, and associated stats, and "block" means any guest
action that causes the vCPU to block/wait.

This series arose out of a discussion over adding a stat to track if a
vCPU is blocked/halted[*].  The TL;DR of the discussion is that x86 has
several non-halt "wait" states, and arguably those states should not
participate in halt-polling.  In practice, it really doesn't matter from
a functionality perspective because there are typically so few occurences
of the non-halt waits that they're in the noise compared to the number of
actual HLTs, especially for a long-running VM.  So, my justification for
the rename is that because it doesn't truly affect functionality, KVM
might as well be technically correct and only use halt-polling for HLT.

The other annoyance this series addresses is that KVM mixes "halt" and
"block", e.g. the existing function is kvm_vcpu_block(), but all the stats
and the tracepoint use "halt".  Ideally, KVM would probably avoid "block"
altogether as people often think of "blocked" as meaning the vCPU is
blocked due to _host_ activity.  But I don't have a better alternative,
e.g. "halt" is obviously taken, "wait" is equivalent to "halt" on arm64,
"stop" has specific meaning on s390, etc...  I tried to address the host
vs. guest issue by naming the new stat "blocking" instead of "blocked",
e.g. to convey that the vCPU is "actively blocking" instead of "being
blocked".

Patch 01 fixes a theoretical, benign s390 bug, and sets the stage for
additional cleanups.

Patches 02-04 reconcile discrepancies in when KVM considers halt-polling
to be "successful".  Some stats consider it a success so long as KVM
doesn't schedule() away, others consider it a success if and only if a
wake event is detected in the halt-polling loop.

Patches 05-06 are prep cleanup to split out the core "block" routine.

Patch 07 is more prep, and should also be a small perf optimization for
halt-polling on arm64.

Patch 08 is x86 cleanup to free up the name kvm_vcpu_halt().

Patches 09-10 rename the existing kvm_vcpu_block() to kvm_vcpu_halt(), and
split out the core "block" routine to a new helper.

Patches 11-12 are minor cleanups to avoid unnecessary ktime_get().

Patches 13-14 convert non-HLT x86 flows to use kvm_vcpu_block().

[*] https://lkml.kernel.org/r/20210817230508.142907-1-jingzhangos@google.com

Jing Zhang (1):
  KVM: stats: Add stat to detect if vcpu is currently blocking

Sean Christopherson (13):
  KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU
  KVM: Update halt-polling stats if and only if halt-polling was
    attempted
  KVM: Refactor and document halt-polling stats update helper
  KVM: Reconcile discrepancies in halt-polling stats
  KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch
    hook
  KVM: Drop obsolete kvm_arch_vcpu_block_finish()
  KVM: Don't block+unblock when halt-polling is successful
  KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt()
  KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()
  KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt()
  KVM: Don't redo ktime_get() when calculating halt-polling
    stop/deadline
  KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs
  KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states

 arch/arm64/include/asm/kvm_host.h   |   1 -
 arch/arm64/kvm/arch_timer.c         |   2 +-
 arch/arm64/kvm/handle_exit.c        |   4 +-
 arch/arm64/kvm/psci.c               |   2 +-
 arch/mips/include/asm/kvm_host.h    |   1 -
 arch/mips/kvm/emulate.c             |   2 +-
 arch/powerpc/include/asm/kvm_host.h |   1 -
 arch/powerpc/kvm/book3s_pr.c        |   2 +-
 arch/powerpc/kvm/book3s_pr_papr.c   |   2 +-
 arch/powerpc/kvm/booke.c            |   2 +-
 arch/powerpc/kvm/powerpc.c          |   2 +-
 arch/s390/include/asm/kvm_host.h    |   2 -
 arch/s390/kvm/interrupt.c           |   3 +-
 arch/s390/kvm/kvm-s390.c            |   7 +-
 arch/x86/include/asm/kvm_host.h     |   4 +-
 arch/x86/kvm/vmx/nested.c           |   2 +-
 arch/x86/kvm/vmx/vmx.c              |   4 +-
 arch/x86/kvm/x86.c                  |  25 ++++--
 include/linux/kvm_host.h            |   6 +-
 include/linux/kvm_types.h           |   1 +
 virt/kvm/kvm_main.c                 | 131 +++++++++++++++++-----------
 21 files changed, 118 insertions(+), 88 deletions(-)

-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-27  6:54   ` Christian Borntraeger
  2021-09-25  0:55 ` [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted Sean Christopherson
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Wrap s390's halt_poll_max_steal with READ_ONCE and snapshot the result of
kvm_arch_no_poll() in kvm_vcpu_block() to avoid a mostly-theoretical,
largely benign bug on s390 where the result of kvm_arch_no_poll() could
change due to userspace modifying halt_poll_max_steal while the vCPU is
blocking.  The bug is largely benign as it will either cause KVM to skip
updating halt-polling times (no_poll toggles false=>true) or to update
halt-polling times with a slightly flawed block_ns.

Note, READ_ONCE is unnecessary in the current code, add it in case the
arch hook is ever inlined, and to provide a hint that userspace can
change the param at will.

Fixes: 8b905d28ee17 ("KVM: s390: provide kvm_arch_no_poll function")
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/s390/kvm/kvm-s390.c | 2 +-
 virt/kvm/kvm_main.c      | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 6a6dd5e1daf6..7cabe6778b1b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3446,7 +3446,7 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
 {
 	/* do not poll with more than halt_poll_max_steal percent of steal time */
 	if (S390_lowcore.avg_steal_timer * 100 / (TICK_USEC << 12) >=
-	    halt_poll_max_steal) {
+	    READ_ONCE(halt_poll_max_steal)) {
 		vcpu->stat.halt_no_poll_steal++;
 		return true;
 	}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 191dac6b1bed..768a4cbb26a6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3213,6 +3213,7 @@ update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited)
  */
 void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 {
+	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	ktime_t start, cur, poll_end;
 	bool waited = false;
 	u64 block_ns;
@@ -3220,7 +3221,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	kvm_arch_vcpu_blocking(vcpu);
 
 	start = cur = poll_end = ktime_get();
-	if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) {
+	if (vcpu->halt_poll_ns && halt_poll_allowed) {
 		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
 
 		++vcpu->stat.generic.halt_attempted_poll;
@@ -3275,7 +3276,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	update_halt_poll_stats(
 		vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
 
-	if (!kvm_arch_no_poll(vcpu)) {
+	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
 		} else if (vcpu->kvm->max_halt_poll_ns) {
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
  2021-09-25  0:55 ` [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 18:57   ` David Matlack
  2021-09-25  0:55 ` [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper Sean Christopherson
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Don't update halt-polling stats if halt-polling wasn't attempted.  This
is a nop as @poll_ns is guaranteed to be '0' (poll_end == start), but it
will allow a future patch to move the histogram stats into the helper to
resolve a discrepancy in what is considered a "successful" halt-poll.

No functional change intended.

Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 768a4cbb26a6..8b33f5045b4d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3214,6 +3214,7 @@ update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited)
 void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
+	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
 	ktime_t start, cur, poll_end;
 	bool waited = false;
 	u64 block_ns;
@@ -3221,7 +3222,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	kvm_arch_vcpu_blocking(vcpu);
 
 	start = cur = poll_end = ktime_get();
-	if (vcpu->halt_poll_ns && halt_poll_allowed) {
+	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
 
 		++vcpu->stat.generic.halt_attempted_poll;
@@ -3273,8 +3274,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	kvm_arch_vcpu_unblocking(vcpu);
 	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
-	update_halt_poll_stats(
-		vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
+	if (do_halt_poll)
+		update_halt_poll_stats(
+			vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
 
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
  2021-09-25  0:55 ` [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU Sean Christopherson
  2021-09-25  0:55 ` [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 19:01   ` David Matlack
  2021-09-25  0:55 ` [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats Sean Christopherson
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Add a comment to document that halt-polling is considered successful even
if the polling loop itself didn't detect a wake event, i.e. if a wake
event was detect in the final kvm_vcpu_check_block().  Invert the param
to the update helper so that the helper is a dumb function that is "told"
whether or not polling was successful, as opposed to having it determinine
success/failure based on blocking behavior.

Opportunistically tweak the params to the update helper to reduce the
line length for the call site so that it fits on a single line, and so
that the prototype conforms to the more traditional kernel style.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b33f5045b4d..12fe91a0a4c8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3199,13 +3199,15 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
-static inline void
-update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited)
+static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
+					  ktime_t end, bool success)
 {
-	if (waited)
-		vcpu->stat.generic.halt_poll_fail_ns += poll_ns;
-	else
+	u64 poll_ns = ktime_to_ns(ktime_sub(end, start));
+
+	if (success)
 		vcpu->stat.generic.halt_poll_success_ns += poll_ns;
+	else
+		vcpu->stat.generic.halt_poll_fail_ns += poll_ns;
 }
 
 /*
@@ -3274,9 +3276,13 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	kvm_arch_vcpu_unblocking(vcpu);
 	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
+	/*
+	 * Note, halt-polling is considered successful so long as the vCPU was
+	 * never actually scheduled out, i.e. even if the wake event arrived
+	 * after of the halt-polling loop itself, but before the full wait.
+	 */
 	if (do_halt_poll)
-		update_halt_poll_stats(
-			vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
+		update_halt_poll_stats(vcpu, start, poll_end, !waited);
 
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (2 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 21:26   ` David Matlack
  2021-09-25  0:55 ` [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook Sean Christopherson
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Move the halt-polling "success" and histogram stats update into the
dedicated helper to fix a discrepancy where the success/fail "time" stats
consider polling successful so long as the wait is avoided, but the main
"success" and histogram stats consider polling successful if and only if
a wake event was detected by the halt-polling loop.

Move halt_attempted_poll to the helper as well so that all the stats are
updated in a single location.  While it's a bit odd to update the stat
well after the fact, practically speaking there's no meaningful advantage
to updating before polling.

Note, there is a functional change in addition to the success vs. fail
change.  The histogram updates previously called ktime_get() instead of
using "cur".  But that change is desirable as it means all the stats are
now updated with the same polling time, and avoids the extra ktime_get(),
which isn't expensive but isn't free either.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 35 ++++++++++++++++-------------------
 1 file changed, 16 insertions(+), 19 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 12fe91a0a4c8..2ba22b68af3b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3202,12 +3202,23 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
 					  ktime_t end, bool success)
 {
+	struct kvm_vcpu_stat_generic *stats = &vcpu->stat.generic;
 	u64 poll_ns = ktime_to_ns(ktime_sub(end, start));
 
-	if (success)
-		vcpu->stat.generic.halt_poll_success_ns += poll_ns;
-	else
-		vcpu->stat.generic.halt_poll_fail_ns += poll_ns;
+	++vcpu->stat.generic.halt_attempted_poll;
+
+	if (success) {
+		++vcpu->stat.generic.halt_successful_poll;
+
+		if (!vcpu_valid_wakeup(vcpu))
+			++vcpu->stat.generic.halt_poll_invalid;
+
+		stats->halt_poll_success_ns += poll_ns;
+		KVM_STATS_LOG_HIST_UPDATE(stats->halt_poll_success_hist, poll_ns);
+	} else {
+		stats->halt_poll_fail_ns += poll_ns;
+		KVM_STATS_LOG_HIST_UPDATE(stats->halt_poll_fail_hist, poll_ns);
+	}
 }
 
 /*
@@ -3227,30 +3238,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
 
-		++vcpu->stat.generic.halt_attempted_poll;
 		do {
 			/*
 			 * This sets KVM_REQ_UNHALT if an interrupt
 			 * arrives.
 			 */
-			if (kvm_vcpu_check_block(vcpu) < 0) {
-				++vcpu->stat.generic.halt_successful_poll;
-				if (!vcpu_valid_wakeup(vcpu))
-					++vcpu->stat.generic.halt_poll_invalid;
-
-				KVM_STATS_LOG_HIST_UPDATE(
-				      vcpu->stat.generic.halt_poll_success_hist,
-				      ktime_to_ns(ktime_get()) -
-				      ktime_to_ns(start));
+			if (kvm_vcpu_check_block(vcpu) < 0)
 				goto out;
-			}
 			cpu_relax();
 			poll_end = cur = ktime_get();
 		} while (kvm_vcpu_can_poll(cur, stop));
-
-		KVM_STATS_LOG_HIST_UPDATE(
-				vcpu->stat.generic.halt_poll_fail_hist,
-				ktime_to_ns(ktime_get()) - ktime_to_ns(start));
 	}
 
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (3 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-27  6:58   ` Christian Borntraeger
  2021-09-25  0:55 ` [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish() Sean Christopherson
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Move the clearing of valid_wakeup out of kvm_arch_vcpu_block_finish() so
that a future patch can drop said arch hook.  Unlike the other blocking-
related arch hooks (vcpu_blocking/unblocking()), vcpu_block_finish() needs
to be called even if the KVM doesn't actually block the vCPU.  This will
allow future patches to differentiate between truly blocking the vCPU and
emulating a halt condition without introducing a contradiction.

Alternatively, the hook could be renamed to kvm_arch_vcpu_halt_finish(),
but there's literally one call site in s390, and future cleanup can also
be done to handle valid_wakeup fully within kvm_s390_handle_wait() and
allow generic KVM to drop vcpu_valid_wakeup().

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/s390/kvm/interrupt.c | 1 +
 arch/s390/kvm/kvm-s390.c  | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 10722455fd02..520450a7956f 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -1336,6 +1336,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
 no_timer:
 	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
 	kvm_vcpu_block(vcpu);
+	vcpu->valid_wakeup = false;
 	__unset_cpu_idle(vcpu);
 	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
 
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 7cabe6778b1b..08ed68639a21 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -5082,7 +5082,7 @@ static inline unsigned long nonhyp_mask(int i)
 
 void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu)
 {
-	vcpu->valid_wakeup = false;
+
 }
 
 static int __init kvm_s390_init(void)
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish()
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (4 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-27  6:58   ` Christian Borntraeger
  2021-09-28 21:28   ` David Matlack
  2021-09-25  0:55 ` [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful Sean Christopherson
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Drop kvm_arch_vcpu_block_finish() now that all arch implementations are
nops.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/include/asm/kvm_host.h   | 1 -
 arch/mips/include/asm/kvm_host.h    | 1 -
 arch/powerpc/include/asm/kvm_host.h | 1 -
 arch/s390/include/asm/kvm_host.h    | 2 --
 arch/s390/kvm/kvm-s390.c            | 5 -----
 arch/x86/include/asm/kvm_host.h     | 2 --
 virt/kvm/kvm_main.c                 | 1 -
 7 files changed, 13 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..4e0ad0fff540 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -716,7 +716,6 @@ void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
-static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
 void kvm_arm_init_debug(void);
 void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 696f6b009377..72b90d45a46e 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -897,7 +897,6 @@ static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
-static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
 #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLB
 int kvm_arch_flush_remote_tlb(struct kvm *kvm);
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 59cb38b04ede..8a84448d77a6 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -864,6 +864,5 @@ static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_exit(void) {}
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
-static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
 #endif /* __POWERPC_KVM_HOST_H__ */
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index a604d51acfc8..a22c9266ea05 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -1010,6 +1010,4 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
 static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
 static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
 
-void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
-
 #endif
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 08ed68639a21..17fabb260c35 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -5080,11 +5080,6 @@ static inline unsigned long nonhyp_mask(int i)
 	return 0x0000ffffffffffffUL >> (nonhyp_fai << 4);
 }
 
-void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu)
-{
-
-}
-
 static int __init kvm_s390_init(void)
 {
 	int i;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 1e0e909b98b2..4e8c21083bdb 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1916,8 +1916,6 @@ static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
 	static_call_cond(kvm_x86_vcpu_unblocking)(vcpu);
 }
 
-static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
-
 static inline int kvm_cpu_get_apicid(int mps_cpu)
 {
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2ba22b68af3b..2015a1f532ce 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3301,7 +3301,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	}
 
 	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
-	kvm_arch_vcpu_block_finish(vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_block);
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (5 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish() Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-25  9:50   ` Marc Zyngier
  2021-09-25  0:55 ` [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt() Sean Christopherson
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Invoke the arch hooks for block+unblock if and only if KVM actually
attempts to block the vCPU.  The only non-nop implementation is on arm64,
and if halt-polling is successful, there is no need for arm64 to put/load
the vGIC as KVM hasn't relinquished control of the vCPU in any way.

The primary motivation is to allow future cleanup to split out "block"
from "halt", but this is also likely a small performance boost on arm64
when halt-polling is successful.

Adjust the post-block path to update "cur" after unblocking, i.e. include
vGIC load time in halt_wait_ns and halt_wait_hist, so that the behavior
is consistent.  Moving just the pre-block arch hook would result in only
the vGIC put latency being included in the halt_wait stats.  There is no
obvious evidence that one way or the other is correct, so just ensure KVM
is consistent.

Cc: Marc Zyngier <maz@kernel.org>
Cc: James Morse <james.morse@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2015a1f532ce..f96cda8312f3 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3232,8 +3232,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	bool waited = false;
 	u64 block_ns;
 
-	kvm_arch_vcpu_blocking(vcpu);
-
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
@@ -3250,6 +3248,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		} while (kvm_vcpu_can_poll(cur, stop));
 	}
 
+	kvm_arch_vcpu_blocking(vcpu);
 
 	prepare_to_rcuwait(&vcpu->wait);
 	for (;;) {
@@ -3262,6 +3261,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		schedule();
 	}
 	finish_rcuwait(&vcpu->wait);
+
+	kvm_arch_vcpu_unblocking(vcpu);
+
 	cur = ktime_get();
 	if (waited) {
 		vcpu->stat.generic.halt_wait_ns +=
@@ -3269,8 +3271,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		KVM_STATS_LOG_HIST_UPDATE(vcpu->stat.generic.halt_wait_hist,
 				ktime_to_ns(cur) - ktime_to_ns(poll_end));
 	}
+
 out:
-	kvm_arch_vcpu_unblocking(vcpu);
 	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
 	/*
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt()
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (6 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 21:59   ` David Matlack
  2021-09-25  0:55 ` [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Sean Christopherson
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Rename a variety of HLT-related helpers to free up the function name
"kvm_vcpu_halt" for future use in generic KVM code, e.g. to differentiate
between "block" and "halt".

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/vmx/nested.c       |  2 +-
 arch/x86/kvm/vmx/vmx.c          |  4 ++--
 arch/x86/kvm/x86.c              | 13 +++++++------
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4e8c21083bdb..cfebef10b89c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1679,7 +1679,7 @@ int kvm_emulate_monitor(struct kvm_vcpu *vcpu);
 int kvm_fast_pio(struct kvm_vcpu *vcpu, int size, unsigned short port, int in);
 int kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
 int kvm_emulate_halt(struct kvm_vcpu *vcpu);
-int kvm_vcpu_halt(struct kvm_vcpu *vcpu);
+int kvm_emulate_halt_noskip(struct kvm_vcpu *vcpu);
 int kvm_emulate_ap_reset_hold(struct kvm_vcpu *vcpu);
 int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
 
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index eedcebf58004..f689e463b678 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3618,7 +3618,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
 		    !(nested_cpu_has(vmcs12, CPU_BASED_INTR_WINDOW_EXITING) &&
 		      (vmcs12->guest_rflags & X86_EFLAGS_IF))) {
 			vmx->nested.nested_run_pending = 0;
-			return kvm_vcpu_halt(vcpu);
+			return kvm_emulate_halt_noskip(vcpu);
 		}
 		break;
 	case GUEST_ACTIVITY_WAIT_SIPI:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d118daed0530..858f5f1f1273 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4740,7 +4740,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
 		if (kvm_emulate_instruction(vcpu, 0)) {
 			if (vcpu->arch.halt_request) {
 				vcpu->arch.halt_request = 0;
-				return kvm_vcpu_halt(vcpu);
+				return kvm_emulate_halt_noskip(vcpu);
 			}
 			return 1;
 		}
@@ -5414,7 +5414,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
 
 		if (vcpu->arch.halt_request) {
 			vcpu->arch.halt_request = 0;
-			return kvm_vcpu_halt(vcpu);
+			return kvm_emulate_halt_noskip(vcpu);
 		}
 
 		/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b0c21d42f453..eade8a2bdccf 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8643,7 +8643,7 @@ void kvm_arch_exit(void)
 #endif
 }
 
-static int __kvm_vcpu_halt(struct kvm_vcpu *vcpu, int state, int reason)
+static int __kvm_emulate_halt(struct kvm_vcpu *vcpu, int state, int reason)
 {
 	++vcpu->stat.halt_exits;
 	if (lapic_in_kernel(vcpu)) {
@@ -8655,11 +8655,11 @@ static int __kvm_vcpu_halt(struct kvm_vcpu *vcpu, int state, int reason)
 	}
 }
 
-int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
+int kvm_emulate_halt_noskip(struct kvm_vcpu *vcpu)
 {
-	return __kvm_vcpu_halt(vcpu, KVM_MP_STATE_HALTED, KVM_EXIT_HLT);
+	return __kvm_emulate_halt(vcpu, KVM_MP_STATE_HALTED, KVM_EXIT_HLT);
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
+EXPORT_SYMBOL_GPL(kvm_emulate_halt_noskip);
 
 int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 {
@@ -8668,7 +8668,7 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 	 * TODO: we might be squashing a GUESTDBG_SINGLESTEP-triggered
 	 * KVM_EXIT_DEBUG here.
 	 */
-	return kvm_vcpu_halt(vcpu) && ret;
+	return kvm_emulate_halt_noskip(vcpu) && ret;
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
@@ -8676,7 +8676,8 @@ int kvm_emulate_ap_reset_hold(struct kvm_vcpu *vcpu)
 {
 	int ret = kvm_skip_emulated_instruction(vcpu);
 
-	return __kvm_vcpu_halt(vcpu, KVM_MP_STATE_AP_RESET_HOLD, KVM_EXIT_AP_RESET_HOLD) && ret;
+	return __kvm_emulate_halt(vcpu, KVM_MP_STATE_AP_RESET_HOLD,
+					KVM_EXIT_AP_RESET_HOLD) && ret;
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_ap_reset_hold);
 
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (7 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt() Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-27  7:06   ` Christian Borntraeger
  2021-09-28 22:01   ` David Matlack
  2021-09-25  0:55 ` [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt() Sean Christopherson
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting
the actual "block" sequences into a separate helper (to be named
kvm_vcpu_block()).  x86 will use the standalone block-only path to handle
non-halt cases where the vCPU is not runnable.

Rename block_ns to halt_ns to match the new function name.

Opportunistically move an x86-specific comment to x86, and enhance it, too.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/arm64/kvm/arch_timer.c       |  2 +-
 arch/arm64/kvm/handle_exit.c      |  4 ++--
 arch/arm64/kvm/psci.c             |  2 +-
 arch/mips/kvm/emulate.c           |  2 +-
 arch/powerpc/kvm/book3s_pr.c      |  2 +-
 arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
 arch/powerpc/kvm/booke.c          |  2 +-
 arch/powerpc/kvm/powerpc.c        |  2 +-
 arch/s390/kvm/interrupt.c         |  2 +-
 arch/x86/kvm/x86.c                | 11 +++++++++--
 include/linux/kvm_host.h          |  2 +-
 virt/kvm/kvm_main.c               | 20 +++++++++-----------
 12 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 3df67c127489..7e8396f74010 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -467,7 +467,7 @@ static void timer_save_state(struct arch_timer_context *ctx)
 }
 
 /*
- * Schedule the background timer before calling kvm_vcpu_block, so that this
+ * Schedule the background timer before calling kvm_vcpu_halt, so that this
  * thread is removed from its waitqueue and made runnable when there's a timer
  * interrupt to handle.
  */
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 275a27368a04..08f823984712 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -82,7 +82,7 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu)
  *
  * WFE: Yield the CPU and come back to this vcpu when the scheduler
  * decides to.
- * WFI: Simply call kvm_vcpu_block(), which will halt execution of
+ * WFI: Simply call kvm_vcpu_halt(), which will halt execution of
  * world-switches and schedule other host processes until there is an
  * incoming IRQ or FIQ to the VM.
  */
@@ -95,7 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
 	} else {
 		trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
 		vcpu->stat.wfi_exit_stat++;
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 	}
 
diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 74c47d420253..e275b2ca08b9 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -46,7 +46,7 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 	 * specification (ARM DEN 0022A). This means all suspend states
 	 * for KVM will preserve the register state.
 	 */
-	kvm_vcpu_block(vcpu);
+	kvm_vcpu_halt(vcpu);
 	kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 
 	return PSCI_RET_SUCCESS;
diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
index 22e745e49b0a..b494d8d39290 100644
--- a/arch/mips/kvm/emulate.c
+++ b/arch/mips/kvm/emulate.c
@@ -952,7 +952,7 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu)
 	if (!vcpu->arch.pending_exceptions) {
 		kvm_vz_lose_htimer(vcpu);
 		vcpu->arch.wait = 1;
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 
 		/*
 		 * We we are runnable, then definitely go off to user space to
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index 6bc9425acb32..0ced1b16f0e5 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -492,7 +492,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
 
 	if (msr & MSR_POW) {
 		if (!vcpu->arch.pending_exceptions) {
-			kvm_vcpu_block(vcpu);
+			kvm_vcpu_halt(vcpu);
 			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 			vcpu->stat.generic.halt_wakeup++;
 
diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c
index ac14239f3424..1f10e7dfcdd0 100644
--- a/arch/powerpc/kvm/book3s_pr_papr.c
+++ b/arch/powerpc/kvm/book3s_pr_papr.c
@@ -376,7 +376,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
 		return kvmppc_h_pr_stuff_tce(vcpu);
 	case H_CEDE:
 		kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 		vcpu->stat.generic.halt_wakeup++;
 		return EMULATE_DONE;
diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index 977801c83aff..12abffa40cd9 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -718,7 +718,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
 
 	if (vcpu->arch.shared->msr & MSR_WE) {
 		local_irq_enable();
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 		hard_irq_disable();
 
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 8ab90ce8738f..565eed2dab81 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -236,7 +236,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
 		break;
 	case EV_HCALL_TOKEN(EV_IDLE):
 		r = EV_SUCCESS;
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
 		break;
 	default:
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 520450a7956f..10bd648170b7 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -1335,7 +1335,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
 	VCPU_EVENT(vcpu, 4, "enabled wait: %llu ns", sltime);
 no_timer:
 	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
-	kvm_vcpu_block(vcpu);
+	kvm_vcpu_halt(vcpu);
 	vcpu->valid_wakeup = false;
 	__unset_cpu_idle(vcpu);
 	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eade8a2bdccf..0d71c73a61bb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8645,6 +8645,13 @@ void kvm_arch_exit(void)
 
 static int __kvm_emulate_halt(struct kvm_vcpu *vcpu, int state, int reason)
 {
+	/*
+	 * The vCPU has halted, e.g. executed HLT.  Update the run state if the
+	 * local APIC is in-kernel, the run loop will detect the non-runnable
+	 * state and halt the vCPU.  Exit to userspace if the local APIC is
+	 * managed by userspace, in which case userspace is responsible for
+	 * handling wake events.
+	 */
 	++vcpu->stat.halt_exits;
 	if (lapic_in_kernel(vcpu)) {
 		vcpu->arch.mp_state = state;
@@ -9886,7 +9893,7 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	if (!kvm_arch_vcpu_runnable(vcpu) &&
 	    (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
 		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 
 		if (kvm_x86_ops.post_block)
@@ -10120,7 +10127,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 			r = -EINTR;
 			goto out;
 		}
-		kvm_vcpu_block(vcpu);
+		kvm_vcpu_halt(vcpu);
 		if (kvm_apic_accept_events(vcpu) < 0) {
 			r = 0;
 			goto out;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3f87d6ad20bf..d2a8be3fb9ba 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -965,7 +965,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
 void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
 
-void kvm_vcpu_block(struct kvm_vcpu *vcpu);
+void kvm_vcpu_halt(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index f96cda8312f3..280cf1dca7db 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3221,16 +3221,13 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
 	}
 }
 
-/*
- * The vCPU has executed a HLT instruction with in-kernel mode enabled.
- */
-void kvm_vcpu_block(struct kvm_vcpu *vcpu)
+void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
 	ktime_t start, cur, poll_end;
 	bool waited = false;
-	u64 block_ns;
+	u64 halt_ns;
 
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
@@ -3273,7 +3270,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	}
 
 out:
-	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
+	/* The total time the vCPU was "halted", including polling time. */
+	halt_ns = ktime_to_ns(cur) - ktime_to_ns(start);
 
 	/*
 	 * Note, halt-polling is considered successful so long as the vCPU was
@@ -3287,24 +3285,24 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
 		} else if (vcpu->kvm->max_halt_poll_ns) {
-			if (block_ns <= vcpu->halt_poll_ns)
+			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
 			else if (vcpu->halt_poll_ns &&
-					block_ns > vcpu->kvm->max_halt_poll_ns)
+				 halt_ns > vcpu->kvm->max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
 			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-					block_ns < vcpu->kvm->max_halt_poll_ns)
+				 halt_ns < vcpu->kvm->max_halt_poll_ns)
 				grow_halt_poll_ns(vcpu);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
 	}
 
-	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
+	trace_kvm_vcpu_wakeup(halt_ns, waited, vcpu_valid_wakeup(vcpu));
 }
-EXPORT_SYMBOL_GPL(kvm_vcpu_block);
+EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
 
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
 {
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt()
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (8 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-27  7:41   ` Christian Borntraeger
  2021-09-28 22:03   ` David Matlack
  2021-09-25  0:55 ` [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking Sean Christopherson
                   ` (4 subsequent siblings)
  14 siblings, 2 replies; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Factor out the "block" part of kvm_vcpu_halt() so that x86 can emulate
non-halt wait/sleep/block conditions that should not be subjected to
halt-polling.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/kvm_host.h |  1 +
 virt/kvm/kvm_main.c      | 50 ++++++++++++++++++++++++++++------------
 2 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index d2a8be3fb9ba..655c2b24db2d 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -966,6 +966,7 @@ void kvm_sigset_activate(struct kvm_vcpu *vcpu);
 void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
 
 void kvm_vcpu_halt(struct kvm_vcpu *vcpu);
+bool kvm_vcpu_block(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
 bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 280cf1dca7db..fe34457530c2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3199,6 +3199,34 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 	return ret;
 }
 
+/*
+ * Block the vCPU until the vCPU is runnable, an event arrives, or a signal is
+ * pending.  This is mostly used when halting a vCPU, but may also be used
+ * directly for other vCPU non-runnable states, e.g. x86's Wait-For-SIPI.
+ */
+bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
+{
+	bool waited = false;
+
+	kvm_arch_vcpu_blocking(vcpu);
+
+	prepare_to_rcuwait(&vcpu->wait);
+	for (;;) {
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		if (kvm_vcpu_check_block(vcpu) < 0)
+			break;
+
+		waited = true;
+		schedule();
+	}
+	finish_rcuwait(&vcpu->wait);
+
+	kvm_arch_vcpu_unblocking(vcpu);
+
+	return waited;
+}
+
 static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
 					  ktime_t end, bool success)
 {
@@ -3221,6 +3249,12 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
 	}
 }
 
+/*
+ * Emulate a vCPU halt condition, e.g. HLT on x86, WFI on arm, etc...  If halt
+ * polling is enabled, busy wait for a short time before blocking to avoid the
+ * expensive block+unblock sequence if a wake event arrives soon after the vCPU
+ * is halted.
+ */
 void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
@@ -3245,21 +3279,7 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 		} while (kvm_vcpu_can_poll(cur, stop));
 	}
 
-	kvm_arch_vcpu_blocking(vcpu);
-
-	prepare_to_rcuwait(&vcpu->wait);
-	for (;;) {
-		set_current_state(TASK_INTERRUPTIBLE);
-
-		if (kvm_vcpu_check_block(vcpu) < 0)
-			break;
-
-		waited = true;
-		schedule();
-	}
-	finish_rcuwait(&vcpu->wait);
-
-	kvm_arch_vcpu_unblocking(vcpu);
+	waited = kvm_vcpu_block(vcpu);
 
 	cur = ktime_get();
 	if (waited) {
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (9 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt() Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 22:04   ` David Matlack
  2021-09-25  0:55 ` [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline Sean Christopherson
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

From: Jing Zhang <jingzhangos@google.com>

Add a "blocking" stat that userspace can use to detect the case where a
vCPU is not being run because of a vCPU/guest action, e.g. HLT or WFS on
x86, WFI on arm64, etc...  Current guest/host/halt stats don't show this
well, e.g. if a guest halts for a long period of time then the vCPU could
appear pathologically blocked due to a host condition, when in reality the
vCPU has been put into a not-runnable state by the guest.

Originally-by: Cannon Matthews <cannonmatthews@google.com>
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
[sean: renamed stat to "blocking", massaged changelog]
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/kvm_host.h  | 3 ++-
 include/linux/kvm_types.h | 1 +
 virt/kvm/kvm_main.c       | 2 ++
 3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 655c2b24db2d..9bb1972e396a 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1453,7 +1453,8 @@ struct _kvm_stats_desc {
 	STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,	       \
 			HALT_POLL_HIST_COUNT),				       \
 	STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,	       \
-			HALT_POLL_HIST_COUNT)
+			HALT_POLL_HIST_COUNT),				       \
+	STATS_DESC_ICOUNTER(VCPU_GENERIC, blocking)
 
 extern struct dentry *kvm_debugfs_dir;
 
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index 2237abb93ccd..c4f9257bf32d 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -94,6 +94,7 @@ struct kvm_vcpu_stat_generic {
 	u64 halt_poll_success_hist[HALT_POLL_HIST_COUNT];
 	u64 halt_poll_fail_hist[HALT_POLL_HIST_COUNT];
 	u64 halt_wait_hist[HALT_POLL_HIST_COUNT];
+	u64 blocking;
 };
 
 #define KVM_STATS_NAME_SIZE	48
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fe34457530c2..2980d2b88559 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3208,6 +3208,7 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
 {
 	bool waited = false;
 
+	vcpu->stat.generic.blocking = 1;
 	kvm_arch_vcpu_blocking(vcpu);
 
 	prepare_to_rcuwait(&vcpu->wait);
@@ -3223,6 +3224,7 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
 	finish_rcuwait(&vcpu->wait);
 
 	kvm_arch_vcpu_unblocking(vcpu);
+	vcpu->stat.generic.blocking = 0;
 
 	return waited;
 }
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (10 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 22:08   ` David Matlack
  2021-09-25  0:55 ` [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs Sean Christopherson
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Calculate the halt-polling "stop" time using "cur" instead of redoing
ktime_get().  In the happy case where hardware correctly predicts
do_halt_poll, "cur" is only a few cycles old.  And if the branch is
mispredicted, arguably that extra latency should count toward the
halt-polling time.

In all likelihood, the numbers involved are in the noise and either
approach is perfectly ok.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/kvm_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 2980d2b88559..80f78daa6b8d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3267,7 +3267,7 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
-		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
+		ktime_t stop = ktime_add_ns(cur, vcpu->halt_poll_ns);
 
 		do {
 			/*
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (11 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 22:12   ` David Matlack
  2021-09-25  0:55 ` [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states Sean Christopherson
  2021-09-27  7:22 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Go directly to kvm_vcpu_block() when handling the case where userspace
attempts to run an UNINITIALIZED vCPU.  The vCPU isn't halted and its time
spent in limbo arguably should not be factored into halt-polling as the
behavior of the VM at this point is not at all indicative of the behavior
of the VM once it is up and running, i.e. executing HLT in idle tasks.

Note, because this case is encountered only on the first run of an AP vCPU,
vcpu->halt_poll_ns is guaranteed to be '0', and so KVM will not attempt
halt-polling, i.e. this really only affects the post-block bookkeeping.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0d71c73a61bb..b444f9315766 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10127,7 +10127,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 			r = -EINTR;
 			goto out;
 		}
-		kvm_vcpu_halt(vcpu);
+		kvm_vcpu_block(vcpu);
 		if (kvm_apic_accept_events(vcpu) < 0) {
 			r = 0;
 			goto out;
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (12 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs Sean Christopherson
@ 2021-09-25  0:55 ` Sean Christopherson
  2021-09-28 22:14   ` David Matlack
  2021-09-27  7:22 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
  14 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-25  0:55 UTC (permalink / raw)
  To: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, David Matlack, Jing Zhang

Call kvm_vcpu_block() directly for all wait states except HALTED so that
kvm_vcpu_halt() is no longer a misnomer on x86.

Functionally, this means KVM will never attempt halt-polling or adjust
vcpu->halt_poll_ns for INIT_RECEIVED (a.k.a. Wait-For-SIPI (WFS)) or
AP_RESET_HOLD; UNINITIALIZED is handled in kvm_arch_vcpu_ioctl_run(),
and x86 doesn't use any other "wait" states.

As mentioned above, the motivation of this is purely so that "halt" isn't
overloaded on x86, e.g. in KVM's stats.  Skipping halt-polling for WFS
(and RESET_HOLD) has no meaningful effect on guest performance as there
are typically single-digit numbers of INIT-SIPI sequences per AP vCPU,
per boot, versus thousands of HLTs just to boot to console.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b444f9315766..a0f313c4bc49 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9893,7 +9893,10 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	if (!kvm_arch_vcpu_runnable(vcpu) &&
 	    (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
 		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
-		kvm_vcpu_halt(vcpu);
+		if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
+			kvm_vcpu_halt(vcpu);
+		else
+			kvm_vcpu_block(vcpu);
 		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
 
 		if (kvm_x86_ops.post_block)
-- 
2.33.0.685.g46640cef36-goog


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-25  0:55 ` [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful Sean Christopherson
@ 2021-09-25  9:50   ` Marc Zyngier
  2021-09-26  6:27     ` Paolo Bonzini
  0 siblings, 1 reply; 50+ messages in thread
From: Marc Zyngier @ 2021-09-25  9:50 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, David Matlack, Jing Zhang

On Sat, 25 Sep 2021 01:55:21 +0100,
Sean Christopherson <seanjc@google.com> wrote:
> 
> Invoke the arch hooks for block+unblock if and only if KVM actually
> attempts to block the vCPU.  The only non-nop implementation is on arm64,
> and if halt-polling is successful, there is no need for arm64 to put/load
> the vGIC as KVM hasn't relinquished control of the vCPU in any way.

This doesn't mean that there is no requirement for any state
change. The put/load on GICv4 is crucial for performance, and the VMCR
resync is a correctness requirement.

> 
> The primary motivation is to allow future cleanup to split out "block"
> from "halt", but this is also likely a small performance boost on arm64
> when halt-polling is successful.
> 
> Adjust the post-block path to update "cur" after unblocking, i.e. include
> vGIC load time in halt_wait_ns and halt_wait_hist, so that the behavior
> is consistent.  Moving just the pre-block arch hook would result in only
> the vGIC put latency being included in the halt_wait stats.  There is no
> obvious evidence that one way or the other is correct, so just ensure KVM
> is consistent.

This effectively reverts 07ab0f8d9a12 ("KVM: Call
kvm_arch_vcpu_blocking early into the blocking sequence"), which was a
huge gain on arm64, not to mention a correctness fix.

Without this, a GICv4 machine will always pay for the full poll
penalty, going into schedule(), and only then get a doorbell interrupt
signalling telling the kernel that there was an interrupt.

On a non-GICv4 machine, it means that interrupts injected by another
thread during the pooling will be evaluated with an outdated priority
mask, which can result in either a spurious wake-up or a missed
wake-up.

If it means introducing a new set of {pre,post}-poll arch-specific
hooks, so be it. But I don't think this change is acceptable as is.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-25  9:50   ` Marc Zyngier
@ 2021-09-26  6:27     ` Paolo Bonzini
  2021-09-26  9:02       ` Marc Zyngier
  0 siblings, 1 reply; 50+ messages in thread
From: Paolo Bonzini @ 2021-09-26  6:27 UTC (permalink / raw)
  To: Marc Zyngier, Sean Christopherson
  Cc: Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, David Matlack, Jing Zhang

On 25/09/21 11:50, Marc Zyngier wrote:
>> there is no need for arm64 to put/load
>> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> 
> This doesn't mean that there is no requirement for any state
> change. The put/load on GICv4 is crucial for performance, and the VMCR
> resync is a correctness requirement.

I wouldn't even say it's crucial for performance: halt polling cannot 
work and is a waste of time without (the current implementation of) 
put/load.

However, is activating the doorbell necessary?  If possible, polling the 
VGIC directly for pending VLPIs without touching the ITS (for example by 
emulating IAR reads) may make sense.  IIUC that must be done at EL2 
though, so maybe it would even make sense to move all of halt polling to 
EL2 for the nVHE case.  It all depends on benchmark results, of course.

Sorry for the many stupid questions I'm asking lately, but I'm trying to 
pay more attention to ARM and understand the VGIC and EL1/EL2 split better.

Paolo


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-26  6:27     ` Paolo Bonzini
@ 2021-09-26  9:02       ` Marc Zyngier
  2021-09-27 17:28         ` Sean Christopherson
  0 siblings, 1 reply; 50+ messages in thread
From: Marc Zyngier @ 2021-09-26  9:02 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Huacai Chen, Aleksandar Markovic,
	Paul Mackerras, Christian Borntraeger, Janosch Frank,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel,
	David Matlack, Jing Zhang

On Sun, 26 Sep 2021 07:27:28 +0100,
Paolo Bonzini <pbonzini@redhat.com> wrote:
> 
> On 25/09/21 11:50, Marc Zyngier wrote:
> >> there is no need for arm64 to put/load
> >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > 
> > This doesn't mean that there is no requirement for any state
> > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > resync is a correctness requirement.
> 
> I wouldn't even say it's crucial for performance: halt polling cannot
> work and is a waste of time without (the current implementation of)
> put/load.

Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
WFI (which is the only reason we end-up on this path). Only LPIs (and
SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
still follow the standard SW injection model.

However, there is still the ICH_VMCR_EL2 requirement (to get the
up-to-date priority mask and group enable bits) for SW-injected
interrupt wake-up to work correctly, and I really don't want to save
that one eagerly on each shallow exit.

> However, is activating the doorbell necessary?  If possible, polling
> the VGIC directly for pending VLPIs without touching the ITS (for
> example by emulating IAR reads) may make sense.  IIUC that must be
> done at EL2 though, so maybe it would even make sense to move all of
> halt polling to EL2 for the nVHE case.  It all depends on benchmark
> results, of course.

No, there is no architectural way to observe the VLPI state. EL2
cannot impersonate the guest an read ICV_IAR1_EL1 (because it
conveniently has the same encoding as ICC_IAR1_EL1), and if it could,
it would be *destructive* (not what you want). The equivalent of the
LR that is used to hold the highest priority VLPI presented to the
virtual CPU interface is not visible to SW at all.

There are exactly two ways for the hypervisor to get a hint about the
VLPI state (and that's only a hint, as everything can be spurious):

- Make the vPE non resident and use GICR_VPENDBASER.PendingLast bit to
  find out whether there are pending VLPIs

- Make the vPE non resident and get a doorbell interrupt

See the common pattern?

There is no polling mechanism, and the only way to flush the VLPI
state to memory is to destroy the GIC view of the vPE, which is a bit
counter-productive. It also only work on GICv4.1, and not GICv4 (which
is why we don't support live migration on GICv4).

> Sorry for the many stupid questions I'm asking lately, but I'm trying
> to pay more attention to ARM and understand the VGIC and EL1/EL2 split
> better.

Feel free to ask any question. The more people understand how the
architecture works, the better.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU
  2021-09-25  0:55 ` [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU Sean Christopherson
@ 2021-09-27  6:54   ` Christian Borntraeger
  0 siblings, 0 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27  6:54 UTC (permalink / raw)
  To: Sean Christopherson, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank,
	Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel,
	David Matlack, Jing Zhang



Am 25.09.21 um 02:55 schrieb Sean Christopherson:
> Wrap s390's halt_poll_max_steal with READ_ONCE and snapshot the result of
> kvm_arch_no_poll() in kvm_vcpu_block() to avoid a mostly-theoretical,
> largely benign bug on s390 where the result of kvm_arch_no_poll() could
> change due to userspace modifying halt_poll_max_steal while the vCPU is
> blocking.  The bug is largely benign as it will either cause KVM to skip
> updating halt-polling times (no_poll toggles false=>true) or to update
> halt-polling times with a slightly flawed block_ns.
> 
> Note, READ_ONCE is unnecessary in the current code, add it in case the
> arch hook is ever inlined, and to provide a hint that userspace can
> change the param at will.
> 
> Fixes: 8b905d28ee17 ("KVM: s390: provide kvm_arch_no_poll function")
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>   arch/s390/kvm/kvm-s390.c | 2 +-
>   virt/kvm/kvm_main.c      | 5 +++--
>   2 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 6a6dd5e1daf6..7cabe6778b1b 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -3446,7 +3446,7 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
>   {
>   	/* do not poll with more than halt_poll_max_steal percent of steal time */
>   	if (S390_lowcore.avg_steal_timer * 100 / (TICK_USEC << 12) >=
> -	    halt_poll_max_steal) {
> +	    READ_ONCE(halt_poll_max_steal)) {
>   		vcpu->stat.halt_no_poll_steal++;
>   		return true;
>   	}
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 191dac6b1bed..768a4cbb26a6 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3213,6 +3213,7 @@ update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited)
>    */
>   void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   {
> +	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
>   	ktime_t start, cur, poll_end;
>   	bool waited = false;
>   	u64 block_ns;
> @@ -3220,7 +3221,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   	kvm_arch_vcpu_blocking(vcpu);
>   
>   	start = cur = poll_end = ktime_get();
> -	if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) {
> +	if (vcpu->halt_poll_ns && halt_poll_allowed) {
>   		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
>   
>   		++vcpu->stat.generic.halt_attempted_poll;
> @@ -3275,7 +3276,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   	update_halt_poll_stats(
>   		vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
>   
> -	if (!kvm_arch_no_poll(vcpu)) {
> +	if (halt_poll_allowed) {
>   		if (!vcpu_valid_wakeup(vcpu)) {
>   			shrink_halt_poll_ns(vcpu);
>   		} else if (vcpu->kvm->max_halt_poll_ns) {
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook
  2021-09-25  0:55 ` [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook Sean Christopherson
@ 2021-09-27  6:58   ` Christian Borntraeger
  0 siblings, 0 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27  6:58 UTC (permalink / raw)
  To: Sean Christopherson, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank,
	Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel,
	David Matlack, Jing Zhang



Am 25.09.21 um 02:55 schrieb Sean Christopherson:
> Move the clearing of valid_wakeup out of kvm_arch_vcpu_block_finish() so
> that a future patch can drop said arch hook.  Unlike the other blocking-
> related arch hooks (vcpu_blocking/unblocking()), vcpu_block_finish() needs
> to be called even if the KVM doesn't actually block the vCPU.  This will
> allow future patches to differentiate between truly blocking the vCPU and
> emulating a halt condition without introducing a contradiction.
> 
> Alternatively, the hook could be renamed to kvm_arch_vcpu_halt_finish(),
> but there's literally one call site in s390, and future cleanup can also
> be done to handle valid_wakeup fully within kvm_s390_handle_wait() and
> allow generic KVM to drop vcpu_valid_wakeup().
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
>   arch/s390/kvm/interrupt.c | 1 +
>   arch/s390/kvm/kvm-s390.c  | 2 +-
>   2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 10722455fd02..520450a7956f 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -1336,6 +1336,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
>   no_timer:
>   	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
>   	kvm_vcpu_block(vcpu);
> +	vcpu->valid_wakeup = false;
>   	__unset_cpu_idle(vcpu);
>   	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
>   
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 7cabe6778b1b..08ed68639a21 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -5082,7 +5082,7 @@ static inline unsigned long nonhyp_mask(int i)
>   
>   void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu)
>   {
> -	vcpu->valid_wakeup = false;
> +

maybe just remove the line instead of adding an empty one?

>   }
>   
>   static int __init kvm_s390_init(void)
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish()
  2021-09-25  0:55 ` [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish() Sean Christopherson
@ 2021-09-27  6:58   ` Christian Borntraeger
  2021-09-28 21:28   ` David Matlack
  1 sibling, 0 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27  6:58 UTC (permalink / raw)
  To: Sean Christopherson, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank,
	Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel,
	David Matlack, Jing Zhang



Am 25.09.21 um 02:55 schrieb Sean Christopherson:
> Drop kvm_arch_vcpu_block_finish() now that all arch implementations are
> nops.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>   arch/arm64/include/asm/kvm_host.h   | 1 -
>   arch/mips/include/asm/kvm_host.h    | 1 -
>   arch/powerpc/include/asm/kvm_host.h | 1 -
>   arch/s390/include/asm/kvm_host.h    | 2 --
>   arch/s390/kvm/kvm-s390.c            | 5 -----
>   arch/x86/include/asm/kvm_host.h     | 2 --
>   virt/kvm/kvm_main.c                 | 1 -
>   7 files changed, 13 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..4e0ad0fff540 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -716,7 +716,6 @@ void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
>   static inline void kvm_arch_hardware_unsetup(void) {}
>   static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>   static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>   
>   void kvm_arm_init_debug(void);
>   void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
> index 696f6b009377..72b90d45a46e 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -897,7 +897,6 @@ static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
>   static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>   static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>   static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>   
>   #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLB
>   int kvm_arch_flush_remote_tlb(struct kvm *kvm);
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 59cb38b04ede..8a84448d77a6 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -864,6 +864,5 @@ static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>   static inline void kvm_arch_exit(void) {}
>   static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>   static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>   
>   #endif /* __POWERPC_KVM_HOST_H__ */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index a604d51acfc8..a22c9266ea05 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -1010,6 +1010,4 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
>   static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>   static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
>   
> -void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
> -
>   #endif
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 08ed68639a21..17fabb260c35 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -5080,11 +5080,6 @@ static inline unsigned long nonhyp_mask(int i)
>   	return 0x0000ffffffffffffUL >> (nonhyp_fai << 4);
>   }
>   
> -void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu)
> -{
> -
> -}
> -
>   static int __init kvm_s390_init(void)
>   {
>   	int i;
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 1e0e909b98b2..4e8c21083bdb 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1916,8 +1916,6 @@ static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
>   	static_call_cond(kvm_x86_vcpu_unblocking)(vcpu);
>   }
>   
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> -
>   static inline int kvm_cpu_get_apicid(int mps_cpu)
>   {
>   #ifdef CONFIG_X86_LOCAL_APIC
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2ba22b68af3b..2015a1f532ce 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3301,7 +3301,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   	}
>   
>   	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
> -	kvm_arch_vcpu_block_finish(vcpu);
>   }
>   EXPORT_SYMBOL_GPL(kvm_vcpu_block);
>   
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()
  2021-09-25  0:55 ` [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Sean Christopherson
@ 2021-09-27  7:06   ` Christian Borntraeger
  2021-09-28 22:01   ` David Matlack
  1 sibling, 0 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27  7:06 UTC (permalink / raw)
  To: Sean Christopherson, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank,
	Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel,
	David Matlack, Jing Zhang



Am 25.09.21 um 02:55 schrieb Sean Christopherson:
> Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting
> the actual "block" sequences into a separate helper (to be named
> kvm_vcpu_block()).  x86 will use the standalone block-only path to handle
> non-halt cases where the vCPU is not runnable.
> 
> Rename block_ns to halt_ns to match the new function name.
> 
> Opportunistically move an x86-specific comment to x86, and enhance it, too.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>   arch/arm64/kvm/arch_timer.c       |  2 +-
>   arch/arm64/kvm/handle_exit.c      |  4 ++--
>   arch/arm64/kvm/psci.c             |  2 +-
>   arch/mips/kvm/emulate.c           |  2 +-
>   arch/powerpc/kvm/book3s_pr.c      |  2 +-
>   arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
>   arch/powerpc/kvm/booke.c          |  2 +-
>   arch/powerpc/kvm/powerpc.c        |  2 +-
>   arch/s390/kvm/interrupt.c         |  2 +-
>   arch/x86/kvm/x86.c                | 11 +++++++++--
>   include/linux/kvm_host.h          |  2 +-
>   virt/kvm/kvm_main.c               | 20 +++++++++-----------
>   12 files changed, 29 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 3df67c127489..7e8396f74010 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -467,7 +467,7 @@ static void timer_save_state(struct arch_timer_context *ctx)
>   }
>   
>   /*
> - * Schedule the background timer before calling kvm_vcpu_block, so that this
> + * Schedule the background timer before calling kvm_vcpu_halt, so that this
>    * thread is removed from its waitqueue and made runnable when there's a timer
>    * interrupt to handle.
>    */
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..08f823984712 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -82,7 +82,7 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu)
>    *
>    * WFE: Yield the CPU and come back to this vcpu when the scheduler
>    * decides to.
> - * WFI: Simply call kvm_vcpu_block(), which will halt execution of
> + * WFI: Simply call kvm_vcpu_halt(), which will halt execution of
>    * world-switches and schedule other host processes until there is an
>    * incoming IRQ or FIQ to the VM.
>    */
> @@ -95,7 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
>   	} else {
>   		trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
>   		vcpu->stat.wfi_exit_stat++;
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   	}
>   
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 74c47d420253..e275b2ca08b9 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -46,7 +46,7 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
>   	 * specification (ARM DEN 0022A). This means all suspend states
>   	 * for KVM will preserve the register state.
>   	 */
> -	kvm_vcpu_block(vcpu);
> +	kvm_vcpu_halt(vcpu);
>   	kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   
>   	return PSCI_RET_SUCCESS;
> diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
> index 22e745e49b0a..b494d8d39290 100644
> --- a/arch/mips/kvm/emulate.c
> +++ b/arch/mips/kvm/emulate.c
> @@ -952,7 +952,7 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu)
>   	if (!vcpu->arch.pending_exceptions) {
>   		kvm_vz_lose_htimer(vcpu);
>   		vcpu->arch.wait = 1;
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   
>   		/*
>   		 * We we are runnable, then definitely go off to user space to
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index 6bc9425acb32..0ced1b16f0e5 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -492,7 +492,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
>   
>   	if (msr & MSR_POW) {
>   		if (!vcpu->arch.pending_exceptions) {
> -			kvm_vcpu_block(vcpu);
> +			kvm_vcpu_halt(vcpu);
>   			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   			vcpu->stat.generic.halt_wakeup++;
>   
> diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c
> index ac14239f3424..1f10e7dfcdd0 100644
> --- a/arch/powerpc/kvm/book3s_pr_papr.c
> +++ b/arch/powerpc/kvm/book3s_pr_papr.c
> @@ -376,7 +376,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
>   		return kvmppc_h_pr_stuff_tce(vcpu);
>   	case H_CEDE:
>   		kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   		vcpu->stat.generic.halt_wakeup++;
>   		return EMULATE_DONE;
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 977801c83aff..12abffa40cd9 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -718,7 +718,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>   
>   	if (vcpu->arch.shared->msr & MSR_WE) {
>   		local_irq_enable();
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   		hard_irq_disable();
>   
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 8ab90ce8738f..565eed2dab81 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -236,7 +236,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
>   		break;
>   	case EV_HCALL_TOKEN(EV_IDLE):
>   		r = EV_SUCCESS;
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>   		break;
>   	default:
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 520450a7956f..10bd648170b7 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -1335,7 +1335,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
>   	VCPU_EVENT(vcpu, 4, "enabled wait: %llu ns", sltime);
>   no_timer:
>   	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
> -	kvm_vcpu_block(vcpu);
> +	kvm_vcpu_halt(vcpu);
>   	vcpu->valid_wakeup = false;
>   	__unset_cpu_idle(vcpu);
>   	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index eade8a2bdccf..0d71c73a61bb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8645,6 +8645,13 @@ void kvm_arch_exit(void)
>   
>   static int __kvm_emulate_halt(struct kvm_vcpu *vcpu, int state, int reason)
>   {
> +	/*
> +	 * The vCPU has halted, e.g. executed HLT.  Update the run state if the
> +	 * local APIC is in-kernel, the run loop will detect the non-runnable
> +	 * state and halt the vCPU.  Exit to userspace if the local APIC is
> +	 * managed by userspace, in which case userspace is responsible for
> +	 * handling wake events.
> +	 */
>   	++vcpu->stat.halt_exits;
>   	if (lapic_in_kernel(vcpu)) {
>   		vcpu->arch.mp_state = state;
> @@ -9886,7 +9893,7 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
>   	if (!kvm_arch_vcpu_runnable(vcpu) &&
>   	    (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
>   		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>   
>   		if (kvm_x86_ops.post_block)
> @@ -10120,7 +10127,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>   			r = -EINTR;
>   			goto out;
>   		}
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>   		if (kvm_apic_accept_events(vcpu) < 0) {
>   			r = 0;
>   			goto out;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 3f87d6ad20bf..d2a8be3fb9ba 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -965,7 +965,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
>   void kvm_sigset_activate(struct kvm_vcpu *vcpu);
>   void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
>   
> -void kvm_vcpu_block(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_halt(struct kvm_vcpu *vcpu);
>   void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
>   void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
>   bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index f96cda8312f3..280cf1dca7db 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3221,16 +3221,13 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>   	}
>   }
>   
> -/*
> - * The vCPU has executed a HLT instruction with in-kernel mode enabled.
> - */
> -void kvm_vcpu_block(struct kvm_vcpu *vcpu)
> +void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>   {
>   	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
>   	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
>   	ktime_t start, cur, poll_end;
>   	bool waited = false;
> -	u64 block_ns;
> +	u64 halt_ns;
>   
>   	start = cur = poll_end = ktime_get();
>   	if (do_halt_poll) {
> @@ -3273,7 +3270,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   	}
>   
>   out:
> -	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
> +	/* The total time the vCPU was "halted", including polling time. */
> +	halt_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>   
>   	/*
>   	 * Note, halt-polling is considered successful so long as the vCPU was
> @@ -3287,24 +3285,24 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>   		if (!vcpu_valid_wakeup(vcpu)) {
>   			shrink_halt_poll_ns(vcpu);
>   		} else if (vcpu->kvm->max_halt_poll_ns) {
> -			if (block_ns <= vcpu->halt_poll_ns)
> +			if (halt_ns <= vcpu->halt_poll_ns)
>   				;
>   			/* we had a long block, shrink polling */
>   			else if (vcpu->halt_poll_ns &&
> -					block_ns > vcpu->kvm->max_halt_poll_ns)
> +				 halt_ns > vcpu->kvm->max_halt_poll_ns)
>   				shrink_halt_poll_ns(vcpu);
>   			/* we had a short halt and our poll time is too small */
>   			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> -					block_ns < vcpu->kvm->max_halt_poll_ns)
> +				 halt_ns < vcpu->kvm->max_halt_poll_ns)
>   				grow_halt_poll_ns(vcpu);
>   		} else {
>   			vcpu->halt_poll_ns = 0;
>   		}
>   	}
>   
> -	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
> +	trace_kvm_vcpu_wakeup(halt_ns, waited, vcpu_valid_wakeup(vcpu));
>   }
> -EXPORT_SYMBOL_GPL(kvm_vcpu_block);
> +EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
>   
>   bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
>   {
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
                   ` (13 preceding siblings ...)
  2021-09-25  0:55 ` [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states Sean Christopherson
@ 2021-09-27  7:22 ` Christian Borntraeger
  2021-09-27 14:59   ` Sean Christopherson
  14 siblings, 1 reply; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27  7:22 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini, David Matlack, Jon Cargille,
	Jim Mattson
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel, Jing Zhang,
	Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Janosch Frank

While looking into this series,

I realized that Davids patch

commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
Author:     David Matlack <dmatlack@google.com>
AuthorDate: Fri Apr 17 15:14:46 2020 -0700
Commit:     Paolo Bonzini <pbonzini@redhat.com>
CommitDate: Fri Apr 24 12:53:17 2020 -0400

     kvm: add capability for halt polling

broke the possibility for an admin to disable halt polling for already running KVM guests.
In past times doing
echo 0 > /sys/module/kvm/parameters/halt_poll_ns

stopped polling system wide.
Now all KVM guests will use the halt_poll_ns value that was active during startup - even those that do not use KVM_CAP_HALT_POLL.

I guess this was not intended?

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt()
  2021-09-25  0:55 ` [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt() Sean Christopherson
@ 2021-09-27  7:41   ` Christian Borntraeger
  2021-09-28 22:03   ` David Matlack
  1 sibling, 0 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27  7:41 UTC (permalink / raw)
  To: Sean Christopherson, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank,
	Paolo Bonzini
  Cc: James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Jim Mattson, Joerg Roedel,
	linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel,
	David Matlack, Jing Zhang



Am 25.09.21 um 02:55 schrieb Sean Christopherson:
> Factor out the "block" part of kvm_vcpu_halt() so that x86 can emulate
> non-halt wait/sleep/block conditions that should not be subjected to
> halt-polling.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

> ---
>   include/linux/kvm_host.h |  1 +
>   virt/kvm/kvm_main.c      | 50 ++++++++++++++++++++++++++++------------
>   2 files changed, 36 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index d2a8be3fb9ba..655c2b24db2d 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -966,6 +966,7 @@ void kvm_sigset_activate(struct kvm_vcpu *vcpu);
>   void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
>   
>   void kvm_vcpu_halt(struct kvm_vcpu *vcpu);
> +bool kvm_vcpu_block(struct kvm_vcpu *vcpu);
>   void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
>   void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
>   bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 280cf1dca7db..fe34457530c2 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3199,6 +3199,34 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>   	return ret;
>   }
>   
> +/*
> + * Block the vCPU until the vCPU is runnable, an event arrives, or a signal is
> + * pending.  This is mostly used when halting a vCPU, but may also be used
> + * directly for other vCPU non-runnable states, e.g. x86's Wait-For-SIPI.
> + */
> +bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
> +{
> +	bool waited = false;
> +
> +	kvm_arch_vcpu_blocking(vcpu);
> +
> +	prepare_to_rcuwait(&vcpu->wait);
> +	for (;;) {
> +		set_current_state(TASK_INTERRUPTIBLE);
> +
> +		if (kvm_vcpu_check_block(vcpu) < 0)
> +			break;
> +
> +		waited = true;
> +		schedule();
> +	}
> +	finish_rcuwait(&vcpu->wait);
> +
> +	kvm_arch_vcpu_unblocking(vcpu);
> +
> +	return waited;
> +}
> +
>   static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>   					  ktime_t end, bool success)
>   {
> @@ -3221,6 +3249,12 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>   	}
>   }
>   
> +/*
> + * Emulate a vCPU halt condition, e.g. HLT on x86, WFI on arm, etc...  If halt
> + * polling is enabled, busy wait for a short time before blocking to avoid the
> + * expensive block+unblock sequence if a wake event arrives soon after the vCPU
> + * is halted.
> + */
>   void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>   {
>   	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
> @@ -3245,21 +3279,7 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>   		} while (kvm_vcpu_can_poll(cur, stop));
>   	}
>   
> -	kvm_arch_vcpu_blocking(vcpu);
> -
> -	prepare_to_rcuwait(&vcpu->wait);
> -	for (;;) {
> -		set_current_state(TASK_INTERRUPTIBLE);
> -
> -		if (kvm_vcpu_check_block(vcpu) < 0)
> -			break;
> -
> -		waited = true;
> -		schedule();
> -	}
> -	finish_rcuwait(&vcpu->wait);
> -
> -	kvm_arch_vcpu_unblocking(vcpu);
> +	waited = kvm_vcpu_block(vcpu);
>   
>   	cur = ktime_get();
>   	if (waited) {
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27  7:22 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
@ 2021-09-27 14:59   ` Sean Christopherson
  2021-09-27 15:03     ` Paolo Bonzini
  0 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-27 14:59 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Paolo Bonzini, David Matlack, Jon Cargille, Jim Mattson,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel, Jing Zhang,
	Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Janosch Frank

On Mon, Sep 27, 2021, Christian Borntraeger wrote:
> While looking into this series,
> 
> I realized that Davids patch
> 
> commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
> Author:     David Matlack <dmatlack@google.com>
> AuthorDate: Fri Apr 17 15:14:46 2020 -0700
> Commit:     Paolo Bonzini <pbonzini@redhat.com>
> CommitDate: Fri Apr 24 12:53:17 2020 -0400
> 
>     kvm: add capability for halt polling
> 
> broke the possibility for an admin to disable halt polling for already running KVM guests.
> In past times doing
> echo 0 > /sys/module/kvm/parameters/halt_poll_ns
> 
> stopped polling system wide.
> Now all KVM guests will use the halt_poll_ns value that was active during
> startup - even those that do not use KVM_CAP_HALT_POLL.
> 
> I guess this was not intended?

Ouch.  I would go so far as to say that halt_poll_ns should be a hard limit on
the capability.  What about having the per-VM variable track only the capability,
and then use the module param to cap the max when doing adjustments?  E.g. add
a variant of this early in the series?

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 80f78daa6b8d..f50e4e31a0cf 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1078,8 +1078,6 @@ static struct kvm *kvm_create_vm(unsigned long type)
                        goto out_err_no_arch_destroy_vm;
        }

-       kvm->max_halt_poll_ns = halt_poll_ns;
-
        r = kvm_arch_init_vm(kvm, type);
        if (r)
                goto out_err_no_arch_destroy_vm;
@@ -3136,7 +3134,8 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
        sigemptyset(&current->real_blocked);
 }

-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu,
+                             unsigned int max_halt_poll_ns)
 {
        unsigned int old, val, grow, grow_start;

@@ -3150,8 +3149,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
        if (val < grow_start)
                val = grow_start;

-       if (val > vcpu->kvm->max_halt_poll_ns)
-               val = vcpu->kvm->max_halt_poll_ns;
+       if (val > max_halt_poll_ns)
+               val = max_halt_poll_ns;

        vcpu->halt_poll_ns = val;
 out:
@@ -3261,6 +3260,7 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
        bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
        bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+       unsigned int max_halt_poll_ns;
        ktime_t start, cur, poll_end;
        bool waited = false;
        u64 halt_ns;
@@ -3304,19 +3304,25 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
                update_halt_poll_stats(vcpu, start, poll_end, !waited);

        if (halt_poll_allowed) {
+               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
+               if (max_halt_poll_ns)
+                       max_halt_poll_ns = min(max_halt_poll_ns, halt_poll_ns);
+               else
+                       max_halt_poll_ns = halt_poll_ns;
+
                if (!vcpu_valid_wakeup(vcpu)) {
                        shrink_halt_poll_ns(vcpu);
-               } else if (vcpu->kvm->max_halt_poll_ns) {
+               } else if (max_halt_poll_ns) {
                        if (halt_ns <= vcpu->halt_poll_ns)
                                ;
                        /* we had a long block, shrink polling */
                        else if (vcpu->halt_poll_ns &&
-                                halt_ns > vcpu->kvm->max_halt_poll_ns)
+                                halt_ns > max_halt_poll_ns)
                                shrink_halt_poll_ns(vcpu);
                        /* we had a short halt and our poll time is too small */
-                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-                                halt_ns < vcpu->kvm->max_halt_poll_ns)
-                               grow_halt_poll_ns(vcpu);
+                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+                                halt_ns < max_halt_poll_ns)
+                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
                } else {
                        vcpu->halt_poll_ns = 0;
                }

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 14:59   ` Sean Christopherson
@ 2021-09-27 15:03     ` Paolo Bonzini
  2021-09-27 15:15       ` Sean Christopherson
  2021-09-27 15:16       ` Christian Borntraeger
  0 siblings, 2 replies; 50+ messages in thread
From: Paolo Bonzini @ 2021-09-27 15:03 UTC (permalink / raw)
  To: Sean Christopherson, Christian Borntraeger
  Cc: David Matlack, Jon Cargille, Jim Mattson, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, Jing Zhang, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank

On 27/09/21 16:59, Sean Christopherson wrote:
>> commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
>> Author:     David Matlack<dmatlack@google.com>
>> AuthorDate: Fri Apr 17 15:14:46 2020 -0700
>> Commit:     Paolo Bonzini<pbonzini@redhat.com>
>> CommitDate: Fri Apr 24 12:53:17 2020 -0400
>> 
>>      kvm: add capability for halt polling
>> 
>> broke the possibility for an admin to disable halt polling for already running KVM guests.
>> In past times doing
>> echo 0 > /sys/module/kvm/parameters/halt_poll_ns
>> 
>> stopped polling system wide.
>> Now all KVM guests will use the halt_poll_ns value that was active during
>> startup - even those that do not use KVM_CAP_HALT_POLL.
>> 
>> I guess this was not intended?

No, but...

> I would go so far as to say that halt_poll_ns should be a hard limit on
> the capability

... this would not be a good idea I think.  Anything that wants to do a 
lot of polling can just do "for (;;)".

So I think there are two possibilities that makes sense:

* track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns 
follow that

* just make halt_poll_ns read-only.

Paolo


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 15:03     ` Paolo Bonzini
@ 2021-09-27 15:15       ` Sean Christopherson
  2021-09-27 15:16       ` Christian Borntraeger
  1 sibling, 0 replies; 50+ messages in thread
From: Sean Christopherson @ 2021-09-27 15:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Christian Borntraeger, David Matlack, Jon Cargille, Jim Mattson,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	kvmarm, linux-mips, kvm, kvm-ppc, linux-kernel, Jing Zhang,
	Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Janosch Frank

On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> On 27/09/21 16:59, Sean Christopherson wrote:
> > > commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
> > > Author:     David Matlack<dmatlack@google.com>
> > > AuthorDate: Fri Apr 17 15:14:46 2020 -0700
> > > Commit:     Paolo Bonzini<pbonzini@redhat.com>
> > > CommitDate: Fri Apr 24 12:53:17 2020 -0400
> > > 
> > >      kvm: add capability for halt polling
> > > 
> > > broke the possibility for an admin to disable halt polling for already running KVM guests.
> > > In past times doing
> > > echo 0 > /sys/module/kvm/parameters/halt_poll_ns
> > > 
> > > stopped polling system wide.
> > > Now all KVM guests will use the halt_poll_ns value that was active during
> > > startup - even those that do not use KVM_CAP_HALT_POLL.
> > > 
> > > I guess this was not intended?
> 
> No, but...
> 
> > I would go so far as to say that halt_poll_ns should be a hard limit on
> > the capability
> 
> ... this would not be a good idea I think.  Anything that wants to do a lot
> of polling can just do "for (;;)".

Hmm, true, there is no danger to the system in having the capability override the
module param.

> So I think there are two possibilities that makes sense:
> 
> * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns
> follow that

I think this option makes more sense, making halt_poll_ns read-only is basically
forcing users to switch to KVM_CAP_HALT_POLL.

> * just make halt_poll_ns read-only.
> 
> Paolo
>

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 15:03     ` Paolo Bonzini
  2021-09-27 15:15       ` Sean Christopherson
@ 2021-09-27 15:16       ` Christian Borntraeger
  2021-09-27 16:58         ` David Matlack
  2021-09-27 17:24         ` Paolo Bonzini
  1 sibling, 2 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-27 15:16 UTC (permalink / raw)
  To: Paolo Bonzini, Sean Christopherson
  Cc: David Matlack, Jon Cargille, Jim Mattson, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips, kvm, kvm-ppc,
	linux-kernel, Jing Zhang, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank



Am 27.09.21 um 17:03 schrieb Paolo Bonzini:
> On 27/09/21 16:59, Sean Christopherson wrote:
>>> commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
>>> Author:     David Matlack<dmatlack@google.com>
>>> AuthorDate: Fri Apr 17 15:14:46 2020 -0700
>>> Commit:     Paolo Bonzini<pbonzini@redhat.com>
>>> CommitDate: Fri Apr 24 12:53:17 2020 -0400
>>>
>>>      kvm: add capability for halt polling
>>>
>>> broke the possibility for an admin to disable halt polling for already running KVM guests.
>>> In past times doing
>>> echo 0 > /sys/module/kvm/parameters/halt_poll_ns
>>>
>>> stopped polling system wide.
>>> Now all KVM guests will use the halt_poll_ns value that was active during
>>> startup - even those that do not use KVM_CAP_HALT_POLL.
>>>
>>> I guess this was not intended?
> 
> No, but...
> 
>> I would go so far as to say that halt_poll_ns should be a hard limit on
>> the capability
> 
> ... this would not be a good idea I think.  Anything that wants to do a lot of polling can just do "for (;;)".
> 
> So I think there are two possibilities that makes sense:
> 
> * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that

what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> 
> * just make halt_poll_ns read-only.
> 
> Paolo
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 15:16       ` Christian Borntraeger
@ 2021-09-27 16:58         ` David Matlack
  2021-09-29  6:56           ` Christian Borntraeger
  2021-09-27 17:24         ` Paolo Bonzini
  1 sibling, 1 reply; 50+ messages in thread
From: David Matlack @ 2021-09-27 16:58 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Paolo Bonzini, Sean Christopherson, Jon Cargille, Jim Mattson,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	KVMARM, LinuxMIPS, kvm list, KVMPPC, LKML, Jing Zhang,
	Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Janosch Frank

On Mon, Sep 27, 2021 at 8:17 AM Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
>
>
>
> Am 27.09.21 um 17:03 schrieb Paolo Bonzini:
> > On 27/09/21 16:59, Sean Christopherson wrote:
> >>> commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
> >>> Author:     David Matlack<dmatlack@google.com>
> >>> AuthorDate: Fri Apr 17 15:14:46 2020 -0700
> >>> Commit:     Paolo Bonzini<pbonzini@redhat.com>
> >>> CommitDate: Fri Apr 24 12:53:17 2020 -0400
> >>>
> >>>      kvm: add capability for halt polling
> >>>
> >>> broke the possibility for an admin to disable halt polling for already running KVM guests.
> >>> In past times doing
> >>> echo 0 > /sys/module/kvm/parameters/halt_poll_ns
> >>>
> >>> stopped polling system wide.
> >>> Now all KVM guests will use the halt_poll_ns value that was active during
> >>> startup - even those that do not use KVM_CAP_HALT_POLL.
> >>>
> >>> I guess this was not intended?
> >
> > No, but...
> >
> >> I would go so far as to say that halt_poll_ns should be a hard limit on
> >> the capability
> >
> > ... this would not be a good idea I think.  Anything that wants to do a lot of polling can just do "for (;;)".

I agree. It would also be a maintenance burden and subtle "gotcha" to
have to increase halt_poll_ns anytime one wants to increase
KVM_CAP_HALT_POLL.

> >
> > So I think there are two possibilities that makes sense:
> >
> > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
>
> what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.

None of these options would cover Christian's original use-case
though. (Write to module to disable halt-polling system-wide.)

What about adding a writable "enable_halt_polling" module parameter
that affects all VMs? Once that is in place we could also consider
getting rid of halt_poll_ns entirely.

> >
> > * just make halt_poll_ns read-only.
> >
> > Paolo
> >

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 15:16       ` Christian Borntraeger
  2021-09-27 16:58         ` David Matlack
@ 2021-09-27 17:24         ` Paolo Bonzini
  2021-09-27 17:33           ` Sean Christopherson
  1 sibling, 1 reply; 50+ messages in thread
From: Paolo Bonzini @ 2021-09-27 17:24 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Sean Christopherson, David Matlack, Jon Cargille, Jim Mattson,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	KVM ARM, open list:MIPS, kvm, kvm-ppc, Kernel Mailing List,
	Linux, Jing Zhang, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank

On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
<borntraeger@de.ibm.com> wrote:
> > So I think there are two possibilities that makes sense:
> >
> > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
>
> what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.

Yes, that's what I meant.  David pointed out that doesn't allow you to
disable halt polling altogether, but for that you can always ask each
VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
don't know about Google's usecase, but mine was actually more about
using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).

Paolo


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-26  9:02       ` Marc Zyngier
@ 2021-09-27 17:28         ` Sean Christopherson
  2021-09-28  9:24           ` Marc Zyngier
  0 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-27 17:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Paolo Bonzini, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, David Matlack, Jing Zhang

On Sun, Sep 26, 2021, Marc Zyngier wrote:
> On Sun, 26 Sep 2021 07:27:28 +0100,
> Paolo Bonzini <pbonzini@redhat.com> wrote:
> > 
> > On 25/09/21 11:50, Marc Zyngier wrote:
> > >> there is no need for arm64 to put/load
> > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > 
> > > This doesn't mean that there is no requirement for any state
> > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > resync is a correctness requirement.

Ah crud, I didn't blame that code beforehand, I simply assumed
kvm_arch_vcpu_blocking() was purely for the blocking/schedule() sequence.  The
comment in arm64's kvm_arch_vcpu_blocking() about kvm_arch_vcpu_runnable() makes
more sense now too.

> > I wouldn't even say it's crucial for performance: halt polling cannot
> > work and is a waste of time without (the current implementation of)
> > put/load.
> 
> Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> WFI (which is the only reason we end-up on this path). Only LPIs (and
> SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> still follow the standard SW injection model.
> 
> However, there is still the ICH_VMCR_EL2 requirement (to get the
> up-to-date priority mask and group enable bits) for SW-injected
> interrupt wake-up to work correctly, and I really don't want to save
> that one eagerly on each shallow exit.

IIUC, VMCR is resident in hardware while the guest is running, and KVM needs to
retrieve the VMCR when processing interrupts to determine if a interrupt is above
the priority threshold.  If that's the case, then IMO handling the VMCR via an
arch hook is unnecessarily fragile, e.g. any generic call that leads to
kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a guest
register.  A better approach for VMCR would be to retrieve the value from
hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
impossible to have bugs where KVM is working with a stale VMCR, e.g.

diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
index 48c6067fc5ec..0784de0c4080 100644
--- a/arch/arm64/kvm/vgic/vgic-mmio.c
+++ b/arch/arm64/kvm/vgic/vgic-mmio.c
@@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
 
 void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
 {
+       if (!vcpu->...->vmcr_available) {
+               preempt_disable();
+               kvm_vgic_vmcr_sync(vcpu);
+               preempt_enable();
+               vcpu->...->vmcr_available = true;
+       }
+
        if (kvm_vgic_global_state.type == VGIC_V2)
                vgic_v2_get_vmcr(vcpu, vmcr);
        else


Regarding vGIC v4, does KVM require it to be resident in hardware while the vCPU
is loaded?  If not, then we could do something like this, which would eliminate
the arch hooks entirely if the VMCR is handled as above.

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..efc862c4d802 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -365,31 +365,6 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
        return kvm_timer_is_pending(vcpu);
 }

-void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
-{
-       /*
-        * If we're about to block (most likely because we've just hit a
-        * WFI), we need to sync back the state of the GIC CPU interface
-        * so that we have the latest PMR and group enables. This ensures
-        * that kvm_arch_vcpu_runnable has up-to-date data to decide
-        * whether we have pending interrupts.
-        *
-        * For the same reason, we want to tell GICv4 that we need
-        * doorbells to be signalled, should an interrupt become pending.
-        */
-       preempt_disable();
-       kvm_vgic_vmcr_sync(vcpu);
-       vgic_v4_put(vcpu, true);
-       preempt_enable();
-}
-
-void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
-{
-       preempt_disable();
-       vgic_v4_load(vcpu);
-       preempt_enable();
-}
-
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
        struct kvm_s2_mmu *mmu;
@@ -697,7 +672,6 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
                        /* The distributor enable bits were changed */
                        preempt_disable();
                        vgic_v4_put(vcpu, false);
-                       vgic_v4_load(vcpu);
                        preempt_enable();
                }

@@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
                 */
                preempt_disable();

+               /*
+                * Reload vGIC v4 if necessary, as it may be put on-demand so
+                * that KVM can detect directly injected interrupts, e.g. when
+                * determining if the vCPU is runnable due to a pending event.
+                */
+               vgic_v4_load(vcpu);
+
                kvm_pmu_flush_hwstate(vcpu);

                local_irq_disable();
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 5dad4996cfb2..3ef360a20a22 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -969,6 +969,16 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)

        vgic_get_vmcr(vcpu, &vmcr);

+       /*
+        * Tell GICv4 that we need doorbells to be signalled, should an
+        * interrupt become pending.
+        */
+       if (vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vpe_resident) {
+               preempt_disable();
+               vgic_v4_put(vcpu, true);
+               preempt_enable();
+       }
+
        raw_spin_lock_irqsave(&vgic_cpu->ap_list_lock, flags);

        list_for_each_entry(irq, &vgic_cpu->ap_list_head, ap_list) {

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 17:24         ` Paolo Bonzini
@ 2021-09-27 17:33           ` Sean Christopherson
  2022-11-15  3:28             ` wangyanan (Y)
  0 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-27 17:33 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Christian Borntraeger, David Matlack, Jon Cargille, Jim Mattson,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	KVM ARM, open list:MIPS, kvm, kvm-ppc, Kernel Mailing List,
	Linux, Jing Zhang, Marc Zyngier, Huacai Chen,
	Aleksandar Markovic, Paul Mackerras, Janosch Frank

On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> <borntraeger@de.ibm.com> wrote:
> > > So I think there are two possibilities that makes sense:
> > >
> > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> >
> > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> 
> Yes, that's what I meant.  David pointed out that doesn't allow you to
> disable halt polling altogether, but for that you can always ask each
> VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> don't know about Google's usecase, but mine was actually more about
> using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).

I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.

@@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
                update_halt_poll_stats(vcpu, start, poll_end, !waited);

        if (halt_poll_allowed) {
+               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
+               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
+                       max_halt_poll_ns = halt_poll_ns;
+
                if (!vcpu_valid_wakeup(vcpu)) {
                        shrink_halt_poll_ns(vcpu);
-               } else if (vcpu->kvm->max_halt_poll_ns) {
+               } else if (max_halt_poll_ns) {
                        if (halt_ns <= vcpu->halt_poll_ns)
                                ;
                        /* we had a long block, shrink polling */
                        else if (vcpu->halt_poll_ns &&
-                                halt_ns > vcpu->kvm->max_halt_poll_ns)
+                                halt_ns > max_halt_poll_ns)
                                shrink_halt_poll_ns(vcpu);
                        /* we had a short halt and our poll time is too small */
-                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-                                halt_ns < vcpu->kvm->max_halt_poll_ns)
-                               grow_halt_poll_ns(vcpu);
+                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+                                halt_ns < max_halt_poll_ns)
+                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
                } else {
                        vcpu->halt_poll_ns = 0;
                }

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-27 17:28         ` Sean Christopherson
@ 2021-09-28  9:24           ` Marc Zyngier
  2021-09-28 16:21             ` Sean Christopherson
  0 siblings, 1 reply; 50+ messages in thread
From: Marc Zyngier @ 2021-09-28  9:24 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, David Matlack, Jing Zhang

On Mon, 27 Sep 2021 18:28:14 +0100,
Sean Christopherson <seanjc@google.com> wrote:
> 
> On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > On Sun, 26 Sep 2021 07:27:28 +0100,
> > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > 
> > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > >> there is no need for arm64 to put/load
> > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > 
> > > > This doesn't mean that there is no requirement for any state
> > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > resync is a correctness requirement.
> 
> Ah crud, I didn't blame that code beforehand, I simply assumed
> kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> sequence.  The comment in arm64's kvm_arch_vcpu_blocking() about
> kvm_arch_vcpu_runnable() makes more sense now too.
> 
> > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > work and is a waste of time without (the current implementation of)
> > > put/load.
> > 
> > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > still follow the standard SW injection model.
> > 
> > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > up-to-date priority mask and group enable bits) for SW-injected
> > interrupt wake-up to work correctly, and I really don't want to save
> > that one eagerly on each shallow exit.
> 
> IIUC, VMCR is resident in hardware while the guest is running, and
> KVM needs to retrieve the VMCR when processing interrupts to
> determine if a interrupt is above the priority threshold.  If that's
> the case, then IMO handling the VMCR via an arch hook is
> unnecessarily fragile, e.g. any generic call that leads to
> kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> guest register.

Not quite. We only need to retrieve the VMCR if we are in a situation
where we need to trigger a wake-up from WFI at the point where we have
not done a vcpu_put() yet. All the other cases where the interrupt is
injected are managed by the HW. And the only case where
kvm_arch_vcpu_runnable() gets called is when blocking.

I also don't get why a hook would be fragile, as long as it has well
defined semantics.

> A better approach for VMCR would be to retrieve the value from
> hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> impossible to have bugs where KVM is working with a stale VMCR, e.g.
> 
> diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> index 48c6067fc5ec..0784de0c4080 100644
> --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
>  
>  void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
>  {
> +       if (!vcpu->...->vmcr_available) {
> +               preempt_disable();
> +               kvm_vgic_vmcr_sync(vcpu);
> +               preempt_enable();
> +               vcpu->...->vmcr_available = true;
> +       }
> +

But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
isn't running at all (such as save/restore). It really only operates
on the shadow state, and what you have above will only lead to state
corruption.

>         if (kvm_vgic_global_state.type == VGIC_V2)
>                 vgic_v2_get_vmcr(vcpu, vmcr);
>         else
> 
> 
> Regarding vGIC v4, does KVM require it to be resident in hardware
> while the vCPU is loaded?

It is a requirement. Otherwise, we end-up with an inconsistent state
between the delivery of doorbells and the state of the vgic. Also,
reloading the GICv4 state can be pretty expensive (multiple MMIO
accesses), which is why we really don't want to do that on the hot
path (kvm_arch_vcpu_ioctl_run() *is* a hot path).

> If not, then we could do something like
> this, which would eliminate the arch hooks entirely if the VMCR is
> handled as above.
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..efc862c4d802 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -365,31 +365,6 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
>         return kvm_timer_is_pending(vcpu);
>  }
> 
> -void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
> -{
> -       /*
> -        * If we're about to block (most likely because we've just hit a
> -        * WFI), we need to sync back the state of the GIC CPU interface
> -        * so that we have the latest PMR and group enables. This ensures
> -        * that kvm_arch_vcpu_runnable has up-to-date data to decide
> -        * whether we have pending interrupts.
> -        *
> -        * For the same reason, we want to tell GICv4 that we need
> -        * doorbells to be signalled, should an interrupt become pending.
> -        */
> -       preempt_disable();
> -       kvm_vgic_vmcr_sync(vcpu);
> -       vgic_v4_put(vcpu, true);
> -       preempt_enable();
> -}
> -
> -void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
> -{
> -       preempt_disable();
> -       vgic_v4_load(vcpu);
> -       preempt_enable();
> -}
> -
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>  {
>         struct kvm_s2_mmu *mmu;
> @@ -697,7 +672,6 @@ static void check_vcpu_requests(struct kvm_vcpu *vcpu)
>                         /* The distributor enable bits were changed */
>                         preempt_disable();
>                         vgic_v4_put(vcpu, false);
> -                       vgic_v4_load(vcpu);
>                         preempt_enable();
>                 }
> 
> @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>                  */
>                 preempt_disable();
> 
> +               /*
> +                * Reload vGIC v4 if necessary, as it may be put on-demand so
> +                * that KVM can detect directly injected interrupts, e.g. when
> +                * determining if the vCPU is runnable due to a pending event.
> +                */
> +               vgic_v4_load(vcpu);

You'd need to detect that a previous put has been done. But overall,
it puts the complexity at the wrong place. WFI (aka kvm_vcpu_block) is
the place where we want to handle this synchronisation, and not the
run loop.

Instead of having a well defined interface with the blocking code
where we implement the required synchronisation, you spray the vgic
crap all over, and it becomes much harder to reason about it. Guess
what, I'm not keen on it.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-28  9:24           ` Marc Zyngier
@ 2021-09-28 16:21             ` Sean Christopherson
  2021-09-30  9:36               ` Marc Zyngier
  0 siblings, 1 reply; 50+ messages in thread
From: Sean Christopherson @ 2021-09-28 16:21 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Paolo Bonzini, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, David Matlack, Jing Zhang

On Tue, Sep 28, 2021, Marc Zyngier wrote:
> On Mon, 27 Sep 2021 18:28:14 +0100,
> Sean Christopherson <seanjc@google.com> wrote:
> > 
> > On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > > On Sun, 26 Sep 2021 07:27:28 +0100,
> > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > 
> > > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > > >> there is no need for arm64 to put/load
> > > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > > 
> > > > > This doesn't mean that there is no requirement for any state
> > > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > > resync is a correctness requirement.
> > 
> > Ah crud, I didn't blame that code beforehand, I simply assumed
> > kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> > sequence.  The comment in arm64's kvm_arch_vcpu_blocking() about
> > kvm_arch_vcpu_runnable() makes more sense now too.
> > 
> > > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > > work and is a waste of time without (the current implementation of)
> > > > put/load.
> > > 
> > > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > > still follow the standard SW injection model.
> > > 
> > > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > > up-to-date priority mask and group enable bits) for SW-injected
> > > interrupt wake-up to work correctly, and I really don't want to save
> > > that one eagerly on each shallow exit.
> > 
> > IIUC, VMCR is resident in hardware while the guest is running, and
> > KVM needs to retrieve the VMCR when processing interrupts to
> > determine if a interrupt is above the priority threshold.  If that's
> > the case, then IMO handling the VMCR via an arch hook is
> > unnecessarily fragile, e.g. any generic call that leads to
> > kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> > guest register.
> 
> Not quite. We only need to retrieve the VMCR if we are in a situation
> where we need to trigger a wake-up from WFI at the point where we have
> not done a vcpu_put() yet. All the other cases where the interrupt is
> injected are managed by the HW. And the only case where
> kvm_arch_vcpu_runnable() gets called is when blocking.
> 
> I also don't get why a hook would be fragile, as long as it has well
> defined semantics.

Generic KVM should not have to know that a seemingly benign arch hook,
kvm_arch_vcpu_runnable(), cannot be safely called without first calling another
arch hook.  E.g. I suspect there's a (benign?) race in kvm_vcpu_on_spin().  If
the loop is delayed between checking rcuwait_active() and vcpu_dy_runnable(),
and the target vCPU is awakened during that period, KVM can call
kvm_arch_vcpu_runnable() while the vCPU is running.

It's kind of a counter-example to my below suggestion as putting the vGIC would
indeed lead to state corruption if the vCPU is running, but I would argue that
arm64 should override kvm_arch_dy_runnable() so that its correctness is guaranteed,
e.g. by not calling kvm_arch_vcpu_runnable() if the vCPU is already running.

> > A better approach for VMCR would be to retrieve the value from
> > hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> > impossible to have bugs where KVM is working with a stale VMCR, e.g.
> > 
> > diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> > index 48c6067fc5ec..0784de0c4080 100644
> > --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> > +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> > @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> >  
> >  void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> >  {
> > +       if (!vcpu->...->vmcr_available) {
> > +               preempt_disable();
> > +               kvm_vgic_vmcr_sync(vcpu);
> > +               preempt_enable();
> > +               vcpu->...->vmcr_available = true;
> > +       }
> > +
> 
> But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
> isn't running at all (such as save/restore). It really only operates
> on the shadow state, and what you have above will only lead to state
> corruption.

Ignoring the kvm_arch_dy_runnable() case for the moment, how would it lead to
corruption?  The idea is that the 'vmcr_available' flag would be cleared when the
vCPU is run, i.e. it tracks whether or not the shadow state may be stale.

> >         if (kvm_vgic_global_state.type == VGIC_V2)
> >                 vgic_v2_get_vmcr(vcpu, vmcr);
> >         else
> > 
> > 
> > Regarding vGIC v4, does KVM require it to be resident in hardware
> > while the vCPU is loaded?
> 
> It is a requirement. Otherwise, we end-up with an inconsistent state
> between the delivery of doorbells and the state of the vgic.

For my own understanding, does KVM require it to be resident in hardware while
the vCPU is loaded but _not_ running?  What I don't fully understand is how KVM
can safely load/put the vCPU if that true, i.e. wouldn't there always be a window
for badness?

> Also, reloading the GICv4 state can be pretty expensive (multiple MMIO
> accesses), which is why we really don't want to do that on the hot path
> (kvm_arch_vcpu_ioctl_run() *is* a hot path).

I wasn't suggesting to reload GICv4 on every entry, it would only be reloaded
if it was put between vcpu_load() and entry to the guest.

> > If not, then we could do something like
> > this, which would eliminate the arch hooks entirely if the VMCR is
> > handled as above.

...

> > @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> >                  */
> >                 preempt_disable();
> > 
> > +               /*
> > +                * Reload vGIC v4 if necessary, as it may be put on-demand so
> > +                * that KVM can detect directly injected interrupts, e.g. when
> > +                * determining if the vCPU is runnable due to a pending event.
> > +                */
> > +               vgic_v4_load(vcpu);
> 
> You'd need to detect that a previous put has been done.

Not that it will likely matter, but doesn't the its_vpe.resident check automatically
handle this?

> But overall, it puts the complexity at the wrong place. WFI (aka
> kvm_vcpu_block) is the place where we want to handle this synchronisation,
> and not the run loop.
> 
> Instead of having a well defined interface with the blocking code
> where we implement the required synchronisation, you spray the vgic
> crap all over, and it becomes much harder to reason about it. Guess
> what, I'm not keen on it.

My objection to the arch hooks is that, from generic KVM's perspective, the
direct dependency is not on blocking, it's on calling kvm_arch_vcpu_runnable().
That's why I suggested handling this by tracking whether or not the VMCR is
up-to-date/stale, as it allows generic KVM to safely call kvm_arch_vcpu_runnable()
whenever the vCPU is loaded.

I don't have a strong opinion on arm64 preferring the sync to be specific to
WFI, but if that's the case then IMO this should be handled fully in arm64, e.g.
a patch like so (or with a wrapper around the call to kvm_vcpu_block() if we
want to guard against future calls into generic KVM)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..312f3acd3ca3 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -367,27 +367,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)

 void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
 {
-       /*
-        * If we're about to block (most likely because we've just hit a
-        * WFI), we need to sync back the state of the GIC CPU interface
-        * so that we have the latest PMR and group enables. This ensures
-        * that kvm_arch_vcpu_runnable has up-to-date data to decide
-        * whether we have pending interrupts.
-        *
-        * For the same reason, we want to tell GICv4 that we need
-        * doorbells to be signalled, should an interrupt become pending.
-        */
-       preempt_disable();
-       kvm_vgic_vmcr_sync(vcpu);
-       vgic_v4_put(vcpu, true);
-       preempt_enable();
+
 }

 void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
 {
-       preempt_disable();
-       vgic_v4_load(vcpu);
-       preempt_enable();
+
 }

 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 275a27368a04..9870e824a27e 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -95,8 +95,28 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
        } else {
                trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
                vcpu->stat.wfi_exit_stat++;
+
+               /*
+                * Sync back the state of the GIC CPU interface so that we have
+                * the latest PMR and group enables. This ensures that
+                * kvm_arch_vcpu_runnable has up-to-date data to decide whether
+                * we have pending interrupts, e.g. when determining if the
+                * vCPU should block.
+                *
+                * For the same reason, we want to tell GICv4 that we need
+                * doorbells to be signalled, should an interrupt become pending.
+                */
+               preempt_disable();
+               kvm_vgic_vmcr_sync(vcpu);
+               vgic_v4_put(vcpu, true);
+               preempt_enable();
+
                kvm_vcpu_block(vcpu);
                kvm_clear_request(KVM_REQ_UNHALT, vcpu);
+
+               preempt_disable();
+               vgic_v4_load(vcpu);
+               preempt_enable();
        }

        kvm_incr_pc(vcpu);






^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted
  2021-09-25  0:55 ` [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted Sean Christopherson
@ 2021-09-28 18:57   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 18:57 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:16PM -0700, Sean Christopherson wrote:
> Don't update halt-polling stats if halt-polling wasn't attempted.  This
> is a nop as @poll_ns is guaranteed to be '0' (poll_end == start), but it
> will allow a future patch to move the histogram stats into the helper to
> resolve a discrepancy in what is considered a "successful" halt-poll.
> 
> No functional change intended.
> 
> Cc: David Matlack <dmatlack@google.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  virt/kvm/kvm_main.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 768a4cbb26a6..8b33f5045b4d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3214,6 +3214,7 @@ update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited)
>  void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  {
>  	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
> +	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
>  	ktime_t start, cur, poll_end;
>  	bool waited = false;
>  	u64 block_ns;
> @@ -3221,7 +3222,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	kvm_arch_vcpu_blocking(vcpu);
>  
>  	start = cur = poll_end = ktime_get();
> -	if (vcpu->halt_poll_ns && halt_poll_allowed) {
> +	if (do_halt_poll) {
>  		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
>  
>  		++vcpu->stat.generic.halt_attempted_poll;
> @@ -3273,8 +3274,9 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	kvm_arch_vcpu_unblocking(vcpu);
>  	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>  
> -	update_halt_poll_stats(
> -		vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
> +	if (do_halt_poll)
> +		update_halt_poll_stats(
> +			vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
>  
>  	if (halt_poll_allowed) {
>  		if (!vcpu_valid_wakeup(vcpu)) {
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper
  2021-09-25  0:55 ` [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper Sean Christopherson
@ 2021-09-28 19:01   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 19:01 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:17PM -0700, Sean Christopherson wrote:
> Add a comment to document that halt-polling is considered successful even
> if the polling loop itself didn't detect a wake event, i.e. if a wake
> event was detect in the final kvm_vcpu_check_block().  Invert the param
> to the update helper so that the helper is a dumb function that is "told"
> whether or not polling was successful, as opposed to having it determinine
> success/failure based on blocking behavior.
> 
> Opportunistically tweak the params to the update helper to reduce the
> line length for the call site so that it fits on a single line, and so
> that the prototype conforms to the more traditional kernel style.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  virt/kvm/kvm_main.c | 20 +++++++++++++-------
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b33f5045b4d..12fe91a0a4c8 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3199,13 +3199,15 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>  	return ret;
>  }
>  
> -static inline void
> -update_halt_poll_stats(struct kvm_vcpu *vcpu, u64 poll_ns, bool waited)
> +static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
> +					  ktime_t end, bool success)
>  {
> -	if (waited)
> -		vcpu->stat.generic.halt_poll_fail_ns += poll_ns;
> -	else
> +	u64 poll_ns = ktime_to_ns(ktime_sub(end, start));
> +
> +	if (success)
>  		vcpu->stat.generic.halt_poll_success_ns += poll_ns;
> +	else
> +		vcpu->stat.generic.halt_poll_fail_ns += poll_ns;
>  }
>  
>  /*
> @@ -3274,9 +3276,13 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	kvm_arch_vcpu_unblocking(vcpu);
>  	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>  
> +	/*
> +	 * Note, halt-polling is considered successful so long as the vCPU was
> +	 * never actually scheduled out, i.e. even if the wake event arrived
> +	 * after of the halt-polling loop itself, but before the full wait.
> +	 */
>  	if (do_halt_poll)
> -		update_halt_poll_stats(
> -			vcpu, ktime_to_ns(ktime_sub(poll_end, start)), waited);
> +		update_halt_poll_stats(vcpu, start, poll_end, !waited);
>  
>  	if (halt_poll_allowed) {
>  		if (!vcpu_valid_wakeup(vcpu)) {
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats
  2021-09-25  0:55 ` [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats Sean Christopherson
@ 2021-09-28 21:26   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 21:26 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:18PM -0700, Sean Christopherson wrote:
> Move the halt-polling "success" and histogram stats update into the
> dedicated helper to fix a discrepancy where the success/fail "time" stats
> consider polling successful so long as the wait is avoided, but the main
> "success" and histogram stats consider polling successful if and only if
> a wake event was detected by the halt-polling loop.
> 
> Move halt_attempted_poll to the helper as well so that all the stats are
> updated in a single location.  While it's a bit odd to update the stat
> well after the fact, practically speaking there's no meaningful advantage
> to updating before polling.
> 
> Note, there is a functional change in addition to the success vs. fail
> change.  The histogram updates previously called ktime_get() instead of
> using "cur".  But that change is desirable as it means all the stats are
> now updated with the same polling time, and avoids the extra ktime_get(),
> which isn't expensive but isn't free either.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  virt/kvm/kvm_main.c | 35 ++++++++++++++++-------------------
>  1 file changed, 16 insertions(+), 19 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 12fe91a0a4c8..2ba22b68af3b 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3202,12 +3202,23 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>  static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>  					  ktime_t end, bool success)
>  {
> +	struct kvm_vcpu_stat_generic *stats = &vcpu->stat.generic;
>  	u64 poll_ns = ktime_to_ns(ktime_sub(end, start));
>  
> -	if (success)
> -		vcpu->stat.generic.halt_poll_success_ns += poll_ns;
> -	else
> -		vcpu->stat.generic.halt_poll_fail_ns += poll_ns;
> +	++vcpu->stat.generic.halt_attempted_poll;
> +
> +	if (success) {
> +		++vcpu->stat.generic.halt_successful_poll;
> +
> +		if (!vcpu_valid_wakeup(vcpu))
> +			++vcpu->stat.generic.halt_poll_invalid;
> +
> +		stats->halt_poll_success_ns += poll_ns;
> +		KVM_STATS_LOG_HIST_UPDATE(stats->halt_poll_success_hist, poll_ns);
> +	} else {
> +		stats->halt_poll_fail_ns += poll_ns;
> +		KVM_STATS_LOG_HIST_UPDATE(stats->halt_poll_fail_hist, poll_ns);
> +	}
>  }
>  
>  /*
> @@ -3227,30 +3238,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	if (do_halt_poll) {
>  		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
>  
> -		++vcpu->stat.generic.halt_attempted_poll;
>  		do {
>  			/*
>  			 * This sets KVM_REQ_UNHALT if an interrupt
>  			 * arrives.
>  			 */
> -			if (kvm_vcpu_check_block(vcpu) < 0) {
> -				++vcpu->stat.generic.halt_successful_poll;
> -				if (!vcpu_valid_wakeup(vcpu))
> -					++vcpu->stat.generic.halt_poll_invalid;
> -
> -				KVM_STATS_LOG_HIST_UPDATE(
> -				      vcpu->stat.generic.halt_poll_success_hist,
> -				      ktime_to_ns(ktime_get()) -
> -				      ktime_to_ns(start));
> +			if (kvm_vcpu_check_block(vcpu) < 0)
>  				goto out;
> -			}
>  			cpu_relax();
>  			poll_end = cur = ktime_get();
>  		} while (kvm_vcpu_can_poll(cur, stop));
> -
> -		KVM_STATS_LOG_HIST_UPDATE(
> -				vcpu->stat.generic.halt_poll_fail_hist,
> -				ktime_to_ns(ktime_get()) - ktime_to_ns(start));
>  	}
>  
>  
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish()
  2021-09-25  0:55 ` [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish() Sean Christopherson
  2021-09-27  6:58   ` Christian Borntraeger
@ 2021-09-28 21:28   ` David Matlack
  1 sibling, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 21:28 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:20PM -0700, Sean Christopherson wrote:
> Drop kvm_arch_vcpu_block_finish() now that all arch implementations are
> nops.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/arm64/include/asm/kvm_host.h   | 1 -
>  arch/mips/include/asm/kvm_host.h    | 1 -
>  arch/powerpc/include/asm/kvm_host.h | 1 -
>  arch/s390/include/asm/kvm_host.h    | 2 --
>  arch/s390/kvm/kvm-s390.c            | 5 -----
>  arch/x86/include/asm/kvm_host.h     | 2 --
>  virt/kvm/kvm_main.c                 | 1 -
>  7 files changed, 13 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..4e0ad0fff540 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -716,7 +716,6 @@ void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
>  static inline void kvm_arch_hardware_unsetup(void) {}
>  static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
>  void kvm_arm_init_debug(void);
>  void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
> diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
> index 696f6b009377..72b90d45a46e 100644
> --- a/arch/mips/include/asm/kvm_host.h
> +++ b/arch/mips/include/asm/kvm_host.h
> @@ -897,7 +897,6 @@ static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
>  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
>  #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLB
>  int kvm_arch_flush_remote_tlb(struct kvm *kvm);
> diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
> index 59cb38b04ede..8a84448d77a6 100644
> --- a/arch/powerpc/include/asm/kvm_host.h
> +++ b/arch/powerpc/include/asm/kvm_host.h
> @@ -864,6 +864,5 @@ static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
>  static inline void kvm_arch_exit(void) {}
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
>  
>  #endif /* __POWERPC_KVM_HOST_H__ */
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index a604d51acfc8..a22c9266ea05 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -1010,6 +1010,4 @@ static inline void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
>  static inline void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu) {}
>  static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu) {}
>  
> -void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu);
> -
>  #endif
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 08ed68639a21..17fabb260c35 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -5080,11 +5080,6 @@ static inline unsigned long nonhyp_mask(int i)
>  	return 0x0000ffffffffffffUL >> (nonhyp_fai << 4);
>  }
>  
> -void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu)
> -{
> -
> -}
> -
>  static int __init kvm_s390_init(void)
>  {
>  	int i;
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 1e0e909b98b2..4e8c21083bdb 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1916,8 +1916,6 @@ static inline void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
>  	static_call_cond(kvm_x86_vcpu_unblocking)(vcpu);
>  }
>  
> -static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
> -
>  static inline int kvm_cpu_get_apicid(int mps_cpu)
>  {
>  #ifdef CONFIG_X86_LOCAL_APIC
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2ba22b68af3b..2015a1f532ce 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3301,7 +3301,6 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	}
>  
>  	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
> -	kvm_arch_vcpu_block_finish(vcpu);
>  }
>  EXPORT_SYMBOL_GPL(kvm_vcpu_block);
>  
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt()
  2021-09-25  0:55 ` [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt() Sean Christopherson
@ 2021-09-28 21:59   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 21:59 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:22PM -0700, Sean Christopherson wrote:
> Rename a variety of HLT-related helpers to free up the function name
> "kvm_vcpu_halt" for future use in generic KVM code, e.g. to differentiate
> between "block" and "halt".
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/include/asm/kvm_host.h |  2 +-
>  arch/x86/kvm/vmx/nested.c       |  2 +-
>  arch/x86/kvm/vmx/vmx.c          |  4 ++--
>  arch/x86/kvm/x86.c              | 13 +++++++------
>  4 files changed, 11 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4e8c21083bdb..cfebef10b89c 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1679,7 +1679,7 @@ int kvm_emulate_monitor(struct kvm_vcpu *vcpu);
>  int kvm_fast_pio(struct kvm_vcpu *vcpu, int size, unsigned short port, int in);
>  int kvm_emulate_cpuid(struct kvm_vcpu *vcpu);
>  int kvm_emulate_halt(struct kvm_vcpu *vcpu);
> -int kvm_vcpu_halt(struct kvm_vcpu *vcpu);
> +int kvm_emulate_halt_noskip(struct kvm_vcpu *vcpu);
>  int kvm_emulate_ap_reset_hold(struct kvm_vcpu *vcpu);
>  int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index eedcebf58004..f689e463b678 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3618,7 +3618,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
>  		    !(nested_cpu_has(vmcs12, CPU_BASED_INTR_WINDOW_EXITING) &&
>  		      (vmcs12->guest_rflags & X86_EFLAGS_IF))) {
>  			vmx->nested.nested_run_pending = 0;
> -			return kvm_vcpu_halt(vcpu);
> +			return kvm_emulate_halt_noskip(vcpu);
>  		}
>  		break;
>  	case GUEST_ACTIVITY_WAIT_SIPI:
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d118daed0530..858f5f1f1273 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4740,7 +4740,7 @@ static int handle_rmode_exception(struct kvm_vcpu *vcpu,
>  		if (kvm_emulate_instruction(vcpu, 0)) {
>  			if (vcpu->arch.halt_request) {
>  				vcpu->arch.halt_request = 0;
> -				return kvm_vcpu_halt(vcpu);
> +				return kvm_emulate_halt_noskip(vcpu);
>  			}
>  			return 1;
>  		}
> @@ -5414,7 +5414,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
>  
>  		if (vcpu->arch.halt_request) {
>  			vcpu->arch.halt_request = 0;
> -			return kvm_vcpu_halt(vcpu);
> +			return kvm_emulate_halt_noskip(vcpu);
>  		}
>  
>  		/*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b0c21d42f453..eade8a2bdccf 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8643,7 +8643,7 @@ void kvm_arch_exit(void)
>  #endif
>  }
>  
> -static int __kvm_vcpu_halt(struct kvm_vcpu *vcpu, int state, int reason)
> +static int __kvm_emulate_halt(struct kvm_vcpu *vcpu, int state, int reason)
>  {
>  	++vcpu->stat.halt_exits;
>  	if (lapic_in_kernel(vcpu)) {
> @@ -8655,11 +8655,11 @@ static int __kvm_vcpu_halt(struct kvm_vcpu *vcpu, int state, int reason)
>  	}
>  }
>  
> -int kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> +int kvm_emulate_halt_noskip(struct kvm_vcpu *vcpu)
>  {
> -	return __kvm_vcpu_halt(vcpu, KVM_MP_STATE_HALTED, KVM_EXIT_HLT);
> +	return __kvm_emulate_halt(vcpu, KVM_MP_STATE_HALTED, KVM_EXIT_HLT);
>  }
> -EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
> +EXPORT_SYMBOL_GPL(kvm_emulate_halt_noskip);
>  
>  int kvm_emulate_halt(struct kvm_vcpu *vcpu)
>  {
> @@ -8668,7 +8668,7 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
>  	 * TODO: we might be squashing a GUESTDBG_SINGLESTEP-triggered
>  	 * KVM_EXIT_DEBUG here.
>  	 */
> -	return kvm_vcpu_halt(vcpu) && ret;
> +	return kvm_emulate_halt_noskip(vcpu) && ret;
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_halt);
>  
> @@ -8676,7 +8676,8 @@ int kvm_emulate_ap_reset_hold(struct kvm_vcpu *vcpu)
>  {
>  	int ret = kvm_skip_emulated_instruction(vcpu);
>  
> -	return __kvm_vcpu_halt(vcpu, KVM_MP_STATE_AP_RESET_HOLD, KVM_EXIT_AP_RESET_HOLD) && ret;
> +	return __kvm_emulate_halt(vcpu, KVM_MP_STATE_AP_RESET_HOLD,
> +					KVM_EXIT_AP_RESET_HOLD) && ret;
>  }
>  EXPORT_SYMBOL_GPL(kvm_emulate_ap_reset_hold);
>  
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt()
  2021-09-25  0:55 ` [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Sean Christopherson
  2021-09-27  7:06   ` Christian Borntraeger
@ 2021-09-28 22:01   ` David Matlack
  1 sibling, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 22:01 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:23PM -0700, Sean Christopherson wrote:
> Rename kvm_vcpu_block() to kvm_vcpu_halt() in preparation for splitting
> the actual "block" sequences into a separate helper (to be named
> kvm_vcpu_block()).  x86 will use the standalone block-only path to handle
> non-halt cases where the vCPU is not runnable.
> 
> Rename block_ns to halt_ns to match the new function name.
> 
> Opportunistically move an x86-specific comment to x86, and enhance it, too.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/arm64/kvm/arch_timer.c       |  2 +-
>  arch/arm64/kvm/handle_exit.c      |  4 ++--
>  arch/arm64/kvm/psci.c             |  2 +-
>  arch/mips/kvm/emulate.c           |  2 +-
>  arch/powerpc/kvm/book3s_pr.c      |  2 +-
>  arch/powerpc/kvm/book3s_pr_papr.c |  2 +-
>  arch/powerpc/kvm/booke.c          |  2 +-
>  arch/powerpc/kvm/powerpc.c        |  2 +-
>  arch/s390/kvm/interrupt.c         |  2 +-
>  arch/x86/kvm/x86.c                | 11 +++++++++--
>  include/linux/kvm_host.h          |  2 +-
>  virt/kvm/kvm_main.c               | 20 +++++++++-----------
>  12 files changed, 29 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
> index 3df67c127489..7e8396f74010 100644
> --- a/arch/arm64/kvm/arch_timer.c
> +++ b/arch/arm64/kvm/arch_timer.c
> @@ -467,7 +467,7 @@ static void timer_save_state(struct arch_timer_context *ctx)
>  }
>  
>  /*
> - * Schedule the background timer before calling kvm_vcpu_block, so that this
> + * Schedule the background timer before calling kvm_vcpu_halt, so that this
>   * thread is removed from its waitqueue and made runnable when there's a timer
>   * interrupt to handle.
>   */
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..08f823984712 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -82,7 +82,7 @@ static int handle_no_fpsimd(struct kvm_vcpu *vcpu)
>   *
>   * WFE: Yield the CPU and come back to this vcpu when the scheduler
>   * decides to.
> - * WFI: Simply call kvm_vcpu_block(), which will halt execution of
> + * WFI: Simply call kvm_vcpu_halt(), which will halt execution of
>   * world-switches and schedule other host processes until there is an
>   * incoming IRQ or FIQ to the VM.
>   */
> @@ -95,7 +95,7 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
>  	} else {
>  		trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
>  		vcpu->stat.wfi_exit_stat++;
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>  	}
>  
> diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
> index 74c47d420253..e275b2ca08b9 100644
> --- a/arch/arm64/kvm/psci.c
> +++ b/arch/arm64/kvm/psci.c
> @@ -46,7 +46,7 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
>  	 * specification (ARM DEN 0022A). This means all suspend states
>  	 * for KVM will preserve the register state.
>  	 */
> -	kvm_vcpu_block(vcpu);
> +	kvm_vcpu_halt(vcpu);
>  	kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>  
>  	return PSCI_RET_SUCCESS;
> diff --git a/arch/mips/kvm/emulate.c b/arch/mips/kvm/emulate.c
> index 22e745e49b0a..b494d8d39290 100644
> --- a/arch/mips/kvm/emulate.c
> +++ b/arch/mips/kvm/emulate.c
> @@ -952,7 +952,7 @@ enum emulation_result kvm_mips_emul_wait(struct kvm_vcpu *vcpu)
>  	if (!vcpu->arch.pending_exceptions) {
>  		kvm_vz_lose_htimer(vcpu);
>  		vcpu->arch.wait = 1;
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  
>  		/*
>  		 * We we are runnable, then definitely go off to user space to
> diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
> index 6bc9425acb32..0ced1b16f0e5 100644
> --- a/arch/powerpc/kvm/book3s_pr.c
> +++ b/arch/powerpc/kvm/book3s_pr.c
> @@ -492,7 +492,7 @@ static void kvmppc_set_msr_pr(struct kvm_vcpu *vcpu, u64 msr)
>  
>  	if (msr & MSR_POW) {
>  		if (!vcpu->arch.pending_exceptions) {
> -			kvm_vcpu_block(vcpu);
> +			kvm_vcpu_halt(vcpu);
>  			kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>  			vcpu->stat.generic.halt_wakeup++;
>  
> diff --git a/arch/powerpc/kvm/book3s_pr_papr.c b/arch/powerpc/kvm/book3s_pr_papr.c
> index ac14239f3424..1f10e7dfcdd0 100644
> --- a/arch/powerpc/kvm/book3s_pr_papr.c
> +++ b/arch/powerpc/kvm/book3s_pr_papr.c
> @@ -376,7 +376,7 @@ int kvmppc_h_pr(struct kvm_vcpu *vcpu, unsigned long cmd)
>  		return kvmppc_h_pr_stuff_tce(vcpu);
>  	case H_CEDE:
>  		kvmppc_set_msr_fast(vcpu, kvmppc_get_msr(vcpu) | MSR_EE);
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>  		vcpu->stat.generic.halt_wakeup++;
>  		return EMULATE_DONE;
> diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
> index 977801c83aff..12abffa40cd9 100644
> --- a/arch/powerpc/kvm/booke.c
> +++ b/arch/powerpc/kvm/booke.c
> @@ -718,7 +718,7 @@ int kvmppc_core_prepare_to_enter(struct kvm_vcpu *vcpu)
>  
>  	if (vcpu->arch.shared->msr & MSR_WE) {
>  		local_irq_enable();
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>  		hard_irq_disable();
>  
> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
> index 8ab90ce8738f..565eed2dab81 100644
> --- a/arch/powerpc/kvm/powerpc.c
> +++ b/arch/powerpc/kvm/powerpc.c
> @@ -236,7 +236,7 @@ int kvmppc_kvm_pv(struct kvm_vcpu *vcpu)
>  		break;
>  	case EV_HCALL_TOKEN(EV_IDLE):
>  		r = EV_SUCCESS;
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  		kvm_clear_request(KVM_REQ_UNHALT, vcpu);
>  		break;
>  	default:
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 520450a7956f..10bd648170b7 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -1335,7 +1335,7 @@ int kvm_s390_handle_wait(struct kvm_vcpu *vcpu)
>  	VCPU_EVENT(vcpu, 4, "enabled wait: %llu ns", sltime);
>  no_timer:
>  	srcu_read_unlock(&vcpu->kvm->srcu, vcpu->srcu_idx);
> -	kvm_vcpu_block(vcpu);
> +	kvm_vcpu_halt(vcpu);
>  	vcpu->valid_wakeup = false;
>  	__unset_cpu_idle(vcpu);
>  	vcpu->srcu_idx = srcu_read_lock(&vcpu->kvm->srcu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index eade8a2bdccf..0d71c73a61bb 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8645,6 +8645,13 @@ void kvm_arch_exit(void)
>  
>  static int __kvm_emulate_halt(struct kvm_vcpu *vcpu, int state, int reason)
>  {
> +	/*
> +	 * The vCPU has halted, e.g. executed HLT.  Update the run state if the
> +	 * local APIC is in-kernel, the run loop will detect the non-runnable
> +	 * state and halt the vCPU.  Exit to userspace if the local APIC is
> +	 * managed by userspace, in which case userspace is responsible for
> +	 * handling wake events.
> +	 */
>  	++vcpu->stat.halt_exits;
>  	if (lapic_in_kernel(vcpu)) {
>  		vcpu->arch.mp_state = state;
> @@ -9886,7 +9893,7 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
>  	if (!kvm_arch_vcpu_runnable(vcpu) &&
>  	    (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
>  		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>  
>  		if (kvm_x86_ops.post_block)
> @@ -10120,7 +10127,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  			r = -EINTR;
>  			goto out;
>  		}
> -		kvm_vcpu_block(vcpu);
> +		kvm_vcpu_halt(vcpu);
>  		if (kvm_apic_accept_events(vcpu) < 0) {
>  			r = 0;
>  			goto out;
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 3f87d6ad20bf..d2a8be3fb9ba 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -965,7 +965,7 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn);
>  void kvm_sigset_activate(struct kvm_vcpu *vcpu);
>  void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
>  
> -void kvm_vcpu_block(struct kvm_vcpu *vcpu);
> +void kvm_vcpu_halt(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
>  bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index f96cda8312f3..280cf1dca7db 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3221,16 +3221,13 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>  	}
>  }
>  
> -/*
> - * The vCPU has executed a HLT instruction with in-kernel mode enabled.
> - */
> -void kvm_vcpu_block(struct kvm_vcpu *vcpu)
> +void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>  {
>  	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
>  	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
>  	ktime_t start, cur, poll_end;
>  	bool waited = false;
> -	u64 block_ns;
> +	u64 halt_ns;
>  
>  	start = cur = poll_end = ktime_get();
>  	if (do_halt_poll) {
> @@ -3273,7 +3270,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	}
>  
>  out:
> -	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
> +	/* The total time the vCPU was "halted", including polling time. */
> +	halt_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>  
>  	/*
>  	 * Note, halt-polling is considered successful so long as the vCPU was
> @@ -3287,24 +3285,24 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  		if (!vcpu_valid_wakeup(vcpu)) {
>  			shrink_halt_poll_ns(vcpu);
>  		} else if (vcpu->kvm->max_halt_poll_ns) {
> -			if (block_ns <= vcpu->halt_poll_ns)
> +			if (halt_ns <= vcpu->halt_poll_ns)
>  				;
>  			/* we had a long block, shrink polling */
>  			else if (vcpu->halt_poll_ns &&
> -					block_ns > vcpu->kvm->max_halt_poll_ns)
> +				 halt_ns > vcpu->kvm->max_halt_poll_ns)
>  				shrink_halt_poll_ns(vcpu);
>  			/* we had a short halt and our poll time is too small */
>  			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> -					block_ns < vcpu->kvm->max_halt_poll_ns)
> +				 halt_ns < vcpu->kvm->max_halt_poll_ns)
>  				grow_halt_poll_ns(vcpu);
>  		} else {
>  			vcpu->halt_poll_ns = 0;
>  		}
>  	}
>  
> -	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
> +	trace_kvm_vcpu_wakeup(halt_ns, waited, vcpu_valid_wakeup(vcpu));
>  }
> -EXPORT_SYMBOL_GPL(kvm_vcpu_block);
> +EXPORT_SYMBOL_GPL(kvm_vcpu_halt);
>  
>  bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu)
>  {
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt()
  2021-09-25  0:55 ` [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt() Sean Christopherson
  2021-09-27  7:41   ` Christian Borntraeger
@ 2021-09-28 22:03   ` David Matlack
  1 sibling, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 22:03 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:24PM -0700, Sean Christopherson wrote:
> Factor out the "block" part of kvm_vcpu_halt() so that x86 can emulate
> non-halt wait/sleep/block conditions that should not be subjected to
> halt-polling.
> 
> No functional change intended.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  include/linux/kvm_host.h |  1 +
>  virt/kvm/kvm_main.c      | 50 ++++++++++++++++++++++++++++------------
>  2 files changed, 36 insertions(+), 15 deletions(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index d2a8be3fb9ba..655c2b24db2d 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -966,6 +966,7 @@ void kvm_sigset_activate(struct kvm_vcpu *vcpu);
>  void kvm_sigset_deactivate(struct kvm_vcpu *vcpu);
>  
>  void kvm_vcpu_halt(struct kvm_vcpu *vcpu);
> +bool kvm_vcpu_block(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu);
>  void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu);
>  bool kvm_vcpu_wake_up(struct kvm_vcpu *vcpu);
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 280cf1dca7db..fe34457530c2 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3199,6 +3199,34 @@ static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
>  	return ret;
>  }
>  
> +/*
> + * Block the vCPU until the vCPU is runnable, an event arrives, or a signal is
> + * pending.  This is mostly used when halting a vCPU, but may also be used
> + * directly for other vCPU non-runnable states, e.g. x86's Wait-For-SIPI.
> + */
> +bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
> +{
> +	bool waited = false;
> +
> +	kvm_arch_vcpu_blocking(vcpu);
> +
> +	prepare_to_rcuwait(&vcpu->wait);
> +	for (;;) {
> +		set_current_state(TASK_INTERRUPTIBLE);
> +
> +		if (kvm_vcpu_check_block(vcpu) < 0)
> +			break;
> +
> +		waited = true;
> +		schedule();
> +	}
> +	finish_rcuwait(&vcpu->wait);
> +
> +	kvm_arch_vcpu_unblocking(vcpu);
> +
> +	return waited;
> +}
> +
>  static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>  					  ktime_t end, bool success)
>  {
> @@ -3221,6 +3249,12 @@ static inline void update_halt_poll_stats(struct kvm_vcpu *vcpu, ktime_t start,
>  	}
>  }
>  
> +/*
> + * Emulate a vCPU halt condition, e.g. HLT on x86, WFI on arm, etc...  If halt
> + * polling is enabled, busy wait for a short time before blocking to avoid the
> + * expensive block+unblock sequence if a wake event arrives soon after the vCPU
> + * is halted.
> + */
>  void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>  {
>  	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
> @@ -3245,21 +3279,7 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>  		} while (kvm_vcpu_can_poll(cur, stop));
>  	}
>  
> -	kvm_arch_vcpu_blocking(vcpu);
> -
> -	prepare_to_rcuwait(&vcpu->wait);
> -	for (;;) {
> -		set_current_state(TASK_INTERRUPTIBLE);
> -
> -		if (kvm_vcpu_check_block(vcpu) < 0)
> -			break;
> -
> -		waited = true;
> -		schedule();
> -	}
> -	finish_rcuwait(&vcpu->wait);
> -
> -	kvm_arch_vcpu_unblocking(vcpu);
> +	waited = kvm_vcpu_block(vcpu);
>  
>  	cur = ktime_get();
>  	if (waited) {
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking
  2021-09-25  0:55 ` [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking Sean Christopherson
@ 2021-09-28 22:04   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 22:04 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:25PM -0700, Sean Christopherson wrote:
> From: Jing Zhang <jingzhangos@google.com>
> 
> Add a "blocking" stat that userspace can use to detect the case where a
> vCPU is not being run because of a vCPU/guest action, e.g. HLT or WFS on
> x86, WFI on arm64, etc...  Current guest/host/halt stats don't show this
> well, e.g. if a guest halts for a long period of time then the vCPU could
> appear pathologically blocked due to a host condition, when in reality the
> vCPU has been put into a not-runnable state by the guest.
> 
> Originally-by: Cannon Matthews <cannonmatthews@google.com>
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> [sean: renamed stat to "blocking", massaged changelog]
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>
> ---
>  include/linux/kvm_host.h  | 3 ++-
>  include/linux/kvm_types.h | 1 +
>  virt/kvm/kvm_main.c       | 2 ++
>  3 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index 655c2b24db2d..9bb1972e396a 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1453,7 +1453,8 @@ struct _kvm_stats_desc {
>  	STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,	       \
>  			HALT_POLL_HIST_COUNT),				       \
>  	STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,	       \
> -			HALT_POLL_HIST_COUNT)
> +			HALT_POLL_HIST_COUNT),				       \
> +	STATS_DESC_ICOUNTER(VCPU_GENERIC, blocking)
>  
>  extern struct dentry *kvm_debugfs_dir;
>  
> diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
> index 2237abb93ccd..c4f9257bf32d 100644
> --- a/include/linux/kvm_types.h
> +++ b/include/linux/kvm_types.h
> @@ -94,6 +94,7 @@ struct kvm_vcpu_stat_generic {
>  	u64 halt_poll_success_hist[HALT_POLL_HIST_COUNT];
>  	u64 halt_poll_fail_hist[HALT_POLL_HIST_COUNT];
>  	u64 halt_wait_hist[HALT_POLL_HIST_COUNT];
> +	u64 blocking;
>  };
>  
>  #define KVM_STATS_NAME_SIZE	48
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index fe34457530c2..2980d2b88559 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3208,6 +3208,7 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  {
>  	bool waited = false;
>  
> +	vcpu->stat.generic.blocking = 1;
>  	kvm_arch_vcpu_blocking(vcpu);
>  
>  	prepare_to_rcuwait(&vcpu->wait);
> @@ -3223,6 +3224,7 @@ bool kvm_vcpu_block(struct kvm_vcpu *vcpu)
>  	finish_rcuwait(&vcpu->wait);
>  
>  	kvm_arch_vcpu_unblocking(vcpu);
> +	vcpu->stat.generic.blocking = 0;
>  
>  	return waited;
>  }
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline
  2021-09-25  0:55 ` [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline Sean Christopherson
@ 2021-09-28 22:08   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 22:08 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:26PM -0700, Sean Christopherson wrote:
> Calculate the halt-polling "stop" time using "cur" instead of redoing
> ktime_get().  In the happy case where hardware correctly predicts
> do_halt_poll, "cur" is only a few cycles old.  And if the branch is
> mispredicted, arguably that extra latency should count toward the
> halt-polling time.
> 
> In all likelihood, the numbers involved are in the noise and either
> approach is perfectly ok.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  virt/kvm/kvm_main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 2980d2b88559..80f78daa6b8d 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3267,7 +3267,7 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>  
>  	start = cur = poll_end = ktime_get();
>  	if (do_halt_poll) {
> -		ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
> +		ktime_t stop = ktime_add_ns(cur, vcpu->halt_poll_ns);
>  
>  		do {
>  			/*
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs
  2021-09-25  0:55 ` [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs Sean Christopherson
@ 2021-09-28 22:12   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 22:12 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:27PM -0700, Sean Christopherson wrote:
> Go directly to kvm_vcpu_block() when handling the case where userspace
> attempts to run an UNINITIALIZED vCPU.  The vCPU isn't halted and its time
> spent in limbo arguably should not be factored into halt-polling as the
> behavior of the VM at this point is not at all indicative of the behavior
> of the VM once it is up and running, i.e. executing HLT in idle tasks.
> 
> Note, because this case is encountered only on the first run of an AP vCPU,
> vcpu->halt_poll_ns is guaranteed to be '0', and so KVM will not attempt
> halt-polling, i.e. this really only affects the post-block bookkeeping.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/x86.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0d71c73a61bb..b444f9315766 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10127,7 +10127,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  			r = -EINTR;
>  			goto out;
>  		}
> -		kvm_vcpu_halt(vcpu);
> +		kvm_vcpu_block(vcpu);
>  		if (kvm_apic_accept_events(vcpu) < 0) {
>  			r = 0;
>  			goto out;
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states
  2021-09-25  0:55 ` [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states Sean Christopherson
@ 2021-09-28 22:14   ` David Matlack
  0 siblings, 0 replies; 50+ messages in thread
From: David Matlack @ 2021-09-28 22:14 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, Paolo Bonzini, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, Jing Zhang

On Fri, Sep 24, 2021 at 05:55:28PM -0700, Sean Christopherson wrote:
> Call kvm_vcpu_block() directly for all wait states except HALTED so that
> kvm_vcpu_halt() is no longer a misnomer on x86.
> 
> Functionally, this means KVM will never attempt halt-polling or adjust
> vcpu->halt_poll_ns for INIT_RECEIVED (a.k.a. Wait-For-SIPI (WFS)) or
> AP_RESET_HOLD; UNINITIALIZED is handled in kvm_arch_vcpu_ioctl_run(),
> and x86 doesn't use any other "wait" states.
> 
> As mentioned above, the motivation of this is purely so that "halt" isn't
> overloaded on x86, e.g. in KVM's stats.  Skipping halt-polling for WFS
> (and RESET_HOLD) has no meaningful effect on guest performance as there
> are typically single-digit numbers of INIT-SIPI sequences per AP vCPU,
> per boot, versus thousands of HLTs just to boot to console.
> 
> Signed-off-by: Sean Christopherson <seanjc@google.com>

Reviewed-by: David Matlack <dmatlack@google.com>

> ---
>  arch/x86/kvm/x86.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b444f9315766..a0f313c4bc49 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9893,7 +9893,10 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
>  	if (!kvm_arch_vcpu_runnable(vcpu) &&
>  	    (!kvm_x86_ops.pre_block || static_call(kvm_x86_pre_block)(vcpu) == 0)) {
>  		srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx);
> -		kvm_vcpu_halt(vcpu);
> +		if (vcpu->arch.mp_state == KVM_MP_STATE_HALTED)
> +			kvm_vcpu_halt(vcpu);
> +		else
> +			kvm_vcpu_block(vcpu);
>  		vcpu->srcu_idx = srcu_read_lock(&kvm->srcu);
>  
>  		if (kvm_x86_ops.post_block)
> -- 
> 2.33.0.685.g46640cef36-goog
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 16:58         ` David Matlack
@ 2021-09-29  6:56           ` Christian Borntraeger
  0 siblings, 0 replies; 50+ messages in thread
From: Christian Borntraeger @ 2021-09-29  6:56 UTC (permalink / raw)
  To: David Matlack
  Cc: Paolo Bonzini, Sean Christopherson, Jon Cargille, Jim Mattson,
	James Morse, Alexandru Elisei, Suzuki K Poulose,
	David Hildenbrand, Cornelia Huck, Claudio Imbrenda,
	Vitaly Kuznetsov, Wanpeng Li, Joerg Roedel, linux-arm-kernel,
	KVMARM, LinuxMIPS, kvm list, KVMPPC, LKML, Jing Zhang,
	Marc Zyngier, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Janosch Frank



Am 27.09.21 um 18:58 schrieb David Matlack:
> On Mon, Sep 27, 2021 at 8:17 AM Christian Borntraeger
> <borntraeger@de.ibm.com> wrote:
>>
>>
>>
>> Am 27.09.21 um 17:03 schrieb Paolo Bonzini:
>>> On 27/09/21 16:59, Sean Christopherson wrote:
>>>>> commit acd05785e48c01edb2c4f4d014d28478b5f19fb5
>>>>> Author:     David Matlack<dmatlack@google.com>
>>>>> AuthorDate: Fri Apr 17 15:14:46 2020 -0700
>>>>> Commit:     Paolo Bonzini<pbonzini@redhat.com>
>>>>> CommitDate: Fri Apr 24 12:53:17 2020 -0400
>>>>>
>>>>>       kvm: add capability for halt polling
>>>>>
>>>>> broke the possibility for an admin to disable halt polling for already running KVM guests.
>>>>> In past times doing
>>>>> echo 0 > /sys/module/kvm/parameters/halt_poll_ns
>>>>>
>>>>> stopped polling system wide.
>>>>> Now all KVM guests will use the halt_poll_ns value that was active during
>>>>> startup - even those that do not use KVM_CAP_HALT_POLL.
>>>>>
>>>>> I guess this was not intended?
>>>
>>> No, but...
>>>
>>>> I would go so far as to say that halt_poll_ns should be a hard limit on
>>>> the capability
>>>
>>> ... this would not be a good idea I think.  Anything that wants to do a lot of polling can just do "for (;;)".
> 
> I agree. It would also be a maintenance burden and subtle "gotcha" to
> have to increase halt_poll_ns anytime one wants to increase
> KVM_CAP_HALT_POLL.

I think the idea of the upper bound is not about preventing wasting CPUs
but to reconfigure existing poll intervals on a global level. So I think
this idea is a bad idea in itself. Especially as the admin might not have
access to the monitor of user QEMUs.

>>>
>>> So I think there are two possibilities that makes sense:
>>>
>>> * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
>>
>> what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> 
> None of these options would cover Christian's original use-case
> though. (Write to module to disable halt-polling system-wide.)
> 
> What about adding a writable "enable_halt_polling" module parameter

that would then affect both classes with and without KVM_CAP_HALT_POLL.

> that affects all VMs? Once that is in place we could also consider
> getting rid of halt_poll_ns entirely.

As far as I can tell QEMU does not yet use KVM_CAP_HALT_POLL.
So having a system wide halt_poll_ns makes sense. And I think for all
processes not using KVM_CAP_HALT_POLL we should really follow what
halt_poll_ns is NOW and not what it used to be.


^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful
  2021-09-28 16:21             ` Sean Christopherson
@ 2021-09-30  9:36               ` Marc Zyngier
  0 siblings, 0 replies; 50+ messages in thread
From: Marc Zyngier @ 2021-09-30  9:36 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Paolo Bonzini, Huacai Chen, Aleksandar Markovic, Paul Mackerras,
	Christian Borntraeger, Janosch Frank, James Morse,
	Alexandru Elisei, Suzuki K Poulose, David Hildenbrand,
	Cornelia Huck, Claudio Imbrenda, Vitaly Kuznetsov, Wanpeng Li,
	Jim Mattson, Joerg Roedel, linux-arm-kernel, kvmarm, linux-mips,
	kvm, kvm-ppc, linux-kernel, David Matlack, Jing Zhang

On Tue, 28 Sep 2021 17:21:12 +0100,
Sean Christopherson <seanjc@google.com> wrote:
> 
> On Tue, Sep 28, 2021, Marc Zyngier wrote:
> > On Mon, 27 Sep 2021 18:28:14 +0100,
> > Sean Christopherson <seanjc@google.com> wrote:
> > > 
> > > On Sun, Sep 26, 2021, Marc Zyngier wrote:
> > > > On Sun, 26 Sep 2021 07:27:28 +0100,
> > > > Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > > > 
> > > > > On 25/09/21 11:50, Marc Zyngier wrote:
> > > > > >> there is no need for arm64 to put/load
> > > > > >> the vGIC as KVM hasn't relinquished control of the vCPU in any way.
> > > > > > 
> > > > > > This doesn't mean that there is no requirement for any state
> > > > > > change. The put/load on GICv4 is crucial for performance, and the VMCR
> > > > > > resync is a correctness requirement.
> > > 
> > > Ah crud, I didn't blame that code beforehand, I simply assumed
> > > kvm_arch_vcpu_blocking() was purely for the blocking/schedule()
> > > sequence.  The comment in arm64's kvm_arch_vcpu_blocking() about
> > > kvm_arch_vcpu_runnable() makes more sense now too.
> > > 
> > > > > I wouldn't even say it's crucial for performance: halt polling cannot
> > > > > work and is a waste of time without (the current implementation of)
> > > > > put/load.
> > > > 
> > > > Not quite. A non-V{LPI,SGI} could still be used as the a wake-up from
> > > > WFI (which is the only reason we end-up on this path). Only LPIs (and
> > > > SGIs on GICv4.1) can be directly injected, meaning that SPIs and PPIs
> > > > still follow the standard SW injection model.
> > > > 
> > > > However, there is still the ICH_VMCR_EL2 requirement (to get the
> > > > up-to-date priority mask and group enable bits) for SW-injected
> > > > interrupt wake-up to work correctly, and I really don't want to save
> > > > that one eagerly on each shallow exit.
> > > 
> > > IIUC, VMCR is resident in hardware while the guest is running, and
> > > KVM needs to retrieve the VMCR when processing interrupts to
> > > determine if a interrupt is above the priority threshold.  If that's
> > > the case, then IMO handling the VMCR via an arch hook is
> > > unnecessarily fragile, e.g. any generic call that leads to
> > > kvm_arch_vcpu_runnable() needs to know that arm64 lazily retrieves a
> > > guest register.
> > 
> > Not quite. We only need to retrieve the VMCR if we are in a situation
> > where we need to trigger a wake-up from WFI at the point where we have
> > not done a vcpu_put() yet. All the other cases where the interrupt is
> > injected are managed by the HW. And the only case where
> > kvm_arch_vcpu_runnable() gets called is when blocking.
> > 
> > I also don't get why a hook would be fragile, as long as it has well
> > defined semantics.
> 
> Generic KVM should not have to know that a seemingly benign arch hook,
> kvm_arch_vcpu_runnable(), cannot be safely called without first calling another
> arch hook.  E.g. I suspect there's a (benign?) race in kvm_vcpu_on_spin().  If
> the loop is delayed between checking rcuwait_active() and vcpu_dy_runnable(),
> and the target vCPU is awakened during that period, KVM can call
> kvm_arch_vcpu_runnable() while the vCPU is running.

Humph. Indeed, there is a potential gold-plated turd there.

> It's kind of a counter-example to my below suggestion as putting the vGIC would
> indeed lead to state corruption if the vCPU is running, but I would argue that
> arm64 should override kvm_arch_dy_runnable() so that its correctness is guaranteed,
> e.g. by not calling kvm_arch_vcpu_runnable() if the vCPU is already running.

I'll work something out for that case.

> > > A better approach for VMCR would be to retrieve the value from
> > > hardware on-demand, e.g. via a hook in vgic_get_vmcr(), so that it's all but
> > > impossible to have bugs where KVM is working with a stale VMCR, e.g.
> > > 
> > > diff --git a/arch/arm64/kvm/vgic/vgic-mmio.c b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > index 48c6067fc5ec..0784de0c4080 100644
> > > --- a/arch/arm64/kvm/vgic/vgic-mmio.c
> > > +++ b/arch/arm64/kvm/vgic/vgic-mmio.c
> > > @@ -828,6 +828,13 @@ void vgic_set_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > >  
> > >  void vgic_get_vmcr(struct kvm_vcpu *vcpu, struct vgic_vmcr *vmcr)
> > >  {
> > > +       if (!vcpu->...->vmcr_available) {
> > > +               preempt_disable();
> > > +               kvm_vgic_vmcr_sync(vcpu);
> > > +               preempt_enable();
> > > +               vcpu->...->vmcr_available = true;
> > > +       }
> > > +
> > 
> > But most of the uses of vgic_get_vmcr() are in contexts where the vcpu
> > isn't running at all (such as save/restore). It really only operates
> > on the shadow state, and what you have above will only lead to state
> > corruption.
> 
> Ignoring the kvm_arch_dy_runnable() case for the moment, how would
> it lead to corruption?  The idea is that the 'vmcr_available' flag
> would be cleared when the vCPU is run, i.e. it tracks whether or not
> the shadow state may be stale.

I guess that 'vmcr_available' would have to be initialised to 'true'
at vcpu reset time so that the userspace side cannot trigger a read
from the HW.

> > >         if (kvm_vgic_global_state.type == VGIC_V2)
> > >                 vgic_v2_get_vmcr(vcpu, vmcr);
> > >         else
> > > 
> > > 
> > > Regarding vGIC v4, does KVM require it to be resident in hardware
> > > while the vCPU is loaded?
> > 
> > It is a requirement. Otherwise, we end-up with an inconsistent state
> > between the delivery of doorbells and the state of the vgic.
> 
> For my own understanding, does KVM require it to be resident in
> hardware while the vCPU is loaded but _not_ running?  What I don't
> fully understand is how KVM can safely load/put the vCPU if that
> true, i.e. wouldn't there always be a window for badness?

No, that part is fine. It is when you start running the vcpu without
the GICv4 context loaded that ugly stuff happens (get a doorbell that
tells you to schedule the currently running vcpu, for example).

> 
> > Also, reloading the GICv4 state can be pretty expensive (multiple MMIO
> > accesses), which is why we really don't want to do that on the hot path
> > (kvm_arch_vcpu_ioctl_run() *is* a hot path).
> 
> I wasn't suggesting to reload GICv4 on every entry, it would only be reloaded
> if it was put between vcpu_load() and entry to the guest.
> 
> > > If not, then we could do something like
> > > this, which would eliminate the arch hooks entirely if the VMCR is
> > > handled as above.
> 
> ...
> 
> > > @@ -813,6 +787,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> > >                  */
> > >                 preempt_disable();
> > > 
> > > +               /*
> > > +                * Reload vGIC v4 if necessary, as it may be put on-demand so
> > > +                * that KVM can detect directly injected interrupts, e.g. when
> > > +                * determining if the vCPU is runnable due to a pending event.
> > > +                */
> > > +               vgic_v4_load(vcpu);
> > 
> > You'd need to detect that a previous put has been done.
> 
> Not that it will likely matter, but doesn't the its_vpe.resident
> check automatically handle this?

Sort of. I eventually want to get rid of this as it papers over all
sort of sins. I introduced it exactly because of the nesting that
vcpu_block triggers, but this is a bit of a layering violation between
KVM and the underlying GICv4 driver.

> 
> > But overall, it puts the complexity at the wrong place. WFI (aka
> > kvm_vcpu_block) is the place where we want to handle this synchronisation,
> > and not the run loop.
> > 
> > Instead of having a well defined interface with the blocking code
> > where we implement the required synchronisation, you spray the vgic
> > crap all over, and it becomes much harder to reason about it. Guess
> > what, I'm not keen on it.
> 
> My objection to the arch hooks is that, from generic KVM's
> perspective, the direct dependency is not on blocking, it's on
> calling kvm_arch_vcpu_runnable().  That's why I suggested handling
> this by tracking whether or not the VMCR is up-to-date/stale, as it
> allows generic KVM to safely call kvm_arch_vcpu_runnable() whenever
> the vCPU is loaded.
> 
> I don't have a strong opinion on arm64 preferring the sync to be
> specific to WFI, but if that's the case then IMO this should be
> handled fully in arm64, e.g.  a patch like so (or with a wrapper
> around the call to kvm_vcpu_block() if we want to guard against
> future calls into generic KVM)
> 
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..312f3acd3ca3 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -367,27 +367,12 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
> 
>  void kvm_arch_vcpu_blocking(struct kvm_vcpu *vcpu)
>  {
> -       /*
> -        * If we're about to block (most likely because we've just hit a
> -        * WFI), we need to sync back the state of the GIC CPU interface
> -        * so that we have the latest PMR and group enables. This ensures
> -        * that kvm_arch_vcpu_runnable has up-to-date data to decide
> -        * whether we have pending interrupts.
> -        *
> -        * For the same reason, we want to tell GICv4 that we need
> -        * doorbells to be signalled, should an interrupt become pending.
> -        */
> -       preempt_disable();
> -       kvm_vgic_vmcr_sync(vcpu);
> -       vgic_v4_put(vcpu, true);
> -       preempt_enable();
> +
>  }
> 
>  void kvm_arch_vcpu_unblocking(struct kvm_vcpu *vcpu)
>  {
> -       preempt_disable();
> -       vgic_v4_load(vcpu);
> -       preempt_enable();
> +
>  }
> 
>  void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index 275a27368a04..9870e824a27e 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -95,8 +95,28 @@ static int kvm_handle_wfx(struct kvm_vcpu *vcpu)
>         } else {
>                 trace_kvm_wfx_arm64(*vcpu_pc(vcpu), false);
>                 vcpu->stat.wfi_exit_stat++;
> +
> +               /*
> +                * Sync back the state of the GIC CPU interface so that we have
> +                * the latest PMR and group enables. This ensures that
> +                * kvm_arch_vcpu_runnable has up-to-date data to decide whether
> +                * we have pending interrupts, e.g. when determining if the
> +                * vCPU should block.
> +                *
> +                * For the same reason, we want to tell GICv4 that we need
> +                * doorbells to be signalled, should an interrupt become pending.
> +                */
> +               preempt_disable();
> +               kvm_vgic_vmcr_sync(vcpu);
> +               vgic_v4_put(vcpu, true);
> +               preempt_enable();
> +
>                 kvm_vcpu_block(vcpu);
>                 kvm_clear_request(KVM_REQ_UNHALT, vcpu);
> +
> +               preempt_disable();
> +               vgic_v4_load(vcpu);
> +               preempt_enable();
>         }
> 
>         kvm_incr_pc(vcpu);

I actually largely prefer this approach, which is massively more
readable than the current setup. Feel free to wrap that in your
series.

I'll also have a look at the vcpu_dy_runnable() asap.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2021-09-27 17:33           ` Sean Christopherson
@ 2022-11-15  3:28             ` wangyanan (Y)
  2022-11-16 17:19               ` David Matlack
  0 siblings, 1 reply; 50+ messages in thread
From: wangyanan (Y) @ 2022-11-15  3:28 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Wanpeng Li, kvm, David Hildenbrand, Kernel Mailing List, Linux,
	Paul Mackerras, Claudio Imbrenda, KVM ARM, Janosch Frank,
	Marc Zyngier, Joerg Roedel, Huacai Chen, Christian Borntraeger,
	Aleksandar Markovic, Jon Cargille, kvm-ppc, David Matlack,
	linux-arm-kernel, Jim Mattson, Cornelia Huck, open list:MIPS,
	Vitaly Kuznetsov

Hi Sean, Paolo,

I recently also notice the behavior change of param halt_poll_ns.
Now it loses the ability to:
1) dynamically disable halt polling for all the running VMs
by `echo 0 > /sys`
2) dynamically adjust the halt polling interval for all the
running VMs by `echo * > /sys`

While in our cases, we usually use above two abilities, and
KVM_CAP_HALT_POLL is not used yet.

On 2021/9/28 1:33, Sean Christopherson wrote:
> On Mon, Sep 27, 2021, Paolo Bonzini wrote:
>> On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
>> <borntraeger@de.ibm.com> wrote:
>>>> So I think there are two possibilities that makes sense:
>>>>
>>>> * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
>>> what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
>> Yes, that's what I meant.  David pointed out that doesn't allow you to
>> disable halt polling altogether, but for that you can always ask each
>> VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
>> don't know about Google's usecase, but mine was actually more about
>> using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
Do we have any plan to repost the diff as a fix?
I would be very nice that this issue can be solved.

Besides, I think we may need some Doc for users to describe
how halt_poll_ns works with KVM_CAP_HALT_POLL, like
"Documentation/virt/guest-halt-polling.rst".
> @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
>
>          if (halt_poll_allowed) {
> +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
> +                       max_halt_poll_ns = halt_poll_ns;
> +
Does this mean that KVM_CAP_HALT_POLL will not be able to
disable halt polling for a VM individually when halt_poll_ns !=0?
>                  if (!vcpu_valid_wakeup(vcpu)) {
>                          shrink_halt_poll_ns(vcpu);
> -               } else if (vcpu->kvm->max_halt_poll_ns) {
> +               } else if (max_halt_poll_ns) {
>                          if (halt_ns <= vcpu->halt_poll_ns)
>                                  ;
>                          /* we had a long block, shrink polling */
>                          else if (vcpu->halt_poll_ns &&
> -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> +                                halt_ns > max_halt_poll_ns)
>                                  shrink_halt_poll_ns(vcpu);
>                          /* we had a short halt and our poll time is too small */
> -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> -                               grow_halt_poll_ns(vcpu);
> +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> +                                halt_ns < max_halt_poll_ns)
> +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
>                  } else {
>                          vcpu->halt_poll_ns = 0;
>                  }
> _______________________________________________
> kvmarm mailing list
> kvmarm@lists.cs.columbia.edu
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> .
Thanks,
Yanan

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2022-11-15  3:28             ` wangyanan (Y)
@ 2022-11-16 17:19               ` David Matlack
  2022-11-18  2:29                 ` wangyanan (Y)
  0 siblings, 1 reply; 50+ messages in thread
From: David Matlack @ 2022-11-16 17:19 UTC (permalink / raw)
  To: wangyanan (Y)
  Cc: Sean Christopherson, Paolo Bonzini, Wanpeng Li, kvm,
	David Hildenbrand, Kernel Mailing List, Linux, Paul Mackerras,
	Claudio Imbrenda, KVM ARM, Janosch Frank, Marc Zyngier,
	Joerg Roedel, Huacai Chen, Christian Borntraeger,
	Aleksandar Markovic, Jon Cargille, kvm-ppc, linux-arm-kernel,
	Jim Mattson, Cornelia Huck, open list:MIPS, Vitaly Kuznetsov

On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
> Hi Sean, Paolo,
> 
> I recently also notice the behavior change of param halt_poll_ns.
> Now it loses the ability to:
> 1) dynamically disable halt polling for all the running VMs
> by `echo 0 > /sys`
> 2) dynamically adjust the halt polling interval for all the
> running VMs by `echo * > /sys`
> 
> While in our cases, we usually use above two abilities, and
> KVM_CAP_HALT_POLL is not used yet.

I think the right path forward is to make KVM_CAP_HALT_POLL a pure
override of halt_poll_ns, and restore the pre-existing behavior of
halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
below.

That will fix issues (1) and (2) above for any VM not using
KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
all changes to halt_poll_ns. If we truly need a mechanism for admins to
disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
separate module parameter for that. But IMO, any setup that is
sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
KVM_CAP_HALT_POLL to disable halt polling.

If everyone is happy with this approach I can test and send a real patch
to the mailing list.

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index e6e66c5e56f2..253ad055b6ad 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -788,6 +788,7 @@ struct kvm {
 	struct srcu_struct srcu;
 	struct srcu_struct irq_srcu;
 	pid_t userspace_pid;
+	bool override_halt_poll_ns;
 	unsigned int max_halt_poll_ns;
 	u32 dirty_ring_size;
 	bool vm_bugged;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 43bbe4fde078..479d0d0da0b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
 			goto out_err_no_arch_destroy_vm;
 	}
 
-	kvm->max_halt_poll_ns = halt_poll_ns;
-
 	r = kvm_arch_init_vm(kvm, type);
 	if (r)
 		goto out_err_no_arch_destroy_vm;
@@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
 	sigemptyset(&current->real_blocked);
 }
 
-static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
+static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
 {
 	unsigned int old, val, grow, grow_start;
 
@@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
 	if (val < grow_start)
 		val = grow_start;
 
-	if (val > vcpu->kvm->max_halt_poll_ns)
-		val = vcpu->kvm->max_halt_poll_ns;
+	if (val > max)
+		val = max;
 
 	vcpu->halt_poll_ns = val;
 out:
@@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 {
 	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
 	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
+	unsigned int max_halt_poll_ns;
 	ktime_t start, cur, poll_end;
+	struct kvm *kvm = vcpu->kvm;
 	bool waited = false;
 	u64 halt_ns;
 
+	if (kvm->override_halt_poll_ns)
+		max_halt_poll_ns = kvm->max_halt_poll_ns;
+	else
+		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
+
 	start = cur = poll_end = ktime_get();
 	if (do_halt_poll) {
 		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
@@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
 	if (halt_poll_allowed) {
 		if (!vcpu_valid_wakeup(vcpu)) {
 			shrink_halt_poll_ns(vcpu);
-		} else if (vcpu->kvm->max_halt_poll_ns) {
+		} else if (max_halt_poll_ns) {
 			if (halt_ns <= vcpu->halt_poll_ns)
 				;
 			/* we had a long block, shrink polling */
-			else if (vcpu->halt_poll_ns &&
-				 halt_ns > vcpu->kvm->max_halt_poll_ns)
+			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
 				shrink_halt_poll_ns(vcpu);
 			/* we had a short halt and our poll time is too small */
-			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
-				 halt_ns < vcpu->kvm->max_halt_poll_ns)
-				grow_halt_poll_ns(vcpu);
+			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
+				 halt_ns < max_halt_poll_ns)
+				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
 		} else {
 			vcpu->halt_poll_ns = 0;
 		}
@@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
 		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
 			return -EINVAL;
 
+		kvm->override_halt_poll_ns = true;
 		kvm->max_halt_poll_ns = cap->args[0];
 		return 0;
 	}

> 
> On 2021/9/28 1:33, Sean Christopherson wrote:
> > On Mon, Sep 27, 2021, Paolo Bonzini wrote:
> > > On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
> > > <borntraeger@de.ibm.com> wrote:
> > > > > So I think there are two possibilities that makes sense:
> > > > > 
> > > > > * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
> > > > what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
> > > Yes, that's what I meant.  David pointed out that doesn't allow you to
> > > disable halt polling altogether, but for that you can always ask each
> > > VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
> > > don't know about Google's usecase, but mine was actually more about
> > > using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
> > I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
> > in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
> Do we have any plan to repost the diff as a fix?
> I would be very nice that this issue can be solved.
> 
> Besides, I think we may need some Doc for users to describe
> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
> "Documentation/virt/guest-halt-polling.rst".
> > @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> >                  update_halt_poll_stats(vcpu, start, poll_end, !waited);
> > 
> >          if (halt_poll_allowed) {
> > +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
> > +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
> > +                       max_halt_poll_ns = halt_poll_ns;
> > +
> Does this mean that KVM_CAP_HALT_POLL will not be able to
> disable halt polling for a VM individually when halt_poll_ns !=0?
> >                  if (!vcpu_valid_wakeup(vcpu)) {
> >                          shrink_halt_poll_ns(vcpu);
> > -               } else if (vcpu->kvm->max_halt_poll_ns) {
> > +               } else if (max_halt_poll_ns) {
> >                          if (halt_ns <= vcpu->halt_poll_ns)
> >                                  ;
> >                          /* we had a long block, shrink polling */
> >                          else if (vcpu->halt_poll_ns &&
> > -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
> > +                                halt_ns > max_halt_poll_ns)
> >                                  shrink_halt_poll_ns(vcpu);
> >                          /* we had a short halt and our poll time is too small */
> > -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> > -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
> > -                               grow_halt_poll_ns(vcpu);
> > +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> > +                                halt_ns < max_halt_poll_ns)
> > +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
> >                  } else {
> >                          vcpu->halt_poll_ns = 0;
> >                  }
> > _______________________________________________
> > kvmarm mailing list
> > kvmarm@lists.cs.columbia.edu
> > https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
> > .
> Thanks,
> Yanan

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat)
  2022-11-16 17:19               ` David Matlack
@ 2022-11-18  2:29                 ` wangyanan (Y)
  0 siblings, 0 replies; 50+ messages in thread
From: wangyanan (Y) @ 2022-11-18  2:29 UTC (permalink / raw)
  To: David Matlack
  Cc: Sean Christopherson, Paolo Bonzini, Wanpeng Li, kvm,
	David Hildenbrand, Kernel Mailing List, Linux, Paul Mackerras,
	Claudio Imbrenda, KVM ARM, Janosch Frank, Marc Zyngier,
	Joerg Roedel, Huacai Chen, Christian Borntraeger,
	Aleksandar Markovic, Jon Cargille, kvm-ppc, linux-arm-kernel,
	Jim Mattson, Cornelia Huck, open list:MIPS, Vitaly Kuznetsov,
	yuzenghui, wangyuan38



On 2022/11/17 1:19, David Matlack wrote:
> On Tue, Nov 15, 2022 at 11:28:56AM +0800, wangyanan (Y) wrote:
>> Hi Sean, Paolo,
>>
>> I recently also notice the behavior change of param halt_poll_ns.
>> Now it loses the ability to:
>> 1) dynamically disable halt polling for all the running VMs
>> by `echo 0 > /sys`
>> 2) dynamically adjust the halt polling interval for all the
>> running VMs by `echo * > /sys`
>>
>> While in our cases, we usually use above two abilities, and
>> KVM_CAP_HALT_POLL is not used yet.
> I think the right path forward is to make KVM_CAP_HALT_POLL a pure
> override of halt_poll_ns, and restore the pre-existing behavior of
> halt_poll_ns whenever KVM_CAP_HALT_POLL is not used. e.g. see the patch
> below.
Agree with this.
kvm.halt_poll_ns serves like a legacy method to control halt polling
globally. Once KVM_CAP_HALT_POLL is used for a VM, it should
hold 100% responsibility to control on the VM, including disabling
the polling. This strategy helps to keep the two mechanisms
decoupled.
> That will fix issues (1) and (2) above for any VM not using
> KVM_CAP_HALT_POLL. If a VM is using KVM_CAP_HALT_POLL, it will ignore
> all changes to halt_poll_ns. If we truly need a mechanism for admins to
> disable halt-polling on VMs using KVM_CAP_HALT_POLL, we can introduce a
> separate module parameter for that. But IMO, any setup that is
> sophisticated enough to use KVM_CAP_HALT_POLL should also be able to use
> KVM_CAP_HALT_POLL to disable halt polling.
>
> If everyone is happy with this approach I can test and send a real patch
> to the mailing list.
>
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index e6e66c5e56f2..253ad055b6ad 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -788,6 +788,7 @@ struct kvm {
>   	struct srcu_struct srcu;
>   	struct srcu_struct irq_srcu;
>   	pid_t userspace_pid;
> +	bool override_halt_poll_ns;
>   	unsigned int max_halt_poll_ns;
>   	u32 dirty_ring_size;
>   	bool vm_bugged;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 43bbe4fde078..479d0d0da0b5 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -1198,8 +1198,6 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname)
>   			goto out_err_no_arch_destroy_vm;
>   	}
>   
> -	kvm->max_halt_poll_ns = halt_poll_ns;
> -
>   	r = kvm_arch_init_vm(kvm, type);
>   	if (r)
>   		goto out_err_no_arch_destroy_vm;
> @@ -3371,7 +3369,7 @@ void kvm_sigset_deactivate(struct kvm_vcpu *vcpu)
>   	sigemptyset(&current->real_blocked);
>   }
>   
> -static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
> +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu, unsigned int max)
>   {
>   	unsigned int old, val, grow, grow_start;
>   
> @@ -3385,8 +3383,8 @@ static void grow_halt_poll_ns(struct kvm_vcpu *vcpu)
>   	if (val < grow_start)
>   		val = grow_start;
>   
> -	if (val > vcpu->kvm->max_halt_poll_ns)
> -		val = vcpu->kvm->max_halt_poll_ns;
> +	if (val > max)
> +		val = max;
>   
>   	vcpu->halt_poll_ns = val;
>   out:
> @@ -3501,10 +3499,17 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>   {
>   	bool halt_poll_allowed = !kvm_arch_no_poll(vcpu);
>   	bool do_halt_poll = halt_poll_allowed && vcpu->halt_poll_ns;
> +	unsigned int max_halt_poll_ns;
>   	ktime_t start, cur, poll_end;
> +	struct kvm *kvm = vcpu->kvm;
>   	bool waited = false;
>   	u64 halt_ns;
>   
> +	if (kvm->override_halt_poll_ns)
> +		max_halt_poll_ns = kvm->max_halt_poll_ns;
> +	else
> +		max_halt_poll_ns = READ_ONCE(halt_poll_ns);
> +
>   	start = cur = poll_end = ktime_get();
>   	if (do_halt_poll) {
>   		ktime_t stop = ktime_add_ns(start, vcpu->halt_poll_ns);
> @@ -3545,17 +3550,16 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>   	if (halt_poll_allowed) {
>   		if (!vcpu_valid_wakeup(vcpu)) {
>   			shrink_halt_poll_ns(vcpu);
> -		} else if (vcpu->kvm->max_halt_poll_ns) {
> +		} else if (max_halt_poll_ns) {
>   			if (halt_ns <= vcpu->halt_poll_ns)
>   				;
>   			/* we had a long block, shrink polling */
> -			else if (vcpu->halt_poll_ns &&
> -				 halt_ns > vcpu->kvm->max_halt_poll_ns)
> +			else if (vcpu->halt_poll_ns && halt_ns > max_halt_poll_ns)
>   				shrink_halt_poll_ns(vcpu);
>   			/* we had a short halt and our poll time is too small */
> -			else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
> -				 halt_ns < vcpu->kvm->max_halt_poll_ns)
> -				grow_halt_poll_ns(vcpu);
> +			else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
> +				 halt_ns < max_halt_poll_ns)
> +				grow_halt_poll_ns(vcpu, max_halt_poll_ns);
>   		} else {
>   			vcpu->halt_poll_ns = 0;
>   		}
> @@ -4588,6 +4592,7 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm,
>   		if (cap->flags || cap->args[0] != (unsigned int)cap->args[0])
>   			return -EINVAL;
>   
> +		kvm->override_halt_poll_ns = true;
>   		kvm->max_halt_poll_ns = cap->args[0];
>   		return 0;
>   	}
Looks sensible to me overall.
I will look at the RFC series, thanks for your quick response.

Yanan
.
>> On 2021/9/28 1:33, Sean Christopherson wrote:
>>> On Mon, Sep 27, 2021, Paolo Bonzini wrote:
>>>> On Mon, Sep 27, 2021 at 5:17 PM Christian Borntraeger
>>>> <borntraeger@de.ibm.com> wrote:
>>>>>> So I think there are two possibilities that makes sense:
>>>>>>
>>>>>> * track what is using KVM_CAP_HALT_POLL, and make writes to halt_poll_ns follow that
>>>>> what about using halt_poll_ns for those VMs that did not uses KVM_CAP_HALT_POLL and the private number for those that did.
>>>> Yes, that's what I meant.  David pointed out that doesn't allow you to
>>>> disable halt polling altogether, but for that you can always ask each
>>>> VM's userspace one by one, or just not use KVM_CAP_HALT_POLL. (Also, I
>>>> don't know about Google's usecase, but mine was actually more about
>>>> using KVM_CAP_HALT_POLL to *disable* halt polling on some VMs!).
>>> I kinda like the idea if special-casing halt_poll_ns=0, e.g. for testing or
>>> in-the-field mitigation if halt-polling is broken.  It'd be trivial to support, e.g.
>> Do we have any plan to repost the diff as a fix?
>> I would be very nice that this issue can be solved.
>>
>> Besides, I think we may need some Doc for users to describe
>> how halt_poll_ns works with KVM_CAP_HALT_POLL, like
>> "Documentation/virt/guest-halt-polling.rst".
>>> @@ -3304,19 +3304,23 @@ void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
>>>                   update_halt_poll_stats(vcpu, start, poll_end, !waited);
>>>
>>>           if (halt_poll_allowed) {
>>> +               max_halt_poll_ns = vcpu->kvm->max_halt_poll_ns;
>>> +               if (!max_halt_poll_ns || !halt_poll_ns)  <------ squish the max if halt_poll_ns==0
>>> +                       max_halt_poll_ns = halt_poll_ns;
>>> +
>> Does this mean that KVM_CAP_HALT_POLL will not be able to
>> disable halt polling for a VM individually when halt_poll_ns !=0?
>>>                   if (!vcpu_valid_wakeup(vcpu)) {
>>>                           shrink_halt_poll_ns(vcpu);
>>> -               } else if (vcpu->kvm->max_halt_poll_ns) {
>>> +               } else if (max_halt_poll_ns) {
>>>                           if (halt_ns <= vcpu->halt_poll_ns)
>>>                                   ;
>>>                           /* we had a long block, shrink polling */
>>>                           else if (vcpu->halt_poll_ns &&
>>> -                                halt_ns > vcpu->kvm->max_halt_poll_ns)
>>> +                                halt_ns > max_halt_poll_ns)
>>>                                   shrink_halt_poll_ns(vcpu);
>>>                           /* we had a short halt and our poll time is too small */
>>> -                       else if (vcpu->halt_poll_ns < vcpu->kvm->max_halt_poll_ns &&
>>> -                                halt_ns < vcpu->kvm->max_halt_poll_ns)
>>> -                               grow_halt_poll_ns(vcpu);
>>> +                       else if (vcpu->halt_poll_ns < max_halt_poll_ns &&
>>> +                                halt_ns < max_halt_poll_ns)
>>> +                               grow_halt_poll_ns(vcpu, max_halt_poll_ns);
>>>                   } else {
>>>                           vcpu->halt_poll_ns = 0;
>>>                   }
>>> _______________________________________________
>>> kvmarm mailing list
>>> kvmarm@lists.cs.columbia.edu
>>> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
>>> .
>> Thanks,
>> Yanan
> .


^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2022-11-18  2:29 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-25  0:55 [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat Sean Christopherson
2021-09-25  0:55 ` [PATCH 01/14] KVM: s390: Ensure kvm_arch_no_poll() is read once when blocking vCPU Sean Christopherson
2021-09-27  6:54   ` Christian Borntraeger
2021-09-25  0:55 ` [PATCH 02/14] KVM: Update halt-polling stats if and only if halt-polling was attempted Sean Christopherson
2021-09-28 18:57   ` David Matlack
2021-09-25  0:55 ` [PATCH 03/14] KVM: Refactor and document halt-polling stats update helper Sean Christopherson
2021-09-28 19:01   ` David Matlack
2021-09-25  0:55 ` [PATCH 04/14] KVM: Reconcile discrepancies in halt-polling stats Sean Christopherson
2021-09-28 21:26   ` David Matlack
2021-09-25  0:55 ` [PATCH 05/14] KVM: s390: Clear valid_wakeup in kvm_s390_handle_wait(), not in arch hook Sean Christopherson
2021-09-27  6:58   ` Christian Borntraeger
2021-09-25  0:55 ` [PATCH 06/14] KVM: Drop obsolete kvm_arch_vcpu_block_finish() Sean Christopherson
2021-09-27  6:58   ` Christian Borntraeger
2021-09-28 21:28   ` David Matlack
2021-09-25  0:55 ` [PATCH 07/14] KVM: Don't block+unblock when halt-polling is successful Sean Christopherson
2021-09-25  9:50   ` Marc Zyngier
2021-09-26  6:27     ` Paolo Bonzini
2021-09-26  9:02       ` Marc Zyngier
2021-09-27 17:28         ` Sean Christopherson
2021-09-28  9:24           ` Marc Zyngier
2021-09-28 16:21             ` Sean Christopherson
2021-09-30  9:36               ` Marc Zyngier
2021-09-25  0:55 ` [PATCH 08/14] KVM: x86: Tweak halt emulation helper names to free up kvm_vcpu_halt() Sean Christopherson
2021-09-28 21:59   ` David Matlack
2021-09-25  0:55 ` [PATCH 09/14] KVM: Rename kvm_vcpu_block() => kvm_vcpu_halt() Sean Christopherson
2021-09-27  7:06   ` Christian Borntraeger
2021-09-28 22:01   ` David Matlack
2021-09-25  0:55 ` [PATCH 10/14] KVM: Split out a kvm_vcpu_block() helper from kvm_vcpu_halt() Sean Christopherson
2021-09-27  7:41   ` Christian Borntraeger
2021-09-28 22:03   ` David Matlack
2021-09-25  0:55 ` [PATCH 11/14] KVM: stats: Add stat to detect if vcpu is currently blocking Sean Christopherson
2021-09-28 22:04   ` David Matlack
2021-09-25  0:55 ` [PATCH 12/14] KVM: Don't redo ktime_get() when calculating halt-polling stop/deadline Sean Christopherson
2021-09-28 22:08   ` David Matlack
2021-09-25  0:55 ` [PATCH 13/14] KVM: x86: Directly block (instead of "halting") UNINITIALIZED vCPUs Sean Christopherson
2021-09-28 22:12   ` David Matlack
2021-09-25  0:55 ` [PATCH 14/14] KVM: x86: Invoke kvm_vcpu_block() directly for non-HALTED wait states Sean Christopherson
2021-09-28 22:14   ` David Matlack
2021-09-27  7:22 ` disabling halt polling broken? (was Re: [PATCH 00/14] KVM: Halt-polling fixes, cleanups and a new stat) Christian Borntraeger
2021-09-27 14:59   ` Sean Christopherson
2021-09-27 15:03     ` Paolo Bonzini
2021-09-27 15:15       ` Sean Christopherson
2021-09-27 15:16       ` Christian Borntraeger
2021-09-27 16:58         ` David Matlack
2021-09-29  6:56           ` Christian Borntraeger
2021-09-27 17:24         ` Paolo Bonzini
2021-09-27 17:33           ` Sean Christopherson
2022-11-15  3:28             ` wangyanan (Y)
2022-11-16 17:19               ` David Matlack
2022-11-18  2:29                 ` wangyanan (Y)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).