kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86
@ 2021-04-13 18:29 Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 1/7] sched/vtime: Move guest enter/exit vtime accounting to separate helpers Sean Christopherson
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

This is an alternative to Wanpeng's series[*] to fix tick-based accounting
on x86.  The approach for fixing the bug is identical: defer accounting
until after tick IRQs are handled.  The difference is purely in how the
context tracking and vtime code is refactored in order to give KVM the
hooks it needs to fix the x86 bug.

x86 compile tested only, hence the RFC.  If folks like the direction and
there are no unsolvable issues, I'll cross-compile, properly test on x86,
and post an "official" series.

Sean Christopherson (7):
  sched/vtime: Move guest enter/exit vtime accounting to separate
    helpers
  context_tracking: Move guest enter/exit logic to standalone helpers
  context_tracking: Consolidate guest enter/exit wrappers
  context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain
  KVM: Move vtime accounting of guest exit to separate helper
  KVM: x86: Consolidate guest enter/exit logic to common helpers
  KVM: x86: Defer tick-based accounting 'til after IRQ handling

 arch/x86/kvm/svm/svm.c           |  39 +-----------
 arch/x86/kvm/vmx/vmx.c           |  39 +-----------
 arch/x86/kvm/x86.c               |   8 +++
 arch/x86/kvm/x86.h               |  48 +++++++++++++++
 include/linux/context_tracking.h | 100 ++++++++-----------------------
 include/linux/kvm_host.h         |  50 ++++++++++++++++
 include/linux/vtime.h            |  45 ++++++++++++--
 7 files changed, 175 insertions(+), 154 deletions(-)

-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH 1/7] sched/vtime: Move guest enter/exit vtime accounting to separate helpers
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 2/7] context_tracking: Move guest enter/exit logic to standalone helpers Sean Christopherson
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

Provide separate helpers for guest enter/exit vtime accounting instead of
open coding the logic within the context tracking code.  This will allow
KVM x86 to handle vtime accounting slightly differently when using tick-
based accounting.

Opportunstically delete the vtime_account_kernel() stub now that all
callers are wrapped with CONFIG_VIRT_CPU_ACCOUNTING_NATIVE=y.

No functional change intended.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/context_tracking.h | 17 +++---------
 include/linux/vtime.h            | 45 +++++++++++++++++++++++++++++---
 2 files changed, 45 insertions(+), 17 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index bceb06498521..58f9a7251d3b 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -102,16 +102,12 @@ extern void context_tracking_init(void);
 static inline void context_tracking_init(void) { }
 #endif /* CONFIG_CONTEXT_TRACKING_FORCE */
 
-
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 /* must be called with irqs disabled */
 static __always_inline void guest_enter_irqoff(void)
 {
 	instrumentation_begin();
-	if (vtime_accounting_enabled_this_cpu())
-		vtime_guest_enter(current);
-	else
-		current->flags |= PF_VCPU;
+	vtime_account_guest_enter();
 	instrumentation_end();
 
 	if (context_tracking_enabled())
@@ -137,10 +133,7 @@ static __always_inline void guest_exit_irqoff(void)
 		__context_tracking_exit(CONTEXT_GUEST);
 
 	instrumentation_begin();
-	if (vtime_accounting_enabled_this_cpu())
-		vtime_guest_exit(current);
-	else
-		current->flags &= ~PF_VCPU;
+	vtime_account_guest_exit();
 	instrumentation_end();
 }
 
@@ -153,8 +146,7 @@ static __always_inline void guest_enter_irqoff(void)
 	 * to flush.
 	 */
 	instrumentation_begin();
-	vtime_account_kernel(current);
-	current->flags |= PF_VCPU;
+	vtime_account_guest_enter();
 	rcu_virt_note_context_switch(smp_processor_id());
 	instrumentation_end();
 }
@@ -163,8 +155,7 @@ static __always_inline void guest_exit_irqoff(void)
 {
 	instrumentation_begin();
 	/* Flush the guest cputime we spent on the guest */
-	vtime_account_kernel(current);
-	current->flags &= ~PF_VCPU;
+	vtime_account_guest_exit();
 	instrumentation_end();
 }
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 041d6524d144..f30b472a2201 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -3,6 +3,8 @@
 #define _LINUX_KERNEL_VTIME_H
 
 #include <linux/context_tracking_state.h>
+#include <linux/sched.h>
+
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 #include <asm/vtime.h>
 #endif
@@ -18,6 +20,17 @@ struct task_struct;
 static inline bool vtime_accounting_enabled_this_cpu(void) { return true; }
 extern void vtime_task_switch(struct task_struct *prev);
 
+static __always_inline void vtime_account_guest_enter(void)
+{
+	vtime_account_kernel(current);
+	current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+
+}
+
 #elif defined(CONFIG_VIRT_CPU_ACCOUNTING_GEN)
 
 /*
@@ -49,12 +62,38 @@ static inline void vtime_task_switch(struct task_struct *prev)
 		vtime_task_switch_generic(prev);
 }
 
+static __always_inline void vtime_account_guest_enter(void)
+{
+	if (vtime_accounting_enabled_this_cpu())
+		vtime_guest_enter(current);
+	else
+		current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+	if (vtime_accounting_enabled_this_cpu())
+		vtime_guest_exit(current);
+	else
+		current->flags &= ~PF_VCPU;
+}
+
+
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
-static inline bool vtime_accounting_enabled_cpu(int cpu) {return false; }
 static inline bool vtime_accounting_enabled_this_cpu(void) { return false; }
 static inline void vtime_task_switch(struct task_struct *prev) { }
 
+static __always_inline void vtime_account_guest_enter(void)
+{
+	current->flags |= PF_VCPU;
+}
+
+static __always_inline void vtime_account_guest_exit(void)
+{
+	current->flags &= ~PF_VCPU;
+}
+
 #endif
 
 /*
@@ -63,9 +102,7 @@ static inline void vtime_task_switch(struct task_struct *prev) { }
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING
 extern void vtime_account_kernel(struct task_struct *tsk);
 extern void vtime_account_idle(struct task_struct *tsk);
-#else /* !CONFIG_VIRT_CPU_ACCOUNTING */
-static inline void vtime_account_kernel(struct task_struct *tsk) { }
-#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
+#endif /* CONFIG_VIRT_CPU_ACCOUNTING */
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 extern void arch_vtime_task_switch(struct task_struct *tsk);
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 2/7] context_tracking: Move guest enter/exit logic to standalone helpers
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 1/7] sched/vtime: Move guest enter/exit vtime accounting to separate helpers Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 3/7] context_tracking: Consolidate guest enter/exit wrappers Sean Christopherson
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

Move guest enter/exit context tracking to standalone helpers, so that the
existing wrappers can be moved under KVM.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/context_tracking.h | 43 +++++++++++++++++++-------------
 1 file changed, 26 insertions(+), 17 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 58f9a7251d3b..89a1a5ccb2ab 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -71,6 +71,30 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
+static __always_inline void context_tracking_guest_enter_irqoff(void)
+{
+	if (context_tracking_enabled())
+		__context_tracking_enter(CONTEXT_GUEST);
+
+	/* KVM does not hold any references to rcu protected data when it
+	 * switches CPU into a guest mode. In fact switching to a guest mode
+	 * is very similar to exiting to userspace from rcu point of view. In
+	 * addition CPU may stay in a guest mode for quite a long time (up to
+	 * one time slice). Lets treat guest mode as quiescent state, just like
+	 * we do with user-mode execution.
+	 */
+	if (!context_tracking_enabled_this_cpu()) {
+		instrumentation_begin();
+		rcu_virt_note_context_switch(smp_processor_id());
+		instrumentation_end();
+	}
+}
+
+static __always_inline void context_tracking_guest_exit_irqoff(void)
+{
+	if (context_tracking_enabled())
+		__context_tracking_exit(CONTEXT_GUEST);
+}
 
 /**
  * ct_state() - return the current context tracking state if known
@@ -110,27 +134,12 @@ static __always_inline void guest_enter_irqoff(void)
 	vtime_account_guest_enter();
 	instrumentation_end();
 
-	if (context_tracking_enabled())
-		__context_tracking_enter(CONTEXT_GUEST);
-
-	/* KVM does not hold any references to rcu protected data when it
-	 * switches CPU into a guest mode. In fact switching to a guest mode
-	 * is very similar to exiting to userspace from rcu point of view. In
-	 * addition CPU may stay in a guest mode for quite a long time (up to
-	 * one time slice). Lets treat guest mode as quiescent state, just like
-	 * we do with user-mode execution.
-	 */
-	if (!context_tracking_enabled_this_cpu()) {
-		instrumentation_begin();
-		rcu_virt_note_context_switch(smp_processor_id());
-		instrumentation_end();
-	}
+	context_tracking_guest_enter_irqoff();
 }
 
 static __always_inline void guest_exit_irqoff(void)
 {
-	if (context_tracking_enabled())
-		__context_tracking_exit(CONTEXT_GUEST);
+	context_tracking_guest_exit_irqoff();
 
 	instrumentation_begin();
 	vtime_account_guest_exit();
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 3/7] context_tracking: Consolidate guest enter/exit wrappers
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 1/7] sched/vtime: Move guest enter/exit vtime accounting to separate helpers Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 2/7] context_tracking: Move guest enter/exit logic to standalone helpers Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 4/7] context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain Sean Christopherson
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

Consolidate the guest enter/exit wrappers by providing stubs for the
context tracking helpers as necessary.  This will allow moving the
wrappers under KVM without having to bleed too many #ifdefs into the
soon-to-be KVM code.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/context_tracking.h | 65 ++++++++++++++------------------
 1 file changed, 29 insertions(+), 36 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 89a1a5ccb2ab..ded56aed539a 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -76,18 +76,7 @@ static __always_inline void context_tracking_guest_enter_irqoff(void)
 	if (context_tracking_enabled())
 		__context_tracking_enter(CONTEXT_GUEST);
 
-	/* KVM does not hold any references to rcu protected data when it
-	 * switches CPU into a guest mode. In fact switching to a guest mode
-	 * is very similar to exiting to userspace from rcu point of view. In
-	 * addition CPU may stay in a guest mode for quite a long time (up to
-	 * one time slice). Lets treat guest mode as quiescent state, just like
-	 * we do with user-mode execution.
-	 */
-	if (!context_tracking_enabled_this_cpu()) {
-		instrumentation_begin();
-		rcu_virt_note_context_switch(smp_processor_id());
-		instrumentation_end();
-	}
+	return context_tracking_enabled_this_cpu();
 }
 
 static __always_inline void context_tracking_guest_exit_irqoff(void)
@@ -116,6 +105,17 @@ static inline void user_exit_irqoff(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
 static inline enum ctx_state ct_state(void) { return CONTEXT_DISABLED; }
+
+static __always_inline bool context_tracking_guest_enter_irqoff(void)
+{
+	return false;
+}
+
+static __always_inline void context_tracking_guest_exit_irqoff(void)
+{
+
+}
+
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
 #define CT_WARN_ON(cond) WARN_ON(context_tracking_enabled() && (cond))
@@ -126,48 +126,41 @@ extern void context_tracking_init(void);
 static inline void context_tracking_init(void) { }
 #endif /* CONFIG_CONTEXT_TRACKING_FORCE */
 
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 /* must be called with irqs disabled */
 static __always_inline void guest_enter_irqoff(void)
 {
+	/*
+	 * This is running in ioctl context so its safe to assume that it's the
+	 * stime pending cputime to flush.
+	 */
 	instrumentation_begin();
 	vtime_account_guest_enter();
 	instrumentation_end();
 
-	context_tracking_guest_enter_irqoff();
+	/*
+	 * KVM does not hold any references to rcu protected data when it
+	 * switches CPU into a guest mode. In fact switching to a guest mode
+	 * is very similar to exiting to userspace from rcu point of view. In
+	 * addition CPU may stay in a guest mode for quite a long time (up to
+	 * one time slice). Lets treat guest mode as quiescent state, just like
+	 * we do with user-mode execution.
+	 */
+	if (!context_tracking_guest_enter_irqoff()) {
+		instrumentation_begin();
+		rcu_virt_note_context_switch(smp_processor_id());
+		instrumentation_end();
+	}
 }
 
 static __always_inline void guest_exit_irqoff(void)
 {
 	context_tracking_guest_exit_irqoff();
 
-	instrumentation_begin();
-	vtime_account_guest_exit();
-	instrumentation_end();
-}
-
-#else
-static __always_inline void guest_enter_irqoff(void)
-{
-	/*
-	 * This is running in ioctl context so its safe
-	 * to assume that it's the stime pending cputime
-	 * to flush.
-	 */
-	instrumentation_begin();
-	vtime_account_guest_enter();
-	rcu_virt_note_context_switch(smp_processor_id());
-	instrumentation_end();
-}
-
-static __always_inline void guest_exit_irqoff(void)
-{
 	instrumentation_begin();
 	/* Flush the guest cputime we spent on the guest */
 	vtime_account_guest_exit();
 	instrumentation_end();
 }
-#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
 
 static inline void guest_exit(void)
 {
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 4/7] context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
                   ` (2 preceding siblings ...)
  2021-04-13 18:29 ` [RFC PATCH 3/7] context_tracking: Consolidate guest enter/exit wrappers Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 5/7] KVM: Move vtime accounting of guest exit to separate helper Sean Christopherson
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

Move the guest enter/exit wrappers to kvm_host.h so that KVM can manage
its context tracking vs. vtime accounting without bleeding too many KVM
details into the context tracking code.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/context_tracking.h | 45 --------------------------------
 include/linux/kvm_host.h         | 45 ++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index ded56aed539a..2f4538380a8d 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -126,49 +126,4 @@ extern void context_tracking_init(void);
 static inline void context_tracking_init(void) { }
 #endif /* CONFIG_CONTEXT_TRACKING_FORCE */
 
-/* must be called with irqs disabled */
-static __always_inline void guest_enter_irqoff(void)
-{
-	/*
-	 * This is running in ioctl context so its safe to assume that it's the
-	 * stime pending cputime to flush.
-	 */
-	instrumentation_begin();
-	vtime_account_guest_enter();
-	instrumentation_end();
-
-	/*
-	 * KVM does not hold any references to rcu protected data when it
-	 * switches CPU into a guest mode. In fact switching to a guest mode
-	 * is very similar to exiting to userspace from rcu point of view. In
-	 * addition CPU may stay in a guest mode for quite a long time (up to
-	 * one time slice). Lets treat guest mode as quiescent state, just like
-	 * we do with user-mode execution.
-	 */
-	if (!context_tracking_guest_enter_irqoff()) {
-		instrumentation_begin();
-		rcu_virt_note_context_switch(smp_processor_id());
-		instrumentation_end();
-	}
-}
-
-static __always_inline void guest_exit_irqoff(void)
-{
-	context_tracking_guest_exit_irqoff();
-
-	instrumentation_begin();
-	/* Flush the guest cputime we spent on the guest */
-	vtime_account_guest_exit();
-	instrumentation_end();
-}
-
-static inline void guest_exit(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	guest_exit_irqoff();
-	local_irq_restore(flags);
-}
-
 #endif
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 3b06d12ec37e..444d5f0225cb 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -332,6 +332,51 @@ struct kvm_vcpu {
 	struct kvm_dirty_ring dirty_ring;
 };
 
+/* must be called with irqs disabled */
+static __always_inline void guest_enter_irqoff(void)
+{
+	/*
+	 * This is running in ioctl context so its safe to assume that it's the
+	 * stime pending cputime to flush.
+	 */
+	instrumentation_begin();
+	vtime_account_guest_enter();
+	instrumentation_end();
+
+	/*
+	 * KVM does not hold any references to rcu protected data when it
+	 * switches CPU into a guest mode. In fact switching to a guest mode
+	 * is very similar to exiting to userspace from rcu point of view. In
+	 * addition CPU may stay in a guest mode for quite a long time (up to
+	 * one time slice). Lets treat guest mode as quiescent state, just like
+	 * we do with user-mode execution.
+	 */
+	if (!context_tracking_guest_enter_irqoff()) {
+		instrumentation_begin();
+		rcu_virt_note_context_switch(smp_processor_id());
+		instrumentation_end();
+	}
+}
+
+static __always_inline void guest_exit_irqoff(void)
+{
+	context_tracking_guest_exit_irqoff();
+
+	instrumentation_begin();
+	/* Flush the guest cputime we spent on the guest */
+	vtime_account_guest_exit();
+	instrumentation_end();
+}
+
+static inline void guest_exit(void)
+{
+	unsigned long flags;
+
+	local_irq_save(flags);
+	guest_exit_irqoff();
+	local_irq_restore(flags);
+}
+
 static inline int kvm_vcpu_exiting_guest_mode(struct kvm_vcpu *vcpu)
 {
 	/*
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 5/7] KVM: Move vtime accounting of guest exit to separate helper
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
                   ` (3 preceding siblings ...)
  2021-04-13 18:29 ` [RFC PATCH 4/7] context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 6/7] KVM: x86: Consolidate guest enter/exit logic to common helpers Sean Christopherson
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

Provide a standalone helper for guest exit vtime accounting so that x86
can defer tick-based accounting until the appropriate time, while still
updating context tracking immediately after VM-Exit.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 include/linux/kvm_host.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 444d5f0225cb..20604bfae5a8 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -358,16 +358,21 @@ static __always_inline void guest_enter_irqoff(void)
 	}
 }
 
-static __always_inline void guest_exit_irqoff(void)
+static __always_inline void kvm_vtime_account_guest_exit(void)
 {
-	context_tracking_guest_exit_irqoff();
-
 	instrumentation_begin();
 	/* Flush the guest cputime we spent on the guest */
 	vtime_account_guest_exit();
 	instrumentation_end();
 }
 
+static __always_inline void guest_exit_irqoff(void)
+{
+	context_tracking_guest_exit_irqoff();
+
+	kvm_vtime_account_guest_exit();
+}
+
 static inline void guest_exit(void)
 {
 	unsigned long flags;
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 6/7] KVM: x86: Consolidate guest enter/exit logic to common helpers
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
                   ` (4 preceding siblings ...)
  2021-04-13 18:29 ` [RFC PATCH 5/7] KVM: Move vtime accounting of guest exit to separate helper Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-13 18:29 ` [RFC PATCH 7/7] KVM: x86: Defer tick-based accounting 'til after IRQ handling Sean Christopherson
  2021-04-14 17:33 ` [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Thomas Gleixner
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

Move the enter/exit logic in {svm,vmx}_vcpu_enter_exit() to common
helpers.  In addition to deduplicating code, this will allow tweaking the
vtime accounting in the VM-Exit path without splitting logic across x86,
VMX, and SVM.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/svm/svm.c | 39 ++----------------------------------
 arch/x86/kvm/vmx/vmx.c | 39 ++----------------------------------
 arch/x86/kvm/x86.h     | 45 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+), 74 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 48b396f33bee..0677595d07e5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -3713,25 +3713,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
 
-	/*
-	 * VMENTER enables interrupts (host state), but the kernel state is
-	 * interrupts disabled when this is invoked. Also tell RCU about
-	 * it. This is the same logic as for exit_to_user_mode().
-	 *
-	 * This ensures that e.g. latency analysis on the host observes
-	 * guest mode as interrupt enabled.
-	 *
-	 * guest_enter_irqoff() informs context tracking about the
-	 * transition to guest mode and if enabled adjusts RCU state
-	 * accordingly.
-	 */
-	instrumentation_begin();
-	trace_hardirqs_on_prepare();
-	lockdep_hardirqs_on_prepare(CALLER_ADDR0);
-	instrumentation_end();
-
-	guest_enter_irqoff();
-	lockdep_hardirqs_on(CALLER_ADDR0);
+	kvm_guest_enter_irqoff();
 
 	if (sev_es_guest(vcpu->kvm)) {
 		__svm_sev_es_vcpu_run(svm->vmcb_pa);
@@ -3745,24 +3727,7 @@ static noinstr void svm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
 		vmload(__sme_page_pa(sd->save_area));
 	}
 
-	/*
-	 * VMEXIT disables interrupts (host state), but tracing and lockdep
-	 * have them in state 'on' as recorded before entering guest mode.
-	 * Same as enter_from_user_mode().
-	 *
-	 * guest_exit_irqoff() restores host context and reinstates RCU if
-	 * enabled and required.
-	 *
-	 * This needs to be done before the below as native_read_msr()
-	 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
-	 * into world and some more.
-	 */
-	lockdep_hardirqs_off(CALLER_ADDR0);
-	guest_exit_irqoff();
-
-	instrumentation_begin();
-	trace_hardirqs_off_finish();
-	instrumentation_end();
+	kvm_guest_exit_irqoff();
 }
 
 static __no_kcsan fastpath_t svm_vcpu_run(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c05e6e2854b5..19b0e25bf598 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6600,25 +6600,7 @@ static fastpath_t vmx_exit_handlers_fastpath(struct kvm_vcpu *vcpu)
 static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 					struct vcpu_vmx *vmx)
 {
-	/*
-	 * VMENTER enables interrupts (host state), but the kernel state is
-	 * interrupts disabled when this is invoked. Also tell RCU about
-	 * it. This is the same logic as for exit_to_user_mode().
-	 *
-	 * This ensures that e.g. latency analysis on the host observes
-	 * guest mode as interrupt enabled.
-	 *
-	 * guest_enter_irqoff() informs context tracking about the
-	 * transition to guest mode and if enabled adjusts RCU state
-	 * accordingly.
-	 */
-	instrumentation_begin();
-	trace_hardirqs_on_prepare();
-	lockdep_hardirqs_on_prepare(CALLER_ADDR0);
-	instrumentation_end();
-
-	guest_enter_irqoff();
-	lockdep_hardirqs_on(CALLER_ADDR0);
+	kvm_guest_enter_irqoff();
 
 	/* L1D Flush includes CPU buffer clear to mitigate MDS */
 	if (static_branch_unlikely(&vmx_l1d_should_flush))
@@ -6634,24 +6616,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 
 	vcpu->arch.cr2 = native_read_cr2();
 
-	/*
-	 * VMEXIT disables interrupts (host state), but tracing and lockdep
-	 * have them in state 'on' as recorded before entering guest mode.
-	 * Same as enter_from_user_mode().
-	 *
-	 * guest_exit_irqoff() restores host context and reinstates RCU if
-	 * enabled and required.
-	 *
-	 * This needs to be done before the below as native_read_msr()
-	 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
-	 * into world and some more.
-	 */
-	lockdep_hardirqs_off(CALLER_ADDR0);
-	guest_exit_irqoff();
-
-	instrumentation_begin();
-	trace_hardirqs_off_finish();
-	instrumentation_end();
+	kvm_guest_exit_irqoff();
 }
 
 static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index daccf20fbcd5..74ef92f47db8 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -8,6 +8,51 @@
 #include "kvm_cache_regs.h"
 #include "kvm_emulate.h"
 
+static __always_inline void kvm_guest_enter_irqoff(void)
+{
+	/*
+	 * VMENTER enables interrupts (host state), but the kernel state is
+	 * interrupts disabled when this is invoked. Also tell RCU about
+	 * it. This is the same logic as for exit_to_user_mode().
+	 *
+	 * This ensures that e.g. latency analysis on the host observes
+	 * guest mode as interrupt enabled.
+	 *
+	 * guest_enter_irqoff() informs context tracking about the
+	 * transition to guest mode and if enabled adjusts RCU state
+	 * accordingly.
+	 */
+	instrumentation_begin();
+	trace_hardirqs_on_prepare();
+	lockdep_hardirqs_on_prepare(CALLER_ADDR0);
+	instrumentation_end();
+
+	guest_enter_irqoff();
+	lockdep_hardirqs_on(CALLER_ADDR0);
+}
+
+static __always_inline void kvm_guest_exit_irqoff(void)
+{
+	/*
+	 * VMEXIT disables interrupts (host state), but tracing and lockdep
+	 * have them in state 'on' as recorded before entering guest mode.
+	 * Same as enter_from_user_mode().
+	 *
+	 * guest_exit_irqoff() restores host context and reinstates RCU if
+	 * enabled and required.
+	 *
+	 * This needs to be done before the below as native_read_msr()
+	 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
+	 * into world and some more.
+	 */
+	lockdep_hardirqs_off(CALLER_ADDR0);
+	guest_exit_irqoff();
+
+	instrumentation_begin();
+	trace_hardirqs_off_finish();
+	instrumentation_end();
+}
+
 #define KVM_NESTED_VMENTER_CONSISTENCY_CHECK(consistency_check)		\
 ({									\
 	bool failed = (consistency_check);				\
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH 7/7] KVM: x86: Defer tick-based accounting 'til after IRQ handling
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
                   ` (5 preceding siblings ...)
  2021-04-13 18:29 ` [RFC PATCH 6/7] KVM: x86: Consolidate guest enter/exit logic to common helpers Sean Christopherson
@ 2021-04-13 18:29 ` Sean Christopherson
  2021-04-14 17:33 ` [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Thomas Gleixner
  7 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-13 18:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Thomas Gleixner,
	Christian Borntraeger, Michael Tokarev

When using tick-based accounting, defer the call to account guest time
until after servicing any IRQ(s) that happened in the guest (or
immediately after VM-Exit).  When using tick-based accounting, time is
accounted to the guest when PF_VCPU is set when the tick IRQ handler
runs.  The current approach of unconditionally accounting time in
kvm_guest_exit_irqoff() prevents IRQs that occur in the guest from ever
being processed with PF_VCPU set, since PF_VCPU ends up being set only
during the relatively short VM-Enter sequence, which runs entirely with
IRQs disabled.

Fixes: 87fa7f3e98a131 ("x86/kvm: Move context tracking where it belongs")
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 8 ++++++++
 arch/x86/kvm/x86.h | 9 ++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 16fb39503296..096bbf50b7a9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9230,6 +9230,14 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	local_irq_disable();
 	kvm_after_interrupt(vcpu);
 
+	/*
+	 * When using tick-based account, wait until after servicing IRQs to
+	 * account guest time so that any ticks that occurred while running the
+	 * guest are properly accounted to the guest.
+	 */
+	if (!IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_GEN))
+		kvm_vtime_account_guest_exit();
+
 	if (lapic_in_kernel(vcpu)) {
 		s64 delta = vcpu->arch.apic->lapic_timer.advance_expire_delta;
 		if (delta != S64_MIN) {
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 74ef92f47db8..039a7d585925 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -38,15 +38,18 @@ static __always_inline void kvm_guest_exit_irqoff(void)
 	 * have them in state 'on' as recorded before entering guest mode.
 	 * Same as enter_from_user_mode().
 	 *
-	 * guest_exit_irqoff() restores host context and reinstates RCU if
-	 * enabled and required.
+	 * context_tracking_guest_exit_irqoff() restores host context and
+	 * reinstates RCU if enabled and required.
 	 *
 	 * This needs to be done before the below as native_read_msr()
 	 * contains a tracepoint and x86_spec_ctrl_restore_host() calls
 	 * into world and some more.
 	 */
 	lockdep_hardirqs_off(CALLER_ADDR0);
-	guest_exit_irqoff();
+	context_tracking_guest_exit_irqoff();
+
+	if (IS_ENABLED(CONFIG_VIRT_CPU_ACCOUNTING_GEN))
+		kvm_vtime_account_guest_exit();
 
 	instrumentation_begin();
 	trace_hardirqs_off_finish();
-- 
2.31.1.295.g9ea45b61b8-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86
  2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
                   ` (6 preceding siblings ...)
  2021-04-13 18:29 ` [RFC PATCH 7/7] KVM: x86: Defer tick-based accounting 'til after IRQ handling Sean Christopherson
@ 2021-04-14 17:33 ` Thomas Gleixner
  2021-04-14 17:57   ` Sean Christopherson
  7 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2021-04-14 17:33 UTC (permalink / raw)
  To: Sean Christopherson, Paolo Bonzini
  Cc: Sean Christopherson, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Christian Borntraeger,
	Michael Tokarev

On Tue, Apr 13 2021 at 11:29, Sean Christopherson wrote:
> This is an alternative to Wanpeng's series[*] to fix tick-based accounting
> on x86.  The approach for fixing the bug is identical: defer accounting
> until after tick IRQs are handled.  The difference is purely in how the
> context tracking and vtime code is refactored in order to give KVM the
> hooks it needs to fix the x86 bug.
>
> x86 compile tested only, hence the RFC.  If folks like the direction and
> there are no unsolvable issues, I'll cross-compile, properly test on x86,
> and post an "official" series.

I like the final outcome of this, but we really want a small set of
patches first which actually fix the bug and is easy to backport and
then the larger consolidation on top.

Can you sort that out with Wanpeng please?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86
  2021-04-14 17:33 ` [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Thomas Gleixner
@ 2021-04-14 17:57   ` Sean Christopherson
  0 siblings, 0 replies; 10+ messages in thread
From: Sean Christopherson @ 2021-04-14 17:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Paolo Bonzini, Vitaly Kuznetsov, Wanpeng Li, Jim Mattson,
	Joerg Roedel, kvm, linux-kernel, Christian Borntraeger,
	Michael Tokarev

On Wed, Apr 14, 2021, Thomas Gleixner wrote:
> On Tue, Apr 13 2021 at 11:29, Sean Christopherson wrote:
> > This is an alternative to Wanpeng's series[*] to fix tick-based accounting
> > on x86.  The approach for fixing the bug is identical: defer accounting
> > until after tick IRQs are handled.  The difference is purely in how the
> > context tracking and vtime code is refactored in order to give KVM the
> > hooks it needs to fix the x86 bug.
> >
> > x86 compile tested only, hence the RFC.  If folks like the direction and
> > there are no unsolvable issues, I'll cross-compile, properly test on x86,
> > and post an "official" series.
> 
> I like the final outcome of this, but we really want a small set of
> patches first which actually fix the bug and is easy to backport and
> then the larger consolidation on top.
> 
> Can you sort that out with Wanpeng please?

Will do.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-04-14 17:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-13 18:29 [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 1/7] sched/vtime: Move guest enter/exit vtime accounting to separate helpers Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 2/7] context_tracking: Move guest enter/exit logic to standalone helpers Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 3/7] context_tracking: Consolidate guest enter/exit wrappers Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 4/7] context_tracking: KVM: Move guest enter/exit wrappers to KVM's domain Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 5/7] KVM: Move vtime accounting of guest exit to separate helper Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 6/7] KVM: x86: Consolidate guest enter/exit logic to common helpers Sean Christopherson
2021-04-13 18:29 ` [RFC PATCH 7/7] KVM: x86: Defer tick-based accounting 'til after IRQ handling Sean Christopherson
2021-04-14 17:33 ` [RFC PATCH 0/7] KVM: Fix tick-based vtime accounting on x86 Thomas Gleixner
2021-04-14 17:57   ` Sean Christopherson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).