linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest
@ 2015-02-10 14:41 riel
  2015-02-10 14:41 ` [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions riel
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

When running a KVM guest on a system with NOHZ_FULL enabled, and the
KVM guest running with idle=poll mode, we still get wakeups of the
rcuos/N threads.

This problem has already been solved for user space by telling the
RCU subsystem that the CPU is in an extended quiescent state while
running user space code.

This patch series extends that code a little bit to make it usable
to track KVM guest space, too.

I tested the code by booting a KVM guest with idle=poll, on a system
with NOHZ_FULL enabled on most CPUs, and a VCPU thread bound to a
CPU. In a 10 second interval, rcuos/N threads on other CPUs got woken
up several times, while the rcuos thread on the CPU running the bound
and alwasy running VCPU thread never got woken up once.

Thanks to Christian Borntraeger, Paul McKenney, Paulo Bonzini,
Frederic Weisbecker, and Will Deacon for reviewing and improving
earlier versions of this patch series.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions
  2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-10 14:41 ` riel
  2015-02-10 15:28   ` Frederic Weisbecker
  2015-02-10 14:41 ` [PATCH 2/6] rcu,nohz: add state parameter to context_tracking_enter/exit riel
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

These wrapper functions allow architecture code (eg. ARM) to keep
calling context_tracking_user_enter & context_tracking_user_exit
the same way it always has, without error prone tricks like duplicate
defines of argument values in assembly code.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h |  2 ++
 kernel/context_tracking.c        | 37 +++++++++++++++++++++++++------------
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 37b81bd51ec0..03b9c733eae7 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -10,6 +10,8 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 extern void context_tracking_cpu_set(int cpu);
 
+extern void context_tracking_enter(void);
+extern void context_tracking_exit(void);
 extern void context_tracking_user_enter(void);
 extern void context_tracking_user_exit(void);
 extern void __context_tracking_task_switch(struct task_struct *prev,
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 937ecdfdf258..bbdc423936e6 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -39,15 +39,15 @@ void context_tracking_cpu_set(int cpu)
 }
 
 /**
- * context_tracking_user_enter - Inform the context tracking that the CPU is going to
- *                               enter userspace mode.
+ * context_tracking_enter - Inform the context tracking that the CPU is going
+ *                          enter user or guest space mode.
  *
  * This function must be called right before we switch from the kernel
- * to userspace, when it's guaranteed the remaining kernel instructions
- * to execute won't use any RCU read side critical section because this
- * function sets RCU in extended quiescent state.
+ * to user or guest space, when it's guaranteed the remaining kernel
+ * instructions to execute won't use any RCU read side critical section
+ * because this function sets RCU in extended quiescent state.
  */
-void context_tracking_user_enter(void)
+void context_tracking_enter(void)
 {
 	unsigned long flags;
 
@@ -105,20 +105,27 @@ void context_tracking_user_enter(void)
 	}
 	local_irq_restore(flags);
 }
+NOKPROBE_SYMBOL(context_tracking_enter);
+
+void context_tracking_user_enter(void)
+{
+	context_tracking_enter();
+}
 NOKPROBE_SYMBOL(context_tracking_user_enter);
 
 /**
- * context_tracking_user_exit - Inform the context tracking that the CPU is
- *                              exiting userspace mode and entering the kernel.
+ * context_tracking_exit - Inform the context tracking that the CPU is
+ *                         exiting user or guest mode and entering the kernel.
  *
- * This function must be called after we entered the kernel from userspace
- * before any use of RCU read side critical section. This potentially include
- * any high level kernel code like syscalls, exceptions, signal handling, etc...
+ * This function must be called after we entered the kernel from user or
+ * guest space before any use of RCU read side critical section. This
+ * potentially include any high level kernel code like syscalls, exceptions,
+ * signal handling, etc...
  *
  * This call supports re-entrancy. This way it can be called from any exception
  * handler without needing to know if we came from userspace or not.
  */
-void context_tracking_user_exit(void)
+void context_tracking_exit(void)
 {
 	unsigned long flags;
 
@@ -143,6 +150,12 @@ void context_tracking_user_exit(void)
 	}
 	local_irq_restore(flags);
 }
+NOKPROBE_SYMBOL(context_tracking_exit);
+
+void context_tracking_user_exit(void)
+{
+	context_tracking_exit();
+}
 NOKPROBE_SYMBOL(context_tracking_user_exit);
 
 /**
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/6] rcu,nohz: add state parameter to context_tracking_enter/exit
  2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
  2015-02-10 14:41 ` [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions riel
@ 2015-02-10 14:41 ` riel
  2015-02-10 14:41 ` [PATCH 3/6] nohz: add stub context_tracking_is_enabled riel
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Add the expected ctx_state as a parameter to context_tracking_enter and
context_tracking_exit, allowing the same functions to not just track
kernel <> user space switching, but also kernel <> guest transitions.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h | 10 +++++-----
 kernel/context_tracking.c        | 14 +++++++-------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 03b9c733eae7..954253283709 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -10,8 +10,8 @@
 #ifdef CONFIG_CONTEXT_TRACKING
 extern void context_tracking_cpu_set(int cpu);
 
-extern void context_tracking_enter(void);
-extern void context_tracking_exit(void);
+extern void context_tracking_enter(enum ctx_state state);
+extern void context_tracking_exit(enum ctx_state state);
 extern void context_tracking_user_enter(void);
 extern void context_tracking_user_exit(void);
 extern void __context_tracking_task_switch(struct task_struct *prev,
@@ -37,7 +37,7 @@ static inline enum ctx_state exception_enter(void)
 		return 0;
 
 	prev_ctx = this_cpu_read(context_tracking.state);
-	context_tracking_user_exit();
+	context_tracking_exit(prev_ctx);
 
 	return prev_ctx;
 }
@@ -45,8 +45,8 @@ static inline enum ctx_state exception_enter(void)
 static inline void exception_exit(enum ctx_state prev_ctx)
 {
 	if (context_tracking_is_enabled()) {
-		if (prev_ctx == IN_USER)
-			context_tracking_user_enter();
+		if (prev_ctx != IN_KERNEL)
+			context_tracking_enter(prev_ctx);
 	}
 }
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index bbdc423936e6..38e38aeac8b9 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -47,7 +47,7 @@ void context_tracking_cpu_set(int cpu)
  * instructions to execute won't use any RCU read side critical section
  * because this function sets RCU in extended quiescent state.
  */
-void context_tracking_enter(void)
+void context_tracking_enter(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -75,7 +75,7 @@ void context_tracking_enter(void)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
-	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
+	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			trace_user_enter(0);
 			/*
@@ -101,7 +101,7 @@ void context_tracking_enter(void)
 		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
 		 * is false because we know that CPU is not tickless.
 		 */
-		__this_cpu_write(context_tracking.state, IN_USER);
+		__this_cpu_write(context_tracking.state, state);
 	}
 	local_irq_restore(flags);
 }
@@ -109,7 +109,7 @@ NOKPROBE_SYMBOL(context_tracking_enter);
 
 void context_tracking_user_enter(void)
 {
-	context_tracking_enter();
+	context_tracking_enter(IN_USER);
 }
 NOKPROBE_SYMBOL(context_tracking_user_enter);
 
@@ -125,7 +125,7 @@ NOKPROBE_SYMBOL(context_tracking_user_enter);
  * This call supports re-entrancy. This way it can be called from any exception
  * handler without needing to know if we came from userspace or not.
  */
-void context_tracking_exit(void)
+void context_tracking_exit(enum ctx_state state)
 {
 	unsigned long flags;
 
@@ -136,7 +136,7 @@ void context_tracking_exit(void)
 		return;
 
 	local_irq_save(flags);
-	if (__this_cpu_read(context_tracking.state) == IN_USER) {
+	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
 			 * We are going to run code that may use RCU. Inform
@@ -154,7 +154,7 @@ NOKPROBE_SYMBOL(context_tracking_exit);
 
 void context_tracking_user_exit(void)
 {
-	context_tracking_exit();
+	context_tracking_exit(IN_USER);
 }
 NOKPROBE_SYMBOL(context_tracking_user_exit);
 
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/6] nohz: add stub context_tracking_is_enabled
  2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
  2015-02-10 14:41 ` [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions riel
  2015-02-10 14:41 ` [PATCH 2/6] rcu,nohz: add state parameter to context_tracking_enter/exit riel
@ 2015-02-10 14:41 ` riel
  2015-02-10 14:41 ` [PATCH 4/6] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

With code elsewhere doing something conditional on whether or not
context tracking is enabled, we want a stub function that tells us
context tracking is not enabled, when CONFIG_CONTEXT_TRACKING is
not set.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking_state.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 97a81225d037..72ab10fe1e46 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -39,6 +39,8 @@ static inline bool context_tracking_in_user(void)
 #else
 static inline bool context_tracking_in_user(void) { return false; }
 static inline bool context_tracking_active(void) { return false; }
+static inline bool context_tracking_is_enabled(void) { return false; }
+static inline bool context_tracking_cpu_is_enabled(void) { return false; }
 #endif /* CONFIG_CONTEXT_TRACKING */
 
 #endif
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/6] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER
  2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (2 preceding siblings ...)
  2015-02-10 14:41 ` [PATCH 3/6] nohz: add stub context_tracking_is_enabled riel
@ 2015-02-10 14:41 ` riel
  2015-02-10 14:41 ` [PATCH 5/6] nohz,kvm: export context_tracking_user_enter/exit riel
  2015-02-10 14:41 ` [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
  5 siblings, 0 replies; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Only run vtime_user_enter, vtime_user_exit, and the user enter & exit
trace points when we are entering or exiting user state, respectively.

The KVM code in guest_enter and guest_exit already take care of calling
vtime_guest_enter and vtime_guest_exit, respectively.

The RCU code only distinguishes between "idle" and "not idle or kernel".
There should be no need to add an additional (unused) state there.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/context_tracking.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 38e38aeac8b9..0e4e318d5ea4 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -77,7 +77,6 @@ void context_tracking_enter(enum ctx_state state)
 	local_irq_save(flags);
 	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
-			trace_user_enter(0);
 			/*
 			 * At this stage, only low level arch entry code remains and
 			 * then we'll run in userspace. We can assume there won't be
@@ -85,7 +84,10 @@ void context_tracking_enter(enum ctx_state state)
 			 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
 			 * on the tick.
 			 */
-			vtime_user_enter(current);
+			if (state == IN_USER) {
+				trace_user_enter(0);
+				vtime_user_enter(current);
+			}
 			rcu_user_enter();
 		}
 		/*
@@ -143,8 +145,10 @@ void context_tracking_exit(enum ctx_state state)
 			 * RCU core about that (ie: we may need the tick again).
 			 */
 			rcu_user_exit();
-			vtime_user_exit(current);
-			trace_user_exit(0);
+			if (state == IN_USER) {
+				vtime_user_exit(current);
+				trace_user_exit(0);
+			}
 		}
 		__this_cpu_write(context_tracking.state, IN_KERNEL);
 	}
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/6] nohz,kvm: export context_tracking_user_enter/exit
  2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (3 preceding siblings ...)
  2015-02-10 14:41 ` [PATCH 4/6] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
@ 2015-02-10 14:41 ` riel
  2015-02-10 14:41 ` [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
  5 siblings, 0 replies; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

Export context_tracking_user_enter/exit so it can be used by KVM.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 kernel/context_tracking.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 0e4e318d5ea4..5bdf1a342ab3 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -108,6 +108,7 @@ void context_tracking_enter(enum ctx_state state)
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_enter);
+EXPORT_SYMBOL_GPL(context_tracking_enter);
 
 void context_tracking_user_enter(void)
 {
@@ -155,6 +156,7 @@ void context_tracking_exit(enum ctx_state state)
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_exit);
+EXPORT_SYMBOL_GPL(context_tracking_exit);
 
 void context_tracking_user_exit(void)
 {
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
                   ` (4 preceding siblings ...)
  2015-02-10 14:41 ` [PATCH 5/6] nohz,kvm: export context_tracking_user_enter/exit riel
@ 2015-02-10 14:41 ` riel
  2015-02-10 19:59   ` Andy Lutomirski
  5 siblings, 1 reply; 19+ messages in thread
From: riel @ 2015-02-10 14:41 UTC (permalink / raw)
  To: paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

From: Rik van Riel <riel@redhat.com>

The host kernel is not doing anything while the CPU is executing
a KVM guest VCPU, so it can be marked as being in an extended
quiescent state, identical to that used when running user space
code.

The only exception to that rule is when the host handles an
interrupt, which is already handled by the irq code, which
calls rcu_irq_enter and rcu_irq_exit.

The guest_enter and guest_exit functions already switch vtime
accounting independent of context tracking. Leave those calls
where they are, instead of moving them into the context tracking
code.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 include/linux/context_tracking.h       | 6 ++++++
 include/linux/context_tracking_state.h | 1 +
 include/linux/kvm_host.h               | 3 ++-
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 954253283709..b65fd1420e53 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -80,10 +80,16 @@ static inline void guest_enter(void)
 		vtime_guest_enter(current);
 	else
 		current->flags |= PF_VCPU;
+
+	if (context_tracking_is_enabled())
+		context_tracking_enter(IN_GUEST);
 }
 
 static inline void guest_exit(void)
 {
+	if (context_tracking_is_enabled())
+		context_tracking_exit(IN_GUEST);
+
 	if (vtime_accounting_enabled())
 		vtime_guest_exit(current);
 	else
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 72ab10fe1e46..90a7bab8779e 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -15,6 +15,7 @@ struct context_tracking {
 	enum ctx_state {
 		IN_KERNEL = 0,
 		IN_USER,
+		IN_GUEST,
 	} state;
 };
 
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 26f106022c88..c7828a6a9614 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -772,7 +772,8 @@ static inline void kvm_guest_enter(void)
 	 * one time slice). Lets treat guest mode as quiescent state, just like
 	 * we do with user-mode execution.
 	 */
-	rcu_virt_note_context_switch(smp_processor_id());
+	if (!context_tracking_cpu_is_enabled())
+		rcu_virt_note_context_switch(smp_processor_id());
 }
 
 static inline void kvm_guest_exit(void)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions
  2015-02-10 14:41 ` [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions riel
@ 2015-02-10 15:28   ` Frederic Weisbecker
  2015-02-10 16:48     ` Rik van Riel
  0 siblings, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2015-02-10 15:28 UTC (permalink / raw)
  To: riel
  Cc: paulmck, will.deacon, linux-kernel, Catalin.Marinas, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On Tue, Feb 10, 2015 at 09:41:45AM -0500, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
> 
> These wrapper functions allow architecture code (eg. ARM) to keep
> calling context_tracking_user_enter & context_tracking_user_exit
> the same way it always has, without error prone tricks like duplicate
> defines of argument values in assembly code.
> 
> Signed-off-by: Rik van Riel <riel@redhat.com>

This patch alone doesn't make much sense. The changelog says it's about keeping
context_tracking_user_*() functions as wrappers but fails to explain to what they
wrap, why and what are the new context_tracking_enter/exit functions for.

Perhaps patches 1 and 2 should be merged together into something like:

	context_tracking: Generalize context tracking APIs to support user and guest

        Do that because we'll also track guest....etc  And keep the old user context tracking APIs
        for now to avoid painful enum parameter support in ARM assembly.... 

> ---
>  include/linux/context_tracking.h |  2 ++
>  kernel/context_tracking.c        | 37 +++++++++++++++++++++++++------------
>  2 files changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 37b81bd51ec0..03b9c733eae7 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -10,6 +10,8 @@
>  #ifdef CONFIG_CONTEXT_TRACKING
>  extern void context_tracking_cpu_set(int cpu);
>  
> +extern void context_tracking_enter(void);
> +extern void context_tracking_exit(void);
>  extern void context_tracking_user_enter(void);
>  extern void context_tracking_user_exit(void);
>  extern void __context_tracking_task_switch(struct task_struct *prev,
> diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
> index 937ecdfdf258..bbdc423936e6 100644
> --- a/kernel/context_tracking.c
> +++ b/kernel/context_tracking.c
> @@ -39,15 +39,15 @@ void context_tracking_cpu_set(int cpu)
>  }
>  
>  /**
> - * context_tracking_user_enter - Inform the context tracking that the CPU is going to
> - *                               enter userspace mode.
> + * context_tracking_enter - Inform the context tracking that the CPU is going
> + *                          enter user or guest space mode.
>   *
>   * This function must be called right before we switch from the kernel
> - * to userspace, when it's guaranteed the remaining kernel instructions
> - * to execute won't use any RCU read side critical section because this
> - * function sets RCU in extended quiescent state.
> + * to user or guest space, when it's guaranteed the remaining kernel
> + * instructions to execute won't use any RCU read side critical section
> + * because this function sets RCU in extended quiescent state.
>   */
> -void context_tracking_user_enter(void)
> +void context_tracking_enter(void)
>  {
>  	unsigned long flags;
>  
> @@ -105,20 +105,27 @@ void context_tracking_user_enter(void)
>  	}
>  	local_irq_restore(flags);
>  }
> +NOKPROBE_SYMBOL(context_tracking_enter);
> +
> +void context_tracking_user_enter(void)
> +{
> +	context_tracking_enter();
> +}
>  NOKPROBE_SYMBOL(context_tracking_user_enter);
>  
>  /**
> - * context_tracking_user_exit - Inform the context tracking that the CPU is
> - *                              exiting userspace mode and entering the kernel.
> + * context_tracking_exit - Inform the context tracking that the CPU is
> + *                         exiting user or guest mode and entering the kernel.
>   *
> - * This function must be called after we entered the kernel from userspace
> - * before any use of RCU read side critical section. This potentially include
> - * any high level kernel code like syscalls, exceptions, signal handling, etc...
> + * This function must be called after we entered the kernel from user or
> + * guest space before any use of RCU read side critical section. This
> + * potentially include any high level kernel code like syscalls, exceptions,
> + * signal handling, etc...
>   *
>   * This call supports re-entrancy. This way it can be called from any exception
>   * handler without needing to know if we came from userspace or not.
>   */
> -void context_tracking_user_exit(void)
> +void context_tracking_exit(void)

Comment changes are good, thanks!

>  {
>  	unsigned long flags;
>  
> @@ -143,6 +150,12 @@ void context_tracking_user_exit(void)
>  	}
>  	local_irq_restore(flags);
>  }
> +NOKPROBE_SYMBOL(context_tracking_exit);
> +
> +void context_tracking_user_exit(void)
> +{
> +	context_tracking_exit();
> +}
>  NOKPROBE_SYMBOL(context_tracking_user_exit);
>  
>  /**
> -- 
> 1.9.3
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions
  2015-02-10 15:28   ` Frederic Weisbecker
@ 2015-02-10 16:48     ` Rik van Riel
  2015-02-10 17:25       ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Rik van Riel @ 2015-02-10 16:48 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: paulmck, will.deacon, linux-kernel, Catalin.Marinas, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On 02/10/2015 10:28 AM, Frederic Weisbecker wrote:
> On Tue, Feb 10, 2015 at 09:41:45AM -0500, riel@redhat.com wrote:
>> From: Rik van Riel <riel@redhat.com>
>>
>> These wrapper functions allow architecture code (eg. ARM) to keep
>> calling context_tracking_user_enter & context_tracking_user_exit
>> the same way it always has, without error prone tricks like duplicate
>> defines of argument values in assembly code.
>>
>> Signed-off-by: Rik van Riel <riel@redhat.com>
> 
> This patch alone doesn't make much sense. 

Agreed, my goal was to keep things super easy to review,
to reduce the chance of introducing bugs.

> The changelog says it's about keeping
> context_tracking_user_*() functions as wrappers but fails to explain to what they
> wrap, why and what are the new context_tracking_enter/exit functions for.
> 
> Perhaps patches 1 and 2 should be merged together into something like:
> 
> 	context_tracking: Generalize context tracking APIs to support user and guest
> 
>         Do that because we'll also track guest....etc  And keep the old user context tracking APIs
>         for now to avoid painful enum parameter support in ARM assembly.... 

Can do...

Paul, would you like me to resend the whole series, or just
a merged patch that replaces patches 1 & 2?

That is assuming Paul prefers having the patches merged into
one :)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions
  2015-02-10 16:48     ` Rik van Riel
@ 2015-02-10 17:25       ` Paul E. McKenney
  2015-02-10 17:36         ` Frederic Weisbecker
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-02-10 17:25 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Frederic Weisbecker, will.deacon, linux-kernel, Catalin.Marinas,
	kvm, mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On Tue, Feb 10, 2015 at 11:48:37AM -0500, Rik van Riel wrote:
> On 02/10/2015 10:28 AM, Frederic Weisbecker wrote:
> > On Tue, Feb 10, 2015 at 09:41:45AM -0500, riel@redhat.com wrote:
> >> From: Rik van Riel <riel@redhat.com>
> >>
> >> These wrapper functions allow architecture code (eg. ARM) to keep
> >> calling context_tracking_user_enter & context_tracking_user_exit
> >> the same way it always has, without error prone tricks like duplicate
> >> defines of argument values in assembly code.
> >>
> >> Signed-off-by: Rik van Riel <riel@redhat.com>
> > 
> > This patch alone doesn't make much sense. 
> 
> Agreed, my goal was to keep things super easy to review,
> to reduce the chance of introducing bugs.
> 
> > The changelog says it's about keeping
> > context_tracking_user_*() functions as wrappers but fails to explain to what they
> > wrap, why and what are the new context_tracking_enter/exit functions for.
> > 
> > Perhaps patches 1 and 2 should be merged together into something like:
> > 
> > 	context_tracking: Generalize context tracking APIs to support user and guest
> > 
> >         Do that because we'll also track guest....etc  And keep the old user context tracking APIs
> >         for now to avoid painful enum parameter support in ARM assembly.... 
> 
> Can do...
> 
> Paul, would you like me to resend the whole series, or just
> a merged patch that replaces patches 1 & 2?

I prefer the whole series, as it reduces my opportunity to introduce
human error when applying them.  ;-)

> That is assuming Paul prefers having the patches merged into
> one :)

I am OK with that.  I will merge the next set with the first two patches
merged, people have had sufficient opportunity to review.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions
  2015-02-10 17:25       ` Paul E. McKenney
@ 2015-02-10 17:36         ` Frederic Weisbecker
  2015-02-10 17:49           ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Frederic Weisbecker @ 2015-02-10 17:36 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Rik van Riel, will.deacon, linux-kernel, Catalin.Marinas, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On Tue, Feb 10, 2015 at 09:25:26AM -0800, Paul E. McKenney wrote:
> On Tue, Feb 10, 2015 at 11:48:37AM -0500, Rik van Riel wrote:
> > On 02/10/2015 10:28 AM, Frederic Weisbecker wrote:
> > > On Tue, Feb 10, 2015 at 09:41:45AM -0500, riel@redhat.com wrote:
> > >> From: Rik van Riel <riel@redhat.com>
> > >>
> > >> These wrapper functions allow architecture code (eg. ARM) to keep
> > >> calling context_tracking_user_enter & context_tracking_user_exit
> > >> the same way it always has, without error prone tricks like duplicate
> > >> defines of argument values in assembly code.
> > >>
> > >> Signed-off-by: Rik van Riel <riel@redhat.com>
> > > 
> > > This patch alone doesn't make much sense. 
> > 
> > Agreed, my goal was to keep things super easy to review,
> > to reduce the chance of introducing bugs.
> > 
> > > The changelog says it's about keeping
> > > context_tracking_user_*() functions as wrappers but fails to explain to what they
> > > wrap, why and what are the new context_tracking_enter/exit functions for.
> > > 
> > > Perhaps patches 1 and 2 should be merged together into something like:
> > > 
> > > 	context_tracking: Generalize context tracking APIs to support user and guest
> > > 
> > >         Do that because we'll also track guest....etc  And keep the old user context tracking APIs
> > >         for now to avoid painful enum parameter support in ARM assembly.... 
> > 
> > Can do...
> > 
> > Paul, would you like me to resend the whole series, or just
> > a merged patch that replaces patches 1 & 2?
> 
> I prefer the whole series, as it reduces my opportunity to introduce
> human error when applying them.  ;-)
> 
> > That is assuming Paul prefers having the patches merged into
> > one :)
> 
> I am OK with that.  I will merge the next set with the first two patches
> merged, people have had sufficient opportunity to review.

BTW, I have a few patches to make on the next cycle to fix a few context tracking
related things. And since it's too late to push this series for the current merge window,
now I wonder it may be easier if I take these patches. Otherwise you might experience
unpleasant rebase conflicts. Is that ok for you?

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions
  2015-02-10 17:36         ` Frederic Weisbecker
@ 2015-02-10 17:49           ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-02-10 17:49 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Rik van Riel, will.deacon, linux-kernel, Catalin.Marinas, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On Tue, Feb 10, 2015 at 06:36:47PM +0100, Frederic Weisbecker wrote:
> On Tue, Feb 10, 2015 at 09:25:26AM -0800, Paul E. McKenney wrote:
> > On Tue, Feb 10, 2015 at 11:48:37AM -0500, Rik van Riel wrote:
> > > On 02/10/2015 10:28 AM, Frederic Weisbecker wrote:
> > > > On Tue, Feb 10, 2015 at 09:41:45AM -0500, riel@redhat.com wrote:
> > > >> From: Rik van Riel <riel@redhat.com>
> > > >>
> > > >> These wrapper functions allow architecture code (eg. ARM) to keep
> > > >> calling context_tracking_user_enter & context_tracking_user_exit
> > > >> the same way it always has, without error prone tricks like duplicate
> > > >> defines of argument values in assembly code.
> > > >>
> > > >> Signed-off-by: Rik van Riel <riel@redhat.com>
> > > > 
> > > > This patch alone doesn't make much sense. 
> > > 
> > > Agreed, my goal was to keep things super easy to review,
> > > to reduce the chance of introducing bugs.
> > > 
> > > > The changelog says it's about keeping
> > > > context_tracking_user_*() functions as wrappers but fails to explain to what they
> > > > wrap, why and what are the new context_tracking_enter/exit functions for.
> > > > 
> > > > Perhaps patches 1 and 2 should be merged together into something like:
> > > > 
> > > > 	context_tracking: Generalize context tracking APIs to support user and guest
> > > > 
> > > >         Do that because we'll also track guest....etc  And keep the old user context tracking APIs
> > > >         for now to avoid painful enum parameter support in ARM assembly.... 
> > > 
> > > Can do...
> > > 
> > > Paul, would you like me to resend the whole series, or just
> > > a merged patch that replaces patches 1 & 2?
> > 
> > I prefer the whole series, as it reduces my opportunity to introduce
> > human error when applying them.  ;-)
> > 
> > > That is assuming Paul prefers having the patches merged into
> > > one :)
> > 
> > I am OK with that.  I will merge the next set with the first two patches
> > merged, people have had sufficient opportunity to review.
> 
> BTW, I have a few patches to make on the next cycle to fix a few context tracking
> related things. And since it's too late to push this series for the current merge window,
> now I wonder it may be easier if I take these patches. Otherwise you might experience
> unpleasant rebase conflicts. Is that ok for you?

Works for me!  ;-)

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 14:41 ` [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
@ 2015-02-10 19:59   ` Andy Lutomirski
  2015-02-10 20:13     ` Rik van Riel
  2015-02-10 20:14     ` Paul E. McKenney
  0 siblings, 2 replies; 19+ messages in thread
From: Andy Lutomirski @ 2015-02-10 19:59 UTC (permalink / raw)
  To: riel, paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On 02/10/2015 06:41 AM, riel@redhat.com wrote:
> From: Rik van Riel <riel@redhat.com>
>
> The host kernel is not doing anything while the CPU is executing
> a KVM guest VCPU, so it can be marked as being in an extended
> quiescent state, identical to that used when running user space
> code.
>
> The only exception to that rule is when the host handles an
> interrupt, which is already handled by the irq code, which
> calls rcu_irq_enter and rcu_irq_exit.
>
> The guest_enter and guest_exit functions already switch vtime
> accounting independent of context tracking. Leave those calls
> where they are, instead of moving them into the context tracking
> code.
>
> Signed-off-by: Rik van Riel <riel@redhat.com>
> ---
>   include/linux/context_tracking.h       | 6 ++++++
>   include/linux/context_tracking_state.h | 1 +
>   include/linux/kvm_host.h               | 3 ++-
>   3 files changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> index 954253283709..b65fd1420e53 100644
> --- a/include/linux/context_tracking.h
> +++ b/include/linux/context_tracking.h
> @@ -80,10 +80,16 @@ static inline void guest_enter(void)
>   		vtime_guest_enter(current);
>   	else
>   		current->flags |= PF_VCPU;
> +
> +	if (context_tracking_is_enabled())
> +		context_tracking_enter(IN_GUEST);

Why the if statement?

Also, have you checked how much this hurts guest lightweight entry/exit 
latency?  Context tracking is shockingly expensive for reasons I don't 
fully understand, but hopefully most of it is the vtime stuff.  (Context 
tracking is *so* expensive that I almost think we should set the 
performance taint flag if we enable it, assuming that flag ended up 
getting merged.  Also, we should make context tracking faster.)

--Andy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 19:59   ` Andy Lutomirski
@ 2015-02-10 20:13     ` Rik van Riel
  2015-02-10 20:14     ` Paul E. McKenney
  1 sibling, 0 replies; 19+ messages in thread
From: Rik van Riel @ 2015-02-10 20:13 UTC (permalink / raw)
  To: Andy Lutomirski, paulmck
  Cc: will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On 02/10/2015 02:59 PM, Andy Lutomirski wrote:
> On 02/10/2015 06:41 AM, riel@redhat.com wrote:
>> From: Rik van Riel <riel@redhat.com>
>>
>> The host kernel is not doing anything while the CPU is executing
>> a KVM guest VCPU, so it can be marked as being in an extended
>> quiescent state, identical to that used when running user space
>> code.
>>
>> The only exception to that rule is when the host handles an
>> interrupt, which is already handled by the irq code, which
>> calls rcu_irq_enter and rcu_irq_exit.
>>
>> The guest_enter and guest_exit functions already switch vtime
>> accounting independent of context tracking. Leave those calls
>> where they are, instead of moving them into the context tracking
>> code.
>>
>> Signed-off-by: Rik van Riel <riel@redhat.com>
>> ---
>>   include/linux/context_tracking.h       | 6 ++++++
>>   include/linux/context_tracking_state.h | 1 +
>>   include/linux/kvm_host.h               | 3 ++-
>>   3 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/context_tracking.h
>> b/include/linux/context_tracking.h
>> index 954253283709..b65fd1420e53 100644
>> --- a/include/linux/context_tracking.h
>> +++ b/include/linux/context_tracking.h
>> @@ -80,10 +80,16 @@ static inline void guest_enter(void)
>>           vtime_guest_enter(current);
>>       else
>>           current->flags |= PF_VCPU;
>> +
>> +    if (context_tracking_is_enabled())
>> +        context_tracking_enter(IN_GUEST);
> 
> Why the if statement?
> 
> Also, have you checked how much this hurts guest lightweight entry/exit
> latency?  Context tracking is shockingly expensive for reasons I don't
> fully understand, but hopefully most of it is the vtime stuff.

Guest_enter and guest_exit already do the vtime stuff today.

This patch series adds the rcu stuff, and modifies
context_tracking_enter & context_tracking_exit to not
do the vtime stuff twice.

> (Context tracking is *so* expensive that I almost think we should set the
> performance taint flag if we enable it, assuming that flag ended up
> getting merged.  Also, we should make context tracking faster.)

I am all for making it faster :)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 19:59   ` Andy Lutomirski
  2015-02-10 20:13     ` Rik van Riel
@ 2015-02-10 20:14     ` Paul E. McKenney
  2015-02-10 20:19       ` Andy Lutomirski
  1 sibling, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-02-10 20:14 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: riel, will.deacon, linux-kernel, Catalin.Marinas, fweisbec, kvm,
	mtosatti, borntraeger, mingo, oleg, lcapitulino, pbonzini

On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
> On 02/10/2015 06:41 AM, riel@redhat.com wrote:
> >From: Rik van Riel <riel@redhat.com>
> >
> >The host kernel is not doing anything while the CPU is executing
> >a KVM guest VCPU, so it can be marked as being in an extended
> >quiescent state, identical to that used when running user space
> >code.
> >
> >The only exception to that rule is when the host handles an
> >interrupt, which is already handled by the irq code, which
> >calls rcu_irq_enter and rcu_irq_exit.
> >
> >The guest_enter and guest_exit functions already switch vtime
> >accounting independent of context tracking. Leave those calls
> >where they are, instead of moving them into the context tracking
> >code.
> >
> >Signed-off-by: Rik van Riel <riel@redhat.com>
> >---
> >  include/linux/context_tracking.h       | 6 ++++++
> >  include/linux/context_tracking_state.h | 1 +
> >  include/linux/kvm_host.h               | 3 ++-
> >  3 files changed, 9 insertions(+), 1 deletion(-)
> >
> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >index 954253283709..b65fd1420e53 100644
> >--- a/include/linux/context_tracking.h
> >+++ b/include/linux/context_tracking.h
> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
> >  		vtime_guest_enter(current);
> >  	else
> >  		current->flags |= PF_VCPU;
> >+
> >+	if (context_tracking_is_enabled())
> >+		context_tracking_enter(IN_GUEST);
> 
> Why the if statement?
> 
> Also, have you checked how much this hurts guest lightweight
> entry/exit latency?  Context tracking is shockingly expensive for
> reasons I don't fully understand, but hopefully most of it is the
> vtime stuff.  (Context tracking is *so* expensive that I almost
> think we should set the performance taint flag if we enable it,
> assuming that flag ended up getting merged.  Also, we should make
> context tracking faster.)

It turns out that context_tracking_is_enabled() is a static inline
that uses a static_key, so the overhead should be minimal on platforms
having a full implementation of static keys.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 20:14     ` Paul E. McKenney
@ 2015-02-10 20:19       ` Andy Lutomirski
  2015-02-10 20:42         ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Lutomirski @ 2015-02-10 20:19 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Rik van Riel, Will Deacon, linux-kernel, Catalin Marinas,
	Frédéric Weisbecker, kvm list, Marcelo Tosatti,
	Christian Borntraeger, Ingo Molnar, Oleg Nesterov,
	Luiz Capitulino, Paolo Bonzini

On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
>> On 02/10/2015 06:41 AM, riel@redhat.com wrote:
>> >From: Rik van Riel <riel@redhat.com>
>> >
>> >The host kernel is not doing anything while the CPU is executing
>> >a KVM guest VCPU, so it can be marked as being in an extended
>> >quiescent state, identical to that used when running user space
>> >code.
>> >
>> >The only exception to that rule is when the host handles an
>> >interrupt, which is already handled by the irq code, which
>> >calls rcu_irq_enter and rcu_irq_exit.
>> >
>> >The guest_enter and guest_exit functions already switch vtime
>> >accounting independent of context tracking. Leave those calls
>> >where they are, instead of moving them into the context tracking
>> >code.
>> >
>> >Signed-off-by: Rik van Riel <riel@redhat.com>
>> >---
>> >  include/linux/context_tracking.h       | 6 ++++++
>> >  include/linux/context_tracking_state.h | 1 +
>> >  include/linux/kvm_host.h               | 3 ++-
>> >  3 files changed, 9 insertions(+), 1 deletion(-)
>> >
>> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>> >index 954253283709..b65fd1420e53 100644
>> >--- a/include/linux/context_tracking.h
>> >+++ b/include/linux/context_tracking.h
>> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
>> >             vtime_guest_enter(current);
>> >     else
>> >             current->flags |= PF_VCPU;
>> >+
>> >+    if (context_tracking_is_enabled())
>> >+            context_tracking_enter(IN_GUEST);
>>
>> Why the if statement?
>>
>> Also, have you checked how much this hurts guest lightweight
>> entry/exit latency?  Context tracking is shockingly expensive for
>> reasons I don't fully understand, but hopefully most of it is the
>> vtime stuff.  (Context tracking is *so* expensive that I almost
>> think we should set the performance taint flag if we enable it,
>> assuming that flag ended up getting merged.  Also, we should make
>> context tracking faster.)
>
> It turns out that context_tracking_is_enabled() is a static inline
> that uses a static_key, so the overhead should be minimal on platforms
> having a full implementation of static keys.

Shouldn't we just fold that into context_tracking_xyz_enter?

Also, why does the vtime stuff depend on RCU extended quiescent
states?  To me, they seem mostly orthogonal other than the fact that
they hook into the same places.

--Andy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 20:19       ` Andy Lutomirski
@ 2015-02-10 20:42         ` Paul E. McKenney
  2015-02-10 21:00           ` Andy Lutomirski
  0 siblings, 1 reply; 19+ messages in thread
From: Paul E. McKenney @ 2015-02-10 20:42 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Rik van Riel, Will Deacon, linux-kernel, Catalin Marinas,
	Frédéric Weisbecker, kvm list, Marcelo Tosatti,
	Christian Borntraeger, Ingo Molnar, Oleg Nesterov,
	Luiz Capitulino, Paolo Bonzini

On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote:
> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
> >> On 02/10/2015 06:41 AM, riel@redhat.com wrote:
> >> >From: Rik van Riel <riel@redhat.com>
> >> >
> >> >The host kernel is not doing anything while the CPU is executing
> >> >a KVM guest VCPU, so it can be marked as being in an extended
> >> >quiescent state, identical to that used when running user space
> >> >code.
> >> >
> >> >The only exception to that rule is when the host handles an
> >> >interrupt, which is already handled by the irq code, which
> >> >calls rcu_irq_enter and rcu_irq_exit.
> >> >
> >> >The guest_enter and guest_exit functions already switch vtime
> >> >accounting independent of context tracking. Leave those calls
> >> >where they are, instead of moving them into the context tracking
> >> >code.
> >> >
> >> >Signed-off-by: Rik van Riel <riel@redhat.com>
> >> >---
> >> >  include/linux/context_tracking.h       | 6 ++++++
> >> >  include/linux/context_tracking_state.h | 1 +
> >> >  include/linux/kvm_host.h               | 3 ++-
> >> >  3 files changed, 9 insertions(+), 1 deletion(-)
> >> >
> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >> >index 954253283709..b65fd1420e53 100644
> >> >--- a/include/linux/context_tracking.h
> >> >+++ b/include/linux/context_tracking.h
> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
> >> >             vtime_guest_enter(current);
> >> >     else
> >> >             current->flags |= PF_VCPU;
> >> >+
> >> >+    if (context_tracking_is_enabled())
> >> >+            context_tracking_enter(IN_GUEST);
> >>
> >> Why the if statement?
> >>
> >> Also, have you checked how much this hurts guest lightweight
> >> entry/exit latency?  Context tracking is shockingly expensive for
> >> reasons I don't fully understand, but hopefully most of it is the
> >> vtime stuff.  (Context tracking is *so* expensive that I almost
> >> think we should set the performance taint flag if we enable it,
> >> assuming that flag ended up getting merged.  Also, we should make
> >> context tracking faster.)
> >
> > It turns out that context_tracking_is_enabled() is a static inline
> > that uses a static_key, so the overhead should be minimal on platforms
> > having a full implementation of static keys.
> 
> Shouldn't we just fold that into context_tracking_xyz_enter?

If I am not getting too confused, Rik did that initially, but it caused
some pain for the ARM guys.  I don't see a performance downside, at
least not for a modern compiler that does a decent job of inlining.

> Also, why does the vtime stuff depend on RCU extended quiescent
> states?  To me, they seem mostly orthogonal other than the fact that
> they hook into the same places.

I might be missing your point, but...

If there are no scheduling-clock interrupts, then the CPU needs to be
in an extended quiescent state, otherwise you will get RCU CPU stall
warnings and eventually OOM.  Similarly, if there are no scheduling-clock
interupts, then you need to compute the vtime stuff based on start times
and deltas instead of relying on a scheduling-clock interrupt that never
comes.  So it isn't that the vtime and RCU stuff are directly related,
but rather that they both must take evasive action if there are to be
no scheduling-clock interrupts for an extended time period.

Therefore, they really need to key off of the same conditions.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 20:42         ` Paul E. McKenney
@ 2015-02-10 21:00           ` Andy Lutomirski
  2015-02-10 21:17             ` Paul E. McKenney
  0 siblings, 1 reply; 19+ messages in thread
From: Andy Lutomirski @ 2015-02-10 21:00 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Rik van Riel, Will Deacon, linux-kernel, Catalin Marinas,
	Frédéric Weisbecker, kvm list, Marcelo Tosatti,
	Christian Borntraeger, Ingo Molnar, Oleg Nesterov,
	Luiz Capitulino, Paolo Bonzini

On Tue, Feb 10, 2015 at 12:42 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote:
>> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
>> <paulmck@linux.vnet.ibm.com> wrote:
>> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
>> >> On 02/10/2015 06:41 AM, riel@redhat.com wrote:
>> >> >From: Rik van Riel <riel@redhat.com>
>> >> >
>> >> >The host kernel is not doing anything while the CPU is executing
>> >> >a KVM guest VCPU, so it can be marked as being in an extended
>> >> >quiescent state, identical to that used when running user space
>> >> >code.
>> >> >
>> >> >The only exception to that rule is when the host handles an
>> >> >interrupt, which is already handled by the irq code, which
>> >> >calls rcu_irq_enter and rcu_irq_exit.
>> >> >
>> >> >The guest_enter and guest_exit functions already switch vtime
>> >> >accounting independent of context tracking. Leave those calls
>> >> >where they are, instead of moving them into the context tracking
>> >> >code.
>> >> >
>> >> >Signed-off-by: Rik van Riel <riel@redhat.com>
>> >> >---
>> >> >  include/linux/context_tracking.h       | 6 ++++++
>> >> >  include/linux/context_tracking_state.h | 1 +
>> >> >  include/linux/kvm_host.h               | 3 ++-
>> >> >  3 files changed, 9 insertions(+), 1 deletion(-)
>> >> >
>> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
>> >> >index 954253283709..b65fd1420e53 100644
>> >> >--- a/include/linux/context_tracking.h
>> >> >+++ b/include/linux/context_tracking.h
>> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
>> >> >             vtime_guest_enter(current);
>> >> >     else
>> >> >             current->flags |= PF_VCPU;
>> >> >+
>> >> >+    if (context_tracking_is_enabled())
>> >> >+            context_tracking_enter(IN_GUEST);
>> >>
>> >> Why the if statement?
>> >>
>> >> Also, have you checked how much this hurts guest lightweight
>> >> entry/exit latency?  Context tracking is shockingly expensive for
>> >> reasons I don't fully understand, but hopefully most of it is the
>> >> vtime stuff.  (Context tracking is *so* expensive that I almost
>> >> think we should set the performance taint flag if we enable it,
>> >> assuming that flag ended up getting merged.  Also, we should make
>> >> context tracking faster.)
>> >
>> > It turns out that context_tracking_is_enabled() is a static inline
>> > that uses a static_key, so the overhead should be minimal on platforms
>> > having a full implementation of static keys.
>>
>> Shouldn't we just fold that into context_tracking_xyz_enter?
>
> If I am not getting too confused, Rik did that initially, but it caused
> some pain for the ARM guys.  I don't see a performance downside, at
> least not for a modern compiler that does a decent job of inlining.

It's more of a tidiness issue to me than a performance issue.

>
>> Also, why does the vtime stuff depend on RCU extended quiescent
>> states?  To me, they seem mostly orthogonal other than the fact that
>> they hook into the same places.
>
> I might be missing your point, but...
>
> If there are no scheduling-clock interrupts, then the CPU needs to be
> in an extended quiescent state, otherwise you will get RCU CPU stall
> warnings and eventually OOM.  Similarly, if there are no scheduling-clock
> interupts, then you need to compute the vtime stuff based on start times
> and deltas instead of relying on a scheduling-clock interrupt that never
> comes.  So it isn't that the vtime and RCU stuff are directly related,
> but rather that they both must take evasive action if there are to be
> no scheduling-clock interrupts for an extended time period.

I'm probably missing something, but isn't vtime also used for accurate
CPU time stats?

--Andy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest
  2015-02-10 21:00           ` Andy Lutomirski
@ 2015-02-10 21:17             ` Paul E. McKenney
  0 siblings, 0 replies; 19+ messages in thread
From: Paul E. McKenney @ 2015-02-10 21:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Rik van Riel, Will Deacon, linux-kernel, Catalin Marinas,
	Frédéric Weisbecker, kvm list, Marcelo Tosatti,
	Christian Borntraeger, Ingo Molnar, Oleg Nesterov,
	Luiz Capitulino, Paolo Bonzini

On Tue, Feb 10, 2015 at 01:00:35PM -0800, Andy Lutomirski wrote:
> On Tue, Feb 10, 2015 at 12:42 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Tue, Feb 10, 2015 at 12:19:28PM -0800, Andy Lutomirski wrote:
> >> On Tue, Feb 10, 2015 at 12:14 PM, Paul E. McKenney
> >> <paulmck@linux.vnet.ibm.com> wrote:
> >> > On Tue, Feb 10, 2015 at 11:59:09AM -0800, Andy Lutomirski wrote:
> >> >> On 02/10/2015 06:41 AM, riel@redhat.com wrote:
> >> >> >From: Rik van Riel <riel@redhat.com>
> >> >> >
> >> >> >The host kernel is not doing anything while the CPU is executing
> >> >> >a KVM guest VCPU, so it can be marked as being in an extended
> >> >> >quiescent state, identical to that used when running user space
> >> >> >code.
> >> >> >
> >> >> >The only exception to that rule is when the host handles an
> >> >> >interrupt, which is already handled by the irq code, which
> >> >> >calls rcu_irq_enter and rcu_irq_exit.
> >> >> >
> >> >> >The guest_enter and guest_exit functions already switch vtime
> >> >> >accounting independent of context tracking. Leave those calls
> >> >> >where they are, instead of moving them into the context tracking
> >> >> >code.
> >> >> >
> >> >> >Signed-off-by: Rik van Riel <riel@redhat.com>
> >> >> >---
> >> >> >  include/linux/context_tracking.h       | 6 ++++++
> >> >> >  include/linux/context_tracking_state.h | 1 +
> >> >> >  include/linux/kvm_host.h               | 3 ++-
> >> >> >  3 files changed, 9 insertions(+), 1 deletion(-)
> >> >> >
> >> >> >diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
> >> >> >index 954253283709..b65fd1420e53 100644
> >> >> >--- a/include/linux/context_tracking.h
> >> >> >+++ b/include/linux/context_tracking.h
> >> >> >@@ -80,10 +80,16 @@ static inline void guest_enter(void)
> >> >> >             vtime_guest_enter(current);
> >> >> >     else
> >> >> >             current->flags |= PF_VCPU;
> >> >> >+
> >> >> >+    if (context_tracking_is_enabled())
> >> >> >+            context_tracking_enter(IN_GUEST);
> >> >>
> >> >> Why the if statement?
> >> >>
> >> >> Also, have you checked how much this hurts guest lightweight
> >> >> entry/exit latency?  Context tracking is shockingly expensive for
> >> >> reasons I don't fully understand, but hopefully most of it is the
> >> >> vtime stuff.  (Context tracking is *so* expensive that I almost
> >> >> think we should set the performance taint flag if we enable it,
> >> >> assuming that flag ended up getting merged.  Also, we should make
> >> >> context tracking faster.)
> >> >
> >> > It turns out that context_tracking_is_enabled() is a static inline
> >> > that uses a static_key, so the overhead should be minimal on platforms
> >> > having a full implementation of static keys.
> >>
> >> Shouldn't we just fold that into context_tracking_xyz_enter?
> >
> > If I am not getting too confused, Rik did that initially, but it caused
> > some pain for the ARM guys.  I don't see a performance downside, at
> > least not for a modern compiler that does a decent job of inlining.
> 
> It's more of a tidiness issue to me than a performance issue.

I feel that the current patch does a good job of optimizing global tidiness.

> >> Also, why does the vtime stuff depend on RCU extended quiescent
> >> states?  To me, they seem mostly orthogonal other than the fact that
> >> they hook into the same places.
> >
> > I might be missing your point, but...
> >
> > If there are no scheduling-clock interrupts, then the CPU needs to be
> > in an extended quiescent state, otherwise you will get RCU CPU stall
> > warnings and eventually OOM.  Similarly, if there are no scheduling-clock
> > interupts, then you need to compute the vtime stuff based on start times
> > and deltas instead of relying on a scheduling-clock interrupt that never
> > comes.  So it isn't that the vtime and RCU stuff are directly related,
> > but rather that they both must take evasive action if there are to be
> > no scheduling-clock interrupts for an extended time period.
> 
> I'm probably missing something, but isn't vtime also used for accurate
> CPU time stats?

Right.

In my previous email, I only talked about what happens if there is no
scheduling-clock interrupt, and your question is instead about what
can happen if the scheduling-clock interrupt is enabled.  The accurate
CPU time stats are optional if you leave the scheduling clock on during
userspace execution, but become mandatory in the nohz_full case where
the scheduling clock is disabled across userspace execution.  So the
accurate CPU time stats are mandatory in the same situation where you
have an RCU extended quiescent state.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-02-10 21:17 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-10 14:41 [PATCH -v4 0/6] rcu,nohz,kvm: use RCU extended quiescent state when running KVM guest riel
2015-02-10 14:41 ` [PATCH 1/6] rcu,nohz: add context_tracking_user_enter/exit wrapper functions riel
2015-02-10 15:28   ` Frederic Weisbecker
2015-02-10 16:48     ` Rik van Riel
2015-02-10 17:25       ` Paul E. McKenney
2015-02-10 17:36         ` Frederic Weisbecker
2015-02-10 17:49           ` Paul E. McKenney
2015-02-10 14:41 ` [PATCH 2/6] rcu,nohz: add state parameter to context_tracking_enter/exit riel
2015-02-10 14:41 ` [PATCH 3/6] nohz: add stub context_tracking_is_enabled riel
2015-02-10 14:41 ` [PATCH 4/6] rcu,nohz: run vtime_user_enter/exit only when state == IN_USER riel
2015-02-10 14:41 ` [PATCH 5/6] nohz,kvm: export context_tracking_user_enter/exit riel
2015-02-10 14:41 ` [PATCH 6/6] kvm,rcu,nohz: use RCU extended quiescent state when running KVM guest riel
2015-02-10 19:59   ` Andy Lutomirski
2015-02-10 20:13     ` Rik van Riel
2015-02-10 20:14     ` Paul E. McKenney
2015-02-10 20:19       ` Andy Lutomirski
2015-02-10 20:42         ` Paul E. McKenney
2015-02-10 21:00           ` Andy Lutomirski
2015-02-10 21:17             ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).