All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/21] nohz patches for 3.12 preview v2
@ 2013-07-26 23:42 Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 01/21] sched: Consolidate open coded preemptible() checks Frederic Weisbecker
                   ` (20 more replies)
  0 siblings, 21 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman, Martin Schwidefsky,
	Heiko Carstens, Alex Shi, Paul Turner, Vincent Guittot

Hi,

This is a respin of the series that optimize full dynticks off-case with
static keys. It seems that some distros are interested in full dynticks so we
need to optimize the off case such that unconcerned users are not impacted
by performance regressions. 

Thanks to Steve for his reviews on the previous version! I hope the
changelogs and comments are better in this version.

---

Changes since last posting:

* Rebase against 3.11-rc2

* Dropped because merged in -tip through urgent queue: 
	nohz: Do not warn about unstable tsc unless user uses nohz_full
	nohz: fix compile warning in tick_nohz_init()

Reported by Steve:

* Fix confusing comments in [03/21]
* Fix confusing changelog [05/21]
* Split [05/21] with new patch to enhance CONFIG_CONTEXT_TRACKING_FORCE
  Kconfig help text, see [06/21]

Bugfixes:

* Fix missing exported symbol, clarify changes by seperating guest APIs
optimization in a seperate patch [09/21]

Further:

* Use static keys on full dynticks APIs [19-21/21]


You can snort from:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	timers/nohz-3.12-preview-v2

Thanks,
	Frederic
---

Frederic Weisbecker (21):
      sched: Consolidate open coded preemptible() checks
      context_tracing: Fix guest accounting with native vtime
      vtime: Update a few comments
      context_tracking: Fix runtime CPU off-case
      nohz: Only enable context tracking on full dynticks CPUs
      context_tracking: Remove full dynticks' hacky dependency on wide context tracking
      context_tracking: Ground setup for static key use
      context_tracking: Optimize main APIs off case with static key
      context_tracking: Optimize guest APIs off case with static key
      context_tracking: Optimize context switch off case with static keys
      context_tracking: User/kernel broundary cross trace events
      vtime: Remove a few unneeded generic vtime state checks
      vtime: Fix racy cputime delta update
      context_tracking: Split low level state headers
      vtime: Describe overriden functions in dedicated arch headers
      vtime: Optimize full dynticks accounting off case with static keys
      vtime: Always scale generic vtime accounting results
      vtime: Always debug check snapshot source _before_ updating it
      nohz: Rename a few state variables
      nohz: Optimize full dynticks state checks with static keys
      nohz: Optimize full dynticks's sched hooks with static keys


 arch/ia64/include/asm/Kbuild            |    1 +
 arch/powerpc/include/asm/Kbuild         |    1 +
 arch/s390/include/asm/cputime.h         |    3 -
 arch/s390/include/asm/vtime.h           |    7 ++
 arch/s390/kernel/vtime.c                |    1 +
 include/linux/context_tracking.h        |  120 +++++++++++++++--------------
 include/linux/context_tracking_state.h  |   39 +++++++++
 include/linux/tick.h                    |   45 +++++++++--
 include/linux/vtime.h                   |   74 ++++++++++++++++--
 include/trace/events/context_tracking.h |   58 ++++++++++++++
 init/Kconfig                            |   28 +++++--
 kernel/context_tracking.c               |  128 ++++++++++++++++++-------------
 kernel/sched/core.c                     |    4 +-
 kernel/sched/cputime.c                  |   53 ++++---------
 kernel/time/Kconfig                     |    1 -
 kernel/time/tick-sched.c                |   56 ++++++--------
 16 files changed, 410 insertions(+), 209 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/21] sched: Consolidate open coded preemptible() checks
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 02/21] context_tracing: Fix guest accounting with native vtime Frederic Weisbecker
                   ` (19 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Ingo Molnar, Li Zhong, Paul E. McKenney,
	Peter Zijlstra, Steven Rostedt, Thomas Gleixner, Borislav Petkov,
	Alex Shi, Paul Turner, Mike Galbraith, Vincent Guittot

preempt_schedule() and preempt_schedule_context() open
code their preemptability checks.

Use the standard API instead for consolidation.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Alex Shi <alex.shi@intel.com>
Cc: Paul Turner <pjt@google.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
---
 kernel/context_tracking.c |    3 +--
 kernel/sched/core.c       |    4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 383f823..942835c 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -87,10 +87,9 @@ void user_enter(void)
  */
 void __sched notrace preempt_schedule_context(void)
 {
-	struct thread_info *ti = current_thread_info();
 	enum ctx_state prev_ctx;
 
-	if (likely(ti->preempt_count || irqs_disabled()))
+	if (likely(!preemptible()))
 		return;
 
 	/*
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b7c32cb..3fb7ace 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2510,13 +2510,11 @@ void __sched schedule_preempt_disabled(void)
  */
 asmlinkage void __sched notrace preempt_schedule(void)
 {
-	struct thread_info *ti = current_thread_info();
-
 	/*
 	 * If there is a non-zero preempt_count or interrupts are disabled,
 	 * we do not want to preempt the current task. Just return..
 	 */
-	if (likely(ti->preempt_count || irqs_disabled()))
+	if (likely(!preemptible()))
 		return;
 
 	do {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/21] context_tracing: Fix guest accounting with native vtime
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 01/21] sched: Consolidate open coded preemptible() checks Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 03/21] vtime: Update a few comments Frederic Weisbecker
                   ` (18 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

1) If context tracking is enabled with native vtime accounting (which
combo is useless except for dev testing), we call vtime_guest_enter()
and vtime_guest_exit() on host <-> guest switches. But those are stubs
in this configurations. As a result, cputime is not correctly flushed
on kvm context switches.

2) If context tracking runs but is disabled on some CPUs, those
CPUs end up calling __guest_enter/__guest_exit which in turn
call vtime_account_system(). We don't want to call this because we
run in tick based accounting for these CPUs.

Refactor the guest_enter/guest_exit code such that all combinations
finally work.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |   52 ++++++++++++++++----------------------
 kernel/context_tracking.c        |    6 +++-
 2 files changed, 26 insertions(+), 32 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index fc09d7b..5984f25 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -20,25 +20,6 @@ struct context_tracking {
 	} state;
 };
 
-static inline void __guest_enter(void)
-{
-	/*
-	 * This is running in ioctl context so we can avoid
-	 * the call to vtime_account() with its unnecessary idle check.
-	 */
-	vtime_account_system(current);
-	current->flags |= PF_VCPU;
-}
-
-static inline void __guest_exit(void)
-{
-	/*
-	 * This is running in ioctl context so we can avoid
-	 * the call to vtime_account() with its unnecessary idle check.
-	 */
-	vtime_account_system(current);
-	current->flags &= ~PF_VCPU;
-}
 
 #ifdef CONFIG_CONTEXT_TRACKING
 DECLARE_PER_CPU(struct context_tracking, context_tracking);
@@ -56,9 +37,6 @@ static inline bool context_tracking_active(void)
 extern void user_enter(void);
 extern void user_exit(void);
 
-extern void guest_enter(void);
-extern void guest_exit(void);
-
 static inline enum ctx_state exception_enter(void)
 {
 	enum ctx_state prev_ctx;
@@ -81,21 +59,35 @@ extern void context_tracking_task_switch(struct task_struct *prev,
 static inline bool context_tracking_in_user(void) { return false; }
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
+static inline enum ctx_state exception_enter(void) { return 0; }
+static inline void exception_exit(enum ctx_state prev_ctx) { }
+static inline void context_tracking_task_switch(struct task_struct *prev,
+						struct task_struct *next) { }
+#endif /* !CONFIG_CONTEXT_TRACKING */
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
+extern void guest_enter(void);
+extern void guest_exit(void);
+#else
 static inline void guest_enter(void)
 {
-	__guest_enter();
+	/*
+	 * This is running in ioctl context so we can avoid
+	 * the call to vtime_account() with its unnecessary idle check.
+	 */
+	vtime_account_system(current);
+	current->flags |= PF_VCPU;
 }
 
 static inline void guest_exit(void)
 {
-	__guest_exit();
+	/*
+	 * This is running in ioctl context so we can avoid
+	 * the call to vtime_account() with its unnecessary idle check.
+	 */
+	vtime_account_system(current);
+	current->flags &= ~PF_VCPU;
 }
-
-static inline enum ctx_state exception_enter(void) { return 0; }
-static inline void exception_exit(enum ctx_state prev_ctx) { }
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next) { }
-#endif /* !CONFIG_CONTEXT_TRACKING */
+#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
 
 #endif
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 942835c..1f47119 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -141,12 +141,13 @@ void user_exit(void)
 	local_irq_restore(flags);
 }
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 void guest_enter(void)
 {
 	if (vtime_accounting_enabled())
 		vtime_guest_enter(current);
 	else
-		__guest_enter();
+		current->flags |= PF_VCPU;
 }
 EXPORT_SYMBOL_GPL(guest_enter);
 
@@ -155,9 +156,10 @@ void guest_exit(void)
 	if (vtime_accounting_enabled())
 		vtime_guest_exit(current);
 	else
-		__guest_exit();
+		current->flags &= ~PF_VCPU;
 }
 EXPORT_SYMBOL_GPL(guest_exit);
+#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
 
 
 /**
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/21] vtime: Update a few comments
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 01/21] sched: Consolidate open coded preemptible() checks Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 02/21] context_tracing: Fix guest accounting with native vtime Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 04/21] context_tracking: Fix runtime CPU off-case Frederic Weisbecker
                   ` (17 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Update a stale comment from the old vtime era and document some
locking that might be non obvious.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |   10 ++++------
 kernel/sched/cputime.c           |    7 +++++++
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 5984f25..d883ff0 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -72,8 +72,9 @@ extern void guest_exit(void);
 static inline void guest_enter(void)
 {
 	/*
-	 * This is running in ioctl context so we can avoid
-	 * the call to vtime_account() with its unnecessary idle check.
+	 * This is running in ioctl context so its safe
+	 * to assume that it's the stime pending cputime
+	 * to flush.
 	 */
 	vtime_account_system(current);
 	current->flags |= PF_VCPU;
@@ -81,10 +82,7 @@ static inline void guest_enter(void)
 
 static inline void guest_exit(void)
 {
-	/*
-	 * This is running in ioctl context so we can avoid
-	 * the call to vtime_account() with its unnecessary idle check.
-	 */
+	/* Flush the guest cputime we spent on the guest */
 	vtime_account_system(current);
 	current->flags &= ~PF_VCPU;
 }
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index a7959e0..223a35e 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -712,6 +712,13 @@ void vtime_user_enter(struct task_struct *tsk)
 
 void vtime_guest_enter(struct task_struct *tsk)
 {
+	/*
+	 * The flags must be updated under the lock with
+	 * the vtime_snap flush and update.
+	 * That enforces a right ordering and update sequence
+	 * synchronization against the reader (task_gtime())
+	 * that can thus safely catch up with a tickless delta.
+	 */
 	write_seqlock(&tsk->vtime_seqlock);
 	__vtime_account_system(tsk);
 	current->flags |= PF_VCPU;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/21] context_tracking: Fix runtime CPU off-case
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 03/21] vtime: Update a few comments Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 05/21] nohz: Only enable context tracking on full dynticks CPUs Frederic Weisbecker
                   ` (16 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

As long as the context tracking is enabled on any CPU, even
a single one, all other CPUs need to keep track of their
user <-> kernel boundaries cross as well.

This is because a task can sleep while servicing an exception
that happened in the kernel or in userspace. Then when the task
eventually wakes up and return from the exception, the CPU needs
to know if we resume in userspace or in the kernel. exception_exit()
get this information from exception_enter() that saved the previous
state.

If the CPU where the exception happened didn't keep track of
these informations, exception_exit() doesn't know which state
tracking to restore on the CPU where the task got migrated
and we may return to userspace with the context tracking
subsystem thinking that we are in kernel mode.

This can be fixed in the long term if we move our context tracking
probes on very low level arch fast path user <-> kernel boundary,
although even that is worrisome as an exception can still happen
in the few instructions between the probe and the actual iret.

Also we are not yet ready to set these probes in the fast path given
the potential overhead problem it induces.

So let's fix this by always enable context tracking even on CPUs
that are not in the full dynticks range. OTOH we can spare the
rcu_user_*() and vtime_user_*() calls there because the tick runs
on these CPUs and we can handle RCU state machine and cputime
accounting through it.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 kernel/context_tracking.c |   52 ++++++++++++++++++++++++++++----------------
 1 files changed, 33 insertions(+), 19 deletions(-)

diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 1f47119..7b095de 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -54,17 +54,31 @@ void user_enter(void)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
-	if (__this_cpu_read(context_tracking.active) &&
-	    __this_cpu_read(context_tracking.state) != IN_USER) {
+	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
+		if (__this_cpu_read(context_tracking.active)) {
+			/*
+			 * At this stage, only low level arch entry code remains and
+			 * then we'll run in userspace. We can assume there won't be
+			 * any RCU read-side critical section until the next call to
+			 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
+			 * on the tick.
+			 */
+			vtime_user_enter(current);
+			rcu_user_enter();
+		}
 		/*
-		 * At this stage, only low level arch entry code remains and
-		 * then we'll run in userspace. We can assume there won't be
-		 * any RCU read-side critical section until the next call to
-		 * user_exit() or rcu_irq_enter(). Let's remove RCU's dependency
-		 * on the tick.
+		 * Even if context tracking is disabled on this CPU, because it's outside
+		 * the full dynticks mask for example, we still have to keep track of the
+		 * context transitions and states to prevent inconsistency on those of
+		 * other CPUs.
+		 * If a task triggers an exception in userspace, sleep on the exception
+		 * handler and then migrate to another CPU, that new CPU must know where
+		 * the exception returns by the time we call exception_exit().
+		 * This information can only be provided by the previous CPU when it called
+		 * exception_enter().
+		 * OTOH we can spare the calls to vtime and RCU when context_tracking.active
+		 * is false because we know that CPU is not tickless.
 		 */
-		vtime_user_enter(current);
-		rcu_user_enter();
 		__this_cpu_write(context_tracking.state, IN_USER);
 	}
 	local_irq_restore(flags);
@@ -130,12 +144,14 @@ void user_exit(void)
 
 	local_irq_save(flags);
 	if (__this_cpu_read(context_tracking.state) == IN_USER) {
-		/*
-		 * We are going to run code that may use RCU. Inform
-		 * RCU core about that (ie: we may need the tick again).
-		 */
-		rcu_user_exit();
-		vtime_user_exit(current);
+		if (__this_cpu_read(context_tracking.active)) {
+			/*
+			 * We are going to run code that may use RCU. Inform
+			 * RCU core about that (ie: we may need the tick again).
+			 */
+			rcu_user_exit();
+			vtime_user_exit(current);
+		}
 		__this_cpu_write(context_tracking.state, IN_KERNEL);
 	}
 	local_irq_restore(flags);
@@ -178,8 +194,6 @@ EXPORT_SYMBOL_GPL(guest_exit);
 void context_tracking_task_switch(struct task_struct *prev,
 			     struct task_struct *next)
 {
-	if (__this_cpu_read(context_tracking.active)) {
-		clear_tsk_thread_flag(prev, TIF_NOHZ);
-		set_tsk_thread_flag(next, TIF_NOHZ);
-	}
+	clear_tsk_thread_flag(prev, TIF_NOHZ);
+	set_tsk_thread_flag(next, TIF_NOHZ);
 }
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/21] nohz: Only enable context tracking on full dynticks CPUs
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (3 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 04/21] context_tracking: Fix runtime CPU off-case Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 06/21] context_tracking: Remove full dynticks' hacky dependency on wide context tracking Frederic Weisbecker
                   ` (15 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

The context tracking subsystem has the ability to selectively
enable the tracking on any defined subset of CPU. This means that
we can define a CPU range that doesn't run the context tracking
and another range that does.

Now what we want in practice is to enable the tracking on full
dynticks CPUs only. In order to perform this, we just need to pass
our full dynticks CPU range selection from the full dynticks
subsystem to the context tracking.

This way we can spare the overhead of RCU user extended quiescent
state and vtime maintainance on the CPUs that are outside the
full dynticks range. Just keep in mind the raw context tracking
itself is still necessary everywhere.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |    2 ++
 kernel/context_tracking.c        |    5 +++++
 kernel/time/tick-sched.c         |    4 ++++
 3 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index d883ff0..1ae37c7 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -34,6 +34,8 @@ static inline bool context_tracking_active(void)
 	return __this_cpu_read(context_tracking.active);
 }
 
+extern void context_tracking_cpu_set(int cpu);
+
 extern void user_enter(void);
 extern void user_exit(void);
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 7b095de..72bcb25 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -26,6 +26,11 @@ DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
 #endif
 };
 
+void context_tracking_cpu_set(int cpu)
+{
+	per_cpu(context_tracking.active, cpu) = true;
+}
+
 /**
  * user_enter - Inform the context tracking that the CPU is going to
  *              enter userspace mode.
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index e80183f..6d604fd 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -23,6 +23,7 @@
 #include <linux/irq_work.h>
 #include <linux/posix-timers.h>
 #include <linux/perf_event.h>
+#include <linux/context_tracking.h>
 
 #include <asm/irq_regs.h>
 
@@ -350,6 +351,9 @@ void __init tick_nohz_init(void)
 			return;
 	}
 
+	for_each_cpu(cpu, nohz_full_mask)
+		context_tracking_cpu_set(cpu);
+
 	cpu_notifier(tick_nohz_cpu_down_callback, 0);
 	cpulist_scnprintf(nohz_full_buf, sizeof(nohz_full_buf), nohz_full_mask);
 	pr_info("NO_HZ: Full dynticks CPUs: %s.\n", nohz_full_buf);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/21] context_tracking: Remove full dynticks' hacky dependency on wide context tracking
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (4 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 05/21] nohz: Only enable context tracking on full dynticks CPUs Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 07/21] context_tracking: Ground setup for static key use Frederic Weisbecker
                   ` (14 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Now that the full dynticks subsystem only enables the context tracking
on full dynticks CPUs, lets remove the dependency on CONTEXT_TRACKING_FORCE

This dependency was a hack to enable the context tracking widely for the
full dynticks susbsystem until the latter becomes able to enable it in a
more CPU-finegrained fashion.

Now CONTEXT_TRACKING_FORCE only stands for testing on archs that
work on support for the context tracking while full dynticks can't be
used yet due to unmet dependencies. It simulates a system where all CPUs
are full dynticks so that RCU user extended quiescent states and dynticks
cputime accounting can be tested on the given arch.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 init/Kconfig        |   28 ++++++++++++++++++++++------
 kernel/time/Kconfig |    1 -
 2 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/init/Kconfig b/init/Kconfig
index 247084b..ffbf5d7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -527,13 +527,29 @@ config RCU_USER_QS
 config CONTEXT_TRACKING_FORCE
 	bool "Force context tracking"
 	depends on CONTEXT_TRACKING
-	default CONTEXT_TRACKING
+	default y if !NO_HZ_FULL
 	help
-	  Probe on user/kernel boundaries by default in order to
-	  test the features that rely on it such as userspace RCU extended
-	  quiescent states.
-	  This test is there for debugging until we have a real user like the
-	  full dynticks mode.
+	  The major pre-requirement for full dynticks to work is to
+	  support the context tracking subsystem. But there are also
+	  other dependencies to provide in order to make the full
+	  dynticks working.
+
+	  This option stands for testing when an arch implements the
+	  context tracking backend but doesn't yet fullfill all the
+	  requirements to make the full dynticks feature working.
+	  Without the full dynticks, there is no way to test the support
+	  for context tracking and the subsystems that rely on it: RCU
+	  userspace extended quiescent state and tickless cputime
+	  accounting. This option copes with the absence of the full
+	  dynticks subsystem by forcing the context tracking on all
+	  CPUs in the system.
+
+	  Say Y only if you're working on the developpement of an
+	  architecture backend for the context tracking.
+
+	  Say N otherwise, this option brings an overhead that you
+	  don't want in production.
+
 
 config RCU_FANOUT
 	int "Tree-based hierarchical RCU fanout value"
diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig
index 70f27e8..747bbc7 100644
--- a/kernel/time/Kconfig
+++ b/kernel/time/Kconfig
@@ -105,7 +105,6 @@ config NO_HZ_FULL
 	select RCU_USER_QS
 	select RCU_NOCB_CPU
 	select VIRT_CPU_ACCOUNTING_GEN
-	select CONTEXT_TRACKING_FORCE
 	select IRQ_WORK
 	help
 	 Adaptively try to shutdown the tick whenever possible, even when
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/21] context_tracking: Ground setup for static key use
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (5 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 06/21] context_tracking: Remove full dynticks' hacky dependency on wide context tracking Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 08/21] context_tracking: Optimize main APIs off case with static key Frederic Weisbecker
                   ` (13 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Prepare for using a static key in the context tracking subsystem.
This will help optimizing the off case on its many users:

* user_enter, user_exit, exception_enter, exception_exit, guest_enter,
  guest_exit, vtime_*()

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |    2 ++
 kernel/context_tracking.c        |   26 ++++++++++++++++++++------
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 1ae37c7..f9356eb 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -4,6 +4,7 @@
 #include <linux/sched.h>
 #include <linux/percpu.h>
 #include <linux/vtime.h>
+#include <linux/static_key.h>
 #include <asm/ptrace.h>
 
 struct context_tracking {
@@ -22,6 +23,7 @@ struct context_tracking {
 
 
 #ifdef CONFIG_CONTEXT_TRACKING
+extern struct static_key context_tracking_enabled;
 DECLARE_PER_CPU(struct context_tracking, context_tracking);
 
 static inline bool context_tracking_in_user(void)
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 72bcb25..f07505c 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -20,15 +20,16 @@
 #include <linux/hardirq.h>
 #include <linux/export.h>
 
-DEFINE_PER_CPU(struct context_tracking, context_tracking) = {
-#ifdef CONFIG_CONTEXT_TRACKING_FORCE
-	.active = true,
-#endif
-};
+struct static_key context_tracking_enabled = STATIC_KEY_INIT_FALSE;
+
+DEFINE_PER_CPU(struct context_tracking, context_tracking);
 
 void context_tracking_cpu_set(int cpu)
 {
-	per_cpu(context_tracking.active, cpu) = true;
+	if (!per_cpu(context_tracking.active, cpu)) {
+		per_cpu(context_tracking.active, cpu) = true;
+		static_key_slow_inc(&context_tracking_enabled);
+	}
 }
 
 /**
@@ -202,3 +203,16 @@ void context_tracking_task_switch(struct task_struct *prev,
 	clear_tsk_thread_flag(prev, TIF_NOHZ);
 	set_tsk_thread_flag(next, TIF_NOHZ);
 }
+
+#ifdef CONFIG_CONTEXT_TRACKING_FORCE
+static int __init context_tracking_init(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu)
+		context_tracking_cpu_set(cpu);
+
+	return 0;
+}
+early_initcall(context_tracking_init);
+#endif
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/21] context_tracking: Optimize main APIs off case with static key
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (6 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 07/21] context_tracking: Ground setup for static key use Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 09/21] context_tracking: Optimize guest " Frederic Weisbecker
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Optimize user and exception entry/exit APIs with static
keys. This minimize the overhead for those who enable
CONFIG_NO_HZ_FULL without always using it. Having no range
passed to nohz_full= should result in the probes to be nopped
(at least we hope so...).

If this proves not be enough in the long term, we'll need
to bring an exception slow path by re-routing the exception
handlers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |   27 ++++++++++++++++++++++-----
 kernel/context_tracking.c        |   12 ++++++------
 2 files changed, 28 insertions(+), 11 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index f9356eb..e5ec0c9 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -38,23 +38,40 @@ static inline bool context_tracking_active(void)
 
 extern void context_tracking_cpu_set(int cpu);
 
-extern void user_enter(void);
-extern void user_exit(void);
+extern void context_tracking_user_enter(void);
+extern void context_tracking_user_exit(void);
+
+static inline void user_enter(void)
+{
+	if (static_key_false(&context_tracking_enabled))
+		context_tracking_user_enter();
+
+}
+static inline void user_exit(void)
+{
+	if (static_key_false(&context_tracking_enabled))
+		context_tracking_user_exit();
+}
 
 static inline enum ctx_state exception_enter(void)
 {
 	enum ctx_state prev_ctx;
 
+	if (!static_key_false(&context_tracking_enabled))
+		return 0;
+
 	prev_ctx = this_cpu_read(context_tracking.state);
-	user_exit();
+	context_tracking_user_exit();
 
 	return prev_ctx;
 }
 
 static inline void exception_exit(enum ctx_state prev_ctx)
 {
-	if (prev_ctx == IN_USER)
-		user_enter();
+	if (static_key_false(&context_tracking_enabled)) {
+		if (prev_ctx == IN_USER)
+			context_tracking_user_enter();
+	}
 }
 
 extern void context_tracking_task_switch(struct task_struct *prev,
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index f07505c..657f668 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -33,15 +33,15 @@ void context_tracking_cpu_set(int cpu)
 }
 
 /**
- * user_enter - Inform the context tracking that the CPU is going to
- *              enter userspace mode.
+ * context_tracking_user_enter - Inform the context tracking that the CPU is going to
+ *                               enter userspace mode.
  *
  * This function must be called right before we switch from the kernel
  * to userspace, when it's guaranteed the remaining kernel instructions
  * to execute won't use any RCU read side critical section because this
  * function sets RCU in extended quiescent state.
  */
-void user_enter(void)
+void context_tracking_user_enter(void)
 {
 	unsigned long flags;
 
@@ -131,8 +131,8 @@ EXPORT_SYMBOL_GPL(preempt_schedule_context);
 #endif /* CONFIG_PREEMPT */
 
 /**
- * user_exit - Inform the context tracking that the CPU is
- *             exiting userspace mode and entering the kernel.
+ * context_tracking_user_exit - Inform the context tracking that the CPU is
+ *                              exiting userspace mode and entering the kernel.
  *
  * This function must be called after we entered the kernel from userspace
  * before any use of RCU read side critical section. This potentially include
@@ -141,7 +141,7 @@ EXPORT_SYMBOL_GPL(preempt_schedule_context);
  * This call supports re-entrancy. This way it can be called from any exception
  * handler without needing to know if we came from userspace or not.
  */
-void user_exit(void)
+void context_tracking_user_exit(void)
 {
 	unsigned long flags;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/21] context_tracking: Optimize guest APIs off case with static key
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (7 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 08/21] context_tracking: Optimize main APIs off case with static key Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 10/21] context_tracking: Optimize context switch off case with static keys Frederic Weisbecker
                   ` (11 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Optimize guest entry/exit APIs with static keys. This minimize
the overhead for those who enable CONFIG_NO_HZ_FULL without
always using it. Having no range passed to nohz_full= should
result in the probes overhead to be minimized.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |   19 +++++++++++++++++--
 kernel/context_tracking.c        |   23 ++---------------------
 kernel/sched/cputime.c           |    2 ++
 3 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index e5ec0c9..03a32b0 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -87,8 +87,23 @@ static inline void context_tracking_task_switch(struct task_struct *prev,
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-extern void guest_enter(void);
-extern void guest_exit(void);
+static inline void guest_enter(void)
+{
+	if (static_key_false(&context_tracking_enabled) &&
+	    vtime_accounting_enabled())
+		vtime_guest_enter(current);
+	else
+		current->flags |= PF_VCPU;
+}
+
+static inline void guest_exit(void)
+{
+	if (static_key_false(&context_tracking_enabled) &&
+	    vtime_accounting_enabled())
+		vtime_guest_exit(current);
+	else
+		current->flags &= ~PF_VCPU;
+}
 #else
 static inline void guest_enter(void)
 {
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 657f668..5afa36b 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -21,8 +21,10 @@
 #include <linux/export.h>
 
 struct static_key context_tracking_enabled = STATIC_KEY_INIT_FALSE;
+EXPORT_SYMBOL_GPL(context_tracking_enabled);
 
 DEFINE_PER_CPU(struct context_tracking, context_tracking);
+EXPORT_SYMBOL_GPL(context_tracking);
 
 void context_tracking_cpu_set(int cpu)
 {
@@ -163,27 +165,6 @@ void context_tracking_user_exit(void)
 	local_irq_restore(flags);
 }
 
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-void guest_enter(void)
-{
-	if (vtime_accounting_enabled())
-		vtime_guest_enter(current);
-	else
-		current->flags |= PF_VCPU;
-}
-EXPORT_SYMBOL_GPL(guest_enter);
-
-void guest_exit(void)
-{
-	if (vtime_accounting_enabled())
-		vtime_guest_exit(current);
-	else
-		current->flags &= ~PF_VCPU;
-}
-EXPORT_SYMBOL_GPL(guest_exit);
-#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
-
-
 /**
  * context_tracking_task_switch - context switch the syscall callbacks
  * @prev: the task that is being switched out
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 223a35e..bb6b29a 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -724,6 +724,7 @@ void vtime_guest_enter(struct task_struct *tsk)
 	current->flags |= PF_VCPU;
 	write_sequnlock(&tsk->vtime_seqlock);
 }
+EXPORT_SYMBOL_GPL(vtime_guest_enter);
 
 void vtime_guest_exit(struct task_struct *tsk)
 {
@@ -732,6 +733,7 @@ void vtime_guest_exit(struct task_struct *tsk)
 	current->flags &= ~PF_VCPU;
 	write_sequnlock(&tsk->vtime_seqlock);
 }
+EXPORT_SYMBOL_GPL(vtime_guest_exit);
 
 void vtime_account_idle(struct task_struct *tsk)
 {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/21] context_tracking: Optimize context switch off case with static keys
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (8 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 09/21] context_tracking: Optimize guest " Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 11/21] context_tracking: User/kernel broundary cross trace events Frederic Weisbecker
                   ` (10 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

No need for syscall slowpath if no CPU is full dynticks,
rather nop this in this case.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |   11 +++++++++--
 kernel/context_tracking.c        |    6 +++---
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 03a32b0..66a8397 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -40,6 +40,8 @@ extern void context_tracking_cpu_set(int cpu);
 
 extern void context_tracking_user_enter(void);
 extern void context_tracking_user_exit(void);
+extern void __context_tracking_task_switch(struct task_struct *prev,
+					   struct task_struct *next);
 
 static inline void user_enter(void)
 {
@@ -74,8 +76,12 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
-extern void context_tracking_task_switch(struct task_struct *prev,
-					 struct task_struct *next);
+static inline void context_tracking_task_switch(struct task_struct *prev,
+						struct task_struct *next)
+{
+	if (static_key_false(&context_tracking_enabled))
+		__context_tracking_task_switch(prev, next);
+}
 #else
 static inline bool context_tracking_in_user(void) { return false; }
 static inline void user_enter(void) { }
@@ -104,6 +110,7 @@ static inline void guest_exit(void)
 	else
 		current->flags &= ~PF_VCPU;
 }
+
 #else
 static inline void guest_enter(void)
 {
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 5afa36b..ef21e4f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -166,7 +166,7 @@ void context_tracking_user_exit(void)
 }
 
 /**
- * context_tracking_task_switch - context switch the syscall callbacks
+ * __context_tracking_task_switch - context switch the syscall callbacks
  * @prev: the task that is being switched out
  * @next: the task that is being switched in
  *
@@ -178,8 +178,8 @@ void context_tracking_user_exit(void)
  * migrate to some CPU that doesn't do the context tracking. As such the TIF
  * flag may not be desired there.
  */
-void context_tracking_task_switch(struct task_struct *prev,
-			     struct task_struct *next)
+void __context_tracking_task_switch(struct task_struct *prev,
+				    struct task_struct *next)
 {
 	clear_tsk_thread_flag(prev, TIF_NOHZ);
 	set_tsk_thread_flag(next, TIF_NOHZ);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/21] context_tracking: User/kernel broundary cross trace events
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (9 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 10/21] context_tracking: Optimize context switch off case with static keys Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 12/21] vtime: Remove a few unneeded generic vtime state checks Frederic Weisbecker
                   ` (9 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

This can be useful to track all kernel/user round trips.
And it's also helpful to debug the context tracking subsystem.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/trace/events/context_tracking.h |   58 +++++++++++++++++++++++++++++++
 kernel/context_tracking.c               |    5 +++
 2 files changed, 63 insertions(+), 0 deletions(-)
 create mode 100644 include/trace/events/context_tracking.h

diff --git a/include/trace/events/context_tracking.h b/include/trace/events/context_tracking.h
new file mode 100644
index 0000000..ce8007c
--- /dev/null
+++ b/include/trace/events/context_tracking.h
@@ -0,0 +1,58 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM context_tracking
+
+#if !defined(_TRACE_CONTEXT_TRACKING_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_CONTEXT_TRACKING_H
+
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(context_tracking_user,
+
+	TP_PROTO(int dummy),
+
+	TP_ARGS(dummy),
+
+	TP_STRUCT__entry(
+		__field( int,	dummy	)
+	),
+
+	TP_fast_assign(
+		__entry->dummy		= dummy;
+	),
+
+	TP_printk("%s", "")
+);
+
+/**
+ * user_enter - called when the kernel resumes to userspace
+ * @dummy:	dummy arg to make trace event macro happy
+ *
+ * This event occurs when the kernel resumes to userspace  after
+ * an exception or a syscall.
+ */
+DEFINE_EVENT(context_tracking_user, user_enter,
+
+	TP_PROTO(int dummy),
+
+	TP_ARGS(dummy)
+);
+
+/**
+ * user_exit - called when userspace enters the kernel
+ * @dummy:	dummy arg to make trace event macro happy
+ *
+ * This event occurs when userspace enters the kernel through
+ * an exception or a syscall.
+ */
+DEFINE_EVENT(context_tracking_user, user_exit,
+
+	TP_PROTO(int dummy),
+
+	TP_ARGS(dummy)
+);
+
+
+#endif /*  _TRACE_CONTEXT_TRACKING_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index ef21e4f..688efe4 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -20,6 +20,9 @@
 #include <linux/hardirq.h>
 #include <linux/export.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/context_tracking.h>
+
 struct static_key context_tracking_enabled = STATIC_KEY_INIT_FALSE;
 EXPORT_SYMBOL_GPL(context_tracking_enabled);
 
@@ -64,6 +67,7 @@ void context_tracking_user_enter(void)
 	local_irq_save(flags);
 	if ( __this_cpu_read(context_tracking.state) != IN_USER) {
 		if (__this_cpu_read(context_tracking.active)) {
+			trace_user_enter(0);
 			/*
 			 * At this stage, only low level arch entry code remains and
 			 * then we'll run in userspace. We can assume there won't be
@@ -159,6 +163,7 @@ void context_tracking_user_exit(void)
 			 */
 			rcu_user_exit();
 			vtime_user_exit(current);
+			trace_user_exit(0);
 		}
 		__this_cpu_write(context_tracking.state, IN_KERNEL);
 	}
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 12/21] vtime: Remove a few unneeded generic vtime state checks
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (10 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 11/21] context_tracking: User/kernel broundary cross trace events Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 13/21] vtime: Fix racy cputime delta update Frederic Weisbecker
                   ` (8 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Some generic vtime APIs check if the vtime accounting
is enabled on the local CPU before doing their work.

Some of these are not needed because all their callers already
take care of that. Let's remove the checks on these.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 kernel/sched/cputime.c |   13 +------------
 1 files changed, 1 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index bb6b29a..5f273b47 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -664,9 +664,6 @@ static void __vtime_account_system(struct task_struct *tsk)
 
 void vtime_account_system(struct task_struct *tsk)
 {
-	if (!vtime_accounting_enabled())
-		return;
-
 	write_seqlock(&tsk->vtime_seqlock);
 	__vtime_account_system(tsk);
 	write_sequnlock(&tsk->vtime_seqlock);
@@ -686,12 +683,7 @@ void vtime_account_irq_exit(struct task_struct *tsk)
 
 void vtime_account_user(struct task_struct *tsk)
 {
-	cputime_t delta_cpu;
-
-	if (!vtime_accounting_enabled())
-		return;
-
-	delta_cpu = get_vtime_delta(tsk);
+	cputime_t delta_cpu = get_vtime_delta(tsk);
 
 	write_seqlock(&tsk->vtime_seqlock);
 	tsk->vtime_snap_whence = VTIME_SYS;
@@ -701,9 +693,6 @@ void vtime_account_user(struct task_struct *tsk)
 
 void vtime_user_enter(struct task_struct *tsk)
 {
-	if (!vtime_accounting_enabled())
-		return;
-
 	write_seqlock(&tsk->vtime_seqlock);
 	tsk->vtime_snap_whence = VTIME_USER;
 	__vtime_account_system(tsk);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 13/21] vtime: Fix racy cputime delta update
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (11 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 12/21] vtime: Remove a few unneeded generic vtime state checks Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 14/21] context_tracking: Split low level state headers Frederic Weisbecker
                   ` (7 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

get_vtime_delta() must be called under the task vtime_seqlock
with the code that does the cputime accounting flush.

Otherwise the cputime reader can be fooled and run into
a race where it sees the snapshot update but misses the
cputime flush. As a result it can report a cputime that is
way too short.

Fix vtime_account_user() that wasn't complying to that rule.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 kernel/sched/cputime.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5f273b47..b62d5c0 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -683,9 +683,10 @@ void vtime_account_irq_exit(struct task_struct *tsk)
 
 void vtime_account_user(struct task_struct *tsk)
 {
-	cputime_t delta_cpu = get_vtime_delta(tsk);
+	cputime_t delta_cpu;
 
 	write_seqlock(&tsk->vtime_seqlock);
+	delta_cpu = get_vtime_delta(tsk);
 	tsk->vtime_snap_whence = VTIME_SYS;
 	account_user_time(tsk, delta_cpu, cputime_to_scaled(delta_cpu));
 	write_sequnlock(&tsk->vtime_seqlock);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 14/21] context_tracking: Split low level state headers
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (12 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 13/21] vtime: Fix racy cputime delta update Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 15/21] vtime: Describe overriden functions in dedicated arch headers Frederic Weisbecker
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

We plan to use the context tracking static key on inline
vtime APIs. For this we need to include the context tracking
headers from those of vtime.

However vtime headers need to stay low level because they are
included in hardirq.h that mostly contains standalone
definitions. But context_tracking.h includes sched.h for
a few task_struct references, therefore it wouldn't be sensible
to include it from vtime.h

To solve this, lets split the context tracking headers and move
out the pure state definitions that only require a few low level
headers. We can safely include that small part in vtime.h later.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h       |   31 +------------------------
 include/linux/context_tracking_state.h |   39 ++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 30 deletions(-)
 create mode 100644 include/linux/context_tracking_state.h

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 66a8397..8b6eedb 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -2,40 +2,12 @@
 #define _LINUX_CONTEXT_TRACKING_H
 
 #include <linux/sched.h>
-#include <linux/percpu.h>
 #include <linux/vtime.h>
-#include <linux/static_key.h>
+#include <linux/context_tracking_state.h>
 #include <asm/ptrace.h>
 
-struct context_tracking {
-	/*
-	 * When active is false, probes are unset in order
-	 * to minimize overhead: TIF flags are cleared
-	 * and calls to user_enter/exit are ignored. This
-	 * may be further optimized using static keys.
-	 */
-	bool active;
-	enum ctx_state {
-		IN_KERNEL = 0,
-		IN_USER,
-	} state;
-};
-
 
 #ifdef CONFIG_CONTEXT_TRACKING
-extern struct static_key context_tracking_enabled;
-DECLARE_PER_CPU(struct context_tracking, context_tracking);
-
-static inline bool context_tracking_in_user(void)
-{
-	return __this_cpu_read(context_tracking.state) == IN_USER;
-}
-
-static inline bool context_tracking_active(void)
-{
-	return __this_cpu_read(context_tracking.active);
-}
-
 extern void context_tracking_cpu_set(int cpu);
 
 extern void context_tracking_user_enter(void);
@@ -83,7 +55,6 @@ static inline void context_tracking_task_switch(struct task_struct *prev,
 		__context_tracking_task_switch(prev, next);
 }
 #else
-static inline bool context_tracking_in_user(void) { return false; }
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
new file mode 100644
index 0000000..0f1979d
--- /dev/null
+++ b/include/linux/context_tracking_state.h
@@ -0,0 +1,39 @@
+#ifndef _LINUX_CONTEXT_TRACKING_STATE_H
+#define _LINUX_CONTEXT_TRACKING_STATE_H
+
+#include <linux/percpu.h>
+#include <linux/static_key.h>
+
+struct context_tracking {
+	/*
+	 * When active is false, probes are unset in order
+	 * to minimize overhead: TIF flags are cleared
+	 * and calls to user_enter/exit are ignored. This
+	 * may be further optimized using static keys.
+	 */
+	bool active;
+	enum ctx_state {
+		IN_KERNEL = 0,
+		IN_USER,
+	} state;
+};
+
+#ifdef CONFIG_CONTEXT_TRACKING
+extern struct static_key context_tracking_enabled;
+DECLARE_PER_CPU(struct context_tracking, context_tracking);
+
+static inline bool context_tracking_in_user(void)
+{
+	return __this_cpu_read(context_tracking.state) == IN_USER;
+}
+
+static inline bool context_tracking_active(void)
+{
+	return __this_cpu_read(context_tracking.active);
+}
+#else
+static inline bool context_tracking_in_user(void) { return false; }
+static inline bool context_tracking_active(void) { return false; }
+#endif /* CONFIG_CONTEXT_TRACKING */
+
+#endif
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 15/21] vtime: Describe overriden functions in dedicated arch headers
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (13 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 14/21] context_tracking: Split low level state headers Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 16/21] vtime: Optimize full dynticks accounting off case with static keys Frederic Weisbecker
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman, Martin Schwidefsky,
	Heiko Carstens

If the arch overrides some generic vtime APIs, let it describe
these on a dedicated and standalone header. This way it becomes
convenient to include it in vtime generic headers without irrelevant
stuff in such a low level header.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
---
 arch/ia64/include/asm/Kbuild    |    1 +
 arch/powerpc/include/asm/Kbuild |    1 +
 arch/s390/include/asm/cputime.h |    3 ---
 arch/s390/include/asm/vtime.h   |    7 +++++++
 arch/s390/kernel/vtime.c        |    1 +
 include/linux/vtime.h           |    4 ++++
 6 files changed, 14 insertions(+), 3 deletions(-)
 create mode 100644 arch/s390/include/asm/vtime.h
 create mode 100644 include/asm-generic/vtime.h

diff --git a/arch/ia64/include/asm/Kbuild b/arch/ia64/include/asm/Kbuild
index 05b03ec..a3456f3 100644
--- a/arch/ia64/include/asm/Kbuild
+++ b/arch/ia64/include/asm/Kbuild
@@ -3,3 +3,4 @@ generic-y += clkdev.h
 generic-y += exec.h
 generic-y += kvm_para.h
 generic-y += trace_clock.h
+generic-y += vtime.h
\ No newline at end of file
diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild
index 650757c..704e6f1 100644
--- a/arch/powerpc/include/asm/Kbuild
+++ b/arch/powerpc/include/asm/Kbuild
@@ -2,3 +2,4 @@
 generic-y += clkdev.h
 generic-y += rwsem.h
 generic-y += trace_clock.h
+generic-y += vtime.h
\ No newline at end of file
diff --git a/arch/s390/include/asm/cputime.h b/arch/s390/include/asm/cputime.h
index d2ff4137..f65bd36 100644
--- a/arch/s390/include/asm/cputime.h
+++ b/arch/s390/include/asm/cputime.h
@@ -13,9 +13,6 @@
 #include <asm/div64.h>
 
 
-#define __ARCH_HAS_VTIME_ACCOUNT
-#define __ARCH_HAS_VTIME_TASK_SWITCH
-
 /* We want to use full resolution of the CPU timer: 2**-12 micro-seconds. */
 
 typedef unsigned long long __nocast cputime_t;
diff --git a/arch/s390/include/asm/vtime.h b/arch/s390/include/asm/vtime.h
new file mode 100644
index 0000000..af9896c
--- /dev/null
+++ b/arch/s390/include/asm/vtime.h
@@ -0,0 +1,7 @@
+#ifndef _S390_VTIME_H
+#define _S390_VTIME_H
+
+#define __ARCH_HAS_VTIME_ACCOUNT
+#define __ARCH_HAS_VTIME_TASK_SWITCH
+
+#endif /* _S390_VTIME_H */
diff --git a/arch/s390/kernel/vtime.c b/arch/s390/kernel/vtime.c
index 9b9c1b7..abcfab5 100644
--- a/arch/s390/kernel/vtime.c
+++ b/arch/s390/kernel/vtime.c
@@ -19,6 +19,7 @@
 #include <asm/irq_regs.h>
 #include <asm/cputime.h>
 #include <asm/vtimer.h>
+#include <asm/vtime.h>
 #include <asm/irq.h>
 #include "entry.h"
 
diff --git a/include/asm-generic/vtime.h b/include/asm-generic/vtime.h
new file mode 100644
index 0000000..e69de29
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index b1dd2db..2ad0739 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -1,6 +1,10 @@
 #ifndef _LINUX_KERNEL_VTIME_H
 #define _LINUX_KERNEL_VTIME_H
 
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+#include <asm/vtime.h>
+#endif
+
 struct task_struct;
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 16/21] vtime: Optimize full dynticks accounting off case with static keys
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (14 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 15/21] vtime: Describe overriden functions in dedicated arch headers Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 17/21] vtime: Always scale generic vtime accounting results Frederic Weisbecker
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

If no CPU is in the full dynticks range, we can avoid the full
dynticks cputime accounting through generic vtime along with its
overhead and use the traditional tick based accounting instead.

Let's do this and nope the off case with static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/context_tracking.h |    6 +--
 include/linux/vtime.h            |   70 +++++++++++++++++++++++++++++++++-----
 kernel/sched/cputime.c           |   22 ++----------
 3 files changed, 67 insertions(+), 31 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 8b6eedb..655356a 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -66,8 +66,7 @@ static inline void context_tracking_task_switch(struct task_struct *prev,
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 static inline void guest_enter(void)
 {
-	if (static_key_false(&context_tracking_enabled) &&
-	    vtime_accounting_enabled())
+	if (vtime_accounting_enabled())
 		vtime_guest_enter(current);
 	else
 		current->flags |= PF_VCPU;
@@ -75,8 +74,7 @@ static inline void guest_enter(void)
 
 static inline void guest_exit(void)
 {
-	if (static_key_false(&context_tracking_enabled) &&
-	    vtime_accounting_enabled())
+	if (vtime_accounting_enabled())
 		vtime_guest_exit(current);
 	else
 		current->flags &= ~PF_VCPU;
diff --git a/include/linux/vtime.h b/include/linux/vtime.h
index 2ad0739..f5b72b3 100644
--- a/include/linux/vtime.h
+++ b/include/linux/vtime.h
@@ -1,22 +1,68 @@
 #ifndef _LINUX_KERNEL_VTIME_H
 #define _LINUX_KERNEL_VTIME_H
 
+#include <linux/context_tracking_state.h>
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 #include <asm/vtime.h>
 #endif
 
+
 struct task_struct;
 
+/*
+ * vtime_accounting_enabled() definitions/declarations
+ */
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
+static inline bool vtime_accounting_enabled(void) { return true; }
+#endif /* CONFIG_VIRT_CPU_ACCOUNTING_NATIVE */
+
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
+static inline bool vtime_accounting_enabled(void)
+{
+	if (static_key_false(&context_tracking_enabled)) {
+		if (context_tracking_active())
+			return true;
+	}
+
+	return false;
+}
+#endif /* CONFIG_VIRT_CPU_ACCOUNTING_GEN */
+
+#ifndef CONFIG_VIRT_CPU_ACCOUNTING
+static inline bool vtime_accounting_enabled(void) { return false; }
+#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
+
+
+/*
+ * Common vtime APIs
+ */
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING
+
+#ifdef __ARCH_HAS_VTIME_TASK_SWITCH
 extern void vtime_task_switch(struct task_struct *prev);
+#else
+extern void vtime_common_task_switch(struct task_struct *prev);
+static inline void vtime_task_switch(struct task_struct *prev)
+{
+	if (vtime_accounting_enabled())
+		vtime_common_task_switch(prev);
+}
+#endif /* __ARCH_HAS_VTIME_TASK_SWITCH */
+
 extern void vtime_account_system(struct task_struct *tsk);
 extern void vtime_account_idle(struct task_struct *tsk);
 extern void vtime_account_user(struct task_struct *tsk);
-extern void vtime_account_irq_enter(struct task_struct *tsk);
 
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
-static inline bool vtime_accounting_enabled(void) { return true; }
-#endif
+#ifdef __ARCH_HAS_VTIME_ACCOUNT
+extern void vtime_account_irq_enter(struct task_struct *tsk);
+#else
+extern void vtime_common_account_irq_enter(struct task_struct *tsk);
+static inline void vtime_account_irq_enter(struct task_struct *tsk)
+{
+	if (vtime_accounting_enabled())
+		vtime_common_account_irq_enter(tsk);
+}
+#endif /* __ARCH_HAS_VTIME_ACCOUNT */
 
 #else /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
@@ -24,14 +70,20 @@ static inline void vtime_task_switch(struct task_struct *prev) { }
 static inline void vtime_account_system(struct task_struct *tsk) { }
 static inline void vtime_account_user(struct task_struct *tsk) { }
 static inline void vtime_account_irq_enter(struct task_struct *tsk) { }
-static inline bool vtime_accounting_enabled(void) { return false; }
-#endif
+#endif /* !CONFIG_VIRT_CPU_ACCOUNTING */
 
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
 extern void arch_vtime_task_switch(struct task_struct *tsk);
-extern void vtime_account_irq_exit(struct task_struct *tsk);
-extern bool vtime_accounting_enabled(void);
+extern void vtime_gen_account_irq_exit(struct task_struct *tsk);
+
+static inline void vtime_account_irq_exit(struct task_struct *tsk)
+{
+	if (vtime_accounting_enabled())
+		vtime_gen_account_irq_exit(tsk);
+}
+
 extern void vtime_user_enter(struct task_struct *tsk);
+
 static inline void vtime_user_exit(struct task_struct *tsk)
 {
 	vtime_account_user(tsk);
@@ -39,7 +91,7 @@ static inline void vtime_user_exit(struct task_struct *tsk)
 extern void vtime_guest_enter(struct task_struct *tsk);
 extern void vtime_guest_exit(struct task_struct *tsk);
 extern void vtime_init_idle(struct task_struct *tsk, int cpu);
-#else
+#else /* !CONFIG_VIRT_CPU_ACCOUNTING_GEN  */
 static inline void vtime_account_irq_exit(struct task_struct *tsk)
 {
 	/* On hard|softirq exit we always account to hard|softirq cputime */
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index b62d5c0..0831b06 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -378,11 +378,8 @@ static inline void irqtime_account_process_tick(struct task_struct *p, int user_
 #ifdef CONFIG_VIRT_CPU_ACCOUNTING
 
 #ifndef __ARCH_HAS_VTIME_TASK_SWITCH
-void vtime_task_switch(struct task_struct *prev)
+void vtime_common_task_switch(struct task_struct *prev)
 {
-	if (!vtime_accounting_enabled())
-		return;
-
 	if (is_idle_task(prev))
 		vtime_account_idle(prev);
 	else
@@ -404,11 +401,8 @@ void vtime_task_switch(struct task_struct *prev)
  * vtime_account().
  */
 #ifndef __ARCH_HAS_VTIME_ACCOUNT
-void vtime_account_irq_enter(struct task_struct *tsk)
+void vtime_common_account_irq_enter(struct task_struct *tsk)
 {
-	if (!vtime_accounting_enabled())
-		return;
-
 	if (!in_interrupt()) {
 		/*
 		 * If we interrupted user, context_tracking_in_user()
@@ -428,7 +422,7 @@ void vtime_account_irq_enter(struct task_struct *tsk)
 	}
 	vtime_account_system(tsk);
 }
-EXPORT_SYMBOL_GPL(vtime_account_irq_enter);
+EXPORT_SYMBOL_GPL(vtime_common_account_irq_enter);
 #endif /* __ARCH_HAS_VTIME_ACCOUNT */
 #endif /* CONFIG_VIRT_CPU_ACCOUNTING */
 
@@ -669,11 +663,8 @@ void vtime_account_system(struct task_struct *tsk)
 	write_sequnlock(&tsk->vtime_seqlock);
 }
 
-void vtime_account_irq_exit(struct task_struct *tsk)
+void vtime_gen_account_irq_exit(struct task_struct *tsk)
 {
-	if (!vtime_accounting_enabled())
-		return;
-
 	write_seqlock(&tsk->vtime_seqlock);
 	if (context_tracking_in_user())
 		tsk->vtime_snap_whence = VTIME_USER;
@@ -732,11 +723,6 @@ void vtime_account_idle(struct task_struct *tsk)
 	account_idle_time(delta_cpu);
 }
 
-bool vtime_accounting_enabled(void)
-{
-	return context_tracking_active();
-}
-
 void arch_vtime_task_switch(struct task_struct *prev)
 {
 	write_seqlock(&prev->vtime_seqlock);
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 17/21] vtime: Always scale generic vtime accounting results
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (15 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 16/21] vtime: Optimize full dynticks accounting off case with static keys Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 18/21] vtime: Always debug check snapshot source _before_ updating it Frederic Weisbecker
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

The cputime accounting in full dynticks can be a subtle
mixup of CPUs using tick based accounting and others using
generic vtime.

As long as the tick can have a share on producing these stats, we
want to scale the result against CFS precise accounting as the tick
can miss some task hiding between the periodic interrupt.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 kernel/sched/cputime.c |    6 ------
 1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 0831b06..e9e742e 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -553,12 +553,6 @@ static void cputime_adjust(struct task_cputime *curr,
 {
 	cputime_t rtime, stime, utime, total;
 
-	if (vtime_accounting_enabled()) {
-		*ut = curr->utime;
-		*st = curr->stime;
-		return;
-	}
-
 	stime = curr->stime;
 	total = stime + curr->utime;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 18/21] vtime: Always debug check snapshot source _before_ updating it
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (16 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 17/21] vtime: Always scale generic vtime accounting results Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 19/21] nohz: Rename a few state variables Frederic Weisbecker
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

The vtime delta update performed by get_vtime_delta() always check
that the source of the snapshot is valid.

Meanhile the snapshot updaters that rely on get_vtime_delta() also
set the new snapshot origin. But some of them do this right before
the call to get_vtime_delta(), making its debug check useless.

This is easily fixable by moving the snapshot origin update after
the call to get_vtime_delta(). The order doesn't matter there.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 kernel/sched/cputime.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index e9e742e..c1d7493 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -660,9 +660,9 @@ void vtime_account_system(struct task_struct *tsk)
 void vtime_gen_account_irq_exit(struct task_struct *tsk)
 {
 	write_seqlock(&tsk->vtime_seqlock);
+	__vtime_account_system(tsk);
 	if (context_tracking_in_user())
 		tsk->vtime_snap_whence = VTIME_USER;
-	__vtime_account_system(tsk);
 	write_sequnlock(&tsk->vtime_seqlock);
 }
 
@@ -680,8 +680,8 @@ void vtime_account_user(struct task_struct *tsk)
 void vtime_user_enter(struct task_struct *tsk)
 {
 	write_seqlock(&tsk->vtime_seqlock);
-	tsk->vtime_snap_whence = VTIME_USER;
 	__vtime_account_system(tsk);
+	tsk->vtime_snap_whence = VTIME_USER;
 	write_sequnlock(&tsk->vtime_seqlock);
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 19/21] nohz: Rename a few state variables
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (17 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 18/21] vtime: Always debug check snapshot source _before_ updating it Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 20/21] nohz: Optimize full dynticks state checks with static keys Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 21/21] nohz: Optimize full dynticks's sched hooks " Frederic Weisbecker
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Rename the full dynticks's cpumask and cpumask state variables
to some more exportable names.

These will be used later from global headers to optimize
the main full dynticks APIs in conjunction with static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 kernel/time/tick-sched.c |   42 +++++++++++++++++++++---------------------
 1 files changed, 21 insertions(+), 21 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6d604fd..71735ea 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -149,8 +149,8 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
 }
 
 #ifdef CONFIG_NO_HZ_FULL
-static cpumask_var_t nohz_full_mask;
-bool have_nohz_full_mask;
+static cpumask_var_t tick_nohz_full_mask;
+bool tick_nohz_full_running;
 
 static bool can_stop_full_tick(void)
 {
@@ -239,11 +239,11 @@ static void nohz_full_kick_ipi(void *info)
  */
 void tick_nohz_full_kick_all(void)
 {
-	if (!have_nohz_full_mask)
+	if (!tick_nohz_full_running)
 		return;
 
 	preempt_disable();
-	smp_call_function_many(nohz_full_mask,
+	smp_call_function_many(tick_nohz_full_mask,
 			       nohz_full_kick_ipi, NULL, false);
 	preempt_enable();
 }
@@ -271,10 +271,10 @@ out:
 
 int tick_nohz_full_cpu(int cpu)
 {
-	if (!have_nohz_full_mask)
+	if (!tick_nohz_full_running)
 		return 0;
 
-	return cpumask_test_cpu(cpu, nohz_full_mask);
+	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
 /* Parse the boot-time nohz CPU list from the kernel parameters. */
@@ -282,18 +282,18 @@ static int __init tick_nohz_full_setup(char *str)
 {
 	int cpu;
 
-	alloc_bootmem_cpumask_var(&nohz_full_mask);
-	if (cpulist_parse(str, nohz_full_mask) < 0) {
+	alloc_bootmem_cpumask_var(&tick_nohz_full_mask);
+	if (cpulist_parse(str, tick_nohz_full_mask) < 0) {
 		pr_warning("NOHZ: Incorrect nohz_full cpumask\n");
 		return 1;
 	}
 
 	cpu = smp_processor_id();
-	if (cpumask_test_cpu(cpu, nohz_full_mask)) {
+	if (cpumask_test_cpu(cpu, tick_nohz_full_mask)) {
 		pr_warning("NO_HZ: Clearing %d from nohz_full range for timekeeping\n", cpu);
-		cpumask_clear_cpu(cpu, nohz_full_mask);
+		cpumask_clear_cpu(cpu, tick_nohz_full_mask);
 	}
-	have_nohz_full_mask = true;
+	tick_nohz_full_running = true;
 
 	return 1;
 }
@@ -311,7 +311,7 @@ static int tick_nohz_cpu_down_callback(struct notifier_block *nfb,
 		 * If we handle the timekeeping duty for full dynticks CPUs,
 		 * we can't safely shutdown that CPU.
 		 */
-		if (have_nohz_full_mask && tick_do_timer_cpu == cpu)
+		if (tick_nohz_full_running && tick_do_timer_cpu == cpu)
 			return NOTIFY_BAD;
 		break;
 	}
@@ -330,14 +330,14 @@ static int tick_nohz_init_all(void)
 	int err = -1;
 
 #ifdef CONFIG_NO_HZ_FULL_ALL
-	if (!alloc_cpumask_var(&nohz_full_mask, GFP_KERNEL)) {
+	if (!alloc_cpumask_var(&tick_nohz_full_mask, GFP_KERNEL)) {
 		pr_err("NO_HZ: Can't allocate full dynticks cpumask\n");
 		return err;
 	}
 	err = 0;
-	cpumask_setall(nohz_full_mask);
-	cpumask_clear_cpu(smp_processor_id(), nohz_full_mask);
-	have_nohz_full_mask = true;
+	cpumask_setall(tick_nohz_full_mask);
+	cpumask_clear_cpu(smp_processor_id(), tick_nohz_full_mask);
+	tick_nohz_full_running = true;
 #endif
 	return err;
 }
@@ -346,20 +346,20 @@ void __init tick_nohz_init(void)
 {
 	int cpu;
 
-	if (!have_nohz_full_mask) {
+	if (!tick_nohz_full_running) {
 		if (tick_nohz_init_all() < 0)
 			return;
 	}
 
-	for_each_cpu(cpu, nohz_full_mask)
+	for_each_cpu(cpu, tick_nohz_full_mask)
 		context_tracking_cpu_set(cpu);
 
 	cpu_notifier(tick_nohz_cpu_down_callback, 0);
-	cpulist_scnprintf(nohz_full_buf, sizeof(nohz_full_buf), nohz_full_mask);
+	cpulist_scnprintf(nohz_full_buf, sizeof(nohz_full_buf), tick_nohz_full_mask);
 	pr_info("NO_HZ: Full dynticks CPUs: %s.\n", nohz_full_buf);
 }
 #else
-#define have_nohz_full_mask (0)
+#define tick_nohz_full_running (0)
 #endif
 
 /*
@@ -737,7 +737,7 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
 		return false;
 	}
 
-	if (have_nohz_full_mask) {
+	if (tick_nohz_full_running) {
 		/*
 		 * Keep the tick alive to guarantee timekeeping progression
 		 * if there are full dynticks CPUs around
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 20/21] nohz: Optimize full dynticks state checks with static keys
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (18 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 19/21] nohz: Rename a few state variables Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  2013-07-26 23:42 ` [PATCH 21/21] nohz: Optimize full dynticks's sched hooks " Frederic Weisbecker
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

These APIs are frequenctly accessed and priority is given
to optimize the full dynticks off-case in order to let
distros enable this feature without suffering from
significant performance regressions.

Let's inline these APIs and optimize them with static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/tick.h     |   25 +++++++++++++++++++++++--
 kernel/time/tick-sched.c |   14 ++------------
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9180f4b..c60b079 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -10,6 +10,8 @@
 #include <linux/irqflags.h>
 #include <linux/percpu.h>
 #include <linux/hrtimer.h>
+#include <linux/context_tracking_state.h>
+#include <linux/cpumask.h>
 
 #ifdef CONFIG_GENERIC_CLOCKEVENTS
 
@@ -158,15 +160,34 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
 # endif /* !CONFIG_NO_HZ_COMMON */
 
 #ifdef CONFIG_NO_HZ_FULL
+extern bool tick_nohz_full_running;
+extern cpumask_var_t tick_nohz_full_mask;
+
+static inline bool tick_nohz_full_enabled(void)
+{
+	if (!static_key_false(&context_tracking_enabled))
+		return false;
+
+	return tick_nohz_full_running;
+}
+
+static inline bool tick_nohz_full_cpu(int cpu)
+{
+	if (!tick_nohz_full_enabled())
+		return false;
+
+	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
+}
+
 extern void tick_nohz_init(void);
-extern int tick_nohz_full_cpu(int cpu);
 extern void tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_all(void);
 extern void tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline void tick_nohz_init(void) { }
-static inline int tick_nohz_full_cpu(int cpu) { return 0; }
+static inline bool tick_nohz_full_enabled(void) { return false; }
+static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick(void) { }
 static inline void tick_nohz_full_kick_all(void) { }
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 71735ea..6d6bd6e 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -149,7 +149,7 @@ static void tick_sched_handle(struct tick_sched *ts, struct pt_regs *regs)
 }
 
 #ifdef CONFIG_NO_HZ_FULL
-static cpumask_var_t tick_nohz_full_mask;
+cpumask_var_t tick_nohz_full_mask;
 bool tick_nohz_full_running;
 
 static bool can_stop_full_tick(void)
@@ -269,14 +269,6 @@ out:
 	local_irq_restore(flags);
 }
 
-int tick_nohz_full_cpu(int cpu)
-{
-	if (!tick_nohz_full_running)
-		return 0;
-
-	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
-}
-
 /* Parse the boot-time nohz CPU list from the kernel parameters. */
 static int __init tick_nohz_full_setup(char *str)
 {
@@ -358,8 +350,6 @@ void __init tick_nohz_init(void)
 	cpulist_scnprintf(nohz_full_buf, sizeof(nohz_full_buf), tick_nohz_full_mask);
 	pr_info("NO_HZ: Full dynticks CPUs: %s.\n", nohz_full_buf);
 }
-#else
-#define tick_nohz_full_running (0)
 #endif
 
 /*
@@ -737,7 +727,7 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts)
 		return false;
 	}
 
-	if (tick_nohz_full_running) {
+	if (tick_nohz_full_enabled()) {
 		/*
 		 * Keep the tick alive to guarantee timekeeping progression
 		 * if there are full dynticks CPUs around
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 21/21] nohz: Optimize full dynticks's sched hooks with static keys
  2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
                   ` (19 preceding siblings ...)
  2013-07-26 23:42 ` [PATCH 20/21] nohz: Optimize full dynticks state checks with static keys Frederic Weisbecker
@ 2013-07-26 23:42 ` Frederic Weisbecker
  20 siblings, 0 replies; 22+ messages in thread
From: Frederic Weisbecker @ 2013-07-26 23:42 UTC (permalink / raw)
  To: LKML
  Cc: Frederic Weisbecker, Steven Rostedt, Paul E. McKenney,
	Ingo Molnar, Thomas Gleixner, Peter Zijlstra, Borislav Petkov,
	Li Zhong, Mike Galbraith, Kevin Hilman

Scheduler IPIs and task context switches are serious fast path.
Let's try to hide as much as we can the impact of full
dynticks APIs' off case that are called on these sites
through the use of static keys.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kevin Hilman <khilman@linaro.org>
---
 include/linux/tick.h     |   20 ++++++++++++++++----
 kernel/time/tick-sched.c |    8 ++++----
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index c60b079..a7ef1d6 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -180,20 +180,32 @@ static inline bool tick_nohz_full_cpu(int cpu)
 }
 
 extern void tick_nohz_init(void);
-extern void tick_nohz_full_check(void);
+extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_all(void);
-extern void tick_nohz_task_switch(struct task_struct *tsk);
+extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline void tick_nohz_init(void) { }
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
-static inline void tick_nohz_full_check(void) { }
+static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick(void) { }
 static inline void tick_nohz_full_kick_all(void) { }
-static inline void tick_nohz_task_switch(struct task_struct *tsk) { }
+static inline void __tick_nohz_task_switch(struct task_struct *tsk) { }
 #endif
 
+static inline void tick_nohz_full_check(void)
+{
+	if (tick_nohz_full_enabled())
+		__tick_nohz_full_check();
+}
+
+static inline void tick_nohz_task_switch(struct task_struct *tsk)
+{
+	if (tick_nohz_full_enabled())
+		__tick_nohz_task_switch(tsk);
+}
+
 
 # ifdef CONFIG_CPU_IDLE_GOV_MENU
 extern void menu_hrtimer_cancel(void);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6d6bd6e..73997be 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -197,7 +197,7 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now);
  * Re-evaluate the need for the tick on the current CPU
  * and restart it if necessary.
  */
-void tick_nohz_full_check(void)
+void __tick_nohz_full_check(void)
 {
 	struct tick_sched *ts = &__get_cpu_var(tick_cpu_sched);
 
@@ -211,7 +211,7 @@ void tick_nohz_full_check(void)
 
 static void nohz_full_kick_work_func(struct irq_work *work)
 {
-	tick_nohz_full_check();
+	__tick_nohz_full_check();
 }
 
 static DEFINE_PER_CPU(struct irq_work, nohz_full_kick_work) = {
@@ -230,7 +230,7 @@ void tick_nohz_full_kick(void)
 
 static void nohz_full_kick_ipi(void *info)
 {
-	tick_nohz_full_check();
+	__tick_nohz_full_check();
 }
 
 /*
@@ -253,7 +253,7 @@ void tick_nohz_full_kick_all(void)
  * It might need the tick due to per task/process properties:
  * perf events, posix cpu timers, ...
  */
-void tick_nohz_task_switch(struct task_struct *tsk)
+void __tick_nohz_task_switch(struct task_struct *tsk)
 {
 	unsigned long flags;
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2013-07-26 23:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-26 23:42 [PATCH 00/21] nohz patches for 3.12 preview v2 Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 01/21] sched: Consolidate open coded preemptible() checks Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 02/21] context_tracing: Fix guest accounting with native vtime Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 03/21] vtime: Update a few comments Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 04/21] context_tracking: Fix runtime CPU off-case Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 05/21] nohz: Only enable context tracking on full dynticks CPUs Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 06/21] context_tracking: Remove full dynticks' hacky dependency on wide context tracking Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 07/21] context_tracking: Ground setup for static key use Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 08/21] context_tracking: Optimize main APIs off case with static key Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 09/21] context_tracking: Optimize guest " Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 10/21] context_tracking: Optimize context switch off case with static keys Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 11/21] context_tracking: User/kernel broundary cross trace events Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 12/21] vtime: Remove a few unneeded generic vtime state checks Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 13/21] vtime: Fix racy cputime delta update Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 14/21] context_tracking: Split low level state headers Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 15/21] vtime: Describe overriden functions in dedicated arch headers Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 16/21] vtime: Optimize full dynticks accounting off case with static keys Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 17/21] vtime: Always scale generic vtime accounting results Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 18/21] vtime: Always debug check snapshot source _before_ updating it Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 19/21] nohz: Rename a few state variables Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 20/21] nohz: Optimize full dynticks state checks with static keys Frederic Weisbecker
2013-07-26 23:42 ` [PATCH 21/21] nohz: Optimize full dynticks's sched hooks " Frederic Weisbecker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.