[RFC PATCH] sched: Fix sched_wakeup tracepoint

* [RFC PATCH] sched: Fix sched_wakeup tracepoint
@ 2015-06-05 11:41 Mathieu Desnoyers
  2015-06-05 12:09 ` Peter Zijlstra
  0 siblings, 1 reply; 19+ messages in thread
From: Mathieu Desnoyers @ 2015-06-05 11:41 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Mathieu Desnoyers, Thomas Gleixner, Ingo Molnar,
	Steven Rostedt, Francis Giraldeau

Commit 317f394160e9 "sched: Move the second half of ttwu() to the remote cpu"
moves ttwu_do_wakeup() to an IPI handler context on the remote CPU for
remote wakeups. This commit appeared upstream in Linux v3.0.

Unfortunately, ttwu_do_wakeup() happens to contain the "sched_wakeup"
tracepoint. Analyzing wakup latencies depends on getting the wakeup
chain right: which process is the waker, which is the wakee. Moving this
instrumention outside of the waker context prevents trace analysis tools
from getting the waker pid, either through "current" in the tracepoint
probe, or by deducing it using other scheduler events based on the CPU
executing the tracepoint.

Another side-effect of moving this instrumentation to the scheduler ipi
is that the delay during which the wakeup is sitting in the pending
queue is not accounted for when calculating wakeup latency.

Therefore, move the sched_wakeup instrumentation back to the waker
context to fix those two shortcomings.

This patch is build-tested only, submitted for feedback. Francis, can
you try it out with your critical path analysis ?

Fixes: 317f394160e9 "sched: Move the second half of ttwu() to the remote cpu"
Reported-by: Francis Giraldeau <francis.giraldeau@gmail.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Ingo Molnar <mingo@kernel.org>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Francis Giraldeau <francis.giraldeau@gmail.com>
---
 kernel/sched/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 3d5f6f6..0ed2021 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1457,7 +1457,6 @@ static void
 ttwu_do_wakeup(struct rq *rq, struct task_struct *p, int wake_flags)
 {
 	check_preempt_curr(rq, p, wake_flags);
-	trace_sched_wakeup(p, true);
 
 	p->state = TASK_RUNNING;
 #ifdef CONFIG_SMP
@@ -1505,6 +1504,7 @@ static int ttwu_remote(struct task_struct *p, int wake_flags)
 	if (task_on_rq_queued(p)) {
 		/* check_preempt_curr() may use rq clock */
 		update_rq_clock(rq);
+		trace_sched_wakeup(p, true);
 		ttwu_do_wakeup(rq, p, wake_flags);
 		ret = 1;
 	}
@@ -1619,6 +1619,7 @@ static void ttwu_queue(struct task_struct *p, int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
 
+	trace_sched_wakeup(p, true);
 #if defined(CONFIG_SMP)
 	if (sched_feat(TTWU_QUEUE) && !cpus_share_cache(smp_processor_id(), cpu)) {
 		sched_clock_cpu(cpu); /* sync clocks x-cpu */
@@ -1734,6 +1735,7 @@ static void try_to_wake_up_local(struct task_struct *p)
 	if (!task_on_rq_queued(p))
 		ttwu_activate(rq, p, ENQUEUE_WAKEUP);
 
+	trace_sched_wakeup(p, true);
 	ttwu_do_wakeup(rq, p, 0);
 	ttwu_stat(p, smp_processor_id(), 0);
 out:
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread