All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] sched: Skip double execution of pick_next_task_fair
@ 2014-04-24 17:14 Tim Chen
  0 siblings, 0 replies; only message in thread
From: Tim Chen @ 2014-04-24 17:14 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra; +Cc: linux-kernel, Andi Kleen, Len Brown

The current code will call pick_next_task_fair a second time in the
slow path if we did not pull any task in our first try.  This is really
unnecessary as we already know no task can be pulled and it doubles the
delay for the cpu to enter idle.

We instrumented some network workloads and saw that pick_next_task_fair
was called frequently before a cpu entered idle.  The call to
pick_next_task_fair can add non trivial latency as it calls load_balance
which runs find_busiest_group on an hierachy of sched domains spanning
the cpus.  For a large 4 socket system, we saw almost 0.25 msec spent
per call of pick_next_task_fair before a cpu can be idled.

This patch skips pick_next_task_fair in the slow path if it has already
been invoked.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 kernel/sched/core.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 1d1b87b..547ccff 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2591,8 +2591,14 @@ pick_next_task(struct rq *rq, struct task_struct *prev)
 	if (likely(prev->sched_class == class &&
 		   rq->nr_running == rq->cfs.h_nr_running)) {
 		p = fair_sched_class.pick_next_task(rq, prev);
-		if (likely(p && p != RETRY_TASK))
-			return p;
+		if (unlikely(p == RETRY_TASK))
+			goto again;
+
+		/* assumes fair_sched_class->next == idle_sched_class */
+		if (unlikely(!p))
+			p = idle_sched_class.pick_next_task(rq, prev);
+
+		return p;
 	}
 
 again:
-- 
1.7.11.7



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2014-04-24 17:14 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-24 17:14 [PATCH v2] sched: Skip double execution of pick_next_task_fair Tim Chen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.