All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] sched: fixes for the nr_running usage
@ 2013-08-18  8:25 Lei Wen
  2013-08-18  8:25 ` [PATCH 1/8] sched: change load balance number to h_nr_running of run queue Lei Wen
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since it is different for the nr_running and h_nr_running in its
presenting meaning, we should take care of their usage in the scheduler.

Lei Wen (8):
  sched: change load balance number to h_nr_running of run queue
  sched: change cpu_avg_load_per_task using h_nr_running
  sched: change update_rq_runnable_avg using h_nr_running
  sched: change pick_next_task_fair to h_nr_running
  sched: change update_sg_lb_stats to h_nr_running
  sched: change find_busiest_queue to h_nr_running
  sched: change active_load_balance_cpu_stop to use h_nr_running
  sched: document the difference between nr_running and h_nr_running

 kernel/sched/fair.c  |   23 +++++++++++++----------
 kernel/sched/sched.h |    6 ++++++
 2 files changed, 19 insertions(+), 10 deletions(-)

-- 
1.7.5.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/8] sched: change load balance number to h_nr_running of run queue
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 2/8] sched: change cpu_avg_load_per_task using h_nr_running Lei Wen
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since rq->nr_running would include both migration and rt task, it is not
reasonable to seek to move nr_running number of task in the load_balance
function, since it only apply to cfs type.

Change it to cfs's h_nr_running, which could well present the task
number in current cfs queue.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f918635..d6153c8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5096,17 +5096,19 @@ redo:
 	schedstat_add(sd, lb_imbalance[idle], env.imbalance);
 
 	ld_moved = 0;
-	if (busiest->nr_running > 1) {
+	/* load balance only apply to CFS task, we use h_nr_running here */
+	if (busiest->cfs.h_nr_running > 1) {
 		/*
 		 * Attempt to move tasks. If find_busiest_group has found
-		 * an imbalance but busiest->nr_running <= 1, the group is
+		 * an imbalance but busiest->cfs.h_nr_running <= 1, the group is
 		 * still unbalanced. ld_moved simply stays zero, so it is
 		 * correctly treated as an imbalance.
 		 */
 		env.flags |= LBF_ALL_PINNED;
 		env.src_cpu   = busiest->cpu;
 		env.src_rq    = busiest;
-		env.loop_max  = min(sysctl_sched_nr_migrate, busiest->nr_running);
+		env.loop_max  = min(sysctl_sched_nr_migrate,
+				    busiest->cfs.h_nr_running);
 
 		update_h_load(env.src_cpu);
 more_balance:
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/8] sched: change cpu_avg_load_per_task using h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
  2013-08-18  8:25 ` [PATCH 1/8] sched: change load balance number to h_nr_running of run queue Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 3/8] sched: change update_rq_runnable_avg " Lei Wen
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since cpu_avg_load_per_task is used only by cfs scheduler, its meaning
should present the average cfs type task load in the current run queue.
Thus we change it to h_nr_running for well presenting its meaning.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d6153c8..e6b99b4 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3008,7 +3008,7 @@ static unsigned long power_of(int cpu)
 static unsigned long cpu_avg_load_per_task(int cpu)
 {
 	struct rq *rq = cpu_rq(cpu);
-	unsigned long nr_running = ACCESS_ONCE(rq->nr_running);
+	unsigned long nr_running = ACCESS_ONCE(rq->cfs.h_nr_running);
 	unsigned long load_avg = rq->cfs.runnable_load_avg;
 
 	if (nr_running)
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/8] sched: change update_rq_runnable_avg using h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
  2013-08-18  8:25 ` [PATCH 1/8] sched: change load balance number to h_nr_running of run queue Lei Wen
  2013-08-18  8:25 ` [PATCH 2/8] sched: change cpu_avg_load_per_task using h_nr_running Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 4/8] sched: change pick_next_task_fair to h_nr_running Lei Wen
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since update_rq_runnable_avg is used only by cfs scheduler, it
should not consider the task beyond the cfs type.

If one cfs task is running with one rt task, the only cfs task
should be no aware of the existence of rt task, and behavior
like one cfs task occasionly throttled by some bandwidth control
mechanism. Thus its sleep time should not being taken into
runnable avg load calculation.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e6b99b4..9869d4d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2893,7 +2893,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 	}
 
 	if (!se) {
-		update_rq_runnable_avg(rq, rq->nr_running);
+		update_rq_runnable_avg(rq, rq->cfs.h_nr_running);
 		inc_nr_running(rq);
 	}
 	hrtick_update(rq);
@@ -4142,7 +4142,7 @@ static void __update_blocked_averages_cpu(struct task_group *tg, int cpu)
 			list_del_leaf_cfs_rq(cfs_rq);
 	} else {
 		struct rq *rq = rq_of(cfs_rq);
-		update_rq_runnable_avg(rq, rq->nr_running);
+		update_rq_runnable_avg(rq, rq->cfs.h_nr_running);
 	}
 }
 
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/8] sched: change pick_next_task_fair to h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
                   ` (2 preceding siblings ...)
  2013-08-18  8:25 ` [PATCH 3/8] sched: change update_rq_runnable_avg " Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 5/8] sched: change update_sg_lb_stats " Lei Wen
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since pick_next_task_fair only want to ensure there is some task in the
run queue to be picked up, it should use the h_nr_running instead of
nr_running, since nr_running cannot present all tasks if group existed.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9869d4d..33576eb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3653,7 +3653,7 @@ static struct task_struct *pick_next_task_fair(struct rq *rq)
 	struct cfs_rq *cfs_rq = &rq->cfs;
 	struct sched_entity *se;
 
-	if (!cfs_rq->nr_running)
+	if (!cfs_rq->h_nr_running)
 		return NULL;
 
 	do {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/8] sched: change update_sg_lb_stats to h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
                   ` (3 preceding siblings ...)
  2013-08-18  8:25 ` [PATCH 4/8] sched: change pick_next_task_fair to h_nr_running Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 6/8] sched: change find_busiest_queue " Lei Wen
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since update_sg_lb_stats is used to calculate sched_group load
difference of cfs type task, it should use h_nr_running instead of
nr_running of rq.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 33576eb..e026001 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4488,7 +4488,7 @@ static inline void update_sg_lb_stats(struct lb_env *env,
 	for_each_cpu_and(i, sched_group_cpus(group), env->cpus) {
 		struct rq *rq = cpu_rq(i);
 
-		nr_running = rq->nr_running;
+		nr_running = rq->cfs.h_nr_running;
 
 		/* Bias balancing toward cpus of our domain */
 		if (local_group) {
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 6/8] sched: change find_busiest_queue to h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
                   ` (4 preceding siblings ...)
  2013-08-18  8:25 ` [PATCH 5/8] sched: change update_sg_lb_stats " Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 7/8] sched: change active_load_balance_cpu_stop to use h_nr_running Lei Wen
  2013-08-18  8:25 ` [PATCH 8/8] sched: document the difference between nr_running and h_nr_running Lei Wen
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Since find_busiest_queue try to avoid do load balance for runqueue
which has only one cfs task and its load is above the imbalance
value calculated, we should use h_nr_running of cfs instead of
nr_running of rq.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index e026001..3656603 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4990,7 +4990,8 @@ static struct rq *find_busiest_queue(struct lb_env *env,
 		 * When comparing with imbalance, use weighted_cpuload()
 		 * which is not scaled with the cpu power.
 		 */
-		if (capacity && rq->nr_running == 1 && wl > env->imbalance)
+		if (capacity && rq->cfs.h_nr_running == 1
+			     && wl > env->imbalance)
 			continue;
 
 		/*
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 7/8] sched: change active_load_balance_cpu_stop to use h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
                   ` (5 preceding siblings ...)
  2013-08-18  8:25 ` [PATCH 6/8] sched: change find_busiest_queue " Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  2013-08-18  8:25 ` [PATCH 8/8] sched: document the difference between nr_running and h_nr_running Lei Wen
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

We should only avoid do the active load balance when there is no
cfs type task. If just use rq->nr_running, it is possible for the
source cpu has multiple rt task, while zero cfs task, so that it
would confuse the active load balance function that try to move,
but find no task it could move.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3656603..4c96124 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5349,7 +5349,7 @@ static int active_load_balance_cpu_stop(void *data)
 		goto out_unlock;
 
 	/* Is there any task to move? */
-	if (busiest_rq->nr_running <= 1)
+	if (busiest_rq->cfs.h_nr_running == 0)
 		goto out_unlock;
 
 	/*
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 8/8] sched: document the difference between nr_running and h_nr_running
  2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
                   ` (6 preceding siblings ...)
  2013-08-18  8:25 ` [PATCH 7/8] sched: change active_load_balance_cpu_stop to use h_nr_running Lei Wen
@ 2013-08-18  8:25 ` Lei Wen
  7 siblings, 0 replies; 9+ messages in thread
From: Lei Wen @ 2013-08-18  8:25 UTC (permalink / raw)
  To: Paul Turner, Peter Zijlstra, Ingo Molnar, mingo, leiwen, linux-kernel

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/sched.h |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index ef0a7b2..b8f0924 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -248,6 +248,12 @@ struct cfs_bandwidth { };
 /* CFS-related fields in a runqueue */
 struct cfs_rq {
 	struct load_weight load;
+	/*
+	 * The difference between nr_running and h_nr_running is:
+	 * nr_running:   present how many entity would take part in the sharing
+	 *               the cpu power of that cfs_rq
+	 * h_nr_running: present how many tasks in current cfs runqueue
+	 */
 	unsigned int nr_running, h_nr_running;
 
 	u64 exec_clock;
-- 
1.7.5.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-08-18  8:29 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-18  8:25 [PATCH 0/8] sched: fixes for the nr_running usage Lei Wen
2013-08-18  8:25 ` [PATCH 1/8] sched: change load balance number to h_nr_running of run queue Lei Wen
2013-08-18  8:25 ` [PATCH 2/8] sched: change cpu_avg_load_per_task using h_nr_running Lei Wen
2013-08-18  8:25 ` [PATCH 3/8] sched: change update_rq_runnable_avg " Lei Wen
2013-08-18  8:25 ` [PATCH 4/8] sched: change pick_next_task_fair to h_nr_running Lei Wen
2013-08-18  8:25 ` [PATCH 5/8] sched: change update_sg_lb_stats " Lei Wen
2013-08-18  8:25 ` [PATCH 6/8] sched: change find_busiest_queue " Lei Wen
2013-08-18  8:25 ` [PATCH 7/8] sched: change active_load_balance_cpu_stop to use h_nr_running Lei Wen
2013-08-18  8:25 ` [PATCH 8/8] sched: document the difference between nr_running and h_nr_running Lei Wen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.