[PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue
@ 2011-05-18 12:57 Hillf Danton
  2011-05-18 13:27 ` Steven Rostedt
  0 siblings, 1 reply; 5+ messages in thread
From: Hillf Danton @ 2011-05-18 12:57 UTC (permalink / raw)
  To: LKML
  Cc: Mike Galbraith, Ingo Molnar, Peter Zijlstra, Yong Zhang, Steven Rostedt

It is changed to be pushing RT task, then the pushable tasks on other
runqueus have chances to reach all CPUS whose runqueus are lower in
priority, which is not covered by pull since only one runqueue is
considered in pull for accepting tasks on other runqueues. Thus the
time of pushable tasks waiting on runqueue could be decreased.

Thanks all comments in preparing this work.

Signed-off-by: Hillf Danton <dhillf@gmail.com>
---

--- a/kernel/sched_rt.c	2011-04-27 11:48:50.000000000 +0800
+++ b/kernel/sched_rt.c	2011-05-18 20:29:26.000000000 +0800
@@ -1423,77 +1423,13 @@ static void push_rt_tasks(struct rq *rq)
 static int pull_rt_task(struct rq *this_rq)
 {
 	int this_cpu = this_rq->cpu, ret = 0, cpu;
-	struct task_struct *p;
-	struct rq *src_rq;

 	if (likely(!rt_overloaded(this_rq)))
 		return 0;

 	for_each_cpu(cpu, this_rq->rd->rto_mask) {
-		if (this_cpu == cpu)
-			continue;
-
-		src_rq = cpu_rq(cpu);
-
-		/*
-		 * Don't bother taking the src_rq->lock if the next highest
-		 * task is known to be lower-priority than our current task.
-		 * This may look racy, but if this value is about to go
-		 * logically higher, the src_rq will push this task away.
-		 * And if its going logically lower, we do not care
-		 */
-		if (src_rq->rt.highest_prio.next >=
-		    this_rq->rt.highest_prio.curr)
-			continue;
-
-		/*
-		 * We can potentially drop this_rq's lock in
-		 * double_lock_balance, and another CPU could
-		 * alter this_rq
-		 */
-		double_lock_balance(this_rq, src_rq);
-
-		/*
-		 * Are there still pullable RT tasks?
-		 */
-		if (src_rq->rt.rt_nr_running <= 1)
-			goto skip;
-
-		p = pick_next_highest_task_rt(src_rq, this_cpu);
-
-		/*
-		 * Do we have an RT task that preempts
-		 * the to-be-scheduled task?
-		 */
-		if (p && (p->prio < this_rq->rt.highest_prio.curr)) {
-			WARN_ON(p == src_rq->curr);
-			WARN_ON(!p->se.on_rq);
-
-			/*
-			 * There's a chance that p is higher in priority
-			 * than what's currently running on its cpu.
-			 * This is just that p is wakeing up and hasn't
-			 * had a chance to schedule. We only pull
-			 * p if it is lower in priority than the
-			 * current task on the run queue
-			 */
-			if (p->prio < src_rq->curr->prio)
-				goto skip;
-
-			ret = 1;
-
-			deactivate_task(src_rq, p, 0);
-			set_task_cpu(p, this_cpu);
-			activate_task(this_rq, p, 0);
-			/*
-			 * We continue with the search, just in
-			 * case there's an even higher prio task
-			 * in another runqueue. (low likelihood
-			 * but possible)
-			 */
-		}
-skip:
-		double_unlock_balance(this_rq, src_rq);
+		if (this_cpu != cpu)
+			ret += push_rt_task(cpu_rq(cpu));
 	}

 	return ret;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue
  2011-05-18 12:57 [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue Hillf Danton
@ 2011-05-18 13:27 ` Steven Rostedt
  2011-05-18 14:54   ` Hillf Danton
  0 siblings, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2011-05-18 13:27 UTC (permalink / raw)
  To: Hillf Danton
  Cc: LKML, Mike Galbraith, Ingo Molnar, Peter Zijlstra, Yong Zhang

On Wed, 2011-05-18 at 20:57 +0800, Hillf Danton wrote:
> It is changed to be pushing RT task, then the pushable tasks on other
> runqueus have chances to reach all CPUS whose runqueus are lower in
> priority, which is not covered by pull since only one runqueue is
> considered in pull for accepting tasks on other runqueues. Thus the
> time of pushable tasks waiting on runqueue could be decreased.

Do you have numbers and test cases for this? Or at least traces that
show how this helps?

Basically, you are saying that we want to iterate over all CPUs and have
them go through the algorithm of searching for rqs that they can push
to. But we already know that our run queue has dropped priority.

I'm not fully understanding the benefit of this patch.

-- Steve


> 
> Thanks all comments in preparing this work.
> 
> Signed-off-by: Hillf Danton <dhillf@gmail.com>
> ---
> 
> --- a/kernel/sched_rt.c	2011-04-27 11:48:50.000000000 +0800
> +++ b/kernel/sched_rt.c	2011-05-18 20:29:26.000000000 +0800
> @@ -1423,77 +1423,13 @@ static void push_rt_tasks(struct rq *rq)
>  static int pull_rt_task(struct rq *this_rq)
>  {
>  	int this_cpu = this_rq->cpu, ret = 0, cpu;
> -	struct task_struct *p;
> -	struct rq *src_rq;
> 
>  	if (likely(!rt_overloaded(this_rq)))
>  		return 0;
> 
>  	for_each_cpu(cpu, this_rq->rd->rto_mask) {
> -		if (this_cpu == cpu)
> -			continue;
> -
> -		src_rq = cpu_rq(cpu);
> -
> -		/*
> -		 * Don't bother taking the src_rq->lock if the next highest
> -		 * task is known to be lower-priority than our current task.
> -		 * This may look racy, but if this value is about to go
> -		 * logically higher, the src_rq will push this task away.
> -		 * And if its going logically lower, we do not care
> -		 */
> -		if (src_rq->rt.highest_prio.next >=
> -		    this_rq->rt.highest_prio.curr)
> -			continue;
> -
> -		/*
> -		 * We can potentially drop this_rq's lock in
> -		 * double_lock_balance, and another CPU could
> -		 * alter this_rq
> -		 */
> -		double_lock_balance(this_rq, src_rq);
> -
> -		/*
> -		 * Are there still pullable RT tasks?
> -		 */
> -		if (src_rq->rt.rt_nr_running <= 1)
> -			goto skip;
> -
> -		p = pick_next_highest_task_rt(src_rq, this_cpu);
> -
> -		/*
> -		 * Do we have an RT task that preempts
> -		 * the to-be-scheduled task?
> -		 */
> -		if (p && (p->prio < this_rq->rt.highest_prio.curr)) {
> -			WARN_ON(p == src_rq->curr);
> -			WARN_ON(!p->se.on_rq);
> -
> -			/*
> -			 * There's a chance that p is higher in priority
> -			 * than what's currently running on its cpu.
> -			 * This is just that p is wakeing up and hasn't
> -			 * had a chance to schedule. We only pull
> -			 * p if it is lower in priority than the
> -			 * current task on the run queue
> -			 */
> -			if (p->prio < src_rq->curr->prio)
> -				goto skip;
> -
> -			ret = 1;
> -
> -			deactivate_task(src_rq, p, 0);
> -			set_task_cpu(p, this_cpu);
> -			activate_task(this_rq, p, 0);
> -			/*
> -			 * We continue with the search, just in
> -			 * case there's an even higher prio task
> -			 * in another runqueue. (low likelihood
> -			 * but possible)
> -			 */
> -		}
> -skip:
> -		double_unlock_balance(this_rq, src_rq);
> +		if (this_cpu != cpu)
> +			ret += push_rt_task(cpu_rq(cpu));
>  	}
> 
>  	return ret;



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue
  2011-05-18 13:27 ` Steven Rostedt
@ 2011-05-18 14:54   ` Hillf Danton
  2011-05-18 15:20     ` Steven Rostedt
  0 siblings, 1 reply; 5+ messages in thread
From: Hillf Danton @ 2011-05-18 14:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Mike Galbraith, Ingo Molnar, Peter Zijlstra, Yong Zhang

On Wed, May 18, 2011 at 9:27 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 2011-05-18 at 20:57 +0800, Hillf Danton wrote:
>> It is changed to be pushing RT task, then the pushable tasks on other
>> runqueus have chances to reach all CPUS whose runqueus are lower in
>> priority, which is not covered by pull since only one runqueue is
>> considered in pull for accepting tasks on other runqueues. Thus the
>> time of pushable tasks waiting on runqueue could be decreased.
>
> Do you have numbers and test cases for this? Or at least traces that
> show how this helps?
>
> Basically, you are saying that we want to iterate over all CPUs and have
> them go through the algorithm of searching for rqs that they can push
> to. But we already know that our run queue has dropped priority.
>
> I'm not fully understanding the benefit of this patch.
>

Hi Steve

In short, if there are pushable tasks and if there are RQs,
NOT LIMITED TO our RQ, in lower priority, tasks should be pushed to
RQs as many as we could.

Hillf

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue
  2011-05-18 14:54   ` Hillf Danton
@ 2011-05-18 15:20     ` Steven Rostedt
  2011-05-19 13:55       ` Hillf Danton
  0 siblings, 1 reply; 5+ messages in thread
From: Steven Rostedt @ 2011-05-18 15:20 UTC (permalink / raw)
  To: Hillf Danton
  Cc: LKML, Mike Galbraith, Ingo Molnar, Peter Zijlstra, Yong Zhang

On Wed, 2011-05-18 at 22:54 +0800, Hillf Danton wrote:

> In short, if there are pushable tasks and if there are RQs,
> NOT LIMITED TO our RQ, in lower priority, tasks should be pushed to
> RQs as many as we could.

I understand what you are trying to do. But this change modifies a lot
of assumptions. Please supply test cases that shows how this helps.

Have a look at:

http://lwn.net/Articles/425583/

Where I did a bit of work just to make sure my change to sched_rt.c was
appropriate. Just coming up with scenarios may not be good enough.
Seeing it in practice is worth much more.

For example, you may be making the fast path slower. This may do what
you expect, with a hit in performance. I'm not sure I like that idea.

-- Steve

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue
  2011-05-18 15:20     ` Steven Rostedt
@ 2011-05-19 13:55       ` Hillf Danton
  0 siblings, 0 replies; 5+ messages in thread
From: Hillf Danton @ 2011-05-19 13:55 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: LKML, Mike Galbraith, Ingo Molnar, Peter Zijlstra, Yong Zhang

On Wed, May 18, 2011 at 11:20 PM, Steven Rostedt <rostedt@goodmis.org> wrote:
> On Wed, 2011-05-18 at 22:54 +0800, Hillf Danton wrote:
>
>> In short, if there are pushable tasks and if there are RQs,
>> NOT LIMITED TO our RQ, in lower priority, tasks should be pushed to
>> RQs as many as we could.
>
> I understand what you are trying to do. But this change modifies a lot
> of assumptions. Please supply test cases that shows how this helps.
>
> Have a look at:
>
> http://lwn.net/Articles/425583/
>
> Where I did a bit of work just to make sure my change to sched_rt.c was
> appropriate. Just coming up with scenarios may not be good enough.
> Seeing it in practice is worth much more.
>
> For example, you may be making the fast path slower. This may do what
> you expect, with a hit in performance. I'm not sure I like that idea.
>
Hi Steve

The patch is updated with a few changes.

A clone of find_lowest_rq(), find_lowest_rq_for_pushing() is added to select
the best RQ for a given pushable tasks, in a manner that the selected RQ
has to be different from the task's RQ.

It is not the final version, since no test cases check it, but a
target for comments.

Thank you very much for sharing your work.

Hillf
---

--- a/kernel/sched_rt.c	2011-04-27 11:48:50.000000000 +0800
+++ b/kernel/sched_rt.c	2011-05-19 21:26:42.000000000 +0800
@@ -1260,6 +1260,28 @@ static int find_lowest_rq(struct task_st
 	return -1;
 }

+static int find_lowest_rq_for_pushing(struct task_struct *task)
+{
+	struct sched_domain *sd;
+	struct cpumask *lowest_mask = __get_cpu_var(local_cpu_mask);
+	int cpu = task_cpu(task);
+
+	if (task->rt.nr_cpus_allowed == 1 ||
+	    !cpupri_find(&task_rq(task)->rd->cpupri, task, lowest_mask))
+		return -1;
+
+	for_each_domain(cpu, sd) {
+		int best_cpu;
+		best_cpu = cpumask_first_and(lowest_mask,
+					     sched_domain_span(sd));
+
+		if (best_cpu < nr_cpu_ids && best_cpu != cpu)
+			return best_cpu;
+	}
+
+	return -1;
+}
+
 /* Will lock the rq it finds */
 static struct rq *find_lock_lowest_rq(struct task_struct *task, struct rq *rq)
 {
@@ -1268,7 +1290,7 @@ static struct rq *find_lock_lowest_rq(st
 	int cpu;

 	for (tries = 0; tries < RT_MAX_TRIES; tries++) {
-		cpu = find_lowest_rq(task);
+		cpu = find_lowest_rq_for_pushing(task);

 		if ((cpu == -1) || (cpu == rq->cpu))
 			break;
@@ -1336,6 +1358,7 @@ static int push_rt_task(struct rq *rq)
 {
 	struct task_struct *next_task;
 	struct rq *lowest_rq;
+	int ret = 0;

 	if (!rq->rt.overloaded)
 		return 0;
@@ -1383,7 +1406,6 @@ retry:
 			 * since the other cpus will pull from us when they
 			 * are ready.
 			 */
-			dequeue_pushable_task(rq, next_task);
 			goto out;
 		}

@@ -1402,7 +1424,7 @@ retry:
 	deactivate_task(rq, next_task, 0);
 	set_task_cpu(next_task, lowest_rq->cpu);
 	activate_task(lowest_rq, next_task, 0);
-
+	ret = 1;
 	resched_task(lowest_rq->curr);

 	double_unlock_balance(rq, lowest_rq);
@@ -1410,7 +1432,7 @@ retry:
 out:
 	put_task_struct(next_task);

-	return 1;
+	return ret;
 }

 static void push_rt_tasks(struct rq *rq)
@@ -1423,77 +1445,20 @@ static void push_rt_tasks(struct rq *rq)
 static int pull_rt_task(struct rq *this_rq)
 {
 	int this_cpu = this_rq->cpu, ret = 0, cpu;
-	struct task_struct *p;
-	struct rq *src_rq;
+	int count = 0;

 	if (likely(!rt_overloaded(this_rq)))
 		return 0;

+again:
 	for_each_cpu(cpu, this_rq->rd->rto_mask) {
-		if (this_cpu == cpu)
-			continue;
-
-		src_rq = cpu_rq(cpu);
-
-		/*
-		 * Don't bother taking the src_rq->lock if the next highest
-		 * task is known to be lower-priority than our current task.
-		 * This may look racy, but if this value is about to go
-		 * logically higher, the src_rq will push this task away.
-		 * And if its going logically lower, we do not care
-		 */
-		if (src_rq->rt.highest_prio.next >=
-		    this_rq->rt.highest_prio.curr)
-			continue;
-
-		/*
-		 * We can potentially drop this_rq's lock in
-		 * double_lock_balance, and another CPU could
-		 * alter this_rq
-		 */
-		double_lock_balance(this_rq, src_rq);
-
-		/*
-		 * Are there still pullable RT tasks?
-		 */
-		if (src_rq->rt.rt_nr_running <= 1)
-			goto skip;
-
-		p = pick_next_highest_task_rt(src_rq, this_cpu);
-
-		/*
-		 * Do we have an RT task that preempts
-		 * the to-be-scheduled task?
-		 */
-		if (p && (p->prio < this_rq->rt.highest_prio.curr)) {
-			WARN_ON(p == src_rq->curr);
-			WARN_ON(!p->se.on_rq);
-
-			/*
-			 * There's a chance that p is higher in priority
-			 * than what's currently running on its cpu.
-			 * This is just that p is wakeing up and hasn't
-			 * had a chance to schedule. We only pull
-			 * p if it is lower in priority than the
-			 * current task on the run queue
-			 */
-			if (p->prio < src_rq->curr->prio)
-				goto skip;
-
-			ret = 1;
+		if (cpu != this_cpu)
+			count += push_rt_task(cpu_rq(cpu));
+	}

-			deactivate_task(src_rq, p, 0);
-			set_task_cpu(p, this_cpu);
-			activate_task(this_rq, p, 0);
-			/*
-			 * We continue with the search, just in
-			 * case there's an even higher prio task
-			 * in another runqueue. (low likelihood
-			 * but possible)
-			 */
-		}
-skip:
-		double_unlock_balance(this_rq, src_rq);
+	if (ret != count) {
+		ret = count;
+		goto again;
 	}

 	return ret;

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2011-05-19 13:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-18 12:57 [PATCH] sched: change pull_rt_task() to decrease time waiting on runqueue Hillf Danton
2011-05-18 13:27 ` Steven Rostedt
2011-05-18 14:54   ` Hillf Danton
2011-05-18 15:20     ` Steven Rostedt
2011-05-19 13:55       ` Hillf Danton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).