linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RT Load balance changes in sched-devel
       [not found]                 ` <4756D709.1020901@redhat.com>
@ 2007-12-09 17:16                   ` Dmitry Adamushko
  2007-12-09 18:32                     ` Gregory Haskins
  0 siblings, 1 reply; 5+ messages in thread
From: Dmitry Adamushko @ 2007-12-09 17:16 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Gregory Haskins, Ingo Molnar, Balbir Singh, Peter Zijlstra,
	vatsa, LKML Kernel

[ cc'ed lkml ]

I guess, one possible load-balancing point is out of consideration --
sched_setscheduler()
(also rt_mutex_setprio()).

(1)  NORMAL --> RT, when p->se.on_rq == 1 && ! task_running(rq, p)

(2) RT --> NORMAL, when task_running(rq, p) == 1

e.g. for (2) we may even get a completely idle rq (schedule() -->
schedule_balance_rt() will not help due to schedule_balance_rt()
having a rt_task(prev) check in place... and 'prev' is of NORMAL type
when it's scheduled out).


btw., both cases would be addressed by placing load-balance points
into sched_class_rt->{enqueue,dequeue}_task_rt()... push_rt_tasks()
and pull_rt_tasks() respectively. As a side effect (I think,
technically, it would be possible), 3 out of 4 *_balance_rt() calls
(the exception: schedule_tail_balance_rt()) in schedule() would become
unnecessary.

_BUT_

the enqueue/dequeue() interface would become less straightforward,
logically-wise.
Something like:

rq = activate_task(rq, ...) ; /* may unlock rq and lock/return another one */

would complicate the existing use cases.


-- 
Best regards,
Dmitry Adamushko

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RT Load balance changes in sched-devel
  2007-12-09 17:16                   ` RT Load balance changes in sched-devel Dmitry Adamushko
@ 2007-12-09 18:32                     ` Gregory Haskins
  2007-12-10  0:57                       ` Steven Rostedt
  2007-12-10  2:53                       ` [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts Gregory Haskins
  0 siblings, 2 replies; 5+ messages in thread
From: Gregory Haskins @ 2007-12-09 18:32 UTC (permalink / raw)
  To: Dmitry Adamushko, Steven Rostedt
  Cc: Peter Zijlstra, Ingo Molnar, Balbir Singh, vatsa, LKML Kernel

Hi Dmitry,

>>> On Sun, Dec 9, 2007 at 12:16 PM, in message
<b647ffbd0712090916l4eb9a944r4726680a5fdcae46@mail.gmail.com>, "Dmitry
Adamushko" <dmitry.adamushko@gmail.com> wrote: 
> [ cc'ed lkml ]
> 
> I guess, one possible load-balancing point is out of consideration --
> sched_setscheduler()
> (also rt_mutex_setprio()).
> 
> (1)  NORMAL --> RT, when p->se.on_rq == 1 && ! task_running(rq, p)
> 
> (2) RT --> NORMAL, when task_running(rq, p) == 1
> 
> e.g. for (2) we may even get a completely idle rq (schedule() -->
> schedule_balance_rt() will not help due to schedule_balance_rt()
> having a rt_task(prev) check in place... and 'prev' is of NORMAL type
> when it's scheduled out).

Indeed.  I think you are correct on both counts.  This is an oversight, so good eyes!

> 
> 
> btw., both cases would be addressed by placing load-balance points
> into sched_class_rt->{enqueue,dequeue}_task_rt()... push_rt_tasks()
> and pull_rt_tasks() respectively. As a side effect (I think,
> technically, it would be possible), 3 out of 4 *_balance_rt() calls
> (the exception: schedule_tail_balance_rt()) in schedule() would become
> unnecessary.
> 
> _BUT_
> 
> the enqueue/dequeue() interface would become less straightforward,
> logically-wise.
> Something like:
> 
> rq = activate_task(rq, ...) ; /* may unlock rq and lock/return another one 
> */
> 
> would complicate the existing use cases.
> 

I think I would prefer to just fix the setscheduler/setprio cases for the class transition than change the behavior of these enqueue/dequeue calls.  But I will keep an open mind as I look into this issue.

Thanks for the review!
-Greg




^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RT Load balance changes in sched-devel
  2007-12-09 18:32                     ` Gregory Haskins
@ 2007-12-10  0:57                       ` Steven Rostedt
  2007-12-10  2:53                       ` [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts Gregory Haskins
  1 sibling, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2007-12-10  0:57 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Dmitry Adamushko, Peter Zijlstra, Ingo Molnar, Balbir Singh,
	vatsa, LKML Kernel

Gregory Haskins wrote:
>> btw., both cases would be addressed by placing load-balance points
>> into sched_class_rt->{enqueue,dequeue}_task_rt()... push_rt_tasks()
>> and pull_rt_tasks() respectively. As a side effect (I think,
>> technically, it would be possible), 3 out of 4 *_balance_rt() calls
>> (the exception: schedule_tail_balance_rt()) in schedule() would become
>> unnecessary.
>>
>> _BUT_
>>
>> the enqueue/dequeue() interface would become less straightforward,
>> logically-wise.
>> Something like:

Also push and pull_rt use activate,deactivate as well. So this would 
make that code a bit more complex.

>>
>> rq = activate_task(rq, ...) ; /* may unlock rq and lock/return another one 
>> */
>>
>> would complicate the existing use cases.
>>
> 
> I think I would prefer to just fix the setscheduler/setprio cases for the class transition than change the behavior of these enqueue/dequeue calls.  But I will keep an open mind as I look into this issue.

I agree with Gregory on this. I prefer to fix the two you found. I 
thought about them before, but somehow they were missed :-/

Anyway, I'll be working on adding some more patches on Monday. There may 
be other ways to clean this up.

> 
> Thanks for the review!

Yeah, thanks from me too!

-- Steve


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts
  2007-12-09 18:32                     ` Gregory Haskins
  2007-12-10  0:57                       ` Steven Rostedt
@ 2007-12-10  2:53                       ` Gregory Haskins
  2007-12-10  3:18                         ` Gregory Haskins
  1 sibling, 1 reply; 5+ messages in thread
From: Gregory Haskins @ 2007-12-10  2:53 UTC (permalink / raw)
  To: dmitry.adamushko, mingo, rostedt; +Cc: linux-kernel, ghaskins

Hi Ingo, Steven, Dmitry,
   Here is a proposed fix for the issue that Dmitry brought up today.  It
   should apply cleanly to sched-devel (though I have a few of my other
   submitted fixes queued ahead of this that are not yet in sched-devel...so if
   you have a problem let me know and I will rebase/resubmit)

Regards,
-Greg

--------------------------
sched: Fixed missed rt-balance points on priority shifts

Dmitry Adamushko identified several holes in the rt-migration stategy relating
to changing priority via sched_setscheduler or rt_mutex_setprio:

http://lkml.org/lkml/2007/12/9/94

This patch should button up those conditions.

Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Dmitry Adamushko <dmitry.adamushko@gmail.com>
---

 kernel/sched.c    |    8 ++++++++
 kernel/sched_rt.c |   46 +++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 53 insertions(+), 1 deletions(-)

diff --git a/kernel/sched.c b/kernel/sched.c
index 02f04bc..fd08ac2 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -348,6 +348,7 @@ struct rt_rq {
 	/* highest queued rt task prio */
 	int highest_prio;
 	int overloaded;
+	int needs_pull;
 };
 
 #ifdef CONFIG_SMP
@@ -4037,6 +4038,9 @@ void rt_mutex_setprio(struct task_struct *p, int prio)
 			check_preempt_curr(rq, p);
 		}
 	}
+
+	wakeup_balance_rt(rq, p);
+
 	task_rq_unlock(rq, &flags);
 }
 
@@ -4341,6 +4345,9 @@ recheck:
 			check_preempt_curr(rq, p);
 		}
 	}
+
+	wakeup_balance_rt(rq, p);
+
 	__task_rq_unlock(rq);
 	spin_unlock_irqrestore(&p->pi_lock, flags);
 
@@ -6887,6 +6894,7 @@ void __init sched_init(void)
 		INIT_LIST_HEAD(&rq->migration_queue);
 		rq->rt.highest_prio = MAX_RT_PRIO;
 		rq->rt.overloaded = 0;
+		rq->rt.needs_pull = 0;
 #endif
 		atomic_set(&rq->nr_iowait, 0);
 
diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c
index 65cbb78..1257575 100644
--- a/kernel/sched_rt.c
+++ b/kernel/sched_rt.c
@@ -84,6 +84,8 @@ static inline void inc_rt_tasks(struct task_struct *p, struct rq *rq)
 
 static inline void dec_rt_tasks(struct task_struct *p, struct rq *rq)
 {
+	int highest_prio = rq->rt.highest_prio;
+
 	WARN_ON(!rt_task(p));
 	WARN_ON(!rq->rt.rt_nr_running);
 	rq->rt.rt_nr_running--;
@@ -103,6 +105,42 @@ static inline void dec_rt_tasks(struct task_struct *p, struct rq *rq)
 	if (p->nr_cpus_allowed > 1)
 		rq->rt.rt_nr_migratory--;
 
+	if (rq->rt.highest_prio > highest_prio) {
+		/*
+		 * If the departing task is reducing our priority, we need to
+		 * check if we should pull tasks because its always possible
+		 * that another RQ tried to push tasks away but skipped us due
+		 * to elevated priority.  That elevated priority is now
+		 * subsiding so there may be tasks that are newly eligible for
+		 * migration.  This pull operation is currently facilitated
+		 * via schedule().
+		 */
+		rq->rt.needs_pull = 1;
+
+		/*
+		 * FIXME: I am not sure about this next part:
+		 *
+		 * If the departing task is already running, we dont need to be
+		 * specific about rescheduling because presumably it will
+		 * happen momentarily anyway.  However, if the departing task
+		 * was *not* the current task (#), we should invoke a
+		 * reschedule to make sure we have the optimal task running.
+		 *
+		 * (#) It may seem like a pathological condition to have the
+		 * highest priority task not also be the current task.  However
+		 * consider the condition where this highest task was enqueued
+		 * and subsequently dequeued before the RQ ever had a chance to
+		 * reschedule.
+		 *
+		 * I have no doubt that this is the proper thing to do to make
+		 * sure RT tasks are properly balanced.  What I cannot wrap my
+		 * head around at this late hour is if issuing a reschedule()
+		 * here may cause issues in other circumstances.  TBD
+		 */
+		if (!task_running(rq, p))
+			resched_task(rq->curr);
+	}
+
 	update_rt_migration(rq);
 #endif /* CONFIG_SMP */
 }
@@ -662,8 +700,14 @@ static int pull_rt_task(struct rq *this_rq)
 static void schedule_balance_rt(struct rq *rq, struct task_struct *prev)
 {
 	/* Try to pull RT tasks here if we lower this rq's prio */
-	if (unlikely(rt_task(prev)) && rq->rt.highest_prio > prev->prio)
+	if (unlikely(rq->rt.needs_pull)) {
+		/*
+		 * Clear the flag first, since pulling may release the lock
+		 * and someone else may re-set the needs_pull
+		 */
+		rq->rt.needs_pull = 0;
 		pull_rt_task(rq);
+	}
 }
 
 static void schedule_tail_balance_rt(struct rq *rq)


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts
  2007-12-10  2:53                       ` [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts Gregory Haskins
@ 2007-12-10  3:18                         ` Gregory Haskins
  0 siblings, 0 replies; 5+ messages in thread
From: Gregory Haskins @ 2007-12-10  3:18 UTC (permalink / raw)
  To: mingo, dmitry.adamushko, rostedt, Gregory Haskins; +Cc: linux-kernel

>>> On Sun, Dec 9, 2007 at  9:53 PM, in message
<20071210024709.4760.68134.stgit@novell1.haskins.net>, Gregory Haskins
<ghaskins@novell.com> wrote: 

> +		 * I have no doubt that this is the proper thing to do to make
> +		 * sure RT tasks are properly balanced.  What I cannot wrap my
> +		 * head around at this late hour is if issuing a reschedule()
> +		 * here may cause issues in other circumstances.  TBD
> +		 */
> +		if (!task_running(rq, p))
> +			resched_task(rq->curr);
> +	}

It dawned on me after I sent this that a further optimization here is to predicate the reschedule on whether we are overloaded.  In otherwords:

if (!task_running(rq, p) && rt_overloaded(rq))

Regards,
-Greg


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-12-10  3:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20071130145939.GN5681@linux.vnet.ibm.com>
     [not found] ` <b647ffbd0712011156p252ae29dnfa75494e4d2c845c@mail.gmail.com>
     [not found]   ` <20071203182223.GA4133@linux.vnet.ibm.com>
     [not found]     ` <47556B23.2060909@redhat.com>
     [not found]       ` <20071204153542.GC3388@linux.vnet.ibm.com>
     [not found]         ` <b647ffbd0712040828l51a26a82jc0f38d3f4aa2291e@mail.gmail.com>
     [not found]           ` <20071205134036.GA21933@linux.vnet.ibm.com>
     [not found]             ` <4756B8E9.3080709@redhat.com>
     [not found]               ` <20071205164800.GA24767@linux.vnet.ibm.com>
     [not found]                 ` <4756D709.1020901@redhat.com>
2007-12-09 17:16                   ` RT Load balance changes in sched-devel Dmitry Adamushko
2007-12-09 18:32                     ` Gregory Haskins
2007-12-10  0:57                       ` Steven Rostedt
2007-12-10  2:53                       ` [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts Gregory Haskins
2007-12-10  3:18                         ` Gregory Haskins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).