From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755375AbXLJDPq (ORCPT ); Sun, 9 Dec 2007 22:15:46 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751663AbXLJDPi (ORCPT ); Sun, 9 Dec 2007 22:15:38 -0500 Received: from 75-130-111-13.dhcp.oxfr.ma.charter.com ([75.130.111.13]:44481 "EHLO novell1.haskins.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751572AbXLJDPh (ORCPT ); Sun, 9 Dec 2007 22:15:37 -0500 From: Gregory Haskins Subject: [PATCH RFC] sched: Fixed missed rt-balance points on priority shifts To: dmitry.adamushko@gmail.com, mingo@elte.hu, rostedt@goodmis.org Cc: linux-kernel@vger.kernel.org, ghaskins@novell.com Date: Sun, 09 Dec 2007 21:53:23 -0500 Message-ID: <20071210024709.4760.68134.stgit@novell1.haskins.net> In-Reply-To: <475BEE6B.BA47.005A.0@novell.com> References: <475BEE6B.BA47.005A.0@novell.com> User-Agent: StGIT/0.12.1 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Ingo, Steven, Dmitry, Here is a proposed fix for the issue that Dmitry brought up today. It should apply cleanly to sched-devel (though I have a few of my other submitted fixes queued ahead of this that are not yet in sched-devel...so if you have a problem let me know and I will rebase/resubmit) Regards, -Greg -------------------------- sched: Fixed missed rt-balance points on priority shifts Dmitry Adamushko identified several holes in the rt-migration stategy relating to changing priority via sched_setscheduler or rt_mutex_setprio: http://lkml.org/lkml/2007/12/9/94 This patch should button up those conditions. Signed-off-by: Gregory Haskins CC: Dmitry Adamushko --- kernel/sched.c | 8 ++++++++ kernel/sched_rt.c | 46 +++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 53 insertions(+), 1 deletions(-) diff --git a/kernel/sched.c b/kernel/sched.c index 02f04bc..fd08ac2 100644 --- a/kernel/sched.c +++ b/kernel/sched.c @@ -348,6 +348,7 @@ struct rt_rq { /* highest queued rt task prio */ int highest_prio; int overloaded; + int needs_pull; }; #ifdef CONFIG_SMP @@ -4037,6 +4038,9 @@ void rt_mutex_setprio(struct task_struct *p, int prio) check_preempt_curr(rq, p); } } + + wakeup_balance_rt(rq, p); + task_rq_unlock(rq, &flags); } @@ -4341,6 +4345,9 @@ recheck: check_preempt_curr(rq, p); } } + + wakeup_balance_rt(rq, p); + __task_rq_unlock(rq); spin_unlock_irqrestore(&p->pi_lock, flags); @@ -6887,6 +6894,7 @@ void __init sched_init(void) INIT_LIST_HEAD(&rq->migration_queue); rq->rt.highest_prio = MAX_RT_PRIO; rq->rt.overloaded = 0; + rq->rt.needs_pull = 0; #endif atomic_set(&rq->nr_iowait, 0); diff --git a/kernel/sched_rt.c b/kernel/sched_rt.c index 65cbb78..1257575 100644 --- a/kernel/sched_rt.c +++ b/kernel/sched_rt.c @@ -84,6 +84,8 @@ static inline void inc_rt_tasks(struct task_struct *p, struct rq *rq) static inline void dec_rt_tasks(struct task_struct *p, struct rq *rq) { + int highest_prio = rq->rt.highest_prio; + WARN_ON(!rt_task(p)); WARN_ON(!rq->rt.rt_nr_running); rq->rt.rt_nr_running--; @@ -103,6 +105,42 @@ static inline void dec_rt_tasks(struct task_struct *p, struct rq *rq) if (p->nr_cpus_allowed > 1) rq->rt.rt_nr_migratory--; + if (rq->rt.highest_prio > highest_prio) { + /* + * If the departing task is reducing our priority, we need to + * check if we should pull tasks because its always possible + * that another RQ tried to push tasks away but skipped us due + * to elevated priority. That elevated priority is now + * subsiding so there may be tasks that are newly eligible for + * migration. This pull operation is currently facilitated + * via schedule(). + */ + rq->rt.needs_pull = 1; + + /* + * FIXME: I am not sure about this next part: + * + * If the departing task is already running, we dont need to be + * specific about rescheduling because presumably it will + * happen momentarily anyway. However, if the departing task + * was *not* the current task (#), we should invoke a + * reschedule to make sure we have the optimal task running. + * + * (#) It may seem like a pathological condition to have the + * highest priority task not also be the current task. However + * consider the condition where this highest task was enqueued + * and subsequently dequeued before the RQ ever had a chance to + * reschedule. + * + * I have no doubt that this is the proper thing to do to make + * sure RT tasks are properly balanced. What I cannot wrap my + * head around at this late hour is if issuing a reschedule() + * here may cause issues in other circumstances. TBD + */ + if (!task_running(rq, p)) + resched_task(rq->curr); + } + update_rt_migration(rq); #endif /* CONFIG_SMP */ } @@ -662,8 +700,14 @@ static int pull_rt_task(struct rq *this_rq) static void schedule_balance_rt(struct rq *rq, struct task_struct *prev) { /* Try to pull RT tasks here if we lower this rq's prio */ - if (unlikely(rt_task(prev)) && rq->rt.highest_prio > prev->prio) + if (unlikely(rq->rt.needs_pull)) { + /* + * Clear the flag first, since pulling may release the lock + * and someone else may re-set the needs_pull + */ + rq->rt.needs_pull = 0; pull_rt_task(rq); + } } static void schedule_tail_balance_rt(struct rq *rq)