The interactivity estimator special cases tasks that are waking up from uninterruptible sleep based on the fact that most uninterruptible sleep represents a task waiting on disk I/O and is not truly interactive. The current system uses a ceiling to the priority bonus said tasks can receive. The problem with that system is that if there are enough interactive tasks at high bonus levels it can lead to I/O starvation. In order to remove the ceiling but still maintain some special case treatment of uninterruptible sleep, we can make any sleep_avg incrementing to be purely based on sleep time instead of being biased in the non-linear fashion that interactive tasks are. This will lead to a detriment in interactive behaviour under disk I/O however the current system unfairly biases against them and leads to a loss of disk throughput. This change should restore a better balance between disk throughput and interactivity. Signed-off-by: Con Kolivas kernel/sched.c | 36 ++++++++++++------------------------ 1 files changed, 12 insertions(+), 24 deletions(-) Index: linux-2.6.15/kernel/sched.c =================================================================== --- linux-2.6.15.orig/kernel/sched.c +++ linux-2.6.15/kernel/sched.c @@ -756,26 +756,17 @@ static int recalc_task_prio(task_t *p, u p->sleep_avg = JIFFIES_TO_NS(MAX_SLEEP_AVG - DEF_TIMESLICE); } else { - /* - * The lower the sleep avg a task has the more - * rapidly it will rise with sleep time. - */ - sleep_time *= (MAX_BONUS - CURRENT_BONUS(p)) ? : 1; /* - * Tasks waking from uninterruptible sleep are - * limited in their sleep_avg rise as they - * are likely to be waiting on I/O + * The lower the sleep avg a task has the more + * rapidly it will rise with sleep time. This enables + * tasks to rapidly recover to a low latency priority. + * If a task was sleeping with the noninteractive + * label do not apply this non-linear boost */ - if (p->sleep_type == SLEEP_NONINTERACTIVE && p->mm) { - if (p->sleep_avg >= INTERACTIVE_SLEEP(p)) - sleep_time = 0; - else if (p->sleep_avg + sleep_time >= - INTERACTIVE_SLEEP(p)) { - p->sleep_avg = INTERACTIVE_SLEEP(p); - sleep_time = 0; - } - } + if (p->sleep_type != SLEEP_NONINTERACTIVE || p->mm) + sleep_time *= + (MAX_BONUS - CURRENT_BONUS(p)) ? : 1; /* * This code gives a bonus to interactive tasks. @@ -818,11 +809,7 @@ static void activate_task(task_t *p, run if (!rt_task(p)) p->prio = recalc_task_prio(p, now); - /* - * This checks to make sure it's not an uninterruptible task - * that is now waking up. - */ - if (p->sleep_type == SLEEP_NORMAL) { + if (p->sleep_type != SLEEP_NONINTERACTIVE) { /* * Tasks which were woken up by interrupts (ie. hw events) * are most likely of interactive nature. So we give them @@ -1356,8 +1343,9 @@ out_activate: if (old_state == TASK_UNINTERRUPTIBLE) { rq->nr_uninterruptible--; /* - * Tasks on involuntary sleep don't earn - * sleep_avg beyond just interactive state. + * Tasks waking from uninterruptible sleep are likely + * to be sleeping involuntarily on I/O and are otherwise + * cpu bound so label them as noninteractive. */ p->sleep_type = SLEEP_NONINTERACTIVE; }