linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* rt scheduler may calculate wrong rt_time
@ 2011-04-21 12:55 Thomas Giesel
  2011-04-22  8:21 ` Mike Galbraith
  0 siblings, 1 reply; 6+ messages in thread
From: Thomas Giesel @ 2011-04-21 12:55 UTC (permalink / raw)
  To: linux-kernel

Friends of the scheduler,

I found that the current (well, at least 2.6.38) scheduler calculates a
wrong rt_time for realtime tasks in certain situations.

Example scenario:
- HZ = 1000, rt_runtime = 95 ms, rt_period = 100 ms (similar with other
  setups, but that's what I did)
- a high priority rt task (A) gets packets from Ethernet about every 10
  ms
- a low priority rt task (B) unfortunately runs for a longer time
  (here: endlessly :)
- no other tasks running (i.e. about 5 ms idle left per period)

When the runtime of the realtime tasks is exceeded (e.g. by (B)), they
are throttled. During this time idle is scheduled. When in idle,
tick_nohz_stop_sched_tick() will stop the scheduler tick, which causes
update_rq_clock() _not_ to be called for a while. When a realtime task
is woken up during this time (e.g. (A) by network traffic),
update_rq_clock() is called from enqueue_task(). The task is not picked
yet, because it is still throttled. After a while
sched_rt_period_timer() unthrottles the realtime tasks and cpu_idle
will call schedule().

schedule() picks (A) which has been woken up a while ago.
_pick_next_task_rt() sets exec_start to rq->clock_task. But this has
been updated last time when the task was woken up, which could have
been up to 5 ms ago in my example. So exec_start contains a time
_before_ the task was actually started. As a result of this, rt_time is
calculated too large which makes the rt tasks being throttled even
earlier in the next period. This error may even increase from interval
to interval, because the throttle-window (initially 5 ms) also
increases.

IMHO the best place to update clock_task would be to call a function
from tick_nohz_restart_sched_tick(). But currently I don't see a
suitable interface to the scheduler to do this. Currently I call
update_rq_clock(rq) just before put_prev_task() in schedule(). This
solves the issue and causes rt_runtime to be kept quite accurately.
(Well, same result would be to remove "if (...)" in put_prev_task())

What do you think is the best way to solve this issue?

Thomas


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-05-16 10:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-21 12:55 rt scheduler may calculate wrong rt_time Thomas Giesel
2011-04-22  8:21 ` Mike Galbraith
2011-04-22 20:52   ` Thomas Giesel
2011-04-27 17:51   ` Thomas Giesel
2011-04-29  6:36     ` [patch] " Mike Galbraith
2011-05-16 10:37       ` [tip:sched/core] sched, rt: Update rq clock when unthrottling of an otherwise idle CPU tip-bot for Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).