[RFC v3 0/6] CPU reclaiming for SCHED_DEADLINE

* [RFC v3 0/6] CPU reclaiming for SCHED_DEADLINE
@ 2016-10-24 14:06 Luca Abeni
  2016-10-24 14:06 ` [RFC v3 1/6] Track the active utilisation Luca Abeni
                   ` (5 more replies)
  0 siblings, 6 replies; 45+ messages in thread
From: Luca Abeni @ 2016-10-24 14:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Ingo Molnar, Juri Lelli, Claudio Scordino,
	Steven Rostedt, Luca Abeni

Hi all,

this patchset implements CPU reclaiming (using the GRUB algorithm[1])
for SCHED_DEADLINE: basically, this feature allows SCHED_DEADLINE tasks
to consume more than their reserved runtime, up to a maximum fraction
of the CPU time (so that other tasks are left some spare CPU time to
execute), if this does not break the guarantees of other SCHED_DEADLINE
tasks.
The patchset applies on top of tip/master.

The implemented CPU reclaiming algorithm is based on tracking the
utilization U_act of active tasks (first 2 patches), and modifying the
runtime accounting rule (see patch 0004). The original GRUB algorithm is
modified as described in [2] to support multiple CPUs (the original
algorithm only considered one single CPU, this one tracks U_act per
runqueue) and to leave an "unreclaimable" fraction of CPU time to non
SCHED_DEADLINE tasks (see patch 0005: the original algorithm can consume
100% of the CPU time, starving all the other tasks).
Patch 0003 uses the newly introduced "inactive timer" (introduced in
patch 0002) to fix dl_overflow() and __setparam_dl().
Patch 0006 allows to enable CPU reclaiming only on selected tasks.

Changes since v2:
in general, I tried to address all the comments I received, and to add
some more comments in "critical" parts of the code. In particular, I:
- Updated to latest tip/master. This required some changes (for example,
  using "struct rq_flags" instead of "unsigned long" for task_rq_lock())
- Merged patches 0001 and 0002, as suggested by Juri
- Added some comments about GRUB in the changelog of the patch adding
  GRUB accounting
- Exchanged the order of two patches ("Make GRUB a task's flag" and
  "Do not reclaim the whole CPU bandwidth"), as suggested by Juri
- Removed unused code ("if (task_on_rq_queued(p))" in task_dead_dl(),
  noticed by Peter) from patch 0001
- Properly consider the migrations of queued dl tasks when updating
  the active utilization. This should address Peter's concern from
  http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02612.html
- Simplified the code for setting up the inactive timer, as suggested
  by Peter: http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02620.html
- Use hrtimer_is_queued() instead of "(hrtimer_active() &&
  !hrtimer_callback_running())", as pointed out by Peter:
  http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02805.html
- Fix select_task_rq_dl() (using "task_cpu(p) != cpu" instead
  of "rq != cpu_rq(cpu)"), as pointed out by Peter:
  http://lkml.iu.edu/hypermail/linux/kernel/1604.0/02822.html
  I also changed the logic used in select_task_rq_dl()
  (that now does not increase the active utilization in the selected
  runqueue)
- Because of the changes in the code, I am not sure if the race condition
  pointed out by Peter can still happen. I tried to trigger it in many ways,
  but I failed... If it turns out that the race is still possible, I'll fix
  it in the next round of patches, by introducing a new "is_contending" field
  (protected by pi_mutex) in the dl scheduling entity.

[1] Lipari, G., & Baruah, S. (2000). Greedy reclamation of unused bandwidth in constant-bandwidth servers. In Real-Time Systems, 2000. Euromicro RTS 2000. 12th Euromicro Conference on (pp. 193-200). IEEE.
[2] Abeni, L., Lelli, J., Scordino, C., & Palopoli, L. (2014, October). Greedy CPU reclaiming for SCHED DEADLINE. In Proceedings of the Real-Time Linux Workshop (RTLWS), Dusseldorf, Germany. 

Luca Abeni (6):
  Track the active utilisation
  Improve the tracking of active utilisation
  Fix the update of the total -deadline utilization
  GRUB accounting
  Do not reclaim the whole CPU bandwidth
  Make GRUB a task's flag

 include/linux/sched.h      |   1 +
 include/uapi/linux/sched.h |   1 +
 kernel/sched/core.c        |  44 ++++-----
 kernel/sched/deadline.c    | 220 ++++++++++++++++++++++++++++++++++++++++-----
 kernel/sched/sched.h       |  13 +++
 5 files changed, 234 insertions(+), 45 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 45+ messages in thread