From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755570AbdERUP6 (ORCPT ); Thu, 18 May 2017 16:15:58 -0400 Received: from mail.santannapisa.it ([193.205.80.98]:25727 "EHLO mail.santannapisa.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755525AbdERUPt (ORCPT ); Thu, 18 May 2017 16:15:49 -0400 From: luca abeni To: linux-kernel@vger.kernel.org Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Claudio Scordino , Steven Rostedt , Tommaso Cucinotta , Daniel Bristot de Oliveira , Joel Fernandes , Mathieu Poirier , Luca Abeni Subject: [PATCH 10/10] sched/deadline: documentation about GRUB reclaiming Date: Thu, 18 May 2017 22:13:37 +0200 Message-Id: <1495138417-6203-11-git-send-email-luca.abeni@santannapisa.it> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1495138417-6203-1-git-send-email-luca.abeni@santannapisa.it> References: <1495138417-6203-1-git-send-email-luca.abeni@santannapisa.it> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Claudio Scordino This patch adds the documentation about the GRUB reclaiming algorithm, adding a few details discussed in list. Signed-off-by: Claudio Scordino Signed-off-by: Luca Abeni --- Documentation/scheduler/sched-deadline.txt | 168 +++++++++++++++++++++++++++++ 1 file changed, 168 insertions(+) diff --git a/Documentation/scheduler/sched-deadline.txt b/Documentation/scheduler/sched-deadline.txt index cbc1b46..e89e36e 100644 --- a/Documentation/scheduler/sched-deadline.txt +++ b/Documentation/scheduler/sched-deadline.txt @@ -7,6 +7,8 @@ CONTENTS 0. WARNING 1. Overview 2. Scheduling algorithm + 2.1 Main algorithm + 2.2 Bandwidth reclaiming 3. Scheduling Real-Time Tasks 3.1 Definitions 3.2 Schedulability Analysis for Uniprocessor Systems @@ -44,6 +46,9 @@ CONTENTS 2. Scheduling algorithm ================== +2.1 Main algorithm +------------------ + SCHED_DEADLINE uses three parameters, named "runtime", "period", and "deadline", to schedule tasks. A SCHED_DEADLINE task should receive "runtime" microseconds of execution time every "period" microseconds, and @@ -113,6 +118,160 @@ CONTENTS remaining runtime = remaining runtime + runtime +2.2 Bandwidth reclaiming +------------------------ + + Bandwidth reclaiming for deadline tasks is based on the GRUB (Greedy + Reclamation of Unused Bandwidth) algorithm [15, 16, 17] and it is enabled + when flag SCHED_FLAG_RECLAIM is set. + + The following diagram illustrates the state names for tasks handled by GRUB: + + ------------ + (d) | Active | + ------------->| | + | | Contending | + | ------------ + | A | + ---------- | | + | | | | + | Inactive | |(b) | (a) + | | | | + ---------- | | + A | V + | ------------ + | | Active | + --------------| Non | + (c) | Contending | + ------------ + + A task can be in one of the following states: + + - ActiveContending: if it is ready for execution (or executing); + + - ActiveNonContending: if it just blocked and has not yet surpassed the 0-lag + time; + + - Inactive: if it is blocked and has surpassed the 0-lag time. + + State transitions: + + (a) When a task blocks, it does not become immediately inactive since its + bandwidth cannot be immediately reclaimed without breaking the + real-time guarantees. It therefore enters a transitional state called + ActiveNonContending. The scheduler arms the "inactive timer" to fire at + the 0-lag time, when the task's bandwidth can be reclaimed without + breaking the real-time guarantees. + + The 0-lag time for a task entering the ActiveNonContending state is + computed as + + (runtime * dl_period) + deadline - --------------------- + dl_runtime + + where runtime is the remaining runtime, while dl_runtime and dl_period + are the reservation parameters. + + (b) If the task wakes up before the inactive timer fires, the task re-enters + the ActiveContending state and the "inactive timer" is canceled. + In addition, if the task wakes up on a different runqueue, then + the task's utilization must be removed from the previous runqueue's active + utilization and must be added to the new runqueue's active utilization. + In order to avoid races between a task waking up on a runqueue while the + "inactive timer" is running on a different CPU, the "dl_non_contending" + flag is used to indicate that a task is not on a runqueue but is active + (so, the flag is set when the task blocks and is cleared when the + "inactive timer" fires or when the task wakes up). + + (c) When the "inactive timer" fires, the task enters the Inactive state and + its utilization is removed from the runqueue's active utilization. + + (d) When an inactive task wakes up, it enters the ActiveContending state and + its utilization is added to the active utilization of the runqueue where + it has been enqueued. + + For each runqueue, the algorithm GRUB keeps track of two different bandwidths: + + - Active bandwidth (running_bw): this is the sum of the bandwidths of all + tasks in active state (i.e., ActiveContending or ActiveNonContending); + + - Total bandwidth (this_bw): this is the sum of all tasks "belonging" to the + runqueue, including the tasks in Inactive state. + + + The algorithm reclaims the bandwidth of the tasks in Inactive state. + It does so by decrementing the runtime of the executing task Ti at a pace equal + to + + dq = -max{ Ui, (1 - Uinact) } dt + + where Uinact is the inactive utilization, computed as (this_bq - running_bw), + and Ui is the bandwidth of task Ti. + + + Let's now see a trivial example of two deadline tasks with runtime equal + to 4 and period equal to 8 (i.e., bandwidth equal to 0.5): + + A Task T1 + | + | | + | | + |-------- |---- + | | V + |---|---|---|---|---|---|---|---|--------->t + 0 1 2 3 4 5 6 7 8 + + + A Task T2 + | + | | + | | + | ------------------------| + | | V + |---|---|---|---|---|---|---|---|--------->t + 0 1 2 3 4 5 6 7 8 + + + A running_bw + | + 1 ----------------- ------ + | | | + 0.5- ----------------- + | | + |---|---|---|---|---|---|---|---|--------->t + 0 1 2 3 4 5 6 7 8 + + + - Time t = 0: + + Both tasks are ready for execution and therefore in ActiveContending state. + Suppose Task T1 is the first task to start execution. + Since there are no inactive tasks, its runtime is decreased as dq = -1 dt. + + - Time t = 2: + + Suppose that task T1 blocks + Task T1 therefore enters the ActiveNonContending state. Since its remaining + runtime is equal to 2, its 0-lag time is equal to t = 4. + Task T2 start execution, with runtime still decreased as dq = -1 dt since + there are no inactive tasks. + + - Time t = 4: + + This is the 0-lag time for Task T1. Since it didn't woken up in the + meantime, it enters the Inactive state. Its bandwidth is removed from + running_bw. + Task T2 continues its execution. However, its runtime is now decreased as + dq = - 0.5 dt because Uinact = 0.5. + Task T2 therefore reclaims the bandwidth unused by Task T1. + + - Time t = 8: + + Task T1 wakes up. It enters the ActiveContending state again, and the + running_bw is incremented. + + 3. Scheduling Real-Time Tasks ============================= @@ -330,6 +489,15 @@ CONTENTS 14 - J. Erickson, U. Devi and S. Baruah. Improved tardiness bounds for Global EDF. Proceedings of the 22nd Euromicro Conference on Real-Time Systems, 2010. + 15 - G. Lipari, S. Baruah, Greedy reclamation of unused bandwidth in + constant-bandwidth servers, 12th IEEE Euromicro Conference on Real-Time + Systems, 2000. + 16 - L. Abeni, J. Lelli, C. Scordino, L. Palopoli, Greedy CPU reclaiming for + SCHED DEADLINE. In Proceedings of the Real-Time Linux Workshop (RTLWS), + Dusseldorf, Germany, 2014. + 17 - L. Abeni, G. Lipari, A. Parri, Y. Sun, Multicore CPU reclaiming: parallel + or sequential?. In Proceedings of the 31st Annual ACM Symposium on Applied + Computing, 2016. 4. Bandwidth management -- 2.7.4