* [PATCH 0/2] sched/deadline: Fixes for constrained deadline tasks @ 2017-02-10 19:48 Daniel Bristot de Oliveira 2017-02-10 19:48 ` [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period Daniel Bristot de Oliveira 2017-02-10 19:48 ` [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline Daniel Bristot de Oliveira 0 siblings, 2 replies; 17+ messages in thread From: Daniel Bristot de Oliveira @ 2017-02-10 19:48 UTC (permalink / raw) To: linux-kernel Cc: Daniel Bristot de Oliveira, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni, Steven Rostedt While reading sched deadline code, I find out that a constrained deadline task could be replenished before the next period if activated after the deadline, opening the window to run for more than Q/P. The patch [2] explains and fixes this problem. Furthermore, while fixing this issue, I found that the replenishment timer was being fired at the deadline of the task. This works fine for implicit deadline tasks (deadline == period) because the deadline is at the same point in time of the next period. But that is not true for constrained deadline tasks (deadline < period). This problem is not as visible as the first because the runtime leakage takes place only in the second activation. Next activations receive the correct bandwidth. However, after the 2nd activation, tasks are activated in the (period - dl_deadline) instant, which is before the expected activation. This problem is explained in the fix description as well. Daniel Bristot de Oliveira (2): sched/deadline: Replenishment timer should fire in the next period sched/deadline: Throttle a constrained deadline task activated after the deadline kernel/sched/deadline.c | 53 +++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 51 insertions(+), 2 deletions(-) -- 2.9.3 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period 2017-02-10 19:48 [PATCH 0/2] sched/deadline: Fixes for constrained deadline tasks Daniel Bristot de Oliveira @ 2017-02-10 19:48 ` Daniel Bristot de Oliveira 2017-02-10 21:47 ` Steven Rostedt 2017-02-11 7:12 ` luca abeni 2017-02-10 19:48 ` [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline Daniel Bristot de Oliveira 1 sibling, 2 replies; 17+ messages in thread From: Daniel Bristot de Oliveira @ 2017-02-10 19:48 UTC (permalink / raw) To: linux-kernel Cc: Daniel Bristot de Oliveira, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni, Steven Rostedt Currently, the replenishment timer is set to fire at the deadline of a task. Although that works for implicit deadline tasks because the deadline is equals to the begin of the next period, that is not correct for constrained deadline tasks (deadline < period). For instance: f.c: --------------- %< --------------- int main (void) { for(;;); } --------------- >% --------------- # gcc -o f f.c # trace-cmd record -e sched:sched_switch \ -e syscalls:sys_exit_sched_setattr \ chrt -d --sched-runtime 490000000 \ --sched-deadline 500000000 \ --sched-period 1000000000 0 ./f # trace-cmd report | grep "{pid of ./f}" After setting parameters, the task is replenished and continue running until being throttled. f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0 The task is throttled after running 492318 ms, as expected. f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> \ watchdog/3:32 [0] But then, the task is replenished 500719 ms after the first replenishment: <idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R \ ==> f:11295 [-1] Running for 490277 ms: f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> \ swapper/3:0 [120] Hence, in the first period, the task runs 2 * runtime, and that is a bug. During the first replenishment, the next deadline is set one period away. So the runtime/period starts to be respected. However, as the second replenishment took place in the wrong instant, the next replenishment will also be held in a wrong instant of time. Rather than occurring in the nth period away from the first activation, it is taking place in the (nth period - relative deadline). Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org --- kernel/sched/deadline.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 70ef2b1..3c94d85 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity *dl_se, } } +static inline u64 dl_next_period(struct sched_dl_entity *dl_se) +{ + return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period; +} + /* * If the entity depleted all its runtime, and if we want it to sleep * while waiting for some new execution time to become available, we - * set the bandwidth enforcement timer to the replenishment instant + * set the bandwidth replenishment timer to the replenishment instant * and try to activate it. * * Notice that it is important for the caller to know if the timer @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p) * that it is actually coming from rq->clock and not from * hrtimer's time base reading. */ - act = ns_to_ktime(dl_se->deadline); + act = ns_to_ktime(dl_next_period(dl_se)); now = hrtimer_cb_get_time(timer); delta = ktime_to_ns(now) - rq_clock(rq); act = ktime_add_ns(act, delta); -- 2.9.3 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period 2017-02-10 19:48 ` [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period Daniel Bristot de Oliveira @ 2017-02-10 21:47 ` Steven Rostedt 2017-02-11 7:12 ` luca abeni 1 sibling, 0 replies; 17+ messages in thread From: Steven Rostedt @ 2017-02-10 21:47 UTC (permalink / raw) To: Daniel Bristot de Oliveira Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni On Fri, 10 Feb 2017 20:48:10 +0100 Daniel Bristot de Oliveira <bristot@redhat.com> wrote: > Currently, the replenishment timer is set to fire at the deadline > of a task. Although that works for implicit deadline tasks because the > deadline is equals to the begin of the next period, that is not correct > for constrained deadline tasks (deadline < period). > > For instance: > > f.c: > --------------- %< --------------- > int main (void) > { > for(;;); > } > --------------- >% --------------- > > # gcc -o f f.c > > # trace-cmd record -e sched:sched_switch \ > -e syscalls:sys_exit_sched_setattr \ > chrt -d --sched-runtime 490000000 \ > --sched-deadline 500000000 \ > --sched-period 1000000000 0 ./f > > # trace-cmd report | grep "{pid of ./f}" # trace-cmd report -F '.*:common_pid == 11295' does the same ;-) Or better yet, add "-F" (just -F, no extra paremeter. In report, -F means "filter", in record -F means "follow"), and that will only record the task that is created. Also, it may be nice to use --ts-diff option to see results. I ran this with my own spinning program, and here's the output: # trace-cmd report --ts-diff [...] userspin-1540 [003] 298.725084: (+822) sys_exit_sched_setattr: 0x0 userspin-1540 [003] 299.218199: (+493115) sched_switch: userspin:1540 [4294967295] R ==> rcuc/3:38 [98] <idle>-0 [003] 299.225111: (+6912) sched_switch: swapper/3:0 [120] R ==> userspin:1540 [4294967295] userspin-1540 [003] 299.718192: (+493081) sched_switch: userspin:1540 [4294967295] R ==> ktimersoftd/3:39 [98] <idle>-0 [003] 300.225125: (+506933) sched_switch: swapper/3:0 [120] R ==> userspin:1540 [4294967295] userspin-1540 [003] 300.717185: (+492060) sched_switch: userspin:1540 [4294967295] R ==> rcuc/3:38 [98] <idle>-0 [003] 301.225116: (+507931) sched_switch: swapper/3:0 [120] R ==> userspin:1540 [4294967295] userspin-1540 [003] 301.718177: (+493061) sched_switch: userspin:1540 [4294967295] R ==> rcuc/3:38 [98] <idle>-0 [003] 302.225120: (+506943) sched_switch: swapper/3:0 [120] R ==> userspin:1540 [4294967295] > > After setting parameters, the task is replenished and continue running > until being throttled. > f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0 > > The task is throttled after running 492318 ms, as expected. > f-11295 [003] 13322.606094: sched_switch: f:11295 [-1] R ==> \ > watchdog/3:32 [0] > > But then, the task is replenished 500719 ms after the first > replenishment: > <idle>-0 [003] 13322.614495: sched_switch: swapper/3:0 [120] R \ > ==> f:11295 [-1] > > Running for 490277 ms: > f-11295 [003] 13323.104772: sched_switch: f:11295 [-1] R ==> \ > swapper/3:0 [120] > > Hence, in the first period, the task runs 2 * runtime, and that is a bug. > > During the first replenishment, the next deadline is set one period away. > So the runtime/period starts to be respected. However, as the second > replenishment took place in the wrong instant, the next replenishment > will also be held in a wrong instant of time. Rather than occurring in > the nth period away from the first activation, it is taking place > in the (nth period - relative deadline). OK, after you patch, I get this: userspin-1581 [003] 103.450129: (+716) sys_exit_sched_setattr: 0x0 userspin-1581 [003] 103.943957: (+493828) sched_switch: userspin:1581 [4294967295] R ==> rcuc/3:38 [98] <idle>-0 [003] 104.450172: (+506215) sched_switch: swapper/3:0 [120] R ==> userspin:1581 [4294967295] userspin-1581 [003] 104.942349: (+492177) sched_switch: userspin:1581 [4294967295] R ==> watchdog/3:36 [0] <idle>-0 [003] 105.450164: (+507815) sched_switch: swapper/3:0 [120] R ==> userspin:1581 [4294967295] userspin-1581 [003] 105.942354: (+492190) sched_switch: userspin:1581 [4294967295] R ==> ktimersoftd/3:39 [98] <idle>-0 [003] 106.450159: (+507805) sched_switch: swapper/3:0 [120] R ==> userspin:1581 [4294967295] userspin-1581 [003] 106.942366: (+492207) sched_switch: userspin:1581 [4294967295] R ==> ktimersoftd/3:39 [98] <idle>-0 [003] 107.450164: (+507798) sched_switch: swapper/3:0 [120] R ==> userspin:1581 [4294967295] Looks promising. But unrelated, I noticed this with and without your patch: userspin-1567 [002] 81.455345: (+1196) sys_exit_sched_setattr: 0x0 userspin-1567 [002] 81.457599: (+2254) sched_switch: chrt:1567 [4294967295] D ==> rcuc/2:29 [98] <idle>-0 [002] 81.466030: (+8431) sched_switch: swapper/2:0 [120] R ==> chrt:1567 [4294967295] userspin-1567 [002] 81.466123: (+93) sched_switch: chrt:1567 [4294967295] D ==> rcuc/2:29 [98] <idle>-0 [002] 81.471862: (+5739) sched_switch: swapper/2:0 [120] R ==> chrt:1567 [4294967295] userspin-1567 [002] 81.471959: (+97) sched_switch: chrt:1567 [4294967295] D ==> swapper/2:0 [120] <idle>-0 [002] 81.478285: (+6326) sched_switch: swapper/2:0 [120] R ==> chrt:1567 [4294967295] userspin-1567 [002] 81.478369: (+84) sched_switch: chrt:1567 [4294967295] D ==> swapper/2:0 [120] <idle>-0 [002] 81.486664: (+8295) sched_switch: swapper/2:0 [120] R ==> chrt:1567 [4294967295] userspin-1567 [002] 81.487291: (+627) sched_switch: chrt:1567 [4294967295] D|W ==> swapper/2:0 [120] <idle>-0 [002] 81.494884: (+7593) sched_switch: swapper/2:0 [120] R ==> chrt:1567 [4294967295] userspin-1567 [002] 81.988095: (+493211) sched_switch: userspin:1567 [4294967295] R ==> rcuc/2:29 [98] <idle>-0 [002] 82.496009: (+507914) sched_switch: swapper/2:0 [120] R ==> userspin:1567 [4294967295] userspin-1567 [002] 82.988158: (+492149) sched_switch: userspin:1567 [4294967295] R ==> rcuc/2:29 [98] <idle>-0 [002] 83.494897: (+506739) sched_switch: swapper/2:0 [120] R ==> userspin:1567 [4294967295] userspin-1567 [002] 83.986181: (+491284) sched_switch: userspin:1567 [4294967295] R ==> rcuc/2:29 [98] <idle>-0 [002] 84.494895: (+508714) sched_switch: swapper/2:0 [120] R ==> userspin:1567 [4294967295] A bit of choppiness to get started. I'll have to look more into this. I'll take a look at this patch deeper to try to understand everything before applying a reviewed-by. -- Steve > > Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Juri Lelli <juri.lelli@arm.com> > Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> > Cc: Luca Abeni <luca.abeni@santannapisa.it> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: linux-kernel@vger.kernel.org > > --- > kernel/sched/deadline.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 70ef2b1..3c94d85 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -505,10 +505,15 @@ static void update_dl_entity(struct sched_dl_entity *dl_se, > } > } > > +static inline u64 dl_next_period(struct sched_dl_entity *dl_se) > +{ > + return dl_se->deadline - dl_se->dl_deadline + dl_se->dl_period; > +} > + > /* > * If the entity depleted all its runtime, and if we want it to sleep > * while waiting for some new execution time to become available, we > - * set the bandwidth enforcement timer to the replenishment instant > + * set the bandwidth replenishment timer to the replenishment instant > * and try to activate it. > * > * Notice that it is important for the caller to know if the timer > @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p) > * that it is actually coming from rq->clock and not from > * hrtimer's time base reading. > */ > - act = ns_to_ktime(dl_se->deadline); > + act = ns_to_ktime(dl_next_period(dl_se)); > now = hrtimer_cb_get_time(timer); > delta = ktime_to_ns(now) - rq_clock(rq); > act = ktime_add_ns(act, delta); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period 2017-02-10 19:48 ` [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period Daniel Bristot de Oliveira 2017-02-10 21:47 ` Steven Rostedt @ 2017-02-11 7:12 ` luca abeni 2017-02-13 11:10 ` Peter Zijlstra 1 sibling, 1 reply; 17+ messages in thread From: luca abeni @ 2017-02-11 7:12 UTC (permalink / raw) To: Daniel Bristot de Oliveira Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Steven Rostedt Hi Daniel, On Fri, 10 Feb 2017 20:48:10 +0100 Daniel Bristot de Oliveira <bristot@redhat.com> wrote: [...] > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 70ef2b1..3c94d85 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -505,10 +505,15 @@ static void update_dl_entity(struct > sched_dl_entity *dl_se, } > } > > +static inline u64 dl_next_period(struct sched_dl_entity *dl_se) > +{ > + return dl_se->deadline - dl_se->dl_deadline + > dl_se->dl_period; +} > + > /* > * If the entity depleted all its runtime, and if we want it to sleep > * while waiting for some new execution time to become available, we > - * set the bandwidth enforcement timer to the replenishment instant > + * set the bandwidth replenishment timer to the replenishment instant > * and try to activate it. > * > * Notice that it is important for the caller to know if the timer > @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p) > * that it is actually coming from rq->clock and not from > * hrtimer's time base reading. > */ > - act = ns_to_ktime(dl_se->deadline); > + act = ns_to_ktime(dl_next_period(dl_se)); Looks like there is a real bug in the code, and your fix looks correct to me. I think it should be committed. Thanks, Luca > now = hrtimer_cb_get_time(timer); > delta = ktime_to_ns(now) - rq_clock(rq); > act = ktime_add_ns(act, delta); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period 2017-02-11 7:12 ` luca abeni @ 2017-02-13 11:10 ` Peter Zijlstra 2017-02-13 14:59 ` Steven Rostedt 0 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2017-02-13 11:10 UTC (permalink / raw) To: luca abeni Cc: Daniel Bristot de Oliveira, linux-kernel, Ingo Molnar, Juri Lelli, Tommaso Cucinotta, Steven Rostedt On Sat, Feb 11, 2017 at 08:12:37AM +0100, luca abeni wrote: > Hi Daniel, > > On Fri, 10 Feb 2017 20:48:10 +0100 > Daniel Bristot de Oliveira <bristot@redhat.com> wrote: > > [...] > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > > index 70ef2b1..3c94d85 100644 > > --- a/kernel/sched/deadline.c > > +++ b/kernel/sched/deadline.c > > @@ -505,10 +505,15 @@ static void update_dl_entity(struct > > sched_dl_entity *dl_se, } > > } > > > > +static inline u64 dl_next_period(struct sched_dl_entity *dl_se) > > +{ > > + return dl_se->deadline - dl_se->dl_deadline + > > dl_se->dl_period; +} > > + > > /* > > * If the entity depleted all its runtime, and if we want it to sleep > > * while waiting for some new execution time to become available, we > > - * set the bandwidth enforcement timer to the replenishment instant > > + * set the bandwidth replenishment timer to the replenishment instant > > * and try to activate it. > > * > > * Notice that it is important for the caller to know if the timer > > @@ -530,7 +535,7 @@ static int start_dl_timer(struct task_struct *p) > > * that it is actually coming from rq->clock and not from > > * hrtimer's time base reading. > > */ > > - act = ns_to_ktime(dl_se->deadline); > > + act = ns_to_ktime(dl_next_period(dl_se)); > > Looks like there is a real bug in the code, and your fix looks correct > to me. I think it should be committed. > I've interpreted this as: Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it> Holler if you disagree. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period 2017-02-13 11:10 ` Peter Zijlstra @ 2017-02-13 14:59 ` Steven Rostedt 2017-02-13 15:35 ` Juri Lelli 0 siblings, 1 reply; 17+ messages in thread From: Steven Rostedt @ 2017-02-13 14:59 UTC (permalink / raw) To: Peter Zijlstra Cc: luca abeni, Daniel Bristot de Oliveira, linux-kernel, Ingo Molnar, Juri Lelli, Tommaso Cucinotta On Mon, 13 Feb 2017 12:10:25 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > I've interpreted this as: > > Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it> > > Holler if you disagree. You can add mine too. I put in a lot of trace_printk()s and it all appears to be exactly as Daniel describes. Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -- Steve ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period 2017-02-13 14:59 ` Steven Rostedt @ 2017-02-13 15:35 ` Juri Lelli 0 siblings, 0 replies; 17+ messages in thread From: Juri Lelli @ 2017-02-13 15:35 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, luca abeni, Daniel Bristot de Oliveira, linux-kernel, Ingo Molnar, Tommaso Cucinotta On 13/02/17 09:59, Steven Rostedt wrote: > On Mon, 13 Feb 2017 12:10:25 +0100 > Peter Zijlstra <peterz@infradead.org> wrote: > > > > I've interpreted this as: > > > > Reviewed-by: Luca Abeni <luca.abeni@santannapisa.it> > > > > Holler if you disagree. > > You can add mine too. I put in a lot of trace_printk()s and it all > appears to be exactly as Daniel describes. > > Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> > I'm a bit late, but I looked at it as well and it looks good. If you want, Reviewed-by: Juri Lelli <juri.lelli@arm.com> Thanks! - Juri ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-10 19:48 [PATCH 0/2] sched/deadline: Fixes for constrained deadline tasks Daniel Bristot de Oliveira 2017-02-10 19:48 ` [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period Daniel Bristot de Oliveira @ 2017-02-10 19:48 ` Daniel Bristot de Oliveira 2017-02-11 7:15 ` luca abeni ` (2 more replies) 1 sibling, 3 replies; 17+ messages in thread From: Daniel Bristot de Oliveira @ 2017-02-10 19:48 UTC (permalink / raw) To: linux-kernel Cc: Daniel Bristot de Oliveira, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni, Steven Rostedt During the activation, CBS checks if it can reuse the current task's runtime and period. If the deadline of the task is in the past, CBS cannot use the runtime, and so it replenishes the task. This rule works fine for implicit deadline tasks (deadline == period), and the CBS was designed for implicit deadline tasks. However, a task with constrained deadline (deadine < period) might be awakened after the deadline, but before the next period. In this case, replenishing the task would allow it to run for runtime / deadline. As in this case deadline < period, CBS enables a task to run for more than the runtime/period. In a very load system, this can cause the domino effect, making other tasks to miss their deadlines. To avoid this problem, in the activation of a constrained deadline task after the deadline but before the next period, throttle the task and set the replenishing timer to the begin of the next period, unless it is boosted. Reproducer: --------------- %< --------------- int main (int argc, char **argv) { int ret; int flags = 0; unsigned long l = 0; struct timespec ts; struct sched_attr attr; memset(&attr, 0, sizeof(attr)); attr.size = sizeof(attr); attr.sched_policy = SCHED_DEADLINE; attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */ attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ ts.tv_sec = 0; ts.tv_nsec = 2000 * 1000; /* 2 ms */ ret = sched_setattr(0, &attr, flags); if (ret < 0) { perror("sched_setattr"); exit(-1); } for(;;) { /* XXX: you may need to adjust the loop */ for (l = 0; l < 150000; l++); /* * The ideia is to go to sleep right before the deadline * and then wake up before the next period to receive * a new replenishment. */ nanosleep(&ts, NULL); } exit(0); } --------------- >% --------------- On my box, this reproducer uses almost 50% of the CPU time, which is obviously wrong for a task with 2/2000 reservation. Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@arm.com> Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> Cc: Luca Abeni <luca.abeni@santannapisa.it> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-kernel@vger.kernel.org --- kernel/sched/deadline.c | 44 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index 3c94d85..b74d40e 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -694,6 +694,36 @@ void init_dl_task_timer(struct sched_dl_entity *dl_se) timer->function = dl_task_timer; } +/* During the activation, CBS checks if it can reuse the current task's + * runtime and period. If the deadline of the task is in the past, CBS + * cannot use the runtime, and so it replenishes the task. This rule + * works fine for implicit deadline tasks (deadline == period), and the + * CBS was designed for implicit deadline tasks. However, a task with + * constrained deadline (deadine < period) might be awakened after the + * deadline, but before the next period. In this case, replenishing the + * task would allow it to run for runtime / deadline. As in this case + * deadline < period, CBS enables a task to run for more than the + * runtime / period. In a very load system, this can cause the domino + * effect, making other tasks to miss their deadlines. + * + * To avoid this problem, in the activation of a constrained deadline + * task after the deadline but before the next period, throttle the + * task and set the replenishing timer to the begin of the next period, + * unless it is boosted. + */ +static inline void dl_check_constrained_dl(struct sched_dl_entity *dl_se) +{ + struct task_struct *p = dl_task_of(dl_se); + struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se)); + + if (dl_time_before(dl_se->deadline, rq_clock(rq)) && + dl_time_before(rq_clock(rq), dl_next_period(dl_se))) { + if (unlikely(dl_se->dl_boosted || !start_dl_timer(p))) + return; + dl_se->dl_throttled = 1; + } +} + static int dl_runtime_exceeded(struct sched_dl_entity *dl_se) { @@ -927,6 +957,11 @@ static void dequeue_dl_entity(struct sched_dl_entity *dl_se) __dequeue_dl_entity(dl_se); } +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se) +{ + return dl_se->dl_runtime < dl_se->dl_period; +} + static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags) { struct task_struct *pi_task = rt_mutex_get_top_task(p); @@ -953,6 +988,15 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags) } /* + * Check if a constrained deadline task was activated + * after the deadline but before the next period. + * If that is the case, the task will be throttled and + * the replenishment timer will be set to the next period. + */ + if (!pi_se->dl_throttled && dl_is_constrained(pi_se)) + dl_check_constrained_dl(pi_se); + + /* * If p is throttled, we do nothing. In fact, if it exhausted * its budget it needs a replenishment and, since it now is on * its rq, the bandwidth timer callback (which clearly has not -- 2.9.3 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-10 19:48 ` [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline Daniel Bristot de Oliveira @ 2017-02-11 7:15 ` luca abeni 2017-02-11 8:00 ` Mike Galbraith 2017-02-13 11:12 ` Peter Zijlstra 2017-02-13 11:12 ` Peter Zijlstra 2017-02-13 15:33 ` Steven Rostedt 2 siblings, 2 replies; 17+ messages in thread From: luca abeni @ 2017-02-11 7:15 UTC (permalink / raw) To: Daniel Bristot de Oliveira Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Steven Rostedt Hi Daniel, On Fri, 10 Feb 2017 20:48:11 +0100 Daniel Bristot de Oliveira <bristot@redhat.com> wrote: > During the activation, CBS checks if it can reuse the current task's > runtime and period. If the deadline of the task is in the past, CBS > cannot use the runtime, and so it replenishes the task. This rule > works fine for implicit deadline tasks (deadline == period), and the > CBS was designed for implicit deadline tasks. However, a task with > constrained deadline (deadine < period) might be awakened after the > deadline, but before the next period. In this case, replenishing the > task would allow it to run for runtime / deadline. As in this case > deadline < period, CBS enables a task to run for more than the > runtime/period. In a very load system, this can cause the domino > effect, making other tasks to miss their deadlines. I think you are right: SCHED_DEADLINE implements the original CBS algorithm here, but uses relative deadlines different from periods in other places (while the original algorithm only considered relative deadlines equal to periods). An this mix is dangerous... I think your fix is correct, and cures a real problem. Thanks, Luca > > To avoid this problem, in the activation of a constrained deadline > task after the deadline but before the next period, throttle the > task and set the replenishing timer to the begin of the next period, > unless it is boosted. > > Reproducer: > > --------------- %< --------------- > int main (int argc, char **argv) > { > int ret; > int flags = 0; > unsigned long l = 0; > struct timespec ts; > struct sched_attr attr; > > memset(&attr, 0, sizeof(attr)); > attr.size = sizeof(attr); > > attr.sched_policy = SCHED_DEADLINE; > attr.sched_runtime = 2 * 1000 * 1000; /* 2 ms > */ attr.sched_deadline = 2 * 1000 * 1000; /* 2 ms */ > attr.sched_period = 2 * 1000 * 1000 * 1000; /* 2 s */ > > ts.tv_sec = 0; > ts.tv_nsec = 2000 * 1000; /* 2 ms */ > > ret = sched_setattr(0, &attr, flags); > > if (ret < 0) { > perror("sched_setattr"); > exit(-1); > } > > for(;;) { > /* XXX: you may need to adjust the loop */ > for (l = 0; l < 150000; l++); > /* > * The ideia is to go to sleep right before the > deadline > * and then wake up before the next period to receive > * a new replenishment. > */ > nanosleep(&ts, NULL); > } > > exit(0); > } > --------------- >% --------------- > > On my box, this reproducer uses almost 50% of the CPU time, which is > obviously wrong for a task with 2/2000 reservation. > > Signed-off-by: Daniel Bristot de Oliveira <bristot@redhat.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.org> > Cc: Juri Lelli <juri.lelli@arm.com> > Cc: Tommaso Cucinotta <tommaso.cucinotta@sssup.it> > Cc: Luca Abeni <luca.abeni@santannapisa.it> > Cc: Steven Rostedt <rostedt@goodmis.org> > Cc: linux-kernel@vger.kernel.org > --- > kernel/sched/deadline.c | 44 > ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 > insertions(+) > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index 3c94d85..b74d40e 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -694,6 +694,36 @@ void init_dl_task_timer(struct sched_dl_entity > *dl_se) timer->function = dl_task_timer; > } > > +/* During the activation, CBS checks if it can reuse the current > task's > + * runtime and period. If the deadline of the task is in the past, > CBS > + * cannot use the runtime, and so it replenishes the task. This rule > + * works fine for implicit deadline tasks (deadline == period), and > the > + * CBS was designed for implicit deadline tasks. However, a task with > + * constrained deadline (deadine < period) might be awakened after > the > + * deadline, but before the next period. In this case, replenishing > the > + * task would allow it to run for runtime / deadline. As in this case > + * deadline < period, CBS enables a task to run for more than the > + * runtime / period. In a very load system, this can cause the domino > + * effect, making other tasks to miss their deadlines. > + * > + * To avoid this problem, in the activation of a constrained deadline > + * task after the deadline but before the next period, throttle the > + * task and set the replenishing timer to the begin of the next > period, > + * unless it is boosted. > + */ > +static inline void dl_check_constrained_dl(struct sched_dl_entity > *dl_se) +{ > + struct task_struct *p = dl_task_of(dl_se); > + struct rq *rq = rq_of_dl_rq(dl_rq_of_se(dl_se)); > + > + if (dl_time_before(dl_se->deadline, rq_clock(rq)) && > + dl_time_before(rq_clock(rq), dl_next_period(dl_se))) { > + if (unlikely(dl_se->dl_boosted > || !start_dl_timer(p))) > + return; > + dl_se->dl_throttled = 1; > + } > +} > + > static > int dl_runtime_exceeded(struct sched_dl_entity *dl_se) > { > @@ -927,6 +957,11 @@ static void dequeue_dl_entity(struct > sched_dl_entity *dl_se) __dequeue_dl_entity(dl_se); > } > > +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se) > +{ > + return dl_se->dl_runtime < dl_se->dl_period; > +} > + > static void enqueue_task_dl(struct rq *rq, struct task_struct *p, > int flags) { > struct task_struct *pi_task = rt_mutex_get_top_task(p); > @@ -953,6 +988,15 @@ static void enqueue_task_dl(struct rq *rq, > struct task_struct *p, int flags) } > > /* > + * Check if a constrained deadline task was activated > + * after the deadline but before the next period. > + * If that is the case, the task will be throttled and > + * the replenishment timer will be set to the next period. > + */ > + if (!pi_se->dl_throttled && dl_is_constrained(pi_se)) > + dl_check_constrained_dl(pi_se); > + > + /* > * If p is throttled, we do nothing. In fact, if it exhausted > * its budget it needs a replenishment and, since it now is > on > * its rq, the bandwidth timer callback (which clearly has > not ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-11 7:15 ` luca abeni @ 2017-02-11 8:00 ` Mike Galbraith 2017-02-11 14:35 ` Steven Rostedt 2017-02-13 11:12 ` Peter Zijlstra 1 sibling, 1 reply; 17+ messages in thread From: Mike Galbraith @ 2017-02-11 8:00 UTC (permalink / raw) To: luca abeni, Daniel Bristot de Oliveira Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Steven Rostedt On Sat, 2017-02-11 at 08:15 +0100, luca abeni wrote: > Hi Daniel, > > On Fri, 10 Feb 2017 20:48:11 +0100 > Daniel Bristot de Oliveira <bristot@redhat.com> wrote: > > > During the activation, CBS checks if it can reuse the current > > task's > > runtime and period. If the deadline of the task is in the past, CBS > > cannot use the runtime, and so it replenishes the task. This rule > > works fine for implicit deadline tasks (deadline == period), and > > the > > CBS was designed for implicit deadline tasks. However, a task with > > constrained deadline (deadine < period) might be awakened after the > > deadline, but before the next period. In this case, replenishing > > the > > task would allow it to run for runtime / deadline. As in this case > > deadline < period, CBS enables a task to run for more than the > > runtime/period. In a very load system, this can cause the domino > > effect, making other tasks to miss their deadlines. > > I think you are right: SCHED_DEADLINE implements the original CBS > algorithm here, but uses relative deadlines different from periods in > other places (while the original algorithm only considered relative > deadlines equal to periods). > An this mix is dangerous... I think your fix is correct, and cures a > real problem. Both of these should be tagged for stable as well, or? -Mike ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-11 8:00 ` Mike Galbraith @ 2017-02-11 14:35 ` Steven Rostedt 0 siblings, 0 replies; 17+ messages in thread From: Steven Rostedt @ 2017-02-11 14:35 UTC (permalink / raw) To: Mike Galbraith Cc: luca abeni, Daniel Bristot de Oliveira, linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta On Sat, 11 Feb 2017 09:00:06 +0100 Mike Galbraith <efault@gmx.de> wrote: > Both of these should be tagged for stable as well, or? > Yes. -- Steve ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-11 7:15 ` luca abeni 2017-02-11 8:00 ` Mike Galbraith @ 2017-02-13 11:12 ` Peter Zijlstra 1 sibling, 0 replies; 17+ messages in thread From: Peter Zijlstra @ 2017-02-13 11:12 UTC (permalink / raw) To: luca abeni Cc: Daniel Bristot de Oliveira, linux-kernel, Ingo Molnar, Juri Lelli, Tommaso Cucinotta, Steven Rostedt On Sat, Feb 11, 2017 at 08:15:26AM +0100, luca abeni wrote: > Hi Daniel, > > On Fri, 10 Feb 2017 20:48:11 +0100 > Daniel Bristot de Oliveira <bristot@redhat.com> wrote: > > > During the activation, CBS checks if it can reuse the current task's > > runtime and period. If the deadline of the task is in the past, CBS > > cannot use the runtime, and so it replenishes the task. This rule > > works fine for implicit deadline tasks (deadline == period), and the > > CBS was designed for implicit deadline tasks. However, a task with > > constrained deadline (deadine < period) might be awakened after the > > deadline, but before the next period. In this case, replenishing the > > task would allow it to run for runtime / deadline. As in this case > > deadline < period, CBS enables a task to run for more than the > > runtime/period. In a very load system, this can cause the domino > > effect, making other tasks to miss their deadlines. > > I think you are right: SCHED_DEADLINE implements the original CBS > algorithm here, but uses relative deadlines different from periods in > other places (while the original algorithm only considered relative > deadlines equal to periods). > An this mix is dangerous... I think your fix is correct, and cures a > real problem. > Made that a "Reviewed-by:" tag in your name again. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-10 19:48 ` [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline Daniel Bristot de Oliveira 2017-02-11 7:15 ` luca abeni @ 2017-02-13 11:12 ` Peter Zijlstra 2017-02-13 13:16 ` Daniel Bristot de Oliveira 2017-02-13 15:33 ` Steven Rostedt 2 siblings, 1 reply; 17+ messages in thread From: Peter Zijlstra @ 2017-02-13 11:12 UTC (permalink / raw) To: Daniel Bristot de Oliveira Cc: linux-kernel, Ingo Molnar, Juri Lelli, Tommaso Cucinotta, Luca Abeni, Steven Rostedt On Fri, Feb 10, 2017 at 08:48:11PM +0100, Daniel Bristot de Oliveira wrote: > +/* During the activation, CBS checks if it can reuse the current task's > + * runtime and period. If the deadline of the task is in the past, CBS Broken comment style, fixed that for you. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-13 11:12 ` Peter Zijlstra @ 2017-02-13 13:16 ` Daniel Bristot de Oliveira 0 siblings, 0 replies; 17+ messages in thread From: Daniel Bristot de Oliveira @ 2017-02-13 13:16 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Ingo Molnar, Juri Lelli, Tommaso Cucinotta, Luca Abeni, Steven Rostedt On 02/13/2017 12:12 PM, Peter Zijlstra wrote: > On Fri, Feb 10, 2017 at 08:48:11PM +0100, Daniel Bristot de Oliveira wrote: >> +/* During the activation, CBS checks if it can reuse the current task's >> + * runtime and period. If the deadline of the task is in the past, CBS > Broken comment style, fixed that for you. Thanks! I was thinking sending a v1 just because of this... -- Daniel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-10 19:48 ` [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline Daniel Bristot de Oliveira 2017-02-11 7:15 ` luca abeni 2017-02-13 11:12 ` Peter Zijlstra @ 2017-02-13 15:33 ` Steven Rostedt 2017-02-13 15:46 ` Daniel Bristot de Oliveira 2 siblings, 1 reply; 17+ messages in thread From: Steven Rostedt @ 2017-02-13 15:33 UTC (permalink / raw) To: Daniel Bristot de Oliveira Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni On Fri, 10 Feb 2017 20:48:11 +0100 Daniel Bristot de Oliveira <bristot@redhat.com> wrote: > > +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se) > +{ > + return dl_se->dl_runtime < dl_se->dl_period; > +} > + Is it ever appropriate for a dl task to have runtime == period? What purpose would that serve? Just run the task as FIFO higher than everything else. Or was this suppose to be dl_deadline < dl_period? -- Steve ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-13 15:33 ` Steven Rostedt @ 2017-02-13 15:46 ` Daniel Bristot de Oliveira 2017-02-13 16:21 ` Daniel Bristot de Oliveira 0 siblings, 1 reply; 17+ messages in thread From: Daniel Bristot de Oliveira @ 2017-02-13 15:46 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni On 02/13/2017 04:33 PM, Steven Rostedt wrote: >> +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se) >> +{ >> + return dl_se->dl_runtime < dl_se->dl_period; >> +} >> + > Is it ever appropriate for a dl task to have runtime == period? What > purpose would that serve? Just run the task as FIFO higher than > everything else. > > Or was this suppose to be dl_deadline < dl_period? Oooooops, my bad :-(, this was supposed to be dl_deadline < dl_period. I will send a v2 fixing it, along with the problem in the comment that peterz mentioned before. Thanks Steven! -- Daniel ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline 2017-02-13 15:46 ` Daniel Bristot de Oliveira @ 2017-02-13 16:21 ` Daniel Bristot de Oliveira 0 siblings, 0 replies; 17+ messages in thread From: Daniel Bristot de Oliveira @ 2017-02-13 16:21 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel, Ingo Molnar, Peter Zijlstra, Juri Lelli, Tommaso Cucinotta, Luca Abeni On 02/13/2017 04:46 PM, Daniel Bristot de Oliveira wrote: > On 02/13/2017 04:33 PM, Steven Rostedt wrote: >>> +static inline bool dl_is_constrained(struct sched_dl_entity *dl_se) >>> +{ >>> + return dl_se->dl_runtime < dl_se->dl_period; >>> +} >>> + >> Is it ever appropriate for a dl task to have runtime == period? What >> purpose would that serve? Just run the task as FIFO higher than >> everything else. >> >> Or was this suppose to be dl_deadline < dl_period? > Oooooops, my bad :-(, this was supposed to be dl_deadline < dl_period. As + if (dl_time_before(dl_se->deadline, rq_clock(rq)) && + dl_time_before(rq_clock(rq), dl_next_period(dl_se))) { is never true for implicit deadline tasks, the problem was not causing further logical issues (apart from being conceptually obviously totally wrong, and causes the check to be called without the need to... me--). -- Daniel ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-02-13 16:21 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-10 19:48 [PATCH 0/2] sched/deadline: Fixes for constrained deadline tasks Daniel Bristot de Oliveira 2017-02-10 19:48 ` [PATCH 1/2] sched/deadline: Replenishment timer should fire in the next period Daniel Bristot de Oliveira 2017-02-10 21:47 ` Steven Rostedt 2017-02-11 7:12 ` luca abeni 2017-02-13 11:10 ` Peter Zijlstra 2017-02-13 14:59 ` Steven Rostedt 2017-02-13 15:35 ` Juri Lelli 2017-02-10 19:48 ` [PATCH 2/2] sched/deadline: Throttle a constrained deadline task activated after the deadline Daniel Bristot de Oliveira 2017-02-11 7:15 ` luca abeni 2017-02-11 8:00 ` Mike Galbraith 2017-02-11 14:35 ` Steven Rostedt 2017-02-13 11:12 ` Peter Zijlstra 2017-02-13 11:12 ` Peter Zijlstra 2017-02-13 13:16 ` Daniel Bristot de Oliveira 2017-02-13 15:33 ` Steven Rostedt 2017-02-13 15:46 ` Daniel Bristot de Oliveira 2017-02-13 16:21 ` Daniel Bristot de Oliveira
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.