All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kirill Tkhai <tkhai@yandex.ru>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>, Luca Abeni <luca.abeni@unitn.it>,
	Ingo Molnar <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"juri.lelli@gmail.com" <juri.lelli@gmail.com>
Subject: Re: Another SCHED_DEADLINE bug (with bisection and possible fix)
Date: Wed, 14 Jan 2015 15:43:07 +0300	[thread overview]
Message-ID: <4632021421239387@web25g.yandex.ru> (raw)
In-Reply-To: <20150113140436.GI25256@twins.programming.kicks-ass.net>

13.01.2015, 17:04, "Peter Zijlstra" <peterz@infradead.org>:
> On Tue, Jan 12, 2015 at 12:26:40PM +0300, Kirill Tkhai wrote:
>>>  Well, I'm inclined to agree to Luca's viewpoint. We should not change
>>>  parameters of a throttled task or we may affect other tasks.
>>  Could you explain your viewpoint more? How does this affects other tasks?
>
> I agree with Juri and Luca here. Luca gave an example that DoS's
> SCHED_DEADLINE.
>
> In Luca's example we'd constantly call sched_setattr() on a task, which
> results in it never throttling and that will affect other tasks, because
> if we're running, they cannot be.
>>  As I understand, in __setparam_dl() we are sure that there is enough
>>  dl_bw. In __sched_setscheduler() we call it after dl_overflow() check.
>
> Yes, _but_, as per the above. BW includes the sleep time, if we fail to
> sleep its all pointless.
>
> We got the runtime (before the throttle) under the promise that we'd go
> sleep. When we change our parameters while being throttled we should
> adjust the throttle time accordingly. If say we changed from 30% to 35%
> we are allowed to sleep less. Now sleeping more than strictly required
> is only detrimental to ourselves, so that is not (too) critical.
>
> However, the other way around, if we change from 35% to 30% we should
> now effectively sleep longer (for the same runtime we already consumed),
> and we must do this, because admission control will instantly assume the
> 5% change in bandwidth to be available. If we sleep short, this BW is
> not in fact available and we break our guarantees.
>>>>>   Anyway, I am fine with every patch that fixes the bug :)
>>>>   Deadline class requires root privileges. So, I do not see a problem
>>>>   here. Please, see __sched_setscheduler().
>
> Its not a priv issue, not doing this makes it impossible to use
> SCHED_DEADLINE in the intended way, even for root.
>
> Say we have a userspace task that evaluates and changes runtime
> parameters for other tasks (basically what Luca is doing IIRC), and the
> changes keep resetting the sleep time, the whole guarantee system comes
> down, rendering the deadline system useless.
>>>>   If in the future we allow non-privileged users to increase deadline,
>>>>   we will reflect that in __setparam_dl() too.
>>>  I'd say it is better to implement the right behavior even for root, so
>>>  that we will find it right when we'll grant access to non root users
>>>  too. Also, even if root can do everything, we always try not to break
>>>  guarantees that come with admission control (root or non root that is).
>
> Just so.

How about something like this?

 include/linux/sched/deadline.h |  2 ++
 kernel/sched/core.c            | 33 +++++++++++++++++++++++++++++----
 kernel/sched/deadline.c        | 10 +---------
 kernel/sched/sched.h           |  1 -
 4 files changed, 32 insertions(+), 14 deletions(-)
diff --git a/include/linux/sched/deadline.h b/include/linux/sched/deadline.h
index 9d303b8..c70a7fc 100644
--- a/include/linux/sched/deadline.h
+++ b/include/linux/sched/deadline.h
@@ -21,4 +21,6 @@ static inline int dl_task(struct task_struct *p)
 	return dl_prio(p->prio);
 }
 
+extern enum hrtimer_restart dl_task_timer(struct hrtimer *timer);
+
 #endif /* _SCHED_DEADLINE_H */
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c0accc0..edf9d91 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1809,6 +1809,8 @@ void __dl_clear_params(struct task_struct *p)
 {
 	struct sched_dl_entity *dl_se = &p->dl;
 
+	dl_se->dl_throttled = 0;
+	dl_se->dl_yielded = 0;
 	dl_se->dl_runtime = 0;
 	dl_se->dl_deadline = 0;
 	dl_se->dl_period = 0;
@@ -1840,6 +1842,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
 
 	RB_CLEAR_NODE(&p->dl.rb_node);
 	hrtimer_init(&p->dl.dl_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	p->dl.dl_timer.function = dl_task_timer;
 	__dl_clear_params(p);
 
 	INIT_LIST_HEAD(&p->rt.run_list);
@@ -3250,16 +3253,38 @@ static void
 __setparam_dl(struct task_struct *p, const struct sched_attr *attr)
 {
 	struct sched_dl_entity *dl_se = &p->dl;
+	struct hrtimer *timer = &dl_se->dl_timer;
+	struct rq *rq = task_rq(p);
+	int pending = 0;
+
+	if (hrtimer_is_queued(timer)) {
+		/*
+		 * Not need smp_rmb() here, synchronization points
+		 * are rq lock in our caller and in dl_task_timer().
+		 */
+		pending = 1;
+	} else if (hrtimer_callback_running(timer)) {
+		smp_rmb(); /* Pairs with rq lock in timer handler */
+
+		/* Has the timer handler already finished? */
+		if (dl_se->dl_throttled)
+			pending = 1; /* No */
+	}
+
+	if (!pending) {
+		BUG_ON(dl_se->dl_throttled);
+		dl_se->dl_new = 1;
+	} else {
+		dl_se->deadline = rq_clock(rq) + attr->sched_deadline;
+		dl_se->runtime = attr->sched_runtime;
+	}
 
-	init_dl_task_timer(dl_se);
+	dl_se->dl_yielded = 0;
 	dl_se->dl_runtime = attr->sched_runtime;
 	dl_se->dl_deadline = attr->sched_deadline;
 	dl_se->dl_period = attr->sched_period ?: dl_se->dl_deadline;
 	dl_se->flags = attr->sched_flags;
 	dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
-	dl_se->dl_throttled = 0;
-	dl_se->dl_new = 1;
-	dl_se->dl_yielded = 0;
 }
 
 /*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index b52092f..b457ca7 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -500,7 +500,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
  * updating (and the queueing back to dl_rq) will be done by the
  * next call to enqueue_task_dl().
  */
-static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
+enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
 {
 	struct sched_dl_entity *dl_se = container_of(timer,
 						     struct sched_dl_entity,
@@ -559,14 +559,6 @@ static enum hrtimer_restart dl_task_timer(struct hrtimer *timer)
 	return HRTIMER_NORESTART;
 }
 
-void init_dl_task_timer(struct sched_dl_entity *dl_se)
-{
-	struct hrtimer *timer = &dl_se->dl_timer;
-
-	hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
-	timer->function = dl_task_timer;
-}
-
 static
 int dl_runtime_exceeded(struct rq *rq, struct sched_dl_entity *dl_se)
 {
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 9a2a45c..ad3a2f0 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1257,7 +1257,6 @@ extern void init_rt_bandwidth(struct rt_bandwidth *rt_b, u64 period, u64 runtime
 
 extern struct dl_bandwidth def_dl_bandwidth;
 extern void init_dl_bandwidth(struct dl_bandwidth *dl_b, u64 period, u64 runtime);
-extern void init_dl_task_timer(struct sched_dl_entity *dl_se);
 
 unsigned long to_ratio(u64 period, u64 runtime);
 

  reply	other threads:[~2015-01-14 12:43 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-29 23:27 Another SCHED_DEADLINE bug (with bisection and possible fix) luca abeni
2015-01-04 22:51 ` Kirill Tkhai
2015-01-05 15:21   ` Luca Abeni
2015-01-05 23:07     ` Kirill Tkhai
2015-01-07  7:01       ` Luca Abeni
2015-01-07 12:29         ` Kirill Tkhai
2015-01-07 12:45           ` Luca Abeni
2015-01-07 13:04             ` Kirill Tkhai
2015-01-07 13:14               ` Luca Abeni
2015-01-09 11:15               ` Luca Abeni
2015-01-09 11:46                 ` Kirill Tkhai
2015-01-13  8:10           ` Juri Lelli
2015-01-13  9:26             ` Kirill Tkhai
2015-01-13 14:04               ` Peter Zijlstra
2015-01-14 12:43                 ` Kirill Tkhai [this message]
2015-01-15 11:23                   ` Luca Abeni
2015-01-15 12:23                     ` Peter Zijlstra
2015-01-15 13:35                       ` Luca Abeni
2015-01-28 14:08                         ` Peter Zijlstra
2015-01-28 14:40                           ` Luca Abeni
2015-01-30 10:25                           ` Luca Abeni
2015-01-30 10:35                           ` Juri Lelli
2015-01-31  9:56                             ` Peter Zijlstra
2015-02-02 11:31                               ` Juri Lelli
2015-02-02 13:57                               ` Luca Abeni
2015-02-04 14:34                           ` [tip:sched/core] sched/deadline: Fix deadline parameter modification handling tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4632021421239387@web25g.yandex.ru \
    --to=tkhai@yandex.ru \
    --cc=juri.lelli@arm.com \
    --cc=juri.lelli@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.abeni@unitn.it \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.