linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE
@ 2014-10-24  9:16 Juri Lelli
  2014-10-24  9:16 ` [PATCH 1/4] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets Juri Lelli
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Juri Lelli @ 2014-10-24  9:16 UTC (permalink / raw)
  To: linux-kernel, peterz
  Cc: juri.lelli, mingo, daniel.wagner, vincent, Juri Lelli

Hi all,

this is basically a resend of 4 patches fixing problems around AC
and PI related to SCHED_DEADLINE. You can find the full story here:

 - https://lkml.org/lkml/2014/9/19/134
 - https://lkml.org/lkml/2014/9/24/493

First two patches fix AC. They are the same as the previous version,
plus aligning to commit f10e00f4bf36 ("sched/dl: Use dl_bw_of() under
rcu_read_lock_sched()"). Last two fix PI.

Thanks a lot again to Daniel for spending time fixing and testing
the PI stuff! And to Peter and Vincent for reviews and comments.

Best Regards,

- Juri

Juri Lelli (4):
  sched/deadline: fix bandwidth check/update when migrating tasks
    between exclusive cpusets
  sched/deadline: ensure that updates to exclusive cpusets don't break
    AC
  sched/deadline: don't replenish from a !SCHED_DEADLINE entity
  sched/deadline: fix races between rt_mutex_setprio and dl_task_timer

 include/linux/sched.h   |  4 +++
 kernel/cpuset.c         | 23 ++++++------
 kernel/sched/core.c     | 94 +++++++++++++++++++++++++++++++++++++++----------
 kernel/sched/deadline.c | 58 +++++++++++++++++++++++++-----
 kernel/sched/sched.h    | 19 ++++++++++
 5 files changed, 160 insertions(+), 38 deletions(-)

-- 
2.1.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets
  2014-10-24  9:16 [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Juri Lelli
@ 2014-10-24  9:16 ` Juri Lelli
  2014-10-24  9:16 ` [PATCH 2/4] sched/deadline: ensure that updates to exclusive cpusets don't break AC Juri Lelli
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Juri Lelli @ 2014-10-24  9:16 UTC (permalink / raw)
  To: linux-kernel, peterz
  Cc: juri.lelli, mingo, daniel.wagner, vincent, Juri Lelli,
	Dario Faggioli, Michael Trimarchi, Fabio Checconi, Li Zefan,
	cgroups

Exclusive cpusets are the only way users can restrict SCHED_DEADLINE tasks
affinity (performing what is commonly called clustered scheduling).
Unfortunately, such thing is currently broken for two reasons:

 - No check is performed when the user tries to attach a task to
   an exlusive cpuset (recall that exclusive cpusets have an
   associated maximum allowed bandwidth).

 - Bandwidths of source and destination cpusets are not correctly
   updated after a task is migrated between them.

This patch fixes both things at once, as they are opposite faces
of the same coin.

The check is performed in cpuset_can_attach(), as there aren't any
points of failure after that function. The updated is split in two
halves. We first reserve bandwidth in the destination cpuset, after
we pass the check in cpuset_can_attach(). And we then release
bandwidth from the source cpuset when the task's affinity is
actually changed. Even if there can be time windows when sched_setattr()
may erroneously fail in the source cpuset, we are fine with it, as
we can't perfom an atomic update of both cpusets at once.

Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Reported-by: Vincent Legout <vincent@legout.info>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Fabio Checconi <fchecconi@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: linux-kernel@vger.kernel.org
Cc: cgroups@vger.kernel.org
---
 include/linux/sched.h   |  2 ++
 kernel/cpuset.c         | 13 ++-------
 kernel/sched/core.c     | 73 ++++++++++++++++++++++++++++++++++++-------------
 kernel/sched/deadline.c | 27 ++++++++++++++++--
 kernel/sched/sched.h    | 19 +++++++++++++
 5 files changed, 102 insertions(+), 32 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 18f5262..f225f20 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2032,6 +2032,8 @@ static inline void tsk_restore_flags(struct task_struct *task,
 	task->flags |= orig_flags & flags;
 }
 
+extern int task_can_attach(struct task_struct *p,
+			   const struct cpumask *cs_cpus_allowed);
 #ifdef CONFIG_SMP
 extern void do_set_cpus_allowed(struct task_struct *p,
 			       const struct cpumask *new_mask);
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 22874d7..ab9be24 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -1428,17 +1428,8 @@ static int cpuset_can_attach(struct cgroup_subsys_state *css,
 		goto out_unlock;
 
 	cgroup_taskset_for_each(task, tset) {
-		/*
-		 * Kthreads which disallow setaffinity shouldn't be moved
-		 * to a new cpuset; we don't want to change their cpu
-		 * affinity and isolating such threads by their set of
-		 * allowed nodes is unnecessary.  Thus, cpusets are not
-		 * applicable for such threads.  This prevents checking for
-		 * success of set_cpus_allowed_ptr() on all attached tasks
-		 * before cpus_allowed may be changed.
-		 */
-		ret = -EINVAL;
-		if (task->flags & PF_NO_SETAFFINITY)
+		ret = task_can_attach(task, cs->cpus_allowed);
+		if (ret)
 			goto out_unlock;
 		ret = security_task_setscheduler(task);
 		if (ret)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c84bdc0..07ec9bb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2050,25 +2050,6 @@ static inline int dl_bw_cpus(int i)
 }
 #endif
 
-static inline
-void __dl_clear(struct dl_bw *dl_b, u64 tsk_bw)
-{
-	dl_b->total_bw -= tsk_bw;
-}
-
-static inline
-void __dl_add(struct dl_bw *dl_b, u64 tsk_bw)
-{
-	dl_b->total_bw += tsk_bw;
-}
-
-static inline
-bool __dl_overflow(struct dl_bw *dl_b, int cpus, u64 old_bw, u64 new_bw)
-{
-	return dl_b->bw != -1 &&
-	       dl_b->bw * cpus < dl_b->total_bw - old_bw + new_bw;
-}
-
 /*
  * We must be sure that accepting a new task (or allowing changing the
  * parameters of an existing one) is consistent with the bandwidth
@@ -4641,6 +4622,60 @@ void init_idle(struct task_struct *idle, int cpu)
 #endif
 }
 
+int task_can_attach(struct task_struct *p,
+		    const struct cpumask *cs_cpus_allowed)
+{
+	int ret = 0;
+
+	/*
+	 * Kthreads which disallow setaffinity shouldn't be moved
+	 * to a new cpuset; we don't want to change their cpu
+	 * affinity and isolating such threads by their set of
+	 * allowed nodes is unnecessary.  Thus, cpusets are not
+	 * applicable for such threads.  This prevents checking for
+	 * success of set_cpus_allowed_ptr() on all attached tasks
+	 * before cpus_allowed may be changed.
+	 */
+	if (p->flags & PF_NO_SETAFFINITY) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+#ifdef CONFIG_SMP
+	if (dl_task(p) && !cpumask_intersects(task_rq(p)->rd->span,
+					      cs_cpus_allowed)) {
+		unsigned int dest_cpu = cpumask_any_and(cpu_active_mask,
+							cs_cpus_allowed);
+		struct dl_bw *dl_b;
+		bool overflow;
+		int cpus;
+		unsigned long flags;
+
+		rcu_read_lock_sched();
+		dl_b = dl_bw_of(dest_cpu);
+		raw_spin_lock_irqsave(&dl_b->lock, flags);
+		cpus = dl_bw_cpus(dest_cpu);
+		overflow = __dl_overflow(dl_b, cpus, 0, p->dl.dl_bw);
+		if (overflow)
+			ret = -EBUSY;
+		else {
+			/*
+			 * We reserve space for this task in the destination
+			 * root_domain, as we can't fail after this point.
+			 * We will free resources in the source root_domain
+			 * later on (see set_cpus_allowed_dl()).
+			 */
+			__dl_add(dl_b, p->dl.dl_bw);
+		}
+		raw_spin_unlock_irqrestore(&dl_b->lock, flags);
+		rcu_read_unlock_sched();
+
+	}
+#endif
+out:
+	return ret;
+}
+
 #ifdef CONFIG_SMP
 /*
  * move_queued_task - move a queued task to new rq.
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index abfaf3d..710621b 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1498,10 +1498,35 @@ static void set_cpus_allowed_dl(struct task_struct *p,
 				const struct cpumask *new_mask)
 {
 	struct rq *rq;
+	struct root_domain *src_rd;
 	int weight;
 
 	BUG_ON(!dl_task(p));
 
+	rq = task_rq(p);
+	src_rd = rq->rd;
+	/*
+	 * Migrating a SCHED_DEADLINE task between exclusive
+	 * cpusets (different root_domains) entails a bandwidth
+	 * update. We already made space for us in the destination
+	 * domain (see cpuset_can_attach()).
+	 */
+	if (!cpumask_intersects(src_rd->span, new_mask)) {
+		struct dl_bw *src_dl_b;
+
+		rcu_read_lock_sched();
+		src_dl_b = dl_bw_of(cpu_of(rq));
+		/*
+		 * We now free resources of the root_domain we are migrating
+		 * off. In the worst case, sched_setattr() may temporary fail
+		 * until we complete the update.
+		 */
+		raw_spin_lock(&src_dl_b->lock);
+		__dl_clear(src_dl_b, p->dl.dl_bw);
+		raw_spin_unlock(&src_dl_b->lock);
+		rcu_read_unlock_sched();
+	}
+
 	/*
 	 * Update only if the task is actually running (i.e.,
 	 * it is on the rq AND it is not throttled).
@@ -1518,8 +1543,6 @@ static void set_cpus_allowed_dl(struct task_struct *p,
 	if ((p->nr_cpus_allowed > 1) == (weight > 1))
 		return;
 
-	rq = task_rq(p);
-
 	/*
 	 * The process used to be able to migrate OR it can now migrate
 	 */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 6130251..5dfb014 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -176,6 +176,25 @@ struct dl_bw {
 	u64 bw, total_bw;
 };
 
+static inline
+void __dl_clear(struct dl_bw *dl_b, u64 tsk_bw)
+{
+	dl_b->total_bw -= tsk_bw;
+}
+
+static inline
+void __dl_add(struct dl_bw *dl_b, u64 tsk_bw)
+{
+	dl_b->total_bw += tsk_bw;
+}
+
+static inline
+bool __dl_overflow(struct dl_bw *dl_b, int cpus, u64 old_bw, u64 new_bw)
+{
+	return dl_b->bw != -1 &&
+	       dl_b->bw * cpus < dl_b->total_bw - old_bw + new_bw;
+}
+
 extern struct mutex sched_domains_mutex;
 
 #ifdef CONFIG_CGROUP_SCHED
-- 
2.1.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] sched/deadline: ensure that updates to exclusive cpusets don't break AC
  2014-10-24  9:16 [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Juri Lelli
  2014-10-24  9:16 ` [PATCH 1/4] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets Juri Lelli
@ 2014-10-24  9:16 ` Juri Lelli
  2014-10-24  9:16 ` [PATCH 3/4] sched/deadline: don't replenish from a !SCHED_DEADLINE entity Juri Lelli
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Juri Lelli @ 2014-10-24  9:16 UTC (permalink / raw)
  To: linux-kernel, peterz
  Cc: juri.lelli, mingo, daniel.wagner, vincent, Juri Lelli,
	Dario Faggioli, Michael Trimarchi, Fabio Checconi, Li Zefan,
	cgroups

How we deal with updates to exclusive cpusets is currently broken.
As an example, suppose we have an exclusive cpuset composed of
two cpus: A[cpu0,cpu1]. We can assign SCHED_DEADLINE task to it
up to the allowed bandwidth. If we want now to modify cpusetA's
cpumask, we have to check that removing a cpu's amount of
bandwidth doesn't break AC guarantees. This thing isn't checked
in the current code.

This patch fixes the problem above, denying an update if the
new cpumask won't have enough bandwidth for SCHED_DEADLINE tasks
that are currently active.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Reported-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Fabio Checconi <fchecconi@gmail.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: linux-kernel@vger.kernel.org
Cc: cgroups@vger.kernel.org
---
 include/linux/sched.h |  2 ++
 kernel/cpuset.c       | 10 ++++++++++
 kernel/sched/core.c   | 21 +++++++++++++++++++++
 3 files changed, 33 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f225f20..d4ed0db 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2032,6 +2032,8 @@ static inline void tsk_restore_flags(struct task_struct *task,
 	task->flags |= orig_flags & flags;
 }
 
+extern int cpuset_cpumask_can_shrink(const struct cpumask *cur,
+				     const struct cpumask *trial);
 extern int task_can_attach(struct task_struct *p,
 			   const struct cpumask *cs_cpus_allowed);
 #ifdef CONFIG_SMP
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index ab9be24..9066cc7 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -505,6 +505,16 @@ static int validate_change(struct cpuset *cur, struct cpuset *trial)
 			goto out;
 	}
 
+	/*
+	 * We can't shrink if we won't have enough room for SCHED_DEADLINE
+	 * tasks.
+	 */
+	ret = -EBUSY;
+	if (is_cpu_exclusive(cur) &&
+	    !cpuset_cpumask_can_shrink(cur->cpus_allowed,
+				       trial->cpus_allowed))
+		goto out;
+
 	ret = 0;
 out:
 	rcu_read_unlock();
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 07ec9bb..35674eb 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4622,6 +4622,27 @@ void init_idle(struct task_struct *idle, int cpu)
 #endif
 }
 
+int cpuset_cpumask_can_shrink(const struct cpumask *cur,
+			      const struct cpumask *trial)
+{
+	int ret = 1, trial_cpus;
+	struct dl_bw *cur_dl_b;
+	unsigned long flags;
+
+	rcu_read_lock_sched();
+	cur_dl_b = dl_bw_of(cpumask_any(cur));
+	trial_cpus = cpumask_weight(trial);
+
+	raw_spin_lock_irqsave(&cur_dl_b->lock, flags);
+	if (cur_dl_b->bw != -1 &&
+	    cur_dl_b->bw * trial_cpus < cur_dl_b->total_bw)
+		ret = 0;
+	raw_spin_unlock_irqrestore(&cur_dl_b->lock, flags);
+	rcu_read_unlock_sched();
+
+	return ret;
+}
+
 int task_can_attach(struct task_struct *p,
 		    const struct cpumask *cs_cpus_allowed)
 {
-- 
2.1.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] sched/deadline: don't replenish from a !SCHED_DEADLINE entity
  2014-10-24  9:16 [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Juri Lelli
  2014-10-24  9:16 ` [PATCH 1/4] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets Juri Lelli
  2014-10-24  9:16 ` [PATCH 2/4] sched/deadline: ensure that updates to exclusive cpusets don't break AC Juri Lelli
@ 2014-10-24  9:16 ` Juri Lelli
  2014-10-28 11:01   ` [tip:sched/core] sched/deadline: Don' t " tip-bot for Juri Lelli
  2014-10-24  9:16 ` [PATCH 4/4] sched/deadline: fix races between rt_mutex_setprio and dl_task_timer Juri Lelli
  2014-10-24 11:34 ` [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Daniel Wagner
  4 siblings, 1 reply; 10+ messages in thread
From: Juri Lelli @ 2014-10-24  9:16 UTC (permalink / raw)
  To: linux-kernel, peterz
  Cc: juri.lelli, mingo, daniel.wagner, vincent, Juri Lelli,
	Dario Faggioli, Michael Trimarchi, Fabio Checconi

In the deboost path, right after the dl_boosted flag has been
reset, we can currently end up replenishing using -deadline
parameters of a !SCHED_DEADLINE entity. This of course causes
a bug, as those parameters are empty.

In the case depicted above it is safe to simply bail out, as
the deboosted task is going to be back to its original scheduling
class anyway.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Fabio Checconi <fchecconi@gmail.com>
Cc: linux-kernel@vger.kernel.org
---
 kernel/sched/deadline.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 710621b..cf2c040 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -847,8 +847,19 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 	 * smaller than our one... OTW we keep our runtime and
 	 * deadline.
 	 */
-	if (pi_task && p->dl.dl_boosted && dl_prio(pi_task->normal_prio))
+	if (pi_task && p->dl.dl_boosted && dl_prio(pi_task->normal_prio)) {
 		pi_se = &pi_task->dl;
+	} else if (!dl_prio(p->normal_prio)) {
+		/*
+		 * Special case in which we have a !SCHED_DEADLINE task
+		 * that is going to be deboosted, but exceedes its
+		 * runtime while doing so. No point in replenishing
+		 * it, as it's going to return back to its original
+		 * scheduling class after this.
+		 */
+		BUG_ON(!p->dl.dl_boosted || flags != ENQUEUE_REPLENISH);
+		return;
+	}
 
 	/*
 	 * If p is throttled, we do nothing. In fact, if it exhausted
-- 
2.1.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] sched/deadline: fix races between rt_mutex_setprio and dl_task_timer
  2014-10-24  9:16 [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Juri Lelli
                   ` (2 preceding siblings ...)
  2014-10-24  9:16 ` [PATCH 3/4] sched/deadline: don't replenish from a !SCHED_DEADLINE entity Juri Lelli
@ 2014-10-24  9:16 ` Juri Lelli
  2014-10-28 11:02   ` [tip:sched/core] sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer() tip-bot for Juri Lelli
  2014-10-24 11:34 ` [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Daniel Wagner
  4 siblings, 1 reply; 10+ messages in thread
From: Juri Lelli @ 2014-10-24  9:16 UTC (permalink / raw)
  To: linux-kernel, peterz
  Cc: juri.lelli, mingo, daniel.wagner, vincent, Juri Lelli,
	Dario Faggioli, Michael Trimarchi, Fabio Checconi

dl_task_timer() is racy against several paths. Daniel noticed that
the replenishment timer may experience a race condition against an
enqueue_dl_entity() called from rt_mutex_setprio(). With his own
words:

 rt_mutex_setprio() resets p->dl.dl_throttled. So the pattern is:
 start_dl_timer() throttled = 1, rt_mutex_setprio() throlled = 0,
 sched_switch() -> enqueue_task(), dl_task_timer-> enqueue_task()
 throttled is 0

=> BUG_ON(on_dl_rq(dl_se)) fires as the scheduling entity is already
enqueued on the -deadline runqueue.

As we do for the other races, we just bail out in the replenishment
timer code.

Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Fabio Checconi <fchecconi@gmail.com>
Cc: linux-kernel@vger.kernel.org
---
 kernel/sched/deadline.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index cf2c040..28d6088 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -518,12 +518,20 @@ again:
 	}
 
 	/*
-	 * We need to take care of a possible races here. In fact, the
-	 * task might have changed its scheduling policy to something
-	 * different from SCHED_DEADLINE or changed its reservation
-	 * parameters (through sched_setattr()).
+	 * We need to take care of several possible races here:
+	 *
+	 *   - the task might have changed its scheduling policy
+	 *     to something different than SCHED_DEADLINE
+	 *   - the task might have changed its reservation parameters
+	 *     (through sched_setattr())
+	 *   - the task might have been boosted by someone else and
+	 *     might be in the boosting/deboosting path
+	 *
+	 * In all this cases we bail out, as the task is already
+	 * in the runqueue or is going to be enqueued back anyway.
 	 */
-	if (!dl_task(p) || dl_se->dl_new)
+	if (!dl_task(p) || dl_se->dl_new ||
+	    dl_se->dl_boosted || !dl_se->dl_throttled)
 		goto unlock;
 
 	sched_clock_tick();
-- 
2.1.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE
  2014-10-24  9:16 [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Juri Lelli
                   ` (3 preceding siblings ...)
  2014-10-24  9:16 ` [PATCH 4/4] sched/deadline: fix races between rt_mutex_setprio and dl_task_timer Juri Lelli
@ 2014-10-24 11:34 ` Daniel Wagner
  2014-10-24 13:21   ` Juri Lelli
  4 siblings, 1 reply; 10+ messages in thread
From: Daniel Wagner @ 2014-10-24 11:34 UTC (permalink / raw)
  To: Juri Lelli, linux-kernel, peterz; +Cc: juri.lelli, mingo, vincent

Hi everyone,

On 10/24/2014 11:16 AM, Juri Lelli wrote:
> Hi all,
> 
> this is basically a resend of 4 patches fixing problems around AC
> and PI related to SCHED_DEADLINE. You can find the full story here:
> 
>  - https://lkml.org/lkml/2014/9/19/134
>  - https://lkml.org/lkml/2014/9/24/493
> 
> First two patches fix AC. They are the same as the previous version,
> plus aligning to commit f10e00f4bf36 ("sched/dl: Use dl_bw_of() under
> rcu_read_lock_sched()"). Last two fix PI.
> 
> Thanks a lot again to Daniel for spending time fixing and testing
> the PI stuff! And to Peter and Vincent for reviews and comments.

Thanks for the patches!

> Best Regards,
> 
> - Juri
> 
> Juri Lelli (4):
>   sched/deadline: fix bandwidth check/update when migrating tasks
>     between exclusive cpusets
>   sched/deadline: ensure that updates to exclusive cpusets don't break
>     AC
>   sched/deadline: don't replenish from a !SCHED_DEADLINE entity
>   sched/deadline: fix races between rt_mutex_setprio and dl_task_timer

I have tested the whole patch set with my test program and everything
seems to work fine. If you like you can add a

Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>

cheers,
daniel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE
  2014-10-24 11:34 ` [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Daniel Wagner
@ 2014-10-24 13:21   ` Juri Lelli
  2014-10-24 13:39     ` Peter Zijlstra
  0 siblings, 1 reply; 10+ messages in thread
From: Juri Lelli @ 2014-10-24 13:21 UTC (permalink / raw)
  To: Daniel Wagner, linux-kernel, peterz; +Cc: juri.lelli, mingo, vincent

Hi Daniel,

On 24/10/14 12:34, Daniel Wagner wrote:
> Hi everyone,
> 
> On 10/24/2014 11:16 AM, Juri Lelli wrote:
>> Hi all,
>>
>> this is basically a resend of 4 patches fixing problems around AC
>> and PI related to SCHED_DEADLINE. You can find the full story here:
>>
>>  - https://lkml.org/lkml/2014/9/19/134
>>  - https://lkml.org/lkml/2014/9/24/493
>>
>> First two patches fix AC. They are the same as the previous version,
>> plus aligning to commit f10e00f4bf36 ("sched/dl: Use dl_bw_of() under
>> rcu_read_lock_sched()"). Last two fix PI.
>>
>> Thanks a lot again to Daniel for spending time fixing and testing
>> the PI stuff! And to Peter and Vincent for reviews and comments.
> 
> Thanks for the patches!
> 
>> Best Regards,
>>
>> - Juri
>>
>> Juri Lelli (4):
>>   sched/deadline: fix bandwidth check/update when migrating tasks
>>     between exclusive cpusets
>>   sched/deadline: ensure that updates to exclusive cpusets don't break
>>     AC
>>   sched/deadline: don't replenish from a !SCHED_DEADLINE entity
>>   sched/deadline: fix races between rt_mutex_setprio and dl_task_timer
> 
> I have tested the whole patch set with my test program and everything
> seems to work fine. If you like you can add a
> 

Cool! Thanks a lot.

> Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
> 

Peter, maybe you can add it while applying them?

Thanks,

- Juri


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE
  2014-10-24 13:21   ` Juri Lelli
@ 2014-10-24 13:39     ` Peter Zijlstra
  0 siblings, 0 replies; 10+ messages in thread
From: Peter Zijlstra @ 2014-10-24 13:39 UTC (permalink / raw)
  To: Juri Lelli; +Cc: Daniel Wagner, linux-kernel, juri.lelli, mingo, vincent

On Fri, Oct 24, 2014 at 02:21:14PM +0100, Juri Lelli wrote:
> > 
> 
> Peter, maybe you can add it while applying them?

Already done, thanks!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [tip:sched/core] sched/deadline: Don' t replenish from a !SCHED_DEADLINE entity
  2014-10-24  9:16 ` [PATCH 3/4] sched/deadline: don't replenish from a !SCHED_DEADLINE entity Juri Lelli
@ 2014-10-28 11:01   ` tip-bot for Juri Lelli
  0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Juri Lelli @ 2014-10-28 11:01 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, peterz, linux-kernel, tglx, raistlin, juri.lelli,
	mingo, hpa, daniel.wagner, fchecconi, michael

Commit-ID:  64be6f1f5f710f5995d41caf8a1767fe6d2b5a87
Gitweb:     http://git.kernel.org/tip/64be6f1f5f710f5995d41caf8a1767fe6d2b5a87
Author:     Juri Lelli <juri.lelli@arm.com>
AuthorDate: Fri, 24 Oct 2014 10:16:37 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 28 Oct 2014 10:46:00 +0100

sched/deadline: Don't replenish from a !SCHED_DEADLINE entity

In the deboost path, right after the dl_boosted flag has been
reset, we can currently end up replenishing using -deadline
parameters of a !SCHED_DEADLINE entity. This of course causes
a bug, as those parameters are empty.

In the case depicted above it is safe to simply bail out, as
the deboosted task is going to be back to its original scheduling
class anyway.

Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: vincent@legout.info
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Fabio Checconi <fchecconi@gmail.com>
Link: http://lkml.kernel.org/r/1414142198-18552-4-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/deadline.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 256e577..92279ea 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -847,8 +847,19 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 	 * smaller than our one... OTW we keep our runtime and
 	 * deadline.
 	 */
-	if (pi_task && p->dl.dl_boosted && dl_prio(pi_task->normal_prio))
+	if (pi_task && p->dl.dl_boosted && dl_prio(pi_task->normal_prio)) {
 		pi_se = &pi_task->dl;
+	} else if (!dl_prio(p->normal_prio)) {
+		/*
+		 * Special case in which we have a !SCHED_DEADLINE task
+		 * that is going to be deboosted, but exceedes its
+		 * runtime while doing so. No point in replenishing
+		 * it, as it's going to return back to its original
+		 * scheduling class after this.
+		 */
+		BUG_ON(!p->dl.dl_boosted || flags != ENQUEUE_REPLENISH);
+		return;
+	}
 
 	/*
 	 * If p is throttled, we do nothing. In fact, if it exhausted

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [tip:sched/core] sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer()
  2014-10-24  9:16 ` [PATCH 4/4] sched/deadline: fix races between rt_mutex_setprio and dl_task_timer Juri Lelli
@ 2014-10-28 11:02   ` tip-bot for Juri Lelli
  0 siblings, 0 replies; 10+ messages in thread
From: tip-bot for Juri Lelli @ 2014-10-28 11:02 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: juri.lelli, hpa, peterz, tglx, michael, daniel.wagner, mingo,
	linux-kernel, torvalds, raistlin, fchecconi

Commit-ID:  aee38ea95419c818dfdde52b115aeffe9cbb259b
Gitweb:     http://git.kernel.org/tip/aee38ea95419c818dfdde52b115aeffe9cbb259b
Author:     Juri Lelli <juri.lelli@arm.com>
AuthorDate: Fri, 24 Oct 2014 10:16:38 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Tue, 28 Oct 2014 10:46:01 +0100

sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer()

dl_task_timer() is racy against several paths. Daniel noticed that
the replenishment timer may experience a race condition against an
enqueue_dl_entity() called from rt_mutex_setprio(). With his own
words:

 rt_mutex_setprio() resets p->dl.dl_throttled. So the pattern is:
 start_dl_timer() throttled = 1, rt_mutex_setprio() throlled = 0,
 sched_switch() -> enqueue_task(), dl_task_timer-> enqueue_task()
 throttled is 0

=> BUG_ON(on_dl_rq(dl_se)) fires as the scheduling entity is already
enqueued on the -deadline runqueue.

As we do for the other races, we just bail out in the replenishment
timer code.

Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Signed-off-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: vincent@legout.info
Cc: Dario Faggioli <raistlin@linux.it>
Cc: Michael Trimarchi <michael@amarulasolutions.com>
Cc: Fabio Checconi <fchecconi@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414142198-18552-5-git-send-email-juri.lelli@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/deadline.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 92279ea..4616789 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -518,12 +518,20 @@ again:
 	}
 
 	/*
-	 * We need to take care of a possible races here. In fact, the
-	 * task might have changed its scheduling policy to something
-	 * different from SCHED_DEADLINE or changed its reservation
-	 * parameters (through sched_setattr()).
+	 * We need to take care of several possible races here:
+	 *
+	 *   - the task might have changed its scheduling policy
+	 *     to something different than SCHED_DEADLINE
+	 *   - the task might have changed its reservation parameters
+	 *     (through sched_setattr())
+	 *   - the task might have been boosted by someone else and
+	 *     might be in the boosting/deboosting path
+	 *
+	 * In all this cases we bail out, as the task is already
+	 * in the runqueue or is going to be enqueued back anyway.
 	 */
-	if (!dl_task(p) || dl_se->dl_new)
+	if (!dl_task(p) || dl_se->dl_new ||
+	    dl_se->dl_boosted || !dl_se->dl_throttled)
 		goto unlock;
 
 	sched_clock_tick();

^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-10-28 11:03 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-24  9:16 [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Juri Lelli
2014-10-24  9:16 ` [PATCH 1/4] sched/deadline: fix bandwidth check/update when migrating tasks between exclusive cpusets Juri Lelli
2014-10-24  9:16 ` [PATCH 2/4] sched/deadline: ensure that updates to exclusive cpusets don't break AC Juri Lelli
2014-10-24  9:16 ` [PATCH 3/4] sched/deadline: don't replenish from a !SCHED_DEADLINE entity Juri Lelli
2014-10-28 11:01   ` [tip:sched/core] sched/deadline: Don' t " tip-bot for Juri Lelli
2014-10-24  9:16 ` [PATCH 4/4] sched/deadline: fix races between rt_mutex_setprio and dl_task_timer Juri Lelli
2014-10-28 11:02   ` [tip:sched/core] sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer() tip-bot for Juri Lelli
2014-10-24 11:34 ` [PATCH 0/4] Fix various bits of AC and PI for SCHED_DEADLINE Daniel Wagner
2014-10-24 13:21   ` Juri Lelli
2014-10-24 13:39     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).