linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio
@ 2015-02-16  9:32 Xunlei Pang
  2015-02-16  9:32 ` [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases Xunlei Pang
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Xunlei Pang @ 2015-02-16  9:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Andrew Morton,
	Dan Streetman, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

If there're multiple nodes with the same prio as @node, currently
plist_add() will add @node behind all of them. Now we need to add
@node before all of these nodes for SMP RT scheduler.

This patch adds a common __plist_add() for adding @node before or
behind existing nodes with the same prio, then adds plist_add_head()
and plist_add_tail() inline wrapper functions for convenient uses.

Finally, define plist_add() with plist_add_tail() which has the same
behaviour as before.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
 include/linux/plist.h | 30 +++++++++++++++++++++++++++++-
 lib/plist.c           | 15 ++++++++++++---
 2 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/include/linux/plist.h b/include/linux/plist.h
index 9788360..e17bb96 100644
--- a/include/linux/plist.h
+++ b/include/linux/plist.h
@@ -138,7 +138,35 @@ static inline void plist_node_init(struct plist_node *node, int prio)
 	INIT_LIST_HEAD(&node->node_list);
 }
 
-extern void plist_add(struct plist_node *node, struct plist_head *head);
+extern void __plist_add(struct plist_node *node,
+				struct plist_head *head, bool is_head);
+
+/**
+ * plist_add_head - add @node to @head, before all existing same-prio nodes
+ *
+ * @node:	&struct plist_node pointer
+ * @head:	&struct plist_head pointer
+ */
+static inline
+void plist_add_head(struct plist_node *node, struct plist_head *head)
+{
+	__plist_add(node, head, 1);
+}
+
+/**
+ * plist_add_tail - add @node to @head, after all existing same-prio nodes
+ *
+ * @node:	&struct plist_node pointer
+ * @head:	&struct plist_head pointer
+ */
+static inline
+void plist_add_tail(struct plist_node *node, struct plist_head *head)
+{
+	__plist_add(node, head, 0);
+}
+
+#define plist_add plist_add_tail
+
 extern void plist_del(struct plist_node *node, struct plist_head *head);
 
 extern void plist_requeue(struct plist_node *node, struct plist_head *head);
diff --git a/lib/plist.c b/lib/plist.c
index d408e77..0e1f1b3 100644
--- a/lib/plist.c
+++ b/lib/plist.c
@@ -67,12 +67,16 @@ static void plist_check_head(struct plist_head *head)
 #endif
 
 /**
- * plist_add - add @node to @head
+ * __plist_add - add @node to @head
  *
  * @node:	&struct plist_node pointer
  * @head:	&struct plist_head pointer
+ * @is_head:	bool
+ *
+ * If there're any nodes with the same prio, add @node
+ * behind or before all of them according to @is_head.
  */
-void plist_add(struct plist_node *node, struct plist_head *head)
+void __plist_add(struct plist_node *node, struct plist_head *head, bool is_head)
 {
 	struct plist_node *first, *iter, *prev = NULL;
 	struct list_head *node_next = &head->node_list;
@@ -97,8 +101,13 @@ void plist_add(struct plist_node *node, struct plist_head *head)
 				struct plist_node, prio_list);
 	} while (iter != first);
 
-	if (!prev || prev->prio != node->prio)
+	if (!prev || prev->prio != node->prio) {
 		list_add_tail(&node->prio_list, &iter->prio_list);
+	} else if (is_head) {
+		list_add(&node->prio_list, &prev->prio_list);
+		list_del_init(&prev->prio_list);
+		node_next = &prev->node_list;
+	}
 ins_node:
 	list_add_tail(&node->node_list, node_next);
 
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases
  2015-02-16  9:32 [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Xunlei Pang
@ 2015-02-16  9:32 ` Xunlei Pang
  2015-02-16 15:35   ` Steven Rostedt
  2015-02-16  9:32 ` [PATCH v4 3/3] sched/rt: Check to push the task when changing its affinity Xunlei Pang
  2015-02-16 15:32 ` [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Steven Rostedt
  2 siblings, 1 reply; 5+ messages in thread
From: Xunlei Pang @ 2015-02-16  9:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Andrew Morton,
	Dan Streetman, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

Currently, SMP RT scheduler has some trouble in dealing with
equal prio cases.

For example, in check_preempt_equal_prio():
When RT1(current task) gets preempted by RT2, if there is a
migratable RT3 with same prio, RT3 will be pushed away instead
of RT1 afterwards, because RT1 will be enqueued to the tail of
the pushable list when going through succeeding put_prev_task_rt()
triggered by resched. This broke FIFO.

Furthermore, this is also problematic for normal preempted cases
if there're some rt tasks queued with the same prio as current,
because current will be put behind these tasks in the pushable
queue.

So, if a task is running and gets preempted by a higher priority
task (or even with same priority for migrating), this patch ensures
that it is put before any existing task with the same priority in
the pushable queue.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
 kernel/sched/rt.c | 23 ++++++++++++++++-------
 1 file changed, 16 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index f4d4b07..65de40e 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -347,11 +347,15 @@ static inline void set_post_schedule(struct rq *rq)
 	rq->post_schedule = has_pushable_tasks(rq);
 }
 
-static void enqueue_pushable_task(struct rq *rq, struct task_struct *p)
+static void enqueue_pushable_task(struct rq *rq,
+				struct task_struct *p, bool head)
 {
 	plist_del(&p->pushable_tasks, &rq->rt.pushable_tasks);
 	plist_node_init(&p->pushable_tasks, p->prio);
-	plist_add(&p->pushable_tasks, &rq->rt.pushable_tasks);
+	if (head)
+		plist_add_head(&p->pushable_tasks, &rq->rt.pushable_tasks);
+	else
+		plist_add_tail(&p->pushable_tasks, &rq->rt.pushable_tasks);
 
 	/* Update the highest prio pushable task */
 	if (p->prio < rq->rt.highest_prio.next)
@@ -373,7 +377,8 @@ static void dequeue_pushable_task(struct rq *rq, struct task_struct *p)
 
 #else
 
-static inline void enqueue_pushable_task(struct rq *rq, struct task_struct *p)
+static inline void enqueue_pushable_task(struct rq *rq,
+					struct task_struct *p, bool head)
 {
 }
 
@@ -1248,7 +1253,7 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags)
 	enqueue_rt_entity(rt_se, flags & ENQUEUE_HEAD);
 
 	if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
-		enqueue_pushable_task(rq, p);
+		enqueue_pushable_task(rq, p, 0);
 }
 
 static void dequeue_task_rt(struct rq *rq, struct task_struct *p, int flags)
@@ -1494,8 +1499,12 @@ static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
 	 * The previous task needs to be made eligible for pushing
 	 * if it is still active
 	 */
-	if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1)
-		enqueue_pushable_task(rq, p);
+	if (on_rt_rq(&p->rt) && p->nr_cpus_allowed > 1) {
+		if (task_running(rq, p) && (preempt_count() & PREEMPT_ACTIVE))
+			enqueue_pushable_task(rq, p, 1);
+		else
+			enqueue_pushable_task(rq, p, 0);
+	}
 }
 
 #ifdef CONFIG_SMP
@@ -1914,7 +1923,7 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 		rq->rt.rt_nr_migratory--;
 	} else {
 		if (!task_current(rq, p))
-			enqueue_pushable_task(rq, p);
+			enqueue_pushable_task(rq, p, 0);
 		rq->rt.rt_nr_migratory++;
 	}
 
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v4 3/3] sched/rt: Check to push the task when changing its affinity
  2015-02-16  9:32 [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Xunlei Pang
  2015-02-16  9:32 ` [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases Xunlei Pang
@ 2015-02-16  9:32 ` Xunlei Pang
  2015-02-16 15:32 ` [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Steven Rostedt
  2 siblings, 0 replies; 5+ messages in thread
From: Xunlei Pang @ 2015-02-16  9:32 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Andrew Morton,
	Dan Streetman, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

We may suffer from extra rt overload rq due to the affinity,
so when the affinity of any runnable rt task is changed, we
should check to trigger balancing, otherwise it will cause
some unnecessary delayed real-time response. Unfortunately,
current RT global scheduler doesn't trigger anything.

For example: a 2-cpu system with two runnable FIFO tasks(same
rt_priority) bound on CPU0, let's name them rt1(running) and
rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
rt2 still can't be scheduled until rt1 enters schedule(), this
definitely causes some/big response latency for rt2.

So, when doing set_cpus_allowed_rt(), if detecting such cases,
check to trigger a push behaviour.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
 kernel/sched/rt.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 68 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 65de40e..2637e23 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1433,10 +1433,9 @@ static struct sched_rt_entity *pick_next_rt_entity(struct rq *rq,
 	return next;
 }
 
-static struct task_struct *_pick_next_task_rt(struct rq *rq)
+static struct task_struct *peek_next_task_rt(struct rq *rq)
 {
 	struct sched_rt_entity *rt_se;
-	struct task_struct *p;
 	struct rt_rq *rt_rq  = &rq->rt;
 
 	do {
@@ -1445,7 +1444,14 @@ static struct task_struct *_pick_next_task_rt(struct rq *rq)
 		rt_rq = group_rt_rq(rt_se);
 	} while (rt_rq);
 
-	p = rt_task_of(rt_se);
+	return rt_task_of(rt_se);
+}
+
+static inline struct task_struct *_pick_next_task_rt(struct rq *rq)
+{
+	struct task_struct *p;
+
+	p = peek_next_task_rt(rq);
 	p->se.exec_start = rq_clock_task(rq);
 
 	return p;
@@ -1895,28 +1901,74 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 				const struct cpumask *new_mask)
 {
 	struct rq *rq;
-	int weight;
+	int old_weight, new_weight;
+	int preempt_push = 0, direct_push = 0;
 
 	BUG_ON(!rt_task(p));
 
 	if (!task_on_rq_queued(p))
 		return;
 
-	weight = cpumask_weight(new_mask);
+	old_weight = p->nr_cpus_allowed;
+	new_weight = cpumask_weight(new_mask);
+
+	rq = task_rq(p);
+
+	if (new_weight > 1 &&
+	    rt_task(rq->curr) &&
+	    !test_tsk_need_resched(rq->curr)) {
+		/*
+		 * We own p->pi_lock and rq->lock. rq->lock might
+		 * get released when doing direct pushing, however
+		 * p->pi_lock is always held, so it's safe to assign
+		 * the new_mask and new_weight to p below.
+		 */
+		if (!task_running(rq, p)) {
+			cpumask_copy(&p->cpus_allowed, new_mask);
+			p->nr_cpus_allowed = new_weight;
+			direct_push = 1;
+		} else if (cpumask_test_cpu(task_cpu(p), new_mask)) {
+			cpumask_copy(&p->cpus_allowed, new_mask);
+			p->nr_cpus_allowed = new_weight;
+			if (!cpupri_find(&rq->rd->cpupri, p, NULL))
+				goto update;
+
+			/*
+			 * At this point, current task gets migratable most
+			 * likely due to the change of its affinity, let's
+			 * figure out if we can migrate it.
+			 *
+			 * Is there any task with the same priority as that
+			 * of current task? If found one, we should resched.
+			 * NOTE: The target may be unpushable.
+			 */
+			if (p->prio == rq->rt.highest_prio.next) {
+				/* One target just in pushable_tasks list. */
+				requeue_task_rt(rq, p, 0);
+				preempt_push = 1;
+			} else if (rq->rt.rt_nr_total > 1) {
+				struct task_struct *next;
+
+				requeue_task_rt(rq, p, 0);
+				next = peek_next_task_rt(rq);
+				if (next != p && next->prio == p->prio)
+					preempt_push = 1;
+			}
+		}
+	}
 
+update:
 	/*
 	 * Only update if the process changes its state from whether it
 	 * can migrate or not.
 	 */
-	if ((p->nr_cpus_allowed > 1) == (weight > 1))
-		return;
-
-	rq = task_rq(p);
+	if ((old_weight > 1) == (new_weight > 1))
+		goto out;
 
 	/*
 	 * The process used to be able to migrate OR it can now migrate
 	 */
-	if (weight <= 1) {
+	if (new_weight <= 1) {
 		if (!task_current(rq, p))
 			dequeue_pushable_task(rq, p);
 		BUG_ON(!rq->rt.rt_nr_migratory);
@@ -1928,6 +1980,12 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 	}
 
 	update_rt_migration(&rq->rt);
+
+out:
+	if (direct_push)
+		push_rt_tasks(rq);
+	else if (preempt_push)
+		resched_curr(rq);
 }
 
 /* Assumes rq->lock is held */
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio
  2015-02-16  9:32 [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Xunlei Pang
  2015-02-16  9:32 ` [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases Xunlei Pang
  2015-02-16  9:32 ` [PATCH v4 3/3] sched/rt: Check to push the task when changing its affinity Xunlei Pang
@ 2015-02-16 15:32 ` Steven Rostedt
  2 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2015-02-16 15:32 UTC (permalink / raw)
  To: Xunlei Pang
  Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Andrew Morton,
	Dan Streetman, Xunlei Pang

On Mon, 16 Feb 2015 17:32:22 +0800
Xunlei Pang <xlpang@126.com> wrote:

> From: Xunlei Pang <pang.xunlei@linaro.org>
> 
> If there're multiple nodes with the same prio as @node, currently
> plist_add() will add @node behind all of them. Now we need to add
> @node before all of these nodes for SMP RT scheduler.
> 
> This patch adds a common __plist_add() for adding @node before or
> behind existing nodes with the same prio, then adds plist_add_head()
> and plist_add_tail() inline wrapper functions for convenient uses.
> 
> Finally, define plist_add() with plist_add_tail() which has the same
> behaviour as before.
> 
> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
> ---
>  include/linux/plist.h | 30 +++++++++++++++++++++++++++++-
>  lib/plist.c           | 15 ++++++++++++---
>  2 files changed, 41 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/plist.h b/include/linux/plist.h
> index 9788360..e17bb96 100644
> --- a/include/linux/plist.h
> +++ b/include/linux/plist.h
> @@ -138,7 +138,35 @@ static inline void plist_node_init(struct plist_node *node, int prio)
>  	INIT_LIST_HEAD(&node->node_list);
>  }
>  
> -extern void plist_add(struct plist_node *node, struct plist_head *head);
> +extern void __plist_add(struct plist_node *node,
> +				struct plist_head *head, bool is_head);
> +
> +/**
> + * plist_add_head - add @node to @head, before all existing same-prio nodes
> + *
> + * @node:	&struct plist_node pointer
> + * @head:	&struct plist_head pointer
> + */
> +static inline
> +void plist_add_head(struct plist_node *node, struct plist_head *head)
> +{
> +	__plist_add(node, head, 1);

is_head is boolean, shouldn't that be "true" instead of "1".

> +}
> +
> +/**
> + * plist_add_tail - add @node to @head, after all existing same-prio nodes
> + *
> + * @node:	&struct plist_node pointer
> + * @head:	&struct plist_head pointer
> + */
> +static inline
> +void plist_add_tail(struct plist_node *node, struct plist_head *head)
> +{
> +	__plist_add(node, head, 0);

And "false" here.

> +}
> +
> +#define plist_add plist_add_tail
> +
>  extern void plist_del(struct plist_node *node, struct plist_head *head);
>  
>  extern void plist_requeue(struct plist_node *node, struct plist_head *head);
> diff --git a/lib/plist.c b/lib/plist.c
> index d408e77..0e1f1b3 100644
> --- a/lib/plist.c
> +++ b/lib/plist.c
> @@ -67,12 +67,16 @@ static void plist_check_head(struct plist_head *head)
>  #endif
>  
>  /**
> - * plist_add - add @node to @head
> + * __plist_add - add @node to @head
>   *
>   * @node:	&struct plist_node pointer
>   * @head:	&struct plist_head pointer
> + * @is_head:	bool
> + *
> + * If there're any nodes with the same prio, add @node
> + * behind or before all of them according to @is_head.
>   */
> -void plist_add(struct plist_node *node, struct plist_head *head)
> +void __plist_add(struct plist_node *node, struct plist_head *head, bool is_head)
>  {
>  	struct plist_node *first, *iter, *prev = NULL;
>  	struct list_head *node_next = &head->node_list;
> @@ -97,8 +101,13 @@ void plist_add(struct plist_node *node, struct plist_head *head)
>  				struct plist_node, prio_list);
>  	} while (iter != first);
>  
> -	if (!prev || prev->prio != node->prio)
> +	if (!prev || prev->prio != node->prio) {
>  		list_add_tail(&node->prio_list, &iter->prio_list);
> +	} else if (is_head) {
> +		list_add(&node->prio_list, &prev->prio_list);
> +		list_del_init(&prev->prio_list);
> +		node_next = &prev->node_list;

The above could use some comments. I would say the entire plist code
could use more comments, but that's out of scope with this series. But
at least lets add comments when adding new code. Something like:

	} else if (is_head) {
		/*
		 * prev has the same priority as the node that is
		 * being added. It is also the first node for this
		 * priority, but the new node needs to be added ahead of
		 * it. To accomplish this, insert node right after prev
		 * in the prio_list and then remove prev from that list.
		 * Then set node_next to prev->node_list so that the
		 * new node gets added before prev and not iter.
		 */

Something like that, because it took me some time to figure out how
plist works. It's one of those algorithms that seem to page fault out
quickly, and takes some time to page fault back into one's mind.


-- Steve


> +	}
>  ins_node:
>  	list_add_tail(&node->node_list, node_next);
>  


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases
  2015-02-16  9:32 ` [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases Xunlei Pang
@ 2015-02-16 15:35   ` Steven Rostedt
  0 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2015-02-16 15:35 UTC (permalink / raw)
  To: Xunlei Pang
  Cc: linux-kernel, Peter Zijlstra, Juri Lelli, Andrew Morton,
	Dan Streetman, Xunlei Pang

On Mon, 16 Feb 2015 17:32:23 +0800
Xunlei Pang <xlpang@126.com> wrote:

> From: Xunlei Pang <pang.xunlei@linaro.org>
> 
> Currently, SMP RT scheduler has some trouble in dealing with
> equal prio cases.
> 
> For example, in check_preempt_equal_prio():
> When RT1(current task) gets preempted by RT2, if there is a
> migratable RT3 with same prio, RT3 will be pushed away instead
> of RT1 afterwards, because RT1 will be enqueued to the tail of
> the pushable list when going through succeeding put_prev_task_rt()
> triggered by resched. This broke FIFO.
> 
> Furthermore, this is also problematic for normal preempted cases
> if there're some rt tasks queued with the same prio as current,
> because current will be put behind these tasks in the pushable
> queue.
> 
> So, if a task is running and gets preempted by a higher priority
> task (or even with same priority for migrating), this patch ensures
> that it is put before any existing task with the same priority in
> the pushable queue.
> 
> Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>

I'd love to review this now, but unfortunately I need to get ready for
my trip to Linux Collab. I'll get back to this next week. Feel free to
ping me then. If I have time, I might review this while at the
conference, but don't place any bets that I will.

-- Steve

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-02-16 15:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-16  9:32 [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Xunlei Pang
2015-02-16  9:32 ` [PATCH v4 2/3] sched/rt: Fix wrong SMP scheduler behavior for equal prio cases Xunlei Pang
2015-02-16 15:35   ` Steven Rostedt
2015-02-16  9:32 ` [PATCH v4 3/3] sched/rt: Check to push the task when changing its affinity Xunlei Pang
2015-02-16 15:32 ` [PATCH v4 1/3] lib/plist: Provide plist_add_head() for nodes with the same prio Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).