linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed
@ 2015-05-05 11:56 Xunlei Pang
  2015-05-05 11:56 ` [PATCH v2 2/2] sched/rt: Remove redundant conditions from task_woken_rt() Xunlei Pang
  2015-05-05 12:09 ` [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed Peter Zijlstra
  0 siblings, 2 replies; 4+ messages in thread
From: Xunlei Pang @ 2015-05-05 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Ingo Molnar, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

We may suffer from extra rt overload rq due to the affinity,
so when the affinity of any runnable rt task is changed, we
should check to trigger balancing, otherwise it will cause
some unnecessary delayed real-time response. Unfortunately,
current RT global scheduler does nothing about this.

For example: a 2-cpu system with two runnable FIFO tasks(same
rt_priority) bound on CPU0, let's name them rt1(running) and
rt2(runnable) respectively; CPU1 has no RTs. Then, someone sets
the affinity of rt2 to 0x3(i.e. CPU0 and CPU1), but after this,
rt2 still can't be scheduled enters schedule(), this
definitely causes some/big response latency for rt2.

This patch introduces a new sched_class::post_set_cpus_allowed()
for RT called after set_cpus_allowed_rt(). In this new function,
if the task is runnable but not running, it tries to push it away
once it got migratable.

The patch also solves a problem about move_queued_task() called
in set_cpus_allowed_ptr():
When a lower priorioty rt task got migrated due to its curr cpu 
isn't in the new affinity mask, after move_queued_task() it will 
miss the chance of pushing away, because check_preempt_curr() 
called by move_queued_task() doens't set the "need resched flag" 
for lower priority tasks.

Parts-suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
v1->v2:
Removed cpupri_find(), as it will probably be executed in push_rt_tasks().

 kernel/sched/core.c  |  3 +++
 kernel/sched/rt.c    | 15 +++++++++++++++
 kernel/sched/sched.h |  1 +
 3 files changed, 19 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d13fc13..64a1603 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4773,6 +4773,9 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
 
 	cpumask_copy(&p->cpus_allowed, new_mask);
 	p->nr_cpus_allowed = cpumask_weight(new_mask);
+
+	if (p->sched_class->post_set_cpus_allowed)
+		p->sched_class->post_set_cpus_allowed(p);
 }
 
 /*
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 8885b65..4176f33 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2280,6 +2280,20 @@ static void set_cpus_allowed_rt(struct task_struct *p,
 	update_rt_migration(&rq->rt);
 }
 
+static void post_set_cpus_allowed_rt(struct task_struct *p)
+{
+	struct rq *rq;
+
+	if (!task_on_rq_queued(p))
+		return;
+
+	rq = task_rq(p);
+	if (!task_running(rq, p) &&
+	    p->nr_cpus_allowed > 1 &&
+	    !test_tsk_need_resched(rq->curr))
+		push_rt_tasks(rq);
+}
+
 /* Assumes rq->lock is held */
 static void rq_online_rt(struct rq *rq)
 {
@@ -2494,6 +2508,7 @@ const struct sched_class rt_sched_class = {
 	.select_task_rq		= select_task_rq_rt,
 
 	.set_cpus_allowed       = set_cpus_allowed_rt,
+	.post_set_cpus_allowed  = post_set_cpus_allowed_rt,
 	.rq_online              = rq_online_rt,
 	.rq_offline             = rq_offline_rt,
 	.post_schedule		= post_schedule_rt,
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e0e1299..6f90645 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1191,6 +1191,7 @@ struct sched_class {
 
 	void (*set_cpus_allowed)(struct task_struct *p,
 				 const struct cpumask *newmask);
+	void (*post_set_cpus_allowed)(struct task_struct *p);
 
 	void (*rq_online)(struct rq *rq);
 	void (*rq_offline)(struct rq *rq);
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 2/2] sched/rt: Remove redundant conditions from task_woken_rt()
  2015-05-05 11:56 [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed Xunlei Pang
@ 2015-05-05 11:56 ` Xunlei Pang
  2015-05-05 12:09 ` [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed Peter Zijlstra
  1 sibling, 0 replies; 4+ messages in thread
From: Xunlei Pang @ 2015-05-05 11:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Peter Zijlstra, Steven Rostedt, Juri Lelli, Ingo Molnar, Xunlei Pang

From: Xunlei Pang <pang.xunlei@linaro.org>

- Remove "has_pushable_tasks(rq)".
  Because for queued p, "!task_running(rq, p)" and "p->nr_cpus_allowed > 1"
  already imply that "has_pushable_tasks(rq)" is true.

- Remove "!test_tsk_need_resched(rq->curr)".
  The condtion mainly intends to ensure higher priority rt tasks won't be pushed 
  away. I can think of two reasons below for getting rid of it.
  1) With following "rq->curr->prio <= p->prio", we still can guarantee that
     purpose. "rq->curr->prio <= p->prio" implies the "need resched flag" wasn't
     set by check_preempt_curr() except the one set by check_preempt_equal_prio()
     for equal prio cases(In this case, if the condition is removed, it may result
     in an extra push_rt_tasks(), but this doesn't cause the wrong logic, in fact
     this extra push_rt_tasks() will probably return quickly for the case).

     Addtionally, there're also cases the "need resched flag" got set before the
     waking, with current implementation it needn't to push lower priority tasks
     as the cpu will schedule, while it will do an extra pushing if the condition
     is removed. But on the other hand, we can get a timely pushing for the woken
     tasks after the condition is removed(better for the non-preemptible kernel).

  2) With following condtion "rq->curr->nr_cpus_allowed < 2" which was added by
     commit b3bc211cfe7d ("sched: Give CPU bound RT tasks preference"). But in the
     scenario descibed in the commit, "need resched flag" was already set before in
     check_preempt_curr(), thus "!test_tsk_need_resched(rq->curr)" is always false
     which means with current implementation the commit is futile for task_woken_rt().
  So, by removing this condition, we get the right logic.

Signed-off-by: Xunlei Pang <pang.xunlei@linaro.org>
---
v1->v2:
Improved the changelog.

 kernel/sched/rt.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 4176f33..95b596b 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2232,8 +2232,6 @@ out:
 static void task_woken_rt(struct rq *rq, struct task_struct *p)
 {
 	if (!task_running(rq, p) &&
-	    !test_tsk_need_resched(rq->curr) &&
-	    has_pushable_tasks(rq) &&
 	    p->nr_cpus_allowed > 1 &&
 	    (dl_task(rq->curr) || rt_task(rq->curr)) &&
 	    (rq->curr->nr_cpus_allowed < 2 ||
-- 
1.9.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed
  2015-05-05 11:56 [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed Xunlei Pang
  2015-05-05 11:56 ` [PATCH v2 2/2] sched/rt: Remove redundant conditions from task_woken_rt() Xunlei Pang
@ 2015-05-05 12:09 ` Peter Zijlstra
       [not found]   ` <OF0AFAEB82.B1AB4A37-ON48257E3C.0052FA35-48257E3C.00540890@zte.com.cn>
  1 sibling, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2015-05-05 12:09 UTC (permalink / raw)
  To: Xunlei Pang
  Cc: linux-kernel, Steven Rostedt, Juri Lelli, Ingo Molnar, Xunlei Pang

On Tue, May 05, 2015 at 07:56:07PM +0800, Xunlei Pang wrote:
> +++ b/kernel/sched/core.c
> @@ -4773,6 +4773,9 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask)
>  
>  	cpumask_copy(&p->cpus_allowed, new_mask);
>  	p->nr_cpus_allowed = cpumask_weight(new_mask);
> +
> +	if (p->sched_class->post_set_cpus_allowed)
> +		p->sched_class->post_set_cpus_allowed(p);
>  }
>  
>  /*
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 8885b65..4176f33 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2280,6 +2280,20 @@ static void set_cpus_allowed_rt(struct task_struct *p,
>  	update_rt_migration(&rq->rt);
>  }
>  
> +static void post_set_cpus_allowed_rt(struct task_struct *p)
> +{
> +	struct rq *rq;
> +
> +	if (!task_on_rq_queued(p))
> +		return;
> +
> +	rq = task_rq(p);
> +	if (!task_running(rq, p) &&
> +	    p->nr_cpus_allowed > 1 &&
> +	    !test_tsk_need_resched(rq->curr))
> +		push_rt_tasks(rq);
> +}

Guys, this is disgusting. Please don't do these minimal effort hacks.

Either fix up all the classes with a trivial set_cpus_allowed() function
and make do_set_cpus_allowed() := p->sched_class->set_cpus_allowed().

Or just do the p->{nr_,}cpus_allowed assignments in
set_cpus_allowed_rt() and keep it all in the one callback.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed
       [not found]   ` <OF0AFAEB82.B1AB4A37-ON48257E3C.0052FA35-48257E3C.00540890@zte.com.cn>
@ 2015-05-08 17:28     ` Steven Rostedt
  0 siblings, 0 replies; 4+ messages in thread
From: Steven Rostedt @ 2015-05-08 17:28 UTC (permalink / raw)
  To: pang.xunlei
  Cc: Peter Zijlstra, Juri Lelli, linux-kernel, linux-kernel-owner,
	Ingo Molnar, Xunlei Pang

On Tue, 5 May 2015 23:17:30 +0800
pang.xunlei@zte.com.cn wrote:

> > Or just do the p->{nr_,}cpus_allowed assignments in
> > set_cpus_allowed_rt() and keep it all in the one callback.
> 
> Ok, thanks.
> 
> How about this? 

This is something more like I had in mind.

> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index d13fc13..c995a02 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4768,11 +4768,15 @@ static struct rq *move_queued_task(struct 
> task_struct *p, int new_cpu)
>  
>  void do_set_cpus_allowed(struct task_struct *p, const struct cpumask 
> *new_mask)
>  {
> +       bool updated = false;
> +
>         if (p->sched_class->set_cpus_allowed)
> -               p->sched_class->set_cpus_allowed(p, new_mask);
> +               updated = p->sched_class->set_cpus_allowed(p, new_mask);
>  
> -       cpumask_copy(&p->cpus_allowed, new_mask);
> -       p->nr_cpus_allowed = cpumask_weight(new_mask);
> +       if (!updated) {
> +               cpumask_copy(&p->cpus_allowed, new_mask);
> +               p->nr_cpus_allowed = cpumask_weight(new_mask);
> +       }

I'm fine with this if Peter is.

>  }
>  
>  /*
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 5e95145..3baffb2 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1574,7 +1574,7 @@ static void task_woken_dl(struct rq *rq, struct 
> task_struct *p)
>         }
>  }
>  
> -static void set_cpus_allowed_dl(struct task_struct *p,
> +static bool set_cpus_allowed_dl(struct task_struct *p,
>                                 const struct cpumask *new_mask)
>  {
>         struct rq *rq;
> @@ -1610,7 +1610,7 @@ static void set_cpus_allowed_dl(struct task_struct 
> *p,
>          * it is on the rq AND it is not throttled).
>          */
>         if (!on_dl_rq(&p->dl))
> -               return;
> +               return false;
>  

I would think DEAD_LINE tasks would need the same "feature".

>         weight = cpumask_weight(new_mask);
>  
> @@ -1619,7 +1619,7 @@ static void set_cpus_allowed_dl(struct task_struct 
> *p,
>          * can migrate or not.
>          */
>         if ((p->nr_cpus_allowed > 1) == (weight > 1))
> -               return;
> +               return false;
>  
>         /*
>          * The process used to be able to migrate OR it can now migrate
> @@ -1636,6 +1636,8 @@ static void set_cpus_allowed_dl(struct task_struct 
> *p,
>         }
>  
>         update_dl_migration(&rq->dl);
> +
> +       return false;
>  }
>  
>  /* Assumes rq->lock is held */
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 8885b65..9e7a4bb 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -2241,7 +2241,7 @@ static void task_woken_rt(struct rq *rq, struct 
> task_struct *p)
>                 push_rt_tasks(rq);
>  }
>  
> -static void set_cpus_allowed_rt(struct task_struct *p,
> +static bool set_cpus_allowed_rt(struct task_struct *p,
>                                 const struct cpumask *new_mask)
>  {
>         struct rq *rq;
> @@ -2250,18 +2250,18 @@ static void set_cpus_allowed_rt(struct task_struct 
> *p,
>         BUG_ON(!rt_task(p));
>  
>         if (!task_on_rq_queued(p))
> -               return;
> +               return false;
>  
>         weight = cpumask_weight(new_mask);
>  
> +       rq = task_rq(p);
> +
>         /*
>          * Only update if the process changes its state from whether it
>          * can migrate or not.

Comment needs updating.

>          */
>         if ((p->nr_cpus_allowed > 1) == (weight > 1))
> -               return;
> -
> -       rq = task_rq(p);
> +               goto check_push;
>  
>         /*
>          * The process used to be able to migrate OR it can now migrate
> @@ -2278,6 +2278,18 @@ static void set_cpus_allowed_rt(struct task_struct 
> *p,
>         }
>  
>         update_rt_migration(&rq->rt);
> +
> +check_push:
> +       if (weight > 1 && !task_running(rq, p) &&
> +           !cpumask_subset(new_mask, &p->cpus_allowed)) {
> +               /* Update new affinity for pushing */
> +               cpumask_copy(&p->cpus_allowed, new_mask);
> +               p->nr_cpus_allowed = weight;
> +               push_rt_tasks(rq);
> +               return true;
> +       }
> +
> +       return false;
>  }
>  
>  /* Assumes rq->lock is held */
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index e0e1299..75f869b 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1189,7 +1189,8 @@ struct sched_class {
>         void (*task_waking) (struct task_struct *task);
>         void (*task_woken) (struct rq *this_rq, struct task_struct *task);
>  
> -       void (*set_cpus_allowed)(struct task_struct *p,
> +       /* If p's affinity was updated by it, return true. Otherwise false 
> */

	/* Return true if p's affinity was updated, false otherwise */

-- Steve

> +       bool (*set_cpus_allowed)(struct task_struct *p,
>                                  const struct cpumask *newmask);
>  
>         void (*rq_online)(struct rq *rq);
> 
> --------------------------------------------------------
> ZTE Information Security Notice: The information contained in this mail (and any attachment transmitted herewith) is privileged and confidential and is intended for the exclusive use of the addressee(s).  If you are not an intended recipient, any disclosure, reproduction, distribution or other dissemination or use of the information contained is strictly prohibited.  If you have received this mail in error, please delete it and notify us immediately.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-08 17:28 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-05 11:56 [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed Xunlei Pang
2015-05-05 11:56 ` [PATCH v2 2/2] sched/rt: Remove redundant conditions from task_woken_rt() Xunlei Pang
2015-05-05 12:09 ` [PATCH v2 1/2] sched/rt: Check to push task away when its affinity is changed Peter Zijlstra
     [not found]   ` <OF0AFAEB82.B1AB4A37-ON48257E3C.0052FA35-48257E3C.00540890@zte.com.cn>
2015-05-08 17:28     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).