linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
@ 2014-11-12  1:06 Wanpeng Li
  2014-11-12 15:08 ` Juri Lelli
  2014-11-12 16:27 ` Kirill Tkhai
  0 siblings, 2 replies; 19+ messages in thread
From: Wanpeng Li @ 2014-11-12  1:06 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra
  Cc: Juri Lelli, Kirill Tkhai, linux-kernel, Wanpeng Li

I observe that dl task can't be migrated to other cpus during cpu hotplug, 
in addition, task may/may not be running again if cpu is added back. The 
root cause which I found is that dl task will be throtted and removed from 
dl rq after comsuming all budget, which leads to stop task can't pick it up 
from dl rq and migrate to other cpus during hotplug.

The method to reproduce:
schedtool -E -t 50000:100000 -e ./test
Actually test is just a simple for loop. Then observe which cpu the test
task is on.
echo 0 > /sys/devices/system/cpu/cpuN/online

This patch adds the dl task migration during cpu hotplug by finding a most 
suitable later deadline rq after dl timer fire if current rq is offline, 
if fail to find a suitable later deadline rq then fallback to any eligible 
online cpu in order that the deadline task will come back to us, and the 
push/pull mechanism should then move it around properly.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
---
v4 -> v5:
 * remove raw_spin_unlock(&rq->lock)
 * cleanup codes, spotted by Peterz
 * cleanup patch description
v3 -> v4:
 * use tsk_cpus_allowed wrapper
 * fix compile error
v2 -> v3:
 * don't get_task_struct
 * if cannot preempt any rq, fallback to pick any online cpus
 * use cpu_active_mask as original later_mask if cpu is offline
v1 -> v2:
 * push the task to another cpu in dl_task_timer() if rq is offline.

 kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 41 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index f3d7776..7c31906 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
 	return hrtimer_active(&dl_se->dl_timer);
 }
 
+static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
 /*
  * This is the bandwidth enforcement timer callback. If here, we know
  * a task is not on its dl_rq, since the fact that the timer was running
@@ -538,6 +539,43 @@ again:
 	update_rq_clock(rq);
 	dl_se->dl_throttled = 0;
 	dl_se->dl_yielded = 0;
+
+	/*
+	 * So if we find that the rq the task was on is no longer
+	 * available, we need to select a new rq.
+	 */
+	if (unlikely(!rq->online)) {
+		struct rq *later_rq = NULL;
+
+		later_rq = find_lock_later_rq(p, rq);
+
+		if (!later_rq) {
+			int cpu;
+
+			/*
+			 * If cannot preempt any rq, fallback to pick any
+			 * online cpu.
+			 */
+			cpu = cpumask_any_and(cpu_active_mask,
+					tsk_cpus_allowed(p));
+			if (cpu >= nr_cpu_ids) {
+				pr_warn("fail to find any online cpu and task will never come back\n");
+				goto unlock;
+			}
+			later_rq = cpu_rq(cpu);
+		}
+
+		deactivate_task(rq, p, 0);
+		set_task_cpu(p, later_rq->cpu);
+		activate_task(later_rq, p, 0);
+
+		resched_curr(later_rq);
+
+		double_unlock_balance(rq, later_rq);
+
+		goto unlock;
+	}
+
 	if (task_on_rq_queued(p)) {
 		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
 		if (dl_task(rq->curr))
@@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
 	 * We have to consider system topology and task affinity
 	 * first, then we can look for a suitable cpu.
 	 */
-	cpumask_copy(later_mask, task_rq(task)->rd->span);
-	cpumask_and(later_mask, later_mask, cpu_active_mask);
+	cpumask_copy(later_mask, cpu_active_mask);
+	if (likely(task_rq(task)->online))
+		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
 	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
 	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
 			task, later_mask);
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12  1:06 [PATCH v5] sched/deadline: support dl task migration during cpu hotplug Wanpeng Li
@ 2014-11-12 15:08 ` Juri Lelli
  2014-11-12 15:39   ` Peter Zijlstra
                     ` (2 more replies)
  2014-11-12 16:27 ` Kirill Tkhai
  1 sibling, 3 replies; 19+ messages in thread
From: Juri Lelli @ 2014-11-12 15:08 UTC (permalink / raw)
  To: Wanpeng Li, Ingo Molnar, Peter Zijlstra; +Cc: Kirill Tkhai, linux-kernel

Hi,

On 12/11/14 01:06, Wanpeng Li wrote:
> I observe that dl task can't be migrated to other cpus during cpu hotplug, 
> in addition, task may/may not be running again if cpu is added back. The 
> root cause which I found is that dl task will be throtted and removed from 
> dl rq after comsuming all budget, which leads to stop task can't pick it up 
> from dl rq and migrate to other cpus during hotplug.
> 
> The method to reproduce:
> schedtool -E -t 50000:100000 -e ./test
> Actually test is just a simple for loop. Then observe which cpu the test
> task is on.
> echo 0 > /sys/devices/system/cpu/cpuN/online
> 
> This patch adds the dl task migration during cpu hotplug by finding a most 
> suitable later deadline rq after dl timer fire if current rq is offline, 
> if fail to find a suitable later deadline rq then fallback to any eligible 
> online cpu in order that the deadline task will come back to us, and the 
> push/pull mechanism should then move it around properly.
> 
> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> ---
> v4 -> v5:
>  * remove raw_spin_unlock(&rq->lock)
>  * cleanup codes, spotted by Peterz
>  * cleanup patch description
> v3 -> v4:
>  * use tsk_cpus_allowed wrapper
>  * fix compile error
> v2 -> v3:
>  * don't get_task_struct
>  * if cannot preempt any rq, fallback to pick any online cpus
>  * use cpu_active_mask as original later_mask if cpu is offline
> v1 -> v2:
>  * push the task to another cpu in dl_task_timer() if rq is offline.
> 
>  kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index f3d7776..7c31906 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>  	return hrtimer_active(&dl_se->dl_timer);
>  }
>  
> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>  /*
>   * This is the bandwidth enforcement timer callback. If here, we know
>   * a task is not on its dl_rq, since the fact that the timer was running
> @@ -538,6 +539,43 @@ again:
>  	update_rq_clock(rq);
>  	dl_se->dl_throttled = 0;
>  	dl_se->dl_yielded = 0;
> +
> +	/*
> +	 * So if we find that the rq the task was on is no longer
> +	 * available, we need to select a new rq.
> +	 */
> +	if (unlikely(!rq->online)) {
> +		struct rq *later_rq = NULL;
> +
> +		later_rq = find_lock_later_rq(p, rq);
> +
> +		if (!later_rq) {
> +			int cpu;
> +
> +			/*
> +			 * If cannot preempt any rq, fallback to pick any
> +			 * online cpu.
> +			 */
> +			cpu = cpumask_any_and(cpu_active_mask,
> +					tsk_cpus_allowed(p));
> +			if (cpu >= nr_cpu_ids) {
> +				pr_warn("fail to find any online cpu and task will never come back\n");
> +				goto unlock;
> +			}
> +			later_rq = cpu_rq(cpu);
> +		}
> +
> +		deactivate_task(rq, p, 0);
> +		set_task_cpu(p, later_rq->cpu);
> +		activate_task(later_rq, p, 0);
> +
> +		resched_curr(later_rq);
> +
> +		double_unlock_balance(rq, later_rq);
> +
> +		goto unlock;
> +	}
> +
>  	if (task_on_rq_queued(p)) {
>  		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>  		if (dl_task(rq->curr))
> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>  	 * We have to consider system topology and task affinity
>  	 * first, then we can look for a suitable cpu.
>  	 */
> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
> +	cpumask_copy(later_mask, cpu_active_mask);
> +	if (likely(task_rq(task)->online))
> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);

So, here you consider the span only when the task_rq is online,
but there might be others cpus still online belonging to the same
rd->span. And you have to consider them when migrating. Actually,
migration must still be restricted to the online cpus of task's
original rd->span, or I fear you can break clustered scheduling.

Thanks,

- Juri

>  	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>  	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>  			task, later_mask);
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 15:08 ` Juri Lelli
@ 2014-11-12 15:39   ` Peter Zijlstra
  2014-11-12 23:02     ` Wanpeng Li
  2015-01-05 14:52     ` Peter Zijlstra
  2014-11-12 23:22   ` Wanpeng Li
  2014-11-18 23:18   ` Wanpeng Li
  2 siblings, 2 replies; 19+ messages in thread
From: Peter Zijlstra @ 2014-11-12 15:39 UTC (permalink / raw)
  To: Juri Lelli; +Cc: Wanpeng Li, Ingo Molnar, Kirill Tkhai, linux-kernel

On Wed, Nov 12, 2014 at 03:08:44PM +0000, Juri Lelli wrote:
> > @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
> >  	 * We have to consider system topology and task affinity
> >  	 * first, then we can look for a suitable cpu.
> >  	 */
> > -	cpumask_copy(later_mask, task_rq(task)->rd->span);
> > -	cpumask_and(later_mask, later_mask, cpu_active_mask);
> > +	cpumask_copy(later_mask, cpu_active_mask);
> > +	if (likely(task_rq(task)->online))
> > +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
> 
> So, here you consider the span only when the task_rq is online,
> but there might be others cpus still online belonging to the same
> rd->span. And you have to consider them when migrating. Actually,
> migration must still be restricted to the online cpus of task's
> original rd->span, or I fear you can break clustered scheduling.

Ah, good point that, we must somehow find the right root domain to
'restore' the task to. Now I'm not entirely sure we still have this.
Lemme ponder that.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12  1:06 [PATCH v5] sched/deadline: support dl task migration during cpu hotplug Wanpeng Li
  2014-11-12 15:08 ` Juri Lelli
@ 2014-11-12 16:27 ` Kirill Tkhai
  2014-11-12 22:56   ` Wanpeng Li
  1 sibling, 1 reply; 19+ messages in thread
From: Kirill Tkhai @ 2014-11-12 16:27 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, linux-kernel

В Ср, 12/11/2014 в 09:06 +0800, Wanpeng Li пишет:
> I observe that dl task can't be migrated to other cpus during cpu hotplug, 
> in addition, task may/may not be running again if cpu is added back. The 
> root cause which I found is that dl task will be throtted and removed from 
> dl rq after comsuming all budget, which leads to stop task can't pick it up 
> from dl rq and migrate to other cpus during hotplug.
> 
> The method to reproduce:
> schedtool -E -t 50000:100000 -e ./test
> Actually test is just a simple for loop. Then observe which cpu the test
> task is on.
> echo 0 > /sys/devices/system/cpu/cpuN/online
> 
> This patch adds the dl task migration during cpu hotplug by finding a most 
> suitable later deadline rq after dl timer fire if current rq is offline, 
> if fail to find a suitable later deadline rq then fallback to any eligible 
> online cpu in order that the deadline task will come back to us, and the 
> push/pull mechanism should then move it around properly.
> 
> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> ---
> v4 -> v5:
>  * remove raw_spin_unlock(&rq->lock)
>  * cleanup codes, spotted by Peterz
>  * cleanup patch description
> v3 -> v4:
>  * use tsk_cpus_allowed wrapper
>  * fix compile error
> v2 -> v3:
>  * don't get_task_struct
>  * if cannot preempt any rq, fallback to pick any online cpus
>  * use cpu_active_mask as original later_mask if cpu is offline
> v1 -> v2:
>  * push the task to another cpu in dl_task_timer() if rq is offline.
> 
>  kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 41 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index f3d7776..7c31906 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>  	return hrtimer_active(&dl_se->dl_timer);
>  }
>  
> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>  /*
>   * This is the bandwidth enforcement timer callback. If here, we know
>   * a task is not on its dl_rq, since the fact that the timer was running
> @@ -538,6 +539,43 @@ again:
>  	update_rq_clock(rq);
>  	dl_se->dl_throttled = 0;
>  	dl_se->dl_yielded = 0;
> +
> +	/*
> +	 * So if we find that the rq the task was on is no longer
> +	 * available, we need to select a new rq.
> +	 */
> +	if (unlikely(!rq->online)) {
> +		struct rq *later_rq = NULL;
> +
> +		later_rq = find_lock_later_rq(p, rq);
> +
> +		if (!later_rq) {
> +			int cpu;
> +
> +			/*
> +			 * If cannot preempt any rq, fallback to pick any
> +			 * online cpu.
> +			 */
> +			cpu = cpumask_any_and(cpu_active_mask,
> +					tsk_cpus_allowed(p));
> +			if (cpu >= nr_cpu_ids) {
> +				pr_warn("fail to find any online cpu and task will never come back\n");
> +				goto unlock;
> +			}
> +			later_rq = cpu_rq(cpu);

later_rq is not locked here, but you activate p on it and you do unlock below.

> +		}
> +
> +		deactivate_task(rq, p, 0);
> +		set_task_cpu(p, later_rq->cpu);
> +		activate_task(later_rq, p, 0);

		^^^^^

> +
> +		resched_curr(later_rq);
> +
> +		double_unlock_balance(rq, later_rq);

		^^^^^^

> +
> +		goto unlock;
> +	}
> +
>  	if (task_on_rq_queued(p)) {
>  		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>  		if (dl_task(rq->curr))
> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>  	 * We have to consider system topology and task affinity
>  	 * first, then we can look for a suitable cpu.
>  	 */
> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
> +	cpumask_copy(later_mask, cpu_active_mask);
> +	if (likely(task_rq(task)->online))
> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>  	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>  	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>  			task, later_mask);



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 16:27 ` Kirill Tkhai
@ 2014-11-12 22:56   ` Wanpeng Li
  2014-11-13 10:10     ` Kirill Tkhai
  0 siblings, 1 reply; 19+ messages in thread
From: Wanpeng Li @ 2014-11-12 22:56 UTC (permalink / raw)
  To: Kirill Tkhai; +Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, linux-kernel

Hi Kirill,
On Wed, Nov 12, 2014 at 07:27:06PM +0300, Kirill Tkhai wrote:
>В Ср, 12/11/2014 в 09:06 +0800, Wanpeng Li пишет:
>> I observe that dl task can't be migrated to other cpus during cpu hotplug, 
>> in addition, task may/may not be running again if cpu is added back. The 
>> root cause which I found is that dl task will be throtted and removed from 
>> dl rq after comsuming all budget, which leads to stop task can't pick it up 
>> from dl rq and migrate to other cpus during hotplug.
>> 
>> The method to reproduce:
>> schedtool -E -t 50000:100000 -e ./test
>> Actually test is just a simple for loop. Then observe which cpu the test
>> task is on.
>> echo 0 > /sys/devices/system/cpu/cpuN/online
>> 
>> This patch adds the dl task migration during cpu hotplug by finding a most 
>> suitable later deadline rq after dl timer fire if current rq is offline, 
>> if fail to find a suitable later deadline rq then fallback to any eligible 
>> online cpu in order that the deadline task will come back to us, and the 
>> push/pull mechanism should then move it around properly.
>> 
>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>> ---
>> v4 -> v5:
>>  * remove raw_spin_unlock(&rq->lock)
>>  * cleanup codes, spotted by Peterz
>>  * cleanup patch description
>> v3 -> v4:
>>  * use tsk_cpus_allowed wrapper
>>  * fix compile error
>> v2 -> v3:
>>  * don't get_task_struct
>>  * if cannot preempt any rq, fallback to pick any online cpus
>>  * use cpu_active_mask as original later_mask if cpu is offline
>> v1 -> v2:
>>  * push the task to another cpu in dl_task_timer() if rq is offline.
>> 
>>  kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 41 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index f3d7776..7c31906 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>  	return hrtimer_active(&dl_se->dl_timer);
>>  }
>>  
>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>  /*
>>   * This is the bandwidth enforcement timer callback. If here, we know
>>   * a task is not on its dl_rq, since the fact that the timer was running
>> @@ -538,6 +539,43 @@ again:
>>  	update_rq_clock(rq);
>>  	dl_se->dl_throttled = 0;
>>  	dl_se->dl_yielded = 0;
>> +
>> +	/*
>> +	 * So if we find that the rq the task was on is no longer
>> +	 * available, we need to select a new rq.
>> +	 */
>> +	if (unlikely(!rq->online)) {
>> +		struct rq *later_rq = NULL;
>> +
>> +		later_rq = find_lock_later_rq(p, rq);
>> +
>> +		if (!later_rq) {
>> +			int cpu;
>> +
>> +			/*
>> +			 * If cannot preempt any rq, fallback to pick any
>> +			 * online cpu.
>> +			 */
>> +			cpu = cpumask_any_and(cpu_active_mask,
>> +					tsk_cpus_allowed(p));
>> +			if (cpu >= nr_cpu_ids) {
>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>> +				goto unlock;
>> +			}
>> +			later_rq = cpu_rq(cpu);
>
>later_rq is not locked here, but you activate p on it and you do unlock below.

Great catch! How about add double_lock_balance(rq, later_rq); here?

Regards,
Wanpeng Li 

>
>> +		}
>> +
>> +		deactivate_task(rq, p, 0);
>> +		set_task_cpu(p, later_rq->cpu);
>> +		activate_task(later_rq, p, 0);
>
>		^^^^^
>
>> +
>> +		resched_curr(later_rq);
>> +
>> +		double_unlock_balance(rq, later_rq);
>
>		^^^^^^
>
>> +
>> +		goto unlock;
>> +	}
>> +
>>  	if (task_on_rq_queued(p)) {
>>  		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>  		if (dl_task(rq->curr))
>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>  	 * We have to consider system topology and task affinity
>>  	 * first, then we can look for a suitable cpu.
>>  	 */
>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>> +	cpumask_copy(later_mask, cpu_active_mask);
>> +	if (likely(task_rq(task)->online))
>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>>  	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>  	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>  			task, later_mask);
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 15:39   ` Peter Zijlstra
@ 2014-11-12 23:02     ` Wanpeng Li
  2015-01-05 14:52     ` Peter Zijlstra
  1 sibling, 0 replies; 19+ messages in thread
From: Wanpeng Li @ 2014-11-12 23:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juri Lelli, Wanpeng Li, Ingo Molnar, Kirill Tkhai, linux-kernel

Hi Peter,
On Wed, Nov 12, 2014 at 04:39:06PM +0100, Peter Zijlstra wrote:
>On Wed, Nov 12, 2014 at 03:08:44PM +0000, Juri Lelli wrote:
>> > @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>> >  	 * We have to consider system topology and task affinity
>> >  	 * first, then we can look for a suitable cpu.
>> >  	 */
>> > -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>> > -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>> > +	cpumask_copy(later_mask, cpu_active_mask);
>> > +	if (likely(task_rq(task)->online))
>> > +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>> 
>> So, here you consider the span only when the task_rq is online,
>> but there might be others cpus still online belonging to the same
>> rd->span. And you have to consider them when migrating. Actually,
>> migration must still be restricted to the online cpus of task's
>> original rd->span, or I fear you can break clustered scheduling.
>
>Ah, good point that, we must somehow find the right root domain to
>'restore' the task to. Now I'm not entirely sure we still have this.
>Lemme ponder that.

Any idea is a great appreciated. :-)

Regards,
Wanpeng Li 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 15:08 ` Juri Lelli
  2014-11-12 15:39   ` Peter Zijlstra
@ 2014-11-12 23:22   ` Wanpeng Li
  2014-11-18 23:18   ` Wanpeng Li
  2 siblings, 0 replies; 19+ messages in thread
From: Wanpeng Li @ 2014-11-12 23:22 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra, Kirill Tkhai, linux-kernel

Hi Juri,
On Wed, Nov 12, 2014 at 03:08:44PM +0000, Juri Lelli wrote:

Btw, could you review other three deadline patches of mine?

https://lkml.org/lkml/2014/11/6/51
https://lkml.org/lkml/2014/11/10/817
https://lkml.org/lkml/2014/11/10/818

Great thanks for your time. ;-)

Regards,
Wanpeng Li 

>Hi,
>
>On 12/11/14 01:06, Wanpeng Li wrote:
>> I observe that dl task can't be migrated to other cpus during cpu hotplug, 
>> in addition, task may/may not be running again if cpu is added back. The 
>> root cause which I found is that dl task will be throtted and removed from 
>> dl rq after comsuming all budget, which leads to stop task can't pick it up 
>> from dl rq and migrate to other cpus during hotplug.
>> 
>> The method to reproduce:
>> schedtool -E -t 50000:100000 -e ./test
>> Actually test is just a simple for loop. Then observe which cpu the test
>> task is on.
>> echo 0 > /sys/devices/system/cpu/cpuN/online
>> 
>> This patch adds the dl task migration during cpu hotplug by finding a most 
>> suitable later deadline rq after dl timer fire if current rq is offline, 
>> if fail to find a suitable later deadline rq then fallback to any eligible 
>> online cpu in order that the deadline task will come back to us, and the 
>> push/pull mechanism should then move it around properly.
>> 
>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>> ---
>> v4 -> v5:
>>  * remove raw_spin_unlock(&rq->lock)
>>  * cleanup codes, spotted by Peterz
>>  * cleanup patch description
>> v3 -> v4:
>>  * use tsk_cpus_allowed wrapper
>>  * fix compile error
>> v2 -> v3:
>>  * don't get_task_struct
>>  * if cannot preempt any rq, fallback to pick any online cpus
>>  * use cpu_active_mask as original later_mask if cpu is offline
>> v1 -> v2:
>>  * push the task to another cpu in dl_task_timer() if rq is offline.
>> 
>>  kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 41 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index f3d7776..7c31906 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>  	return hrtimer_active(&dl_se->dl_timer);
>>  }
>>  
>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>  /*
>>   * This is the bandwidth enforcement timer callback. If here, we know
>>   * a task is not on its dl_rq, since the fact that the timer was running
>> @@ -538,6 +539,43 @@ again:
>>  	update_rq_clock(rq);
>>  	dl_se->dl_throttled = 0;
>>  	dl_se->dl_yielded = 0;
>> +
>> +	/*
>> +	 * So if we find that the rq the task was on is no longer
>> +	 * available, we need to select a new rq.
>> +	 */
>> +	if (unlikely(!rq->online)) {
>> +		struct rq *later_rq = NULL;
>> +
>> +		later_rq = find_lock_later_rq(p, rq);
>> +
>> +		if (!later_rq) {
>> +			int cpu;
>> +
>> +			/*
>> +			 * If cannot preempt any rq, fallback to pick any
>> +			 * online cpu.
>> +			 */
>> +			cpu = cpumask_any_and(cpu_active_mask,
>> +					tsk_cpus_allowed(p));
>> +			if (cpu >= nr_cpu_ids) {
>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>> +				goto unlock;
>> +			}
>> +			later_rq = cpu_rq(cpu);
>> +		}
>> +
>> +		deactivate_task(rq, p, 0);
>> +		set_task_cpu(p, later_rq->cpu);
>> +		activate_task(later_rq, p, 0);
>> +
>> +		resched_curr(later_rq);
>> +
>> +		double_unlock_balance(rq, later_rq);
>> +
>> +		goto unlock;
>> +	}
>> +
>>  	if (task_on_rq_queued(p)) {
>>  		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>  		if (dl_task(rq->curr))
>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>  	 * We have to consider system topology and task affinity
>>  	 * first, then we can look for a suitable cpu.
>>  	 */
>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>> +	cpumask_copy(later_mask, cpu_active_mask);
>> +	if (likely(task_rq(task)->online))
>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>
>So, here you consider the span only when the task_rq is online,
>but there might be others cpus still online belonging to the same
>rd->span. And you have to consider them when migrating. Actually,
>migration must still be restricted to the online cpus of task's
>original rd->span, or I fear you can break clustered scheduling.
>
>Thanks,
>
>- Juri
>
>>  	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>  	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>  			task, later_mask);
>> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 22:56   ` Wanpeng Li
@ 2014-11-13 10:10     ` Kirill Tkhai
  2014-11-13 10:19       ` Wanpeng Li
  0 siblings, 1 reply; 19+ messages in thread
From: Kirill Tkhai @ 2014-11-13 10:10 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, linux-kernel

В Чт, 13/11/2014 в 06:56 +0800, Wanpeng Li пишет:
> Hi Kirill,
> On Wed, Nov 12, 2014 at 07:27:06PM +0300, Kirill Tkhai wrote:
> >В Ср, 12/11/2014 в 09:06 +0800, Wanpeng Li пишет:
> >> I observe that dl task can't be migrated to other cpus during cpu hotplug, 
> >> in addition, task may/may not be running again if cpu is added back. The 
> >> root cause which I found is that dl task will be throtted and removed from 
> >> dl rq after comsuming all budget, which leads to stop task can't pick it up 
> >> from dl rq and migrate to other cpus during hotplug.
> >> 
> >> The method to reproduce:
> >> schedtool -E -t 50000:100000 -e ./test
> >> Actually test is just a simple for loop. Then observe which cpu the test
> >> task is on.
> >> echo 0 > /sys/devices/system/cpu/cpuN/online
> >> 
> >> This patch adds the dl task migration during cpu hotplug by finding a most 
> >> suitable later deadline rq after dl timer fire if current rq is offline, 
> >> if fail to find a suitable later deadline rq then fallback to any eligible 
> >> online cpu in order that the deadline task will come back to us, and the 
> >> push/pull mechanism should then move it around properly.
> >> 
> >> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> >> ---
> >> v4 -> v5:
> >>  * remove raw_spin_unlock(&rq->lock)
> >>  * cleanup codes, spotted by Peterz
> >>  * cleanup patch description
> >> v3 -> v4:
> >>  * use tsk_cpus_allowed wrapper
> >>  * fix compile error
> >> v2 -> v3:
> >>  * don't get_task_struct
> >>  * if cannot preempt any rq, fallback to pick any online cpus
> >>  * use cpu_active_mask as original later_mask if cpu is offline
> >> v1 -> v2:
> >>  * push the task to another cpu in dl_task_timer() if rq is offline.
> >> 
> >>  kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> >>  1 file changed, 41 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> >> index f3d7776..7c31906 100644
> >> --- a/kernel/sched/deadline.c
> >> +++ b/kernel/sched/deadline.c
> >> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
> >>  	return hrtimer_active(&dl_se->dl_timer);
> >>  }
> >>  
> >> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
> >>  /*
> >>   * This is the bandwidth enforcement timer callback. If here, we know
> >>   * a task is not on its dl_rq, since the fact that the timer was running
> >> @@ -538,6 +539,43 @@ again:
> >>  	update_rq_clock(rq);
> >>  	dl_se->dl_throttled = 0;
> >>  	dl_se->dl_yielded = 0;
> >> +
> >> +	/*
> >> +	 * So if we find that the rq the task was on is no longer
> >> +	 * available, we need to select a new rq.
> >> +	 */
> >> +	if (unlikely(!rq->online)) {
> >> +		struct rq *later_rq = NULL;
> >> +
> >> +		later_rq = find_lock_later_rq(p, rq);
> >> +
> >> +		if (!later_rq) {
> >> +			int cpu;
> >> +
> >> +			/*
> >> +			 * If cannot preempt any rq, fallback to pick any
> >> +			 * online cpu.
> >> +			 */
> >> +			cpu = cpumask_any_and(cpu_active_mask,
> >> +					tsk_cpus_allowed(p));
> >> +			if (cpu >= nr_cpu_ids) {
> >> +				pr_warn("fail to find any online cpu and task will never come back\n");
> >> +				goto unlock;
> >> +			}
> >> +			later_rq = cpu_rq(cpu);
> >
> >later_rq is not locked here, but you activate p on it and you do unlock below.
> 
> Great catch! How about add double_lock_balance(rq, later_rq); here?

This sounds good.

> 
> Regards,
> Wanpeng Li 
> 
> >
> >> +		}
> >> +
> >> +		deactivate_task(rq, p, 0);
> >> +		set_task_cpu(p, later_rq->cpu);
> >> +		activate_task(later_rq, p, 0);
> >
> >		^^^^^
> >
> >> +
> >> +		resched_curr(later_rq);
> >> +
> >> +		double_unlock_balance(rq, later_rq);
> >
> >		^^^^^^
> >
> >> +
> >> +		goto unlock;
> >> +	}
> >> +
> >>  	if (task_on_rq_queued(p)) {
> >>  		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
> >>  		if (dl_task(rq->curr))
> >> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
> >>  	 * We have to consider system topology and task affinity
> >>  	 * first, then we can look for a suitable cpu.
> >>  	 */
> >> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
> >> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
> >> +	cpumask_copy(later_mask, cpu_active_mask);
> >> +	if (likely(task_rq(task)->online))
> >> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
> >>  	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
> >>  	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
> >>  			task, later_mask);
> >

Also, we should think about the following situation.

DL task is left on dead rq. In your scheme it will be moved by the timer.
But what will be if somebody changes the class of the task (before timer)?
In this case the task still remains on dead rq.

We should handle this situation in some way.

Kirill



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-13 10:10     ` Kirill Tkhai
@ 2014-11-13 10:19       ` Wanpeng Li
  2014-11-13 10:21         ` Kirill Tkhai
  0 siblings, 1 reply; 19+ messages in thread
From: Wanpeng Li @ 2014-11-13 10:19 UTC (permalink / raw)
  To: Kirill Tkhai, Wanpeng Li
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, linux-kernel

Hi Kirill,
On 11/13/14, 6:10 PM, Kirill Tkhai wrote:
> В Чт, 13/11/2014 в 06:56 +0800, Wanpeng Li пишет:
>> Hi Kirill,
>> On Wed, Nov 12, 2014 at 07:27:06PM +0300, Kirill Tkhai wrote:
>>> В Ср, 12/11/2014 в 09:06 +0800, Wanpeng Li пишет:
>>>> I observe that dl task can't be migrated to other cpus during cpu hotplug,
>>>> in addition, task may/may not be running again if cpu is added back. The
>>>> root cause which I found is that dl task will be throtted and removed from
>>>> dl rq after comsuming all budget, which leads to stop task can't pick it up
>>>> from dl rq and migrate to other cpus during hotplug.
>>>>
>>>> The method to reproduce:
>>>> schedtool -E -t 50000:100000 -e ./test
>>>> Actually test is just a simple for loop. Then observe which cpu the test
>>>> task is on.
>>>> echo 0 > /sys/devices/system/cpu/cpuN/online
>>>>
>>>> This patch adds the dl task migration during cpu hotplug by finding a most
>>>> suitable later deadline rq after dl timer fire if current rq is offline,
>>>> if fail to find a suitable later deadline rq then fallback to any eligible
>>>> online cpu in order that the deadline task will come back to us, and the
>>>> push/pull mechanism should then move it around properly.
>>>>
>>>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>>>> ---
>>>> v4 -> v5:
>>>>   * remove raw_spin_unlock(&rq->lock)
>>>>   * cleanup codes, spotted by Peterz
>>>>   * cleanup patch description
>>>> v3 -> v4:
>>>>   * use tsk_cpus_allowed wrapper
>>>>   * fix compile error
>>>> v2 -> v3:
>>>>   * don't get_task_struct
>>>>   * if cannot preempt any rq, fallback to pick any online cpus
>>>>   * use cpu_active_mask as original later_mask if cpu is offline
>>>> v1 -> v2:
>>>>   * push the task to another cpu in dl_task_timer() if rq is offline.
>>>>
>>>>   kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>>> index f3d7776..7c31906 100644
>>>> --- a/kernel/sched/deadline.c
>>>> +++ b/kernel/sched/deadline.c
>>>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>>>   	return hrtimer_active(&dl_se->dl_timer);
>>>>   }
>>>>   
>>>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>>>   /*
>>>>    * This is the bandwidth enforcement timer callback. If here, we know
>>>>    * a task is not on its dl_rq, since the fact that the timer was running
>>>> @@ -538,6 +539,43 @@ again:
>>>>   	update_rq_clock(rq);
>>>>   	dl_se->dl_throttled = 0;
>>>>   	dl_se->dl_yielded = 0;
>>>> +
>>>> +	/*
>>>> +	 * So if we find that the rq the task was on is no longer
>>>> +	 * available, we need to select a new rq.
>>>> +	 */
>>>> +	if (unlikely(!rq->online)) {
>>>> +		struct rq *later_rq = NULL;
>>>> +
>>>> +		later_rq = find_lock_later_rq(p, rq);
>>>> +
>>>> +		if (!later_rq) {
>>>> +			int cpu;
>>>> +
>>>> +			/*
>>>> +			 * If cannot preempt any rq, fallback to pick any
>>>> +			 * online cpu.
>>>> +			 */
>>>> +			cpu = cpumask_any_and(cpu_active_mask,
>>>> +					tsk_cpus_allowed(p));
>>>> +			if (cpu >= nr_cpu_ids) {
>>>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>>>> +				goto unlock;
>>>> +			}
>>>> +			later_rq = cpu_rq(cpu);
>>> later_rq is not locked here, but you activate p on it and you do unlock below.
>> Great catch! How about add double_lock_balance(rq, later_rq); here?
> This sounds good.

I will do this in next version. ;-)

>
>> Regards,
>> Wanpeng Li
>>
>>>> +		}
>>>> +
>>>> +		deactivate_task(rq, p, 0);
>>>> +		set_task_cpu(p, later_rq->cpu);
>>>> +		activate_task(later_rq, p, 0);
>>> 		^^^^^
>>>
>>>> +
>>>> +		resched_curr(later_rq);
>>>> +
>>>> +		double_unlock_balance(rq, later_rq);
>>> 		^^^^^^
>>>
>>>> +
>>>> +		goto unlock;
>>>> +	}
>>>> +
>>>>   	if (task_on_rq_queued(p)) {
>>>>   		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>>>   		if (dl_task(rq->curr))
>>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>>>   	 * We have to consider system topology and task affinity
>>>>   	 * first, then we can look for a suitable cpu.
>>>>   	 */
>>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>>>> +	cpumask_copy(later_mask, cpu_active_mask);
>>>> +	if (likely(task_rq(task)->online))
>>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>>>>   	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>>>   	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>>>   			task, later_mask);
> Also, we should think about the following situation.
>
> DL task is left on dead rq. In your scheme it will be moved by the timer.
> But what will be if somebody changes the class of the task (before timer)?

I think timer will be cancelled in switched_from_dl().

Regards,
Wanpeng Li

> In this case the task still remains on dead rq.
>
> We should handle this situation in some way.
>
> Kirill
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-13 10:19       ` Wanpeng Li
@ 2014-11-13 10:21         ` Kirill Tkhai
  2014-11-16 22:59           ` Wanpeng Li
  2014-11-20  8:49           ` Wanpeng Li
  0 siblings, 2 replies; 19+ messages in thread
From: Kirill Tkhai @ 2014-11-13 10:21 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra, Juri Lelli, linux-kernel

В Чт, 13/11/2014 в 18:19 +0800, Wanpeng Li пишет:
> Hi Kirill,
> On 11/13/14, 6:10 PM, Kirill Tkhai wrote:
> > В Чт, 13/11/2014 в 06:56 +0800, Wanpeng Li пишет:
> >> Hi Kirill,
> >> On Wed, Nov 12, 2014 at 07:27:06PM +0300, Kirill Tkhai wrote:
> >>> В Ср, 12/11/2014 в 09:06 +0800, Wanpeng Li пишет:
> >>>> I observe that dl task can't be migrated to other cpus during cpu hotplug,
> >>>> in addition, task may/may not be running again if cpu is added back. The
> >>>> root cause which I found is that dl task will be throtted and removed from
> >>>> dl rq after comsuming all budget, which leads to stop task can't pick it up
> >>>> from dl rq and migrate to other cpus during hotplug.
> >>>>
> >>>> The method to reproduce:
> >>>> schedtool -E -t 50000:100000 -e ./test
> >>>> Actually test is just a simple for loop. Then observe which cpu the test
> >>>> task is on.
> >>>> echo 0 > /sys/devices/system/cpu/cpuN/online
> >>>>
> >>>> This patch adds the dl task migration during cpu hotplug by finding a most
> >>>> suitable later deadline rq after dl timer fire if current rq is offline,
> >>>> if fail to find a suitable later deadline rq then fallback to any eligible
> >>>> online cpu in order that the deadline task will come back to us, and the
> >>>> push/pull mechanism should then move it around properly.
> >>>>
> >>>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> >>>> ---
> >>>> v4 -> v5:
> >>>>   * remove raw_spin_unlock(&rq->lock)
> >>>>   * cleanup codes, spotted by Peterz
> >>>>   * cleanup patch description
> >>>> v3 -> v4:
> >>>>   * use tsk_cpus_allowed wrapper
> >>>>   * fix compile error
> >>>> v2 -> v3:
> >>>>   * don't get_task_struct
> >>>>   * if cannot preempt any rq, fallback to pick any online cpus
> >>>>   * use cpu_active_mask as original later_mask if cpu is offline
> >>>> v1 -> v2:
> >>>>   * push the task to another cpu in dl_task_timer() if rq is offline.
> >>>>
> >>>>   kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
> >>>>   1 file changed, 41 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> >>>> index f3d7776..7c31906 100644
> >>>> --- a/kernel/sched/deadline.c
> >>>> +++ b/kernel/sched/deadline.c
> >>>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
> >>>>   	return hrtimer_active(&dl_se->dl_timer);
> >>>>   }
> >>>>   
> >>>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
> >>>>   /*
> >>>>    * This is the bandwidth enforcement timer callback. If here, we know
> >>>>    * a task is not on its dl_rq, since the fact that the timer was running
> >>>> @@ -538,6 +539,43 @@ again:
> >>>>   	update_rq_clock(rq);
> >>>>   	dl_se->dl_throttled = 0;
> >>>>   	dl_se->dl_yielded = 0;
> >>>> +
> >>>> +	/*
> >>>> +	 * So if we find that the rq the task was on is no longer
> >>>> +	 * available, we need to select a new rq.
> >>>> +	 */
> >>>> +	if (unlikely(!rq->online)) {
> >>>> +		struct rq *later_rq = NULL;
> >>>> +
> >>>> +		later_rq = find_lock_later_rq(p, rq);
> >>>> +
> >>>> +		if (!later_rq) {
> >>>> +			int cpu;
> >>>> +
> >>>> +			/*
> >>>> +			 * If cannot preempt any rq, fallback to pick any
> >>>> +			 * online cpu.
> >>>> +			 */
> >>>> +			cpu = cpumask_any_and(cpu_active_mask,
> >>>> +					tsk_cpus_allowed(p));
> >>>> +			if (cpu >= nr_cpu_ids) {
> >>>> +				pr_warn("fail to find any online cpu and task will never come back\n");
> >>>> +				goto unlock;
> >>>> +			}
> >>>> +			later_rq = cpu_rq(cpu);
> >>> later_rq is not locked here, but you activate p on it and you do unlock below.
> >> Great catch! How about add double_lock_balance(rq, later_rq); here?
> > This sounds good.
> 
> I will do this in next version. ;-)
> 
> >
> >> Regards,
> >> Wanpeng Li
> >>
> >>>> +		}
> >>>> +
> >>>> +		deactivate_task(rq, p, 0);
> >>>> +		set_task_cpu(p, later_rq->cpu);
> >>>> +		activate_task(later_rq, p, 0);
> >>> 		^^^^^
> >>>
> >>>> +
> >>>> +		resched_curr(later_rq);
> >>>> +
> >>>> +		double_unlock_balance(rq, later_rq);
> >>> 		^^^^^^
> >>>
> >>>> +
> >>>> +		goto unlock;
> >>>> +	}
> >>>> +
> >>>>   	if (task_on_rq_queued(p)) {
> >>>>   		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
> >>>>   		if (dl_task(rq->curr))
> >>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
> >>>>   	 * We have to consider system topology and task affinity
> >>>>   	 * first, then we can look for a suitable cpu.
> >>>>   	 */
> >>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
> >>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
> >>>> +	cpumask_copy(later_mask, cpu_active_mask);
> >>>> +	if (likely(task_rq(task)->online))
> >>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
> >>>>   	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
> >>>>   	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
> >>>>   			task, later_mask);
> > Also, we should think about the following situation.
> >
> > DL task is left on dead rq. In your scheme it will be moved by the timer.
> > But what will be if somebody changes the class of the task (before timer)?
> 
> I think timer will be cancelled in switched_from_dl().

Yeah, but nobody will move this task to alive rq.

> 
> Regards,
> Wanpeng Li
> 
> > In this case the task still remains on dead rq.
> >
> > We should handle this situation in some way.
> >
> > Kirill
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-13 10:21         ` Kirill Tkhai
@ 2014-11-16 22:59           ` Wanpeng Li
  2014-11-20  8:49           ` Wanpeng Li
  1 sibling, 0 replies; 19+ messages in thread
From: Wanpeng Li @ 2014-11-16 22:59 UTC (permalink / raw)
  To: Kirill Tkhai
  Cc: Wanpeng Li, Ingo Molnar, Peter Zijlstra, Juri Lelli, linux-kernel

Hi Kirill,
On Thu, Nov 13, 2014 at 01:21:31PM +0300, Kirill Tkhai wrote:
>> >>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>> >>>>   	 * We have to consider system topology and task affinity
>> >>>>   	 * first, then we can look for a suitable cpu.
>> >>>>   	 */
>> >>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>> >>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>> >>>> +	cpumask_copy(later_mask, cpu_active_mask);
>> >>>> +	if (likely(task_rq(task)->online))
>> >>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>> >>>>   	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>> >>>>   	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>> >>>>   			task, later_mask);
>> > Also, we should think about the following situation.
>> >
>> > DL task is left on dead rq. In your scheme it will be moved by the timer.
>> > But what will be if somebody changes the class of the task (before timer)?
>> 
>> I think timer will be cancelled in switched_from_dl().
>
>Yeah, but nobody will move this task to alive rq.
>
>> 
>> Regards,
>> Wanpeng Li
>> 
>> > In this case the task still remains on dead rq.
>> >
>> > We should handle this situation in some way.

Your proposal is a great appreciated. 

Regards,
Wanpeng Li 

>> >
>> > Kirill
>> >
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > Please read the FAQ at  http://www.tux.org/lkml/
>> 
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 15:08 ` Juri Lelli
  2014-11-12 15:39   ` Peter Zijlstra
  2014-11-12 23:22   ` Wanpeng Li
@ 2014-11-18 23:18   ` Wanpeng Li
  2014-11-19 10:13     ` Juri Lelli
  2 siblings, 1 reply; 19+ messages in thread
From: Wanpeng Li @ 2014-11-18 23:18 UTC (permalink / raw)
  To: Juri Lelli, Wanpeng Li, Ingo Molnar, Peter Zijlstra
  Cc: Kirill Tkhai, linux-kernel

Hi Juri,
On 11/12/14, 11:08 PM, Juri Lelli wrote:
> Hi,
>
> On 12/11/14 01:06, Wanpeng Li wrote:
>> I observe that dl task can't be migrated to other cpus during cpu hotplug,
>> in addition, task may/may not be running again if cpu is added back. The
>> root cause which I found is that dl task will be throtted and removed from
>> dl rq after comsuming all budget, which leads to stop task can't pick it up
>> from dl rq and migrate to other cpus during hotplug.
>>
>> The method to reproduce:
>> schedtool -E -t 50000:100000 -e ./test
>> Actually test is just a simple for loop. Then observe which cpu the test
>> task is on.
>> echo 0 > /sys/devices/system/cpu/cpuN/online
>>
>> This patch adds the dl task migration during cpu hotplug by finding a most
>> suitable later deadline rq after dl timer fire if current rq is offline,
>> if fail to find a suitable later deadline rq then fallback to any eligible
>> online cpu in order that the deadline task will come back to us, and the
>> push/pull mechanism should then move it around properly.
>>
>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>> ---
>> v4 -> v5:
>>   * remove raw_spin_unlock(&rq->lock)
>>   * cleanup codes, spotted by Peterz
>>   * cleanup patch description
>> v3 -> v4:
>>   * use tsk_cpus_allowed wrapper
>>   * fix compile error
>> v2 -> v3:
>>   * don't get_task_struct
>>   * if cannot preempt any rq, fallback to pick any online cpus
>>   * use cpu_active_mask as original later_mask if cpu is offline
>> v1 -> v2:
>>   * push the task to another cpu in dl_task_timer() if rq is offline.
>>
>>   kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>> index f3d7776..7c31906 100644
>> --- a/kernel/sched/deadline.c
>> +++ b/kernel/sched/deadline.c
>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>   	return hrtimer_active(&dl_se->dl_timer);
>>   }
>>   
>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>   /*
>>    * This is the bandwidth enforcement timer callback. If here, we know
>>    * a task is not on its dl_rq, since the fact that the timer was running
>> @@ -538,6 +539,43 @@ again:
>>   	update_rq_clock(rq);
>>   	dl_se->dl_throttled = 0;
>>   	dl_se->dl_yielded = 0;
>> +
>> +	/*
>> +	 * So if we find that the rq the task was on is no longer
>> +	 * available, we need to select a new rq.
>> +	 */
>> +	if (unlikely(!rq->online)) {
>> +		struct rq *later_rq = NULL;
>> +
>> +		later_rq = find_lock_later_rq(p, rq);
>> +
>> +		if (!later_rq) {
>> +			int cpu;
>> +
>> +			/*
>> +			 * If cannot preempt any rq, fallback to pick any
>> +			 * online cpu.
>> +			 */
>> +			cpu = cpumask_any_and(cpu_active_mask,
>> +					tsk_cpus_allowed(p));
>> +			if (cpu >= nr_cpu_ids) {
>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>> +				goto unlock;
>> +			}
>> +			later_rq = cpu_rq(cpu);
>> +		}
>> +
>> +		deactivate_task(rq, p, 0);
>> +		set_task_cpu(p, later_rq->cpu);
>> +		activate_task(later_rq, p, 0);
>> +
>> +		resched_curr(later_rq);
>> +
>> +		double_unlock_balance(rq, later_rq);
>> +
>> +		goto unlock;
>> +	}
>> +
>>   	if (task_on_rq_queued(p)) {
>>   		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>   		if (dl_task(rq->curr))
>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>   	 * We have to consider system topology and task affinity
>>   	 * first, then we can look for a suitable cpu.
>>   	 */
>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>> +	cpumask_copy(later_mask, cpu_active_mask);
>> +	if (likely(task_rq(task)->online))
>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
> So, here you consider the span only when the task_rq is online,
> but there might be others cpus still online belonging to the same
> rd->span. And you have to consider them when migrating. Actually,
> migration must still be restricted to the online cpus of task's
> original rd->span, or I fear you can break clustered scheduling.

Sorry, what's clustered scheduling?

Regards,
Wanpeng Li

>
> Thanks,
>
> - Juri
>
>>   	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>   	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>   			task, later_mask);
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-18 23:18   ` Wanpeng Li
@ 2014-11-19 10:13     ` Juri Lelli
  2014-11-19 12:30       ` Wanpeng Li
  0 siblings, 1 reply; 19+ messages in thread
From: Juri Lelli @ 2014-11-19 10:13 UTC (permalink / raw)
  To: Wanpeng Li, Wanpeng Li, Ingo Molnar, Peter Zijlstra
  Cc: Kirill Tkhai, linux-kernel

Hi,

On 18/11/14 23:18, Wanpeng Li wrote:
> Hi Juri,
> On 11/12/14, 11:08 PM, Juri Lelli wrote:
>> Hi,
>>
>> On 12/11/14 01:06, Wanpeng Li wrote:
>>> I observe that dl task can't be migrated to other cpus during cpu hotplug,
>>> in addition, task may/may not be running again if cpu is added back. The
>>> root cause which I found is that dl task will be throtted and removed from
>>> dl rq after comsuming all budget, which leads to stop task can't pick it up
>>> from dl rq and migrate to other cpus during hotplug.
>>>
>>> The method to reproduce:
>>> schedtool -E -t 50000:100000 -e ./test
>>> Actually test is just a simple for loop. Then observe which cpu the test
>>> task is on.
>>> echo 0 > /sys/devices/system/cpu/cpuN/online
>>>
>>> This patch adds the dl task migration during cpu hotplug by finding a most
>>> suitable later deadline rq after dl timer fire if current rq is offline,
>>> if fail to find a suitable later deadline rq then fallback to any eligible
>>> online cpu in order that the deadline task will come back to us, and the
>>> push/pull mechanism should then move it around properly.
>>>
>>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>>> ---
>>> v4 -> v5:
>>>   * remove raw_spin_unlock(&rq->lock)
>>>   * cleanup codes, spotted by Peterz
>>>   * cleanup patch description
>>> v3 -> v4:
>>>   * use tsk_cpus_allowed wrapper
>>>   * fix compile error
>>> v2 -> v3:
>>>   * don't get_task_struct
>>>   * if cannot preempt any rq, fallback to pick any online cpus
>>>   * use cpu_active_mask as original later_mask if cpu is offline
>>> v1 -> v2:
>>>   * push the task to another cpu in dl_task_timer() if rq is offline.
>>>
>>>   kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>>   1 file changed, 41 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>> index f3d7776..7c31906 100644
>>> --- a/kernel/sched/deadline.c
>>> +++ b/kernel/sched/deadline.c
>>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>>   	return hrtimer_active(&dl_se->dl_timer);
>>>   }
>>>   
>>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>>   /*
>>>    * This is the bandwidth enforcement timer callback. If here, we know
>>>    * a task is not on its dl_rq, since the fact that the timer was running
>>> @@ -538,6 +539,43 @@ again:
>>>   	update_rq_clock(rq);
>>>   	dl_se->dl_throttled = 0;
>>>   	dl_se->dl_yielded = 0;
>>> +
>>> +	/*
>>> +	 * So if we find that the rq the task was on is no longer
>>> +	 * available, we need to select a new rq.
>>> +	 */
>>> +	if (unlikely(!rq->online)) {
>>> +		struct rq *later_rq = NULL;
>>> +
>>> +		later_rq = find_lock_later_rq(p, rq);
>>> +
>>> +		if (!later_rq) {
>>> +			int cpu;
>>> +
>>> +			/*
>>> +			 * If cannot preempt any rq, fallback to pick any
>>> +			 * online cpu.
>>> +			 */
>>> +			cpu = cpumask_any_and(cpu_active_mask,
>>> +					tsk_cpus_allowed(p));
>>> +			if (cpu >= nr_cpu_ids) {
>>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>>> +				goto unlock;
>>> +			}
>>> +			later_rq = cpu_rq(cpu);
>>> +		}
>>> +
>>> +		deactivate_task(rq, p, 0);
>>> +		set_task_cpu(p, later_rq->cpu);
>>> +		activate_task(later_rq, p, 0);
>>> +
>>> +		resched_curr(later_rq);
>>> +
>>> +		double_unlock_balance(rq, later_rq);
>>> +
>>> +		goto unlock;
>>> +	}
>>> +
>>>   	if (task_on_rq_queued(p)) {
>>>   		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>>   		if (dl_task(rq->curr))
>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>>   	 * We have to consider system topology and task affinity
>>>   	 * first, then we can look for a suitable cpu.
>>>   	 */
>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>>> +	cpumask_copy(later_mask, cpu_active_mask);
>>> +	if (likely(task_rq(task)->online))
>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>> So, here you consider the span only when the task_rq is online,
>> but there might be others cpus still online belonging to the same
>> rd->span. And you have to consider them when migrating. Actually,
>> migration must still be restricted to the online cpus of task's
>> original rd->span, or I fear you can break clustered scheduling.
> 
> Sorry, what's clustered scheduling?
> 

It's a scheduling configuration in which you restrict tasks to run in
disjoint subsets of system CPUs. Translated to what we have, it's what
you get when you create exclusive cpusets (each one gets a rd) and
associate tasks to them.

My concern in what above is that you may end up breaking this setup
if you don't consider the sd->span when one of the CPUs of your
cpuset is off. But, Pang Xunlei patches may solve this, I still have to
check :/.

Thanks,

- Juri

> Regards,
> Wanpeng Li
> 
>>
>> Thanks,
>>
>> - Juri
>>
>>>   	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>>   	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>>   			task, later_mask);
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-19 10:13     ` Juri Lelli
@ 2014-11-19 12:30       ` Wanpeng Li
  2014-11-19 13:49         ` Juri Lelli
  0 siblings, 1 reply; 19+ messages in thread
From: Wanpeng Li @ 2014-11-19 12:30 UTC (permalink / raw)
  To: Juri Lelli, Wanpeng Li, Ingo Molnar, Peter Zijlstra
  Cc: Kirill Tkhai, linux-kernel

Hi Juri,
On 11/19/14, 6:13 PM, Juri Lelli wrote:
> Hi,
>
> On 18/11/14 23:18, Wanpeng Li wrote:
>> Hi Juri,
>> On 11/12/14, 11:08 PM, Juri Lelli wrote:
>>> Hi,
>>>
>>> On 12/11/14 01:06, Wanpeng Li wrote:
>>>> I observe that dl task can't be migrated to other cpus during cpu hotplug,
>>>> in addition, task may/may not be running again if cpu is added back. The
>>>> root cause which I found is that dl task will be throtted and removed from
>>>> dl rq after comsuming all budget, which leads to stop task can't pick it up
>>>> from dl rq and migrate to other cpus during hotplug.
>>>>
>>>> The method to reproduce:
>>>> schedtool -E -t 50000:100000 -e ./test
>>>> Actually test is just a simple for loop. Then observe which cpu the test
>>>> task is on.
>>>> echo 0 > /sys/devices/system/cpu/cpuN/online
>>>>
>>>> This patch adds the dl task migration during cpu hotplug by finding a most
>>>> suitable later deadline rq after dl timer fire if current rq is offline,
>>>> if fail to find a suitable later deadline rq then fallback to any eligible
>>>> online cpu in order that the deadline task will come back to us, and the
>>>> push/pull mechanism should then move it around properly.
>>>>
>>>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>>>> ---
>>>> v4 -> v5:
>>>>    * remove raw_spin_unlock(&rq->lock)
>>>>    * cleanup codes, spotted by Peterz
>>>>    * cleanup patch description
>>>> v3 -> v4:
>>>>    * use tsk_cpus_allowed wrapper
>>>>    * fix compile error
>>>> v2 -> v3:
>>>>    * don't get_task_struct
>>>>    * if cannot preempt any rq, fallback to pick any online cpus
>>>>    * use cpu_active_mask as original later_mask if cpu is offline
>>>> v1 -> v2:
>>>>    * push the task to another cpu in dl_task_timer() if rq is offline.
>>>>
>>>>    kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>>>    1 file changed, 41 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>>> index f3d7776..7c31906 100644
>>>> --- a/kernel/sched/deadline.c
>>>> +++ b/kernel/sched/deadline.c
>>>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>>>    	return hrtimer_active(&dl_se->dl_timer);
>>>>    }
>>>>    
>>>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>>>    /*
>>>>     * This is the bandwidth enforcement timer callback. If here, we know
>>>>     * a task is not on its dl_rq, since the fact that the timer was running
>>>> @@ -538,6 +539,43 @@ again:
>>>>    	update_rq_clock(rq);
>>>>    	dl_se->dl_throttled = 0;
>>>>    	dl_se->dl_yielded = 0;
>>>> +
>>>> +	/*
>>>> +	 * So if we find that the rq the task was on is no longer
>>>> +	 * available, we need to select a new rq.
>>>> +	 */
>>>> +	if (unlikely(!rq->online)) {
>>>> +		struct rq *later_rq = NULL;
>>>> +
>>>> +		later_rq = find_lock_later_rq(p, rq);
>>>> +
>>>> +		if (!later_rq) {
>>>> +			int cpu;
>>>> +
>>>> +			/*
>>>> +			 * If cannot preempt any rq, fallback to pick any
>>>> +			 * online cpu.
>>>> +			 */
>>>> +			cpu = cpumask_any_and(cpu_active_mask,
>>>> +					tsk_cpus_allowed(p));
>>>> +			if (cpu >= nr_cpu_ids) {
>>>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>>>> +				goto unlock;
>>>> +			}
>>>> +			later_rq = cpu_rq(cpu);
>>>> +		}
>>>> +
>>>> +		deactivate_task(rq, p, 0);
>>>> +		set_task_cpu(p, later_rq->cpu);
>>>> +		activate_task(later_rq, p, 0);
>>>> +
>>>> +		resched_curr(later_rq);
>>>> +
>>>> +		double_unlock_balance(rq, later_rq);
>>>> +
>>>> +		goto unlock;
>>>> +	}
>>>> +
>>>>    	if (task_on_rq_queued(p)) {
>>>>    		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>>>    		if (dl_task(rq->curr))
>>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>>>    	 * We have to consider system topology and task affinity
>>>>    	 * first, then we can look for a suitable cpu.
>>>>    	 */
>>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>>>> +	cpumask_copy(later_mask, cpu_active_mask);
>>>> +	if (likely(task_rq(task)->online))
>>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>>> So, here you consider the span only when the task_rq is online,
>>> but there might be others cpus still online belonging to the same
>>> rd->span. And you have to consider them when migrating. Actually,
>>> migration must still be restricted to the online cpus of task's
>>> original rd->span, or I fear you can break clustered scheduling.
>> Sorry, what's clustered scheduling?
>>
> It's a scheduling configuration in which you restrict tasks to run in
> disjoint subsets of system CPUs. Translated to what we have, it's what
> you get when you create exclusive cpusets (each one gets a rd) and
> associate tasks to them.
>
> My concern in what above is that you may end up breaking this setup
> if you don't consider the sd->span when one of the CPUs of your
> cpuset is off. But, Pang Xunlei patches may solve this, I still have to
> check :/.

Thanks for your explanation. Could you point out which one of Pang's 
patchset solve this? ;-)

Regards,
Wanpeng Li

>
> Thanks,
>
> - Juri
>
>> Regards,
>> Wanpeng Li
>>
>>> Thanks,
>>>
>>> - Juri
>>>
>>>>    	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>>>    	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>>>    			task, later_mask);
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> Please read the FAQ at  http://www.tux.org/lkml/
>>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-19 12:30       ` Wanpeng Li
@ 2014-11-19 13:49         ` Juri Lelli
  2014-11-19 23:08           ` Wanpeng Li
  0 siblings, 1 reply; 19+ messages in thread
From: Juri Lelli @ 2014-11-19 13:49 UTC (permalink / raw)
  To: Wanpeng Li, Wanpeng Li, Ingo Molnar, Peter Zijlstra
  Cc: Kirill Tkhai, linux-kernel

Hi,

On 19/11/14 12:30, Wanpeng Li wrote:
> Hi Juri,
> On 11/19/14, 6:13 PM, Juri Lelli wrote:
>> Hi,
>>
>> On 18/11/14 23:18, Wanpeng Li wrote:
>>> Hi Juri,
>>> On 11/12/14, 11:08 PM, Juri Lelli wrote:
>>>> Hi,
>>>>
>>>> On 12/11/14 01:06, Wanpeng Li wrote:
>>>>> I observe that dl task can't be migrated to other cpus during cpu hotplug,
>>>>> in addition, task may/may not be running again if cpu is added back. The
>>>>> root cause which I found is that dl task will be throtted and removed from
>>>>> dl rq after comsuming all budget, which leads to stop task can't pick it up
>>>>> from dl rq and migrate to other cpus during hotplug.
>>>>>
>>>>> The method to reproduce:
>>>>> schedtool -E -t 50000:100000 -e ./test
>>>>> Actually test is just a simple for loop. Then observe which cpu the test
>>>>> task is on.
>>>>> echo 0 > /sys/devices/system/cpu/cpuN/online
>>>>>
>>>>> This patch adds the dl task migration during cpu hotplug by finding a most
>>>>> suitable later deadline rq after dl timer fire if current rq is offline,
>>>>> if fail to find a suitable later deadline rq then fallback to any eligible
>>>>> online cpu in order that the deadline task will come back to us, and the
>>>>> push/pull mechanism should then move it around properly.
>>>>>
>>>>> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
>>>>> ---
>>>>> v4 -> v5:
>>>>>    * remove raw_spin_unlock(&rq->lock)
>>>>>    * cleanup codes, spotted by Peterz
>>>>>    * cleanup patch description
>>>>> v3 -> v4:
>>>>>    * use tsk_cpus_allowed wrapper
>>>>>    * fix compile error
>>>>> v2 -> v3:
>>>>>    * don't get_task_struct
>>>>>    * if cannot preempt any rq, fallback to pick any online cpus
>>>>>    * use cpu_active_mask as original later_mask if cpu is offline
>>>>> v1 -> v2:
>>>>>    * push the task to another cpu in dl_task_timer() if rq is offline.
>>>>>
>>>>>    kernel/sched/deadline.c | 43 +++++++++++++++++++++++++++++++++++++++++--
>>>>>    1 file changed, 41 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>>>> index f3d7776..7c31906 100644
>>>>> --- a/kernel/sched/deadline.c
>>>>> +++ b/kernel/sched/deadline.c
>>>>> @@ -487,6 +487,7 @@ static int start_dl_timer(struct sched_dl_entity *dl_se, bool boosted)
>>>>>    	return hrtimer_active(&dl_se->dl_timer);
>>>>>    }
>>>>>    
>>>>> +static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq);
>>>>>    /*
>>>>>     * This is the bandwidth enforcement timer callback. If here, we know
>>>>>     * a task is not on its dl_rq, since the fact that the timer was running
>>>>> @@ -538,6 +539,43 @@ again:
>>>>>    	update_rq_clock(rq);
>>>>>    	dl_se->dl_throttled = 0;
>>>>>    	dl_se->dl_yielded = 0;
>>>>> +
>>>>> +	/*
>>>>> +	 * So if we find that the rq the task was on is no longer
>>>>> +	 * available, we need to select a new rq.
>>>>> +	 */
>>>>> +	if (unlikely(!rq->online)) {
>>>>> +		struct rq *later_rq = NULL;
>>>>> +
>>>>> +		later_rq = find_lock_later_rq(p, rq);
>>>>> +
>>>>> +		if (!later_rq) {
>>>>> +			int cpu;
>>>>> +
>>>>> +			/*
>>>>> +			 * If cannot preempt any rq, fallback to pick any
>>>>> +			 * online cpu.
>>>>> +			 */
>>>>> +			cpu = cpumask_any_and(cpu_active_mask,
>>>>> +					tsk_cpus_allowed(p));
>>>>> +			if (cpu >= nr_cpu_ids) {
>>>>> +				pr_warn("fail to find any online cpu and task will never come back\n");
>>>>> +				goto unlock;
>>>>> +			}
>>>>> +			later_rq = cpu_rq(cpu);
>>>>> +		}
>>>>> +
>>>>> +		deactivate_task(rq, p, 0);
>>>>> +		set_task_cpu(p, later_rq->cpu);
>>>>> +		activate_task(later_rq, p, 0);
>>>>> +
>>>>> +		resched_curr(later_rq);
>>>>> +
>>>>> +		double_unlock_balance(rq, later_rq);
>>>>> +
>>>>> +		goto unlock;
>>>>> +	}
>>>>> +
>>>>>    	if (task_on_rq_queued(p)) {
>>>>>    		enqueue_task_dl(rq, p, ENQUEUE_REPLENISH);
>>>>>    		if (dl_task(rq->curr))
>>>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>>>>    	 * We have to consider system topology and task affinity
>>>>>    	 * first, then we can look for a suitable cpu.
>>>>>    	 */
>>>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>>>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>>>>> +	cpumask_copy(later_mask, cpu_active_mask);
>>>>> +	if (likely(task_rq(task)->online))
>>>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>>>> So, here you consider the span only when the task_rq is online,
>>>> but there might be others cpus still online belonging to the same
>>>> rd->span. And you have to consider them when migrating. Actually,
>>>> migration must still be restricted to the online cpus of task's
>>>> original rd->span, or I fear you can break clustered scheduling.
>>> Sorry, what's clustered scheduling?
>>>
>> It's a scheduling configuration in which you restrict tasks to run in
>> disjoint subsets of system CPUs. Translated to what we have, it's what
>> you get when you create exclusive cpusets (each one gets a rd) and
>> associate tasks to them.
>>
>> My concern in what above is that you may end up breaking this setup
>> if you don't consider the sd->span when one of the CPUs of your
>> cpuset is off. But, Pang Xunlei patches may solve this, I still have to
>> check :/.
> 
> Thanks for your explanation. Could you point out which one of Pang's 
> patchset solve this? ;-)
> 

https://lkml.org/lkml/2014/11/17/443 may help with this, although I
still have to properly look at it.

Best,

- Juri

> Regards,
> Wanpeng Li
> 
>>
>> Thanks,
>>
>> - Juri
>>
>>> Regards,
>>> Wanpeng Li
>>>
>>>> Thanks,
>>>>
>>>> - Juri
>>>>
>>>>>    	cpumask_and(later_mask, later_mask, &task->cpus_allowed);
>>>>>    	best_cpu = cpudl_find(&task_rq(task)->rd->cpudl,
>>>>>    			task, later_mask);
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> Please read the FAQ at  http://www.tux.org/lkml/
>>>
> 
> 


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-19 13:49         ` Juri Lelli
@ 2014-11-19 23:08           ` Wanpeng Li
  0 siblings, 0 replies; 19+ messages in thread
From: Wanpeng Li @ 2014-11-19 23:08 UTC (permalink / raw)
  To: Juri Lelli
  Cc: Ingo Molnar, Peter Zijlstra, Kirill Tkhai, linux-kernel, Wanpeng Li

Hi Juri,
On Wed, Nov 19, 2014 at 01:49:20PM +0000, Juri Lelli wrote:
[...]
>>>>>> @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>>>>>>    	 * We have to consider system topology and task affinity
>>>>>>    	 * first, then we can look for a suitable cpu.
>>>>>>    	 */
>>>>>> -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>>>>>> -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>>>>>> +	cpumask_copy(later_mask, cpu_active_mask);
>>>>>> +	if (likely(task_rq(task)->online))
>>>>>> +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>>>>> So, here you consider the span only when the task_rq is online,
>>>>> but there might be others cpus still online belonging to the same
>>>>> rd->span. And you have to consider them when migrating. Actually,
>>>>> migration must still be restricted to the online cpus of task's
>>>>> original rd->span, or I fear you can break clustered scheduling.
>>>> Sorry, what's clustered scheduling?
>>>>
>>> It's a scheduling configuration in which you restrict tasks to run in
>>> disjoint subsets of system CPUs. Translated to what we have, it's what
>>> you get when you create exclusive cpusets (each one gets a rd) and
>>> associate tasks to them.
>>>
>>> My concern in what above is that you may end up breaking this setup
>>> if you don't consider the sd->span when one of the CPUs of your
>>> cpuset is off. But, Pang Xunlei patches may solve this, I still have to
>>> check :/.
>> 
>> Thanks for your explanation. Could you point out which one of Pang's 
>> patchset solve this? ;-)
>> 
>
>https://lkml.org/lkml/2014/11/17/443 may help with this, although I
>still have to properly look at it.
>

Thanks for your pointing out. Btw, could you review below patches of
mine for deadline? Great thanks for your time. ;-)

https://lkml.org/lkml/2014/11/18/1006
https://lkml.org/lkml/2014/11/19/182
https://lkml.org/lkml/2014/11/19/187

Regards,
Wanpeng Li 



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-13 10:21         ` Kirill Tkhai
  2014-11-16 22:59           ` Wanpeng Li
@ 2014-11-20  8:49           ` Wanpeng Li
  1 sibling, 0 replies; 19+ messages in thread
From: Wanpeng Li @ 2014-11-20  8:49 UTC (permalink / raw)
  To: Peter Zijlstra, Juri Lelli
  Cc: Kirill Tkhai, Ingo Molnar, linux-kernel, Wanpeng Li

Hi Peterz, Juri,
On Thu, Nov 13, 2014 at 01:21:31PM +0300, Kirill Tkhai wrote:
>> > Also, we should think about the following situation.
>> >
>> > DL task is left on dead rq. In your scheme it will be moved by the timer.
>> > But what will be if somebody changes the class of the task (before timer)?
>> 
>> I think timer will be cancelled in switched_from_dl().
>
>Yeah, but nobody will move this task to alive rq.
>
>> > In this case the task still remains on dead rq.
>> >
>> > We should handle this situation in some way.

I think pang's patch handle the exclusive cpusets rd->span issue, the concern 
which Kirill mentioned above maybe the last one block the patch merged, is 
there any idea or proposal from you experts?

Regards,
Wanpeng Li 

>> >
>> > Kirill
>> >
>> >
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > Please read the FAQ at  http://www.tux.org/lkml/
>> 
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2014-11-12 15:39   ` Peter Zijlstra
  2014-11-12 23:02     ` Wanpeng Li
@ 2015-01-05 14:52     ` Peter Zijlstra
  2015-01-06  2:14       ` Wanpeng Li
  1 sibling, 1 reply; 19+ messages in thread
From: Peter Zijlstra @ 2015-01-05 14:52 UTC (permalink / raw)
  To: Juri Lelli; +Cc: Wanpeng Li, Ingo Molnar, Kirill Tkhai, linux-kernel

On Wed, Nov 12, 2014 at 04:39:06PM +0100, Peter Zijlstra wrote:
> On Wed, Nov 12, 2014 at 03:08:44PM +0000, Juri Lelli wrote:
> > > @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
> > >  	 * We have to consider system topology and task affinity
> > >  	 * first, then we can look for a suitable cpu.
> > >  	 */
> > > -	cpumask_copy(later_mask, task_rq(task)->rd->span);
> > > -	cpumask_and(later_mask, later_mask, cpu_active_mask);
> > > +	cpumask_copy(later_mask, cpu_active_mask);
> > > +	if (likely(task_rq(task)->online))
> > > +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
> > 
> > So, here you consider the span only when the task_rq is online,
> > but there might be others cpus still online belonging to the same
> > rd->span. And you have to consider them when migrating. Actually,
> > migration must still be restricted to the online cpus of task's
> > original rd->span, or I fear you can break clustered scheduling.
> 
> Ah, good point that, we must somehow find the right root domain to
> 'restore' the task to. Now I'm not entirely sure we still have this.
> Lemme ponder that.

Ah, we should be able to find this by looking at the cpuset cgroup
information. The cpuset cgroup knows the available cpumask of this task,
which we can translate to the correct root domain in two separate ways
(either run up the cpuset cgroup hierarchy and find the highest domain
with balancing enabled, or look at whatever the rq->rd is for any one of
the allowed CPUs of the immediate cgroup this task belongs to).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v5] sched/deadline: support dl task migration during cpu hotplug
  2015-01-05 14:52     ` Peter Zijlstra
@ 2015-01-06  2:14       ` Wanpeng Li
  0 siblings, 0 replies; 19+ messages in thread
From: Wanpeng Li @ 2015-01-06  2:14 UTC (permalink / raw)
  To: Peter Zijlstra, Juri Lelli
  Cc: Wanpeng Li, Ingo Molnar, Kirill Tkhai, linux-kernel

Hi Peter,
On Mon, Jan 05, 2015 at 03:52:10PM +0100, Peter Zijlstra wrote:
>On Wed, Nov 12, 2014 at 04:39:06PM +0100, Peter Zijlstra wrote:
>> On Wed, Nov 12, 2014 at 03:08:44PM +0000, Juri Lelli wrote:
>> > > @@ -1185,8 +1223,9 @@ static int find_later_rq(struct task_struct *task)
>> > >  	 * We have to consider system topology and task affinity
>> > >  	 * first, then we can look for a suitable cpu.
>> > >  	 */
>> > > -	cpumask_copy(later_mask, task_rq(task)->rd->span);
>> > > -	cpumask_and(later_mask, later_mask, cpu_active_mask);
>> > > +	cpumask_copy(later_mask, cpu_active_mask);
>> > > +	if (likely(task_rq(task)->online))
>> > > +		cpumask_and(later_mask, later_mask, task_rq(task)->rd->span);
>> > 
>> > So, here you consider the span only when the task_rq is online,
>> > but there might be others cpus still online belonging to the same
>> > rd->span. And you have to consider them when migrating. Actually,
>> > migration must still be restricted to the online cpus of task's
>> > original rd->span, or I fear you can break clustered scheduling.
>> 
>> Ah, good point that, we must somehow find the right root domain to
>> 'restore' the task to. Now I'm not entirely sure we still have this.
>> Lemme ponder that.
>
>Ah, we should be able to find this by looking at the cpuset cgroup
>information. The cpuset cgroup knows the available cpumask of this task,
>which we can translate to the correct root domain in two separate ways
>(either run up the cpuset cgroup hierarchy and find the highest domain
>with balancing enabled, or look at whatever the rq->rd is for any one of
>the allowed CPUs of the immediate cgroup this task belongs to).

If the patch Juri pointed out can help to skip the issue?

https://lkml.org/lkml/2014/11/19/293

Regards,
Wanpeng Li 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-01-06  2:34 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-12  1:06 [PATCH v5] sched/deadline: support dl task migration during cpu hotplug Wanpeng Li
2014-11-12 15:08 ` Juri Lelli
2014-11-12 15:39   ` Peter Zijlstra
2014-11-12 23:02     ` Wanpeng Li
2015-01-05 14:52     ` Peter Zijlstra
2015-01-06  2:14       ` Wanpeng Li
2014-11-12 23:22   ` Wanpeng Li
2014-11-18 23:18   ` Wanpeng Li
2014-11-19 10:13     ` Juri Lelli
2014-11-19 12:30       ` Wanpeng Li
2014-11-19 13:49         ` Juri Lelli
2014-11-19 23:08           ` Wanpeng Li
2014-11-12 16:27 ` Kirill Tkhai
2014-11-12 22:56   ` Wanpeng Li
2014-11-13 10:10     ` Kirill Tkhai
2014-11-13 10:19       ` Wanpeng Li
2014-11-13 10:21         ` Kirill Tkhai
2014-11-16 22:59           ` Wanpeng Li
2014-11-20  8:49           ` Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).