linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead
@ 2015-04-07  3:36 Wanpeng Li
  2015-05-06  8:14 ` Juri Lelli
  0 siblings, 1 reply; 6+ messages in thread
From: Wanpeng Li @ 2015-04-07  3:36 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra; +Cc: Juri Lelli, linux-kernel, Wanpeng Li

The total used dl bandwidth of each root domain will be reset to 0 after 
cpu hotplug when rebuild sched domains, since the call path is:

_cpu_down
  cpuset_cpu_inactive() 
    cpuset_update_active_cpus()
      partition_sched_domains()
        build_sched_domains() 
          init_rootdomain() 
            init_dl_bw() 

The bandwidth which dl task occupy will be released when dl task dead,
it will be minus from total used dl bandwidth of its root domain, 
however, bandwidth overflow occurs since total used dl bandwidth is 0.

This patch fix it by attaching the bandwidth which dl task occupy to 
the new root domain when the task is migrating since cpu hotplug, and
attach all the used dl bandwidth of dl tasks to the new root domain 
when sched domains are rebuild.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
---
 kernel/sched/core.c     |  1 +
 kernel/sched/deadline.c | 25 +++++++++++++++++++++++++
 kernel/sched/sched.h    |  1 +
 3 files changed, 27 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 28b0d75..c940999 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5586,6 +5586,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
 	rq->rd = rd;
 
 	cpumask_set_cpu(rq->cpu, rd->span);
+	attach_dl_bw(rq);
 	if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
 		set_rq_online(rq);
 
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 5e95145..62680d7 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -224,6 +224,7 @@ static void dl_task_offline_migration(struct rq *rq, struct task_struct *p)
 {
 	struct rq *later_rq = NULL;
 	bool fallback = false;
+	struct dl_bw *dl_b;
 
 	later_rq = find_lock_later_rq(p, rq);
 
@@ -258,6 +259,11 @@ static void dl_task_offline_migration(struct rq *rq, struct task_struct *p)
 	set_task_cpu(p, later_rq->cpu);
 	activate_task(later_rq, p, ENQUEUE_REPLENISH);
 
+	dl_b = dl_bw_of(later_rq->cpu);
+	raw_spin_lock(&dl_b->lock);
+	__dl_add(dl_b, p->dl.dl_bw);
+	raw_spin_unlock(&dl_b->lock);
+
 	if (!fallback)
 		resched_curr(later_rq);
 
@@ -1776,6 +1782,25 @@ static void prio_changed_dl(struct rq *rq, struct task_struct *p,
 		switched_to_dl(rq, p);
 }
 
+void attach_dl_bw(struct rq *rq)
+{
+	struct rb_node *next_node = rq->dl.rb_leftmost;
+	struct sched_dl_entity *dl_se;
+	struct dl_bw *dl_b;
+
+	dl_b = dl_bw_of(rq->cpu);
+	raw_spin_lock(&dl_b->lock);
+next_node:
+	if (next_node) {
+		dl_se = rb_entry(next_node, struct sched_dl_entity, rb_node);
+		__dl_add(dl_b, dl_se->dl_bw);
+		next_node = rb_next(next_node);
+
+		goto next_node;
+	}
+	raw_spin_unlock(&dl_b->lock);
+}
+
 const struct sched_class dl_sched_class = {
 	.next			= &rt_sched_class,
 	.enqueue_task		= enqueue_task_dl,
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e0e1299..a7b1a59 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1676,6 +1676,7 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
 
 extern void cfs_bandwidth_usage_inc(void);
 extern void cfs_bandwidth_usage_dec(void);
+void attach_dl_bw(struct rq *rq);
 
 #ifdef CONFIG_NO_HZ_COMMON
 enum rq_nohz_flag_bits {
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead
  2015-04-07  3:36 [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead Wanpeng Li
@ 2015-05-06  8:14 ` Juri Lelli
       [not found]   ` <CANRm+Cy43icX8DsLQJCwy+hbOuUynK5MOMfrHj5YA19LK_HmdQ@mail.gmail.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Juri Lelli @ 2015-05-06  8:14 UTC (permalink / raw)
  To: Wanpeng Li, Ingo Molnar, Peter Zijlstra; +Cc: linux-kernel

Hi Wanpeng,

I finally got to review this, sorry about the huge delay.

On 07/04/2015 04:36, Wanpeng Li wrote:
> The total used dl bandwidth of each root domain will be reset to 0 after 
> cpu hotplug when rebuild sched domains, since the call path is:
> 
> _cpu_down
>   cpuset_cpu_inactive() 
>     cpuset_update_active_cpus()
>       partition_sched_domains()
>         build_sched_domains() 
>           init_rootdomain() 
>             init_dl_bw() 
> 
> The bandwidth which dl task occupy will be released when dl task dead,
> it will be minus from total used dl bandwidth of its root domain, 
> however, bandwidth overflow occurs since total used dl bandwidth is 0.
> 

Right, that's a bug.

> This patch fix it by attaching the bandwidth which dl task occupy to 
> the new root domain when the task is migrating since cpu hotplug, and
> attach all the used dl bandwidth of dl tasks to the new root domain 
> when sched domains are rebuild.
> 

But, I think this fix has still a couple of problems:

 - what happens if a DL task is simply sleeping when domains are
   reconfigured?

 - def_root_domain has now multiple accounting problems, as you do
   this thing even when a cpu is moved there in the cpuoff path

Also, runqueue (and throttling) information are dynamic, while we
are trying to fix a static problem. It's probably not a good idea
mixing them.

I'm not sure how (I need more time to think it through), but can
we maybe fix this using cpuset information?

Thanks,

- Juri

> Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
> ---
>  kernel/sched/core.c     |  1 +
>  kernel/sched/deadline.c | 25 +++++++++++++++++++++++++
>  kernel/sched/sched.h    |  1 +
>  3 files changed, 27 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 28b0d75..c940999 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5586,6 +5586,7 @@ static void rq_attach_root(struct rq *rq, struct root_domain *rd)
>  	rq->rd = rd;
>  
>  	cpumask_set_cpu(rq->cpu, rd->span);
> +	attach_dl_bw(rq);
>  	if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
>  		set_rq_online(rq);
>  
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 5e95145..62680d7 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -224,6 +224,7 @@ static void dl_task_offline_migration(struct rq *rq, struct task_struct *p)
>  {
>  	struct rq *later_rq = NULL;
>  	bool fallback = false;
> +	struct dl_bw *dl_b;
>  
>  	later_rq = find_lock_later_rq(p, rq);
>  
> @@ -258,6 +259,11 @@ static void dl_task_offline_migration(struct rq *rq, struct task_struct *p)
>  	set_task_cpu(p, later_rq->cpu);
>  	activate_task(later_rq, p, ENQUEUE_REPLENISH);
>  
> +	dl_b = dl_bw_of(later_rq->cpu);
> +	raw_spin_lock(&dl_b->lock);
> +	__dl_add(dl_b, p->dl.dl_bw);
> +	raw_spin_unlock(&dl_b->lock);
> +
>  	if (!fallback)
>  		resched_curr(later_rq);
>  
> @@ -1776,6 +1782,25 @@ static void prio_changed_dl(struct rq *rq, struct task_struct *p,
>  		switched_to_dl(rq, p);
>  }
>  
> +void attach_dl_bw(struct rq *rq)
> +{
> +	struct rb_node *next_node = rq->dl.rb_leftmost;
> +	struct sched_dl_entity *dl_se;
> +	struct dl_bw *dl_b;
> +
> +	dl_b = dl_bw_of(rq->cpu);
> +	raw_spin_lock(&dl_b->lock);
> +next_node:
> +	if (next_node) {
> +		dl_se = rb_entry(next_node, struct sched_dl_entity, rb_node);
> +		__dl_add(dl_b, dl_se->dl_bw);
> +		next_node = rb_next(next_node);
> +
> +		goto next_node;
> +	}
> +	raw_spin_unlock(&dl_b->lock);
> +}
> +
>  const struct sched_class dl_sched_class = {
>  	.next			= &rt_sched_class,
>  	.enqueue_task		= enqueue_task_dl,
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index e0e1299..a7b1a59 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1676,6 +1676,7 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
>  
>  extern void cfs_bandwidth_usage_inc(void);
>  extern void cfs_bandwidth_usage_dec(void);
> +void attach_dl_bw(struct rq *rq);
>  
>  #ifdef CONFIG_NO_HZ_COMMON
>  enum rq_nohz_flag_bits {
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead
       [not found]   ` <CANRm+Cy43icX8DsLQJCwy+hbOuUynK5MOMfrHj5YA19LK_HmdQ@mail.gmail.com>
@ 2015-08-10 14:10     ` Juri Lelli
  2015-08-10 22:25       ` Wanpeng Li
  2015-08-30 11:25       ` Wanpeng Li
  0 siblings, 2 replies; 6+ messages in thread
From: Juri Lelli @ 2015-08-10 14:10 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: Ingo Molnar, Peter Zijlstra, linux-kernel, wanpeng.li

On 06/08/15 09:39, Wanpeng Li wrote:
> Hi Juri,
>

Hi,

> 2015-05-06 16:14 GMT+08:00 Juri Lelli <juri.lelli@arm.com
> <mailto:juri.lelli@arm.com>>:
> 
>     Hi Wanpeng,
> 
>     I finally got to review this, sorry about the huge delay.
> 
>     On 07/04/2015 04:36, Wanpeng Li wrote:
>     > The total used dl bandwidth of each root domain will be reset to 0 after
>     > cpu hotplug when rebuild sched domains, since the call path is:
>     >
>     > _cpu_down
>     >   cpuset_cpu_inactive()
>     >     cpuset_update_active_cpus()
>     >       partition_sched_domains()
>     >         build_sched_domains()
>     >           init_rootdomain()
>     >             init_dl_bw()
>     >
>     > The bandwidth which dl task occupy will be released when dl task dead,
>     > it will be minus from total used dl bandwidth of its root domain,
>     > however, bandwidth overflow occurs since total used dl bandwidth is 0.
>     >
> 
>     Right, that's a bug.
> 
>     > This patch fix it by attaching the bandwidth which dl task occupy to
>     > the new root domain when the task is migrating since cpu hotplug, and
>     > attach all the used dl bandwidth of dl tasks to the new root domain
>     > when sched domains are rebuild.
>     >
> 
>     But, I think this fix has still a couple of problems:
> 
>      - what happens if a DL task is simply sleeping when domains are
>        reconfigured?
> 
>      - def_root_domain has now multiple accounting problems, as you do
>        this thing even when a cpu is moved there in the cpuoff path
> 
>     Also, runqueue (and throttling) information are dynamic, while we
>     are trying to fix a static problem. It's probably not a good idea
>     mixing them.
> 
>     I'm not sure how (I need more time to think it through), but can
>     we maybe fix this using cpuset information?
> 
> 
> Any ideas?
> 

Yes, actually. I might have a different fix, but I'd like to play with
it a bit more as it is a bit too intrusive. Let me see if I can come
up with something that I can share.

Thanks,

- Juri

> Regards,
> Wanpeng Li
>  
> 
> 
>     Thanks,
> 
>     - Juri
> 
>     > Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com
>     <mailto:wanpeng.li@linux.intel.com>>
>     > ---
>     >  kernel/sched/core.c     |  1 +
>     >  kernel/sched/deadline.c | 25 +++++++++++++++++++++++++
>     >  kernel/sched/sched.h    |  1 +
>     >  3 files changed, 27 insertions(+)
>     >
>     > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>     > index 28b0d75..c940999 100644
>     > --- a/kernel/sched/core.c
>     > +++ b/kernel/sched/core.c
>     > @@ -5586,6 +5586,7 @@ static void rq_attach_root(struct rq *rq,
>     struct root_domain *rd)
>     >       rq->rd = rd;
>     >
>     >       cpumask_set_cpu(rq->cpu, rd->span);
>     > +     attach_dl_bw(rq);
>     >       if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
>     >               set_rq_online(rq);
>     >
>     > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>     > index 5e95145..62680d7 100644
>     > --- a/kernel/sched/deadline.c
>     > +++ b/kernel/sched/deadline.c
>     > @@ -224,6 +224,7 @@ static void dl_task_offline_migration(struct
>     rq *rq, struct task_struct *p)
>     >  {
>     >       struct rq *later_rq = NULL;
>     >       bool fallback = false;
>     > +     struct dl_bw *dl_b;
>     >
>     >       later_rq = find_lock_later_rq(p, rq);
>     >
>     > @@ -258,6 +259,11 @@ static void dl_task_offline_migration(struct
>     rq *rq, struct task_struct *p)
>     >       set_task_cpu(p, later_rq->cpu);
>     >       activate_task(later_rq, p, ENQUEUE_REPLENISH);
>     >
>     > +     dl_b = dl_bw_of(later_rq->cpu);
>     > +     raw_spin_lock(&dl_b->lock);
>     > +     __dl_add(dl_b, p->dl.dl_bw);
>     > +     raw_spin_unlock(&dl_b->lock);
>     > +
>     >       if (!fallback)
>     >               resched_curr(later_rq);
>     >
>     > @@ -1776,6 +1782,25 @@ static void prio_changed_dl(struct rq *rq,
>     struct task_struct *p,
>     >               switched_to_dl(rq, p);
>     >  }
>     >
>     > +void attach_dl_bw(struct rq *rq)
>     > +{
>     > +     struct rb_node *next_node = rq->dl.rb_leftmost;
>     > +     struct sched_dl_entity *dl_se;
>     > +     struct dl_bw *dl_b;
>     > +
>     > +     dl_b = dl_bw_of(rq->cpu);
>     > +     raw_spin_lock(&dl_b->lock);
>     > +next_node:
>     > +     if (next_node) {
>     > +             dl_se = rb_entry(next_node, struct sched_dl_entity,
>     rb_node);
>     > +             __dl_add(dl_b, dl_se->dl_bw);
>     > +             next_node = rb_next(next_node);
>     > +
>     > +             goto next_node;
>     > +     }
>     > +     raw_spin_unlock(&dl_b->lock);
>     > +}
>     > +
>     >  const struct sched_class dl_sched_class = {
>     >       .next                   = &rt_sched_class,
>     >       .enqueue_task           = enqueue_task_dl,
>     > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>     > index e0e1299..a7b1a59 100644
>     > --- a/kernel/sched/sched.h
>     > +++ b/kernel/sched/sched.h
>     > @@ -1676,6 +1676,7 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
>     >
>     >  extern void cfs_bandwidth_usage_inc(void);
>     >  extern void cfs_bandwidth_usage_dec(void);
>     > +void attach_dl_bw(struct rq *rq);
>     >
>     >  #ifdef CONFIG_NO_HZ_COMMON
>     >  enum rq_nohz_flag_bits {
>     >
> 
>     --
>     To unsubscribe from this list: send the line "unsubscribe
>     linux-kernel" in
>     the body of a message to majordomo@vger.kernel.org
>     <mailto:majordomo@vger.kernel.org>
>     More majordomo info at  http://vger.kernel.org/majordomo-info.html
>     Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead
  2015-08-10 14:10     ` Juri Lelli
@ 2015-08-10 22:25       ` Wanpeng Li
  2015-08-30 11:25       ` Wanpeng Li
  1 sibling, 0 replies; 6+ messages in thread
From: Wanpeng Li @ 2015-08-10 22:25 UTC (permalink / raw)
  To: Juri Lelli, Wanpeng Li; +Cc: Ingo Molnar, Peter Zijlstra, linux-kernel



On 8/10/15 10:10 PM, Juri Lelli wrote:
> On 06/08/15 09:39, Wanpeng Li wrote:
>> Hi Juri,
>>
> Hi,
>
>> 2015-05-06 16:14 GMT+08:00 Juri Lelli <juri.lelli@arm.com
>> <mailto:juri.lelli@arm.com>>:
>>
>>      Hi Wanpeng,
>>
>>      I finally got to review this, sorry about the huge delay.
>>
>>      On 07/04/2015 04:36, Wanpeng Li wrote:
>>      > The total used dl bandwidth of each root domain will be reset to 0 after
>>      > cpu hotplug when rebuild sched domains, since the call path is:
>>      >
>>      > _cpu_down
>>      >   cpuset_cpu_inactive()
>>      >     cpuset_update_active_cpus()
>>      >       partition_sched_domains()
>>      >         build_sched_domains()
>>      >           init_rootdomain()
>>      >             init_dl_bw()
>>      >
>>      > The bandwidth which dl task occupy will be released when dl task dead,
>>      > it will be minus from total used dl bandwidth of its root domain,
>>      > however, bandwidth overflow occurs since total used dl bandwidth is 0.
>>      >
>>
>>      Right, that's a bug.
>>
>>      > This patch fix it by attaching the bandwidth which dl task occupy to
>>      > the new root domain when the task is migrating since cpu hotplug, and
>>      > attach all the used dl bandwidth of dl tasks to the new root domain
>>      > when sched domains are rebuild.
>>      >
>>
>>      But, I think this fix has still a couple of problems:
>>
>>       - what happens if a DL task is simply sleeping when domains are
>>         reconfigured?
>>
>>       - def_root_domain has now multiple accounting problems, as you do
>>         this thing even when a cpu is moved there in the cpuoff path
>>
>>      Also, runqueue (and throttling) information are dynamic, while we
>>      are trying to fix a static problem. It's probably not a good idea
>>      mixing them.
>>
>>      I'm not sure how (I need more time to think it through), but can
>>      we maybe fix this using cpuset information?
>>
>>
>> Any ideas?
>>
> Yes, actually. I might have a different fix, but I'd like to play with
> it a bit more as it is a bit too intrusive. Let me see if I can come
> up with something that I can share.

Cool, looking forward for your patches. :)

Regards,
Wanpeng Li

>
> Thanks,
>
> - Juri
>
>> Regards,
>> Wanpeng Li
>>   
>>
>>
>>      Thanks,
>>
>>      - Juri
>>
>>      > Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com
>>      <mailto:wanpeng.li@linux.intel.com>>
>>      > ---
>>      >  kernel/sched/core.c     |  1 +
>>      >  kernel/sched/deadline.c | 25 +++++++++++++++++++++++++
>>      >  kernel/sched/sched.h    |  1 +
>>      >  3 files changed, 27 insertions(+)
>>      >
>>      > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>      > index 28b0d75..c940999 100644
>>      > --- a/kernel/sched/core.c
>>      > +++ b/kernel/sched/core.c
>>      > @@ -5586,6 +5586,7 @@ static void rq_attach_root(struct rq *rq,
>>      struct root_domain *rd)
>>      >       rq->rd = rd;
>>      >
>>      >       cpumask_set_cpu(rq->cpu, rd->span);
>>      > +     attach_dl_bw(rq);
>>      >       if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
>>      >               set_rq_online(rq);
>>      >
>>      > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>      > index 5e95145..62680d7 100644
>>      > --- a/kernel/sched/deadline.c
>>      > +++ b/kernel/sched/deadline.c
>>      > @@ -224,6 +224,7 @@ static void dl_task_offline_migration(struct
>>      rq *rq, struct task_struct *p)
>>      >  {
>>      >       struct rq *later_rq = NULL;
>>      >       bool fallback = false;
>>      > +     struct dl_bw *dl_b;
>>      >
>>      >       later_rq = find_lock_later_rq(p, rq);
>>      >
>>      > @@ -258,6 +259,11 @@ static void dl_task_offline_migration(struct
>>      rq *rq, struct task_struct *p)
>>      >       set_task_cpu(p, later_rq->cpu);
>>      >       activate_task(later_rq, p, ENQUEUE_REPLENISH);
>>      >
>>      > +     dl_b = dl_bw_of(later_rq->cpu);
>>      > +     raw_spin_lock(&dl_b->lock);
>>      > +     __dl_add(dl_b, p->dl.dl_bw);
>>      > +     raw_spin_unlock(&dl_b->lock);
>>      > +
>>      >       if (!fallback)
>>      >               resched_curr(later_rq);
>>      >
>>      > @@ -1776,6 +1782,25 @@ static void prio_changed_dl(struct rq *rq,
>>      struct task_struct *p,
>>      >               switched_to_dl(rq, p);
>>      >  }
>>      >
>>      > +void attach_dl_bw(struct rq *rq)
>>      > +{
>>      > +     struct rb_node *next_node = rq->dl.rb_leftmost;
>>      > +     struct sched_dl_entity *dl_se;
>>      > +     struct dl_bw *dl_b;
>>      > +
>>      > +     dl_b = dl_bw_of(rq->cpu);
>>      > +     raw_spin_lock(&dl_b->lock);
>>      > +next_node:
>>      > +     if (next_node) {
>>      > +             dl_se = rb_entry(next_node, struct sched_dl_entity,
>>      rb_node);
>>      > +             __dl_add(dl_b, dl_se->dl_bw);
>>      > +             next_node = rb_next(next_node);
>>      > +
>>      > +             goto next_node;
>>      > +     }
>>      > +     raw_spin_unlock(&dl_b->lock);
>>      > +}
>>      > +
>>      >  const struct sched_class dl_sched_class = {
>>      >       .next                   = &rt_sched_class,
>>      >       .enqueue_task           = enqueue_task_dl,
>>      > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>>      > index e0e1299..a7b1a59 100644
>>      > --- a/kernel/sched/sched.h
>>      > +++ b/kernel/sched/sched.h
>>      > @@ -1676,6 +1676,7 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
>>      >
>>      >  extern void cfs_bandwidth_usage_inc(void);
>>      >  extern void cfs_bandwidth_usage_dec(void);
>>      > +void attach_dl_bw(struct rq *rq);
>>      >
>>      >  #ifdef CONFIG_NO_HZ_COMMON
>>      >  enum rq_nohz_flag_bits {
>>      >
>>
>>      --
>>      To unsubscribe from this list: send the line "unsubscribe
>>      linux-kernel" in
>>      the body of a message to majordomo@vger.kernel.org
>>      <mailto:majordomo@vger.kernel.org>
>>      More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>      Please read the FAQ at  http://www.tux.org/lkml/
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead
  2015-08-10 14:10     ` Juri Lelli
  2015-08-10 22:25       ` Wanpeng Li
@ 2015-08-30 11:25       ` Wanpeng Li
  2015-09-01  9:49         ` Juri Lelli
  1 sibling, 1 reply; 6+ messages in thread
From: Wanpeng Li @ 2015-08-30 11:25 UTC (permalink / raw)
  To: Juri Lelli, Wanpeng Li, Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel

On 8/10/15 10:10 PM, Juri Lelli wrote:
> On 06/08/15 09:39, Wanpeng Li wrote:
>> Hi Juri,
>>
> Hi,
>
>> 2015-05-06 16:14 GMT+08:00 Juri Lelli <juri.lelli@arm.com
>> <mailto:juri.lelli@arm.com>>:
>>
>>      Hi Wanpeng,
>>
>>      I finally got to review this, sorry about the huge delay.
>>
>>      On 07/04/2015 04:36, Wanpeng Li wrote:
>>      > The total used dl bandwidth of each root domain will be reset to 0 after
>>      > cpu hotplug when rebuild sched domains, since the call path is:
>>      >
>>      > _cpu_down
>>      >   cpuset_cpu_inactive()
>>      >     cpuset_update_active_cpus()
>>      >       partition_sched_domains()
>>      >         build_sched_domains()
>>      >           init_rootdomain()
>>      >             init_dl_bw()
>>      >
>>      > The bandwidth which dl task occupy will be released when dl task dead,
>>      > it will be minus from total used dl bandwidth of its root domain,
>>      > however, bandwidth overflow occurs since total used dl bandwidth is 0.
>>      >
>>
>>      Right, that's a bug.
>>
>>      > This patch fix it by attaching the bandwidth which dl task occupy to
>>      > the new root domain when the task is migrating since cpu hotplug, and
>>      > attach all the used dl bandwidth of dl tasks to the new root domain
>>      > when sched domains are rebuild.
>>      >
>>
>>      But, I think this fix has still a couple of problems:
>>
>>       - what happens if a DL task is simply sleeping when domains are
>>         reconfigured?
>>
>>       - def_root_domain has now multiple accounting problems, as you do
>>         this thing even when a cpu is moved there in the cpuoff path
>>
>>      Also, runqueue (and throttling) information are dynamic, while we
>>      are trying to fix a static problem. It's probably not a good idea
>>      mixing them.
>>
>>      I'm not sure how (I need more time to think it through), but can
>>      we maybe fix this using cpuset information?
>>
>>
>> Any ideas?
>>
> Yes, actually. I might have a different fix, but I'd like to play with
> it a bit more as it is a bit too intrusive. Let me see if I can come
> up with something that I can share.

Ping Peter, Juri, any detail ideas to help me post another version of my 
patch? ;-)

Regards,
Wanpeng Li

>
> Thanks,
>
> - Juri
>
>> Regards,
>> Wanpeng Li
>>   
>>
>>
>>      Thanks,
>>
>>      - Juri
>>
>>      > Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com
>>      <mailto:wanpeng.li@linux.intel.com>>
>>      > ---
>>      >  kernel/sched/core.c     |  1 +
>>      >  kernel/sched/deadline.c | 25 +++++++++++++++++++++++++
>>      >  kernel/sched/sched.h    |  1 +
>>      >  3 files changed, 27 insertions(+)
>>      >
>>      > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>      > index 28b0d75..c940999 100644
>>      > --- a/kernel/sched/core.c
>>      > +++ b/kernel/sched/core.c
>>      > @@ -5586,6 +5586,7 @@ static void rq_attach_root(struct rq *rq,
>>      struct root_domain *rd)
>>      >       rq->rd = rd;
>>      >
>>      >       cpumask_set_cpu(rq->cpu, rd->span);
>>      > +     attach_dl_bw(rq);
>>      >       if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
>>      >               set_rq_online(rq);
>>      >
>>      > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>      > index 5e95145..62680d7 100644
>>      > --- a/kernel/sched/deadline.c
>>      > +++ b/kernel/sched/deadline.c
>>      > @@ -224,6 +224,7 @@ static void dl_task_offline_migration(struct
>>      rq *rq, struct task_struct *p)
>>      >  {
>>      >       struct rq *later_rq = NULL;
>>      >       bool fallback = false;
>>      > +     struct dl_bw *dl_b;
>>      >
>>      >       later_rq = find_lock_later_rq(p, rq);
>>      >
>>      > @@ -258,6 +259,11 @@ static void dl_task_offline_migration(struct
>>      rq *rq, struct task_struct *p)
>>      >       set_task_cpu(p, later_rq->cpu);
>>      >       activate_task(later_rq, p, ENQUEUE_REPLENISH);
>>      >
>>      > +     dl_b = dl_bw_of(later_rq->cpu);
>>      > +     raw_spin_lock(&dl_b->lock);
>>      > +     __dl_add(dl_b, p->dl.dl_bw);
>>      > +     raw_spin_unlock(&dl_b->lock);
>>      > +
>>      >       if (!fallback)
>>      >               resched_curr(later_rq);
>>      >
>>      > @@ -1776,6 +1782,25 @@ static void prio_changed_dl(struct rq *rq,
>>      struct task_struct *p,
>>      >               switched_to_dl(rq, p);
>>      >  }
>>      >
>>      > +void attach_dl_bw(struct rq *rq)
>>      > +{
>>      > +     struct rb_node *next_node = rq->dl.rb_leftmost;
>>      > +     struct sched_dl_entity *dl_se;
>>      > +     struct dl_bw *dl_b;
>>      > +
>>      > +     dl_b = dl_bw_of(rq->cpu);
>>      > +     raw_spin_lock(&dl_b->lock);
>>      > +next_node:
>>      > +     if (next_node) {
>>      > +             dl_se = rb_entry(next_node, struct sched_dl_entity,
>>      rb_node);
>>      > +             __dl_add(dl_b, dl_se->dl_bw);
>>      > +             next_node = rb_next(next_node);
>>      > +
>>      > +             goto next_node;
>>      > +     }
>>      > +     raw_spin_unlock(&dl_b->lock);
>>      > +}
>>      > +
>>      >  const struct sched_class dl_sched_class = {
>>      >       .next                   = &rt_sched_class,
>>      >       .enqueue_task           = enqueue_task_dl,
>>      > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>>      > index e0e1299..a7b1a59 100644
>>      > --- a/kernel/sched/sched.h
>>      > +++ b/kernel/sched/sched.h
>>      > @@ -1676,6 +1676,7 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
>>      >
>>      >  extern void cfs_bandwidth_usage_inc(void);
>>      >  extern void cfs_bandwidth_usage_dec(void);
>>      > +void attach_dl_bw(struct rq *rq);
>>      >
>>      >  #ifdef CONFIG_NO_HZ_COMMON
>>      >  enum rq_nohz_flag_bits {
>>      >
>>
>>      --
>>      To unsubscribe from this list: send the line "unsubscribe
>>      linux-kernel" in
>>      the body of a message to majordomo@vger.kernel.org
>>      <mailto:majordomo@vger.kernel.org>
>>      More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>      Please read the FAQ at  http://www.tux.org/lkml/
>>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead
  2015-08-30 11:25       ` Wanpeng Li
@ 2015-09-01  9:49         ` Juri Lelli
  0 siblings, 0 replies; 6+ messages in thread
From: Juri Lelli @ 2015-09-01  9:49 UTC (permalink / raw)
  To: Wanpeng Li, Wanpeng Li, Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel

Hi,

On 30/08/15 12:25, Wanpeng Li wrote:
> On 8/10/15 10:10 PM, Juri Lelli wrote:
>> On 06/08/15 09:39, Wanpeng Li wrote:
>>> Hi Juri,
>>>
>> Hi,
>>
>>> 2015-05-06 16:14 GMT+08:00 Juri Lelli <juri.lelli@arm.com
>>> <mailto:juri.lelli@arm.com>>:
>>>
>>>      Hi Wanpeng,
>>>
>>>      I finally got to review this, sorry about the huge delay.
>>>
>>>      On 07/04/2015 04:36, Wanpeng Li wrote:
>>>      > The total used dl bandwidth of each root domain will be reset to 0 after
>>>      > cpu hotplug when rebuild sched domains, since the call path is:
>>>      >
>>>      > _cpu_down
>>>      >   cpuset_cpu_inactive()
>>>      >     cpuset_update_active_cpus()
>>>      >       partition_sched_domains()
>>>      >         build_sched_domains()
>>>      >           init_rootdomain()
>>>      >             init_dl_bw()
>>>      >
>>>      > The bandwidth which dl task occupy will be released when dl task dead,
>>>      > it will be minus from total used dl bandwidth of its root domain,
>>>      > however, bandwidth overflow occurs since total used dl bandwidth is 0.
>>>      >
>>>
>>>      Right, that's a bug.
>>>
>>>      > This patch fix it by attaching the bandwidth which dl task occupy to
>>>      > the new root domain when the task is migrating since cpu hotplug, and
>>>      > attach all the used dl bandwidth of dl tasks to the new root domain
>>>      > when sched domains are rebuild.
>>>      >
>>>
>>>      But, I think this fix has still a couple of problems:
>>>
>>>       - what happens if a DL task is simply sleeping when domains are
>>>         reconfigured?
>>>
>>>       - def_root_domain has now multiple accounting problems, as you do
>>>         this thing even when a cpu is moved there in the cpuoff path
>>>
>>>      Also, runqueue (and throttling) information are dynamic, while we
>>>      are trying to fix a static problem. It's probably not a good idea
>>>      mixing them.
>>>
>>>      I'm not sure how (I need more time to think it through), but can
>>>      we maybe fix this using cpuset information?
>>>
>>>
>>> Any ideas?
>>>
>> Yes, actually. I might have a different fix, but I'd like to play with
>> it a bit more as it is a bit too intrusive. Let me see if I can come
>> up with something that I can share.
> 
> Ping Peter, Juri, any detail ideas to help me post another version of my 
> patch? ;-)
> 

Let me see if I'm able to post my version of the fix before
end of this week ;).

Thanks!

- Juri

> Regards,
> Wanpeng Li
> 
>>
>> Thanks,
>>
>> - Juri
>>
>>> Regards,
>>> Wanpeng Li
>>>   
>>>
>>>
>>>      Thanks,
>>>
>>>      - Juri
>>>
>>>      > Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com
>>>      <mailto:wanpeng.li@linux.intel.com>>
>>>      > ---
>>>      >  kernel/sched/core.c     |  1 +
>>>      >  kernel/sched/deadline.c | 25 +++++++++++++++++++++++++
>>>      >  kernel/sched/sched.h    |  1 +
>>>      >  3 files changed, 27 insertions(+)
>>>      >
>>>      > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>>      > index 28b0d75..c940999 100644
>>>      > --- a/kernel/sched/core.c
>>>      > +++ b/kernel/sched/core.c
>>>      > @@ -5586,6 +5586,7 @@ static void rq_attach_root(struct rq *rq,
>>>      struct root_domain *rd)
>>>      >       rq->rd = rd;
>>>      >
>>>      >       cpumask_set_cpu(rq->cpu, rd->span);
>>>      > +     attach_dl_bw(rq);
>>>      >       if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
>>>      >               set_rq_online(rq);
>>>      >
>>>      > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>>>      > index 5e95145..62680d7 100644
>>>      > --- a/kernel/sched/deadline.c
>>>      > +++ b/kernel/sched/deadline.c
>>>      > @@ -224,6 +224,7 @@ static void dl_task_offline_migration(struct
>>>      rq *rq, struct task_struct *p)
>>>      >  {
>>>      >       struct rq *later_rq = NULL;
>>>      >       bool fallback = false;
>>>      > +     struct dl_bw *dl_b;
>>>      >
>>>      >       later_rq = find_lock_later_rq(p, rq);
>>>      >
>>>      > @@ -258,6 +259,11 @@ static void dl_task_offline_migration(struct
>>>      rq *rq, struct task_struct *p)
>>>      >       set_task_cpu(p, later_rq->cpu);
>>>      >       activate_task(later_rq, p, ENQUEUE_REPLENISH);
>>>      >
>>>      > +     dl_b = dl_bw_of(later_rq->cpu);
>>>      > +     raw_spin_lock(&dl_b->lock);
>>>      > +     __dl_add(dl_b, p->dl.dl_bw);
>>>      > +     raw_spin_unlock(&dl_b->lock);
>>>      > +
>>>      >       if (!fallback)
>>>      >               resched_curr(later_rq);
>>>      >
>>>      > @@ -1776,6 +1782,25 @@ static void prio_changed_dl(struct rq *rq,
>>>      struct task_struct *p,
>>>      >               switched_to_dl(rq, p);
>>>      >  }
>>>      >
>>>      > +void attach_dl_bw(struct rq *rq)
>>>      > +{
>>>      > +     struct rb_node *next_node = rq->dl.rb_leftmost;
>>>      > +     struct sched_dl_entity *dl_se;
>>>      > +     struct dl_bw *dl_b;
>>>      > +
>>>      > +     dl_b = dl_bw_of(rq->cpu);
>>>      > +     raw_spin_lock(&dl_b->lock);
>>>      > +next_node:
>>>      > +     if (next_node) {
>>>      > +             dl_se = rb_entry(next_node, struct sched_dl_entity,
>>>      rb_node);
>>>      > +             __dl_add(dl_b, dl_se->dl_bw);
>>>      > +             next_node = rb_next(next_node);
>>>      > +
>>>      > +             goto next_node;
>>>      > +     }
>>>      > +     raw_spin_unlock(&dl_b->lock);
>>>      > +}
>>>      > +
>>>      >  const struct sched_class dl_sched_class = {
>>>      >       .next                   = &rt_sched_class,
>>>      >       .enqueue_task           = enqueue_task_dl,
>>>      > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>>>      > index e0e1299..a7b1a59 100644
>>>      > --- a/kernel/sched/sched.h
>>>      > +++ b/kernel/sched/sched.h
>>>      > @@ -1676,6 +1676,7 @@ extern void init_dl_rq(struct dl_rq *dl_rq);
>>>      >
>>>      >  extern void cfs_bandwidth_usage_inc(void);
>>>      >  extern void cfs_bandwidth_usage_dec(void);
>>>      > +void attach_dl_bw(struct rq *rq);
>>>      >
>>>      >  #ifdef CONFIG_NO_HZ_COMMON
>>>      >  enum rq_nohz_flag_bits {
>>>      >
>>>
>>>      --
>>>      To unsubscribe from this list: send the line "unsubscribe
>>>      linux-kernel" in
>>>      the body of a message to majordomo@vger.kernel.org
>>>      <mailto:majordomo@vger.kernel.org>
>>>      More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>      Please read the FAQ at  http://www.tux.org/lkml/
>>>
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-09-01  9:48 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-07  3:36 [PATCH] sched/deadline: fix dl bandwidth of root domain overflow after dl task dead Wanpeng Li
2015-05-06  8:14 ` Juri Lelli
     [not found]   ` <CANRm+Cy43icX8DsLQJCwy+hbOuUynK5MOMfrHj5YA19LK_HmdQ@mail.gmail.com>
2015-08-10 14:10     ` Juri Lelli
2015-08-10 22:25       ` Wanpeng Li
2015-08-30 11:25       ` Wanpeng Li
2015-09-01  9:49         ` Juri Lelli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).