All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sched: keep quiescent cpu out of idle balance loop
@ 2014-02-19  5:20 Lei Wen
  2014-02-19  9:04 ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Lei Wen @ 2014-02-19  5:20 UTC (permalink / raw)
  To: peterz, mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian,
	leiwen, linux-kernel

Since cpu which is put into quiescent mode, would remove itself
from kernel's sched_domain. So we could use search sched_domain
method to check whether this cpu don't want to be disturbed as
idle load balance would send IPI to it.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 235cfa7..14230ae 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6783,6 +6783,8 @@ out_unlock:
  * - When one of the busy CPUs notice that there may be an idle rebalancing
  *   needed, they will kick the idle load balancer, which then does idle
  *   load balancing for all the idle CPUs.
+ * - exclude those cpus not inside current call_cpu's sched_domain, so that
+ *   those isolated cpu could be kept in their quisecnt mode.
  */
 static struct {
 	cpumask_var_t idle_cpus_mask;
@@ -6792,10 +6794,16 @@ static struct {
 
 static inline int find_new_ilb(void)
 {
-	int ilb = cpumask_first(nohz.idle_cpus_mask);
+	int ilb;
+	int cpu = smp_processor_id();
+	struct sched_domain *tmp;
 
-	if (ilb < nr_cpu_ids && idle_cpu(ilb))
-		return ilb;
+	for_each_domain(cpu, tmp) {
+		ilb = cpumask_first_and(nohz.idle_cpus_mask,
+				sched_domain_span(tmp));
+		if (ilb < nr_cpu_ids && idle_cpu(ilb))
+			return ilb;
+	}
 
 	return nr_cpu_ids;
 }
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-19  5:20 [PATCH] sched: keep quiescent cpu out of idle balance loop Lei Wen
@ 2014-02-19  9:04 ` Peter Zijlstra
  2014-02-20  2:42   ` Lei Wen
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2014-02-19  9:04 UTC (permalink / raw)
  To: Lei Wen
  Cc: mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian, linux-kernel

On Wed, Feb 19, 2014 at 01:20:30PM +0800, Lei Wen wrote:
> Since cpu which is put into quiescent mode, would remove itself
> from kernel's sched_domain. So we could use search sched_domain
> method to check whether this cpu don't want to be disturbed as
> idle load balance would send IPI to it.
> 
> Signed-off-by: Lei Wen <leiwen@marvell.com>
> ---
>  kernel/sched/fair.c | 14 +++++++++++---
>  1 file changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 235cfa7..14230ae 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6783,6 +6783,8 @@ out_unlock:
>   * - When one of the busy CPUs notice that there may be an idle rebalancing
>   *   needed, they will kick the idle load balancer, which then does idle
>   *   load balancing for all the idle CPUs.
> + * - exclude those cpus not inside current call_cpu's sched_domain, so that
> + *   those isolated cpu could be kept in their quisecnt mode.
>   */
>  static struct {
>  	cpumask_var_t idle_cpus_mask;
> @@ -6792,10 +6794,16 @@ static struct {
>  
>  static inline int find_new_ilb(void)
>  {
> -	int ilb = cpumask_first(nohz.idle_cpus_mask);
> +	int ilb;
> +	int cpu = smp_processor_id();
> +	struct sched_domain *tmp;
>  
> -	if (ilb < nr_cpu_ids && idle_cpu(ilb))
> -		return ilb;
> +	for_each_domain(cpu, tmp) {
> +		ilb = cpumask_first_and(nohz.idle_cpus_mask,
> +				sched_domain_span(tmp));
> +		if (ilb < nr_cpu_ids && idle_cpu(ilb))
> +			return ilb;
> +	}

The ILB code is bad; but you just made it horrible. Don't add pointless
for_each_domain() iterations.

I'm thinking something like:

  ilb = cpumask_first_and(nohz.idle_cpus_mask, this_rq()->rd.span);

Should work just fine, no?

Better still would be to maybe not participate in the ILB in the first
place and leave this selection loop alone.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-19  9:04 ` Peter Zijlstra
@ 2014-02-20  2:42   ` Lei Wen
  2014-02-20  8:50     ` Peter Zijlstra
  0 siblings, 1 reply; 15+ messages in thread
From: Lei Wen @ 2014-02-20  2:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Lei Wen, mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian,
	linux-kernel

On Wed, Feb 19, 2014 at 5:04 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Wed, Feb 19, 2014 at 01:20:30PM +0800, Lei Wen wrote:
>> Since cpu which is put into quiescent mode, would remove itself
>> from kernel's sched_domain. So we could use search sched_domain
>> method to check whether this cpu don't want to be disturbed as
>> idle load balance would send IPI to it.
>>
>> Signed-off-by: Lei Wen <leiwen@marvell.com>
>> ---
>>  kernel/sched/fair.c | 14 +++++++++++---
>>  1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 235cfa7..14230ae 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -6783,6 +6783,8 @@ out_unlock:
>>   * - When one of the busy CPUs notice that there may be an idle rebalancing
>>   *   needed, they will kick the idle load balancer, which then does idle
>>   *   load balancing for all the idle CPUs.
>> + * - exclude those cpus not inside current call_cpu's sched_domain, so that
>> + *   those isolated cpu could be kept in their quisecnt mode.
>>   */
>>  static struct {
>>       cpumask_var_t idle_cpus_mask;
>> @@ -6792,10 +6794,16 @@ static struct {
>>
>>  static inline int find_new_ilb(void)
>>  {
>> -     int ilb = cpumask_first(nohz.idle_cpus_mask);
>> +     int ilb;
>> +     int cpu = smp_processor_id();
>> +     struct sched_domain *tmp;
>>
>> -     if (ilb < nr_cpu_ids && idle_cpu(ilb))
>> -             return ilb;
>> +     for_each_domain(cpu, tmp) {
>> +             ilb = cpumask_first_and(nohz.idle_cpus_mask,
>> +                             sched_domain_span(tmp));
>> +             if (ilb < nr_cpu_ids && idle_cpu(ilb))
>> +                     return ilb;
>> +     }
>
> The ILB code is bad; but you just made it horrible. Don't add pointless
> for_each_domain() iterations.
>
> I'm thinking something like:
>
>   ilb = cpumask_first_and(nohz.idle_cpus_mask, this_rq()->rd.span);
>
> Should work just fine, no?

Yes, it has the same result as my previous patch did.

>
> Better still would be to maybe not participate in the ILB in the first
> place and leave this selection loop alone.

Not quitely get your point here...
Do you mean that you want idle cpu selection be put in earlier place
than current find_new_ilb is?

Thanks,
Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-20  2:42   ` Lei Wen
@ 2014-02-20  8:50     ` Peter Zijlstra
  2014-02-20  9:15       ` Lei Wen
  2014-02-20  9:17       ` Lei Wen
  0 siblings, 2 replies; 15+ messages in thread
From: Peter Zijlstra @ 2014-02-20  8:50 UTC (permalink / raw)
  To: Lei Wen
  Cc: Lei Wen, mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian,
	linux-kernel

On Thu, Feb 20, 2014 at 10:42:51AM +0800, Lei Wen wrote:
> >> -     int ilb = cpumask_first(nohz.idle_cpus_mask);
> >> +     int ilb;
> >> +     int cpu = smp_processor_id();
> >> +     struct sched_domain *tmp;
> >>
> >> -     if (ilb < nr_cpu_ids && idle_cpu(ilb))
> >> -             return ilb;
> >> +     for_each_domain(cpu, tmp) {
> >> +             ilb = cpumask_first_and(nohz.idle_cpus_mask,
> >> +                             sched_domain_span(tmp));
> >> +             if (ilb < nr_cpu_ids && idle_cpu(ilb))
> >> +                     return ilb;
> >> +     }
> >
> > The ILB code is bad; but you just made it horrible. Don't add pointless
> > for_each_domain() iterations.
> >
> > I'm thinking something like:
> >
> >   ilb = cpumask_first_and(nohz.idle_cpus_mask, this_rq()->rd.span);
> >
> > Should work just fine, no?
> 
> Yes, it has the same result as my previous patch did.
> 
> >
> > Better still would be to maybe not participate in the ILB in the first
> > place and leave this selection loop alone.
> 
> Not quitely get your point here...
> Do you mean that you want idle cpu selection be put in earlier place
> than current find_new_ilb is?

I meant that if you stop an idle CPU setting its bit in
nohz.idle_cpus_mask, you don't have to mask it out either.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-20  8:50     ` Peter Zijlstra
@ 2014-02-20  9:15       ` Lei Wen
  2014-02-20  9:17       ` Lei Wen
  1 sibling, 0 replies; 15+ messages in thread
From: Lei Wen @ 2014-02-20  9:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Lei Wen, mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian,
	linux-kernel

On Thu, Feb 20, 2014 at 4:50 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Feb 20, 2014 at 10:42:51AM +0800, Lei Wen wrote:
>> >> -     int ilb = cpumask_first(nohz.idle_cpus_mask);
>> >> +     int ilb;
>> >> +     int cpu = smp_processor_id();
>> >> +     struct sched_domain *tmp;
>> >>
>> >> -     if (ilb < nr_cpu_ids && idle_cpu(ilb))
>> >> -             return ilb;
>> >> +     for_each_domain(cpu, tmp) {
>> >> +             ilb = cpumask_first_and(nohz.idle_cpus_mask,
>> >> +                             sched_domain_span(tmp));
>> >> +             if (ilb < nr_cpu_ids && idle_cpu(ilb))
>> >> +                     return ilb;
>> >> +     }
>> >
>> > The ILB code is bad; but you just made it horrible. Don't add pointless
>> > for_each_domain() iterations.
>> >
>> > I'm thinking something like:
>> >
>> >   ilb = cpumask_first_and(nohz.idle_cpus_mask, this_rq()->rd.span);
>> >
>> > Should work just fine, no?
>>
>> Yes, it has the same result as my previous patch did.
>>
>> >
>> > Better still would be to maybe not participate in the ILB in the first
>> > place and leave this selection loop alone.
>>
>> Not quitely get your point here...
>> Do you mean that you want idle cpu selection be put in earlier place
>> than current find_new_ilb is?
>
> I meant that if you stop an idle CPU setting its bit in
> nohz.idle_cpus_mask, you don't have to mask it out either.

Understand it.
I would reformat my previous patch according to your suggestion.

Thanks,
Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-20  8:50     ` Peter Zijlstra
  2014-02-20  9:15       ` Lei Wen
@ 2014-02-20  9:17       ` Lei Wen
  2014-02-20 12:04         ` Peter Zijlstra
  2014-02-20 12:23         ` Peter Zijlstra
  1 sibling, 2 replies; 15+ messages in thread
From: Lei Wen @ 2014-02-20  9:17 UTC (permalink / raw)
  To: Peter Zijlstra, Lei Wen, mingo, preeti.lkml, daniel.lezcano,
	viresh.kumar, xjian, linux-kernel

Cpu which is put into quiescent mode, would remove itself
from kernel's sched_domain, and want others not disturb its
task running. But current scheduler would not checking whether
that cpu is setting in such mode, and still insist the quiescent
cpu to response the nohz load balance.

Fix it by preventing such cpu set nohz.idle_cpus_mask in the
first place.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 235cfa7..bc85022 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6883,6 +6883,13 @@ void nohz_balance_enter_idle(int cpu)
 	if (!cpu_active(cpu))
 		return;
 
+	/*
+	 * If this cpu is kept outside of root domain, we don't bother
+	 * to ask it for nohz balance.
+	 */
+	if (!cpumask_test_cpu(cpu, this_rq()->rd.span))
+		return;
+
 	if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
 		return;
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-20  9:17       ` Lei Wen
@ 2014-02-20 12:04         ` Peter Zijlstra
  2014-02-20 12:23         ` Peter Zijlstra
  1 sibling, 0 replies; 15+ messages in thread
From: Peter Zijlstra @ 2014-02-20 12:04 UTC (permalink / raw)
  To: Lei Wen
  Cc: mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian, linux-kernel

On Thu, Feb 20, 2014 at 05:17:04PM +0800, Lei Wen wrote:
> Cpu which is put into quiescent mode, would remove itself
> from kernel's sched_domain, and want others not disturb its
> task running. But current scheduler would not checking whether
> that cpu is setting in such mode, and still insist the quiescent
> cpu to response the nohz load balance.
> 
> Fix it by preventing such cpu set nohz.idle_cpus_mask in the
> first place.
> 
> Signed-off-by: Lei Wen <leiwen@marvell.com>

Thanks

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] sched: keep quiescent cpu out of idle balance loop
  2014-02-20  9:17       ` Lei Wen
  2014-02-20 12:04         ` Peter Zijlstra
@ 2014-02-20 12:23         ` Peter Zijlstra
  2014-02-21  2:23           ` [PATCH v2] " Lei Wen
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Zijlstra @ 2014-02-20 12:23 UTC (permalink / raw)
  To: Lei Wen
  Cc: mingo, preeti.lkml, daniel.lezcano, viresh.kumar, xjian, linux-kernel

On Thu, Feb 20, 2014 at 05:17:04PM +0800, Lei Wen wrote:
> +++ b/kernel/sched/fair.c
> @@ -6883,6 +6883,13 @@ void nohz_balance_enter_idle(int cpu)
>  	if (!cpu_active(cpu))
>  		return;
>  
> +	/*
> +	 * If this cpu is kept outside of root domain, we don't bother
> +	 * to ask it for nohz balance.
> +	 */
> +	if (!cpumask_test_cpu(cpu, this_rq()->rd.span))

 rd->span

Otherwise the compiler gets upset. Fixed it.

> +		return;
> +
>  	if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
>  		return;
>  
> -- 
> 1.8.3.2
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v2] sched: keep quiescent cpu out of idle balance loop
  2014-02-20 12:23         ` Peter Zijlstra
@ 2014-02-21  2:23           ` Lei Wen
  2014-02-21  5:51             ` Mike Galbraith
  0 siblings, 1 reply; 15+ messages in thread
From: Lei Wen @ 2014-02-21  2:23 UTC (permalink / raw)
  To: Peter Zijlstra, Lei Wen, mingo, preeti.lkml, daniel.lezcano,
	viresh.kumar, xjian, linux-kernel

Cpu which is put into quiescent mode, would remove itself
from kernel's sched_domain, and want others not disturb its
task running. But current scheduler would not checking whether
that cpu is setting in such mode, and still insist the quiescent
cpu to response the nohz load balance.

Fix it by preventing such cpu set nohz.idle_cpus_mask in the
first place.

Signed-off-by: Lei Wen <leiwen@marvell.com>
---
 kernel/sched/fair.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 235cfa7..66194fc 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6883,6 +6883,13 @@ void nohz_balance_enter_idle(int cpu)
 	if (!cpu_active(cpu))
 		return;
 
+	/*
+	 * If this cpu is kept outside of root domain, we don't bother
+	 * to ask it for nohz balance.
+	 */
+	if (!cpumask_test_cpu(cpu, this_rq()->rd->span))
+		return;
+
 	if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
 		return;
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched: keep quiescent cpu out of idle balance loop
  2014-02-21  2:23           ` [PATCH v2] " Lei Wen
@ 2014-02-21  5:51             ` Mike Galbraith
  2014-02-21  7:28               ` Lei Wen
  0 siblings, 1 reply; 15+ messages in thread
From: Mike Galbraith @ 2014-02-21  5:51 UTC (permalink / raw)
  To: Lei Wen
  Cc: Peter Zijlstra, mingo, preeti.lkml, daniel.lezcano, viresh.kumar,
	xjian, linux-kernel

On Fri, 2014-02-21 at 10:23 +0800, Lei Wen wrote: 
> Cpu which is put into quiescent mode, would remove itself
> from kernel's sched_domain, and want others not disturb its
> task running. But current scheduler would not checking whether
> that cpu is setting in such mode, and still insist the quiescent
> cpu to response the nohz load balance.

Let's isolate some CPUs.

Setup:
/-----"system" CPU0
\
  \---"rtcpus" CPUs1-3

crash> runqueues
PER-CPU DATA TYPE:
  struct rq runqueues;
PER-CPU ADDRESSES:
  [0]: ffff88022fc12c00
  [1]: ffff88022fc92c00
  [2]: ffff88022fd12c00
  [3]: ffff88022fd92c00 
crash> struct rq ffff88022fd92c00 | grep sd
  sd = 0x0,  <== yup, CPU3 is isolated bit of silicon
crash> struct rq ffff88022fd92c00 | grep rd
  rd = 0xffffffff81bffe60 <def_root_domain>,
crash> struct -x root_domain 0xffffffff81bffe60
...
  span = {{
      bits = {0xe} <== "rtcpus"
    }},
crash> struct rq ffff88022fc12c00 | grep rd
  rd = 0xffff8802242c5800, 
crash> struct -x root_domain 0xffff8802242c5800
... 
  span = {{
      bits = {0x1} <== "system"
    }},

Turn off load balancing in "system" as well now, CPU0 loses its 'sd',
and becomes an isolated island identical to "rtcpus" CPUs, and thus..

  span = {{
      bits = {0xf} <== oh darn
    }},

.."system" and "rtcpus" merge, all CPUs having NULL sd, as they are now
all remote silicon islands, but they now also share rd again, as if you
had never diddled domains in the first place.

-Mike


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched: keep quiescent cpu out of idle balance loop
  2014-02-21  5:51             ` Mike Galbraith
@ 2014-02-21  7:28               ` Lei Wen
  2014-02-21  8:34                 ` Mike Galbraith
  0 siblings, 1 reply; 15+ messages in thread
From: Lei Wen @ 2014-02-21  7:28 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Lei Wen, Peter Zijlstra, mingo, preeti.lkml, daniel.lezcano,
	viresh.kumar, xjian, linux-kernel

Mike,

On Fri, Feb 21, 2014 at 1:51 PM, Mike Galbraith <bitbucket@online.de> wrote:
> On Fri, 2014-02-21 at 10:23 +0800, Lei Wen wrote:
>> Cpu which is put into quiescent mode, would remove itself
>> from kernel's sched_domain, and want others not disturb its
>> task running. But current scheduler would not checking whether
>> that cpu is setting in such mode, and still insist the quiescent
>> cpu to response the nohz load balance.
>
> Let's isolate some CPUs.
>
> Setup:
> /-----"system" CPU0
> \
>   \---"rtcpus" CPUs1-3
>
> crash> runqueues
> PER-CPU DATA TYPE:
>   struct rq runqueues;
> PER-CPU ADDRESSES:
>   [0]: ffff88022fc12c00
>   [1]: ffff88022fc92c00
>   [2]: ffff88022fd12c00
>   [3]: ffff88022fd92c00
> crash> struct rq ffff88022fd92c00 | grep sd
>   sd = 0x0,  <== yup, CPU3 is isolated bit of silicon
> crash> struct rq ffff88022fd92c00 | grep rd
>   rd = 0xffffffff81bffe60 <def_root_domain>,
> crash> struct -x root_domain 0xffffffff81bffe60
> ...
>   span = {{
>       bits = {0xe} <== "rtcpus"
>     }},
> crash> struct rq ffff88022fc12c00 | grep rd
>   rd = 0xffff8802242c5800,
> crash> struct -x root_domain 0xffff8802242c5800
> ...
>   span = {{
>       bits = {0x1} <== "system"
>     }},
>
> Turn off load balancing in "system" as well now, CPU0 loses its 'sd',
> and becomes an isolated island identical to "rtcpus" CPUs, and thus..
>
>   span = {{
>       bits = {0xf} <== oh darn
>     }},
>
> .."system" and "rtcpus" merge, all CPUs having NULL sd, as they are now
> all remote silicon islands, but they now also share rd again, as if you
> had never diddled domains in the first place.

Great catch for it!
Actually, what I have experiment is as:
1. set top cpuset as disable load balance
2. set 0-2 cpus to "system", and enable its load balance
3. set 3 cpu to "rt" and disable load balance.

While by this way, root span always covering [0-2] which is seen
by cpu 0-2, as you also mentioned.
And it is true that if I disable load balance, I would see span mask
get them merged.

So how about below change?
+       if (!this_rq()->sd)
+               return;
Suppose isolated cpu would lose its sd, could you help
confirm it from crash too?

Or, you think it is wrong to do merge job when system group disable
the load balance?

Thanks,
Lei

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched: keep quiescent cpu out of idle balance loop
  2014-02-21  7:28               ` Lei Wen
@ 2014-02-21  8:34                 ` Mike Galbraith
  2014-02-21  9:15                   ` [PATCH v3] " Lei Wen
  2014-02-21  9:15                   ` [PATCH v2] " Mike Galbraith
  0 siblings, 2 replies; 15+ messages in thread
From: Mike Galbraith @ 2014-02-21  8:34 UTC (permalink / raw)
  To: Lei Wen
  Cc: Lei Wen, Peter Zijlstra, mingo, preeti.lkml, daniel.lezcano,
	viresh.kumar, xjian, linux-kernel

On Fri, 2014-02-21 at 15:28 +0800, Lei Wen wrote:

> Actually, what I have experiment is as:
> 1. set top cpuset as disable load balance
> 2. set 0-2 cpus to "system", and enable its load balance
> 3. set 3 cpu to "rt" and disable load balance.

Exactly as I do, pertinent part of my cheezy script being...

        # ...and fire up the shield
        cset shield --userset=rtcpus --cpu=${START_CPU}-${END_CPU} --kthread=on

        # If cpuset wasn't previously mounted (no obnoxious systemd),
        # we just mounted it.  Find the mount point.
        if [ -z $CPUSET_ROOT ]; then
                CPUSET_ROOT=$(grep cpuset /proc/mounts|cut -d ' ' -f2)
                if [ -z $CPUSET_ROOT ]; then
                        # If it's not mounted now, bail.
                        echo EEK, cupset is not mounted!
                        exit
                else
                        # ok, check for cgroup mount
                        if [ -f ${CPUSET_ROOT}/cpuset.cpus ]; then
                                CPUSET_PREFIX=cpuset.
                        fi
                fi
        fi

        echo 0 > ${CPUSET_ROOT}/${CPUSET_PREFIX}sched_load_balance
        echo 1 > ${CPUSET_ROOT}/system/${CPUSET_PREFIX}sched_load_balance
        echo 0 > ${CPUSET_ROOT}/rtcpus/${CPUSET_PREFIX}sched_load_balance
        echo 0 > ${CPUSET_ROOT}/rtcpus/${CPUSET_PREFIX}sched_relax_domain_level

> While by this way, root span always covering [0-2] which is seen
> by cpu 0-2, as you also mentioned.
> And it is true that if I disable load balance, I would see span mask
> get them merged.
> 
> So how about below change?
> +       if (!this_rq()->sd)
> +               return;
> Suppose isolated cpu would lose its sd, could you help
> confirm it from crash too?

Yeah, isolated CPUs have no sd connectivity.  I sent Peter a patchlet
offline showing what I do to keep nohz at bay.
> Or, you think it is wrong to do merge job when system group disable
> the load balance?

I think the construction stuff works fine, and !->sd is the perfect cue
to tell various things to keep their grubby mitts off of a CPU.

-Mike


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v3] sched: keep quiescent cpu out of idle balance loop
  2014-02-21  8:34                 ` Mike Galbraith
@ 2014-02-21  9:15                   ` Lei Wen
  2014-02-21  9:41                     ` Mike Galbraith
  2014-02-21  9:15                   ` [PATCH v2] " Mike Galbraith
  1 sibling, 1 reply; 15+ messages in thread
From: Lei Wen @ 2014-02-21  9:15 UTC (permalink / raw)
  To: Mike Galbraith, Lei Wen, Peter Zijlstra, mingo, preeti.lkml,
	daniel.lezcano, viresh.kumar, xjian, linux-kernel

Cpu which is put into quiescent mode, would set its sd
member as NULL, and want others not disturb its task running.
But current scheduler would not checking whether that cpu is
setting in such mode, and still insist the quiescent
cpu to response the nohz load balance.

Fix it by preventing such cpu set nohz.idle_cpus_mask in the
first place.

Signed-off-by: Lei Wen <leiwen@marvell.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <bitbucket@online.de>
---
Much thanks to Mike Pointing out the root span would be merged when the
last cpu becomes isolated from the crash result checking!

 kernel/sched/fair.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 235cfa7..af30b6a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6883,6 +6883,14 @@ void nohz_balance_enter_idle(int cpu)
 	if (!cpu_active(cpu))
 		return;
 
+	/*
+	 * If this cpu is isolated, its rq's sd member would becomes NULL.
+	 * Base on this observation, we could exclude this cpu from nohz
+	 * idle balance, so that it would not be disturbed.
+	 */
+	if (!this_rq()->sd)
+		return;
+
 	if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
 		return;
 
-- 
1.8.3.2


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v2] sched: keep quiescent cpu out of idle balance loop
  2014-02-21  8:34                 ` Mike Galbraith
  2014-02-21  9:15                   ` [PATCH v3] " Lei Wen
@ 2014-02-21  9:15                   ` Mike Galbraith
  1 sibling, 0 replies; 15+ messages in thread
From: Mike Galbraith @ 2014-02-21  9:15 UTC (permalink / raw)
  To: Lei Wen
  Cc: Lei Wen, Peter Zijlstra, mingo, preeti.lkml, daniel.lezcano,
	viresh.kumar, xjian, linux-kernel

On Fri, 2014-02-21 at 09:34 +0100, Mike Galbraith wrote:

> I think the construction stuff works fine, and !->sd is the perfect cue
> to tell various things to keep their grubby mitts off of a CPU.

Take idle_balance() for instance.. not much point in dropping rq->lock
just to take it again after doing _nothing_, and we certainly wouldn't
want to go play load balancer for connected CPUs anyway, we might get
something much more important to do RSN.

(that applies to rt in general - go off and play load balancer in the
SCHED_OTHER slums?  Surely you jest, elite snobs don't do skut work,
they actually might get their hands dirty;)

Or, at user discretion, telling CPUPRI stuff that no load balancing also
means no rt balancing, i.e. the user assumes responsibility for ALL task
placement, so we don't need to call neutered functions or spend cycles
maintaining data we will have no use for until the user tells us he is
finished with these CPUs.

etc.

-Mike 


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v3] sched: keep quiescent cpu out of idle balance loop
  2014-02-21  9:15                   ` [PATCH v3] " Lei Wen
@ 2014-02-21  9:41                     ` Mike Galbraith
  0 siblings, 0 replies; 15+ messages in thread
From: Mike Galbraith @ 2014-02-21  9:41 UTC (permalink / raw)
  To: Lei Wen
  Cc: Peter Zijlstra, mingo, preeti.lkml, daniel.lezcano, viresh.kumar,
	xjian, linux-kernel

On Fri, 2014-02-21 at 17:15 +0800, Lei Wen wrote:

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 235cfa7..af30b6a 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6883,6 +6883,14 @@ void nohz_balance_enter_idle(int cpu)
>  	if (!cpu_active(cpu))
>  		return;
>  
> +	/*
> +	 * If this cpu is isolated, its rq's sd member would becomes NULL.
> +	 * Base on this observation, we could exclude this cpu from nohz
> +	 * idle balance, so that it would not be disturbed.
> +	 */
> +	if (!this_rq()->sd)
> +		return;
> +
>  	if (test_bit(NOHZ_TICK_STOPPED, nohz_flags(cpu)))
>  		return;

What about nohz_balance_exit_idle()?

I think Peter queued a patchlet to tell nohz balancing to go away.

-Mike


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-02-21  9:42 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-19  5:20 [PATCH] sched: keep quiescent cpu out of idle balance loop Lei Wen
2014-02-19  9:04 ` Peter Zijlstra
2014-02-20  2:42   ` Lei Wen
2014-02-20  8:50     ` Peter Zijlstra
2014-02-20  9:15       ` Lei Wen
2014-02-20  9:17       ` Lei Wen
2014-02-20 12:04         ` Peter Zijlstra
2014-02-20 12:23         ` Peter Zijlstra
2014-02-21  2:23           ` [PATCH v2] " Lei Wen
2014-02-21  5:51             ` Mike Galbraith
2014-02-21  7:28               ` Lei Wen
2014-02-21  8:34                 ` Mike Galbraith
2014-02-21  9:15                   ` [PATCH v3] " Lei Wen
2014-02-21  9:41                     ` Mike Galbraith
2014-02-21  9:15                   ` [PATCH v2] " Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.