All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] nohz: add tick_nohz_full_set_cpus() API
@ 2015-04-03 16:24 cmetcalf
  2015-04-03 16:24 ` [PATCH 2/2] nohz: make nohz_full imply isolcpus cmetcalf
  0 siblings, 1 reply; 89+ messages in thread
From: cmetcalf @ 2015-04-03 16:24 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

From: Chris Metcalf <cmetcalf@ezchip.com>

This is useful, for example, to modify a cpumask indicating some
special nohz-type functionality so that the nohz cores are
automatically added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
 include/linux/tick.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index d53ad4892a39..29456c443970 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -192,6 +192,12 @@ static inline void tick_nohz_full_clear_cpus(struct cpumask *mask)
 		cpumask_andnot(mask, mask, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_set_cpus(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -201,6 +207,7 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
 static inline void tick_nohz_full_clear_cpus(struct cpumask *mask) { }
+static inline void tick_nohz_full_set_cpus(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 16:24 [PATCH 1/2] nohz: add tick_nohz_full_set_cpus() API cmetcalf
@ 2015-04-03 16:24 ` cmetcalf
  2015-04-03 17:42   ` Frederic Weisbecker
  2015-04-03 18:08   ` [PATCH 2/2] nohz: make nohz_full imply isolcpus Mike Galbraith
  0 siblings, 2 replies; 89+ messages in thread
From: cmetcalf @ 2015-04-03 16:24 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

From: Chris Metcalf <cmetcalf@ezchip.com>

It's not clear that nohz_full is useful without isolcpus also
set, since otherwise the scheduler has to run periodically to
try to determine whether to steal work from other cores.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
I am puzzled why this has not been done before, so I suspect
there is some argument against it that I am missing, but I
wasn't able to turn anything up by searching LKML.

 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..275f12c608f2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
 	doms_cur = alloc_sched_domains(ndoms_cur);
 	if (!doms_cur)
 		doms_cur = &fallback_doms;
+	tick_nohz_full_set_cpus(cpu_isolated_map);
 	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
 	err = build_sched_domains(doms_cur[0], NULL);
 	register_sched_domain_sysctl();
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 16:24 ` [PATCH 2/2] nohz: make nohz_full imply isolcpus cmetcalf
@ 2015-04-03 17:42   ` Frederic Weisbecker
  2015-04-03 19:20     ` Chris Metcalf
  2015-04-03 18:08   ` [PATCH 2/2] nohz: make nohz_full imply isolcpus Mike Galbraith
  1 sibling, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-03 17:42 UTC (permalink / raw)
  To: cmetcalf, Rik van Riel
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Fri, Apr 03, 2015 at 12:24:08PM -0400, cmetcalf@ezchip.com wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> It's not clear that nohz_full is useful without isolcpus also
> set, since otherwise the scheduler has to run periodically to
> try to determine whether to steal work from other cores.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>

I think Rick has a similar patch.

> ---
> I am puzzled why this has not been done before, so I suspect
> there is some argument against it that I am missing, but I
> wasn't able to turn anything up by searching LKML.
> 
>  kernel/sched/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..275f12c608f2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
>  	doms_cur = alloc_sched_domains(ndoms_cur);
>  	if (!doms_cur)
>  		doms_cur = &fallback_doms;
> +	tick_nohz_full_set_cpus(cpu_isolated_map);
>  	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
>  	err = build_sched_domains(doms_cur[0], NULL);
>  	register_sched_domain_sysctl();
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 16:24 ` [PATCH 2/2] nohz: make nohz_full imply isolcpus cmetcalf
  2015-04-03 17:42   ` Frederic Weisbecker
@ 2015-04-03 18:08   ` Mike Galbraith
  2015-04-03 19:21     ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-04-03 18:08 UTC (permalink / raw)
  To: cmetcalf
  Cc: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Fri, 2015-04-03 at 12:24 -0400, cmetcalf@ezchip.com wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> It's not clear that nohz_full is useful without isolcpus also
> set, since otherwise the scheduler has to run periodically to
> try to determine whether to steal work from other cores.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>

Ack!  nohz_full= as currently defined makes zero sense when the cpu 
set (which should be spelled cpuset) remains connected to the 
scheduler.  Perturbation of tasks to PREVENT cpu domination is what 
the scheduler does for a living.  Sprinkling microsecond savers all 
over the kernel is pretty silly if you don't shut down the mother lode 
of perturbation.

> ---
> I am puzzled why this has not been done before, so I suspect
> there is some argument against it that I am missing, but I
> wasn't able to turn anything up by searching LKML.
> 
>  kernel/sched/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..275f12c608f2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct 
> cpumask *cpu_map)
>       doms_cur = alloc_sched_domains(ndoms_cur);
>       if (!doms_cur)
>               doms_cur = &fallback_doms;
> +     tick_nohz_full_set_cpus(cpu_isolated_map);
>       cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
>       err = build_sched_domains(doms_cur[0], NULL);
>       register_sched_domain_sysctl();

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 17:42   ` Frederic Weisbecker
@ 2015-04-03 19:20     ` Chris Metcalf
  2015-04-04 14:10       ` Rik van Riel
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-03 19:20 UTC (permalink / raw)
  To: Frederic Weisbecker, Rik van Riel
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 04/03/2015 01:42 PM, Frederic Weisbecker wrote:
> On Fri, Apr 03, 2015 at 12:24:08PM -0400, cmetcalf@ezchip.com wrote:
>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>
>> It's not clear that nohz_full is useful without isolcpus also
>> set, since otherwise the scheduler has to run periodically to
>> try to determine whether to steal work from other cores.
>>
>> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> I think Rick has a similar patch.

I didn't see anything relevant in linux-next, though I did see 
cpu_isolated_map
made into a public symbol in a recent commit by Rik.

Rik, what's the change you're proposing that's similar to this one? Thanks!

>> ---
>> I am puzzled why this has not been done before, so I suspect
>> there is some argument against it that I am missing, but I
>> wasn't able to turn anything up by searching LKML.
>>
>>   kernel/sched/core.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index f0f831e8a345..275f12c608f2 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
>>   	doms_cur = alloc_sched_domains(ndoms_cur);
>>   	if (!doms_cur)
>>   		doms_cur = &fallback_doms;
>> +	tick_nohz_full_set_cpus(cpu_isolated_map);
>>   	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
>>   	err = build_sched_domains(doms_cur[0], NULL);
>>   	register_sched_domain_sysctl();
>> -- 
>> 2.1.2
>>

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 18:08   ` [PATCH 2/2] nohz: make nohz_full imply isolcpus Mike Galbraith
@ 2015-04-03 19:21     ` Chris Metcalf
  2015-04-04  2:03       ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-03 19:21 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/03/2015 02:08 PM, Mike Galbraith wrote:
> On Fri, 2015-04-03 at 12:24 -0400, cmetcalf@ezchip.com wrote:
>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>
>> It's not clear that nohz_full is useful without isolcpus also
>> set, since otherwise the scheduler has to run periodically to
>> try to determine whether to steal work from other cores.
>>
>> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> Ack!  nohz_full= as currently defined makes zero sense when the cpu
> set (which should be spelled cpuset) remains connected to the
> scheduler.  Perturbation of tasks to PREVENT cpu domination is what
> the scheduler does for a living.  Sprinkling microsecond savers all
> over the kernel is pretty silly if you don't shut down the mother lode
> of perturbation.

Sounds like a thumbs up for this patch, then?  :-)

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 19:21     ` Chris Metcalf
@ 2015-04-04  2:03       ` Mike Galbraith
  2015-04-04  3:43         ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-04-04  2:03 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Fri, 2015-04-03 at 15:21 -0400, Chris Metcalf wrote:
> On 04/03/2015 02:08 PM, Mike Galbraith wrote:
> > On Fri, 2015-04-03 at 12:24 -0400,  cmetcalf@ezchip.comwrote:
> > > From: Chris Metcalf <cmetcalf@ezchip.com>
> > > 
> > > It's not clear that nohz_full is useful without isolcpus also
> > > set, since otherwise the scheduler has to run periodically to
> > > try to determine whether to steal work from other cores.
> > > 
> > > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> > Ack!  nohz_full= as currently defined makes zero sense when the cpu
> > set (which should be spelled cpuset) remains connected to the
> > scheduler.  Perturbation of tasks to PREVENT cpu domination is what
> > the scheduler does for a living.  Sprinkling microsecond savers all
> > over the kernel is pretty silly if you don't shut down the mother 
> > lode
> > of perturbation.
> 
> Sounds like a thumbs up for this patch, then?  :-)

Yup.  The other thumb turns in the up direction when folks start 
spelling cpuset properly ;-)  Static isolcpus was supposed to go away.

        -Mike

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-04  2:03       ` Mike Galbraith
@ 2015-04-04  3:43         ` Mike Galbraith
  2015-04-06 19:28           ` Rik van Riel
  0 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-04-04  3:43 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Sat, 2015-04-04 at 04:03 +0200, Mike Galbraith wrote:
> On Fri, 2015-04-03 at 15:21 -0400, Chris Metcalf wrote:
> > On 04/03/2015 02:08 PM, Mike Galbraith wrote:
> > > On Fri, 2015-04-03 at 12:24 -0400,  cmetcalf@ezchip.comwrote:
> > > > From: Chris Metcalf <cmetcalf@ezchip.com>
> > > > 
> > > > It's not clear that nohz_full is useful without isolcpus also
> > > > set, since otherwise the scheduler has to run periodically to
> > > > try to determine whether to steal work from other cores.
> > > > 
> > > > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> > > Ack!  nohz_full= as currently defined makes zero sense when the 
> > > cpu
> > > set (which should be spelled cpuset) remains connected to the
> > > scheduler.  Perturbation of tasks to PREVENT cpu domination is 
> > > what
> > > the scheduler does for a living.  Sprinkling microsecond savers 
> > > all
> > > over the kernel is pretty silly if you don't shut down the 
> > > mother 
> > > lode
> > > of perturbation.
> > 
> > Sounds like a thumbs up for this patch, then?  :-)
> 
> Yup.  The other thumb turns in the up direction when folks start 
> spelling cpuset properly ;-)  Static isolcpus was supposed to go 
> away.

Speaking of microsecond savers, the (ick) deferment experiment below 
cut 60 core jitter in half.  Shooting the clocksource watchdog fixes 
alternating ~15us/~5us tick on my desktop box.

With workqueue twiddles and whatnot floating around, the thing is 
starting to look viable.

---
 kernel/sched/core.c       |    5 +++--
 kernel/time/clocksource.c |    5 +++++
 2 files changed, 8 insertions(+), 2 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2604,12 +2604,13 @@ u64 scheduler_tick_max_deferment(void)
        struct rq *rq = this_rq();
        unsigned long next, now = ACCESS_ONCE(jiffies);
 
-       next = rq->last_sched_tick + HZ;
+       next = (rq->last_sched_tick + HZ) | (rq->clock & 0x3f);
 
        if (time_before_eq(next, now))
                return 0;
 
-       return jiffies_to_nsecs(next - now);
+       /* Add noise to avoid CPUs colliding at tick boundaries */
+       return jiffies_to_nsecs(next - now) | (rq->clock & 0xfffff);
 }
 #endif
 
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -267,8 +267,13 @@ static void clocksource_watchdog(unsigne
         * to each other.
         */
        next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
+skip_nohz_full:
        if (next_cpu >= nr_cpu_ids)
                next_cpu = cpumask_first(cpu_online_mask);
+       if (next_cpu && tick_nohz_full_cpu(next_cpu)) {
+               next_cpu = cpumask_next(next_cpu, cpu_online_mask);
+               goto skip_nohz_full;
+       }
        watchdog_timer.expires += WATCHDOG_INTERVAL;
        add_timer_on(&watchdog_timer, next_cpu);
 out:

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-03 19:20     ` Chris Metcalf
@ 2015-04-04 14:10       ` Rik van Riel
  2015-04-04 17:09         ` Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Rik van Riel @ 2015-04-04 14:10 UTC (permalink / raw)
  To: Chris Metcalf, Frederic Weisbecker
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 04/03/2015 03:20 PM, Chris Metcalf wrote:
> On 04/03/2015 01:42 PM, Frederic Weisbecker wrote:
>> On Fri, Apr 03, 2015 at 12:24:08PM -0400, cmetcalf@ezchip.com wrote:
>>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>>
>>> It's not clear that nohz_full is useful without isolcpus also
>>> set, since otherwise the scheduler has to run periodically to
>>> try to determine whether to steal work from other cores.
>>>
>>> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
>> I think Rick has a similar patch.
> 
> I didn't see anything relevant in linux-next, though I did see
> cpu_isolated_map
> made into a public symbol in a recent commit by Rik.

I have a few patches in cgroups/for-4.1 as well that
export information about isolated and nohz_full cpus
in /sys/devices/system/cpu/

> Rik, what's the change you're proposing that's similar to this one? Thanks!

I don't have this particular one, and I like it.

I know there are use cases where isolcpus= without
nohz_full= makes sense, but I cannot think of the
reverse.

Acked-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-04 14:10       ` Rik van Riel
@ 2015-04-04 17:09         ` Chris Metcalf
  2015-04-05  5:05           ` Ingo Molnar
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-04 17:09 UTC (permalink / raw)
  To: Rik van Riel, Mike Galbraith
  Cc: Frederic Weisbecker, Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 4/4/2015 10:10 AM, Rik van Riel wrote:
>> Rik, what's the change you're proposing that's similar to this one? Thanks!
> I don't have this particular one, and I like it.
>
> I know there are use cases where isolcpus= without
> nohz_full= makes sense, but I cannot think of the
> reverse.
>
> Acked-by: Rik van Riel<riel@redhat.com>

Thanks, I'll push it via the tile tree unless someone would prefer otherwise.
(The tick_nohz_full_set_cpus() and tick_nohz_full_clear_cpus() routines are in
earlier tile tree commits, the latter supporting a change to the tile network driver.)

Mike, I added your

Thumbs-Up-by: Mike Galbraith <umgwanakikbuti@gmail.com> :-)

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-04 17:09         ` Chris Metcalf
@ 2015-04-05  5:05           ` Ingo Molnar
  2015-04-06 17:15             ` Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Ingo Molnar @ 2015-04-05  5:05 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Rik van Riel, Mike Galbraith, Frederic Weisbecker,
	Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel


* Chris Metcalf <cmetcalf@ezchip.com> wrote:

> On 4/4/2015 10:10 AM, Rik van Riel wrote:
> >>Rik, what's the change you're proposing that's similar to this one? Thanks!
> >I don't have this particular one, and I like it.
> >
> >I know there are use cases where isolcpus= without
> >nohz_full= makes sense, but I cannot think of the
> >reverse.
> >
> >Acked-by: Rik van Riel<riel@redhat.com>
> 
> Thanks, I'll push it via the tile tree unless someone would prefer otherwise.

Yes, I'd prefer otherwise: please send the final, agreed upon patch to 
the timer tree.

> (The tick_nohz_full_set_cpus() and tick_nohz_full_clear_cpus() 
> routines are in earlier tile tree commits, the latter supporting a 
> change to the tile network driver.)

This is absolutely not OK, please push this through the timer tree. We 
don't do generic timer changes through architecture trees.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-05  5:05           ` Ingo Molnar
@ 2015-04-06 17:15             ` Chris Metcalf
  2015-04-06 18:16               ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-06 17:15 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Rik van Riel, Mike Galbraith, Frederic Weisbecker,
	Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 04/05/2015 01:05 AM, Ingo Molnar wrote:
> * Chris Metcalf <cmetcalf@ezchip.com> wrote:
>
>> On 4/4/2015 10:10 AM, Rik van Riel wrote:
>>>> Rik, what's the change you're proposing that's similar to this one? Thanks!
>>> I don't have this particular one, and I like it.
>>>
>>> I know there are use cases where isolcpus= without
>>> nohz_full= makes sense, but I cannot think of the
>>> reverse.
>>>
>>> Acked-by: Rik van Riel<riel@redhat.com>
>> Thanks, I'll push it via the tile tree unless someone would prefer otherwise.
> Yes, I'd prefer otherwise: please send the final, agreed upon patch to
> the timer tree.

Not having done this before, I assume I just ask you to take patches
from LKML, or to pull from my tree?

I've set up a "timers" branch in my kernel.org git repo in any case
(git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git)
that has the commits in it.

>> (The tick_nohz_full_set_cpus() and tick_nohz_full_clear_cpus()
>> routines are in earlier tile tree commits, the latter supporting a
>> change to the tile network driver.)
> This is absolutely not OK, please push this through the timer tree. We
> don't do generic timer changes through architecture trees.

Sure; makes sense.  I will post a v2 with two patches in the patchset
and see if anyone else wants to comment in the next couple of days,
and then ask you to pull, or take the patch series, at that point.

Thanks!

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs
  2015-04-06 17:15             ` Chris Metcalf
@ 2015-04-06 18:16               ` cmetcalf
  2015-04-06 18:16                 ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus cmetcalf
  2015-04-06 18:29                 ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs Frederic Weisbecker
  0 siblings, 2 replies; 89+ messages in thread
From: cmetcalf @ 2015-04-06 18:16 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

From: Chris Metcalf <cmetcalf@ezchip.com>

The "clear" API is useful, for example, to modify a cpumask to avoid
the nohz cores so that interrupts aren't sent to them.

Likewise the "set" API is useful to modify a cpumask indicating some
special nohz-type functionality so that the nohz cores are
automatically added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
Frederic, if you could ack this patch (and maybe the next) before
I ask for it to be pulled into the timer tree that would be great.
I will wait a few days before asking in case anyone else has any
other issues or would like to provide an Acked-by.

v2: put the "...set_cpus" API together with the change to
init_sched_domains() so that they can go into the timer tree
independently of other changes in my tree.

 include/linux/tick.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..29456c443970 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_clear_cpus(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_andnot(mask, mask, tick_nohz_full_mask);
+}
+
+static inline void tick_nohz_full_set_cpus(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_clear_cpus(struct cpumask *mask) { }
+static inline void tick_nohz_full_set_cpus(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-06 18:16               ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
@ 2015-04-06 18:16                 ` cmetcalf
  2015-04-07 22:29                   ` Frederic Weisbecker
  2015-04-08  9:41                   ` Peter Zijlstra
  2015-04-06 18:29                 ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs Frederic Weisbecker
  1 sibling, 2 replies; 89+ messages in thread
From: cmetcalf @ 2015-04-06 18:16 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

From: Chris Metcalf <cmetcalf@ezchip.com>

It's not clear that nohz_full is useful without isolcpus also
set, since otherwise the scheduler has to run periodically to
try to determine whether to steal work from other cores.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..275f12c608f2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
 	doms_cur = alloc_sched_domains(ndoms_cur);
 	if (!doms_cur)
 		doms_cur = &fallback_doms;
+	tick_nohz_full_set_cpus(cpu_isolated_map);
 	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
 	err = build_sched_domains(doms_cur[0], NULL);
 	register_sched_domain_sysctl();
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs
  2015-04-06 18:16               ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
  2015-04-06 18:16                 ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus cmetcalf
@ 2015-04-06 18:29                 ` Frederic Weisbecker
  2015-04-06 19:09                   ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-06 18:29 UTC (permalink / raw)
  To: cmetcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Mon, Apr 06, 2015 at 02:16:44PM -0400, cmetcalf@ezchip.com wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> The "clear" API is useful, for example, to modify a cpumask to avoid
> the nohz cores so that interrupts aren't sent to them.
> 
> Likewise the "set" API is useful to modify a cpumask indicating some
> special nohz-type functionality so that the nohz cores are
> automatically added to that set.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> ---
> Frederic, if you could ack this patch (and maybe the next) before
> I ask for it to be pulled into the timer tree that would be great.
> I will wait a few days before asking in case anyone else has any
> other issues or would like to provide an Acked-by.

Sure, or better yet I should take them since they are full nohz code.
And I have a pending pile to pull-request to timer tree.

Thanks.

> 
> v2: put the "...set_cpus" API together with the change to
> init_sched_domains() so that they can go into the timer tree
> independently of other changes in my tree.
> 
>  include/linux/tick.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 9c085dc12ae9..29456c443970 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
>  	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
>  }
>  
> +static inline void tick_nohz_full_clear_cpus(struct cpumask *mask)
> +{
> +	if (tick_nohz_full_enabled())
> +		cpumask_andnot(mask, mask, tick_nohz_full_mask);
> +}
> +
> +static inline void tick_nohz_full_set_cpus(struct cpumask *mask)
> +{
> +	if (tick_nohz_full_enabled())
> +		cpumask_or(mask, mask, tick_nohz_full_mask);
> +}
> +
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
> @@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> +static inline void tick_nohz_full_clear_cpus(struct cpumask *mask) { }
> +static inline void tick_nohz_full_set_cpus(struct cpumask *mask) { }
>  static inline void __tick_nohz_full_check(void) { }
>  static inline void tick_nohz_full_kick_cpu(int cpu) { }
>  static inline void tick_nohz_full_kick(void) { }
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs
  2015-04-06 18:29                 ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs Frederic Weisbecker
@ 2015-04-06 19:09                   ` Chris Metcalf
  2015-04-07  9:33                     ` Ingo Molnar
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-06 19:09 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 04/06/2015 02:29 PM, Frederic Weisbecker wrote:
> On Mon, Apr 06, 2015 at 02:16:44PM -0400,cmetcalf@ezchip.com  wrote:
>> >From: Chris Metcalf<cmetcalf@ezchip.com>
>> >
>> >The "clear" API is useful, for example, to modify a cpumask to avoid
>> >the nohz cores so that interrupts aren't sent to them.
>> >
>> >Likewise the "set" API is useful to modify a cpumask indicating some
>> >special nohz-type functionality so that the nohz cores are
>> >automatically added to that set.
>> >
>> >Signed-off-by: Chris Metcalf<cmetcalf@ezchip.com>
>> >---
>> >Frederic, if you could ack this patch (and maybe the next) before
>> >I ask for it to be pulled into the timer tree that would be great.
>> >I will wait a few days before asking in case anyone else has any
>> >other issues or would like to provide an Acked-by.
> Sure, or better yet I should take them since they are full nohz code.
> And I have a pending pile to pull-request to timer tree.

That would be great, thanks.

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-04  3:43         ` Mike Galbraith
@ 2015-04-06 19:28           ` Rik van Riel
  2015-04-07  3:10             ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Rik van Riel @ 2015-04-06 19:28 UTC (permalink / raw)
  To: Mike Galbraith, Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/03/2015 11:43 PM, Mike Galbraith wrote:

> Speaking of microsecond savers, the (ick) deferment experiment below
> cut 60 core jitter in half.  Shooting the clocksource watchdog fixes
> alternating ~15us/~5us tick on my desktop box.
>
> With workqueue twiddles and whatnot floating around, the thing is
> starting to look viable.

Doesn't look too bad to me, though the changes below
could probably use some comments when turned into a
final patch :)

> ---
>   kernel/sched/core.c       |    5 +++--
>   kernel/time/clocksource.c |    5 +++++
>   2 files changed, 8 insertions(+), 2 deletions(-)
>
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2604,12 +2604,13 @@ u64 scheduler_tick_max_deferment(void)
>          struct rq *rq = this_rq();
>          unsigned long next, now = ACCESS_ONCE(jiffies);
>
> -       next = rq->last_sched_tick + HZ;
> +       next = (rq->last_sched_tick + HZ) | (rq->clock & 0x3f);
>
>          if (time_before_eq(next, now))
>                  return 0;
>
> -       return jiffies_to_nsecs(next - now);
> +       /* Add noise to avoid CPUs colliding at tick boundaries */
> +       return jiffies_to_nsecs(next - now) | (rq->clock & 0xfffff);
>   }
>   #endif
>
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -267,8 +267,13 @@ static void clocksource_watchdog(unsigne
>           * to each other.
>           */
>          next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
> +skip_nohz_full:
>          if (next_cpu >= nr_cpu_ids)
>                  next_cpu = cpumask_first(cpu_online_mask);
> +       if (next_cpu && tick_nohz_full_cpu(next_cpu)) {
> +               next_cpu = cpumask_next(next_cpu, cpu_online_mask);
> +               goto skip_nohz_full;
> +       }
>          watchdog_timer.expires += WATCHDOG_INTERVAL;
>          add_timer_on(&watchdog_timer, next_cpu);
>   out:
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 2/2] nohz: make nohz_full imply isolcpus
  2015-04-06 19:28           ` Rik van Riel
@ 2015-04-07  3:10             ` Mike Galbraith
  0 siblings, 0 replies; 89+ messages in thread
From: Mike Galbraith @ 2015-04-07  3:10 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Chris Metcalf, Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Mon, 2015-04-06 at 15:28 -0400, Rik van Riel wrote:
> On 04/03/2015 11:43 PM, Mike Galbraith wrote:
> 
> > Speaking of microsecond savers, the (ick) deferment experiment 
> > below
> > cut 60 core jitter in half.  Shooting the clocksource watchdog 
> > fixes
> > alternating ~15us/~5us tick on my desktop box.
> > 
> > With workqueue twiddles and whatnot floating around, the thing is
> > starting to look viable.
> 
> Doesn't look too bad to me, though the changes below
> could probably use some comments when turned into a
> final patch :)

The watchdog yeah, the tick thing may want to become.. anything else.

        -Mike

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs
  2015-04-06 19:09                   ` Chris Metcalf
@ 2015-04-07  9:33                     ` Ingo Molnar
  0 siblings, 0 replies; 89+ messages in thread
From: Ingo Molnar @ 2015-04-07  9:33 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel


* Chris Metcalf <cmetcalf@ezchip.com> wrote:

> On 04/06/2015 02:29 PM, Frederic Weisbecker wrote:
> >On Mon, Apr 06, 2015 at 02:16:44PM -0400,cmetcalf@ezchip.com  wrote:
> >>>From: Chris Metcalf<cmetcalf@ezchip.com>
> >>>
> >>>The "clear" API is useful, for example, to modify a cpumask to avoid
> >>>the nohz cores so that interrupts aren't sent to them.
> >>>
> >>>Likewise the "set" API is useful to modify a cpumask indicating some
> >>>special nohz-type functionality so that the nohz cores are
> >>>automatically added to that set.
> >>>
> >>>Signed-off-by: Chris Metcalf<cmetcalf@ezchip.com>
> >>>---
> >>>Frederic, if you could ack this patch (and maybe the next) before
> >>>I ask for it to be pulled into the timer tree that would be great.
> >>>I will wait a few days before asking in case anyone else has any
> >>>other issues or would like to provide an Acked-by.
> >Sure, or better yet I should take them since they are full nohz code.
> >And I have a pending pile to pull-request to timer tree.
> 
> That would be great, thanks.

That's fine with me too!

My main worry was not about the specific patches but about the route 
they take: it gets better review and better response to future bugs if 
it goes through the usual maintainer tree (Frederic's).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-06 18:16                 ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus cmetcalf
@ 2015-04-07 22:29                   ` Frederic Weisbecker
  2015-04-08  9:41                   ` Peter Zijlstra
  1 sibling, 0 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-07 22:29 UTC (permalink / raw)
  To: cmetcalf, Peter Zijlstra
  Cc: Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Mon, Apr 06, 2015 at 02:16:45PM -0400, cmetcalf@ezchip.com wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> It's not clear that nohz_full is useful without isolcpus also
> set, since otherwise the scheduler has to run periodically to
> try to determine whether to steal work from other cores.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> Acked-by: Rik van Riel <riel@redhat.com>

Peter, are you ok with that change? I think that you planned to remove
cpu_isolated_map a while ago, so I prefer to wait for your ack.

Thanks.

> ---
>  kernel/sched/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..275f12c608f2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
>  	doms_cur = alloc_sched_domains(ndoms_cur);
>  	if (!doms_cur)
>  		doms_cur = &fallback_doms;
> +	tick_nohz_full_set_cpus(cpu_isolated_map);
>  	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
>  	err = build_sched_domains(doms_cur[0], NULL);
>  	register_sched_domain_sysctl();
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-06 18:16                 ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus cmetcalf
  2015-04-07 22:29                   ` Frederic Weisbecker
@ 2015-04-08  9:41                   ` Peter Zijlstra
  2015-04-08 14:04                     ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-08  9:41 UTC (permalink / raw)
  To: cmetcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Mon, Apr 06, 2015 at 02:16:45PM -0400, cmetcalf@ezchip.com wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> It's not clear that nohz_full is useful without isolcpus also
> set, since otherwise the scheduler has to run periodically to
> try to determine whether to steal work from other cores.

So the Changelog and the patch don't seem to agree with one another.

The Changelog states that nohz_full should depend on isolcpus.
The patch implies nohz_full for isolcpus.

These are not the same; and I don't see the argument for the former make
sense for the latter.

In specific isolcpus without nohz_full does make sense.

> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> Acked-by: Rik van Riel <riel@redhat.com>
> ---
>  kernel/sched/core.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..275f12c608f2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
>  	doms_cur = alloc_sched_domains(ndoms_cur);
>  	if (!doms_cur)
>  		doms_cur = &fallback_doms;
> +	tick_nohz_full_set_cpus(cpu_isolated_map);
>  	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
>  	err = build_sched_domains(doms_cur[0], NULL);
>  	register_sched_domain_sysctl();

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08  9:41                   ` Peter Zijlstra
@ 2015-04-08 14:04                     ` Chris Metcalf
  2015-04-08 14:26                       ` Peter Zijlstra
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-08 14:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/08/2015 05:41 AM, Peter Zijlstra wrote:
> On Mon, Apr 06, 2015 at 02:16:45PM -0400, cmetcalf@ezchip.com wrote:
>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>
>> It's not clear that nohz_full is useful without isolcpus also
>> set, since otherwise the scheduler has to run periodically to
>> try to determine whether to steal work from other cores.
> So the Changelog and the patch don't seem to agree with one another.
>
> The Changelog states that nohz_full should depend on isolcpus.

The git commit message says "make nohz_full imply isolcpus".
That's consistent with the code.

> The patch implies nohz_full for isolcpus.
>
> These are not the same; and I don't see the argument for the former make
> sense for the latter.
>
> In specific isolcpus without nohz_full does make sense.
>
>> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
>> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
>> Acked-by: Rik van Riel <riel@redhat.com>
>> ---
>>   kernel/sched/core.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index f0f831e8a345..275f12c608f2 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
>>   	doms_cur = alloc_sched_domains(ndoms_cur);
>>   	if (!doms_cur)
>>   		doms_cur = &fallback_doms;
>> +	tick_nohz_full_set_cpus(cpu_isolated_map);
>>   	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
>>   	err = build_sched_domains(doms_cur[0], NULL);
>>   	register_sched_domain_sysctl();

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08 14:04                     ` Chris Metcalf
@ 2015-04-08 14:26                       ` Peter Zijlstra
  2015-04-08 15:21                         ` Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-08 14:26 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Wed, Apr 08, 2015 at 10:04:43AM -0400, Chris Metcalf wrote:
> On 04/08/2015 05:41 AM, Peter Zijlstra wrote:
> >On Mon, Apr 06, 2015 at 02:16:45PM -0400, cmetcalf@ezchip.com wrote:
> >>From: Chris Metcalf <cmetcalf@ezchip.com>
> >>
> >>It's not clear that nohz_full is useful without isolcpus also
> >>set, since otherwise the scheduler has to run periodically to
> >>try to determine whether to steal work from other cores.
> >So the Changelog and the patch don't seem to agree with one another.
> >
> >The Changelog states that nohz_full should depend on isolcpus.
> 
> The git commit message says "make nohz_full imply isolcpus".
> That's consistent with the code.

Well, but then the Changelog doesn't make any sense.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08 14:26                       ` Peter Zijlstra
@ 2015-04-08 15:21                         ` Chris Metcalf
  2015-04-08 17:24                           ` Frederic Weisbecker
  2015-04-08 17:27                           ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus Peter Zijlstra
  0 siblings, 2 replies; 89+ messages in thread
From: Chris Metcalf @ 2015-04-08 15:21 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/08/2015 10:26 AM, Peter Zijlstra wrote:
> On Wed, Apr 08, 2015 at 10:04:43AM -0400, Chris Metcalf wrote:
>> On 04/08/2015 05:41 AM, Peter Zijlstra wrote:
>>> On Mon, Apr 06, 2015 at 02:16:45PM -0400, cmetcalf@ezchip.com wrote:
>>>> From: Chris Metcalf <cmetcalf@ezchip.com>
>>>>
>>>> It's not clear that nohz_full is useful without isolcpus also
>>>> set, since otherwise the scheduler has to run periodically to
>>>> try to determine whether to steal work from other cores.
>>> So the Changelog and the patch don't seem to agree with one another.
>>>
>>> The Changelog states that nohz_full should depend on isolcpus.
>> The git commit message says "make nohz_full imply isolcpus".
>> That's consistent with the code.
> Well, but then the Changelog doesn't make any sense.

Apparently the body of the commit message isn't as clear as it might be :-)

It does say the same thing, though, basically that if nohz_full DOESN'T
imply isolcpus, that's a bad thing.  I'm happy to reword the text to avoid
the double negative and say:

   nohz_full is only useful with isolcpus also set, since otherwise the
   scheduler has to run periodically to try to determine whether to steal
   work from other cores.

Frederic, do you want me to respin the patch, or can you just update
the text of the commit message as above?

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08 15:21                         ` Chris Metcalf
@ 2015-04-08 17:24                           ` Frederic Weisbecker
  2015-04-08 19:20                             ` [PATCH v3 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
  2015-04-08 17:27                           ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus Peter Zijlstra
  1 sibling, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-08 17:24 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Wed, Apr 08, 2015 at 11:21:56AM -0400, Chris Metcalf wrote:
> On 04/08/2015 10:26 AM, Peter Zijlstra wrote:
> >On Wed, Apr 08, 2015 at 10:04:43AM -0400, Chris Metcalf wrote:
> >>On 04/08/2015 05:41 AM, Peter Zijlstra wrote:
> >>>On Mon, Apr 06, 2015 at 02:16:45PM -0400, cmetcalf@ezchip.com wrote:
> >>>>From: Chris Metcalf <cmetcalf@ezchip.com>
> >>>>
> >>>>It's not clear that nohz_full is useful without isolcpus also
> >>>>set, since otherwise the scheduler has to run periodically to
> >>>>try to determine whether to steal work from other cores.
> >>>So the Changelog and the patch don't seem to agree with one another.
> >>>
> >>>The Changelog states that nohz_full should depend on isolcpus.
> >>The git commit message says "make nohz_full imply isolcpus".
> >>That's consistent with the code.
> >Well, but then the Changelog doesn't make any sense.
> 
> Apparently the body of the commit message isn't as clear as it might be :-)
> 
> It does say the same thing, though, basically that if nohz_full DOESN'T
> imply isolcpus, that's a bad thing.  I'm happy to reword the text to avoid
> the double negative and say:
> 
>   nohz_full is only useful with isolcpus also set, since otherwise the
>   scheduler has to run periodically to try to determine whether to steal
>   work from other cores.
> 
> Frederic, do you want me to respin the patch, or can you just update
> the text of the commit message as above?

Please respin. Writing or fixing changelogs is what takes me most time :-)

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08 15:21                         ` Chris Metcalf
  2015-04-08 17:24                           ` Frederic Weisbecker
@ 2015-04-08 17:27                           ` Peter Zijlstra
  2015-04-08 18:12                             ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-08 17:27 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Wed, Apr 08, 2015 at 11:21:56AM -0400, Chris Metcalf wrote:
> Apparently the body of the commit message isn't as clear as it might be :-)
> 
> It does say the same thing, though, basically that if nohz_full DOESN'T
> imply isolcpus, that's a bad thing.  I'm happy to reword the text to avoid
> the double negative and say:
> 
>   nohz_full is only useful with isolcpus also set, since otherwise the
>   scheduler has to run periodically to try to determine whether to steal
>   work from other cores.

But you're doing the reverse! You're setting nohz_full for isolcpus, not
limiting the nohz_full mask to isolcpus.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08 17:27                           ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus Peter Zijlstra
@ 2015-04-08 18:12                             ` Chris Metcalf
  2015-04-09  8:29                               ` Peter Zijlstra
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-08 18:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/08/2015 01:27 PM, Peter Zijlstra wrote:
> On Wed, Apr 08, 2015 at 11:21:56AM -0400, Chris Metcalf wrote:
>> Apparently the body of the commit message isn't as clear as it might be :-)
>>
>> It does say the same thing, though, basically that if nohz_full DOESN'T
>> imply isolcpus, that's a bad thing.  I'm happy to reword the text to avoid
>> the double negative and say:
>>
>>    nohz_full is only useful with isolcpus also set, since otherwise the
>>    scheduler has to run periodically to try to determine whether to steal
>>    work from other cores.
> But you're doing the reverse! You're setting nohz_full for isolcpus, not
> limiting the nohz_full mask to isolcpus.

Ah, I see.  Yes, that's right.  The idea is that if you are saying
"nohz_full=1-15" on the command line, you would like that to
imply "isolcpus=1-15" as well, without having to actually say so
explicitly.  If we instead limit nohz_full based on isolcpus, it's not
clear that it's actually worth making any change in this area.

I still maintain that the text has always correctly (if perhaps
confusingly) said what it is that the code was doing; where the
text says "x implies y", that means "x being set forces y to be set".
But I'm respinning it anyway for Frederic so I will avoid using the
word "imply" altogether to make this clearer.

The larger question is whether you agree with the proposed semantics.

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v3 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs
  2015-04-08 17:24                           ` Frederic Weisbecker
@ 2015-04-08 19:20                             ` cmetcalf
  2015-04-08 19:20                               ` [PATCH v3 2/2] nohz: set isolcpus when nohz_full is set cmetcalf
  0 siblings, 1 reply; 89+ messages in thread
From: cmetcalf @ 2015-04-08 19:20 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

From: Chris Metcalf <cmetcalf@ezchip.com>

The "clear" API is useful, for example, to modify a cpumask to avoid
the nohz cores so that interrupts aren't sent to them.

Likewise the "set" API is useful to modify a cpumask indicating some
special nohz-type functionality so that the nohz cores are
automatically added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v3: no change here, just in part 2/2.

v2: put the "...set_cpus" API together with the change to
init_sched_domains() so that they can go into the timer tree
independently of other changes in my tree.

 include/linux/tick.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..29456c443970 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_clear_cpus(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_andnot(mask, mask, tick_nohz_full_mask);
+}
+
+static inline void tick_nohz_full_set_cpus(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_clear_cpus(struct cpumask *mask) { }
+static inline void tick_nohz_full_set_cpus(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-08 19:20                             ` [PATCH v3 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
@ 2015-04-08 19:20                               ` cmetcalf
  0 siblings, 0 replies; 89+ messages in thread
From: cmetcalf @ 2015-04-08 19:20 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

From: Chris Metcalf <cmetcalf@ezchip.com>

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
---
v3: update changelog language to avoid using unclear "implies" [PeterZ]

 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..275f12c608f2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
 	doms_cur = alloc_sched_domains(ndoms_cur);
 	if (!doms_cur)
 		doms_cur = &fallback_doms;
+	tick_nohz_full_set_cpus(cpu_isolated_map);
 	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
 	err = build_sched_domains(doms_cur[0], NULL);
 	register_sched_domain_sysctl();
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-08 18:12                             ` Chris Metcalf
@ 2015-04-09  8:29                               ` Peter Zijlstra
  2015-04-09 12:02                                 ` Chris Metcalf
  2015-04-09 17:00                                 ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Chris Metcalf
  0 siblings, 2 replies; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-09  8:29 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Wed, Apr 08, 2015 at 02:12:34PM -0400, Chris Metcalf wrote:

> >But you're doing the reverse! You're setting nohz_full for isolcpus, not
> >limiting the nohz_full mask to isolcpus.
> 
> Ah, I see.  Yes, that's right.  

No its not, you should correct me when I'm wrong ;-)

So the problem is that:

+       tick_nohz_full_set_cpus(cpu_isolated_map);

reads like you're doing:

  nohz_full_map |= isolcpus_map

But in actual fact you're doing:

  isolcpus_map |= nohz_full_map

So that function is retarded, but the logic is fine.

So NAK on both patches for the reason that they're utterly confusing as
to wtf they actually do.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-09  8:29                               ` Peter Zijlstra
@ 2015-04-09 12:02                                 ` Chris Metcalf
  2015-04-09 12:45                                   ` Frederic Weisbecker
  2015-04-09 17:00                                 ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 12:02 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 4/9/2015 4:29 AM, Peter Zijlstra wrote:
> On Wed, Apr 08, 2015 at 02:12:34PM -0400, Chris Metcalf wrote:
>
>>> But you're doing the reverse! You're setting nohz_full for isolcpus, not
>>> limiting the nohz_full mask to isolcpus.
>> Ah, I see.  Yes, that's right.
> No its not, you should correct me when I'm wrong ;-)

Oh, I have no problem with that :-)  But, now that I know what was confusing
you about the patch I see what it was you were saying with your English
above too.  I thought you were saying something like "making nohz_full
imply isolcpus" again, but you weren't.  Phew, OK, I think we're done
talking at cross-purposes.

> So the problem is that:
>
> +       tick_nohz_full_set_cpus(cpu_isolated_map);
>
> reads like you're doing:
>
>    nohz_full_map |= isolcpus_map
>
> But in actual fact you're doing:
>
>    isolcpus_map |= nohz_full_map
>
> So that function is retarded, but the logic is fine.
>
> So NAK on both patches for the reason that they're utterly confusing as
> to wtf they actually do.

What about tick_nohz_full_cpumask_or(cpu_isolated_map) ?
At that point maybe the similarity to the existing cpumask API will make
it more clear that we are modifying the argument?

If not, do you have any suggestions what might do better?  Obviously
the goal is to make it something that macroizes away, otherwise I'd
suggest just explicitly using an #ifdef and cpumask_or().

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/2] nohz: make nohz_full imply isolcpus
  2015-04-09 12:02                                 ` Chris Metcalf
@ 2015-04-09 12:45                                   ` Frederic Weisbecker
  2015-04-09 16:59                                     ` [PATCH v5] nohz: set isolcpus when nohz_full is set Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-09 12:45 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Thu, Apr 09, 2015 at 08:02:28AM -0400, Chris Metcalf wrote:
> On 4/9/2015 4:29 AM, Peter Zijlstra wrote:
> >On Wed, Apr 08, 2015 at 02:12:34PM -0400, Chris Metcalf wrote:
> >
> >>>But you're doing the reverse! You're setting nohz_full for isolcpus, not
> >>>limiting the nohz_full mask to isolcpus.
> >>Ah, I see.  Yes, that's right.
> >No its not, you should correct me when I'm wrong ;-)
> 
> Oh, I have no problem with that :-)  But, now that I know what was confusing
> you about the patch I see what it was you were saying with your English
> above too.  I thought you were saying something like "making nohz_full
> imply isolcpus" again, but you weren't.  Phew, OK, I think we're done
> talking at cross-purposes.
> 
> >So the problem is that:
> >
> >+       tick_nohz_full_set_cpus(cpu_isolated_map);
> >
> >reads like you're doing:
> >
> >   nohz_full_map |= isolcpus_map
> >
> >But in actual fact you're doing:
> >
> >   isolcpus_map |= nohz_full_map
> >
> >So that function is retarded, but the logic is fine.
> >
> >So NAK on both patches for the reason that they're utterly confusing as
> >to wtf they actually do.
> 
> What about tick_nohz_full_cpumask_or(cpu_isolated_map) ?
> At that point maybe the similarity to the existing cpumask API will make
> it more clear that we are modifying the argument?
> 
> If not, do you have any suggestions what might do better?  Obviously
> the goal is to make it something that macroizes away, otherwise I'd
> suggest just explicitly using an #ifdef and cpumask_or().

Possible alternative: do it the other way around.

cpu_isolated_map is allocated and filled early (__setup or sched_init())
before tick_init() and tick_init() is before sched_init_smp() which first uses
cpu_isolated_map(). So we can call some sched_isolated_map_add(struct cpumask *cpumask)
from tick_nohz_init().

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v5] nohz: set isolcpus when nohz_full is set
  2015-04-09 12:45                                   ` Frederic Weisbecker
@ 2015-04-09 16:59                                     ` Chris Metcalf
  2015-04-09 17:12                                       ` Peter Zijlstra
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 16:59 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
Frederic wrote:
> cpu_isolated_map is allocated and filled early (__setup or sched_init())
> before tick_init() and tick_init() is before sched_init_smp() which first uses
> cpu_isolated_map(). So we can call some sched_isolated_map_add(struct cpumask *cpumask)
> from tick_nohz_init().

I'll re-send a v4 of the patch without your suggestion, just renaming
the methods to tick_nohz_full_cpumask_andnot() etc, since I still think
that that model is easier to understand - we tweak isolcpus in exactly
the spot where we first put it to use.  And, we do need those
tick_nohz_full_cpumask_xxx() accessors in other places anyway --
see my earlier patch for the tilegx network driver to remove the
nohz_full cores from the set of cores that get interrupted by the driver,
for example.

That said, I'm not opposed to your idea, and we could certainly do it that
way if that's the consensus.  For reference, here's what it looks like
when fleshed out; I'm calling it v5 to be sort of clear about this,
but either v4 or v5 would be fine.  I left the sched_isolated_map_add()
function enabled in all kernel configurations, not just NO_HZ_FULL,
since it's pretty trivial and it felt like the #ifdefs to disable it
conditionally would be noisier than the benefit to kernel size.

 include/linux/sched.h    | 1 +
 kernel/sched/core.c      | 5 +++++
 kernel/time/tick-sched.c | 3 +++
 3 files changed, 9 insertions(+)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 6d77432e14ff..18a961b9beba 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -323,6 +323,7 @@ struct task_struct;
 extern int lockdep_tasklist_lock_is_held(void);
 #endif /* #ifdef CONFIG_PROVE_RCU */
 
+extern void sched_isolated_map_add(const struct cpumask *);
 extern void sched_init(void);
 extern void sched_init_smp(void);
 extern asmlinkage void schedule_tail(struct task_struct *prev);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..b055c5e0e65c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5824,6 +5824,11 @@ static int __init isolated_cpu_setup(char *str)
 
 __setup("isolcpus=", isolated_cpu_setup);
 
+void sched_isolated_map_add(const struct cpumask *cpumask)
+{
+	cpumask_or(cpu_isolated_map, cpu_isolated_map, cpumask);
+}
+
 struct s_data {
 	struct sched_domain ** __percpu sd;
 	struct root_domain	*rd;
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index a4c4edac4528..b0092d02ca3f 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -385,6 +385,9 @@ void __init tick_nohz_init(void)
 	for_each_cpu(cpu, tick_nohz_full_mask)
 		context_tracking_cpu_set(cpu);
 
+	/* It's not meaningful to be nohz without disabling the scheduler. */
+	sched_isolated_map_add(tick_nohz_full_mask);
+
 	cpu_notifier(tick_nohz_cpu_down_callback, 0);
 	pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
 		cpumask_pr_args(tick_nohz_full_mask));
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs
  2015-04-09  8:29                               ` Peter Zijlstra
  2015-04-09 12:02                                 ` Chris Metcalf
@ 2015-04-09 17:00                                 ` Chris Metcalf
  2015-04-09 17:00                                   ` [PATCH v4 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
  2015-04-09 17:07                                   ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Peter Zijlstra
  1 sibling, 2 replies; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 17:00 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

The "andnot" API is useful, for example, to modify a cpumask to avoid
the nohz cores so that interrupts aren't sent to them.

Likewise the "or" API is useful to modify a cpumask indicating some
special nohz-type functionality so that the nohz cores are
automatically added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v4: update accessor names to make them clearer [PeterZ]

v3: no change here, just in part 2/2.

v2: put the "...set_cpus" API together with the change to
init_sched_domains() so that they can go into the timer tree
independently of other changes in my tree.

 include/linux/tick.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..ecc24e3a6fa2 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_cpumask_andnot(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_andnot(mask, mask, tick_nohz_full_mask);
+}
+
+static inline void tick_nohz_full_cpumask_or(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_cpumask_andnot(struct cpumask *mask) { }
+static inline void tick_nohz_full_cpumask_or(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-09 17:00                                 ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Chris Metcalf
@ 2015-04-09 17:00                                   ` Chris Metcalf
  2015-04-09 17:07                                   ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Peter Zijlstra
  1 sibling, 0 replies; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 17:00 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
---
v4: update accessor names to make them clearer [PeterZ]

v3: update changelog language to avoid using unclear "implies" [PeterZ]

 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..f8bce1b24cf3 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6836,6 +6836,7 @@ static int init_sched_domains(const struct cpumask *cpu_map)
 	doms_cur = alloc_sched_domains(ndoms_cur);
 	if (!doms_cur)
 		doms_cur = &fallback_doms;
+	tick_nohz_full_cpumask_or(cpu_isolated_map);
 	cpumask_andnot(doms_cur[0], cpu_map, cpu_isolated_map);
 	err = build_sched_domains(doms_cur[0], NULL);
 	register_sched_domain_sysctl();
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs
  2015-04-09 17:00                                 ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Chris Metcalf
  2015-04-09 17:00                                   ` [PATCH v4 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-09 17:07                                   ` Peter Zijlstra
  2015-04-09 17:24                                     ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-09 17:07 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Thu, Apr 09, 2015 at 01:00:38PM -0400, Chris Metcalf wrote:
> +static inline void tick_nohz_full_cpumask_or(struct cpumask *mask)

This still reads as if you're doing: nohz_full_mask |= mask.

I think the suggestion done by Frederic is the right one, reverse the
lot, have:

	isolcpu_map_or(nohz_full_map) := isolcpus_map |= nohz_full_map

Or just completely give up and just write readable code under an #ifdef.



^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v5] nohz: set isolcpus when nohz_full is set
  2015-04-09 16:59                                     ` [PATCH v5] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-09 17:12                                       ` Peter Zijlstra
  2015-04-10  1:05                                         ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-09 17:12 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Thu, Apr 09, 2015 at 12:59:39PM -0400, Chris Metcalf wrote:
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 6d77432e14ff..18a961b9beba 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -323,6 +323,7 @@ struct task_struct;
>  extern int lockdep_tasklist_lock_is_held(void);
>  #endif /* #ifdef CONFIG_PROVE_RCU */
>  
> +extern void sched_isolated_map_add(const struct cpumask *);
>  extern void sched_init(void);
>  extern void sched_init_smp(void);
>  extern asmlinkage void schedule_tail(struct task_struct *prev);
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..b055c5e0e65c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5824,6 +5824,11 @@ static int __init isolated_cpu_setup(char *str)
>  
>  __setup("isolcpus=", isolated_cpu_setup);
>  
> +void sched_isolated_map_add(const struct cpumask *cpumask)
> +{
> +	cpumask_or(cpu_isolated_map, cpu_isolated_map, cpumask);
> +}
> +
>  struct s_data {
>  	struct sched_domain ** __percpu sd;
>  	struct root_domain	*rd;
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index a4c4edac4528..b0092d02ca3f 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -385,6 +385,9 @@ void __init tick_nohz_init(void)
>  	for_each_cpu(cpu, tick_nohz_full_mask)
>  		context_tracking_cpu_set(cpu);
>  
> +	/* It's not meaningful to be nohz without disabling the scheduler. */
> +	sched_isolated_map_add(tick_nohz_full_mask);
> +
>  	cpu_notifier(tick_nohz_cpu_down_callback, 0);
>  	pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
>  		cpumask_pr_args(tick_nohz_full_mask));

Right, this could work. Although I would suggest adding a comment
somewhere that we should be careful with init order. I checked, this
appears to be ordered right, but...

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs
  2015-04-09 17:07                                   ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Peter Zijlstra
@ 2015-04-09 17:24                                     ` Chris Metcalf
  2015-04-09 17:42                                       ` Peter Zijlstra
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 17:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/09/2015 01:07 PM, Peter Zijlstra wrote:
> On Thu, Apr 09, 2015 at 01:00:38PM -0400, Chris Metcalf wrote:
>> +static inline void tick_nohz_full_cpumask_or(struct cpumask *mask)
> This still reads as if you're doing: nohz_full_mask |= mask.
>
> I think the suggestion done by Frederic is the right one, reverse the
> lot, have:
>
> 	isolcpu_map_or(nohz_full_map) := isolcpus_map |= nohz_full_map
>
> Or just completely give up and just write readable code under an #ifdef.

OK, so let's go with v5 (in the other thread) plus comments on init 
ordering, then.
I'll repost that shortly.

However, I'd still appreciate guidance on the naming, since I do
have a patch outstanding to fiddle with cpumasks for nohz_full
(in the other case, for the tilegx network driver irq mask).

So here's the obvious readable code snippet approach:

#ifdef CONFIG_NO_HZ_FULL
         cpumask_or(some_random_map, some_random_map, tick_nohz_full_map);
#endif

Some possible names so we can macroize them to no-ops:

         exclude_nohz_full_cpus_from(some_random_map);
or
         remove_nohz_full_cpus_from(some_random_map);

         include_nohz_full_cpus_in(some_random_map);
or
         add_nohz_full_cpus_to(some_random_map);

or perhaps with better namespace prefixes, but more confusing to read:

         tick_nohz_full_exclude_cpus_from(some_random_map);
or
         tick_nohz_full_remove_cpus_from(some_random_map);

         tick_nohz_full_include_cpus_in(some_random_map);
or
         tick_nohz_full_add_cpus_to(some_random_map);

Any of these sound good? Any other ideas?

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs
  2015-04-09 17:24                                     ` Chris Metcalf
@ 2015-04-09 17:42                                       ` Peter Zijlstra
  2015-04-09 18:01                                         ` [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-09 17:42 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Thu, Apr 09, 2015 at 01:24:22PM -0400, Chris Metcalf wrote:
> However, I'd still appreciate guidance on the naming, since I do
> have a patch outstanding to fiddle with cpumasks for nohz_full
> (in the other case, for the tilegx network driver irq mask).
> 
> So here's the obvious readable code snippet approach:
> 
> #ifdef CONFIG_NO_HZ_FULL
>         cpumask_or(some_random_map, some_random_map, tick_nohz_full_map);
> #endif
> 
> Some possible names so we can macroize them to no-ops:
> 
>         exclude_nohz_full_cpus_from(some_random_map);
> or
>         remove_nohz_full_cpus_from(some_random_map);
> 
>         include_nohz_full_cpus_in(some_random_map);
> or
>         add_nohz_full_cpus_to(some_random_map);
> 
> or perhaps with better namespace prefixes, but more confusing to read:
> 
>         tick_nohz_full_exclude_cpus_from(some_random_map);
> or
>         tick_nohz_full_remove_cpus_from(some_random_map);
> 
>         tick_nohz_full_include_cpus_in(some_random_map);
> or
>         tick_nohz_full_add_cpus_to(some_random_map);
> 
> Any of these sound good? Any other ideas?

Yes, I think these are all clearer than previous attempts.

I'm not entirely sure which to pick though; I have a vague preference
for add/remove over include/exclude and the top set reads better then
the lower set but the lower set is more consistent in naming :/

But the important point is that these names all indicate you're going to
change the mask passed as argument.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-09 17:42                                       ` Peter Zijlstra
@ 2015-04-09 18:01                                         ` Chris Metcalf
  2015-04-09 18:01                                           ` [PATCH v6 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
  2015-04-10  1:34                                           ` [PATCH v6 " Frederic Weisbecker
  0 siblings, 2 replies; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 18:01 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

The "removes_cpus_from" API is useful, for example, to modify a cpumask
to avoid the nohz cores so that interrupts aren't sent to them.

Likewise the "add_cpus_to" API is useful to modify a cpumask indicating
some special nohz-type functionality so that the nohz cores are
automatically added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v6: I think we may finally have accessor names that are OK

v5: (skipped this patch)

v4: update accessor names to make them clearer [PeterZ]

v3: no change here, just in part 2/2.

v2: put the "...set_cpus" API together with the change to
init_sched_domains() so that they can go into the timer tree
independently of other changes in my tree.

 include/linux/tick.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..8d1754c0f694 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
+static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_andnot(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
+static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-09 18:01                                         ` [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
@ 2015-04-09 18:01                                           ` Chris Metcalf
  2015-04-10  1:37                                             ` Frederic Weisbecker
  2015-04-10  1:34                                           ` [PATCH v6 " Frederic Weisbecker
  1 sibling, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-09 18:01 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
I've come full circle on this issue.  For v5 I liked Frederic's
idea of adding an API, but the more I thought about it seemed
like an overly-generic mechanism for what is really a very specific
issue, namely the interaction between the scheduler and nohz_full.
We already have plenty of nohz_full-specific code in sched/core.c,
and I think we now have a cpumask function name that is OK, so
now I'm proposing we just stick with directly modifying the
cpu_isolated_map using the new name.  However, I've moved the
update to the map right to the top of sched_init_smp(), which seems
like a cleaner place to make this kind of a change.

v6: back to just using a direct update, with the new names

v5: test run of Frederic's proposal to use sched_isolated_map_add()

v4: update accessor names to make them clearer [PeterZ]

v3: update changelog language to avoid using unclear "implies" [PeterZ]

 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..041845c568d2 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7031,6 +7031,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_remove_cpus_from(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v5] nohz: set isolcpus when nohz_full is set
  2015-04-09 17:12                                       ` Peter Zijlstra
@ 2015-04-10  1:05                                         ` Mike Galbraith
  2015-04-10 15:33                                           ` Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-04-10  1:05 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Chris Metcalf, Frederic Weisbecker, Paul E. McKenney,
	Rafael J. Wysocki, Martin Schwidefsky, Ingo Molnar, linux-kernel

On Thu, 2015-04-09 at 19:12 +0200, Peter Zijlstra wrote:
> On Thu, Apr 09, 2015 at 12:59:39PM -0400, Chris Metcalf wrote:
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 6d77432e14ff..18a961b9beba 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -323,6 +323,7 @@ struct task_struct;
> >  extern int lockdep_tasklist_lock_is_held(void);
> >  #endif /* #ifdef CONFIG_PROVE_RCU */
> >  
> > +extern void sched_isolated_map_add(const struct cpumask *);
> >  extern void sched_init(void);
> >  extern void sched_init_smp(void);
> >  extern asmlinkage void schedule_tail(struct task_struct *prev);
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index f0f831e8a345..b055c5e0e65c 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -5824,6 +5824,11 @@ static int __init isolated_cpu_setup(char 
> > *str)
> >  
> >  __setup("isolcpus=", isolated_cpu_setup);
> >  
> > +void sched_isolated_map_add(const struct cpumask *cpumask)
> > +{
> > +   cpumask_or(cpu_isolated_map, cpu_isolated_map, cpumask);
> > +}
> > +
> >  struct s_data {
> >     struct sched_domain ** __percpu sd;
> >     struct root_domain      *rd;
> > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> > index a4c4edac4528..b0092d02ca3f 100644
> > --- a/kernel/time/tick-sched.c
> > +++ b/kernel/time/tick-sched.c
> > @@ -385,6 +385,9 @@ void __init tick_nohz_init(void)
> >     for_each_cpu(cpu, tick_nohz_full_mask)
> >             context_tracking_cpu_set(cpu);
> >  
> > +   /* It's not meaningful to be nohz without disabling the 
> > scheduler. */
> > +   sched_isolated_map_add(tick_nohz_full_mask);
> > +
> >     cpu_notifier(tick_nohz_cpu_down_callback, 0);
> >     pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
> >             cpumask_pr_args(tick_nohz_full_mask));
> 
> Right, this could work. Although I would suggest adding a comment
> somewhere that we should be careful with init order. I checked, this
> appears to be ordered right, but...

I'd embed it in domain construction, that way it'd be ready for the 
day nohz_full becomes dynamic, and people can start using cpusets to 
set up /tear down isolated sets on the fly.

        -Mike

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-09 18:01                                         ` [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
  2015-04-09 18:01                                           ` [PATCH v6 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-10  1:34                                           ` Frederic Weisbecker
  2015-04-10 15:31                                             ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-10  1:34 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Thu, Apr 09, 2015 at 02:01:45PM -0400, Chris Metcalf wrote:
> The "removes_cpus_from" API is useful, for example, to modify a cpumask
> to avoid the nohz cores so that interrupts aren't sent to them.
> 
> Likewise the "add_cpus_to" API is useful to modify a cpumask indicating
> some special nohz-type functionality so that the nohz cores are
> automatically added to that set.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> ---
> v6: I think we may finally have accessor names that are OK
> 
> v5: (skipped this patch)
> 
> v4: update accessor names to make them clearer [PeterZ]
> 
> v3: no change here, just in part 2/2.
> 
> v2: put the "...set_cpus" API together with the change to
> init_sched_domains() so that they can go into the timer tree
> independently of other changes in my tree.
> 
>  include/linux/tick.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 9c085dc12ae9..8d1754c0f694 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
>  	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
>  }
>  
> +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)

Or tick_nohz_full_affine() ?

> +{
> +	if (tick_nohz_full_enabled())
> +		cpumask_or(mask, mask, tick_nohz_full_mask);
> +}
> +
> +static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
> +{
> +	if (tick_nohz_full_enabled())
> +		cpumask_andnot(mask, mask, tick_nohz_full_mask);
> +}
> +
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
> @@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> +static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask) { }
>  static inline void __tick_nohz_full_check(void) { }
>  static inline void tick_nohz_full_kick_cpu(int cpu) { }
>  static inline void tick_nohz_full_kick(void) { }
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-09 18:01                                           ` [PATCH v6 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-10  1:37                                             ` Frederic Weisbecker
  2015-04-10 20:53                                               ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-10  1:37 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Thu, Apr 09, 2015 at 02:01:46PM -0400, Chris Metcalf wrote:
> nohz_full is only useful with isolcpus also set, since otherwise the
> scheduler has to run periodically to try to determine whether to steal
> work from other cores.
> 
> Accordingly, when booting with nohz_full=xxx on the command line, we
> should act as if isolcpus=xxx was also set, and set (or extend) the
> isolcpus set to include the nohz_full cpus.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> ---
> I've come full circle on this issue.  For v5 I liked Frederic's
> idea of adding an API, but the more I thought about it seemed
> like an overly-generic mechanism for what is really a very specific
> issue, namely the interaction between the scheduler and nohz_full.
> We already have plenty of nohz_full-specific code in sched/core.c,
> and I think we now have a cpumask function name that is OK, so
> now I'm proposing we just stick with directly modifying the
> cpu_isolated_map using the new name.  However, I've moved the
> update to the map right to the top of sched_init_smp(), which seems
> like a cleaner place to make this kind of a change.
> 
> v6: back to just using a direct update, with the new names
> 
> v5: test run of Frederic's proposal to use sched_isolated_map_add()
> 
> v4: update accessor names to make them clearer [PeterZ]
> 
> v3: update changelog language to avoid using unclear "implies" [PeterZ]
> 
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..041845c568d2 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7031,6 +7031,9 @@ void __init sched_init_smp(void)
>  	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
>  	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
>  
> +	/* nohz_full won't take effect without isolating the cpus. */
> +	tick_nohz_full_remove_cpus_from(cpu_isolated_map);

This should be the other one I guess.

> +
>  	sched_init_numa();
>  
>  	/*
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-10  1:34                                           ` [PATCH v6 " Frederic Weisbecker
@ 2015-04-10 15:31                                             ` Chris Metcalf
  0 siblings, 0 replies; 89+ messages in thread
From: Chris Metcalf @ 2015-04-10 15:31 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 04/09/2015 09:34 PM, Frederic Weisbecker wrote:
> On Thu, Apr 09, 2015 at 02:01:45PM -0400, Chris Metcalf wrote:
>> --- a/include/linux/tick.h
>> +++ b/include/linux/tick.h
>> @@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
>>   	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
>>   }
>>   
>> +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
> Or tick_nohz_full_affine() ?

I'd vote no to that, I think - the first problem I see is that it doesn't
make very clear that it modifies the argument, which is the problem
Peter Z has been having from the beginning with some suggestions.
The second problem is that it sounds like it might be setting the
argument unconditionally to the nohz_full set, when in fact it's doing
a cpumask_or() to increase the set of cpus further.

> +	/* nohz_full won't take effect without isolating the cpus. */
> +	tick_nohz_full_remove_cpus_from(cpu_isolated_map);
> This should be the other one I guess.
>

<Slaps self on forehead>  Thanks!

Now off to do some actual testing before sending the next patchset :-)

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v5] nohz: set isolcpus when nohz_full is set
  2015-04-10  1:05                                         ` Mike Galbraith
@ 2015-04-10 15:33                                           ` Chris Metcalf
  2015-04-10 15:57                                             ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-10 15:33 UTC (permalink / raw)
  To: Mike Galbraith, Peter Zijlstra
  Cc: Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On 04/09/2015 09:05 PM, Mike Galbraith wrote:
> On Thu, 2015-04-09 at 19:12 +0200, Peter Zijlstra wrote:
>> On Thu, Apr 09, 2015 at 12:59:39PM -0400, Chris Metcalf wrote:
>>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>>> index 6d77432e14ff..18a961b9beba 100644
>>> --- a/include/linux/sched.h
>>> +++ b/include/linux/sched.h
>>> @@ -323,6 +323,7 @@ struct task_struct;
>>>   extern int lockdep_tasklist_lock_is_held(void);
>>>   #endif /* #ifdef CONFIG_PROVE_RCU */
>>>   
>>> +extern void sched_isolated_map_add(const struct cpumask *);
>>>   extern void sched_init(void);
>>>   extern void sched_init_smp(void);
>>>   extern asmlinkage void schedule_tail(struct task_struct *prev);
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index f0f831e8a345..b055c5e0e65c 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -5824,6 +5824,11 @@ static int __init isolated_cpu_setup(char
>>> *str)
>>>   
>>>   __setup("isolcpus=", isolated_cpu_setup);
>>>   
>>> +void sched_isolated_map_add(const struct cpumask *cpumask)
>>> +{
>>> +   cpumask_or(cpu_isolated_map, cpu_isolated_map, cpumask);
>>> +}
>>> +
>>>   struct s_data {
>>>      struct sched_domain ** __percpu sd;
>>>      struct root_domain      *rd;
>>> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
>>> index a4c4edac4528..b0092d02ca3f 100644
>>> --- a/kernel/time/tick-sched.c
>>> +++ b/kernel/time/tick-sched.c
>>> @@ -385,6 +385,9 @@ void __init tick_nohz_init(void)
>>>      for_each_cpu(cpu, tick_nohz_full_mask)
>>>              context_tracking_cpu_set(cpu);
>>>   
>>> +   /* It's not meaningful to be nohz without disabling the
>>> scheduler. */
>>> +   sched_isolated_map_add(tick_nohz_full_mask);
>>> +
>>>      cpu_notifier(tick_nohz_cpu_down_callback, 0);
>>>      pr_info("NO_HZ: Full dynticks CPUs: %*pbl.\n",
>>>              cpumask_pr_args(tick_nohz_full_mask));
>> Right, this could work. Although I would suggest adding a comment
>> somewhere that we should be careful with init order. I checked, this
>> appears to be ordered right, but...
> I'd embed it in domain construction, that way it'd be ready for the
> day nohz_full becomes dynamic, and people can start using cpusets to
> set up /tear down isolated sets on the fly.

So, move it to the top of build_sched_domains(), and then for
every "const struct cpumask *cpu_map" argument, create a
temporary cpu_map so we can mask out the nohz_full cores?

The problem is, we already allow partition_sched_domains()
to override "isolcpus=", so it seems appropriate that you should
be able to override "nohz_full=" in the same way, which my
current patch (v6) does.

So I think the proposed solution is certainly no worse than what
we have now in terms of a future migration to cpusets.

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v5] nohz: set isolcpus when nohz_full is set
  2015-04-10 15:33                                           ` Chris Metcalf
@ 2015-04-10 15:57                                             ` Mike Galbraith
  0 siblings, 0 replies; 89+ messages in thread
From: Mike Galbraith @ 2015-04-10 15:57 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra, Frederic Weisbecker, Paul E. McKenney,
	Rafael J. Wysocki, Martin Schwidefsky, Ingo Molnar, linux-kernel

On Fri, 2015-04-10 at 11:33 -0400, Chris Metcalf wrote:
> 
> So I think the proposed solution is certainly no worse than what
> we have now in terms of a future migration to cpusets.

Yeah, whatever will fly in the here and now is fine.

        -Mike

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-10  1:37                                             ` Frederic Weisbecker
@ 2015-04-10 20:53                                               ` Chris Metcalf
  2015-04-10 20:53                                                 ` [PATCH v7 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
  2015-04-14  0:33                                                 ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Frederic Weisbecker
  0 siblings, 2 replies; 89+ messages in thread
From: Chris Metcalf @ 2015-04-10 20:53 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

The "removes_cpus_from" API is useful, for example, to modify a cpumask
to avoid the nohz cores so that interrupts aren't sent to them.

Likewise the "add_cpus_to" API is useful to modify a cpumask indicating
some special nohz-type functionality so that the nohz cores are
automatically added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v7: no change here, just in part 2/2.

v6: I think we may finally have accessor names that are OK

v5: (skipped this patch)

v4: update accessor names to make them clearer [PeterZ]

v3: no change here, just in part 2/2.

v2: put the "...set_cpus" API together with the change to
init_sched_domains() so that they can go into the timer tree
independently of other changes in my tree.

 include/linux/tick.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

 include/linux/tick.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 9c085dc12ae9..8d1754c0f694 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
+static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_andnot(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
+static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v7 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-10 20:53                                               ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
@ 2015-04-10 20:53                                                 ` Chris Metcalf
  2015-04-14  0:37                                                   ` Frederic Weisbecker
  2015-04-14  0:33                                                 ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Frederic Weisbecker
  1 sibling, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-10 20:53 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v7: oops, use add_cpus_to, not remove_cpus_from [Frederic]

v6: back to just using a direct update, with the new names

v5: test run of Frederic's proposal to use sched_isolated_map_add()

v4: update accessor names to make them clearer [PeterZ]

v3: update changelog language to avoid using unclear "implies" [PeterZ]

 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..4a5681b5a4fd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7031,6 +7031,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-10 20:53                                               ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
  2015-04-10 20:53                                                 ` [PATCH v7 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-14  0:33                                                 ` Frederic Weisbecker
  2015-04-14  0:49                                                   ` Chris Metcalf
  1 sibling, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-14  0:33 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Fri, Apr 10, 2015 at 04:53:51PM -0400, Chris Metcalf wrote:
> The "removes_cpus_from" API is useful, for example, to modify a cpumask
> to avoid the nohz cores so that interrupts aren't sent to them.
> 
> Likewise the "add_cpus_to" API is useful to modify a cpumask indicating
> some special nohz-type functionality so that the nohz cores are
> automatically added to that set.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> ---
> v7: no change here, just in part 2/2.
> 
> v6: I think we may finally have accessor names that are OK
> 
> v5: (skipped this patch)
> 
> v4: update accessor names to make them clearer [PeterZ]
> 
> v3: no change here, just in part 2/2.
> 
> v2: put the "...set_cpus" API together with the change to
> init_sched_domains() so that they can go into the timer tree
> independently of other changes in my tree.
> 
>  include/linux/tick.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
>  include/linux/tick.h | 14 ++++++++++++++
>  1 file changed, 14 insertions(+)
> 
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 9c085dc12ae9..8d1754c0f694 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
>  	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
>  }
>  
> +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
> +{
> +	if (tick_nohz_full_enabled())
> +		cpumask_or(mask, mask, tick_nohz_full_mask);
> +}
> +
> +static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
> +{
> +	if (tick_nohz_full_enabled())
> +		cpumask_andnot(mask, mask, tick_nohz_full_mask);

I would prefer that you don't introduce new APIs if they aren't used in your
patchset. It seems that's the case for tick_nohz_full_remove_cpus_from().

Also we have housekeeping_affine() that affines a task to CPUs outside the
range of nohz full. In case you still need tick_nohz_full_remove_cpus_from()
in a further patchset, housekeeping_affine_cpumask() would extend the existing
naming.

> +}
> +
>  extern void __tick_nohz_full_check(void);
>  extern void tick_nohz_full_kick(void);
>  extern void tick_nohz_full_kick_cpu(int cpu);
> @@ -194,6 +206,8 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
>  #else
>  static inline bool tick_nohz_full_enabled(void) { return false; }
>  static inline bool tick_nohz_full_cpu(int cpu) { return false; }
> +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
> +static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask) { }
>  static inline void __tick_nohz_full_check(void) { }
>  static inline void tick_nohz_full_kick_cpu(int cpu) { }
>  static inline void tick_nohz_full_kick(void) { }
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v7 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-10 20:53                                                 ` [PATCH v7 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-14  0:37                                                   ` Frederic Weisbecker
  2015-04-14 15:17                                                     ` [PATCH v8 1/2] nohz: add tick_nohz_full_add_cpus_to() API Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-14  0:37 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Fri, Apr 10, 2015 at 04:53:52PM -0400, Chris Metcalf wrote:
> nohz_full is only useful with isolcpus also set, since otherwise the
> scheduler has to run periodically to try to determine whether to steal
> work from other cores.
> 
> Accordingly, when booting with nohz_full=xxx on the command line, we
> should act as if isolcpus=xxx was also set, and set (or extend) the
> isolcpus set to include the nohz_full cpus.
> 
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> ---
> v7: oops, use add_cpus_to, not remove_cpus_from [Frederic]
> 
> v6: back to just using a direct update, with the new names
> 
> v5: test run of Frederic's proposal to use sched_isolated_map_add()
> 
> v4: update accessor names to make them clearer [PeterZ]
> 
> v3: update changelog language to avoid using unclear "implies" [PeterZ]
> 
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..4a5681b5a4fd 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7031,6 +7031,9 @@ void __init sched_init_smp(void)
>  	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
>  	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
>  
> +	/* nohz_full won't take effect without isolating the cpus. */
> +	tick_nohz_full_add_cpus_to(cpu_isolated_map);
> +

I can't say I like this, but I can not come up with a better name.
So that patch is ok to me (if Peterz acks it) :-)

>  	sched_init_numa();
>  
>  	/*
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-14  0:33                                                 ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Frederic Weisbecker
@ 2015-04-14  0:49                                                   ` Chris Metcalf
  2015-04-14 15:34                                                     ` Frederic Weisbecker
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-14  0:49 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On 4/13/2015 8:33 PM, Frederic Weisbecker wrote:
> On Fri, Apr 10, 2015 at 04:53:51PM -0400, Chris Metcalf wrote:
>> The "removes_cpus_from" API is useful, for example, to modify a cpumask
>> to avoid the nohz cores so that interrupts aren't sent to them.
>>
>> Likewise the "add_cpus_to" API is useful to modify a cpumask indicating
>> some special nohz-type functionality so that the nohz cores are
>> automatically added to that set.
>>
>> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
>> ---
>> [...]
>>
>> diff --git a/include/linux/tick.h b/include/linux/tick.h
>> index 9c085dc12ae9..8d1754c0f694 100644
>> --- a/include/linux/tick.h
>> +++ b/include/linux/tick.h
>> @@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
>>   	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
>>   }
>>   
>> +static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
>> +{
>> +	if (tick_nohz_full_enabled())
>> +		cpumask_or(mask, mask, tick_nohz_full_mask);
>> +}
>> +
>> +static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
>> +{
>> +	if (tick_nohz_full_enabled())
>> +		cpumask_andnot(mask, mask, tick_nohz_full_mask);
> I would prefer that you don't introduce new APIs if they aren't used in your
> patchset. It seems that's the case for tick_nohz_full_remove_cpus_from().

Yes, it's used in a tile-tree patch to remove nohz_full cpus from the set that
take interrupts from the tilegx network driver.

I can certainly make this patchset have just the _add_cpus_to() variant,
and the other patchset have just the _remove_cpus_from() variant.
It seemed to make sense to include them together as a matched set,
but doing it the way you suggest makes an equal if different amount of sense.

> Also we have housekeeping_affine() that affines a task to CPUs outside the
> range of nohz full. In case you still need tick_nohz_full_remove_cpus_from()
> in a further patchset, housekeeping_affine_cpumask() would extend the existing
> naming.

I like housekeeping_affine(), but it overwrites the affinity mask of
the task.  So I would expect housekeeping_affine_mask() to overwrite
the specified argument cpumask, which it doesn't in my definition.

I don't know that I can think of a good name in the "housekeeping_xxx"
namespace that feels better than the one I proposed.  In context of the
proposed client of the API so far, it's:

        if (!network_cpus_init()) {
                network_cpus_map = *cpu_online_mask;
tick_nohz_full_remove_cpus_from(&network_cpus_map);
        }

If housekeeping_mask were defined for non-nohz_full I could just use
it unconditionally here.  Alternately I could just put in an #ifdef for
CONFIG_NO_HZ_FULL and either use cpu_only_mask or housekeeping_mask
to initialize network_cpus_map, which dodges the bullet of creating
an acceptable API name here for the moment.

Frederic, what's your thought/preference?

-- 
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v8 1/2] nohz: add tick_nohz_full_add_cpus_to() API
  2015-04-14  0:37                                                   ` Frederic Weisbecker
@ 2015-04-14 15:17                                                     ` Chris Metcalf
  2015-04-14 15:17                                                       ` [PATCH v8 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-14 15:17 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

This API is useful to modify a cpumask indicating some special nohz-type
functionality so that the nohz cores are automatically added to that set.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v8: Just include _add_cpus_to() in this patch set [Frederic]
  Add Frederic's Ack

v7: no change here, just in part 2/2.

v6: I think we may finally have accessor names that are OK

v5: (skipped this patch)

v4: update accessor names to make them clearer [PeterZ]

v3: no change here, just in part 2/2.

v2: put the "...set_cpus" API together with the change to
init_sched_domains() so that they can go into the timer tree
independently of other changes in my tree.

 include/linux/tick.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index ce655084ffe3..8d1754c0f694 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -186,6 +186,12 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
 {
 	if (tick_nohz_full_enabled())
@@ -200,6 +206,7 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v8 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-14 15:17                                                     ` [PATCH v8 1/2] nohz: add tick_nohz_full_add_cpus_to() API Chris Metcalf
@ 2015-04-14 15:17                                                       ` Chris Metcalf
  2015-04-14 15:26                                                         ` Frederic Weisbecker
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Metcalf @ 2015-04-14 15:17 UTC (permalink / raw)
  To: Peter Zijlstra (Intel),
	Frederic Weisbecker, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel
  Cc: Chris Metcalf

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
---
v8: no change in this patch, just in 1/2
  Add Frederic's Ack

v7: oops, use add_cpus_to, not remove_cpus_from [Frederic]

v6: back to just using a direct update, with the new names

v5: test run of Frederic's proposal to use sched_isolated_map_add()

v4: update accessor names to make them clearer [PeterZ]

v3: update changelog language to avoid using unclear "implies" [PeterZ]

v2: add Mike and Rik's Acked-by

 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f0f831e8a345..4a5681b5a4fd 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7031,6 +7031,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v8 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-14 15:17                                                       ` [PATCH v8 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
@ 2015-04-14 15:26                                                         ` Frederic Weisbecker
  2015-04-14 16:45                                                           ` Peter Zijlstra
  0 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-14 15:26 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Tue, Apr 14, 2015 at 11:17:55AM -0400, Chris Metcalf wrote:
> nohz_full is only useful with isolcpus also set, since otherwise the
> scheduler has to run periodically to try to determine whether to steal
> work from other cores.
> 
> Accordingly, when booting with nohz_full=xxx on the command line, we
> should act as if isolcpus=xxx was also set, and set (or extend) the
> isolcpus set to include the nohz_full cpus.
> 
> Acked-by: Frederic Weisbecker <fweisbec@gmail.com>

No need to put my ack, since I'll add my Signed-off-by when I apply the patches.
Note this won't go through this merge window though as we are in the middle of
it already.

Peter, are you ok with these two patches?

Thanks.

> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> Acked-by: Rik van Riel <riel@redhat.com>
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> ---
> v8: no change in this patch, just in 1/2
>   Add Frederic's Ack
> 
> v7: oops, use add_cpus_to, not remove_cpus_from [Frederic]
> 
> v6: back to just using a direct update, with the new names
> 
> v5: test run of Frederic's proposal to use sched_isolated_map_add()
> 
> v4: update accessor names to make them clearer [PeterZ]
> 
> v3: update changelog language to avoid using unclear "implies" [PeterZ]
> 
> v2: add Mike and Rik's Acked-by
> 
>  kernel/sched/core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f0f831e8a345..4a5681b5a4fd 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7031,6 +7031,9 @@ void __init sched_init_smp(void)
>  	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
>  	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
>  
> +	/* nohz_full won't take effect without isolating the cpus. */
> +	tick_nohz_full_add_cpus_to(cpu_isolated_map);
> +
>  	sched_init_numa();
>  
>  	/*
> -- 
> 2.1.2
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs
  2015-04-14  0:49                                                   ` Chris Metcalf
@ 2015-04-14 15:34                                                     ` Frederic Weisbecker
  0 siblings, 0 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-04-14 15:34 UTC (permalink / raw)
  To: Chris Metcalf
  Cc: Peter Zijlstra (Intel),
	Paul E. McKenney, Rafael J. Wysocki, Martin Schwidefsky,
	Ingo Molnar, linux-kernel

On Mon, Apr 13, 2015 at 08:49:20PM -0400, Chris Metcalf wrote:
> On 4/13/2015 8:33 PM, Frederic Weisbecker wrote:
> >On Fri, Apr 10, 2015 at 04:53:51PM -0400, Chris Metcalf wrote:
> >>The "removes_cpus_from" API is useful, for example, to modify a cpumask
> >>to avoid the nohz cores so that interrupts aren't sent to them.
> >>
> >>Likewise the "add_cpus_to" API is useful to modify a cpumask indicating
> >>some special nohz-type functionality so that the nohz cores are
> >>automatically added to that set.
> >>
> >>Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> >>---
> >>[...]
> >>
> >>diff --git a/include/linux/tick.h b/include/linux/tick.h
> >>index 9c085dc12ae9..8d1754c0f694 100644
> >>--- a/include/linux/tick.h
> >>+++ b/include/linux/tick.h
> >>@@ -186,6 +186,18 @@ static inline bool tick_nohz_full_cpu(int cpu)
> >>  	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
> >>  }
> >>+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
> >>+{
> >>+	if (tick_nohz_full_enabled())
> >>+		cpumask_or(mask, mask, tick_nohz_full_mask);
> >>+}
> >>+
> >>+static inline void tick_nohz_full_remove_cpus_from(struct cpumask *mask)
> >>+{
> >>+	if (tick_nohz_full_enabled())
> >>+		cpumask_andnot(mask, mask, tick_nohz_full_mask);
> >I would prefer that you don't introduce new APIs if they aren't used in your
> >patchset. It seems that's the case for tick_nohz_full_remove_cpus_from().
> 
> Yes, it's used in a tile-tree patch to remove nohz_full cpus from the set that
> take interrupts from the tilegx network driver.
> 
> I can certainly make this patchset have just the _add_cpus_to() variant,
> and the other patchset have just the _remove_cpus_from() variant.
> It seemed to make sense to include them together as a matched set,
> but doing it the way you suggest makes an equal if different amount of sense.
> 
> >Also we have housekeeping_affine() that affines a task to CPUs outside the
> >range of nohz full. In case you still need tick_nohz_full_remove_cpus_from()
> >in a further patchset, housekeeping_affine_cpumask() would extend the existing
> >naming.
> 
> I like housekeeping_affine(), but it overwrites the affinity mask of
> the task.  So I would expect housekeeping_affine_mask() to overwrite
> the specified argument cpumask, which it doesn't in my definition.
> 
> I don't know that I can think of a good name in the "housekeeping_xxx"
> namespace that feels better than the one I proposed.  In context of the
> proposed client of the API so far, it's:
> 
>        if (!network_cpus_init()) {
>                network_cpus_map = *cpu_online_mask;
> tick_nohz_full_remove_cpus_from(&network_cpus_map);
>        }
> 
> If housekeeping_mask were defined for non-nohz_full I could just use
> it unconditionally here.  Alternately I could just put in an #ifdef for
> CONFIG_NO_HZ_FULL and either use cpu_only_mask or housekeeping_mask
> to initialize network_cpus_map, which dodges the bullet of creating
> an acceptable API name here for the moment.
> 
> Frederic, what's your thought/preference?

Right, the more I look into this whole, the more I think we should perhaps
make tick_nohz_full_cpumask a read-only wide visible cpumask like we do
for the cpumask in kernel/cpu.c defined as "const struct cpumask *const tick_nohz_full_mask"
and do the cpumask operations with it.

> 
> -- 
> Chris Metcalf, EZChip Semiconductor
> http://www.ezchip.com
> 

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v8 2/2] nohz: set isolcpus when nohz_full is set
  2015-04-14 15:26                                                         ` Frederic Weisbecker
@ 2015-04-14 16:45                                                           ` Peter Zijlstra
  0 siblings, 0 replies; 89+ messages in thread
From: Peter Zijlstra @ 2015-04-14 16:45 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Chris Metcalf, Paul E. McKenney, Rafael J. Wysocki,
	Martin Schwidefsky, Ingo Molnar, linux-kernel

On Tue, Apr 14, 2015 at 05:26:45PM +0200, Frederic Weisbecker wrote:
> On Tue, Apr 14, 2015 at 11:17:55AM -0400, Chris Metcalf wrote:
> > nohz_full is only useful with isolcpus also set, since otherwise the
> > scheduler has to run periodically to try to determine whether to steal
> > work from other cores.
> > 
> > Accordingly, when booting with nohz_full=xxx on the command line, we
> > should act as if isolcpus=xxx was also set, and set (or extend) the
> > isolcpus set to include the nohz_full cpus.
> > 
> > Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
> 
> No need to put my ack, since I'll add my Signed-off-by when I apply the patches.
> Note this won't go through this merge window though as we are in the middle of
> it already.
> 
> Peter, are you ok with these two patches?

Yeah,

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [GIT PULL] nohz: A few improvements v4
@ 2015-05-06 16:04 Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
                   ` (3 more replies)
  0 siblings, 4 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Chris Metcalf, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

Ingo,

Please pull the nohz/core branch (on top of tip:timers/core) that can be
found at:

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
	nohz/core

HEAD: 60d0b4dc36259029e4cc8d030d8f59b33a119814

---
Summary of changes:

* Fix rare crashes due to context tracking recursion when faulting on
  vmalloc'ed areas.
* Simplify the TIF_NOHZ propagation (less context switch overhead)
* Set nohz full CPUs as isolcpus. This way we enforce nohz CPUs to be
  undisturbed by globally affine tasks.

Thanks,
	Frederic
---

Chris Metcalf (2):
      nohz: Add tick_nohz_full_add_cpus_to() API
      nohz: Set isolcpus when nohz_full is set

Frederic Weisbecker (2):
      context_tracking: Protect against recursion
      context_tracking: Inherit TIF_NOHZ through forks instead of context switches


 include/linux/context_tracking.h       | 10 -----
 include/linux/context_tracking_state.h |  1 +
 include/linux/sched.h                  |  3 ++
 include/linux/tick.h                   |  7 ++++
 kernel/context_tracking.c              | 68 +++++++++++++++++++++++-----------
 kernel/sched/core.c                    |  4 +-
 6 files changed, 60 insertions(+), 33 deletions(-)

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH 1/4] context_tracking: Protect against recursion
  2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
@ 2015-05-06 16:04 ` Frederic Weisbecker
  2015-05-07  9:58   ` Ingo Molnar
  2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Chris Metcalf, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

Context tracking recursion can happen when an exception triggers in the
middle of a call to a context tracking probe.

This special case can be caused by vmalloc faults. If an access to a
memory area allocated by vmalloc happens in the middle of
context_tracking_enter(), we may run into an endless fault loop because
the exception in turn calls context_tracking_enter() which faults on
the same vmalloc'ed memory, triggering an exception again, etc...

Some rare crashes have been reported so lets protect against this with
a recursion counter.

Reported-by: Dave Jones <davej@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/context_tracking_state.h |  1 +
 kernel/context_tracking.c              | 30 ++++++++++++++++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 6b7b96a..678ecdf 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -12,6 +12,7 @@ struct context_tracking {
 	 * may be further optimized using static keys.
 	 */
 	bool active;
+	int recursion;
 	enum ctx_state {
 		CONTEXT_KERNEL = 0,
 		CONTEXT_USER,
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 72d59a1..b9e0b4f 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -38,6 +38,25 @@ void context_tracking_cpu_set(int cpu)
 	}
 }
 
+static bool context_tracking_recursion_enter(void)
+{
+	int recursion;
+
+	recursion = __this_cpu_inc_return(context_tracking.recursion);
+	if (recursion == 1)
+		return true;
+
+	WARN_ONCE((recursion < 1), "Invalid context tracking recursion value %d\n", recursion);
+	__this_cpu_dec(context_tracking.recursion);
+
+	return false;
+}
+
+static void context_tracking_recursion_exit(void)
+{
+	__this_cpu_dec(context_tracking.recursion);
+}
+
 /**
  * context_tracking_enter - Inform the context tracking that the CPU is going
  *                          enter user or guest space mode.
@@ -75,6 +94,11 @@ void context_tracking_enter(enum ctx_state state)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
+	if (!context_tracking_recursion_enter()) {
+		local_irq_restore(flags);
+		return;
+	}
+
 	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
@@ -105,6 +129,7 @@ void context_tracking_enter(enum ctx_state state)
 		 */
 		__this_cpu_write(context_tracking.state, state);
 	}
+	context_tracking_recursion_exit();
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_enter);
@@ -139,6 +164,10 @@ void context_tracking_exit(enum ctx_state state)
 		return;
 
 	local_irq_save(flags);
+	if (!context_tracking_recursion_enter()) {
+		local_irq_restore(flags);
+		return;
+	}
 	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
@@ -153,6 +182,7 @@ void context_tracking_exit(enum ctx_state state)
 		}
 		__this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
 	}
+	context_tracking_recursion_exit();
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_exit);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
@ 2015-05-06 16:04 ` Frederic Weisbecker
  2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
  3 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Frederic Weisbecker, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Chris Metcalf, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

TIF_NOHZ is used by context_tracking to force syscall slow-path on every
task in order to track userspace roundtrips. As such, it must be set on
all running tasks.

It's currently explicitly inherited through context switches. There is
no need to do it on this fast-path though. The flag could be simply
set once for all on all tasks, whether they are running or not.

Lets do this by setting the flag to init task on early boot and let it
propagate through fork inheritance.

While at it mark context_tracking_cpu_set() as init code, we only need
it at early boot time.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Dave Jones <davej@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
---
 include/linux/context_tracking.h | 10 ---------
 include/linux/sched.h            |  3 +++
 kernel/context_tracking.c        | 44 +++++++++++++++++-----------------------
 kernel/sched/core.c              |  1 -
 4 files changed, 22 insertions(+), 36 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 2821838..b96bd29 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -14,8 +14,6 @@ extern void context_tracking_enter(enum ctx_state state);
 extern void context_tracking_exit(enum ctx_state state);
 extern void context_tracking_user_enter(void);
 extern void context_tracking_user_exit(void);
-extern void __context_tracking_task_switch(struct task_struct *prev,
-					   struct task_struct *next);
 
 static inline void user_enter(void)
 {
@@ -51,19 +49,11 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next)
-{
-	if (context_tracking_is_enabled())
-		__context_tracking_task_switch(prev, next);
-}
 #else
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next) { }
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8222ae4..f5d4f9e 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2540,6 +2540,9 @@ static inline unsigned long wait_task_inactive(struct task_struct *p,
 }
 #endif
 
+#define tasklist_empty() \
+	list_empty(&init_task.tasks)
+
 #define next_task(p) \
 	list_entry_rcu((p)->tasks.next, struct task_struct, tasks)
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index b9e0b4f..a3aaca7 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -30,14 +30,6 @@ EXPORT_SYMBOL_GPL(context_tracking_enabled);
 DEFINE_PER_CPU(struct context_tracking, context_tracking);
 EXPORT_SYMBOL_GPL(context_tracking);
 
-void context_tracking_cpu_set(int cpu)
-{
-	if (!per_cpu(context_tracking.active, cpu)) {
-		per_cpu(context_tracking.active, cpu) = true;
-		static_key_slow_inc(&context_tracking_enabled);
-	}
-}
-
 static bool context_tracking_recursion_enter(void)
 {
 	int recursion;
@@ -194,24 +186,26 @@ void context_tracking_user_exit(void)
 }
 NOKPROBE_SYMBOL(context_tracking_user_exit);
 
-/**
- * __context_tracking_task_switch - context switch the syscall callbacks
- * @prev: the task that is being switched out
- * @next: the task that is being switched in
- *
- * The context tracking uses the syscall slow path to implement its user-kernel
- * boundaries probes on syscalls. This way it doesn't impact the syscall fast
- * path on CPUs that don't do context tracking.
- *
- * But we need to clear the flag on the previous task because it may later
- * migrate to some CPU that doesn't do the context tracking. As such the TIF
- * flag may not be desired there.
- */
-void __context_tracking_task_switch(struct task_struct *prev,
-				    struct task_struct *next)
+void __init context_tracking_cpu_set(int cpu)
 {
-	clear_tsk_thread_flag(prev, TIF_NOHZ);
-	set_tsk_thread_flag(next, TIF_NOHZ);
+	static __initdata bool initialized = false;
+
+	if (!per_cpu(context_tracking.active, cpu)) {
+		per_cpu(context_tracking.active, cpu) = true;
+		static_key_slow_inc(&context_tracking_enabled);
+	}
+
+	if (initialized)
+		return;
+
+	/*
+	 * Set TIF_NOHZ to init/0 and let it propagate to all tasks through fork
+	 * This assumes that init is the only task at this early boot stage.
+	 */
+	set_tsk_thread_flag(&init_task, TIF_NOHZ);
+	WARN_ON_ONCE(!tasklist_empty());
+
+	initialized = true;
 }
 
 #ifdef CONFIG_CONTEXT_TRACKING_FORCE
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d8a6196..6f149f8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2338,7 +2338,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 */
 	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
 
-	context_tracking_task_switch(prev, next);
 	/* Here we just switch the register state and the stack. */
 	switch_to(prev, next, prev);
 	barrier();
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API
  2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
  2015-05-06 16:04 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
@ 2015-05-06 16:04 ` Frederic Weisbecker
  2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Chris Metcalf
  2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
  3 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Frederic Weisbecker,
	Rik van Riel, Martin Schwidefsky

From: Chris Metcalf <cmetcalf@ezchip.com>

This API is useful to modify a cpumask indicating some special nohz-type
functionality so that the nohz cores are automatically added to that set.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Link: http://lkml.kernel.org/r/1429024675-18938-1-git-send-email-cmetcalf@ezchip.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 include/linux/tick.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f8492da5..4191b56 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -134,6 +134,12 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -142,6 +148,7 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
                   ` (2 preceding siblings ...)
  2015-05-06 16:04 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
@ 2015-05-06 16:04 ` Frederic Weisbecker
  2015-05-07 11:32   ` [tip:timers/nohz] " tip-bot for Chris Metcalf
  2015-05-16 19:39   ` [PATCH 4/4] " Sasha Levin
  3 siblings, 2 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-06 16:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Frederic Weisbecker,
	Rik van Riel, Martin Schwidefsky

From: Chris Metcalf <cmetcalf@ezchip.com>

nohz_full is only useful with isolcpus also set, since otherwise the
scheduler has to run periodically to try to determine whether to steal
work from other cores.

Accordingly, when booting with nohz_full=xxx on the command line, we
should act as if isolcpus=xxx was also set, and set (or extend) the
isolcpus set to include the nohz_full cpus.

Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6f149f8..e95b4d8 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7042,6 +7042,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH 1/4] context_tracking: Protect against recursion
  2015-05-06 16:04 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
@ 2015-05-07  9:58   ` Ingo Molnar
  2015-05-07 11:53     ` Frederic Weisbecker
  2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Frederic Weisbecker
  1 sibling, 1 reply; 89+ messages in thread
From: Ingo Molnar @ 2015-05-07  9:58 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: LKML, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith,
	Chris Metcalf, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky


* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> @@ -75,6 +94,11 @@ void context_tracking_enter(enum ctx_state state)
>  	WARN_ON_ONCE(!current->mm);
>  
>  	local_irq_save(flags);
> +	if (!context_tracking_recursion_enter()) {
> +		local_irq_restore(flags);
> +		return;
> +	}
> +
>  	if ( __this_cpu_read(context_tracking.state) != state) {
>  		if (__this_cpu_read(context_tracking.active)) {
>  			/*
> @@ -105,6 +129,7 @@ void context_tracking_enter(enum ctx_state state)
>  		 */
>  		__this_cpu_write(context_tracking.state, state);
>  	}
> +	context_tracking_recursion_exit();

>  	local_irq_restore(flags);
>  }

So why not add an 'out_irq_restore:' label and use goto instead of 
duplicating the return path in the recursion check?

>  NOKPROBE_SYMBOL(context_tracking_enter);
> @@ -139,6 +164,10 @@ void context_tracking_exit(enum ctx_state state)
>  		return;
>  
>  	local_irq_save(flags);
> +	if (!context_tracking_recursion_enter()) {
> +		local_irq_restore(flags);
> +		return;

Ditto.

No need to resend, I fixed this up in the patch.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [tip:timers/nohz] context_tracking: Protect against recursion
  2015-05-06 16:04 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
  2015-05-07  9:58   ` Ingo Molnar
@ 2015-05-07 11:31   ` tip-bot for Frederic Weisbecker
  1 sibling, 0 replies; 89+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2015-05-07 11:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: cmetcalf, riel, peterz, davej, schwidefsky, hpa, oleg, mingo,
	tglx, fweisbec, paulmck, bp, umgwanakikbuti, linux-kernel,
	rafael.j.wysocki

Commit-ID:  aed5ed47724f6a7453fa62e3c90f3cee93edbfe3
Gitweb:     http://git.kernel.org/tip/aed5ed47724f6a7453fa62e3c90f3cee93edbfe3
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Wed, 6 May 2015 18:04:23 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 7 May 2015 12:02:50 +0200

context_tracking: Protect against recursion

Context tracking recursion can happen when an exception triggers
in the middle of a call to a context tracking probe.

This special case can be caused by vmalloc faults. If an access
to a memory area allocated by vmalloc happens in the middle of
context_tracking_enter(), we may run into an endless fault loop
because the exception in turn calls context_tracking_enter()
which faults on the same vmalloc'ed memory, triggering an
exception again, etc...

Some rare crashes have been reported so lets protect against
this with a recursion counter.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1430928266-24888-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/context_tracking_state.h |  1 +
 kernel/context_tracking.c              | 29 +++++++++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h
index 6b7b96a..678ecdf 100644
--- a/include/linux/context_tracking_state.h
+++ b/include/linux/context_tracking_state.h
@@ -12,6 +12,7 @@ struct context_tracking {
 	 * may be further optimized using static keys.
 	 */
 	bool active;
+	int recursion;
 	enum ctx_state {
 		CONTEXT_KERNEL = 0,
 		CONTEXT_USER,
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 72d59a1..5b11a10 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -38,6 +38,25 @@ void context_tracking_cpu_set(int cpu)
 	}
 }
 
+static bool context_tracking_recursion_enter(void)
+{
+	int recursion;
+
+	recursion = __this_cpu_inc_return(context_tracking.recursion);
+	if (recursion == 1)
+		return true;
+
+	WARN_ONCE((recursion < 1), "Invalid context tracking recursion value %d\n", recursion);
+	__this_cpu_dec(context_tracking.recursion);
+
+	return false;
+}
+
+static void context_tracking_recursion_exit(void)
+{
+	__this_cpu_dec(context_tracking.recursion);
+}
+
 /**
  * context_tracking_enter - Inform the context tracking that the CPU is going
  *                          enter user or guest space mode.
@@ -75,6 +94,9 @@ void context_tracking_enter(enum ctx_state state)
 	WARN_ON_ONCE(!current->mm);
 
 	local_irq_save(flags);
+	if (!context_tracking_recursion_enter())
+		goto out_irq_restore;
+
 	if ( __this_cpu_read(context_tracking.state) != state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
@@ -105,6 +127,8 @@ void context_tracking_enter(enum ctx_state state)
 		 */
 		__this_cpu_write(context_tracking.state, state);
 	}
+	context_tracking_recursion_exit();
+out_irq_restore:
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_enter);
@@ -139,6 +163,9 @@ void context_tracking_exit(enum ctx_state state)
 		return;
 
 	local_irq_save(flags);
+	if (!context_tracking_recursion_enter())
+		goto out_irq_restore;
+
 	if (__this_cpu_read(context_tracking.state) == state) {
 		if (__this_cpu_read(context_tracking.active)) {
 			/*
@@ -153,6 +180,8 @@ void context_tracking_exit(enum ctx_state state)
 		}
 		__this_cpu_write(context_tracking.state, CONTEXT_KERNEL);
 	}
+	context_tracking_recursion_exit();
+out_irq_restore:
 	local_irq_restore(flags);
 }
 NOKPROBE_SYMBOL(context_tracking_exit);

^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [tip:timers/nohz] context_tracking: Inherit TIF_NOHZ through forks instead of context switches
  2015-05-06 16:04 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
@ 2015-05-07 11:31   ` tip-bot for Frederic Weisbecker
  0 siblings, 0 replies; 89+ messages in thread
From: tip-bot for Frederic Weisbecker @ 2015-05-07 11:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: fweisbec, riel, linux-kernel, paulmck, mingo, oleg,
	umgwanakikbuti, peterz, davej, rafael.j.wysocki, schwidefsky, bp,
	cmetcalf, tglx, hpa

Commit-ID:  fafe870f31212a72f3c2d74e7b90e4ef39e83ee1
Gitweb:     http://git.kernel.org/tip/fafe870f31212a72f3c2d74e7b90e4ef39e83ee1
Author:     Frederic Weisbecker <fweisbec@gmail.com>
AuthorDate: Wed, 6 May 2015 18:04:24 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 7 May 2015 12:02:51 +0200

context_tracking: Inherit TIF_NOHZ through forks instead of context switches

TIF_NOHZ is used by context_tracking to force syscall slow-path
on every task in order to track userspace roundtrips. As such,
it must be set on all running tasks.

It's currently explicitly inherited through context switches.
There is no need to do it in this fast-path though. The flag
could simply be set once for all on all tasks, whether they are
running or not.

Lets do this by setting the flag for the init task on early boot,
and let it propagate through fork inheritance.

While at it, mark context_tracking_cpu_set() as init code, we
only need it at early boot time.

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Dave Jones <davej@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Paul E . McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1430928266-24888-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/context_tracking.h | 10 ---------
 include/linux/sched.h            |  3 +++
 kernel/context_tracking.c        | 44 +++++++++++++++++-----------------------
 kernel/sched/core.c              |  1 -
 4 files changed, 22 insertions(+), 36 deletions(-)

diff --git a/include/linux/context_tracking.h b/include/linux/context_tracking.h
index 2821838..b96bd29 100644
--- a/include/linux/context_tracking.h
+++ b/include/linux/context_tracking.h
@@ -14,8 +14,6 @@ extern void context_tracking_enter(enum ctx_state state);
 extern void context_tracking_exit(enum ctx_state state);
 extern void context_tracking_user_enter(void);
 extern void context_tracking_user_exit(void);
-extern void __context_tracking_task_switch(struct task_struct *prev,
-					   struct task_struct *next);
 
 static inline void user_enter(void)
 {
@@ -51,19 +49,11 @@ static inline void exception_exit(enum ctx_state prev_ctx)
 	}
 }
 
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next)
-{
-	if (context_tracking_is_enabled())
-		__context_tracking_task_switch(prev, next);
-}
 #else
 static inline void user_enter(void) { }
 static inline void user_exit(void) { }
 static inline enum ctx_state exception_enter(void) { return 0; }
 static inline void exception_exit(enum ctx_state prev_ctx) { }
-static inline void context_tracking_task_switch(struct task_struct *prev,
-						struct task_struct *next) { }
 #endif /* !CONFIG_CONTEXT_TRACKING */
 
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 26a2e61..185a750 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2532,6 +2532,9 @@ static inline unsigned long wait_task_inactive(struct task_struct *p,
 }
 #endif
 
+#define tasklist_empty() \
+	list_empty(&init_task.tasks)
+
 #define next_task(p) \
 	list_entry_rcu((p)->tasks.next, struct task_struct, tasks)
 
diff --git a/kernel/context_tracking.c b/kernel/context_tracking.c
index 5b11a10..0a495ab 100644
--- a/kernel/context_tracking.c
+++ b/kernel/context_tracking.c
@@ -30,14 +30,6 @@ EXPORT_SYMBOL_GPL(context_tracking_enabled);
 DEFINE_PER_CPU(struct context_tracking, context_tracking);
 EXPORT_SYMBOL_GPL(context_tracking);
 
-void context_tracking_cpu_set(int cpu)
-{
-	if (!per_cpu(context_tracking.active, cpu)) {
-		per_cpu(context_tracking.active, cpu) = true;
-		static_key_slow_inc(&context_tracking_enabled);
-	}
-}
-
 static bool context_tracking_recursion_enter(void)
 {
 	int recursion;
@@ -193,24 +185,26 @@ void context_tracking_user_exit(void)
 }
 NOKPROBE_SYMBOL(context_tracking_user_exit);
 
-/**
- * __context_tracking_task_switch - context switch the syscall callbacks
- * @prev: the task that is being switched out
- * @next: the task that is being switched in
- *
- * The context tracking uses the syscall slow path to implement its user-kernel
- * boundaries probes on syscalls. This way it doesn't impact the syscall fast
- * path on CPUs that don't do context tracking.
- *
- * But we need to clear the flag on the previous task because it may later
- * migrate to some CPU that doesn't do the context tracking. As such the TIF
- * flag may not be desired there.
- */
-void __context_tracking_task_switch(struct task_struct *prev,
-				    struct task_struct *next)
+void __init context_tracking_cpu_set(int cpu)
 {
-	clear_tsk_thread_flag(prev, TIF_NOHZ);
-	set_tsk_thread_flag(next, TIF_NOHZ);
+	static __initdata bool initialized = false;
+
+	if (!per_cpu(context_tracking.active, cpu)) {
+		per_cpu(context_tracking.active, cpu) = true;
+		static_key_slow_inc(&context_tracking_enabled);
+	}
+
+	if (initialized)
+		return;
+
+	/*
+	 * Set TIF_NOHZ to init/0 and let it propagate to all tasks through fork
+	 * This assumes that init is the only task at this early boot stage.
+	 */
+	set_tsk_thread_flag(&init_task, TIF_NOHZ);
+	WARN_ON_ONCE(!tasklist_empty());
+
+	initialized = true;
 }
 
 #ifdef CONFIG_CONTEXT_TRACKING_FORCE
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fe22f75..54f032c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2332,7 +2332,6 @@ context_switch(struct rq *rq, struct task_struct *prev,
 	 */
 	spin_release(&rq->lock.dep_map, 1, _THIS_IP_);
 
-	context_tracking_task_switch(prev, next);
 	/* Here we just switch the register state and the stack. */
 	switch_to(prev, next, prev);
 	barrier();

^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [tip:timers/nohz] nohz: Add tick_nohz_full_add_cpus_to() API
  2015-05-06 16:04 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
@ 2015-05-07 11:31   ` tip-bot for Chris Metcalf
  0 siblings, 0 replies; 89+ messages in thread
From: tip-bot for Chris Metcalf @ 2015-05-07 11:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, davej, bp, riel, umgwanakikbuti, hpa, mingo,
	fweisbec, cmetcalf, oleg, rafael.j.wysocki, tglx, paulmck,
	peterz, schwidefsky

Commit-ID:  83dedea8a07fb4bf91863764b15c1c4ec00330f9
Gitweb:     http://git.kernel.org/tip/83dedea8a07fb4bf91863764b15c1c4ec00330f9
Author:     Chris Metcalf <cmetcalf@ezchip.com>
AuthorDate: Wed, 6 May 2015 18:04:25 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 7 May 2015 12:02:51 +0200

nohz: Add tick_nohz_full_add_cpus_to() API

This API is useful to modify a cpumask indicating some special
nohz-type functionality so that the nohz cores are automatically
added to that set.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Jones <davej@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1429024675-18938-1-git-send-email-cmetcalf@ezchip.com
Link: http://lkml.kernel.org/r/1430928266-24888-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 include/linux/tick.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/tick.h b/include/linux/tick.h
index f8492da5..4191b56 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -134,6 +134,12 @@ static inline bool tick_nohz_full_cpu(int cpu)
 	return cpumask_test_cpu(cpu, tick_nohz_full_mask);
 }
 
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask)
+{
+	if (tick_nohz_full_enabled())
+		cpumask_or(mask, mask, tick_nohz_full_mask);
+}
+
 extern void __tick_nohz_full_check(void);
 extern void tick_nohz_full_kick(void);
 extern void tick_nohz_full_kick_cpu(int cpu);
@@ -142,6 +148,7 @@ extern void __tick_nohz_task_switch(struct task_struct *tsk);
 #else
 static inline bool tick_nohz_full_enabled(void) { return false; }
 static inline bool tick_nohz_full_cpu(int cpu) { return false; }
+static inline void tick_nohz_full_add_cpus_to(struct cpumask *mask) { }
 static inline void __tick_nohz_full_check(void) { }
 static inline void tick_nohz_full_kick_cpu(int cpu) { }
 static inline void tick_nohz_full_kick(void) { }

^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [tip:timers/nohz] nohz: Set isolcpus when nohz_full is set
  2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
@ 2015-05-07 11:32   ` tip-bot for Chris Metcalf
  2015-05-16 19:39   ` [PATCH 4/4] " Sasha Levin
  1 sibling, 0 replies; 89+ messages in thread
From: tip-bot for Chris Metcalf @ 2015-05-07 11:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: schwidefsky, fweisbec, umgwanakikbuti, davej, hpa, bp, peterz,
	cmetcalf, paulmck, tglx, riel, rafael.j.wysocki, oleg,
	linux-kernel, mingo

Commit-ID:  8cb9764fc88b41db11f251e8b2a0d006578b7eb4
Gitweb:     http://git.kernel.org/tip/8cb9764fc88b41db11f251e8b2a0d006578b7eb4
Author:     Chris Metcalf <cmetcalf@ezchip.com>
AuthorDate: Wed, 6 May 2015 18:04:26 +0200
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 7 May 2015 12:02:51 +0200

nohz: Set isolcpus when nohz_full is set

nohz_full is only useful with isolcpus are also set, since
otherwise the scheduler has to run periodically to try to
determine whether to steal work from other cores.

Accordingly, when booting with nohz_full=xxx on the command
line, we should act as if isolcpus=xxx was also set, and set
(or extend) the isolcpus set to include the nohz_full cpus.

Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Rik van Riel <riel@redhat.com>
Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Jones <davej@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1430928266-24888-5-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 54f032c..b8f4876 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7036,6 +7036,9 @@ void __init sched_init_smp(void)
 	alloc_cpumask_var(&non_isolated_cpus, GFP_KERNEL);
 	alloc_cpumask_var(&fallback_doms, GFP_KERNEL);
 
+	/* nohz_full won't take effect without isolating the cpus. */
+	tick_nohz_full_add_cpus_to(cpu_isolated_map);
+
 	sched_init_numa();
 
 	/*

^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH 1/4] context_tracking: Protect against recursion
  2015-05-07  9:58   ` Ingo Molnar
@ 2015-05-07 11:53     ` Frederic Weisbecker
  0 siblings, 0 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-07 11:53 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: LKML, Rafael J . Wysocki, Peter Zijlstra, Mike Galbraith,
	Chris Metcalf, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On Thu, May 07, 2015 at 11:58:32AM +0200, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > @@ -75,6 +94,11 @@ void context_tracking_enter(enum ctx_state state)
> >  	WARN_ON_ONCE(!current->mm);
> >  
> >  	local_irq_save(flags);
> > +	if (!context_tracking_recursion_enter()) {
> > +		local_irq_restore(flags);
> > +		return;
> > +	}
> > +
> >  	if ( __this_cpu_read(context_tracking.state) != state) {
> >  		if (__this_cpu_read(context_tracking.active)) {
> >  			/*
> > @@ -105,6 +129,7 @@ void context_tracking_enter(enum ctx_state state)
> >  		 */
> >  		__this_cpu_write(context_tracking.state, state);
> >  	}
> > +	context_tracking_recursion_exit();
> 
> >  	local_irq_restore(flags);
> >  }
> 
> So why not add an 'out_irq_restore:' label and use goto instead of 
> duplicating the return path in the recursion check?

Ah yeah. Sometimes people prefer that we just repeat the rollback code if
it's only one line. But I'm fine with the label as well.

> 
> >  NOKPROBE_SYMBOL(context_tracking_enter);
> > @@ -139,6 +164,10 @@ void context_tracking_exit(enum ctx_state state)
> >  		return;
> >  
> >  	local_irq_save(flags);
> > +	if (!context_tracking_recursion_enter()) {
> > +		local_irq_restore(flags);
> > +		return;
> 
> Ditto.
> 
> No need to resend, I fixed this up in the patch.
> 
> Thanks,
> 
> 	Ingo

Ah thanks a lot Ingo!

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
  2015-05-07 11:32   ` [tip:timers/nohz] " tip-bot for Chris Metcalf
@ 2015-05-16 19:39   ` Sasha Levin
  2015-05-17  5:30     ` Mike Galbraith
  1 sibling, 1 reply; 89+ messages in thread
From: Sasha Levin @ 2015-05-16 19:39 UTC (permalink / raw)
  To: Frederic Weisbecker, Ingo Molnar
  Cc: LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Mike Galbraith, Dave Jones, Thomas Gleixner, Oleg Nesterov,
	Paul E . McKenney, Ingo Molnar, Rik van Riel, Martin Schwidefsky

On 05/06/2015 12:04 PM, Frederic Weisbecker wrote:
> From: Chris Metcalf <cmetcalf@ezchip.com>
> 
> nohz_full is only useful with isolcpus also set, since otherwise the
> scheduler has to run periodically to try to determine whether to steal
> work from other cores.
> 
> Accordingly, when booting with nohz_full=xxx on the command line, we
> should act as if isolcpus=xxx was also set, and set (or extend) the
> isolcpus set to include the nohz_full cpus.
> 
> Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> Acked-by: Rik van Riel <riel@redhat.com>
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Rik van Riel <riel@redhat.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>

Hi folks,

I've noticed a regression in my testing a few days ago and bisected it down to
this patch. I was seeing frequent soft lockups/RCU lockups and the load of the
testing VMs would go beyond 400-500 (on 32 VCPU guests) - note I'm booting them
with nohz_full=1-27.

This patch sort of explains the behaviour I was seeing now: most of the cores
are no longer being used by the scheduler, and the remaining cores can't deal
with the load imposed on them which results in "lockups" which are really just
the CPUs being unable to keep up.

I always thought that nohz_full without isolcpus meant that the cores would
be available to the scheduler, but it won't interfere if there is one task
running on them. It seems that this patch changed that behaviour.

Did I misunderstand that?


Thanks,
Sasha

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-16 19:39   ` [PATCH 4/4] " Sasha Levin
@ 2015-05-17  5:30     ` Mike Galbraith
  2015-05-18  2:17       ` Rik van Riel
  2015-05-20 20:38       ` Afzal Mohammed
  0 siblings, 2 replies; 89+ messages in thread
From: Mike Galbraith @ 2015-05-17  5:30 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf,
	Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Sat, 2015-05-16 at 15:39 -0400, Sasha Levin wrote:
> On 05/06/2015 12:04 PM, Frederic Weisbecker wrote:
> > From: Chris Metcalf <cmetcalf@ezchip.com>
> > 
> > nohz_full is only useful with isolcpus also set, since otherwise the
> > scheduler has to run periodically to try to determine whether to steal
> > work from other cores.
> > 
> > Accordingly, when booting with nohz_full=xxx on the command line, we
> > should act as if isolcpus=xxx was also set, and set (or extend) the
> > isolcpus set to include the nohz_full cpus.
> > 
> > Acked-by: Mike Galbraith <umgwanakikbuti@gmail.com> ["thumbs up!"]
> > Acked-by: Rik van Riel <riel@redhat.com>
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
> > Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Rik van Riel <riel@redhat.com>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> 
> Hi folks,
> 
> I've noticed a regression in my testing a few days ago and bisected it down to
> this patch. I was seeing frequent soft lockups/RCU lockups and the load of the
> testing VMs would go beyond 400-500 (on 32 VCPU guests) - note I'm booting them
> with nohz_full=1-27.
> 
> This patch sort of explains the behaviour I was seeing now: most of the cores
> are no longer being used by the scheduler, and the remaining cores can't deal
> with the load imposed on them which results in "lockups" which are really just
> the CPUs being unable to keep up.
> 
> I always thought that nohz_full without isolcpus meant that the cores would
> be available to the scheduler, but it won't interfere if there is one task
> running on them. It seems that this patch changed that behaviour.
> 
> Did I misunderstand that?

Yeah, tying nohz_full set to isolcpus set up an initial condition that
you have to tear down with cpusets if you want those cpus returned to
the general purpose pool.  I had considered the kernel setting initial
state to be an optimization, but have reconsidered.

You may have misunderstood somewhat though, if you do not explicitly
isolate the nohz_full set from the scheduler via either isolcpus or
cpusets, there is no exclusion from load balancing, the scheduler may
place additional tasks on a nohz_full cpu to resolve load imbalance.

Given that kernel initiated association to isolcpus, a user turning
NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
he/she does not have CPUSETS enabled, or should Rik's patch rendering
isolcpus immutable be merged, he/she would quickly discover that the
generic box has been transformed into an utterly inflexible specialist
incapable of performing mundane tasks ever again.

	-Mike


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-17  5:30     ` Mike Galbraith
@ 2015-05-18  2:17       ` Rik van Riel
  2015-05-18  3:29         ` Mike Galbraith
  2015-05-20 20:38       ` Afzal Mohammed
  1 sibling, 1 reply; 89+ messages in thread
From: Rik van Riel @ 2015-05-18  2:17 UTC (permalink / raw)
  To: Mike Galbraith, Sasha Levin
  Cc: Frederic Weisbecker, Ingo Molnar, LKML, Chris Metcalf,
	Rafael J . Wysocki, Peter Zijlstra, Dave Jones, Thomas Gleixner,
	Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/17/2015 01:30 AM, Mike Galbraith wrote:

> Given that kernel initiated association to isolcpus, a user turning
> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> isolcpus immutable be merged, 

My patch does not aim to make isolcpus immutable, it aims to make
isolcpus resistent to system management tools (like libvirt)
automatically undoing isolcpus the instant a cpuset with the default
cpus (inherited from the root group) is created.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18  2:17       ` Rik van Riel
@ 2015-05-18  3:29         ` Mike Galbraith
  2015-05-18 14:07           ` Rik van Riel
  0 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-05-18  3:29 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
> 
> > Given that kernel initiated association to isolcpus, a user turning
> > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > he/she does not have CPUSETS enabled, or should Rik's patch rendering
> > isolcpus immutable be merged, 
> 
> My patch does not aim to make isolcpus immutable, it aims to make
> isolcpus resistent to system management tools (like libvirt)
> automatically undoing isolcpus the instant a cpuset with the default
> cpus (inherited from the root group) is created.

Aim or not, if cpusets is the sole modifier, it'll render isolcpus
immutable, no?  Cpusets could grow an override to the override I
suppose, to regain control of the resource it thinks it manages.

	-Mike


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18  3:29         ` Mike Galbraith
@ 2015-05-18 14:07           ` Rik van Riel
  2015-05-18 14:22             ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Rik van Riel @ 2015-05-18 14:07 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/17/2015 11:29 PM, Mike Galbraith wrote:
> On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
>> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
>>
>>> Given that kernel initiated association to isolcpus, a user turning
>>> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
>>> he/she does not have CPUSETS enabled, or should Rik's patch rendering
>>> isolcpus immutable be merged, 
>>
>> My patch does not aim to make isolcpus immutable, it aims to make
>> isolcpus resistent to system management tools (like libvirt)
>> automatically undoing isolcpus the instant a cpuset with the default
>> cpus (inherited from the root group) is created.
> 
> Aim or not, if cpusets is the sole modifier, it'll render isolcpus
> immutable, no?  Cpusets could grow an override to the override I
> suppose, to regain control of the resource it thinks it manages.

The other way would be to make /sys/devices/system/cpu/isolcpus
(which Greg KH promised he would queue up for 4.2) writable.

I am all for making isolcpus changeable at run time. I just want
to prevent it being changed accidentally at run time, by system
tools that want to place their workloads in cpusets.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18 14:07           ` Rik van Riel
@ 2015-05-18 14:22             ` Mike Galbraith
  2015-05-18 14:52               ` Rik van Riel
  0 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-05-18 14:22 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On Mon, 2015-05-18 at 10:07 -0400, Rik van Riel wrote:
> On 05/17/2015 11:29 PM, Mike Galbraith wrote:
> > On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
> >> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
> >>
> >>> Given that kernel initiated association to isolcpus, a user turning
> >>> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> >>> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> >>> isolcpus immutable be merged, 
> >>
> >> My patch does not aim to make isolcpus immutable, it aims to make
> >> isolcpus resistent to system management tools (like libvirt)
> >> automatically undoing isolcpus the instant a cpuset with the default
> >> cpus (inherited from the root group) is created.
> > 
> > Aim or not, if cpusets is the sole modifier, it'll render isolcpus
> > immutable, no?  Cpusets could grow an override to the override I
> > suppose, to regain control of the resource it thinks it manages.
> 
> The other way would be to make /sys/devices/system/cpu/isolcpus
> (which Greg KH promised he would queue up for 4.2) writable.

Anything is better than override the override.  That's easy, but the
changelog would have to be farmed out to a talented used car salesman.

	-Mike


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18 14:22             ` Mike Galbraith
@ 2015-05-18 14:52               ` Rik van Riel
  2015-05-19  2:30                 ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Rik van Riel @ 2015-05-18 14:52 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/18/2015 10:22 AM, Mike Galbraith wrote:
> On Mon, 2015-05-18 at 10:07 -0400, Rik van Riel wrote:
>> On 05/17/2015 11:29 PM, Mike Galbraith wrote:
>>> On Sun, 2015-05-17 at 22:17 -0400, Rik van Riel wrote:
>>>> On 05/17/2015 01:30 AM, Mike Galbraith wrote:
>>>>
>>>>> Given that kernel initiated association to isolcpus, a user turning
>>>>> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
>>>>> he/she does not have CPUSETS enabled, or should Rik's patch rendering
>>>>> isolcpus immutable be merged, 
>>>>
>>>> My patch does not aim to make isolcpus immutable, it aims to make
>>>> isolcpus resistent to system management tools (like libvirt)
>>>> automatically undoing isolcpus the instant a cpuset with the default
>>>> cpus (inherited from the root group) is created.
>>>
>>> Aim or not, if cpusets is the sole modifier, it'll render isolcpus
>>> immutable, no?  Cpusets could grow an override to the override I
>>> suppose, to regain control of the resource it thinks it manages.
>>
>> The other way would be to make /sys/devices/system/cpu/isolcpus
>> (which Greg KH promised he would queue up for 4.2) writable.
> 
> Anything is better than override the override.  That's easy, but the
> changelog would have to be farmed out to a talented used car salesman.

For real time KVM, it is desirable to run the VCPU threads on
isolated CPUs as real time tasks.

Meanwhile, the emulator threads can run as normal tasks anywhere
on the system.

This means the /machine cpuset, which all guests live under,
needs to contain all the system's CPUs, both isolated and
non-isolated, otherwise it will be unable to have the VCPU
threads on isolated CPUs and the emulator threads on other
CPUs.

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-18 14:52               ` Rik van Riel
@ 2015-05-19  2:30                 ` Mike Galbraith
  2015-06-12 19:12                   ` Rik van Riel
  0 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-05-19  2:30 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On Mon, 2015-05-18 at 10:52 -0400, Rik van Riel wrote:

> For real time KVM, it is desirable to run the VCPU threads on
> isolated CPUs as real time tasks.
> 
> Meanwhile, the emulator threads can run as normal tasks anywhere
> on the system.
> 
> This means the /machine cpuset, which all guests live under,
> needs to contain all the system's CPUs, both isolated and
> non-isolated, otherwise it will be unable to have the VCPU
> threads on isolated CPUs and the emulator threads on other
> CPUs.

Ok, I know next to nothing about virtualization only because I failed at
maintaining the desired absolute zero, but...

Why do you use isolcpus for this?  This KVM setup sounds very similar to
what I do for real realtime.  I use a shield script to set up an
exclusive isolated set and a system set and move the mundane world into
the system set.  Inject realtime proggy into its exclusive home, it
creates/pins worker bees, done.  Why the cpuset isolcpus hybrid?

When you say emulator threads can run anywhere, it sounds like that
includes the isolated set, but even so, nothing prevents injecting
whatever (the kernel permits) into whichever set.

	-Mike



^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-17  5:30     ` Mike Galbraith
  2015-05-18  2:17       ` Rik van Riel
@ 2015-05-20 20:38       ` Afzal Mohammed
  2015-05-20 21:00         ` Paul E. McKenney
  1 sibling, 1 reply; 89+ messages in thread
From: Afzal Mohammed @ 2015-05-20 20:38 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

Hi,

On Sun, May 17, 2015 at 07:30:50AM +0200, Mike Galbraith wrote:

> Yeah, tying nohz_full set to isolcpus set up an initial condition that
> you have to tear down with cpusets if you want those cpus returned to
> the general purpose pool.  I had considered the kernel setting initial
> state to be an optimization, but have reconsidered.

> Given that kernel initiated association to isolcpus, a user turning
> NO_HZ_FULL_ALL on had better not have much generic load to manage.  If

On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
time as compared to w/o this patch, except boot cpu every one else
jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
it was working fine, but not after this - it is now like a single core
system.

Regards
Afzal

> he/she does not have CPUSETS enabled, or should Rik's patch rendering
> isolcpus immutable be merged, he/she would quickly discover that the
> generic box has been transformed into an utterly inflexible specialist
> incapable of performing mundane tasks ever again.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-20 20:38       ` Afzal Mohammed
@ 2015-05-20 21:00         ` Paul E. McKenney
  2015-05-21 12:12           ` Afzal Mohammed
  0 siblings, 1 reply; 89+ messages in thread
From: Paul E. McKenney @ 2015-05-20 21:00 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

On Thu, May 21, 2015 at 02:08:09AM +0530, Afzal Mohammed wrote:
> Hi,
> 
> On Sun, May 17, 2015 at 07:30:50AM +0200, Mike Galbraith wrote:
> 
> > Yeah, tying nohz_full set to isolcpus set up an initial condition that
> > you have to tear down with cpusets if you want those cpus returned to
> > the general purpose pool.  I had considered the kernel setting initial
> > state to be an optimization, but have reconsidered.
> 
> > Given that kernel initiated association to isolcpus, a user turning
> > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> 
> On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> time as compared to w/o this patch, except boot cpu every one else
> jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> it was working fine, but not after this - it is now like a single core
> system.

I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
to do for you?

							Thanx, Paul

> Regards
> Afzal
> 
> > he/she does not have CPUSETS enabled, or should Rik's patch rendering
> > isolcpus immutable be merged, he/she would quickly discover that the
> > generic box has been transformed into an utterly inflexible specialist
> > incapable of performing mundane tasks ever again.
> 


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-20 21:00         ` Paul E. McKenney
@ 2015-05-21 12:12           ` Afzal Mohammed
  2015-05-21 12:57             ` Paul E. McKenney
  0 siblings, 1 reply; 89+ messages in thread
From: Afzal Mohammed @ 2015-05-21 12:12 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

Hi,

On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:

> > > Given that kernel initiated association to isolcpus, a user turning
> > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > 
> > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > time as compared to w/o this patch, except boot cpu every one else
> > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > it was working fine, but not after this - it is now like a single core
> > system.
> 
> I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> to do for you?

I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.

Thought that shutting down ticks as much as possible would be
beneficial to normal loads too, though it has been mentioned to be used
for specialized loads. Seems like drawbacks due to it weigh against
normal loads, but haven't so far observed any (on a laptop with normal
activities) before this change.

Regards
Afzal

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 12:12           ` Afzal Mohammed
@ 2015-05-21 12:57             ` Paul E. McKenney
  2015-05-21 13:06               ` Frederic Weisbecker
  0 siblings, 1 reply; 89+ messages in thread
From: Paul E. McKenney @ 2015-05-21 12:57 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: Mike Galbraith, Sasha Levin, Frederic Weisbecker, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> Hi,
> 
> On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> 
> > > > Given that kernel initiated association to isolcpus, a user turning
> > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > 
> > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > time as compared to w/o this patch, except boot cpu every one else
> > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > it was working fine, but not after this - it is now like a single core
> > > system.
> > 
> > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > to do for you?
> 
> I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> 
> Thought that shutting down ticks as much as possible would be
> beneficial to normal loads too, though it has been mentioned to be used
> for specialized loads. Seems like drawbacks due to it weigh against
> normal loads, but haven't so far observed any (on a laptop with normal
> activities) before this change.

Indeed, NO_HZ_FULL is special purpose.  You normally would select
NO_HZ_FULL_ALL only on a system intended for heavy compute without
normal-workload distractions or for some real-time systems.  For mixed
workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
use the boot parameters to select which CPUs are to be running the
specialized portion of the workload.

And you would of course need to lead enough CPUs running normally to
handle the non-specialized portion of the workload.

This sort of thing has traditionally required specialized kernels,
so the cool thing here is that we can make Linux do it.  Though, as
you noticed, careful configuration is still required.

Seem reasonable?

							Thanx, Paul


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 12:57             ` Paul E. McKenney
@ 2015-05-21 13:06               ` Frederic Weisbecker
  2015-05-21 13:27                 ` Paul E. McKenney
                                   ` (2 more replies)
  0 siblings, 3 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-21 13:06 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> > Hi,
> > 
> > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> > 
> > > > > Given that kernel initiated association to isolcpus, a user turning
> > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > > 
> > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > > time as compared to w/o this patch, except boot cpu every one else
> > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > > it was working fine, but not after this - it is now like a single core
> > > > system.
> > > 
> > > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > > to do for you?
> > 
> > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> > 
> > Thought that shutting down ticks as much as possible would be
> > beneficial to normal loads too, though it has been mentioned to be used
> > for specialized loads. Seems like drawbacks due to it weigh against
> > normal loads, but haven't so far observed any (on a laptop with normal
> > activities) before this change.
> 
> Indeed, NO_HZ_FULL is special purpose.  You normally would select
> NO_HZ_FULL_ALL only on a system intended for heavy compute without
> normal-workload distractions or for some real-time systems.  For mixed
> workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> use the boot parameters to select which CPUs are to be running the
> specialized portion of the workload.
> 
> And you would of course need to lead enough CPUs running normally to
> handle the non-specialized portion of the workload.
> 
> This sort of thing has traditionally required specialized kernels,
> so the cool thing here is that we can make Linux do it.  Though, as
> you noticed, careful configuration is still required.
> 
> Seem reasonable?

That said if he saw a big performance regression after applying these patches,
then there is likely a problem in the patchset. Well it could be due to that mode
which loops on full dynticks before resuming to userspace. Indeed when that is
enabled, I expect real throughput issues on workloads doing lots of kernel <->
userspace roundtrips. We just need to make sure this thing only works when requested.

Anyway, I need to look at the patchset.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:06               ` Frederic Weisbecker
@ 2015-05-21 13:27                 ` Paul E. McKenney
  2015-05-21 13:29                 ` Afzal Mohammed
  2015-05-21 18:59                 ` Mike Galbraith
  2 siblings, 0 replies; 89+ messages in thread
From: Paul E. McKenney @ 2015-05-21 13:27 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> > > Hi,
> > > 
> > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> > > 
> > > > > > Given that kernel initiated association to isolcpus, a user turning
> > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > > > 
> > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > > > time as compared to w/o this patch, except boot cpu every one else
> > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > > > it was working fine, but not after this - it is now like a single core
> > > > > system.
> > > > 
> > > > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > > > to do for you?
> > > 
> > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> > > 
> > > Thought that shutting down ticks as much as possible would be
> > > beneficial to normal loads too, though it has been mentioned to be used
> > > for specialized loads. Seems like drawbacks due to it weigh against
> > > normal loads, but haven't so far observed any (on a laptop with normal
> > > activities) before this change.
> > 
> > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > normal-workload distractions or for some real-time systems.  For mixed
> > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > use the boot parameters to select which CPUs are to be running the
> > specialized portion of the workload.
> > 
> > And you would of course need to lead enough CPUs running normally to
> > handle the non-specialized portion of the workload.
> > 
> > This sort of thing has traditionally required specialized kernels,
> > so the cool thing here is that we can make Linux do it.  Though, as
> > you noticed, careful configuration is still required.
> > 
> > Seem reasonable?
> 
> That said if he saw a big performance regression after applying these patches,
> then there is likely a problem in the patchset. Well it could be due to that mode
> which loops on full dynticks before resuming to userspace. Indeed when that is
> enabled, I expect real throughput issues on workloads doing lots of kernel <->
> userspace roundtrips. We just need to make sure this thing only works when requested.
> 
> Anyway, I need to look at the patchset.

Fair enough!

							Thanx, Paul


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:06               ` Frederic Weisbecker
  2015-05-21 13:27                 ` Paul E. McKenney
@ 2015-05-21 13:29                 ` Afzal Mohammed
  2015-05-21 14:14                   ` Paul E. McKenney
  2015-05-21 18:59                 ` Mike Galbraith
  2 siblings, 1 reply; 89+ messages in thread
From: Afzal Mohammed @ 2015-05-21 13:29 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

Hi,

On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:

> > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > normal-workload distractions or for some real-time systems.  For mixed
> > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > use the boot parameters to select which CPUs are to be running the
> > specialized portion of the workload.
> > 
> > And you would of course need to lead enough CPUs running normally to
> > handle the non-specialized portion of the workload.
> > 
> > This sort of thing has traditionally required specialized kernels,
> > so the cool thing here is that we can make Linux do it.  Though, as
> > you noticed, careful configuration is still required.
> > 
> > Seem reasonable?

Yes, thanks, some dots got connected :)

> That said if he saw a big performance regression after applying these patches,
> then there is likely a problem in the patchset. Well it could be due to that mode
> which loops on full dynticks before resuming to userspace. Indeed when that is
> enabled, I expect real throughput issues on workloads doing lots of kernel <->
> userspace roundtrips. We just need to make sure this thing only works when requested.

With this change (& having NO_HZ_FULL_ALL), hackbench was being served
only by the boot cpu, while w/o this change, all 8 (this is a quad
core HT processor) was being used - observation based on 'top'.

Regards
Afzal

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:29                 ` Afzal Mohammed
@ 2015-05-21 14:14                   ` Paul E. McKenney
  2015-05-21 14:46                     ` Frederic Weisbecker
  0 siblings, 1 reply; 89+ messages in thread
From: Paul E. McKenney @ 2015-05-21 14:14 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: Frederic Weisbecker, Mike Galbraith, Sasha Levin, Ingo Molnar,
	LKML, Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra,
	Dave Jones, Thomas Gleixner, Oleg Nesterov, Ingo Molnar,
	Rik van Riel, Martin Schwidefsky

On Thu, May 21, 2015 at 06:59:05PM +0530, Afzal Mohammed wrote:
> Hi,
> 
> On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> 
> > > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > > normal-workload distractions or for some real-time systems.  For mixed
> > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > > use the boot parameters to select which CPUs are to be running the
> > > specialized portion of the workload.
> > > 
> > > And you would of course need to lead enough CPUs running normally to
> > > handle the non-specialized portion of the workload.
> > > 
> > > This sort of thing has traditionally required specialized kernels,
> > > so the cool thing here is that we can make Linux do it.  Though, as
> > > you noticed, careful configuration is still required.
> > > 
> > > Seem reasonable?
> 
> Yes, thanks, some dots got connected :)
> 
> > That said if he saw a big performance regression after applying these patches,
> > then there is likely a problem in the patchset. Well it could be due to that mode
> > which loops on full dynticks before resuming to userspace. Indeed when that is
> > enabled, I expect real throughput issues on workloads doing lots of kernel <->
> > userspace roundtrips. We just need to make sure this thing only works when requested.
> 
> With this change (& having NO_HZ_FULL_ALL), hackbench was being served
> only by the boot cpu, while w/o this change, all 8 (this is a quad
> core HT processor) was being used - observation based on 'top'.

Good to know!  And it will be interesting to see what Frederic decides
based on his review of the patchset.

							Thanx, Paul


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 14:14                   ` Paul E. McKenney
@ 2015-05-21 14:46                     ` Frederic Weisbecker
  0 siblings, 0 replies; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-21 14:46 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Afzal Mohammed, Mike Galbraith, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 07:14:09AM -0700, Paul E. McKenney wrote:
> On Thu, May 21, 2015 at 06:59:05PM +0530, Afzal Mohammed wrote:
> > Hi,
> > 
> > On Thu, May 21, 2015 at 03:06:23PM +0200, Frederic Weisbecker wrote:
> > > On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> > 
> > > > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > > > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > > > normal-workload distractions or for some real-time systems.  For mixed
> > > > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > > > use the boot parameters to select which CPUs are to be running the
> > > > specialized portion of the workload.
> > > > 
> > > > And you would of course need to lead enough CPUs running normally to
> > > > handle the non-specialized portion of the workload.
> > > > 
> > > > This sort of thing has traditionally required specialized kernels,
> > > > so the cool thing here is that we can make Linux do it.  Though, as
> > > > you noticed, careful configuration is still required.
> > > > 
> > > > Seem reasonable?
> > 
> > Yes, thanks, some dots got connected :)
> > 
> > > That said if he saw a big performance regression after applying these patches,
> > > then there is likely a problem in the patchset. Well it could be due to that mode
> > > which loops on full dynticks before resuming to userspace. Indeed when that is
> > > enabled, I expect real throughput issues on workloads doing lots of kernel <->
> > > userspace roundtrips. We just need to make sure this thing only works when requested.
> > 
> > With this change (& having NO_HZ_FULL_ALL), hackbench was being served
> > only by the boot cpu, while w/o this change, all 8 (this is a quad
> > core HT processor) was being used - observation based on 'top'.
> 
> Good to know!  And it will be interesting to see what Frederic decides
> based on his review of the patchset.

Ah wait I was confusing this patchset with the dataplane mode.

No the performance loss seen by Afzal is actually expected. Now that we set
all nohz full CPUs as isolcpus, when you start a process such as hackbench,
it will be affine to CPU 0 unless you explicitly tweak the affinity of
hackbench.

So yeah this is a desired effect :-)

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 13:06               ` Frederic Weisbecker
  2015-05-21 13:27                 ` Paul E. McKenney
  2015-05-21 13:29                 ` Afzal Mohammed
@ 2015-05-21 18:59                 ` Mike Galbraith
  2015-05-22 14:39                   ` Frederic Weisbecker
  2 siblings, 1 reply; 89+ messages in thread
From: Mike Galbraith @ 2015-05-21 18:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, 2015-05-21 at 15:06 +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 05:57:59AM -0700, Paul E. McKenney wrote:
> > On Thu, May 21, 2015 at 05:42:46PM +0530, Afzal Mohammed wrote:
> > > Hi,
> > > 
> > > On Wed, May 20, 2015 at 02:00:26PM -0700, Paul E. McKenney wrote:
> > > 
> > > > > > Given that kernel initiated association to isolcpus, a user turning
> > > > > > NO_HZ_FULL_ALL on had better not have much generic load to manage.  If
> > > > > 
> > > > > On a quad-core desktop system with NO_HZ_FULL_ALL, hackbench took 3x
> > > > > time as compared to w/o this patch, except boot cpu every one else
> > > > > jobless. Though NO_HZ_FULL_ALL (afaik) is not meant for generic load,
> > > > > it was working fine, but not after this - it is now like a single core
> > > > > system.
> > > > 
> > > > I have to ask...  What is your use case?  What are you wanting NO_HZ_FULL
> > > > to do for you?
> > > 
> > > I was just playing NO_HZ_FULL with tip-[sched,timers]-* changes.
> > > 
> > > Thought that shutting down ticks as much as possible would be
> > > beneficial to normal loads too, though it has been mentioned to be used
> > > for specialized loads. Seems like drawbacks due to it weigh against
> > > normal loads, but haven't so far observed any (on a laptop with normal
> > > activities) before this change.
> > 
> > Indeed, NO_HZ_FULL is special purpose.  You normally would select
> > NO_HZ_FULL_ALL only on a system intended for heavy compute without
> > normal-workload distractions or for some real-time systems.  For mixed
> > workloads, you would build with NO_HZ_FULL (but not NO_HZ_FULL_ALL) and
> > use the boot parameters to select which CPUs are to be running the
> > specialized portion of the workload.
> > 
> > And you would of course need to lead enough CPUs running normally to
> > handle the non-specialized portion of the workload.
> > 
> > This sort of thing has traditionally required specialized kernels,
> > so the cool thing here is that we can make Linux do it.  Though, as
> > you noticed, careful configuration is still required.
> > 
> > Seem reasonable?
> 
> That said if he saw a big performance regression after applying these patches,
> then there is likely a problem in the patchset. Well it could be due to that mode
> which loops on full dynticks before resuming to userspace. Indeed when that is
> enabled, I expect real throughput issues on workloads doing lots of kernel <->
> userspace roundtrips. We just need to make sure this thing only works when requested.
> 
> Anyway, I need to look at the patchset.

I think it's a mistake to make any rash assumption and/or mandate that
the user WILL use nohz_full CPUs immediately, or even at all for that
matter.  nohz_full currently is nothing but a CPU attribute, period,
nothing more, nothing less.  The overhead that comes with the it is the
only thing that suggests that the set really really should be used
exclusively for isolated compute loads, and even then, they had better
be pure userspace due to the hefty border crossing overhead.

Let that overhear be seriously reduced, as per the Ingo/Andy discussion,
and presto, the things become a very interesting to dynamic sets.  You
can really really use them as you see fit, and on the fly, it doesn't
cost you an arm and a leg just to turn it on.  The only restriction
being the static set declaration, that you have to manage until that too
goes away.

I see no reason to turn nohz_full into a rigid construct.

	-Mike


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-21 18:59                 ` Mike Galbraith
@ 2015-05-22 14:39                   ` Frederic Weisbecker
  2015-05-22 15:20                     ` Mike Galbraith
  0 siblings, 1 reply; 89+ messages in thread
From: Frederic Weisbecker @ 2015-05-22 14:39 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Thu, May 21, 2015 at 08:59:38PM +0200, Mike Galbraith wrote:
> I think it's a mistake to make any rash assumption and/or mandate that
> the user WILL use nohz_full CPUs immediately, or even at all for that
> matter.  nohz_full currently is nothing but a CPU attribute, period,
> nothing more, nothing less.

That's what the nohz_full parameter is for. Now if you're referring to
change the nohz_full toggle to a runtime interface such as sysfs or such,
I don't see that's a pressing need, especially considering the added
complexity. At least I haven't heard much requests for it.

> The overhead that comes with the it is the
> only thing that suggests that the set really really should be used
> exclusively for isolated compute loads, and even then, they had better
> be pure userspace due to the hefty border crossing overhead.

Yep.

> 
> Let that overhear be seriously reduced, as per the Ingo/Andy discussion,
> and presto, the things become a very interesting to dynamic sets.  You
> can really really use them as you see fit, and on the fly, it doesn't
> cost you an arm and a leg just to turn it on.  The only restriction
> being the static set declaration, that you have to manage until that too
> goes away.
> 
> I see no reason to turn nohz_full into a rigid construct.

I'm all for making nohz_full just about the tick and drive the rest of the
isolation features through other settings.

Ideally the isolation should be tuned by userspace, we have all the interface
available for that: process affinity, irq affinity, workqueue affinity (soon),
watchdog, etc... Then we'll also add syscall or prctl to loop on hard isolation
(dataplane).

Unfortunately people seem to prefer adding that complexity to the kernel and let
it assume bold and unflexible pre-setting on its own.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-22 14:39                   ` Frederic Weisbecker
@ 2015-05-22 15:20                     ` Mike Galbraith
  0 siblings, 0 replies; 89+ messages in thread
From: Mike Galbraith @ 2015-05-22 15:20 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Paul E. McKenney, Afzal Mohammed, Sasha Levin, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Ingo Molnar, Rik van Riel,
	Martin Schwidefsky

On Fri, 2015-05-22 at 16:39 +0200, Frederic Weisbecker wrote:
> On Thu, May 21, 2015 at 08:59:38PM +0200, Mike Galbraith wrote:
> > I think it's a mistake to make any rash assumption and/or mandate that
> > the user WILL use nohz_full CPUs immediately, or even at all for that
> > matter.  nohz_full currently is nothing but a CPU attribute, period,
> > nothing more, nothing less.
> 
> That's what the nohz_full parameter is for. Now if you're referring to
> change the nohz_full toggle to a runtime interface such as sysfs or such,
> I don't see that's a pressing need, especially considering the added
> complexity. At least I haven't heard much requests for it.

I'm not advocating anything like that, I'm just standing on my soap box
advocating NOT restricting user choices.  Cpusets works fine for me, and
I'd like them to keep on working.  If I need to add any hooks, buttons,
or rubber tubing I'll do it there ;-)

	-Mike


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 4/4] nohz: Set isolcpus when nohz_full is set
  2015-05-19  2:30                 ` Mike Galbraith
@ 2015-06-12 19:12                   ` Rik van Riel
  0 siblings, 0 replies; 89+ messages in thread
From: Rik van Riel @ 2015-06-12 19:12 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Sasha Levin, Frederic Weisbecker, Ingo Molnar, LKML,
	Chris Metcalf, Rafael J . Wysocki, Peter Zijlstra, Dave Jones,
	Thomas Gleixner, Oleg Nesterov, Paul E . McKenney, Ingo Molnar,
	Martin Schwidefsky

On 05/18/2015 10:30 PM, Mike Galbraith wrote:
> On Mon, 2015-05-18 at 10:52 -0400, Rik van Riel wrote:
> 
>> For real time KVM, it is desirable to run the VCPU threads on
>> isolated CPUs as real time tasks.
>>
>> Meanwhile, the emulator threads can run as normal tasks anywhere
>> on the system.
>>
>> This means the /machine cpuset, which all guests live under,
>> needs to contain all the system's CPUs, both isolated and
>> non-isolated, otherwise it will be unable to have the VCPU
>> threads on isolated CPUs and the emulator threads on other
>> CPUs.
> 
> Ok, I know next to nothing about virtualization only because I failed at
> maintaining the desired absolute zero, but...
> 
> Why do you use isolcpus for this?  This KVM setup sounds very similar to
> what I do for real realtime.  I use a shield script to set up an
> exclusive isolated set and a system set and move the mundane world into
> the system set.  Inject realtime proggy into its exclusive home, it
> creates/pins worker bees, done.  Why the cpuset isolcpus hybrid?

Because libvirt doesn't know any better currently.

Ideally in the long run, we would use only cpusets, and not
boot with isolcpus at all.

Of course, things like irqbalance also key off isolcpus, to
avoid assigning irqs to the isolated cpus, so there are quite
a few puzzle pieces, and the best I can hope for in the short
term is to simply resolve the incompatibilities where one
piece of software undoes the settings required by another...

-- 
All rights reversed

^ permalink raw reply	[flat|nested] 89+ messages in thread

end of thread, other threads:[~2015-06-12 19:12 UTC | newest]

Thread overview: 89+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-03 16:24 [PATCH 1/2] nohz: add tick_nohz_full_set_cpus() API cmetcalf
2015-04-03 16:24 ` [PATCH 2/2] nohz: make nohz_full imply isolcpus cmetcalf
2015-04-03 17:42   ` Frederic Weisbecker
2015-04-03 19:20     ` Chris Metcalf
2015-04-04 14:10       ` Rik van Riel
2015-04-04 17:09         ` Chris Metcalf
2015-04-05  5:05           ` Ingo Molnar
2015-04-06 17:15             ` Chris Metcalf
2015-04-06 18:16               ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
2015-04-06 18:16                 ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus cmetcalf
2015-04-07 22:29                   ` Frederic Weisbecker
2015-04-08  9:41                   ` Peter Zijlstra
2015-04-08 14:04                     ` Chris Metcalf
2015-04-08 14:26                       ` Peter Zijlstra
2015-04-08 15:21                         ` Chris Metcalf
2015-04-08 17:24                           ` Frederic Weisbecker
2015-04-08 19:20                             ` [PATCH v3 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs cmetcalf
2015-04-08 19:20                               ` [PATCH v3 2/2] nohz: set isolcpus when nohz_full is set cmetcalf
2015-04-08 17:27                           ` [PATCH v2 2/2] nohz: make nohz_full imply isolcpus Peter Zijlstra
2015-04-08 18:12                             ` Chris Metcalf
2015-04-09  8:29                               ` Peter Zijlstra
2015-04-09 12:02                                 ` Chris Metcalf
2015-04-09 12:45                                   ` Frederic Weisbecker
2015-04-09 16:59                                     ` [PATCH v5] nohz: set isolcpus when nohz_full is set Chris Metcalf
2015-04-09 17:12                                       ` Peter Zijlstra
2015-04-10  1:05                                         ` Mike Galbraith
2015-04-10 15:33                                           ` Chris Metcalf
2015-04-10 15:57                                             ` Mike Galbraith
2015-04-09 17:00                                 ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Chris Metcalf
2015-04-09 17:00                                   ` [PATCH v4 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
2015-04-09 17:07                                   ` [PATCH v4 1/2] nohz: add tick_nohz_full_cpumask_or() and _andnot() APIs Peter Zijlstra
2015-04-09 17:24                                     ` Chris Metcalf
2015-04-09 17:42                                       ` Peter Zijlstra
2015-04-09 18:01                                         ` [PATCH v6 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
2015-04-09 18:01                                           ` [PATCH v6 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
2015-04-10  1:37                                             ` Frederic Weisbecker
2015-04-10 20:53                                               ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Chris Metcalf
2015-04-10 20:53                                                 ` [PATCH v7 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
2015-04-14  0:37                                                   ` Frederic Weisbecker
2015-04-14 15:17                                                     ` [PATCH v8 1/2] nohz: add tick_nohz_full_add_cpus_to() API Chris Metcalf
2015-04-14 15:17                                                       ` [PATCH v8 2/2] nohz: set isolcpus when nohz_full is set Chris Metcalf
2015-04-14 15:26                                                         ` Frederic Weisbecker
2015-04-14 16:45                                                           ` Peter Zijlstra
2015-04-14  0:33                                                 ` [PATCH v7 1/2] nohz: add tick_nohz_full_add_cpus_to() and _remove_cpus_from() APIs Frederic Weisbecker
2015-04-14  0:49                                                   ` Chris Metcalf
2015-04-14 15:34                                                     ` Frederic Weisbecker
2015-04-10  1:34                                           ` [PATCH v6 " Frederic Weisbecker
2015-04-10 15:31                                             ` Chris Metcalf
2015-04-06 18:29                 ` [PATCH v2 1/2] nohz: add tick_nohz_full_clear_cpus() and _set_cpus() APIs Frederic Weisbecker
2015-04-06 19:09                   ` Chris Metcalf
2015-04-07  9:33                     ` Ingo Molnar
2015-04-03 18:08   ` [PATCH 2/2] nohz: make nohz_full imply isolcpus Mike Galbraith
2015-04-03 19:21     ` Chris Metcalf
2015-04-04  2:03       ` Mike Galbraith
2015-04-04  3:43         ` Mike Galbraith
2015-04-06 19:28           ` Rik van Riel
2015-04-07  3:10             ` Mike Galbraith
2015-05-06 16:04 [GIT PULL] nohz: A few improvements v4 Frederic Weisbecker
2015-05-06 16:04 ` [PATCH 1/4] context_tracking: Protect against recursion Frederic Weisbecker
2015-05-07  9:58   ` Ingo Molnar
2015-05-07 11:53     ` Frederic Weisbecker
2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Frederic Weisbecker
2015-05-06 16:04 ` [PATCH 2/4] context_tracking: Inherit TIF_NOHZ through forks instead of context switches Frederic Weisbecker
2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Frederic Weisbecker
2015-05-06 16:04 ` [PATCH 3/4] nohz: Add tick_nohz_full_add_cpus_to() API Frederic Weisbecker
2015-05-07 11:31   ` [tip:timers/nohz] " tip-bot for Chris Metcalf
2015-05-06 16:04 ` [PATCH 4/4] nohz: Set isolcpus when nohz_full is set Frederic Weisbecker
2015-05-07 11:32   ` [tip:timers/nohz] " tip-bot for Chris Metcalf
2015-05-16 19:39   ` [PATCH 4/4] " Sasha Levin
2015-05-17  5:30     ` Mike Galbraith
2015-05-18  2:17       ` Rik van Riel
2015-05-18  3:29         ` Mike Galbraith
2015-05-18 14:07           ` Rik van Riel
2015-05-18 14:22             ` Mike Galbraith
2015-05-18 14:52               ` Rik van Riel
2015-05-19  2:30                 ` Mike Galbraith
2015-06-12 19:12                   ` Rik van Riel
2015-05-20 20:38       ` Afzal Mohammed
2015-05-20 21:00         ` Paul E. McKenney
2015-05-21 12:12           ` Afzal Mohammed
2015-05-21 12:57             ` Paul E. McKenney
2015-05-21 13:06               ` Frederic Weisbecker
2015-05-21 13:27                 ` Paul E. McKenney
2015-05-21 13:29                 ` Afzal Mohammed
2015-05-21 14:14                   ` Paul E. McKenney
2015-05-21 14:46                     ` Frederic Weisbecker
2015-05-21 18:59                 ` Mike Galbraith
2015-05-22 14:39                   ` Frederic Weisbecker
2015-05-22 15:20                     ` Mike Galbraith

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.