* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 11:10 ` Ingo Molnar
@ 2013-02-07 15:41 ` Christoph Lameter
2013-02-07 16:12 ` Steven Rostedt
2013-02-07 16:25 ` Frederic Weisbecker
2 siblings, 0 replies; 28+ messages in thread
From: Christoph Lameter @ 2013-02-07 15:41 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, Frederic Weisbecker, LKML, Alessio Igor Bogani,
Andrew Morton, Chris Metcalf, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, 7 Feb 2013, Ingo Molnar wrote:
> Agreed?
Yes and please also change the texts in Kconfig to accurately describe
what happens to the timer tick.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 11:10 ` Ingo Molnar
2013-02-07 15:41 ` Christoph Lameter
@ 2013-02-07 16:12 ` Steven Rostedt
2013-02-07 16:30 ` Paul E. McKenney
2013-02-07 16:25 ` Frederic Weisbecker
2 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2013-02-07 16:12 UTC (permalink / raw)
To: Ingo Molnar
Cc: Frederic Weisbecker, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, 2013-02-07 at 12:10 +0100, Ingo Molnar wrote:
> * Steven Rostedt <rostedt@goodmis.org> wrote:
>
> > I'll reply to this as I come up with comments.
> >
> > First thing is, don't call it NO_HZ_FULL. A better name would
> > be NO_HZ_CPU. I would like to reserve NO_HZ_FULL when we
> > totally remove jiffies :-)
>
> I don't think we want yet another config option named in a
> weird way.
>
> What we want instead is to just split NO_HZ up into its
> conceptual parts:
>
> CONFIG_NO_HZ_IDLE
> CONFIG_NO_HZ_USER_SPACE
> CONFIG_NO_HZ_KERNEL_SPACE
>
> Where the current status quo is NO_HZ_IDLE=y, and Frederic is
> about to introduce NO_HZ_USER_SPACE=y. When jiffies get removed
> we get NO_HZ_KERNEL_SPACE=y.
Saying NO_HZ_USER_SPACE is a bit of a misnomer. As we don't just stop
the tick for user space, but it may remained stopped when entering the
kernel. The rule is that when there's just a single task on a CPU, the
tick can stop (no scheduling work needed). But if the task triggers
something that may require a tick (like printk) then the tick will start
again. But just going into the kernel does not designate a tick restart.
Maybe a better name would be NO_HZ_SINGLE_TASK ?
>
> The 'CONFIG_NO_HZ' meta-option, which we should leave for easy
> configurability and for compatibility, should get us the
> currently recommended default, which for the time being might
> be:
>
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_USER_SPACE is disabled
>
> Btw., you could add CONFIG_NO_HZ_KERNEL_SPACE right away, just
> keep it false all the time. That would document our future plans
> pretty well.
Maybe the removal of jiffies would be NO_HZ_COMPLETE?
-- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 16:12 ` Steven Rostedt
@ 2013-02-07 16:30 ` Paul E. McKenney
2013-02-07 17:06 ` Steven Rostedt
0 siblings, 1 reply; 28+ messages in thread
From: Paul E. McKenney @ 2013-02-07 16:30 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, Frederic Weisbecker, LKML, Alessio Igor Bogani,
Andrew Morton, Chris Metcalf, Christoph Lameter, Geoff Levand,
Gilad Ben Yossef, Hakan Akkan, Li Zhong, Namhyung Kim,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, Feb 07, 2013 at 11:12:00AM -0500, Steven Rostedt wrote:
> On Thu, 2013-02-07 at 12:10 +0100, Ingo Molnar wrote:
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > > I'll reply to this as I come up with comments.
> > >
> > > First thing is, don't call it NO_HZ_FULL. A better name would
> > > be NO_HZ_CPU. I would like to reserve NO_HZ_FULL when we
> > > totally remove jiffies :-)
> >
> > I don't think we want yet another config option named in a
> > weird way.
> >
> > What we want instead is to just split NO_HZ up into its
> > conceptual parts:
> >
> > CONFIG_NO_HZ_IDLE
> > CONFIG_NO_HZ_USER_SPACE
> > CONFIG_NO_HZ_KERNEL_SPACE
> >
> > Where the current status quo is NO_HZ_IDLE=y, and Frederic is
> > about to introduce NO_HZ_USER_SPACE=y. When jiffies get removed
> > we get NO_HZ_KERNEL_SPACE=y.
>
> Saying NO_HZ_USER_SPACE is a bit of a misnomer. As we don't just stop
> the tick for user space, but it may remained stopped when entering the
> kernel. The rule is that when there's just a single task on a CPU, the
> tick can stop (no scheduling work needed). But if the task triggers
> something that may require a tick (like printk) then the tick will start
> again. But just going into the kernel does not designate a tick restart.
>
> Maybe a better name would be NO_HZ_SINGLE_TASK ?
>
> >
> > The 'CONFIG_NO_HZ' meta-option, which we should leave for easy
> > configurability and for compatibility, should get us the
> > currently recommended default, which for the time being might
> > be:
> >
> > CONFIG_NO_HZ_IDLE=y
> > # CONFIG_NO_HZ_USER_SPACE is disabled
> >
> > Btw., you could add CONFIG_NO_HZ_KERNEL_SPACE right away, just
> > keep it false all the time. That would document our future plans
> > pretty well.
>
> Maybe the removal of jiffies would be NO_HZ_COMPLETE?
I suspect that removal of jiffies from the kernel will take a few stages,
with RCU being one of the laggards for awhile. Making RCU's state
machine depend wholly on process-based execution will take some care
and experimentation, especially for extreme and corner-case workloads.
For example, having RCU OOM the system just because a specific CPU was
unable to run some RCU kthread for an extended time is something to
be avoided. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 16:30 ` Paul E. McKenney
@ 2013-02-07 17:06 ` Steven Rostedt
2013-02-07 17:37 ` Paul E. McKenney
0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2013-02-07 17:06 UTC (permalink / raw)
To: paulmck
Cc: Ingo Molnar, Frederic Weisbecker, LKML, Alessio Igor Bogani,
Andrew Morton, Chris Metcalf, Christoph Lameter, Geoff Levand,
Gilad Ben Yossef, Hakan Akkan, Li Zhong, Namhyung Kim,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, 2013-02-07 at 08:30 -0800, Paul E. McKenney wrote:
> I suspect that removal of jiffies from the kernel will take a few stages,
> with RCU being one of the laggards for awhile. Making RCU's state
> machine depend wholly on process-based execution will take some care
> and experimentation, especially for extreme and corner-case workloads.
> For example, having RCU OOM the system just because a specific CPU was
> unable to run some RCU kthread for an extended time is something to
> be avoided. ;-)
Tickless doesn't mean no timeouts or periodic timers. I think we will
always have some sort of dynamic tick when needed. It will just be more
event driven then something that goes off constantly.
-- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 17:06 ` Steven Rostedt
@ 2013-02-07 17:37 ` Paul E. McKenney
0 siblings, 0 replies; 28+ messages in thread
From: Paul E. McKenney @ 2013-02-07 17:37 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, Frederic Weisbecker, LKML, Alessio Igor Bogani,
Andrew Morton, Chris Metcalf, Christoph Lameter, Geoff Levand,
Gilad Ben Yossef, Hakan Akkan, Li Zhong, Namhyung Kim,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, Feb 07, 2013 at 12:06:21PM -0500, Steven Rostedt wrote:
> On Thu, 2013-02-07 at 08:30 -0800, Paul E. McKenney wrote:
>
> > I suspect that removal of jiffies from the kernel will take a few stages,
> > with RCU being one of the laggards for awhile. Making RCU's state
> > machine depend wholly on process-based execution will take some care
> > and experimentation, especially for extreme and corner-case workloads.
> > For example, having RCU OOM the system just because a specific CPU was
> > unable to run some RCU kthread for an extended time is something to
> > be avoided. ;-)
>
> Tickless doesn't mean no timeouts or periodic timers. I think we will
> always have some sort of dynamic tick when needed. It will just be more
> event driven then something that goes off constantly.
As long as we don't end up replacing a single tick with multiple hrtimers
(or whatever), ending up with more overhead and disruption than we
started with. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 11:10 ` Ingo Molnar
2013-02-07 15:41 ` Christoph Lameter
2013-02-07 16:12 ` Steven Rostedt
@ 2013-02-07 16:25 ` Frederic Weisbecker
2013-02-07 16:41 ` Steven Rostedt
2013-02-07 19:07 ` Ingo Molnar
2 siblings, 2 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2013-02-07 16:25 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
2013/2/7 Ingo Molnar <mingo@kernel.org>:
>
> * Steven Rostedt <rostedt@goodmis.org> wrote:
>
>> I'll reply to this as I come up with comments.
>>
>> First thing is, don't call it NO_HZ_FULL. A better name would
>> be NO_HZ_CPU. I would like to reserve NO_HZ_FULL when we
>> totally remove jiffies :-)
>
> I don't think we want yet another config option named in a
> weird way.
>
> What we want instead is to just split NO_HZ up into its
> conceptual parts:
>
> CONFIG_NO_HZ_IDLE
Renaming CONFIG_NO_HZ to CONFIG_NO_HZ_IDLE is something I considered.
I was just worried about this option being present in many defconfig.
Perhaps we can do that renaming and keep CONFIG_NO_HZ around a little
while for backward compatibility (pretty much like what we've done for
CONFIG_PERF_COUNTERS -> CONFIG_PERF_EVENTS).
> CONFIG_NO_HZ_USER_SPACE
> CONFIG_NO_HZ_KERNEL_SPACE
>
> Where the current status quo is NO_HZ_IDLE=y, and Frederic is
> about to introduce NO_HZ_USER_SPACE=y. When jiffies get removed
> we get NO_HZ_KERNEL_SPACE=y.
Note on my tree I stop the tick on both rings. I believe that
restarting the tick on kernel entry isn't something we should
seriously consider. It would be a costly operation that may make
things worse. And in fact there is no big difference. Just kernelspace
has more opportunities to be disturbed (RCU IPIs, async timer/work
scheduled by the kernel, etc...) and get its tick restarted sometimes.
>
> The 'CONFIG_NO_HZ' meta-option, which we should leave for easy
> configurability and for compatibility, should get us the
> currently recommended default, which for the time being might
> be:
Ah looks like you considered the compatibility as well :)
>
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_USER_SPACE is disabled
>
> Btw., you could add CONFIG_NO_HZ_KERNEL_SPACE right away, just
> keep it false all the time. That would document our future plans
> pretty well.
>
> Once CONFIG_NO_HZ_USER_SPACE is proven problem-free, we might
> default to:
>
> CONFIG_NO_HZ_IDLE=y
> CONFIG_NO_HZ_USER_SPACE=y
>
> The goal is to have this in the distant future:
>
> CONFIG_NO_HZ=y
>
> CONFIG_NO_HZ_IDLE=y
> CONFIG_NO_HZ_USER_SPACE=y
> CONFIG_NO_HZ_KERNEL_SPACE=y
>
> And eventually we might even be able to get rid of all the 3
> variants, and only offer full-on/off.
>
> Agreed?
At least for now we seem to agree on CONFIG_NO_HZ_IDLE and keep
CONFIG_NO_HZ for compatibility. Are you ok with that? If so I'll send
a patch.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 16:25 ` Frederic Weisbecker
@ 2013-02-07 16:41 ` Steven Rostedt
2013-02-07 16:45 ` Frederic Weisbecker
2013-02-07 19:07 ` Ingo Molnar
1 sibling, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2013-02-07 16:41 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, 2013-02-07 at 17:25 +0100, Frederic Weisbecker wrote:
> At least for now we seem to agree on CONFIG_NO_HZ_IDLE and keep
> CONFIG_NO_HZ for compatibility. Are you ok with that? If so I'll send
> a patch.
I believe that Ingo was suggesting to have CONFIG_NO_HZ give options to
what type of config NO_HZ you want. Something like:
config NO_HZ
bool "Enable tickless support"
config NO_HZ_IDLE
bool "Stop tick when CPU is idle"
default y
depends on NO_HZ
config NO_HZ_TASK
bool "Stop tick on specified CPUs when single task is running"
default n
depends on NO_HZ
That is, if you select NO_HZ, by default NO_HZ_IDLE is also selected.
But in the kernel the NO_HZ_IDLE is used.
-- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 16:41 ` Steven Rostedt
@ 2013-02-07 16:45 ` Frederic Weisbecker
2013-02-07 17:03 ` Steven Rostedt
0 siblings, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2013-02-07 16:45 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
2013/2/7 Steven Rostedt <rostedt@goodmis.org>:
> On Thu, 2013-02-07 at 17:25 +0100, Frederic Weisbecker wrote:
>
>> At least for now we seem to agree on CONFIG_NO_HZ_IDLE and keep
>> CONFIG_NO_HZ for compatibility. Are you ok with that? If so I'll send
>> a patch.
>
> I believe that Ingo was suggesting to have CONFIG_NO_HZ give options to
> what type of config NO_HZ you want. Something like:
>
> config NO_HZ
> bool "Enable tickless support"
>
> config NO_HZ_IDLE
> bool "Stop tick when CPU is idle"
> default y
> depends on NO_HZ
Sounds good!
>
> config NO_HZ_TASK
> bool "Stop tick on specified CPUs when single task is running"
> default n
> depends on NO_HZ
Ok I launched another debate about that single task thing. I wish we
don't make it a fundamental component but rather an implementation
detail that can be dynamically dealt with in the future. Anyway let's
talk about that on my previous answer.
>
> That is, if you select NO_HZ, by default NO_HZ_IDLE is also selected.
> But in the kernel the NO_HZ_IDLE is used.
Yeah, nice idea!
>
> -- Steve
>
>
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 16:45 ` Frederic Weisbecker
@ 2013-02-07 17:03 ` Steven Rostedt
2013-02-07 17:45 ` Frederic Weisbecker
0 siblings, 1 reply; 28+ messages in thread
From: Steven Rostedt @ 2013-02-07 17:03 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, 2013-02-07 at 17:45 +0100, Frederic Weisbecker wrote:
> >
> > config NO_HZ_TASK
> > bool "Stop tick on specified CPUs when single task is running"
> > default n
> > depends on NO_HZ
>
> Ok I launched another debate about that single task thing. I wish we
> don't make it a fundamental component but rather an implementation
> detail that can be dynamically dealt with in the future.
It's not just an implementation detail, as it is very visible to the
user. If they want to take advantage of a task NO_HZ they have to go
through a bit of loops to make sure only a single task is running on a
CPU. We should be broadcasting this requirement to educate the users on
exactly how they can take advantage of this feature.
> Anyway let's
> talk about that on my previous answer.
I already did ;-)
-- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 17:03 ` Steven Rostedt
@ 2013-02-07 17:45 ` Frederic Weisbecker
0 siblings, 0 replies; 28+ messages in thread
From: Frederic Weisbecker @ 2013-02-07 17:45 UTC (permalink / raw)
To: Steven Rostedt
Cc: Ingo Molnar, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
2013/2/7 Steven Rostedt <rostedt@goodmis.org>:
> On Thu, 2013-02-07 at 17:45 +0100, Frederic Weisbecker wrote:
>
>> >
>> > config NO_HZ_TASK
>> > bool "Stop tick on specified CPUs when single task is running"
>> > default n
>> > depends on NO_HZ
>>
>> Ok I launched another debate about that single task thing. I wish we
>> don't make it a fundamental component but rather an implementation
>> detail that can be dynamically dealt with in the future.
>
> It's not just an implementation detail, as it is very visible to the
> user. If they want to take advantage of a task NO_HZ they have to go
> through a bit of loops to make sure only a single task is running on a
> CPU. We should be broadcasting this requirement to educate the users on
> exactly how they can take advantage of this feature.
If you guys really insist I can make it CONFIG_NO_HZ_SINGLETASK. I
don't mind that much. Then when we support hrtick we can rename it to
NO_HZ_FULL or whatever.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 16:25 ` Frederic Weisbecker
2013-02-07 16:41 ` Steven Rostedt
@ 2013-02-07 19:07 ` Ingo Molnar
2013-02-07 19:19 ` Steven Rostedt
2013-02-08 15:51 ` Frederic Weisbecker
1 sibling, 2 replies; 28+ messages in thread
From: Ingo Molnar @ 2013-02-07 19:07 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Steven Rostedt, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 2013/2/7 Ingo Molnar <mingo@kernel.org>:
> >
> > * Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> >> I'll reply to this as I come up with comments.
> >>
> >> First thing is, don't call it NO_HZ_FULL. A better name would
> >> be NO_HZ_CPU. I would like to reserve NO_HZ_FULL when we
> >> totally remove jiffies :-)
> >
> > I don't think we want yet another config option named in a
> > weird way.
> >
> > What we want instead is to just split NO_HZ up into its
> > conceptual parts:
> >
> > CONFIG_NO_HZ_IDLE
>
> Renaming CONFIG_NO_HZ to CONFIG_NO_HZ_IDLE is something I
> considered. I was just worried about this option being present
> in many defconfig.
I don't think renaming it is an option - it's present not just
in defconfigs, but in various distro configs, etc.
But we can add new config variables and use the existing
CONFIG_NO_HZ value to set their default values.
> Perhaps we can do that renaming and keep CONFIG_NO_HZ around a
> little while for backward compatibility (pretty much like what
> we've done for CONFIG_PERF_COUNTERS -> CONFIG_PERF_EVENTS).
Yes.
> > CONFIG_NO_HZ_USER_SPACE
> > CONFIG_NO_HZ_KERNEL_SPACE
> >
> > Where the current status quo is NO_HZ_IDLE=y, and Frederic is
> > about to introduce NO_HZ_USER_SPACE=y. When jiffies get removed
> > we get NO_HZ_KERNEL_SPACE=y.
>
> Note on my tree I stop the tick on both rings. I believe that
> restarting the tick on kernel entry isn't something we should
> seriously consider. It would be a costly operation that may
> make things worse. And in fact there is no big difference.
> Just kernelspace has more opportunities to be disturbed (RCU
> IPIs, async timer/work scheduled by the kernel, etc...) and
> get its tick restarted sometimes.
Ok.
Could we just simplify things and make this an unconditional
option of NO_HZ? Any reason why we'd want to make this
configurable, other than debugging?
I'm worried about the proliferation of not easily separable
config options. We already have way too many timer and scheduler
options to begin with.
> At least for now we seem to agree on CONFIG_NO_HZ_IDLE and
> keep CONFIG_NO_HZ for compatibility. Are you ok with that? If
> so I'll send a patch.
What would be the name of the new config option?
Can we just keep CONFIG_NO_HZ and extend it with your bits, and
make sure they work well?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 19:07 ` Ingo Molnar
@ 2013-02-07 19:19 ` Steven Rostedt
2013-02-08 15:51 ` Frederic Weisbecker
1 sibling, 0 replies; 28+ messages in thread
From: Steven Rostedt @ 2013-02-07 19:19 UTC (permalink / raw)
To: Ingo Molnar
Cc: Frederic Weisbecker, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
On Thu, 2013-02-07 at 20:07 +0100, Ingo Molnar wrote:
> Could we just simplify things and make this an unconditional
> option of NO_HZ? Any reason why we'd want to make this
> configurable, other than debugging?
I think the worry is the overhead that is required to keep it active. It
requires the context_tracking being enabled. Although, we may be able to
have both working.
Frederic, can we switch between context_tracking timing and tick base at
run time?
If we can have it enabled without overhead then I see no problem with
it. We still need the boot time kernel parameter to implement it. Hmm,
even if we can't dynamically switch between context_tracking and tick
base, we could make that decision at boot up based off of the kernel
parameters.
>
> I'm worried about the proliferation of not easily separable
> config options. We already have way too many timer and scheduler
> options to begin with.
I agree.
>
> > At least for now we seem to agree on CONFIG_NO_HZ_IDLE and
> > keep CONFIG_NO_HZ for compatibility. Are you ok with that? If
> > so I'll send a patch.
>
> What would be the name of the new config option?
>
> Can we just keep CONFIG_NO_HZ and extend it with your bits, and
> make sure they work well?
As long as we do not introduce performance regressions. If we can keep
it active without causing the system to slow down when not in use, then
I think it should be always enabled if CONFIG_NO_HZ is selected.
-- Steve
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-07 19:07 ` Ingo Molnar
2013-02-07 19:19 ` Steven Rostedt
@ 2013-02-08 15:51 ` Frederic Weisbecker
2013-02-11 9:59 ` Ingo Molnar
1 sibling, 1 reply; 28+ messages in thread
From: Frederic Weisbecker @ 2013-02-08 15:51 UTC (permalink / raw)
To: Ingo Molnar
Cc: Steven Rostedt, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
2013/2/7 Ingo Molnar <mingo@kernel.org>:
>
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
>
>> 2013/2/7 Ingo Molnar <mingo@kernel.org>:
>> >
>> > * Steven Rostedt <rostedt@goodmis.org> wrote:
>> >
>> >> I'll reply to this as I come up with comments.
>> >>
>> >> First thing is, don't call it NO_HZ_FULL. A better name would
>> >> be NO_HZ_CPU. I would like to reserve NO_HZ_FULL when we
>> >> totally remove jiffies :-)
>> >
>> > I don't think we want yet another config option named in a
>> > weird way.
>> >
>> > What we want instead is to just split NO_HZ up into its
>> > conceptual parts:
>> >
>> > CONFIG_NO_HZ_IDLE
>>
>> Renaming CONFIG_NO_HZ to CONFIG_NO_HZ_IDLE is something I
>> considered. I was just worried about this option being present
>> in many defconfig.
>
> I don't think renaming it is an option - it's present not just
> in defconfigs, but in various distro configs, etc.
>
> But we can add new config variables and use the existing
> CONFIG_NO_HZ value to set their default values.
Sure.
>> Note on my tree I stop the tick on both rings. I believe that
>> restarting the tick on kernel entry isn't something we should
>> seriously consider. It would be a costly operation that may
>> make things worse. And in fact there is no big difference.
>> Just kernelspace has more opportunities to be disturbed (RCU
>> IPIs, async timer/work scheduled by the kernel, etc...) and
>> get its tick restarted sometimes.
>
> Ok.
>
> Could we just simplify things and make this an unconditional
> option of NO_HZ? Any reason why we'd want to make this
> configurable, other than debugging?
>
> I'm worried about the proliferation of not easily separable
> config options. We already have way too many timer and scheduler
> options to begin with.
Like Steve said, this is for overhead reasons. The syscall uses the
slow path so that's ok. But we add a callback to every exception, irq
entry/exit, scheduler sched switch, signal handling, user and kernel
preemption point. This all could be lowered using static keys but even
that doesn't make me feel comfortable with this idea.
Moreover, for now this is going to be used only on extreme usecases
such as real time and HPC. If we really have to merge this into an
all-in-one nohz kconfig, I suggest we wait for the feature to mature a
bit and prove that it can be useful further those specialized
workloads, and also that we can ensure it's off-case overhead is not
significant.
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: [ANNOUNCE] 3.8-rc6-nohz4
2013-02-08 15:51 ` Frederic Weisbecker
@ 2013-02-11 9:59 ` Ingo Molnar
0 siblings, 0 replies; 28+ messages in thread
From: Ingo Molnar @ 2013-02-11 9:59 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Steven Rostedt, LKML, Alessio Igor Bogani, Andrew Morton,
Chris Metcalf, Christoph Lameter, Geoff Levand, Gilad Ben Yossef,
Hakan Akkan, Li Zhong, Namhyung Kim, Paul E. McKenney,
Paul Gortmaker, Peter Zijlstra, Thomas Gleixner
* Frederic Weisbecker <fweisbec@gmail.com> wrote:
> > I'm worried about the proliferation of not easily separable
> > config options. We already have way too many timer and
> > scheduler options to begin with.
>
> Like Steve said, this is for overhead reasons. The syscall
> uses the slow path so that's ok. But we add a callback to
> every exception, irq entry/exit, scheduler sched switch,
> signal handling, user and kernel preemption point. This all
> could be lowered using static keys but even that doesn't make
> me feel comfortable with this idea.
>
> Moreover, for now this is going to be used only on extreme
> usecases such as real time and HPC. If we really have to merge
> this into an all-in-one nohz kconfig, I suggest we wait for
> the feature to mature a bit and prove that it can be useful
> further those specialized workloads, and also that we can
> ensure it's off-case overhead is not significant.
I have no problems with making it an option initially - as long
as the options are logically named and interconnected.
In terms of overhead, a big plus is the reduction in user-space
execution overhead. At HZ=1000 we easily have 0.5%-1.0% overhead
currently. That is a *lot* of overhead if the box does mostly
user-space execution - which most boxes do, both servers and
desktops - not HPC systems.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 28+ messages in thread