linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] clockevents: Per cpu tick skew boot option
@ 2012-05-06 12:58 Mike Galbraith
  2012-05-06 13:10 ` Mike Galbraith
  2012-05-07 19:17 ` Thomas Gleixner
  0 siblings, 2 replies; 9+ messages in thread
From: Mike Galbraith @ 2012-05-06 12:58 UTC (permalink / raw)
  To: LKML; +Cc: Thomas Gleixner

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867
Historically, Linux has tried to make the regular timer tick on the
various CPUs not happen at the same time, to avoid contention on
xtime_lock.
    
Nowadays, with the tickless kernel, this contention no longer happens
since time keeping and updating are done differently. In addition,
this skew is actually hurting power consumption in a measurable way on
many-core systems.
End quote

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.
  
- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>

---
 Documentation/kernel-parameters.txt |    5 +++++
 kernel/time/tick-sched.c            |   19 +++++++++++++++++++
 2 files changed, 24 insertions(+)

--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2426,6 +2426,11 @@ bytes respectively. Such letter suffixes
 
 	sched_debug	[KNL] Enables verbose scheduler debug messages.
 
+	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
+			xtime_lock contention on larger systems.  Note: increases
+			power consumption, and should only be enabled if running
+			jitter sensitive (HPC/RT) workloads.
+
 	security=	[SECURITY] Choose a security module to enable at boot.
 			If this boot parameter is not specified, only the first
 			security module asking for security registration will be
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -814,6 +814,8 @@ static enum hrtimer_restart tick_sched_t
 	return HRTIMER_RESTART;
 }
 
+static int sched_skew_tick;
+
 /**
  * tick_setup_sched_timer - setup the tick emulation timer
  */
@@ -831,6 +833,14 @@ void tick_setup_sched_timer(void)
 	/* Get the next period (per cpu) */
 	hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
 
+	/* Offset the tick to avert xtime_lock contention. */
+	if (sched_skew_tick) {
+		u64 offset = ktime_to_ns(tick_period) >> 1;
+		do_div(offset, num_possible_cpus());
+		offset *= smp_processor_id();
+		hrtimer_add_expires_ns(&ts->sched_timer, offset);
+	}
+
 	for (;;) {
 		hrtimer_forward(&ts->sched_timer, now, tick_period);
 		hrtimer_start_expires(&ts->sched_timer,
@@ -910,3 +920,12 @@ int tick_check_oneshot_change(int allow_
 	tick_nohz_switch_to_nohz();
 	return 0;
 }
+
+static int __init skew_tick(char *str)
+{
+	get_option(&str, &sched_skew_tick);
+
+	return 0;
+}
+early_param("skew_tick", skew_tick);
+



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] clockevents: Per cpu tick skew boot option
  2012-05-06 12:58 [PATCH] clockevents: Per cpu tick skew boot option Mike Galbraith
@ 2012-05-06 13:10 ` Mike Galbraith
  2012-05-07 19:17 ` Thomas Gleixner
  1 sibling, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2012-05-06 13:10 UTC (permalink / raw)
  To: LKML; +Cc: Thomas Gleixner

On Sun, 2012-05-06 at 14:58 +0200, Mike Galbraith wrote:

> - 0209f649 rcu: limit rcu_node leaf-level fanout
>   This patch was born to combat lock contention which testing showed
>   to have been _induced by_ skew removal.  Skew the tick, contention
>   disappeared virtually completely.

P.S.  This boot option patch will complement a proposed patch by Paul
McKenney to allow boot time fanout configuration.  Between the two,
users will then be able to boot time select the tradeoff they need for
optimal performance of their systems.

-Mike



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] clockevents: Per cpu tick skew boot option
  2012-05-06 12:58 [PATCH] clockevents: Per cpu tick skew boot option Mike Galbraith
  2012-05-06 13:10 ` Mike Galbraith
@ 2012-05-07 19:17 ` Thomas Gleixner
  2012-05-08  3:20   ` Mike Galbraith
  1 sibling, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2012-05-07 19:17 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML

On Sun, 6 May 2012, Mike Galbraith wrote:
>  
> +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> +			xtime_lock contention on larger systems.  Note: increases
> +			power consumption, and should only be enabled if running
> +			jitter sensitive (HPC/RT) workloads.
> +

The "=" is wrong as skew_tick should not take parameters. It's
disabled by default. So "skew_tick" simply enables it, right ?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] clockevents: Per cpu tick skew boot option
  2012-05-07 19:17 ` Thomas Gleixner
@ 2012-05-08  3:20   ` Mike Galbraith
  2012-05-08  9:44     ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Mike Galbraith @ 2012-05-08  3:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML

On Mon, 2012-05-07 at 21:17 +0200, Thomas Gleixner wrote: 
> On Sun, 6 May 2012, Mike Galbraith wrote:
> >  
> > +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> > +			xtime_lock contention on larger systems.  Note: increases
> > +			power consumption, and should only be enabled if running
> > +			jitter sensitive (HPC/RT) workloads.
> > +
> 
> The "=" is wrong as skew_tick should not take parameters. It's
> disabled by default. So "skew_tick" simply enables it, right ?

Unless as I have RT set up, it's turned on by default, so '=' lets the
user turn it back off.

-Mike


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] clockevents: Per cpu tick skew boot option
  2012-05-08  3:20   ` Mike Galbraith
@ 2012-05-08  9:44     ` Thomas Gleixner
  2012-05-08 10:20       ` Mike Galbraith
  0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2012-05-08  9:44 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML

On Tue, 8 May 2012, Mike Galbraith wrote:

> On Mon, 2012-05-07 at 21:17 +0200, Thomas Gleixner wrote: 
> > On Sun, 6 May 2012, Mike Galbraith wrote:
> > >  
> > > +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> > > +			xtime_lock contention on larger systems.  Note: increases
> > > +			power consumption, and should only be enabled if running
> > > +			jitter sensitive (HPC/RT) workloads.
> > > +
> > 
> > The "=" is wrong as skew_tick should not take parameters. It's
> > disabled by default. So "skew_tick" simply enables it, right ?
> 
> Unless as I have RT set up, it's turned on by default, so '=' lets the
> user turn it back off.

Then the doc should say what's the parameter after the "+" is :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] clockevents: Per cpu tick skew boot option
  2012-05-08  9:44     ` Thomas Gleixner
@ 2012-05-08 10:20       ` Mike Galbraith
  2012-05-10 18:16         ` Paul E. McKenney
  2012-05-24 23:52         ` [tip:timers/core] tick: Add " tip-bot for Mike Galbraith
  0 siblings, 2 replies; 9+ messages in thread
From: Mike Galbraith @ 2012-05-08 10:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: LKML

On Tue, 2012-05-08 at 11:44 +0200, Thomas Gleixner wrote: 
> On Tue, 8 May 2012, Mike Galbraith wrote:
> 
> > On Mon, 2012-05-07 at 21:17 +0200, Thomas Gleixner wrote: 
> > > On Sun, 6 May 2012, Mike Galbraith wrote:
> > > >  
> > > > +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> > > > +			xtime_lock contention on larger systems.  Note: increases
> > > > +			power consumption, and should only be enabled if running
> > > > +			jitter sensitive (HPC/RT) workloads.
> > > > +
> > > 
> > > The "=" is wrong as skew_tick should not take parameters. It's
> > > disabled by default. So "skew_tick" simply enables it, right ?
> > 
> > Unless as I have RT set up, it's turned on by default, so '=' lets the
> > user turn it back off.
> 
> Then the doc should say what's the parameter after the "+" is :)

I only put anything there because boss said "Document", I was hiding it
along with fugly but damn useful <koff> HPC/RT cpuset patch ;-)

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867
Historically, Linux has tried to make the regular timer tick on the
various CPUs not happen at the same time, to avoid contention on
xtime_lock.
    
Nowadays, with the tickless kernel, this contention no longer happens
since time keeping and updating are done differently. In addition,
this skew is actually hurting power consumption in a measurable way on
many-core systems.
End quote

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.
  
- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>

---
 Documentation/kernel-parameters.txt |    9 +++++++++
 kernel/time/tick-sched.c            |   19 +++++++++++++++++++
 2 files changed, 28 insertions(+)

--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2426,6 +2426,15 @@ bytes respectively. Such letter suffixes
 
 	sched_debug	[KNL] Enables verbose scheduler debug messages.
 
+	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
+			xtime_lock contention on larger systems, and/or RCU lock
+			contention on all systems with CONFIG_MAXSMP set.
+			Format: { "0" | "1" }
+			0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"
+			1 -- enable.
+			Note: increases power consumption, thus should only be
+			enabled if running jitter sensitive (HPC/RT) workloads.
+
 	security=	[SECURITY] Choose a security module to enable at boot.
 			If this boot parameter is not specified, only the first
 			security module asking for security registration will be
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -814,6 +814,8 @@ static enum hrtimer_restart tick_sched_t
 	return HRTIMER_RESTART;
 }
 
+static int sched_skew_tick;
+
 /**
  * tick_setup_sched_timer - setup the tick emulation timer
  */
@@ -831,6 +833,14 @@ void tick_setup_sched_timer(void)
 	/* Get the next period (per cpu) */
 	hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
 
+	/* Offset the tick to avert xtime_lock contention. */
+	if (sched_skew_tick) {
+		u64 offset = ktime_to_ns(tick_period) >> 1;
+		do_div(offset, num_possible_cpus());
+		offset *= smp_processor_id();
+		hrtimer_add_expires_ns(&ts->sched_timer, offset);
+	}
+
 	for (;;) {
 		hrtimer_forward(&ts->sched_timer, now, tick_period);
 		hrtimer_start_expires(&ts->sched_timer,
@@ -910,3 +920,12 @@ int tick_check_oneshot_change(int allow_
 	tick_nohz_switch_to_nohz();
 	return 0;
 }
+
+static int __init skew_tick(char *str)
+{
+	get_option(&str, &sched_skew_tick);
+
+	return 0;
+}
+early_param("skew_tick", skew_tick);
+



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] clockevents: Per cpu tick skew boot option
  2012-05-08 10:20       ` Mike Galbraith
@ 2012-05-10 18:16         ` Paul E. McKenney
  2012-05-23 15:23           ` [PATCH v3] " Mike Galbraith
  2012-05-24 23:52         ` [tip:timers/core] tick: Add " tip-bot for Mike Galbraith
  1 sibling, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2012-05-10 18:16 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: Thomas Gleixner, LKML

On Tue, May 08, 2012 at 12:20:58PM +0200, Mike Galbraith wrote:
> On Tue, 2012-05-08 at 11:44 +0200, Thomas Gleixner wrote: 
> > On Tue, 8 May 2012, Mike Galbraith wrote:
> > 
> > > On Mon, 2012-05-07 at 21:17 +0200, Thomas Gleixner wrote: 
> > > > On Sun, 6 May 2012, Mike Galbraith wrote:
> > > > >  
> > > > > +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> > > > > +			xtime_lock contention on larger systems.  Note: increases
> > > > > +			power consumption, and should only be enabled if running
> > > > > +			jitter sensitive (HPC/RT) workloads.
> > > > > +
> > > > 
> > > > The "=" is wrong as skew_tick should not take parameters. It's
> > > > disabled by default. So "skew_tick" simply enables it, right ?
> > > 
> > > Unless as I have RT set up, it's turned on by default, so '=' lets the
> > > user turn it back off.
> > 
> > Then the doc should say what's the parameter after the "+" is :)
> 
> I only put anything there because boss said "Document", I was hiding it
> along with fugly but damn useful <koff> HPC/RT cpuset patch ;-)
> 
> Let the user decide whether power consumption or jitter is the
> more important consideration for their machines.
> 
> Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867
> Historically, Linux has tried to make the regular timer tick on the
> various CPUs not happen at the same time, to avoid contention on
> xtime_lock.
>     
> Nowadays, with the tickless kernel, this contention no longer happens
> since time keeping and updating are done differently. In addition,
> this skew is actually hurting power consumption in a measurable way on
> many-core systems.
> End quote
> 
> Problems:
> 
> - Contrary to the above, systems do encounter contention on both
>   xtime_lock and RCU structure locks when the tick is synchronized.
>   
> - Moderate sized RT systems suffer intolerable jitter due to the tick
>   being synchronized.
> 
> - SGI reports the same for their large systems.
> 
> - Fully utilized systems reap no power saving benefit from skew removal,
>   but do suffer from resulting induced lock contention.
> 
> - 0209f649 rcu: limit rcu_node leaf-level fanout
>   This patch was born to combat lock contention which testing showed
>   to have been _induced by_ skew removal.  Skew the tick, contention
>   disappeared virtually completely.
> 
> Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
> 
> ---
>  Documentation/kernel-parameters.txt |    9 +++++++++
>  kernel/time/tick-sched.c            |   19 +++++++++++++++++++
>  2 files changed, 28 insertions(+)
> 
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2426,6 +2426,15 @@ bytes respectively. Such letter suffixes
> 
>  	sched_debug	[KNL] Enables verbose scheduler debug messages.
> 
> +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> +			xtime_lock contention on larger systems, and/or RCU lock
> +			contention on all systems with CONFIG_MAXSMP set.

Suggest instead:

			contention on systems with large CONFIG_RCU_FANOUT
			values.

> +			Format: { "0" | "1" }
> +			0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"

Suggest simply:

			0 -- disable (default for typical kernel builds).

With these changes:

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> +			1 -- enable.
> +			Note: increases power consumption, thus should only be
> +			enabled if running jitter sensitive (HPC/RT) workloads.
> +
>  	security=	[SECURITY] Choose a security module to enable at boot.
>  			If this boot parameter is not specified, only the first
>  			security module asking for security registration will be
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -814,6 +814,8 @@ static enum hrtimer_restart tick_sched_t
>  	return HRTIMER_RESTART;
>  }
> 
> +static int sched_skew_tick;
> +
>  /**
>   * tick_setup_sched_timer - setup the tick emulation timer
>   */
> @@ -831,6 +833,14 @@ void tick_setup_sched_timer(void)
>  	/* Get the next period (per cpu) */
>  	hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
> 
> +	/* Offset the tick to avert xtime_lock contention. */
> +	if (sched_skew_tick) {
> +		u64 offset = ktime_to_ns(tick_period) >> 1;
> +		do_div(offset, num_possible_cpus());
> +		offset *= smp_processor_id();
> +		hrtimer_add_expires_ns(&ts->sched_timer, offset);
> +	}
> +
>  	for (;;) {
>  		hrtimer_forward(&ts->sched_timer, now, tick_period);
>  		hrtimer_start_expires(&ts->sched_timer,
> @@ -910,3 +920,12 @@ int tick_check_oneshot_change(int allow_
>  	tick_nohz_switch_to_nohz();
>  	return 0;
>  }
> +
> +static int __init skew_tick(char *str)
> +{
> +	get_option(&str, &sched_skew_tick);
> +
> +	return 0;
> +}
> +early_param("skew_tick", skew_tick);
> +
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3] clockevents: Per cpu tick skew boot option
  2012-05-10 18:16         ` Paul E. McKenney
@ 2012-05-23 15:23           ` Mike Galbraith
  0 siblings, 0 replies; 9+ messages in thread
From: Mike Galbraith @ 2012-05-23 15:23 UTC (permalink / raw)
  To: paulmck; +Cc: Thomas Gleixner, LKML

On Thu, 2012-05-10 at 11:16 -0700, Paul E. McKenney wrote:

> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -2426,6 +2426,15 @@ bytes respectively. Such letter suffixes
> > 
> >  	sched_debug	[KNL] Enables verbose scheduler debug messages.
> > 
> > +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> > +			xtime_lock contention on larger systems, and/or RCU lock
> > +			contention on all systems with CONFIG_MAXSMP set.
> 
> Suggest instead:
> 
> 			contention on systems with large CONFIG_RCU_FANOUT
> 			values.
> 
> > +			Format: { "0" | "1" }
> > +			0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"
> 
> Suggest simply:
> 
> 			0 -- disable (default for typical kernel builds).
> 
> With these changes:
> 
> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Hunted down round-tuit.

clockevents: Per cpu tick skew boot option

Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867
Historically, Linux has tried to make the regular timer tick on the
various CPUs not happen at the same time, to avoid contention on
xtime_lock.
    
Nowadays, with the tickless kernel, this contention no longer happens
since time keeping and updating are done differently. In addition,
this skew is actually hurting power consumption in a measurable way on
many-core systems.
End quote

Problems:
- Contrary to the above, all systems do encounter contention on
  xtime_lock and RCU structure locks when the tick is synchronized.

- Large systems and moderate sized RT systems suffer intolerable
  jitter with the tick synchronized.

- Fully utilized systems reap no power saving benefit, but do
  suffer from synchronized tick lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.  Measured latency on 48 core box
  was >330us.  Revert, amd restore skew, it dropped back to ~70us.
  We absorbed a 400% latency increase to combat induced contention.

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

---
 Documentation/kernel-parameters.txt |    9 +++++++++
 kernel/time/tick-sched.c            |   19 +++++++++++++++++++
 2 files changed, 28 insertions(+)

--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2426,6 +2426,15 @@ bytes respectively. Such letter suffixes
 
 	sched_debug	[KNL] Enables verbose scheduler debug messages.
 
+	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
+			xtime_lock contention on larger systems, and/or RCU lock
+			contention on systems with large CONFIG_RCU_FANOUT values.
+			Format: { "0" | "1" }
+			0 -- disable (default for typical kernel builds).
+			1 -- enable.
+			Note: increases power consumption, thus should only be
+			enabled if running jitter sensitive (HPC/RT) workloads.
+
 	security=	[SECURITY] Choose a security module to enable at boot.
 			If this boot parameter is not specified, only the first
 			security module asking for security registration will be
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -814,6 +814,8 @@ static enum hrtimer_restart tick_sched_t
 	return HRTIMER_RESTART;
 }
 
+static int sched_skew_tick;
+
 /**
  * tick_setup_sched_timer - setup the tick emulation timer
  */
@@ -831,6 +833,14 @@ void tick_setup_sched_timer(void)
 	/* Get the next period (per cpu) */
 	hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
 
+	/* Offset the tick to avert xtime_lock contention. */
+	if (sched_skew_tick) {
+		u64 offset = ktime_to_ns(tick_period) >> 1;
+		do_div(offset, num_possible_cpus());
+		offset *= smp_processor_id();
+		hrtimer_add_expires_ns(&ts->sched_timer, offset);
+	}
+
 	for (;;) {
 		hrtimer_forward(&ts->sched_timer, now, tick_period);
 		hrtimer_start_expires(&ts->sched_timer,
@@ -910,3 +920,12 @@ int tick_check_oneshot_change(int allow_
 	tick_nohz_switch_to_nohz();
 	return 0;
 }
+
+static int __init skew_tick(char *str)
+{
+	get_option(&str, &sched_skew_tick);
+
+	return 0;
+}
+early_param("skew_tick", skew_tick);
+



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:timers/core] tick: Add tick skew boot option
  2012-05-08 10:20       ` Mike Galbraith
  2012-05-10 18:16         ` Paul E. McKenney
@ 2012-05-24 23:52         ` tip-bot for Mike Galbraith
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot for Mike Galbraith @ 2012-05-24 23:52 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, mgalbraith, hpa, mingo, tglx

Commit-ID:  5307c9556bc17e3cd26d4e94fc3b2565921834de
Gitweb:     http://git.kernel.org/tip/5307c9556bc17e3cd26d4e94fc3b2565921834de
Author:     Mike Galbraith <mgalbraith@suse.de>
AuthorDate: Tue, 8 May 2012 12:20:58 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Fri, 25 May 2012 01:44:50 +0200

tick: Add tick skew boot option

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867:

"Historically, Linux has tried to make the regular timer tick on the
 various CPUs not happen at the same time, to avoid contention on
 xtime_lock.
    
 Nowadays, with the tickless kernel, this contention no longer happens
 since time keeping and updating are done differently. In addition,
 this skew is actually hurting power consumption in a measurable way on
 many-core systems."

Problems:

- Contrary to the above, systems do encounter contention on both
  xtime_lock and RCU structure locks when the tick is synchronized.
  
- Moderate sized RT systems suffer intolerable jitter due to the tick
  being synchronized.

- SGI reports the same for their large systems.

- Fully utilized systems reap no power saving benefit from skew removal,
  but do suffer from resulting induced lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.

Signed-off-by: Mike Galbraith <mgalbraith@suse.de>
Link: http://lkml.kernel.org/r/1336472458.21924.78.camel@marge.simpson.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

---
 Documentation/kernel-parameters.txt |    9 +++++++++
 kernel/time/tick-sched.c            |   18 ++++++++++++++++++
 2 files changed, 27 insertions(+), 0 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index b69cfdc12..ea38cd1 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2532,6 +2532,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 
 	sched_debug	[KNL] Enables verbose scheduler debug messages.
 
+	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
+			xtime_lock contention on larger systems, and/or RCU lock
+			contention on all systems with CONFIG_MAXSMP set.
+			Format: { "0" | "1" }
+			0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"
+			1 -- enable.
+			Note: increases power consumption, thus should only be
+			enabled if running jitter sensitive (HPC/RT) workloads.
+
 	security=	[SECURITY] Choose a security module to enable at boot.
 			If this boot parameter is not specified, only the first
 			security module asking for security registration will be
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6a3a5b9..4eddbb5 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -814,6 +814,8 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
 	return HRTIMER_RESTART;
 }
 
+static int sched_skew_tick;
+
 /**
  * tick_setup_sched_timer - setup the tick emulation timer
  */
@@ -831,6 +833,14 @@ void tick_setup_sched_timer(void)
 	/* Get the next period (per cpu) */
 	hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
 
+	/* Offset the tick to avert xtime_lock contention. */
+	if (sched_skew_tick) {
+		u64 offset = ktime_to_ns(tick_period) >> 1;
+		do_div(offset, num_possible_cpus());
+		offset *= smp_processor_id();
+		hrtimer_add_expires_ns(&ts->sched_timer, offset);
+	}
+
 	for (;;) {
 		hrtimer_forward(&ts->sched_timer, now, tick_period);
 		hrtimer_start_expires(&ts->sched_timer,
@@ -910,3 +920,11 @@ int tick_check_oneshot_change(int allow_nohz)
 	tick_nohz_switch_to_nohz();
 	return 0;
 }
+
+static int __init skew_tick(char *str)
+{
+	get_option(&str, &sched_skew_tick);
+
+	return 0;
+}
+early_param("skew_tick", skew_tick);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-05-24 23:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-06 12:58 [PATCH] clockevents: Per cpu tick skew boot option Mike Galbraith
2012-05-06 13:10 ` Mike Galbraith
2012-05-07 19:17 ` Thomas Gleixner
2012-05-08  3:20   ` Mike Galbraith
2012-05-08  9:44     ` Thomas Gleixner
2012-05-08 10:20       ` Mike Galbraith
2012-05-10 18:16         ` Paul E. McKenney
2012-05-23 15:23           ` [PATCH v3] " Mike Galbraith
2012-05-24 23:52         ` [tip:timers/core] tick: Add " tip-bot for Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).