All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
@ 2015-07-06  7:12 Tony Lindgren
  2015-07-06 14:18 ` Thomas Gleixner
  2015-07-06 17:28 ` John Stultz
  0 siblings, 2 replies; 11+ messages in thread
From: Tony Lindgren @ 2015-07-06  7:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, linux-omap, Felipe Balbi, John Stultz,
	Nishanth Menon, Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

Some persistent clocksources can be on a slow external bus. For shorter
latencies for RT use, let's allow toggling the clocksource during idle
between a faster non-persistent runtime clocksource and a slower persistent
clocksource.

Cc: Felipe Balbi <balbi@ti.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Nishanth Menon <nm@ti.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---

Anybody got better ideas for something like last_idled_cpu() type check
at the end of this patch?

---
 include/linuxt-email-lkml-omap/clocksource.h |  2 ++
 kernel/time/clocksource.c   | 60 +++++++++++++++++++++++++++++++++++++++++++--
 kernel/time/timekeeping.c   | 13 +++++++++-
 3 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd27..7e5ff99 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -232,6 +232,8 @@ static inline void __clocksource_update_freq_khz(struct clocksource *cs, u32 khz
 
 
 extern int timekeeping_notify(struct clocksource *clock);
+extern int clocksource_pm_enter(void);
+extern void clocksource_pm_exit(void);
 
 extern cycle_t clocksource_mmio_readl_up(struct clocksource *);
 extern cycle_t clocksource_mmio_readl_down(struct clocksource *);
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 841b72f..69dc307 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -93,6 +93,8 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 maxsec)
 /*[Clocksource internal variables]---------
  * curr_clocksource:
  *	currently selected clocksource.
+ * runtime_clocksource:
+ *	preferred clocksource for runtime, can be local and non-persistent
  * clocksource_list:
  *	linked list with the registered clocksources
  * clocksource_mutex:
@@ -101,6 +103,7 @@ clocks_calc_mult_shift(u32 *mult, u32 *shift, u32 from, u32 to, u32 maxsec)
  *	Name of the user-specified clocksource.
  */
 static struct clocksource *curr_clocksource;
+static struct clocksource *runtime_clocksource;
 static LIST_HEAD(clocksource_list);
 static DEFINE_MUTEX(clocksource_mutex);
 static char override_name[CS_NAME_LEN];
@@ -525,7 +528,8 @@ static inline void clocksource_update_max_deferment(struct clocksource *cs)
 
 #ifndef CONFIG_ARCH_USES_GETTIMEOFFSET
 
-static struct clocksource *clocksource_find_best(bool oneshot, bool skipcur)
+static struct clocksource *clocksource_find_best(bool oneshot, bool persistent,
+						 bool skipcur)
 {
 	struct clocksource *cs;
 
@@ -540,6 +544,8 @@ static struct clocksource *clocksource_find_best(bool oneshot, bool skipcur)
 	list_for_each_entry(cs, &clocksource_list, list) {
 		if (skipcur && cs == curr_clocksource)
 			continue;
+		if (persistent && !(cs->flags & CLOCK_SOURCE_SUSPEND_NONSTOP))
+			continue;
 		if (oneshot && !(cs->flags & CLOCK_SOURCE_VALID_FOR_HRES))
 			continue;
 		return cs;
@@ -553,7 +559,7 @@ static void __clocksource_select(bool skipcur)
 	struct clocksource *best, *cs;
 
 	/* Find the best suitable clocksource */
-	best = clocksource_find_best(oneshot, skipcur);
+	best = clocksource_find_best(oneshot, false, skipcur);
 	if (!best)
 		return;
 
@@ -802,6 +808,56 @@ int clocksource_unregister(struct clocksource *cs)
 }
 EXPORT_SYMBOL(clocksource_unregister);
 
+/**
+ * clocksource_pm_enter - change to a persistent clocksource before idle
+ *
+ * Changes system to use a persistent clocksource for idle. Intended to
+ * be called from CPUidle from the last active CPU.
+ */
+int clocksource_pm_enter(void)
+{
+	bool oneshot = tick_oneshot_mode_active();
+	struct clocksource *best;
+
+	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
+		      "Unable to get clocksource_mutex"))
+		return -EINTR;
+
+	best = clocksource_find_best(oneshot, true, false);
+	if (best) {
+		if (curr_clocksource != best &&
+		    !timekeeping_notify(best)) {
+			runtime_clocksource = curr_clocksource;
+			curr_clocksource = best;
+		}
+	}
+	mutex_unlock(&clocksource_mutex);
+
+	return !!best;
+}
+
+/**
+ * clocksource_pm_exit - change to a runtime clocksrouce after idle
+ *
+ * Changes system to use the best clocksource for runtime. Intended to
+ * be called after waking up from CPUidle on the first active CPU.
+ */
+void clocksource_pm_exit(void)
+{
+	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
+		      "Unable to get clocksource_mutex"))
+		return;
+
+	if (runtime_clocksource) {
+		if (curr_clocksource != runtime_clocksource &&
+		    !timekeeping_notify(runtime_clocksource)) {
+			curr_clocksource = runtime_clocksource;
+			runtime_clocksource = NULL;
+		}
+	}
+	mutex_unlock(&clocksource_mutex);
+}
+
 #ifdef CONFIG_SYSFS
 /**
  * sysfs_show_current_clocksources - sysfs interface for current clocksource
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index bca3667..0379260 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1086,7 +1086,18 @@ int timekeeping_notify(struct clocksource *clock)
 
 	if (tk->tkr_mono.clock == clock)
 		return 0;
-	stop_machine(change_clocksource, clock, NULL);
+
+	/*
+	 * We may want to toggle between a fast and a persistent
+	 * clocksource from CPUidle on the last active CPU and can't
+	 * use stop_machine at that point.
+	 */
+	if (cpumask_test_cpu(smp_processor_id(), cpu_online_mask) &&
+	    !rcu_is_watching())
+		change_clocksource(clock);
+	else
+		stop_machine(change_clocksource, clock, NULL);
+
 	tick_clock_notify();
 	return tk->tkr_mono.clock == clock ? 0 : -1;
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06  7:12 [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle Tony Lindgren
@ 2015-07-06 14:18 ` Thomas Gleixner
  2015-07-06 15:22   ` Tony Lindgren
  2015-07-06 17:28 ` John Stultz
  1 sibling, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2015-07-06 14:18 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: linux-kernel, linux-omap, Felipe Balbi, John Stultz,
	Nishanth Menon, Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

On Mon, 6 Jul 2015, Tony Lindgren wrote:

> Some persistent clocksources can be on a slow external bus. For shorter
> latencies for RT use, let's allow toggling the clocksource during idle
> between a faster non-persistent runtime clocksource and a slower persistent
> clocksource.

I really cannot follow that RT argument here. The whole switchover
causes latencies itself and whats worse is, that this breaks
timekeeping accuracy because there is no way to switch clocksources
atomically without loss.

> ---
>  include/linuxt-email-lkml-omap/clocksource.h |  2 ++

Interesting file name.

>  extern int timekeeping_notify(struct clocksource *clock);
> +extern int clocksource_pm_enter(void);
> +extern void clocksource_pm_exit(void);

Unfortunately you are not providing the call site, so I cannot see
from which context this is going to be called.

I fear its from the guts of the idle code probably with interrupts
disabled etc ...., right?
  
> +/**
> + * clocksource_pm_enter - change to a persistent clocksource before idle
> + *
> + * Changes system to use a persistent clocksource for idle. Intended to
> + * be called from CPUidle from the last active CPU.
> + */
> +int clocksource_pm_enter(void)
> +{
> +	bool oneshot = tick_oneshot_mode_active();
> +	struct clocksource *best;
> +
> +	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
> +		      "Unable to get clocksource_mutex"))
> +		return -EINTR;

This trylock serves which purpose?

> +
> +	best = clocksource_find_best(oneshot, true, false);
> +	if (best) {
> +		if (curr_clocksource != best &&
> +		    !timekeeping_notify(best)) {
> +			runtime_clocksource = curr_clocksource;
> +			curr_clocksource = best;
> +		}
> +	}
> +	mutex_unlock(&clocksource_mutex);
> +
> +	return !!best;
> +}
> +
> +/**
> + * clocksource_pm_exit - change to a runtime clocksrouce after idle
> + *
> + * Changes system to use the best clocksource for runtime. Intended to
> + * be called after waking up from CPUidle on the first active CPU.
> + */
> +void clocksource_pm_exit(void)
> +{
> +	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
> +		      "Unable to get clocksource_mutex"))
> +		return;
> +
> +	if (runtime_clocksource) {
> +		if (curr_clocksource != runtime_clocksource &&
> +		    !timekeeping_notify(runtime_clocksource)) {
> +			curr_clocksource = runtime_clocksource;
> +			runtime_clocksource = NULL;
> +		}
> +	}
> +	mutex_unlock(&clocksource_mutex);
> +}

I really cannot see how this is proper serialized.

>  #ifdef CONFIG_SYSFS
>  /**
>   * sysfs_show_current_clocksources - sysfs interface for current clocksource
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index bca3667..0379260 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -1086,7 +1086,18 @@ int timekeeping_notify(struct clocksource *clock)
>  
>  	if (tk->tkr_mono.clock == clock)
>  		return 0;
> -	stop_machine(change_clocksource, clock, NULL);
> +
> +	/*
> +	 * We may want to toggle between a fast and a persistent
> +	 * clocksource from CPUidle on the last active CPU and can't
> +	 * use stop_machine at that point.
> +	 */
> +	if (cpumask_test_cpu(smp_processor_id(), cpu_online_mask) &&

Can you please explain how this code gets called from an offline cpu?

> +	    !rcu_is_watching())

So pick some random combination of conditions and define that it is
correct, right? How on earth does !rcu_watching() tell that this is
the last running cpu.

> +		change_clocksource(clock);
> +	else
> +		stop_machine(change_clocksource, clock, NULL);

This patch definitely earns a place in my ugly code museum under the
category 'Does not explode in my face, so it must be correct'.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 14:18 ` Thomas Gleixner
@ 2015-07-06 15:22   ` Tony Lindgren
  2015-07-06 15:46     ` Thomas Gleixner
  0 siblings, 1 reply; 11+ messages in thread
From: Tony Lindgren @ 2015-07-06 15:22 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, linux-omap, Felipe Balbi, John Stultz,
	Nishanth Menon, Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

Hi,

* Thomas Gleixner <tglx@linutronix.de> [150706 07:20]:
> On Mon, 6 Jul 2015, Tony Lindgren wrote:
> 
> > Some persistent clocksources can be on a slow external bus. For shorter
> > latencies for RT use, let's allow toggling the clocksource during idle
> > between a faster non-persistent runtime clocksource and a slower persistent
> > clocksource.
> 
> I really cannot follow that RT argument here. The whole switchover
> causes latencies itself and whats worse is, that this breaks
> timekeeping accuracy because there is no way to switch clocksources
> atomically without loss.

It would be during deeper idle states.. But yeah the RT use would be
better replaced in the description with with "lower runtime timer
latency". 

The timekeeping accuracy issue certainly needs some thinking, and
also the resolution between the clocksources can be different.. In the
test case I have the slow timer is always on and of a lower resolution
than the ARM global timer being used during runtime.

Got some handy timer test in mind you want me to run to provide data
on the accuracy?
 
> > ---
> >  include/linuxt-email-lkml-omap/clocksource.h |  2 ++
> 
> Interesting file name.

Heh that needs to go back to sed land :)
 
> >  extern int timekeeping_notify(struct clocksource *clock);
> > +extern int clocksource_pm_enter(void);
> > +extern void clocksource_pm_exit(void);
> 
> Unfortunately you are not providing the call site, so I cannot see
> from which context this is going to be called.
> 
> I fear its from the guts of the idle code probably with interrupts
> disabled etc ...., right?

Yes from the last cpu active in cpuidle. Here's the related snippet
in my case:

--- a/arch/arm/mach-omap2/cpuidle44xx.c
+++ b/arch/arm/mach-omap2/cpuidle44xx.c
@@ -111,6 +111,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
 				 (cx->mpu_logic_state == PWRDM_POWER_OFF);
 
 	tick_broadcast_enter();
+	clocksource_pm_enter();
 
 	/*
 	 * Call idle CPU PM enter notifier chain so that
@@ -167,6 +168,7 @@ static int omap_enter_idle_coupled(struct cpuidle_device *dev,
 	if (dev->cpu == 0 && mpuss_can_lose_context)
 		cpu_cluster_pm_exit();
 
+	clocksource_pm_exit();
 	tick_broadcast_exit();
 
 fail:

   
> > +/**
> > + * clocksource_pm_enter - change to a persistent clocksource before idle
> > + *
> > + * Changes system to use a persistent clocksource for idle. Intended to
> > + * be called from CPUidle from the last active CPU.
> > + */
> > +int clocksource_pm_enter(void)
> > +{
> > +	bool oneshot = tick_oneshot_mode_active();
> > +	struct clocksource *best;
> > +
> > +	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
> > +		      "Unable to get clocksource_mutex"))
> > +		return -EINTR;
> 
> This trylock serves which purpose?

Well we don't want to start changing clocksource if something is
running like you pointed out.

> > +	best = clocksource_find_best(oneshot, true, false);
> > +	if (best) {
> > +		if (curr_clocksource != best &&
> > +		    !timekeeping_notify(best)) {
> > +			runtime_clocksource = curr_clocksource;
> > +			curr_clocksource = best;
> > +		}
> > +	}
> > +	mutex_unlock(&clocksource_mutex);
> > +
> > +	return !!best;
> > +}
> > +
> > +/**
> > + * clocksource_pm_exit - change to a runtime clocksrouce after idle
> > + *
> > + * Changes system to use the best clocksource for runtime. Intended to
> > + * be called after waking up from CPUidle on the first active CPU.
> > + */
> > +void clocksource_pm_exit(void)
> > +{
> > +	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
> > +		      "Unable to get clocksource_mutex"))
> > +		return;
> > +
> > +	if (runtime_clocksource) {
> > +		if (curr_clocksource != runtime_clocksource &&
> > +		    !timekeeping_notify(runtime_clocksource)) {
> > +			curr_clocksource = runtime_clocksource;
> > +			runtime_clocksource = NULL;
> > +		}
> > +	}
> > +	mutex_unlock(&clocksource_mutex);
> > +}
> 
> I really cannot see how this is proper serialized.

We need to allow this only from the last cpu before hitting idle.
 
> >  #ifdef CONFIG_SYSFS
> >  /**
> >   * sysfs_show_current_clocksources - sysfs interface for current clocksource
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index bca3667..0379260 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -1086,7 +1086,18 @@ int timekeeping_notify(struct clocksource *clock)
> >  
> >  	if (tk->tkr_mono.clock == clock)
> >  		return 0;
> > -	stop_machine(change_clocksource, clock, NULL);
> > +
> > +	/*
> > +	 * We may want to toggle between a fast and a persistent
> > +	 * clocksource from CPUidle on the last active CPU and can't
> > +	 * use stop_machine at that point.
> > +	 */
> > +	if (cpumask_test_cpu(smp_processor_id(), cpu_online_mask) &&
> 
> Can you please explain how this code gets called from an offline cpu?

Last cpu getting idled..
 
> > +	    !rcu_is_watching())
> 
> So pick some random combination of conditions and define that it is
> correct, right? How on earth does !rcu_watching() tell that this is
> the last running cpu.

We have called rcu_idle_enter() from cpuidle_idle_call(). Do you have
some better test in mind when the last cpu is about hit idle?
 
> > +		change_clocksource(clock);
> > +	else
> > +		stop_machine(change_clocksource, clock, NULL);
> 
> This patch definitely earns a place in my ugly code museum under the
> category 'Does not explode in my face, so it must be correct'.

Yeah TSC-like issues revisited on other architectures.. I was
expecting something like that from you on this one :)

Regards,

Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 15:22   ` Tony Lindgren
@ 2015-07-06 15:46     ` Thomas Gleixner
  2015-07-06 16:42       ` Tony Lindgren
  2015-07-06 17:51       ` John Stultz
  0 siblings, 2 replies; 11+ messages in thread
From: Thomas Gleixner @ 2015-07-06 15:46 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: linux-kernel, linux-omap, Felipe Balbi, John Stultz,
	Nishanth Menon, Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

On Mon, 6 Jul 2015, Tony Lindgren wrote:
> * Thomas Gleixner <tglx@linutronix.de> [150706 07:20]:
> > On Mon, 6 Jul 2015, Tony Lindgren wrote:
> The timekeeping accuracy issue certainly needs some thinking, and
> also the resolution between the clocksources can be different.. In the
> test case I have the slow timer is always on and of a lower resolution
> than the ARM global timer being used during runtime.
> 
> Got some handy timer test in mind you want me to run to provide data
> on the accuracy?

John Stultz might have something.
  
> > > +/**
> > > + * clocksource_pm_enter - change to a persistent clocksource before idle
> > > + *
> > > + * Changes system to use a persistent clocksource for idle. Intended to
> > > + * be called from CPUidle from the last active CPU.
> > > + */
> > > +int clocksource_pm_enter(void)
> > > +{
> > > +	bool oneshot = tick_oneshot_mode_active();
> > > +	struct clocksource *best;
> > > +
> > > +	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
> > > +		      "Unable to get clocksource_mutex"))
> > > +		return -EINTR;
> > 
> > This trylock serves which purpose?
> 
> Well we don't want to start changing clocksource if something is
> running like you pointed out.

Well yes, but ....
 
> > I really cannot see how this is proper serialized.
> 
> We need to allow this only from the last cpu before hitting idle.

And I cannot see anything which does so.

cpu0  	     	 	  	cpu1
				is_idle	
go_idle()
  clocksource_pm_enter()
    lock(cs_mutex);
				wakeup()
				clocksource_pm_exit()
				  trylock fails ....

    ...  
    unlock(cs_mutex);
  
--> Crap!

> > > @@ -1086,7 +1086,18 @@ int timekeeping_notify(struct clocksource *clock)
> > >  
> > >  	if (tk->tkr_mono.clock == clock)
> > >  		return 0;
> > > -	stop_machine(change_clocksource, clock, NULL);
> > > +
> > > +	/*
> > > +	 * We may want to toggle between a fast and a persistent
> > > +	 * clocksource from CPUidle on the last active CPU and can't
> > > +	 * use stop_machine at that point.
> > > +	 */
> > > +	if (cpumask_test_cpu(smp_processor_id(), cpu_online_mask) &&
> > 
> > Can you please explain how this code gets called from an offline cpu?
> 
> Last cpu getting idled..

That does not make any sense at all. How is idle related to the online
mask? An idle cpu is still online.

> > > +	    !rcu_is_watching())
> > 
> > So pick some random combination of conditions and define that it is
> > correct, right? How on earth does !rcu_watching() tell that this is
> > the last running cpu.
> 
> We have called rcu_idle_enter() from cpuidle_idle_call(). Do you have
> some better test in mind when the last cpu is about hit idle?

The cpuidle code should know that. And if it does not know, it better
should keep track of that information and based on it provide the
proper serialization, so the call into the timekeeping code is not a
subject to guess work and race conditions.

Thanks,

	tglx

  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 15:46     ` Thomas Gleixner
@ 2015-07-06 16:42       ` Tony Lindgren
  2015-07-06 17:51       ` John Stultz
  1 sibling, 0 replies; 11+ messages in thread
From: Tony Lindgren @ 2015-07-06 16:42 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: linux-kernel, linux-omap, Felipe Balbi, John Stultz,
	Nishanth Menon, Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

* Thomas Gleixner <tglx@linutronix.de> [150706 08:48]:
> On Mon, 6 Jul 2015, Tony Lindgren wrote:
> > * Thomas Gleixner <tglx@linutronix.de> [150706 07:20]:
> > > On Mon, 6 Jul 2015, Tony Lindgren wrote:
> > The timekeeping accuracy issue certainly needs some thinking, and
> > also the resolution between the clocksources can be different.. In the
> > test case I have the slow timer is always on and of a lower resolution
> > than the ARM global timer being used during runtime.
> > 
> > Got some handy timer test in mind you want me to run to provide data
> > on the accuracy?
> 
> John Stultz might have something.
>   
> > > > +/**
> > > > + * clocksource_pm_enter - change to a persistent clocksource before idle
> > > > + *
> > > > + * Changes system to use a persistent clocksource for idle. Intended to
> > > > + * be called from CPUidle from the last active CPU.
> > > > + */
> > > > +int clocksource_pm_enter(void)
> > > > +{
> > > > +	bool oneshot = tick_oneshot_mode_active();
> > > > +	struct clocksource *best;
> > > > +
> > > > +	if (WARN_ONCE(!mutex_trylock(&clocksource_mutex),
> > > > +		      "Unable to get clocksource_mutex"))
> > > > +		return -EINTR;
> > > 
> > > This trylock serves which purpose?
> > 
> > Well we don't want to start changing clocksource if something is
> > running like you pointed out.
> 
> Well yes, but ....
>  
> > > I really cannot see how this is proper serialized.
> > 
> > We need to allow this only from the last cpu before hitting idle.
> 
> And I cannot see anything which does so.
> 
> cpu0  	     	 	  	cpu1
> 				is_idle	
> go_idle()
>   clocksource_pm_enter()
>     lock(cs_mutex);
> 				wakeup()
> 				clocksource_pm_exit()
> 				  trylock fails ....
> 
>     ...  
>     unlock(cs_mutex);
>   
> --> Crap!

OK you're right, this only works with cpuidle and using
drivers/cpuidle/coupled.c.
 
> > > > @@ -1086,7 +1086,18 @@ int timekeeping_notify(struct clocksource *clock)
> > > >  
> > > >  	if (tk->tkr_mono.clock == clock)
> > > >  		return 0;
> > > > -	stop_machine(change_clocksource, clock, NULL);
> > > > +
> > > > +	/*
> > > > +	 * We may want to toggle between a fast and a persistent
> > > > +	 * clocksource from CPUidle on the last active CPU and can't
> > > > +	 * use stop_machine at that point.
> > > > +	 */
> > > > +	if (cpumask_test_cpu(smp_processor_id(), cpu_online_mask) &&
> > > 
> > > Can you please explain how this code gets called from an offline cpu?
> > 
> > Last cpu getting idled..
> 
> That does not make any sense at all. How is idle related to the online
> mask? An idle cpu is still online.

Oops yeah that's a bogus test, cpu off != offlined.
 
> > > > +	    !rcu_is_watching())
> > > 
> > > So pick some random combination of conditions and define that it is
> > > correct, right? How on earth does !rcu_watching() tell that this is
> > > the last running cpu.
> > 
> > We have called rcu_idle_enter() from cpuidle_idle_call(). Do you have
> > some better test in mind when the last cpu is about hit idle?
> 
> The cpuidle code should know that. And if it does not know, it better
> should keep track of that information and based on it provide the
> proper serialization, so the call into the timekeeping code is not a
> subject to guess work and race conditions.

OK I agree. Based on your comments this clearly needs to be limited to
cpuidle. And thanks for your comments.

Regards,

Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06  7:12 [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle Tony Lindgren
  2015-07-06 14:18 ` Thomas Gleixner
@ 2015-07-06 17:28 ` John Stultz
  2015-07-06 17:34   ` Thomas Gleixner
  1 sibling, 1 reply; 11+ messages in thread
From: John Stultz @ 2015-07-06 17:28 UTC (permalink / raw)
  To: Tony Lindgren
  Cc: Thomas Gleixner, lkml, linux-omap, Felipe Balbi, Nishanth Menon,
	Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

On Mon, Jul 6, 2015 at 12:12 AM, Tony Lindgren <tony@atomide.com> wrote:
> Some persistent clocksources can be on a slow external bus. For shorter
> latencies for RT use, let's allow toggling the clocksource during idle
> between a faster non-persistent runtime clocksource and a slower persistent
> clocksource.

So yea, as Thomas mentioned, this will cause problems for timekeeping
accuracy, since we end up resetting the ntp state when we change
clocksource (additionally you gain a bit of error each time you switch
clocksources). So you'll most likely see trouble w/ ntpd steering the
clock.

I'm not sure its quite clear in the description as to the need here.
Could you expand a bit as to the rational for why?  I assume it has to
do with you have a high-res clocksource that is good for fine-grained
clock_gettime() calls, but wraps quickly, making it poor for long idle
times. So you're trying to swap to a less fine grained clocksource
that won't wrap so fast?

The persistent-clock-like name muddies things further, since the
persistent-clock is used for suspend, while it looks like this is just
for idle times.

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 17:28 ` John Stultz
@ 2015-07-06 17:34   ` Thomas Gleixner
  2015-07-06 17:53     ` John Stultz
  0 siblings, 1 reply; 11+ messages in thread
From: Thomas Gleixner @ 2015-07-06 17:34 UTC (permalink / raw)
  To: John Stultz
  Cc: Tony Lindgren, lkml, linux-omap, Felipe Balbi, Nishanth Menon,
	Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

On Mon, 6 Jul 2015, John Stultz wrote:
> On Mon, Jul 6, 2015 at 12:12 AM, Tony Lindgren <tony@atomide.com> wrote:
> > Some persistent clocksources can be on a slow external bus. For shorter
> > latencies for RT use, let's allow toggling the clocksource during idle
> > between a faster non-persistent runtime clocksource and a slower persistent
> > clocksource.
> 
> So yea, as Thomas mentioned, this will cause problems for timekeeping
> accuracy, since we end up resetting the ntp state when we change
> clocksource (additionally you gain a bit of error each time you switch
> clocksources). So you'll most likely see trouble w/ ntpd steering the
> clock.
> 
> I'm not sure its quite clear in the description as to the need here.
> Could you expand a bit as to the rational for why?  I assume it has to
> do with you have a high-res clocksource that is good for fine-grained
> clock_gettime() calls, but wraps quickly, making it poor for long idle
> times. So you're trying to swap to a less fine grained clocksource
> that won't wrap so fast?
> 
> The persistent-clock-like name muddies things further, since the
> persistent-clock is used for suspend, while it looks like this is just
> for idle times.

Though that are idle states where the cpu is powered off so the fast
clock source is powered off as well....

We all know how well that works from the x86/TSC horror. It's just the
same thing again with a different arch prefix.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 15:46     ` Thomas Gleixner
  2015-07-06 16:42       ` Tony Lindgren
@ 2015-07-06 17:51       ` John Stultz
  2015-07-07 10:24         ` Tony Lindgren
  1 sibling, 1 reply; 11+ messages in thread
From: John Stultz @ 2015-07-06 17:51 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Tony Lindgren, lkml, linux-omap, Felipe Balbi, Nishanth Menon,
	Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

On Mon, Jul 6, 2015 at 8:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 6 Jul 2015, Tony Lindgren wrote:
>> * Thomas Gleixner <tglx@linutronix.de> [150706 07:20]:
>> > On Mon, 6 Jul 2015, Tony Lindgren wrote:
>> The timekeeping accuracy issue certainly needs some thinking, and
>> also the resolution between the clocksources can be different.. In the
>> test case I have the slow timer is always on and of a lower resolution
>> than the ARM global timer being used during runtime.
>>
>> Got some handy timer test in mind you want me to run to provide data
>> on the accuracy?
>
> John Stultz might have something.

You can turn on ntp stats in your ntp.conf and chart the loopstats
data w/ gnuplot:

set terminal png
set output "loopstat.png"
plot "/var/log/ntpstats/loopstats.<dateoftest>" using 2:3 with linespoints


I also have  the drift-log.py and graph-log.py scripts I use here:
https://github.com/johnstultz-work/timetests

But those aren't really ready for mass consumption, as they're
probably not very cross-distro compatible.

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 17:34   ` Thomas Gleixner
@ 2015-07-06 17:53     ` John Stultz
  2015-07-07 10:33       ` Tony Lindgren
  0 siblings, 1 reply; 11+ messages in thread
From: John Stultz @ 2015-07-06 17:53 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Tony Lindgren, lkml, linux-omap, Felipe Balbi, Nishanth Menon,
	Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

On Mon, Jul 6, 2015 at 10:34 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Mon, 6 Jul 2015, John Stultz wrote:
>> On Mon, Jul 6, 2015 at 12:12 AM, Tony Lindgren <tony@atomide.com> wrote:
>> > Some persistent clocksources can be on a slow external bus. For shorter
>> > latencies for RT use, let's allow toggling the clocksource during idle
>> > between a faster non-persistent runtime clocksource and a slower persistent
>> > clocksource.
>>
>> So yea, as Thomas mentioned, this will cause problems for timekeeping
>> accuracy, since we end up resetting the ntp state when we change
>> clocksource (additionally you gain a bit of error each time you switch
>> clocksources). So you'll most likely see trouble w/ ntpd steering the
>> clock.
>>
>> I'm not sure its quite clear in the description as to the need here.
>> Could you expand a bit as to the rational for why?  I assume it has to
>> do with you have a high-res clocksource that is good for fine-grained
>> clock_gettime() calls, but wraps quickly, making it poor for long idle
>> times. So you're trying to swap to a less fine grained clocksource
>> that won't wrap so fast?
>>
>> The persistent-clock-like name muddies things further, since the
>> persistent-clock is used for suspend, while it looks like this is just
>> for idle times.
>
> Though that are idle states where the cpu is powered off so the fast
> clock source is powered off as well....
>
> We all know how well that works from the x86/TSC horror. It's just the
> same thing again with a different arch prefix.

Right.. and it might be telling that on x86 systems with this issue,
we don't play games with it and that sort of hardware just has to use
the slower and less fine-grained clocksources all the time.

thanks
-john

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 17:51       ` John Stultz
@ 2015-07-07 10:24         ` Tony Lindgren
  0 siblings, 0 replies; 11+ messages in thread
From: Tony Lindgren @ 2015-07-07 10:24 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, lkml, linux-omap, Felipe Balbi, Nishanth Menon,
	Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

* John Stultz <john.stultz@linaro.org> [150706 10:53]:
> On Mon, Jul 6, 2015 at 8:46 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 6 Jul 2015, Tony Lindgren wrote:
> >> * Thomas Gleixner <tglx@linutronix.de> [150706 07:20]:
> >> > On Mon, 6 Jul 2015, Tony Lindgren wrote:
> >> The timekeeping accuracy issue certainly needs some thinking, and
> >> also the resolution between the clocksources can be different.. In the
> >> test case I have the slow timer is always on and of a lower resolution
> >> than the ARM global timer being used during runtime.
> >>
> >> Got some handy timer test in mind you want me to run to provide data
> >> on the accuracy?
> >
> > John Stultz might have something.
> 
> You can turn on ntp stats in your ntp.conf and chart the loopstats
> data w/ gnuplot:
> 
> set terminal png
> set output "loopstat.png"
> plot "/var/log/ntpstats/loopstats.<dateoftest>" using 2:3 with linespoints
> 
> 
> I also have  the drift-log.py and graph-log.py scripts I use here:
> https://github.com/johnstultz-work/timetests
> 
> But those aren't really ready for mass consumption, as they're
> probably not very cross-distro compatible.

Great thanks, I'll play a bit with those to see how bad the jitter gets.

Regrds,

Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle
  2015-07-06 17:53     ` John Stultz
@ 2015-07-07 10:33       ` Tony Lindgren
  0 siblings, 0 replies; 11+ messages in thread
From: Tony Lindgren @ 2015-07-07 10:33 UTC (permalink / raw)
  To: John Stultz
  Cc: Thomas Gleixner, lkml, linux-omap, Felipe Balbi, Nishanth Menon,
	Yingjoe Chen, Rafael J. Wysocki, Peter Zijlstra

* John Stultz <john.stultz@linaro.org> [150706 10:55]:
> On Mon, Jul 6, 2015 at 10:34 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> > On Mon, 6 Jul 2015, John Stultz wrote:
> >> On Mon, Jul 6, 2015 at 12:12 AM, Tony Lindgren <tony@atomide.com> wrote:
> >> > Some persistent clocksources can be on a slow external bus. For shorter
> >> > latencies for RT use, let's allow toggling the clocksource during idle
> >> > between a faster non-persistent runtime clocksource and a slower persistent
> >> > clocksource.
> >>
> >> So yea, as Thomas mentioned, this will cause problems for timekeeping
> >> accuracy, since we end up resetting the ntp state when we change
> >> clocksource (additionally you gain a bit of error each time you switch
> >> clocksources). So you'll most likely see trouble w/ ntpd steering the
> >> clock.
> >>
> >> I'm not sure its quite clear in the description as to the need here.
> >> Could you expand a bit as to the rational for why?  I assume it has to
> >> do with you have a high-res clocksource that is good for fine-grained
> >> clock_gettime() calls, but wraps quickly, making it poor for long idle
> >> times. So you're trying to swap to a less fine grained clocksource
> >> that won't wrap so fast?
> >>
> >> The persistent-clock-like name muddies things further, since the
> >> persistent-clock is used for suspend, while it looks like this is just
> >> for idle times.
> >
> > Though that are idle states where the cpu is powered off so the fast
> > clock source is powered off as well....
> >
> > We all know how well that works from the x86/TSC horror. It's just the
> > same thing again with a different arch prefix.
> 
> Right.. and it might be telling that on x86 systems with this issue,
> we don't play games with it and that sort of hardware just has to use
> the slower and less fine-grained clocksources all the time.

Yes it's similar to the old x86 TSC vs ACPI PM timer issue. We have
ARM global timer powered off during idle.

It may be best to just use kernel cmdline to select the ARM global
timer and then disable any deeper power states if ARM global timer
is selected.

Regards,

Tony

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-07-07 10:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-06  7:12 [PATCH] clocksource: Allow toggling between runtime and persistent clocksource for idle Tony Lindgren
2015-07-06 14:18 ` Thomas Gleixner
2015-07-06 15:22   ` Tony Lindgren
2015-07-06 15:46     ` Thomas Gleixner
2015-07-06 16:42       ` Tony Lindgren
2015-07-06 17:51       ` John Stultz
2015-07-07 10:24         ` Tony Lindgren
2015-07-06 17:28 ` John Stultz
2015-07-06 17:34   ` Thomas Gleixner
2015-07-06 17:53     ` John Stultz
2015-07-07 10:33       ` Tony Lindgren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.