All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH RFC] cpuidle: consolidate calls to time capture
@ 2020-03-16 21:08 Daniel Lezcano
  2020-03-17  3:52 ` kbuild test robot
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Daniel Lezcano @ 2020-03-16 21:08 UTC (permalink / raw)
  To: rjw; +Cc: ulf.hansson, daniel.lezcano, linux-pm, linux-kernel, khilman

A few years ago, we changed the code in cpuidle to replace ktime_get()
by a local_clock() to get rid of potential seq lock in the path and an
extra latency.

Meanwhile, the code evolved and we are getting the time in some other
places like the power domain governor and in the future break even
deadline proposal.

Unfortunately, as the time must be compared across the CPU, we have no
other option than using the ktime_get() again. Hopefully, we can
factor out all the calls to local_clock() and ktime_get() into a
single one when the CPU is entering idle as the value will be reuse in
different places.

We can assume the time to go through the code path distance is small
enough between ktime_get() call in the cpuidle_enter() function and
the other users inspecting the value.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
 drivers/base/power/domain_governor.c | 4 +++-
 drivers/cpuidle/cpuidle.c            | 6 +++---
 include/linux/cpuidle.h              | 1 +
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c
index daa8c7689f7e..bee97f7b7b8d 100644
--- a/drivers/base/power/domain_governor.c
+++ b/drivers/base/power/domain_governor.c
@@ -279,8 +279,10 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
 		}
 	}
 
+	dev = per_cpu(cpuidle_devices, smp_processor_id());
+
 	/* The minimum idle duration is from now - until the next wakeup. */
-	idle_duration_ns = ktime_to_ns(ktime_sub(domain_wakeup, ktime_get()));
+	idle_duration_ns = ktime_to_ns(ktime_sub(domain_wakeup, dev->idle_start));
 	if (idle_duration_ns <= 0)
 		return false;
 
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index c149d9e20dfd..9db14581759b 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -206,7 +206,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 
 	struct cpuidle_state *target_state = &drv->states[index];
 	bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
-	ktime_t time_start, time_end;
+	ktime_t time_end;
 
 	/*
 	 * Tell the time framework to switch to a broadcast timer because our
@@ -228,14 +228,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
 	sched_idle_set_state(target_state);
 
 	trace_cpu_idle_rcuidle(index, dev->cpu);
-	time_start = ns_to_ktime(local_clock());
+	dev->idle_start = ktime_get();
 
 	stop_critical_timings();
 	entered_state = target_state->enter(dev, drv, index);
 	start_critical_timings();
 
 	sched_clock_idle_wakeup_event();
-	time_end = ns_to_ktime(local_clock());
+	time_end = ktime_get();
 	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
 
 	/* The cpu is no longer idle or about to enter idle. */
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index ec2ef63771f0..112494658e01 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -89,6 +89,7 @@ struct cpuidle_device {
 	unsigned int		poll_time_limit:1;
 	unsigned int		cpu;
 	ktime_t			next_hrtimer;
+	ktime_t			idle_start;
 
 	int			last_state_idx;
 	u64			last_residency_ns;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] cpuidle: consolidate calls to time capture
  2020-03-16 21:08 [PATCH RFC] cpuidle: consolidate calls to time capture Daniel Lezcano
@ 2020-03-17  3:52 ` kbuild test robot
  2020-03-18 10:17 ` Rafael J. Wysocki
  2020-03-18 10:49 ` Ulf Hansson
  2 siblings, 0 replies; 7+ messages in thread
From: kbuild test robot @ 2020-03-17  3:52 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 10579 bytes --]

Hi Daniel,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on pm/linux-next]
[also build test ERROR on v5.6-rc6 next-20200316]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Daniel-Lezcano/cpuidle-consolidate-calls-to-time-capture/20200317-085736
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
config: x86_64-randconfig-s2-20200317 (attached as .config)
compiler: gcc-5 (Ubuntu 5.5.0-12ubuntu1) 5.5.0 20171010
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/timer.h:6:0,
                    from include/linux/clocksource.h:17,
                    from include/linux/clockchips.h:14,
                    from drivers//cpuidle/cpuidle.c:11:
   drivers//cpuidle/cpuidle.c: In function 'cpuidle_enter_state':
>> drivers//cpuidle/cpuidle.c:263:30: error: 'time_start' undeclared (first use in this function)
      diff = ktime_sub(time_end, time_start);
                                 ^
   include/linux/ktime.h:46:39: note: in definition of macro 'ktime_sub'
    #define ktime_sub(lhs, rhs) ((lhs) - (rhs))
                                          ^
   drivers//cpuidle/cpuidle.c:263:30: note: each undeclared identifier is reported only once for each function it appears in
      diff = ktime_sub(time_end, time_start);
                                 ^
   include/linux/ktime.h:46:39: note: in definition of macro 'ktime_sub'
    #define ktime_sub(lhs, rhs) ((lhs) - (rhs))
                                          ^

vim +/time_start +263 drivers//cpuidle/cpuidle.c

3810631332465d Rafael J. Wysocki 2015-02-12  195  
56cfbf74a17c40 Colin Cross       2012-05-07  196  /**
56cfbf74a17c40 Colin Cross       2012-05-07  197   * cpuidle_enter_state - enter the state and update stats
56cfbf74a17c40 Colin Cross       2012-05-07  198   * @dev: cpuidle device for this cpu
56cfbf74a17c40 Colin Cross       2012-05-07  199   * @drv: cpuidle driver for this cpu
7312280bd2ad9d Rafael J. Wysocki 2015-05-09  200   * @index: index into the states table in @drv of the state to enter
56cfbf74a17c40 Colin Cross       2012-05-07  201   */
56cfbf74a17c40 Colin Cross       2012-05-07  202  int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
554c06ba3ee29c Daniel Lezcano    2013-04-23  203  			int index)
56cfbf74a17c40 Colin Cross       2012-05-07  204  {
56cfbf74a17c40 Colin Cross       2012-05-07  205  	int entered_state;
56cfbf74a17c40 Colin Cross       2012-05-07  206  
554c06ba3ee29c Daniel Lezcano    2013-04-23  207  	struct cpuidle_state *target_state = &drv->states[index];
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  208  	bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
13eaa2bebd418b Daniel Lezcano    2020-03-16  209  	ktime_t time_end;
554c06ba3ee29c Daniel Lezcano    2013-04-23  210  
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  211  	/*
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  212  	 * Tell the time framework to switch to a broadcast timer because our
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  213  	 * local timer will be shut down.  If a local timer is used from another
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  214  	 * CPU as a broadcast timer, this call may fail if it is not available.
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  215  	 */
827a5aefc542b8 Rafael J. Wysocki 2015-05-10  216  	if (broadcast && tick_broadcast_enter()) {
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  217  		index = find_deepest_state(drv, dev, target_state->exit_latency_ns,
0d94039fabccaa Rafael J. Wysocki 2015-05-10  218  					   CPUIDLE_FLAG_TIMER_STOP, false);
0d94039fabccaa Rafael J. Wysocki 2015-05-10  219  		if (index < 0) {
827a5aefc542b8 Rafael J. Wysocki 2015-05-10  220  			default_idle_call();
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  221  			return -EBUSY;
827a5aefc542b8 Rafael J. Wysocki 2015-05-10  222  		}
0d94039fabccaa Rafael J. Wysocki 2015-05-10  223  		target_state = &drv->states[index];
f187851b9b4a76 Nicholas Piggin   2017-09-01  224  		broadcast = false;
0d94039fabccaa Rafael J. Wysocki 2015-05-10  225  	}
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  226  
faad3849281411 Rafael J. Wysocki 2015-05-10  227  	/* Take note of the planned idle state. */
faad3849281411 Rafael J. Wysocki 2015-05-10  228  	sched_idle_set_state(target_state);
faad3849281411 Rafael J. Wysocki 2015-05-10  229  
30fe6884021b9f Sandeep Tripathy  2014-07-02  230  	trace_cpu_idle_rcuidle(index, dev->cpu);
13eaa2bebd418b Daniel Lezcano    2020-03-16  231  	dev->idle_start = ktime_get();
554c06ba3ee29c Daniel Lezcano    2013-04-23  232  
63caae84809217 Lucas Stach       2015-07-20  233  	stop_critical_timings();
554c06ba3ee29c Daniel Lezcano    2013-04-23  234  	entered_state = target_state->enter(dev, drv, index);
63caae84809217 Lucas Stach       2015-07-20  235  	start_critical_timings();
554c06ba3ee29c Daniel Lezcano    2013-04-23  236  
f9fccdb9efef60 Peter Zijlstra    2017-04-21  237  	sched_clock_idle_wakeup_event();
13eaa2bebd418b Daniel Lezcano    2020-03-16  238  	time_end = ktime_get();
30fe6884021b9f Sandeep Tripathy  2014-07-02  239  	trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
554c06ba3ee29c Daniel Lezcano    2013-04-23  240  
faad3849281411 Rafael J. Wysocki 2015-05-10  241  	/* The cpu is no longer idle or about to enter idle. */
faad3849281411 Rafael J. Wysocki 2015-05-10  242  	sched_idle_set_state(NULL);
faad3849281411 Rafael J. Wysocki 2015-05-10  243  
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  244  	if (broadcast) {
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  245  		if (WARN_ON_ONCE(!irqs_disabled()))
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  246  			local_irq_disable();
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  247  
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  248  		tick_broadcast_exit();
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  249  	}
df8d9eeadd0f7a Rafael J. Wysocki 2015-04-29  250  
e7387da52028b0 Daniel Lezcano    2016-05-17  251  	if (!cpuidle_state_is_coupled(drv, index))
554c06ba3ee29c Daniel Lezcano    2013-04-23  252  		local_irq_enable();
554c06ba3ee29c Daniel Lezcano    2013-04-23  253  
7037b43e0076cc Fieah Lim         2018-09-11  254  	if (entered_state >= 0) {
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  255  		s64 diff, delay = drv->states[entered_state].exit_latency_ns;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  256  		int i;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  257  
7037b43e0076cc Fieah Lim         2018-09-11  258  		/*
7037b43e0076cc Fieah Lim         2018-09-11  259  		 * Update cpuidle counters
7037b43e0076cc Fieah Lim         2018-09-11  260  		 * This can be moved to within driver enter routine,
7037b43e0076cc Fieah Lim         2018-09-11  261  		 * but that results in multiple copies of same code.
7037b43e0076cc Fieah Lim         2018-09-11  262  		 */
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07 @263  		diff = ktime_sub(time_end, time_start);
554c06ba3ee29c Daniel Lezcano    2013-04-23  264  
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  265  		dev->last_residency_ns = diff;
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  266  		dev->states_usage[entered_state].time_ns += diff;
56cfbf74a17c40 Colin Cross       2012-05-07  267  		dev->states_usage[entered_state].usage++;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  268  
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  269  		if (diff < drv->states[entered_state].target_residency_ns) {
04dab58a39d402 Rafael J. Wysocki 2018-12-10  270  			for (i = entered_state - 1; i >= 0; i--) {
99e98d3fb1008e Rafael J. Wysocki 2019-11-04  271  				if (dev->states_usage[i].disable)
04dab58a39d402 Rafael J. Wysocki 2018-12-10  272  					continue;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  273  
04dab58a39d402 Rafael J. Wysocki 2018-12-10  274  				/* Shallower states are enabled, so update. */
04dab58a39d402 Rafael J. Wysocki 2018-12-10  275  				dev->states_usage[entered_state].above++;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  276  				break;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  277  			}
04dab58a39d402 Rafael J. Wysocki 2018-12-10  278  		} else if (diff > delay) {
04dab58a39d402 Rafael J. Wysocki 2018-12-10  279  			for (i = entered_state + 1; i < drv->state_count; i++) {
99e98d3fb1008e Rafael J. Wysocki 2019-11-04  280  				if (dev->states_usage[i].disable)
04dab58a39d402 Rafael J. Wysocki 2018-12-10  281  					continue;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  282  
04dab58a39d402 Rafael J. Wysocki 2018-12-10  283  				/*
04dab58a39d402 Rafael J. Wysocki 2018-12-10  284  				 * Update if a deeper state would have been a
04dab58a39d402 Rafael J. Wysocki 2018-12-10  285  				 * better match for the observed idle duration.
04dab58a39d402 Rafael J. Wysocki 2018-12-10  286  				 */
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  287  				if (diff - delay >= drv->states[i].target_residency_ns)
04dab58a39d402 Rafael J. Wysocki 2018-12-10  288  					dev->states_usage[entered_state].below++;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  289  
04dab58a39d402 Rafael J. Wysocki 2018-12-10  290  				break;
04dab58a39d402 Rafael J. Wysocki 2018-12-10  291  			}
04dab58a39d402 Rafael J. Wysocki 2018-12-10  292  		}
56cfbf74a17c40 Colin Cross       2012-05-07  293  	} else {
c1d51f684c72b5 Rafael J. Wysocki 2019-11-07  294  		dev->last_residency_ns = 0;
56cfbf74a17c40 Colin Cross       2012-05-07  295  	}
56cfbf74a17c40 Colin Cross       2012-05-07  296  
56cfbf74a17c40 Colin Cross       2012-05-07  297  	return entered_state;
56cfbf74a17c40 Colin Cross       2012-05-07  298  }
56cfbf74a17c40 Colin Cross       2012-05-07  299  

:::::: The code at line 263 was first introduced by commit
:::::: c1d51f684c72b5eb2aecbbd47be3a2977a2dc903 cpuidle: Use nanoseconds as the unit of time

:::::: TO: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
:::::: CC: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 32475 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] cpuidle: consolidate calls to time capture
  2020-03-16 21:08 [PATCH RFC] cpuidle: consolidate calls to time capture Daniel Lezcano
  2020-03-17  3:52 ` kbuild test robot
@ 2020-03-18 10:17 ` Rafael J. Wysocki
  2020-03-18 11:04   ` Daniel Lezcano
  2020-03-18 10:49 ` Ulf Hansson
  2 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2020-03-18 10:17 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: ulf.hansson, linux-pm, linux-kernel, khilman

On Monday, March 16, 2020 10:08:43 PM CET Daniel Lezcano wrote:
> A few years ago, we changed the code in cpuidle to replace ktime_get()
> by a local_clock() to get rid of potential seq lock in the path and an
> extra latency.
> 
> Meanwhile, the code evolved and we are getting the time in some other
> places like the power domain governor and in the future break even
> deadline proposal.

Hmm?

Have any patches been posted for that?

> Unfortunately, as the time must be compared across the CPU, we have no
> other option than using the ktime_get() again. Hopefully, we can
> factor out all the calls to local_clock() and ktime_get() into a
> single one when the CPU is entering idle as the value will be reuse in
> different places.

So there are cases in which it is not necessary to synchronize the time
between CPUs and those would take the overhead unnecessarily.

This change looks premature to me at least.

Thanks!




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] cpuidle: consolidate calls to time capture
  2020-03-16 21:08 [PATCH RFC] cpuidle: consolidate calls to time capture Daniel Lezcano
  2020-03-17  3:52 ` kbuild test robot
  2020-03-18 10:17 ` Rafael J. Wysocki
@ 2020-03-18 10:49 ` Ulf Hansson
  2 siblings, 0 replies; 7+ messages in thread
From: Ulf Hansson @ 2020-03-18 10:49 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Linux PM, Linux Kernel Mailing List, Kevin Hilman

On Mon, 16 Mar 2020 at 22:10, Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> A few years ago, we changed the code in cpuidle to replace ktime_get()
> by a local_clock() to get rid of potential seq lock in the path and an
> extra latency.
>
> Meanwhile, the code evolved and we are getting the time in some other
> places like the power domain governor and in the future break even
> deadline proposal.
>
> Unfortunately, as the time must be compared across the CPU, we have no
> other option than using the ktime_get() again. Hopefully, we can
> factor out all the calls to local_clock() and ktime_get() into a
> single one when the CPU is entering idle as the value will be reuse in
> different places.
>
> We can assume the time to go through the code path distance is small
> enough between ktime_get() call in the cpuidle_enter() function and
> the other users inspecting the value.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>  drivers/base/power/domain_governor.c | 4 +++-
>  drivers/cpuidle/cpuidle.c            | 6 +++---
>  include/linux/cpuidle.h              | 1 +
>  3 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c
> index daa8c7689f7e..bee97f7b7b8d 100644
> --- a/drivers/base/power/domain_governor.c
> +++ b/drivers/base/power/domain_governor.c
> @@ -279,8 +279,10 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
>                 }
>         }
>
> +       dev = per_cpu(cpuidle_devices, smp_processor_id());
> +
>         /* The minimum idle duration is from now - until the next wakeup. */
> -       idle_duration_ns = ktime_to_ns(ktime_sub(domain_wakeup, ktime_get()));
> +       idle_duration_ns = ktime_to_ns(ktime_sub(domain_wakeup, dev->idle_start));
>         if (idle_duration_ns <= 0)
>                 return false;
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index c149d9e20dfd..9db14581759b 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -206,7 +206,7 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
>
>         struct cpuidle_state *target_state = &drv->states[index];
>         bool broadcast = !!(target_state->flags & CPUIDLE_FLAG_TIMER_STOP);
> -       ktime_t time_start, time_end;
> +       ktime_t time_end;
>
>         /*
>          * Tell the time framework to switch to a broadcast timer because our
> @@ -228,14 +228,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, struct cpuidle_driver *drv,
>         sched_idle_set_state(target_state);
>
>         trace_cpu_idle_rcuidle(index, dev->cpu);
> -       time_start = ns_to_ktime(local_clock());
> +       dev->idle_start = ktime_get();

I fully agree with Rafael, this is bad for all cases where the
local_clock is sufficient.

To avoid the ktime_get() in the cpu_power_down_ok() for the genpd
governor, I think a better option could be to use the
"ts->idle_entrytime", that has been set in tick_nohz_start_idle().

>
>         stop_critical_timings();
>         entered_state = target_state->enter(dev, drv, index);
>         start_critical_timings();
>
>         sched_clock_idle_wakeup_event();
> -       time_end = ns_to_ktime(local_clock());
> +       time_end = ktime_get();
>         trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu);
>
>         /* The cpu is no longer idle or about to enter idle. */
> diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
> index ec2ef63771f0..112494658e01 100644
> --- a/include/linux/cpuidle.h
> +++ b/include/linux/cpuidle.h
> @@ -89,6 +89,7 @@ struct cpuidle_device {
>         unsigned int            poll_time_limit:1;
>         unsigned int            cpu;
>         ktime_t                 next_hrtimer;
> +       ktime_t                 idle_start;
>
>         int                     last_state_idx;
>         u64                     last_residency_ns;
> --
> 2.17.1
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] cpuidle: consolidate calls to time capture
  2020-03-18 10:17 ` Rafael J. Wysocki
@ 2020-03-18 11:04   ` Daniel Lezcano
  2020-03-18 14:32     ` Daniel Lezcano
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2020-03-18 11:04 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: ulf.hansson, linux-pm, linux-kernel, khilman


Hi Rafael,

On 18/03/2020 11:17, Rafael J. Wysocki wrote:
> On Monday, March 16, 2020 10:08:43 PM CET Daniel Lezcano wrote:
>> A few years ago, we changed the code in cpuidle to replace ktime_get()
>> by a local_clock() to get rid of potential seq lock in the path and an
>> extra latency.
>>
>> Meanwhile, the code evolved and we are getting the time in some other
>> places like the power domain governor and in the future break even
>> deadline proposal.
> 
> Hmm?
> 
> Have any patches been posted for that?

https://lkml.org/lkml/2020/3/11/1113

https://lkml.org/lkml/2020/3/13/466

but there is no consensus yet if that has a benefit or not.

>> Unfortunately, as the time must be compared across the CPU, we have no
>> other option than using the ktime_get() again. Hopefully, we can
>> factor out all the calls to local_clock() and ktime_get() into a
>> single one when the CPU is entering idle as the value will be reuse in
>> different places.
> 
> So there are cases in which it is not necessary to synchronize the time
> between CPUs and those would take the overhead unnecessarily.
> 
> This change looks premature to me at least.

The idea is to call one time ktime_get() when entering idle and store
the result in the struct cpuidle_device, so we have the information when
we entered idle.

Moreover, ktime_get() is called in do_idle() via:

tick_nohz_idle_enter()
  tick_nohz_start_idle()
    ts->idle_entrytime = ktime_get();

This is called at the first loop level. The idle loop is exiting and
re-entering again without passing through tick_nohz_idle_enter() in the
second loop level in case of interrupt processing, thus the
idle_entrytime is not updated and the return of
tick_nohz_get_sleep_length() will be greater than what is expected.

May be we can consider ktime_get_mono_fast_ns() which is lockless with a
particular care of the non-monotonic aspect if needed. Given the
description at [1] the time jump could a few nanoseconds in case of NMI.

The local_clock() can no be inspected across CPUs, the gap is too big
and continues to increase during system lifetime.

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/time/timekeeping.c#n396


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] cpuidle: consolidate calls to time capture
  2020-03-18 11:04   ` Daniel Lezcano
@ 2020-03-18 14:32     ` Daniel Lezcano
  2020-03-18 21:29       ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Lezcano @ 2020-03-18 14:32 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: ulf.hansson, linux-pm, linux-kernel, khilman

On 18/03/2020 12:04, Daniel Lezcano wrote:
> 
> Hi Rafael,
> 
> On 18/03/2020 11:17, Rafael J. Wysocki wrote:
>> On Monday, March 16, 2020 10:08:43 PM CET Daniel Lezcano wrote:
>>> A few years ago, we changed the code in cpuidle to replace ktime_get()
>>> by a local_clock() to get rid of potential seq lock in the path and an
>>> extra latency.
>>>
>>> Meanwhile, the code evolved and we are getting the time in some other
>>> places like the power domain governor and in the future break even
>>> deadline proposal.
>>
>> Hmm?
>>
>> Have any patches been posted for that?
> 
> https://lkml.org/lkml/2020/3/11/1113
> 
> https://lkml.org/lkml/2020/3/13/466
> 
> but there is no consensus yet if that has a benefit or not.
> 
>>> Unfortunately, as the time must be compared across the CPU, we have no
>>> other option than using the ktime_get() again. Hopefully, we can
>>> factor out all the calls to local_clock() and ktime_get() into a
>>> single one when the CPU is entering idle as the value will be reuse in
>>> different places.
>>
>> So there are cases in which it is not necessary to synchronize the time
>> between CPUs and those would take the overhead unnecessarily.
>>
>> This change looks premature to me at least.
> 
> The idea is to call one time ktime_get() when entering idle and store
> the result in the struct cpuidle_device, so we have the information when
> we entered idle.
> 
> Moreover, ktime_get() is called in do_idle() via:
> 
> tick_nohz_idle_enter()
>   tick_nohz_start_idle()
>     ts->idle_entrytime = ktime_get();
> 
> This is called at the first loop level. The idle loop is exiting and
> re-entering again without passing through tick_nohz_idle_enter() in the
> second loop level in case of interrupt processing, thus the
> idle_entrytime is not updated and the return of
> tick_nohz_get_sleep_length() will be greater than what is expected.
> 
> May be we can consider ktime_get_mono_fast_ns() which is lockless with a
> particular care of the non-monotonic aspect if needed. Given the
> description at [1] the time jump could a few nanoseconds in case of NMI.
> 
> The local_clock() can no be inspected across CPUs, the gap is too big
> and continues to increase during system lifetime.

I took the opportunity to measure the duration to a call to ktime_get,
ktime_get_mono_fast_ns and local_clock.

The result is an average of 10000 measurements and an average of 1000
run of those.

The duration is measured with local_clock(), ktime_get() and
ktime_get_mono_fast_ns()

Measurement with local_clock():
-------------------------------

ktime_get():
N	min	max	sum	mean	stddev
1000	71	168	109052	109.052	13.0278

ktime_get_mono_fast_ns():
N	min	max	sum	mean	stddev
1000	66	153	101135	101.135	11.9262

local_clock():
N	min	max	sum	mean	stddev
1000	70	163	106896	106.896	12.8575


Measurement with ktime_get():
-----------------------------

ktime_get():
N	min	max	sum	mean	stddev
1000	71	124	100465	100.465	10.0272

ktime_get_mono_fast_ns():
N	min	max	sum	mean	stddev
1000	67	124	94451	94.451	9.67218

local_clock():
N	min	max	sum	mean	stddev
1000	71	123	99765	99.765	10.0508


Measurement with ktime_get_mono_fast_ns():
------------------------------------------

ktime_get():
N	min	max	sum	mean	stddev
1000	67	116	87562	87.562	4.38399

ktime_get_mono_fast_ns():
N	min	max	sum	mean	stddev
1000	62	104	81017	81.017	4.12453

local_clock():
N	min	max	sum	mean	stddev
1000	65	110	85919	85.919	4.24859



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] cpuidle: consolidate calls to time capture
  2020-03-18 14:32     ` Daniel Lezcano
@ 2020-03-18 21:29       ` Rafael J. Wysocki
  0 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2020-03-18 21:29 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Rafael J. Wysocki, Ulf Hansson, Linux PM,
	Linux Kernel Mailing List, Kevin Hilman

On Wed, Mar 18, 2020 at 3:32 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 18/03/2020 12:04, Daniel Lezcano wrote:
> >
> > Hi Rafael,
> >
> > On 18/03/2020 11:17, Rafael J. Wysocki wrote:
> >> On Monday, March 16, 2020 10:08:43 PM CET Daniel Lezcano wrote:
> >>> A few years ago, we changed the code in cpuidle to replace ktime_get()
> >>> by a local_clock() to get rid of potential seq lock in the path and an
> >>> extra latency.
> >>>
> >>> Meanwhile, the code evolved and we are getting the time in some other
> >>> places like the power domain governor and in the future break even
> >>> deadline proposal.
> >>
> >> Hmm?
> >>
> >> Have any patches been posted for that?
> >
> > https://lkml.org/lkml/2020/3/11/1113
> >
> > https://lkml.org/lkml/2020/3/13/466
> >
> > but there is no consensus yet if that has a benefit or not.
> >
> >>> Unfortunately, as the time must be compared across the CPU, we have no
> >>> other option than using the ktime_get() again. Hopefully, we can
> >>> factor out all the calls to local_clock() and ktime_get() into a
> >>> single one when the CPU is entering idle as the value will be reuse in
> >>> different places.
> >>
> >> So there are cases in which it is not necessary to synchronize the time
> >> between CPUs and those would take the overhead unnecessarily.
> >>
> >> This change looks premature to me at least.
> >
> > The idea is to call one time ktime_get() when entering idle and store
> > the result in the struct cpuidle_device, so we have the information when
> > we entered idle.
> >
> > Moreover, ktime_get() is called in do_idle() via:
> >
> > tick_nohz_idle_enter()
> >   tick_nohz_start_idle()
> >     ts->idle_entrytime = ktime_get();
> >
> > This is called at the first loop level. The idle loop is exiting and
> > re-entering again without passing through tick_nohz_idle_enter() in the
> > second loop level in case of interrupt processing, thus the
> > idle_entrytime is not updated and the return of
> > tick_nohz_get_sleep_length() will be greater than what is expected.
> >
> > May be we can consider ktime_get_mono_fast_ns() which is lockless with a
> > particular care of the non-monotonic aspect if needed. Given the
> > description at [1] the time jump could a few nanoseconds in case of NMI.
> >
> > The local_clock() can no be inspected across CPUs, the gap is too big
> > and continues to increase during system lifetime.
>
> I took the opportunity to measure the duration to a call to ktime_get,
> ktime_get_mono_fast_ns and local_clock.

The results you get depend a good deal on the conditions of the test,
the system on which they were obtained and so on.  Without this
information it is hard to draw any conclusions from those results.  In
particular, ktime_get() is not significantly slower than local_clock()
if there is no contention AFAICS, and the lack of contention cannot be
guaranteed here.

Generally speaking, the problem is that it is not sufficient to
measure the time before running the governor and after the CPU wakes
up, because in the cases that really care about the latency of that
operation the time needed to run the governor may be a significant
fraction of the entire overhead.  So it is necessary to take time
stamps in several places and putting ktime_get() in all of them
doesn't sound particularly attractive.

Anyway, there is no real need to make this change AFAICS, so I'm not
really sure what the entire argument is about.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-03-18 21:30 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-16 21:08 [PATCH RFC] cpuidle: consolidate calls to time capture Daniel Lezcano
2020-03-17  3:52 ` kbuild test robot
2020-03-18 10:17 ` Rafael J. Wysocki
2020-03-18 11:04   ` Daniel Lezcano
2020-03-18 14:32     ` Daniel Lezcano
2020-03-18 21:29       ` Rafael J. Wysocki
2020-03-18 10:49 ` Ulf Hansson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.