linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Kepplinger <martin.kepplinger@puri.sm>
To: daniel.lezcano@linaro.org, viresh.kumar@linaro.org,
	kevin.wangtao@linaro.org, leo.yan@linaro.org,
	edubezval@gmail.com, vincent.guittot@linaro.org,
	javi.merino@kernel.org, rui.zhang@intel.com,
	daniel.thompson@linaro.org
Cc: linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 6/7] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver
Date: Mon, 5 Aug 2019 08:53:39 +0200	[thread overview]
Message-ID: <02ec23c3-37ee-4e9f-56a4-453a30a29747@puri.sm> (raw)
In-Reply-To: <20190805051111.24318-1-martin.kepplinger@puri.sm>

On 05.08.19 07:11, Martin Kepplinger wrote:
> ---
> 
> On 05-04-18, 18:16, Daniel Lezcano wrote:
>> The cpu idle cooling driver performs synchronized idle injection across all
>> cpus belonging to the same cluster and offers a new method to cool down a SoC.
>>
>> Each cluster has its own idle cooling device, each core has its own idle
>> injection thread, each idle injection thread uses play_idle to enter idle.  In
>> order to reach the deepest idle state, each cooling device has the idle
>> injection threads synchronized together.
>>
>> It has some similarity with the intel power clamp driver but it is actually
>> designed to work on the ARM architecture via the DT with a mathematical proof
>> with the power model which comes with the Documentation.
>>
>> The idle injection cycle is fixed while the running cycle is variable. That
>> allows to have control on the device reactivity for the user experience. At
>> the mitigation point the idle threads are unparked, they play idle the
>> specified amount of time and they schedule themselves. The last thread sets
>> the next idle injection deadline and when the timer expires it wakes up all
>> the threads which in turn play idle again. Meanwhile the running cycle is
>> changed by set_cur_state.  When the mitigation ends, the threads are parked.
>> The algorithm is self adaptive, so there is no need to handle hotplugging.
>>
>> If we take an example of the balanced point, we can use the DT for the hi6220.
>>
>> The sustainable power for the SoC is 3326mW to mitigate at 75°C. Eight cores
>> running at full blast at the maximum OPP consumes 5280mW. The first value is
>> given in the DT, the second is calculated from the OPP with the formula:
>>
>>    Pdyn = Cdyn x Voltage^2 x Frequency
>>
>> As the SoC vendors don't want to share the static leakage values, we assume
>> it is zero, so the Prun = Pdyn + Pstatic = Pdyn + 0 = Pdyn.
>>
>> In order to reduce the power to 3326mW, we have to apply a ratio to the
>> running time.
>>
>> ratio = (Prun - Ptarget) / Ptarget = (5280 - 3326) / 3326 = 0,5874
>>
>> We know the idle cycle which is fixed, let's assume 10ms. However from this
>> duration we have to substract the wake up latency for the cluster idle state.
>> In our case, it is 1.5ms. So for a 10ms latency for idle, we are really idle
>> 8.5ms.
>>
>> As we know the idle duration and the ratio, we can compute the running cycle.
>>
>>    running_cycle = 8.5 / 0.5874 = 14.47ms
>>
>> So for 8.5ms of idle, we have 14.47ms of running cycle, and that brings the
>> SoC to the balanced trip point of 75°C.
>>
>> The driver has been tested on the hi6220 and it appears the temperature
>> stabilizes at 75°C with an idle injection time of 10ms (8.5ms real) and
>> running cycle of 14ms as expected by the theory above.
>>
>> Signed-off-by: Kevin Wangtao <kevin.wangtao@linaro.org>
>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
>> ---
>>  drivers/thermal/Kconfig       |  10 +
>>  drivers/thermal/cpu_cooling.c | 479 ++++++++++++++++++++++++++++++++++++++++++
>>  include/linux/cpu_cooling.h   |   6 +
>>  3 files changed, 495 insertions(+)
>>
>> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
>> index 5aaae1b..6c34117 100644
>> --- a/drivers/thermal/Kconfig
>> +++ b/drivers/thermal/Kconfig
>> @@ -166,6 +166,16 @@ config CPU_FREQ_THERMAL
>>  	  This will be useful for platforms using the generic thermal interface
>>  	  and not the ACPI interface.
>>  
>> +config CPU_IDLE_THERMAL
>> +       bool "CPU idle cooling strategy"
>> +       depends on CPU_IDLE
>> +       help
>> +	 This implements the generic CPU cooling mechanism through
>> +	 idle injection.  This will throttle the CPU by injecting
>> +	 fixed idle cycle.  All CPUs belonging to the same cluster
>> +	 will enter idle synchronously to reach the deepest idle
>> +	 state.
>> +
>>  endchoice
>>  
>>  config CLOCK_THERMAL
>> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
>> index 5c219dc..1eec8d6 100644
>> --- a/drivers/thermal/cpu_cooling.c
>> +++ b/drivers/thermal/cpu_cooling.c
>> @@ -10,18 +10,33 @@
>>   *		Viresh Kumar <viresh.kumar@linaro.org>
>>   *
>>   */
>> +#define pr_fmt(fmt) "CPU cooling: " fmt
>> +
>>  #include <linux/module.h>
>>  #include <linux/thermal.h>
>>  #include <linux/cpufreq.h>
>> +#include <linux/cpuidle.h>
>>  #include <linux/err.h>
>> +#include <linux/freezer.h>
>>  #include <linux/idr.h>
>> +#include <linux/kthread.h>
>>  #include <linux/pm_opp.h>
>>  #include <linux/slab.h>
>> +#include <linux/sched/prio.h>
>> +#include <linux/sched/rt.h>
>> +#include <linux/smpboot.h>
>>  #include <linux/cpu.h>
>>  #include <linux/cpu_cooling.h>
>>  
>> +#include <linux/ratelimit.h>
>> +
>> +#include <linux/platform_device.h>
>> +#include <linux/of_platform.h>
>> +
>>  #include <trace/events/thermal.h>
>>  
>> +#include <uapi/linux/sched/types.h>
>> +
>>  #ifdef CONFIG_CPU_FREQ_THERMAL
>>  /*
>>   * Cooling state <-> CPUFreq frequency
>> @@ -928,3 +943,467 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
>>  }
>>  EXPORT_SYMBOL_GPL(cpufreq_cooling_unregister);
>>  #endif /* CONFIG_CPU_FREQ_THERMAL */
>> +
>> +#ifdef CONFIG_CPU_IDLE_THERMAL
>> +/**
>> + * struct cpuidle_cooling_device - data for the idle cooling device
>> + * @cdev: a pointer to a struct thermal_cooling_device
>> + * @cpumask: a cpumask containing the CPU managed by the cooling device
>> + * @timer: a hrtimer giving the tempo for the idle injection cycles
>> + * @kref: a kernel refcount on this structure
>> + * @count: an atomic to keep track of the last task exiting the idle cycle
>> + * @idle_cycle: an integer defining the duration of the idle injection
>> + * @state: an normalized integer giving the state of the cooling device
>> + */
>> +struct cpuidle_cooling_device {
>> +	struct thermal_cooling_device *cdev;
>> +	struct cpumask *cpumask;
>> +	struct hrtimer timer;
>> +	struct kref kref;
>> +	atomic_t count;
>> +	unsigned int idle_cycle;
>> +	unsigned long state;
>> +};
>> +
>> +struct cpuidle_cooling_thread {
>> +	struct task_struct *tsk;
>> +	int should_run;
>> +};
>> +
>> +static DEFINE_PER_CPU(struct cpuidle_cooling_thread, cpuidle_cooling_thread);
>> +static DEFINE_PER_CPU(struct cpuidle_cooling_device *, cpuidle_cooling_device);
>> +
>> +/**
>> + * cpuidle_cooling_wakeup - Wake up all idle injection threads
>> + * @idle_cdev: the idle cooling device
>> + *
>> + * Every idle injection task belonging to the idle cooling device and
>> + * running on an online cpu will be wake up by this call.
>> + */
>> +static void cpuidle_cooling_wakeup(struct cpuidle_cooling_device *idle_cdev)
>> +{
>> +	struct cpuidle_cooling_thread *cct;
>> +	int cpu;
>> +
>> +	for_each_cpu_and(cpu, idle_cdev->cpumask, cpu_online_mask) {
>> +		cct = per_cpu_ptr(&cpuidle_cooling_thread, cpu);
>> +		cct->should_run = 1;
>> +		wake_up_process(cct->tsk);
>> +	}
>> +}
>> +
>> +/**
>> + * cpuidle_cooling_wakeup_fn - Running cycle timer callback
>> + * @timer: a hrtimer structure
>> + *
>> + * When the mitigation is acting, the CPU is allowed to run an amount
>> + * of time, then the idle injection happens for the specified delay
>> + * and the idle task injection schedules itself until the timer event
>> + * wakes the idle injection tasks again for a new idle injection
>> + * cycle. The time between the end of the idle injection and the timer
>> + * expiration is the allocated running time for the CPU.
>> + *
>> + * Always returns HRTIMER_NORESTART
>> + */
>> +static enum hrtimer_restart cpuidle_cooling_wakeup_fn(struct hrtimer *timer)
>> +{
>> +	struct cpuidle_cooling_device *idle_cdev =
>> +		container_of(timer, struct cpuidle_cooling_device, timer);
>> +
>> +	cpuidle_cooling_wakeup(idle_cdev);
>> +
>> +	return HRTIMER_NORESTART;
>> +}
>> +
>> +/**
>> + * cpuidle_cooling_runtime - Running time computation
>> + * @idle_cdev: the idle cooling device
>> + *
>> + * The running duration is computed from the idle injection duration
>> + * which is fixed. If we reach 100% of idle injection ratio, that
>> + * means the running duration is zero. If we have a 50% ratio
>> + * injection, that means we have equal duration for idle and for
>> + * running duration.
>> + *
>> + * The formula is deduced as the following:
>> + *
>> + *  running = idle x ((100 / ratio) - 1)
>> + *
>> + * For precision purpose for integer math, we use the following:
>> + *
>> + *  running = (idle x 100) / ratio - idle
>> + *
>> + * For example, if we have an injected duration of 50%, then we end up
>> + * with 10ms of idle injection and 10ms of running duration.
>> + *
>> + * Returns a s64 nanosecond based
>> + */
>> +static s64 cpuidle_cooling_runtime(struct cpuidle_cooling_device *idle_cdev)
>> +{
>> +	s64 next_wakeup;
>> +	unsigned long state = idle_cdev->state;
>> +
>> +	/*
>> +	 * The function should not be called when there is no
>> +	 * mitigation because:
>> +	 * - that does not make sense
>> +	 * - we end up with a division by zero
>> +	 */
>> +	if (!state)
>> +		return 0;
>> +
>> +	next_wakeup = (s64)((idle_cdev->idle_cycle * 100) / state) -
>> +		idle_cdev->idle_cycle;
>> +
>> +	return next_wakeup * NSEC_PER_USEC;
>> +}
>> +
> 
> There is a bug in your calculation formula here when "state" becomes 100.
> You return 0 for the injection rate, which is the same as "rate" being 0,
> which is dangerous. You stop cooling when it's most necessary :)
> 
> I'm not sure how much sense really being 100% idle makes, so I, when testing
> this, just say if (state == 100) { state = 99 }. Anyways, just don't return 0.
> 

oh and also, this breaks S3 suspend:

Aug  5 06:09:20 pureos kernel: [  807.487887] PM: suspend entry (deep)
Aug  5 06:09:40 pureos kernel: [  807.501148] Filesystems sync: 0.013
seconds
Aug  5 06:09:40 pureos kernel: [  807.501591] Freezing user space
processes ... (elapsed 0.003 seconds) done.
Aug  5 06:09:40 pureos kernel: [  807.504741] OOM killer disabled.
Aug  5 06:09:40 pureos kernel: [  807.504744] Freezing remaining
freezable tasks ...
Aug  5 06:09:40 pureos kernel: [  827.517712] Freezing of tasks failed
after 20.002 seconds (4 tasks refusing to freeze, wq_busy=0):
Aug  5 06:09:40 pureos kernel: [  827.527122] thermal-idle/0  S    0
161      2 0x00000028
Aug  5 06:09:40 pureos kernel: [  827.527131] Call trace:
Aug  5 06:09:40 pureos kernel: [  827.527148]  __switch_to+0xb4/0x200
Aug  5 06:09:40 pureos kernel: [  827.527156]  __schedule+0x1e0/0x488
Aug  5 06:09:40 pureos kernel: [  827.527162]  schedule+0x38/0xc8
Aug  5 06:09:40 pureos kernel: [  827.527169]  smpboot_thread_fn+0x250/0x2a8
Aug  5 06:09:40 pureos kernel: [  827.527176]  kthread+0xf4/0x120
Aug  5 06:09:40 pureos kernel: [  827.527182]  ret_from_fork+0x10/0x18
Aug  5 06:09:40 pureos kernel: [  827.527186] thermal-idle/1  S    0
162      2 0x00000028
Aug  5 06:09:40 pureos kernel: [  827.527192] Call trace:
Aug  5 06:09:40 pureos kernel: [  827.527197]  __switch_to+0x188/0x200
Aug  5 06:09:40 pureos kernel: [  827.527203]  __schedule+0x1e0/0x488
Aug  5 06:09:40 pureos kernel: [  827.527208]  schedule+0x38/0xc8
Aug  5 06:09:40 pureos kernel: [  827.527213]  smpboot_thread_fn+0x250/0x2a8
Aug  5 06:09:40 pureos kernel: [  827.527218]  kthread+0xf4/0x120
Aug  5 06:09:40 pureos kernel: [  827.527222]  ret_from_fork+0x10/0x18
Aug  5 06:09:40 pureos kernel: [  827.527226] thermal-idle/2  S    0
163      2 0x00000028
Aug  5 06:09:40 pureos kernel: [  827.527231] Call trace:
Aug  5 06:09:40 pureos kernel: [  827.527237]  __switch_to+0xb4/0x200
Aug  5 06:09:40 pureos kernel: [  827.527242]  __schedule+0x1e0/0x488
Aug  5 06:09:40 pureos kernel: [  827.527247]  schedule+0x38/0xc8
Aug  5 06:09:40 pureos kernel: [  827.527259]  smpboot_thread_fn+0x250/0x2a8
Aug  5 06:09:40 pureos kernel: [  827.527264]  kthread+0xf4/0x120
Aug  5 06:09:40 pureos kernel: [  827.527268]  ret_from_fork+0x10/0x18
Aug  5 06:09:40 pureos kernel: [  827.527272] thermal-idle/3  S    0
164      2 0x00000028
Aug  5 06:09:40 pureos kernel: [  827.527278] Call trace:
Aug  5 06:09:40 pureos kernel: [  827.527283]  __switch_to+0xb4/0x200
Aug  5 06:09:40 pureos kernel: [  827.527288]  __schedule+0x1e0/0x488
Aug  5 06:09:40 pureos kernel: [  827.527293]  schedule+0x38/0xc8
Aug  5 06:09:40 pureos kernel: [  827.527298]  smpboot_thread_fn+0x250/0x2a8
Aug  5 06:09:40 pureos kernel: [  827.527303]  kthread+0xf4/0x120
Aug  5 06:09:40 pureos kernel: [  827.527308]  ret_from_fork+0x10/0x18
Aug  5 06:09:40 pureos kernel: [  827.527375] Restarting kernel threads
... done.
Aug  5 06:09:40 pureos kernel: [  827.527771] OOM killer enabled.
Aug  5 06:09:40 pureos kernel: [  827.527772] Restarting tasks ... done.
Aug  5 06:09:40 pureos kernel: [  827.528926] PM: suspend exit


do you know where things might go wrong here?

thanks,

                            martin


  reply	other threads:[~2019-08-05  6:53 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-05 16:16 [PATCH v3 0/7] CPU cooling device new strategies Daniel Lezcano
2018-04-05 16:16 ` [PATCH v3 1/7] thermal/drivers/cpu_cooling: Fixup the header and copyright Daniel Lezcano
     [not found]   ` <20180411061514.GL7671@vireshk-i7>
2018-04-11  8:56     ` Daniel Lezcano
2018-04-05 16:16 ` [PATCH v3 2/7] thermal/drivers/cpu_cooling: Add Software Package Data Exchange (SPDX) Daniel Lezcano
2018-04-05 16:16 ` [PATCH v3 3/7] thermal/drivers/cpu_cooling: Remove pointless field Daniel Lezcano
2018-04-05 16:16 ` [PATCH v3 4/7] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Daniel Lezcano
     [not found]   ` <20180411061851.GM7671@vireshk-i7>
2018-04-11  8:58     ` Daniel Lezcano
2018-04-05 16:16 ` [PATCH v3 5/7] thermal/drivers/cpu_cooling: Add idle cooling device documentation Daniel Lezcano
2018-04-05 16:16 ` [PATCH v3 6/7] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Daniel Lezcano
2018-04-11  8:51   ` Viresh Kumar
2018-04-11  9:29     ` Daniel Lezcano
2018-04-13 11:23   ` Sudeep Holla
2018-04-13 11:47     ` Daniel Lezcano
2018-04-16  7:37       ` Viresh Kumar
2018-04-16  7:44         ` Daniel Lezcano
2018-04-16  9:34           ` Sudeep Holla
2018-04-16  9:37           ` Viresh Kumar
2018-04-16  9:45             ` Daniel Lezcano
2018-04-16  9:50               ` Viresh Kumar
2018-04-16 10:03                 ` Daniel Lezcano
2018-04-16 10:10                   ` Viresh Kumar
2018-04-16 12:10                     ` Daniel Lezcano
2018-04-16 12:30                       ` Lorenzo Pieralisi
2018-04-16 13:57                         ` Daniel Lezcano
2018-04-16 14:22                           ` Lorenzo Pieralisi
2018-04-17  7:17                             ` Daniel Lezcano
2018-04-17 10:24                               ` Lorenzo Pieralisi
2018-04-16 12:31                       ` Sudeep Holla
2018-04-16 12:49                         ` Daniel Lezcano
2018-04-16 13:03                           ` Sudeep Holla
2018-04-16 12:29                 ` Sudeep Holla
2018-04-13 11:38   ` Daniel Thompson
2018-04-13 11:46     ` Daniel Lezcano
2019-08-05  5:11   ` Martin Kepplinger
2019-08-05  6:53     ` Martin Kepplinger [this message]
2019-08-05  7:39       ` Daniel Lezcano
2019-08-05  7:42         ` Martin Kepplinger
2019-08-05  7:58           ` Daniel Lezcano
2019-10-25 11:22             ` Martin Kepplinger
2019-10-25 14:45               ` Daniel Lezcano
2019-10-26 18:23                 ` Martin Kepplinger
2019-10-28 15:16                   ` Daniel Lezcano
2019-08-05  7:37     ` Daniel Lezcano
2019-08-05  7:40       ` Martin Kepplinger
2018-04-05 16:16 ` [PATCH v3 7/7] cpuidle/drivers/cpuidle-arm: Register the cooling device Daniel Lezcano
2018-04-11  8:51   ` Viresh Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=02ec23c3-37ee-4e9f-56a4-453a30a29747@puri.sm \
    --to=martin.kepplinger@puri.sm \
    --cc=daniel.lezcano@linaro.org \
    --cc=daniel.thompson@linaro.org \
    --cc=edubezval@gmail.com \
    --cc=javi.merino@kernel.org \
    --cc=kevin.wangtao@linaro.org \
    --cc=leo.yan@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=rui.zhang@intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).