From: Daniel Lezcano <firstname.lastname@example.org> To: email@example.com, firstname.lastname@example.org Cc: email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, Jonathan Corbet <email@example.com>, "open list:DOCUMENTATION" <firstname.lastname@example.org> Subject: [PATCH v3 5/7] thermal/drivers/cpu_cooling: Add idle cooling device documentation Date: Thu, 5 Apr 2018 18:16:42 +0200 [thread overview] Message-ID: <email@example.com> (raw) In-Reply-To: <firstname.lastname@example.org> Provide some documentation for the idle injection cooling effect in order to let people to understand the rational of the approach for the idle injection CPU cooling device. Signed-off-by: Daniel Lezcano <email@example.com> --- Documentation/thermal/cpu-idle-cooling.txt | 166 +++++++++++++++++++++++++++++ 1 file changed, 166 insertions(+) create mode 100644 Documentation/thermal/cpu-idle-cooling.txt diff --git a/Documentation/thermal/cpu-idle-cooling.txt b/Documentation/thermal/cpu-idle-cooling.txt new file mode 100644 index 0000000..457cd99 --- /dev/null +++ b/Documentation/thermal/cpu-idle-cooling.txt @@ -0,0 +1,166 @@ + +Situation: +---------- + +Under certain circumstances a SoC can reach the maximum temperature +limit or is unable to stabilize the temperature around a temperature +control. When the SoC has to stabilize the temperature, the kernel can +act on a cooling device to mitigate the dissipated power. When the +maximum temperature is reached and to prevent a reboot or a shutdown, +a decision must be taken to reduce the temperature under the critical +threshold, that impacts the performance. + +Another situation is when the silicon reaches a certain temperature +which continues to increase even if the dynamic leakage is reduced to +its minimum by clock gating the component. The runaway phenomena will +continue with the static leakage and only powering down the component, +thus dropping the dynamic and static leakage will allow the component +to cool down. + +Last but not least, the system can ask for a specific power budget but +because of the OPP density, we can only choose an OPP with a power +budget lower than the requested one and underuse the CPU, thus losing +performances. In other words, one OPP under uses the CPU with a power +lesser than the power budget and the next OPP exceed the power budget, +an intermediate OPP could have been used if it were present. + +Solutions: +---------- + +If we can remove the static and the dynamic leakage for a specific +duration in a controlled period, the SoC temperature will +decrease. Acting at the idle state duration or the idle cycle +injection period, we can mitigate the temperature by modulating the +power budget. + +The Operating Performance Point (OPP) density has a great influence on +the control precision of cpufreq, however different vendors have a +plethora of OPP density, and some have large power gap between OPPs, +that will result in loss of performance during thermal control and +loss of power in other scenes. + +At a specific OPP, we can assume injecting idle cycle on all CPUs, +belonging to the same cluster, with a duration greater than the +cluster idle state target residency, we drop the static and the +dynamic leakage for this period (modulo the energy needed to enter +this state). So the sustainable power with idle cycles has a linear +relation with the OPP’s sustainable power and can be computed with a +coefficient similar to: + + Power(IdleCycle) = Coef x Power(OPP) + +Idle Injection: +--------------- + +The base concept of the idle injection is to force the CPU to go to an +idle state for a specified time each control cycle, it provides +another way to control CPU power and heat in addition to +cpufreq. Ideally, if all CPUs belonging to the same cluster, inject +their idle cycle synchronously, the cluster can reach its power down +state with a minimum power consumption and static leakage +drop. However, these idle cycles injection will add extra latencies as +the CPUs will have to wakeup from a deep sleep state. + + ^ + | + | + |------- ------- ------- + |_______|_____|_______|_____|_______|___________ + + <-----> + idle <----> + running + +With the fixed idle injection duration, we can give a value which is +an acceptable performance drop off or latency when we reach a specific +temperature and we begin to mitigate by varying the Idle injection +period. + +The mitigation begins with a maximum period value which decrease when +more cooling effect is requested. When the period duration is equal to +the idle duration, then we are in a situation the platform can’t +dissipate the heat enough and the mitigation fails. In this case the +situation is considered critical and there is nothing to do. The idle +injection duration must be changed by configuration and until we reach +the cooling effect, otherwise an additionnal cooling device must be +used or ultimately decrease the SoC performance by dropping the +highest OPP point of the SoC. + +The idle injection duration value must comply with the constraints: + +- It is lesser or equal to the latency we tolerate when the mitigation + begins. It is platform dependent and will depend on the user + experience, reactivity vs performance trade off we want. This value + should be specified. + +- It is greater than the idle state’s target residency we want to go + for thermal mitigation, otherwise we end up consuming more energy. + +Minimum period +-------------- + +The idle injection duration being fixed, it is obvious the minimum +period can’t be lesser than that, otherwise we will be scheduling the +idle injection task right before the idle injection duration is +complete, so waking up the CPU to put it asleep again. + +Maximum period +-------------- + +The maximum period is the initial period when the mitigation +begins. Theoretically when we reach the thermal trip point, we have to +sustain a specified power for specific temperature but at this time we +consume: + + Power = Capacitance x Voltage^2 x Frequency x Utilisation + +... which is more than the sustainable power (or there is something +wrong on the system setup). The ‘Capacitance’ and ‘Utilisation’ are a +fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially +because we don’t want to change the OPP. We can group the +‘Capacitance’ and the ‘Utilisation’ into a single term which is the +‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have: + + Pdyn = Cdyn x Voltage^2 x Frequency + +The IPA will ask us somehow to reduce our power in order to target the +sustainable power defined in the device tree. So with the idle +injection mechanism, we want an average power (Ptarget) resulting on +an amount of time running at full power on a specific OPP and idle +another amount of time. That could be put in a equation: + + P(opp)target = ((trunning x (P(opp)running) + (tidle P(opp)idle)) / + (trunning + tidle) + ... + + tidle = trunning x ((P(opp)running / P(opp)target) - 1) + +At this point if we know the running period for the CPU, that gives us +the idle injection, we need. Alternatively if we have the idle +injection duration, we can compute the running duration with: + + trunning = tidle / ((P(opp)running / P(opp)target) - 1) + +Practically, if the running power is lesses than the targeted power, +we end up with a negative time value, so obviously the equation usage +is bound to a power reduction, hence a higher OPP is needed to have +the running power greater than the targeted power. + +However, in this demonstration we ignore three aspects: + + * The static leakage is not defined here, we can introduce it in the + equation but assuming it will be zero most of the time as it is + difficult to get the values from the SoC vendors + + * The idle state wake up latency (or entry + exit latency) is not + taken into account, it must be added in the equation in order to + rigorously compute the idle injection + + * The injected idle duration must be greater than the idle state + target residency, otherwise we end up consuming more energy and + potentially invert the mitigation effect + +So the final equation is: + + trunning = (tidle - twakeup ) x + (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target ) -- 2.7.4
next prev parent reply other threads:[~2018-04-05 16:16 UTC|newest] Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-04-05 16:16 [PATCH v3 0/7] CPU cooling device new strategies Daniel Lezcano 2018-04-05 16:16 ` [PATCH v3 1/7] thermal/drivers/cpu_cooling: Fixup the header and copyright Daniel Lezcano [not found] ` <20180411061514.GL7671@vireshk-i7> 2018-04-11 8:56 ` Daniel Lezcano 2018-04-05 16:16 ` [PATCH v3 2/7] thermal/drivers/cpu_cooling: Add Software Package Data Exchange (SPDX) Daniel Lezcano 2018-04-05 16:16 ` [PATCH v3 3/7] thermal/drivers/cpu_cooling: Remove pointless field Daniel Lezcano 2018-04-05 16:16 ` [PATCH v3 4/7] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Daniel Lezcano [not found] ` <20180411061851.GM7671@vireshk-i7> 2018-04-11 8:58 ` Daniel Lezcano 2018-04-05 16:16 ` Daniel Lezcano [this message] 2018-04-05 16:16 ` [PATCH v3 6/7] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Daniel Lezcano 2018-04-11 8:51 ` Viresh Kumar 2018-04-11 9:29 ` Daniel Lezcano 2018-04-13 11:23 ` Sudeep Holla 2018-04-13 11:47 ` Daniel Lezcano 2018-04-16 7:37 ` Viresh Kumar 2018-04-16 7:44 ` Daniel Lezcano 2018-04-16 9:34 ` Sudeep Holla 2018-04-16 9:37 ` Viresh Kumar 2018-04-16 9:45 ` Daniel Lezcano 2018-04-16 9:50 ` Viresh Kumar 2018-04-16 10:03 ` Daniel Lezcano 2018-04-16 10:10 ` Viresh Kumar 2018-04-16 12:10 ` Daniel Lezcano 2018-04-16 12:30 ` Lorenzo Pieralisi 2018-04-16 13:57 ` Daniel Lezcano 2018-04-16 14:22 ` Lorenzo Pieralisi 2018-04-17 7:17 ` Daniel Lezcano 2018-04-17 10:24 ` Lorenzo Pieralisi 2018-04-16 12:31 ` Sudeep Holla 2018-04-16 12:49 ` Daniel Lezcano 2018-04-16 13:03 ` Sudeep Holla 2018-04-16 12:29 ` Sudeep Holla 2018-04-13 11:38 ` Daniel Thompson 2018-04-13 11:46 ` Daniel Lezcano 2019-08-05 5:11 ` Martin Kepplinger 2019-08-05 6:53 ` Martin Kepplinger 2019-08-05 7:39 ` Daniel Lezcano 2019-08-05 7:42 ` Martin Kepplinger 2019-08-05 7:58 ` Daniel Lezcano 2019-10-25 11:22 ` Martin Kepplinger 2019-10-25 14:45 ` Daniel Lezcano 2019-10-26 18:23 ` Martin Kepplinger 2019-10-28 15:16 ` Daniel Lezcano 2019-08-05 7:37 ` Daniel Lezcano 2019-08-05 7:40 ` Martin Kepplinger 2018-04-05 16:16 ` [PATCH v3 7/7] cpuidle/drivers/cpuidle-arm: Register the cooling device Daniel Lezcano 2018-04-11 8:51 ` Viresh Kumar
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [PATCH v3 5/7] thermal/drivers/cpu_cooling: Add idle cooling device documentation' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).