linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V3 1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice
@ 2019-12-03  9:37 Daniel Lezcano
  2019-12-03  9:37 ` [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation Daniel Lezcano
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-03  9:37 UTC (permalink / raw)
  To: viresh.kumar, rui.zhang
  Cc: rjw, edubezval, linux-pm, amit.kucheria, linux-kernel

The next changes will add a new way to cool down a CPU by injecting
idle cycles. With the current configuration, a CPU cooling device is
the cpufreq cooling device. As we want to add a new CPU cooling
device, let's convert the CPU cooling to a choice giving a list of CPU
cooling devices. At this point, there is obviously only one CPU
cooling device.

There is no functional changes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
---
  V2:
    - Default CPU_FREQ_COOLING when CPU_THERMAL is set (Viresh Kumar)
---
 drivers/thermal/Kconfig     | 14 ++++++++++++--
 drivers/thermal/Makefile    |  2 +-
 include/linux/cpu_cooling.h |  6 +++---
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 001a21abcc28..4e3ee036938b 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -150,8 +150,18 @@ config THERMAL_GOV_POWER_ALLOCATOR
 
 config CPU_THERMAL
 	bool "Generic cpu cooling support"
-	depends on CPU_FREQ
 	depends on THERMAL_OF
+	help
+	  Enable the CPU cooling features. If the system has no active
+	  cooling device available, this option allows to use the CPU
+	  as a cooling device.
+
+if CPU_THERMAL
+
+config CPU_FREQ_THERMAL
+	bool "CPU frequency cooling device"
+	depends on CPU_FREQ
+	default y
 	help
 	  This implements the generic cpu cooling mechanism through frequency
 	  reduction. An ACPI version of this already exists
@@ -159,7 +169,7 @@ config CPU_THERMAL
 	  This will be useful for platforms using the generic thermal interface
 	  and not the ACPI interface.
 
-	  If you want this support, you should say Y here.
+endif
 
 config CLOCK_THERMAL
 	bool "Generic clock cooling support"
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 74a37c7f847a..d3b01cc96981 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -19,7 +19,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)	+= user_space.o
 thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)	+= power_allocator.o
 
 # cpufreq cooling
-thermal_sys-$(CONFIG_CPU_THERMAL)	+= cpu_cooling.o
+thermal_sys-$(CONFIG_CPU_FREQ_THERMAL)	+= cpu_cooling.o
 
 # clock cooling
 thermal_sys-$(CONFIG_CLOCK_THERMAL)	+= clock_cooling.o
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index b74732535e4b..3cdd85f987d7 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -19,7 +19,7 @@
 
 struct cpufreq_policy;
 
-#ifdef CONFIG_CPU_THERMAL
+#ifdef CONFIG_CPU_FREQ_THERMAL
 /**
  * cpufreq_cooling_register - function to create cpufreq cooling device.
  * @policy: cpufreq policy.
@@ -40,7 +40,7 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev);
 struct thermal_cooling_device *
 of_cpufreq_cooling_register(struct cpufreq_policy *policy);
 
-#else /* !CONFIG_CPU_THERMAL */
+#else /* !CONFIG_CPU_FREQ_THERMAL */
 static inline struct thermal_cooling_device *
 cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
@@ -58,6 +58,6 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
 	return NULL;
 }
-#endif /* CONFIG_CPU_THERMAL */
+#endif /* CONFIG_CPU_FREQ_THERMAL */
 
 #endif /* __CPU_COOLING_H__ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation
  2019-12-03  9:37 [PATCH V3 1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Daniel Lezcano
@ 2019-12-03  9:37 ` Daniel Lezcano
  2019-12-04  4:24   ` Amit Kucheria
  2019-12-03  9:37 ` [PATCH V3 3/4] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Daniel Lezcano
  2019-12-03  9:37 ` [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling Daniel Lezcano
  2 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-03  9:37 UTC (permalink / raw)
  To: viresh.kumar, rui.zhang
  Cc: rjw, edubezval, linux-pm, amit.kucheria, linux-kernel

Provide some documentation for the idle injection cooling effect in
order to let people to understand the rational of the approach for the
idle injection CPU cooling device.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 .../driver-api/thermal/cpu-idle-cooling.rst   | 166 ++++++++++++++++++
 1 file changed, 166 insertions(+)
 create mode 100644 Documentation/driver-api/thermal/cpu-idle-cooling.rst

diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
new file mode 100644
index 000000000000..457cd9979ddb
--- /dev/null
+++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
@@ -0,0 +1,166 @@
+
+Situation:
+----------
+
+Under certain circumstances a SoC can reach the maximum temperature
+limit or is unable to stabilize the temperature around a temperature
+control. When the SoC has to stabilize the temperature, the kernel can
+act on a cooling device to mitigate the dissipated power. When the
+maximum temperature is reached and to prevent a reboot or a shutdown,
+a decision must be taken to reduce the temperature under the critical
+threshold, that impacts the performance.
+
+Another situation is when the silicon reaches a certain temperature
+which continues to increase even if the dynamic leakage is reduced to
+its minimum by clock gating the component. The runaway phenomena will
+continue with the static leakage and only powering down the component,
+thus dropping the dynamic and static leakage will allow the component
+to cool down.
+
+Last but not least, the system can ask for a specific power budget but
+because of the OPP density, we can only choose an OPP with a power
+budget lower than the requested one and underuse the CPU, thus losing
+performances. In other words, one OPP under uses the CPU with a power
+lesser than the power budget and the next OPP exceed the power budget,
+an intermediate OPP could have been used if it were present.
+
+Solutions:
+----------
+
+If we can remove the static and the dynamic leakage for a specific
+duration in a controlled period, the SoC temperature will
+decrease. Acting at the idle state duration or the idle cycle
+injection period, we can mitigate the temperature by modulating the
+power budget.
+
+The Operating Performance Point (OPP) density has a great influence on
+the control precision of cpufreq, however different vendors have a
+plethora of OPP density, and some have large power gap between OPPs,
+that will result in loss of performance during thermal control and
+loss of power in other scenes.
+
+At a specific OPP, we can assume injecting idle cycle on all CPUs,
+belonging to the same cluster, with a duration greater than the
+cluster idle state target residency, we drop the static and the
+dynamic leakage for this period (modulo the energy needed to enter
+this state). So the sustainable power with idle cycles has a linear
+relation with the OPP’s sustainable power and can be computed with a
+coefficient similar to:
+
+	    Power(IdleCycle) = Coef x Power(OPP)
+
+Idle Injection:
+---------------
+
+The base concept of the idle injection is to force the CPU to go to an
+idle state for a specified time each control cycle, it provides
+another way to control CPU power and heat in addition to
+cpufreq. Ideally, if all CPUs belonging to the same cluster, inject
+their idle cycle synchronously, the cluster can reach its power down
+state with a minimum power consumption and static leakage
+drop. However, these idle cycles injection will add extra latencies as
+the CPUs will have to wakeup from a deep sleep state.
+
+     ^
+     |
+     |
+     |-------       -------       -------
+     |_______|_____|_______|_____|_______|___________
+
+      <----->
+       idle  <---->
+              running
+
+With the fixed idle injection duration, we can give a value which is
+an acceptable performance drop off or latency when we reach a specific
+temperature and we begin to mitigate by varying the Idle injection
+period.
+
+The mitigation begins with a maximum period value which decrease when
+more cooling effect is requested. When the period duration is equal to
+the idle duration, then we are in a situation the platform can’t
+dissipate the heat enough and the mitigation fails. In this case the
+situation is considered critical and there is nothing to do. The idle
+injection duration must be changed by configuration and until we reach
+the cooling effect, otherwise an additionnal cooling device must be
+used or ultimately decrease the SoC performance by dropping the
+highest OPP point of the SoC.
+
+The idle injection duration value must comply with the constraints:
+
+- It is lesser or equal to the latency we tolerate when the mitigation
+  begins. It is platform dependent and will depend on the user
+  experience, reactivity vs performance trade off we want. This value
+  should be specified.
+
+- It is greater than the idle state’s target residency we want to go
+  for thermal mitigation, otherwise we end up consuming more energy.
+
+Minimum period
+--------------
+
+The idle injection duration being fixed, it is obvious the minimum
+period can’t be lesser than that, otherwise we will be scheduling the
+idle injection task right before the idle injection duration is
+complete, so waking up the CPU to put it asleep again.
+
+Maximum period
+--------------
+
+The maximum period is the initial period when the mitigation
+begins. Theoretically when we reach the thermal trip point, we have to
+sustain a specified power for specific temperature but at this time we
+consume:
+
+ Power = Capacitance x Voltage^2 x Frequency x Utilisation
+
+... which is more than the sustainable power (or there is something
+wrong on the system setup). The ‘Capacitance’ and ‘Utilisation’ are a
+fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially
+because we don’t want to change the OPP. We can group the
+‘Capacitance’ and the ‘Utilisation’ into a single term which is the
+‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have:
+
+ Pdyn = Cdyn x Voltage^2 x Frequency
+
+The IPA will ask us somehow to reduce our power in order to target the
+sustainable power defined in the device tree. So with the idle
+injection mechanism, we want an average power (Ptarget) resulting on
+an amount of time running at full power on a specific OPP and idle
+another amount of time. That could be put in a equation:
+
+ P(opp)target = ((trunning x (P(opp)running) + (tidle P(opp)idle)) /
+			(trunning + tidle)
+  ...
+
+ tidle = trunning x ((P(opp)running / P(opp)target) - 1)
+
+At this point if we know the running period for the CPU, that gives us
+the idle injection, we need. Alternatively if we have the idle
+injection duration, we can compute the running duration with:
+
+ trunning = tidle / ((P(opp)running / P(opp)target) - 1)
+
+Practically, if the running power is lesses than the targeted power,
+we end up with a negative time value, so obviously the equation usage
+is bound to a power reduction, hence a higher OPP is needed to have
+the running power greater than the targeted power.
+
+However, in this demonstration we ignore three aspects:
+
+ * The static leakage is not defined here, we can introduce it in the
+   equation but assuming it will be zero most of the time as it is
+   difficult to get the values from the SoC vendors
+
+ * The idle state wake up latency (or entry + exit latency) is not
+   taken into account, it must be added in the equation in order to
+   rigorously compute the idle injection
+
+ * The injected idle duration must be greater than the idle state
+   target residency, otherwise we end up consuming more energy and
+   potentially invert the mitigation effect
+
+So the final equation is:
+
+ trunning = (tidle - twakeup ) x
+		(((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V3 3/4] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver
  2019-12-03  9:37 [PATCH V3 1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Daniel Lezcano
  2019-12-03  9:37 ` [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation Daniel Lezcano
@ 2019-12-03  9:37 ` Daniel Lezcano
  2019-12-04  4:53   ` Amit Kucheria
  2019-12-03  9:37 ` [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling Daniel Lezcano
  2 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-03  9:37 UTC (permalink / raw)
  To: viresh.kumar, rui.zhang
  Cc: rjw, edubezval, linux-pm, amit.kucheria, linux-kernel

The cpu idle cooling device offers a new method to cool down a CPU by
injecting idle cycles at runtime.

It has some similarities with the intel power clamp driver but it is
actually designed to be more generic and relying on the idle injection
powercap framework.

The idle injection cycle is fixed while the running cycle is variable. That
allows to have control on the device reactivity for the user experience.

An idle state powering down the CPU or the cluster will allow to drop
the static leakage, thus restoring the heat capacity of the SoC. It
can be set with a trip point between the hot and the critical points,
giving the opportunity to prevent a hard reset of the system when the
cpufreq cooling fails to cool down the CPU.

With more sophisticated boards having a per core sensor, the idle
cooling device allows to cool down a single core without throttling
the compute capacity of several cpus belonging to the same clock line,
so it could be used in collaboration with the cpufreq cooling device.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 V3:
   - Add missing parameter documentation (Viresh Kumar)
   - Fixed function description (Viresh Kumar)
   - Add entry in MAINTAINER file
 V2:
   - Remove idle_duration_us field and use idle_inject API instead (Viresh Kumar)
   - Fixed function definition wheh CPU_IDLE_COOLING is not set
   - Inverted the initialization in the init function (Viresh Kumar)
---
 MAINTAINERS                       |   3 +
 drivers/thermal/Kconfig           |   7 +
 drivers/thermal/Makefile          |   1 +
 drivers/thermal/cpuidle_cooling.c | 234 ++++++++++++++++++++++++++++++
 include/linux/cpu_cooling.h       |  22 +++
 5 files changed, 267 insertions(+)
 create mode 100644 drivers/thermal/cpuidle_cooling.c

diff --git a/MAINTAINERS b/MAINTAINERS
index c570f0204b48..d2e92a0360f2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16187,12 +16187,15 @@ F:	Documentation/devicetree/bindings/thermal/
 
 THERMAL/CPU_COOLING
 M:	Amit Daniel Kachhap <amit.kachhap@gmail.com>
+M:	Daniel Lezcano <daniel.lezcano@linaro.org>
 M:	Viresh Kumar <viresh.kumar@linaro.org>
 M:	Javi Merino <javi.merino@kernel.org>
 L:	linux-pm@vger.kernel.org
 S:	Supported
 F:	Documentation/driver-api/thermal/cpu-cooling-api.rst
+F:	Documentation/driver-api/thermal/cpu-idle-cooling.rst
 F:	drivers/thermal/cpu_cooling.c
+F:	drivers/thermal/cpuidle_cooling.c
 F:	include/linux/cpu_cooling.h
 
 THINKPAD ACPI EXTRAS DRIVER
diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index 4e3ee036938b..4ee9953ba5ce 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -169,6 +169,13 @@ config CPU_FREQ_THERMAL
 	  This will be useful for platforms using the generic thermal interface
 	  and not the ACPI interface.
 
+config CPU_IDLE_THERMAL
+	bool "CPU idle cooling device"
+	depends on IDLE_INJECT
+	help
+	  This implements the CPU cooling mechanism through
+	  idle injection. This will throttle the CPU by injecting
+	  idle cycle.
 endif
 
 config CLOCK_THERMAL
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index d3b01cc96981..9c8aa2d4bd28 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -20,6 +20,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)	+= power_allocator.o
 
 # cpufreq cooling
 thermal_sys-$(CONFIG_CPU_FREQ_THERMAL)	+= cpu_cooling.o
+thermal_sys-$(CONFIG_CPU_IDLE_THERMAL)	+= cpuidle_cooling.o
 
 # clock cooling
 thermal_sys-$(CONFIG_CLOCK_THERMAL)	+= clock_cooling.o
diff --git a/drivers/thermal/cpuidle_cooling.c b/drivers/thermal/cpuidle_cooling.c
new file mode 100644
index 000000000000..7d91a1b298d4
--- /dev/null
+++ b/drivers/thermal/cpuidle_cooling.c
@@ -0,0 +1,234 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ *  Copyright (C) 2019 Linaro Limited.
+ *
+ *  Author: Daniel Lezcano <daniel.lezcano@linaro.org>
+ *
+ */
+#include <linux/cpu_cooling.h>
+#include <linux/cpuidle.h>
+#include <linux/err.h>
+#include <linux/idle_inject.h>
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/thermal.h>
+
+/**
+ * struct cpuidle_cooling_device - data for the idle cooling device
+ * @ii_dev: an atomic to keep track of the last task exiting the idle cycle
+ * @state: an normalized integer giving the state of the cooling device
+ */
+struct cpuidle_cooling_device {
+	struct idle_inject_device *ii_dev;
+	unsigned long state;
+};
+
+static DEFINE_IDA(cpuidle_ida);
+
+/**
+ * cpuidle_cooling_runtime - Running time computation
+ * @idle_duration_us: the idle cooling device
+ * @state: a percentile based number
+ *
+ * The running duration is computed from the idle injection duration
+ * which is fixed. If we reach 100% of idle injection ratio, that
+ * means the running duration is zero. If we have a 50% ratio
+ * injection, that means we have equal duration for idle and for
+ * running duration.
+ *
+ * The formula is deduced as the following:
+ *
+ *  running = idle x ((100 / ratio) - 1)
+ *
+ * For precision purpose for integer math, we use the following:
+ *
+ *  running = (idle x 100) / ratio - idle
+ *
+ * For example, if we have an injected duration of 50%, then we end up
+ * with 10ms of idle injection and 10ms of running duration.
+ *
+ * Returns an unsigned int for an usec based runtime duration.
+ */
+static unsigned int cpuidle_cooling_runtime(unsigned int idle_duration_us,
+					    unsigned long state)
+{
+	if (!state)
+		return 0;
+
+	return ((idle_duration_us * 100) / state) - idle_duration_us;
+}
+
+/**
+ * cpuidle_cooling_get_max_state - Get the maximum state
+ * @cdev  : the thermal cooling device
+ * @state : a pointer to the state variable to be filled
+ *
+ * The function always gives 100 as the injection ratio is percentile
+ * based for consistency accros different platforms.
+ *
+ * The function can not fail, it always returns zero.
+ */
+static int cpuidle_cooling_get_max_state(struct thermal_cooling_device *cdev,
+					 unsigned long *state)
+{
+	/*
+	 * Depending on the configuration or the hardware, the running
+	 * cycle and the idle cycle could be different. We want unify
+	 * that to an 0..100 interval, so the set state interface will
+	 * be the same whatever the platform is.
+	 *
+	 * The state 100% will make the cluster 100% ... idle. A 0%
+	 * injection ratio means no idle injection at all and 50%
+	 * means for 10ms of idle injection, we have 10ms of running
+	 * time.
+	 */
+	*state = 100;
+
+	return 0;
+}
+
+/**
+ * cpuidle_cooling_get_cur_state - Get the current cooling state
+ * @cdev: the thermal cooling device
+ * @state: a pointer to the state
+ *
+ * The function just copy the state value from the private thermal
+ * cooling device structure, the mapping is 1 <-> 1.
+ *
+ * The function can not fail, it always returns zero.
+ */
+static int cpuidle_cooling_get_cur_state(struct thermal_cooling_device *cdev,
+					 unsigned long *state)
+{
+	struct cpuidle_cooling_device *idle_cdev = cdev->devdata;
+
+	*state = idle_cdev->state;
+
+	return 0;
+}
+
+/**
+ * cpuidle_cooling_set_cur_state - Set the current cooling state
+ * @cdev: the thermal cooling device
+ * @state: the target state
+ *
+ * The function checks first if we are initiating the mitigation which
+ * in turn wakes up all the idle injection tasks belonging to the idle
+ * cooling device. In any case, it updates the internal state for the
+ * cooling device.
+ *
+ * The function can not fail, it always returns zero.
+ */
+static int cpuidle_cooling_set_cur_state(struct thermal_cooling_device *cdev,
+					 unsigned long state)
+{
+	struct cpuidle_cooling_device *idle_cdev = cdev->devdata;
+	struct idle_inject_device *ii_dev = idle_cdev->ii_dev;
+	unsigned long current_state = idle_cdev->state;
+	unsigned int runtime_us, idle_duration_us;
+
+	idle_cdev->state = state;
+
+	idle_inject_get_duration(ii_dev, &runtime_us, &idle_duration_us);
+
+	runtime_us = cpuidle_cooling_runtime(idle_duration_us, state);
+
+	idle_inject_set_duration(ii_dev, runtime_us, idle_duration_us);
+
+	if (current_state == 0 && state > 0) {
+		idle_inject_start(ii_dev);
+	} else if (current_state > 0 && !state)  {
+		idle_inject_stop(ii_dev);
+	}
+
+	return 0;
+}
+
+/**
+ * cpuidle_cooling_ops - thermal cooling device ops
+ */
+static struct thermal_cooling_device_ops cpuidle_cooling_ops = {
+	.get_max_state = cpuidle_cooling_get_max_state,
+	.get_cur_state = cpuidle_cooling_get_cur_state,
+	.set_cur_state = cpuidle_cooling_set_cur_state,
+};
+
+/**
+ * cpuidle_of_cooling_register - Idle cooling device initialization function
+ * @drv: a cpuidle driver structure pointer
+ * @np: a node pointer to a device tree cooling device node
+ *
+ * This function is in charge of creating a cooling device per cpuidle
+ * driver and register it to thermal framework.
+ *
+ * Returns a valid pointer to a thermal cooling device or a PTR_ERR
+ * corresponding to the error detected in the underlying subsystems.
+ */
+struct thermal_cooling_device *
+__init cpuidle_of_cooling_register(struct device_node *np,
+				   struct cpuidle_driver *drv)
+{
+	struct idle_inject_device *ii_dev;
+	struct cpuidle_cooling_device *idle_cdev;
+	struct thermal_cooling_device *cdev;
+	char dev_name[THERMAL_NAME_LENGTH];
+	int id, ret;
+
+	idle_cdev = kzalloc(sizeof(*idle_cdev), GFP_KERNEL);
+	if (!idle_cdev) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	id = ida_simple_get(&cpuidle_ida, 0, 0, GFP_KERNEL);
+	if (id < 0) {
+		ret = id;
+		goto out_kfree;
+	}
+
+	ii_dev = idle_inject_register(drv->cpumask);
+	if (IS_ERR(ii_dev)) {
+		ret = PTR_ERR(ii_dev);
+		goto out_id;
+	}
+
+	idle_inject_set_duration(ii_dev, 0, TICK_USEC);
+	
+	idle_cdev->ii_dev = ii_dev;
+
+	snprintf(dev_name, sizeof(dev_name), "thermal-idle-%d", id);
+
+	cdev = thermal_of_cooling_device_register(np, dev_name, idle_cdev,
+						  &cpuidle_cooling_ops);
+	if (IS_ERR(cdev)) {
+		ret = PTR_ERR(cdev);
+		goto out_unregister;
+	}
+
+	return cdev;
+
+out_unregister:
+	idle_inject_unregister(ii_dev);
+out_id:
+	ida_simple_remove(&cpuidle_ida, id);
+out_kfree:
+	kfree(idle_cdev);
+out:
+	return ERR_PTR(ret);
+}
+
+/**
+ * cpuidle_cooling_register - Idle cooling device initialization function
+ * @drv: a cpuidle driver structure pointer
+ *
+ * This function is in charge of creating a cooling device per cpuidle
+ * driver and register it to thermal framework.
+ *
+ * Returns a valid pointer to a thermal cooling device, a PTR_ERR
+ * corresponding to the error detected in the underlying subsystems.
+ */
+struct thermal_cooling_device *
+__init cpuidle_cooling_register(struct cpuidle_driver *drv)
+{
+	return cpuidle_of_cooling_register(NULL, drv);
+}
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index 3cdd85f987d7..da0970183d1f 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -60,4 +60,26 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 }
 #endif /* CONFIG_CPU_FREQ_THERMAL */
 
+struct cpuidle_driver;
+
+#ifdef CONFIG_CPU_IDLE_THERMAL
+extern struct thermal_cooling_device *
+__init cpuidle_cooling_register(struct cpuidle_driver *drv);
+extern struct thermal_cooling_device *
+__init cpuidle_of_cooling_register(struct device_node *np,
+				   struct cpuidle_driver *drv);
+#else /* CONFIG_CPU_IDLE_THERMAL */
+static inline struct thermal_cooling_device *
+__init cpuidle_cooling_register(struct cpuidle_driver *drv)
+{
+	return ERR_PTR(-EINVAL);
+}
+static inline struct thermal_cooling_device *
+__init cpuidle_of_cooling_register(struct device_node *np,
+				   struct cpuidle_driver *drv)
+{
+	return ERR_PTR(-EINVAL);
+}
+#endif /* CONFIG_CPU_IDLE_THERMAL */
+
 #endif /* __CPU_COOLING_H__ */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling
  2019-12-03  9:37 [PATCH V3 1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Daniel Lezcano
  2019-12-03  9:37 ` [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation Daniel Lezcano
  2019-12-03  9:37 ` [PATCH V3 3/4] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Daniel Lezcano
@ 2019-12-03  9:37 ` Daniel Lezcano
  2019-12-03  9:40   ` Viresh Kumar
  2019-12-04  4:27   ` Amit Kucheria
  2 siblings, 2 replies; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-03  9:37 UTC (permalink / raw)
  To: viresh.kumar, rui.zhang
  Cc: rjw, edubezval, linux-pm, amit.kucheria, linux-kernel

As we introduced the idle injection cooling device called
cpuidle_cooling, let's be consistent and rename the cpu_cooling to
cpufreq_cooling as this one mitigates with OPPs changes.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
---
  V3:
    - Fix missing name conversion (Viresh Kumar)
---
 Documentation/driver-api/thermal/exynos_thermal.rst  | 2 +-
 MAINTAINERS                                          | 2 +-
 drivers/thermal/Makefile                             | 2 +-
 drivers/thermal/clock_cooling.c                      | 2 +-
 drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} | 6 +++---
 include/linux/clock_cooling.h                        | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)
 rename drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} (99%)

diff --git a/Documentation/driver-api/thermal/exynos_thermal.rst b/Documentation/driver-api/thermal/exynos_thermal.rst
index 5bd556566c70..d4e4a5b75805 100644
--- a/Documentation/driver-api/thermal/exynos_thermal.rst
+++ b/Documentation/driver-api/thermal/exynos_thermal.rst
@@ -67,7 +67,7 @@ TMU driver description:
 The exynos thermal driver is structured as::
 
 					Kernel Core thermal framework
-				(thermal_core.c, step_wise.c, cpu_cooling.c)
+				(thermal_core.c, step_wise.c, cpufreq_cooling.c)
 								^
 								|
 								|
diff --git a/MAINTAINERS b/MAINTAINERS
index d2e92a0360f2..26e4be914765 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16194,7 +16194,7 @@ L:	linux-pm@vger.kernel.org
 S:	Supported
 F:	Documentation/driver-api/thermal/cpu-cooling-api.rst
 F:	Documentation/driver-api/thermal/cpu-idle-cooling.rst
-F:	drivers/thermal/cpu_cooling.c
+F:	drivers/thermal/cpufreq_cooling.c
 F:	drivers/thermal/cpuidle_cooling.c
 F:	include/linux/cpu_cooling.h
 
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index 9c8aa2d4bd28..5c98472ffd8b 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -19,7 +19,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)	+= user_space.o
 thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)	+= power_allocator.o
 
 # cpufreq cooling
-thermal_sys-$(CONFIG_CPU_FREQ_THERMAL)	+= cpu_cooling.o
+thermal_sys-$(CONFIG_CPU_FREQ_THERMAL)	+= cpufreq_cooling.o
 thermal_sys-$(CONFIG_CPU_IDLE_THERMAL)	+= cpuidle_cooling.o
 
 # clock cooling
diff --git a/drivers/thermal/clock_cooling.c b/drivers/thermal/clock_cooling.c
index 3ad3256c48fd..7cb3ae4b44ee 100644
--- a/drivers/thermal/clock_cooling.c
+++ b/drivers/thermal/clock_cooling.c
@@ -7,7 +7,7 @@
  *  Copyright (C) 2013	Texas Instruments Inc.
  *  Contact:  Eduardo Valentin <eduardo.valentin@ti.com>
  *
- *  Highly based on cpu_cooling.c.
+ *  Highly based on cpufreq_cooling.c.
  *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
  *  Copyright (C) 2012  Amit Daniel <amit.kachhap@linaro.org>
  */
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpufreq_cooling.c
similarity index 99%
rename from drivers/thermal/cpu_cooling.c
rename to drivers/thermal/cpufreq_cooling.c
index 6b9865c786ba..3a3f9cf94b6d 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpufreq_cooling.c
@@ -1,6 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 /*
- *  linux/drivers/thermal/cpu_cooling.c
+ *  linux/drivers/thermal/cpufreq_cooling.c
  *
  *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
  *
@@ -694,7 +694,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 	u32 capacitance = 0;
 
 	if (!np) {
-		pr_err("cpu_cooling: OF node not available for cpu%d\n",
+		pr_err("cpufreq_cooling: OF node not available for cpu%d\n",
 		       policy->cpu);
 		return NULL;
 	}
@@ -705,7 +705,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 
 		cdev = __cpufreq_cooling_register(np, policy, capacitance);
 		if (IS_ERR(cdev)) {
-			pr_err("cpu_cooling: cpu%d failed to register as cooling device: %ld\n",
+			pr_err("cpufreq_cooling: cpu%d failed to register as cooling device: %ld\n",
 			       policy->cpu, PTR_ERR(cdev));
 			cdev = NULL;
 		}
diff --git a/include/linux/clock_cooling.h b/include/linux/clock_cooling.h
index b5cebf766e02..4b0a69863656 100644
--- a/include/linux/clock_cooling.h
+++ b/include/linux/clock_cooling.h
@@ -7,7 +7,7 @@
  *  Copyright (C) 2013	Texas Instruments Inc.
  *  Contact:  Eduardo Valentin <eduardo.valentin@ti.com>
  *
- *  Highly based on cpu_cooling.c.
+ *  Highly based on cpufreq_cooling.c.
  *  Copyright (C) 2012	Samsung Electronics Co., Ltd(http://www.samsung.com)
  *  Copyright (C) 2012  Amit Daniel <amit.kachhap@linaro.org>
  */
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling
  2019-12-03  9:37 ` [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling Daniel Lezcano
@ 2019-12-03  9:40   ` Viresh Kumar
  2019-12-04  4:27   ` Amit Kucheria
  1 sibling, 0 replies; 10+ messages in thread
From: Viresh Kumar @ 2019-12-03  9:40 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: rui.zhang, rjw, edubezval, linux-pm, amit.kucheria, linux-kernel

On 03-12-19, 10:37, Daniel Lezcano wrote:
> As we introduced the idle injection cooling device called
> cpuidle_cooling, let's be consistent and rename the cpu_cooling to
> cpufreq_cooling as this one mitigates with OPPs changes.
> 
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> ---
>   V3:
>     - Fix missing name conversion (Viresh Kumar)
> ---
>  Documentation/driver-api/thermal/exynos_thermal.rst  | 2 +-
>  MAINTAINERS                                          | 2 +-
>  drivers/thermal/Makefile                             | 2 +-
>  drivers/thermal/clock_cooling.c                      | 2 +-
>  drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} | 6 +++---
>  include/linux/clock_cooling.h                        | 2 +-
>  6 files changed, 8 insertions(+), 8 deletions(-)
>  rename drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} (99%)

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>

-- 
viresh

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation
  2019-12-03  9:37 ` [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation Daniel Lezcano
@ 2019-12-04  4:24   ` Amit Kucheria
  2019-12-04  6:50     ` Daniel Lezcano
  0 siblings, 1 reply; 10+ messages in thread
From: Amit Kucheria @ 2019-12-04  4:24 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Viresh Kumar, Zhang Rui, Rafael J. Wysocki, Eduardo Valentin,
	Linux PM list, Linux Kernel Mailing List

On Tue, Dec 3, 2019 at 3:07 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> Provide some documentation for the idle injection cooling effect in
> order to let people to understand the rational of the approach for the

s/rational/rationale

> idle injection CPU cooling device.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  .../driver-api/thermal/cpu-idle-cooling.rst   | 166 ++++++++++++++++++
>  1 file changed, 166 insertions(+)
>  create mode 100644 Documentation/driver-api/thermal/cpu-idle-cooling.rst
>
> diff --git a/Documentation/driver-api/thermal/cpu-idle-cooling.rst b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
> new file mode 100644
> index 000000000000..457cd9979ddb
> --- /dev/null
> +++ b/Documentation/driver-api/thermal/cpu-idle-cooling.rst
> @@ -0,0 +1,166 @@
> +
> +Situation:
> +----------
> +
> +Under certain circumstances a SoC can reach the maximum temperature
> +limit or is unable to stabilize the temperature around a temperature

s/the maximum/a critical/

s/or/and/

> +control. When the SoC has to stabilize the temperature, the kernel can
> +act on a cooling device to mitigate the dissipated power. When the
> +maximum temperature is reached and to prevent a reboot or a shutdown,
> +a decision must be taken to reduce the temperature under the critical
> +threshold, that impacts the performance.

Consider replacing above paragraph with:

When the critical temperature is reached, a decision must be taken to
reduce the temperature, that, in turn impacts performance.

> +
> +Another situation is when the silicon reaches a certain temperature
> +which continues to increase even if the dynamic leakage is reduced to
> +its minimum by clock gating the component. The runaway phenomena will

s/phenomena/phenomenon/

> +continue with the static leakage and only powering down the component,
> +thus dropping the dynamic and static leakage will allow the component
> +to cool down.
> +

Consider rephrasing as,

Another situation is when the silicon temperature continues to
increase even after the dynamic leakage is reduced to its minimum by
clock gating the component. This runaway phenomenon can continue due
to the static leakage. The only solution is to power down the
component, thus dropping the dynamic and static leakage that will
allow the component to cool down.


> +Last but not least, the system can ask for a specific power budget but
> +because of the OPP density, we can only choose an OPP with a power
> +budget lower than the requested one and underuse the CPU, thus losing
> +performances. In other words, one OPP under uses the CPU with a power

s/performances/performance.

s/underuse/under-utlilize/
s/under use/under-utlilizes/

> +lesser than the power budget and the next OPP exceed the power budget,

s/lesser than the/less than the requested/
s/exceed/exceeds/

> +an intermediate OPP could have been used if it were present.

Make this a new sentence.

> +
> +Solutions:
> +----------
> +
> +If we can remove the static and the dynamic leakage for a specific
> +duration in a controlled period, the SoC temperature will
> +decrease. Acting at the idle state duration or the idle cycle

s/at/for/ ?

> +injection period, we can mitigate the temperature by modulating the
> +power budget.
> +
> +The Operating Performance Point (OPP) density has a great influence on
> +the control precision of cpufreq, however different vendors have a
> +plethora of OPP density, and some have large power gap between OPPs,
> +that will result in loss of performance during thermal control and
> +loss of power in other scenes.

s/scenes/scenarios/

> +
> +At a specific OPP, we can assume injecting idle cycle on all CPUs,
> +belonging to the same cluster, with a duration greater than the

Change to "we can assume that injecting idle cycles on all CPUs belong
to the same cluster"

> +cluster idle state target residency, we drop the static and the

s/we drop/will lead to dropping/

> +dynamic leakage for this period (modulo the energy needed to enter
> +this state). So the sustainable power with idle cycles has a linear
> +relation with the OPP’s sustainable power and can be computed with a
> +coefficient similar to:
> +
> +           Power(IdleCycle) = Coef x Power(OPP)
> +
> +Idle Injection:
> +---------------
> +
> +The base concept of the idle injection is to force the CPU to go to an
> +idle state for a specified time each control cycle, it provides
> +another way to control CPU power and heat in addition to
> +cpufreq. Ideally, if all CPUs belonging to the same cluster, inject
> +their idle cycle synchronously, the cluster can reach its power down

cycles

> +state with a minimum power consumption and static leakage
> +drop. However, these idle cycles injection will add extra latencies as

s/static leakage drop/reduce static leakage to (almost) zero/

> +the CPUs will have to wakeup from a deep sleep state.
> +
> +     ^
> +     |
> +     |
> +     |-------       -------       -------
> +     |_______|_____|_______|_____|_______|___________
> +
> +      <----->
> +       idle  <---->
> +              running
> +
> +With the fixed idle injection duration, we can give a value which is
> +an acceptable performance drop off or latency when we reach a specific
> +temperature and we begin to mitigate by varying the Idle injection
> +period.
> +

I'm not sure what it the purpose of this statement. You've described
how the period value starts at a maximum and is adjusted dynamically
below.

> +The mitigation begins with a maximum period value which decrease when

Shouldn't the idle injection period increase to get more cooling effect?

> +more cooling effect is requested. When the period duration is equal to
> +the idle duration, then we are in a situation the platform can’t
> +dissipate the heat enough and the mitigation fails. In this case the
> +situation is considered critical and there is nothing to do. The idle
> +injection duration must be changed by configuration and until we reach
> +the cooling effect, otherwise an additionnal cooling device must be

typo: additional

> +used or ultimately decrease the SoC performance by dropping the
> +highest OPP point of the SoC.
> +
> +The idle injection duration value must comply with the constraints:
> +
> +- It is lesser or equal to the latency we tolerate when the mitigation

s/lesser/less than/

> +  begins. It is platform dependent and will depend on the user
> +  experience, reactivity vs performance trade off we want. This value
> +  should be specified.
> +
> +- It is greater than the idle state’s target residency we want to go
> +  for thermal mitigation, otherwise we end up consuming more energy.
> +
> +Minimum period
> +--------------
> +
> +The idle injection duration being fixed, it is obvious the minimum

Change to:
When the idle injection duration is fixed,

> +period can’t be lesser than that, otherwise we will be scheduling the
> +idle injection task right before the idle injection duration is
> +complete, so waking up the CPU to put it asleep again.
> +
> +Maximum period
> +--------------
> +
> +The maximum period is the initial period when the mitigation
> +begins. Theoretically when we reach the thermal trip point, we have to
> +sustain a specified power for specific temperature but at this time we
> +consume:
> +
> + Power = Capacitance x Voltage^2 x Frequency x Utilisation
> +
> +... which is more than the sustainable power (or there is something
> +wrong on the system setup). The ‘Capacitance’ and ‘Utilisation’ are a

s/on/in/

> +fixed value, ‘Voltage’ and the ‘Frequency’ are fixed artificially
> +because we don’t want to change the OPP. We can group the
> +‘Capacitance’ and the ‘Utilisation’ into a single term which is the
> +‘Dynamic Power Coefficient (Cdyn)’ Simplifying the above, we have:
> +
> + Pdyn = Cdyn x Voltage^2 x Frequency
> +
> +The IPA will ask us somehow to reduce our power in order to target the

s/IPA/power allocator governor/

> +sustainable power defined in the device tree. So with the idle
> +injection mechanism, we want an average power (Ptarget) resulting on

s/on/in

> +an amount of time running at full power on a specific OPP and idle
> +another amount of time. That could be put in a equation:
> +
> + P(opp)target = ((trunning x (P(opp)running) + (tidle P(opp)idle)) /

missed a 'x' after tidle.

Suggest using capital T for time everwhere to make it easier to read.

> +                       (trunning + tidle)
> +  ...
> +
> + tidle = trunning x ((P(opp)running / P(opp)target) - 1)
> +
> +At this point if we know the running period for the CPU, that gives us
> +the idle injection, we need. Alternatively if we have the idle

Lose the comma.

> +injection duration, we can compute the running duration with:
> +
> + trunning = tidle / ((P(opp)running / P(opp)target) - 1)
> +
> +Practically, if the running power is lesses than the targeted power,

s/lesses/less/

> +we end up with a negative time value, so obviously the equation usage
> +is bound to a power reduction, hence a higher OPP is needed to have
> +the running power greater than the targeted power.
> +
> +However, in this demonstration we ignore three aspects:
> +
> + * The static leakage is not defined here, we can introduce it in the
> +   equation but assuming it will be zero most of the time as it is
> +   difficult to get the values from the SoC vendors
> +
> + * The idle state wake up latency (or entry + exit latency) is not
> +   taken into account, it must be added in the equation in order to
> +   rigorously compute the idle injection
> +
> + * The injected idle duration must be greater than the idle state
> +   target residency, otherwise we end up consuming more energy and
> +   potentially invert the mitigation effect
> +
> +So the final equation is:
> +
> + trunning = (tidle - twakeup ) x
> +               (((P(opp)dyn + P(opp)static ) - P(opp)target) / P(opp)target )
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling
  2019-12-03  9:37 ` [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling Daniel Lezcano
  2019-12-03  9:40   ` Viresh Kumar
@ 2019-12-04  4:27   ` Amit Kucheria
  1 sibling, 0 replies; 10+ messages in thread
From: Amit Kucheria @ 2019-12-04  4:27 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Viresh Kumar, Zhang Rui, Rafael J. Wysocki, Eduardo Valentin,
	Linux PM list, Linux Kernel Mailing List

On Tue, Dec 3, 2019 at 3:07 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> As we introduced the idle injection cooling device called
> cpuidle_cooling, let's be consistent and rename the cpu_cooling to
> cpufreq_cooling as this one mitigates with OPPs changes.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Reviewed-by: Amit Kucheria <amit.kucheria@linaro.org>

> ---
>   V3:
>     - Fix missing name conversion (Viresh Kumar)
> ---
>  Documentation/driver-api/thermal/exynos_thermal.rst  | 2 +-
>  MAINTAINERS                                          | 2 +-
>  drivers/thermal/Makefile                             | 2 +-
>  drivers/thermal/clock_cooling.c                      | 2 +-
>  drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} | 6 +++---
>  include/linux/clock_cooling.h                        | 2 +-
>  6 files changed, 8 insertions(+), 8 deletions(-)
>  rename drivers/thermal/{cpu_cooling.c => cpufreq_cooling.c} (99%)
>
> diff --git a/Documentation/driver-api/thermal/exynos_thermal.rst b/Documentation/driver-api/thermal/exynos_thermal.rst
> index 5bd556566c70..d4e4a5b75805 100644
> --- a/Documentation/driver-api/thermal/exynos_thermal.rst
> +++ b/Documentation/driver-api/thermal/exynos_thermal.rst
> @@ -67,7 +67,7 @@ TMU driver description:
>  The exynos thermal driver is structured as::
>
>                                         Kernel Core thermal framework
> -                               (thermal_core.c, step_wise.c, cpu_cooling.c)
> +                               (thermal_core.c, step_wise.c, cpufreq_cooling.c)
>                                                                 ^
>                                                                 |
>                                                                 |
> diff --git a/MAINTAINERS b/MAINTAINERS
> index d2e92a0360f2..26e4be914765 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16194,7 +16194,7 @@ L:      linux-pm@vger.kernel.org
>  S:     Supported
>  F:     Documentation/driver-api/thermal/cpu-cooling-api.rst
>  F:     Documentation/driver-api/thermal/cpu-idle-cooling.rst
> -F:     drivers/thermal/cpu_cooling.c
> +F:     drivers/thermal/cpufreq_cooling.c
>  F:     drivers/thermal/cpuidle_cooling.c
>  F:     include/linux/cpu_cooling.h
>
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index 9c8aa2d4bd28..5c98472ffd8b 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -19,7 +19,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_USER_SPACE)  += user_space.o
>  thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)      += power_allocator.o
>
>  # cpufreq cooling
> -thermal_sys-$(CONFIG_CPU_FREQ_THERMAL) += cpu_cooling.o
> +thermal_sys-$(CONFIG_CPU_FREQ_THERMAL) += cpufreq_cooling.o
>  thermal_sys-$(CONFIG_CPU_IDLE_THERMAL) += cpuidle_cooling.o
>
>  # clock cooling
> diff --git a/drivers/thermal/clock_cooling.c b/drivers/thermal/clock_cooling.c
> index 3ad3256c48fd..7cb3ae4b44ee 100644
> --- a/drivers/thermal/clock_cooling.c
> +++ b/drivers/thermal/clock_cooling.c
> @@ -7,7 +7,7 @@
>   *  Copyright (C) 2013 Texas Instruments Inc.
>   *  Contact:  Eduardo Valentin <eduardo.valentin@ti.com>
>   *
> - *  Highly based on cpu_cooling.c.
> + *  Highly based on cpufreq_cooling.c.
>   *  Copyright (C) 2012 Samsung Electronics Co., Ltd(http://www.samsung.com)
>   *  Copyright (C) 2012  Amit Daniel <amit.kachhap@linaro.org>
>   */
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpufreq_cooling.c
> similarity index 99%
> rename from drivers/thermal/cpu_cooling.c
> rename to drivers/thermal/cpufreq_cooling.c
> index 6b9865c786ba..3a3f9cf94b6d 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpufreq_cooling.c
> @@ -1,6 +1,6 @@
>  // SPDX-License-Identifier: GPL-2.0
>  /*
> - *  linux/drivers/thermal/cpu_cooling.c
> + *  linux/drivers/thermal/cpufreq_cooling.c
>   *
>   *  Copyright (C) 2012 Samsung Electronics Co., Ltd(http://www.samsung.com)
>   *
> @@ -694,7 +694,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>         u32 capacitance = 0;
>
>         if (!np) {
> -               pr_err("cpu_cooling: OF node not available for cpu%d\n",
> +               pr_err("cpufreq_cooling: OF node not available for cpu%d\n",
>                        policy->cpu);
>                 return NULL;
>         }
> @@ -705,7 +705,7 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>
>                 cdev = __cpufreq_cooling_register(np, policy, capacitance);
>                 if (IS_ERR(cdev)) {
> -                       pr_err("cpu_cooling: cpu%d failed to register as cooling device: %ld\n",
> +                       pr_err("cpufreq_cooling: cpu%d failed to register as cooling device: %ld\n",
>                                policy->cpu, PTR_ERR(cdev));
>                         cdev = NULL;
>                 }
> diff --git a/include/linux/clock_cooling.h b/include/linux/clock_cooling.h
> index b5cebf766e02..4b0a69863656 100644
> --- a/include/linux/clock_cooling.h
> +++ b/include/linux/clock_cooling.h
> @@ -7,7 +7,7 @@
>   *  Copyright (C) 2013 Texas Instruments Inc.
>   *  Contact:  Eduardo Valentin <eduardo.valentin@ti.com>
>   *
> - *  Highly based on cpu_cooling.c.
> + *  Highly based on cpufreq_cooling.c.
>   *  Copyright (C) 2012 Samsung Electronics Co., Ltd(http://www.samsung.com)
>   *  Copyright (C) 2012  Amit Daniel <amit.kachhap@linaro.org>
>   */
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V3 3/4] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver
  2019-12-03  9:37 ` [PATCH V3 3/4] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Daniel Lezcano
@ 2019-12-04  4:53   ` Amit Kucheria
  0 siblings, 0 replies; 10+ messages in thread
From: Amit Kucheria @ 2019-12-04  4:53 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Viresh Kumar, Zhang Rui, Rafael J. Wysocki, Eduardo Valentin,
	Linux PM list, Linux Kernel Mailing List

On Tue, Dec 3, 2019 at 3:07 PM Daniel Lezcano <daniel.lezcano@linaro.org> wrote:
>
> The cpu idle cooling device offers a new method to cool down a CPU by
> injecting idle cycles at runtime.
>
> It has some similarities with the intel power clamp driver but it is
> actually designed to be more generic and relying on the idle injection
> powercap framework.
>
> The idle injection cycle is fixed while the running cycle is variable. That
> allows to have control on the device reactivity for the user experience.

s/cycle/period/ ? Since you use that in your documentation.

> An idle state powering down the CPU or the cluster will allow to drop
> the static leakage, thus restoring the heat capacity of the SoC. It
> can be set with a trip point between the hot and the critical points,
> giving the opportunity to prevent a hard reset of the system when the
> cpufreq cooling fails to cool down the CPU.
>
> With more sophisticated boards having a per core sensor, the idle
> cooling device allows to cool down a single core without throttling
> the compute capacity of several cpus belonging to the same clock line,
> so it could be used in collaboration with the cpufreq cooling device.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  V3:
>    - Add missing parameter documentation (Viresh Kumar)
>    - Fixed function description (Viresh Kumar)
>    - Add entry in MAINTAINER file
>  V2:
>    - Remove idle_duration_us field and use idle_inject API instead (Viresh Kumar)
>    - Fixed function definition wheh CPU_IDLE_COOLING is not set
>    - Inverted the initialization in the init function (Viresh Kumar)
> ---
>  MAINTAINERS                       |   3 +
>  drivers/thermal/Kconfig           |   7 +
>  drivers/thermal/Makefile          |   1 +
>  drivers/thermal/cpuidle_cooling.c | 234 ++++++++++++++++++++++++++++++
>  include/linux/cpu_cooling.h       |  22 +++
>  5 files changed, 267 insertions(+)
>  create mode 100644 drivers/thermal/cpuidle_cooling.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c570f0204b48..d2e92a0360f2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -16187,12 +16187,15 @@ F:    Documentation/devicetree/bindings/thermal/
>
>  THERMAL/CPU_COOLING
>  M:     Amit Daniel Kachhap <amit.kachhap@gmail.com>
> +M:     Daniel Lezcano <daniel.lezcano@linaro.org>
>  M:     Viresh Kumar <viresh.kumar@linaro.org>
>  M:     Javi Merino <javi.merino@kernel.org>
>  L:     linux-pm@vger.kernel.org
>  S:     Supported
>  F:     Documentation/driver-api/thermal/cpu-cooling-api.rst
> +F:     Documentation/driver-api/thermal/cpu-idle-cooling.rst
>  F:     drivers/thermal/cpu_cooling.c
> +F:     drivers/thermal/cpuidle_cooling.c
>  F:     include/linux/cpu_cooling.h
>
>  THINKPAD ACPI EXTRAS DRIVER
> diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
> index 4e3ee036938b..4ee9953ba5ce 100644
> --- a/drivers/thermal/Kconfig
> +++ b/drivers/thermal/Kconfig
> @@ -169,6 +169,13 @@ config CPU_FREQ_THERMAL
>           This will be useful for platforms using the generic thermal interface
>           and not the ACPI interface.
>
> +config CPU_IDLE_THERMAL
> +       bool "CPU idle cooling device"
> +       depends on IDLE_INJECT
> +       help
> +         This implements the CPU cooling mechanism through
> +         idle injection. This will throttle the CPU by injecting
> +         idle cycle.
>  endif
>
>  config CLOCK_THERMAL
> diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
> index d3b01cc96981..9c8aa2d4bd28 100644
> --- a/drivers/thermal/Makefile
> +++ b/drivers/thermal/Makefile
> @@ -20,6 +20,7 @@ thermal_sys-$(CONFIG_THERMAL_GOV_POWER_ALLOCATOR)     += power_allocator.o
>
>  # cpufreq cooling
>  thermal_sys-$(CONFIG_CPU_FREQ_THERMAL) += cpu_cooling.o
> +thermal_sys-$(CONFIG_CPU_IDLE_THERMAL) += cpuidle_cooling.o
>
>  # clock cooling
>  thermal_sys-$(CONFIG_CLOCK_THERMAL)    += clock_cooling.o
> diff --git a/drivers/thermal/cpuidle_cooling.c b/drivers/thermal/cpuidle_cooling.c
> new file mode 100644
> index 000000000000..7d91a1b298d4
> --- /dev/null
> +++ b/drivers/thermal/cpuidle_cooling.c
> @@ -0,0 +1,234 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + *  Copyright (C) 2019 Linaro Limited.
> + *
> + *  Author: Daniel Lezcano <daniel.lezcano@linaro.org>
> + *
> + */
> +#include <linux/cpu_cooling.h>
> +#include <linux/cpuidle.h>
> +#include <linux/err.h>
> +#include <linux/idle_inject.h>
> +#include <linux/idr.h>
> +#include <linux/slab.h>
> +#include <linux/thermal.h>
> +
> +/**
> + * struct cpuidle_cooling_device - data for the idle cooling device
> + * @ii_dev: an atomic to keep track of the last task exiting the idle cycle
> + * @state: an normalized integer giving the state of the cooling device

s/an/a/

> + */
> +struct cpuidle_cooling_device {
> +       struct idle_inject_device *ii_dev;
> +       unsigned long state;
> +};
> +
> +static DEFINE_IDA(cpuidle_ida);
> +
> +/**
> + * cpuidle_cooling_runtime - Running time computation
> + * @idle_duration_us: the idle cooling device
> + * @state: a percentile based number
> + *
> + * The running duration is computed from the idle injection duration
> + * which is fixed. If we reach 100% of idle injection ratio, that

How about using the term 'duty cycle' instead of ratio? It describes
the on/off (running/idle) waveform better.

> + * means the running duration is zero. If we have a 50% ratio
> + * injection, that means we have equal duration for idle and for
> + * running duration.
> + *
> + * The formula is deduced as the following:

s/the following/follows/

> + *
> + *  running = idle x ((100 / ratio) - 1)
> + *
> + * For precision purpose for integer math, we use the following:
> + *
> + *  running = (idle x 100) / ratio - idle
> + *
> + * For example, if we have an injected duration of 50%, then we end up
> + * with 10ms of idle injection and 10ms of running duration.
> + *
> + * Returns an unsigned int for an usec based runtime duration.

s/an/a/ since usec is actually a shortform for microseconds.

> + */
> +static unsigned int cpuidle_cooling_runtime(unsigned int idle_duration_us,
> +                                           unsigned long state)
> +{
> +       if (!state)
> +               return 0;
> +
> +       return ((idle_duration_us * 100) / state) - idle_duration_us;
> +}
> +
> +/**
> + * cpuidle_cooling_get_max_state - Get the maximum state
> + * @cdev  : the thermal cooling device
> + * @state : a pointer to the state variable to be filled
> + *
> + * The function always gives 100 as the injection ratio is percentile

s/gives/returns/

Split the sentence after ratio.

> + * based for consistency accros different platforms.

Typo: across

> + *
> + * The function can not fail, it always returns zero.
> + */
> +static int cpuidle_cooling_get_max_state(struct thermal_cooling_device *cdev,
> +                                        unsigned long *state)
> +{
> +       /*
> +        * Depending on the configuration or the hardware, the running
> +        * cycle and the idle cycle could be different. We want unify

s/want/want to/

> +        * that to an 0..100 interval, so the set state interface will
> +        * be the same whatever the platform is.
> +        *
> +        * The state 100% will make the cluster 100% ... idle. A 0%
> +        * injection ratio means no idle injection at all and 50%
> +        * means for 10ms of idle injection, we have 10ms of running
> +        * time.
> +        */
> +       *state = 100;
> +
> +       return 0;
> +}
> +
> +/**
> + * cpuidle_cooling_get_cur_state - Get the current cooling state
> + * @cdev: the thermal cooling device
> + * @state: a pointer to the state
> + *
> + * The function just copy the state value from the private thermal

s/copy/copies/

> + * cooling device structure, the mapping is 1 <-> 1.
> + *
> + * The function can not fail, it always returns zero.

Add the Return keyword at the beginning to conform to kernel-doc. And
all other function return values in this file.

> + */
> +static int cpuidle_cooling_get_cur_state(struct thermal_cooling_device *cdev,
> +                                        unsigned long *state)
> +{
> +       struct cpuidle_cooling_device *idle_cdev = cdev->devdata;
> +
> +       *state = idle_cdev->state;
> +
> +       return 0;
> +}
> +
> +/**
> + * cpuidle_cooling_set_cur_state - Set the current cooling state
> + * @cdev: the thermal cooling device
> + * @state: the target state
> + *
> + * The function checks first if we are initiating the mitigation which
> + * in turn wakes up all the idle injection tasks belonging to the idle
> + * cooling device. In any case, it updates the internal state for the
> + * cooling device.
> + *
> + * The function can not fail, it always returns zero.
> + */
> +static int cpuidle_cooling_set_cur_state(struct thermal_cooling_device *cdev,
> +                                        unsigned long state)
> +{
> +       struct cpuidle_cooling_device *idle_cdev = cdev->devdata;
> +       struct idle_inject_device *ii_dev = idle_cdev->ii_dev;
> +       unsigned long current_state = idle_cdev->state;
> +       unsigned int runtime_us, idle_duration_us;
> +
> +       idle_cdev->state = state;
> +
> +       idle_inject_get_duration(ii_dev, &runtime_us, &idle_duration_us);
> +
> +       runtime_us = cpuidle_cooling_runtime(idle_duration_us, state);
> +
> +       idle_inject_set_duration(ii_dev, runtime_us, idle_duration_us);
> +
> +       if (current_state == 0 && state > 0) {
> +               idle_inject_start(ii_dev);
> +       } else if (current_state > 0 && !state)  {
> +               idle_inject_stop(ii_dev);
> +       }
> +
> +       return 0;
> +}
> +
> +/**
> + * cpuidle_cooling_ops - thermal cooling device ops
> + */
> +static struct thermal_cooling_device_ops cpuidle_cooling_ops = {
> +       .get_max_state = cpuidle_cooling_get_max_state,
> +       .get_cur_state = cpuidle_cooling_get_cur_state,
> +       .set_cur_state = cpuidle_cooling_set_cur_state,
> +};
> +
> +/**
> + * cpuidle_of_cooling_register - Idle cooling device initialization function
> + * @drv: a cpuidle driver structure pointer
> + * @np: a node pointer to a device tree cooling device node
> + *
> + * This function is in charge of creating a cooling device per cpuidle
> + * driver and register it to thermal framework.
> + *
> + * Returns a valid pointer to a thermal cooling device or a PTR_ERR
> + * corresponding to the error detected in the underlying subsystems.
> + */
> +struct thermal_cooling_device *
> +__init cpuidle_of_cooling_register(struct device_node *np,
> +                                  struct cpuidle_driver *drv)
> +{
> +       struct idle_inject_device *ii_dev;
> +       struct cpuidle_cooling_device *idle_cdev;
> +       struct thermal_cooling_device *cdev;
> +       char dev_name[THERMAL_NAME_LENGTH];
> +       int id, ret;
> +
> +       idle_cdev = kzalloc(sizeof(*idle_cdev), GFP_KERNEL);
> +       if (!idle_cdev) {
> +               ret = -ENOMEM;
> +               goto out;
> +       }
> +
> +       id = ida_simple_get(&cpuidle_ida, 0, 0, GFP_KERNEL);
> +       if (id < 0) {
> +               ret = id;
> +               goto out_kfree;
> +       }
> +
> +       ii_dev = idle_inject_register(drv->cpumask);
> +       if (IS_ERR(ii_dev)) {
> +               ret = PTR_ERR(ii_dev);
> +               goto out_id;
> +       }
> +
> +       idle_inject_set_duration(ii_dev, 0, TICK_USEC);
> +
> +       idle_cdev->ii_dev = ii_dev;
> +
> +       snprintf(dev_name, sizeof(dev_name), "thermal-idle-%d", id);
> +
> +       cdev = thermal_of_cooling_device_register(np, dev_name, idle_cdev,
> +                                                 &cpuidle_cooling_ops);
> +       if (IS_ERR(cdev)) {
> +               ret = PTR_ERR(cdev);
> +               goto out_unregister;
> +       }
> +
> +       return cdev;
> +
> +out_unregister:
> +       idle_inject_unregister(ii_dev);
> +out_id:
> +       ida_simple_remove(&cpuidle_ida, id);
> +out_kfree:
> +       kfree(idle_cdev);
> +out:
> +       return ERR_PTR(ret);
> +}
> +
> +/**
> + * cpuidle_cooling_register - Idle cooling device initialization function
> + * @drv: a cpuidle driver structure pointer
> + *
> + * This function is in charge of creating a cooling device per cpuidle
> + * driver and register it to thermal framework.
> + *
> + * Returns a valid pointer to a thermal cooling device, a PTR_ERR
> + * corresponding to the error detected in the underlying subsystems.
> + */
> +struct thermal_cooling_device *
> +__init cpuidle_cooling_register(struct cpuidle_driver *drv)
> +{
> +       return cpuidle_of_cooling_register(NULL, drv);
> +}
> diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
> index 3cdd85f987d7..da0970183d1f 100644
> --- a/include/linux/cpu_cooling.h
> +++ b/include/linux/cpu_cooling.h
> @@ -60,4 +60,26 @@ of_cpufreq_cooling_register(struct cpufreq_policy *policy)
>  }
>  #endif /* CONFIG_CPU_FREQ_THERMAL */
>
> +struct cpuidle_driver;
> +
> +#ifdef CONFIG_CPU_IDLE_THERMAL
> +extern struct thermal_cooling_device *
> +__init cpuidle_cooling_register(struct cpuidle_driver *drv);
> +extern struct thermal_cooling_device *
> +__init cpuidle_of_cooling_register(struct device_node *np,
> +                                  struct cpuidle_driver *drv);
> +#else /* CONFIG_CPU_IDLE_THERMAL */
> +static inline struct thermal_cooling_device *
> +__init cpuidle_cooling_register(struct cpuidle_driver *drv)
> +{
> +       return ERR_PTR(-EINVAL);
> +}
> +static inline struct thermal_cooling_device *
> +__init cpuidle_of_cooling_register(struct device_node *np,
> +                                  struct cpuidle_driver *drv)
> +{
> +       return ERR_PTR(-EINVAL);
> +}
> +#endif /* CONFIG_CPU_IDLE_THERMAL */
> +
>  #endif /* __CPU_COOLING_H__ */
> --
> 2.17.1
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation
  2019-12-04  4:24   ` Amit Kucheria
@ 2019-12-04  6:50     ` Daniel Lezcano
  2019-12-04  7:10       ` Amit Kucheria
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-04  6:50 UTC (permalink / raw)
  To: Amit Kucheria
  Cc: Viresh Kumar, Zhang Rui, Rafael J. Wysocki, Eduardo Valentin,
	Linux PM list, Linux Kernel Mailing List


Hi Amit,

thanks for the review.


On 04/12/2019 05:24, Amit Kucheria wrote:

[ ... ]

>> +the CPUs will have to wakeup from a deep sleep state.
>> +
>> +     ^
>> +     |
>> +     |
>> +     |-------       -------       -------
>> +     |_______|_____|_______|_____|_______|___________
>> +
>> +      <----->
>> +       idle  <---->
>> +              running
>> +
>> +With the fixed idle injection duration, we can give a value which is
>> +an acceptable performance drop off or latency when we reach a specific
>> +temperature and we begin to mitigate by varying the Idle injection
>> +period.
>> +
> 
> I'm not sure what it the purpose of this statement. You've described
> how the period value starts at a maximum and is adjusted dynamically
> below.

We can have different way to inject idle periods. We can increase the
idle duration and/or keep this duration constant but make a variation of
the period. This statement clarify the method which is the latter
because we want to have a constant latency per period easier to deal with.

>> +The mitigation begins with a maximum period value which decrease when
> 
> Shouldn't the idle injection period increase to get more cooling effect?

The period is the opposite of the frequency. The highest the period, the
lowest the frequency, thus less idle cycles and lesser cooling effect.

>> +more cooling effect is requested. When the period duration is equal to
>> +the idle duration, then we are in a situation the platform can’t
>> +dissipate the heat enough and the mitigation fails. In this case the
>> +situation is considered critical and there is nothing to do. The idle
>> +injection duration must be changed by configuration and until we reach
>> +the cooling effect, otherwise an additionnal cooling device must be
> 
> typo: additional
> 
>> +used or ultimately decrease the SoC performance by dropping the
>> +highest OPP point of the SoC.
>> +
>> +The idle injection duration value must comply with the constraints:
>> +
>> +- It is lesser or equal to the latency we tolerate when the mitigation
> 
> s/lesser/less than/
> 
>> +  begins. It is platform dependent and will depend on the user
>> +  experience, reactivity vs performance trade off we want. This value
>> +  should be specified.
>> +
>> +- It is greater than the idle state’s target residency we want to go
>> +  for thermal mitigation, otherwise we end up consuming more energy.
>> +
>> +Minimum period
>> +--------------
>> +
>> +The idle injection duration being fixed, it is obvious the minimum
> 
> Change to:
> When the idle injection duration is fixed,
> 

The idle duration is always fixed in the cpuidle cooling device, why do
you want to add the sentence above?


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation
  2019-12-04  6:50     ` Daniel Lezcano
@ 2019-12-04  7:10       ` Amit Kucheria
  0 siblings, 0 replies; 10+ messages in thread
From: Amit Kucheria @ 2019-12-04  7:10 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Viresh Kumar, Zhang Rui, Rafael J. Wysocki, Eduardo Valentin,
	Linux PM list, Linux Kernel Mailing List

On Wed, Dec 4, 2019 at 12:20 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
>
> Hi Amit,
>
> thanks for the review.
>
>
> On 04/12/2019 05:24, Amit Kucheria wrote:
>
> [ ... ]
>
> >> +the CPUs will have to wakeup from a deep sleep state.
> >> +
> >> +     ^
> >> +     |
> >> +     |
> >> +     |-------       -------       -------
> >> +     |_______|_____|_______|_____|_______|___________
> >> +
> >> +      <----->
> >> +       idle  <---->
> >> +              running
> >> +
> >> +With the fixed idle injection duration, we can give a value which is
> >> +an acceptable performance drop off or latency when we reach a specific
> >> +temperature and we begin to mitigate by varying the Idle injection
> >> +period.
> >> +
> >
> > I'm not sure what it the purpose of this statement. You've described
> > how the period value starts at a maximum and is adjusted dynamically
> > below.
>
> We can have different way to inject idle periods. We can increase the
> idle duration and/or keep this duration constant but make a variation of
> the period. This statement clarify the method which is the latter
> because we want to have a constant latency per period easier to deal with.

I think I read period as duration leading to confusion. I suggest
using duty-cycle instead of period throughout this series. I think it
will improve the explanation.

The above paragraph could be rewritten as:

"We use a fixed duration of idle injection that gives an acceptable
performance penalty and a fixed latency. Mitigation can be increased
or decreased by modulating the duty cycle of the idle injection."

Perhaps you could also enhance your ascii art above to show fixed
duration idles and different duty cyles to drive home the point.

> >> +The mitigation begins with a maximum period value which decrease when
> >
> > Shouldn't the idle injection period increase to get more cooling effect?
>
> The period is the opposite of the frequency. The highest the period, the
> lowest the frequency, thus less idle cycles and lesser cooling effect.

Yeah, I definitely confused period with duration :-)

> >> +more cooling effect is requested. When the period duration is equal to
> >> +the idle duration, then we are in a situation the platform can’t
> >> +dissipate the heat enough and the mitigation fails. In this case the
> >> +situation is considered critical and there is nothing to do. The idle
> >> +injection duration must be changed by configuration and until we reach
> >> +the cooling effect, otherwise an additionnal cooling device must be
> >
> > typo: additional
> >
> >> +used or ultimately decrease the SoC performance by dropping the
> >> +highest OPP point of the SoC.
> >> +
> >> +The idle injection duration value must comply with the constraints:
> >> +
> >> +- It is lesser or equal to the latency we tolerate when the mitigation
> >
> > s/lesser/less than/
> >
> >> +  begins. It is platform dependent and will depend on the user
> >> +  experience, reactivity vs performance trade off we want. This value
> >> +  should be specified.
> >> +
> >> +- It is greater than the idle state’s target residency we want to go
> >> +  for thermal mitigation, otherwise we end up consuming more energy.
> >> +
> >> +Minimum period
> >> +--------------
> >> +
> >> +The idle injection duration being fixed, it is obvious the minimum
> >
> > Change to:
> > When the idle injection duration is fixed,
> >
>
> The idle duration is always fixed in the cpuidle cooling device, why do
> you want to add the sentence above?

Ignore for now.

Regards,
Amit

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-12-04  7:10 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-03  9:37 [PATCH V3 1/4] thermal/drivers/Kconfig: Convert the CPU cooling device to a choice Daniel Lezcano
2019-12-03  9:37 ` [PATCH V3 2/4] thermal/drivers/cpu_cooling: Add idle cooling device documentation Daniel Lezcano
2019-12-04  4:24   ` Amit Kucheria
2019-12-04  6:50     ` Daniel Lezcano
2019-12-04  7:10       ` Amit Kucheria
2019-12-03  9:37 ` [PATCH V3 3/4] thermal/drivers/cpu_cooling: Introduce the cpu idle cooling driver Daniel Lezcano
2019-12-04  4:53   ` Amit Kucheria
2019-12-03  9:37 ` [PATCH V3 4/4] thermal/drivers/cpu_cooling: Rename to cpufreq_cooling Daniel Lezcano
2019-12-03  9:40   ` Viresh Kumar
2019-12-04  4:27   ` Amit Kucheria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).